Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
8
implementation of the ARIMA modelling techniques to the same time series data. Finally,
Chapter V focuses on conclusions and limitations concerning the results of the applied
techniques.
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
9
CHAPTER I
Critical review of smoothing and extrapolation of time series
Smoothing technique is a set of simple methods which permit the extrapolation that can be
used for forecasting purposes. The technique provides projections for a large number of time
series which may be needed quickly, so used when time and resources do not permit the use
of formal modelling techniques. These extrapolation techniques represent a deterministic
approach to the modelling of time series since no reference is made to the sources or nature
of the underlying randomness in the series.
The basic concept lies on the fact that it is often desirable to smooth a time series in order to
eliminate some of the more volatile short-term fluctuations. Smoothing may be done before
making a forecast or simply to make the series easier to analyse and interpret.
Smoothing methods can be classified in two main groups: averaging methods, which conform
to the conventional understanding of what an average is, that is equally weighted
observations, and exponential smoothing methods that apply an unequal set of weights to
past data.
This report will refer purely to the second group of methods because they are generally
superior to averaging methods (Makridakis, Wheelwright and Hyndman, 1998).
1.1 - Exponential smoothing methods
These methods are basically an extension of the moving average techniques and generate
forecasting by weighted moving average. With simple moving average forecasts, the mean of
past observations, say k observations, is used as a forecast. This implies equal weights
(equal to 1/k) for all k data points. In practice, forecast made using the most recent
observations will usually provide the best guide as to the future, so what is needed is a
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
10
weighting scheme that has decreasing weights as the observations (i.e. our information)
become older. This scheme can be obtained by exponential smoothing procedures, a set of
methods which have in common the property that recent values are given relatively more
weight in forecasting than the older observations. The basic idea of smoothing a series of
data lies on the concept that a time-series data can be seen as the product of different
factors. Such factors concern the data themselves, the trend, the seasonal pattern and/or the
cyclical pattern that a series might display. Hence, in order to take account of the
characteristics of the series, each smoothing method requires certain parameters to be
defined and these determine the unequal weights to be applied to past data.
As a starting point, Makridakis, Wheelwright and Hyndman (1998) suggest a strategy for
evaluating any forecasting methodology (see figure 1).
Figure 1 - A strategy for appraising any of the smoothing methods of forecasting
Source: S. Makridakis, S.C. Wheelwright and R.J. Hyndman, Forecasting, methods and applications (1998), p. 140
Basically, the forecaster should go through the following stages:
Stage 1: divide the time series of interest in two parts, an “initialisation set” and a “test set” in
order to evaluate the chosen forecasting method;
Stage 2: choose a forecasting method among all those available (which are described below)
by looking at the particular characteristics of the series under study;
Stage 3: apply the chosen forecasting method to the “initialisation” data set in order to get
estimates of any trend components, seasonal components and any other parameter values;
Stage 5: Appraisal decision. Pros and cons. Application potentials
Stage 4: Use the smoothing method to forecast over the "test" set. Test measures: MAPE, MSE, etc. Optimize the values of parameters
Stage 3: initialize the method. Use the initialization data set
Stage 2: choose a smoothing method
Stage 1: choose a time series (data set) to analyse). Divide this into an "inizialization" set and a "test" set
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
11
Stage 4: apply the method to the “test” data set to check how well it does on data that were
not used in estimating the parameters of the model. After each forecast, the forecasting error
is determined and over the complete test set certain accuracy measures are determined (e.g.
mean squared error (MSE), mean absolute percentage error (MAPE), etc.) in order to reach
the optimum parameter values in the model. This is really an iterative phase;
Stage 5: appraise the chosen forecasting method and its suitability for various kinds of data
patterns.
Once the series has been divided into the “initialisation” and the “test” sets then it must be
decided which of the various smoothing techniques is the most appropriate for the given
series of data (i.e. stage 2). The simplest exponential smoothing technique is the single
exponential smoothing for which just one parameter needs to be estimated. Then, there is the
Holt’s method which makes use of two different parameters and allows forecasting for series
which show trends and the Hold-Winters’s method that requires three smoothing parameters
to smooth the data, the trend and the seasonal index. Pagels’ (1969) also draws exponential
smoothing methods based on classification of trend and seasonality patterns depending on
whether they are additive of multiplicative.
Shortly, all these can be described as follows.
1.1.a - Single exponential smoothing
This method uses the forecast of the previous year and adjusts it using the forecast error. It
can be represented by the equation:
F
t + 1
= F
t
+ α (Y
t
- F
t
)
where α is a constant value between 0 and 1.
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
12
In practice, the new forecast (F
t + 1
) is simply the old forecast plus an adjustment for the error
that occurred in the last forecast, (Y
t
- F
t
).
1.1.b - Holt’s linear method
This is an extension of the single exponential smoothing to linear exponential smoothing to
allow forecasting of data with trends. In this case two parameters, α and β (with values
comprise between 0 and 1), have to be found out to get the forecasted value F
t + m
. This
method can be represented by a set of equations such as:
F
t + m
= L
t
+ b
t
m
L
t
= αY
t
+ (1 - α)(L
t - 1
+ b
t - 1
)
b
t
= β(L
t
- L
t – 1
) + (1 - β) b
t – 1
where F
t + m
is the forecast m periods ahead, L
t
is an estimate of the level of the series at time
t and b
t
denotes an estimate of the slope of the series at time t. The second equation adjusts
L
t
directly for the trend of the previous period, b
t – 1
, by adding it to the last smoothed value,
L
t– 1
. The last equation updates the trend, which is expressed as the linear difference between
the last two smoothed values
1
.
1.1.c - Holt-Winters’ trend and seasonality method
When seasonality is embedded in the time series the two previous methods are not sufficient.
The Holt-Winters’ trend and seasonal smoothing method provides adjustments in this case.
Winters (1960) extended the Holt’s method in order to capture seasonality directly. This
technique is based on three smoothing equations to take account of the level, the trend and
1
A wider explanation can be found in S. Makridakis – S.C. Wheelwright – R.J. Hyndman, Forecasting, methods and applications (1998),
page 158.
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
13
the seasonal variation. There is then one further choice as to whether to treat seasonality as
multiplicative (non-linear) or additive (linear) way
2
.
Multiplicative seasonality. In this case the basic equations are as follows:
Level: L
t
= α (Y
t
/ S
t – s
) + (1 - α)(L
t - 1
+ b
t - 1
)
Trend: b
t
= β (L
t
- L
t – 1
) + (1 - β)b
t – 1
Seasonal: S
t
= γ (Y
t
/ L
t
) + (1 - γ)S
t - s
Forecast: F
t + m
= (L
t
+ b
t
m)S
t – s + m
where L
t
represents the level of the series, s is the length of seasonality (e.g. number of
quarters or months), b
t
stands for the trend, S
t
is the seasonal component and F
t + m
represents the forecast for m periods ahead. The parameters α, β , γ, are chosen in a way to
minimise the MSE or the MAPE. Nowadays, with the increasing power of computers is easy
for software such as SPSS Trends to find out the optimal parameter values using a grid
search function
3
.
Additive seasonality. Alternatively seasonality can be treated additively. In this case the
equations which lead to the forecasting value after m periods are:
Level: L
t
= α (Y
t
- S
t – s
) + (1 - α)(L
t - 1
+ b
t - 1
)
Trend: b
t
= β (L
t
- L
t – 1
) + (1 - β)b
t – 1
Seasonal: S
t
= γ (Y
t
- L
t
) + (1 - γ)S
t - s
Forecast: F
t + m
= (L
t
+ b
t
m)S
t – s + m
It can be seen that the only differences in the equations concern to the fact that the seasonal
indices are now added and subtracted instead of taking products and ratios.
2
A multiplicative model is a model in which the various terms are multiplied together while an additive model is one in which the various
terms are added together.
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
14
1.1.d - Pagels’ classification of exponential smoothing techniques
Pagels set up a useful classification for interpreting and dealing with exponential smoothing
methods which have separate trend and seasonal aspects. Globally, there are nine
exponential smoothing models that can be summarised by the following equations:
L
t
= α P
t
+ (1 - α)Q
t
b
t
= β R
t
+ (1 - β)b
t – 1
S
t
= γ T
t
+ (1 - γ)S
t - s
where P, Q, R and T vary depending on the characteristics of the trend component (i.e. none,
additive or multiplicative) and the seasonal component (i.e. none, additive or multiplicative). The
specific formulae for forecasting m periods ahead are also provided
4
.
3
See the practical application in Chapter 3
4
For more details see S. Makridakis – S.C. Wheelwright – R.J. Hyndman, Forecasting, methods and applications(1998), p. 171
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
15
CHAPTER II
Critical review of the Box-Jenkins approach to time-series model building
The Box-Jenkins (B-J) (1976) approach to time-series model building is a method of finding,
for a given sample of data, an ARIMA (Autoregressive Integrated Moving-average) model that
may adequately represent the data-generation process. “In practice, the emphasis of this
forecasting method is on analysing the probabilistic or stochastic, properties of the economic
time series on their own under the philosophy “let the data speak for themselves”. (D.
Gujarati, 1995).
Before introducing the B-J methodology a brief explanation of concepts such as stationarity,
autocorrelation function, autoregressive (AR) process, moving-average (MA) process and
autoregressive and moving-average (ARMA) process is necessary.
As a starting point, it must be considered that the basic assumption of empirical work based
on time series data which assumes that the underlying time series is stationary.
In short, “a stochastic process is said to be stationary (or weakly stationary) if its mean and
variance are constant over time and the value of covariance between two time periods
depends only on the distance or lag between the two time periods and not on the actual time
at which the covariance is computed
5
”. (D. Gujarati, 1995).
If the characteristics of the stochastic process change over time, i.e. if the process is
therefore nonstationary, it will be difficult to represent the time series over past and future
intervals of time by a simple algebraic model. On the other hand, if the stochastic process is
5
To explain this statement, let Y
t
be a stochastic time series with the following properties:
Mean: E(Y
t
) = µ
Variance: var(Y
t
) = E(Y
t
- µ)
2
= σ
2
Covariance: γ
κ
= E [(Y
t
- µ) (Y
t+k
- µ)]
Moschini Cristiano
MSc in Agribusiness Management
__________________________________________________________________________________________________________________
16
fixed in time, that is, stationary one can model the process via an equation with fixed
coefficients that can be estimated from past data.
Usually, very few of the time series one meets in practice are stationary but fortunately many
of the nonstationary time series that are encountered have the desirable property that if they
are differentiated one or more times, the resulting series will be stationary. Such a
nonstationary series is called homogeneous and the number of times the original series must
be differenced before a stationary series results is termed the order of homogeneity. It follows
that a method is needed in order to decide whether a series is stationary or determine the
number of times a homogeneous series should be differenced to arrive at a stationary series.
For doing that we first look at the plot of the Autocorrelation Function (called a Correlogram).
The Autocorrelation Function (ACF) is one simple test of stationarity and an ACF at lag k,
denoted by ρ
k
is defined as:
ρ
k
= γ
k
/ γ
0
= covariance at lag k
variance
Since both covariance and variance are measured in the same units of measurements, ρ
k
is
unitless, a pure number which lies between –1 and +1 as any other correlation coefficient
does.
The ACF function for a stationary series drops off as k, the number of lags, becomes large,
but this is not the case for a nonstationary series (see figures 1 and 2 in the appendix).
If the series does not appear to be stationary it must be differenced so that, once it has
reached the stationary condition, we can model it by the variety of ways that are described
below.
Where γ
κ
, the covariance at lag k, is the covariance (also called autocovariance) between the values of Y
t
and Y
t+k
, that is, between two
periods apart. Imagine that there is a shift from the origin of Y from Y
t
to Y
t+m
then if Y
t
is supposed to be stationary the mean, variance