volatility implied by the market. It is common knowledge that types of as-
sets experience periods of high and low volatility. This phaenomenon, called
volatility clustering, consists that during some periods prices go up and
down quickly, while during other times they might not seem to move at all.
Periods when prices fall quickly (a crash) are often followed by prices going
down even more, or going up by an unusual amount. Also, a time when prices
rise quickly (a bubble) may be often followed by prices going up even more,
or going down by an unusual amount. Most typically, extreme movements
do not appear out of nowhere; they are presaged by larger movements than
usual. Of course, whether such large movements have the same direction,
or the opposite, is more dicult to say. And an increase in volatility does
not always presage a further increase but the volatility may simply go back
down again. Below, we illustrate the most important models that allow us
to capture the volatility of nancial returns. Recently various economet-
ric models have been developed to describe the evolution of the volatility
of an asset return over time. All these models belong to the class of condi-
tional heteroskedastic models. The univariate volatility models discussed in
this chapter include the Autoregressive Conditional Heteroskedastic (ARCH)
model of Engle (1982), the Generalized ARCH (GARCH) Model of Boller-
slev (1986), and the extensions of the GARCH model: the Absolute Value
GARCH, the Exponential GARCH (E-GARCH), the Threshold GARCH (T-
GARCH) model, the GJR-GARCH, the I-GARCH, the GARCH-M and the
FI-GARCH. We also discuss advantages and weaknesses of each volatility
model and show some applications of the models.
16
1.2 Characteristics of Volatility
A special feature of volatility is that it is not directly observable. For ex-
ample, consider the daily log-returns of IBM stock. The daily volatility is
not directly observable from the returns because there is only one observa-
tion in a trading day. If intraday data, such as ve-minutes returns, are
available, then one can estimate the daily volatility using the intraday in-
formation. However, evaluating the accuracy of such an estimate deserves a
careful study. Furthermore,the unobservability of volatility makes it dicult
to evaluate the forecasting performance of conditional heteroskedastic mod-
els. We discuss this issue later. Although volatility is not directly observable,
it has some characteristics that are commonly seen in asset returns:
1. There exist volatility clusterings (i.e., volatility may be high for cer-
tain time periods and low for other periods);
2. Volatility evolves over time in a continuous manner - that is, volatility
jumps are rare;
3. Volatility does not diverge to innity - that is, volatility varies within
some xed range. Statistically speaking, this means that volatility is
often stationary;
4. Volatility seems to react dierently to a big price increase or a big price
drop (the so-called leverage eect).
These properties play an important role in the development of volatility
models. Some volatility models were proposed specically to correct the
weaknesses of the existing ones for their inability to capture the character-
istics mentioned earlier. For example, the E-GARCH model was developed
17
to capture the asymmetry in volatility induced by big positive and neg-
ative asset returns. In the Fig.1.1 we showed the volatility clusterings
observed in the conditional variances predicted by a GARCH(1,1) model for
the log-returns of TESC PLC.
Figure (1.1). Conditional variances predicted by a GARCH(1,1) for TESCO PLC log-returns. We can
see in the circles the volatility clusterings.
2003 2004 2005 2006 2007 2008 2009
0
0.5
1
1.5
x 10
-3
Low Volatility
High Volatility
1.3 Volatility in Options trading
Volatility is an important factor in options trading. Here volatility means the
conditional variance of the underlying asset return. Consider, for example,
the price of a European call option, which is a contract giving its holder the
right, but not the obligation, to buy a xed number of shares of a specied
common stock at a xed price on a given date. The xed price is called the
strike price and is commonly denoted by K. The given date is called the
expiration date. The important time duration here is the time to expiration,
and we denote it by l. If the holder can exercise his right any time on or
before the expiration date, then the option is called an American call option.
The well-known Black and Scholes option pricing formula states that the
price of a European call option is
18
CBSt = StΦ(x)−Kr−lΦ(x−σt
√
l), and x = ln(Pt/Kr
−l)
σt
√
l
+ 1
2
σt
√
l, (1.1)
where St is the current price of the underlying stock, r is the risk-free interest
rate, σt is the conditional standard deviation of the log-return of the specied
stock,l is the maturity of the option and Φ(x) is the cumulative distribution
function of the standard normal random variable evaluated at x. We will
give more details on the BS model in Chapter 3, when we will speak about
the option pricing models. The BS formula has several interesting interpre-
tations, but it suces to say here that the conditional variance of the log
return of the underlying stock plays an important role. In options markets,
if one accepts the idea that the prices are governed by an econometric model
such as the Black and Scholes formula, then one can use the price to obtain
the Implied Volatility. The Implied Volatility of an option is the volatility
implied by the market price of the option based on an option pricing model.
In other words, it is the volatility that, when used in a particular pricing
model, yields a theoretical value for the option equal to the current market
price of that option. For example the implied volatility of the BS model can
be obtained in this way:
σimpt = σ[St, CBSt , K, r, l].
From the observed prices of a European call option, one can use the Black
and Scholes formula in (1.1) to deduce the conditional standard deviation
σt. However, this implied volatility is derived under the assumption of log-
normal distribution for the asset returns series. It might be very dierent
from the actual volatility. Experience shows that implied volatility of an
asset return tends to be larger than that obtained by using a GARCH type
of volatility model (Figlewski (1994)).
19
1.4 A general model structure
Let rt be the log return of an asset at time index t. The basic idea be-
hind volatility study is that the series rt is either serially uncorrelated or
with minor lower order serial correlations, but it is dependent. For illus-
tration, Fig.1.2 shows the ACF and PACF of some functions of the daily
log-returns of TESCO PLC. from the rst day of 02 January 2003 to 17 July
2009. The upper left panel shows the sample ACF of the squared log-returns,
which suggests signicant serial correlations because the ACF are out of the
condence bands. The upper right panel shows the sample PACF of the
squared log-returns, whereas the lower left panel shows the sample ACF of
the log-returns, which clearly suggest that the daily log-returns are corre-
lated, because the values of the rst, third, fourth and fth ACF are out of
the condence bands. Volatility models attempt to capture such dependence
in the return series. To put the volatility models in a proper perspective, it
is informative to consider the conditional mean and conditional variance of
rt given Ft−1 - that is,
µt = E(rt|Ft−1), σ2t = V ar(rt|Ft−1) = E[(rt − µt)2]|Ft−1], (1.2)
where Ft−1 denotes the information set available at time t− 1. As shown by
many empirical examples, serial dependence of a returns series rt is weak if
it exists at all. Therefore, the equation for µt in (1.2) should be simple, and
we assume that rt follows a simple time series model such as a stationary
ARMA(p, q) model. In other words, we consider the model
rt = µt + at, µt = φ0 +
p∑
i=1
φirt−i −
q∑
i=1
θiat−1, (1.3)
for rt, where p and q are non-negative integers. Model in equation (1.3)
20
Figure (1.2). Sample ACF and PACF of various functions of monthly log stock returns of TESCO PLC
from 02/01/03 to 07/17/09: (1) ACF of the squared returns, (2) ACF of the log returns (lower left), (3)
PACF of the squared returns (upper right), and (4) ACF of the absolute returns.
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
Lag
S
a
m
p
le
A
u
to
co
rr
e
la
tio
n
ACF of squared returns
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
Lag
S
a
m
p
le
P
a
rt
ia
l A
u
to
co
rr
e
la
tio
n
s
PACF of the squared returns
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
Lag
S
a
m
p
le
A
u
to
co
rr
e
la
tio
n
ACF of log-returns
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
Lag
S
a
m
p
le
A
u
to
co
rr
e
la
tio
n
ACF of absolute returns
illustrates a possible nancial application of a linear time series models (in
this case an ARMA(p, q)). The order (p, q) of an ARMA model may depend
on the frequency of the return series. One may include some explanatory
variables to the conditional mean equation and use a linear regression model
with time series errors to capture the behavior of µt. For example, a dummy
variable can be used for the Mondays to study the eect of weekend on daily
stock returns. Combining Eqs.(1.2) and (1.3), we have
sigma2t = V ar(rt|Ft−1) = V ar(t|Ft−1). (1.4)
The conditional heteroskedastic models of this chapter are concerned with the
21
evolution of σ2t . The manner under which σ2t evolves over time distinguishes
one volatility model from another. Conditional heteroskedastic models can be
classied into two general categories. Those in the rst category use an exact
function to govern the evolution of σ2t , whereas those in the second category
use a stochastic equation to describe σ2t . The GARCH model belongs to
the rst category, and the stochastic volatility model (SVM) is in the second
category. For simplicity in introducing volatility models, we assume that the
model for the conditional mean is given. Throughout this work, t is referred
to as the shock or mean-corrected return of an asset return at time t and σt
is the positive square-root of σ2t . The model for µt in (1.3) is referred to as
the mean equation for rt and the model for σ2t is the volatility equation for rt.
Therefore, modeling conditional heteroskedasticity amounts to augmenting
a dynamic equation to a time series model to govern the time evolution of
the conditional variance of the shock.
1.5 The ARCH Model
The rst model that provides a systematic framework for volatility modeling
is the ARCH model of Engle (1982). The basic ideas of ARCH models are
that:
1. the mean corrected asset return t is serially uncorrelated, but depen-
dent;
2. the conditional variance of t can be described by a simple quadratic
function of its lagged squared values.
Specically, an ARCH(m) model assumes that
t = σtzt, σ2t = ω + α12t−1 + · · ·+ αm2t−m, (1.5)
22
where zt is a sequence of independent and identically distributed (i.i.d.) ran-
dom variables with mean zero and variance 1, ω > 0, and αi ≥ 0 for i > 0.
The coecients αi must satisfy some regularity conditions to ensure that the
unconditional variance of t is nite. In practice, zt is often assumed to follow
the standard normal or a standardized Student-t distribution 1. From the
structure of the model, it is seen that large past squared shocks 2t−1 imply a
large conditional variance σ2t for the mean-corrected return t. Consequently,
t tends to assume a large value (in modulus). This means that, under the
ARCH framework, large shocks tend to be followed by another large shock.
Here I use the word tend because a large variance does not necessarily pro-
duce a large variate. It only says that the probability of obtaining a large
variate is greater than that of a smaller variance. This feature is similar to
the volatility clusterings observed in asset returns.
1.5.1 Properties of ARCH Models
To understand the ARCH models, it pays to carefully study the ARCH(1)
model
t = σtzt, σ2t = ω + α12t−1, (1.6)
where ω > 0, α1 ≥ 0 and
∑p
i=1 αi < 1. First, the unconditional mean of
t remains zero because
E(at) = E[E(t|Ft−1)] = E[σtE(zt)] = 0. (1.7)
Second, the unconditional variance of t can be obtained as
1Other distributions are also possible, see Bauwens and Laurent (2005).
23
V ar(at) = E(2t )
= E[E(2t |Ft−1)]
= E(ω + α12t−1) = ω + α1E(2t−1).
Because t is a stationary process 2 with
E(t) = 0, and V ar(t) = V ar(t−1) = E(2t−1)
Therefore, we have
V ar(t) = ω + α1V ar(t) and V ar(t) =
ω
1− α1
because the variance of t must be positive, we need 0 ≤ α1 < 1.
Third, in some applications, we need higher order moments of t to exist
and, hence, α1 must also satisfy some additional constraints. For instance,
to study its tail behavior, we require that the fourth moment of t is nite.
Under the normality assumption of zt in Eq.(1.5), we have
E(4t |Ft−1) = 3[E(2t |Ft−1]2 = 3(ω + α12t−1)2.
Therefore,
E(4t ) = E[E(4t |Ft−1)] = 3E(ω + α12t−1)2 = 3E[ω2 + 2ωα12t−1 + α214t−1].
If t is fourth-order stationary with m4 = E(4t ), then we have
2We suppose that t is stationary. The condition for the stationarity of the process is
that
∑m
i=1 αi < 1. Instead if the sum of the parameters is ≤ 1, then the unconditional
variance does not exist and the process is not covariance-stationary.
24
m4 = 3[ω2 + 2ωα1V ar(t) + α21m4]
= 3ω2
(
1 + 2 α1
1− α1
)
+ 3α21m4. (1.8)
Consequently,
m4 = 3ω
2(1 + α1)
(1− α1)(1− 3α21)
.
This result has two important implications:
(a) since the fourth moment of t is positive, we see that α1 must also
satisfy the condition: 1− 3(α21) > 0 that is, 0 ≤ α21 < 1/3;
(b) the unconditional kurtosis of t is
E(4t )
[V ar(t)]2
= ω
2(1 + α1)
(1− α1)(1− 3α21
× (1− α1)
2
ω2
= 1− α
2
1
1− 3α21
> 3.
Thus, the excess kurtosis of t is positive and the tail distribution of t is
heavier than that of a normal distribution. In other words, the shock t of a
conditional Gaussian ARCH(1) model is more likely than a Gaussian white
noise series to produce outliers. This is in agreement with the empirical
nding that outliers appear more often in asset returns than that implied
by an i.i.d. sequence of normal random variates. These properties continue to
hold for general ARCH models, but the formulas become more complicated
for higher order ARCH models. The condition αi ≥ 0 can be relaxed. It is a
condition to ensure that the conditional variance σ2t is positive for all t. In
fact, a natural way to achieve positiveness of the conditional variance is to
rewrite an ARCH(m) model as:
t = σtzt, σt = ω + A
′
m,t−1ΩAm,t−1, (1.9)
25
where Am,t−1 = (t−1, · · · , t−m)′ and Ω is a [m × m] non-negative denite
matrix. The ARCH(m) model in (1.9) requires Ω to be diagonal. Thus,
Engle's model uses a parsimonious approach to approximate a quadratic
function. A simple way to achieve Eq.(1.9) is to employ a random-coecient
model (RCA) for t, such as the CHARMA model proposed by Tsay (1987).
1.5.2 Building an ARCH Model
The procedure for drawing inferences for an ARCH model consists of three
steps:
1. build an econometric model (e.g., an ARMA model) for the return series
to remove any linear dependence in the data, and use the residual series
of the model to test for ARCH eects;
2. specify the ARCH order and perform estimation;
3. check the tted ARCH model carefully and rene it if necessary. More
details are given later.
An ARMA model is built for the observed time series to remove any serial
correlations in the data. For most asset return series, this step amounts to
removing the sample mean from the data if the sample mean is signicantly
dierent from zero. For some daily return series, a simple AR model might be
needed. The squared series 2t is used to check for conditional heteroscedas-
ticity, where t = rt − µt is the residual of the tted ARMA model. Two
tests are available here. The rst test is to check the usual Ljung-Box statis-
tics of 2t . The second test for conditional heteroskedasticity is the Lagrange
multiplier test of Engle (1982). The null hypothesis is:
26
H0 : αi = 0, (i = 1, · · · , m)
or alternatively
H0 : ω = α1 = α2 = · · · = αp = 0.
and the linear regression is
2
t = ω + α12t−1 + · · ·+ αm2t−m + et, t = m + 1, · · · , T, (1.10)
where et denotes the error term, m is a prespecied positive integer, and T
is the sample size. If the parameters estimated are not signicant, we will
accept the null hypothesis in the linear regression and we deduce that there
are not ARCH eects. If the estimated coecients are signicant, we will
refuse the null hypothesis and there will be ARCH eects. The statistic of
Engle is obtained as LM = T ×R2, which is distributed as
T × R2 d→ χ2p,
for T → ∞ and χ2p is a chi-squared with p degrees of freedom 3. R2 is the
coecient of determination of the regression 4. The logic of the test is simple:
if
α1 = α2 = α3 = · · · , αp
were equals to zero, their estimation would be not signicant from a statistic
point of view. Hence, the R2 coecient would be zero. To obtain a random
3McLeod and Li (1983), show that the sample autocorrelations of 2t have asymptotic
variance equals to T−1 and that portmanteau statistics calculated from them are asymp-
totically distributed like a χ2 if the a2t are independent.
4R2 is the squared multiple correlation coecient of the regression.
27
variable known (at least approximately), it needs to amplify the value of the
coecient of determination multiplying it by the dimension of the sample.
We will calculate the critical value, which is the beyond of the two regions 5,
χ2p,α or the number for which we have:
P (χ2p > χ2p,α) = α
where α is the signicance level of the test chosen.
Alternatively we can use another test for the signicance of the p coe-
cients of the regression for the squared series of the residuals. Let SSR0 =∑T
t=m+1(
2
t − µ¯2), where µ¯ is the sample mean of 2t , and SSR1 =
∑T
t=m+1 eˆt
2
,
where eˆt2 is the least squares residual of the previous linear regression in
(1.10). Then we have
F = (SSR0 − SSR1)/m
SSR1/(T − 2m− 1) ,
which is asymptotically distributed as a Fischer distribution with (m,T −
2m− 1) degrees of freedom under the null hypothesis. If the test statistic F
is signicant, then conditional heteroscedasticity of t is detected, and we use
the PACF of 2t to determine the ARCH order. Using PACF of 2t to select
the ARCH order can be justied as follows. From the model in Eq.(1.5), we
have
σ2t = ω + α12t−1 + · · ·+ αm2t−m.
For a given sample, 2t is an unbiased estimate of ht. Therefore, we expect
that 2t is linearly related to
2
t−1, . . . ,
2
t−m in a manner similar to that of an
autoregressive model of order m. Note that a single 2t is generally not an
5They are: the region of acceptance and the region of rejection or critical region.
28