Statistik

Chaper 9: Nonseasonal Box-Jenkins Models
The concepts of ‘stationary time series’ and

‘nonstationary time series’ are important in the
Box-Jenkins methodology.
Stationary time series

A time series { yt } is said to be stationary if the
following two conditions are satisfied:
(a) the mean function is constant over time, i.e.,
mt = E ( yt ) = c for all t
(b) rt ,s = cov( yt , ys ) / var( yt ) var( ys ) are not
functions of time, i.e., rt ,t -k = r0,k = rk for all

time t and lag k . This is equivalent to the
condition: g t ,s = cov( yt , ys ) are independent of
time t also, ie., gt,t-k = g0,k = gk for all t and lag k.
1 Ch9
In other words both the autocorrelations rt ,s and
autocovariances g t ,s depend only the distance

between the two time points s and t but not on the
actual positions of s and t .
Note: Since g t ,t = cov( yt , yt ) = var( yt ) , a stationary
time series is also necessary that the variance is
constant with respect to t.
Nonstationary Time Series

If the n values of yt do not fluctuate around a
constant mean or do not fluctuate with constant
variation then it is reasonable to believe the time
series is not stationary.
2 Ch9
Random walk with zero mean
15
10
Zt
5
-5
Time
50 100 150
A nonstationary series can be transformed into a

stationary one by first differencing
zt = �yt = yt - yt -1 .
Minitab command for differencing is
Stat > Time Series >Difference (lag 1)
(Differencing is like differentiation in calculus)

�yt y - yt -1
�yt = yt - yt -1 � �yt = = t
1 t - (t - 1)
which is similar to the definition of a derivative of
a function f (t ) :
3 Ch9
f (t + D ) - f (t ) f (t + D ) - f (t )
f ' (t ) = lim = lim
D �0 t + D -t D �0 D
Time Series Plot of Paper Towel Sales

20
15
10
y
1 12 24 36 48 60 72 84 96 108 120
Index
After first differencing
4 Ch9
Time Series Plot of first differencing
3
0
C2
-1
-2
-3
-4
1 12 24 36 48 60 72 84 96 108 120
Index
If this is not sufficient, take second differences (the

first differences of the first differences) of the
original series values should normally does the job
zt = �2 yt = �yt - �yt -1 = ( yt - yt -1 ) - ( yt -1 - yt -2 )
If a time series plot indicates increasing variability,
it is often transform the series by using either
square root, quadric or logarithmetic
transformation first and then takes first differences
Example:
Consider the following NCR (New Company
Registrations) rates data given below:
5 Ch9
Time Series Plot of NCR
700
600
500
NCR
400
300
200
100
4 8 12 16 20 24 28 32 36
Index
The series is clearly not stationary since it has a

trend and increasing variability which means both
E ( yt ) and var( yt ) are depending on the time
variable t .
6 Ch9
Time Series Plot of lnNCR
6.50
6.25
6.00
5.75
lnNCR
5.50
5.25
5.00
4 8 12 16 20 24 28 32 36
Index
Clearly the log transformation has stabilised the

variance somewhat.
Applying differencing on the logged series:

Time Series Plot of d1lnNCR
0.3
0.2
0.1
d1lnNCR
0.0
-0.1
-0.2
-0.3
4 8 12 16 20 24 28 32 36
Index
7 Ch9
It now appears that the resulting series is stationary.
Working Series
The textbook uses zb , zb +1 ,..., zn as the ‘working
series’ obtained from the original series by
transformation or differencing.
b = 2 if zt = yt - yt -1
Sample autocorrelation coefficient (SAC)

The sample autocorrelation at lag k is
n-k
�( zt - z )( zt +k - z )
t =b
rk = rk = n
�( zt - z )2
t =b
where
n
z = �zt /(n - b + 1)
t =b
The standard error of rk is
8 Ch9
� 1
�(n - b + 1)1/ 2 , if k = 1
�
�
srk = � k -1
� 1 + 2 �j r 2
j =1
� 1/ 2
, if k = 2,3,...
�(n - b + 1)
The trk -statistic is

rk
trk =
srk
SAC graph is a graph of sample autocorrelations

(Minitab calls it the ACF plot):
Autocorrelation Function for y (original towel sales)

(with 5% significance limits for the autocorrelations)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Lag
9 Ch9
Spikes
We say that a spike at lag k exists if rk is
statistically large, says trk = rk / srk > 2 in absolute

value.
In Minitab acf graph, any rk that is above or below
the confidence bands is considered to be a spike so
you do not need to find the value of trk .
Cuts off after k

We say that SAC cuts off after lag k if no spikes at
lags greater than k in SAC
Using the SAC to find a stationary time series

For nonseasonal data
(i) If the time series either cuts off fairly
quickly or dies down fairly quickly, then
the series is considered stationary
10 Ch9
(ii) If the time series dies down extremely
slowly, then the series is considered
nonstationary
Note that the SAC of the towel sales series refuse
to die down quickly so there is a clear sign the
series is nonstationary
Sample partial autocorrelation rkk

Can be thought of as the sample autocorrelation of
time series observations separated by a lag of k
time units with the effects of the intervening
observations eliminated.
In other words, this measure of correlation is used

to identify the extent of relationship between
current values of a variable with earlier values of
the same variable (values for various time lags)
while holding the effects of all other time lags
constant.
11 Ch9
Consider now the differenced series of the towel
sales
Autocorrelation Function for z (differenced series)
(with 5% significance limits for the autocorrelations)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Lag
Here, there is a cut-off at lag 1 so the differenced

series is stationary.
Simple Stationary Time Series Models- ARMA

Let {at } be a sequence of random shocks which
describe the effect of all other factors other than
zt -1 on zt . It is more or less the residual errors of
12 Ch9
the forecast (if the residuals e t are not independent,
then we can’t treat e t as at )
Note: Most textbooks call { at } the white noise.
Properties of {at }
(i) a1 , a2 , a3, ... are independent
(ii) ai : N (0, s a )
2
(iii) at +1 is independent of yt , yt -1 ,...
{at } forms a very important role in Box-Jenkins

methodology. Essentially, every stationary Box-
Jenkins model can be expressed in terms of the
white noise process.
Simple Box-Jenkins Models
Moving Average Models
13 Ch9
zt = at - q1at -1 ... - qqat -q
and refer to it as a moving average process of order
q, denoted by MA(q). (Note that structurally
speaking, MA(q) is expressed as averaging of at
terms except the negative signs)
The special case:

MA(1)
zt = at - q1at -1
E ( zt ) = 0
var( zt ) = s 2a (1 + q12 )
cov( zt , zt +1 ) = -q1s 2a
cov( zt , zt + k ) = 0 for k �2
q1
Thus r1 = 2 and all other r k are zero.
1 + q1
(Make sure you know how to derive the above).
Hence the TAC of an MA(1) “cuts off” after lag 1.
MA(2)
14 Ch9
zt = at - q1at -1 - q2 at -2
E ( zt ) = 0 E(Zt) = 0
var( zt ) = s a2 (1 + q12 + q22 )
cov( zt , zt +1 ) = ( -q1 + q1q2 )s a2
cov( zt , zt + 2 ) = -q2 s2a
cov( zt , zt + k ) = 0 for k �3
�
-q1 + q1q2
r1 = ,
1 + q12 + q22
-q2
r2 =
1 + q12 + q22
and all other rk are zero.
Thus the TAC of an MA(2) “cuts off” after lag 2.
In general, for MA(q)
(i) rk �0 for k = 1, 2,..., q

rk = 0 for k > q
(ii) PAC dies down
15 Ch9
Autoregressive Models
zt = f1 zt -1 + f2 zt -2 + ...f p zt - p + at
Here the zt are regressed on themselves, (hence of

course the name) but lagged by various amounts.
The simplest case is the first order, denoted as
AR(1), which takes the form
zt = f1 zt -1 + at
E ( zt ) = 0
s 2a
var( zt ) = g 0 = 2 ,
1 - f1
so |f1| < 1 to ensure stationarity
gk = f1 gk-1
rk = f1
k
Thus rk “dies down” exponentially as k increases,

oscillating if f1 < 0. Thus if the TAC of a series
dies down rather than cuts off, we suspect it to be
an AR rather than an MA.
16 Ch9
Note that AR and MA series are not entirely
unrelated. It can be shown that an AR(1) can be
expressed as an “infinite” MA series, much like the
general linear process. The MA(1) can similarly be
expressed as an “infinite” AR series.
Note: a linear process is a time series that has the

form
yt = at + y 1at -1 + y 2 at -2 + ...
The AR(2) can be written as

zt = f1 zt -1 + f2 zt -2 + at
f1
r1 =
1 - f2
r2 = f1r1 + f2
r3 = f1r2 + r2 f1
etc.
Thus again the TAC dies down rather than cuts off,
though it is difficult at times to tell the difference in
TAC’s between AR(1) and AR(2).
17 Ch9
TPAC has nonzero partial autocorrelations at lags 1
and 2 and zero at all lags after lag 2, i.e., cuts off
after lag 2.
In general, for AR(p), TAC dies down and TPAC

cuts off after lag p.
ARMA(p, q)
Mixed autoregressive-moving average models
The model can be written as

zt = f1 zt -1 + f2 zt - 2 + ... + ft - p + at - q1at -1 - q2 at -2 - ... - qq at -q
zt - f1 zt -1 - f2 zt -2 ... - ft - p = at - q1at -1 - q2 at -2 - ... - q q at - q
i.e., we move autoregressive part to the left

whereas the moving average part on the right.
ARMA(1, 1)
18 Ch9
zt = f1 zt -1 + at - q1at -1
(1 - q1f1 )(f1 - q1 ) k -1
rk = f1 , k �1
1 - 2q1f1 + q1 2
i.e., TAC dies down exponentially from r1 (not

from r0 = 1)
TPAC also dies down exponentially.
Summary
We can therefore tentatively produce a Model

Identification Chart, as follows, based on the
behaviours of the SAC and SPAC of a stationary
series.
SAC SPAC Tentative

behaviour behaviour Model
Cuts off after 1 Dies down MA(1)
Cuts off after 2 Dies down MA(2)
Dies down Cuts off after 1 AR(1)
Dies down Cuts off after 2 AR(2)
Dies down Dies down ARMA(1, 1)
19 Ch9
This looks relatively obvious, but isn’t as easy in
practice as it appears. Note that no process has
ACF and PACF that both cut off.
Box-Jenkins Models with a nonzero constant

term
MA(q):
zt = d + at - q1at -1 ... - qqat -q
E ( zt ) = m = d
AR(p):
zt = d+f1 zt -1 + f2 zt -2 + ...f p zt - p + at
d = m(1 - f1 - f2 - ... - f p ) �
m = d /(1 - f1 - f2 ...fk )
ARMA(p,q)
zt = d + f1 zt -1 + f2 zt -2 + ... + ft - p + at - q1at -1 - q2 at -2 - ... - qq at - q
d = m(1 - f1 - f2 - ... - f p )
20 Ch9
Time Series Operations and Representation of
ARMA (p,q) Models.
Backshift Operator
Byt = yt -1
(Push back the time series to the previous position)
Difference operator
�= 1 - B so �yt = (1 - B) yt = yt - yt -1 . Thus, �is

generally known as a differencing operator.
�2 yt = ��
( yt ) = �( yt - yt -1 ) = ( yt - yt -1 ) - ( yt -1 - yt -2 )
= yt - 2 yt -1 + yt -2
Also �d = (1 - B) d
Representation of an ARMA(p, q) model:

AR(p)
zt = d + f1 zt -1 + ... + f p zt - p + at �
zt - f1 zt -1 - ... - f p zt - p = d + at
which can also be written as
21 Ch9
(1 - f1 B - f2 B 2 - ... - f p B p ) zt = d + at
Define f p ( B ) = (1 - f1 B - f2 B - ... - f p B )
2 p
so
f p ( B) zt = d + at
MA(q) – moving average model of order q
The model is written as
zt = d + at - q1at -1 - q 2 at -2 - ... - q q at -q
which can also be written as
zt = d + (1 - q1 B - q 2 B 2 - ... - q q B q )at
Define
q q ( B) = (1 - q1 B - q 2 B 2 - ... - q q B q ) ,
then
zt = d + q q ( B )at
ARMA (p, q)—Mixed autoregressive-moving

average model of order (p, q):
22 Ch9
zt = d + f1 zt -1 + f2 zt - 2 + ... + f p zt - p
+at - q1at -1 - q 2 at - 2 - ... - q q at - q
or
zt - f1 zt -1 - f2 zt - 2 �
��-f p zt - p = d + at - q1at -1 - q 2 at -2 �
��-q q at - q
(1 - f1 B - f2 B 2 - ... - f p B p ) zt = d + (1 - q1 B - q 2 B 2 - ... - q q B q )at

or
f p ( B ) zt = d + q q ( B )at (*)
where q q ( B ) = (1 - q1 B - q 2 B - .. - q q B )
2 q
In this notation, ARMA(p, 0)= AR(p) and

ARMA(0, q) = MA(q).
In such cases one would prefer to write AR(p) and

MA(q) instead of ARMA(p, 0) and ARMA(0, q).
23 Ch9
Point Estimate of the model parameters
Having identified a tentative ARMA model, we
must now fit it to the dataset concerned, in so doing
obtain estimates of the parameters defined by the
models. For the ARMA(p, q) model, the
parameters are qi , fi and d (if the constant term is
required).
These parameters are popularly estimated the least

squares method (As we understand it, both Minitab
and SAS use this approach).
The least method essentially find the estimates so
that SSE = � t - 2
( y ˆ
y t ) is minimum.
You do not need to know the detailed algorithm.

Isn’t nice that the computer packages do it for us?
24 Ch9
Forecasts
What is the meaning of forecasting?
yˆt +t (t ) is a point forecast of the series at time t + t
given the series has been observed from 1 to t
Statistically speaking,
yˆt +t (t ) = E ( yt +t | y1 , y2 ,.., yt )
Since ARMA models build upon the series{at } , the

properties of {at } needs to be revisited. In
particular, a1 , a2 , a3 ,... are independent and that
future values of a ' s are independent of the present
and the past values of y ' s , i.e., at +1 is independent
of yt , yt -1 ,....
Example: Paper Towel Sales

It is found that the differenced series can be fitted
by MA(1), so
zt = at - q1at -1
(assuming d = 0 ).
Since zt = yt - yt -1 so
25 Ch9
yt - yt -1 = at - q1at -1 �
yt = yt -1 + at - q1at -1
(This is known as in the form of a difference-

equation)
One-step forecast:
First, we have yt +1 = yt + at +1 - q1at

yˆt +1 (t ) = E ( yt +1 | y1 , y2 ,..., yt )
= E ( yt + at +1 - q1at | y1 ,..., yt )
= yt + 0 - qˆ1aˆt = yt - qˆ1aˆt
since at +1 is independent of y1 ,.., yt
so E (at +1 | y1 , y2 ,.., yt ) = E ( at +1 ) = 0 .
Let t = 120 and t = 1 so
yˆ121 (120) = y120 - qˆ1aˆ120
In the absorbent towel sales example given in Table
9.1, Minitab gives qˆ1 = -0.3544
26 Ch9
Final Estimates of Parameters
Type Coef SE Coef T P

MA 1 -0.3544 0.0864 -4.10 0.000
Differencing: 1 regular difference

Number of observations: Original series 120, after
differencing 119
Residuals: SS = 127.367 (backforecasts excluded)
MS = 1.079 DF = 118
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 10.3 18.6 27.5 41.2
DF 11 23 35 47
P-Value 0.500 0.725 0.815 0.710
The last two residuals are e119 = -1.0890 and

e120 = 0.6903 so aˆ119 = -1.0890 and aˆ120 = 0.6903 .
Thus
yˆ121 (120) = 15.6453 + 0.3544 �0.6903
= 15.8899
Using Minitab to forecast, we get
Forecasts from period 120
95 Percent
Limits
Period Forecast Lower Upper Actual
121 15.8899 13.8532 17.9267
which is identical.
27 Ch9
Two-step forecast:
yt + 2 = yt +1 + at + 2 - q1at +1 �
yˆt + 2 = yˆt +1 (t ) + E (at + 2 ) - qˆ1E (at +1 ) = yˆt +1 (t )
Again, let t = 120 ,
then yˆ122 = yˆ121 (120) = 15.8899 .
However, the prediction interval is winder:
Forecasts from period 120
95 Percent
Limits
Period Forecast Lower Upper Actual
121 15.8899 13.8532 17.9267
122 15.8899 12.4609 19.3189
Finally, in ARIMA notation, we may write our

model that fits the original series as
ARIMA(0,1,1).
28 Ch9

Statistik

Încărcat de

Informații document

Drepturi de autor

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Statistik

Încărcat de

Drepturi de autor:

Chaper 9: Nonseasonal Box-Jenkins Models

The concepts of ‘stationary time series’ and

Stationary time series

(b) rt ,s = cov( yt , ys ) / var( yt ) var( ys ) are not

functions of time, i.e., rt ,t -k = r0,k = rk for all

autocovariances g t ,s depend only the distance

Nonstationary Time Series

A nonstationary series can be transformed into a

(Differencing is like differentiation in calculus)

Time Series Plot of Paper Towel Sales

After first differencing

If this is not sufficient, take second differences (the

The series is clearly not stationary since it has a

Clearly the log transformation has stabilised the

Applying differencing on the logged series:

Sample autocorrelation coefficient (SAC)

The standard error of rk is

The trk -statistic is

SAC graph is a graph of sample autocorrelations

Autocorrelation Function for y (original towel sales)

statistically large, says trk = rk / srk > 2 in absolute

you do not need to find the value of trk .

Cuts off after k

Using the SAC to find a stationary time series

Sample partial autocorrelation rkk

In other words, this measure of correlation is used

Here, there is a cut-off at lag 1 so the differenced

Simple Stationary Time Series Models- ARMA

(iii) at +1 is independent of yt , yt -1 ,...

{at } forms a very important role in Box-Jenkins

Simple Box-Jenkins Models

Moving Average Models

The special case:

Hence the TAC of an MA(1) “cuts off” after lag 1.

In general, for MA(q)

(i) rk �0 for k = 1, 2,..., q

(ii) PAC dies down

Here the zt are regressed on themselves, (hence of

Thus rk “dies down” exponentially as k increases,

Note: a linear process is a time series that has the

The AR(2) can be written as

In general, for AR(p), TAC dies down and TPAC

The model can be written as

zt - f1 zt -1 - f2 zt -2 ... - ft - p = at - q1at -1 - q2 at -2 - ... - q q at - q

i.e., we move autoregressive part to the left

i.e., TAC dies down exponentially from r1 (not

TPAC also dies down exponentially.

We can therefore tentatively produce a Model

SAC SPAC Tentative

Box-Jenkins Models with a nonzero constant

�= 1 - B so �yt = (1 - B) yt = yt - yt -1 . Thus, �is

Representation of an ARMA(p, q) model:

which can also be written as

MA(q) – moving average model of order q

The model is written as

ARMA (p, q)—Mixed autoregressive-moving

(1 - f1 B - f2 B 2 - ... - f p B p ) zt = d + (1 - q1 B - q 2 B 2 - ... - q q B q )at

In this notation, ARMA(p, 0)= AR(p) and

In such cases one would prefer to write AR(p) and

These parameters are popularly estimated the least

You do not need to know the detailed algorithm.

Since ARMA models build upon the series{at } , the

Example: Paper Towel Sales

(This is known as in the form of a difference-

First, we have yt +1 = yt + at +1 - q1at

since at +1 is independent of y1 ,.., yt

Let t = 120 and t = 1 so