Sunteți pe pagina 1din 28

Chaper 9: Nonseasonal Box-Jenkins Models

The concepts of ‘stationary time series’ and


‘nonstationary time series’ are important in the
Box-Jenkins methodology.

Stationary time series


A time series { yt } is said to be stationary if the
following two conditions are satisfied:
(a) the mean function is constant over time, i.e.,
mt = E ( yt ) = c for all t

(b) rt ,s = cov( yt , ys ) / var( yt ) var( ys ) are not

functions of time, i.e., rt ,t -k = r0,k = rk for all


time t and lag k . This is equivalent to the
condition: g t ,s = cov( yt , ys ) are independent of
time t also, ie., gt,t-k = g0,k = gk for all t and lag k.

1 Ch9
In other words both the autocorrelations rt ,s and

autocovariances g t ,s depend only the distance


between the two time points s and t but not on the
actual positions of s and t .
Note: Since g t ,t = cov( yt , yt ) = var( yt ) , a stationary
time series is also necessary that the variance is
constant with respect to t.

Nonstationary Time Series


If the n values of yt do not fluctuate around a
constant mean or do not fluctuate with constant
variation then it is reasonable to believe the time
series is not stationary.

2 Ch9
Random walk with zero mean
15

10

Zt
5

-5

Time
50 100 150

A nonstationary series can be transformed into a


stationary one by first differencing
zt = �yt = yt - yt -1 .
Minitab command for differencing is
Stat > Time Series >Difference (lag 1)

(Differencing is like differentiation in calculus)


�yt y - yt -1
�yt = yt - yt -1 � �yt = = t
1 t - (t - 1)
which is similar to the definition of a derivative of
a function f (t ) :

3 Ch9
f (t + D ) - f (t ) f (t + D ) - f (t )
f ' (t ) = lim = lim
D �0 t + D -t D �0 D

Time Series Plot of Paper Towel Sales


20

15

10
y

1 12 24 36 48 60 72 84 96 108 120
Index

After first differencing

4 Ch9
Time Series Plot of first differencing
3

0
C2

-1

-2

-3

-4
1 12 24 36 48 60 72 84 96 108 120
Index

If this is not sufficient, take second differences (the


first differences of the first differences) of the
original series values should normally does the job
zt = �2 yt = �yt - �yt -1 = ( yt - yt -1 ) - ( yt -1 - yt -2 )
If a time series plot indicates increasing variability,
it is often transform the series by using either
square root, quadric or logarithmetic
transformation first and then takes first differences

Example:
Consider the following NCR (New Company
Registrations) rates data given below:

5 Ch9
Time Series Plot of NCR
700

600

500
NCR

400

300

200

100
4 8 12 16 20 24 28 32 36
Index

The series is clearly not stationary since it has a


trend and increasing variability which means both
E ( yt ) and var( yt ) are depending on the time
variable t .

6 Ch9
Time Series Plot of lnNCR
6.50

6.25

6.00

5.75
lnNCR

5.50

5.25

5.00

4 8 12 16 20 24 28 32 36
Index

Clearly the log transformation has stabilised the


variance somewhat.

Applying differencing on the logged series:


Time Series Plot of d1lnNCR

0.3

0.2

0.1
d1lnNCR

0.0

-0.1

-0.2

-0.3
4 8 12 16 20 24 28 32 36
Index

7 Ch9
It now appears that the resulting series is stationary.

Working Series
The textbook uses zb , zb +1 ,..., zn as the ‘working
series’ obtained from the original series by
transformation or differencing.
b = 2 if zt = yt - yt -1

Sample autocorrelation coefficient (SAC)


The sample autocorrelation at lag k is
n-k
�( zt - z )( zt +k - z )
t =b
rk = rk = n
�( zt - z )2
t =b

where
n
z = �zt /(n - b + 1)
t =b

The standard error of rk is

8 Ch9
� 1
�(n - b + 1)1/ 2 , if k = 1


srk = � k -1

� 1 + 2 �j r 2

j =1
� 1/ 2
, if k = 2,3,...
�(n - b + 1)

The trk -statistic is


rk
trk =
srk

SAC graph is a graph of sample autocorrelations


(Minitab calls it the ACF plot):

Autocorrelation Function for y (original towel sales)


(with 5% significance limits for the autocorrelations)

1.0
0.8
0.6
0.4
Autocorrelation

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Lag

9 Ch9
Spikes
We say that a spike at lag k exists if rk is

statistically large, says trk = rk / srk > 2 in absolute


value.
In Minitab acf graph, any rk that is above or below
the confidence bands is considered to be a spike so

you do not need to find the value of trk .

Cuts off after k


We say that SAC cuts off after lag k if no spikes at
lags greater than k in SAC

Using the SAC to find a stationary time series


For nonseasonal data
(i) If the time series either cuts off fairly
quickly or dies down fairly quickly, then
the series is considered stationary

10 Ch9
(ii) If the time series dies down extremely
slowly, then the series is considered
nonstationary
Note that the SAC of the towel sales series refuse
to die down quickly so there is a clear sign the
series is nonstationary

Sample partial autocorrelation rkk


Can be thought of as the sample autocorrelation of
time series observations separated by a lag of k
time units with the effects of the intervening
observations eliminated.

In other words, this measure of correlation is used


to identify the extent of relationship between
current values of a variable with earlier values of
the same variable (values for various time lags)
while holding the effects of all other time lags
constant.

11 Ch9
Consider now the differenced series of the towel
sales
Autocorrelation Function for z (differenced series)
(with 5% significance limits for the autocorrelations)

1.0
0.8
0.6
0.4
Autocorrelation

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Lag

Here, there is a cut-off at lag 1 so the differenced


series is stationary.

Simple Stationary Time Series Models- ARMA


Let {at } be a sequence of random shocks which
describe the effect of all other factors other than
zt -1 on zt . It is more or less the residual errors of

12 Ch9
the forecast (if the residuals e t are not independent,
then we can’t treat e t as at )
Note: Most textbooks call { at } the white noise.

Properties of {at }
(i) a1 , a2 , a3, ... are independent
(ii) ai : N (0, s a )
2

(iii) at +1 is independent of yt , yt -1 ,...

{at } forms a very important role in Box-Jenkins


methodology. Essentially, every stationary Box-
Jenkins model can be expressed in terms of the
white noise process.

Simple Box-Jenkins Models

Moving Average Models

13 Ch9
zt = at - q1at -1 ... - qqat -q
and refer to it as a moving average process of order
q, denoted by MA(q). (Note that structurally
speaking, MA(q) is expressed as averaging of at
terms except the negative signs)

The special case:


MA(1)

zt = at - q1at -1

E ( zt ) = 0
var( zt ) = s 2a (1 + q12 )
cov( zt , zt +1 ) = -q1s 2a
cov( zt , zt + k ) = 0 for k �2

q1
Thus r1 = 2 and all other r k are zero.
1 + q1
(Make sure you know how to derive the above).

Hence the TAC of an MA(1) “cuts off” after lag 1.

MA(2)

14 Ch9
zt = at - q1at -1 - q2 at -2
E ( zt ) = 0 E(Zt) = 0
var( zt ) = s a2 (1 + q12 + q22 )
cov( zt , zt +1 ) = ( -q1 + q1q2 )s a2
cov( zt , zt + 2 ) = -q2 s2a
cov( zt , zt + k ) = 0 for k �3


-q1 + q1q2
r1 = ,
1 + q12 + q22
-q2
r2 =
1 + q12 + q22
and all other rk are zero.
Thus the TAC of an MA(2) “cuts off” after lag 2.

In general, for MA(q)

(i) rk �0 for k = 1, 2,..., q


rk = 0 for k > q

(ii) PAC dies down

15 Ch9
Autoregressive Models

zt = f1 zt -1 + f2 zt -2 + ...f p zt - p + at

Here the zt are regressed on themselves, (hence of


course the name) but lagged by various amounts.
The simplest case is the first order, denoted as
AR(1), which takes the form
zt = f1 zt -1 + at

E ( zt ) = 0
s 2a
var( zt ) = g 0 = 2 ,
1 - f1
so |f1| < 1 to ensure stationarity
gk = f1 gk-1
rk = f1
k

Thus rk “dies down” exponentially as k increases,


oscillating if f1 < 0. Thus if the TAC of a series
dies down rather than cuts off, we suspect it to be
an AR rather than an MA.

16 Ch9
Note that AR and MA series are not entirely
unrelated. It can be shown that an AR(1) can be
expressed as an “infinite” MA series, much like the
general linear process. The MA(1) can similarly be
expressed as an “infinite” AR series.

Note: a linear process is a time series that has the


form
yt = at + y 1at -1 + y 2 at -2 + ...

The AR(2) can be written as


zt = f1 zt -1 + f2 zt -2 + at
f1
r1 =
1 - f2
r2 = f1r1 + f2
r3 = f1r2 + r2 f1

etc.

Thus again the TAC dies down rather than cuts off,
though it is difficult at times to tell the difference in
TAC’s between AR(1) and AR(2).

17 Ch9
TPAC has nonzero partial autocorrelations at lags 1
and 2 and zero at all lags after lag 2, i.e., cuts off
after lag 2.

In general, for AR(p), TAC dies down and TPAC


cuts off after lag p.

ARMA(p, q)
Mixed autoregressive-moving average models

The model can be written as


zt = f1 zt -1 + f2 zt - 2 + ... + ft - p + at - q1at -1 - q2 at -2 - ... - qq at -q

zt - f1 zt -1 - f2 zt -2 ... - ft - p = at - q1at -1 - q2 at -2 - ... - q q at - q

i.e., we move autoregressive part to the left


whereas the moving average part on the right.

ARMA(1, 1)

18 Ch9
zt = f1 zt -1 + at - q1at -1

(1 - q1f1 )(f1 - q1 ) k -1
rk = f1 , k �1
1 - 2q1f1 + q1 2

i.e., TAC dies down exponentially from r1 (not


from r0 = 1)

TPAC also dies down exponentially.

Summary

We can therefore tentatively produce a Model


Identification Chart, as follows, based on the
behaviours of the SAC and SPAC of a stationary
series.

SAC SPAC Tentative


behaviour behaviour Model
Cuts off after 1 Dies down MA(1)
Cuts off after 2 Dies down MA(2)
Dies down Cuts off after 1 AR(1)
Dies down Cuts off after 2 AR(2)
Dies down Dies down ARMA(1, 1)

19 Ch9
This looks relatively obvious, but isn’t as easy in
practice as it appears. Note that no process has
ACF and PACF that both cut off.

Box-Jenkins Models with a nonzero constant


term
MA(q):
zt = d + at - q1at -1 ... - qqat -q

E ( zt ) = m = d

AR(p):
zt = d+f1 zt -1 + f2 zt -2 + ...f p zt - p + at
d = m(1 - f1 - f2 - ... - f p ) �

m = d /(1 - f1 - f2 ...fk )

ARMA(p,q)
zt = d + f1 zt -1 + f2 zt -2 + ... + ft - p + at - q1at -1 - q2 at -2 - ... - qq at - q

d = m(1 - f1 - f2 - ... - f p )

20 Ch9
Time Series Operations and Representation of
ARMA (p,q) Models.

Backshift Operator
Byt = yt -1
(Push back the time series to the previous position)
Difference operator

�= 1 - B so �yt = (1 - B) yt = yt - yt -1 . Thus, �is


generally known as a differencing operator.

�2 yt = ��
( yt ) = �( yt - yt -1 ) = ( yt - yt -1 ) - ( yt -1 - yt -2 )
= yt - 2 yt -1 + yt -2

Also �d = (1 - B) d

Representation of an ARMA(p, q) model:


AR(p)

zt = d + f1 zt -1 + ... + f p zt - p + at �
zt - f1 zt -1 - ... - f p zt - p = d + at

which can also be written as

21 Ch9
(1 - f1 B - f2 B 2 - ... - f p B p ) zt = d + at

Define f p ( B ) = (1 - f1 B - f2 B - ... - f p B )
2 p

so
f p ( B) zt = d + at

MA(q) – moving average model of order q

The model is written as

zt = d + at - q1at -1 - q 2 at -2 - ... - q q at -q
which can also be written as

zt = d + (1 - q1 B - q 2 B 2 - ... - q q B q )at

Define
q q ( B) = (1 - q1 B - q 2 B 2 - ... - q q B q ) ,

then

zt = d + q q ( B )at

ARMA (p, q)—Mixed autoregressive-moving


average model of order (p, q):

22 Ch9
zt = d + f1 zt -1 + f2 zt - 2 + ... + f p zt - p
+at - q1at -1 - q 2 at - 2 - ... - q q at - q

or
zt - f1 zt -1 - f2 zt - 2 �
��-f p zt - p = d + at - q1at -1 - q 2 at -2 �
��-q q at - q

(1 - f1 B - f2 B 2 - ... - f p B p ) zt = d + (1 - q1 B - q 2 B 2 - ... - q q B q )at


or
f p ( B ) zt = d + q q ( B )at (*)

where q q ( B ) = (1 - q1 B - q 2 B - .. - q q B )
2 q

In this notation, ARMA(p, 0)= AR(p) and


ARMA(0, q) = MA(q).

In such cases one would prefer to write AR(p) and


MA(q) instead of ARMA(p, 0) and ARMA(0, q).

23 Ch9
Point Estimate of the model parameters
Having identified a tentative ARMA model, we
must now fit it to the dataset concerned, in so doing
obtain estimates of the parameters defined by the
models. For the ARMA(p, q) model, the
parameters are qi , fi and d (if the constant term is
required).

These parameters are popularly estimated the least


squares method (As we understand it, both Minitab
and SAS use this approach).
The least method essentially find the estimates so

that SSE = � t - 2
( y ˆ
y t ) is minimum.

You do not need to know the detailed algorithm.


Isn’t nice that the computer packages do it for us?

24 Ch9
Forecasts
What is the meaning of forecasting?
yˆt +t (t ) is a point forecast of the series at time t + t
given the series has been observed from 1 to t
Statistically speaking,
yˆt +t (t ) = E ( yt +t | y1 , y2 ,.., yt )

Since ARMA models build upon the series{at } , the


properties of {at } needs to be revisited. In
particular, a1 , a2 , a3 ,... are independent and that
future values of a ' s are independent of the present
and the past values of y ' s , i.e., at +1 is independent
of yt , yt -1 ,....

Example: Paper Towel Sales


It is found that the differenced series can be fitted
by MA(1), so
zt = at - q1at -1

(assuming d = 0 ).

Since zt = yt - yt -1 so

25 Ch9
yt - yt -1 = at - q1at -1 �

yt = yt -1 + at - q1at -1

(This is known as in the form of a difference-


equation)
One-step forecast:

First, we have yt +1 = yt + at +1 - q1at


yˆt +1 (t ) = E ( yt +1 | y1 , y2 ,..., yt )
= E ( yt + at +1 - q1at | y1 ,..., yt )
= yt + 0 - qˆ1aˆt = yt - qˆ1aˆt

since at +1 is independent of y1 ,.., yt

so E (at +1 | y1 , y2 ,.., yt ) = E ( at +1 ) = 0 .

Let t = 120 and t = 1 so

yˆ121 (120) = y120 - qˆ1aˆ120

In the absorbent towel sales example given in Table

9.1, Minitab gives qˆ1 = -0.3544

26 Ch9
Final Estimates of Parameters

Type Coef SE Coef T P


MA 1 -0.3544 0.0864 -4.10 0.000

Differencing: 1 regular difference


Number of observations: Original series 120, after
differencing 119
Residuals: SS = 127.367 (backforecasts excluded)
MS = 1.079 DF = 118

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48
Chi-Square 10.3 18.6 27.5 41.2
DF 11 23 35 47
P-Value 0.500 0.725 0.815 0.710

The last two residuals are e119 = -1.0890 and


e120 = 0.6903 so aˆ119 = -1.0890 and aˆ120 = 0.6903 .

Thus
yˆ121 (120) = 15.6453 + 0.3544 �0.6903
= 15.8899
Using Minitab to forecast, we get
Forecasts from period 120

95 Percent
Limits
Period Forecast Lower Upper Actual
121 15.8899 13.8532 17.9267

which is identical.

27 Ch9
Two-step forecast:
yt + 2 = yt +1 + at + 2 - q1at +1 �

yˆt + 2 = yˆt +1 (t ) + E (at + 2 ) - qˆ1E (at +1 ) = yˆt +1 (t )

Again, let t = 120 ,

then yˆ122 = yˆ121 (120) = 15.8899 .

However, the prediction interval is winder:

Forecasts from period 120

95 Percent
Limits
Period Forecast Lower Upper Actual
121 15.8899 13.8532 17.9267
122 15.8899 12.4609 19.3189

Finally, in ARIMA notation, we may write our


model that fits the original series as

ARIMA(0,1,1).

28 Ch9

S-ar putea să vă placă și