Documente Academic
Documente Profesional
Documente Cultură
THE BOX-JENKINS
(ARIMA)
METHODOLOGY
BOX-JENKINS
METHODOLOGY
It does not assume any particular pattern in
the historical data.
It uses an iterative approach of identifying a
possible model from a general class of
models.
The chosen model is checked.
The model fits well if the residuals are small
and randomly distributed.
If the specified model is not satisfactory, the
process is repeated using a new model.
The final model is used for forecasting.
Autoregressive
Models
Yt 1Yt 1 2Yt 2 ... pYt p t
Where:
Yt = the response (dependent) variable at time t
Forecasting
Step 4 Use Model for Forecasting with the Model
280
270
Closing Average
260
250
240
230
220
210
1 6 12 18 24 30 36 42 48 54 60
Day
1.0
0.8
0.6 Time Series is nonstationary
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
Time Series Plot of First Differences
3
First Differences
-1
-2
-3
-4
1 6 12 18 24 30 36 42 48 54 60
Day
Autocorrelation Function for First Differences
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
√ -1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
ARIMA(1, 1, 0)
√
Partial Autocorrelation Function for First Differences
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
ARIMA(0, 1, 1)
???
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
ARIMA(1, 1, 0) Model for Transportation Index
Final Estimates of Parameters
Autocorrelation
5 0.053912 0.41 4.15 0.2
0.0
6 -0.069976 -0.53 4.50
-0.2
7 0.215477 1.62 7.94
-0.4
8 -0.062525 -0.45 8.24
-0.6
9 -0.155310 -1.12 10.09
-0.8
10 0.119656 0.85 11.21
-1.0
11 -0.087686 -0.61 11.82
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12 0.005717 0.04 11.83 Lag
13 -0.024945 -0.17 11.88
14 0.052536 0.37 12.11
15 -0.083232 -0.58 12.71
16 -0.170800 -1.18 15.28
ACF of Residuals for Closing Average ARIMA(0, 1, 1)
(with 5% significance limits for the autocorrelations)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
Example 9.4
Readings for the Atron Process
60.0 99.0 75.0 79.5 61.5 88.5 72.0 90
81.0 25.5 78.0 64.5 81.0 51.0 66.0 78
72.0 93.0 66.0 99.0 76.5 85.5 73.5 87
78.0 75.0 97.5 72.0 84.0 58.5 66.0 99
61.5 57.0 60.0 78.0 57.0 90.0 73.5 72
78.0 88.5 97.5 63.0 84.0 60.0 103.5
57.0 76.5 61.5 66.0 73.5 78.0 60.0
84.0 82.5 96.0 84.0 78.0 66.0 81.0
72.0 72.0 79.5 66.0 49.5 97.5 87.0
67.8 76.5 72.0 87.0 78.0 64.5 73.5
Readings for the Atron Process
110
100
90
80
Atron Readings
70
60
50
40
30
20
1 7 14 21 28 35 42 49 56 63 70
Time Period
Seams to be stationary
Autocorrelation Function for Yt
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18
Lag
1.0
√ Partial Autocorrelation
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18
Lag
Partial Autocorrelation for Atron Readings
1.0
0.8
0.6
Partial Autocorrelation
√ 0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18
Lag
ARIMA(0, 0, 2) Model 1: Atron readings
Final Estimates of Parameters
Type Coef SE Coef T P
MA 1 0.5663 0.1108 5.11 0.000
MA 2 -0.3549 0.1147 -3.09 0.003
Constant 75.414 1.060 71.13 0.000
Mean 75.414 1.060
Number of observations: 75
Residuals: SS = 9729.53 (backforecasts
excluded)
MS = 135.13 DF = 72
Modified Box-Pierce (Ljung-Box) Chi-Square
statistic
Lag 12 24 36 48
Chi-Square 7.0 23.7 31.8 46.9
DF 9 21 33 45
P-Value 0.639 0.305 0.528 0.394
Forecasts from period 75
95% Limits
Period Forecast Lower Upper Actual
76 80.626 57.837 103.415
77 78.164 51.975 104.353
78 75.414 48.005 102.823
Residual Autocorrelations: ARIMA(0, 0, 2)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18
Lag
ARIMA(1, 0, 0) Model 1: Atron readings
Final Estimates of Parameters
Number of observations: 75
Residuals: SS = 10065.8 (backforecasts excluded)
MS = 137.9 DF = 73
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 9.3 29.8 37.2 58.2
DF 10 22 34 46
P-Value 0.508 0.124 0.324 0.107
Forecasts from period 75
95% Limits
Period Forecast Lower Upper Actual
76 77.122 54.102 100.142
77 74.368 48.232 100.504
78 75.849 48.879 102.818
According to the principle of parsimony : ARIMA(1, 0, 0)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18
Lag
Example 9.5
Errors from Atron Quality Control
0.23- 0.20- 1.93- 0.97- 0.10
0.63 0.21- 1.87 0.83 0.62-
0.48 0.91 0.97- 0.33- 2.27
0.83- 0.36- 0.46 0.91 0.62-
0.03- 0.48 2.12 1.13- 0.74
1.31 0.61 2.11- 2.22 0.16-
0.86 1.38- 0.70 0.80 1.34
1.28- 0.04- 0.69 1.95- 1.83-
0.00 0.90 0.24- 2.61 0.31
0.63- 1.79 0.34 0.59 1.13
0.08 0.37- 0.60 0.71 0.87-
1.30- 0.40 0.15 -.84 1.45
1.48 1.19- 0.02- 0.11- 1.95-
0.28- 0.98 0.46 1.27 0.51-
0.79- 1.51- 0.54- 0.80- 0.41-
1.86 0.90 0.89 0.76- 0.49
0.07 1.56- 1.07 1.58 1.54
0.09 2.18 0.20 -.38 -.96
Errors (Deviation) from Target): Atron Quality Control
3
1
Errors
-1
-2
1 9 18 27 36 45 54 63 72 81 90
Time Period
Seams to be stationary
Autocorrelations for Atron Quality Control Errors
1.0
0.8
0.6
√ √ 0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20 22
Lag
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20 22
Lag
ARIMA (0, 0, 1) Model: Atron Quality Control
Final Estimates of Parameters
Type Coef SE Coef T P
MA 1 0.5875 0.0864 6.80 0.000
Constant 0.15128 0.04022 3.76 0.000
Mean 0.15128 0.04022
Number of observations: 90
Residuals: SS = 74.4934 (backforecasts excluded)
MS = 0.8465 DF = 88
Lag 12 24 36 48
Chi-Square 9.1 10.8 17.3 31.5
DF 10 22 34 46
P-Value 0.524 0.977 0.992 0.950
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20 22
Lag
Example 9.6
Errors for Ed Jones’ Quality Control
0.77 1.04 2.46- 0.73- 0.23-
0.33 1.02 0.37- 0.10 1.05
2.15 2.03- 0.80 1.47- 0.66-
2.50 2.54- 0.49 0.89- 0.25
1.36 0.23- 0.50 0.53- 0.63-
0.48 0.49 0.07 0.20- 0.91
2.05 0.87- 1.92 0.70- 0.21-
1.46- 0.61 1.00 0.27- 0.24
1.13- 0.20 2.16 0.39 0.05
2.85- 0.98 0.04 0.07- 0.85
2.67- 0.78 1.91 0.89 1.55
2.71- 0.80 0.43 0.37 0.40
1.30- 0.86 0.32- 0.75- 1.82
0.88- 1.72 0.48- 1.24- 0.81
0.07- 0.15 0.13- 0.62- 0.28
1.47- 1.15- 2.26- 0.54- 1.06
Errors (Deviations from Target): Ed Jones' Quality Control
3
The time series
2
appear to vary
1
about a fixed level
Errors
0
of zero, and the
-1
autocorrelations die
-2
out rather quickly.
-3
1 8 16 24 32 40 48 56 64 72 80
Time Period Autocorrelations for Ed Jones' Quality Control Errors
1.0
0.8
0.6
0.4
Autocorrelation
0.2
The graphs 0.0
series is -0.6
-0.8
stationary -1.0
2 4 6 8 10 12 14 16 18 20
Lag
Autocorrelations for Ed Jones' Quality Control Errors
1.0
0.8
√ √ 0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6 Attributed to
-0.8
sampling errors
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
Minitab Output for AR(1) Model for Ed Jones’ Quality Control Errors
Final Estimates of Parameters
Number of observations: 80
Residuals: SS = 86.8808 (backforecasts excluded)
MS = 1.0998 DF = 79
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 10.7 19.5 36.2 44.2
DF 11 23 35 47
P-Value 0.468 0.669 0.410 0.591
95% Limits
Period Forecast Lower Upper Actual
81 0.53088 -1.52498 2.58673
82 0.26588 -2.03340 2.56515
Residual Autocorrelations: ARIMA(1, 0, 0)
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
Example 9.7
Closing Prices for ISC Corporation Stock
235 200 250 270 275
320 290 225 240 205
115 220 125 275 265
355 400 295 225 245
190 275 250 285 170
320 185 355 250 175
275 370 280 310 270
205 255 370 220 225
295 285 250 320 340
240 250 290 215 190
355 300 225 260 250
175 225 270 190 300
285 285 180 295 195
ISC Corporation Closing Stock Prices
400
350
300
Price
250
200
150
100
1 6 12 18 24 30 36 42 48 54 60
Time Period
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
√
-1.0
√ 1 2 3 4 5 6 7 8
Lag
9 10 11 12 13 14 15 16
Minitab Output for AR(2) Model for ISC Closing Stock Prices
Final Estimates of Parameters
Lag 12 24 36 48
Chi-Square 6.3 13.3 18.2 29.1
DF 9 21 33 45
P-Value 0.707 0.899 0.983 0.969
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag
Model Selection
Criteria
It is possible that two (or more) initial models may be
consistent with the patterns of the sample autocorrelations
and partial autocorrelations.
2500
2000
Sales
1500
1000
500
Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan
Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 5 10 15 20 25 30 35
Lag
400
300
200
Diff12Sales
100
-100
-200
-300
Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan
Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
1.0
0.8
0.6
√ √ 0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 5 10 15 20 25 30 35
Lag
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 5 10 15 20 25 30 35
Lag
ARIMA model
• ARIMA(0, 0, 0)(0, 1, 1)12 model:
(p, d, q)(P, D, Q)
2500
2000
Sales
1500
1000
500
1 12 24 36 48 60 72 84 96 108 120
Time
Residual Autocorrelations: ARIMA(0, 0, 0)(0, 1, 1)12
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
3 6 9 12 15 18 21 24
Lag
Example 9.10
Quarterly Sales for Coastal Marine Corporation
Fiscal Year December 31 March 31 June 31 September 31
1994 147.6 251.8 273.1 249.1
1995 139.3 221.2 260.2 259.5
1996 140.5 245.5 298.8 287.0
1997 168.8 322.6 393.5 404.3
1998 259.7 401.1 464.6 479.7
1999 264.4 402.6 411.3 385.9
2000 232.7 309.2 310.7 293.0
2001 205.1 234.4 285.4 258.7
2002 193.2 263.7 292.5 315.2
2003 178.3 274.5 295.4 286.4
2004 190.8 263.5 318.8 305.5
2005 242.6 318.8 329.6 338.2
2006 232.1 285.6 291.0 281.4
Minitab Solution
1. Enter the data on the worksheet.
Plot the data (a seasonal pattern, with a slight upward
trend)
Minitab Minus:
Stat>Time Series>Time Series Plot
Time Series Plot of Sales
500
400
Sales
300
200
100
Quarter Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1
Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Trend Analysis Plot for Sales
Linear Trend Model
Yt = 270.3 + 0.510*t
500 Variable
Actual
Fits
Accuracy Measures
400
MAPE 21.55
MAD 54.63
MSD 5571.70
Sales
300
200
100
1 5 10 15 20 25 30 35 40 45 50
Index
High
Quarterly Sales for Coastal Marine: 1994-2006
500
Quarterly Sales (Thousands)
400
300
200
100
Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Low
Time Series Plot of Quarterly Sales for
Coastal Marine
2. Compute the autocorrelations for the variable Sales:
Stat>Time Series>Autocorrelation
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13
Lag
4. Seasonally difference the data:
Stat>Time Series>Differences
5. Complete the Difference dialog box
-0.6
9 -0.338302 -1.14 91.42
-0.8
10 -0.392471 -1.29 101.15
-1.0
11 -0.407624 -1.30 111.93
1 2 3 4 5 6 7 8 9 10 11 12
12 -0.387378 -1.19 121.93 Lag
The autocorrelation for the seasonally differenced
data are large at low lags and decline rather
slowly. The series may still be nonstationary.
A regular difference in addition to the seasonal
difference might be required to achieve
stationarity. Autocorrelation Function for Diff4Sales
1.0
0.8
0.6
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12
Lag
7. Compute the first differences for the Diff4Sales
variable. Store the differences in C6.
8. Label the C6 variable Diff1Diff4Sales
Compute the autocorrelations for
Diff1Diff4Sales variable.
-0.6
9 0.127245 0.63 26.80
-0.8
10 -0.126554 -0.62 27.80
-1.0
11 -0.053071 -0.26 27.98
1 2 3 4 5 6 7 8 9 10 11 12
12 -0.021962 -0.11 28.01 Lag
Only two significant values, those at lags 1 and 8. The autocorrelations
for the first two lags alternated in sign.
An ARIMA model with a regular autoregressive term and perhaps a
seasonal moving average term at lag 8 (or 4). ARIMA (1,1, 0)(0,1, 2) 4
0.4
Autocorrelation
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
√ 1 2 3 4 5 6
Lag
7 8 9 10 11 12
9. Compute the Partial Autocorrelations for the variable
Diff1Diff4Sales:
Stat>Time Series>Partial Autocorrelation
10. Complete the Partial Autocorrelation Function dialog box:
Partial Autocorrelation Function:
Diff1Diff4Sales
Partial Autocorrelation Function for Diff1Diff4Sales
Lag PACF T (with 5% significance limits for the partial autocorrelations)
Partial Autocorrelation
4 -0.026932 -0.18 0.4
5 0.036653 0.25 0.2
6 -0.238502 -1.64 0.0
7 -0.020543 -0.14 -0.2
12 -0.111998 -0.77 1 2 3 4 5 6 7 8 9 10 11 12
Lag
The partial
autocorrelations seem
to cut off after the first
lag, consistent with the
AR(1).
• Try this model:
√
(1,1, 0)(0,1, 0)4
Partial Autocorrelation Function for Diff1Diff4Sales
(with 5% significance limits for the partial autocorrelations)
1.0
0.8
0.6
Partial Autocorrelation
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5 6 7 8 9 10 11 12
Lag
11. To run an ARIMA (1,1, 0)(0,1, 2)4 model:
Stat>Time Series>ARIMA
12. Complete the ARIMA dialog box:
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 -0.3505 0.1423 -2.46 0.018
SMA 4 0.2392 0.1340 1.78 0.081
SMA 8 0.6716 0.1404 4.78 0.000
Differencing: 1 regular, 1 seasonal of order 4
Number of observations: Original series 52, after differencing 47
Residuals: SS = 31525.4 (backforecasts excluded)
MS = 716.5 DF = 44
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 8.4 10.7 22.0 *
DF 9 21 33 *
P-Value 0.493 0.969 0.928 *
Autocorrelation Function:
RESI1
-0.6
9 -0.060856 -0.37 7.04
-0.8
10 -0.080227 -0.49 7.44
-1.0
11 -0.044193 -0.27 7.57
1 2 3 4 5 6 7 8 9 10 11 12
12 -0.113656 -0.69 8.42 Lag
13. Develop a time series plot including a forecast.
(Click on Graphs on the ARIMA dialog box).
Complete the ARIMA-Graphs dialog box.
Time Series Plot for Sales
(with forecasts and their 95% confidence limits)
500
400
Sales
300
200
100
1 5 10 15 20 25 30 35 40 45 50 55
Quarter
Forecasts for the Next Four Quarters from the ARIMA (1,1, 0)(0,1, 2)4 Model