Documente Academic
Documente Profesional
Documente Cultură
TIME SERIES MODELS Time Series time series = sequence of observations Example: daily returns on a stock multivariate time series is a sequence of vectors of observations Example returns from set of stocks. statistical models for univariate times series widely used in nance to model asset prices in OR to model the output of simulations in business for forecasting
Stationary Processes often a time series has same type of random behavior from one time period to the next outside temperature: each summer is similar to the past summers interest rates and returns on equities stationary stochastic processes are probability models for such series process stationary if behavior unchanged by shifts in time
a process is weakly stationary if its mean, variance, and covariance are unchanged by time shifts thus X1 , X2 , . . . is a weakly stationary process if E (Xi ) = (a constant) for all i Var(Xi ) = 2 (a constant) for all i Corr(Xi , Xj ) = (|i j |) for all i and j for some function
the correlation between two observations depends only on the time distance between them (called the lag) example: correlation between X2 and X5 = correlation between X7 and X10
is the correlation function Note that (h) = (h) covariance between Xt and Xt+h is denoted by (h) () is called the autocovariance function Note that (h) = 2 (h) and that (0) = 2 since (0) = 1 many nancial time series not stationary but the changes in these time series may be stationary
Weak White Noise simplest example of stationary process no correlation X1 , X2 , . . . is WN(, 2 ) if E (Xi ) = for all i Var(Xi ) = 2 (a constant) for all i Corr(Xi , Xj ) = 0 for all i = j if X1 , X2 . . . IID normal then process is Gaussian white noise process
weak white noise process is weakly stationary with (0) = 1 (t) = 0 if t = 0 so that (0) = 2 (t) = 0 if t = 0
White noise WN is uninteresting in itself but is the building block of important models It is interesting to know if a nancial time series, e.g., of net returns, is WN.
Estimating parameters of a stationary process observe y1 , . . . , yn estimate and 2 with Y and s2 estimate autocovariance with
nh
(h) = n1
j =1
(yj +h y )(yj y )
10
AR(1) processes time series models with correlation built from WN in AR processes yt is modeled as a weighted average of past observations plus a white noise error AR(1) is simplest AR process
1, 2, . . .
are WN(0, 2 )
(1)
11
From previous page: yt = (yt1 ) + Only three parameters: mean 2 variance of one-step ahead prediction errors a correlation parameter
t
12
If || < 1, then y1 , . . . is a weakly stationary process mean is yt = (1 ) + yt1 + compare with linear regression model, yt = 0 + 1 xt + t 0 = (1 ) is called the constant in computer output is called the mean in the output
t
(2)
13
yt = +
t+
2 + t1
t2 + = + h=0
th
14
Properties of a stationary AR(1) process When || < 1 (stationarity), then E (yt ) = t 2 (0) = Var(yt ) = 1 2 t t t
15
if || 1, then the AR(1) process is nonstationary, and the mean, variance, and correlation are not constant Formulas 14 can be proved using
yt = +
t+
2 + t1
t2 + + = + h=0
th
For example
Var(yt ) = Var
h=0
th
= 2
h=0
2h
2 = 1 2
16
(h) = Cov
2 |h| = 1 2 and
1, 2, . . .
17
Nonstationary AR(1) processes Random Walk if = 1 then yt = yt1 + not stationary random walk process yt = yt1 + t = (yt2 + y0 + 1 + + t
t1 ) t
= =
18
start at the process at an arbitrary point y0 then E (yt |y0 ) = y0 for all t Var(yt |y0 ) = t 2 increasing variance makes the random walk wander AR(1) processes when || > 1 when || > 1, an AR(1) process has explosive behavior
19
AR(1): = 0.9 6 4 2 0 2 4 6 8 0 50 100 AR(1): = 0.2 3 2 1 0 0 1 2 3 4 0 50 100 AR(1): = 1 4 2 0 2 4 6 8 10 12 0 50 100 150 200 30 40 50 60 0 50 10 0 10 20 150 200 5 0 50 5 150 200 3 2 1 0 1 2 3 4 5 0 50
AR(1): = 0.6
150
200
150
200
100
150
200
n = 200
20
AR(1): = 0.9 4 3 1 2 1 0 1 2 2 3 0 10 20 30 3 0 0 1 2
AR(1): = 0.6
10
20
30
AR(1): = 0.2 2 1 0 1 2 3 6 4 2 0 2 4
AR(1): = 0.9
10
20
30
10
20
30
AR(1): = 1 5 4 3 2 2 1 0 1 1 0 0 10 20 30 1 0 6 5 4 3
AR(1): = 1.02
10
20
30
n = 30
21
AR(1): = 0.6
400
600
800
1000
AR(1): = 0.9
200
400
600
800
1000
AR(1): = 1 40 30 20 4 10 2 0 10 0 2 10 8 6
x 10
AR(1): = 1.02
200
400
600
800
1000
200
400
600
800
1000
n = 1000
22
= (yt2 + = 2 yt2 + = =
t
+
t
t1
+ 2
t2
+ + t1
+ t y0
23
Since || > 1, variance increases geometrically fast at t . Explosive AR processes not widely used in econometrics since economic growth usually is not explosive.
24
Estimation Can t an AR(1) to either raw data, or variable constructed from the raw data To create the log returns take logs of prices dierence
25
In MINITAB, to dierence Stat menu Time Series menu select dierences select log prices as variable
26
AR(1) model is a linear regression model can be analyzed using linear regression software one creates a lagged variable in yt and uses this as the x-variable MINITAB and SAS both support lagging to lag in MINITAB Stat menu Time Series menu then select lag
27
{yt } {(yt1 )}
t=2
if the errors are Gaussian white noise then LSE = MLE both MINITAB or SAS have special procedures for tting AR models
28
In MINITAB Stat menu Time Series then choose ARIMA use 1 autoregressive parameter 0 dierencing if using log returns (or 1 if using log prices) 0 moving average parameters In SAS, use the AUTOREG or the ARIMA procedure
29
Residuals
t
= yt (yt1 )
estimate since
t
1, 2, . . . , n
= yt (yt1 )
used to check that y1 , y2 , . . . , yn is an AR(1) process autocorrelation in residuals evidence against AR(1) assumption to test for residual autocorrelation use test bounds provided by MINITABs or SASs autocorrelation plots
30
can also use the Ljung-Box test null hypothesis is that autocorrelations up to a specied lag are zero
31
To appreciate why residual autocorrelation indicates a possible problem, suppose that we are tting an AR(1) model but the true model is a AR(2) process given by (yt ) = 1 (yt1 ) + 2 (yt2 ) + t . no hope of estimating 2 . does not necessarily estimate 1 because of bias
32
Let be the expected value of . For the purpose of illustration, assume that and . Then
t
(yt1 )
= (1 )(yt1 ) + 2 (yt2 ) + t .
33
(yt1 )
= (1 )(yt1 ) + 2 (yt2 ) + t . Thus, the residuals do not estimate the white noise process. If there is no bias in the estimation of then 1 = and (1 )(yt1 ) drops out But the presence of 2 (yt2 ) still causes the residuals to be autocorrelated.
34
Example: GE daily returns The MINITAB output was obtained by running MINITAB interactively. Here is the MINITAB output. The variable logR is the time series of log returns.
35
Results for: GE_DAILY.MT ARIMA Model: logR ARIMA model for logR Estimates at each iteration Iteration SSE Parameters 0 2.11832 0.100 0.090 1 0.12912 0.228 0.015 2 0.07377 0.233 0.001 3 0.07360 0.230 0.000 4 0.07360 0.230 -0.000 5 0.07360 0.230 -0.000 Relative change in each estimate less than
0.0010
36
Final Estimates of Parameters Type Coef SE Coef AR 1 0.2299 0.0621 Constant -0.000031 0.001081 Mean -0.000040 0.001403
T 3.70 -0.03
P 0.000 0.977
Number of observations: 252 Residuals: SS = 0.0735911 (backforecasts excluded) MS = 0.0002944 DF = 250 Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag 12 24 36 48 Chi-Square 23.0 33.6 47.1 78.6 DF 10 22 34 46 P-Value 0.011 0.054 0.066 0.002
37
38
39
40
Lag 1
Preliminary MSE 0.000292 Estimates of Autoregressive Parameters Standard Coefficient Error t Value -0.225457 0.061617 -3.66
41
GE - Daily prices, Dec 17, 1999 to Dec 15, 2000 The AUTOREG Procedure Yule-Walker Estimates SSE MSE SBC Regress R-Square Durbin-Watson Variable DF Intercept 1 0.07359998 DFE 250 0.0002944 Root MSE 0.01716 -1324.6559 AIC -1331.7148 0.0000 Total R-Square 0.0518 1.9326 Standard Approx Estimate Error t Value Pr > |t| -0.000040 0.001394 -0.03 0.9773
42
= .2299 std dev of is 0.0621 t-value for testing H0 : = 0 versus H1 : = 0 is .2299/.0621 = 3.70 p-value is .000 (to three decimals) null hypothesis: log returns are white noise alternative is that they are correlated small p-value is evidence against the geometric random walk hypothesis
43
however, = .2299 is not large (h) = h correlation between successive log returns is .2299 the squared correlation is only .0528 only about ve percent of the variation in a log return can be predicted by the previous days return In summary, AR(1) process ts GE log returns better than white noise not proof that the AR(1) ts these data only that it ts better than a white noise model
44
to check that the AR(1) ts well looks sample autocorrelation function (SACF) of the residuals plot of the residual SACF available from MINITAB or SAS SACF of the residuals from the GE daily log returns shows high negative autocorrelation at lag 6 (6) is outside the test limits so is signicant at = .05 this is disturbing
45
0.25 0.2 0.15 0.1 autocorrelation 0.05 0 0.05 0.1 0.15 0.2 0.25 0 5 10 15 lag 20 25 30 35
46
the more conservative Ljung-Box simultaneous test that (1) = = (12) = 0 has p = .011 since the AR(1) model does not t well, one might consider more complex models these will be discussed in following sections
47
so SASs is the negative of as we, and MINITAB, dene it The dierence, .2299 versus .2254, between MINITAB and SAS is slight due to the estimation algorithm
48
can also estimate and test that is zero from the MINITAB output is nearly zero t-value for testing that is zero is very small p-value is near one small values of the p-value are signicant since the p-value is large we accept the null hypothesis that is zero
49
50
The ARIMA Procedure Name of Variable = log_return Mean of Working Series 0.000953 Standard Deviation 0.051749 Number of Observations 3244 Autocorrelations Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 1 2 3 4 5 6 7 8 9 10 . . . 0.0026779 -0.0000656 -0.0000694 -0.0000339 -0.0000221 -0.0000175 0.00005227 0.00003363 -0.0000703 0.00008622 -6.6465E-6 1.00000 -.02450 -.02593 -.01266 -.00826 -.00653 0.01952 0.01256 -.02624 0.03220 -.00248 | | | | | | | | | | | |********************| .|. | *|. | .|. | .|. | .|. | .|. | .|. | *|. | .|* | .|. |
51
Autocorrelation Check for White Noise To Lag 6 12 18 24 ChiSquare 6.25 12.69 15.40 17.57 Pr > ChiSq -------------Autocorrelations-----------0.3955 -0.025 -0.026 -0.013 -0.008 -0.007 0.020 0.3919 0.013 -0.026 0.032 -0.002 -0.002 -0.009 0.6343 0.013 -0.003 0.020 0.008 0.010 -0.009 0.8235 0.013 -0.004 -0.011 0.012 0.012 0.009
DF 6 12 18 24
52
Parameter MU AR1,1
Conditional Least Squares Estimation Standard Approx Estimate Error t Value Pr > |t| 0.0009521 -0.02450 0.0008869 0.01756 1.07 -1.40 0.2831 0.1630
Lag 0 1
Constant Estimate 0.000975 Variance Estimate 0.002678 Std Error Estimate 0.051749 AIC -10005.2 SBC -9993 Number of Residuals 3244 * AIC and SBC do not include log determinant. Correlations of Parameter Estimates Parameter MU AR1,1 MU 1.000 -0.000 AR1,1 -0.000 1.000
53
The ARIMA Procedure Autocorrelation Check of Residuals To Lag 6 12 18 24 30 36 42 48 ChiSquare 4.58 10.67 13.36 15.62 34.93 48.74 52.02 57.08 Pr > ChiSq -------------Autocorrelations-----------0.4697 -0.001 -0.027 -0.014 -0.009 -0.006 0.020 0.4716 0.012 -0.025 0.032 -0.002 -0.003 -0.009 0.7115 0.013 -0.002 0.020 0.008 0.010 -0.009 0.8711 0.012 -0.004 -0.011 0.012 0.012 0.011 0.2067 0.042 0.043 -0.005 -0.006 0.014 0.045 0.0614 -0.020 -0.025 -0.036 -0.029 0.019 -0.027 0.1162 0.007 -0.025 -0.011 0.001 -0.000 -0.015 0.1488 -0.006 -0.016 0.015 0.012 -0.006 -0.029
DF 5 11 17 23 29 35 41 47
54
AR(p) models yt is AR(p) process if (yt ) = 1 (yt1 )+2 (yt2 )+ +p (ytp )+ here
1, . . . , n t
is WN(0, 2 )
multiple linear regression model with lagged values of the time series as the x-variables model can be reexpressed as yt = 0 + 1 yt1 + . . . + p ytp + t , here 0 = {1 (1 + . . . + p )}
55
least-squares estimator can be calculated using a multiple linear regression program one must create x-variables by lagging the time series with lags 1 throught p easier to use the ARIMA command in MINITAB or SAS or SASs AUTOREG procedures these do the lagging automatically
56
replaced by
model logR =/nlag = 6
i is signicant at lags 1 and 6 but not at lags 2 through 5 signicant means at = .05 which corresponds to absolute t-value bigger than 2 MINITAB will not allow p > 5 but SAS does not have such a constraint
57
Moving Average (MA) Processes MA(1) processes moving average process of order [MA(1)] is yt = where as before the can show that E (yt ) = , Var(yt ) = 2 (1 + 2 ),
t s t
t1 ,
are WN(0, 2 )
58
59
t1
tq
can show that (h) = 0 and (h) = 0 if |h| > q formulas for (h) and (h) when |h| q are given in time series textbooks complicated but not be needed by us
60
ARIMA Processes ARMA (autoregressive and moving average): stationary time series with complex autocorrelation behavior better modeled by mixed autoregressive and moving average processes ARIMA (autoregressive, integrated, moving average): based on stationary ARMA processes but are nonstationary ARIMA processes easily described with backwards operator, B
61
The backwards operator backwards operator B is dened by B yt = yt1 more generally, B k yt = ytk B c = c for any constant c since a constant does not change with time
62
ARMA Processes ARMA(p, q ) process satises the equation (1 1 B p B p )(yt ) = (1 1 B . . . q B q ) t (3) white noise process is ARMA(0,0) with = 0 since if p = q = 0, then (3) reduces to (yt ) =
t
63
The dierencing operator dierencing operator is = 1 B so that yt = yt B yt = yt yt1 dierencing a time series produces a new time series consisting of the changes in the original series for example, if pt = log(Pt ) is the log price, then the log return is rt = pt
64
dierencing can be iterated for example, 2 yt = (yt ) = (yt yt1 ) = (yt yt1 ) (yt1 yt2 ) = yt 2yt1 + yt2
65
From ARMA processes to ARIMA process often the rst or second dierences of nonstationary time series are stationary for example, the rst dierences of random walk (nonstationary) are white noise (stationary) a time series yt is said to be ARIMA(p, d, q ) if d yt is ARMA(p, q ) for example, if log returns (rt ) on an asset are ARMA(p, q ), then the log prices (pt ) are ARIMA(p, 1, q )
66
ARIMA procedures in MINITAB and SAS allow one to specify p, d, and q an ARIMA(p, 0, q ) model is the same as an ARMA(p, q ) model ARIMA(p, 0, 0), ARMA(p, 0), and AR(p) models are the same Also, ARIMA(0, 0, q ), ARMA(0, q ), and MA(q ) models are the same
67
random walk is an ARIMA(0, 1, 0) model The inverse of dierencing is integrating the integral of a process yt is wt = wt0 + yt0 + yt0 +1 + yt t0 is an arbitrary starting time point wt0 is the starting value of the wt process Figure shows an AR(1), its integral and its second integral, meaning the integral of its integral
68
ARIMA(1,0,0) with = 0 and = 0.4 4 2 0 2 4 0 20 10 0 10 20 30 0 500 50 100 150 200 250 ARIMA(1,2,0) 300 350 400 50 100 150 200 250 ARIMA(1,1,0) 300 350 400
500
1000 0
50
100
150
200
250
300
350
400
69
Model Selection once the parameters p, d, and q selected, coecients can be estimated by maximum likelihood but how do we choose p, d, and q ? generally, d is either 0, 1, or 2 chosen by looking at the SACF of yt , yt , and 2 yt a sign that a process is nonstationary is that its SACF decays to zero very slowly if this is true of yt then the original series is nonstationary should be dierenced at least once
70
if the SACF of yt looks stationary, then we use d=1 otherwise, we look at the SACF of 2 yt if this looks stationary we use d = 2. real time series where 2 yt did not look stationary are rare but if one were encountered then d > 2 would be used once d has been chosen we will t ARMA(p, q ) process to d yt but still need p and q comparing various choices of p and q by some criterion that measures how well a model ts
71
AIC and SBC AIC and SBC are model selection criteria based on the log-likelihood Akaikes information criterion (AIC) is dened as 2 log(L) + 2(p + q ), L is the likelihood evaluated at the MLE Schwarzs Bayesian Criterion (SBC): dened as 2 log(L) + log(n)(p + q ), n is the length of the time series also called Bayesian Information Criterion (BIC)
72
best model by either criterion is the model that minimizes that criterion Either criteria will tend to select models with a large likelihood value this makes perfect sense since large L means observed data are likely under that model
73
term 2(p + q ) in AIC or log(n)(p + q ) is a penalty on having too many parameters therefore, AIC and SBC try to tradeo good t to the data measured by L the desire to use few parameters which penalizes the most? log(n) > 2 if n 8 most time series are much longer than 8 so SBC penalizes p + q more than AIC therefore, AIC will tend to choose models with more parameters than SBC
74
compared to SBC, with AIC the tradeo is more in favor of a large value of L than a small value of p + q Unfortunately, MINITAB does not compute AIC and SBC but SAS does. Heres how you can calculate approximate AIC and SBC values using MINITAB. It can be shown that log(L) (n/2) log( 2 ) + K where K is a constant that does not depend on the model of on the parameters. Since we only want to minimize AIC and SBC, the exact value of K is irrelevant and we will drop K .
75
Thus, you can use the approximations AIC n log( 2 ) + 2(p + q ), and SBC n log( 2 ) + log(n)(p + q ). 2 is called MSE (mean squared error) on the MINITAB output.
76
dierence between AIC and SBC is due to the way they were designed AIC is designed to select the model that will predict best and is less concerned with having a few too many parameters SBC is designed to select the true values of p and q exactly in practice the best AIC model is usually close to the best SBC model often they are the same model models can be compared by likelihood ratio testing when one model is bigger than the other therefore, AIC and SBC are basically LR tests
77
Stepwise regression applied to AR processes stepwise regression: looks at many regression models sees which ones t the data well will be discussed later backwards regression: starts with all possible x-variables eliminates them one at time stop when all remaining variables are signicant can be applied to AR models SASs AUTOREG procedure allows backstepping as an option
78
The following SAS program starts with an AR(6) model and backsteps
options linesize = 72 ; data ge ; infile c:\courses\or473\data\ge_quart.dat ; input close ; D_p = dif(close); logP = log(close) ; logR = dif(logP) ; run ; title GE - Quarterly closing prices, Dec 1900 to Dec 2000 ; title2 AR(6) with backstepping ; proc autoreg ; model logR =/nlag = 6 backstep ; run ;
79
Variable Intercept
t Value 6.21
80
Lag 0 1 2 3 4 5 6
81
GE - Quarterly closing prices, Dec 1900 to Dec 2000 AR(6) with backstepping The AUTOREG Procedure Backward Elimination of Autoregressive Terms Lag Estimate t Value Pr > |t| 4 0.020648 2 -0.023292 1 0.035577 6 0.082465 5 0.170641 Preliminary MSE 0.12 -0.14 0.23 0.50 1.13 0.00328 0.9058 0.8921 0.8226 0.6215 0.2655
82
Estimates of Autoregressive Parameters Standard Lag Coefficient Error t Value 3 -0.392878 Expected Autocorrelations Lag 0 1 2 3 Autocorr 1.0000 0.0000 0.0000 0.3929 0.151180 -2.60
83
Yule-Walker Estimates SSE MSE SBC Regress R-Square Durbin-Watson 0.12476731 0.00337 -105.5425 0.0000 1.9820 DFE Root MSE AIC Total R-Square 37 0.05807 -108.86962 0.1751
84
GE - Quarterly closing prices, Dec 1900 to Dec 2000 AR(6) with backstepping The AUTOREG Procedure Standard Variable DF Estimate Error t Value Intercept 1 0.0632 0.0146 4.33 Expected Autocorrelations Lag Autocorr 0 1.0000 1 0.0000 2 0.0000 3 0.3929
85
interest rate
0.4 0.2
autocorr.
autocorr. 20 lag 40 60
0.5
0 0.2
0.5 0
0.4 0
20 lag
40
60
86
Using ARIMA in SAS: Cree data in this example, we will illustrate tting an ARMA model in SAS will use daily log returns on Cree from December 1999 to December 2000.
87
CREE, daily 12/17/99 to 12/15/00 100 0.3 90 80 price 70 60 50 40 30 0 100 200 Return 0.2 log return 0 100 200 0.1 0 0.1 0.2
Normal plot of log returns 0.999 0.997 0.99 0.98 0.95 0.90 0.75 0.50 0.25 0.10 0.05 0.02 0.01 0.003 0.001 0.2 0.1 0 0.1 log return 0.25 0.2 Volatility 0.15 0.1 0.05 0
Probability
100
200
88
89
Here is the SAS output. Cree log returns appear to be white noise since each of 1 (denoted by AR1,1 in SAS) 1 (denoted by MA1,1) not signicantly dierent from zero.
90
Lag 0 1 2 3 4 5 6 7 8 9 10
Autocorrelations Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0.0045526 1.00000 | |********************| 0.00031398 0.06897 | . |* . | -0.0000160 -.00351 | . | . | -5.5958E-6 -.00123 | . | . | -0.0002213 -.04862 | . *| . | 0.00002748 0.00604 | . | . | -0.0000779 -.01712 | . | . | -0.0000207 -.00454 | . | . | -0.0003281 -.07207 | . *| . | 0.00015664 0.03441 | . |* . 0.00057077 0.12537 | . |*** |
91
To Lag 6 12 18 24
Autocorrelation Check for White Noise ChiPr > Square DF ChiSq -------------Autocorrelations-----------1.91 10.02 21.95 23.37 6 12 18 24 0.9276 0.069 -0.004 -0.001 -0.049 0.006 -0.017 0.6143 -0.005 -0.072 0.034 0.125 0.052 -0.076 0.2344 -0.030 -0.123 0.051 -0.022 -0.013 -0.157 0.4978 0.014 -0.010 -0.037 -0.032 -0.047 0.01
92
Conditional Least Squares Estimation Standard Parameter Estimate Error t Value MU -0.0006814 0.0045317 -0.15 MA1,1 -0.18767 0.88710 -0.21 AR1,1 -0.11768 0.89670 -0.13 Constant Estimate -0.00076 Variance Estimate 0.004585 Std Error Estimate 0.067712 AIC -638.889 SBC -628.301 Number of Residuals 252 * AIC and SBC do not include log determinant.
Lag 0 1 1
93
To Lag 6 12 18 24
Autocorrelation Check of Residuals ChiPr > Square DF ChiSq -------------Autocorrelations-----------0.75 8.54 21.12 22.48 4 10 16 22 0.9444 0.000 0.004 0.001 -0.049 0.010 -0.019 0.5761 0.003 -0.075 0.032 0.118 0.050 -0.079 0.1741 -0.014 -0.127 0.062 -0.029 0.001 -0.159 0.4314 0.025 -0.011 -0.035 -0.026 -0.045 0.016
94
Three possible scenarios: 1. log returns are white noise then log returns should pass the white noise test 2. log returns are not white noise but t the times series model then log returns should fail the white noise test but then residuals should pass the white noise test 3. log returns are not white noise and do not t the time series model then log returns and residuals will both fail the white noise test
95
Warning Dont rely too much of the residual tests for autocorrelation. If n is large: autocorrelation might be small but statistically signicant my opinion SBC is a better guide to model choice than the residual test for autocorrelations
96
Example: Three-month Treasury bill rates our empirical results: log returns have little autocorrelation but not exactly white noise other nancial time series do have substantial autocorrelation
97
example: monthly interest rates on three-month US Treasury bills from December 1950 until February 1996 data come from Example 16.1 of Pindyck and Rubin (1998), Econometric Models and Economic Forecasts rates are plotted in next gure rst dierences look somewhat stationary we will t ARMA models to the rst dierences
98
3 month Tbills 20 15 10 5 0 0 4 2 1st difference 0 2 4 200 400 month since Jan 1950 SACF 600 6 0
Differences
interest rate
600
0.4 0.2
autocorr.
autocorr. 20 lag 40 60
0.5
0 0.2
0.5 0
0.4 0
20 lag
40
60
99
100
101
Lag 0 1 2 3 4 5 6 7 8 9 10
Autocorrelations Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0.244138 1.00000 | |********************| 0.067690 0.27726 | . |****** | -0.026212 -.10736 | **| . | -0.022360 -.09159 | **| . | -0.0091143 -.03733 | .*| . | 0.011399 0.04669 | . |*. | -0.045339 -.18571 | ****| . | -0.047987 -.19656 | ****| . | 0.022734 0.09312 | . |** | 0.047441 0.19432 | . |**** 0.014282 0.05850 | . |*. |
102
To Lag 6 12 18 24
Autocorrelation Check for White Noise ChiPr > Square DF ChiSq -------------Autocorrelations-----------75.33 130.15 158.33 205.42 6 12 18 24 <.0001 0.277 -0.107 -0.092 -0.037 0.047 -0.186 <.0001 -0.197 0.093 0.194 0.059 -0.007 -0.093 <.0001 0.036 0.157 -0.102 0.005 0.082 0.078 <.0001 -0.033 -0.232 -0.160 -0.015 -0.008 -0.030
103
Conditional Least Squares Estimation Standard Parameter Estimate Error t MU 0.0071463 0.02056 AR1,1 0.33494 0.04287 AR1,2 -0.16456 0.04501 AR1,3 0.01712 0.04535 AR1,4 -0.10901 0.04522 AR1,5 0.14252 0.04451 AR1,6 -0.21560 0.04451 AR1,7 -0.08347 0.04522 AR1,8 0.10382 0.04536 AR1,9 0.10007 0.04502 AR1,10 -0.04723 0.04290 Constant Estimate 0.006585 Variance Estimate 0.198648 Std Error Estimate 0.445699
Value 0.35 7.81 -3.66 0.38 -2.41 3.20 -4.84 -1.85 2.29 2.22 -1.10
Approx Pr > |t| 0.7283 <.0001 0.0003 0.7060 0.0163 0.0014 <.0001 0.0655 0.0225 0.0267 0.2714
Lag 0 1 2 3 4 5 6 7 8 9 10
104
Three month treasury bills ARIMA model - to first differences The ARIMA Procedure AIC 687.6855 SBC 735.1743 Number of Residuals 554 * AIC and SBC do not include log determinant.
105
To Lag 6 12 18 24 30 36 42 48
Three month treasury bills 5 ARIMA model - to first differences The ARIMA Procedure Autocorrelation Check of Residuals ChiPr > Square DF ChiSq -------------Autocorrelations-----------0.00 9.56 42.72 62.06 65.76 73.52 74.14 82.20 0 <.0001 2 0.0084 8 <.0001 14 <.0001 20 <.0001 26 <.0001 32 <.0001 38 <.0001 0.003 -0.011 0.036 -0.001 -0.076 0.177 -0.062 -0.149 0.002 0.008 -0.070 -0.004 -0.007 0.028 -0.011 -0.000 0.003 -0.031 -0.115 -0.078 0.045 -0.051 -0.007 -0.006 0.021 -0.015 -0.031 0.018 0.105 -0.040 0.081 0.019 0.025 -0.025 -0.024 -0.013 0.048 -0.043 -0.007 -0.003 -0.053 -0.052 -0.005 0.010 0.006 0.001 -0.103 0.050
106
Lag 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Autocorrelation Plot of Residuals Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0.198648 1.00000 | |********************| 0.00057812 0.00291 | . | . | -0.0020959 -.01055 | . | . | 0.00068451 0.00345 | . | . | 0.0041792 0.02104 | . | . | -0.0030362 -.01528 | . | . | -0.0061377 -.03090 | .*| . | 0.0071315 0.03590 | . |*. | -0.0001693 -.00085 | . | . | -0.0061781 -.03110 | .*| . 0.0036055 0.01815 | . | . | 0.020788 0.10465 | . |** | -0.0078818 -.03968 | .*| . | -0.015171 -.07637 | **| . | 0.035240 0.17740 | . |****
107
AR(10) model does not t well try an AR(24) model with backtting here is the SAS program
options linesize = 72 ; data rate1 ; infile c:\courses\or473\data\fygn.dat ; input date $ z; zdif=dif(z) ; title Three month treasury bills ; title2 AR(24) model to first differences with backfitting ; proc autoreg ; model zdif= / nlag=24 backstep; run ;
108
109
AR(24) model to first differences with backfitting The AUTOREG Procedure Backward Elimination of Autoregressive Terms Lag Estimate t Value Pr > |t| 10 0.007567 0.16 0.8721 23 0.010212 0.22 0.8241 17 0.008951 0.19 0.8492 3 -0.014390 -0.32 0.7496 24 0.015798 0.40 0.6907 13 0.041434 0.92 0.3605 7 0.038880 0.85 0.3964 18 -0.037456 -0.90 0.3702 22 0.042555 1.02 0.3090 20 0.058230 1.31 0.1912 4 0.059903 1.48 0.1389 9 -0.058141 -1.42 0.1562 Preliminary MSE 0.1765
110
Estimates of Autoregressive Parameters Standard Lag Coefficient Error t Value 1 -0.388246 0.040419 -9.61 2 0.200242 0.040438 4.95 5 -0.108069 0.040513 -2.67 6 0.249095 0.039719 6.27 8 -0.103462 0.039668 -2.61 11 -0.102896 0.040278 -2.55 12 0.119950 0.040704 2.95 14 -0.204702 0.040427 -5.06 15 0.223381 0.042441 5.26 16 -0.151917 0.040811 -3.72 19 0.103356 0.038847 2.66 21 0.108074 0.039511 2.74
111
My analysis of these results: We are probably overtting Probably should look carefully at SBC values of these models
112
Forecasting ARIMA models can forecast future values consider forecasting using an AR(1) process have data y1 , . . . , yn and estimates and
113
n+1
114
in general, yn+k = + k (yn ). if < 1 then as k increases forecasts decay exponentially fast to forecasting general AR(p) processes is similar
115
example: for an AR(2) process yn+1 = + 1 (yn ) + 2 (yn1 ) + therefore yn+1 := + 1 (yn ) + 2 (yn1 ) also yn+2 := + 1 (yn+1 ) + 2 (yn ). etc
n+1
116
forecasting ARMA and ARIMA processes is slightly more complicated is discussed in time series courses such as ORIE 563 the forecasts can be generated automatically by statistical software such as MINITAB and SAS
117
GE daily returns tting an ARIMA(1,0,0) model to log returns is equivalent to tting an ARIMA(1,1,0) model to the log prices we will t both models to the GE daily price data next gure shows the forecasts of the log returns up to 24 days ahead.
118
0.08 0.06
UPPER FORECAST LIMIT
50
100
150 time
200
250
300
119
4.15
UPPER FORECAST LIMIT
4.1 4.05 4 log price 3.95 3.9 3.85 3.8 3.75 3.7 3.65 0 50 100 150 time 200 250 300
LOWER FORECAST LIMIT DATA FORECASTS
time
120
MINITAB always forecasts the input series the two gures show that forecasts of a stationary process behave very dierently from forecasts of a nonstationary process
121
MATLAB now has a GARCH Toolbox Can be used to t ARIMA models as well as GARCH models
122
MATLAB Program
load cree_daily_adjclose.txt ; x = cree_daily_adjclose ; n = length(x) ; year = 1993 + (1:n)*(2006-1993)/n ; net_return = price2ret(x) ; spec = garchset(r,2,m,0) ; [coeff,errors,LLF, innovations, sigmas,summary]=garchfit(spec,net_return) ; garchdisp(coeff,errors) garchplot(innovations,sigmas,net_return) [h pvalue Qstat ] = lbqtest(innovations,[6 12])
123
MATLAB Output
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Diagnostic Information Number of variables: 4 Functions Objective: Gradient: Hessian: Nonlinear constraints: Gradient of nonlinear constraints:
Constraints Number of nonlinear inequality constraints: 2 Number of nonlinear equality constraints: 0 Number Number Number Number of of of of linear inequality constraints: linear equality constraints: lower bound constraints: upper bound constraints: 0 0 4 4
124
max Directional First-order Iter F-count f(x) constraint Step-size derivative optimality Procedure 0 5 -5005.73 -0.002674 1 31 -5005.73 -0.002674 -9.54e-007 -0.00417 0.581 2 57 -5005.73 -0.002674 -9.54e-007 -0.00825 1.81 3 78 -5005.73 -0.002674 3.05e-005 -0.000112 1.44 4 97 -5005.73 -0.002674 0.000122 -8.6e-007 1.26 Optimization terminated: magnitude of directional derivative in search direction less than 2*options.TolFun and maximum constraint violation is less than options.TolCon. No active inequalities
125
Mean: ARMAX(2,0,0); Variance: GARCH(0,0) Conditional Probability Distribution: Gaussian Number of Model Parameters Estimated: 4 Standard Error -----------0.00090898 0.013994 0.014982 4.2403e-005 T Statistic ----------1.1029 -1.7973 -1.7717 63.0711
pvalue = Qstat =
0.8990 2.2139
0.7574 8.3481
126
Innovations 0.4 Innovation 0.2 0 0.2 0.4 0 500 1000 1500 2000 2500 3000 3500
500
1000
1500 Returns
2000
2500
3000
3500
0.4 0.2 Return 0 0.2 0.4 0 500 1000 1500 2000 2500 3000 3500