Documente Academic
Documente Profesional
Documente Cultură
PART II
EDUCATIONAL AIM
The main objective of the course is to introduce students to the fundamental concepts of
econometrics and its usefulness in analysing financial data. The module is designed to
give students an understanding of why econometrics is necessary, to provide for them a
working ability with basic econometric tools and illustrate their application in finance.
More broadly, it is concerned with estimating financial relationships, testing assumptions
involving financial behaviour and forecasting financial variables. The course is highly
participative; there will be exercises on the development of empirical models that are
coherent with financial data and on the explanation of estimation results.
EDUCATIONAL OBJECTIVES
Introduce students to the basic econometric tools for empirical modelling
Accustom students with applying these tools to estimation, statistical inference, and
forecasting in financial markets.
Develop the necessary skills to critically interpret the results of econometric analyses.
LEARNING OUTCOMES
Have a solid knowledge of the econometrics required to formulate and test
appropriate financial models.
Have a comprehensive understanding of how information from observed financial
data could be processed.
Be able to draw conclusions regarding the financial behaviour and to assess critically
the extant empirical research in Finance.
Appreciate the range of more advanced techniques that exist and have the foundations
for further study of econometrics.
TEACHING FORMAT
The course will comprise 11 lectures of 2 hours contact time each. Students are also
expected to attend regularly scheduled workshops. The latter are intended to demonstrate
the use of the econometric package EViews for the practical implementation of the
theoretical material covered during the lectures using a data set provided by the lecturer.
LECTURES
The aforementioned aims and the intended learning outcomes will be addressed in a
series of lectures. The lectures will embody activities such as formal lecturing,
participative discussions, and exercises.
Lecture 1
The Simple Linear Regression Model
Lecture 2
Inference in Simple Regression: Interval Estimation and Hypothesis Testing
Lecture 3
The Multiple Regression Model
Lecture 4 and 5
Heteroskedasticity and Autocorrelation: Causes, Consequences, Remedies
Lecture 6
Misspecification Problems: Multicollinearity, Functional Form, Normality,
Omitted Variables
Lecture 7
Seasonality with Financial Data
Lecture 8
Time Series Analysis and Stationarity
Lecture 9
ARMA Models
Lecture 10
Limited Dependent Variable Models
Lecture 11
Revision
LECTURE 1
What is Econometrics? What is Regression Analysis? The Basic Econometric Model,
Introducing the Error Term, Assumptions of Simple Linear Regression Model, Parameter
Interpretation, Point Estimation, The Method of Ordinary Least Squares, Properties and
Precision of the Least Squares Estimators.
LECTURE 2
Interval Estimation: Confidence Intervals for Regression Parameters, Hypothesis Testing,
Testing a Hypothesis Involving a Linear Combination of Parameters, Test Statistics,
Critical Region, p Value, Scaling and Units of Measurement, Functional Form.
LECTURE 3
The Multiple Regression Model, Assumptions of Multiple Regression Model,
Interpretation of Multiple Regression Equation, The Meaning of the Regression
Coefficients, Single Hypothesis Tests, Goodness of Fit, Joint Hypotheses Tests, The Ftest, Testing the Overall Significance of a Model.
LECTURE 4
The Nature of the Heteroskedasticity Problem, Causes of Heteroskedastic Errors,
Consequences of Ignoring Heteroskedasticity, Detecting Heteroskedasticity: Graphical
Method and Tests, Treatment of Heteroskedasticity.
LECTURE 5
The Nature of the Autocorrelation Problem, First-Order Autocorrelated Errors, Causes of
Autocorrelation, Consequences of Ignoring Autocorrelation, Detecting Autocorrelation:
Graphical Method and Tests, Treatment of Autocorrelation.
LECTURE 6
Other Misspecification Issues, Multicollinearity, Non-normality of Errors, Nonlinearity,
Functional Form, The RESET Misspecification Test, Omitted Variable Bias, Irrelevant
Variables.
LECTURE 7
The Use of Intercept Dummy Variables, The Use of Interaction (Slope) Dummy
Variables, Controlling for Time: Estimating Seasonal Effects, Testing for Structural
Stability.
LECTURE 8
White Noise, Non-Stationary vs. Stationary Processes, Stochastic and Deterministic
Trend, Random Walk, Stationarity, Unit Root Tests.
LECTURE 9
Time Series Regression, ARMA Models, The Autocorrelation
Autocorrelation Functions, Identification, Estimation, Diagnostic Tests.
and
Partial
LECTURE 10
Discrete Variable Models, Linear Probability Model, Probit Model, Logit Model,
Parameter Interpretation, Goodness of Fit.
LECTURE 11
Review of course.
ASSESSMENT
There will be one piece of group coursework, and a written examination that will be
weighted as 30% and 70%, respectively. The coursework is highly empirical, and the
students will have to apply their theoretical and quantitative skills to investigate a given
problem in finance. The coursework should demonstrate a sufficient understanding of the
issues analyzed during the course.
READING LIST
Financial Econometrics
Financial Econometrics
Dr Elena Kalotychou
Topic
p 1
Topic 2
Topic 3
Topic 4
Topic 5
Topic 6
Topic 7
Simple
p linear regression
g
assumptions,
p
, OLS estimation
Hypothesis testing single and multiple hypotheses
Multiple regression Cross section and Time series
Goodness of fit statistics
Violations of the assumptions - causes, consequences, remedies
Non-stationarity, unit root tests; ARMA models
Limited Dependent Variable Models
Financial Econometrics
Administrative Preliminaries
This module comprises 22 hours of lectures plus EViews sessions in the
computer lab.
Attend all the lectures/labs and dont switch groups.
Textbook: Brooks, but others are also good.
Assessment:
Comprising 30% group coursework and 70% individual final exam.
Introduction:
The Nature and Purpose of Econometrics
What is Econometrics?
Financial Econometrics
Financial Econometrics
Frequency
monthly, or quarterly
annually
weekly
as transactions occur
8
Financial Econometrics
10
Financial Econometrics
It is preferable not to work directly with asset prices, so we usually convert the
raw prices into a series of returns. There are two ways to do this:
Simple returns
or
log returns
Rt
pt pt 1
100%
pt 1
p
rt ln t 100 %
pt 1
Log Returns
The returns are also known as log price relatives.
We will use the log-returns.
g
There are a number of reasons for this:
1. They have the nice property that they can be interpreted as continuously
compounded returns.
2. Can add them up, e.g. if we want a weekly return and we have calculated
daily log returns:
r1 = ln p1/p0 = ln p1 - ln p0
r2 = ln p2/p1 = ln p2 - ln p1
r3 = ln p3/p2 = ln p3 - ln p2
r4 = ln p4/p3 = ln p4 - ln p3
r5 = ln p5/p4 = ln p5 - ln p4
ln p5 - ln p0 = ln p5/p0
12
Financial Econometrics
Yes
Reformulate Model
Interpret Model
Use for Analysis
13
Regression
Regression
g
is pprobably
y the single
g most important
p
tool at the econometricians
disposal.
But what is regression analysis?
It is concerned with describing and evaluating the relationship between a given
variable (usually called the dependent variable) and one or more other
variables (usually known as the independent variable(s)).
variable(s))
14
Financial Econometrics
Some Notation
Denote the dependent variable by y and the independent variable(s) by
x1, x2, ... , xk where there are k independent variables.
Some alternative names for the y and x variables:
y
x
dependent variable
independent variables
regressand
regressors
effect variable
causal variables
p
variable
explanatory
p
y variable
explained
Note that there can be many x variables but we will limit ourselves to the case
where there is only one x variable to start with. In our set-up, there is only one
y variable.
15
16
Financial Econometrics
Simple Regression
For simplicity, suppose that k 1. This is the situation where y depends on only
one x variable.
Examples of the kind of relationship that may be of interest include:
How asset returns vary with their level of market risk
Measuring the long-term relationship between spot prices and dividends.
Constructing an optimal hedge ratio
17
We have some intuition that the beta on this fund is positive, and we therefore
want to find whether there appears to be a relationship between x and y given
the data that we have. The first stage would be to form a scatter plot of the two
variables.
18
Financial Econometrics
Exce
ess return on fund XXX
45
40
35
30
25
20
15
10
5
0
0
10
15
20
25
20
Financial Econometrics
21
22
Financial Econometrics
The most common method used to fit a line to the data is known as
OLS (ordinary least squares).
squares)
What we actually do is take each distance and square it (i.e. take the
area of each of the squares in the diagram) and minimise the total sum
of the squares (hence least squares).
Tightening up the notation, let
yt denote the actual data point t
y t denote the fitted value from the regression line
ut denote the residual, yt - yt
23
yt
u t
y t
xt
x
24
Financial Econometrics
2
2
2
2
2
2
So min. u1 u2 u3 u4 u5 , or minimise ut . This is known as the
t 1
residual sum of squares.
ut2
25
But
y t x t , so let
L ( y t y t ) 2 ( y t xt ) 2
t
t
From (1),
yt xt 0 yt T xt 0
t
But
yt T y
and
xt
Tx .
26
Financial Econometrics
y x 0
or
xt ( y t xt ) 0
From (2),
(3)
(4)
y x
From (3),
(5)
xt ( yt y x xt ) 0
t
xt yt y xt x xt xt
xt yt Tyx Tx 2 xt
27
Rearranging for ,
So overall we have
xt yt Tx y
xt2 Tx 2
y x
Financial Econometrics
Question: If an analyst tells you that she expects the market to yield a return
20% higher than the risk-free rate next year, what would you expect the return
on fund XXX to be?
Solution: We can say that the expected value of y = -1.74 + 1.64 value of x,
so plug x = 20 into the equation to get the expected value for y:
30
Financial Econometrics
Population of interest
the entire electorate
31
yt xt ut
The SRF is
y t x t
ut yt y t
Financial Econometrics
Linearity
In order to use OLS, we need a model which is linear in the parameters (
and ). It does not necessarily have to be linear in the variables (y and x).
Linear in the parameters means that the parameters are not multiplied
together, divided, squared or cubed etc.
Some models can be transformed to linear ones by a suitable substitution
or manipulation. For example, if theory suggests that y and x should be
inversely related:
yt ut
xt
then the regression can be estimated using OLS by substituting
1
zt
xt
33
This is known as the exponential regression model. Here, the coefficients can
be interpreted as elasticities.
34
Financial Econometrics
Estimator or Estimate?
35
Financial Econometrics
37
If assumptions
p
1 through
g 4 hold,, then the estimators and determined by
y
OLS are known as Best Linear Unbiased Estimators (BLUE).
What does the acronym stand for?
Estimator
Linear
Unbiased
Best
Financial Econometrics
Consistency/Unbiasedness/Efficiency
Consistent
The least squares estimators and are consistent. That is, the estimates will
converge to their true values as the sample size increases to infinity. We need the
assumptions E(xtut)=0 and Var(ut)=2 < to prove this. Consistency implies that
lim Pr 0 0
T
Unbiased
The least squares estimates of and are unbiased. That is E( ) = and
E( ) =
Thus on average the estimated value will be equal to the true values. To prove this
also
l requires
i
th assumption
the
ti that
th t E(u
E( t)=0.
) 0 Unbiasedness
U bi d
i a stronger
is
t
condition
diti
than consistency.
Efficiency
An estimator of parameter is said to be efficient if it is unbiased and no other
unbiased estimator has a smaller variance. If the estimator is efficient, we are
minimising the probability that it is a long way off from the true value of .
39
xt2
T ( xt x ) 2
SE ( ) s
1
s
( xt x ) 2
xt
T xt2 T 2 x 2
xt2
1
Tx 2
40
Financial Econometrics
41
where
ut2
ut2
T 2
Financial Econometrics
xt yt 830102
x 416.5
xt2 3919654
y 86.65
Calculations:
T 22
RSS 130.6
0.35
We write
y t 59.12 0.35 xt
43
ut2
T 2
130.6
2.55
20
SE ( ) 2.55
3919654
3.35
22 3919654 22 416.5 2
SE ( ) 2.55
1
0.0079
3919654 22 416.5 2
44
Financial Econometrics
We want to make inferences about the likely population values from the
regression parameters.
Example: Suppose we have the following regression results:
y t 20 . 3 0 . 5091 x t
(14 . 38 ) ( 0 . 2561 )
46
Financial Econometrics
N(, Var())
N(, Var())
What if the disturbances are not normally distributed? Will the parameter
estimates still be normally distributed?
Yes, if the other assumptions of the CLRM hold, and the sample size is
sufficiently large.
48
Financial Econometrics
~ N 0,1
var
and
~ N 0,1
var
~ tT 2
SE ( )
andd
~ tT 2
SE ( )
49
Testing Hypotheses:
The Test of Significance Approach
Assume that , for t=1,2,...,T , the regression equation is given by
yt xt ut
Financial Econometrics
5. Given a significance level, we can determine a rejection region and nonrejection region.
region For a 2-sided test:
f(x)
2.5%
rejection region
95% non-rejection
i
2.5%
rejection region
52
Financial Econometrics
f(x)
95% non-rejection
5% rejection region
53
f(x)
95% non-rejection
j
region
g
5% rejection region
54
Financial Econometrics
55
You should all be familiar with the normal distribution and its characteristic
bell
bell shape.
We can scale a normal variate to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.
There is, however, a specific relationship between the t- and the standard
normal distribution. Both are symmetrical and centred on zero.
But the t-distribution has another parameter - its degrees of freedom. We will
always know this (for the time being from the number of observations - 2).
56
Financial Econometrics
normal distribution
t-distribution
57
t(4)
0
2.13
2.78
4.60
The reason for using the t-distribution rather than the standard normal is that
we had to estimate 2, the variance of the disturbances.
58
Financial Econometrics
59
4 The
4.
Th confidence
fid
i t
interval
l is
i given
i
b t crit SE , t crit SE
by
5. Perform the test: If the hypothesised value of (*) lies outside the
confidence interval, then reject the null hypothesis that = *, otherwise do
not reject the null.
60
Financial Econometrics
t crit
SE
t crit SE t crit SE
t crit SE t crit SE
But this is just the rule under the confidence interval approach.
61
Using
g the regression
g
results above,,
y t 20.3 0.5091xt
(14.38) (0.2561)
T=22
Using both the test of significance and confidence interval approaches, test the
hypothesis that =1 against a two-sided alternative.
The first step is to obtain the critical value. We want tcritit = t20;5%
20 5%
62
Financial Econometrics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
45
50
60
70
80
90
100
120
150
200
300
0.4
0.3249
0.2887
0 2767
0.2767
0.2707
0.2672
0.2648
0.2632
0.2619
0.2610
0.2602
0.2596
0.2590
0.2586
0.2582
0.2579
0.2576
0.2573
0.2571
0.2569
0.2567
0.2566
0.2564
0.2563
0.2562
0.2561
0 2560
0.2560
0.2559
0.2558
0.2557
0.2556
0.2553
0.2550
0.2549
0.2547
0.2545
0.2543
0.2542
0.2541
0.2540
0.2539
0.2538
0.2537
0.2536
0.2533
0.25
0.15
1.0000
0.8165
0 7649
0.7649
0.7407
0.7267
0.7176
0.7111
0.7064
0.7027
0.6998
0.6974
0.6955
0.6938
0.6924
0.6912
0.6901
0.6892
0.6884
0.6876
0.6870
0.6864
0.6858
0.6853
0.6848
0.6844
0 6840
0.6840
0.6837
0.6834
0.6830
0.6828
0.6816
0.6807
0.6800
0.6794
0.6786
0.6780
0.6776
0.6772
0.6770
0.6765
0.6761
0.6757
0.6753
0.6745
1.9626
1.3862
1 2498
1.2498
1.1896
1.1558
1.1342
1.1192
1.1081
1.0997
1.0931
1.0877
1.0832
1.0795
1.0763
1.0735
1.0711
1.0690
1.0672
1.0655
1.0640
1.0627
1.0614
1.0603
1.0593
1.0584
1 0575
1.0575
1.0567
1.0560
1.0553
1.0547
1.0520
1.0500
1.0485
1.0473
1.0455
1.0442
1.0432
1.0424
1.0418
1.0409
1.0400
1.0391
1.0382
1.0364
0.1
3.0777
1.8856
1 6377
1.6377
1.5332
1.4759
1.4398
1.4149
1.3968
1.3830
1.3722
1.3634
1.3562
1.3502
1.3450
1.3406
1.3368
1.3334
1.3304
1.3277
1.3253
1.3232
1.3212
1.3195
1.3178
1.3163
1 3150
1.3150
1.3137
1.3125
1.3114
1.3104
1.3062
1.3031
1.3006
1.2987
1.2958
1.2938
1.2922
1.2910
1.2901
1.2886
1.2872
1.2858
1.2844
1.2816
0.05
6.3138
2.9200
2 3534
2.3534
2.1318
2.0150
1.9432
1.8946
1.8595
1.8331
1.8125
1.7959
1.7823
1.7709
1.7613
1.7531
1.7459
1.7396
1.7341
1.7291
1.7247
1.7207
1.7171
1.7139
1.7109
1.7081
1 7056
1.7056
1.7033
1.7011
1.6991
1.6973
1.6896
1.6839
1.6794
1.6759
1.6706
1.6669
1.6641
1.6620
1.6602
1.6577
1.6551
1.6525
1.6499
1.6449
0.025
0.01
12.7062
4.3027
3 1824
3.1824
2.7764
2.5706
2.4469
2.3646
2.3060
2.2622
2.2281
2.2010
2.1788
2.1604
2.1448
2.1314
2.1199
2.1098
2.1009
2.0930
2.0860
2.0796
2.0739
2.0687
2.0639
2.0595
2 0555
2.0555
2.0518
2.0484
2.0452
2.0423
2.0301
2.0211
2.0141
2.0086
2.0003
1.9944
1.9901
1.9867
1.9840
1.9799
1.9759
1.9719
1.9679
1.9600
31.8205
6.9646
4 5407
4.5407
3.7469
3.3649
3.1427
2.9980
2.8965
2.8214
2.7638
2.7181
2.6810
2.6503
2.6245
2.6025
2.5835
2.5669
2.5524
2.5395
2.5280
2.5176
2.5083
2.4999
2.4922
2.4851
2 4786
2.4786
2.4727
2.4671
2.4620
2.4573
2.4377
2.4233
2.4121
2.4033
2.3901
2.3808
2.3739
2.3685
2.3642
2.3578
2.3515
2.3451
2.3388
2.3263
0.005
0.001
0.0005
Source: Biometrika Tables for Statisticians (1966), Volume 1, 3rd Edition. Reprinted with
permission of Oxford University Press.
63
f(x)
-2.086
+2.086
64
Financial Econometrics
Confidence interval
approach
t crit SE
SE
0.5091 1
1.917
0.2561
65
vs.
H0 : = 2
H1 : 2
66
Financial Econometrics
SE
0.5091 1
1.917
0.2561
f(x)
5% rejection region
-1.725
5% rejection region
+1.725
68
Financial Econometrics
t20;10% = 1.725. So now, as the test statistic lies in the rejection region, we
would reject H0.
69
If we reject the null hypothesis at the 5% level, we say that the result of the test
is statistically significant.
70
Financial Econometrics
Result of
Test
Significant
(reject H0)
Insignificant
( do not
reject H0)
Reality
H0 is true
Type I error
=
H0 is false
Type II error
=
71
The probability of a type I error is just , the significance level or size of test we
chose.
h
T see this,
To
hi recall
ll what
h we said
id significance
i ifi
at the
h 5% level
l l meant: it
i is
i only
l
5% likely that a result as or more extreme as this could have occurred purely by
chance.
Note that there is no chance for a free lunch here! What happens if we reduce the size
of the test (e.g. from a 5% test to a 1% test)? We reduce the chances of making a type
I error ... but we also reduce the probability that we will reject the null hypothesis at
all, so we increase the probability of a type II error:
less likely
to falsely reject
Reduce size
of test
more strict
criterion for
rejection
reject null
hypothesis
less often
more likely to
incorrectly not
reject
So there is always a trade off between type I and type II errors when choosing a
significance level. The only way we can reduce the chances of both is to increase the
sample size.
72
Financial Econometrics
Financial Econometrics
where t=1,2,...,T
y is T 1
X is T k
is k 1
u is T 1
y1 1 x21
u1
y 1 x
22 1 u 2
2
2
y 1 x
u
T
T
2T
T 1
T2
21 T1
75
uT
The RSS would be given by
u ' u u1 u 2
u1
u
uT 2 u12 u22 ... uT2 ut2
uT
76
Financial Econometrics
2 X ' X 1 X ' y
k
77
ut2 .
Previously, to estimate the variance of the errors, 2, we used s 2
T 2
Now using the matrix notation, we use
s2
u 'u
T k
Financial Econometrics
RSS 10.96
0.91
T k 15 3
79
Var 3 3.93
Var 1 1.83
Var 2 0.91
SE 3 1.98
SE 1 1.35
SE 2 0.96
We write:
y t 1.10 4.40 x2 t 19 .88 x3t
80
Financial Econometrics
i i
SE i
H0 : i = 0
H1 : i 0
i.e. a test that the population coefficient is zero against a two-sided
alternative this is known as a t-ratio
alternative,
t ratio test:
If the test is
Since i* = 0,
test statistic
i
SE i
1.10
. 0
1.35
0.81
Do we reject H0:
H0 :
H0 :
1 = 0?
2 = 0?
3 = 0?
-4.40
. 0
0.96
-4.63
=
=
=
12 d.f.
2.179
3.055
19.88
9.88
1.98
10.04
5%
1%
(No)
(Yes)
(Yes)
82
Financial Econometrics
83
xt
Data Mining
Data
ata miningg iss searching
sea c g many
a y series
se es for
o statistical
stat st ca relationships
e at o s ps without
w t out
theoretical justification.
For example, suppose we generate one dependent variable and twenty
explanatory variables completely randomly and independently of each other.
If we regress the dependent variable separately on each independent variable,
on average one slope coefficient will be significant at 5%.
5%
If data mining occurs, the true significance level will be greater than the
nominal significance level.
84
Financial Econometrics
R jt R ft j j Rmt R ft u jt
Financial Econometrics
87
88
Financial Econometrics
Estimates of
t-ratio on
Mean
-0.02%
0.91
-0.07
Minimum
-0.54%
0.56
-2.44
Maximum
0.33%
1.09
3.11
Median
-0.03%
0.91
-0.25
89
RRSS URSS T k
URSS
m
Financial Econometrics
The F-Distribution
The test statistic follows the F-distribution, which has 2 d.f. parameters.
The value of the degrees of freedom parameters are m and (T-k) respectively
(the order of the d.f. parameters is important).
The appropriate critical value will be in column m, row (T-k).
The F-distribution has only positive values and is not symmetrical. We
therefore only reject the null if the test statistic > critical F-value.
91
Example
Th generall regression
The
i is
i
yt = 1 + 2x2t + 3x3t + 4x4t + ut
(1)
We want to test the restriction that 3+4 = 1 (we have some hypothesis from
theory which suggests that this would be an interesting hypothesis to study).
The unrestricted regression is (1) above, but what is the restricted regression?
yt = 1 + 2x2t + 3x3t + 4x4t + ut s.t. 3+4 = 1
We substitute the restriction ( 3+4 = 1) into the regression so that it is
automatically imposed on the data.
3+4 = 1 4 = 1- 3
92
Financial Econometrics
yt = 1 + 2x2t + 3x3t + (1
(1-3)x4t + ut
yt = 1 + 2x2t + 3x3t + x4t - 3x4t + ut
Gather terms in s together and rearrange
(yt - x4t) = 1 + 2x2t + 3(x3t - x4t) + ut
This is the restricted regression. We actually estimate it by creating two
new variables, call them, say, Pt and Qt.
Pt = yt - x4t
Qt = x3t - x4t
so
Pt = 1 + 2x2t + 3Qt + ut is the restricted regression we actually estimate.
93
Examples :
H0: hypothesis
h th i
N off restrictions,
No.
t i ti
m
1 + 2 = 2
1
2 = 1 and 3 = -1
2
2 = 0, 3 = 0 and 4 = 0
3
If the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut,
then the null hypothesis
H0: 2 = 0,
0 and 3 = 0 and 4 = 0 is tested by the regression F
F-statistic
statistic. It
tests the null hypothesis that all of the coefficients except the intercept
coefficient are zero.
Note the form of the alternative hypothesis for all tests when more than one
restriction is involved: H1: 2 0, or 3 0 or 4 0
94
Financial Econometrics
We cannot test using this framework hypotheses which are not linear
or which are multiplicative,
e.g.
H0 : 2 3 = 2
or
H0 : 2 2 = 1
cannot be tested.
95
F-test Example
Question: Suppose a researcher wants to test whether the returns on a
company stock (y) show unit sensitivity to two factors (factor x2 and factor x3)
among three
h considered.
id d The
Th regression
i is
i carried
i d out on 144 monthly
hl
observations. The regression is yt = 1 + 2x2t + 3x3t + 4x4t+ ut
- What are the restricted and unrestricted regressions?
- If the two RSS are 436.1 and 397.2 respectively, perform the test.
Solution:
Unit sensitivity implies H0:2=1 and 3=1. The unrestricted regression is the
one in the question. The restricted regression is (yt-x2t-x3t)= 1+ 4x4t+ut or
letting
zt=yt-x2t-x3t, the restricted regression is zt= 1+ 4x4t+ut
In the F-test formula, T=144, k=4, m=2, RRSS=436.1, URSS=397.2
F-test statistic = 6.68. Critical value is an F(2,140) = 3.07 (5%) and 4.79 (1%).
Conclusion: Reject H0.
96
Financial Econometrics
Degre
es of
Freedo
m for
denom
inator
(T-k)
12
15
20
24
30
1 161
200 216
2 18.5
19.0 19.2
19.5 19.5 19.5 19.5
3 10.1 9.55 9.28
4 7.71 6.94 6.59
5 6.61 5.79 5.41
225
19.2
19.5
9.12
6.39
5.19
230
19.3
234
19.3
237
19.4
239
19.4
241
19.4
242 244
19.4 19.4
246
19.4
248
19.4
249
19.5
250
9.01
6.26
5.05
8.94 8.89
6.16 6.09
4.95 4.88
8.85
6.04
4.82
8.81
6.00
4.77
8.79
5.96
4.74
8.74
5.91
4.68
8.70
5.86
4.62
8.66
5.80
4.56
8.64 8.62
5.77 5.75
4.53 4.50
6
7
8
9
10
5.99
5.59
5.32
5.12
4.96
5.14
4.74
4.46
4.26
4.10
4.76
4.35
4.07
3.86
3.71
4.53
4.12
3.84
3.63
3.48
4.39
3.97
3.69
3.48
3.33
4.28
3.87
3.58
3.37
3.22
4.21
3.79
3.50
3.29
3.14
4.15
3.73
3.44
3.23
3.07
4.10
3.68
3.39
3.18
3.02
4.06
3.64
3.35
3.14
2.98
4.00
3.57
3.28
3.07
2.91
3.94
3.51
3.22
3.01
2.85
3.87
3.44
3.15
2.94
2.77
3.84
3.41
3.12
2.90
2.74
3.81
3.38
3.08
2.86
2.70
11
12
13
14
15
4.84
4.75
4.67
4.60
4.54
3.98
3.89
3.81
3.74
3.68
3.59
3.49
3.41
3.34
3.29
3.36
3.26
3.18
3.11
3.06
3.20
3.11
3.03
2.96
2.90
3.09
3.00
2.92
2.85
2.79
3.01
2.91
2.83
2.76
2.71
2.95
2.85
2.77
2.70
2.64
2.90
2.80
2.71
2.65
2.59
2.85
2.75
2.67
2.60
2.54
2.79
2.69
2.60
2.53
2.48
2.72
2.62
2.53
2.46
2.40
2.65
2.54
2.46
2.39
2:33
2.61
2.51
2.42
2.35
2.29
2.57
2.47
2.38
2.31
2.25
16
17
18
19
20
4.49
4.45
4.41
4.38
4.35
3.63
3.59
3.55
3.52
3.49
3.24
3.20
3.16
3.13
3.10
3.01
2.96
2.93
2.90
2.87
2.85
2.81
2.77
2.74
2.71
2.74 2.66
2.70 2.61
2.66 2.58
2.63 2.54
2.60 2.51
2.59
2.55
2.51
2.48
2.45
2.54 2.49
2.49 2.45
2.46 2.41
2.42 2.38
2.39 2.35
2.42 2.35
2.38 2.31
2.34 2.27
2.31 2.23
2.28 2.20
2.28
2.23
2.19
2.16
2.12
2.24 2.19
2.19 2.15
2.15 2.11
2.11 2.07
2.08 2.04
21
22
23
24
25
4.32
4.30
4.28
4.26
4.24
3.47
3.44
3.42
3.40
3.39
3.07
3.05
3.03
3.01
2.99
2.84
2.82
2.80
2.78
2.76
2.68
2.66
2.64
2.62
2.60
2.57
2.55
2.53
2.51
2.49
2.49
2.46
2.44
2.42
2.40
2.42
2.40
2.37
2.36
2.34
2.37
2.34
2.32
2.30
2.28
2.32
2.30
2.27
2.25
2.24
2.25
2.23
2.20
2.18
2.16
2.18
2.15
2.13
2.11
2.09
2.10
2.07
2.05
2.03
2.01
2.05
2.03
2.01
1.98
1.96
2.01
1.98
1.96
1.94
1.92
30
40
60
120
4.17
4.08
4.00
3.92
3.84
3.32
3.23
3.15
3.07
3.00
2.92
2.84
2.76
2.68
2.60
2.69
2.61
2.53
2.45
2.37
2.53
2.45
2.37
2.29
2.21
2.42
2.34
2.25
2.18
2.10
2.33
2.25
2.17
2.09
2.01
2.27
2.18
2.10
2.02
1.94
2.21
2.12
2.04
1.96
1.88
2.16
2.08.
1.99
1.91
1.83
2.09
2.00
1.92
1.83
1.75
2.01
1.92
1.84
1.75
1.67
1.93
1.84
1.75
1.66
1.57
1.89
1.79
1.70
1.61
1.52
1.84
1.74
1.65
1.55
1.46
Source: Biometrika Tables for Statisticians (1966), Volume 1, 3rd Edition. Reprinted with permission
of Oxford University Press.
97
We would like some measure of how well our regression model actually fits
the
h data.
d
We have goodness of fit statistics to test this: i.e. how well the sample
regression function (srf) fits the data.
The most common goodness of fit statistic is known as R2. One way to define
R2 is to say that it is the square of the correlation coefficient between y and y .
For another explanation, recall that what we are interested in doing is
explaining the variability of y about its mean value, , i.e. the total sum of
squares,
q
, TSS:
TSS yt y 2
t
We can split the TSS into two parts, the part which we have explained (known
as the explained sum of squares, ESS) and the part which we did not explain
using the model (the RSS).
98
Financial Econometrics
Defining R2
That is,
TSS =
ESS
yt y
+ RSS
y t y 2 ut2
t
R2
1
TSS
TSS
TSS
R2 must always lie between zero and one. To understand this, consider two
extremes
RSS = TSS i.e.
ESS = 0 so
R2 = ESS/TSS = 0
ESS = TSS i.e.
RSS = 0 so
R2 = ESS/TSS = 1
99
yt
yt
xt
xt
100
Financial Econometrics
101
Adjusted R2
In order to get around these problems, a modification is often made which
takes into account the loss of degrees of freedom associated with adding extra
2
variables. This is known as R , or adjusted R2:
T 1
R 2 1
1 R2
T k
Financial Econometrics
A Regression Example:
Hedonic House Pricing Models
Hedonic models are used to value real assets, especially housing, and view
the asset as representing a bundle of characteristics.
Des
D R
Rosiers
i and
d Th
Thrialt
i lt (1996) consider
id th
the effect
ff t off various
i
amenities
iti on
rental values for buildings and apartments 5 sub-markets in the Quebec area
of Canada.
The rental value in Canadian Dollars per month (the dependent variable) is a
function of 9 to 14 variables (depending on the area under consideration). The
paper employs 1990 data, and for the Quebec City region, there are 13,378
observations, and the 12 explanatory variables are:
LnAGE
- log of the apparent age of the property
NBROOMS - number of bedrooms
AREABYRM - area per room (in square metres)
ELEVATOR - a dummy variable = 1 if the building has an elevator; 0
otherwise
BASEMENT - a dummy variable = 1 if the unit is located in a basement; 0
otherwise
103
104
Financial Econometrics
Variable
Intercept
LnAGE
NBROOMS
AREABYRM
ELEVATOR
BASEMENT
OUTPARK
INDPARK
NOLEASE
LnDISTCBD
SINGLPAR
DSHOPCNTR
VACDIFF1
Coefficient
282.21
-53.10
48.47
3.97
88.51
-15.90
7.17
73.76
-16.99
5 84
5.84
-4.27
-10.04
0.29
t-ratio
56.09
-59.71
104.81
29.99
45.04
-11.32
7.07
31.25
-7.62
4 60
4.60
-38.88
-5.97
5.98
Notes: Adjusted R2 = 0.65l; regression F-statistic = 2082.27. Source: Des Rosiers and
Thrialt
(1996). Reprinted with permission of the American Real Estate Society.
105
106
Financial Econometrics
Financial Econometrics
Assumption 1: E(ut) = 0
109
x2t
110
-
Financial Econometrics
xt
(points more widely scattered around regression line for larger x)
111
Rt 1 2 RtM t
var( t ) var( Rt | RtM ) not constant
112
Financial Econometrics
Detection of Heteroscedasticity
Graphical methods
Formal
F
l tests:
t t
One of the best is Whites general test for heteroscedasticity.
The test is carried out as follows:
1. Assume that the regression we carried out is as follows
yt = 1 + 2x2t + 3x3t + ut
And we want to test Var(ut) = 2. We estimate the model,
model obtaining the
residuals, ut
2. Then run the auxiliary regression
ut2 1 2 x2t 3 x3t 4 x22t 5 x32t 6 x2t x3t vt
113
33. Obtain
Obt i R2 from
f
the
th auxiliary
ili
regression
i andd multiply
lti l it by
b the
th number
b off
observations, T. It can be shown that
T R2 2 (m)
where m is the number of regressors in the auxiliary regression excluding the
constant term.
4. If the 2 test statistic from step 3 is greater than the corresponding value
from the statistical table then reject the null hypothesis that the disturbances are
homoscedastic.
114
Financial Econometrics
E(yt | xt ) 1 2 xt
var(yt | xt ) var( t ) E ( t 2 ) t2
115
1 t 1
p t p
1 t 1
p t p
Financial Econometrics
117
If the form (i.e. the cause) of the heteroscedasticity is known, then we can use
an estimation method which takes this into account (called generalised least
squares, GLS).
A simple illustration of GLS is as follows: Suppose that the error variance is
related to another variable zt by varut 2 zt2
To remove the heteroscedasticity, divide the regression equation by zt
yt
x
x
1
1 2 2t 3 3t vt
zt
zt
zt
zt
u
where vt t is an error term.
zt
u varut 2 zt2
Now varvt var t
2 2 for known zt.
2
z
z
zt
t
t
So the disturbances from the new regression equation will be homoscedastic.
118
Financial Econometrics
119
yt
1.3-0.8=0.5
-0.9-1.3=-2.2
0.2--0.9=1.1
-1.7-0.2=-1.9
2.3--1.7=4.0
0.1-2.3=-2.2
0.0-0.1=-0.1
.
.
.
120
Financial Econometrics
Autocorrelation
We assumed of the CLRMs errors that Cov (ui , uj) = 0 for ij, i.e.
y g there is no p
pattern in the disturbances.
This is essentiallyy the same as saying
In a regression, if there are patterns in the residuals from a model, we say that
they are autocorrelated.
Obviously we never have the actual us, so we use their sample counterpart, the
residuals (the ut ).
Some stereotypical patterns we may find in the residuals are given on the next
three slides.
121
Positive Autocorrelation
+
u t
u t
u t 1
Time
122
Financial Econometrics
Negative Autocorrelation
ut
+
u t
u t 1
Time
123
ut
u t
u t 1
Financial Econometrics
125
Detecting Autocorrelation:
The Durbin-Watson Test
The Durbin Watson (DW) is a test for first order autocorrelation - i.e. it
assumes that
th t the
th relationship
l ti hi is
i between
b t
an error and
d the
th previous
i
one
ut = ut-1 + vt
(1)
where vt N(0, v2).
The DW test statistic actually tests
H0 : =0 and H1 : 0
The test statistic is calculated by
T
ut ut 1
DW t 2
ut2
t 2
126
Financial Econometrics
DW 21
(2)
where is the estimated correlation coefficient. Since is a correlation, it
implies that 1 1 .
Rearranging for DW from (2) would give 0DW4.
If = 0, DW = 2. So roughly speaking, do not reject the null hypothesis if DW
is near 2 i.e. there is little evidence of autocorrelation
Unfortunately, DW has 2 critical values, an upper critical value (du) and a
lower critical value (dL), and there is also an intermediate region where we can
neither reject nor not reject H0.
127
k=1
k=2
k=4
k=5
dU
dL
dU
dL
dU
dL
dU
0.81
0.84
0.87
0.90
0.93
0.95
1.07
1.09
1.10
1.12
1.13
1.15
0.70
0.74
0.77
0.80
0.83
0.86
1.25
1.25
1.25
1.26
1.26
1.27
0.59
0.63
0.67
0.71
0.74
0.77
1.46
1.44
1.43
1.42
1.41
1.41
0.49
0.53
0.57
0.61
0.65
0.68
1.70
1.66
1.63
1.60
1.58
1.57
0.39
0.44
0.48
0.52
0.56
0.60
1.96
1.90
1.85
1.80
1.77
1.74
21
22
23
24
25
0.97
1.00
1.02
1.04
1.05
1.16
1.17
1.19
1.20
1.21
0.89
0.91
0.94
0.96
0.98
1.27
1.28
1.29
1.30
1.30
0.80
0.83
0.86
0.88
0.90
1.41
1.40
1.40
1.41
1.41
0.72
0.75
0.77
0.80
0.83
1.55
1.54
1.53
1.53
1.52
0.63
0.66
0.70
0.72
0.75
1.71
1.69
1.67
1.66
1.65
26
27
28
29
30
1.07
1.09
1.10
1.12
1.13
1.22
1.23
1.24
1.25
1.26
1.00
1.02
1.04
1.05
1.07
1.31
1.32
1.32
1.33
1.34
0.93
0.95
0.97
0.99
1.01
1.41
1.41
1.41
1.42
1.42
0.85
0.88
0.90
0.92
0.94
1.52
1.51
1.51
1.51
1.51
0.78
0.81
0.83
0.85
0.88
1.64
1.63
1.62
1.61
1.61
31
32
33
34
35
1.15
1.16
1.17
1.18
1.19
1.27
1.28
1.29
1.30
1.31
1.08
1.10
1.11
1.13
1.14
1.34
1.35
1.36
1.36
1.37
1.02
1.04
1.05
1.07
1.08
1.42
1.43
1.43
1.43
1.44
0.96
0.98
1.00
1.01
1.03
1.51
1.51
1.51
1.51
1.51
0.90
0.92
0.94
0.95
0.97
1.60
1.60
1.59
1.59
1.59
36
37
38
39
40
1.21
1.22
.
1.23
1.24
1.25
1.32
1.32
.3
1.33
1.34
1.34
1.15
1.16
. 6
1.18
1.19
1.20
1.38
1.38
.38
1.39
1.39
1.40
1.10
1.11
.
1.12
1.14
1.15
1.44
1.45
. 5
1.45
1.45
1.46
1.04
1.06
.06
1.07
1.09
1.10
1.51
1.51
.5
1.52
1.52
1.52
0.99
1.00
.00
1.02
1.03
1.05
1.59
1.59
.59
1.58
1.58
1.58
45
50
55
60
65
70
1.29
1.32
1.36
1.38
1.41
1.43
1.38
1.40
1.43
1.45
1.47
1.49
1.24
1.28
1.32
1.35
1.38
1.40
1.42
1.45
1.47
1.48
1.50
1.52
1.20
1.24
1.28
1.32
1.35
1.37
1.48
1.49
1.51
1.52
1.53
1.55
1.16
1.20
1.25
1.28
1.31
1.34
1.53
1.54
1.55
1.56
1.57
1.58
1.11
1.16
1.21
1.25
1.28
1.31
1.58
1.59
1.59
1.60
1.61
1.61
75
1.45
1.50
1.42
1.53
1.39
1.56
1.37
1.59
1.34
80
1.47
1.52
1.44
1.54
1.42
1.57
1.39
1.60
1.36
85
1.48
1.53
1.46
1.55
1.43
1.58
1.41
1.60
1.39
90
1.50
1.54
1.47
1.56
1.45
1.59
1.43
1.61
1.41
95
1.51
1:55
1.49
1.57
1.47
1.60
1.45
1.62
1.42
100
1.52
1.56
1.50
1.58
1.48
1.60
1.46
1.63
1.44
T, number of observations; k, number of explanatory variables (excluding a constant term).
Source: Econometrica, 48, 1554. Reprinted with the permission of the Econometric Society.
1.62
1.62
1.63
1.64
1.64
1.65
dL
dU
k=3
dL
15
16
17
18
19
20
128
Financial Econometrics
Financial Econometrics
The coefficient estimates derived using OLS are still unbiased, but they are
inefficient, i.e. they are not BLUE, even in large sample sizes.
Thus if the standard error estimates are inappropriate, there exists the
possibility that we could make the wrong inferences.
R2 is
i likely
lik l to
t be
b inflated
i fl t d relative
l ti to
t its
it correct
t value
l for
f positively
iti l correlated
l t d
residuals.
131
132
Financial Econometrics
Dynamic Models
All of the models we have considered so far have been static,, e.g.
g
yt = 1 + 2x2t + ... + kxkt + ut
But we can easily extend this analysis to the case where the current value of yt
depends on previous values of y or one of the xs, e.g.
yt = 1 + 2x2t + ... + kxkt + 1yt-1 + 2x2t-1 + + kxkt-1+ ut
We could extend the model even further by adding extra lags, e.g. x2t-2 , yt-3 .
133
Inclusion of lagged
gg values of the dependent
p
variable violates the assumption
p
that the RHS variables are non-stochastic.
What does an equation with a large number of lags actually mean?
Note that if there is still autocorrelation in the residuals of a model including
lags, then the OLS estimators will not even be consistent.
134
Financial Econometrics
Multicollinearity
This problem occurs when the explanatory variables are very highly correlated
with each other.
Perfect multicollinearity
Cannot estimate all the coefficients
- e.g. suppose x3t = 2x2t
and the model is yt = 0 + 2x2t + 3x3t + 4x4t + ut
Problems if Near Multicollinearity is Present but Ignored
- R2 will be high but the individual coefficients will have high standard errors.
- The regression becomes very sensitive to small changes in the specification.
- Thus confidence intervals for the parameters will be very wide, and
significance tests might therefore give inappropriate conclusions.
135
Measuring Multicollinearity
Corr
x2t
x3t
x4t
x2t
0.2
0.8
x3t
0.2
0.3
x4t
0.8
0.3
-
Financial Econometrics
138
Financial Econometrics
Another possibility is to transform the data into logarithms. This will linearise
many previously multiplicative models into additive ones:
yt Axt eut ln yt ln xt ut
139
140
Financial Econometrics
f(x)
f(x)
A normal distribution
A skewed distribution
141
0.5
0.4
0.3
0.2
0.1
0.0
-5.4
-3.6
-1.8
-0.0
1.8
3.6
5.4
142
Financial Econometrics
Bera and Jarque formalise this by testing the residuals for normality by
testing whether the coefficient of skewness and the coefficient of excess
kurtosis are jointly zero.
It can be proved that the coefficients of skewness and kurtosis can be
expressed respectively as:
b1
E[u 3 ]
2 32
and
b2
E[u 4 ]
2 2
b 2 b 32
2
W T 1 2
~ 2
6
24
144
Financial Econometrics
Oct
1987
Time
Financial Econometrics
How many dummy variables do we need? We need one less than the
seasonality of the data. e.g. for quarterly series, consider what happens if we
use all 4 dummies
147
1986 Q1
Q2
Q3
Q4
1987 Q1
Q2
Q3
D1t
1
0
0
0
1
0
0
D2t
0
1
0
0
0
1
0
etc.
D3t
0
0
1
0
0
0
1
D4t
0
0
0
1
0
0
0
Sumt
1
1
1
1
1
1
1
Financial Econometrics
yt
xt
Q3
Q2
Q1
Q0
149
150
Financial Econometrics
Monday
Tuesday
Wednesday
Thursday
Friday
South Korea
0.49E-3
(0.6740)
-0.45E-3
(-0.3692)
-0.37E-3
-0.5005)
0.40E-3
(0 5468)
(0.5468)
-0.31E-3
(-0.3998)
Thailand
0.00322
(3.9804)**
-0.00179
(-1.6834)
-0.00160
(-1.5912)
0.00100
(1 0379)
(1.0379)
0.52E-3
(0.5036)
Malaysia
0.00185
(2.9304)**
-0.00175
(-2.1258)**
0.31E-3
(0.4786)
0.00159
(2 2886)**
(2.2886)**
0.40E-4
(0.0536)
Taiwan
0.56E-3
(0.4321)
0.00104
(0.5955)
-0.00264
(-2.107)**
-0.00159
( 1 2724)
(-1.2724)
0.43E-3
(0.3123)
Philippines
0.00119
(1.4369)
-0.97E-4
(-0.0916)
-0.49E-3
(-0.5637)
0.92E-3
(0 8908)
(0.8908)
0.00151
(1.7123)
Notes: Coefficients are given in each cell followed by t-ratios in parentheses; * and ** denote significance at the
5% and 1% levels respectively. Source: Brooks and Persand (2001).
151
xt
152
Financial Econometrics
rt = ( iDit + i DitRWMt) + ut
i 1
where Dit is the ith dummy variable taking the value 1 for day t=i and zero
otherwise and RWMt is the return on the world market index
otherwise,
Now both risk and return are allowed to vary across the days of the week.
153
Thailand
0.00322
(3.3571)**
(3.3571)
-0.00114
(-1.1545)
-0.00164
(-1.6926)
0.00104
(1.0913)
0.31E-4
(0.03214)
0.3573
(2.1987)*
1.0254
(8.0035)**
0.6040
(3.7147)**
0.6662
(3.9313)**
0.9124
(5.8301)**
Malaysia
0.00185
(2.8025)**
(2.8025)
-0.00122
(-1.8172)
0.25E-3
(0.3711)
0.00157
(2.3515)*
-0.3752
(-0.5680)
0.5494
(4.9284)**
0.9822
(11.2708)**
0.5753
(5.1870)**
0.8163
(6.9846)**
0.8059
(7.4493)**
Taiwan
0.544E-3
(0.3945)
0.00140
(1.0163)
-0.00263
(-1.9188)
-0.00166
(-1.2116)
-0.13E-3
(-0.0976)
0.6330
(2.7464)**
0.6572
(3.7078)**
0.3444
(1.4856)
0.6055
(2.5146)*
1.0906
(4.9294)**
Notes: Coefficients are given in each cell followed by t-ratios in parentheses; * and ** denote significance at the
5% and 1% levels respectively. Source: Brooks and Persand (2001).
154
Financial Econometrics
156
Financial Econometrics
RSS1 RSS 2
k
157
Financial Econometrics
T = 82
RSS1 = 0.03555
T = 62
RSS2 = 0.00336
T = 144
RSS = 0.0434
159
H 0 : 1 2
and
1 2
The unrestricted model is the model where this restriction is not imposed
Test statistic
0.0355 0.00336
2
= 7.698
Compare with 5% F(2,140) = 3.06
We reject H0 at the 5% level and say that we reject the restriction that the
coefficients are the same in the two periods.
160
Financial Econometrics
1200
1000
800
600
400
200
443
417
391
365
339
313
287
261
235
209
183
157
131
79
105
53
27
0
Sample Period
Financial Econometrics
Financial Econometrics
Our Objective:
To build a statistically adequate empirical model which
- satisfies the assumptions of the CLRM
- is parsimonious
- has the appropriate theoretical interpretation
- has the right shape - i.e.
- all signs on coefficients are correct
- all
ll sizes
i
off coefficients
ffi i
are correct
165
4
3
2
1
0
-1 1
-2
-3
-4
40 79 118 157 196 235 274 313 352 391 430 469
166
Financial Econometrics
White Noise
167
Stationary Process
168
Financial Econometrics
Shock hits
long-run
xt
70
60
Random Walk
Random Walk with Drift
50
40
30
20
10
0
1
19 37 55 73 91 109 127 145 163 181 199 217 235 253 271 289 307 325 343 361 379 397 415 433 451 469 487
-10
-20
170
Financial Econometrics
30
25
20
15
10
5
0
-5 1
40 79 118 157 196 235 274 313 352 391 430 469
171
The stationarity
y or otherwise of a series can strongly
g y influence its behaviour
and properties - e.g. persistence of shocks will be infinite for nonstationary
series
Spurious regressions. If two variables are trending over time, a regression of
one on the other could have a high R2 even if the two are totally unrelated
If the variables in the regression model are not stationary,
stationary then it can be proved
that the standard assumptions for asymptotic analysis will not be valid. In other
words, the usual t-ratios will not follow a t-distribution, so we cannot validly
undertake hypothesis tests about the regression parameters.
172
Financial Econometrics
173
174
Financial Econometrics
(2)
175
Stochastic Non-stationarity
Note that the model ((1)) could be ggeneralised to the case where yt is an
explosive process:
yt = + yt-1 + ut
where > 1.
Typically, the explosive case is ignored and we use = 1 to characterise the
non-stationarity because
> 1 does
d
nott describe
d
ib many data
d t series
i in
i economics
i andd finance.
fi
> 1 has an intuitively unappealing property: shocks to the system are not
only persistent through time, they are propagated so that a given shock will
have an increasingly large influence.
176
Financial Econometrics
yt y0 ut
as T
i 0
So just an infinite sum of past shocks plus some starting value of y0.
3. >1. Now given shocks become more influential as time goes on,
since if >1, 3>2> etc.
177
Financial Econometrics
yt = yt-1 + ut
so that a test of =1 is equivalent to a test of =0 (since -1=).
180
Financial Econometrics
test statistic =
SE
The test statistic does not follow the usual t-distribution under the null, since the null is
one of non-stationarity, but rather follows a non-standard distribution. Critical values
are derived from Monte Carlo experiments in, for example, Fuller (1976). Relevant
examples of the distribution are shown in table 4.1 below
181
1%
-3.43
-3.96
ADF Tests (Fuller,
The null hypothesis of a unit root is rejected in favour of the stationary alternative
in each case if the test statistic is more negative than the critical value.
182
Financial Econometrics
The tests above are only valid if ut is white noise. In particular, ut will be
autocorrelated if there was autocorrelation in the dependent variable of the
regression (yt) which we have not modelled. The solution is to augment
the test using p lags of the dependent variable. The alternative model in
case (i) is now written:
p
yt yt 1 i yt i ut
i 1
The same critical values from the DF tables are used as before. A problem
now arises in determining the optimal number of lags of the dependent
variable.
i bl
There are 2 ways
- use the frequency of the data to decide
- use information criteria
183
184
Financial Econometrics
Autocovariances
So if the process is covariance stationary, all the variances are the same and all the
covariances depend on the difference between t1 and t2. The moments
E ( yt E ( yt ))( yt s E ( yt s )) s , s = (-,+)
are known as the covariance function.
The covariances, s, are known as autocovariances.
However, the autocovariances depend on the units of measurement of yt.
It is
i thus
h more convenient
i to use the
h autocorrelation
l i which
hi h are the
h autocovariances
i
normalised by dividing by the variance:
s = (-,+)
,
s s
0
If we plot s against s=0,1,2,... then we obtain the autocorrelation function or
correlogram.
186
Financial Econometrics
Thus the autocorrelation function will be zero apart from a single peak of 1 at s = 0.
s approximately N(0,1/T) where T = sample size
We can use this to do significance tests for the autocorrelation coefficients by
constructing a confidence interval.
interval
For example, a 95% confidence interval would be given by 1.96/T.
If the sample autocorrelation coefficient,s , falls outside this region for any value of s,
then we reject the null hypothesis that the true value of the coefficient at lag s is
zero.
187
We can also test the joint hypothesis that all m of the k correlation coefficients
are simultaneously equal to zero using the Q-statistic
Q statistic developed by Box and
m
Pierce:
2
Q T k
k 1
Q T T 2
k2
k 1 T k
~ m2
This statistic is very useful as a test of linear dependence in time series. It can
also be used on residuals.
188
Financial Econometrics
An ACF Example
Question:
S
Suppose
th
thatt a researcher
h had
h d estimated
ti t d the
th first
fi t 5 autocorrelation
t
l ti coefficients
ffi i t
using a series of length 100 observations, and found them to be (from 1 to 5):
0.207, -0.013, 0.086, 0.005, -0.022.
Test each of the individual coefficient for significance, and use both the BoxPierce and Ljung-Box tests to establish whether they are jointly significant.
Solution:
A coefficient would be significant if it lies outside ((-0.196,+0.196)
0.196, 0.196) at the 5%
level, so only the first autocorrelation coefficient is significant.
Q=5.09 and Q*=5.26
Compared with a tabulated 2(5)=11.1 at the 5% level, so the 5 coefficients
are jointly insignificant.
189
Financial Econometrics
L
Lett ut (t=1,2,3,...)
(t 1 2 3 ) be
b a sequence off independently
i d
d tl andd identically
id ti ll
distributed (iid) random variables with E(ut)=0 and Var(ut)=2, then
yt = + ut + 1ut-1 + 2ut-2 + ... + qut-q
th
is a q order moving average model MA(q).
Its p
properties
p
are
Constant mean
Constant variance
Autocovariances are zero beyond lag q
191
ACF Plot
acf
0.4
0.2
0
0
-0.2
-0.4
-0.6
192
Financial Econometrics
Autoregressive Processes
y t 1 y t 1 2 y t 2 ... p y t p u t
Or using the lag operator notation:
Lyt = yt-1
Liyt = yt-i
p
y t i y t i u t
i 1
or
y t i Li y t u t
i 1
or ( L) y t u t
where
( L) 1 (1 L 2 L2 ... p Lp ) .
193
States that any stationary series can be decomposed into the sum of two
unrelated
l t d processes, a purely
l deterministic
d t
i i ti partt andd a purely
l stochastic
t h ti
part,
For an AR(p), Wolds decomposition theorem essentially means that the
model can be written as an MA().
An MA(q) can also be written as an AR()
194
Financial Econometrics
195
The pacf is useful for telling the difference between an AR process and an
ARMA process.
In the case of an AR(p), there are direct connections between yt and yt-s only
for s p.
So for an AR(p), the theoretical pacf will be zero after lag p.
In the case of an MA(q), this can be written as an AR(), so there are direct
connections between yt and all its previous values.
For an MA(q), the theoretical pacf will be geometrically declining.
196
Financial Econometrics
ARMA Processes
yt 1 y t 1 2 y t 2 ... p yt p 1u t 1 2 u t 2 ... q u t q u t
2
2
with E (u t ) 0; E (u t ) ; E (u t u s ) 0, t s
197
ARMA Processes
198
Financial Econometrics
199
10
-0.05
acf a
and pacf
-0.1
-0.15
-0.2
-0.25
-0.3
acf
-0.35
pacf
-0.4
-0.45
Lag
200
Financial Econometrics
0.4
acf
0.3
pacf
0.2
0.1
0
1
10
-0.1
-0.2
-0.3
-0.4
Lags
201
1
0.9
acf
pacf
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
10
-0.1
Lags
202
Financial Econometrics
0.6
0.5
acf
pacf
0.4
0.3
0.2
01
0.1
0
1
10
-0.1
Lags
203
0.3
0.2
0.1
0
1
10
-0.1
-0.2
-0.3
-0.4
acf
pacf
-0.5
-0.6
Lags
204
Financial Econometrics
1
0.9
acf
pacf
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
10
Lags
205
0.8
0.6
acf
pacf
acf and pacf
0.4
0.2
0
1
10
-0.2
-0.4
Lags
206
Financial Econometrics
Box and Jenkins (1970) were the first to approach the task of estimating an
ARMA model
d l iin a systematic
i manner. There
Th are 3 steps to their
h i approach:
h
1. Identification
2. Estimation
3. Model diagnostic checking
Step 1:
- Involves determining the order of the model.
- Use of graphical procedures
- A better procedure is now available
207
Step 2:
- Estimation of the parameters
- Can be done using least squares or maximum likelihood depending
on the model.
Step 3:
- Model checking
Box and Jenkins suggest 2 methods:
- deliberate overfitting
- residual diagnostics
208
Financial Econometrics
The information criteria vary according to how stiff the penalty term is.
The
Th th
three mostt popular
l criteria
it i are Akaikes
Ak ik (1974) information
i f
ti criterion
it i
(AIC), Schwarzs (1978) Bayesian information criterion (SBIC), and the
Hannan-Quinn criterion (HQIC).
AIC ln( 2 ) 2 k / T
k
ln T
T
2
k
ln(ln(T ))
HQIC ln( 2 )
T
where k = p + q + 1,
1 T = sample size.
size So we min.
min IC s.t.
s t p p, q q
SBIC embodies a stiffer penalty term than AIC.
Which IC should be preferred if they suggest different model orders?
SBIC is strongly consistent but (inefficient).
AIC is not consistent, and will typically pick bigger models.
SBIC ln( 2 )
210
Financial Econometrics
ARIMA Models
211
212
Financial Econometrics
213
xt
0197
.
( 0101
. )
1.950
0.927 xt 1 0.079 xt 2
(0.084)
11.036
(0.084)
-0.940
Financial Econometrics
kk
0.148
-0.061
0.117
0.067
-0.082
0 082
0.013
0.041
-0.011
0.087
0.021
-0.008
0.026
0.148
0.085
0.143
0.020
-0.079
0 079
0.034
0.008
0.002
0.102
-0.030
0.012
0.010
215
3.685
3.694
3.704
3.698
3.688
3.696
3.699
3.707
3.709
3.730
3.752
3.758
3.724
3.744
3.759
3.779
p
0
1
2
3
3.701
3.689
3.691
3.683
0
1
2
3
3.701
3.701
3.715
3.719
AIC
3.684
3.685
3.695
3.693
SBIC
3.696
3.709
3.731
3.741
216
Financial Econometrics
t-ratios:
7.11
xt
3.04
-1.89
LB-Q*(12)=6.07
.
558
0.186ut 1
(0.35)
(0.056)
15.94
0.144 xt 3
(0.057)
2.526
3.00
LB-Q*(12)=15.30
217
There are numerous examples of instances where this may arise, for example
where
h we wantt to
t model:
d l
Why firms choose to list their shares on the NASDAQ rather than the NYSE
Why some stocks pay dividends while others do not
What factors affect whether countries default on their sovereign debt
Why some firms choose to issue new stock to finance an expansion while
others issue bonds
Why
y some firms choose to engage
g g in stock splits
p while others do not.
It is fairly easy to see in all these cases that the appropriate form for the
dependent variable would be a 0-1 dummy variable since there are only two
possible outcomes. There are, of course, also situations where it would be
more useful to allow the dependent variable to take on other values, but these
will be considered later.
234
Financial Econometrics
The slope estimates for the linear probability model can be interpreted as the
change in the probability that the dependent variable will equal 1 for a oneone
unit change in a given explanatory variable, holding the effect of all other
explanatory variables fixed.
Suppose, for example, that we wanted to model the probability that a firm i
will pay a dividend p(yi = 1) as a function of its market capitalisation (x2i,
measured in millions of US dollars), and we fit the following line:
Financial Econometrics
236
For any firm whose value is less than $25m, the model-predicted probability
of dividend payment is negative, while for any firm worth more than $88m,
the probability is greater than one.
Financial Econometrics
However, there are at least two reasons why this is still not adequate.
The process of truncation will result in too many observations for which the
estimated probabilities are exactly zero or one.
Probably not, and so a different kind of model is usually used for binary
dependent variables either a logit or a probit specification.
238
The LPM also suffers from a couple of more standard econometric problems
that
h we hhave examined
i d in
i previous
i
chapters.
h
Since the dependent variable only takes one or two values, for given (fixed in
repeated samples) values of the explanatory variables, the disturbance term
will also only take on one of two values.
Hence the error term cannot plausibly be assumed to be normally distributed.
239
Financial Econometrics
240
The logit
g model is so-called because it uses a the cumulative logistic
g
distribution to transform the model so that the probabilities follow the Sshape given on the previous slide.
With the logistic model, 0 and 1 are asymptotes to the function and thus the
probabilities will never actually fall to exactly zero or rise to one, although
they may come infinitesimally close.
Financial Econometrics
The theory
y of firm financingg suggests
gg
that corporations
p
should use the
cheapest methods of financing their activities first (i.e. the sources of funds
that require payment of the lowest rates of return to investors) and then only
switch to more expensive methods when the cheaper sources have been
exhausted.
This is known as the pecking order hypothesis.
Differences in the relative cost of the various sources of funds are argued to
arise largely from information asymmetries since the firm's senior managers
will know the true riskiness of the business,, whereas potential
p
outside
investors will not.
Hence, all else equal, firms will prefer internal finance and then, if further
(external) funding is necessary, the firm's riskiness will determine the type of
funding sought.
242
Data
Helwege and Liang (1996) examine the pecking order hypothesis in the
context off a set off US fi
firms that
h hhad
d bbeen newly
l li
listedd on the
h stockk market
k in
i
1983, with their additional funding decisions being tracked over the 1984 1992 period.
Such newly listed firms are argued to experience higher rates of growth, and
are more likely to require additional external funding than firms which have
been stock market listed for many years.
Theyy are also more likely
y to exhibit information asymmetries
y
due to their lack of
a track record.
The list of initial public offerings (IPOs) was obtained from the Securities
Data Corporation and the Securities and Exchange Commission with data
obtained from Compustat.
243
Financial Econometrics
A core objective of the paper is to determine the factors that affect the
probability
b bili off raising
i i externall financing.
fi
i
As such, the dependent variable will be binary -- that is, a column of 1's (firm
raises funds externally) and 0's (firm does not raise any external funds).
Thus OLS would not be appropriate and hence a logit model is used.
The explanatory
Th
l
variables
i bl are a set that
h aims
i to capture the
h relative
l i degree
d
off
information asymmetry and degree of riskiness of the firm.
If the pecking order hypothesis is supported by the data, then firms should be
more likely to raise external funding the less internal cash they hold.
244
Financial Econometrics
246
Analysis of Results
The key variable, deficit has a parameter that is not statistically significant
and
d hence
h
the
h probability
b bili off obtaining
b i i externall fi
financing
i does
d
not depend
d
d on
the size of a firm's cash deficit.
Or an alternative explanation, as with a similar result in the context of a
standard regression model, is that the probability varies widely across firms
with the size of the cash deficit so that the standard errors are large relative to
the point estimate.
The parameter on the surplus variable has the correct negative sign,
indicating that the larger a firm's surplus, the less likely it is to seek external
financing, which provides some limited support for the pecking order
hypothesis.
Larger firms (with larger total assets) are more likely to use the capital
markets, as are firms that have already obtained external financing during the
previous year.
247
Financial Econometrics
Instead of using the cumulative logistic function to transform the model, the
cumulative normal distribution is sometimes used instead.
248
Logit or Probit?
For the majority of the applications, the logit and probit models will give very
similar characterisations of the data because the densities are very similar.
similar
That is, the fitted regression plots will be virtually indistinguishable, and the
implied relationships between the explanatory variables and the probability
that yi =1 will also be very similar.
Both approaches are much preferred to the linear probability model. The only
instance where the models may give non-negligibility different results occurs
when the split of the yi between 0 and 1 is very unbalanced - for example,
when yi =1 occurs only 10% of the time.
Stock and Watson (2006) suggest that the logistic approach was traditionally
preferred since the function does not require the evaluation of an integral and
thus the model parameters could be estimated faster.
However, this argument is no longer relevant given the computational speeds
now achievable and the choice of one specification rather than the other is now
usually arbitrary.
249
Financial Econometrics
250
To obtain the required relationship between changes in x2i and Pi, we would
need to differentiate F with respect to x2i and it turns out that this derivative is
2F(x2i) .
So in fact, a 1-unit increase in x2i will cause a 2F(x2i) increase in probability.
Usually, these impacts of incremental changes in an explanatory variable are
evaluated by setting each of them to their mean values.
These estimates are sometimes known as the marginal effects.
There is also another way of interpreting discrete choice models known as the
random utility model.
The idea is that we can view the value of y that is chosen by individual i
(either 0 or 1) as giving that person a particular level of utility, and the choice
that is made will obviously be the one that generates the highest level of
utility.
This interpretation is particularly useful in the situation where the person faces
a choice between more than 2 possibilities see a later slide.
251
Financial Econometrics
The End!
253
City University
Financial Econometrics
Exercise 1: Simple Linear Regression and Hypothesis Testing
1. What are the five assumptions usually made about the unobservable disturbance
terms? Briefly explain the meaning of each. Why do we need to make these
assumptions?
2. Which of the following models can be estimated (following a suitable
rearrangement) using ordinary least squares?
(Hint: the models need to be linear in the parameters).
where x, y, z are variables and are parameters to be estimated.
1. y t xt u t
2. y t e xt e ut
3. y t xt u t
4. ln( y t ) ln( xt ) u t
5. y t xt z t u t
3. The capital asset pricing model (CAPM) model can be written as
E ( Ri ) R f [ E ( Rm ) R f ] .
(1)
Where Rit is the return for security i at time t, Rmt is the return on a proxy for the
market portfolio at time t, and ut is an iid random disturbance term.
The coefficient beta in this case is also the CAPM beta for security i.
Suppose that you had estimated equation (1) and found that the estimated value of
beta, was 1.147. The standard error associated with this coefficient SE ( ) is
estimated to be 0.0548.
A City Analyst has told you that this security closely follows the market, but that it
is no more risky, on average, than the market. This can be tested by the null
hypothesis that the value of beta is one. The model is estimated over 62 daily
observations. Test this hypothesis against a one-sided alternative that the security is
more risky than the market, at the 5% level. Write down the null and alternative
hypothesis. What do you conclude? Are the Analysts claims empirically verified?
4. The Analyst also tells you that shares in Chris Mining PLC have no systematic
risk, in other words the returns on its shares are completely un-related to
movements in the market. The value of beta and its standard error are calculated to
be 0.214 and 0.186 respectively. The model is estimated over 38 quarterly
observations. Write down the null and alternative hypothesis. Test this null
hypothesis against a two-sided alternative.
5. Form and interpret a 95% and a 99% confidence interval for beta using the
figures given in question 4.
CITY UNIVERSITY
Financial Econometrics
Solutions to Exercise 1: Simple Linear Regression and Hypothesis
Testing
1. A list of the assumptions of the classical linear regression models disturbance
terms is given in the lecture notes handout.
We need to make the first four assumptions in order to prove that the ordinary least
squares estimators of and are best, that is to prove that they have minimum
variance among the class of linear unbiased estimators. The theorem that proves
that OLS estimators are BLUE (provided the assumptions are fulfilled) is known as
the Gauss-Markov theorem. If these assumptions are violated (which we will look
at in detail later in the course), then it may be that OLS estimators are no longer
unbiased or efficient. That is, they may be inaccurate or subject to fluctuations
between samples.
We needed to make the fifth assumption, that the disturbances are normally
distributed, in order to make statistical inferences about the population parameters
from the sample data, i.e. to test hypotheses about the coefficients. Making this
assumption (provided that the other assumptions also hold) implies that test
statistics will follow a t-distribution.
2. If the models are linear in the parameters, we can use OLS.
(1) Yes, can use OLS since the model is the usual linear model we have been
dealing with.
(2) Yes. The model can be linearised by taking logarithms of both sides and by
rearranging. Although this is a very specific case, it has sound theoretical
foundations (e.g. the Cobb-Douglas production function in economics), and it is the
case that many relationships can be approximately linearised by taking logs of the
variables. The effect of taking logs is to reduce the effect of extreme values on the
regression function, and it may be possible to turn multiplicative models into
additive ones which we can easily estimate.
(3) Yes and no! We can estimate this model using OLS, but we would not be able
to obtain the values of both and , but we would obtain the value of these two
coefficients multiplied together.
(4) Yes, we can use OLS, since this model is linear in the logarithms. For those
who have done some economics, models of this kind which are linear in the
logarithms have the interesting property that the coefficients ( and ) can be
interpreted as elasticities.
(5) Yes, in fact we can still use OLS since it is linear in the parameters. If we make
a substitution, say qt = xtzt, then we can run the regression:
yt = +qt + ut as usual.
So, in fact, we can estimate a fairly wide range of model types using these simple
tools.
3. The null hypothesis is that the true (but unknown) value of beta is equal to one,
against a one sided alternative that it is greater than one:
H0 : = 1
H1 : >1
The test statistic is given by
* 1.147 1
test stat
2.682
0.0548
SE ( )
We want to compare this with a value from the t-table with T-2 degrees of freedom,
where T is the sample size and T-2 =60. We want a value with 5% all in one tail
since we are doing a 1-sided test. The critical t-value from the t-table is 1.671:
f(x)
5% rejection
region
+1.671
The value of the test statistic is in the rejection region and hence we can reject the
null hypothesis. We have statistically significant evidence that this security has a
beta greater than one, i.e. it is significantly more risky than the market as a whole.
4. We want to use a two-sided test to test the null hypothesis that shares in Chris
Mining are completely un-related to movements in the market as a whole. In other
words, the value of beta in the regression model would be zero so that whatever
happens to the value of the market proxy, Chris Mining would be completely
unaffected by it.
The null and alternative hypotheses are therefore:
H0 : = 0
H1 : 0
The test statistic has the same format as before, and is given by:
4
test stat
* 0.214 0
1.150
0.186
SE ( )
We want to find a value from the t-tables for a variable with 38-2=36 degrees of
freedom, and we want to look up the value that puts 2.5% of the distribution in each
tail since we are doing a two-sided test and we want to have a 5% size of test over
all. The critical t-value is 2.03:
-2.03
+2.03
Since the test statistic is not within the rejection region, we do not reject the null
hypothesis. We therefore conclude that we have no statistically significant evidence
that Chris Mining has any systematic risk. In other words, we have no evidence that
changes in the companys value are driven by movements in the market.
5. A confidence interval for beta is given by the formula:
( SE ( ) t crit , SE ( ) t crit )
Confidence intervals are almost invariably 2-sided, unless we are told otherwise
(which we are not here), so we want to look up the values which put 2.5% in the
upper tail and 0.5% in the upper tail for the 95% and 99% confidence intervals
respectively.
The 0.5% critical values given as follows for a t-distribution with T-2=38-2=36:
degrees of freedom.
-2.72
+2.72
City University
Financial Econometrics
Exercise 2: Multiple Linear Regression F-tests and Goodness of Fit
Statistics
1. By using examples from the relevant statistical tables, explain the relationship
between the t and the F-distributions.
For questions 2-7 assume that our econometric model is of the form
yt = 1 + 2x2t + 3x3t + 4x4t + 5x5t + ut
(1)
2. Do we test hypotheses about the actual values of the coefficients (i.e. ) or their
estimated values (i.e. ) and why?
3. Which of the following hypotheses about the coefficients can be tested using a ttest? Which of them can be tested using an F-test? In each case, state the number of
restrictions.
i)
H0 : 2 = 2
ii)
H0 : 2 + 3 = 1
iii)
H0 : 2 + 3 = 1 and 4 = 1
iv)
H0 : 2=0 and 3 =0 and 4 = 0 and 5 = 0
v)
H0 : 2 3 = 1
4. Which of the above null hypotheses constitutes the regression F-statistic? Why
are we always interested in this null hypothesis whatever the regression relationship
under study? What exactly would constitute the alternative hypothesis in this case?
5. Which would we expect to be bigger - the unrestricted residual sum of squares or
the restricted residual sum of squares and why?
6. You decide to investigate the relationship given in the null hypothesis of
question 3 part (iii). What would constitute the restricted regression? The
regressions are carried out on a sample of 96 quarterly observations, and the
residual sum of squares for the restricted and unrestricted regressions are 102.87
and 91.41 respectively. Perform the test. What is your conclusion?
7. You estimate a regression of the form given by the equation below in order to
evaluate the effect of various firm-specific factors on the firms return series. You
run a cross-sectional regression with 200 firms
ri = 1 + 2Si + 3MBi + 4PEi + 5BETAi + ui
where ri is the percentage annual return for the stock
Si is the size of firm i measured in terms of sales revenue
MBi is the market to book ratio of the firm
PEi is the price-earnings ratio of the firm
BETAi is the stocks CAPM beta coefficient
You obtain the following results (with standard errors in parentheses):
7
City University
Financial Econometrics
Solutions to Exercise 2: Multiple Linear Regression F-tests and
Goodness of Fit Statistics
1. It can be proved that a t-distribution is just a special case of the more general Fdistribution. The square of a t-distribution with T-k degrees of freedom will be
identical to an F-distribution with (1,T-k) degrees of freedom. But remember that if
we use a 5% size of test, we will look up a 5% value for the F-distribution because
the test is 2-sided even though we only look in one tail of the distribution. We look
up a 2.5% value for the t-distribution since the test is 2-tailed.
Examples at the 5% level from tables
T-k
20
40
60
120
F critical value
4.35
4.08
4.00
3.92
t critical value
2.09
2.02
2.00
1.98
2. We test hypotheses about the actual coefficients, not the estimated values. We
want to make inferences about the likely values of the population parameters (i.e. to
test hypotheses about them). We do not need to test hypotheses about the estimated
values since we know exactly what our estimates are because we calculated them!
3.i)
H0 : 2 = 2
We could use an F- or a t- test for this one since it is a single hypothesis involving
only one coefficient. We would probably in practice use a t-test since it is
computationally simpler and we only have to estimate one regression. There is one
restriction.
ii)
H0 : 2 + 3 = 1
Since this involves more than one coefficient, we should use an F-test. There is one
restriction.
iii)
H0 : 2 + 3 = 1 and 4 = 1
Since we are testing more than one hypothesis simultaneously, we would use an Ftest. There are 2 restrictions.
iv)
H0 : 2 =0 and 3 = 0 and 4 = 0 and 5 = 0
As for iii), we are testing multiple hypotheses so we cannot use a t-test. We have 4
restrictions.
v)
H0 : 2 3 = 1
Although there is only one restriction, it is a multiplicative restriction. We therefore
cannot use a t-test or an F-test to test it. In fact we cannot test it at all using the
methodology that we have learned.
4. THE regression F-statistic would be given by the test statistic associated with
hypothesis iv) above. We are always interested in testing this hypothesis since it
tests whether all of the coefficients in the regression (except the constant) are
10
reject the null hypothesis that the restrictions are valid. We cannot impose these
restrictions on the data without a substantial increase in the residual sum of squares.
7.
The t-ratios are given in the final row above, and are in italics. They are calculated
by dividing the coefficient estimate by its standard error.
The relevant value from the t-tables is for a 2-sided test with 5% rejection overall.
T-k = 195; tcrit = 1.97. The null hypothesis is rejected at the 5% level if the absolute
value of the test statistic is greater than the critical value.
We would conclude based on this evidence that only firm size and market to book
value have a significant effect on stock returns.
If a stocks beta increases from 1 to 1.2, then we would expect the return on the
stock to FALL by (1.2-1)*0.084 = 0.0168 = 1.68%
This is not the sign we would have expected on beta, beta would be expected to be
positively related to return, since investors would require higher returns as
compensation for bearing higher market risk.
We would thus consider deleting the price/earnings and beta variables from the
regression since these are not significant in the regression - i.e. they are not helping
much to explain variations in y. We would not delete the constant term from the
regression even though it is insignificant since there are good statistical reasons for
its inclusion.
8. We need the concept of a parsimonious model - one that describes the most
important features of the data but using as few parameters as possible. We do want
to form a model that fits the data as well as possible, but in most financial series,
there is a substantial amount of noise. This can be interpreted as a random event
that is unlikely to be repeated in any forecastable way. We want to fit a model to
the data that will be able to generalise. In other words, we want a model that fits
to features of the data that will be replicated in future; we do not want to fit to
sample-specific noise.
This is why we need the concept of parsimony - fitting the smallest possible
mode to the data. Otherwise we may get a great fit to the data in sample, but any
use of the model for forecasts could yield terrible results.
Another important point is that the larger the number of estimated parameters (i.e.
the more variables we have), then the smaller will be the number of degrees of
freedom, and this will imply that standard errors will be larger than they would
otherwise have been. This could lead to a loss of power in hypothesis tests, and
variables which would have been significant are now insignificant.
11
City University
Financial Econometrics
Exercise 3: Goodness of Fit Statistics and the Assumptions of the
CLRM
1. A researcher estimates the following econometric models including a lagged
dependent variable
yt 1 2 x2t 3 x3t 4 yt 1 ut
yt 1 2 x2t 3 x3t 4 yt 1 vt
where ut and vt are iid disturbances.
Will these models have the same value of
(i) the residual sum of squares (RSS)
(ii) R2
(iii)Adjusted R2 ?
2. A researcher estimates the following two econometric models
yt 1 2 x2t 3 x3t ut
(1)
12
City University
Financial Econometrics
Solutions to Exercise 3: Goodness of Fit Statistics and the
Assumptions of the CLRM
1.
yt 1 2 x2t 3 x3t 4 yt 1 ut
yt 1 2 x2t 3 x3t 4 yt 1 vt
Note that we have not changed anything substantial between these models in the
sense that the second model is just a re-parameterisation (rearrangement) of the
first, where we have subtracted yt-1 from both sides of the equation.
(i) Remember that the residual sum of squares is the sum of each of the squared
residuals. So lets consider what the residuals will be in each case. For the first
model in the level of y
ut yt y t yt 1 2 x2t 3 x3t 4 yt 1
Now for the second model, the dependent variable is now the change in y:
ut yt y t yt 1 2 x2t 3 x3t 4 yt 1
where y is the fitted value in each case (note that we do not need at this stage to
assume they are the same).
Rearranging this second model would give
ut yt yt 1 1 2 x2t 3 x3t 4 yt 1
yt 1 2 x2t 3 x3t (4 1) yt 1
If we compare this formulation with the one we calculated for the first model, we
can see that the residuals are exactly the same for the two models, with 4 4 1 .
Hence if the residuals are the same, the residual sum of squares must also be the
same. In fact the two models are really identical, since one is just a rearrangement
of the other.
RSS
in the second case. Therefore since the total sum of squares
(yi y ) 2
(the denominator) has changed, then the value of R2 must have also changed as a
consequence of changing the dependent variable.
R2 1
(iii) By the same logic, since the value of the adjusted R2 is just an algebraic
modification of R2 itself, the value of the adjusted R2 must also change.
13
4. We would like to see no pattern in the residual plot! If there is a pattern in the
residual plot, this is an indication that there is still some action or variability left
in yt that has not been explained by our model. This indicates that potentially it may
be possible to form a better model, perhaps using additional or completely different
explanatory variables, or by using lags of either the dependent or of one or more of
the explanatory variables. Recall that the two plots I showed in the lectures, where
the residuals followed a cyclical pattern, and when they followed an alternating
pattern are used as indications that the residuals are positively and negatively
autocorrelated respectively.
Another problem if there is a pattern in the residuals is that, if it does indicate the
presence of autocorrelation, then this may suggest that our standard estimates for
the coefficients could be wrong and hence any inferences we make about the
coefficients could be misleading.
14
City University
Financial Econometrics
Exercise 4: Assumptions of the CLRM 2
1. A researcher estimates the following model for stock market returns, but thinks
that there may be a problem with it. By calculating the t-ratios, and considering
their significance or otherwise, suggest what the problem might be.
0.638 + 0.402 x2t - 0.891 x3t
R 2 0.96, R 2 0.89
(0.436) (0.291)
(0.763)
How might you go about solving the perceived problem?
y t =
2.
(i) State in algebraic notation and explain the assumption about the CLRMs
error terms that is referred to by the term homoscedasticity.
(ii) What would the consequence be for a regression model if the errors are
not homoscedastic?
(iii) How might you proceed if you found that (ii) was actually the case?
3.
4. Calculate the long run static equilibrium solution to the following dynamic
econometric model:
yt 1 2 x2t 3x3t 4 yt 1 5 x2t 1
6 x3t 1 7 x3t 4 ut
15
City University
Financial Econometrics
Solutions to Exercise 4: Assumptions of the CLRM 2
1. The t-ratios for the coefficients in this model are given in the third row below
after the standard errors. They are calculated by dividing the individual coefficients
by their standard errors.
0.638 + 0.402 x2t - 0.891 x3t
R 2 0.96, R 2 0.89
(0.436) (0.291)
(0.763)
t-ratio 1.46
1.38
1.17
The problem appears to be that the regression parameters are all individually
insignificant (i.e. not significantly different from zero), although the value of R2
and its adjusted version are both very high, so that the regression taken as a whole
seems to indicate a good fit. This looks like a classic example of what we term near
multicollinearity. This is where the individual regressors are very closely related, so
that it becomes difficult to disentangle the effect of each individual variable upon
the dependent variable.
y t =
The solution to near multicollinearity that is usually suggested is that since the
problem is really one of insufficient information in the sample to determine each of
the coefficients, then one should go out and get more data. In other words, we
should switch to a higher frequency of data for analysis (e.g. weekly instead of
monthly, monthly instead of quarterly etc.). An alternative is also to get more data
by using a longer sample period (i.e. one going further back in time), or to combine
the two independent variables in a ratio (e.g. x2t / x3t ).
Other, more ad hoc methods for dealing with the possible existence of near
multicollinearity were discussed in the lectures:
- Ignore it: if the model is otherwise adequate, i.e. statistically and in terms of
each coefficient being of a plausible magnitude and having an appropriate sign.
Sometimes, the existence of multicollinearity does not reduce the t-ratios on
variables that would have been significant without the multicollinearity
sufficiently to make them insignificant. It is worth stating that the presence of
near multicollinearity does not affect the BLUE properties of the OLS estimator
i.e. it will still be consistent, unbiased and efficient since the presence of near
multicollinearity does not violate any of the CLRM assumptions 1-4. However,
in the presence of near multicollinearity, it will be hard to obtain small standard
errors. This will not matter if the aim of the model-building exercise is to
produce forecasts from the estimated model, since the forecasts will be
unaffected by the presence of near multicollinearity so long as this relationship
between the explanatory variables continues to hold over the forecasted sample.
- Drop one of the collinear variables - so that the problem disappears. However,
this may be unacceptable to the researcher if there were strong a priori
theoretical reasons for including both variables in the model. Also, if the
16
removed variable was relevant in the data generating process for y, an omitted
variable bias would result.
Transform the highly correlated variables into a ratio and include only the ratio
and not the individual variables in the regression. Again, this may be
unacceptable if financial theory suggests that changes in the dependent variable
should occur following changes in the individual explanatory variables, and not
a ratio of them.
2.
(i) The assumption of homoscedasticity is that the variance of the errors is
constant and finite over time. Technically, we write Var (u t ) u2 .
(ii) The coefficient estimates would still be the correct ones (assuming
that the other assumptions for OLS optimality are not violated), but the problem
would be that the standard errors could be wrong. Hence if we were trying to test
hypotheses about the true parameter values, we could end up drawing the wrong
conclusions. In fact, for all of the variables except the constant, the standard errors
would typically be too small, so that we would end up rejecting the null hypothesis
too many times.
(iii) There are a number of ways to proceed in practice, including
- Using heteroscedasticity robust standard errors which correct for the problem by
enlarging the standard errors relative to what they would have been for the situation
where the error variance is positively related to one of the explanatory variables.
- Transforming the data into logs, which has the effect of reducing the effect of
large errors relative to small ones.
3.
(i) This is where there is a relationship between the ith and jth residuals.
Recall that one of the assumptions of the CLRM was that such a relationship did
not exist. We want our residuals to be random, and if there is evidence of
autocorrelation in the residuals, then it implies that we could predict the sign of the
next residual and get the right answer more than half the time on average!
(ii) The Durbin Watson test is a test for first order autocorrelation. The test
is calculated as follows. You would run whatever regression you were interested in,
and obtain the residuals. Then calculate the statistic
ut ut 1
T
DW t 2
ut
t 2
You would then need to look up the two critical values from the Durbin Watson
tables, and these would depend on how many variables and how many observations
and how many regressors (excluding the constant this time) you had in the model.
The rejection / non-rejection rule would be given by selecting the appropriate
region from the following diagram:
17
4 y 1 5 x 2 ( 6 7 ) x3
( 4 )
y 1 5 x2 6
x3
4 4
4
The last equation above is the long run solution.
18
City University
Financial Econometrics
Exercise 5: Assumptions of the CLRM and Structural Stability
1. What might we use Ramseys RESET test for? What could we do if we find that
we have failed the RESET test?
2.
(i) Why do we need to assume that the disturbances of a regression model
are normally distributed?
(ii) In a practical econometric modelling situation, how might we get around
the problem of residuals that are not normal?
3. A researcher is attempting to form an econometric model to explain daily
movements of stock returns. A colleague suggests that she might want to see
whether her data are influenced by daily seasonality.
(i) How might she go about doing this?
(ii) The researcher estimates a model with the dependent variable as the daily
returns on a given share traded on the London stock exchange, and various
macroeconomic variables and accounting rations as independent variables. She
attempts to estimate this model, together with five daily dummy variables (one for
each day of the week), and a constant term, using EViews. EViews then tells her
that it cannot estimate the parameters of the model. Explain what has probably
happened, and how she can fix it.
(iii) The final model for asset returns, rt is as follows (with standard errors in
parentheses):
rt = 0.0034 - 0.0183 D1t + 0.01554 D2t -0.0007 D3t - 0.0272 D4t+ other variables
(0.0146) (0.0068)
(0.0231)
(0.0179)
(0.0193)
The model is estimated using 500 observations. Is there significant evidence of any
day of the week effects? Assume that there are 3 other variables.
(Continued)
19
20
City University
Financial Econometrics
Solutions to Exercise 5: Assumptions of the CLRM and Structural
Stability
1. Ramseys RESET test is a test of whether the functional form of the regression
is appropriate. In other words, we test whether the relationship between the
dependent variable and the independent variables really should be linear or whether
a non-linear form would be more appropriate. The test works by adding powers of
the fitted values from the regression into a second regression. If the appropriate
model was a linear one, then the powers of the fitted values would not be
significant in this second regression.
If we fail Ramseys RESET test, then the easiest solution is probably to transform
all of the variables into logarithms. This has the effect of turning a multiplicative
model into an additive one.
If this still fails, then we really have to admit that the relationship between the
dependent variable and the independent variables was probably not linear after all
so that we have to either estimate a non-linear model for the data (which is beyond
the scope of this course) or we have to go back to the drawing board and run a
different regression containing different variables.
2.
(i) It is important to note that we did not need to assume normality in order
to derive the sample estimates of and or in calculating their standard errors. We
needed the normality assumption at the later stage when we come to test hypotheses
about the regression coefficients, either singly or jointly, so that the test statistics
we calculate would indeed have the distribution (t or F) that we said they would.
(ii) One solution would be to use a technique for estimation and inference
which did not require normality. But these techniques are often highly complex and
also their properties are not so well understood, so we do not know with such
certainty how well the methods will perform in different circumstances. Nonnormality is only really a problem when the sample size is small.
One pragmatic response to failing the normality test is to plot the estimated
residuals of the model, and look for one or more very extreme outliers. These
would be residuals that are much bigger (either very big and positive, or very big
and negative) than the rest. It is, fortunately for us, often the case that one or two
very extreme outliers will cause a violation of the normality assumption. The
reason that one or two extreme outliers can cause a violation of the normality
assumption is that they would lead the skewness and/or kurtosis of the residuals to
be very large.
Once we spot a few extreme residuals, we should look at the dates when these
outliers occurred. If we have a good theoretical reason for doing so, we can add in
separate dummy variables for big outliers caused by, for example, wars, changes of
government, stock market crashes, changes in market microstructure (e.g. the big
21
bang in 1986). The effect of the dummy variable is exactly the same as if we had
removed the observation from the sample altogether and estimated the regression
on the remainder. If we only remove observations in this way, then we make sure
that we do not lose any useful pieces of information represented by sample points.
3.
(i) The researcher could construct four dummy variables, which take the value 1 for
the day of the week they correspond to, and zero elsewhere. For example, if the
sample starts on a Tuesday, and D1t, D2t, D3t, and D4t represent the dummies for
Monday - Thursday respectively, then the additional variables to be put into the
regression would be
Day
Value of
D2t
D3t
D4t
D1t
Tuesday
0
1
0
0
Wednesday 0
0
1
0
Thursday
0
0
0
1
Friday
0
0
0
0
Monday
1
0
0
0
Tuesday
0
1
0
0
Wednesday 0
0
1
0
Thursday
0
0
0
1
(ii) The problem is probably one of perfect multicollinearity between the five daily
dummy variables and the constant term. The reason is that when we add the five
dummy variables together, they will sum to one in every time period. Thus the sum
is exactly the same as the constant term in the regression. Thus there is a perfect
positive correlation between the dummy variables and the constant term, which in
not allowed!
Technically, the problem would be that (XX) will not be of full rank and hence its
inverse will not exist. Hence we cannot calculate any of the parameter estimates
which are computed using (XX)-1.
Fortunately, the problem is easy to fix. What we would do it to just have four
dummy variables and a constant, or all five of the dummy variables, but no
constant, in the regression. Thus the multicollinearity problem would never arise.
The convention is to use Monday - Friday dummy variables and to leave out the
constant, or to use a constant plus the first four dummy variables, although as far as
I am aware there is no theoretical reason for doing this.
(iii) The thing to do to test whether there are significant day of the week effects
is of course to calculate the t-ratios and to therefore see if the coefficients are
significantly different from zero. The t-ratios are given in the third line under the
standard errors. The coefficients that are significant at the 5% level are indicated by
an asterisk (*):
rt = 0.0034 - 0.0183D1t + 0.01554D2t - 0.0007D3t - 0.0272D4t + other variables
(0.0146) (0.0068)
(0.0231)
(0.0179)
(0.0193)
0.233 -2.691*
0.673
-0.0391
-1.409
So there is evidence that there is a significant Monday effect. Since the sign on
the Monday dummy variable is negative, then it implies that everything else being
22
equal, returns will be significantly lower on Mondays relative to other days of the
week. There is no statistically significant evidence of any other seasonalities,
however, according to these results.
4.(a) Parameter structural stability refers to whether the coefficient estimates for a
regression equation are stable over time. If the regression is not structurally stable,
it implies that the coefficient estimates would be different for some sub-samples of
the data compared to others. This is clearly not what we want to find since when we
estimate a regression, we are implicitly assuming that the regression parameters are
constant over the entire sample period under consideration.
(b)
1981M1-1995M12
RSS=0.189 T=180
rt = 0.0215 + 1.491 Rmt
1981M1-1987M10
RSS=0.079 T=82
rt = 0.0163 + 1.308 Rmt
1987M11-1995M12
RSS=0.082 T=98
rt = 0.0360 + 1.613 Rmt
(i) If we define the coefficient estimates for the first and second halves of the
sample as 1 and 1, and 2 and 2 respectively, then the null and alternative
hypotheses are
H0 : 1 = 2 and 1 = 2
and
H1 : 1 2 or 1 2
15.304
0.079 0.082
2
RSS1 RSS 2
k
This follows an F distribution with (k,T-2k) degrees of freedom. F(2,176) = 3.05 at
the 5% level. Clearly we reject the null hypothesis that the coefficients are equal in
the two sub-periods.
5.The data we have is
1981M1-1995M12
RSS=0.189 T=180
rt = 0.0215 + 1.491 Rmt
1981M1-1994M12
RSS=0.148 T=168
rt = 0.0212 + 1.478 Rmt
1982M1-1995M12
RSS=0.182 T=168
rt = 0.0217 + 1.523 Rmt
First, the forward predictive failure test - i.e. we are trying to see if the model for
1981M1-1994M12 can predict 1995M1-1995M12.
The test statistic is given by
RSS RSS1 T1 k 0.189 0.148 168 2
*
*
3.832
RSS1
T2
0.148
12
Where T1 is the number of observations in the first period (i.e. the period that we
actually estimate the model over), and T2 is the number of observations we are
trying to predict. The test statistic follows an F-distribution with (T2, T1-k)
23
degrees of freedom. F(12, 166) = 1.81 at the 5% level. So we reject the null
hypothesis that the model can predict the observations for 1995. We would
conclude that our model is no use for predicting this period, and from a practical
point of view, we would have to consider whether this failure is a result of a-typical
behaviour of the series out-of-sample (i.e. during 1995), or whether it results from a
genuine deficiency in the model.
The backward predictive failure test is a little more difficult to understand, although
no more difficult to implement. The test statistic is given by
RSS RSS1 T1 k 0.189 0.182 168 2
*
*
0.532
RSS1
T2
0.182
12
Now we need to be a little careful in our interpretation of what exactly are the
first and second sample periods. It would be possible to define T1 as always
being the first sample period. But I think it easier to say that T1 is always the sample
over which we estimate the model (even though it now comes after the hold-outsample). Thus T2 is still the sample that we are trying to predict, even though it
comes first. You can use either notation, but you need to be clear and consistent. If
you wanted to choose the other way to the one I suggest, then you would need to
change the subscript 1 everywhere in the formula above so that it was 2, and change
every 2 so that it was a 1.
Either way, we conclude that there is little evidence against the null hypothesis.
Thus our model is able to adequately back-cast the first 12 observations of the
sample.
24
City University
Financial Econometrics
Exercise 6: Testing for Unit Roots
1. Why is it in general important to test for non-stationarity in time series data
before attempting to build an empirical model?
2. A researcher wants to test the order of integration of some time series data. He
decides to use the DF test. He estimates a regression of the form
yt = + yt-1 + ut
and obtains the estimate =-0.02 with standard error = 0.31.
(i) What are the null and alternative hypotheses for this test?
(ii) Given the data, and a critical value of -2.88, perform the test.
(iii) What do we conclude from this test what should be the next step?
(iv) Why can we not compare the estimated test statistic with the corresponding
critical value from a t-distribution, even though the test statistic takes the form of
the usual t-ratio?
3. Using the same regression as above, but on a different set of data, the researcher
now obtains the estimate =-0.52 with standard error = 0.16.
(i) Perform the test.
(ii) What do we conclude, and what should be the next step?
(iii) Another researcher suggests that there may be a problem with this
methodology since it assumes that the disturbances (ut) are white noise. Suggest a
possible source of difficulty and how we might in practice get around it.
25
City University
Financial Econometrics
Solutions to Exercise 6: Testing for Unit Roots
1. Non-stationarity can be an important determinant of the properties of a series.
Also, if two series are non-stationary, we may experience the problem of spurious
regression. This occurs when we regress one non-stationary variable on a
completely unrelated non-stationary variable, but yield a reasonably high value of
R2, apparently indicating that the model fits well.
Most importantly therefore, we are not able to perform any hypothesis tests in
models which inappropriately use non-stationary data since the test statistics will no
longer follow the distributions which we assumed they would (e.g. a t or F), so any
inferences we make are likely to be invalid.
2. (i)The null hypothesis is of a unit root against a one sided stationary alternative,
i.e. we have
H0 : yt I(1)
H1 : yt I(0)
which is also equivalent to
H0 : = 0
H1 : < 0
(ii) The test statistic is given by / SE ( ) which equals -0.02 / 0.31 = -0.06
Since this is not more negative than the appropriate critical value, we do not reject
the null hypothesis.
(iii) We therefore conclude that there is at least one unit root in the series (there
could be 1, 2, 3 or more). What we would do now is to regress 2yt on yt-1 and test
if there is a further unit root. The null and alternative hypotheses would now be
H0 : yt I(1) i.e. yt I(2)
H1 : yt I(0) i.e. yt I(1)
If we rejected the null hypothesis, we would therefore conclude that the first
differences are stationary, and hence the original series was I(1). If we did not reject
at this stage, we would conclude that yt must be at least I(2), and we would have to
test again until we rejected.
(iv) We cannot compare the test statistic with that from a t-distribution since we
have non-stationarity under the null hypothesis and hence the test statistic will no
longer follow a t-distribution.
3. Using the same regression as above, but on a different set of data, the researcher
now obtains the estimate =-0.52 with standard error = 0.16.
(i) The test statistic is calculated as above. The value of the test statistic = -0.52
/0.16 = -3.25. We therefore reject the null hypothesis since the test statistic is
smaller (more negative) than the critical value.
26
(ii) We conclude that the series is stationary since we reject the unit root
null hypothesis. We need do no further tests since we have already rejected.
(iii) The researcher is correct. One possible source of non-whiteness is when the
errors are autocorrelated. This will occur if there is autocorrelation in the original
dependent variable in the regression (yt). In practice, we can easily get around
this by augmenting the test with lags of the dependent variable to soak up the
autocorrelation. The appropriate number of lags can be determined using the
information criteria.
27
City University
Financial Econometrics
Exercise 7: Linear Univariate Time Series Modelling
1. Why are ARMA models particularly useful for financial time series?
2. Consider the following three models which a researcher suggests might be a
reasonable model of stock market prices:
(1)
yt = yt-1 + ut
(2)
yt = 0.5 yt-1 + ut
yt = 0.8 ut-1 + ut
(3)
(i) What classes of model are these examples of?
(ii) What would the autocorrelation and partial autocorrelation functions for each of
these processes look like? (You do not need to calculate the acf, simply consider
what shape it might have given the class of model from which it is drawn).
(iii) Which model is more likely to represent stock market prices from a theoretical
perspective, and why? If any of the three models truly represented the way stock
market prices move, which could we potentially use to make money by forecasting
future values of the series?
(iv) By making a series of successive substitutions or from your knowledge of the
behaviour of these types of processes, consider the extent of persistence of shocks
to the series in each case.
3.
(i) Describe the steps that Box and Jenkins (1970) suggested should be
involved in constructing an ARMA model.
(ii) What particular aspect of this methodology has been the subject of
criticism and why?
(iii) Describe an alternative procedure that could be used for this aspect.
4. A researcher is trying to determine the appropriate order of an ARMA model to
describe some actual data, with 200 observations available. She has the following
figures for the log of the estimated residual variance (i.e. log ( 2 )) for various
candidate models. She has assumed that an order greater than (3,3) should not be
necessary to model the dynamics of the data. What is the optimal model order?
log (2)
0.932
0.864
0.902
0.836
0.801
0.821
0.789
0.773
0.782
0.764
28
5. How could you determine if the order you suggested for question 5 was in fact
appropriate?
6. Given that the objective of any econometric modelling exercise is to find the
model that most closely fits the data, then adding more lags to an ARMA model
will almost invariably lead to a better fit. Therefore a large model is best because it
will fit the data more closely.
Comment on the validity (or otherwise) of this statement.
7. (a) You obtain the following sample autocorrelations and partial autocorrelations
for a sample of 100 observations from actual data:
Lag
1
2
3
4
5
6
7
8
acf
0.420
0.104
0.032 -0.206 -0.138
0.042 -0.018
0.074
pacf
0.632
0.381
0.268
0.199
0.205
0.101
0.096
0.082
Can you identify the most appropriate time series process for this data?
(b) Use the Ljung-Box Q* test to determine whether the first three autocorrelation
coefficients taken together are jointly significantly different from zero.
8. Explain what stylised shapes would be expected for the autocorrelation and
partial autocorrelation functions for the following stochastic processes:
- white noise
- An AR(2)
- An MA(1)
- An ARMA (2,1)
9. (a) Briefly explain any difference you perceive between the characteristics of
macroeconomic and financial data. Which of these features suggest the use of
different econometric tools for each class of data?
(b) Consider the following autocorrelation and partial autocorrelation
coefficients estimated using 500 observations for a weakly stationary series, yt:
Lag
1
2
3
4
5
Acf
0.307
-0.013
0.086
0.031
-0.197
pacf
0.307
0.264
0.147
0.086
0.049
Using a simple rule of thumb, determine which, if any, of the acf and pacf
coefficients are significant at the 5% level. Use both the Box-Pierce and LjungBox statistics to test the joint null hypothesis that the first 5 autocorrelation
coefficients are jointly zero.
(c) What process would you tentatively suggest could represent the most
appropriate model for the series in part (b)? Explain your answer.
29
(d) Outline two methods proposed by Box and Jenkins (1970) for determining
the adequacy of the models proposed in part (c).
30
City University
Financial Econometrics
Solutions to Exercise 7: Linear Univariate Time Series Modelling
1. ARMA models are of particular use for financial series due to their flexibility.
They are fairly simple to estimate, can often produce reasonable forecasts, and most
importantly, they require no knowledge of any structural variables that might be
required for more traditional econometric analysis. When the data are available at
high frequencies, we can still use ARMA models while exogenous explanatory
variables (e.g. macroeconomic variables, accounting ratios) may be unobservable at
any more than monthly intervals at best.
2.
yt = yt-1 + ut
yt = 0.5 yt-1 + ut
yt = 0.8 ut-1 + ut
(1)
(2)
(3)
(i) The first two models are AR(1) models (although the first one is the very special
case of a unit root or random walk), while the last is an MA(1). Strictly, since the
first model is a random walk, it should be called an ARIMA(0,1,0) model, but it
could still be viewed as a special case of an autoregressive model.
(ii) We know that the theoretical acf of an MA(q) process will be zero after q lags,
so the acf of the MA(1) will be zero at all lags after one. For an autoregressive
process, the acf dies away gradually. It will die away fairly quickly for case (2),
with each successive autocorrelation coefficient taking on a value equal to half that
of the previous lag. For the first case, however, the acf will never die away, and in
theory will always take on a value of one, whatever the lag.
Turning now to the pacf, the pacf for the first two models would have a large
positive spike at lag 1, and no statistically significant pacfs at other lags. Again,
the unit root process of (1) would have a pacf the same as that of a stationary AR
process. The pacf for (3), the MA(1), will decline geometrically.
(iii) Clearly the first equation (the random walk) is more likely to represent stock
prices in practice. The discounted dividend model of share prices states that the
current value of a share will be simply the discounted sum of all expected future
dividends. If we assume that investors form their expectations about dividend
payments rationally, then the current share price should embody all information that
is known about the future of dividend payments, and hence todays price should
only differ from yesterdays by the amount of unexpected news which influences
dividend payments.
Thus stock prices should follow a random walk. Note that we could apply a similar
rational expectations and random walk model to many other kinds of financial
series.
If the stock market really followed the process described by equations (2) or (3),
then we could potentially make useful forecasts of the series using our model. In
the latter case of the MA(1), we could only make one-step ahead forecasts since the
31
memory of the model is only that length. In the case of equation (2), we could
potentially make a lot of money by forming multiple step ahead forecasts and
trading on the basis of these.
Hence after a period, it is likely that other investors would spot this potential
opportunity and hence the model would no longer be a useful description of the
data.
(iv) See the handouts for the algebra. This part of the question is really an extension
of the others. Analysing the simplest case first, the MA(1), the memory of the
process will only be one period, and therefore a given shock or innovation, ut,
will only persist in the series (i.e. be reflected in yt) for one period. After that, the
effect of a given shock would have completely worked through.
For the case of the AR(1) given in equation (2), a given shock, ut, will persist
indefinitely and will therefore influence the properties of yt for ever, but its effect
upon yt will diminish geometrically as time goes on.
In the first case, the series yt could be written as an infinite sum of past shocks, and
therefore the effect of a given shock will persist indefinitely, and its effect will not
diminish over time.
3.
(i) Box and Jenkins were the first to consider ARMA modelling in this
logical and coherent fashion. Their methodology consists of 3 steps:
Identification - determining the appropriate order of the model using graphical
procedures (e.g. plots of autocorrelation functions).
Estimation - of the parameters of the model of size given in the first stage. This can
be done using least squares or maximum likelihood depending on the model.
Diagnostic checking - this step is to ensure that the model actually estimated is
adequate. B & J suggest two methods for achieving this:
- overfitting, which involves deliberately fitting a model larger than that
suggested in step 1 and testing the hypothesis that all the additional coefficients can
jointly be set to zero.
- Residual diagnostics. If the model estimated is a good description of the
data, there should be no further linear dependence in the residuals of the estimated
model. Therefore, we could calculate the residuals from the estimated model, and
use the Ljung-Box test on them, or calculate their acf. If either of these reveal
evidence of additional structure, then we assume that the estimated model is not an
adequate description of the data.
If the model appears to be adequate, then it can be used for policy analysis and for
constructing forecasts. If it is not adequate, then we must go back to stage 1 and
start again!
(ii) The main problem with the B & J methodology is the inexactness of the
identification stage. Autocorrelation functions and partial autocorrelations for
actual data are very difficult to interpret accurately, rendering the whole procedure
often little more than educated guesswork. A further problem concerns the
32
diagnostic checking stage, which will only indicate when the proposed model is
too small and would not inform the researcher of when it is too large.
(iii) We could use Akaikes or Schwarzs Bayesian information criteria. Our
objective would then be to fit the model order that minimises these.
We can calculate the value of Akaikes (AIC) and Schwarzs (SBIC) Bayesian
information criteria using the following respective formulae
AIC = ln ( 2 ) + 2k/T
SBIC = ln ( 2 ) + k ln(T)/T
The information criteria trade off an increase in the number of parameters and
therefore an increase in the penalty term against a fall in the RSS, implying a closer
fit of the model to the data.
4. Using the formulae above, we end up with the following values for each criterion
and for each model order (with an asterisk denoting the smallest value of the
information criterion in each case).
ARMA (p,q) model order
(0,0)
(1,0)
(0,1)
(1,1)
(2,1)
(1,2)
(2,2)
(3,2)
(2,3)
(3,3)
log ( 2 )
0.932
0.864
0.902
0.836
0.801
0.821
0.789
0.773
0.782
0.764
AIC
0.942
0.884
0.922
0.866
0.841
0.861
0.839
0.833*
0.842
0.834
SBIC
0.944
0.887
0.925
0.870
0.847
0.867
0.846
0.842*
0.851
0.844
The result is pretty clear: both SBIC and AIC say that the appropriate model is an
ARMA(3,2).
5. We could still perform the Ljung-Box test on the residuals of the estimated
models to see if there was any linear dependence left unaccounted for by our
postulated models.
Another test of the models adequacy that we could use is to leave out some of the
observations at the identification and estimation stage, and attempt to construct out
of sample forecasts for these. For example, if we have 2000 observations, we may
use only 1800 of them to identify and estimate the models, and leave the remaining
200 for construction of forecasts. We would then prefer the model that gave the
most accurate forecasts.
6. This is not true in general. Yes, we do want to form a model which fits the data
as well as possible. But in most financial series, there is a substantial amount of
noise. This can be interpreted as a number of random events that are unlikely to
be repeated in any forecastable way. We want to fit a model to the data that will be
33
able to generalise. In other words, we want a model that fits to features of the
data that will be replicated in future; we do not want to fit to sample-specific noise.
This is why we need the concept of parsimony - fitting the smallest possible
model to the data. Otherwise we may get a great fit to the data in sample, but any
use of the model for forecasts could yield terrible results.
Another important point is that the larger the number of estimated parameters (i.e.
the more variables we have), then the smaller will be the number of degrees of
freedom, and this will imply that coefficient standard errors will be larger than they
would otherwise have been. This could lead to a loss of power in hypothesis tests,
and variables that would otherwise have been significant are now insignificant.
7. (a) We class an autocorrelation coefficient or partial autocorrelation coefficient
1
= 0.196. Under this rule, the sample
as significant if it exceeds 1.96 *
T
autocorrelation coefficients (sacfs) at lag 1 and 4 are significant, and the spacfs at
lag 1, 2, 3, 4 and 5 are all significant.
This clearly looks like the data are consistent with a first order moving average
process since all but the first acfs are not significant (the significant lag 4 acf is a
typical wrinkle that one might expect with real data and should probably be
ignored), and the pacf has a slowly declining structure.
(b) The formula for the Ljung-Box Q* test is given by
m
Q* T (T 2)
k 1
k2
T k
m2
19.41.
100 1 100 2 100 3
The 5% and 1% critical values for a 2 distribution with 3 degrees of freedom are
7.81 and 11.3 respectively. Clearly, then, we would reject the null hypothesis that
the first three autocorrelation coefficients are jointly not significantly different from
zero.
8. The shapes of the acf and pacf are perhaps best summarised in a table:
Process
acf
pacf
White noise No significant coefficients
No significant coefficients
AR(2)
Geometrically declining or damped First 2 pacf coefficients significant,
sinusoid acf
all others insignificant
MA(1)
First acf coefficient significant, all Geometrically declining or damped
others insignificant
sinusoid pacf
ARMA(2,1) Geometrically declining or damped Geometrically declining or damped
sinusoid acf
sinusoid pacf
34
A couple of further points are worth noting. First, it is not possible to tell what the
signs of the coefficients for the acf or pacf would be for the last three processes,
since that would depend on the signs of the coefficients of the processes. Second,
for mixed processes, the AR part dominates from the point of view of acf
calculation, while the MA part dominates for pacf calculation.
Q T k2
and
k 1
Q* T (T 2)
k 1
k2
T k
Q* 500 502
71.39 .
500 2
500 3 500 4
500 5
500 1
The test statistics will both follow a 2 distribution with 5 degrees of freedom (the
number of autocorrelation coefficients being used in the test). The critical values
are 11.07 and 15.09 at 5% and 1% respectively. Clearly, the null hypothesis that the
first 5 autocorrelation coefficients are jointly zero is resoundingly rejected.
35
(c) Setting aside the lag 5 autocorrelation coefficient, the pattern in the table is for
the autocorrelation coefficient to only be significant at lag 1 and then to fall rapidly
to values close to zero, while the partial autocorrelation coefficients appear to fall
much more slowly as the lag length increases. These characteristics would lead us
to think that an appropriate model for this series is an MA(1). Of course, the
autocorrelation coefficient at lag 5 is an anomaly that does not fit in with the pattern
of the rest of the coefficients. But such a result would be typical of a real data series
(as opposed to a simulated data series that would have a much cleaner structure).
This serves to illustrate that when econometrics is used for the analysis of real data,
the data generating process was almost certainly not any of the models in the
ARMA family. So all we are trying to do is to find a model that best describes the
features of the data to hand. As one econometrician put it, all models are wrong, but
some are useful!
(d) The methods are overfitting and residual diagnostics. Overfitting involves
selecting a deliberately larger model than the proposed one, and examining the
statistical significances of the additional parameters. If the additional parameters are
statistically insignificant, then the originally postulated model is deemed acceptable.
The larger model would usually involve the addition of one extra MA term and one
extra AR term. Thus it would be sensible to try an ARMA(1,2) in the context of
Model A, and an ARMA(3,1) in the context of Model B. Residual diagnostics
would involve examining the acf and pacf of the residuals from the estimated
model. If the residuals showed any action, that is, if any of the acf or pacf
coefficients showed statistical significance, this would suggest that the original
model was inadequate. Residual diagnostics in the Box-Jenkins sense of the term
involved only examining the acf and pacf, rather than the array of diagnostics. It is
worth noting that these two model evaluation procedures would only indicate a
model that was too small. If the model were too large, i.e. it had superfluous terms,
these procedures would deem the model adequate.
36
1.1
Prelude ___________________________________________________ 2
1.2
1.3
1.4
1.5
1.6
1.7
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
1 INTRO TO EVIEWS
1.1
Prelude
This guide is compiled following the textbook by Chris Brooks. Some of the
material contained within is for you to do, while some is for reading only. Instructions
where you actually have to do something (e.g. type, point or click) are given in boldfaced type.
1.2
What is EViews?
Getting Started:
Importing data:
There are several different ways to input data into EViews. The simplest way to input
(very small amounts of) data is manually, but since theres no real practical reason to
do this, we will start by importing data from an Excel Spreadsheet or CSV file.
File Open Foreign data as workfile
We will use data on the stock of British Petroleum and the FTSE All Shares Index.
The data are from Datastream and come in a CSV format. The data is in
T:\Eviews\BP.csv
You will see the following windows
In Step 3 you can edit the column names by selecting each one.
Once done click Finish, you will now have a new Eviews workfile:
1.5
For now lets double-click on BP to open it in a series window. You will see a
button bar on the top. On that button bar, click on the view button to see a series of
options. These options include:
View:
Spreadsheet
Descriptive Statistics
To see some descriptive statistics for the variables, select the option View; Descriptive
Statistics; Histogram and Stats. The histogram of the variable and amongst others the
mean, the minimum the maximum, the median and the standard deviation of the series
will now be shown.
To see the spreadsheet again, click on View; Spreadsheet.
To plot variable BP, we simply chose the view graph - line graph option, and in the
same window, the graph of variable BP will appear.
It is also possible to create other types of graph. For example a scatter plot. Close the
BP series window, select 2 variables (using ctrl) and select Open as group.
Now, View Graph Scatter Simple Scatter will give you a scatter plot.
If you click on Name and label the scatter plot (e.g. Scatter), in the workfile window,
there will now be a new object that is named scatter. If you double click on this object,
a scatter plot of the two variables will now appear. NOTE that saving the plot in this
way doesnt actually freeze the output, it saves the group of 2 variables you selected.
You can still select View Spreadsheet and return to just the numbers. To save just the
output, e.g. just the graph, select Freeze, then in the new window that appears select
Name and name your new Graph object. Double-click this object and you will see that
the View menu has changed.
1.6
Now we will look at how to run a simple OLS regression in EViews. Dont worry
about whether it makes sense to run this particular regression. At this stage we are
simply trying to familiarize ourselves with the package and the concepts.
There are 2 ways to do this:
Either:
Object - New Object. In the new window select Equation and name the equation
REG1. Click on OK.
In order to specify the regression equation we can use two alternative methods. In the
new window, type EITHER
BP=c(1)+c(2)*FTAS
OR
BP C FTAS
Both expressions are equivalent and it is obvious that we are instructing EViews to
run a regression of BP on a constant and FTAS. It is possible to choose another
estimation method, but for now use LS - Least Squares (NLS and ARMA).
ALTERNATIVELY, you can select BP and FTAS together, right-click, and open as
equation object. This will get you to the same point that the procedure outlined above
does.
We could also select not to include all our observations in the regression by
specifying a subset of observations to be used. This would allow some data points to be
left out and used for other purposes (e.g. forecasting). For now, all the available
observations will be used and so the sample is left at 1 370.
Click on OK and the screen will display the regression results. However complex
the regression equation, the results screen will always have the same format. The
results specify the dependent variable, the estimation method, the sample and the
observations used, the coefficients names; values; standard errors; t-statistics and pvalues of the t-test. Below this, some regression statistics are shown, including the Rsquared; the adjusted R-squared and the-value of the F-test of the regression.
Dependent Variable: BP
Method: Least Squares
Sample: 1 370
Included observations: 370
Variable
FTAS
C
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
Std.
Error
t-Statistic
Prob.
0.223737 0.021684
-100.1781 64.80856
10.31805
-1.545754
0.0000
0.1230
Coefficient
Mean dependent
0.224385 var
0.222277 S.D. dependent var
Akaike info
42.84394 criterion
675501.9 Schwarz criterion
-1914.303 F-statistic
0.079089 Prob(F-statistic)
568.1250
48.58218
10.35840
10.37955
106.4622
0.000000
Note that the p-values are given so that one does not have to look up the t-values in a t
statistical table. Recall that the t-test examines whether the value of an individual
coefficient is statistically significant or not. This test in effect examines whether the
value of the coefficient is equal to zero or it is significantly different from zero. Under
this setting, the test takes the form:
H0: = 0
H1: 0
for the intercept coefficient, denoted , and for a single slope coefficient denoted ,
H0: = 0
H1: 0
The value of the estimated t-statistic is then compared with the value of the
tabulated t-statistic for a number of degrees of freedom and the confidence interval and
a conclusion can be reached. An alternative way to test the hypothesis is by looking
at the p-value. The p-value is the probability by chance alone of drawing a tstatistic as extreme as the one actually observed. This probability is the marginal
significance level and given a p-value, it can be concluded immediately whether to
reject or not reject the null hypothesis that the true coefficient is zero. If
conducting a one-sided test, the probability is one-half that reported by EViews. In the
simple regression above, the slope coefficient is statistically significant (i.e.
significantly different from zero), since the p-value of the t-test is 0.0000 and thus the
null hypothesis is rejected. On the other hand, the intercept coefficient is statistically
insignificant and the null hypothesis that its true value is zero cannot be rejected.
For statistical reasons, the intercept coefficient should never be excluded from a
regression equation.
We now turn our attention to the F-Test statistic. This test examines whether all the
slope coefficients are statistically insignificant or not. This is a joint test of significance
that, assuming the slope coefficients are denoted 1, 2, 3, , k, takes the form:
H0: 1 = 0 and 2 = 0 and and k = 0
H1: 1 0 or 2 0 or or k 0
In this regression, there is only one slope coefficient and so the t- and the F-statistic
provide the same information.
The R-squared value shows how much of the variability of the dependent variable is
explained by the independent variable(s).
It is possible to copy these results into the clipboard and then to paste them into a
document by simply selecting the result table and then copying them formatted or
unformatted into the clipboard. To copy the results, Ctrl-A (selects non-empty cells)
then Ctrl+C or Edit - Copy and Ctrl-V (or Edit Paste) wherever you want to paste
the table. To save regression results into an equation object use Name as we did with
the scatter plot. In order to change the estimation method or the regression equation,
click on the estimate button in the equation window.
To save just the table of output use Freeze and Name (again as with the scatter plot).
Let us turn our attention now to the residuals of the regression. We can view the
table of the actual and fitted values and the residuals by clicking on the view button in
the equation window and selecting Actual, Fitted, Residuals; Actual, Fitted,
Residuals Table. This will give a table of the actual and fitted values and of the
residuals with a plot of the residuals. To obtain a graph of the residuals, select Actual,
Fitted, Residuals; Residual Graph.
In order to save your work, go to File; Save and then provide a name for your file.
EViews will save the entire workfile with all the objects created in it. This means that
all series, equations and graphs will be saved in this workfile in a .wf1 format that can
be accessed only from EViews.
This file can be accessed in the future by simply opening EViews and then clicking on
File Open EViews workfile and selecting the file you have created.
This guide has so far covered how to import data, how to view various descriptive
statistics, how to plot the variables, how to run a regression, how to interpret simple
regression statistics and tests, how to view the residuals, and how to save a workfile.
We will now continue with some real data and a financial application.
1.7
For the population of chief executive officers, let yt be annual salary (salary) in
thousands of $. Thus, a value y = 1256.3 indicates an annual salary of $1 256 300. Let
xt denote the average return on equity (roe) for the CEOs firm for the previous three
years. (Return on equity is defined in terms of net income as a percentage of common
equity). For example, if roe = 19, then average return on equity is 19%. The file
CEOSAL1 contains information on 209 CEOs for the year 1990; these data were
obtained from Business Week 5/6/91 (Wooldridge, 2003).
(a) What is the average annual salary in this sample? And the average return on
equity?
In order to find the average annual salary and the average roe we open the
corresponding series in the workfile and we click View/Descriptive Stats and
Tests/Stats Table.
As we can see the average annual salary is $ 1281.120 and the roe is 17.18%.
The alternative solution is to run a simple regression on a constant with salary and the
roe as dependent variables in order to find the average salary and roe, respectively.
(b) What is the variance of the average annual salary in this sample? And the
average return on equity?
The variance is equal to the squared standard deviation. Thus, the variance of the
annual salary is (1372.345)2 = 1883619 and the variance of roe is equal to (8.5185)2 =
72.564
SALARY
10,000
8,000
6,000
4,000
2,000
0
0
10
20
30
40
ROE
50
60
(d) Estimate the following regression model (by LS) relating salary to roe.
salaryi = 1 + 2*roei + t
What estimates do you obtain for the model parameters? Which standard error
each of them has? Interpret the latter.
Dependent Variable: SALARY
Method: Least Squares
Date: 08/28/12 Time: 13:09
Sample: 1 209
Included observations: 209
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
ROE
963.1913
18.50119
213.2403
11.12325
4.516930
1.663290
0.0000
0.0978
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
0.013189
0.008421
1366.555
3.87E+08
-1804.543
2.766532
0.097768
1281.120
1372.345
17.28750
17.31948
17.30043
2.104990
(e) If the return on equity is zero, what is the expected (or predicted) salary?
If the roe = 0, then E(salary) = $ 963 191.3.
(f) If the return on equity increases by 1 percentage point, by how much is salary
expected to increase?
If the roe increases by 1%, then the salary is going to increase by $ 18.50 .
(g) Use your estimation results to compare predicted salaries at different values
of roe, for instance roe = 30% and roe = 15%.
If for 1% increase of roe the salary increases by $ 18.50, then for 30% of roe the salary
is going to increase by 30 * $ 18.50 =$ 555 and 15 * $ 18.50 = $ 277.5.
10
(h) On the basis of the current sample and your estimation results, can we claim
that roe plays a significant role in explaining CEOs salary? (Hint: conduct a
hypothesis test)
Hypothesis
H0: b2 = 0
HA: b2 0
t-test
t-stat = 18.50/ 11.12 = 1.66
The critical value is t2.5%, 207 = 1.97 or t5%, 207 = 1.65
For 5% level of significance roe doesnt play a significant role. However, for 10%
level roe is significant.
Alternatively, we can see the p-value of the relevant t-statistic. As we can see the pvalue is 9.7% which is greater than 5% level of significance and thus we dont reject
the H0 that b2=0. Thus, b2 is insignificant at the 5% level.
However it is significant at the 10% level since p-value = 9.7% is smaller than 10%.
(i) Construct a two-sided 95% confidence interval on the marginal effect of roe
on CEOs salary and answer the previous question using this interval. Do you
get the same answer as in (h)?
The confidence interval is
b2 tcrit*SE(b2) = 18.50 t2.5%, 207*11.12= 18.50 1.97*11.12 = (-3.40, 40.40)
As we can see the estimated b2 is inside the confidence interval and thus we dont
reject the null hypothesis that b2=0. Thus b2 is insignificant at the 5% level of
significance.
Same answer was obtained in (h). The hypothesis testing with the confidence intervals
and test of significance approach leads to the same results.
(j) Test the hypothesis that if the roe increases by 1 percentage point the CEOs
salary is expected to change by more than $15,000.
H0 : b1=15
HA : b1> 15
t-stat= (18.50-15)/ 11.12 = 0.31
t5%, 207 = 1.65
Since we have a one-side test the t-critical is 1.65 and is greater than the t-statistic,
which means that we do not reject the null that b1= 15.000. Hence, we reject the
11
alternative hypothesis that if roe increases by 1% the CEOs salary is going to change
by more than $ 15.000.
(k) Plot the predicted salaries (according to this model) and the actual salaries.
Discuss.
15,000
12,500
10,000
7,500
5,000
2,500
16,000
0
12,000
8,000
4,000
0
-4,000
25
50
75
100
Residual
125
Actual
150
175
200
Fitted
From the plot we can see that there is no big difference between the actual and the
fitted (predicted) salaries.
50
75
100
125
12
150
175
200
The errors are equal to the actual salaries minus the predicted salaries. As we can see
with the exception of three cases our model predicts well the actual salaries of the 209
CEOs.
(m) Report and interpret the marginal impact of roe on CEOs salary.
If roe changes by 1%, the CEOs salary will increase by $ 18.50. But the impact is
only significant at the 10% level.
13
Import Data
These series are daily closing prices of the FTSE All Shares Index and BP stock. In
order to estimate a CAPM equation, these series should first be transformed into
returns and then calculate excess returns over the risk free rate. To transform the series,
click on the Generate button (Genr) on the workfile window. In the new window, type:
RFTAS=LOG(FTAS/FTAS(-1))
This will create a new series named RFTAS that will contain the returns of the
FTAS. The operator (-1) is used to instruct EViews to use the one period lagged
observation of the series.
To estimate returns on the BP Stock, press the (Genr) button again and type
RBP=LOG(BP/BP(-1))
This will yield a new series named RBP that will contain the returns of the BP
Stock.
EViews allows various different kinds of transformations to the series. For example:
X2=X/2
XSQ=X^2
LX=LOG(X)
LAGX=X(-1) lags X by 1 period
LAGX2=X(-2) lags X by two periods etc.
Other functions include:
d(x)
first difference
d(x,n)
d(x,n,s)
dlog(x)
dlog(x,n)
14
abs(y)
absolute value of y
Note that if in the transformation the new series is given the same name as the old
series, the old series will be overwritten. (best to avoid doing this to your original
series)
Using to the above functions it is possible to generate new series by simply typing in
the Genr window:
RFTAS=DLOG(FTAS)
In the same way, we will transform the returns into excess returns.
Click the Genr button again and type in:
RFTAS=RFTAS-UKTBILL
Now transform the series of returns for BP to excess returns.
Note that this has now overwritten the returns series with excess returns.
We will now run a CAPM regression to identify the beta and the alpha for BP stock.
2.3
Now that we have the excess returns of the series, we can proceed to run the CAPM
regression. Before running the regression, plot the data to examine visually whether the
series move together. To do this, create a new object by clicking on the Objects; New
Object menu on the menu bar. Select Graph, provide a name (call the graph graph1)
and then in the new window provide the names of the series to plot. In this new
window, type:
RFTAS RBP
In the new window press OK. What does the graph imply about the beta of BP
Stock?
Close the window of the graph and return to the workfile window. Select RBP and
RFTAS, open group and select the scatter plot. It should appear as below.
15
To estimate the CAPM equation we click on Objects; New Objects. In the new
window, select Equation and Name the equation CAPM. Click on OK. In the new
window, specify the regression equation. The regression equation takes the form:
(RBP-rf)t = + (RM-rf)t + ut
Since the data has already been transformed, in order to specify this regression
equation, type in the equation window:
RBP c RFTAS
To use all the observations in the sample and to estimate the regression using LS
Least Squares, click on OK. The results screen appears and has the same format as the
screen of the previous section.
Coefficient
Std. Error
t-Statistic
Prob.
C
RFTAS
0.000720
0.697092
0.001105
0.107925
0.651537
6.459017
0.5151
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.102072
0.099626
0.021235
0.165489
898.8419
1.898081
16
0.000761
0.022379
-4.860932
-4.839735
41.71890
0.000000
2.4
Take a couple of minutes to examine the results of the regression. What do the
results tell us? What is the slope coefficient estimate and what does it signify? Is this
coefficient statistically significant?
We can see that the beta coefficient (the slope coefficient / the coefficient of
RFTAS) is equal to 0.697092. The p-value of the t-ratio is 0.0000 signifying that the
return on the market proxy is able to explain the variability of the returns of BP.
What does the constant coefficient mean? Is it statistically significant?
The F test shows that the regression slope coefficient is significantly different from
zero, which in this case is the same result as the t-test for the beta coefficient (note that
we only have one slope coefficient).
Finally, by examining the R2 and the adjusted R2, it can be seen that the excess
returns of the market proxy are able to explain a relatively small proportion of the
variability of the excess returns on BP stock.
It is of interest to test whether the beta coefficient is statistically different from 1. To
do this, click on the View button in the regression window and choose Coefficient
tests; Wald-Coefficient Restrictions. In the new window type:
C(2)=1
This tells EViews to test whether the slope coefficient (i.e. the coefficient on the
second variable, since the intercept is c(1)) is equal to 1. Click on OK. In the new test
result screen you will see:
Wald Test:
Equation: CAPM
Test Statistic
F-statistic
Chi-square
Value
7.877243
7.877243
df
Probability
(1, 367)
1
0.0053
0.0050
Value
Std. Err.
-0.302908
0.107925
There are two versions of the test given: an F-version and a 2-version. The F-version
is adjusted for small sample bias and should be used when the regression is estimated
using a small sample. Both statistics asymptotically yield the same result, and hence in
a sample of this size, the p-values are very similar. The beta of BP is significantly
different from 1. We thus conclude that, in the CAPM world, BP returns on average
17
fluctuate less than those of the market as a whole, as we might expect of a basic
commodity supplier.
2.5
We will now examine the residuals of the regression. To examine whether there is
autocorrelation and heteroscedasticity it is important to look at the residuals. Plot the
residuals by selecting View; Actual, Fitted, Residuals; Residual Graph.
If the residuals of the regression have systematically changing variability over the
sample, that is sign of heteroscedasticity. In that case, any inferences that are made
regarding the coefficient estimates may be wrong since although the coefficient
estimates are still unbiased in the presence of heteroscedasticity, they are no longer
BLUE.
To test for heteroscedasticity using the White heteroscedasticity test, click on the
View button in the regression window and select Residual Tests; White
Heteroscedasticity (no cross terms). The results of the test are:
0.712712
1.431533
Probability
Probability
0.490992
0.488817
Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Included observations: 369
Variable
C
RFTAS
RFTAS^2
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
t-Statistic
Prob.
5.36E-05
7.813221
0.004320
0.865445
0.295599
0.948413
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
0.0000
0.3874
0.3435
0.000448
0.000839
-11.31970
-11.28791
0.712712
0.490992
What do we conclude from the results? Remember the null is well specified i.e. no
heteroskedasticity.
We will now examine whether there is autocorrelation in the residuals. If
autocorrelation is present, the coefficient estimates of the regression are still unbiased
but they are inefficient.
There are several ways to test for autocorrelation. The easiest (but not very
accurate) is by examining the Durbin Watson statistic for first order autocorrelation.
This statistic was given in the general results screen shown above. To view the results
screen again, click on the View button in the regression window and select Estimation
output. The DW statistic is found at the bottom of the table. What does the DW
statistic tell us in this case?
18
0>DW>4
DW around 2 = no autocorrelation
DW << 2 positive a/c
DW>>2 negative a/c
To examine whether the residuals contain any higher order autocorrelation, we
could plot them over time, although this is likely to be difficult to interpret. A
statistical approach would be to use the Breusch-Godfrey Serial Correlation LM Test.
This test can be conducted by selecting View; Residual Tests; Serial Correlation
LM Tests. In the new window, type the number of lagged residuals you want to
include in the test and click on OK. Assuming that you selected to employ 10 lags in
the test, the results would be:
Breusch-Godfrey Serial Correlation LM Test:
F-statistic
Obs*R-squared
1.799298
17.70543
Probability
Probability
0.059338
0.060141
Test Equation:
Dependent Variable: RESID
Method: Least Squares
Presample missing value lagged residuals set to zero.
Variable
C
RFTAS
RESID(-1)
RESID(-2)
RESID(-3)
RESID(-4)
RESID(-5)
RESID(-6)
RESID(-7)
RESID(-8)
RESID(-9)
RESID(-10)
-3.65E-06
-0.016466
0.039652
-0.118282
-0.123380
-0.025917
-0.087858
0.022433
0.038984
-0.051403
0.029604
-0.080451
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.047982
0.018648
0.021007
0.157548
907.9140
1.992850
0.001094
0.107771
0.052849
0.052990
0.053359
0.053765
0.054094
0.054170
0.054162
0.053886
0.053578
0.053732
t-Statistic
Prob.
-0.003335
-0.152785
0.750292
-2.232153
-2.312255
-0.482043
-1.624178
0.414124
0.719768
-0.953924
0.552542
-1.497262
0.9973
0.8787
0.4536
0.0262
0.0213
0.6301
0.1052
0.6790
0.4721
0.3408
0.5809
0.1352
The test is an F-test of serial correlation and if the p-value of the F-statistic is
smaller than 0.05 we reject the null of no serial correlation.
Another assumption of the CLRM is that the disturbances follow a normal
distribution. If the residuals do not follow a normal distribution then we cannot make
correct inferences about the true coefficients from the coefficient estimates.
To test for normality, the Jarque-Bera test is used. This test can be viewed by
selecting View; Residual Tests; Histogram-Normality Test. The Jarque-Bera
19
statistic has a 2 distribution with two degrees of freedom under the null hypothesis of
normally distributed errors. If the residuals are normally distributed, the histogram
should be bell-shaped and the Jarque-Bera statistic would not be significant. This
means that the p-value given at the bottom of the normality test screen should be
bigger than 0.05 to not reject the null of normality at the 5% level. In this case, the
screen would appear as:
In this case, the null hypothesis for residual normality is rejected, implying that the
inferences we make about the coefficient estimates could be wrong, although the
sample is probably sufficiently large to not give great cause for concern.
Let us now shift our attention to the functional form of the regression. A simple test
for the functional form of the model is the Ramsey Reset Test found in the View menu
of the regression window under Stability tests; Ramsey RESET test. It examines
whether the coefficients of the equation are nonlinear and should be examined to find
misspecification problems. You are asked for the number of fitted terms, equivalent
to the number of powers of the fitted value to be used in the regression; type 1 to
consider only the square of the fitted values. The Ramsey RESET test for this
regression is in effect testing whether the relationship between the stock excess returns
and the market proxy excess returns is linear or not. The results of this test for one
fitted term are:
1.368964
1.377610
Probability
Probability
0.242751
0.240509
Test Equation:
Dependent Variable: RBP
Method: Least Squares
Included observations: 369
Variable
20
t-Statistic
Prob.
C
RFTAS
FITTED^2
-0.000208
0.699135
17.99653
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.105418
0.100530
0.021224
0.164872
899.5307
1.893274
0.001360
0.107885
15.38130
-0.152761
6.480356
1.170027
0.8787
0.0000
0.2428
0.000761
0.022379
-4.859245
-4.827450
21.56490
0.000000
We can see that there is no apparent nonlinearity in the regression equation and we
thus conclude that the linear model in the returns is appropriate.
Taking the results as a whole, what are the implications for the validity and
testability of the estimated model?
Heteroskedasticity? Autocorrelation? Normality? Non-linearity?
2.6
For the population of chief executive officers, let y_{t} be annual salary (salary) in
thousands of $. Thus, a value y=1256.3 indicates an annual salary of $1256,300. Let
x_{t}..
2.7
21
Coefficient
Std. Error
t-Statistic
Prob.
C
RFUTURE
0.001895
0.936129
0.008300
0.083072
0.228254
11.26887
0.8199
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.533589
0.529387
0.088202
0.863536
115.0470
2.002975
-0.000578
0.128572
-2.000832
-1.952560
126.9875
0.000000
What does the slope coefficient tell us? Does it have the correct sign? What does the
R tell us?
2
22
2.8
Yt 1 X 1t 2 X 2t 3 X 3t .... t
OLS assumptions and properties of OLS estimators are the same as in the simple
regression case.
A Practical Example CAPM to a 3 Factor Model
Data:
We will use the data in the file MRM that can be found at:
T:\Eviews\MRM
This data comes from Professor Kenneth Frenchs website:
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
We have monthly data from 1980:01 to 2007:08
SMB (Small Minus Big) is the average return on the three small portfolios minus
the average return on the three big portfolios
HML (High Minus Low) is the average return on the two value portfolios minus the
average return on the two growth portfolios
Rm-Rf, the excess return on the market, is the value-weight return on all NYSE,
AMEX, and NASDAQ stocks (from CRSP) minus the one-month Treasury bill rate
(from Ibbotson Associates)
LRG_GRW: average monthly returns on large, growth portfolio
SM_VAL: average monthly returns on small, value portfolio
MID: neutral
23
We can first estimate a 1 factor model (CAPM) as before, for example using the large
growth portfolio. The regression would be:
ERLG = c(1) * MKT_RF + c(2)
Or equivalently
ERLG MKT_RF c
Variable
Coefficie
nt
Std. Error
t-Statistic
MKT_RF
1.204779
0.021592
55.79650
0.0000
-0.093009
0.095212
-0.976863
0.3294
Prob.
R-squared
0.904160
0.667018
Adjusted R-squared
0.903870
5.537812
S.E. of regression
1.716991
3.925029
972.8590
Schwarz criterion
3.947952
F-statistic
3113.250
Prob(F-statistic)
0.000000
Log likelihood
Durbin-Watson stat
-649.5549
2.076069
This output is of course interpreted as before. What would the constant tell us in this
case? How about the coefficient on the market excess return?
Now we want to expand on CAPM by adding size and value factors in addition to the
market risk factor. The Fama-French 3 factor model considers the fact that value (high
book to market) and small cap stocks historically outperform markets. Accounting for
this observation and adjusting for it should give us a better model for expected (and
24
therefore excess) returns. The 3 factor model adjusts downward for small cap and
value outperformance.
Now the regression to estimate is:
R Rf 1 ( Rm Rf ) 2 SMB 3 HML
Dependent Variable: ERLG
Method: Least Squares
Date: 10/22/07 Time: 19:22
Sample: 1980M01 2007M08
Included observations: 332
Variable
Coefficient
Std. Error
t-Statistic
Prob.
MKT_RF
1.104110
0.021004
52.56632
0.0000
SMB
0.122345
0.027086
4.516952
0.0000
HML
-0.249939
0.031572
-7.916555
0.0000
0.119039
0.084372
1.410880
0.1592
R-squared
0.930598
0.667018
Adjusted R-squared
0.929963
5.537812
S.E. of regression
1.465553
3.614318
704.4935
Schwarz criterion
3.660163
F-statistic
1466.027
Prob(F-statistic)
0.000000
Log likelihood
Durbin-Watson stat
-595.9767
1.673780
How do we interpret this model? What extra considerations must we make when we have
more than one regressor?
Try the same exercise with the small value and neutral portfolio.
25
Y
X
BUT
Since we have more than 1 independent variable in the MRM we write:
Y
X 1
Y
X 2
Y
X 3
These partial derivatives are called the partial regression coefficients and they measure
the isolated effect of each variable onto Y:
1 : The change in Y due to a unit change in X1, keeping X2 and X3 fixed. (In this
case where we have 3 independent variables)
Therefore it is important that we have no perfect collinearity (i.e. correlation)
between regressors X1, X2, X3
e.g. X1 and X2 can be correlated (and probably will be) BUT they must not be a
perfect linear function of each other.
Intuitively, it would be like having one variable only in the model and the 2 effects
cannot be separated.
NOTE:
If there is high (but not perfect) correlation between variables, the coefficients can still
be estimated but the high correlation will affect their reliability, i.e. they will have too
large a variance and as a consequence the estimated t-values will be low.Therefore it
will be more likely to find a variable non-significant when it is actually significant.
R 2 : now called multiple coefficient of determination
R 2 : important as variables are added to the model
26
F-statistic: tests null hypothesis that all coefficients except the intercept are zero. Now
we have more than 1 regressor this statistic can give us different information than the tstatistic in the SRM.
Specification Bias
Inclusion of an irrelevant variable does not alter the results of our estimation (except an
increase in R2). This is called model overfitting and is not a serious problem as
coefficient estimates remain consistent and unbiased, but are inefficient.
Model underfitting, i.e. omitting an important variable makes coefficient estimates
biased and inconsistent, and variances of both the regression and the estimators will be
wrongly calculated. NO inferences can be made based on the usual tests.
27
EVIEWS TUTORIAL 3
3 BUILDING AN EMPIRICAL MODEL OF RETURNS FOR GE...................2
3.1
Introduction..............................................................................................2
3.2
3.3
3.4
Multicollinearity.....................................................................................10
3.5
3.6
3.7
Price of GEC
Dividend Yield
Assets per Share
Earnings Index
Return on Investment
Value of FTSE 100 Index
Index Level of GDP
Retail Price Index
Redemption Yield on Long Gilts
3 Month T-Bill Rate
Sterling Effective Exchange Rate
Some of the variables are general macroeconomic variables while others are
company-specific accounting variables. All of these variables could, a priori, be
expected to affect the returns on the share of GE. There are a total of 11 variables.
We have monthly data from Jan. 1980 to Mar. 1999: 1980:1 - 1999:3
Import the data from:
T:\EViews\GEC
You should have 11 new variables in the workfile, each named as in the table above.
Plot the price of GE against the FTSE in a line graph.
Note that the two series take significantly different values (scale).
Do the series appear to move together?
CAPM Regression:
Dependent Variable: RP
Method: Least Squares
Date: 11/14/07 Time: 16:55
Sample (adjusted): 1980M02 1999M03
Included observations: 230 after adjustments
Variable
C
RFTSE
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.003688
0.072245
t-Statistic
Prob.
-0.365769
12.17854
0.7149
0.0000
Wald Test:
Equation: CAPM
Test Statistic
F-statistic
Chi-square
Value
2.766384
2.766384
df
Probability
(1, 228)
1
0.0976
0.0963
0.007791
0.070199
-2.963013
-2.933117
148.3167
0.000000
0.692363
8.519170
Probability
Probability
0.758055
0.743358
Test Equation:
Dependent Variable: RESID
Method: Least Squares
Date: 10/10/07 Time: 23:37
Presample missing value lagged residuals set to zero.
Variable
C
RFTSE
RESID(-1)
RESID(-2)
RESID(-3)
RESID(-4)
RESID(-5)
RESID(-6)
RESID(-7)
RESID(-8)
RESID(-9)
RESID(-10)
RESID(-11)
RESID(-12)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
t-Statistic
Prob.
0.082807
-0.362828
0.433083
0.578822
0.312816
0.797125
-0.764335
1.045118
1.431019
0.369498
-0.756973
0.210279
-1.422079
0.609690
0.9341
0.7171
0.6654
0.5633
0.7547
0.4263
0.4455
0.2971
0.1539
0.7121
0.4499
0.8336
0.1564
0.5427
0.000308
-0.027081
0.029441
0.039321
0.021543
0.054740
-0.052836
0.072001
0.099430
0.025695
-0.052712
0.014645
-0.099172
0.043024
0.003724
0.074639
0.067980
0.067932
0.068869
0.068672
0.069127
0.068893
0.069482
0.069541
0.069636
0.069646
0.069737
0.070567
0.037040
-0.020916
0.055210
0.658398
347.0870
1.997872
1.36E-18
0.054641
-2.896409
-2.687134
0.639104
0.819499
0.617161
1.243871
Probability
Probability
0.540376
0.536904
Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Date: 10/10/07 Time: 23:39
Sample: 1980M02 1999M03
Included observations: 230
Variable
C
RFTSE
RFTSE^2
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
t-Statistic
Prob.
7.937081
0.902845
-0.286822
0.0000
0.3676
0.7745
0.002945
0.006337
-0.014673
0.000371
0.007019
0.051158
0.005408
-0.003355
0.004978
0.005624
894.7962
2.027455
0.002973
0.004969
-7.754750
-7.709905
0.617161
0.540376
Dont forget a normality test and test of functional form the results are not
included here but you should still run the tests.
What other misspecification issues might we face?
As we can see, the returns of the market proxy are able to explain only a small
percentage of the variability of the returns of GE and the model does not appear to
suffer from any violations of the CLRM.
However, are there any other variables that might affect the returns on GE Stock?
To identify other variables, we will run regressions that are more complex.
Now run two separate regressions: one including only macroeconomic variables
and the market proxy returns and one using only accounting variables and the
market returns.
The results you should obtain are as follows:
0.004651
0.072014
0.302943
0.008463
0.671929
0.200599
t-Statistic
Prob.
-1.331952
12.31951
0.634309
1.174484
1.657277
0.999785
0.1842
0.0000
0.5265
0.2414
0.0989
0.3185
0.007791
0.070199
-2.955017
-2.865327
31.15033
0.000000
0.001248
0.029927
0.033471
0.026445
0.036151
0.037254
t-Statistic
Prob.
-2.705825
1.993494
-2.091854
7.627164
24.22464
-2.872610
0.0073
0.0474
0.0376
0.0000
0.0000
0.0045
0.007791
0.070199
-5.228590
-5.138901
692.9854
0.000000
Run all the diagnostics checks. There are some problems with both sets of
regressions: suggest remedies for these.
3.3 Heteroskedasticity-robust Standard Errors
When the form of heteroskedasticity is unknown, it is usually not possible to obtain
efficient coefficient estimates of the parameters using weighted least squares. OLS
provides consistent parameter estimates in the presence of heteroskedasticity but the
usual OLS standard errors will be incorrect and should not be used for inference. In
order to allow for this problem, heteroskedasticity-robust standard errors are
constructed. White (1980) has derived a heteroskedasticity consistent covariance
matrix estimator which provides correct estimates of the coefficient covariances in the
presence of heteroskedasticity of unknown form. The White covariance matrix
assumes that the residuals of the estimated equation are serially uncorrelated. Newey
and West (1987) have proposed a more general covariance estimator that is consistent
in the presence of both heteroskedasticity and autocorrelation of unknown form.
Note that using the White heteroskedasticity consistent or the Newey-West HAC
covariance estimates will not change the estimated coefficient values, but only their
estimated standard errors.
In order to estimate the regression with heteroskedasticity robust standard errors,
select this option from the option button in the regression entry window. If we click
on the Estimate button in the regression window we will come to the screen where
we input the regression equation. By clicking on the Options button, a new window
will open that will allow the selection of the required methodology:
0.004812
0.059768
0.473926
0.007385
0.577138
0.211973
t-Statistic
Prob.
-1.287331
14.84350
0.405462
1.345988
1.929474
0.946138
0.1993
0.0000
0.6855
0.1797
0.0549
0.3451
0.007791
0.070199
-2.955017
-2.865327
31.15033
0.000000
Coefficient
Std. Error
t-Statistic
Prob.
-0.003378
0.059660
-0.070017
0.201698
0.875756
-0.107015
0.001279
0.023409
0.039532
0.130585
0.040083
0.098188
-2.640086
2.548646
-1.771120
1.544580
21.84847
-1.089904
0.0089
0.0115
0.0779
0.1239
0.0000
0.2769
0.939278
0.937922
0.017490
0.068525
607.2878
2.370655
0.007791
0.070199
-5.228590
-5.138901
692.9854
0.000000
RFTSE
LDIVY
LAPS
LROI
LEI
RFTSE
1.000000
-0.582813
0.062844
0.626814
0.070636
LDIVY
-0.582813
1.000000
-0.089244
-0.872911
-0.001587
LAPS
0.062844
-0.089244
1.000000
0.071339
0.283713
LROY
0.626814
-0.872911
0.071339
1.000000
-0.028627
LEI
0.070636
-0.001587
0.283713
-0.028627
1.000000
Do the results indicate any significant correlations between the independent variables?
(The log-differences of the return on investment and of the dividend yield have
correlation 0.87, which indicates that they are on average closely related but moving
in opposite directions).
Now repeat this step for the macroeconomic variables. Overall, which model do
you think better explains the variability of returns of GE stock?
10
Due to the fact that the company specific model seems to be a better fitting model, we
will concentrate on this one. From the regression output, it is evident that three of the
variables are statistically insignificant. We remove the insignificant variables starting
with the variable that is showing the highest level of insignificance. As a result, we
get the following (note that after removing LEI and LAPS, LDIVY becomes
significant):
Dependent Variable: RP
Method: Least Squares
Date: 11/15/07 Time: 12:12
Sample (adjusted): 1980M02 1999M03
Included observations: 230 after adjustments
White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable
C
RFTSE
LDIVY
LROI
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
3.5
Coefficient
Std. Error
t-Statistic
Prob.
-0.003060
0.054179
-0.079813
0.880422
0.001337
0.022717
0.037850
0.040137
-2.289359
2.384958
-2.108668
21.93556
0.0230
0.0179
0.0361
0.0000
0.923361
0.922344
0.019562
0.086486
580.5169
2.316229
0.007791
0.070199
-5.013190
-4.953398
907.6348
0.000000
Note that in the case of the company specific regression the normality test suggests
that the residuals do not follow a normal distribution. This might be caused by an
outlier or a breakpoint in the regression residuals. In order to check whether this is the
case or not, we will examine two tests.
First, we will examine whether there is a breakpoint in the regression relationship.
This will inform us of any changes in the regression equation caused by a specific
event. We can identify such an event by plotting the actual values, the fitted values
and the residuals of the regression. This can be achieved by selecting View; Actual,
Fitted, Residual; Actual, Fitted, Residual Graph. The plot should look as follows:
11
From the graph, we can see that some time in late 1996 there is a big residual outlier
that is probably disrupting the model. In order to identify the exact date that this
outlier was realized, we use the shading option by right clicking on the graph and
selecting the add shading option. In the new window, input 1996M10 as the ending
date of the shade.
12
We can see that October 1996 is the probable day of the outlier. Another approach to
determining the exact date of the break would be to view the residuals in a Table
(again from the View button). The large negative residual is indeed observed in
October 1996 this represents a mini-crash in the markets. We need the exact date of
the outlier in order to adjust our model to correct for it.
13
Dependent Variable: RP
Method: Least Squares
Date: 11/15/07 Time: 13:03
Sample (adjusted): 1980M02 1999M03
Included observations: 230 after adjustments
White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable
C
RFTSE
LDIVY
LROI
OCTDUMMY
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
Coefficient
Std. Error
t-Statistic
Prob.
-0.002042
0.055328
-0.087297
0.871601
-0.212753
0.000864
0.022499
0.037562
0.039670
0.000880
-2.362356
2.459101
-2.324069
21.97137
-241.7862
0.0190
0.0147
0.0210
0.0000
0.0000
0.963280
0.962627
0.013571
0.041438
665.1324
2.621413
0.007791
0.070199
-5.740281
-5.665541
1475.623
0.000000
14
Coefficient
Std. Error
t-Statistic
Prob.
-0.001596
0.057380
-0.089424
0.871254
-0.213223
-0.005578
0.000872
0.022565
0.037555
0.039882
0.000883
0.003954
-1.830720
2.542887
-2.381145
21.84575
-241.4054
-1.410659
0.0685
0.0117
0.0181
0.0000
0.0000
0.1597
0.963752
0.962943
0.013514
0.040906
666.6191
2.620576
0.007791
0.070199
-5.744514
-5.654825
1191.127
0.000000
15
2.173931
8.837132
Prob. F(4,222)
Prob. Chi-Square(4)
0.072854
0.065302
The result indicates that there was no structural break in the data. If there was a
structural break, how would we account for it in our model?
Finally, before you exit, save the workfile as an EViews file.
16
NON-STATIONARITY ........................................................................ 2
4.1
Stationarity _______________________________________________2
5.2
5.3
4 NON-STATIONARITY
A Test that should always be examined before running a regression is a nonstationarity test of the included variables. If the series included in a regression are not
stationary then the mean and variance of a series are not well defined, making any
inferences about their coefficients unreliable. To examine the variables for a unit root
we perform a Dickey Fuller/Augmented Dickey Fuller test.
Monthly price data for the S&P 500 Index and the GBP/USD monthly exchange rate
for the same period will be used.
Import the file: t:\Eviews\stationary.xls.
Transform both series into logarithms. Call the new variables LNSP500 and
LNFX.
4.1
Stationarity
To test for stationarity, double click on the series and then in the View menu,
select Unit Root Test.
EViews gives you the options to select from various nonstationarity tests (Augmented
Dickey Fuller, Phillips-Perron, KPSS etc.) and to select whether to:
1. Test the levels or the first or second differences of the series
2. Include an intercept, an intercept and a trend or neither of the two
3. Select a number of lagged differences to be included
Run an ADF test with an intercept and a trend on the levels of the series leaving
the lag length selection to automatic. The lagged differences to be included can be
selected based on the data frequency or chosen using an information criterion such as
AIC or Schwarz.
The results of the Test for the S&P 500 index are:
Null Hypothesis: LNSP500 has a unit root
Exogenous: Constant, Linear Trend
Lag Length: 0 (Automatic based on SIC, MAXLAG=14)
t-Statistic
Prob.*
-1.535907
-3.997083
-3.428819
-3.137851
0.8146
Coefficient
Std. Error
t-Statistic
Prob.
LNSP500(-1)
C
@TREND(1987M11)
-0.018202
0.115889
9.78E-05
0.011851
0.067318
9.73E-05
-1.535907
1.721508
1.005046
0.1259
0.0865
0.3159
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.014945
0.006561
0.039104
0.359339
435.2880
2.055505
0.007948
0.039233
-3.632672
-3.588904
1.782663
0.170456
Is the series Stationary? No, since the test statistic is bigger than (i.e. not as negative
as) the critical values. Can also simply look at the p-value.
t-Statistic
Prob.*
-2.088622
-3.997587
-3.429063
-3.137995
0.5490
The results clearly show that both series are non stationary. Try performing the test
regressions with different numbers of lags and without a trend does this make
any difference to the conclusion? (No.)
In this case, the first differences of the series must also be examined for nonstationarity, to determine the order of integration.
We can test the first differences from the Stationarity test window by selecting the
Option Test for Unit Root in 1st difference. Note that this could also have been
achieved by using GENR to construct the two series of first differences and then
testing for a unit root in the levels of these already differenced series.
The results are:
Null Hypothesis: D(LNSP500) has a unit root
Exogenous: Constant, Linear Trend
Lag Length: 0 (Automatic based on SIC, MAXLAG=14)
t-Statistic
Prob.*
-16.01127
-3.997250
-3.428900
-3.137898
0.0000
t-Statistic
Prob.*
-11.55602
-3.997418
-3.428981
-3.137946
0.0000
In both cases, the test statistic is now more negative than the critical values, and so we
reject the null hypotheses that the differenced series contain a unit root. Hence the
results indicate that both of the original log-levels series are I(1).
Plotting the series in levels and then in first differenced form (returns) illustrates this
result:
.8
.7
.6
.5
.4
.3
88
90
92
94
96
98
00
02
04
06
LNFX
.08
.04
.00
-.04
-.08
-.12
88
90
92
94
96
98
00
DLNFX
02
04
06
Getting Started
Using the instructions discussed previously, open EViews and import the BT data as
before. (Remember a constant term (c) and a residual term (resid) will be added
automatically).
T:/Eviews/BT
Save the workfile in the directory you prefer with the name ARMA.WF1.
Repeat the above procedure for the FTSE100 Dividend Yield (start date 1986:1 and
end date 1999:12)
T:/Eviews/FTDY
Construct sets of log-price changes for the two series. In the case of the BT shares, the
logs of the differences are interpreted as continuously compounded returns, while the
log-differences of the dividend yield are simply the continuously compounded
changes in the dividend yield.
The objective of this exercise is to build an ARMA model for both the British
Telecom returns and the FTSE100 Dividend Yield. Recall that there are three stages
involved: identification, estimation, and diagnostic checking. The first stage is carried
out by looking at the autocorrelation coefficients to identify any structure in the data.
5.2
Partial Correlation
|********
|
|
|*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
0.994
0.988
0.983
0.978
0.974
0.969
0.965
0.960
0.956
0.952
0.948
0.945
0.940
0.935
0.930
0.926
0.921
0.916
0.911
0.906
0.901
0.897
0.893
0.889
0.885
PAC
Q-Stat
Prob
0.994
-0.007
0.071
0.019
0.026
-0.010
0.024
-0.004
0.019
0.002
0.056
-0.028
-0.050
-0.026
-0.014
0.004
0.039
-0.066
-0.042
0.035
-0.013
0.049
0.059
-0.018
0.014
1291.7
2569.1
3834.4
5088.4
6331.7
7564.1
8786.2
9998.2
11200.
12393.
13578.
14754.
15919.
17074.
18218.
19351.
20474.
21586.
22686.
23775.
24853.
25921.
26982.
28034.
29078.
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
FTDY
Date: 10/26/07 Time: 20:33
Sample: 1986M01 1999M12
Included observations: 168
Autocorrelation
.|*******|
.|*******|
.|*******|
.|****** |
.|****** |
.|****** |
.|***** |
.|***** |
.|***** |
.|***** |
.|**** |
.|**** |
.|**** |
.|**** |
.|**** |
.|*** |
.|*** |
.|*** |
.|*** |
.|*** |
.|*** |
.|** |
.|** |
.|** |
.|** |
Partial Correlation
.|*******|
*|.
|
.|*
|
.|.
|
.|.
|
.|.
|
.|.
|
.|.
|
.|*
|
.|.
|
.|.
|
.|.
|
.|.
|
*|.
|
.|.
|
.|.
|
.|.
|
.|.
|
.|.
|
.|.
|
.|.
|
.|.
|
*|.
|
.|.
|
.|.
|
AC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
0.950
0.896
0.854
0.818
0.781
0.739
0.700
0.665
0.637
0.607
0.575
0.541
0.517
0.488
0.463
0.439
0.412
0.391
0.372
0.354
0.339
0.325
0.305
0.285
0.266
PAC
Q-Stat
Prob
0.950
-0.060
0.092
0.037
-0.032
-0.056
0.019
-0.009
0.068
-0.043
-0.022
-0.024
0.057
-0.072
0.050
-0.010
-0.036
0.035
0.013
-0.012
0.039
0.002
-0.076
0.003
-0.012
154.22
292.30
418.48
535.14
642.04
738.27
825.29
904.16
977.12
1043.8
1103.8
1157.5
1206.7
1250.8
1290.7
1326.9
1359.1
1388.2
1414.8
1439.0
1461.3
1482.0
1500.3
1516.4
1530.5
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
Note that the output is slightly different here to that which appears on the screen. It is
clearly evident from the first columns that the series are both very persistent, with
autocorrelation functions that only die away very slowly. Only the first partial
autocorrelation coefficient appears strongly significant. The numerical values of the
autocorrelation and partial autocorrelation coefficients at lags 1 to 25 are given in
columns 4 and 5 of the output, with the lag length given in column 3. Again, for both
of the raw data series, the slow decay of the acf is evident, especially for the BT share
price series. Even at lag 25, the autocorrelation coefficient is still 0.885.
The penultimate column of output gives the statistic resulting from a Ljung-Box test
with number of lags in the sum equal to the row number (i.e. the number in column
3). The test statistics will follow a 2(1) for the first row, a 2(2) for the second row,
and so on. p-values associated with these test statistics are given in the last column.
Since the raw data are likely to be non-stationary, an application of this test is not
valid. For this and various other reasons, it is usual practice to work with the logs of
9
the changes of a series rather than the series itself. The non-stationary feature of the
raw data will otherwise swamp all others.
The autocorrelation and partial autocorrelation functions for the BT returns and the
continuously compounded changes in the dividend yield are:
BT Return Series:
Date: 10/26/07 Time: 20:37
Sample: 1/02/1995 12/30/1999
Included observations: 1303
Autocorrelation
|*
*|
*|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Partial Correlation
|*
*|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
AC
PAC
Q-Stat
Prob
0.091
-0.087
-0.073
-0.050
-0.046
-0.021
-0.036
0.002
0.065
-0.011
0.000
0.027
0.015
-0.026
-0.021
-0.000
0.018
0.050
-0.007
0.035
0.023
-0.025
-0.057
-0.023
-0.017
0.091
-0.096
-0.057
-0.046
-0.050
-0.025
-0.048
-0.003
0.051
-0.032
0.009
0.026
0.012
-0.021
-0.010
0.007
0.014
0.044
-0.011
0.047
0.018
-0.019
-0.038
-0.011
-0.020
10.916
20.734
27.773
30.989
33.756
34.329
36.066
36.074
41.675
41.841
41.841
42.812
43.091
43.979
44.553
44.553
44.980
48.328
48.395
49.987
50.680
51.537
55.867
56.564
56.968
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
10
FTDY Log-differences
Date: 10/26/07 Time: 20:41
Sample: 1986M01 1999M12
Included observations: 167
Autocorrelation
.|.
*|.
*|.
.|.
.|.
.|.
*|.
*|.
.|*
.|.
.|.
.|.
.|*
.|.
.|.
.|.
.|.
.|.
.|.
*|.
.|.
.|.
.|*
.|.
.|.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Partial Correlation
.|.
*|.
*|.
.|.
.|.
.|.
*|.
*|.
.|.
.|.
.|.
.|.
.|*
.|.
.|.
.|.
.|.
.|.
.|.
*|.
.|.
.|.
.|*
.|.
.|.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
AC
PAC
Q-Stat
Prob
0.050
-0.083
-0.091
0.022
-0.005
-0.037
-0.102
-0.090
0.071
0.036
0.028
-0.045
0.091
0.013
-0.018
-0.033
-0.044
-0.034
-0.040
-0.066
0.025
0.009
0.119
0.028
-0.003
0.050
-0.086
-0.083
0.024
-0.022
-0.041
-0.099
-0.093
0.058
-0.001
0.024
-0.036
0.092
-0.011
-0.024
-0.008
-0.035
-0.034
-0.048
-0.073
0.039
-0.030
0.109
0.005
0.005
0.4233
1.5977
3.0165
3.1021
3.1066
3.3505
5.2028
6.6500
7.5523
7.7810
7.9237
8.2938
9.8164
9.8481
9.9054
10.111
10.480
10.701
11.010
11.850
11.967
11.985
14.748
14.900
14.902
0.515
0.450
0.389
0.541
0.684
0.764
0.635
0.575
0.580
0.650
0.720
0.762
0.709
0.773
0.826
0.861
0.882
0.907
0.923
0.921
0.940
0.958
0.903
0.924
0.944
It can be deduced that, for the BT returns series, the first three autocorrelation
coefficients and the first three partial autocorrelation coefficients are significant under
this rule. Since the first acf coefficient is highly significant, the Ljung-Box joint test
11
statistic rejects the null hypothesis of no autocorrelation at the 1% level for all
numbers of lags considered.
In the case of the dividend yield log change series, the second and third
autocorrelation and partial autocorrelation coefficients are significant, although the
first acf and pacf coefficients are not. The Ljung-Box test statistic is never significant
for this series.
In the BT case, it could be concluded that a mixed ARMA process could be
appropriate, although it is hard to precisely determine the appropriate order given this
output.
For the dividend yield series, on the other hand, it seems that there is little structure in
the data that could be captured by a linear time series model. In order to investigate
this issue further, the information criteria are now employed.
5.3
An important point to note is that books and statistical packages often differ in their
construction of the test statistic. For example, the formulae for Akaikes and
Schwarzs Information Criteria:
AIC log( 2 )
2k
T
SBIC log( 2 )
k
(log T )
T
AIC 2 / T
2k
T
SBIC 2 / T
k
(log T )
T
12
where, l
T
1 log(2 ) log(u u / T )
2
Unfortunately, this modification is not benign, since it affects the relative strength of
the penalty term compared with the error variance, sometimes leading different
packages to select different model orders for the same data and criterion!
Suppose that it is thought that ARMA models from order (0,0) to (5,5) are plausible
for these two returns series. This would entail considering 36 models (ARMA(0,0),
ARMA(1,0), ARMA(2,0), ARMA(5,5)), i.e. up to 5 lags in both the autoregressive
and moving average terms.
In EViews, this can be done by separately estimating each of the models and noting
down the value of the information criteria in each case. This can be done in the
following way:
On the EViews main menu, click on Quick and choose Estimate Equation. EViews
will open an Equation Specification window. In the Equation Specification editor,
type, for example
rbt c ar(1) ma(1)
For the estimation settings, select LS Least Squares (NLS and ARMA), select the
whole sample, and click OK this will specify an ARMA(1,1). The output is given
below:
Dependent Variable: RBT
Method: Least Squares
Date: 10/26/07 Time: 21:00
Sample (adjusted): 1/04/1995 12/30/1999
Included observations: 1302 after adjustments
Convergence achieved after 33 iterations
Backcast: 1/03/1995
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
AR(1)
MA(1)
0.001065
-0.305276
0.412712
0.000556
0.214965
0.205631
1.913614
-1.420121
2.007055
0.0559
0.1558
0.0450
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
Inverted AR Roots
Inverted MA Roots
0.012580
0.011059
0.018554
0.447170
3345.227
2.016086
-.31
-.41
13
0.001064
0.018657
-5.133989
-5.122073
8.274493
0.000269
Note that the header for the EViews output for ARMA models states the number of
iterations that have been used in the model estimation process. This shows that, in
fact, an iterative numerical optimisation procedure has been employed to estimate the
coefficients.
Repeating these steps for the other ARMA models would give all of the required
values for the information criteria.
To give just one more example, in the case of an ARMA(5,5), the following would be
typed in the Equation specification editor box:
rbt c ar(1) ar(2) ar(3) ar(4) ar(5) ma(1) ma(2) ma(3) ma(4) ma(5)
The values of all of the information criteria, calculated using EViews, are as
follows:
Information Criteria for British Telecom Stock Return ARMA Models
AIC
p/q
-5.125
-5.134
-5.138
-5.143
-5.144
-5.144
-5.131
-5.134
-5.146
-5.145
-5.143
-5.142
-5.138
-5.146
-5.144
-5.143
-5.141
-5.140
-5.139
-5.144
-5.143
-5.142
-5.144
-5.143
-5.140
-5.143
-5.143
-5.145
-5.141
-5.141
-5.140
-5.141
-5.140
-5.139
-5.142
-5.141
p/q
-5.121
-5.126
-5.126
-5.127
-5.124
-5.120
-5.123
-5.122
-5.130
-5.125
-5.119
-5.114
-5.126
-5.130
-5.125
-5.119
-5.113
-5.108
-5.123
-5.125
-5.119
-5.114
-5.112
-5.107
-5.120
-5.119
-5.115
-5.113
-5.105
-5.102
-5.117
-5.113
-5.108
-5.103
-5.102
-5.097
SBIC
So what model actually minimises the two information criteria? Both the AIC and
SBIC are minimised at p=1 and q=2 or p=2 and q=1 for British Telecom returns. For
the FTSE100 dividend yield log-changes, both information criteria are minimised at
p=2 and q=3 (not shown). Interestingly, both criteria suggest the same model order for
the BT series and for the dividend yield series, although this is usually not the case.
14