Documente Academic
Documente Profesional
Documente Cultură
This summary is intended to help you learn key aspects of econometrics. There is no
need to memorize the formulas, spend your energy on understanding the concepts and
how they are used in economic analysis.
Ordinary Least Squares (OLS) estimators for the simple (1 X variable) model
We use economic theory to specify a model, e.g.,
yt = β1 + β 2 xt + et , t = 1,...,T , (1)
where yt is the variable to be predicted (or explained, the dependent variable),
xt is the independent (explanatory) variable used to predict yt ,
T is the number of data observations on each of yt and xt ,
β1 is the constant term,
β 2 is the slope parameter, and
Regardless of the number of x variables or the form of the model being estimated, y is
decomposed into what we can explain and what we can’t,
y = β1 + β 2 xt + et , t = 1,...,T .
t
total explained unexplained
Given the T observations of data, OLS regression is used to estimate the parameters β1
1
− RFt = β̂1 + β̂ 2 ( RM t − RFt ) = 0.0175 + 1.0138( RM t − RFt ) ,
MSFT
For example, Rt
where β̂1 = 0.0175 and β̂ 2 = 1.0138 are from the MSFT CAPM.
Since the et , t = 1,...,T , are not observed, they are to be predicted. The predicted (or
forecast) error et is
where ŷt is the predicted (or forecast) value of yt , which is conditional on the value of
xt , i.e.,
ŷt = E ( yt xt ) = β̂1 + β̂ 2 xt . (2)
OLS parameters (coefficients), β̂1 and β̂ 2 , are chosen to minimize the sum of squared
errors,
T T T
SSE = ∑ êt2 = ∑ ( yt − ŷt ) = ∑ ( yt − β1 − β 2 xt ) .
2 2
OLS parameter variances, covariances, and standard errors for the simple model are
T
σ̂ 2 ∑ xt2
( )
SE β̂1 = var β̂1 ( ) ( )
vâr β̂1 = T
t =1
,
T ∑ ( xt − x )
2
t =1
( )
SE β̂ 2 = var β̂ 2 ( ) ( )
vâr β̂ 2 = T
σ̂ 2
∑( x − x)
2
t
t =1
σ̂ 2 ( −x )
(
côv β̂1 , β̂ 2 =) T ,
∑( x − x)
2
t
t =1
2
Standard error of the regression:
∑ ê
⎡T t
2
⎤
1/2
σ̂ = = ⎢ ∑ êt2 (T − K ) ⎥
t=1
(T − K ) ⎣ t=1 ⎦
This is a measure of the variation in the predicted errors (residuals) of the model. K= the
number of parameters estimated in the model, e.g., K=2 in model (1), the β̂1 and β̂ 2
parameters.
Coefficient of determination R 2
If there is a constant term in the regression model then 0 ≤ R 2 ≤ 1 and can be interpreted
as the proportion of the variation in y that is explained by the model.
y = β1 + β 2 xt + et , t = 1,...,T .
t
total explained unexplained
T T T
∑ ( y − y ) = ∑ ( ŷ − y ) + ∑ ê
2 2 2
t t t
t=1
t=1
t=1
TSS XSS ESS
so
TSS=XSS+ESS (note that MS Excel prints these values in the ANOVA table, the SS column).
TSS ≡ total sum of squares
XSS ≡ explained or regression sum of squares
3
ESS ≡ error (residual) sum of squares.
XSS ESS
R2 ≡ = 1−
TSS TSS
If the goal is to obtain the highest R2, which is the objective in many stepwise regressions,
one can always obtain an R2=1 as long as a constant term is included in the regression. To
obtain the upper bound result with any Y and X data, just estimate
Yt = β1 + β 2 X + β3 X 2 + ... + βT −1 X T −1 .
t-statistic
β −c
t= ∼ t p,df ,
SEβ
level (usually 0.05), and the degrees of freedom for the model are given by df = T − K ,
K=the number of parameters estimated in the model (so far K=2 because we are
The t-statistic presented with regression results is to test the null hypothesis H 0 : β =0 .
4
Alternatively, the same hypothesis can be tested using confidence intervals for β̂ . For
β̂ ± t.05,T −K SEβ ⇒
β̂ − t.05,T −K SEβ ≤ c ≤ β̂ + t.05,T −K SEβ .
If the null hypothesis is, e.g., H 0 : β̂ =0 , then reject the null hypothesis if 0 is within the
calculated bounds.
OLS assumptions for the simple (single x variable) linear (in parameters) regression
model:
1. yt = β1 + β 2 xt + et , t = 1,..., T
3. var ( et ) = σ 2 < ∞
If assumptions 1-5 hold, then the OLS estimator is the Best Linear Unbiased Estimator
(BLUE) of the parameters β1 and β 2 and their standard errors. Linear unbiased applies
to the parameter estimates and best refers to the efficiency of the standard errors.
BLUE implies that the OLS estimators have the smallest variance of all linear (in
parameters) and unbiased estimators (see the Gauss-Markov Theorem for the proof).
5
6. et N (0, σ 2 ), the error term is normally distributed with mean zero and variance
sigma-squared.
Implications of violating any of these assumptions on hypothesis tests and forecasting are
covered after we consider adding additional X variables to the model.
Multiple regression involves more than one X variable in the model, e.g.,
Regardless of the number of x variables or the form of the model being estimated, y is
decomposed into what we can explain and what we can’t, as above.
1. Theoretical and empirical interest. There may be more than one variable that explains
Y, i.e., additional X variables should be in the model (such as in APT or multifactor asset
pricing models, demand and supply, etc.).
cov ( xit ,et ) ≠ 0 . This then creates a violation of assumption 5, leading to bias in the
estimated βi .
6
b. Assumption 2 becomes E ( et ) = 0 if and only if E ( yt ) = ŷt = β̂1 + β̂ 2 xt + ...+ β̂ K x Kt
c. Assumption 5 becomes cov ( xkt ,et ) = 0 for all k = 2,..., K, t = 1,...,T , and there is
no exact linear relationship between two or more of the X variables (i.e., no exact
multicollinearity).
with multiple X variables there is often correlation across these variables. This
not be able to separate the individual effects of the collinear variables (although we can
validly consider the collinear variables jointly with an F-test). Multicollinearity is not a
violation of the above OLS assumptions, but it may bring in to question the validity of
individual parameter tests when the null hypothesis not rejected. However, it has no
regression model and have reached, e.g., a do not reject the zero null with individual
parameter t-tests.
F-test
F(q,T − K ) =
( ESS r
− ESSu ) / m
∼ Fq,T − K ,
ESSu / (T − K )
where ESSi , i = r,u, is the error sum of squares for the restricted and unrestricted model,
and m is the number of restrictions specified in the null hypothesis. The statistic is
distributed according to the F distribution with degrees of freedom m for the numerator,
7
Consider the unrestricted model yt = β1 + β 2 x2t + ...+ β K x Kt + et , t = 1,...,T . As an example
(we can test any subset of the parameters), suppose we are interested in testing the null
hypothesis
H 0 : β̂ 2 = ... = β̂ K = 0
versus the alternative
H 1 : β̂ 2 ,..., and/or β̂ K ≠ 0.
The F statistic printed with the regression output is to test this null. Apply the null
hypothesis to the unrestricted model to obtain the restricted model yt = β1 + et , t = 1,...,T , and
estimate to obtain the ESSr . The degrees of freedom for the numerator, the difference in the
two models, is given by the difference in the number of parameters estimated in unrestricted
and restricted models, and is K −1 for the example above. The degrees of freedom for the
denominator are the same as the degrees of freedom for the unrestricted model, T − K.
If the calculated F(q,T − K ) > F( critical sig. level ,q,T − K ) , then reject the null hypothesis.
F critical value for significance level 0.05 (i.e., the right tail area equals 0.05).
Note the degrees of freedom for the numerator, df1, and for the denominator, df2, used to
determine the critical value. Changing their order of the degrees of freedom changes the
The F-test can be used for testing linear (in parameters) restrictions on any subset of
parameters in the model. It will also be useful in testing the OLS assumptions above.
8
Violations of assumption(s)
LM test for autocorrelation is used to test for violation of (4), which can cause inefficient
standard errors.
White test for heteroskedasticity can be used to test for violation of (3), which can cause
inefficient standard errors.
Type 1 errors (rejecting the null when it is true) and Type II errors (not rejecting the null
when it is false) may result from violations of the OLS assumptions, i.e., model
specification issues:
9
FORECASTING (prediction) with a single equation model
where N is the number of time periods into the future that y is being forecast. For
example, if τ = 1 then we are calculating the one period ahead forecast, ŷT +1 .
Types of forecasting
Unconditional forecast: explanatory variables are known with certainty (they are
observed).
• Ex post forecast is unconditional (values of explanatory variables are observed).
• Ex ante forecast may be unconditional if explanatory variables are lagged. If
there are no lagged explanatory variables then all ex ante forecasts are
conditional.
Conditional forecast: at least one explanatory variable is not known (i.e., we must
forecast at least one explanatory variable in order to forecast ŷ ). In this case the forecast
of ŷ is said to be conditional on the forecast of the x variables.
Example:
10
Time t = [ 0,…,t1 ] is the sample data used in estimating the model (e.g., Jan. 1990-Dec.
11
Forecast standard error
For the simple regression model, the standard error around the forecast of ŷT +τ is given
by
1/2
* $ 2
'-
, 1 (x − x ) )/
σˆ f = ,σˆ 2 &&1+ + T 0 (4)
T ∑ (x − x )2 )/
,+ % t=1 t (/.
This expression is only for simple regression (1 X variable) models and includes error
from the error term in the model (the standard error of the regression) and error
associated with the parameter estimates (parameter standard error).
is a special case of a more complicated expression that includes terms for multiple X
variables. To consider more than one X variable, the forecast standard error expression
in (4) has to be expanded to include analogous terms for each X and covariance terms
between all the β associated with X variables in the model. The prediction or forecast
1/2
⎡ ⎛ 1 ⎞ K −1 K −1
( ⎤
σ̂ f = ⎢σ̂ 2 ⎜ 1+ ⎟ + ∑ ∑ (xi0 − xi )2 (x j0 − x j )2 cov β̂ i , β̂ j ⎥ .
⎣ ⎝ T ⎠ i=1 j=1 ⎦
) (5)
For conditional mean forecasts, drop the 1 from the expression (5), so
1/2
⎡ σ̂ 2 K −1 K −1
( )
⎤
σ̂ f = ⎢ + ∑ ∑ (xi0 − xi )2 (x j0 − x j )2 cov β̂ i , β̂ j ⎥ .
⎣T i=1 j=1 ⎦
(6)
Using dummy variables to compute the forecast and forecast standard error
For any number of X variables and forecast time periods, a relatively easy way to
calculate the forecast and forecast standard error is to use the following dummy variable
12
approach. Expand the data used to estimate the model by increasing the number of rows
and columns of data by the number periods being forecast. Each new column will
contain all zeros except for the diagonal of the rows and columns that that have been
added. Each element of this diagonal is set equal to -1. Each of the new rows below the
forecast two periods forward (i.e., time periods T+1 and T+2), it is necessary to have
values for all xij , i = 1, 2, 3 and j = T + 1,T + 2. The forecast of y is conditional on these x
values. Below is the data layout which will allow us calculate the forecast values and the
forecast standard error for each forecast period. The black entries are the original data
matrix to estimate the model. The blue entries are the additional rows and columns to
add to obtain the forecast, forecast standard error, and the probability bounds around the
forecast.
" y1 x11 x21 x31 0 0 %
$ '
$ y2 x12 x22 x32 0 0 '
$ '
$ 0 0 '
(7)
$ yT x1T x2T x3T 0 0 '
$ 0 x1,T +1 x2,T +1 x3,T +1 −1 0 '
$ '
$ 0 x1,T +2 x2,T +2 x3,T +2 0 −1 '&
#
See the DAL CAPM and other examples for more detail.
13
Choosing among alternative forecasting models
The root mean square error (RMSE) is a measure of the deviation of the simulated
(forecast) variable from the actual (observed) value of the variable, and can be used to
choose between alternative forecasting models.
To calculate the RMSE, we need the predicted and actual values of the y so that we can
calculate the prediction error in the forecast, ŷt − yt . Using this information over the
forecasting horizon we can calculate the
T +h
( ŷt − yt )2
Root mean square error = RMSE = ∑ h
,
t=T +1
where h is the number of forecasting periods, ŷt is the forecast value of y at time t, and y
For any specified forecasting model, the magnitude of this error can only be evaluated
relative to the mean of the variable being forecast.
“Can Economists Forecasts Crashes?” David Hendry (little over 3 min video):
http://www.youtube.com/watch?v=yrpUO0k3kSM
Econometrics Beat: Dave Giles’ Blog (has data, among other things)
http://davegiles.blogspot.com
14