Documente Academic
Documente Profesional
Documente Cultură
The regression model is linear in the parameters, though it may or may not be linear in the
variables. That is the regression model as shown in
Values taken by the regressor X may be considered fixed in repeated samples (the case of
fixed regressor) or they may be sampled along with the dependent variable Y (the case of stochastic
regressor). In the latter case, it is assumed that the X variable(s) and the error term are independent,
that is, cov (Xi, ui) =0.
Given the value of Xi, the mean, or expected, value of the random disturbance term ui is
zero.
The variance of the error, or disturbance, term is the same regardless of the value of X.
Symbolically,
Given any two X values, Xi and Xj(I,j), the correlation between any two ui and uj(I, j) is
zero. In short, the observations are sampled independently. Symbolically, cov(ui, uj|Xi, Xj) =0,
Alternatively, the number of observations must be greater than the number of explanatory
variables.
1
7. The Nature of X Variables:
The X values in a given sample must not all be the same. Technically, var (X) must be a
positive number. Furthermore, there can be no outliers in the values of the X variable, that is,
values that are very large in relation to the rest of the observations.
OLS properties
Small-Sample Properties
i. Unbiasedness
E (ê) = e
E (ê) - e = 0
If this equality does not hold, then the estimator is said to be biased, and the bias is
Calculated as
The average value of these estimates is expected to be equal to the true value if the
estimator is unbiased.
If ê1 and ê2 are two unbiased estimators of e, and the variance of ê1 is smaller than or at
most equal to the variance of ê2, then ê1 is a minimum-variance unbiased, or best Unbiased, or
efficient, estimator.
iv. Linearity
2
An estimator ê is said to be a linear estimator of e if it is a linear function of the sample
observations. Thus, the sample mean defined as
If ê is linear, is unbiased, and has minimum variance in the class of all linear unbiased
estimators of e, then it is called a best linear unbiased estimator, or BLUE for short.
The difference between the two is that var (ê) measures the dispersion of the distribution of ê
around its mean or expected value, whereas MSE(ê) measures dispersion around the true value of
the parameter.
Large-Sample Properties
It often happens that an estimator does not satisfy one or more of the desirable statistical
properties in small samples. But as the sample size increases indefinitely, the estimator possesses
several desirable statistical properties. These properties are known as the large sample, or
asymptotic, properties.
i. Asymptotic Unbiasedness
lim n→∞E(ê n) = e
3
where ên means that the estimator is based on a sample size of n and where lim means limit and n
→ ∞ means that n increases indefinitely. In words, ê is an asymptotically unbiased estimator of e
if its expected, or mean, value approaches the true value as the sample size gets larger and larger.
ii. Consistency
ê is said to be a consistent estimator if it approaches the true value e as the sample size gets
Problems:
1. Multicollinearity:
4
Test for Multicollinearity:
2. Heteroscedasticity:
If it is suspected that the variances are not homogeneous (a representation of the residuals
against the explanatory variables may reveal heteroscedasticity), it is therefore necessary to
perform a test for heteroscedasticity. Several tests have been developed, with the following null
and alternative hypotheses:
Tests:
Breusch-Pagan test
White test
5
Breusch-Pagan test:
3. White Test:
White test allows for nonlinearities by using squares and crossproducts of all the x’s.
2.
3. Obtain the predicted Y values after estimating your model.
5.
6. Retain the R-squared value from this regression:
7.
8. Calculate the F-statistic or the chi-squared statistic:
6
3. Autocorrelation:
The term autocorrelation may be defined as “correlation between members of series of
observations ordered in time [as in time series data] or space [as in cross-sectional data].In the
regression context, the classical linear regression model assumes that such autocorrelation does
not exist in the disturbances ui. Symbolically,
1. The regression model includes the intercept term. If it is not present, as in the case of the
regression through the origin, it is essential to rerun the regression including the intercept term to
obtain the RSS
2. The explanatory variables, the X’s, are non-stochastic, or fixed in repeated sampling.
3. The disturbances ut are generated by the first-order autoregressive scheme: ut= ρut-1 +εt.
Therefore, it cannot be used to detect higher-order autoregressive schemes.
5. The regression model does not include the lagged value(s) of the dependent variable as one of
the explanatory variables.
7
Method
2. Compute d
3. For the given sample size and given number of explanatory variables, find out the critical dL
and dU values.
Decision Rule:
1. H0:ρ =0 versus H1:ρ>0. Reject H0 at α level if d < dU. That is, there is statistically significant
positive autocorrelation.
2. H0:ρ =0 versus H1:ρ<0. Reject H0 at α level if the estimated (4−d) < dU, that is, there is
statistically significant evidence of negative autocorrelation.
3. H0:ρ =0 versus H1:ρ =0. Reject H0 at 2α level if d < dU or (4−d) < dU, that is, there is statistically
significant evidence of autocorrelation, positive or negative.