Sunteți pe pagina 1din 24

Review

Probability
Random variable (RV)
Probability distribution
Expectation operator
Mean and variance of a RV
Independence
Conditional expectation

1
Review
Regression
Assumptions about the residual
OLS estimation (minimise the SSR)
Forecasting with an estimated model
Testing for coefficient significance
Selecting between models (AIC, SIC)

2
Review Questions
What is the distribution of a RV?
What is the expected value of a RV? What is
the expectation operator?
What do we mean when we say that two RVs
are independent?
What is the conditional expectation of a RV?

3
PS1 : Question 5, pp. 30
Draw qualitative scatterplots and regression
lines for each of the following two-variable
datasets, and state the R-square in each case:
(a) Dataset 1: y and x have correlation 1
(b) Dataset 2: y and x have correlation -1
(c) Dataset 3: y and x have correlation 0

4
PS1 : Answer
(a) Data points are precisely on a positively
sloped line. R-sq = 1
(b) Data points are precisely on a negatively
sloped line. R-sq = 1
(c) Data points are scattered around a horizontal
line. R-sq = 0

5
PS2 : Question 4, pp. 30
Given the regression model:
= 0 + 1 + 2 2 + 3 +
(0, 2 )
Find the mean and variance of conditional on
= and = . Does the conditional mean
adapt to the conditioning information? Does the
conditional variance adapt to the conditioning
information?

6
PS2 : Answer
| , = 0 + 1 + 2 2 + 3

var | , = 2

The conditional mean of y adapts to the


conditioning information (ie, the values of x and
z). However, the conditional variance of y does
not.

7
PS3 : Question 9, pp. 31
Let
= 0 + 1 + 2 +
Where is the number of hot dogs sold at an
amusement park on a given day, is the
number of admission tickets sold that day, is
the daily maximum temperature, and is a
random error.

8
PS3 : Question 9, pp. 31
(a) State whether each of , , , 0 , 1 , and 2
is a coefficient of a variable.
(b) Determine the units of 0 , 1 , and 2 , and
describe the physical meaning of each
(c) What does the sign of a coefficient tell you
about how its corresponding variable affects the
number of HDs sold? What are your
expectations for the signs of the various
coefficients - -ve, 0, +ve,, or unsure?
(d) Is it sensible to think of a non-zero intercept,
that is: 0 0? 0 > 0? 0 < 0?
9
PS3 : Answer
(a) y is dependent variable, x & z are
independent variables, 0 , 1 , and 2 are
coefficients (parameters).
(b) Units: 0 = number of HDs; 1 = number of
HDs per admission ticket; and 2 = number of
HDs per degree in temperature.

10
PS3 : Answer
(c) The sign indicates whether the dependent variable is
+vely or vely related to the corresponding independent
variable. You would expect that x coefficient is
nonnegative (more entries lead to more sales). But no
prior intuition about the sign of the z coefficient. Maybe
zero.

(d) Probably not sensible to not include intercept.


Presumably HD sales must be zero if admissions are zero.
If we view this linear model as a linear approximation to a
potentially non-linear relationship, the intercept may well
be non-zero. (of either sign).

11
PS4 : Question 8, pp. 31
Consider Fig 2.2
(a) In fitting that regression line, we included a
constant term. How can you tell?
(b) Suppose that we had not included a constant
term. How would the figure look?
(c) We almost always include a constant term
when estimating regressions. Why?
(d) When, if ever, might you explicitly want to
exclude the constant term?
12
13
PS4 : Answer
(a) The fitted regression line does not go
through the origin, indicating a non-zero
intercept term.

(b) If the regression did not include a constant


term, the fitted line would pass through the
origin as the regression function would be:
= 1

14
PS4 : Answer
(c) As the linear regression model is usually an
approximation to the relationship between y & x
in the range of data, there is no reason to force
the approximation through the origin, except in
very special circumstances.

(d) If, for example, a production function were


truly linear, then it should pass through the
origin (no inputs; no outputs)

15
PS5 : Question 6, pp. 31
For each of the diagnostics listed, indicate
whether, bigger is better, smaller is better, or
neither. Explain your reasoning.
(a) coefficient, (b) standard error, (c) t-statistic,
(d) p-value of t-statistic, (e) R-sq., (f) Adj. R-sq.
(g) standard error of regression, (h) SSR, (i) Log-
likelihood, (j) DW statistic, (k) Mean of y, (l)
standard deviation of y, (m) AIC, (n) SIC, (o) F-
statistic, (p) p-value of F-statistic.

16
PS5 : Answer
(a) coefficient: neither
(b) standard error: smaller is better
(c) t-ratio: bigger is better
(d) p-value: smaller is better
(e) R-sq.: bigger is better
(f) Adj. R-sq.: Bigger is better
(g) standard error of regression: smaller is better
(h) Sum of squared errors: smaller is better
(i) Log-likelihood: Bigger is better
(j) DW statistic: neither, closer to 2 is better
(k) mean of y: neither
(l) stand deviation of y: neither
(m) AIC: smaller is better
(n) SIC: smaller is better
(o) F-statistic: bigger is better
(p) p-value: smaller is better

17
PS6
The file oz-gdp.dat contains a single time
series, quarterly Australian GDP from 1970Q1 to
1998Q4.Read the data into Eviews.
(a) Find the summary statistics by typing hist
gdp and press Enter.
(b) Generate a time trend by typing genr
tt=@trend(0) and press Enter.
(c) Generate the log of GDP by typing genr
y=log(gdp) and press Enter.
18
PS6
(d) Regress log GDP on (1,tt) by typing ls y c tt
and press Enter.
(e) Comment on the estimation.
In the results window, click View/Actual, Fitted,
Residual/Graph to find the residual plot.
Comment.

19
PS6 (a): Answer

As the data is trending, the histogram does not show an obvious central tendency or a
unique peak.
20
PS6 (d): Answer

21
PS6 (e): Answer
1. The slope coefficient is statistically different
from zero (virtually zero p-value) and could be
interpreted as the quarterly GDP growth rate
(0.779%) if the regression model assumptions
were satisfied.
2. The R-sq. is very large, because the GDP series
and the time trend (tt) are both trending.
3. The error term of the regression is strongly
autocorrelated, indicated by the tiny DW
statistic (0.141034). Hence, the assumption that
the error is iid does not hold here.

22
PS6 (f): Answer

23
PS6 (f): Answer
The time trend fits the GDP series fairly well in the
given sample. The magnitudes of the residuals are very
small. However, the residual plot indicates that the
residuals stay either positive or negative for sizable
episodes. This is a typical sign of strong
autocorrelations in the residual series (which is
consistent with the DW statistic).
For forecasting purposes, the autocorrelations in the
residual series, which is a pattern, should be further
exploited. Later on we will be spending a lot of time on
modelling autocorrelations.

24

S-ar putea să vă placă și