Sunteți pe pagina 1din 6

Econ 140 SLC Midterm Review Session

Lead Tutor: Tomas Villena, tomas.villena@berkeley.edu


10/3/2016, 10/5/2016

1. Concept Review for Midterm

Continuous

Up to now, we have seen three building blocks:


Cumulative Density Function (CDF)

Review of Probability

Same as discrete case,

Review of Statistics

F = P r(X x)

Basics of the Simple Linear Regression Model*

The following is a non-comprehensive outlines of some of


the key points in these three topics. They are ordered in
their respective Stock and Watson chapters for reference.

F =

p(x)dx

Chapter 1: Economic Questions and Data


Chapter References: Pages 1-14

Probability Density Function (PDF)

Types of Data
Cross-Sectional

Probability at each (conitnuous) point

Multiple entities observed at a single time period


Time-series
Single entity observed at multiple time periods

f=

Panel Data
Multiple entities, where each entity is observed at multiple time periods

Chapter 2: Review of Prability


Chapter References: Pages 14-64
2.1 Probability and Random Variables
Random Variable: Numerical summary of a random outcome
Random Variables can be:
Discrete Example: (Computer Crashes)
Outcomes

Probability Distribution
Cummulative Probability Distribution

0.8
0.8

0.1
0.9

0.05
0.95

0.05
1.00

d
F = P r(X = x)
dx

2
Best Reference: pg. 35 (Key Concept 2.3) of S&W!

3.2 Properties of Good Estimators


Unbiasedness: the expectation of the estimator is equal
to the population parameter

2.2 Expectation
Sum of all possible values a random variable X can take,
weighted by the probability of each outcome.
E[X] =
Notation:

E[ ] =
Consistency: as we approach infinite data, our estimator
gets close to the true parameter i.e.
P r[|

xi P (X = xi )

| < "] ! 1 as N ! 1.

Convergence in probability:
!p

= E[X]

Intuition: when the sample size is large, our estimate is approximately equal to the true parameter

Expectation Property:

Eciency:
- is more ecient than e if it has a lower variance

E[aX + bY + c] = aE[X] + bE[y] + c

V ar( ) V ar( )

2.3 Variance
V ar[X] = E[(X

x ) 2 ] =

2
X

)2 P (X = xi )

(xi

V ar( ) V ar( ) 8 e E[ e] =

= V ar[X]

V ar[aX + bY + c] = a2

2
x

- is the best linear unbiased if it has the lowest variance


of any unbiased estimator

+ b2

2
y

+ 2ab

xy

2.4 Covariance and Correlation


Cov(X, Y ) =

XY

= E[(X

x )(Y

y )]

XY

Corr(X, Y ) =

X Y

2.4 Independent Drawing (i.i.d)

Chapter 3: Review of Statistics


Chapter References: Pages 64-107

3.3 Statistics Theorems


Law of Large Numbers
= limx!1
limn!1 X

3.1 Properties of Random Variables


Centrality: Average = x = E(X) , M edian = ms.t.P (X
m) = 0.5
Dispersion: Takes on dierent values centered around the
mean, with the spread depending on the variance.
Skewness: E[(X
one side 3 .

x ) ], a skewed distribution has a tail on

Kurtosis: E[(X

x ) ], measures the thickness of the tail.

1X
xi = x
n i

Central Limit Theorem


n!1
N (X ,
)X
)Z=

X
X

2
X)

N (X ,

2
X)

SLC Econ 140 MT#1 Review Session

nstructor: Professor Woroch


Tomas Villena
0/3 and 10/5, 2016

Chapter 4: Simple Linear Regression

Introduction to Linear Regression


Important Note: There is a true or population model,
or a start, let us assume that our variables of interest Y and X are related to each other linearly.
and there is our sample model. We can not know the true
osit a statistical model of the form: Yi = 0 + 1 Xi + ui
model, but we can study how close our sample estimator gets
0 is the intercept of the population regression line
to it. Econometrics is about understanding and building upon
1 is theour
slopeknowledge
of the population
line
of regression
how dierent
estimations of sample models
ui is theare
errorrelated
term or sometimes
also known
as the residual
to the true
model.

OLS Assumptions
1.is,Conditional
That
we

Mean of Errors Is Zero

E[ui |xi ] = 0
2. i.i.d Sampling

The subscript i refers to observations in the population that run from i = 1 to N

If we are drawing a sample of characteristics Xi0 s from a


Notation Note: As a rule, if a true parameter is denoted:
population, then i.i.d assumptions imply every draw of
a dierent observation is independent from the previous
.1 Ordinary Least Squares Estimator
draw, i.e. we have Xi independent (?) from Xj for all

he Ordinary Least Squares Estimator (OLS) chooses the regression coefficients (b0 , b1 ) so that the sample regresXi 6= Xj
Theninthe
sample
parameter
is denoted:
on line as depicted
equation
(2) is estimated
as close as possible
to the observed
data collected.

ollectively,

and

are known as the coefficients or parameters of the population regression function.

rom here on, instead of using the notation (b0 , b1 ), we will use ( 0 , 1 ) to denote the OLS estimators3.
for Large
the
use ui instead of i to denote the estimate of the error
opulation regression parameters ( 0 , 1 ). We will also
erm (or also known as fitted residual). In other words, re-writing our sample regression line, we obtain:

Outliers are Unlikely

E[Xi4 ], E[Yi4 ] < 1

Yi = 0 + 1 Xi + ui

SLR Model

Notice that OLS estimators regression coefficients and the error term for our sample regression line have a hat
Y = 0 + between
uiactual population parameters (without the
1 Xi +the
n top of them. This notation enables us toidifferentiate
Coecient
hats) and the estimated parameters (with the hats) and is also supposed to make it more intuitive
for you
We use some kind of method to get estimators of coecients.
a.k.a 0 is the OLS estimator for 0 ).
model

Interpretation Let us estimate the following

The most standard method is Ordinary Least Squares (OLS):

o formalize what is meant as as close as possible to the observed data collected, the regression coefficients ( 0 ,
1 ) are chosen to minimize the Sum of Squared Residuals (SSR), which is the sum of the differences in the
alues predicted in our estimated linear regression
the
actual values.
Yi = from
i
0 + 1 Xi + u
Mathematically, the minimization of the Sum of Squared Residuals (SSR) is written as:
min

N
X

ui 2 = min

i=1

N
X

(Yi

Y = 0 + 1 X + u

If we were to read a Stat output, the coecients would be


interepreted as follows:

Yi )2

i=1

here the sample regression is denoted in equation (3) and the corresponding predicted value of Yi based
n the sample regression line denoted as:

The OLS estimator formula is:


P

Yi = 0 + 1 Xi

i model
(Xby
X)(Y
Y ) is known
xy as the fitted
he part of the outcome that is not
explained
=
Pi the estimated
= 2
mathematically as:
2

(Xi X)
x

0 :
residual, written

Yi

ui = Yi

Measures of Fit

Explained Sum of Squares (ESS): the amount of deviation of the predicted values of Y , Y from the sample
averages Y
ESS =

n
X
3

(Yi

(Yi

the

interpretation.

Statistics Review

Properties of Random Variables

Centrality: Average = x = E(X), M edian = mx : P(X mx ) = 0.5


Dispersion: Takes on different values centered around the mean, with the spread depending on the vari
E[(X x ) ]
1 : the
Skewness:
, a skewedeect
distribution
a tailunit
on onechange
side.
3
predicted
of ahasone

in X on Y .
, measures
of thechange
tail.
is to say
thattheathickness
one unit
in X should have
Sample Average
a 1 unit change in Y .
3

Kurtosis:
That

Y )2

Total Sum of Squares (T SS): the deviation of the actual


values of Y , from its sample average Y
T SS =

Instructor: Professor Woroch


Y when X is zero. Depending on the relation,
Tomas Villena

10/3 andconstant
10/5, 2016 0 may or may not have a useful

E[(X x )4 ]

n
X

SLC Econ
MT#1
Session
the 140
estimator
thatReview
captures the
expected eect on

Y )2

=
For a sample of X1 , , Xn , the sample average is: X

1
n

n
X

Xi

n
n error, also
n known as the dierence be ui : the X
prediction
1 X
1 X
1
= E( 1
E(X)
Xi ) =
E(Xi ) =
x = n x = x
tweenn the
observed
values
of
Y
and
the predicted value.
n
n
n
i=1
i=1
i=1
n
n
n
2
2
X
X
X
If
u

=
0,
it
means
that
our
sample
perfectly
1
1
1
x
2regression
2
= V ar( 1
V ar(X)
Xi ) = 2
V ar(Xi ) = 2
, lim x = 0
x = 2 n x =
n
n!1 n
n
n
n
n
predicts the
regression.
i=1 population
i=1
i=1
i=1

Statistic Theorems

will converge in p
of Large Number: As the sample size n approaches infinity, the sample mean X
Sum of Squared Residuals (SSR): the sum of the dif- Law
bility to the population mean x .
Hypothesis
Tesing
General
form
of
a
t-statistics:
ferences in the values predicted in our estimated linear lim X = x
n!1
regression Y from the actual values Y
Central Limit Theorem: As the number of samples taken approches infinity, the distribution of sample m
approaches normal.
X
estimator hypothesized
n
N (x , 2 )
X
X
t=
x
2

X
standard
error(estimator)
SSR =
(Yi Yi )
Zn =
N (0, 1)
x

Hypothesis Testing

T SS = ESS + SSR

T-statistic: the distance of your observed value from your null hypothesis.

R2 : the ratio of variation in Y which is explained by X.


Want a high R2 !
ESS
T SS SSR
R2 =
=
=1
T SS
T SS

SSR
T SS

P-value: smallest level of significance at which your null hypothesis will be rejected.
Hypothesis testing conditions and conclusions:
If t-stat > critical value: reject the null hypothesis
If p-value < significance value: reject
Calculating Confidence Interval: H0 SE

Critical V alue

Calculating t-statistic:
For a single variable: t =

Observed V alue
Std.Dev

H0

P-value: smallest level of significance at which your null hypothesis will be rejected.
Hypothesis testing conditions and conclusions:
If t-stat > critical value: reject the null hypothesis

GSI: Alessandra, Byoungchang, Caroline, Nicholas


2.1 Short Questions

If p-value < Testing


significance value:
Hypothesis
of reject
1
Calculating Confidence Interval: H0 SE

Critical V alue

1. Features of Probability Distributions

Calculating t-statistic:
For a single variable: t =

Observed V alue
Std.Dev

For a regression parameter: t =

H0

1
x
Sx 2
Nx

y
+

N (3, 25), Y N (1, 4) and Z N (16, 1). X is indepen-

dent from
Y and Cov(X,
Z)variance]
= 3 while
Z) Z
= be7.random va
1. [Properties
of expected
values and
LetCov(Y,
X, Y and
N (3,Using
25), Y the
N
(1, 4) and Z given
N (16,
1). X
is independent from Y and C
information
above,
compute
Cov(Y, Z) = 7. Using the information given above, compute

null

SE

Calculating t-statistic for difference of means: t =

Let X,Y and Z be random variables, where X


Exercises

Sy 2
Ny

(a) [E(X + Z)]2


(b) E(2X 2 + 3Y )
(b) E(2X 2 + 3Y )
(c) Cov(X + 3Y, 2Y + 3Z)
(a) [E(X + Z)]2

(c) Cov(X + 3Y, 2Y + 3Z)

Prove the following statements

2. [Summation operator] Using the summation rules, prove that:


(a) Cov(X, Y ) =
(b)V ar(X) =

1
N

1
N
N
P

N
P

Xi

Y =

Yi

i=1

Xi

i=1

1
N

N
P

i=1

Xi2

1
N

N
P

Xi Y i

Y
X

i=1

X2

3. [Probability distribution] Let Y denote the number of heads that occur when t

2.
of Estimators
(a)Properties
Derive the probability
distribution of Y .
(b)
Derive
Let the cumulative probability distribution of Y .
(c) Compute the mean and
of1 X
Y .i 140
Yi variance
= 0 -+Econ
+ ui
Review

- Suppose you know that

= 0. Derive a formula for

4. [Regression] The dataset BWGHT.RAW contains data on births to women in th


the least squares
estimator of 1.Caroline, Nicholas
of interest areGSI:
theAlessandra,
dependentByoungchang,
variable, infant birth weight in ounces (bwght)
- Suppose
know
4. mother
Derivesmoked
a formula
forduring pre
variable,
averageyou
number
of that
cigarettes
per day
0 =the
following
simplesquares
regression
was estimated
the least
estimator
of 1 .using data on n = 1,388 births:

Exercises
bwght = 119.77

0.514 cigs

1. [Unbiasedness] Assume Xi , i = 1, ..., N , iid random variables. Are the following estimators biased? If
so, calculate the bias.
(a) M1 =

2X1 +3X2

(a) What 5isN the predicted birth weight when cigs = 0? What about when ci
P
(b) M2day)?
= N1+1 Comment
Xi
on the dierence.
i=1

Nsimple regression necessarily capture a casual relationship betw


P
(b) Does this
1
Xi
i=1 the mothers smoking habit? Explain.
weight Nand

=
(c) M3 = X

Thanks
for previous
sharing
his section
2.
[Eciency]
Consider GSI
Xi , iEdson
= 1, ...,Severnini
N , iid random
variables.
The notes.
estimator M1 =
= X1 +X2 ? Justify your answer.
ecient that X
2

2X1 +3X2
5

is more

3. [Linear Regression] Suppose that there are friends students who


1 attend an Introductory Econometric
Course and they all attend all lectures and sections. The four friends, however, dier in the amount of
hours they devote studying. Table 1 reports the number of hours each student spent studying for the
midterm and their grade on the exam, which is graded on a scale from 0 to 100.

2.2 Stock & Watson Questions


Chapter 2: Probability

Table 1: Hours spent studying and grades


Student Hours Grade
1
2
3
4

20
30
60
70

55
65
90
95

ESS = 1112.13 and T SS = 1118.75


(a) Compute the OLS estimators for the slope and constant you obtain regressing grades on hours
spent studying. Interpret your estimates.
(b) Compute the R2 for this regression and interpret it.
(c) Figure 1 reports the STATA output obtained by regressing grades on hours worked. Do 0 and
1 correspond to those you calculated by hand? what about the R2 ?
(d) Find a measure of goodness of fit in the regression output other than the R2 and its adjusted
version, and comment on it.
Thanks for previous GSI Edson Severnini sharing his section notes.

5
Chapter 3: Statistics

6
Chapter 5: Hypothesis Tesing

S-ar putea să vă placă și