Sunteți pe pagina 1din 3

Econometrics III, MT5

Computer Based Assignment


Due June 27, Friday (by 11:55pm)

General Rules
Table below contains student-variant assignment. Please, note your variant number from the table and
follow these instructions to do problems assigned to your variant. All the data files that you will need to
complete the assignment are located in a joint folder “CBA data”.

Last name first letter(s) Variant


A-E 1
G-Ku 2
Kv-O 3
P-Z 4

Please, note that this assignment is strictly individual and should be completed on your own. Cases of
plagiarism in programming or answers to problems will be strictly punished.

You should submit 2 files as part of your assignment:

1) A working STATA do-file that performs all the steps involved in addressing assigned
problems. Make sure to add as many explanatory notes to your do-file as necessary to
understand your program. In particular, make sure to note: your name, the dataset(s) that
you are working with, problem number (follow numbering in Wooldridge, for example:
*** Problem C10.8, part a) before the set of commands addressing that problem. The
name of this file should be composed of the first 3 letters of your name followed by the
first 3 letters of your last name. For example Saba Metreveli will submit Sabmet.do file.
2) A document file with your written answers to problems (explanations, interpretations,
conclusions). Make sure to give complete answers to all questions. Mark each answer
with the number of the corresponding question (i.e. Problem C10.8, part a). Follow the
same rule as above in naming this file (Saba Metreveli will submit Sabmet.doc file).

Submission directions:

- You should upload both files (.do and .doc) on Moodle in the folder “Computer
Based Assignment” under Week 7. If one or both files are not submitted by the
deadline, your assignment will not be considered submitted on time.
- Late submissions will be penalized in the following way: -20% of earned grade if
submitted during the next day after the deadline, -50% if submitted during the second
day, -80% if submitted during the third day. Beyond that you will not earn any points
for your assignment. Late assignments should be e-mailed to the instructor (use “Late
Assignment” in the subject line of your message).

Grade for the assignment will be based on completeness and academic quality of your answers to
assigned problems, quality of programming, and timely submission of the assignment. If you have any
questions, you can address them to course TAs and/or send an e-mail to the instructor.

Problems
ALL students should do the first two exercises. These exercises use generated data to demonstrate the
main issues with missing data and special observations. Follow do-file “Project exercises.do”. Instead of
DDMM fill in your day and month of birth. Example: if you are born on 21st of February you will put 213
as your seed number.

Exercise 1: Missing data

a) Carefully examine the provided code. Based on the code, what is the equation for the TRUE
regression line? Write this equation down and keep it handy when looking at results from
following estimations. With observational data we never observe parameter values or the
errors, but with generated data you KNOW these things.
b) Follow the do-file to do OLS estimation of the model with the full sample. Compare obtained
estimates with the TRUE parameter values. Not the standard errors produced by this estimation
and the estimated error variance. Comment on “closeness” of your estimates to the true
parameters.
c) Now continue with the do-file and perform estimation of the model with sub-sample selected
based on values of x1. Compare results from OLS with full sample, sample with mild selection,
and sample with substantial selection based on x1. How do parameter estimates and their
standard errors change in each of these cases? Are these changes in line with your
expectations? Are the true parameters still within the 95% confidence intervals?
d) Now program yourself a case when selection of the sample is based on y: mild selection with
y>8 and more substantial selection with y>18. Repeat steps in part c) and report your findings.
e) Program the case of random selection based on y (you could simply take the first 700
observations as the case of mild selection and the first 300 observations as the case of
substantial selection). Repeat steps in parts c) and d), report your findings.

Exercise 2: Special observations

a) Continue with the sample generated in exercise 1. The do-file shows you how to generate
several leverage points in the sample (Note: put your day of births instead of DD). Note that y
values for leverage points are recalculated to reflect the new value of x2 for these observations.
Generate a scatterplot of y values versus x2. Comment on the graph. Run OLS estimation of the
model (same variables as before) with and without the leverage points. Comment on changes
you observe in model estimates.
b) Triple the number of leverage points in your data. Run the regression with and without these
leverage points and compare the results. What is the effect from having even more leverage
points on the regression results?
c) Follow the do-file to create a few outliers in the dataset (note, the first 100 observations, part of
which were leverage points, is dropped). Use your month of birth instead of MM. Do a
scatterplot of the y against x3. Do you notice the outliers? Run a regression with outliers and
predict residuals from this regression. Can you detect the outliers by catching cases where
residuals are 2*StErr(residuals)?
d) Run the regression with and without the residuals, save and compare the results. Can you say
that any of the outliers happen to be influential data points?
e) Create a “super”-outlier from observation 900 by multiplying its error by 500. Generate the
corresponding y value for that observation. Repeat steps in d), does this new outlier seem to be
an influential point?

The rest of the problems are variant-specific. All the problems are from Wooldridge (3rd edition).

Variant 1:

C10.11
C10.7
C11.9
C18.2
C13.2
C14.10 (i-v)

Variant 2:
C10.1
C10.2
C10.8
C11.7
C18.11
C13.10
C14.6

Variant 3:
C10.5
C10.10
C11.6
C18.4
C18.5
C13.12
C14.7

Variant 4:
C10.6
C10.12
C11.4
C18.13
C13.11
C14.8

S-ar putea să vă placă și