Sunteți pe pagina 1din 5

MID TERM

SOLUTION
PREVIUOS MID TERM

The nature of Dummy Variables


In regression analysis the dependent variable is frequently influenced not
only by ratio scale variables (e.g., income, output, prices, costs, height, and
temperature) but also by variables that are essentially qualitative or nominal
scale, in nature, such as sex, race, color, religion, nationality, geographical
region, political upheavals, and party affiliation. For example, holding all
other factors constant, female workers are found to earn less than their male
counterparts or Nonwhite workers are found to earn less than whites. This
shows that qualitative variables are not less important and should be
included in the regression analysis DV variables usually indicate the
presence or absence of a "quality or an attribute. How to quantify?
Construct artificial variables that take on values of 1 or 0, 1 indicating the
presence or absence. Dummy variables are thus essentially a device to
classify data into mutually exclusive categories such as male or female. How
to incorporate in regression models: Dummy variables can be incorporated
in regression models just as easily as quantitative variables. In other words a
regression model may contain regressors that are all exclusively dummy, or
qualitative, in nature. Such models are called Analysis of Variance
(ANOVA) models For example, the variables like sex (male or
female), colour (black, white), nationality, employment
status (employed, unemployed) are defined on a nominal
scale. Such variables do not have any natural scale of
measurement. Such variables usually indicate the presence
or absence of a quality or an attribute like employed or
unemployed, graduate or non-graduate, smokers or nonsmokers, yes or no, acceptance or rejection, so they are
defined on a nominal scale. Such variables can be quantified
by artificially constructing the variables that take the values,
e.g., 1 and 0 where 1 indicates usually the presence of
attribute and 0 indicates usually the absence of attribute.
For example, 1 indicates that the person is male and 0
indicates that the person is female. Similarly, 1 may
indicate that the person is employed and then 0 indicates
that the person is unemployed. Such variables classify the
data into mutually exclusive categories. These variables are
called indicator variables or dummy variables. Usually, the
dummy variables take on the values 0 and 1 to identify the
1 | Page

MID TERM
SOLUTION
PREVIUOS MID TERM

mutually exclusive classes of the explanatory variables. For


example, 1 if person is male 0 if person is female, 1 if person
is employed 0 if person is unemployed.

Consider an equation to explain salaries of CEOs in terms of annual firm


sales, return on equity (roe, in percentage form), and return on the firms
stock (ros, in percentage form):
log(salary) = 0+ 1log(sales) + 2roe+ 3ros+
(i)
In terms of the model parameters, state the null
hypothesis that, after controlling for salesand roe,
roshas no effect on CEO salary. State the alternative
that better stock market performance increases a CEOs
salary
Ans.H0: 3= 0. H1: 3> 0
log(salary) = 0+ 1log(sales) + 2roe+ 3ros+
(ii) Using a data set on firms, suppose the following was obtained
via OLS
log()= 4.32 + .280log(sales) + .0174roe+ .00024ros
(.32) (.035)(.0041)(.00054)
N = 209, R2= .283
By what percentage is salary predicted to increase if
rosincreases by 50 points? Does rosshare an economically
large relationship with salary?
Ans.-Recall, we interpret a log-level model as %=100(or, 1)
-So, a 50 point increase in rosis associated with an increase in salary by 1.2
(.00024501=.012) percent
-A 1.2 percent increase in salary that is related to a 50 percent
increase in a return on a firms stock does not seem economically
meaningful
log()= 4.32 + .280log(sales) + .0174roe+ .00024ros
2 | Page

MID TERM
SOLUTION
PREVIUOS MID TERM

(.32) (.035)(.0041)(.00054)
N = 209, R2= .283
(iii) Test the null hypothesis that roshas no affect on salary against the alternative
that roshas a positive effect. Carry out the test at the 10% significance level.
Ans. The 10% critical value for a one-tailed test, using df= 200, is 1.29. (table)
The tstat on rosis .00024/.00054 = .44, which is well below the critical value.
Therefore, we fail to reject H0at the 10% significance level and say that the
relationship between rosand salaryis statistically indistinguishable from zero.
log()= 4.32 + .280log(sales) + .0174roe+ .00024ros
(.32) (.035)(.0041)(.00054)
N = 209, R2= .283
(iv) Would you include rosin a final model explaining CEO compensation in terms
of firm performance?
Ans. Based on this sample, rosis not a statistically significant predictor of CEO
compensation. However, including rosmay not be causing harmQ. What does
this depend on?
-It depends on how correlated it is with the other independent variables
The following table contains the ACT scores and the GPA
(grade point average) for eight College students. Grade
point average is based on a four-point scale and has been
rounded to one digit after the decimal.
Student GPA ACT
1
2
3
4
5
6
7
8
2

2.8
3.4
3.0
3.5
3.6
3.0
2.7
3.7

21
24
26
27
29
25
25
30

(i)Estimate the relationship between GPA and ACT using OLS; that
is, obtain the intercept
and slope estimates in the equation
d GPA = b _0 + b _1ACT

3 | Page

MID TERM
SOLUTION
PREVIUOS MID TERM

Comment on the direction of the relationship. Does the intercept


have a useful interpre-tation here? Explain. How much higher is the
GPA predicted to be if the ACT score is increased by _ve points?

The easiest way to get the answer to this question would be to


enter the data by hand into Stata (or Exel). Then run a simple OLS
regression. Since there are only 8 data points, this would be
pretty quick. Or, again because there aren't many data points,
you could just compute the formula for the OLS coefficients using
the formula on page 29 and a calculator. I used Stata (the easiest
way to enter this number of data points is by using the edit
command and just _lling in the spreadsheet) and found the
following coefficient cients (rounded to two decimal places):
b _0 = 0:57; b _1 = 0:1:
The direction of the relationship is positive, as expected. Students
with a higher
ACT score have a higher GPA. Whether this has to do with
intelligence, test-taking
Ability, or whatever, I am not sure. But the direction of the
relationship is certainly
Expected. The literal interpretation of the intercept is \the GPA if
the ACT score was 0." Is that even possible? Can you score a 0 on
your ACT? Or do you get 10 points for writing your name? I don't
know. But either way, my answer to the question is probably
\no, the intercept does not have a useful interpretation." All of the
ACT scores in our
data are well above 0, indicating to me that the notion of a 0 ACT
score is well outside
our sample, and thus well outside what I am comfortable
predicting with this model.
If the ACT score increases by 5 points, I expect to observe a 0.5
point increase in
GPA (= 0.1 * 5).
(iii)What is the predicted value of GPA when ACT = 20?

The predicted value of GPA when ACT = 20 is


0:57 + 0:1 _ 20 = 2:57:

4 | Page

MID TERM
SOLUTION
PREVIUOS MID TERM

Even though we don't have an observation with ACT = 20, we can


predict what would happen if we did, using our model.
(iv)How much of the variation in GPA for these eight students is
explained by ACT? Explain.

About 58% of the variation in GPA is explained by ACT. We see


this in the R-squared statistic, which represents the proportion of
the total variation in GPA that is explained by all the explanatory
variables. Since we only have one explanatory variable, R-squared
gives us exactly what the question is looking for. (This would not
be true if there were multiple explanatory variables).

5 | Page

S-ar putea să vă placă și