Documente Academic
Documente Profesional
Documente Cultură
Decision Sciences II
Mid-Term Examination
Friday 19 October, 2018
Time : 180 minutes
Total No. of Pages :21
Name________________________
Total No. of Questions: 3 Roll No. ________________________
Total marks:40 Section ________________________
Instructions
1. This is a closed book exam. You are NOT allowed to use text book and class notes.
2. Answer all questions only in the space provided following the question.
3. Show all work and give adequate explanations to get full credit.
4. You may use the backside of the last page for rough work only if needed. Do NOT attach any
rough work/sheets.
5. Encircle or underline your final answer for each part.
6. No clarifications will be made during the exam.
7. Assume 95% confidence level if necessary ( = 0.05).
8. Use approximate critical values for Z, t and F tests if the exact value is not available in the tables
attached with the question paper.
Question 1 (8 points)
Data on Life Expectancy at birth (measured in years) and Infant Mortality (deaths per thousand
live births) is collected for 20 countries in South and East Asia and the Pacific Region.
The descriptive statistics related to the data are given in the table below:
A simple linear regression model is developed to predict Life Expectancy using Infant Mortality
as the only explanatory variable. The regression outputs obtained are given below:
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA
df SS MS F Significance F
Regression 1040.828 140.893 6.01375E-10
Residual
Total 1173.800
Question 1.1
What is the change in Life Expectancy when Infant Mortality increases by one unit? (1 point)
Question 1.2
Calculate Life Expectancy for India given that it has an Infant Mortality of 51. (1 point)
Question 1.3
What percentage of variation in Life Expectancy can be explained by Infant Mortality? (1 point)
4
Question 1.4
At 95% confidence level, calculate the maximum value of Life Expectancy for Vietnam which
has an Infant Mortality of 34. (2 points)
Question 1.5
At 95% confidence level, test the hypothesis that Life Expectancy for New Zealand is 77, given
that Infant Mortality is 5. (2 points)
5
Question 1.6
Can the regression result be used to predict the Life Expectancy of a country in South Asia with
an Infant Mortality of 110? Why or why not? (1 point)
Consider the following data on sericulture (production of silk and rearing silk worms), collected
from 561 farmers in Karnataka, where the variables are listed in Table 2.1.
The following descriptive statistics of these variables are shown in Table 2.2 below.
Model 1
A linear regression model was developed between income per acre as the dependent variable (Y)
and years of experience as the independent variable. The SPSS model outcome is shown in
Tables 2.3 to 2.5.
1 44043.3492
Unstandardized Coefficients
Model 1 t Sig.
B Std. Error
Question 2.2
Is it possible to claim that the income per acre increases by at least 200 rupees for every one-year
increase in experience at a 5% significance level? (2 points)
8
Question 2.3
What is the average revenue earned by the first-time farmers? (1 point)
A second model is developed based on whether a farmer has undergone any training or not and
the results obtained are displayed in Table 2.6.
Table 2.6 Coefficientsa
Question 2.5
Is the variable “training” statistically significant at a 1% significance level? (1 point)
Question 2.6
Is it possible to conclude that the farmers should not attend the training program? State your
reasons clearly. (1 point)
Model 3 is developed using both training and experience as independent variables, and the
corresponding SPSS output is shown in Table 2.7
Question 2.7
Comment on the type of correlation (positive or negative) that exists between training and
experience. Clearly show all the steps. (2 points)
Model 4 is developed using the following independent variables: (1) Training; (2) Experience;
and (3) TrainExp, which is an interaction variable between Training and Experience, and the
SPSS output for this model is recorded in Table 2.8 below.
Table 2.8 Coefficientsa
Question 2.8
Can we say that farmers who have attended the training program will always earn less than those
who did not attend the training program, irrespective of their experience? (2 points)
11
Model 5 is developed using stepwise regression with all variables included, and the results are
documented in Table 2.9 below.
Question 2.9
Among the continuous variables used in Model 5, which independent variable has the highest
impact on the income per acre? (1 point)
Model 6 is developed after adding a new variable “chawki_bivoltine”, which is a dummy variable
that captures whether the farmer used a hybrid variety called “bivoltine” (1 implies that the
farmer used the hybrid variety and 0 implies otherwise). The outputs are shown in Table 2.10-
2.11.
12
Figure 2.1
14
Question 3
(16 points)
The Election Commission (EC) and the Supreme Court have of late been trying to improve the
election process by looking at various data related to winners from different constituencies in the
recent elections. Some of this data comes from the EC, while the rest is gathered directly from the
winners. One of the questions that the EC is trying to understand is whether the margin of victory
depends on: (a) the financial strength of the candidate; (b) the candidate’s involvement in various
court cases and; (c) other demographic data related to the candidate. This would enable the EC to
better plan security and other arrangements for the next election cycle.
In this context, the EC uses the following data, which is given in Table 3.1 below.
There were other categorical and dummy variables that the EC was interested in:
Gp1, Gp2, Gp3 are three Groups from which the winners come (e.g., General, SC, ST)
Cases Win gt Loser = 1 if the winner was involved in more court cases than the runner up; 0
otherwise
Serious W gt L = 1 if the winner was involved in more serious court cases than the runner up; 0
otherwise.
Grad, PG, Professional, Ph D, None are all 0-1 variables which take a value of 1 if the winner
had finished a graduate degree, a post graduate degree, a professional degree, Ph D or None of
these respectively, and 0 otherwise. The highest educational qualification obtained by the winner
was taken. Some winners did not go to college.
Various regressions were tried and the results obtained for some of them are given below.
Regression 1 Output
Regression Statistics
Multiple R 0.176394898
R Square 0.03111516
Adjusted R Square 0.023788999
Standard Error 1.133067839
Observations 534
15
ANOVA
df SS MS F Significance F
Regression 4 21.81058803 5.452647008 4.247130033 0.00215867
Residual 529 679.1528032 1.283842728
Total 533 700.9633912
Standard
Coefficients Error t Stat P-value
Intercept 1.89774 0.09752 19.46022 0.00000
Grad -0.36673 0.14378 -2.55057 0.01104
Professional -0.39892 0.14783 -2.69844 0.00719
PG -0.51749 0.13485 -3.83750 0.00014
PhD -0.11008 0.22278 -0.49412 0.62143
Regression 2 Output
Regression Statistics
Multiple R 0.685085
R Square 0.469342
Adjusted R Square 0.468344
Standard Error 0.836179
Observations 534
ANOVA
Significance
df SS MS F F
Regression 1 328.9912 328.9912 470.528 3.18E-75
Residual 532 371.9722 0.699196
Total 533 700.9634
Standard
Coefficients Error t Stat P-value
Intercept 1.471912 0.036599 40.21676 2E-163
3.18E-
Assets 1 over 2 0.017381 0.000801 21.69166 75
Regression 3 Output
Regression Statistics
Multiple R 0.69933132
R Square 0.489064296
Adjusted R Square 0.485200888
Standard Error 0.822816437
Observations 534
16
ANOVA
Significance
df SS MS F F
Regression 4 342.8162 85.70404 126.5888 9.51E-76
Residual 529 358.1472 0.677027
Total 533 700.9634
Regression 4 Output
Regression Statistics
Multiple R 0.129845
R Square 0.01686
Adjusted R Square 0.013157
Standard Error 1.139221
Observations 534
ANOVA
Significance
df SS MS F F
Regression 2 11.81811 5.90905 4.55304 0.01095
Residual 531 689.1453 1.29783
Total 533 700.9634
Standard
Coefficients Error t Stat P-value
Intercept 1.66155 0.056678 29.31541 4.5E-113
Gp2 -0.41222 0.136612 -3.01748 0.002671
Gp3 -0.06565 0.177274 -0.37036 0.711265
Identify the most appropriate regression model and give proper explanations for your
answers to the following questions.
3.1 Are people who attended college likely to win with a greater margin? (2 points)
17
3.2 Given that the values of the variable “Asset 1 over 2” are within 3 standard deviations, what
is the maximum margin of victory? (3 points)
3.3 What is the best inference you can make about the relationship between education levels and
Assets1over2 ? (2 points)
The interaction effects between Asset1over2 and the other variables are added and
Stepwise Regression is carried out using all the variables. The output is given below.
Regression 5 Output
Coefficientsa
Model Summaryf
Descriptive Statistics
3.5 The EC is concerned about constituencies where the margin of victory is the lowest. In what
type of constituencies does this happen? What are the attributes of the winners in constituencies
where this happens? Assume that the winner had 15% less Assets than the runner up. (3 points)
20
3.7. What is the change in R Square value due to the entering variable between Step 4 and 5 in
Regression Output 5? (2 Points)
21
ROUGH SHEET