Documente Academic
Documente Profesional
Documente Cultură
Chapter 12
Simple Regression
12.1 For each sample: H0: ρ= 0 versus H1: ρ ≠ 0. The following formula is used to calculate the
n-2
test statistic: tcalc = r and tcrit=T.INV(α/2,df). Because these are all two-tailed
1- r2
tests the decision rule is Reject H0 if tcalc> +tcrit or tcalc< −tcrit
Summary Table
Decision
2.138 > 2.101, RejectH0
−1.977< −1.701, RejectH0
1.677 is between the critical
values, Fail to Reject H0
−2.416< −2.390, Reject H0
12.2 a. The scatter plot shows a positive correlation between hours worked and weekly pay.
b.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy
2130
r= = .9199
350 15318
12-1
Chapter 12 - Simple Regression
12.3 a. The scatter plot shows a negative correlation between operators and wait time.
b.
Operators (X) Wait (Y) ( xi - x )( yi - y )
4 385 −76
5 335 12
6 383 0
7 344 −3
8 288 −118
6 347 −185
x y SSxy
-185
r= = -.7328
10 6374
c. tcrit = t.025 =T.INV(.025,3) =±3.182, using d.f.= 3
5-2
d. tcalc = -.7328 = -1.865 . We fail to reject the null hypothesis of zero
1 - (-.7328) 2
correlation because -1.865 >-3.182.
e. p-value = T.DIST.2T(1.865,3) = .159.
Learning Objective: 12-1
12-2
Chapter 12 - Simple Regression
12.4 a. The scatter plot shows little correlation between age and amount spent.
b. rcalc = −.292
c. tcrit = t.025 =T.INV(.025,8) =±2.306, using d.f. = 8
10 - 2
d. tcalc = -.292 = -.864
1 - (-.292) 2
e. Because tcalc (−.864) > −2.306, we fail to reject the null hypothesis of zero correlation.
Learning Objective: 12-1
12.5 a. The scatter plot shows a positive correlation between returns from last year and returns
from this year.
b. rcalc = .5313
c. tcrit = t.025 =T.INV(.025,15) = ±2.131, using d.f. = 15
17 - 2
d. tcalc = .5313 = 2.429
1 - (.5313) 2
e. Because tcalc (2.429) > 2.131, we reject the null hypothesis of zero correlation.
Learning Objective: 12-1
12-3
Chapter 12 - Simple Regression
12.6 a. The scatter plot shows a positive correlation between orders and ship cost.
b. rcalc = .820
c. tcrit = t.025 =T.INV(.025,10) = ±2.228, using d.f. = 10
12 - 2
d. tcalc = .820 = 4.530
1 - (.820) 2
e. Becausetcalc (4.53) > 2.228, we reject the null hypothesis of zero correlation.
Learning Objective: 12-1
12.7 a. Increasing the size of a home by 1 square foot increases the price by $150.
b. HomePrice = $125000 + $150×(2000) = $425,000
c. The intercept might be interpreted as the value of the lot without a home. But the range
of values for Xdoes not include zero so it would be dangerous to extrapolate for x = 0.
Learning Objective: 12-2
Learning Objective: 12-3
12.8 a. An increase in the price of the item of $1 reduces its expected sales by 37.5 units.
b. Sales = 842 – ($20)×37.5 = 92
c. From a practical point of view no. A zero price is unrealistic.
Learning Objective: 12-2
Learning Objective: 12-3
12.9 a. An increase in the median age of one year means the number of car thefts decreases by
35.3.
b. CarTheft = 1,667 - 35.3×40 = 255
c. The intercept would not be meaningful because you would not have a median age of
zero for any state.
Learning Objective: 12-2
Learning Objective: 12-3
12.10 a. An increase in the microprocessor speed of one MHz means the computer power
dissipation increases by 0.032 watts.
b. Computer power dissipation= 15.73 + 0.032×3000 = 111.73 watts
c. The intercept would not be meaningful because you would not have a computer with
zero speed.
Learning Objective: 12-2
Learning Objective: 12-3
12-4
Chapter 12 - Simple Regression
12.11 a. An increase in a country’s Power distance index of one unit means the number of
international franchises increases by 1.75.
b. International franchises= -47.5 + 1.75×85 = 101.25
c. The intercept would not be meaningful because you cannot have a negative number of
franchises. While the range for the index does include zero, it is unlikely that a
country’s index value will be close to zero.
Learning Objective: 12-2
Learning Objective: 12-3
12.12 a. Increasing the average revenue by $1million raises the net income by $30,700.
b. If revenue is zero, then net income is $2,277 million which suggests that the firm has
net income when revenue is zero. This does not seem logical.
c. Revenue= $2,277 + 0.0307×($20,000)=$2,891 million
Learning Objective: 12-2
Learning Objective: 12-3
12.13 a. Increasing the median income by $1,000 raises the median home price by $2610.
b. If median income is zero, then the model suggests thatmedian home price is $51,300.
While it does not seem logical that the median family income for any city is zero, it is
unclear what the lower bound would be.
c. HomePrice = $51.3 + 2.61×($50) = $181.8 or $181,800
Homeprice = $51.3 + 2.61×($100) = $312.3 or $312,300
Learning Objective: 12-2
Learning Objective: 12-3
12.14 a. Increasing the number of hours worked per week by 1 hour reduces the expected
number of credits by .07.
b. Yes, the intercept makes sense in this situation. It is possible that a student does not
have a job outside of school.
c. Credits= 15.4 - .07×0 = 15.4 credits
Credits= 15.4 - .07×40 = 12.6 credits
The more hours a student works, the less credits (courses) he will take on average.
Learning Objective: 12-2
Learning Objective: 12-3
12.15 a. Chevy Blazer: a one year increase in vehicle age reduces the price by $1050.
Chevy Silverado: a one year increase in vehicle age reduces the price by $1339.
b. Chevy Blazer: If age = 0 then price = $16,189. This could be the price of a new Blazer.
Chevy Silverado: If age = 0 then price = $22,591. This could be the price of a new
Silverado.
c. $16,189 – $1,050×5 = $10,939
$22,591 -$1,339×5 = $15,896
Learning Objective: 12-2
Learning Objective: 12-3
12-5
Chapter 12 - Simple Regression
12.18 a.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy
2130
b. b1 = = 6.086 , b0 = 177 - 6.086(20) = 55.286 , ŷ = 55.286 + 6.086x
350
c.
Hours Estimated
Worked (xi) Pay ( yˆ i ) yi - yˆi ( yi - yˆi ) 2 ( yˆi - y )2 ( yi - y ) 2
10 93 116.14623.146 535.7373 3703.209 7056
15 171 146.57624.424 596.5318 925.6198 36
20 204 177.00626.994 728.676 3.6E-05 729
20 156 177.00621.006 441.252 3.6E-05 441
35 261 268.2967.296 53.23162 8334.96 7056
20 177 177.0060.006 3.6E-05 3.6E-05 0
x SSE SSR SST
20 177 2355.429 12963.79 15318
12,963
d. R = = .8462
2
15,318
12-6
Chapter 12 - Simple Regression
e.
12.19 a.
-185
b. b1 = = -18.5 , b0 = 347 + 18.5(6) = 458 , ŷ = 458−18.5x
10
c.
Operators Wait Time Estimated
(xi) (yi) Time ( yˆ i ) yi - yˆi ( yi - yˆi ) 2 ( yˆi - y() 2yi - y )2
4 385 384 1 1 1369 1444
5 335 365.5 30.5 930.25 342.25144
6 383 347 36 1296 0 1296
7 344 328.5 15.5 240.25 342.25 9
8 288 310 -22 484 1369 3481
6 347 2951.5 3422.56374
x y SSR
3, 422.5
d. R = = .5369
2
6,374.0
e.
12-7
Chapter 12 - Simple Regression
12.20 a. and b.
12-8
Chapter 12 - Simple Regression
12.21 a. and b.
12.22 a. and b.
12-9
Chapter 12 - Simple Regression
r² 0.846 n 5
r 0.920 k 1
Std. Error 28.020 Dep. Var. Weekly Pay (Y)
ANOVA table
Source SS df MS F p-value
Regression 12,962.5714 1 12,962.5714 16.51 .0269
Residual 2,355.4286 3 785.1429
Total 15,318.0000 4
r² 0.537 n 5
r -0.733 k 1
Std. Error 31.366 Dep. Var. Wait Time (Y)
ANOVA table
Source SS df MS F p-value
Regression 3,422.5000 1 3,422.5000 3.48 .1590
Residual 2,951.5000 3 983.8333
Total 6,374.0000 4
12-10
Chapter 12 - Simple Regression
d. The slope is not significantly different from zero because the p-value is greater than .05.
Learning Objective: 12-6
Learning Objective: 12-7
12-11
Chapter 12 - Simple Regression
f. This model has a poor fit. The F statistic is barely significant at a level of .05 (p-value =
.0478) and R2 = .2. Only 20% of the variation in units sold can be explained by average
price.
Learning Objective: 12-6
Learning Objective: 12-8
Regression Analysis
r² 0.085 n 10
r -0.292 k 1
Std. Error 2.128 Dep. Var. Spent (Y)
ANOVA table
Source SS df MS F p-value
Regression 3.3727 1 3.3727 0.74 .4133
Residual 36.2396 8 4.5299
Total 39.6123 9
b. (−0.1946, 0.0886) This interval does contain zero therefore we cannot conclude that the
slope is greater than zero.
c. The t statistic is −0.863 and the p-value is .4133. Because the p-value is greater than
0.05, we cannot conclude that the slope is different from zero.
d. Fcalc = 0.745 with a p-value = .4133. This indicates that the model does not fit the data.
e. The p-values match. Fcalc= (−0.863)2 = 0.745.
f. This model does not fit the data. The F statistic is not significant.
Learning Objective: 12-6
Learning Objective: 12-7
Learning Objective: 12-8
12-12
Chapter 12 - Simple Regression
r² 0.282 n 17
r 0.531 k 1
Std. Error 4.335 Dep. Var. This Year (Y)
ANOVA table
Source SS df MS F p-value
Regression 110.8585 1 110.8585 5.90 .0282
Residual 281.8321 15 18.7888
Total 392.6906 16
Regression Analysis
r² 0.672 n 12
r 0.820 k 1
Std. Error 599.029 Dep. Var. Ship Cost (Y)
ANOVA table
Source SS df MS F p-value
Regression 7,340,819.5514 1 7,340,819.5514 20.46 .0011
Residual 3,588,357.1152 10 358,835.7115
Total 10,929,176.6667 11
12-13
Chapter 12 - Simple Regression
b. (2.5024, 7.3619) This interval does not contain zero therefore we can conclude that the
slope is greater than zero.
c. The t statistic is 4.523 and the p-value is 0.0011. Because the p-value is less than 0.05,
we can conclude that the slope is positive.
d. Fcalc = 20.46 with a p-value = .0011. This indicates that the model does provide some fit
to the data.
e. The p-values match. Fcalc = (4.523)2 = 20.46.
f. This model provides a good fit to the data. The F statistic is highly significant andR2
shows that 67% of the variation in shipping cost is explained by number of orders.
Learning Objective: 12-6
Learning Objective: 12-7
Learning Objective: 12-8
b. x = 17: 95% confidence interval (116.377, 201.109), 95% prediction interval (60.017,
257.469)
61.883
c. The 95% confidence interval for µY: 177 ± 2.776 or (100.174, 253.826).
5
d. The margin of error for the confidence interval in part (c) is 76.826 whereas the
margin of error for the confidence interval in part b is less than 76.826. Knowing the
number of hours a student works helps us better estimate the average credit hours a
student takes.
Learning Objective: 12-5
Learning Objective: 12-9
12-14
Chapter 12 - Simple Regression
12.34 No, these plots do not show that regression error assumptions of normality or constant
variance have been violated. The plot on the left is a normplot and shows a fairly
straight line on the diagonal. This indicates that the assumption of a normal distribution
for residuals is reasonable. The plot on the right shows residual values plotted against
the corresponding x value. The plot suggests that the residuals are homoscedastic
because there is no increase or decrease in residual magnitude.
Learning Objective: 12-10
12.35 The plot on the left is a normplot and shows a fairly straight line on the diagonal. This
indicates that the assumption of a normal distribution for residuals is reasonable. The
plot on the right shows residual values plotted against the corresponding x value. The
plot suggests that the residuals are heteroscedastic because there is an increase in
residual magnitude as the x values increase.
Learning Objective: 12-10
12.36 a. Predicted Defects = 3.2 + 0.045(100) = 7.7 defects per million parts
b. ei = yi − ŷi = 4.4 – 7.7 = −3.3
ei -3.3
c. ei* = � = -3.084
sei 1.07
d. Yes, this residual is considered an outlier because ei* < −3.
Learning Objective: 12-11
12-15
Chapter 12 - Simple Regression
ei 5.13
c. ei* = � = 2.527
sei 2.03
d. No, this residual is not considered an outlier because ei* < 3 but it is considered unusual
because ei* > 2.
Learning Objective: 12-11
1 ( xi - x ) 2 1 (2382 - 2004)2
12.38 a. hi = + = + = 0.1774 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1774 > 0.1379 this would be considered a high leverage observation.
1 ( x - x ) 2 1 (2125 - 2004) 2
b. hi = + i = + = 0.0491 . 4/n = 4/29 = 0.1379. Because 0.0491
n SS XX 29 999,603
< 0.1379 this would not be considered a high leverage observation.
1 ( xi - x ) 2 1 (1620 - 2004) 2
c. hi = + = + = 0.1820 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1820 > 0.1379 this would be considered a high leverage observation.
Learning Objective: 12-11
1 ( xi - x ) 2 1 (0.072 - 2.027) 2
12.39 a. hi = + = + = 0.185 . 4/n = 4/74 = 0.0541. Because 0.185
n SS XX 74 22.285
> 0.0541 this would be considered a high leverage observation.
1 ( xi - x ) 2 1 (1.413 - 2.027) 2
b. hi = + = + = 0.0304 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0304<0.0541 this wouldnot be considered a high leverage observation.
1 ( x - x ) 2 1 (3.376 - 2.027) 2
c. hi = + i = + = 0.0952 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0952> 0.0541 this would be considered a high leverage observation.
Learning Objective: 12-11
Questions 12.40 through 12.55 refer to 10 different data sets labeled A-J. The answers to each question
are listed for each data set in turn.Note that one can find the tcrit values using either
=T.INV(α/2, df) or =T.INV.2T(α, df).
DATA SET A
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. Most likely from a survey similar to the Current Population Survey
conducted by The Bureau of Labor Statistics.
Learning Objective: 02-7
12-16
Chapter 12 - Simple Regression
12.42 Answers will vary. Sample size is sufficient for educational purposes but most government
studies will have much larger sample sizes.
Learning Objective: 02-9
12.43 A positive slope would be logical. It makes sense that higher income is associated with
higher home values. Cause and effect cannot be assumed. An increase in income does
not automatically compel a family to purchase a more expensive home.
Learning Objective: 12-2
12.44
There is a moderate positive relationship between Median Income and Median Home
Price.
Learning Objective: 12-4
12.46 An increase in median income of $1000, increases home price by $2,609.80. No, the
intercept does not have meaning because it seems unlikely that the median income for a
family will be equal to zero.
Learning Objective: 12-2
12-17
Chapter 12 - Simple Regression
12-18
Chapter 12 - Simple Regression
Regression Analysis
r² 0.340 n 34
r 0.583 k 1
Std. Error 58.855 Dep. Var. Home
ANOVA table
Source SS df MS F p-value
Regression 57,071.8007 1 57,071.8007 16.48 .0003
Residual 110,844.4754 32 3,463.8899
Total 167,916.2761 33
Studentized
Studentized Deleted
Observation Home Predicted Residual Leverage Residual Residual
1 290.0000 207.7728 82.2272 0.108 1.479 1.508
2 279.9000 344.6811 -64.7811 0.115 -1.170 -1.177
3 338.2500 332.7568 5.4932 0.089 0.098 0.096
4 316.0000 295.2225 20.7775 0.037 0.360 0.355
5 207.0000 252.4398 -45.4398 0.038 -0.787 -0.782
6 250.0000 252.8364 -2.8364 0.038 -0.049 -0.048
7 320.0000 298.8240 21.1760 0.040 0.367 0.362
8 150.0000 191.5449 -41.5449 0.150 -0.766 -0.761
9 230.0000 274.9494 -44.9494 0.029 -0.775 -0.770
10 199.0000 252.2884 -53.2884 0.038 -0.923 -0.921
11 218.5000 214.7044 3.7956 0.092 0.068 0.067
12 290.0000 337.0265 -47.0265 0.098 -0.841 -0.837
13 315.0000 278.2247 36.7753 0.030 0.634 0.628
14 248.0000 269.3827 -21.3827 0.030 -0.369 -0.364
15 290.0000 271.8724 18.1276 0.030 0.313 0.308
16 220.0000 220.7383 -0.7383 0.080 -0.013 -0.013
17 170.4500 219.3995 -48.9495 0.083 -0.868 -0.865
18 290.0000 296.5352 -6.5352 0.038 -0.113 -0.111
19 205.0000 320.0496 -115.0496 0.066 -2.022 -2.131
20 410.0000 289.3791 120.6209 0.033 2.084 2.207
21 379.9750 335.0874 44.8876 0.094 0.801 0.796
12-19
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (1.3002, 3.9195). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 32, t.025 =T.INV(.025, 32) = ±2.037, tcalc = 4.059 > 2.037.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .34. 34% of the variation in median home price can be explained by median
household income. Closer to 100% is always desirable but this is still decent
considering we are using only one predictor variable.
b. Fcalc = 16.48 and its p-value = .0003. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.50 Observations 19, 20, 28, and 29 have unusual residual values. Observation 19 is in PA. The
model overestimated the median home price. The other three cities were in NJ and NY.
The model underestimated the median home price. Home prices tend to be higher in the
northeast part of the country. More research would be needed to better explain the
unusual observations.
Learning Objective: 12-11
12-20
Chapter 12 - Simple Regression
12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a fairly symmetric distribution with slight skewness to the right.
b. The residuals do not show any obvious departure from normality. The assumption of
residual normality is valid.
12-21
Chapter 12 - Simple Regression
12.52
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals forx = $75,000 and $105,000 are
shown in the table below.
12.55 Observation 8 has a high leverage. This is Chesapeake, VA, which has a very low median
income.
Learning Objective: 12-11
12-22
Chapter 12 - Simple Regression
DATA SET B
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Data were most likely from a semester’s class of 58 students. This can be treated as a
sample from the larger population of students who take statistics over several years.
Learning Objective: 02-7
12.42 A sample size of 58 should be sufficient to draw significant conclusions about the
relationship between midterm and final exam scores.
Learning Objective: 02-9
12.43 A positive slope would be logical. It makes sense that higher midterm scores are associated
with higher final exam scores and vice versa. While cause and effect cannot be assumed
it would be reasonable to hypothesize that if a student performs well on their midterm
exam due to a solid understanding of the material they will then be able to comprehend
the material in the last half of the semester and will be well-prepared for their final
exam.
Learning Objective: 12-2
12.44
There is a moderate positive relationship between Midterm Exam Score and Final
Exam Score.
Learning Objective: 12-4
12-23
Chapter 12 - Simple Regression
12.46 A one point increase in the midterm score is associated with a 1.014 point increase in the
final exam score. While theoretically a student could score zero on the midterm, the y-
intercept is negative which is not feasible.
Learning Objective: 12-2
Regression Analysis
r² 0.430 n 58
r 0.656 k 1
Std. Error 7.517 Dep. Var. Final Exam Score
ANOVA table
Source SS df MS F p-value
Regression 2,385.5446 1 2,385.5446 42.22 2.33E-08
Residual 3,164.0588 56 56.5011
Total 5,549.6034 57
Studentize
d
Studentize
d Deleted
Final Exam Predicte Leverag
Observation Score d Residual e Residual Residual
1 78.0 79.5 -1.5 0.022 -0.208 -0.206
2 85.0 86.6 -1.6 0.063 -0.226 -0.224
3 81.0 71.4 9.6 0.027 1.290 1.297
4 54.0 68.4 -14.4 0.042 -1.957 -2.009
5 70.0 85.6 -15.6 0.055 -2.139 -2.212
6 73.0 82.6 -9.6 0.035 -1.298 -1.306
7 89.0 77.5 11.5 0.018 1.541 1.561
8 84.0 74.5 9.5 0.018 1.279 1.286
9 86.0 73.5 12.5 0.020 1.685 1.714
10 79.0 74.5 4.5 0.018 0.607 0.604
11 75.0 83.6 -8.6 0.040 -1.168 -1.172
12 63.0 72.4 -9.4 0.023 -1.272 -1.279
13 72.0 73.5 -1.5 0.020 -0.197 -0.195
14 69.0 72.4 -3.4 0.023 -0.464 -0.461
15 86.0 79.5 6.5 0.022 0.868 0.866
12-24
Chapter 12 - Simple Regression
12-25
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (0.7012, 1.3263). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 56, t.025 =T.INV(.025,56) = ±2.003, tcalc = 4.059 > 2.003.
Therefore reject H0 and conclude that the slope is significantly different from zero.
12-26
Chapter 12 - Simple Regression
c. The p-value = .0000. This means that there is very little chance of observing this
sample if there is no correlation between midterm scores and final scores.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .43. 43% of the variation in final exam scores can be explained by midterm exam
scores. Closer to 100% is always desirable but this is still decent considering we are
using only one predictor variable.
b. Fcalc = 42.22 and its p-value = .0000. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.50 Students4 and 5 have unusual residual values with residuals that are negative. The model
overestimated the final exam score for these two students. Student 55 had a residual
beyond −4 which means the model significantly overestimated the final exam score.
This observation would be considered an outlier. Low exam scores are typically due to
students who do not study or attend class. On occasion the material is very difficult
which can also result in a low exam score.
Learning Objective: 12-11
12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a skewed left distribution which is fairly common for exam scores.
Most students do well on exams with a few students in the low range.
12-27
Chapter 12 - Simple Regression
b. The residuals do not show strong departure from normality. The assumption of residual
normality is valid.
Learning Objective: 12-10
12.52
12-28
Chapter 12 - Simple Regression
The residual plot does not show signs of heteroscedasticity although the low final exam
score can be seen in the lower part of the graph.
Learning Objective: 12-10
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals for x = 75 and 95 are shown in the
table below.
12.55 Students 43, 49, and 55 have high leverage. These three students had very low midterm
scores.
Learning Objective: 12-11
12-29
Chapter 12 - Simple Regression
DATA SET C
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. Hospital data bases will contain this type of information.
Learning Objective: 02-7
12.42 Answers will vary. Sample size is sufficient for educational purposes but most studies
conducted by a hospital will use larger sample sizes because the data is available.
Learning Objective: 02-9
12.43 A positive slope would be logical. The estimated length of stay of a patient should be based
on their condition upon admission and should be related to their actual length of stay.
Although estimating a patient’s time in the hospital should not cause their time to be
longer or shorter.
Learning Objective: 12-2
12.44
12.36 An increase in ELOS of one month increases ALOS by 1.03months. No, the intercept does
not have meaning because by definition admission to a hospital implies staying in the
hospital.
Learning Objective: 12-2
12-30
Chapter 12 - Simple Regression
r² 0.625 n 16
r 0.791 k 1
Std. Error 2.282 Dep. Var. ALOS
ANOVA table
Source SS df MS F p-value
Regression 121.5515 1 121.5515 23.35 .0003
Residual 72.8860 14 5.2061
Total 194.4375 15
Studentized
Studentized Deleted
Observation ALOS Predicted Residual Leverage Residual Residual
1 10.00 11.32 -1.32 0.171 -0.636 -0.622
2 2.00 5.15 -3.15 0.116 -1.466 -1.536
3 4.00 8.23 -4.23 0.065 -1.919 -2.154
4 11.00 12.87 -1.87 0.283 -0.966 -0.963
5 11.00 8.23 2.77 0.065 1.254 1.282
6 11.00 9.78 1.22 0.098 0.564 0.550
7 6.50 6.69 -0.19 0.071 -0.087 -0.083
8 5.00 5.66 -0.66 0.096 -0.305 -0.295
9 8.00 6.69 1.31 0.071 0.595 0.581
10 16.00 12.87 3.13 0.283 1.622 1.735
11 6.50 7.72 -1.22 0.063 -0.552 -0.538
12 6.00 5.15 0.85 0.116 0.398 0.385
13 3.50 4.12 -0.62 0.167 -0.296 -0.287
14 10.00 6.69 3.31 0.071 1.505 1.584
15 7.00 8.23 -1.23 0.065 -0.559 -0.545
16 5.50 3.60 1.90 0.200 0.930 0.925
12.48 a. 95% confidence interval for β1: (0.5724, 1.4862). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 14, t.025 =T.INV(.025,14) = ±2.145, tcalc = 4.832 > 2.145.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12-31
Chapter 12 - Simple Regression
12.49 a. R2 = .625. 62.5% of the variation in actual length of stay of a patient can be explained
by their expected length of stay. This shows a moderately strong relationship.
b. Fcalc = 23.35 and its p-value = .0003. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.51 a. The histogram of residuals does not show a clear normal distribution; however, the
normal plot of residuals has a modest straight line.
12-32
Chapter 12 - Simple Regression
b. While the residuals show some departure from normality, this is a very small sample so
it is difficult to see clear normality. There are no strong outliers in the data set so at this
point the slight departure from normality is not too troublesome.
Learning Objective: 12-10
12.52
12-33
Chapter 12 - Simple Regression
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Prediction and confidence intervals for x = 5 and 9 days is shown.
Predicted values for:
ALOS
95% Confidence Intervals 95% Prediction Intervals
ELOS Predicted lower upper lower upper
4 4.6318 2.8052 6.4584 -0.5917 9.8554
9 9.7782 8.2426 11.3138 4.6492 14.9072
12.55 Observations 4 and 10 have high leverage. These patients had unusually long estimated
lengths of stay.
Learning Objective: 12-9
12-34
Chapter 12 - Simple Regression
DATA SET D
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. This type of information could probably be collected from the airplane
manufacturer.
Learning Objective: 02-7
12.42 Answers will vary. Sample size of 52 should be sufficient to observe important
relationships.
Learning Objective: 02-9
12.43 A positive slope would be logical. Cruise speed should be related to engine size. And yes, a
cause and effect relationship would make sense
Learning Objective: 12-2
12.44
12.46 An increase in one unit of horsepower, increases cruise speed by .1931 mph. No, the
intercept does not have meaning because an engine cannot have zero horsepower.
Learning Objective: 12-4
12-35
Chapter 12 - Simple Regression
Regression Analysis
r² 0.684 n 52
r 0.827 k 1
Std. Error 20.596 Dep. Var. Cruise
ANOVA table
Source SS df MS F p-value
Regression 45,896.1706 1 45,896.1706 108.20 4.20E-14
Residual 21,209.2717 50 424.1854
Total 67,105.4423 51
Studentized
Studentized Deleted
Observation Cruise Predicted Residual Leverage Residual Residual
1 100.0 125.5 -25.5 0.046 -1.267 -1.275
2 200.0 219.0 -19.0 0.093 -0.967 -0.966
3 241.0 228.6 12.4 0.119 0.640 0.636
4 199.0 213.2 -14.2 0.079 -0.717 -0.714
5 174.0 161.0 13.0 0.019 0.636 0.632
6 164.0 172.6 -8.6 0.022 -0.423 -0.420
7 141.0 172.6 -31.6 0.022 -1.552 -1.575
8 161.0 161.0 -0.0 0.019 -0.001 -0.001
9 107.0 124.3 -17.3 0.048 -0.863 -0.860
10 104.0 131.1 -27.1 0.038 -1.341 -1.353
11 122.0 134.0 -12.0 0.035 -0.593 -0.589
12 129.0 137.9 -8.9 0.031 -0.437 -0.433
13 144.0 147.5 -3.5 0.023 -0.172 -0.171
14 194.0 213.2 -19.2 0.079 -0.970 -0.969
15 170.0 184.2 -14.2 0.031 -0.701 -0.697
16 223.0 222.8 0.2 0.103 0.009 0.009
17 234.0 247.9 -13.9 0.185 -0.749 -0.746
18 124.0 137.9 -13.9 0.031 -0.683 -0.679
19 186.0 158.1 27.9 0.019 1.366 1.379
20 190.0 158.1 31.9 0.019 1.563 1.586
21 190.0 199.7 -9.7 0.052 -0.481 -0.478
22 159.0 148.5 10.5 0.023 0.517 0.513
23 160.0 148.5 11.5 0.023 0.566 0.562
24 148.0 163.0 -15.0 0.019 -0.733 -0.730
25 143.0 161.0 -18.0 0.019 -0.884 -0.882
12-36
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (0.1558, 0.2304). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,50) =±2.009, tcalc = 10.402 > 2.009.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 4.20×10-14. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .684. 68.4% of the variation in cruise speed can be explained by engine
horsepower. This shows a moderately strong relationship.
b. Fcalc = 108.20 and its p-value = 4.20×10-14. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12-37
Chapter 12 - Simple Regression
12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a very
slight right skewed distribution.
12-38
Chapter 12 - Simple Regression
12.52
The residual plot does not show obvious signs of heteroscedasticity. We can see a possible
high outlier on the graph.
Learning Objective: 12-10
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals for x = 150 and 250 are shown
below.
12.55 Observations 2-4 and 14, 16, and 17 have high leverage.
Learning Objective: 12-11
12-39
Chapter 12 - Simple Regression
DATA SET E
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. This type of information would probably be collected from a researcher
who has obtained different types of processors.
Learning Objective: 02-7
12.42 Answers will vary. Sample size of 14 is fairly small and can open one up to Type II error.
Learning Objective: 02-9
12.43 A positive slope would be logical. Microprocessor speed should be related to Power
dissipation.
Learning Objective: 12-2
12.44
12.46 An increase in one unit of microprocessor speed increases power dissipation by 0.032
watts. No, the intercept does not have meaning a speed of zero is not logical.
Learning Objective: 12-4
12-40
Chapter 12 - Simple Regression
Regression Analysis
r² 0.925 n 14
r 0.962 k 1
Std. Error 13.109 Dep. Var. Power
ANOVA
table
Source SS df MS F p-value
25,561.915
Regression 7 1 25,561.9157 148.75 4.03E-08
Residual 2,062.0843 12 171.8404
27,624.000
Total 0 13
Studentize
d
Studentize
d Deleted
Leverag
Observation Power Predicted Residual e Residual Residual
1 3.0 16.4 -13.4 0.184 -1.129 -1.143
2 10.0 18.9 -8.9 0.174 -0.748 -0.734
3 35.0 23.2 11.8 0.157 0.985 0.983
4 20.0 25.3 -5.3 0.150 -0.437 -0.422
5 42.0 34.8 7.2 0.120 0.582 0.565
6 50.0 34.8 15.2 0.120 1.233 1.263
7 51.0 57.1 -6.1 0.078 -0.488 -0.472
8 73.0 82.6 -9.6 0.078 -0.764 -0.750
9 115.0 136.8 -21.8 0.246 -1.912 -2.196
10 130.0 117.7 12.3 0.160 1.027 1.030
11 95.0 89.0 6.0 0.086 0.479 0.463
12 136.0 117.7 18.3 0.160 1.527 1.629
13 95.0 108.1 -13.1 0.128 -1.071 -1.078
14 125.0 117.7 7.3 0.160 0.611 0.594
12.48 a. 95% confidence interval for β1: (0.0262, 0.0375). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,12) = ±2.179, tcalc = 12/196 > 2.179.
Therefore reject H0 and conclude that the slope is significantly different from zero.
12-41
Chapter 12 - Simple Regression
c. The p-value = .0000. This means that it is highly unlikely to obtain a slope estimate of
this value if the true slope is equal to zero.
12-42
Chapter 12 - Simple Regression
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a left
skewed distribution.
12-43
Chapter 12 - Simple Regression
b. The sample size is small so it is difficult to determine normality but there is no obvious
evidence to assume non-normality.
Learning Objective: 12-10
12.52
12-44
Chapter 12 - Simple Regression
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals for x = 1000 and 2500 are shown
below.
Predicted values for:
Power
95% Confidence Intervals 95% Prediction Intervals
Speed Predicted lower upper lower upper
1,000 47.583 38.962 56.203 17.748 77.417
2,500 95.362 86.485 104.238 65.452 125.271
12-45
Chapter 12 - Simple Regression
DATA SET F
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. This could probably be collected through surveys.
Learning Objective: 02-7
12.42 Answers will vary. Sample size of 10 is fairly small and can open one up to Type II error.
But this information could be difficult to obtain.
Learning Objective: 02-9
12.43 A positive slope would be logical. Increased website hits should be associated with
increased revenue.
Learning Objective: 12-2
12.44
There is a weak positive relationship between website hits and restaurant revenue.
Learning Objective: 12-4
12.46 An increase of one website visit increases weekly revenue by $1.67. The intercept could be
interpreted as the weekly revenue with no website hits. Although for this particular
sample there were no restaurants with an x near zero so it would be dangerous to
extrapolate.
12-46
Chapter 12 - Simple Regression
Regression Analysis
r² 0.128 n 10
r 0.357 k 1
Std. Error 1078.961 Dep. Var. Restaurant Revenue
ANOVA
table
Source SS df MS F p-value
1,361,311.737
Regression 1,361,311.7375 1 5 1.17 .3111
1,164,157.782
Residual 9,313,262.2625 8 8
Total 10,674,574.0000 9
Studentized
Studentize
d Deleted
Observatio Restaurant
n Revenue Predicted Residual Leverage Residual Residual
1 12,113.0 12,407.5 -294.5 0.278 -0.321 -0.302
2 11,409.0 12,869.9 -1,460.9 0.101 -1.428 -1.547
3 14,579.0 12,661.3 1,917.7 0.142 1.919 2.443
4 11,605.0 12,811.5 -1,206.5 0.106 -1.182 -1.218
5 12,308.0 12,501.0 -193.0 0.217 -0.202 -0.190
6 12,320.0 13,107.0 -787.0 0.131 -0.783 -0.762
7 13,225.0 12,591.1 633.9 0.170 0.645 0.620
8 13,652.0 13,496.0 156.0 0.361 0.181 0.170
9 13,893.0 13,036.9 856.1 0.114 0.843 0.826
10 13,896.0 13,517.7 378.3 0.380 0.445 0.422
12.48 a. 95% confidence interval for β1: (−1.8907, 5.2297). This interval does contain zero which
means that we are not confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,8) = ±2.306, tcalc = 1.081 falls
between the critical values. Therefore fail to reject H0 and conclude that the slope isnot
significantly different from zero.
12-47
Chapter 12 - Simple Regression
c. The p-value = .3111. This means that it is likely to obtain a slope estimate of this value
even if the true slope is equal to zero.
d. No, this sample does not support our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .128. Because the slope is not significant R2 does not have meaning.
b. Fcalc = 1.17 and its p-value = .3111. The F statistic is not significant which means the
linear model does not provide significant fit.
c. This model is not fit for practical use.
Learning Objective: 12-8
12.50 Restaurant 3 shows an unusual residual. It appears the model is underestimating the
revenue.
Learning Objective: 12-11
12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a fairly
uniform distribution.
12-48
Chapter 12 - Simple Regression
b. The sample size is small so it is difficult to determine normality from the histogram.
Based on the normplot there is no obvious evidence to assume non-normality.
Learning Objective: 12-10
12.52
12-49
Chapter 12 - Simple Regression
12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals for x = 1200 and 1500 are shown
below. Keep in mind that these intervals are unreliable because there is not significant
relationship between website hits and revenue.
Predicted values for: Restaurant Revenue
95% Confidence Intervals 95% Prediction Intervals
Website Hits Predicted lower upper lower upper
1,200 12,385.790 11,036.168 13,735.411 9,555.230 15,216.349
1,500 12,886.644 12,099.326 13,673.962 10,276.958 15,496.330
12-50
Chapter 12 - Simple Regression
DATA SET G
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. This information can be gathered from manufacturers’ specification
information which is listed on their websites. They manufacturers use sophisticated
sampling techniques for estimating these values.
Learning Objective: 02-7
12.43 A negative slope would be logical. It would make sense that the heavier a car is the lower
the MPG.
Learning Objective: 12-2
12.44
There is a fairly strong negative relationship between the Weight and CityMPG.
Learning Objective: 12-4
12.46 An increase in the weight of a car by one pound reduces its city mpg by 0.0045 mpg. No,
the intercept does not make sense.
Learning Objective: 12-2
12-51
Chapter 12 - Simple Regression
Regression Analysis
r² 0.681 n 43
r -0.825 k 1
Std. Error 2.499 Dep. Var. City MPG
ANOVA table
Source SS df MS F p-value
Regression 546.43787944 1 546.43787944 87.51 1.00E-11
Residual 256.02723683 41 6.24456675
Total 802.46511628 42
Studentized
Studentized Deleted
Observation City MPG Predicted Residual Leverage Residual Residual
1 20.0 20.9 -0.9 0.027 -0.371 -0.367
2 23.0 21.5 1.5 0.031 0.607 0.602
3 19.0 21.2 -2.2 0.029 -0.888 -0.886
4 20.0 21.4 -1.4 0.030 -0.557 -0.552
5 18.0 17.4 0.6 0.031 0.260 0.257
6 18.0 18.2 -0.2 0.026 -0.073 -0.072
7 19.0 21.8 -2.8 0.034 -1.141 -1.145
8 14.0 14.1 -0.1 0.074 -0.062 -0.061
9 15.0 15.4 -0.4 0.053 -0.165 -0.163
10 17.0 15.4 1.6 0.053 0.657 0.653
11 18.0 17.5 0.5 0.030 0.223 0.220
12 13.0 12.5 0.5 0.111 0.219 0.216
13 13.0 9.8 3.2 0.194 1.448 1.469
14 26.0 24.1 1.9 0.063 0.803 0.799
15 15.0 15.4 -0.4 0.053 -0.165 -0.163
16 21.0 21.2 -0.2 0.029 -0.076 -0.075
17 18.0 17.0 1.0 0.034 0.418 0.414
18 24.0 23.5 0.5 0.054 0.201 0.199
19 16.0 17.1 -1.1 0.033 -0.433 -0.429
20 15.0 14.0 1.0 0.077 0.412 0.408
21 18.0 19.3 -1.3 0.023 -0.525 -0.520
22 25.0 26.2 -1.2 0.107 -0.498 -0.494
23 17.0 20.0 -3.0 0.024 -1.235 -1.243
24 18.0 21.2 -3.2 0.029 -1.295 -1.306
25 13.0 13.9 -0.9 0.080 -0.355 -0.352
12-52
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (-0.005, -0.0036). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 41, t.025 =T.INV(.025,41) =±2.020, tcalc = -9.354<-2.020.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 1.00×10-11. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .681. 68.1% of the variation in the City MPG of a vehicle can be explained by its
weight. This shows a fairly strong relationship.
b. Fcalc = 87.51 and its p-value = 1.00×10-11. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.50 Observation 42 is an outlier residual. This happens to be Volkswagen Jetta. The model
underestimated the true MPG by quite a margin. Perhaps this was the diesel version of
that model.
Learning Objective: 12-11
12.51 a. The normplot of residuals is not perfect. The high outlier can be seen and the line sags in
the middle. The histogram of residuals shows a somewhat bell-shaped distribution with
one high outlier.
12-53
Chapter 12 - Simple Regression
b. The residuals do not show significant departure from normality in spite of the one high
outlier.
Learning Objective: 12-11
12-54
Chapter 12 - Simple Regression
12.52
The residual plot does not show obvious signs of heteroscedasticity. It is possible to see the
Jetta outlier residual.
Learning Objective: 12-10
12.53 An autocorrelation test is not appropriate because this is cross sectional data.
Learning Objective: 12-10
12.54 Answers will vary. Confidence and prediction intervals for x = 3000 and 4000 are shown
below.
Predicted values for: City
MPG
95% Confidence Intervals 95% Prediction Intervals
Weight Predicted lower upper lower upper
3,000 22.965 21.879 24.050 17.803 28.127
4,000 18.408 17.608 19.208 13.299 23.518
Learning Objective: 12-9
12.55 Observations 12, 13, and 22 have high leverage. These correspond to the Dodge Ram 1500,
the Ford Expedition, and the Kia Rio.
Learning Objective: 12-11
12-55
Chapter 12 - Simple Regression
DATA SET H
12.40 Cross-sectional.
Learning Objective: 02-3
12.41 Answers will vary. This information can be gathered from food labels or manufacturers’
websites.
Learning Objective: 02-7
12.42 Answers will vary. Sample size is reasonable although a large sample would be better.
Learning Objective: 02-9
12.43 A positive slope would be logical. It would make sense that the more fat calories the sauce
has the more calories are in the sauce in general.
Learning Objective: 12-2
12.44
12.46 An increase in the fat calories per gram by 1, increases total calories per gram by 2.2179.
The intercept might make sense for sauce labeled “fat free.”
Learning Objective: 12-2
12-56
Chapter 12 - Simple Regression
Regression Analysis
r² 0.843 n 20
r 0.918 k 1
Std. Error 0.102 Dep. Var. Calories Per Gram
ANOVA table
Source SS df MS F p-value
Regression 1.0076 1 1.0076 96.49 1.17E-08
Residual 0.1880 18 0.0104
Total 1.1955 19
Studentized
Studentized Deleted
Observation Calories Per Gram Predicted Residual Leverage Residual Residual
1 0.640 0.749 -0.109 0.053 -1.096 -1.103
2 0.560 0.572 -0.012 0.066 -0.117 -0.114
3 0.400 0.483 -0.083 0.096 -0.853 -0.846
4 0.480 0.394 0.086 0.142 0.907 0.902
5 0.640 0.572 0.068 0.066 0.693 0.682
6 0.400 0.305 0.095 0.203 1.037 1.039
7 0.560 0.483 0.077 0.096 0.794 0.785
8 0.550 0.483 0.067 0.096 0.691 0.681
9 0.480 0.572 -0.092 0.066 -0.927 -0.923
10 0.480 0.572 -0.092 0.066 -0.927 -0.923
11 1.250 1.148 0.102 0.251 1.151 1.162
12 1.000 1.037 -0.037 0.164 -0.400 -0.390
13 1.000 0.949 0.051 0.112 0.534 0.523
14 1.170 1.037 0.133 0.164 1.420 1.465
15 0.920 0.860 0.060 0.076 0.612 0.601
16 0.670 0.860 -0.190 0.076 -1.933 -2.111
17 0.860 0.749 0.111 0.053 1.116 1.124
18 0.700 0.727 -0.027 0.051 -0.270 -0.262
19 0.560 0.749 -0.189 0.053 -1.900 -2.066
20 0.640 0.660 -0.020 0.051 -0.204 -0.198
Learning Objective: 12-7
12-57
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (1.7436, 2.6923). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 18, t.025 =T.INV(.025,18) = ±2.101, tcalc = 9.823 > 2.101.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 1.17×10-8. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .843. 84.3% of the variation in the total calories/gram of a pasta sauce can be
explained by the number of fat calories/gram. This shows a fairly strong relationship.
b. Fcalc = 96.49 and its p-value = 1.17×10-8. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.50 Observation 16 is an unusual residual. This is the Ragu Old World Style with meat sauce.
Learning Objective: 12-11
12.51 a. The normal probability plot of residuals does not show a clear normal pattern.
12-58
Chapter 12 - Simple Regression
b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
Learning Objective: 12-10
12.52
12-59
Chapter 12 - Simple Regression
12.54 Answers will vary. Confidence and prediction intervals for x = 0.10 and 0.20 are shown
below.
12.55 Observations 6 and 11 show high leverage. These correspond to Healthy Choice Traditional
and Prego Hearty Meat Peperoni.
Learning Objective: 12-11
12-60
Chapter 12 - Simple Regression
DATA SET I
12.40 Time-series.
Learning Objective: 02-3
12.41 Answers will vary. This information can be gathered by taking a random household and
observing their energy usage.
Learning Objective: 02-7
12.42 Answers will vary. Two years of data is good but when looking for seasonal influences
more years would be better.
Learning Objective: 02-9
12.43 A negative slope would be logical. It would make sense that the lower the temperature the
more energy a household would use.
Learning Objective: 12-2
12.44
There is a fairly strong negative relationship between the Average Daily Temperature and
Energy Consumption.
Learning Objective: 12-4
12.46 An increase in 1° in average temperature decreases the monthly energy use by 9.661 kwh.
Yes, the intercept does make sense. There can be a month with an average temperature
of 0°.
Learning Objective: 12-2
12-61
Chapter 12 - Simple Regression
r² 0.766 n 24
r -0.875 k 1
Std. Error 84.951 Dep. Var. Electric Consumption (KWH)
ANOVA table
Source SS df MS F p-value
Regression 520,420.3570 1 520,420.3570 72.11 2.16E-08
Residual 158,766.1430 22 7,216.6429
Total 679,186.5000 23
Studentized
Studentized Deleted
Observation Energy Use (KWH) Predicted Residual Leverage Residual Residual
1 436.0 566.8 -130.8 0.056 -1.585 -1.645
2 464.0 479.9 -15.9 0.098 -0.197 -0.192
3 446.0 431.6 14.4 0.135 0.183 0.179
4 391.0 499.2 -108.2 0.086 -1.332 -1.358
5 444.0 557.2 -113.2 0.059 -1.373 -1.403
6 608.0 663.4 -55.4 0.042 -0.667 -0.658
7 885.0 808.3 76.7 0.089 0.945 0.943
8 821.0 779.4 41.6 0.073 0.509 0.500
9 830.0 789.0 41.0 0.078 0.502 0.494
10 750.0 827.7 -77.7 0.101 -0.964 -0.963
11 617.0 731.0 -114.0 0.054 -1.380 -1.411
12 598.0 644.1 -46.1 0.042 -0.554 -0.545
13 597.0 499.2 97.8 0.086 1.205 1.218
14 528.0 470.2 57.8 0.105 0.719 0.711
15 477.0 402.6 74.4 0.161 0.956 0.954
16 562.0 489.5 72.5 0.092 0.895 0.891
17 658.0 586.1 71.9 0.050 0.868 0.863
18 690.0 702.1 -12.1 0.047 -0.145 -0.142
19 862.0 798.7 63.3 0.083 0.778 0.771
20 1,008.0 837.3 170.7 0.108 2.127 2.332
21 840.0 924.3 -84.3 0.184 -1.098 -1.104
22 867.0 798.7 68.3 0.083 0.840 0.834
23 606.0 702.1 -96.1 0.047 -1.158 -1.168
24 657.0 653.8 3.2 0.042 0.039 0.038
Learning Objective: 12-7
12-62
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (-12.0202, -7.3016). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 22, t.025 =T.INV(.025,22) = ±2.074, tcalc = -8.492
<-2.074. Therefore reject H0 and conclude that the slope is significantly different from
zero.
c. The p-value = 2.16×10-8. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .766. 76.6% of the variation in the energy usage can be explained by average daily
temperature. This shows a fairly strong relationship.
b. Fcalc = 72.11 and its p-value = 2.16×10-8. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12.50 Observation 20 is an unusual residual. The model underestimates the energy usage for that
month.
Learning Objective: 12-11
12.51 a. The normplot of residuals does not show a clear normal pattern. The histogram also
shows a non-normality.
12-63
Chapter 12 - Simple Regression
b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
Learning Objective: 12-10
12.52
12-64
Chapter 12 - Simple Regression
12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.53 which indicates slight positive
autocorrelation. This is not unexpected because temperature for one month is related to
temperature for the previous month. The level of autocorrelation does not invalidate the
regression results.
12.54 Answers will vary. Confidence and prediction intervals for x = 50 and 70 are shown below.
Predicted values for: Electric Consumption (KWH)
95% Confidence Intervals 95% Prediction Intervals
Avg Daily Temp (deg F) Predicted lower upper lower upper
50 682.745 645.995 719.495 502.776 862.715
70 489.527 436.022 543.033 305.405 673.650
Learning Objective: 12-9
12-65
Chapter 12 - Simple Regression
DATA SET J
12.40 Time-series.
Learning Objective: 02-3
12.41 Answers will vary. This information is collected by The Bureau of Labor Statistics.
Learning Objective: 02-7
12.42 Answers will vary. 47 years of data is a good sample. Methods of measurement can vary
over time. It is important to consider how the measures were calculated and compare
years in which the calculations were similar.
Learning Objective: 02-9
12.43 A positive slope would be logical. It would make sense that the change in Commodities
CPI would move in the same direction as change in Services CPI.
Learning Objective: 12-2
12.44
There is a fairly strong positive relationship between the Commodities% and Services%.
Learning Objective: 12-4
12.46 An increase in the change in Commodities CPI of 1% increases the change in Service CPI
by .830%. Yes, it is possible that there is no change in the CPI between two years.
Learning Objective: 12-2
12-66
Chapter 12 - Simple Regression
Regression Analysis
r² 0.727 n 47
r 0.853 k 1
Std. Error 1.574 Dep. Var. Services%
ANOVA table
Source SS df MS F p-value
Regression 297.2715 1 297.2715 120.03 2.76E-14
Residual 111.4532 45 2.4767
Total 408.7247 46
Studentized
Studentized Deleted
Observation Services% Predicted Residual Leverage Residual Residual
1 3.40 2.95 0.45 0.037 0.289 0.286
2 1.70 2.70 -1.00 0.041 -0.652 -0.648
3 2.00 2.95 -0.95 0.037 -0.618 -0.613
4 2.00 2.95 -0.95 0.037 -0.618 -0.613
5 2.00 3.20 -1.20 0.034 -0.778 -0.774
6 2.30 3.12 -0.82 0.035 -0.530 -0.526
7 3.80 4.37 -0.57 0.023 -0.363 -0.360
8 4.30 3.78 0.52 0.027 0.332 0.329
9 5.20 5.11 0.09 0.021 0.056 0.056
10 6.90 6.11 0.79 0.025 0.509 0.505
11 8.00 5.94 2.06 0.024 1.323 1.334
12 5.70 5.20 0.50 0.021 0.324 0.321
13 3.80 4.70 -0.90 0.022 -0.577 -0.572
14 4.40 8.35 -3.95 0.057 -2.584 -2.769
15 9.20 12.09 -2.89 0.185 -2.031 -2.107
16 9.60 9.51 0.09 0.086 0.058 0.058
17 8.30 5.78 2.52 0.023 1.622 1.653
18 7.70 7.02 0.68 0.034 0.438 0.434
19 8.60 8.18 0.42 0.053 0.272 0.269
20 11.00 11.59 -0.59 0.162 -0.408 -0.404
21 15.40 12.42 2.98 0.201 2.120 2.209
22 13.10 9.18 3.92 0.077 2.592 2.779
23 9.00 5.61 3.39 0.022 2.178 2.277
24 3.50 4.61 -1.11 0.022 -0.716 -0.712
25 5.20 5.03 0.17 0.021 0.110 0.108
26 5.10 3.95 1.15 0.026 0.740 0.736
12-67
Chapter 12 - Simple Regression
12.48 a. 95% confidence interval for β1: (0.6776, 0.9828). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 45, t.025 =T.INV(.025,45) = ±2.014, tcalc = 10.956 > 2.014.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 2.76×10-14. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6
12.49 a. R2 = .727. 72.7% of the variation in the Services CPI change can be explained by
Commodities CPI change. This shows a fairly strong relationship.
b. Fcalc = 120.03 and its p-value = 2.76×10-14. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8
12-68
Chapter 12 - Simple Regression
12.51 a. The histogram shows a slight bell-shaped curve although there is more concentration in
the middle of the graph than you would see in a true normal distribution. The normplot
shows a similar pattern.
b. The histogram of residuals does not show obvious departure from normality.
Learning Objective: 12-10
12-69
Chapter 12 - Simple Regression
12.52
The residual plot does not show significant signs of heteroscedasticity although there is a
slight fan out pattern as X increases.
Learning Objective: 12-10
12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.08 which indicates slight positive
autocorrelation. This is not unexpected because CPI indexes are economic data which
one would expect to be correlated month by month. The level of autocorrelation does
not invalidate the regression results.
12-70
Chapter 12 - Simple Regression
12.54 Answers will vary. Confidence and prediction intervals for x = 1.5 and 2.5 are shown
below.
Predicted values for: Services%
95% Confidence
Intervals 95% Prediction Intervals
Commodities% Predicted lower upper lower upper
1.5 3.4520 2.8982 4.0059 0.2343 6.6698
2.5 4.2822 3.7954 4.7690 1.0753 7.4891
Learning Objective: 12-9
12-71
Chapter 12 - Simple Regression
12.56 No, r measures the strength and direction of the linear relationship, but not the amount of
variation explained by the explanatory variable.
Learning Objective: 12-15
12.57 H0: ρ = 0 versus H1: ρ ≠ 0. α = .025 so tcrit = t.0125 =T.INV(.0125,53) =±2.3069. tcalc =
55 - 2
.3043 = 2.3256 > 2.3069 so we reject the null hypothesis. The correlation is
1 - .30432
not equal to zero.
Learning Objective: 12-1
12.58 The correlation coefficient, r, is only .13, indicating that there exists a very weak positive
correlation between prices on successive days. The fact that it is a highly significant
result stems from a large sample size which increases power of the test. This means that
very small correlations will show statistical significance even though the correlation is
not truly important.
Learning Objective: 12-1
12-72
Chapter 12 - Simple Regression
of downtime will not completely resolve the issue. Indicates that there are most likely
other reasons why machines have the amount of downtime incurred.
Learning Objective: 12-6
Learning Objective: 12-8
12-73
Chapter 12 - Simple Regression
12.64 a.
b. r = −.297. This shows a weak negative linear relationship between loyalty card use and
sales growth.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 72 and = .05, α = .05 so tcrit = t.025
74 - 2
=T.INV(.025,72) =±1.9935. tcalc = -.297 = -2.639 . Because -2.639
1 - .297 2
<-1.9935, we reject the hypothesis of no correlation and the sample evidence supports
the notion of negative correlation.
d. It appears that a higher loyalty card usage is associated with lower sales growth.
12.65 a. The scatter plot indicates that there is a positive correlation between the fertility rates in
1990 and 2000.
12-74
Chapter 12 - Simple Regression
b. r = .749. There is a strong positive linear relationship between the fertility rates in 1990
and 2000.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 13 and = .05, t.025 =T.INV(.025,13) = 2.160.
15 - 2
tcalc = .749 = 4.076 . Because 4.076 > 2.160, we reject the hypothesis of no
1 - .7492
correlation and the sample evidence supports the notion of positive correlation.There is
a positive correlation.
Learning Objective: 12-1
b. r = −.105. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 25 and = .05, α = .05 so tcrit = t.025
27 - 2
=T.INV(.025,25) =±2.060. tcalc = -.105 = -0.528 . Because -.528 falls
1 - .1052
between the critical valueswe fail to reject the hypothesis of no correlation.
c. It appears there is very little relationship between price and accuracy rating of speakers.
Learning Objective: 12-1
12.67 For each of these, the scatter plot will contain the answers to (a), (b), and (d) with respect to
the fitted equation.
c. Salary: The fit is good. Assessed: The fit is excellent. HomePrice2: The fit is good.
d. Salary: An increase in the age by 1 year increases salary by $1447.4.
Assessed: An increase in 1 sq. ft. of floor space increases assessed value by $313.30.
HomePrice2: An increase in 1 sq. ft. of home size increases the selling price by
$209.20.
e. The intercept is not meaningful for any of these data sets as a zero value for any of X’s
respectively cannot realistically result in a positive Y value.
12-75
Chapter 12 - Simple Regression
estimated slope
t=
12.68 a. standard error
See table below for calculations.
b. Answers shown in right column in table below.
12-76
Chapter 12 - Simple Regression
12.69 a.
c. The fit of this regression is weak as given byR2 = 0.2474. 24% of the variation in %
Operating Margin is explained by % Equity Financing.
Learning Objective: 12-7
Learning Objective: 12-8
12.70 a.
c. The fit of this regression is very good as given by R2 = 0.8216. The regression line
does show a strong positive linear relationship between molecular w.r.t. and retention
time, indicating that the greater the molecular w.r.t. the greater is the retention time.
Learning Objective: 12-7
Learning Objective: 12-8
12.71 a. Based on both the R2 = 0 and the p-value > .10, there is no relationship between the
class size and teacher ratings.
b. Given that R2 =0, we have not “explained” teacher ratings in this bivariate model. Other
factors might be students’ expected GPA, whether the course is a core class or not, the
age of the student, gender of student, gender of instructor, etc. Answers will vary.
Learning Objective: 12-8
12-77
Chapter 12 - Simple Regression
c. The fit of this regression is very good as given by R2 = .8206. The regression line
showsa strong positive linear relationship between revenue and profit, indicating that
the greater revenue is associated with higher profit.
Learning Objective: 12-7
Learning Objective: 12-8
12.73 a. The slope of each model indicates the impact an additional year in vehicle age has on
the price. This relationship for each model is negative indicating that an additional year
of age reduces the asking price. This impact ranges from a low for the Taurus (an
additional year reduces the asking price by $906) to a high for the Ford Explorer (an
additional year reduces the asking price by $2,452).
b. The intercepts could indicate the price of a new vehicle.
c. Based on the R2 values: The fit is very good for the Explorer, the F-150 Pickup and the
Taurus. The fit is weak for the Mustang. One reason for the seemingly poor fit for the
Mustang is the fact that this is a collector item (if in good condition) so that the age is
less important of a factor in determining the asking price.
d. Answers will vary, but a bivariate model for 3 of the vehicles explains approximately
2/3 of the variation in asking price at a minimum. Other factors: condition of the car,
collector status, proposed usage, price of a new vehicle.
Learning Objective: 12-6
Learning Objective: 12-8
12.74 a. The regression results are not significant, based on the p-value, for the 1-Year holding
period. The results for the 2-Year period are significant at the 5% level, while for 5-, 8-,
and 10-Years the results are significant at the 1% level. For each regression there is an
inverse relationship between P/E and the stock return. For the 8-Year and 10-Year
period the relationship is approximately -1. The R2 increases as the holding period
increases. This indicates that P/E ratio explains a greater portion of the variation in
stock return, the longer the stock is held.
12-78
Chapter 12 - Simple Regression
b. Yes, given the data are time series, the potential for autocorrelation is present. Also, it is
commonly recognized that stock returns do exhibit a high degree of autocorrelation, as
do most financial series.
Learning Objective: 12-6
Learning Objective: 12-8
12.75 a. Using Father’s Height: My Predicted Height = 71+2.5 = 73.5” My actual height =
73” . Using Average of Parent’s Height: My Predicted Height = 68+2.5 = 70.5”
b. Fairly accurate within 0.5” when using my father’s height, within 2.5” when using
average parent height. May be there is improved accuracy using only father’s height
for males.
c. Regression analysis of samples of daughters and sons, with respective average height of
parents. Separate samples of each.
Learning Objective: 12-3
12-79