Chap 012

Chapter 12 - Simple Regression
Chapter 12
Simple Regression
12.1 For each sample: H0: ρ= 0 versus H1: ρ ≠ 0. The following formula is used to calculate the
n-2
test statistic: tcalc = r and tcrit=T.INV(α/2,df). Because these are all two-tailed
1- r2
tests the decision rule is Reject H0 if tcalc> +tcrit or tcalc< −tcrit
Summary Table
Decision
2.138 > 2.101, RejectH0
−1.977< −1.701, RejectH0
1.677 is between the critical
values, Fail to Reject H0
−2.416< −2.390, Reject H0
Learning Objective: 12-1
12.2 a. The scatter plot shows a positive correlation between hours worked and weekly pay.
b.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy
2130
r= = .9199
350 15318
12-1
c. tcrit =t.025 =T.INV(.025, 3) =±3.182, using d.f. =n−2 = 3

5-2
d. tcalc = .9199 = 4.063 . We reject the null hypothesis of zero correlation
1 - (.9199) 2
because 4.063 > 3.182.
e. p-value =T.DIST.2T(4.063,3) = .0269.
12.3 a. The scatter plot shows a negative correlation between operators and wait time.
b.
Operators (X) Wait (Y) ( xi - x )( yi - y )
4 385 −76
5 335 12
6 383 0
7 344 −3
8 288 −118
6 347 −185
x y SSxy
-185
r= = -.7328
10 6374
c. tcrit = t.025 =T.INV(.025,3) =±3.182, using d.f.= 3
5-2
d. tcalc = -.7328 = -1.865 . We fail to reject the null hypothesis of zero
1 - (-.7328) 2
correlation because -1.865 >-3.182.
e. p-value = T.DIST.2T(1.865,3) = .159.
12-2
12.4 a. The scatter plot shows little correlation between age and amount spent.
b. rcalc = −.292
c. tcrit = t.025 =T.INV(.025,8) =±2.306, using d.f. = 8
10 - 2
d. tcalc = -.292 = -.864
1 - (-.292) 2
e. Because tcalc (−.864) > −2.306, we fail to reject the null hypothesis of zero correlation.
12.5 a. The scatter plot shows a positive correlation between returns from last year and returns
from this year.
b. rcalc = .5313
c. tcrit = t.025 =T.INV(.025,15) = ±2.131, using d.f. = 15
17 - 2
d. tcalc = .5313 = 2.429
1 - (.5313) 2
e. Because tcalc (2.429) > 2.131, we reject the null hypothesis of zero correlation.
12-3
12.6 a. The scatter plot shows a positive correlation between orders and ship cost.
b. rcalc = .820
c. tcrit = t.025 =T.INV(.025,10) = ±2.228, using d.f. = 10
12 - 2
d. tcalc = .820 = 4.530
1 - (.820) 2
e. Becausetcalc (4.53) > 2.228, we reject the null hypothesis of zero correlation.
12.7 a. Increasing the size of a home by 1 square foot increases the price by $150.
b. HomePrice = $125000 + $150×(2000) = $425,000
c. The intercept might be interpreted as the value of the lot without a home. But the range
of values for Xdoes not include zero so it would be dangerous to extrapolate for x = 0.
12.8 a. An increase in the price of the item of $1 reduces its expected sales by 37.5 units.
b. Sales = 842 – ($20)×37.5 = 92
c. From a practical point of view no. A zero price is unrealistic.
12.9 a. An increase in the median age of one year means the number of car thefts decreases by
35.3.
b. CarTheft = 1,667 - 35.3×40 = 255
c. The intercept would not be meaningful because you would not have a median age of
zero for any state.
12.10 a. An increase in the microprocessor speed of one MHz means the computer power
dissipation increases by 0.032 watts.
b. Computer power dissipation= 15.73 + 0.032×3000 = 111.73 watts
c. The intercept would not be meaningful because you would not have a computer with
zero speed.
12-4
12.11 a. An increase in a country’s Power distance index of one unit means the number of
international franchises increases by 1.75.
b. International franchises= -47.5 + 1.75×85 = 101.25
c. The intercept would not be meaningful because you cannot have a negative number of
franchises. While the range for the index does include zero, it is unlikely that a
country’s index value will be close to zero.
12.12 a. Increasing the average revenue by $1million raises the net income by $30,700.
b. If revenue is zero, then net income is $2,277 million which suggests that the firm has
net income when revenue is zero. This does not seem logical.
c. Revenue= $2,277 + 0.0307×($20,000)=$2,891 million
12.13 a. Increasing the median income by $1,000 raises the median home price by $2610.
b. If median income is zero, then the model suggests thatmedian home price is $51,300.
While it does not seem logical that the median family income for any city is zero, it is
unclear what the lower bound would be.
c. HomePrice = $51.3 + 2.61×($50) = $181.8 or $181,800
Homeprice = $51.3 + 2.61×($100) = $312.3 or $312,300
12.14 a. Increasing the number of hours worked per week by 1 hour reduces the expected
number of credits by .07.
b. Yes, the intercept makes sense in this situation. It is possible that a student does not
have a job outside of school.
c. Credits= 15.4 - .07×0 = 15.4 credits
Credits= 15.4 - .07×40 = 12.6 credits
The more hours a student works, the less credits (courses) he will take on average.
12.15 a. Chevy Blazer: a one year increase in vehicle age reduces the price by $1050.
Chevy Silverado: a one year increase in vehicle age reduces the price by $1339.
b. Chevy Blazer: If age = 0 then price = $16,189. This could be the price of a new Blazer.
Chevy Silverado: If age = 0 then price = $22,591. This could be the price of a new
Silverado.
c. $16,189 – $1,050×5 = $10,939
$22,591 -$1,339×5 = $15,896
12-5
12.16 a. yˆ i = $2,277 + 0.0307($41,078) = $3538.0946 ,

ei = yi - yˆ i = 8301 - 3538.0946 = 4762.9054 . The regression equation underestimated
the net income.
b. yˆ i = $2, 277 + 0.0307($61,768) = $4173.2776 ,
ei = yi - yˆ i = 893 - 4173.2776 = -3280, 2776 . The regression equation overestimated
the net income.
12.17 a. yˆ i = 15.4 - 0.07(14) = 14.42 , ei = yi - yˆ i = 18 - 14.42 = 3.58 . The regression equation

underestimated the number of credits.
b. yˆ i = 15.4 - 0.07(30) = 13.3 , ei = yi - yˆ i = 6 - 13.3 = -7.3 . The regression equation
overestimated the number of credits.
12.18 a.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy
2130
b. b1 = = 6.086 , b0 = 177 - 6.086(20) = 55.286 , ŷ = 55.286 + 6.086x
350
c.
Hours Estimated
Worked (xi) Pay ( yˆ i ) yi - yî ( yi - yî ) 2 ( yî - y )2 ( yi - y ) 2
10 93 116.14623.146 535.7373 3703.209 7056
15 171 146.57624.424 596.5318 925.6198 36
20 204 177.00626.994 728.676 3.6E-05 729
20 156 177.00621.006 441.252 3.6E-05 441
35 261 268.2967.296 53.23162 8334.96 7056
20 177 177.0060.006 3.6E-05 3.6E-05 0
x SSE SSR SST
20 177 2355.429 12963.79 15318
12,963
d. R = = .8462
2
15,318
12-6
e.

12.19 a.
Operators (X) Wait (Y) ( xi - x )( yi - y )

4 385 −76
5 335 12
6 383 0
7 344 −3
8 288 −118
6 347 −185
x y SSxy
-185
b. b1 = = -18.5 , b0 = 347 + 18.5(6) = 458 , ŷ = 458−18.5x
10
c.
Operators Wait Time Estimated
(xi) (yi) Time ( yˆ i ) yi - yî ( yi - yî ) 2 ( yî - y() 2yi - y )2
4 385 384 1 1 1369 1444
5 335 365.5 30.5 930.25 342.25144
6 383 347 36 1296 0 1296
7 344 328.5 15.5 240.25 342.25 9
8 288 310 -22 484 1369 3481
6 347 2951.5 3422.56374
x y SSR
3, 422.5
d. R = = .5369
2
6,374.0
e.
12-7

12.20 a. and b.
c. An increase in age of 10 years leads to an average decrease in spending of $5.30.

d. The intercept is not meaningful in this case.
e. R2 = .0851 8.51% of the variation in spending is due to the variation in age. Age of the
consumer has little impact on the amount spent.
12-8
12.21 a. and b.
c. An increase of 1% inlast year’s return leads to an increase, on average, of .458% for

this year’s return.
d. If last year’s return is zero, this year’s return is 11.155%. Yes, this is meaningful,
returns can be zero.
e. R2 = .2823.Only 28.23% of the variation in this year’s return is explained by last year’s
return.
12.22 a. and b.
c. An increase of 100 orders leads to an average increase in shipping cost of $493.22.

d. The intercept is not meaningful in this case.
e. R2 = .6717.67.17% of the variation in shipping costs is explained by number of orders.
12-9
12.23 a. MegaStat regression output:

Regression Analysis
r² 0.846 n 5
r 0.920 k 1
Std. Error 28.020 Dep. Var. Weekly Pay (Y)
ANOVA table
Source SS df MS F p-value
Regression 12,962.5714 1 12,962.5714 16.51 .0269
Residual 2,355.4286 3 785.1429
Total 15,318.0000 4
Regression output confidence interval

variables coefficients std. error t (df=3) p-value 95% lower 95% upper
Intercept 55.2857 32.4705 1.703 .1872 -48.0500 158.6214
Hours Worked (X) 6.0857 1.4978 4.063 .0269 1.3192 10.8522
b. H0: 1 = 0 vs. H1: 1 ≠ 0.

c. p-value = .0269, (1.3192, 10,8522)
d. The slope is significantly different from zero because the p-value is less than .05.

Regression Analysis
r² 0.537 n 5
r -0.733 k 1
Std. Error 31.366 Dep. Var. Wait Time (Y)
ANOVA table
Regression 3,422.5000 1 3,422.5000 3.48 .1590
Residual 2,951.5000 3 983.8333
Total 6,374.0000 4

Intercept 458.0000 61.1438 7.491 .0049 263.4131 652.5869
Operators (X) -18.5000 9.9188 -1.865 .1590 -50.0662 13.0662
b. H0: 1 = 0 vs. H1: 1 ≠ 0.

c. p-value = .1590, (-50.0662, 13.0662)
12-10
d. The slope is not significantly different from zero because the p-value is greater than .05.
12.25 a. ŷ = 557.4511 + 3.0047x

b. For a 95% confidence level use t.025 =T.INV(.025,30) = −2.042. The 95% confidence
interval is 3.0047 ± 2.042(0.8820) or (1.203, 4.806).
c. H0: β1 ≤ 0 versus H1: β1> 0.tcrit=T.INV(.95,30) = 1.697. Reject the null hypothesis if tcalc>
1.697. tcalc = 3.0047/0.8820 = 3.407> 1.697 so we reject the null hypothesis.
d. p-value =T.DIST.RT(3.407,30) = .0009 < .05 so we reject the null hypothesis. The slope
is positive. Increased debt is correlated with increased NFL team value.
12.26 a. ŷ = 7.6425 + 0.9467x

b. For a 95% confidence level use t.025 =T.INV(.025,14) = −2.145. The 95% confidence
interval is 0.9467 ± 2.145(0.0936) or (0.7460, 1.1473).
c. H0: β1 ≤ 0 versus H1: β1> 0. tcrit=T.INV(.95,14) = 1.761. Reject the null hypothesis if
tcalc> 1.761. tcalc = 0.9467/0.0936 = 10.114> 1.761 so we reject the null hypothesis.
d. p-value =T.DIST.RT(10.114,14) = .0000 < .05 so we reject the null hypothesis. The
slope is positive. Increased revenue is correlated with increased expenses.
12.27 a. ŷ = 1.8064 + .0039x

b. Intercept: tcalc = 1.8064/0.6116 = 2.954, Slope: tcalc = 0.0039/0.0014 = 2.786 (Excel
value is slightly different due to internal rounding.)
c. d.f. = 10, t.025 =T.INV(.025, 10) = −2.228 so tcrit = ±2.228. Or one could use
=T.INV.2T(.05,10)
d. Intercept: p-value =T.DIST.2T(2.954,10) = .0144. Slope: p-value
=T.DSIT.2T(2.869,10) = .0167.
e. Fcalc = (2.869)2 = 8.23
f. This model fits the data fairly well. The F statistic is highly significant (p-value = .
0167). Also, R2 = .452 indicating almost half of the variation in annual taxes is
explained by home price.
12.28 a. ŷ = 614.930 − 1.09.11x

b. Intercept: tcalc = 614.930/51.2343 = 12.002. Slope:tcalc = −109.112/51.3623 = −2.124.
c. d.f.= 18, t.025 =T.INV(.025, 18) = − 2.101so tcrit = ±2.101.
d. Intercept: p-value =T.DIST.2T(12.002,18) = .0000, Slope: p-value
=T.DIST.2T(2.124,18) = .0478
e. Fcalc =(−2.124)2= 4.51
12-11
f. This model has a poor fit. The F statistic is barely significant at a level of .05 (p-value =
.0478) and R2 = .2. Only 20% of the variation in units sold can be explained by average
price.
Regression Analysis
r² 0.085 n 10
r -0.292 k 1
Std. Error 2.128 Dep. Var. Spent (Y)
ANOVA table
Regression 3.3727 1 3.3727 0.74 .4133
Residual 36.2396 8 4.5299
Total 39.6123 9

Intercept 6.9609 2.0885 3.333 .0103 2.1449 11.7770
Age (X) -0.0530 0.0614 -0.863 .4133 -0.1946 0.0886
b. (−0.1946, 0.0886) This interval does contain zero therefore we cannot conclude that the
slope is greater than zero.
c. The t statistic is −0.863 and the p-value is .4133. Because the p-value is greater than
0.05, we cannot conclude that the slope is different from zero.
d. Fcalc = 0.745 with a p-value = .4133. This indicates that the model does not fit the data.
e. The p-values match. Fcalc= (−0.863)2 = 0.745.
f. This model does not fit the data. The F statistic is not significant.
12-12

Regression Analysis
r² 0.282 n 17
r 0.531 k 1
Std. Error 4.335 Dep. Var. This Year (Y)
ANOVA table
Regression 110.8585 1 110.8585 5.90 .0282
Residual 281.8321 15 18.7888
Total 392.6906 16

Intercept 11.1549 2.1907 5.092 .0001 6.4854 15.8243
Last Year (X) 0.4580 0.1885 2.429 .0282 0.0561 0.8598
b. (0.0561, 0.8598) This interval does not contain zero and falls on the positive side
therefore we can conclude that the slope is greater than zero.
c. The t statistic is 2.429 and the p-value is .0282. Because the p-value is less than 0.05,
we can conclude that the slope is positive.
d. Fcalc = 5.90 with a p-value = .0282. This indicates that the model does provide some fit
to the data.
e. The p-values match. Fcalc= (2.429)2 = 5.90.
f. This model provides modest fit to the data. Although the F statistic is significant, R2
shows that only 28% of the variation in this year’s return is explained by last year’s
return.
Regression Analysis
r² 0.672 n 12
r 0.820 k 1
Std. Error 599.029 Dep. Var. Ship Cost (Y)
ANOVA table
Regression 7,340,819.5514 1 7,340,819.5514 20.46 .0011
Residual 3,588,357.1152 10 358,835.7115
Total 10,929,176.6667 11
12-13

Intercept -31.1895 1,059.8678 -0.029 .9771 -2,392.7222 2,330.3432
Orders (X) 4.9322 1.0905 4.523 .0011 2.5024 7.3619
b. (2.5024, 7.3619) This interval does not contain zero therefore we can conclude that the
slope is greater than zero.
c. The t statistic is 4.523 and the p-value is 0.0011. Because the p-value is less than 0.05,
we can conclude that the slope is positive.
d. Fcalc = 20.46 with a p-value = .0011. This indicates that the model does provide some fit
to the data.
e. The p-values match. Fcalc = (4.523)2 = 20.46.
f. This model provides a good fit to the data. The F statistic is highly significant andR2
shows that 67% of the variation in shipping cost is explained by number of orders.
12.32 a. MegaStat Predictions:
Predicted values for: Weekly Pay (Y)

95% Confidence Intervals 95% Prediction Intervals
Hours Worked (X) Predicted lower upper lower upper
12 128.314 73.138 183.491 23.451 233.178
17 158.743 116.377 201.109 60.017 257.469
21 183.086 142.922 223.249 85.285 280.887
25 207.429 160.970 253.887 106.879 307.978
30 237.857 175.709 300.005 129.164 346.551
b. x = 17: 95% confidence interval (116.377, 201.109), 95% prediction interval (60.017,
257.469)
61.883
c. The 95% confidence interval for µY: 177 ± 2.776 or (100.174, 253.826).
5
d. The margin of error for the confidence interval in part (c) is 76.826 whereas the
margin of error for the confidence interval in part b is less than 76.826. Knowing the
number of hours a student works helps us better estimate the average credit hours a
student takes.
12-14
12.33 a. MegaStat Predictions:
Predicted values for: Profit (Y)

Revenue (X) Predicted lower upper lower upper
1.8 -0.070148 -0.554318 0.414023 -1.326790 1.186495
15.0 0.757399 0.367076 1.147723 -0.466153 1.980952
30.0 1.697794 1.106753 2.288834 0.396234 2.999353
b. x = 15: 95% confidence interval (0.3671, 1.1477), 95% prediction interval

(-0.4662,1.9810)
1.083
c. The 95% confidence interval for µY: 0.628 ± 2.306 or (-0.2045, 1.4605).
9
d. The margin of error for the confidence interval in part (c) is 0.8325 whereas the margin
of error for the confidence interval in part b is less than 0.8325. Knowing the revenue
an entertainment company brings in helps us better estimate their average profit.
12.34 No, these plots do not show that regression error assumptions of normality or constant
variance have been violated. The plot on the left is a normplot and shows a fairly
straight line on the diagonal. This indicates that the assumption of a normal distribution
for residuals is reasonable. The plot on the right shows residual values plotted against
the corresponding x value. The plot suggests that the residuals are homoscedastic
because there is no increase or decrease in residual magnitude.
12.35 The plot on the left is a normplot and shows a fairly straight line on the diagonal. This
indicates that the assumption of a normal distribution for residuals is reasonable. The
plot on the right shows residual values plotted against the corresponding x value. The
plot suggests that the residuals are heteroscedastic because there is an increase in
residual magnitude as the x values increase.
12.36 a. Predicted Defects = 3.2 + 0.045(100) = 7.7 defects per million parts
b. ei = yi − ŷi = 4.4 – 7.7 = −3.3
ei -3.3
c. ei* = � = -3.084
sei 1.07
d. Yes, this residual is considered an outlier because ei* < −3.
12.37 a. Predicted MPG = 49.22 – 0.081(200) = 33.02 MPG

b. ei = yi − ŷi = 38.15 – 33.02 = 5.13
12-15
ei 5.13
c. ei* = � = 2.527
sei 2.03
d. No, this residual is not considered an outlier because ei* < 3 but it is considered unusual
because ei* > 2.
1 ( xi - x ) 2 1 (2382 - 2004)2
12.38 a. hi = + = + = 0.1774 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1774 > 0.1379 this would be considered a high leverage observation.
1 ( x - x ) 2 1 (2125 - 2004) 2
b. hi = + i = + = 0.0491 . 4/n = 4/29 = 0.1379. Because 0.0491
n SS XX 29 999,603
< 0.1379 this would not be considered a high leverage observation.
1 ( xi - x ) 2 1 (1620 - 2004) 2
c. hi = + = + = 0.1820 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1820 > 0.1379 this would be considered a high leverage observation.
1 ( xi - x ) 2 1 (0.072 - 2.027) 2
12.39 a. hi = + = + = 0.185 . 4/n = 4/74 = 0.0541. Because 0.185
n SS XX 74 22.285
> 0.0541 this would be considered a high leverage observation.
1 ( xi - x ) 2 1 (1.413 - 2.027) 2
b. hi = + = + = 0.0304 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0304<0.0541 this wouldnot be considered a high leverage observation.
1 ( x - x ) 2 1 (3.376 - 2.027) 2
c. hi = + i = + = 0.0952 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0952> 0.0541 this would be considered a high leverage observation.
Questions 12.40 through 12.55 refer to 10 different data sets labeled A-J. The answers to each question
are listed for each data set in turn.Note that one can find the tcrit values using either
=T.INV(α/2, df) or =T.INV.2T(α, df).
DATA SET A
12.40 Cross-sectional.
12.41 Answers will vary. Most likely from a survey similar to the Current Population Survey
conducted by The Bureau of Labor Statistics.
12-16
12.42 Answers will vary. Sample size is sufficient for educational purposes but most government
studies will have much larger sample sizes.
12.43 A positive slope would be logical. It makes sense that higher income is associated with
higher home values. Cause and effect cannot be assumed. An increase in income does
not automatically compel a family to purchase a more expensive home.
12.44
There is a moderate positive relationship between Median Income and Median Home
Price.
12.45 See graph above. Yes, a linear relationship is plausible.

12.46 An increase in median income of $1000, increases home price by $2,609.80. No, the
intercept does not have meaning because it seems unlikely that the median income for a
family will be equal to zero.
12-17
12-18
12.47 MegaStat output is provided.
Regression Analysis
r² 0.340 n 34
r 0.583 k 1
Std. Error 58.855 Dep. Var. Home
ANOVA table
Regression 57,071.8007 1 57,071.8007 16.48 .0003
Residual 110,844.4754 32 3,463.8899
Total 167,916.2761 33

Intercept 51.2465 55.9415 0.916 .3665 -62.7025 165.1955
Income 2.6098 0.6430 4.059 .0003 1.3002 3.9195
Studentized
Studentized Deleted
Observation Home Predicted Residual Leverage Residual Residual
1 290.0000 207.7728 82.2272 0.108 1.479 1.508
2 279.9000 344.6811 -64.7811 0.115 -1.170 -1.177
3 338.2500 332.7568 5.4932 0.089 0.098 0.096
4 316.0000 295.2225 20.7775 0.037 0.360 0.355
5 207.0000 252.4398 -45.4398 0.038 -0.787 -0.782
6 250.0000 252.8364 -2.8364 0.038 -0.049 -0.048
7 320.0000 298.8240 21.1760 0.040 0.367 0.362
8 150.0000 191.5449 -41.5449 0.150 -0.766 -0.761
9 230.0000 274.9494 -44.9494 0.029 -0.775 -0.770
10 199.0000 252.2884 -53.2884 0.038 -0.923 -0.921
11 218.5000 214.7044 3.7956 0.092 0.068 0.067
12 290.0000 337.0265 -47.0265 0.098 -0.841 -0.837
13 315.0000 278.2247 36.7753 0.030 0.634 0.628
14 248.0000 269.3827 -21.3827 0.030 -0.369 -0.364
15 290.0000 271.8724 18.1276 0.030 0.313 0.308
16 220.0000 220.7383 -0.7383 0.080 -0.013 -0.013
17 170.4500 219.3995 -48.9495 0.083 -0.868 -0.865
18 290.0000 296.5352 -6.5352 0.038 -0.113 -0.111
19 205.0000 320.0496 -115.0496 0.066 -2.022 -2.131
20 410.0000 289.3791 120.6209 0.033 2.084 2.207
21 379.9750 335.0874 44.8876 0.094 0.801 0.796
12-19
22 135.0000 221.2733 -86.2733 0.079 -1.528 -1.562

23 358.5000 306.2855 52.2145 0.047 0.909 0.906
24 342.5000 257.8630 84.6370 0.034 1.463 1.491
25 341.0000 288.2803 52.7197 0.033 0.911 0.908
26 287.4500 314.0966 -26.6466 0.057 -0.466 -0.460
27 214.5000 259.5228 -45.0228 0.033 -0.778 -0.773
28 330.8750 220.7644 110.1106 0.080 1.951 2.045
29 444.5000 322.9831 121.5169 0.070 2.141 2.277
30 240.0000 273.7698 -33.7698 0.029 -0.582 -0.576
31 226.4500 250.9756 -24.5256 0.039 -0.425 -0.420
32 278.2500 320.9709 -42.7209 0.067 -0.752 -0.746
33 290.0000 293.8079 -3.8079 0.036 -0.066 -0.065
34 230.0000 249.7908 -19.7908 0.040 -0.343 -0.338
12.48 a. 95% confidence interval for β1: (1.3002, 3.9195). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 32, t.025 =T.INV(.025, 32) = ±2.037, tcalc = 4.059 > 2.037.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
12.49 a. R2 = .34. 34% of the variation in median home price can be explained by median
household income. Closer to 100% is always desirable but this is still decent
considering we are using only one predictor variable.
b. Fcalc = 16.48 and its p-value = .0003. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
12.50 Observations 19, 20, 28, and 29 have unusual residual values. Observation 19 is in PA. The
model overestimated the median home price. The other three cities were in NJ and NY.
The model underestimated the median home price. Home prices tend to be higher in the
northeast part of the country. More research would be needed to better explain the
unusual observations.
12-20
12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a fairly symmetric distribution with slight skewness to the right.
b. The residuals do not show any obvious departure from normality. The assumption of
residual normality is valid.
12-21
12.52
The residual plot does not show signs of heteroscedasticity.

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
12.54 Answers will vary. Confidence and prediction intervals forx = $75,000 and $105,000 are
shown in the table below.
Predicted values for: Home

Income Predicted lower upper lower upper
75.000 246.982632 222.190412 271.774852 124.562573 369.402690
105.000 325.277089 292.571489 357.982690 201.012561 449.541617
12.55 Observation 8 has a high leverage. This is Chesapeake, VA, which has a very low median
income.
12-22
DATA SET B
12.41 Data were most likely from a semester’s class of 58 students. This can be treated as a
sample from the larger population of students who take statistics over several years.
12.42 A sample size of 58 should be sufficient to draw significant conclusions about the
relationship between midterm and final exam scores.
12.43 A positive slope would be logical. It makes sense that higher midterm scores are associated
with higher final exam scores and vice versa. While cause and effect cannot be assumed
it would be reasonable to hypothesize that if a student performs well on their midterm
exam due to a solid understanding of the material they will then be able to comprehend
the material in the last half of the semester and will be well-prepared for their final
exam.
12.44
There is a moderate positive relationship between Midterm Exam Score and Final
Exam Score.
12-23

12.46 A one point increase in the midterm score is associated with a 1.014 point increase in the
final exam score. While theoretically a student could score zero on the midterm, the y-
intercept is negative which is not feasible.
Regression Analysis
r² 0.430 n 58
r 0.656 k 1
Std. Error 7.517 Dep. Var. Final Exam Score
ANOVA table
Regression 2,385.5446 1 2,385.5446 42.22 2.33E-08
Residual 3,164.0588 56 56.5011
Total 5,549.6034 57

Intercept -1.5568 12.0004 -0.130 .8972 -25.5965 22.4829
Midterm Exam
Score 1.0138 0.1560 6.498 2.33E-08 0.7012 1.3263
Studentize
d
Studentize
d Deleted
Final Exam Predicte Leverag
Observation Score d Residual e Residual Residual
1 78.0 79.5 -1.5 0.022 -0.208 -0.206
2 85.0 86.6 -1.6 0.063 -0.226 -0.224
3 81.0 71.4 9.6 0.027 1.290 1.297
4 54.0 68.4 -14.4 0.042 -1.957 -2.009
5 70.0 85.6 -15.6 0.055 -2.139 -2.212
6 73.0 82.6 -9.6 0.035 -1.298 -1.306
7 89.0 77.5 11.5 0.018 1.541 1.561
8 84.0 74.5 9.5 0.018 1.279 1.286
9 86.0 73.5 12.5 0.020 1.685 1.714
10 79.0 74.5 4.5 0.018 0.607 0.604
11 75.0 83.6 -8.6 0.040 -1.168 -1.172
12 63.0 72.4 -9.4 0.023 -1.272 -1.279
13 72.0 73.5 -1.5 0.020 -0.197 -0.195
14 69.0 72.4 -3.4 0.023 -0.464 -0.461
15 86.0 79.5 6.5 0.022 0.868 0.866
12-24
16 78.0 74.5 3.5 0.018 0.473 0.470
12-25
17 75.0 71.4 3.6 0.027 0.481 0.477

18 68.0 76.5 -8.5 0.017 -1.141 -1.145
19 77.0 75.5 1.5 0.017 0.203 0.201
20 78.0 65.4 12.6 0.066 1.741 1.774
21 77.0 73.5 3.5 0.020 0.475 0.472
22 73.0 70.4 2.6 0.031 0.348 0.346
23 79.0 84.6 -5.6 0.047 -0.765 -0.762
24 74.0 73.5 0.5 0.020 0.072 0.071
25 79.0 75.5 3.5 0.017 0.471 0.468
26 73.0 75.5 -2.5 0.017 -0.334 -0.332
27 72.0 83.6 -11.6 0.040 -1.576 -1.597
28 81.0 76.5 4.5 0.017 0.603 0.600
29 86.0 77.5 8.5 0.018 1.139 1.142
30 76.0 85.6 -9.6 0.055 -1.318 -1.327
31 83.0 80.6 2.4 0.025 0.329 0.326
32 83.0 77.5 5.5 0.018 0.736 0.733
33 86.0 84.6 1.4 0.047 0.189 0.187
34 71.0 72.4 -1.4 0.023 -0.195 -0.193
35 83.0 82.6 0.4 0.035 0.056 0.055
36 79.0 82.6 -3.6 0.035 -0.486 -0.482
37 68.0 71.4 -3.4 0.027 -0.463 -0.460
38 90.0 82.6 7.4 0.035 1.004 1.004
39 89.0 83.6 5.4 0.040 0.733 0.730
40 83.0 72.4 10.6 0.023 1.420 1.433
41 81.0 85.6 -4.6 0.055 -0.633 -0.630
42 79.0 75.5 3.5 0.017 0.471 0.468
43 58.0 57.2 0.8 0.167 0.110 0.109
44 77.0 72.4 4.6 0.023 0.612 0.609
45 85.0 80.6 4.4 0.025 0.598 0.595
46 67.0 71.4 -4.4 0.027 -0.598 -0.595
47 70.0 76.5 -6.5 0.017 -0.873 -0.871
48 79.0 81.6 -2.6 0.030 -0.348 -0.345
49 59.0 63.3 -4.3 0.086 -0.602 -0.599
50 74.0 78.5 -4.5 0.020 -0.609 -0.606
51 86.0 78.5 7.5 0.020 1.003 1.003
52 85.0 81.6 3.4 0.030 0.463 0.459
53 79.0 74.5 4.5 0.018 0.607 0.604
54 81.0 77.5 3.5 0.018 0.467 0.464
55 31.0 57.2 -26.2 0.167 -3.826 -4.411
56 82.0 79.5 2.5 0.022 0.330 0.327
57 70.0 67.4 2.6 0.050 0.357 0.355
58 69.0 72.4 -3.4 0.023 -0.464 -0.461
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 56, t.025 =T.INV(.025,56) = ±2.003, tcalc = 4.059 > 2.003.
12-26
c. The p-value = .0000. This means that there is very little chance of observing this
sample if there is no correlation between midterm scores and final scores.
12.49 a. R2 = .43. 43% of the variation in final exam scores can be explained by midterm exam
scores. Closer to 100% is always desirable but this is still decent considering we are
using only one predictor variable.
12.50 Students4 and 5 have unusual residual values with residuals that are negative. The model
overestimated the final exam score for these two students. Student 55 had a residual
beyond −4 which means the model significantly overestimated the final exam score.
This observation would be considered an outlier. Low exam scores are typically due to
students who do not study or attend class. On occasion the material is very difficult
which can also result in a low exam score.
12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a skewed left distribution which is fairly common for exam scores.
Most students do well on exams with a few students in the low range.
12-27
b. The residuals do not show strong departure from normality. The assumption of residual
normality is valid.
12.52
12-28
The residual plot does not show signs of heteroscedasticity although the low final exam
score can be seen in the lower part of the graph.
12.54 Answers will vary. Confidence and prediction intervals for x = 75 and 95 are shown in the
table below.
Predicted values for: Final Exam Score

95% Confidence 95% Prediction
Intervals Intervals
Midterm Exam Predict
Score ed lower upper lower upper
75 74.477 72.433 76.521 59.281 89.673
95 94.753 88.688 100.818 78.520 110.986
12.55 Students 43, 49, and 55 have high leverage. These three students had very low midterm
scores.
12-29
DATA SET C
12.41 Answers will vary. Hospital data bases will contain this type of information.
12.42 Answers will vary. Sample size is sufficient for educational purposes but most studies
conducted by a hospital will use larger sample sizes because the data is available.
12.43 A positive slope would be logical. The estimated length of stay of a patient should be based
on their condition upon admission and should be related to their actual length of stay.
Although estimating a patient’s time in the hospital should not cause their time to be
longer or shorter.
12.44
There is a strong positive relationship between ELOS and ALOS.


12.36 An increase in ELOS of one month increases ALOS by 1.03months. No, the intercept does
not have meaning because by definition admission to a hospital implies staying in the
hospital.
12-30

Regression Analysis
r² 0.625 n 16
r 0.791 k 1
Std. Error 2.282 Dep. Var. ALOS
ANOVA table
Regression 121.5515 1 121.5515 23.35 .0003
Residual 72.8860 14 5.2061
Total 194.4375 15

Intercept 0.5147 1.5903 0.324 .7510 -2.8961 3.9255
ELOS 1.0293 0.2130 4.832 .0003 0.5724 1.4862
Studentized
Studentized Deleted
Observation ALOS Predicted Residual Leverage Residual Residual
1 10.00 11.32 -1.32 0.171 -0.636 -0.622
2 2.00 5.15 -3.15 0.116 -1.466 -1.536
3 4.00 8.23 -4.23 0.065 -1.919 -2.154
4 11.00 12.87 -1.87 0.283 -0.966 -0.963
5 11.00 8.23 2.77 0.065 1.254 1.282
6 11.00 9.78 1.22 0.098 0.564 0.550
7 6.50 6.69 -0.19 0.071 -0.087 -0.083
8 5.00 5.66 -0.66 0.096 -0.305 -0.295
9 8.00 6.69 1.31 0.071 0.595 0.581
10 16.00 12.87 3.13 0.283 1.622 1.735
11 6.50 7.72 -1.22 0.063 -0.552 -0.538
12 6.00 5.15 0.85 0.116 0.398 0.385
13 3.50 4.12 -0.62 0.167 -0.296 -0.287
14 10.00 6.69 3.31 0.071 1.505 1.584
15 7.00 8.23 -1.23 0.065 -0.559 -0.545
16 5.50 3.60 1.90 0.200 0.930 0.925
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
12-31
12.49 a. R2 = .625. 62.5% of the variation in actual length of stay of a patient can be explained
by their expected length of stay. This shows a moderately strong relationship.
12.50 There are no unusual residuals.

12.51 a. The histogram of residuals does not show a clear normal distribution; however, the
normal plot of residuals has a modest straight line.
12-32
b. While the residuals show some departure from normality, this is a very small sample so
it is difficult to see clear normality. There are no strong outliers in the data set so at this
point the slight departure from normality is not too troublesome.
12.52
The residual plot does not show signs of heteroscedasticity.

12-33
12.54 Answers will vary. Prediction and confidence intervals for x = 5 and 9 days is shown.
Predicted values for:
ALOS
ELOS Predicted lower upper lower upper
4 4.6318 2.8052 6.4584 -0.5917 9.8554
9 9.7782 8.2426 11.3138 4.6492 14.9072
12.55 Observations 4 and 10 have high leverage. These patients had unusually long estimated
lengths of stay.
12-34
DATA SET D
12.41 Answers will vary. This type of information could probably be collected from the airplane
manufacturer.
12.42 Answers will vary. Sample size of 52 should be sufficient to observe important
relationships.
12.43 A positive slope would be logical. Cruise speed should be related to engine size. And yes, a
cause and effect relationship would make sense
12.44
There is a strong positive relationship between TotalHP and CruiseSpeed.


12.46 An increase in one unit of horsepower, increases cruise speed by .1931 mph. No, the
intercept does not have meaning because an engine cannot have zero horsepower.
12-35
Regression Analysis
r² 0.684 n 52
r 0.827 k 1
Std. Error 20.596 Dep. Var. Cruise
ANOVA table
Regression 45,896.1706 1 45,896.1706 108.20 4.20E-14
Residual 21,209.2717 50 424.1854
Total 67,105.4423 51

Intercept 103.0870 6.2426 16.513 6.92E-22 90.5483 115.6256
TotalHP 0.1931 0.0186 10.402 4.20E-14 0.1558 0.2304
Studentized
Studentized Deleted
Observation Cruise Predicted Residual Leverage Residual Residual
1 100.0 125.5 -25.5 0.046 -1.267 -1.275
2 200.0 219.0 -19.0 0.093 -0.967 -0.966
3 241.0 228.6 12.4 0.119 0.640 0.636
4 199.0 213.2 -14.2 0.079 -0.717 -0.714
5 174.0 161.0 13.0 0.019 0.636 0.632
6 164.0 172.6 -8.6 0.022 -0.423 -0.420
7 141.0 172.6 -31.6 0.022 -1.552 -1.575
8 161.0 161.0 -0.0 0.019 -0.001 -0.001
9 107.0 124.3 -17.3 0.048 -0.863 -0.860
10 104.0 131.1 -27.1 0.038 -1.341 -1.353
11 122.0 134.0 -12.0 0.035 -0.593 -0.589
12 129.0 137.9 -8.9 0.031 -0.437 -0.433
13 144.0 147.5 -3.5 0.023 -0.172 -0.171
14 194.0 213.2 -19.2 0.079 -0.970 -0.969
15 170.0 184.2 -14.2 0.031 -0.701 -0.697
16 223.0 222.8 0.2 0.103 0.009 0.009
17 234.0 247.9 -13.9 0.185 -0.749 -0.746
18 124.0 137.9 -13.9 0.031 -0.683 -0.679
19 186.0 158.1 27.9 0.019 1.366 1.379
20 190.0 158.1 31.9 0.019 1.563 1.586
21 190.0 199.7 -9.7 0.052 -0.481 -0.478
22 159.0 148.5 10.5 0.023 0.517 0.513
23 160.0 148.5 11.5 0.023 0.566 0.562
24 148.0 163.0 -15.0 0.019 -0.733 -0.730
25 143.0 161.0 -18.0 0.019 -0.884 -0.882
12-36
26 160.0 141.7 18.3 0.027 0.900 0.898

27 140.0 127.2 12.8 0.044 0.634 0.630
28 235.0 170.7 64.3 0.021 3.157 3.492
29 191.0 163.0 28.0 0.019 1.375 1.388
30 132.0 127.2 4.8 0.044 0.237 0.235
31 115.0 137.9 -22.9 0.031 -1.127 -1.130
32 170.0 143.6 26.4 0.026 1.296 1.305
33 175.0 150.2 24.8 0.022 1.217 1.223
34 156.0 141.7 14.3 0.027 0.703 0.700
35 188.0 157.2 30.8 0.020 1.512 1.532
36 128.0 134.0 -6.0 0.035 -0.296 -0.293
37 107.0 127.2 -20.2 0.044 -1.004 -1.005
38 148.0 161.0 -13.0 0.019 -0.639 -0.635
39 129.0 137.9 -8.9 0.031 -0.437 -0.433
40 191.0 199.7 -8.7 0.052 -0.432 -0.428
41 147.0 148.5 -1.5 0.023 -0.072 -0.072
42 213.0 170.7 42.3 0.021 2.077 2.151
43 186.0 161.0 25.0 0.019 1.224 1.231
44 148.0 161.0 -13.0 0.019 -0.639 -0.635
45 180.0 188.1 -8.1 0.035 -0.399 -0.395
46 186.0 188.1 -2.1 0.035 -0.102 -0.101
47 100.0 132.1 -32.1 0.037 -1.586 -1.611
48 176.0 161.0 15.0 0.019 0.734 0.731
49 151.0 153.3 -2.3 0.020 -0.113 -0.112
50 98.0 118.7 -20.7 0.058 -1.037 -1.038
51 163.0 151.4 11.6 0.021 0.571 0.567
52 143.0 137.9 5.1 0.031 0.254 0.252
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,50) =±2.009, tcalc = 10.402 > 2.009.
c. The p-value = 4.20×10-14. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
12.49 a. R2 = .684. 68.4% of the variation in cruise speed can be explained by engine
horsepower. This shows a moderately strong relationship.
b. Fcalc = 108.20 and its p-value = 4.20×10-14. The F statistic is significant which means the
12-37
12.50 Observation 28 is an outlier residual. This corresponds to the ExtraExtra400 model.

Observation 42 is an unusual residual. This is the Piper Malibu Mirage model.
12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a very
slight right skewed distribution.
b. The residuals do not show significant departure from normality.

12-38
12.52
The residual plot does not show obvious signs of heteroscedasticity. We can see a possible
high outlier on the graph.
12.54 Answers will vary. Confidence and prediction intervals for x = 150 and 250 are shown
below.

Cruise
TotalHP Predicted lower upper lower upper
150 132.057 124.072 140.043 89.926 174.189
250 151.371 145.350 157.391 109.567 193.174
12.55 Observations 2-4 and 14, 16, and 17 have high leverage.
12-39
DATA SET E
12.41 Answers will vary. This type of information would probably be collected from a researcher
who has obtained different types of processors.
12.42 Answers will vary. Sample size of 14 is fairly small and can open one up to Type II error.
12.43 A positive slope would be logical. Microprocessor speed should be related to Power
dissipation.
12.44
There is a strong positive relationship betweenmicroprocessor speed and power

dissipation.

12.46 An increase in one unit of microprocessor speed increases power dissipation by 0.032
watts. No, the intercept does not have meaning a speed of zero is not logical.
12-40
Regression Analysis
r² 0.925 n 14
r 0.962 k 1
Std. Error 13.109 Dep. Var. Power
ANOVA
table
25,561.915
Regression 7 1 25,561.9157 148.75 4.03E-08
Residual 2,062.0843 12 171.8404
27,624.000
Total 0 13

Intercept 15.7299 5.6634 2.777 .0167 3.3905 28.0693
Speed 0.0319 0.0026 12.196 4.03E-08 0.0262 0.0375
Studentize
d
Studentize
d Deleted
Leverag
Observation Power Predicted Residual e Residual Residual
1 3.0 16.4 -13.4 0.184 -1.129 -1.143
2 10.0 18.9 -8.9 0.174 -0.748 -0.734
3 35.0 23.2 11.8 0.157 0.985 0.983
4 20.0 25.3 -5.3 0.150 -0.437 -0.422
5 42.0 34.8 7.2 0.120 0.582 0.565
6 50.0 34.8 15.2 0.120 1.233 1.263
7 51.0 57.1 -6.1 0.078 -0.488 -0.472
8 73.0 82.6 -9.6 0.078 -0.764 -0.750
9 115.0 136.8 -21.8 0.246 -1.912 -2.196
10 130.0 117.7 12.3 0.160 1.027 1.030
11 95.0 89.0 6.0 0.086 0.479 0.463
12 136.0 117.7 18.3 0.160 1.527 1.629
13 95.0 108.1 -13.1 0.128 -1.071 -1.078
14 125.0 117.7 7.3 0.160 0.611 0.594
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,12) = ±2.179, tcalc = 12/196 > 2.179.
12-41
c. The p-value = .0000. This means that it is highly unlikely to obtain a slope estimate of
this value if the true slope is equal to zero.
12-42
12.49 a. R2 = .925. 92.5% of the variation in power dissipation can be explained by

microprocessor speed. This shows a strong relationship.
12.50 There are no unusual or outlier standardized residuals.

12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a left
skewed distribution.
12-43
b. The sample size is small so it is difficult to determine normality but there is no obvious
evidence to assume non-normality.
12.52
The residual plot does not show obvious signs of heteroscedasticity.

12-44
below.
Power
Speed Predicted lower upper lower upper
1,000 47.583 38.962 56.203 17.748 77.417
2,500 95.362 86.485 104.238 65.452 125.271
12.55 There are no high leverage observations.

12-45
DATA SET F
12.41 Answers will vary. This could probably be collected through surveys.
12.42 Answers will vary. Sample size of 10 is fairly small and can open one up to Type II error.
But this information could be difficult to obtain.
12.43 A positive slope would be logical. Increased website hits should be associated with
increased revenue.
12.44
There is a weak positive relationship between website hits and restaurant revenue.

12.46 An increase of one website visit increases weekly revenue by $1.67. The intercept could be
interpreted as the weekly revenue with no website hits. Although for this particular
sample there were no restaurants with an x near zero so it would be dangerous to
extrapolate.
12-46
Regression Analysis
r² 0.128 n 10
r 0.357 k 1
Std. Error 1078.961 Dep. Var. Restaurant Revenue
ANOVA
table
1,361,311.737
Regression 1,361,311.7375 1 5 1.17 .3111
1,164,157.782
Residual 9,313,262.2625 8 8
Total 10,674,574.0000 9

2,353.059 15,808.537
Intercept 10,382.3728 2 4.412 .0022 4,956.2085 1
Website
Hits 1.6695 1.5439 1.081 .3111 -1.8907 5.2297
Studentized
Studentize
d Deleted
Observatio Restaurant
n Revenue Predicted Residual Leverage Residual Residual
1 12,113.0 12,407.5 -294.5 0.278 -0.321 -0.302
2 11,409.0 12,869.9 -1,460.9 0.101 -1.428 -1.547
3 14,579.0 12,661.3 1,917.7 0.142 1.919 2.443
4 11,605.0 12,811.5 -1,206.5 0.106 -1.182 -1.218
5 12,308.0 12,501.0 -193.0 0.217 -0.202 -0.190
6 12,320.0 13,107.0 -787.0 0.131 -0.783 -0.762
7 13,225.0 12,591.1 633.9 0.170 0.645 0.620
8 13,652.0 13,496.0 156.0 0.361 0.181 0.170
9 13,893.0 13,036.9 856.1 0.114 0.843 0.826
10 13,896.0 13,517.7 378.3 0.380 0.445 0.422
12.48 a. 95% confidence interval for β1: (−1.8907, 5.2297). This interval does contain zero which
means that we are not confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,8) = ±2.306, tcalc = 1.081 falls
between the critical values. Therefore fail to reject H0 and conclude that the slope isnot
significantly different from zero.
12-47
c. The p-value = .3111. This means that it is likely to obtain a slope estimate of this value
even if the true slope is equal to zero.
d. No, this sample does not support our a priori hypothesis about the slope.
12.49 a. R2 = .128. Because the slope is not significant R2 does not have meaning.
b. Fcalc = 1.17 and its p-value = .3111. The F statistic is not significant which means the
linear model does not provide significant fit.
c. This model is not fit for practical use.
12.50 Restaurant 3 shows an unusual residual. It appears the model is underestimating the
revenue.
12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a fairly
uniform distribution.
12-48
b. The sample size is small so it is difficult to determine normality from the histogram.
Based on the normplot there is no obvious evidence to assume non-normality.
12.52

12-49
below. Keep in mind that these intervals are unreliable because there is not significant
relationship between website hits and revenue.
Predicted values for: Restaurant Revenue
Website Hits Predicted lower upper lower upper
1,200 12,385.790 11,036.168 13,735.411 9,555.230 15,216.349
1,500 12,886.644 12,099.326 13,673.962 10,276.958 15,496.330
12.55 There are no high leverage observations.

12-50
DATA SET G
12.41 Answers will vary. This information can be gathered from manufacturers’ specification
information which is listed on their websites. They manufacturers use sophisticated
sampling techniques for estimating these values.
12.42 Answers will vary. Sample size of 43 is reasonable.

12.43 A negative slope would be logical. It would make sense that the heavier a car is the lower
the MPG.
12.44
There is a fairly strong negative relationship between the Weight and CityMPG.

12.46 An increase in the weight of a car by one pound reduces its city mpg by 0.0045 mpg. No,
the intercept does not make sense.
12-51
Regression Analysis
r² 0.681 n 43
r -0.825 k 1
Std. Error 2.499 Dep. Var. City MPG
ANOVA table
Regression 546.43787944 1 546.43787944 87.51 1.00E-11
Residual 256.02723683 41 6.24456675
Total 802.46511628 42

Intercept 36.6337 1.8793 19.493 2.39E-22 32.8383 40.4291
Weight -0.0046 0.00048708 -9.354 1.00E-11 -0.0055 -0.0036
Studentized
Studentized Deleted
Observation City MPG Predicted Residual Leverage Residual Residual
1 20.0 20.9 -0.9 0.027 -0.371 -0.367
2 23.0 21.5 1.5 0.031 0.607 0.602
3 19.0 21.2 -2.2 0.029 -0.888 -0.886
4 20.0 21.4 -1.4 0.030 -0.557 -0.552
5 18.0 17.4 0.6 0.031 0.260 0.257
6 18.0 18.2 -0.2 0.026 -0.073 -0.072
7 19.0 21.8 -2.8 0.034 -1.141 -1.145
8 14.0 14.1 -0.1 0.074 -0.062 -0.061
9 15.0 15.4 -0.4 0.053 -0.165 -0.163
10 17.0 15.4 1.6 0.053 0.657 0.653
11 18.0 17.5 0.5 0.030 0.223 0.220
12 13.0 12.5 0.5 0.111 0.219 0.216
13 13.0 9.8 3.2 0.194 1.448 1.469
14 26.0 24.1 1.9 0.063 0.803 0.799
15 15.0 15.4 -0.4 0.053 -0.165 -0.163
16 21.0 21.2 -0.2 0.029 -0.076 -0.075
17 18.0 17.0 1.0 0.034 0.418 0.414
18 24.0 23.5 0.5 0.054 0.201 0.199
19 16.0 17.1 -1.1 0.033 -0.433 -0.429
20 15.0 14.0 1.0 0.077 0.412 0.408
21 18.0 19.3 -1.3 0.023 -0.525 -0.520
22 25.0 26.2 -1.2 0.107 -0.498 -0.494
23 17.0 20.0 -3.0 0.024 -1.235 -1.243
24 18.0 21.2 -3.2 0.029 -1.295 -1.306
25 13.0 13.9 -0.9 0.080 -0.355 -0.352
12-52
26 18.0 18.7 -0.7 0.024 -0.304 -0.300

27 19.0 21.3 -2.3 0.030 -0.954 -0.953
28 17.0 17.5 -0.5 0.030 -0.211 -0.209
29 20.0 21.4 -1.4 0.031 -0.575 -0.571
30 20.0 21.7 -1.7 0.032 -0.678 -0.673
31 20.0 21.4 -1.4 0.030 -0.566 -0.561
32 15.0 17.2 -2.2 0.032 -0.886 -0.884
33 16.0 17.0 -1.0 0.034 -0.396 -0.392
34 25.0 22.5 2.5 0.041 1.009 1.009
35 28.0 23.9 4.1 0.059 1.711 1.754
36 24.0 23.6 0.4 0.056 0.154 0.152
37 21.0 20.3 0.7 0.025 0.266 0.263
38 17.0 20.3 -3.3 0.025 -1.328 -1.340
39 23.0 24.9 -1.9 0.079 -0.802 -0.799
40 26.0 23.0 3.0 0.047 1.216 1.224
41 19.0 17.9 1.1 0.028 0.462 0.458
42 34.0 22.8 11.2 0.044 4.600 6.531
43 20.0 19.8 0.2 0.024 0.073 0.072
12.48 a. 95% confidence interval for β1: (-0.005, -0.0036). This interval does not contain zero
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 41, t.025 =T.INV(.025,41) =±2.020, tcalc = -9.354<-2.020.
12.49 a. R2 = .681. 68.1% of the variation in the City MPG of a vehicle can be explained by its
weight. This shows a fairly strong relationship.
12.50 Observation 42 is an outlier residual. This happens to be Volkswagen Jetta. The model
underestimated the true MPG by quite a margin. Perhaps this was the diesel version of
that model.
12.51 a. The normplot of residuals is not perfect. The high outlier can be seen and the line sags in
the middle. The histogram of residuals shows a somewhat bell-shaped distribution with
one high outlier.
12-53
b. The residuals do not show significant departure from normality in spite of the one high
outlier.
12-54
12.52
The residual plot does not show obvious signs of heteroscedasticity. It is possible to see the
Jetta outlier residual.
12.53 An autocorrelation test is not appropriate because this is cross sectional data.
below.
Predicted values for: City
MPG
Weight Predicted lower upper lower upper
3,000 22.965 21.879 24.050 17.803 28.127
4,000 18.408 17.608 19.208 13.299 23.518
12.55 Observations 12, 13, and 22 have high leverage. These correspond to the Dodge Ram 1500,
the Ford Expedition, and the Kia Rio.
12-55
DATA SET H
12.41 Answers will vary. This information can be gathered from food labels or manufacturers’
websites.
12.42 Answers will vary. Sample size is reasonable although a large sample would be better.
12.43 A positive slope would be logical. It would make sense that the more fat calories the sauce
has the more calories are in the sauce in general.
12.44
There is a fairly strong positive relationship between the FatCalories/Gram and

Calories/Gram.

12.46 An increase in the fat calories per gram by 1, increases total calories per gram by 2.2179.
The intercept might make sense for sauce labeled “fat free.”
12-56
Regression Analysis
r² 0.843 n 20
r 0.918 k 1
Std. Error 0.102 Dep. Var. Calories Per Gram
ANOVA table
Regression 1.0076 1 1.0076 96.49 1.17E-08
Residual 0.1880 18 0.0104
Total 1.1955 19

Intercept 0.3054 0.0460 6.635 3.16E-06 0.2087 0.4021
Fat Calories Per Gram 2.2179 0.2258 9.823 1.17E-08 1.7436 2.6923
Studentized
Studentized Deleted
Observation Calories Per Gram Predicted Residual Leverage Residual Residual
1 0.640 0.749 -0.109 0.053 -1.096 -1.103
2 0.560 0.572 -0.012 0.066 -0.117 -0.114
3 0.400 0.483 -0.083 0.096 -0.853 -0.846
4 0.480 0.394 0.086 0.142 0.907 0.902
5 0.640 0.572 0.068 0.066 0.693 0.682
6 0.400 0.305 0.095 0.203 1.037 1.039
7 0.560 0.483 0.077 0.096 0.794 0.785
8 0.550 0.483 0.067 0.096 0.691 0.681
9 0.480 0.572 -0.092 0.066 -0.927 -0.923
10 0.480 0.572 -0.092 0.066 -0.927 -0.923
11 1.250 1.148 0.102 0.251 1.151 1.162
12 1.000 1.037 -0.037 0.164 -0.400 -0.390
13 1.000 0.949 0.051 0.112 0.534 0.523
14 1.170 1.037 0.133 0.164 1.420 1.465
15 0.920 0.860 0.060 0.076 0.612 0.601
16 0.670 0.860 -0.190 0.076 -1.933 -2.111
17 0.860 0.749 0.111 0.053 1.116 1.124
18 0.700 0.727 -0.027 0.051 -0.270 -0.262
19 0.560 0.749 -0.189 0.053 -1.900 -2.066
20 0.640 0.660 -0.020 0.051 -0.204 -0.198
12-57
12.49 a. R2 = .843. 84.3% of the variation in the total calories/gram of a pasta sauce can be
explained by the number of fat calories/gram. This shows a fairly strong relationship.
12.50 Observation 16 is an unusual residual. This is the Ragu Old World Style with meat sauce.
12.51 a. The normal probability plot of residuals does not show a clear normal pattern.
12-58
b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
12.52

12-59
12.53 An autocorrelation test is not appropriate because this is cross-sectional data.

12.54 Answers will vary. Confidence and prediction intervals for x = 0.10 and 0.20 are shown
below.
Predicted values for: Calories Per Gram

95% Confidence
Intervals 95% Prediction Intervals
Predicte
Fat Calories Per Gram d lower upper lower upper
0.10 0.52722 0.46690 0.58754 0.30422 0.75021
0.20 0.74901 0.69978 0.79824 0.52876 0.96927
12.55 Observations 6 and 11 show high leverage. These correspond to Healthy Choice Traditional
and Prego Hearty Meat Peperoni.
12-60
DATA SET I
12.40 Time-series.
12.41 Answers will vary. This information can be gathered by taking a random household and
observing their energy usage.
12.42 Answers will vary. Two years of data is good but when looking for seasonal influences
more years would be better.
12.43 A negative slope would be logical. It would make sense that the lower the temperature the
more energy a household would use.
12.44
There is a fairly strong negative relationship between the Average Daily Temperature and
Energy Consumption.

12.46 An increase in 1° in average temperature decreases the monthly energy use by 9.661 kwh.
Yes, the intercept does make sense. There can be a month with an average temperature
of 0°.
12-61

Regression Analysis
r² 0.766 n 24
r -0.875 k 1
Std. Error 84.951 Dep. Var. Electric Consumption (KWH)
ANOVA table
Regression 520,420.3570 1 520,420.3570 72.11 2.16E-08
Residual 158,766.1430 22 7,216.6429
Total 679,186.5000 23

Intercept 1,165.7901 62.9671 18.514 6.64E-15 1,035.2043 1,296.3760
Avg Daily Temp (deg F) -9.6609 1.1376 -8.492 2.16E-08 -12.0202 -7.3016
Studentized
Studentized Deleted
Observation Energy Use (KWH) Predicted Residual Leverage Residual Residual
1 436.0 566.8 -130.8 0.056 -1.585 -1.645
2 464.0 479.9 -15.9 0.098 -0.197 -0.192
3 446.0 431.6 14.4 0.135 0.183 0.179
4 391.0 499.2 -108.2 0.086 -1.332 -1.358
5 444.0 557.2 -113.2 0.059 -1.373 -1.403
6 608.0 663.4 -55.4 0.042 -0.667 -0.658
7 885.0 808.3 76.7 0.089 0.945 0.943
8 821.0 779.4 41.6 0.073 0.509 0.500
9 830.0 789.0 41.0 0.078 0.502 0.494
10 750.0 827.7 -77.7 0.101 -0.964 -0.963
11 617.0 731.0 -114.0 0.054 -1.380 -1.411
12 598.0 644.1 -46.1 0.042 -0.554 -0.545
13 597.0 499.2 97.8 0.086 1.205 1.218
14 528.0 470.2 57.8 0.105 0.719 0.711
15 477.0 402.6 74.4 0.161 0.956 0.954
16 562.0 489.5 72.5 0.092 0.895 0.891
17 658.0 586.1 71.9 0.050 0.868 0.863
18 690.0 702.1 -12.1 0.047 -0.145 -0.142
19 862.0 798.7 63.3 0.083 0.778 0.771
20 1,008.0 837.3 170.7 0.108 2.127 2.332
21 840.0 924.3 -84.3 0.184 -1.098 -1.104
22 867.0 798.7 68.3 0.083 0.840 0.834
23 606.0 702.1 -96.1 0.047 -1.158 -1.168
24 657.0 653.8 3.2 0.042 0.039 0.038
12-62
12.48 a. 95% confidence interval for β1: (-12.0202, -7.3016). This interval does not contain zero
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 22, t.025 =T.INV(.025,22) = ±2.074, tcalc = -8.492
<-2.074. Therefore reject H0 and conclude that the slope is significantly different from
zero.
12.49 a. R2 = .766. 76.6% of the variation in the energy usage can be explained by average daily
temperature. This shows a fairly strong relationship.
12.50 Observation 20 is an unusual residual. The model underestimates the energy usage for that
month.
12.51 a. The normplot of residuals does not show a clear normal pattern. The histogram also
shows a non-normality.
12-63
b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
12.52

12-64
12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.53 which indicates slight positive
autocorrelation. This is not unexpected because temperature for one month is related to
temperature for the previous month. The level of autocorrelation does not invalidate the
regression results.
12.54 Answers will vary. Confidence and prediction intervals for x = 50 and 70 are shown below.
Predicted values for: Electric Consumption (KWH)
Avg Daily Temp (deg F) Predicted lower upper lower upper
50 682.745 645.995 719.495 502.776 862.715
70 489.527 436.022 543.033 305.405 673.650
12.55 Observation 21 shows high leverage.

12-65
DATA SET J
12.40 Time-series.
12.41 Answers will vary. This information is collected by The Bureau of Labor Statistics.
12.42 Answers will vary. 47 years of data is a good sample. Methods of measurement can vary
over time. It is important to consider how the measures were calculated and compare
years in which the calculations were similar.
12.43 A positive slope would be logical. It would make sense that the change in Commodities
CPI would move in the same direction as change in Services CPI.
12.44
There is a fairly strong positive relationship between the Commodities% and Services%.

12.46 An increase in the change in Commodities CPI of 1% increases the change in Service CPI
by .830%. Yes, it is possible that there is no change in the CPI between two years.
12-66
Regression Analysis
r² 0.727 n 47
r 0.853 k 1
Std. Error 1.574 Dep. Var. Services%
ANOVA table
Regression 297.2715 1 297.2715 120.03 2.76E-14
Residual 111.4532 45 2.4767
Total 408.7247 46

Intercept 2.2068 0.3506 6.293 1.14E-07 1.5005 2.9130
Commodities% 0.8302 0.0758 10.956 2.76E-14 0.6776 0.9828
Studentized
Studentized Deleted
Observation Services% Predicted Residual Leverage Residual Residual
1 3.40 2.95 0.45 0.037 0.289 0.286
2 1.70 2.70 -1.00 0.041 -0.652 -0.648
3 2.00 2.95 -0.95 0.037 -0.618 -0.613
4 2.00 2.95 -0.95 0.037 -0.618 -0.613
5 2.00 3.20 -1.20 0.034 -0.778 -0.774
6 2.30 3.12 -0.82 0.035 -0.530 -0.526
7 3.80 4.37 -0.57 0.023 -0.363 -0.360
8 4.30 3.78 0.52 0.027 0.332 0.329
9 5.20 5.11 0.09 0.021 0.056 0.056
10 6.90 6.11 0.79 0.025 0.509 0.505
11 8.00 5.94 2.06 0.024 1.323 1.334
12 5.70 5.20 0.50 0.021 0.324 0.321
13 3.80 4.70 -0.90 0.022 -0.577 -0.572
14 4.40 8.35 -3.95 0.057 -2.584 -2.769
15 9.20 12.09 -2.89 0.185 -2.031 -2.107
16 9.60 9.51 0.09 0.086 0.058 0.058
17 8.30 5.78 2.52 0.023 1.622 1.653
18 7.70 7.02 0.68 0.034 0.438 0.434
19 8.60 8.18 0.42 0.053 0.272 0.269
20 11.00 11.59 -0.59 0.162 -0.408 -0.404
21 15.40 12.42 2.98 0.201 2.120 2.209
22 13.10 9.18 3.92 0.077 2.592 2.779
23 9.00 5.61 3.39 0.022 2.178 2.277
24 3.50 4.61 -1.11 0.022 -0.716 -0.712
25 5.20 5.03 0.17 0.021 0.110 0.108
26 5.10 3.95 1.15 0.026 0.740 0.736
12-67
27 5.00 1.46 3.54 0.066 2.328 2.454

28 4.20 4.86 -0.66 0.021 -0.426 -0.422
29 4.60 5.11 -0.51 0.021 -0.329 -0.326
30 4.90 6.11 -1.21 0.025 -0.778 -0.774
31 5.50 6.52 -1.02 0.028 -0.660 -0.656
32 5.10 4.78 0.32 0.022 0.205 0.203
33 3.90 3.87 0.03 0.026 0.021 0.021
34 3.90 3.78 0.12 0.027 0.075 0.074
35 3.30 3.62 -0.32 0.029 -0.205 -0.203
36 3.40 3.78 -0.38 0.027 -0.247 -0.245
37 3.20 4.37 -1.17 0.023 -0.749 -0.745
38 3.00 3.37 -0.37 0.031 -0.238 -0.236
39 2.70 2.29 0.41 0.048 0.267 0.264
40 2.50 3.70 -1.20 0.028 -0.774 -0.771
41 3.40 4.95 -1.55 0.021 -0.993 -0.993
42 4.10 3.04 1.06 0.036 0.688 0.684
43 3.10 1.63 1.47 0.062 0.967 0.967
44 3.20 3.04 0.16 0.036 0.106 0.104
45 2.90 4.12 -1.22 0.025 -0.782 -0.779
46 3.30 5.20 -1.90 0.021 -1.217 -1.224
47 3.80 4.20 -0.40 0.024 -0.257 -0.254
12.49 a. R2 = .727. 72.7% of the variation in the Services CPI change can be explained by
Commodities CPI change. This shows a fairly strong relationship.
12.50 Observations 14, 15, 21-23 and 27 are unusual residuals.

12-68
12.51 a. The histogram shows a slight bell-shaped curve although there is more concentration in
the middle of the graph than you would see in a true normal distribution. The normplot
shows a similar pattern.
b. The histogram of residuals does not show obvious departure from normality.
12-69
12.52
The residual plot does not show significant signs of heteroscedasticity although there is a
slight fan out pattern as X increases.
12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.08 which indicates slight positive
autocorrelation. This is not unexpected because CPI indexes are economic data which
one would expect to be correlated month by month. The level of autocorrelation does
not invalidate the regression results.
12-70
12.54 Answers will vary. Confidence and prediction intervals for x = 1.5 and 2.5 are shown
below.
Predicted values for: Services%
95% Confidence
Intervals 95% Prediction Intervals
Commodities% Predicted lower upper lower upper
1.5 3.4520 2.8982 4.0059 0.2343 6.6698
2.5 4.2822 3.7954 4.7690 1.0753 7.4891
12.55 Observations 15, 16, 20 and 21 show high leverage.

12-71
12.56 No, r measures the strength and direction of the linear relationship, but not the amount of
variation explained by the explanatory variable.
12.57 H0: ρ = 0 versus H1: ρ ≠ 0. α = .025 so tcrit = t.0125 =T.INV(.0125,53) =±2.3069. tcalc =
55 - 2
.3043 = 2.3256 > 2.3069 so we reject the null hypothesis. The correlation is
1 - .30432
not equal to zero.
12.58 The correlation coefficient, r, is only .13, indicating that there exists a very weak positive
correlation between prices on successive days. The fact that it is a highly significant
result stems from a large sample size which increases power of the test. This means that
very small correlations will show statistical significance even though the correlation is
not truly important.
12.59 a. ŷ = 55.2 +.73(2000) = 1515.2 total free throws expected.

b. No, the intercept is not meaningful. You can’t make free throws without attempting
them.
c. Quick rule: 1515.2 �2.052(53.2) = 1515.2 �109.17 or (1406.03, 1624.37)
12.60 a. ŷ = 30.7963+ .0343x (R2 = .202, se = 6.816)

b. d.f.= 33,α = .05 so tcrit = t.025 =T.INV(.025,33) =±2.035
c. tcalc = 2.889 > 2.035 so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (.0101, .0584). This CI
does not contain zero, hence, there is a relationship between the weekly pay and the
income tax withheld.
e. Fcalc = (2.889)2 = 8.3463
f. The value of R2 assigns only 20% of the variation in income withholding to the weekly
pay. While the F statistic is significant, the fit is only a modest fit.
12.61 a. ŷ = 1743.57 − 1.2163x (R2 = .370, s = 286.793)

b. d.f. = 13,α = .05 so tcrit = t.025 =T.INV(.025,13) =±2.160
c. tcalc = −2.764 < −2.160so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (−2.1617,−0.2656).
This CI does not contain zero, hence, there is a relationship between the weekly pay
and monthly machine downtime.
e. Fcalc =(−2.764)2 = 7.639696
f. The value of R2 assigns only 37% of the variation in monthly machine downtime to the
monthly maintenance spending (dollars). Thus, throwing more “money” at the problem
12-72
of downtime will not completely resolve the issue. Indicates that there are most likely
other reasons why machines have the amount of downtime incurred.
12.62 a. ŷ = 6.5763 +0.0452x (R2 = .519, s = 6.977)

b. d.f.= 62,α = .05 so tcrit = t.025 =T.INV(.025,62) = ±1.999
c. tcalc = 8.183 > 1.999 so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (0.0342, 0.0563). This
CI does not contain zero, hence, there is a relationship between the total assets
(billions) and total revenue (billions).
e. Fcalc =(8.183)2 = 66.96
f. The value of R2assigns 51.9% of the variation in total revenue (billions) to the total
assets (billions). Thus, increasing assets will lead to an increase in income. However,
the results also indicate that there are most likely other reasons why companies earn the
revenue they do.
12.63 a. r = −.387 (from Excel)

b. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 41, α = .01 so tcrit = t.005 =T.INV(.005,41)
43 - 2
=±2.701. tcalc = -.387 = -2.687 . Because −2.687 is between the critical
1 - (-.387) 2
valueswe do not reject the hypothesis of zero correlation. The sample does not provide
evidence at the .01 significance level that the stock prices move together.
c. The scatterplot shows a slight negative correlation between IBM and HPQ stock prices.
(Note that if α = .05, t.025 = ±2.0195, and we would reject the hypothesis of zero
correlation and conclude that there is correlation between the stock prices.)
12-73
12.64 a.
b. r = −.297. This shows a weak negative linear relationship between loyalty card use and
sales growth.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 72 and  = .05, α = .05 so tcrit = t.025
74 - 2
=T.INV(.025,72) =±1.9935. tcalc = -.297 = -2.639 . Because -2.639
1 - .297 2
<-1.9935, we reject the hypothesis of no correlation and the sample evidence supports
the notion of negative correlation.
d. It appears that a higher loyalty card usage is associated with lower sales growth.
12.65 a. The scatter plot indicates that there is a positive correlation between the fertility rates in
1990 and 2000.
12-74
b. r = .749. There is a strong positive linear relationship between the fertility rates in 1990
and 2000.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 13 and  = .05, t.025 =T.INV(.025,13) = 2.160.
15 - 2
tcalc = .749 = 4.076 . Because 4.076 > 2.160, we reject the hypothesis of no
1 - .7492
correlation and the sample evidence supports the notion of positive correlation.There is
a positive correlation.
12.66 a. The scatter plot shows almost no pattern.
b. r = −.105. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 25 and  = .05, α = .05 so tcrit = t.025
27 - 2
=T.INV(.025,25) =±2.060. tcalc = -.105 = -0.528 . Because -.528 falls
1 - .1052
between the critical valueswe fail to reject the hypothesis of no correlation.
c. It appears there is very little relationship between price and accuracy rating of speakers.
12.67 For each of these, the scatter plot will contain the answers to (a), (b), and (d) with respect to
the fitted equation.
c. Salary: The fit is good. Assessed: The fit is excellent. HomePrice2: The fit is good.
d. Salary: An increase in the age by 1 year increases salary by $1447.4.
Assessed: An increase in 1 sq. ft. of floor space increases assessed value by $313.30.
HomePrice2: An increase in 1 sq. ft. of home size increases the selling price by
$209.20.
e. The intercept is not meaningful for any of these data sets as a zero value for any of X’s
respectively cannot realistically result in a positive Y value.
12-75

estimated slope
t=
12.68 a. standard error
See table below for calculations.
b. Answers shown in right column in table below.
Dependent Variable Estimated Differ from 0?

Slope
Highest grade -0.027 Yes
achieved
Reading grade -0.07 Yes
equivalent
Class standing -0.006 No
Absence from 4.8 Yes
school
Grammatical 0.159 No
reasoning
Vocabulary -0.124 Yes
Hand-eye 0.041 No
coordination
Reaction time 11.8 No
Minor antisocial -0.639 No
behavior
c. It would be inappropriate to assume cause and effect without a better understanding of

how the study was conducted.
12-76
12.69 a.
c. The fit of this regression is weak as given byR2 = 0.2474. 24% of the variation in %
Operating Margin is explained by % Equity Financing.
12.70 a.
c. The fit of this regression is very good as given by R2 = 0.8216. The regression line
does show a strong positive linear relationship between molecular w.r.t. and retention
time, indicating that the greater the molecular w.r.t. the greater is the retention time.
12.71 a. Based on both the R2 = 0 and the p-value > .10, there is no relationship between the
class size and teacher ratings.
b. Given that R2 =0, we have not “explained” teacher ratings in this bivariate model. Other
factors might be students’ expected GPA, whether the course is a core class or not, the
age of the student, gender of student, gender of instructor, etc. Answers will vary.
12-77
12.72 a. The scatter plot shows a positive relationship.
c. The fit of this regression is very good as given by R2 = .8206. The regression line
showsa strong positive linear relationship between revenue and profit, indicating that
the greater revenue is associated with higher profit.
12.73 a. The slope of each model indicates the impact an additional year in vehicle age has on
the price. This relationship for each model is negative indicating that an additional year
of age reduces the asking price. This impact ranges from a low for the Taurus (an
additional year reduces the asking price by $906) to a high for the Ford Explorer (an
additional year reduces the asking price by $2,452).
b. The intercepts could indicate the price of a new vehicle.
c. Based on the R2 values: The fit is very good for the Explorer, the F-150 Pickup and the
Taurus. The fit is weak for the Mustang. One reason for the seemingly poor fit for the
Mustang is the fact that this is a collector item (if in good condition) so that the age is
less important of a factor in determining the asking price.
d. Answers will vary, but a bivariate model for 3 of the vehicles explains approximately
2/3 of the variation in asking price at a minimum. Other factors: condition of the car,
collector status, proposed usage, price of a new vehicle.
12.74 a. The regression results are not significant, based on the p-value, for the 1-Year holding
period. The results for the 2-Year period are significant at the 5% level, while for 5-, 8-,
and 10-Years the results are significant at the 1% level. For each regression there is an
inverse relationship between P/E and the stock return. For the 8-Year and 10-Year
period the relationship is approximately -1. The R2 increases as the holding period
increases. This indicates that P/E ratio explains a greater portion of the variation in
stock return, the longer the stock is held.
12-78
b. Yes, given the data are time series, the potential for autocorrelation is present. Also, it is
commonly recognized that stock returns do exhibit a high degree of autocorrelation, as
do most financial series.
12.75 a. Using Father’s Height: My Predicted Height = 71+2.5 = 73.5” My actual height =
73” . Using Average of Parent’s Height: My Predicted Height = 68+2.5 = 70.5”
b. Fairly accurate within 0.5” when using my father’s height, within 2.5” when using
average parent height. May be there is improved accuracy using only father’s height
for males.
c. Regression analysis of samples of daughters and sons, with respective average height of
parents. Separate samples of each.
12-79

Chap 012

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Chap 012

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 12 - Simple Regression

Learning Objective: 12-1

c. tcrit =t.025 =T.INV(.025, 3) =±3.182, using d.f. =n−2 = 3

12.16 a. yˆ i = $2,277 + 0.0307($41,078) = $3538.0946 ,

12.17 a. yˆ i = 15.4 - 0.07(14) = 14.42 , ei = yi - yˆ i = 18 - 14.42 = 3.58 . The regression equation

Learning Objective: 12-4

Operators (X) Wait (Y) ( xi - x )( yi - y )

Learning Objective: 12-4

c. An increase in age of 10 years leads to an average decrease in spending of $5.30.

c. An increase of 1% inlast year’s return leads to an increase, on average, of .458% for

c. An increase of 100 orders leads to an average increase in shipping cost of $493.22.

12.23 a. MegaStat regression output:

Regression output confidence interval

b. H0: 1 = 0 vs. H1: 1 ≠ 0.

12.24 a. MegaStat regression output:

Regression output confidence interval

b. H0: 1 = 0 vs. H1: 1 ≠ 0.

12.25 a. ŷ = 557.4511 + 3.0047x

12.26 a. ŷ = 7.6425 + 0.9467x

12.27 a. ŷ = 1.8064 + .0039x

12.28 a. ŷ = 614.930 − 1.09.11x

12.29 a. MegaStat regression output:

Regression output confidence interval

12.30 a. MegaStat regression output:

Regression output confidence interval

12.31 a. MegaStat regression output:

Regression output confidence interval

12.32 a. MegaStat Predictions:

Predicted values for: Weekly Pay (Y)

12.33 a. MegaStat Predictions:

Predicted values for: Profit (Y)

b. x = 15: 95% confidence interval (0.3671, 1.1477), 95% prediction interval

12.37 a. Predicted MPG = 49.22 – 0.081(200) = 33.02 MPG

12.45 See graph above. Yes, a linear relationship is plausible.

12.47 MegaStat output is provided.

Regression output confidence interval

22 135.0000 221.2733 -86.2733 0.079 -1.528 -1.562

Learning Objective: 12-10

The residual plot does not show signs of heteroscedasticity.

Predicted values for: Home

Learning Objective: 12-9

12.45 See graph above. Yes, a linear relationship is plausible.

12.47 MegaStat output is provided.

Regression output confidence interval

16 78.0 74.5 3.5 0.018 0.473 0.470

17 75.0 71.4 3.6 0.027 0.481 0.477

Learning Objective: 12-7

Predicted values for: Final Exam Score

Learning Objective: 12-9

There is a strong positive relationship between ELOS and ALOS.

12.45 See graph above. Yes, a linear relationship is plausible.

12.47 MegaStat output is provided.

Regression output confidence interval

12.50 There are no unusual residuals.

The residual plot does not show signs of heteroscedasticity.

There is a strong positive relationship between TotalHP and CruiseSpeed.

12.45 See graph above. Yes, a linear relationship is plausible.

12.47 MegaStat output is provided.

Regression output confidence interval

26 160.0 141.7 18.3 0.027 0.900 0.898

Learning Objective: 12-7

12.50 Observation 28 is an outlier residual. This corresponds to the ExtraExtra400 model.

b. The residuals do not show significant departure from normality.

Predicted values for: