Sunteți pe pagina 1din 79

Chapter 12 - Simple Regression

Chapter 12
Simple Regression
12.1 For each sample: H0: ρ= 0 versus H1: ρ ≠ 0. The following formula is used to calculate the
n-2
test statistic: tcalc = r and tcrit=T.INV(α/2,df). Because these are all two-tailed
1- r2
tests the decision rule is Reject H0 if tcalc> +tcrit or tcalc< −tcrit

Summary Table
Decision
2.138 > 2.101, RejectH0
−1.977< −1.701, RejectH0
1.677 is between the critical
values, Fail to Reject H0
−2.416< −2.390, Reject H0

Learning Objective: 12-1

12.2 a. The scatter plot shows a positive correlation between hours worked and weekly pay.

b.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy

2130
r= = .9199
350 15318

12-1
Chapter 12 - Simple Regression

c. tcrit =t.025 =T.INV(.025, 3) =±3.182, using d.f. =n−2 = 3


5-2
d. tcalc = .9199 = 4.063 . We reject the null hypothesis of zero correlation
1 - (.9199) 2
because 4.063 > 3.182.
e. p-value =T.DIST.2T(4.063,3) = .0269.
Learning Objective: 12-1

12.3 a. The scatter plot shows a negative correlation between operators and wait time.

b.
Operators (X) Wait (Y) ( xi - x )( yi - y )
4 385 −76
5 335 12
6 383 0
7 344 −3
8 288 −118
6 347 −185
x y SSxy
-185
r= = -.7328
10 6374
c. tcrit = t.025 =T.INV(.025,3) =±3.182, using d.f.= 3
5-2
d. tcalc = -.7328 = -1.865 . We fail to reject the null hypothesis of zero
1 - (-.7328) 2
correlation because -1.865 >-3.182.
e. p-value = T.DIST.2T(1.865,3) = .159.
Learning Objective: 12-1

12-2
Chapter 12 - Simple Regression

12.4 a. The scatter plot shows little correlation between age and amount spent.

b. rcalc = −.292
c. tcrit = t.025 =T.INV(.025,8) =±2.306, using d.f. = 8
10 - 2
d. tcalc = -.292 = -.864
1 - (-.292) 2
e. Because tcalc (−.864) > −2.306, we fail to reject the null hypothesis of zero correlation.
Learning Objective: 12-1

12.5 a. The scatter plot shows a positive correlation between returns from last year and returns
from this year.

b. rcalc = .5313
c. tcrit = t.025 =T.INV(.025,15) = ±2.131, using d.f. = 15
17 - 2
d. tcalc = .5313 = 2.429
1 - (.5313) 2
e. Because tcalc (2.429) > 2.131, we reject the null hypothesis of zero correlation.
Learning Objective: 12-1

12-3
Chapter 12 - Simple Regression

12.6 a. The scatter plot shows a positive correlation between orders and ship cost.

b. rcalc = .820
c. tcrit = t.025 =T.INV(.025,10) = ±2.228, using d.f. = 10
12 - 2
d. tcalc = .820 = 4.530
1 - (.820) 2
e. Becausetcalc (4.53) > 2.228, we reject the null hypothesis of zero correlation.
Learning Objective: 12-1

12.7 a. Increasing the size of a home by 1 square foot increases the price by $150.
b. HomePrice = $125000 + $150×(2000) = $425,000
c. The intercept might be interpreted as the value of the lot without a home. But the range
of values for Xdoes not include zero so it would be dangerous to extrapolate for x = 0.
Learning Objective: 12-2
Learning Objective: 12-3

12.8 a. An increase in the price of the item of $1 reduces its expected sales by 37.5 units.
b. Sales = 842 – ($20)×37.5 = 92
c. From a practical point of view no. A zero price is unrealistic.
Learning Objective: 12-2
Learning Objective: 12-3

12.9 a. An increase in the median age of one year means the number of car thefts decreases by
35.3.
b. CarTheft = 1,667 - 35.3×40 = 255
c. The intercept would not be meaningful because you would not have a median age of
zero for any state.
Learning Objective: 12-2
Learning Objective: 12-3

12.10 a. An increase in the microprocessor speed of one MHz means the computer power
dissipation increases by 0.032 watts.
b. Computer power dissipation= 15.73 + 0.032×3000 = 111.73 watts
c. The intercept would not be meaningful because you would not have a computer with
zero speed.
Learning Objective: 12-2
Learning Objective: 12-3

12-4
Chapter 12 - Simple Regression

12.11 a. An increase in a country’s Power distance index of one unit means the number of
international franchises increases by 1.75.
b. International franchises= -47.5 + 1.75×85 = 101.25
c. The intercept would not be meaningful because you cannot have a negative number of
franchises. While the range for the index does include zero, it is unlikely that a
country’s index value will be close to zero.
Learning Objective: 12-2
Learning Objective: 12-3

12.12 a. Increasing the average revenue by $1million raises the net income by $30,700.
b. If revenue is zero, then net income is $2,277 million which suggests that the firm has
net income when revenue is zero. This does not seem logical.
c. Revenue= $2,277 + 0.0307×($20,000)=$2,891 million
Learning Objective: 12-2
Learning Objective: 12-3

12.13 a. Increasing the median income by $1,000 raises the median home price by $2610.
b. If median income is zero, then the model suggests thatmedian home price is $51,300.
While it does not seem logical that the median family income for any city is zero, it is
unclear what the lower bound would be.
c. HomePrice = $51.3 + 2.61×($50) = $181.8 or $181,800
Homeprice = $51.3 + 2.61×($100) = $312.3 or $312,300
Learning Objective: 12-2
Learning Objective: 12-3

12.14 a. Increasing the number of hours worked per week by 1 hour reduces the expected
number of credits by .07.
b. Yes, the intercept makes sense in this situation. It is possible that a student does not
have a job outside of school.
c. Credits= 15.4 - .07×0 = 15.4 credits
Credits= 15.4 - .07×40 = 12.6 credits
The more hours a student works, the less credits (courses) he will take on average.
Learning Objective: 12-2
Learning Objective: 12-3

12.15 a. Chevy Blazer: a one year increase in vehicle age reduces the price by $1050.
Chevy Silverado: a one year increase in vehicle age reduces the price by $1339.
b. Chevy Blazer: If age = 0 then price = $16,189. This could be the price of a new Blazer.
Chevy Silverado: If age = 0 then price = $22,591. This could be the price of a new
Silverado.
c. $16,189 – $1,050×5 = $10,939
$22,591 -$1,339×5 = $15,896
Learning Objective: 12-2
Learning Objective: 12-3

12-5
Chapter 12 - Simple Regression

12.16 a. yˆ i = $2,277 + 0.0307($41,078) = $3538.0946 ,


ei = yi - yˆ i = 8301 - 3538.0946 = 4762.9054 . The regression equation underestimated
the net income.
b. yˆ i = $2, 277 + 0.0307($61,768) = $4173.2776 ,
ei = yi - yˆ i = 893 - 4173.2776 = -3280, 2776 . The regression equation overestimated
the net income.
Learning Objective: 12-2
Learning Objective: 12-3

12.17 a. yˆ i = 15.4 - 0.07(14) = 14.42 , ei = yi - yˆ i = 18 - 14.42 = 3.58 . The regression equation


underestimated the number of credits.
b. yˆ i = 15.4 - 0.07(30) = 13.3 , ei = yi - yˆ i = 6 - 13.3 = -7.3 . The regression equation
overestimated the number of credits.
Learning Objective: 12-2
Learning Objective: 12-3

12.18 a.
Hours Worked (X) Weekly Pay (Y) ( xi - x )( yi - y )
10 93 840
15 171 30
20 204 0
20 156 0
35 261 1260
20 177 2130
x y SSxy

2130
b. b1 = = 6.086 , b0 = 177 - 6.086(20) = 55.286 , ŷ = 55.286 + 6.086x
350
c.

Hours Estimated
Worked (xi) Pay ( yˆ i ) yi - yˆi ( yi - yˆi ) 2 ( yˆi - y )2 ( yi - y ) 2
10 93 116.14623.146 535.7373 3703.209 7056
15 171 146.57624.424 596.5318 925.6198 36
20 204 177.00626.994 728.676 3.6E-05 729
20 156 177.00621.006 441.252 3.6E-05 441
35 261 268.2967.296 53.23162 8334.96 7056
20 177 177.0060.006 3.6E-05 3.6E-05 0
x SSE SSR SST
20 177 2355.429 12963.79 15318

12,963
d. R = = .8462
2

15,318

12-6
Chapter 12 - Simple Regression

e.

Learning Objective: 12-4


Learning Objective: 12-7
Learning Objective: 12-8

12.19 a.

Operators (X) Wait (Y) ( xi - x )( yi - y )


4 385 −76
5 335 12
6 383 0
7 344 −3
8 288 −118
6 347 −185
x y SSxy

-185
b. b1 = = -18.5 , b0 = 347 + 18.5(6) = 458 , ŷ = 458−18.5x
10
c.
Operators Wait Time Estimated
(xi) (yi) Time ( yˆ i ) yi - yˆi ( yi - yˆi ) 2 ( yˆi - y() 2yi - y )2
4 385 384 1 1 1369 1444
5 335 365.5 30.5 930.25 342.25144
6 383 347 36 1296 0 1296
7 344 328.5 15.5 240.25 342.25 9
8 288 310 -22 484 1369 3481
6 347 2951.5 3422.56374
x y SSR

3, 422.5
d. R = = .5369
2

6,374.0
e.

12-7
Chapter 12 - Simple Regression

Learning Objective: 12-4


Learning Objective: 12-7
Learning Objective: 12-8

12.20 a. and b.

c. An increase in age of 10 years leads to an average decrease in spending of $5.30.


d. The intercept is not meaningful in this case.
e. R2 = .0851 8.51% of the variation in spending is due to the variation in age. Age of the
consumer has little impact on the amount spent.
Learning Objective: 12-4
Learning Objective: 12-7
Learning Objective: 12-8

12-8
Chapter 12 - Simple Regression

12.21 a. and b.

c. An increase of 1% inlast year’s return leads to an increase, on average, of .458% for


this year’s return.
d. If last year’s return is zero, this year’s return is 11.155%. Yes, this is meaningful,
returns can be zero.
e. R2 = .2823.Only 28.23% of the variation in this year’s return is explained by last year’s
return.
Learning Objective: 12-4
Learning Objective: 12-7
Learning Objective: 12-8

12.22 a. and b.

c. An increase of 100 orders leads to an average increase in shipping cost of $493.22.


d. The intercept is not meaningful in this case.
e. R2 = .6717.67.17% of the variation in shipping costs is explained by number of orders.
Learning Objective: 12-4
Learning Objective: 12-7
Learning Objective: 12-8

12-9
Chapter 12 - Simple Regression

12.23 a. MegaStat regression output:


Regression Analysis

r² 0.846 n 5
r 0.920 k 1
Std. Error 28.020 Dep. Var. Weekly Pay (Y)

ANOVA table
Source SS df MS F p-value
Regression 12,962.5714 1 12,962.5714 16.51 .0269
Residual 2,355.4286 3 785.1429
Total 15,318.0000 4

Regression output confidence interval


variables coefficients std. error t (df=3) p-value 95% lower 95% upper
Intercept 55.2857 32.4705 1.703 .1872 -48.0500 158.6214
Hours Worked (X) 6.0857 1.4978 4.063 .0269 1.3192 10.8522

b. H0: 1 = 0 vs. H1: 1 ≠ 0.


c. p-value = .0269, (1.3192, 10,8522)
d. The slope is significantly different from zero because the p-value is less than .05.
Learning Objective: 12-6
Learning Objective: 12-7

12.24 a. MegaStat regression output:


Regression Analysis

r² 0.537 n 5
r -0.733 k 1
Std. Error 31.366 Dep. Var. Wait Time (Y)

ANOVA table
Source SS df MS F p-value
Regression 3,422.5000 1 3,422.5000 3.48 .1590
Residual 2,951.5000 3 983.8333
Total 6,374.0000 4

Regression output confidence interval


variables coefficients std. error t (df=3) p-value 95% lower 95% upper
Intercept 458.0000 61.1438 7.491 .0049 263.4131 652.5869
Operators (X) -18.5000 9.9188 -1.865 .1590 -50.0662 13.0662

b. H0: 1 = 0 vs. H1: 1 ≠ 0.


c. p-value = .1590, (-50.0662, 13.0662)

12-10
Chapter 12 - Simple Regression

d. The slope is not significantly different from zero because the p-value is greater than .05.
Learning Objective: 12-6
Learning Objective: 12-7

12.25 a. ŷ = 557.4511 + 3.0047x


b. For a 95% confidence level use t.025 =T.INV(.025,30) = −2.042. The 95% confidence
interval is 3.0047 ± 2.042(0.8820) or (1.203, 4.806).
c. H0: β1 ≤ 0 versus H1: β1> 0.tcrit=T.INV(.95,30) = 1.697. Reject the null hypothesis if tcalc>
1.697. tcalc = 3.0047/0.8820 = 3.407> 1.697 so we reject the null hypothesis.
d. p-value =T.DIST.RT(3.407,30) = .0009 < .05 so we reject the null hypothesis. The slope
is positive. Increased debt is correlated with increased NFL team value.
Learning Objective: 12-6
Learning Objective: 12-7

12.26 a. ŷ = 7.6425 + 0.9467x


b. For a 95% confidence level use t.025 =T.INV(.025,14) = −2.145. The 95% confidence
interval is 0.9467 ± 2.145(0.0936) or (0.7460, 1.1473).
c. H0: β1 ≤ 0 versus H1: β1> 0. tcrit=T.INV(.95,14) = 1.761. Reject the null hypothesis if
tcalc> 1.761. tcalc = 0.9467/0.0936 = 10.114> 1.761 so we reject the null hypothesis.
d. p-value =T.DIST.RT(10.114,14) = .0000 < .05 so we reject the null hypothesis. The
slope is positive. Increased revenue is correlated with increased expenses.
Learning Objective: 12-6
Learning Objective: 12-7

12.27 a. ŷ = 1.8064 + .0039x


b. Intercept: tcalc = 1.8064/0.6116 = 2.954, Slope: tcalc = 0.0039/0.0014 = 2.786 (Excel
value is slightly different due to internal rounding.)
c. d.f. = 10, t.025 =T.INV(.025, 10) = −2.228 so tcrit = ±2.228. Or one could use
=T.INV.2T(.05,10)
d. Intercept: p-value =T.DIST.2T(2.954,10) = .0144. Slope: p-value
=T.DSIT.2T(2.869,10) = .0167.
e. Fcalc = (2.869)2 = 8.23
f. This model fits the data fairly well. The F statistic is highly significant (p-value = .
0167). Also, R2 = .452 indicating almost half of the variation in annual taxes is
explained by home price.
Learning Objective: 12-6
Learning Objective: 12-8

12.28 a. ŷ = 614.930 − 1.09.11x


b. Intercept: tcalc = 614.930/51.2343 = 12.002. Slope:tcalc = −109.112/51.3623 = −2.124.
c. d.f.= 18, t.025 =T.INV(.025, 18) = − 2.101so tcrit = ±2.101.
d. Intercept: p-value =T.DIST.2T(12.002,18) = .0000, Slope: p-value
=T.DIST.2T(2.124,18) = .0478
e. Fcalc =(−2.124)2= 4.51

12-11
Chapter 12 - Simple Regression

f. This model has a poor fit. The F statistic is barely significant at a level of .05 (p-value =
.0478) and R2 = .2. Only 20% of the variation in units sold can be explained by average
price.
Learning Objective: 12-6
Learning Objective: 12-8

12.29 a. MegaStat regression output:

Regression Analysis

r² 0.085 n 10
r -0.292 k 1
Std. Error 2.128 Dep. Var. Spent (Y)

ANOVA table
Source SS df MS F p-value
Regression 3.3727 1 3.3727 0.74 .4133
Residual 36.2396 8 4.5299
Total 39.6123 9

Regression output confidence interval


variables coefficients std. error t (df=8) p-value 95% lower 95% upper
Intercept 6.9609 2.0885 3.333 .0103 2.1449 11.7770
Age (X) -0.0530 0.0614 -0.863 .4133 -0.1946 0.0886

b. (−0.1946, 0.0886) This interval does contain zero therefore we cannot conclude that the
slope is greater than zero.
c. The t statistic is −0.863 and the p-value is .4133. Because the p-value is greater than
0.05, we cannot conclude that the slope is different from zero.
d. Fcalc = 0.745 with a p-value = .4133. This indicates that the model does not fit the data.
e. The p-values match. Fcalc= (−0.863)2 = 0.745.
f. This model does not fit the data. The F statistic is not significant.
Learning Objective: 12-6
Learning Objective: 12-7
Learning Objective: 12-8

12-12
Chapter 12 - Simple Regression

12.30 a. MegaStat regression output:


Regression Analysis

r² 0.282 n 17
r 0.531 k 1
Std. Error 4.335 Dep. Var. This Year (Y)

ANOVA table
Source SS df MS F p-value
Regression 110.8585 1 110.8585 5.90 .0282
Residual 281.8321 15 18.7888
Total 392.6906 16

Regression output confidence interval


variables coefficients std. error t (df=15) p-value 95% lower 95% upper
Intercept 11.1549 2.1907 5.092 .0001 6.4854 15.8243
Last Year (X) 0.4580 0.1885 2.429 .0282 0.0561 0.8598
b. (0.0561, 0.8598) This interval does not contain zero and falls on the positive side
therefore we can conclude that the slope is greater than zero.
c. The t statistic is 2.429 and the p-value is .0282. Because the p-value is less than 0.05,
we can conclude that the slope is positive.
d. Fcalc = 5.90 with a p-value = .0282. This indicates that the model does provide some fit
to the data.
e. The p-values match. Fcalc= (2.429)2 = 5.90.
f. This model provides modest fit to the data. Although the F statistic is significant, R2
shows that only 28% of the variation in this year’s return is explained by last year’s
return.
Learning Objective: 12-6
Learning Objective: 12-7
Learning Objective: 12-8

12.31 a. MegaStat regression output:

Regression Analysis

r² 0.672 n 12
r 0.820 k 1
Std. Error 599.029 Dep. Var. Ship Cost (Y)

ANOVA table
Source SS df MS F p-value
Regression 7,340,819.5514 1 7,340,819.5514 20.46 .0011
Residual 3,588,357.1152 10 358,835.7115
Total 10,929,176.6667 11

12-13
Chapter 12 - Simple Regression

Regression output confidence interval


variables coefficients std. error t (df=10) p-value 95% lower 95% upper
Intercept -31.1895 1,059.8678 -0.029 .9771 -2,392.7222 2,330.3432
Orders (X) 4.9322 1.0905 4.523 .0011 2.5024 7.3619

b. (2.5024, 7.3619) This interval does not contain zero therefore we can conclude that the
slope is greater than zero.
c. The t statistic is 4.523 and the p-value is 0.0011. Because the p-value is less than 0.05,
we can conclude that the slope is positive.
d. Fcalc = 20.46 with a p-value = .0011. This indicates that the model does provide some fit
to the data.
e. The p-values match. Fcalc = (4.523)2 = 20.46.
f. This model provides a good fit to the data. The F statistic is highly significant andR2
shows that 67% of the variation in shipping cost is explained by number of orders.
Learning Objective: 12-6
Learning Objective: 12-7
Learning Objective: 12-8

12.32 a. MegaStat Predictions:

Predicted values for: Weekly Pay (Y)


95% Confidence Intervals 95% Prediction Intervals
Hours Worked (X) Predicted lower upper lower upper
12 128.314 73.138 183.491 23.451 233.178
17 158.743 116.377 201.109 60.017 257.469
21 183.086 142.922 223.249 85.285 280.887
25 207.429 160.970 253.887 106.879 307.978
30 237.857 175.709 300.005 129.164 346.551

b. x = 17: 95% confidence interval (116.377, 201.109), 95% prediction interval (60.017,
257.469)
61.883
c. The 95% confidence interval for µY: 177 ± 2.776 or (100.174, 253.826).
5
d. The margin of error for the confidence interval in part (c) is 76.826 whereas the
margin of error for the confidence interval in part b is less than 76.826. Knowing the
number of hours a student works helps us better estimate the average credit hours a
student takes.
Learning Objective: 12-5
Learning Objective: 12-9

12-14
Chapter 12 - Simple Regression

12.33 a. MegaStat Predictions:

Predicted values for: Profit (Y)


95% Confidence Intervals 95% Prediction Intervals
Revenue (X) Predicted lower upper lower upper
1.8 -0.070148 -0.554318 0.414023 -1.326790 1.186495
15.0 0.757399 0.367076 1.147723 -0.466153 1.980952
30.0 1.697794 1.106753 2.288834 0.396234 2.999353

b. x = 15: 95% confidence interval (0.3671, 1.1477), 95% prediction interval


(-0.4662,1.9810)
1.083
c. The 95% confidence interval for µY: 0.628 ± 2.306 or (-0.2045, 1.4605).
9
d. The margin of error for the confidence interval in part (c) is 0.8325 whereas the margin
of error for the confidence interval in part b is less than 0.8325. Knowing the revenue
an entertainment company brings in helps us better estimate their average profit.
Learning Objective: 12-5
Learning Objective: 12-9

12.34 No, these plots do not show that regression error assumptions of normality or constant
variance have been violated. The plot on the left is a normplot and shows a fairly
straight line on the diagonal. This indicates that the assumption of a normal distribution
for residuals is reasonable. The plot on the right shows residual values plotted against
the corresponding x value. The plot suggests that the residuals are homoscedastic
because there is no increase or decrease in residual magnitude.
Learning Objective: 12-10

12.35 The plot on the left is a normplot and shows a fairly straight line on the diagonal. This
indicates that the assumption of a normal distribution for residuals is reasonable. The
plot on the right shows residual values plotted against the corresponding x value. The
plot suggests that the residuals are heteroscedastic because there is an increase in
residual magnitude as the x values increase.
Learning Objective: 12-10

12.36 a. Predicted Defects = 3.2 + 0.045(100) = 7.7 defects per million parts
b. ei = yi − ŷi = 4.4 – 7.7 = −3.3
ei -3.3
c. ei* = � = -3.084
sei 1.07
d. Yes, this residual is considered an outlier because ei* < −3.
Learning Objective: 12-11

12.37 a. Predicted MPG = 49.22 – 0.081(200) = 33.02 MPG


b. ei = yi − ŷi = 38.15 – 33.02 = 5.13

12-15
Chapter 12 - Simple Regression

ei 5.13
c. ei* = � = 2.527
sei 2.03
d. No, this residual is not considered an outlier because ei* < 3 but it is considered unusual
because ei* > 2.
Learning Objective: 12-11

1 ( xi - x ) 2 1 (2382 - 2004)2
12.38 a. hi = + = + = 0.1774 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1774 > 0.1379 this would be considered a high leverage observation.
1 ( x - x ) 2 1 (2125 - 2004) 2
b. hi = + i = + = 0.0491 . 4/n = 4/29 = 0.1379. Because 0.0491
n SS XX 29 999,603
< 0.1379 this would not be considered a high leverage observation.
1 ( xi - x ) 2 1 (1620 - 2004) 2
c. hi = + = + = 0.1820 . 4/n = 4/29 = 0.1379. Because
n SS XX 29 999, 603
0.1820 > 0.1379 this would be considered a high leverage observation.
Learning Objective: 12-11

1 ( xi - x ) 2 1 (0.072 - 2.027) 2
12.39 a. hi = + = + = 0.185 . 4/n = 4/74 = 0.0541. Because 0.185
n SS XX 74 22.285
> 0.0541 this would be considered a high leverage observation.
1 ( xi - x ) 2 1 (1.413 - 2.027) 2
b. hi = + = + = 0.0304 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0304<0.0541 this wouldnot be considered a high leverage observation.
1 ( x - x ) 2 1 (3.376 - 2.027) 2
c. hi = + i = + = 0.0952 . 4/n = 4/74 = 0.0541. Because
n SS XX 74 22.285
0.0952> 0.0541 this would be considered a high leverage observation.
Learning Objective: 12-11

Questions 12.40 through 12.55 refer to 10 different data sets labeled A-J. The answers to each question
are listed for each data set in turn.Note that one can find the tcrit values using either
=T.INV(α/2, df) or =T.INV.2T(α, df).

DATA SET A

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. Most likely from a survey similar to the Current Population Survey
conducted by The Bureau of Labor Statistics.
Learning Objective: 02-7

12-16
Chapter 12 - Simple Regression

12.42 Answers will vary. Sample size is sufficient for educational purposes but most government
studies will have much larger sample sizes.
Learning Objective: 02-9
12.43 A positive slope would be logical. It makes sense that higher income is associated with
higher home values. Cause and effect cannot be assumed. An increase in income does
not automatically compel a family to purchase a more expensive home.
Learning Objective: 12-2

12.44

There is a moderate positive relationship between Median Income and Median Home
Price.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in median income of $1000, increases home price by $2,609.80. No, the
intercept does not have meaning because it seems unlikely that the median income for a
family will be equal to zero.
Learning Objective: 12-2

12-17
Chapter 12 - Simple Regression

12-18
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.340 n 34
r 0.583 k 1
Std. Error 58.855 Dep. Var. Home

ANOVA table
Source SS df MS F p-value
Regression 57,071.8007 1 57,071.8007 16.48 .0003
Residual 110,844.4754 32 3,463.8899
Total 167,916.2761 33

Regression output confidence interval


variables coefficients std. error t (df=32) p-value 95% lower 95% upper
Intercept 51.2465 55.9415 0.916 .3665 -62.7025 165.1955
Income 2.6098 0.6430 4.059 .0003 1.3002 3.9195

Studentized
Studentized Deleted
Observation Home Predicted Residual Leverage Residual Residual
1 290.0000 207.7728 82.2272 0.108 1.479 1.508
2 279.9000 344.6811 -64.7811 0.115 -1.170 -1.177
3 338.2500 332.7568 5.4932 0.089 0.098 0.096
4 316.0000 295.2225 20.7775 0.037 0.360 0.355
5 207.0000 252.4398 -45.4398 0.038 -0.787 -0.782
6 250.0000 252.8364 -2.8364 0.038 -0.049 -0.048
7 320.0000 298.8240 21.1760 0.040 0.367 0.362
8 150.0000 191.5449 -41.5449 0.150 -0.766 -0.761
9 230.0000 274.9494 -44.9494 0.029 -0.775 -0.770
10 199.0000 252.2884 -53.2884 0.038 -0.923 -0.921
11 218.5000 214.7044 3.7956 0.092 0.068 0.067
12 290.0000 337.0265 -47.0265 0.098 -0.841 -0.837
13 315.0000 278.2247 36.7753 0.030 0.634 0.628
14 248.0000 269.3827 -21.3827 0.030 -0.369 -0.364
15 290.0000 271.8724 18.1276 0.030 0.313 0.308
16 220.0000 220.7383 -0.7383 0.080 -0.013 -0.013
17 170.4500 219.3995 -48.9495 0.083 -0.868 -0.865
18 290.0000 296.5352 -6.5352 0.038 -0.113 -0.111
19 205.0000 320.0496 -115.0496 0.066 -2.022 -2.131
20 410.0000 289.3791 120.6209 0.033 2.084 2.207
21 379.9750 335.0874 44.8876 0.094 0.801 0.796

12-19
Chapter 12 - Simple Regression

22 135.0000 221.2733 -86.2733 0.079 -1.528 -1.562


23 358.5000 306.2855 52.2145 0.047 0.909 0.906
24 342.5000 257.8630 84.6370 0.034 1.463 1.491
25 341.0000 288.2803 52.7197 0.033 0.911 0.908
26 287.4500 314.0966 -26.6466 0.057 -0.466 -0.460
27 214.5000 259.5228 -45.0228 0.033 -0.778 -0.773
28 330.8750 220.7644 110.1106 0.080 1.951 2.045
29 444.5000 322.9831 121.5169 0.070 2.141 2.277
30 240.0000 273.7698 -33.7698 0.029 -0.582 -0.576
31 226.4500 250.9756 -24.5256 0.039 -0.425 -0.420
32 278.2500 320.9709 -42.7209 0.067 -0.752 -0.746
33 290.0000 293.8079 -3.8079 0.036 -0.066 -0.065
34 230.0000 249.7908 -19.7908 0.040 -0.343 -0.338
Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (1.3002, 3.9195). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 32, t.025 =T.INV(.025, 32) = ±2.037, tcalc = 4.059 > 2.037.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .34. 34% of the variation in median home price can be explained by median
household income. Closer to 100% is always desirable but this is still decent
considering we are using only one predictor variable.
b. Fcalc = 16.48 and its p-value = .0003. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Observations 19, 20, 28, and 29 have unusual residual values. Observation 19 is in PA. The
model overestimated the median home price. The other three cities were in NJ and NY.
The model underestimated the median home price. Home prices tend to be higher in the
northeast part of the country. More research would be needed to better explain the
unusual observations.
Learning Objective: 12-11

12-20
Chapter 12 - Simple Regression

12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a fairly symmetric distribution with slight skewness to the right.

b. The residuals do not show any obvious departure from normality. The assumption of
residual normality is valid.

12-21
Chapter 12 - Simple Regression

Learning Objective: 12-10

12.52

The residual plot does not show signs of heteroscedasticity.


Learning Objective: 12-10

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals forx = $75,000 and $105,000 are
shown in the table below.

Predicted values for: Home


95% Confidence Intervals 95% Prediction Intervals
Income Predicted lower upper lower upper
75.000 246.982632 222.190412 271.774852 124.562573 369.402690
105.000 325.277089 292.571489 357.982690 201.012561 449.541617

Learning Objective: 12-9

12.55 Observation 8 has a high leverage. This is Chesapeake, VA, which has a very low median
income.
Learning Objective: 12-11

12-22
Chapter 12 - Simple Regression

DATA SET B

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Data were most likely from a semester’s class of 58 students. This can be treated as a
sample from the larger population of students who take statistics over several years.
Learning Objective: 02-7

12.42 A sample size of 58 should be sufficient to draw significant conclusions about the
relationship between midterm and final exam scores.
Learning Objective: 02-9

12.43 A positive slope would be logical. It makes sense that higher midterm scores are associated
with higher final exam scores and vice versa. While cause and effect cannot be assumed
it would be reasonable to hypothesize that if a student performs well on their midterm
exam due to a solid understanding of the material they will then be able to comprehend
the material in the last half of the semester and will be well-prepared for their final
exam.
Learning Objective: 12-2

12.44

There is a moderate positive relationship between Midterm Exam Score and Final
Exam Score.
Learning Objective: 12-4

12-23
Chapter 12 - Simple Regression

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 A one point increase in the midterm score is associated with a 1.014 point increase in the
final exam score. While theoretically a student could score zero on the midterm, the y-
intercept is negative which is not feasible.
Learning Objective: 12-2

12.47 MegaStat output is provided.

Regression Analysis

r² 0.430 n 58
r 0.656 k 1
Std. Error 7.517 Dep. Var. Final Exam Score

ANOVA table
Source SS df MS F p-value
Regression 2,385.5446 1 2,385.5446 42.22 2.33E-08
Residual 3,164.0588 56 56.5011
Total 5,549.6034 57

Regression output confidence interval


variables coefficients std. error t (df=56) p-value 95% lower 95% upper
Intercept -1.5568 12.0004 -0.130 .8972 -25.5965 22.4829
Midterm Exam
Score 1.0138 0.1560 6.498 2.33E-08 0.7012 1.3263

Studentize
d
Studentize
d Deleted
Final Exam Predicte Leverag
Observation Score d Residual e Residual Residual
1 78.0 79.5 -1.5 0.022 -0.208 -0.206
2 85.0 86.6 -1.6 0.063 -0.226 -0.224
3 81.0 71.4 9.6 0.027 1.290 1.297
4 54.0 68.4 -14.4 0.042 -1.957 -2.009
5 70.0 85.6 -15.6 0.055 -2.139 -2.212
6 73.0 82.6 -9.6 0.035 -1.298 -1.306
7 89.0 77.5 11.5 0.018 1.541 1.561
8 84.0 74.5 9.5 0.018 1.279 1.286
9 86.0 73.5 12.5 0.020 1.685 1.714
10 79.0 74.5 4.5 0.018 0.607 0.604
11 75.0 83.6 -8.6 0.040 -1.168 -1.172
12 63.0 72.4 -9.4 0.023 -1.272 -1.279
13 72.0 73.5 -1.5 0.020 -0.197 -0.195
14 69.0 72.4 -3.4 0.023 -0.464 -0.461
15 86.0 79.5 6.5 0.022 0.868 0.866

12-24
Chapter 12 - Simple Regression

16 78.0 74.5 3.5 0.018 0.473 0.470

12-25
Chapter 12 - Simple Regression

17 75.0 71.4 3.6 0.027 0.481 0.477


18 68.0 76.5 -8.5 0.017 -1.141 -1.145
19 77.0 75.5 1.5 0.017 0.203 0.201
20 78.0 65.4 12.6 0.066 1.741 1.774
21 77.0 73.5 3.5 0.020 0.475 0.472
22 73.0 70.4 2.6 0.031 0.348 0.346
23 79.0 84.6 -5.6 0.047 -0.765 -0.762
24 74.0 73.5 0.5 0.020 0.072 0.071
25 79.0 75.5 3.5 0.017 0.471 0.468
26 73.0 75.5 -2.5 0.017 -0.334 -0.332
27 72.0 83.6 -11.6 0.040 -1.576 -1.597
28 81.0 76.5 4.5 0.017 0.603 0.600
29 86.0 77.5 8.5 0.018 1.139 1.142
30 76.0 85.6 -9.6 0.055 -1.318 -1.327
31 83.0 80.6 2.4 0.025 0.329 0.326
32 83.0 77.5 5.5 0.018 0.736 0.733
33 86.0 84.6 1.4 0.047 0.189 0.187
34 71.0 72.4 -1.4 0.023 -0.195 -0.193
35 83.0 82.6 0.4 0.035 0.056 0.055
36 79.0 82.6 -3.6 0.035 -0.486 -0.482
37 68.0 71.4 -3.4 0.027 -0.463 -0.460
38 90.0 82.6 7.4 0.035 1.004 1.004
39 89.0 83.6 5.4 0.040 0.733 0.730
40 83.0 72.4 10.6 0.023 1.420 1.433
41 81.0 85.6 -4.6 0.055 -0.633 -0.630
42 79.0 75.5 3.5 0.017 0.471 0.468
43 58.0 57.2 0.8 0.167 0.110 0.109
44 77.0 72.4 4.6 0.023 0.612 0.609
45 85.0 80.6 4.4 0.025 0.598 0.595
46 67.0 71.4 -4.4 0.027 -0.598 -0.595
47 70.0 76.5 -6.5 0.017 -0.873 -0.871
48 79.0 81.6 -2.6 0.030 -0.348 -0.345
49 59.0 63.3 -4.3 0.086 -0.602 -0.599
50 74.0 78.5 -4.5 0.020 -0.609 -0.606
51 86.0 78.5 7.5 0.020 1.003 1.003
52 85.0 81.6 3.4 0.030 0.463 0.459
53 79.0 74.5 4.5 0.018 0.607 0.604
54 81.0 77.5 3.5 0.018 0.467 0.464
55 31.0 57.2 -26.2 0.167 -3.826 -4.411
56 82.0 79.5 2.5 0.022 0.330 0.327
57 70.0 67.4 2.6 0.050 0.357 0.355
58 69.0 72.4 -3.4 0.023 -0.464 -0.461

Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (0.7012, 1.3263). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 56, t.025 =T.INV(.025,56) = ±2.003, tcalc = 4.059 > 2.003.
Therefore reject H0 and conclude that the slope is significantly different from zero.

12-26
Chapter 12 - Simple Regression

c. The p-value = .0000. This means that there is very little chance of observing this
sample if there is no correlation between midterm scores and final scores.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .43. 43% of the variation in final exam scores can be explained by midterm exam
scores. Closer to 100% is always desirable but this is still decent considering we are
using only one predictor variable.
b. Fcalc = 42.22 and its p-value = .0000. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Students4 and 5 have unusual residual values with residuals that are negative. The model
overestimated the final exam score for these two students. Student 55 had a residual
beyond −4 which means the model significantly overestimated the final exam score.
This observation would be considered an outlier. Low exam scores are typically due to
students who do not study or attend class. On occasion the material is very difficult
which can also result in a low exam score.
Learning Objective: 12-11

12.51 a. The normplot of residuals show a fairly straight line on the diagonal. The histogram of
residuals shows a skewed left distribution which is fairly common for exam scores.
Most students do well on exams with a few students in the low range.

12-27
Chapter 12 - Simple Regression

b. The residuals do not show strong departure from normality. The assumption of residual
normality is valid.
Learning Objective: 12-10

12.52

12-28
Chapter 12 - Simple Regression

The residual plot does not show signs of heteroscedasticity although the low final exam
score can be seen in the lower part of the graph.
Learning Objective: 12-10

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 75 and 95 are shown in the
table below.

Predicted values for: Final Exam Score


95% Confidence 95% Prediction
Intervals Intervals
Midterm Exam Predict
Score ed lower upper lower upper
75 74.477 72.433 76.521 59.281 89.673
95 94.753 88.688 100.818 78.520 110.986

Learning Objective: 12-9

12.55 Students 43, 49, and 55 have high leverage. These three students had very low midterm
scores.
Learning Objective: 12-11

12-29
Chapter 12 - Simple Regression

DATA SET C

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. Hospital data bases will contain this type of information.
Learning Objective: 02-7

12.42 Answers will vary. Sample size is sufficient for educational purposes but most studies
conducted by a hospital will use larger sample sizes because the data is available.
Learning Objective: 02-9

12.43 A positive slope would be logical. The estimated length of stay of a patient should be based
on their condition upon admission and should be related to their actual length of stay.
Although estimating a patient’s time in the hospital should not cause their time to be
longer or shorter.
Learning Objective: 12-2

12.44

There is a strong positive relationship between ELOS and ALOS.


Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.36 An increase in ELOS of one month increases ALOS by 1.03months. No, the intercept does
not have meaning because by definition admission to a hospital implies staying in the
hospital.
Learning Objective: 12-2

12-30
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.


Regression Analysis

r² 0.625 n 16
r 0.791 k 1
Std. Error 2.282 Dep. Var. ALOS

ANOVA table
Source SS df MS F p-value
Regression 121.5515 1 121.5515 23.35 .0003
Residual 72.8860 14 5.2061
Total 194.4375 15

Regression output confidence interval


variables coefficients std. error t (df=14) p-value 95% lower 95% upper
Intercept 0.5147 1.5903 0.324 .7510 -2.8961 3.9255
ELOS 1.0293 0.2130 4.832 .0003 0.5724 1.4862

Studentized
Studentized Deleted
Observation ALOS Predicted Residual Leverage Residual Residual
1 10.00 11.32 -1.32 0.171 -0.636 -0.622
2 2.00 5.15 -3.15 0.116 -1.466 -1.536
3 4.00 8.23 -4.23 0.065 -1.919 -2.154
4 11.00 12.87 -1.87 0.283 -0.966 -0.963
5 11.00 8.23 2.77 0.065 1.254 1.282
6 11.00 9.78 1.22 0.098 0.564 0.550
7 6.50 6.69 -0.19 0.071 -0.087 -0.083
8 5.00 5.66 -0.66 0.096 -0.305 -0.295
9 8.00 6.69 1.31 0.071 0.595 0.581
10 16.00 12.87 3.13 0.283 1.622 1.735
11 6.50 7.72 -1.22 0.063 -0.552 -0.538
12 6.00 5.15 0.85 0.116 0.398 0.385
13 3.50 4.12 -0.62 0.167 -0.296 -0.287
14 10.00 6.69 3.31 0.071 1.505 1.584
15 7.00 8.23 -1.23 0.065 -0.559 -0.545
16 5.50 3.60 1.90 0.200 0.930 0.925

12.48 a. 95% confidence interval for β1: (0.5724, 1.4862). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 14, t.025 =T.INV(.025,14) = ±2.145, tcalc = 4.832 > 2.145.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = .0003. This means that 3 times out of 10,000 we will see a sample this
extreme if the slope is actually equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12-31
Chapter 12 - Simple Regression

12.49 a. R2 = .625. 62.5% of the variation in actual length of stay of a patient can be explained
by their expected length of stay. This shows a moderately strong relationship.
b. Fcalc = 23.35 and its p-value = .0003. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 There are no unusual residuals.


Learning Objective: 12-11

12.51 a. The histogram of residuals does not show a clear normal distribution; however, the
normal plot of residuals has a modest straight line.

12-32
Chapter 12 - Simple Regression

b. While the residuals show some departure from normality, this is a very small sample so
it is difficult to see clear normality. There are no strong outliers in the data set so at this
point the slight departure from normality is not too troublesome.
Learning Objective: 12-10

12.52

The residual plot does not show signs of heteroscedasticity.


Learning Objective: 12-10

12-33
Chapter 12 - Simple Regression

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Prediction and confidence intervals for x = 5 and 9 days is shown.
Predicted values for:
ALOS
95% Confidence Intervals 95% Prediction Intervals
ELOS Predicted lower upper lower upper
4 4.6318 2.8052 6.4584 -0.5917 9.8554
9 9.7782 8.2426 11.3138 4.6492 14.9072

12.55 Observations 4 and 10 have high leverage. These patients had unusually long estimated
lengths of stay.
Learning Objective: 12-9

12-34
Chapter 12 - Simple Regression

DATA SET D

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. This type of information could probably be collected from the airplane
manufacturer.
Learning Objective: 02-7

12.42 Answers will vary. Sample size of 52 should be sufficient to observe important
relationships.
Learning Objective: 02-9

12.43 A positive slope would be logical. Cruise speed should be related to engine size. And yes, a
cause and effect relationship would make sense
Learning Objective: 12-2

12.44

There is a strong positive relationship between TotalHP and CruiseSpeed.


Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in one unit of horsepower, increases cruise speed by .1931 mph. No, the
intercept does not have meaning because an engine cannot have zero horsepower.
Learning Objective: 12-4

12-35
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.684 n 52
r 0.827 k 1
Std. Error 20.596 Dep. Var. Cruise

ANOVA table
Source SS df MS F p-value
Regression 45,896.1706 1 45,896.1706 108.20 4.20E-14
Residual 21,209.2717 50 424.1854
Total 67,105.4423 51

Regression output confidence interval


variables coefficients std. error t (df=50) p-value 95% lower 95% upper
Intercept 103.0870 6.2426 16.513 6.92E-22 90.5483 115.6256
TotalHP 0.1931 0.0186 10.402 4.20E-14 0.1558 0.2304

Studentized
Studentized Deleted
Observation Cruise Predicted Residual Leverage Residual Residual
1 100.0 125.5 -25.5 0.046 -1.267 -1.275
2 200.0 219.0 -19.0 0.093 -0.967 -0.966
3 241.0 228.6 12.4 0.119 0.640 0.636
4 199.0 213.2 -14.2 0.079 -0.717 -0.714
5 174.0 161.0 13.0 0.019 0.636 0.632
6 164.0 172.6 -8.6 0.022 -0.423 -0.420
7 141.0 172.6 -31.6 0.022 -1.552 -1.575
8 161.0 161.0 -0.0 0.019 -0.001 -0.001
9 107.0 124.3 -17.3 0.048 -0.863 -0.860
10 104.0 131.1 -27.1 0.038 -1.341 -1.353
11 122.0 134.0 -12.0 0.035 -0.593 -0.589
12 129.0 137.9 -8.9 0.031 -0.437 -0.433
13 144.0 147.5 -3.5 0.023 -0.172 -0.171
14 194.0 213.2 -19.2 0.079 -0.970 -0.969
15 170.0 184.2 -14.2 0.031 -0.701 -0.697
16 223.0 222.8 0.2 0.103 0.009 0.009
17 234.0 247.9 -13.9 0.185 -0.749 -0.746
18 124.0 137.9 -13.9 0.031 -0.683 -0.679
19 186.0 158.1 27.9 0.019 1.366 1.379
20 190.0 158.1 31.9 0.019 1.563 1.586
21 190.0 199.7 -9.7 0.052 -0.481 -0.478
22 159.0 148.5 10.5 0.023 0.517 0.513
23 160.0 148.5 11.5 0.023 0.566 0.562
24 148.0 163.0 -15.0 0.019 -0.733 -0.730
25 143.0 161.0 -18.0 0.019 -0.884 -0.882

12-36
Chapter 12 - Simple Regression

26 160.0 141.7 18.3 0.027 0.900 0.898


27 140.0 127.2 12.8 0.044 0.634 0.630
28 235.0 170.7 64.3 0.021 3.157 3.492
29 191.0 163.0 28.0 0.019 1.375 1.388
30 132.0 127.2 4.8 0.044 0.237 0.235
31 115.0 137.9 -22.9 0.031 -1.127 -1.130
32 170.0 143.6 26.4 0.026 1.296 1.305
33 175.0 150.2 24.8 0.022 1.217 1.223
34 156.0 141.7 14.3 0.027 0.703 0.700
35 188.0 157.2 30.8 0.020 1.512 1.532
36 128.0 134.0 -6.0 0.035 -0.296 -0.293
37 107.0 127.2 -20.2 0.044 -1.004 -1.005
38 148.0 161.0 -13.0 0.019 -0.639 -0.635
39 129.0 137.9 -8.9 0.031 -0.437 -0.433
40 191.0 199.7 -8.7 0.052 -0.432 -0.428
41 147.0 148.5 -1.5 0.023 -0.072 -0.072
42 213.0 170.7 42.3 0.021 2.077 2.151
43 186.0 161.0 25.0 0.019 1.224 1.231
44 148.0 161.0 -13.0 0.019 -0.639 -0.635
45 180.0 188.1 -8.1 0.035 -0.399 -0.395
46 186.0 188.1 -2.1 0.035 -0.102 -0.101
47 100.0 132.1 -32.1 0.037 -1.586 -1.611
48 176.0 161.0 15.0 0.019 0.734 0.731
49 151.0 153.3 -2.3 0.020 -0.113 -0.112
50 98.0 118.7 -20.7 0.058 -1.037 -1.038
51 163.0 151.4 11.6 0.021 0.571 0.567
52 143.0 137.9 5.1 0.031 0.254 0.252

Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (0.1558, 0.2304). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,50) =±2.009, tcalc = 10.402 > 2.009.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 4.20×10-14. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .684. 68.4% of the variation in cruise speed can be explained by engine
horsepower. This shows a moderately strong relationship.
b. Fcalc = 108.20 and its p-value = 4.20×10-14. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12-37
Chapter 12 - Simple Regression

12.50 Observation 28 is an outlier residual. This corresponds to the ExtraExtra400 model.


Observation 42 is an unusual residual. This is the Piper Malibu Mirage model.
Learning Objective: 12-11

12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a very
slight right skewed distribution.

b. The residuals do not show significant departure from normality.


Learning Objective: 12-10

12-38
Chapter 12 - Simple Regression

12.52

The residual plot does not show obvious signs of heteroscedasticity. We can see a possible
high outlier on the graph.
Learning Objective: 12-10

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 150 and 250 are shown
below.

Predicted values for:


Cruise
95% Confidence Intervals 95% Prediction Intervals
TotalHP Predicted lower upper lower upper
150 132.057 124.072 140.043 89.926 174.189
250 151.371 145.350 157.391 109.567 193.174

Learning Objective: 12-9

12.55 Observations 2-4 and 14, 16, and 17 have high leverage.
Learning Objective: 12-11

12-39
Chapter 12 - Simple Regression

DATA SET E

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. This type of information would probably be collected from a researcher
who has obtained different types of processors.
Learning Objective: 02-7

12.42 Answers will vary. Sample size of 14 is fairly small and can open one up to Type II error.
Learning Objective: 02-9

12.43 A positive slope would be logical. Microprocessor speed should be related to Power
dissipation.
Learning Objective: 12-2

12.44

There is a strong positive relationship betweenmicroprocessor speed and power


dissipation.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in one unit of microprocessor speed increases power dissipation by 0.032
watts. No, the intercept does not have meaning a speed of zero is not logical.
Learning Objective: 12-4

12-40
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.925 n 14
r 0.962 k 1
Std. Error 13.109 Dep. Var. Power

ANOVA
table
Source SS df MS F p-value
25,561.915
Regression 7 1 25,561.9157 148.75 4.03E-08
Residual 2,062.0843 12 171.8404
27,624.000
Total 0 13

Regression output confidence interval


variables coefficients std. error t (df=12) p-value 95% lower 95% upper
Intercept 15.7299 5.6634 2.777 .0167 3.3905 28.0693
Speed 0.0319 0.0026 12.196 4.03E-08 0.0262 0.0375

Studentize
d
Studentize
d Deleted
Leverag
Observation Power Predicted Residual e Residual Residual
1 3.0 16.4 -13.4 0.184 -1.129 -1.143
2 10.0 18.9 -8.9 0.174 -0.748 -0.734
3 35.0 23.2 11.8 0.157 0.985 0.983
4 20.0 25.3 -5.3 0.150 -0.437 -0.422
5 42.0 34.8 7.2 0.120 0.582 0.565
6 50.0 34.8 15.2 0.120 1.233 1.263
7 51.0 57.1 -6.1 0.078 -0.488 -0.472
8 73.0 82.6 -9.6 0.078 -0.764 -0.750
9 115.0 136.8 -21.8 0.246 -1.912 -2.196
10 130.0 117.7 12.3 0.160 1.027 1.030
11 95.0 89.0 6.0 0.086 0.479 0.463
12 136.0 117.7 18.3 0.160 1.527 1.629
13 95.0 108.1 -13.1 0.128 -1.071 -1.078
14 125.0 117.7 7.3 0.160 0.611 0.594

Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (0.0262, 0.0375). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,12) = ±2.179, tcalc = 12/196 > 2.179.
Therefore reject H0 and conclude that the slope is significantly different from zero.

12-41
Chapter 12 - Simple Regression

c. The p-value = .0000. This means that it is highly unlikely to obtain a slope estimate of
this value if the true slope is equal to zero.

12-42
Chapter 12 - Simple Regression

d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .925. 92.5% of the variation in power dissipation can be explained by


microprocessor speed. This shows a strong relationship.
b. Fcalc = 148.75 and its p-value = .0000. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 There are no unusual or outlier standardized residuals.


Learning Objective: 12-11

12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a left
skewed distribution.

12-43
Chapter 12 - Simple Regression

b. The sample size is small so it is difficult to determine normality but there is no obvious
evidence to assume non-normality.
Learning Objective: 12-10

12.52

The residual plot does not show obvious signs of heteroscedasticity.


Learning Objective: 12-10

12-44
Chapter 12 - Simple Regression

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 1000 and 2500 are shown
below.
Predicted values for:
Power
95% Confidence Intervals 95% Prediction Intervals
Speed Predicted lower upper lower upper
1,000 47.583 38.962 56.203 17.748 77.417
2,500 95.362 86.485 104.238 65.452 125.271

Learning Objective: 12-9

12.55 There are no high leverage observations.


Learning Objective: 12-11

12-45
Chapter 12 - Simple Regression

DATA SET F

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. This could probably be collected through surveys.
Learning Objective: 02-7

12.42 Answers will vary. Sample size of 10 is fairly small and can open one up to Type II error.
But this information could be difficult to obtain.
Learning Objective: 02-9

12.43 A positive slope would be logical. Increased website hits should be associated with
increased revenue.
Learning Objective: 12-2

12.44

There is a weak positive relationship between website hits and restaurant revenue.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase of one website visit increases weekly revenue by $1.67. The intercept could be
interpreted as the weekly revenue with no website hits. Although for this particular
sample there were no restaurants with an x near zero so it would be dangerous to
extrapolate.

12-46
Chapter 12 - Simple Regression

Learning Objective: 12-4

12.47 MegaStat output is provided.

Regression Analysis

r² 0.128 n 10
r 0.357 k 1
Std. Error 1078.961 Dep. Var. Restaurant Revenue

ANOVA
table
Source SS df MS F p-value
1,361,311.737
Regression 1,361,311.7375 1 5 1.17 .3111
1,164,157.782
Residual 9,313,262.2625 8 8
Total 10,674,574.0000 9

Regression output confidence interval


variables coefficients std. error t (df=8) p-value 95% lower 95% upper
2,353.059 15,808.537
Intercept 10,382.3728 2 4.412 .0022 4,956.2085 1
Website
Hits 1.6695 1.5439 1.081 .3111 -1.8907 5.2297

Studentized
Studentize
d Deleted
Observatio Restaurant
n Revenue Predicted Residual Leverage Residual Residual
1 12,113.0 12,407.5 -294.5 0.278 -0.321 -0.302
2 11,409.0 12,869.9 -1,460.9 0.101 -1.428 -1.547
3 14,579.0 12,661.3 1,917.7 0.142 1.919 2.443
4 11,605.0 12,811.5 -1,206.5 0.106 -1.182 -1.218
5 12,308.0 12,501.0 -193.0 0.217 -0.202 -0.190
6 12,320.0 13,107.0 -787.0 0.131 -0.783 -0.762
7 13,225.0 12,591.1 633.9 0.170 0.645 0.620
8 13,652.0 13,496.0 156.0 0.361 0.181 0.170
9 13,893.0 13,036.9 856.1 0.114 0.843 0.826
10 13,896.0 13,517.7 378.3 0.380 0.445 0.422

Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (−1.8907, 5.2297). This interval does contain zero which
means that we are not confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 50, t.025 =T.INV(.025,8) = ±2.306, tcalc = 1.081 falls
between the critical values. Therefore fail to reject H0 and conclude that the slope isnot
significantly different from zero.

12-47
Chapter 12 - Simple Regression

c. The p-value = .3111. This means that it is likely to obtain a slope estimate of this value
even if the true slope is equal to zero.
d. No, this sample does not support our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .128. Because the slope is not significant R2 does not have meaning.
b. Fcalc = 1.17 and its p-value = .3111. The F statistic is not significant which means the
linear model does not provide significant fit.
c. This model is not fit for practical use.
Learning Objective: 12-8

12.50 Restaurant 3 shows an unusual residual. It appears the model is underestimating the
revenue.
Learning Objective: 12-11

12.51 a. The residual normplot is somewhat linear. The histogram of residuals shows a fairly
uniform distribution.

12-48
Chapter 12 - Simple Regression

b. The sample size is small so it is difficult to determine normality from the histogram.
Based on the normplot there is no obvious evidence to assume non-normality.
Learning Objective: 12-10

12.52

The residual plot does not show obvious signs of heteroscedasticity.


Learning Objective: 12-10

12-49
Chapter 12 - Simple Regression

12.53 An autocorrelation test is not appropriate because this is not a time series data set.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 1200 and 1500 are shown
below. Keep in mind that these intervals are unreliable because there is not significant
relationship between website hits and revenue.
Predicted values for: Restaurant Revenue
95% Confidence Intervals 95% Prediction Intervals
Website Hits Predicted lower upper lower upper
1,200 12,385.790 11,036.168 13,735.411 9,555.230 15,216.349
1,500 12,886.644 12,099.326 13,673.962 10,276.958 15,496.330

Learning Objective: 12-9

12.55 There are no high leverage observations.


Learning Objective: 12-11

12-50
Chapter 12 - Simple Regression

DATA SET G

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. This information can be gathered from manufacturers’ specification
information which is listed on their websites. They manufacturers use sophisticated
sampling techniques for estimating these values.
Learning Objective: 02-7

12.42 Answers will vary. Sample size of 43 is reasonable.


Learning Objective: 02-9

12.43 A negative slope would be logical. It would make sense that the heavier a car is the lower
the MPG.
Learning Objective: 12-2

12.44

There is a fairly strong negative relationship between the Weight and CityMPG.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in the weight of a car by one pound reduces its city mpg by 0.0045 mpg. No,
the intercept does not make sense.
Learning Objective: 12-2

12-51
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.681 n 43
r -0.825 k 1
Std. Error 2.499 Dep. Var. City MPG

ANOVA table
Source SS df MS F p-value
Regression 546.43787944 1 546.43787944 87.51 1.00E-11
Residual 256.02723683 41 6.24456675
Total 802.46511628 42

Regression output confidence interval


variables coefficients std. error t (df=41) p-value 95% lower 95% upper
Intercept 36.6337 1.8793 19.493 2.39E-22 32.8383 40.4291
Weight -0.0046 0.00048708 -9.354 1.00E-11 -0.0055 -0.0036

Studentized
Studentized Deleted
Observation City MPG Predicted Residual Leverage Residual Residual
1 20.0 20.9 -0.9 0.027 -0.371 -0.367
2 23.0 21.5 1.5 0.031 0.607 0.602
3 19.0 21.2 -2.2 0.029 -0.888 -0.886
4 20.0 21.4 -1.4 0.030 -0.557 -0.552
5 18.0 17.4 0.6 0.031 0.260 0.257
6 18.0 18.2 -0.2 0.026 -0.073 -0.072
7 19.0 21.8 -2.8 0.034 -1.141 -1.145
8 14.0 14.1 -0.1 0.074 -0.062 -0.061
9 15.0 15.4 -0.4 0.053 -0.165 -0.163
10 17.0 15.4 1.6 0.053 0.657 0.653
11 18.0 17.5 0.5 0.030 0.223 0.220
12 13.0 12.5 0.5 0.111 0.219 0.216
13 13.0 9.8 3.2 0.194 1.448 1.469
14 26.0 24.1 1.9 0.063 0.803 0.799
15 15.0 15.4 -0.4 0.053 -0.165 -0.163
16 21.0 21.2 -0.2 0.029 -0.076 -0.075
17 18.0 17.0 1.0 0.034 0.418 0.414
18 24.0 23.5 0.5 0.054 0.201 0.199
19 16.0 17.1 -1.1 0.033 -0.433 -0.429
20 15.0 14.0 1.0 0.077 0.412 0.408
21 18.0 19.3 -1.3 0.023 -0.525 -0.520
22 25.0 26.2 -1.2 0.107 -0.498 -0.494
23 17.0 20.0 -3.0 0.024 -1.235 -1.243
24 18.0 21.2 -3.2 0.029 -1.295 -1.306
25 13.0 13.9 -0.9 0.080 -0.355 -0.352

12-52
Chapter 12 - Simple Regression

26 18.0 18.7 -0.7 0.024 -0.304 -0.300


27 19.0 21.3 -2.3 0.030 -0.954 -0.953
28 17.0 17.5 -0.5 0.030 -0.211 -0.209
29 20.0 21.4 -1.4 0.031 -0.575 -0.571
30 20.0 21.7 -1.7 0.032 -0.678 -0.673
31 20.0 21.4 -1.4 0.030 -0.566 -0.561
32 15.0 17.2 -2.2 0.032 -0.886 -0.884
33 16.0 17.0 -1.0 0.034 -0.396 -0.392
34 25.0 22.5 2.5 0.041 1.009 1.009
35 28.0 23.9 4.1 0.059 1.711 1.754
36 24.0 23.6 0.4 0.056 0.154 0.152
37 21.0 20.3 0.7 0.025 0.266 0.263
38 17.0 20.3 -3.3 0.025 -1.328 -1.340
39 23.0 24.9 -1.9 0.079 -0.802 -0.799
40 26.0 23.0 3.0 0.047 1.216 1.224
41 19.0 17.9 1.1 0.028 0.462 0.458
42 34.0 22.8 11.2 0.044 4.600 6.531
43 20.0 19.8 0.2 0.024 0.073 0.072
Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (-0.005, -0.0036). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 41, t.025 =T.INV(.025,41) =±2.020, tcalc = -9.354<-2.020.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 1.00×10-11. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .681. 68.1% of the variation in the City MPG of a vehicle can be explained by its
weight. This shows a fairly strong relationship.
b. Fcalc = 87.51 and its p-value = 1.00×10-11. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Observation 42 is an outlier residual. This happens to be Volkswagen Jetta. The model
underestimated the true MPG by quite a margin. Perhaps this was the diesel version of
that model.
Learning Objective: 12-11

12.51 a. The normplot of residuals is not perfect. The high outlier can be seen and the line sags in
the middle. The histogram of residuals shows a somewhat bell-shaped distribution with
one high outlier.

12-53
Chapter 12 - Simple Regression

b. The residuals do not show significant departure from normality in spite of the one high
outlier.
Learning Objective: 12-11

12-54
Chapter 12 - Simple Regression

12.52

The residual plot does not show obvious signs of heteroscedasticity. It is possible to see the
Jetta outlier residual.
Learning Objective: 12-10

12.53 An autocorrelation test is not appropriate because this is cross sectional data.
Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 3000 and 4000 are shown
below.
Predicted values for: City
MPG
95% Confidence Intervals 95% Prediction Intervals
Weight Predicted lower upper lower upper
3,000 22.965 21.879 24.050 17.803 28.127
4,000 18.408 17.608 19.208 13.299 23.518
Learning Objective: 12-9

12.55 Observations 12, 13, and 22 have high leverage. These correspond to the Dodge Ram 1500,
the Ford Expedition, and the Kia Rio.
Learning Objective: 12-11

12-55
Chapter 12 - Simple Regression

DATA SET H

12.40 Cross-sectional.
Learning Objective: 02-3

12.41 Answers will vary. This information can be gathered from food labels or manufacturers’
websites.
Learning Objective: 02-7

12.42 Answers will vary. Sample size is reasonable although a large sample would be better.
Learning Objective: 02-9

12.43 A positive slope would be logical. It would make sense that the more fat calories the sauce
has the more calories are in the sauce in general.
Learning Objective: 12-2

12.44

There is a fairly strong positive relationship between the FatCalories/Gram and


Calories/Gram.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in the fat calories per gram by 1, increases total calories per gram by 2.2179.
The intercept might make sense for sauce labeled “fat free.”
Learning Objective: 12-2

12-56
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.843 n 20
r 0.918 k 1
Std. Error 0.102 Dep. Var. Calories Per Gram

ANOVA table
Source SS df MS F p-value
Regression 1.0076 1 1.0076 96.49 1.17E-08
Residual 0.1880 18 0.0104
Total 1.1955 19

Regression output confidence interval


variables coefficients std. error t (df=18) p-value 95% lower 95% upper
Intercept 0.3054 0.0460 6.635 3.16E-06 0.2087 0.4021
Fat Calories Per Gram 2.2179 0.2258 9.823 1.17E-08 1.7436 2.6923

Studentized
Studentized Deleted
Observation Calories Per Gram Predicted Residual Leverage Residual Residual
1 0.640 0.749 -0.109 0.053 -1.096 -1.103
2 0.560 0.572 -0.012 0.066 -0.117 -0.114
3 0.400 0.483 -0.083 0.096 -0.853 -0.846
4 0.480 0.394 0.086 0.142 0.907 0.902
5 0.640 0.572 0.068 0.066 0.693 0.682
6 0.400 0.305 0.095 0.203 1.037 1.039
7 0.560 0.483 0.077 0.096 0.794 0.785
8 0.550 0.483 0.067 0.096 0.691 0.681
9 0.480 0.572 -0.092 0.066 -0.927 -0.923
10 0.480 0.572 -0.092 0.066 -0.927 -0.923
11 1.250 1.148 0.102 0.251 1.151 1.162
12 1.000 1.037 -0.037 0.164 -0.400 -0.390
13 1.000 0.949 0.051 0.112 0.534 0.523
14 1.170 1.037 0.133 0.164 1.420 1.465
15 0.920 0.860 0.060 0.076 0.612 0.601
16 0.670 0.860 -0.190 0.076 -1.933 -2.111
17 0.860 0.749 0.111 0.053 1.116 1.124
18 0.700 0.727 -0.027 0.051 -0.270 -0.262
19 0.560 0.749 -0.189 0.053 -1.900 -2.066
20 0.640 0.660 -0.020 0.051 -0.204 -0.198
Learning Objective: 12-7

12-57
Chapter 12 - Simple Regression

12.48 a. 95% confidence interval for β1: (1.7436, 2.6923). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 18, t.025 =T.INV(.025,18) = ±2.101, tcalc = 9.823 > 2.101.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 1.17×10-8. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .843. 84.3% of the variation in the total calories/gram of a pasta sauce can be
explained by the number of fat calories/gram. This shows a fairly strong relationship.
b. Fcalc = 96.49 and its p-value = 1.17×10-8. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Observation 16 is an unusual residual. This is the Ragu Old World Style with meat sauce.
Learning Objective: 12-11

12.51 a. The normal probability plot of residuals does not show a clear normal pattern.

12-58
Chapter 12 - Simple Regression

b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
Learning Objective: 12-10

12.52

The residual plot does not show obvious signs of heteroscedasticity.


Learning Objective: 12-10

12-59
Chapter 12 - Simple Regression

12.53 An autocorrelation test is not appropriate because this is cross-sectional data.


Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 0.10 and 0.20 are shown
below.

Predicted values for: Calories Per Gram


95% Confidence
Intervals 95% Prediction Intervals
Predicte
Fat Calories Per Gram d lower upper lower upper
0.10 0.52722 0.46690 0.58754 0.30422 0.75021
0.20 0.74901 0.69978 0.79824 0.52876 0.96927
Learning Objective: 12-9

12.55 Observations 6 and 11 show high leverage. These correspond to Healthy Choice Traditional
and Prego Hearty Meat Peperoni.
Learning Objective: 12-11

12-60
Chapter 12 - Simple Regression

DATA SET I

12.40 Time-series.
Learning Objective: 02-3

12.41 Answers will vary. This information can be gathered by taking a random household and
observing their energy usage.
Learning Objective: 02-7

12.42 Answers will vary. Two years of data is good but when looking for seasonal influences
more years would be better.
Learning Objective: 02-9

12.43 A negative slope would be logical. It would make sense that the lower the temperature the
more energy a household would use.
Learning Objective: 12-2

12.44

There is a fairly strong negative relationship between the Average Daily Temperature and
Energy Consumption.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in 1° in average temperature decreases the monthly energy use by 9.661 kwh.
Yes, the intercept does make sense. There can be a month with an average temperature
of 0°.
Learning Objective: 12-2

12-61
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.


Regression Analysis

r² 0.766 n 24
r -0.875 k 1
Std. Error 84.951 Dep. Var. Electric Consumption (KWH)

ANOVA table
Source SS df MS F p-value
Regression 520,420.3570 1 520,420.3570 72.11 2.16E-08
Residual 158,766.1430 22 7,216.6429
Total 679,186.5000 23

Regression output confidence interval


variables coefficients std. error t (df=22) p-value 95% lower 95% upper
Intercept 1,165.7901 62.9671 18.514 6.64E-15 1,035.2043 1,296.3760
Avg Daily Temp (deg F) -9.6609 1.1376 -8.492 2.16E-08 -12.0202 -7.3016

Studentized
Studentized Deleted
Observation Energy Use (KWH) Predicted Residual Leverage Residual Residual
1 436.0 566.8 -130.8 0.056 -1.585 -1.645
2 464.0 479.9 -15.9 0.098 -0.197 -0.192
3 446.0 431.6 14.4 0.135 0.183 0.179
4 391.0 499.2 -108.2 0.086 -1.332 -1.358
5 444.0 557.2 -113.2 0.059 -1.373 -1.403
6 608.0 663.4 -55.4 0.042 -0.667 -0.658
7 885.0 808.3 76.7 0.089 0.945 0.943
8 821.0 779.4 41.6 0.073 0.509 0.500
9 830.0 789.0 41.0 0.078 0.502 0.494
10 750.0 827.7 -77.7 0.101 -0.964 -0.963
11 617.0 731.0 -114.0 0.054 -1.380 -1.411
12 598.0 644.1 -46.1 0.042 -0.554 -0.545
13 597.0 499.2 97.8 0.086 1.205 1.218
14 528.0 470.2 57.8 0.105 0.719 0.711
15 477.0 402.6 74.4 0.161 0.956 0.954
16 562.0 489.5 72.5 0.092 0.895 0.891
17 658.0 586.1 71.9 0.050 0.868 0.863
18 690.0 702.1 -12.1 0.047 -0.145 -0.142
19 862.0 798.7 63.3 0.083 0.778 0.771
20 1,008.0 837.3 170.7 0.108 2.127 2.332
21 840.0 924.3 -84.3 0.184 -1.098 -1.104
22 867.0 798.7 68.3 0.083 0.840 0.834
23 606.0 702.1 -96.1 0.047 -1.158 -1.168
24 657.0 653.8 3.2 0.042 0.039 0.038
Learning Objective: 12-7

12-62
Chapter 12 - Simple Regression

12.48 a. 95% confidence interval for β1: (-12.0202, -7.3016). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 22, t.025 =T.INV(.025,22) = ±2.074, tcalc = -8.492
<-2.074. Therefore reject H0 and conclude that the slope is significantly different from
zero.
c. The p-value = 2.16×10-8. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .766. 76.6% of the variation in the energy usage can be explained by average daily
temperature. This shows a fairly strong relationship.
b. Fcalc = 72.11 and its p-value = 2.16×10-8. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Observation 20 is an unusual residual. The model underestimates the energy usage for that
month.
Learning Objective: 12-11

12.51 a. The normplot of residuals does not show a clear normal pattern. The histogram also
shows a non-normality.

12-63
Chapter 12 - Simple Regression

b. The residuals show some departure from normality but this is a small sample size and a
larger sample might help.
Learning Objective: 12-10

12.52

The residual plot does not show obvious signs of heteroscedasticity.


Learning Objective: 12-10

12-64
Chapter 12 - Simple Regression

12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.53 which indicates slight positive
autocorrelation. This is not unexpected because temperature for one month is related to
temperature for the previous month. The level of autocorrelation does not invalidate the
regression results.

Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 50 and 70 are shown below.
Predicted values for: Electric Consumption (KWH)
95% Confidence Intervals 95% Prediction Intervals
Avg Daily Temp (deg F) Predicted lower upper lower upper
50 682.745 645.995 719.495 502.776 862.715
70 489.527 436.022 543.033 305.405 673.650
Learning Objective: 12-9

12.55 Observation 21 shows high leverage.


Learning Objective: 12-11

12-65
Chapter 12 - Simple Regression

DATA SET J

12.40 Time-series.
Learning Objective: 02-3

12.41 Answers will vary. This information is collected by The Bureau of Labor Statistics.
Learning Objective: 02-7

12.42 Answers will vary. 47 years of data is a good sample. Methods of measurement can vary
over time. It is important to consider how the measures were calculated and compare
years in which the calculations were similar.
Learning Objective: 02-9

12.43 A positive slope would be logical. It would make sense that the change in Commodities
CPI would move in the same direction as change in Services CPI.
Learning Objective: 12-2

12.44

There is a fairly strong positive relationship between the Commodities% and Services%.
Learning Objective: 12-4

12.45 See graph above. Yes, a linear relationship is plausible.


Learning Objective: 12-4

12.46 An increase in the change in Commodities CPI of 1% increases the change in Service CPI
by .830%. Yes, it is possible that there is no change in the CPI between two years.
Learning Objective: 12-2

12-66
Chapter 12 - Simple Regression

12.47 MegaStat output is provided.

Regression Analysis

r² 0.727 n 47
r 0.853 k 1
Std. Error 1.574 Dep. Var. Services%

ANOVA table
Source SS df MS F p-value
Regression 297.2715 1 297.2715 120.03 2.76E-14
Residual 111.4532 45 2.4767
Total 408.7247 46

Regression output confidence interval


variables coefficients std. error t (df=45) p-value 95% lower 95% upper
Intercept 2.2068 0.3506 6.293 1.14E-07 1.5005 2.9130
Commodities% 0.8302 0.0758 10.956 2.76E-14 0.6776 0.9828

Studentized
Studentized Deleted
Observation Services% Predicted Residual Leverage Residual Residual
1 3.40 2.95 0.45 0.037 0.289 0.286
2 1.70 2.70 -1.00 0.041 -0.652 -0.648
3 2.00 2.95 -0.95 0.037 -0.618 -0.613
4 2.00 2.95 -0.95 0.037 -0.618 -0.613
5 2.00 3.20 -1.20 0.034 -0.778 -0.774
6 2.30 3.12 -0.82 0.035 -0.530 -0.526
7 3.80 4.37 -0.57 0.023 -0.363 -0.360
8 4.30 3.78 0.52 0.027 0.332 0.329
9 5.20 5.11 0.09 0.021 0.056 0.056
10 6.90 6.11 0.79 0.025 0.509 0.505
11 8.00 5.94 2.06 0.024 1.323 1.334
12 5.70 5.20 0.50 0.021 0.324 0.321
13 3.80 4.70 -0.90 0.022 -0.577 -0.572
14 4.40 8.35 -3.95 0.057 -2.584 -2.769
15 9.20 12.09 -2.89 0.185 -2.031 -2.107
16 9.60 9.51 0.09 0.086 0.058 0.058
17 8.30 5.78 2.52 0.023 1.622 1.653
18 7.70 7.02 0.68 0.034 0.438 0.434
19 8.60 8.18 0.42 0.053 0.272 0.269
20 11.00 11.59 -0.59 0.162 -0.408 -0.404
21 15.40 12.42 2.98 0.201 2.120 2.209
22 13.10 9.18 3.92 0.077 2.592 2.779
23 9.00 5.61 3.39 0.022 2.178 2.277
24 3.50 4.61 -1.11 0.022 -0.716 -0.712
25 5.20 5.03 0.17 0.021 0.110 0.108
26 5.10 3.95 1.15 0.026 0.740 0.736

12-67
Chapter 12 - Simple Regression

27 5.00 1.46 3.54 0.066 2.328 2.454


28 4.20 4.86 -0.66 0.021 -0.426 -0.422
29 4.60 5.11 -0.51 0.021 -0.329 -0.326
30 4.90 6.11 -1.21 0.025 -0.778 -0.774
31 5.50 6.52 -1.02 0.028 -0.660 -0.656
32 5.10 4.78 0.32 0.022 0.205 0.203
33 3.90 3.87 0.03 0.026 0.021 0.021
34 3.90 3.78 0.12 0.027 0.075 0.074
35 3.30 3.62 -0.32 0.029 -0.205 -0.203
36 3.40 3.78 -0.38 0.027 -0.247 -0.245
37 3.20 4.37 -1.17 0.023 -0.749 -0.745
38 3.00 3.37 -0.37 0.031 -0.238 -0.236
39 2.70 2.29 0.41 0.048 0.267 0.264
40 2.50 3.70 -1.20 0.028 -0.774 -0.771
41 3.40 4.95 -1.55 0.021 -0.993 -0.993
42 4.10 3.04 1.06 0.036 0.688 0.684
43 3.10 1.63 1.47 0.062 0.967 0.967
44 3.20 3.04 0.16 0.036 0.106 0.104
45 2.90 4.12 -1.22 0.025 -0.782 -0.779
46 3.30 5.20 -1.90 0.021 -1.217 -1.224
47 3.80 4.20 -0.40 0.024 -0.257 -0.254
Learning Objective: 12-7

12.48 a. 95% confidence interval for β1: (0.6776, 0.9828). This interval does not contain zero
which means that we are confident the slope is not zero.
b. H0: β1 = 0 vs. H1: β1 ≠ 0, d.f. = 45, t.025 =T.INV(.025,45) = ±2.014, tcalc = 10.956 > 2.014.
Therefore reject H0 and conclude that the slope is significantly different from zero.
c. The p-value = 2.76×10-14. This means that it is highly unlikely to obtain a slope estimate
of this value if the true slope is equal to zero.
d. Yes, this sample supports our a priori hypothesis about the slope.
Learning Objective: 12-5
Learning Objective: 12-6

12.49 a. R2 = .727. 72.7% of the variation in the Services CPI change can be explained by
Commodities CPI change. This shows a fairly strong relationship.
b. Fcalc = 120.03 and its p-value = 2.76×10-14. The F statistic is significant which means the
linear model provides significant fit.
c. This model provides a good enough fit for practical use.
Learning Objective: 12-8

12.50 Observations 14, 15, 21-23 and 27 are unusual residuals.


Learning Objective: 12-11

12-68
Chapter 12 - Simple Regression

12.51 a. The histogram shows a slight bell-shaped curve although there is more concentration in
the middle of the graph than you would see in a true normal distribution. The normplot
shows a similar pattern.

b. The histogram of residuals does not show obvious departure from normality.
Learning Objective: 12-10

12-69
Chapter 12 - Simple Regression

12.52

The residual plot does not show significant signs of heteroscedasticity although there is a
slight fan out pattern as X increases.
Learning Objective: 12-10

12.53 An autocorrelation test is appropriate because this is time series data. The residual plot
below does not show obvious increasing or decreasing trends nor does it show signs of
negative autocorrelation. The DW statistic = 1.08 which indicates slight positive
autocorrelation. This is not unexpected because CPI indexes are economic data which
one would expect to be correlated month by month. The level of autocorrelation does
not invalidate the regression results.

12-70
Chapter 12 - Simple Regression

Learning Objective: 12-10

12.54 Answers will vary. Confidence and prediction intervals for x = 1.5 and 2.5 are shown
below.
Predicted values for: Services%
95% Confidence
Intervals 95% Prediction Intervals
Commodities% Predicted lower upper lower upper
1.5 3.4520 2.8982 4.0059 0.2343 6.6698
2.5 4.2822 3.7954 4.7690 1.0753 7.4891
Learning Objective: 12-9

12.55 Observations 15, 16, 20 and 21 show high leverage.


Learning Objective: 12-11

12-71
Chapter 12 - Simple Regression

12.56 No, r measures the strength and direction of the linear relationship, but not the amount of
variation explained by the explanatory variable.
Learning Objective: 12-15

12.57 H0: ρ = 0 versus H1: ρ ≠ 0. α = .025 so tcrit = t.0125 =T.INV(.0125,53) =±2.3069. tcalc =
55 - 2
.3043 = 2.3256 > 2.3069 so we reject the null hypothesis. The correlation is
1 - .30432
not equal to zero.
Learning Objective: 12-1

12.58 The correlation coefficient, r, is only .13, indicating that there exists a very weak positive
correlation between prices on successive days. The fact that it is a highly significant
result stems from a large sample size which increases power of the test. This means that
very small correlations will show statistical significance even though the correlation is
not truly important.
Learning Objective: 12-1

12.59 a. ŷ = 55.2 +.73(2000) = 1515.2 total free throws expected.


b. No, the intercept is not meaningful. You can’t make free throws without attempting
them.
c. Quick rule: 1515.2 �2.052(53.2) = 1515.2 �109.17 or (1406.03, 1624.37)
Learning Objective: 12-2
Learning Objective: 12-3
Learning Objective: 12-9

12.60 a. ŷ = 30.7963+ .0343x (R2 = .202, se = 6.816)


b. d.f.= 33,α = .05 so tcrit = t.025 =T.INV(.025,33) =±2.035
c. tcalc = 2.889 > 2.035 so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (.0101, .0584). This CI
does not contain zero, hence, there is a relationship between the weekly pay and the
income tax withheld.
e. Fcalc = (2.889)2 = 8.3463
f. The value of R2 assigns only 20% of the variation in income withholding to the weekly
pay. While the F statistic is significant, the fit is only a modest fit.
Learning Objective: 12-6
Learning Objective: 12-8

12.61 a. ŷ = 1743.57 − 1.2163x (R2 = .370, s = 286.793)


b. d.f. = 13,α = .05 so tcrit = t.025 =T.INV(.025,13) =±2.160
c. tcalc = −2.764 < −2.160so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (−2.1617,−0.2656).
This CI does not contain zero, hence, there is a relationship between the weekly pay
and monthly machine downtime.
e. Fcalc =(−2.764)2 = 7.639696
f. The value of R2 assigns only 37% of the variation in monthly machine downtime to the
monthly maintenance spending (dollars). Thus, throwing more “money” at the problem

12-72
Chapter 12 - Simple Regression

of downtime will not completely resolve the issue. Indicates that there are most likely
other reasons why machines have the amount of downtime incurred.
Learning Objective: 12-6
Learning Objective: 12-8

12.62 a. ŷ = 6.5763 +0.0452x (R2 = .519, s = 6.977)


b. d.f.= 62,α = .05 so tcrit = t.025 =T.INV(.025,62) = ±1.999
c. tcalc = 8.183 > 1.999 so we will reject the null hypothesis that the slope is zero.
d. We are 95% confident that the slope is contained in the interval (0.0342, 0.0563). This
CI does not contain zero, hence, there is a relationship between the total assets
(billions) and total revenue (billions).
e. Fcalc =(8.183)2 = 66.96
f. The value of R2assigns 51.9% of the variation in total revenue (billions) to the total
assets (billions). Thus, increasing assets will lead to an increase in income. However,
the results also indicate that there are most likely other reasons why companies earn the
revenue they do.
Learning Objective: 12-6
Learning Objective: 12-8

12.63 a. r = −.387 (from Excel)


b. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 41, α = .01 so tcrit = t.005 =T.INV(.005,41)
43 - 2
=±2.701. tcalc = -.387 = -2.687 . Because −2.687 is between the critical
1 - (-.387) 2
valueswe do not reject the hypothesis of zero correlation. The sample does not provide
evidence at the .01 significance level that the stock prices move together.
c. The scatterplot shows a slight negative correlation between IBM and HPQ stock prices.
(Note that if α = .05, t.025 = ±2.0195, and we would reject the hypothesis of zero
correlation and conclude that there is correlation between the stock prices.)

Learning Objective: 12-1

12-73
Chapter 12 - Simple Regression

12.64 a.

b. r = −.297. This shows a weak negative linear relationship between loyalty card use and
sales growth.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 72 and  = .05, α = .05 so tcrit = t.025
74 - 2
=T.INV(.025,72) =±1.9935. tcalc = -.297 = -2.639 . Because -2.639
1 - .297 2
<-1.9935, we reject the hypothesis of no correlation and the sample evidence supports
the notion of negative correlation.
d. It appears that a higher loyalty card usage is associated with lower sales growth.

12.65 a. The scatter plot indicates that there is a positive correlation between the fertility rates in
1990 and 2000.

12-74
Chapter 12 - Simple Regression

b. r = .749. There is a strong positive linear relationship between the fertility rates in 1990
and 2000.
c. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 13 and  = .05, t.025 =T.INV(.025,13) = 2.160.
15 - 2
tcalc = .749 = 4.076 . Because 4.076 > 2.160, we reject the hypothesis of no
1 - .7492
correlation and the sample evidence supports the notion of positive correlation.There is
a positive correlation.
Learning Objective: 12-1

12.66 a. The scatter plot shows almost no pattern.

b. r = −.105. H0: ρ = 0 versus H1: ρ ≠ 0. Using d.f. = 25 and  = .05, α = .05 so tcrit = t.025
27 - 2
=T.INV(.025,25) =±2.060. tcalc = -.105 = -0.528 . Because -.528 falls
1 - .1052
between the critical valueswe fail to reject the hypothesis of no correlation.
c. It appears there is very little relationship between price and accuracy rating of speakers.
Learning Objective: 12-1

12.67 For each of these, the scatter plot will contain the answers to (a), (b), and (d) with respect to
the fitted equation.
c. Salary: The fit is good. Assessed: The fit is excellent. HomePrice2: The fit is good.
d. Salary: An increase in the age by 1 year increases salary by $1447.4.
Assessed: An increase in 1 sq. ft. of floor space increases assessed value by $313.30.
HomePrice2: An increase in 1 sq. ft. of home size increases the selling price by
$209.20.
e. The intercept is not meaningful for any of these data sets as a zero value for any of X’s
respectively cannot realistically result in a positive Y value.

12-75
Chapter 12 - Simple Regression

Learning Objective: 12-7


Learning Objective: 12-8

estimated slope
t=
12.68 a. standard error
See table below for calculations.
b. Answers shown in right column in table below.

Dependent Variable Estimated Differ from 0?


Slope
Highest grade -0.027 Yes
achieved
Reading grade -0.07 Yes
equivalent
Class standing -0.006 No
Absence from 4.8 Yes
school
Grammatical 0.159 No
reasoning
Vocabulary -0.124 Yes
Hand-eye 0.041 No
coordination
Reaction time 11.8 No
Minor antisocial -0.639 No
behavior

c. It would be inappropriate to assume cause and effect without a better understanding of


how the study was conducted.
Learning Objective: 12-6

12-76
Chapter 12 - Simple Regression

12.69 a.

c. The fit of this regression is weak as given byR2 = 0.2474. 24% of the variation in %
Operating Margin is explained by % Equity Financing.
Learning Objective: 12-7
Learning Objective: 12-8

12.70 a.

c. The fit of this regression is very good as given by R2 = 0.8216. The regression line
does show a strong positive linear relationship between molecular w.r.t. and retention
time, indicating that the greater the molecular w.r.t. the greater is the retention time.
Learning Objective: 12-7
Learning Objective: 12-8

12.71 a. Based on both the R2 = 0 and the p-value > .10, there is no relationship between the
class size and teacher ratings.
b. Given that R2 =0, we have not “explained” teacher ratings in this bivariate model. Other
factors might be students’ expected GPA, whether the course is a core class or not, the
age of the student, gender of student, gender of instructor, etc. Answers will vary.
Learning Objective: 12-8

12-77
Chapter 12 - Simple Regression

12.72 a. The scatter plot shows a positive relationship.

c. The fit of this regression is very good as given by R2 = .8206. The regression line
showsa strong positive linear relationship between revenue and profit, indicating that
the greater revenue is associated with higher profit.
Learning Objective: 12-7
Learning Objective: 12-8

12.73 a. The slope of each model indicates the impact an additional year in vehicle age has on
the price. This relationship for each model is negative indicating that an additional year
of age reduces the asking price. This impact ranges from a low for the Taurus (an
additional year reduces the asking price by $906) to a high for the Ford Explorer (an
additional year reduces the asking price by $2,452).
b. The intercepts could indicate the price of a new vehicle.
c. Based on the R2 values: The fit is very good for the Explorer, the F-150 Pickup and the
Taurus. The fit is weak for the Mustang. One reason for the seemingly poor fit for the
Mustang is the fact that this is a collector item (if in good condition) so that the age is
less important of a factor in determining the asking price.
d. Answers will vary, but a bivariate model for 3 of the vehicles explains approximately
2/3 of the variation in asking price at a minimum. Other factors: condition of the car,
collector status, proposed usage, price of a new vehicle.
Learning Objective: 12-6
Learning Objective: 12-8

12.74 a. The regression results are not significant, based on the p-value, for the 1-Year holding
period. The results for the 2-Year period are significant at the 5% level, while for 5-, 8-,
and 10-Years the results are significant at the 1% level. For each regression there is an
inverse relationship between P/E and the stock return. For the 8-Year and 10-Year
period the relationship is approximately -1. The R2 increases as the holding period
increases. This indicates that P/E ratio explains a greater portion of the variation in
stock return, the longer the stock is held.

12-78
Chapter 12 - Simple Regression

b. Yes, given the data are time series, the potential for autocorrelation is present. Also, it is
commonly recognized that stock returns do exhibit a high degree of autocorrelation, as
do most financial series.
Learning Objective: 12-6
Learning Objective: 12-8

12.75 a. Using Father’s Height: My Predicted Height = 71+2.5 = 73.5” My actual height =
73” . Using Average of Parent’s Height: My Predicted Height = 68+2.5 = 70.5”
b. Fairly accurate within 0.5” when using my father’s height, within 2.5” when using
average parent height. May be there is improved accuracy using only father’s height
for males.
c. Regression analysis of samples of daughters and sons, with respective average height of
parents. Separate samples of each.
Learning Objective: 12-3

12-79

S-ar putea să vă placă și