Sunteți pe pagina 1din 20

Business Statistics, 4e

by Ken Black

Chapter 14
Discrete Distributions

Multiple Regression
Analysis

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-1
Learning Objectives

• Develop a multiple regression model.


• Understand and apply significance tests of the regression
model and its coefficients.
• Compute and interpret residuals, the standard error of the
estimate, and the coefficient of determination.
• Interpret multiple regression computer output.

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-2
Regression Models

Probabilistic Multiple Regression Model

Y = 0 + 1X1 + 2X2 + 3X3 + . . . + kXk+ 

Y = the value of the dependent (response) variable


0 = the regression constant
1 = the partial regression coefficient of independent variable 1
2 = the partial regression coefficient of independent variable 2
k = the partial regression coefficient of independent variable k
k = the number of independent variables
 = the error of prediction
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-3
Estimated Regression Model

Yˆ  b0  b1 X 1  b2 X 2  b3 X 3    bk X k
where : Yˆ  predicted value of Y
b  estimate of regression constant
0

b  estimate of regression coefficient 1


1

b  estimate of regression coefficient 2


2

b  estimate of regression coefficient 3


3

b  estimate of regression coefficient k


k

k = number of independent variables


Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-4
Multiple Regression Model with Two
Independent Variables (First-Order)
Population
Y   0  1 X 1   2 X 2
 Model
where:  0
= the regression constant

 1
 the partial regression coefficient for independent variable 1

 2
 the partial regression coefficient for independent variable 2
 = the error of prediction
Y  b  b X  b X Estimated
0 1 1 2 2
Model
where: Y  predicted value of Y
b 0
estimate of regression constant
b 1
estimate of regression coefficient 1
b 2
estimate of regression coefficient 2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-5
Response Plane for First-Order Two-
Predictor Multiple Regression Model
Y Vertical Intercept
Y1

Response Plane

X2 X1

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-6
Least Squares Equations for k = 2

b n  b  X  b  X  Y 0 1 1 2 2

b  X b  X b  X X   X Y
2
0 1 1 1 2 1 2 1

b  X b  X X b  X   X Y
2
0 2 1 1 2 2 2 2

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-7
Real Estate Data
Market Square Age Market Square Age
Price Feet (Years) Price Feet (Years)
($1,000) ($1,000)
Observation Y X1 X2 Observation Y X1 X2
1 63.0 1,605 35 13 79.7 2,121 14
2 65.1 2,489 45 14 84.5 2,485 9
3 69.9 1,553 20 15 96.0 2,300 19
4 7
76.8 2,404 32 16 109.5 2,714 4
5 73.9 1,884 25 17 102.5 2,463 5
6 77.9 1,558 14 18 121.0 3,076 7
7 74.9 1,748 8 19 104.9 3,048 3
8 78.0 3,105 10 20 128.0 3,267 6
9 79.0 1,682 28 21 129.0 3,069 10
10 63.4 2,470 30 22 117.9 4,765 11
11 79.5 1,820 2 23 140.0 4,540 8
12 83.9 2,143 6

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-8
MINITAB Output
for the Real Estate Example
The regression equation is
Price = 57.4 + 0.0177 Sq.Feet - 0.666 Age

Predictor Coef StDev T P


Constant 57.35 10.01 5.73 0.000
Sq.Feet 0.017718 0.003146 5.63 0.000
Age -0.6663 0.2280 -2.92 0.008

S = 11.96 R-Sq = 74.1% R-Sq(adj) = 71.5%


Analysis of Variance

Source DF SS MS F P
Regression 2 8189.7 4094.9 28.63 0.000
Residual Error 20 2861.0 143.1
Total 22 11050.7
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-9
Predicting the Price of Home

Yˆ  57.4  0.0177 X 1  0.666 X 2


For X 1  2500 and X 2  12,
Yˆ  57.4  0.0177 2500  0.66612
 93.658 thousand dollars

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-10
Evaluating the Multiple
Regression Model
Testing
H 0:  1   2   3    k  0 the
Ha: At least one of the regression coefficients is  0 Overall
Model

H 0:  1  0 H 0:  3  0

Ha:  1  0 Ha:  3  0 Significance


Tests for
 Individual
H 0:  2  0 H 0:  k  0 Regression
Coefficients
Ha:  2  0 Ha:  k  0

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-11
Testing the Overall Model for the
Real Estate Example

H 0 :   2  0 F .01, 2 , 20
 585
.
1

Ha : At least one of the regression coefficients is  0 F Cal


 28.63  585
. , reject H0.

SSR SSE MSR


MSR  MSE  F
k n  k 1 MSE

ANOVA
df SS MS F p
Regression 2 8189.723 4094.86 28.63 .000
Residual (Error) 20 2861.017 143.1
Total 22 11050.74
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-12
Significance Test
of the Regression
H 0:  1  0 Coefficients for
Ha:  1  0 t.025,20 = 2.086 the Real Estate
Example
H 0:  2  0
tCal = 5.63 > 2.086, reject H0.
Ha:  2  0

Coefficients Std Dev t Stat p

x1 (Sq.Feet) 0.0177 0.003146 5.63 .000


x2 (Age) -0.666 0.2280 -2.92 .008

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-13
Residuals and Sum of Squares Error
for the Real Estate Example

   Y Y
2 2

Observation Y Y Y  Y Y Y
Observation Y Y Y  Y
1 43.0 42.466 0.534 0.285 13 59.7 65.602 -5.902 34.832
2 45.1 51.465 -6.365 40.517 14 64.5 75.383 -10.883 118.438
3 49.9 51.540 -1.640 2.689 15 76.0 65.442 10.558 111.479
4 56.8 58.622 -1.822 3.319 16 89.5 82.772 6.728 45.265
5 53.9 54.073 -0.173 0.030 17 82.5 77.659 4.841 23.440
6 57.9 55.627 2.273 5.168 18 101.0 87.187 13.813 190.799
7 54.9 62.991 -8.091 65.466 19 84.9 89.356 -4.456 19.858
8 58.0 85.702 -27.702 767.388 20 108.0 91.237 16.763 280.982
9 59.0 48.495 10.505 110.360 21 109.0 85.064 23.936 572.936
10 63.4 61.124 2.276 5.181 22 97.9 114.447 -16.547 273.815
11 59.5 68.265 -8.765 76.823 23 120.0 112.460 7.540 56.854
12 63.9 71.322 -7.422 55.092 SSE 2861.017

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-14
MINITAB Residual Diagnostics for
the Real Estate Problem
Residual Model Diagnostics
Normal Plot of Residuals I Chart of Residuals
40
20 3.0SL=31.26
30
10 20
Residual

Residual
10
0 X=-7.2E-14
0
-10 -10
-20
-20
-30 -3.0SL=-31.26
-30 -40
-2 -1 0 1 2 0 10 20
Normal Score Observation Number

Histogram of Residuals Residuals v s. Fits


6
20
5
10
Frequency

Residual
3 0
2 -10
1 -20
0 -30
-30 -20 -10 0 10 20 30 60 70 80 90 100 110 120 130 140
Residual Fit

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-15
SSE and Standard Error
of the Estimate
ANOVA
df SS MS F P
Regression 2 8189.7 4094.9 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

SSE
SSE
S e

n  k 1
2861

23  2  1
 1196
.
where: n = number of observations
k = number of independent variables

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-16
Coefficient of Multiple Determination (R2)
SSYY
SSE SSR
ANOVA
df SS MS F p
Regression 2 8189.7 4094.89 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

2SSR 8189.723
R  SSY  11050.74 .741
2 SSE 2861017
.
R  1  SSY  1  11050.74 .741

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-17
Adjusted R2

n-1 n-k-1 SSE SSYY


ANOVA
df SS MS F p
Regression 2 8189.7 4094.9 28.63 .000
Residual (Error) 20 2861.0 143.1
Total 22 11050.7

SSE 2861017
.
adj. R  1  n  k  1  1  23  2  1  1.285 .715
2
SSY 11050.74
n 1 23  1

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-18
Demonstration Problem 14.1: Freight
Data
Freight Cargo Shipped by
Road
(Million Short-Ton Length 0f Roads Number of Commercial
Country Miles) (Miles) Vehicles

China 278,806 673,239 5,010,000

Brazil 178,359 1,031,693 1,371,127

India 144,000 1,342,000 1,980,000

Germany 138,975 395,367 2,923,000

Italy 125,171 188,597 2,745,500

Spain 105,824 206,271 2,859,438

Mexico 96,049 157,036 3,758,034

Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-19
Demonstration Problem 14.1: Excel
Output
Regression Statistics
Multiple R 0.812
R Square 0.659
Adjusted R Square 0.488
Standard Error 44273.86677
Observations 7
ANOVA
  df SS MS F Sig. F
Regression 2 15148592381 7.57E+09 3.86 0.116
Residual 4 7840701114 1.96E+09
Total 6 22989293495      
  Coefficients Standard Error t Stat P-value
Intercept -26425.45085 67624.93769 -0.39 0.716
Length 0f Roads 0.101820862 0.043495015 2.34 0.079
Commercial Vehicles 0.04094856 0.017121018 2.39 0.075
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 14-20

S-ar putea să vă placă și