00 voturi pozitive00 voturi negative

60 vizualizări71 paginiDec 21, 2010

© Attribution Non-Commercial (BY-NC)

PPT, PDF, TXT sau citiți online pe Scribd

Attribution Non-Commercial (BY-NC)

60 vizualizări

00 voturi pozitive00 voturi negative

Attribution Non-Commercial (BY-NC)

Sunteți pe pagina 1din 71

5th Edition

Chapter 13

Simple Linear Regression

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-1

Learning Objectives

To use regression analysis to predict the value of a

dependent variable based on an independent variable

The meaning of the regression coefficients b0 and b1

To evaluate the assumptions of regression analysis and

know what to do if the assumptions are violated

To make inferences about the slope and correlation

coefficient

To estimate mean values and predict individual values

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-2

Correlation vs. Regression

A scatter plot (or scatter diagram) can be

used to show the relationship between two

numerical variables

Correlation analysis is used to measure

strength of the association (linear

relationship) between two variables

Correlation is only concerned with strength of

the relationship

No causal effect is implied with correlation

Correlation was first presented in Chapter 3

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-3

Regression Analysis

Predict the value of a dependent variable based

on the value of at least one independent variable

Explain the impact of changes in an independent

variable on the dependent variable

Dependent variable: the variable you wish to explain

Independent variable: the variable used to explain

the dependent variable

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-4

Simple Linear Regression

Model

Only one independent variable, X

Relationship between X and Y is described

by a linear function

Changes in Y are related to changes in X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-5

Types of Relationships

Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-6

Types of Relationships

Strong relationships Weak relationships

Y Y

X X

Y Y

X X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-7

Types of Relationships

No relationship X

X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-8

The Linear Regression Model

Population Random

Population Independent Error

Slope

Y intercept Variable term

Coefficient

Dependent

Variable

Yi = β 0 + β1X i + ε i

Linear component Random Error

component

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-9

The Linear Regression Model

Y Yi = β 0 + β1X i + ε i

Observed Value

of Y for Xi

εi Slope = β1

Predicted Value Random Error

of Y for Xi for this Xi value

Intercept = β0

Xi X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-10

Linear Regression Equation

The simple linear regression equation provides an

estimate of the population regression line

Estimated (or

predicted) Y Estimate of the Estimate of the

value for regression regression slope

observation i intercept

Value of X for

Ŷi = b 0 + b1X i

observation i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-11

The Least Squares Method

between Y and : Ŷ

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-12

Finding the Least Squares

Equation

The coefficients b0 and b1 , and other regression

results in this chapter, will be found using Excel

those who are interested

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-13

Interpretation of the

Intercept and the Slope

b0 is the estimated mean value of Y when the

value of X is zero

Y for every one-unit change in X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-14

Linear Regression Example

relationship between the selling price of a

home and its size (measured in square feet)

Dependent variable (Y) = house price in

$1000s

Independent variable (X) = square feet

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-15

Linear Regression Example

Data

House Price in $1000s Square Feet

(Y) (X)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-16

Linear Regression Example

Scatterplot

House price model: scatter plot

450

400

350

House Price ($1000s)

300

250

200

150

100

50

0

0 500 1000 1500 2000 2500 3000

Square Feet

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-17

Linear Regression Example

Using Excel

Tools

--------

Data Analysis

--------

Regression

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-18

Linear Regression Example

Excel Output

Regression Statistics

Multiple R 0.76211

The regression equation is:

R Square 0.58082 house price = 98.24833 + 0.10977 (square feet)

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA df SS MS F Significance F

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-19

Linear Regression Example

Graphical Representation

House price model: scatter plot and regression line

450

400

350 Slope

House Price ($1000s)

300 = 0.10977

250

200

150

100

Intercept 50

= 98.248 0

0 500 1000 1500 2000 2500 3000

Square Feet

house price = 98.24833 + 0.10977 (square feet)

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-20

Linear Regression Example

Interpretation of b0

house price = 98.24833 + 0.10977 (square feet)

X is zero (if X = 0 is in the range of observed X

values)

Because the square footage of the house cannot be 0,

the Y intercept has no practical application.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-21

Linear Regression Example

Interpretation of b1

house price = 98.24833 + 0.10977 (square feet)

as a result of a one-unit change in X

Here, b1 = .10977 tells us that the mean value of a

house increases by .10977($1000) = $109.77, on

average, for each additional one square foot of size

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-22

Linear Regression Example

Making Predictions

Predict the price for a house with 2000 square feet:

= 98.25 + 0.1098(2000)

= 317.85

is 317.85($1,000s) = $317,850

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-23

Linear Regression Example

Making Predictions

When using a regression model for prediction, only

predict within the relevant range of data

Relevant range for

interpolation

450

400

350

House Price ($1000s)

300

Do not try to

250 extrapolate beyond

200 the range of

150 observed X’s

100

50

0

Statistics 0 500Using1000

for Managers Microsoft1500

Excel, 5e ©2000 2500 Inc.

2008 Prentice-Hall, 3000 Chap 13-24

Measures of Variation

Total variation is made up of two parts:

Total Sum of Regression Sum of Error Sum of

Squares Squares Squares

where:

Y = Mean value of the dependent variable

Yi = Observed values of the dependent variable

Ŷ = Predicted value of Y for the given X value

i i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-25

Measures of Variation

Measures the variation of the Yi values around

their mean Y

SSR = regression sum of squares

Explained variation attributable to the

relationship between X and Y

SSE = error sum of squares

Variation attributable to factors other than the

relationship between X and Y

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-26

Measures of Variation

Y

Yi ∧ ∧

SSE = ∑(Yi - Yi )2 Y

_

SST = ∑(Yi - Y)2

∧

Y ∧ _

_ SSR = ∑(Yi - Y)2 _

Y Y

Xi X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-27

Coefficient of Determination, r2

The coefficient of determination is the portion of

the total variation in the dependent variable that

is explained by variation in the independent

variable

The coefficient of determination is also called r-

squared and is denoted as r2

SSR regression sum of squares

r =

2

=

SST total sum of squares

0 ≤r ≤1 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-28

Coefficient of Determination, r2

Y

r2 = 1

between X and Y:

X

r2 = -1

Y 100% of the variation in Y is

explained by variation in X

X

r =1

2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-29

Coefficient of Determination, r2

Y

0 < r2 < 1

between X and Y:

X

Some but not all of the variation

Y

in Y is explained by variation in

X

X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-30

Coefficient of Determination, r2

r2 = 0

Y

No linear relationship between X

and Y:

X (None of the variation in Y is

r2 = 0

explained by variation in X)

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-31

Linear Regression Example

Coefficient of Determination, r2

Regression Statistics

SSR 18934.9348

Multiple R 0.76211 r =2

= = 0.58082

R Square 0.58082 SST 32600.5000

Adjusted R Square 0.52842

Standard Error 41.33032

58.08% of the variation in house

Observations 10 prices is explained by variation in

square feet

ANOVA df SS MS F Significance F

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-32

Standard Error of Estimate

The standard deviation of the variation of observations

around the regression line is estimated by

SSE ∑ i i

(Y − Yˆ ) 2

SYX = = i =1

n−2 n−2

Where

SSE = error sum of squares

n = sample size

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-33

Linear Regression Example

Standard Error of Estimate

Regression Statistics

Multiple R

R Square

0.76211

0.58082

S YX = 41.33032

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA df SS MS F Significance F

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-34

Comparing Standard Errors

SYX is a measure of the variation of observed

Y values from the regression line

Y Y

small s YX X large s YX X

the size of the Y values in the sample data

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-35

Assumptions of Regression

L.I.N.E

Linearity

The relationship between X and Y is linear

Independence of Errors

Error values are statistically independent

Normality of Error

Error values are normally distributed for any given value

of X

Equal Variance (also called homoscedasticity)

The probability distribution of the errors has constant

variance

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-36

Residual Analysis

ei = Yi − Yˆi

The residual for observation i, ei, is the difference between its

observed and predicted value

Check the assumptions of regression by examining the residuals

Examine for Linearity assumption

Evaluate Independence assumption

Evaluate Normal distribution assumption

Examine Equal variance for all levels of X

Graphical Analysis of Residuals

Can plot residuals vs. X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-37

Residual Analysis for Linearity

Y Y

x x

residuals

x residuals x

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-38

Residual Analysis for

Independence

Not Independent Independent

residuals

residuals

X

residuals

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-39

Checking for Normality

Residuals

Examine the Box-and-Whisker Plot of the

Residuals

Examine the Histogram of the Residuals

Construct a Normal Probability Plot of the

Residuals

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-40

Residual Analysis for

Equal Variance

Y Y

x x

residuals

x residuals x

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-41

Linear Regression Example

Excel Residual Output

RESIDUAL OUTPUT House Price Model Residual Plot

Predicted House Residuals

Price 80

1 251.92316 -6.923162 60

2 273.87671 38.12329 40

Residuals

3 284.85348 -5.853484 20

4 304.06284 3.937162 0

5 218.99284 -19.99284 0 1000 2000 3000

-20

6 268.38832 -49.38832

-40

7 356.20251 48.79749

-60

8 367.17929 -43.17929

Square Feet

9 254.6674 64.33264

10 284.85348 -29.85348 Does not appear to violate

any regression assumptions

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-42

Measuring Autocorrelation:

The Durbin-Watson Statistic

Used when data are collected over time to

detect if autocorrelation is present

Autocorrelation exists if residuals in one

time period are related to residuals in another

period

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-43

Autocorrelation

Autocorrelation is correlation of the errors (residuals) over time

15

10

Here, residuals suggest a Residuals 5

cyclic pattern, not 0

random -5 0 2 4 6 8

-10

-15

Time (t)

statistically independent

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-44

The Durbin-Watson Statistic

The Durbin-Watson statistic is used to test for autocorrelation

H1: autocorrelation is present

∑ (e − e i i −1 ) 2

D= i=2

n

∑i

e 2

i =1

D less than 2 may signal positive

autocorrelation, D greater than 2 may

signal negative autocorrelation

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-45

The Durbin-Watson Statistic

H0: positive autocorrelation does not exist

H1: positive autocorrelation is present

Calculate the Durbin-Watson test statistic = D

(The Durbin-Watson Statistic can be found using Excel)

Find the values dL and dU from the Durbin-Watson table

(for sample size n and number of independent variables k)

0 dL dU 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-46

The Durbin-Watson Statistic

Example with n = 25: 160

140

Excel output:

120

Durbin-Watson Calculations

100

Sum of Squared Difference 3296.18

y = 30.65 + 4.7038x

Sales

of Residuals 80

2

60 R = 0.8976

Sum of Squared Residuals 3279.98

40

Durbin-Watson Statistic 1.00494

20

0

0 5 10 15 20 25 30

n Tim e

∑(e i − ei−1 ) 2

3296.18

D= i=2

n

= = 1.00494

3279.98

∑ei

2

i=1

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-47

The Durbin-Watson Statistic

Here, n = 25 and there is k = 1 independent variable

Using the Durbin-Watson table, dL = 1.29 and dU = 1.45

D = 1.00494 < dL = 1.29, so reject H0 and conclude that

significant positive autocorrelation exists

Therefore the linear model is not the appropriate model to

predict sales

Decision: reject H0 since

D = 1.00494 < dL

0 dL=1.29 dU=1.45 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-48

Inferences About the Slope:

t Test

t test for a population slope

Is there a linear relationship between X and Y?

Null and alternative hypotheses

H0 : β 1 = 0 (no linear relationship)

H1 : β 1 ≠ 0 (linear relationship does exist)

Test statistic where:

b1 − β1

t= b1 = regression slope

coefficient

Sb1 β1 = hypothesized slope

error of the slope

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-49

Inferences About the Slope:

t Test Example

House Price in Square Feet Estimated Regression Equation:

$1000s (x)

(y)

house price = 98.25 + 0.1098 (sq.ft.)

245 1400

312 1600

279 1700

308 1875 The slope of this model is 0.1098

199 1100

219 1550 Is there a relationship between the

405 2350 square footage of the house and its

324 2450 sales price?

319 1425

255 1700

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-50

Inferences About the Slope:

t Test Example

b1 Sb1

H0: β1 = 0 From Excel output:

Coefficients Standard Error t Stat P-value

H1: β1 ≠ 0

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

b1 − β1 0.10977 − 0

t= = t = 3.32938

Sb1 0.03297

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-51

Inferences About the Slope:

t Test Example

Test Statistic: t = 3.329 H0: β1 = 0

H1: β1 ≠ 0

d.f. = 10- 2 = 8

α /2=.025 α /2=.025

Decision: Reject H0

Reject H0

-tα/2

Do not reject H0

tα/2 0 Reject H that square footage affects

0

-2.3060 2.3060 3.329 house price

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-52

Inferences About the Slope:

t Test Example

From Excel output: P-Value

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

H0: β1 = 0

H1: β1 ≠ 0

Decision: Reject H0, since p-value < α

square footage affects house price.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-53

F-Test for Significance

F Test statistic:

MSR

F=

MSE

where

SSR

MSR =

k

SSE

MSE =

n − k −1

and (n - k - 1) denominator degrees of freedom

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-54

F-Test for Significance

Excel Output

Regression Statistics

Multiple R 0.76211

MSR 18934.9348

R Square 0.58082

F= = = 11.0848

Adjusted R Square 0.52842 MSE 1708.1957

Standard Error 41.33032

Observations 10 With 1 and 8 degrees P-value for

of freedom the F-Test

ANOVA df SS MS F Significance F

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-55

F-Test for Significance

H 0 : β1 = 0 Test Statistic:

H 1 : β1 ≠ 0 MSR

F= = 11.08

α = .05 MSE

df1= 1 df2 = 8

Decision:

Critical Value: Reject H0 at α =

0.05

Fα = 5.32

α =. Conclusion:

05 There is sufficient evidence that

0 Do not Reject H0

F house size affects selling price

reject H0 F.05 = 5.32

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-56

Confidence Interval Estimate

for the Slope

Confidence Interval Estimate of the Slope:

b1 ± t n−2Sb1 d.f. = n - 2

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

interval for the slope is (0.0337, 0.1858)

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-57

Confidence Interval Estimate

for the Slope

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

are 95% confident that the mean change in sales price is

between $33.74 and $185.80 per square foot of house size

Conclusion: There is a significant relationship between house price and

square feet at the .05 level of significance

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-58

t Test for a Correlation Coefficient

Hypotheses

H0: ρ = 0 (no correlation between X and Y)

H1: ρ ≠ 0 (correlation exists)

Test statistic

r - ρ of freedom)

(with n – 2 degrees

t=

1− r 2 where

r = + r 2 if b1 > 0

n−2

r = − r 2 if b1 < 0

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-59

t Test for a Correlation Coefficient

between square feet and house price at the .05

level of significance?

H0: ρ = 0 (No correlation)

H1: ρ ≠ 0 (correlation exists)

α =.05 , df = 10 - 2 = 8

r −ρ .762 − 0

t= = = 3.329

1− r2 1 − .762 2

n−2 10 − 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-60

t Test for a Correlation Coefficient

d.f. = 10- 2 = 8

Decision:

Reject H0

α /2=.025 α /2=.025

Conclusion:

There is evidence

Reject H0 Do not reject H0 Reject H0

of a linear

-tα/2 0

tα/2 association at the

-2.3060 2.3060 5% level of

3.329

significance

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-61

Estimating Mean Values and

Predicting Individual Values

Goal: Form intervals around Y to express

uncertainty about the value of Y for a given Xi

Confidence

Interval for

the mean of Y ∧

Y

Y, given Xi

∧

Y = b0+b1Xi

individual Y, given Xi

Xi X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-62

Confidence Interval for

the Average Y, Given X

Confidence interval estimate for the mean value

of Y given a particular Xi

Confidence interval for μY|X = Xi :

Ŷ ± t n−2S YX hi

Size of interval varies according to

distance away from mean, X

1 (X i − X) 2 1 (X i − X) 2

hi = + = +

n SSX n ∑ (X i − X) 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-63

Prediction Interval for

an Individual Y, Given X

Prediction interval estimate for an individual value

of Y given a particular Xi

Yˆ ± t n − 2SYX 1 + hi

added uncertainty for an individual case

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-64

Estimation of Mean Values:

Example

Confidence Interval Estimate for μY|X=X

i

Find the 95% confidence interval for the mean price of 2,000

square-foot houses

∧

Predicted Price Yi = 317.85 ($1,000s)

1 (Xi − X)2

Ŷ ± t n-2S YX + = 317.85 ± 37.12

n ∑ (Xi − X) 2

or from $280,660 to $354,900

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-65

Estimation of Individual

Values: Example

Prediction Interval Estimate for YX=X

i

2,000 square feet

∧

Predicted Price Yi = 317.85 ($1,000s)

1 (Xi − X)2

Ŷ ± t n-1S YX 1+ + = 317.85 ± 102.28

n ∑ (Xi − X) 2

from $215,500 to $420,070

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-66

Pitfalls of Regression Analysis

underlying least-squares regression

Not knowing how to evaluate the assumptions

Not knowing the alternatives to least-squares

regression if a particular assumption is violated

Using a regression model without knowledge of

the subject matter

Extrapolating outside the relevant range

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-67

Strategies for Avoiding

the Pitfalls of Regression

Start with a scatter plot of X on Y to observe

possible relationship

Perform residual analysis to check the

assumptions

Plot the residuals vs. X to check for violations

of assumptions such as equal variance

Use a histogram, stem-and-leaf display, box-

and-whisker plot, or normal probability plot of

the residuals to uncover possible non-normality

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-68

Strategies for Avoiding

the Pitfalls of Regression

If there is violation of any assumption, use

alternative methods or models

If there is no evidence of assumption

violation, then test for the significance of the

regression coefficients and construct

confidence intervals and prediction intervals

Avoid making predictions or forecasts

outside the relevant range

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-69

Chapter Summary

In this chapter, we have

Introduced types of regression models

Reviewed assumptions of regression and

correlation

Discussed determining the simple linear

regression equation

Described measures of variation

Discussed residual analysis

Addressed measuring autocorrelation

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-70

Chapter Summary

In this chapter, we have

Discussed correlation -- measuring the

strength of the association

Addressed estimation of mean values and

prediction of individual values

Discussed possible pitfalls in regression and

recommended strategies to avoid them

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-71

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.