Sunteți pe pagina 1din 5

The Excel spreadsheet housedata.xls contains data on the sales of 950 single-family homes in Springfield, MA.

We wish to explain and predict the price of a single-family home (Y, in thousands of dollars) using the
following predictor variables:

Data Description

Variable Name Description House of interest

s_p Sale price in dollars ?
inv Sale date inventory of homes on market 100
bath Number of bathrooms 2
ltsz Lot size in acres .25
hssz Sq. ft. of living area 1200
bsemt 1 if basement, 0 otherwise 0
a_c 1 if central a/c, 0 otherwise 1
f_place 1 if fireplace, 0 otherwise 0
garsz_a 1 if garage, 0 otherwise 1
dinsp 1 if dining space, 0 otherwise 1
dw 1 if dishwasher, 0 otherwise 1
dr 1 if dining room, 0 otherwise 0
fr 1 if family room, 0 otherwise 0
age5 1 if age <= 5 yrs, 0 otherwise 1
stl10 1 if 1 story house, 0 otherwise 1
bdrms Number of bedrooms 4

1) Build a regression model to predict the selling price for a home. Explain your thinking and your analytical
process concisely but clearly, using specific excerpts from your data analysis where appropriate (Time
saving hint: Mark them up by hand, and staple them, in a logical order). Be sure to discuss any additional
steps you would like to perform if you had more time for your analysis (and why those steps would be
important).
2) What is your BEST-MOST COMPLETE answer to what the house of interest listed above will cost (ie.
point and interval)?












Excel Output for Regression Analysis:
SUMMARY OUTPUT

Regression Statistics

Multiple R 0.772837797

R Square 0.597278261

Adjusted R
Square
0.591248203

Standard
Error
18649.3224

Observations 950


ANOVA



df SS MS F Significance F

Regression 14 4.82E+11 3.4E+10 9.9E+01 1.52E-173

Residual 935 3.25E+11 3.5E+08

Total 949 8.07E+11


Coefficients
Standard
Error
t Stat P-value
Lower
95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept -3024.133 5204.164 -0.581 0.561
-
13237.327
7189.061
-
13237.327
7189.061
inv -15.139 11.826 -1.280 0.201 -38.348 8.070 -38.348 8.070
bath 11374.445 1121.711 10.140 0.000 9173.082 13575.807 9173.082 13575.807
ltsz 18524.424 1529.900 12.108 0.000 15521.988 21526.860 15521.988 21526.860
hssz 12.469 2.318 5.378 0.000 7.919 17.019 7.919 17.019
bsemt 9093.475 2556.131 3.558 0.000 4077.057 14109.893 4077.057 14109.893
a_c 1191.235 1797.291 0.663 0.508 -2335.957 4718.426 -2335.957 4718.426
f_place 9252.545 1490.819 6.206 0.000 6326.805 12178.284 6326.805 12178.284
garsz_a 2884.968 3268.070 0.883 0.378 -3528.633 9298.570 -3528.633 9298.570
dw 1204.861 1895.385 0.636 0.525 -2514.840 4924.563 -2514.840 4924.563
dr 5670.059 1575.809 3.598 0.000 2577.526 8762.591 2577.526 8762.591
fr -1032.413 1710.942 -0.603 0.546 -4390.144 2325.319 -4390.144 2325.319
age5 11582.990 1771.466 6.539 0.000 8106.480 15059.499 8106.480 15059.499
stl10 -2911.433 1332.341 -2.185 0.029 -5526.158 -296.708 -5526.158 -296.708
bdrms 6984.691 1056.320 6.612 0.000 4911.659 9057.723 4911.659 9057.723





The required regression model to predict the selling price for a home is as below:

Selling Price =
= -3021.13 15.139 * inv + 11374.445 * bath + 18524.424 * Itsz +12.469 * hssz +
9093.475 * bsemt + 1191.235 * a_c + 9252.545 * f_place + 2884.968 * garsz_a +
1204.861* dw -1032.413*fr + 11582.99 * age5 -2911.433*stl10 + 6984.69 * bdrms

Overall model is highly significant with a value of coefficient of determination equal to 0.5972,
which indicates the model is capable of explaining 59.72% variation in dependent variable.

Using the above model, the point estimate for the price of house of interest is given by:

Selling Price = 79696.47698

Using 95% Confidence limits for the regression coefficients from above output, the 95 %
confidence Interval for the estimate is given by:

Selling Price = (28505.4280, 130887.5259)

























Model 2:

SUMMARY OUTPUT


Regression Statistics

Multiple R 0.771816

R Square 0.5957
Adjusted R
Square 0.591829
Standard
Error 18636.06

Observations 950


ANOVA

df SS MS F
Significance
F

Regression 9 4.81E+11 5.34E+10 153.8899 4.6E-178

Residual 940 3.26E+11 3.47E+08

Total 949 8.07E+11


Coefficients
Standard
Error t Stat P-value Lower 95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept -2181.5 4253.8 -0.5 0.6 -10529.5 6166.4 -10529.5 6166.4
bath 11397.6 1110.3 10.3 0.0 9218.7 13576.5 9218.7 13576.5
ltsz 18359.7 1524.0 12.0 0.0 15368.9 21350.5 15368.9 21350.5
hssz 12.7 2.3 5.6 0.0 8.2 17.3 8.2 17.3
bsemt 9191.8 2515.2 3.7 0.0 4255.7 14127.8 4255.7 14127.8
f_place 9407.3 1434.8 6.6 0.0 6591.6 12223.0 6591.6 12223.0
dr 5913.3 1550.3 3.8 0.0 2870.9 8955.6 2870.9 8955.6
age5 12058.4 1734.8 7.0 0.0 8653.9 15462.8 8653.9 15462.8
stl10 -2963.8 1324.8 -2.2 0.0 -5563.8 -363.9 -5563.8 -363.9
bdrms 7055.2 1035.4 6.8 0.0 5023.2 9087.2 5023.2 9087.2


Model 2:

The required regression model to predict the selling price for a home is as below:

The regression equation is
s_p = - 2182 + 11398 bath + 18360 ltsz + 12.7 hssz + 9192 bsemt + 9407 f_place
+ 5913 dr + 12058 age5 - 2964 stl10 + 7055 bdrms

Overall model is highly significant with a value of coefficient of determination equal to 0.5918,
which indicates the model is capable of explaining 59.2% variation in dependent variable.

Using the above model, the point estimate for the price of house of interest is given by:

Selling Price = 77757.5



Using 95% Confidence limits for the regression coefficients from above output, the 95 %
confidence Interval for the estimate is given by:

Selling Price = (44830.5, 110805.2)

S-ar putea să vă placă și