Documente Academic
Documente Profesional
Documente Cultură
REGRESSION CASE
QUANTITATIVE METHODS II
TO
BY
Table of Contents
S.No
Particulars
1.
2.
3.
Executive Summary
Understanding of the Problem
Model Description
Model 1
Prediction interval Vs Confidence Interval
Step wise Regression: A closer look
Test of Model: Analysis of Results
Model 2
Test of Model: Analysis of Results
Other Models
4.
5.
Variables Entered/Removed
Model Summary
ANOVA
Coefficients
Residual Statistics
Pages
3-4
4
5-13
5-8
6
7
8
9-13
11-13
13
14
15
Executive Summary
Reyem Affiar has recently found the below described condominium in Mid-Cambridge that he wants to
purchase.
Street Address
Last Price
: $169000
: M/9
Bed
:2
Bath
:1
Rooms
:5
Interior
: 1040
Condo
: $175
Tax
: $1121
RC
Even though Affiar is monetarily capable of paying the asking price of $169000, generally negotiations
from buyers agent keeps the selling price lower than the last asking price. Given the above information,
based on the data that Reyem Affiar has on condominiums sold in Cambridge the past five years, we
need to help Reyem Affiar to decide on a fair offer price.
Solution Approach
An estimate for selling price of the above condominium needs to be made. Hence selling price is clearly
the dependent variable Y for the regression model. Clearly first date, close date and number of days
between the two (Days) cannot be part of the independent variable set since we do not have these
information for the 236 Ellery Steet Condominium yet (since the sale has not taken place yet). Further
the condominium of interest lies in area M (9), hence one could possibly analyze only the data on the
111 condominiums from the same area and ignore the rest. On the other hand, if we can set up
independent dummy variables for the area/area codes, these can be incorporated into our regression
model and then we will have a bigger sample of 456 data-points to make a better and more accurate
prediction for Affiar. This will be explained in detail in the model description. Stepwise regression in
SPSS has been adopted for variable selection. This method, being a combination of forward selection
and backward elimination techniques for variable selection, avoids the errors in regression model that
can be committed due to multi-collinearity.
This means we dont have information on the first price for the Ellery Street
condominium, hence we remove first price from our possible independent variable list. As stated before
in section 1.1, we cannot have number of days between first and last date as an independent variable
either since the sale of condominium has not happened and we dont have information on the first date
the condominium was put on sale. Finally, we can intuitively see that there will be a positive correlation
between interior space and number of rooms, bathrooms and bedrooms. Since interior space can be
representative of all, to avoid the issue of multi-collinearity, interior space can very well act as a good
proxy in our regression model for number of rooms, bathrooms and bedrooms. We will also show this
through the output generated in the model description section. Further, one can also expect last price
and interior space to have positive coefficients while condominium taxes, property taxes and RC to have
negative coefficients. Effect of the other dummy variables for area/area codes need to be explored by
running the regression model.
We will start with a basic regression model, then will check the model for normality, linearity and in case
it does not pass the test we will transform the variables using Log, Square root or inverse.We will rerun
the regression model with transformations and try to find the outliers. If any outlier is found we will
remove that and then again run the regression model. Then we will check for Residuals normality and
homoscedasticty.If there is at least 2% increase in the R square value as compared to the baseline
regession then we will go with the regression model with transformed variable else we will go with
baseline model and mention the cautions for non normality etc.
Model Description
Model 1
Baseline regression model
Where A2, A5, A12 and A16 are the dummy variables associated with areas Avon Hill, East Cambridge,
Porter Square and West Cambridge respectively. They will take values of 1 or 0 depending on whether
we are to predict the price of a condominium in that area. For 236 Ellery Street Apartment, we have
Sale Price =
95% prediction interval for the Selling price of 236 Ellery Street Condominium is given by:
= {72630.188, 240885.328}
The standard error and MSE are taken from the regression output table (Appendix).
Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street
Condominium would be given by:
= 156757.758 t[0.025,(456-10)](4021)
= 156757.758 1.9653 *(4021)
= 156757.758 7902.471
= {148855.29, 164660.23}
The standard error of mean predicted value is taken from the Residual Statistics table (Appendix).
Exhibit 1: Regression Model Coefficients
Coefficients
Model
9
Unstandardized
Standardized
95% Confidence
Collinearity
Coefficients
Coefficients
Interval for B
Statistics
B
(Constant)
Std. Error
-
Beta
Sig.
Lower
Upper
Bound
Bound
VIF
5913.780
-2.700
.333
.023
.403 14.763
.000
.289
.377
.335
2.988
Tax
35.947
3.136
.364 11.462
.000
29.783
42.110
.248
4.035
Interior
44.967
5.554
.173
8.097
.000
34.052
55.882
.549
1.821
Condo
105.108
21.268
.127
4.942
.000
63.311
146.906
.380
2.629
8366.791
-.056
-3.345
.902
1.108
15967.736
LastPrice
A12
27984.595
.007 -27590.071
Tolerance
-4345.402
A5
29804.817
6552.903
.084
4.548
.738
1.354
RC
10992.327
3445.556
.059
3.190
.002
.726
1.378
5480.634
-.037
-2.271
-1676.216
.944
1.059
5486.742
.036
2.240
1507.625 23073.784
.967
1.034
A16
12447.291
A2
12290.704
4220.785 17763.869
.024 -23218.366
.026
Let us check if the models regression assumptions are satisfied through Residual Analysis:
Complete stepwise multiple regression analysis: sample size
Since the number of cases is 456 and the number of independent variable is 9 the ratio is 50.66 which
passes the criteria of 50 is to 1.
As we can see from above that dependent does not pass the normality test
So Transform Salesprice to Log (salesprice) so that it follows normal distribution
Tax
It does not follow normal distribution as we can see below
Interior
The variable does not follow normal distribution as shown below
Condo
It does not follow normal distribution as shown below
Last Price
It also does not follow normal distribution
Since all the other variables are ordinal we are not testing for normality
Test for Linearity
As we have transformed the independent variables test for linearity is not required.
TEST for OUTLIERS
After transformation for detecting the outliers we ran the regression(EXHIBIT B) again
with transformed variables and checked for outliers. Below was the result.
Casewise Diagnostics
Case
Number
LOG_SALEPRIC
Std. Residual
Predicted Value
Residual
59
4.446
13.68
13.1891
.49288
217
3.660
11.03
10.6291
.40575
305
3.420
13.33
12.9502
.37916
306
3.162
13.35
12.9950
.35051
360
-8.181
11.73
12.6349
-.90689
408
-3.276
11.17
11.5336
-.36320
The above case number Std.Residual was outside + 3 and 3 and hence were oitliers.
Deleted the above case numbers and rerun the regression again.
Complete stepwise multiple regression analysis: assumption of independence of errors
Also Durbin Watson value is 1.649 which is between 1.5 and 2.5
Step wise regression has taken care of multicollinearity which is tested at each stage with a Pin =
0.05 and Pout = 0.10. and it has eliminated Beds,Rooms,Bath which were collinear.
R square value = 0.949 and adjusted 0.948 which is more than 2 % higher than baseline
regression R square value of 0.889.Hence we will go with model with transformed variables
and outliers removed. Five outliers were removed based on the case diagnostic.
Transformation to Log has a base e that is natural log.
Unstandardized
Standardized
Collinearity
Coefficients
Coefficients
for B
Statistics
Std.
Model
Error
5 (Constant)
12.059
INVERSE_LASTRICE
133487.069
Beta
.178
3284.282
Sig.
67.598 .000
-.808
40.644
.000
Lower
Upper
Bound
Bound
Tolerance VIF
11.708
12.409
139941.779 127032.360
.294 3.403
SQRT_TAX
.004
.001
.003
.006
.324 3.089
A5
.119
.019
.082
.155
.869 1.151
LOG_CONDO
.034
.011
.012
.057
.596 1.677
LOG_INTERIOR
.062
.021
.020
.104
.368 2.720
Where A5 are the dummy variables associated with East Cambridge, This will take values of 1 or 0
depending on whether we are to predict the price of a condominium in that area. For 236 Ellery Street
Apartment, we have
Log(Sale Price) = -133487.069 (1 / 169000) + .004 * SQRT(1121) + .119* 0 + .034 *Log(175) +
.062 * Log(1040)+ 12.059 = 164288.0015
95% prediction interval for the Selling price of 236 Ellery Street Condominium is given by:
The standard error and MSE are taken from the regression output table (Appendix).
Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street
Condominium would be given by:
Lastly homoscedasticity can be seen from the residual scatter plot where the residuals are scattered
around the mean 0 in a random fashion with no observable pattern or heteroscedasticity
A high Adjusted R2 value of 0.948 in this case (Appendix) suggests that 94.8% of the variation in Sale
Price is explained by the regression model.
Model 2:
In Model 1, we have clearly accounted for the areas/area codes of condominiums by starting with the 15
dummy variables for our step-wise regression analysis. One could very well argue that condominiums
outside of Mid-Cambridge should not be considered for analysis. Hence step-wise regression was run
with only the 111 data points from Mid-Cambridge condominiums. The step-wise regression was
started with the input independent variables including Last Price, Bed, Bath, Rooms, Interior, Condo, Tax
and RC. But Last Price and RC were the only independent variables that seem to have a significant
impact on the Selling Price. The step-wise regression with a Pin = 0.05 and Pout = 0.10 was carried out, as
we can see from Appendix, Last Price and RC were the only independent variables with a significant
impact (based on step-wise partial F-test) on Selling Price. The model can be summarized as below:
Similar to model 1, 95% prediction interval for the Selling price of 236 Ellery Street Condominium is
given by :
Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street
Condominium would be given by:
= 161,994.725t[0.025,(111-3)](698.994)
= 161,994.725 1.98217 *(698.994)
= 161,994.7251385.525
= {160609.2,163380.25}
The standard error of mean predicted value is taken from the Residual Statistics table (Appendix).
As explained for model 1, there is more uncertainty about the predicted value than there is about the
average value of Y given the values of Xi. Based on the confidence interval, the recommendation for
Affiar would be to not bid more than the upper limit value of $163,380 since he can be confident to a
level of 97.5% (100% 5%/2) that the final selling price (mean) of the condominium would be below this
number. So $163,380 is the maximum that he should bid on the condominium. If he were to be more
conservative in his bid, then he can go by the prediction interval. Since the upper limit of the prediction
interval $174,393 is greater than the asking price of $169000, his bid should be $169,000 in this case.
The maximum he can afford to bid for the house with a 95% confidence level would be $174,393.
Coefficients
Standardized
Unstandardized Coefficients
Model
1
B
(Constant)
-544.824
1357.461
.958
.008
-2181.178
1541.383
.960
.008
1935.903
909.479
LastPrice
2
(Constant)
LastPrice
RC
Std. Error
Coefficients
Beta
Sig.
-.401
.689
123.128
.000
-1.415
.160
.998
124.529
.000
.017
2.129
.036
.996
Let us check if the models regression assumptions are satisfied through Residual Analysis:
From the normality histogram for residuals shown in the figure below, it is clear that the normality
assumption is satisfied since the residuals (standardized) seem to be normally distributed. The normal
P-P graph also confirms the same. Lastly homoscedasticity can be seen from the residual scatter plot
where the residuals are scattered around the mean 0 in a random fashion with no observable pattern or
heteroscedasticity.
inherently taken care of in the step-wise regression technique which checks for multi-collinearity after
each stage (as shown in Figure 1) with a Pin = 0.05 and Pout = 0.10. Hence the algorithm automatically
kicks out of the model variables that are correlated to each other and keeps only the most significant
independent variables inside the model.
Other Models:
In addition to the above 2 best-fit models, a number of other regression models with different
combinations of input independent variables were tried. For instance, areas based on location (with the
help of the map provided) were grouped to form lesser number of dummy variables (e.g., grouping
Agassiz, Harvard Square and Radcliffe). Multiple such combinations were formed to see how area can
be best-fit into the model. Rooms was tried as proxy for interior (due to their high correlation as seen
in Appendix). Best fit test for each model based on R2 values, significance of coefficients, residual plots
was conducted and the best 2 models have been presented in the case solution. Also in each model, the
given price for the Ellery street condominium has been assumed as the Last Price as stated before.
Max.
Recommend
Mean Selling
Price ($)
Conservativ
Prediction Interval ($)
ed bid price
e bid price
($)
($)
Model
164288.0015
Model
{164287.4172,
164288.586}
161,994.725
{149596.661,174392.789}
{164287.9818,
164288.0212}
{160609.2,163380.25}
164,288.02
164288.58
12
163,380
174,393
Comparing the Adjusted R2 values of the two models, we see that Model 2 is able to explain 99.3% of
variation in Sale price against Model 1s 94.9%. Hence one might be tempted to use Model 2. But on a
closer look at the independent variables in model 2, Last Price and RC are the only independent
variables used. In this case there is not a large difference between the recommended prices for Affiar
using model 1 or model 2, but in reality buyer cant base his/her offer just by the sellers stated Last
price. Obviously a number of other factors like interior space, tax, apartment maintenance fee, area,
etc., need to be considered. From the given data, model 1 has made a comprehensive attempt to form
the best possible regression fit by use of maximum data points. Hence the recommendation would be
to go by model 1, but in this specific case of the Ellery Street house, since the variation for the predicted
selling price from the two models is not much, it is left to Affiar to either make an initial offer of
$164,288 or $163,380.
Appendix
EXHIBIT A (BASELINE REGRESSION)
Variables Entered/Removed
Model
Variables
Variables
Entered
Removed
Method
Stepwise
(Criteria:
Probabilityof-F-toLastPrice
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toTax
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toInterior
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toCondo
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA12
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA5
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toRC
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA16
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA2
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Model Summary
Change Statistics
Model
9
Adjusted R
Std. Error of
R Square
Square
the Estimate
Change
R Square
i
.943
.889
.886 30268.70125
Sig. F
F Change
.001
df1
df2
5.018
Sig.
Change
446
.026
ANOVA
Model
9
Sum of Squares
df
Mean Square
Regression
3.264E12
3.627E11
Residual
4.086E11
446
9.162E8
Total
3.673E12
455
395.860
.000
i. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC, A16, A2
j. Dependent Variable: SalePrice
Correl
ations
Ro
Sale Last Inte Be Ba om Co Ta
Price Price rior d
Pearso Sale
n
Correla
Price
1.00
0
.872
th
ndo x RC A1 A2 A3 A4 A5 A6 A7 A8 0
A1 A1 A1 A1 A1 A1 A1
.40
3
.02
0
.00 .09
7
1
-
2
-
3
-
4
-
5
-
6
-
.01
3
tion
Last
Price
.872
Interi
or
.652 .574
Bed
.405 .356
Bath
.534 .510
Roo
ms
.420 .355
Cond
o
.713 .643
Tax
.866 .766
RC
00
A2
7 00
00
7 00
.03 .06
8
9
.01
7
-
.00
0
.10
2
-
9 00
-
.35 .37
2
.01 .02
3
A4
-.099 -.093
.01 .11
9
1
.01
8
.09
9
.07 .03
3
5
.22
3
.09
9
-
0
.03
8
.06
1
-
5
.08
2
.10
4
-
7
.05
4
-
.07 .11
8
.03
1
7
.01
8
.12 1.0
1 00
.11
4
.02
3
-
.04
5
-
1
.00
2
.23
4
9
-
.40
4
.30
0
-
.04 .35
8
.01 .00
5
.05
5
.04
2
6
.05
9
.03
7
1
.17
8
-
.08
9
.04 .01
4
.23 .00
8
.09 .01
2
.03
3
-
2
.02
8
.24 .06
0
.13 .22
3
.04 .15
8
.05 .03
1
9
-
.08 .00
8
7
.08
7
-
.02 .11
1
.05
3
.01 .05
8
7
.08
4
.06
6
-
.03 .05
1
4
.08
9
-
.04
7
.09
5
-
.00
8
.07
4
.03
5
.10
3
.04 .01 .03 .04 .01 .07 .03 .03 .01 .02 .04 .02 .01 .04
5
1.0
00
-
.01 .02
5
.37
.02 .06 .07 .01 .14 .06 .05 .03 .05 .08 .05 .03 .08
7
1.0
00
-
-.002 -.010
A1
.02 .02 .00 .04 .02 .01 .01 .01 .02 .01 .01 .02
1
1.0
00
.05 .01 .10 .04 .04 .02 .04 .06 .04 .02 .06
9
A5
.403 .376
A6
-.020 -.015
A7
.15
1
.00
2
.05 .05
4
A10
-.049 -.044
A11
-.012 -.015
A12
.06
1
8
.00
2
-.061 -.061
A15
.04
2
.00
4
.23
8
.05
6
.24
0
.06
4
A14
A13
8
.01
8
.05 .05
3
.09
7
.13
8
.09
4
-
6
-
.05 .03
9
.22 .15
3
A16
.013 .002
Price
. .000
.04 .09
7
4
.00
8
1.0
00
-
.03
1.0
00
-
.06 .00 .05 .01 .03 .01 .02 .02 .00 .05 .02 .02
5
.05
4
-
.06 .03
8
.15 .07
4
.22
2
.06
5
.00
9
.04 .08 .02 .06 .08 .02 .15 .06 .05 .03 .05
9
.02 .05 .01 .04 .05 .01 .09 .04 .03 .01 .03 .05
9
.01 .03 .01 .02 .03 .00 .05 .02 .02 .01 .02 .03 .02
8
.02 .05
1
1.0
00
-
.04 .08 .02 .06 .07 .01 .14 .06 .05 .03 .05 .08 .05 .03
5
.08 .06 .07 .18 .02 .05 .01 .04 .05 .01 .09 .04 .03 .01
7
.01 .13 .05 .05 .02 .05 .08 .05 .03 .07
.11 .06 .02 .03 .05 .01 .04 .05 .01 .09 .04
3
.03
3
1.0
00
.00 .00 .00 .00 .00 .00 .00 .48 .23 .16 .01 .00 .33 .43 .01 .14 .39 .02 .31 .09 .00 .38
0
Last
Price
Interi
or
Bed
Bath
Roo
ms
Cond
o
Tax
RC
A1
A2
A3
A4
A5
A6
A7
A8
A10
.000
.000 .000
.000 .000
.000 .000
.000 .000
.000 .000
.000 .000
.000 .000
.486 .414
.233 .171
.165 .200
.018 .023
.000 .000
.338 .375
.439 .488
.019 .029
.149 .176
.00 .00 .00 .00 .00 .00 .00 .41 .17 .20 .02 .00 .37 .48 .02 .17 .37 .24 .23 .09 .01 .48
0
.
.00
0
0
.
0
.
0
.
9
.
0
.
0
.
5
.
1
.
5
.
0
.
3
.
1
.
2
.
.00 .02 .35 .03 .00 .08 .28 .26 .12 .35 .18 .13 .39 .02 .19
1
.09 .48 .00 .08 .00 .00 .00 .23 .09 .33 .15 .10 .38 .01
7
.35 .00 .10 .13 .27 .14 .03 .14 .25 .04
.00 .00 .46 .00 .00 .15 .20 .04 .00 .16 .01 .00 .24
0
.10 .37 .01 .15 .18 .31 .19 .07 .19 .29 .08
.12 .11 .18 .11 .10 .21 .13 .41 .34 .44 .37 .35
6
.33 .29 .44 .16 .33 .35 .41 .35 .26 .35 .40 .28
.00 .48 .00 .47 .00 .00 .00 .18 .05 .29 .10
1
.28 .09 .05 .34 .00 .09 .12 .26 .13 .03 .13 .24 .04
.34 .00 .34 .01 .00 .02 .15 .23 .09 .33
0
.17 .37 .23 .18 .41 .04 .23 .26 .36 .26 .14 .26 .35 .16
.00 .00 .30 .15 .00 .13 .20 .00 .28 .14 .00 .00 .08 .42 .01
.00 .30 .01 .22 .02 .00 .21 .15 .00 .08 .45 .06 .24 .06 .00 .01
.00 .00 .39 .06 .06 .00 .00 .10 .00 .00 .00 .08 .09 .07 .00 .07 .22
.15 .00 .49 .01 .12 .10 .01 .47 .11 .00 .08 .03 .45 .03 .25 .12 .02 .05
.00 .00 .00 .00 .35 .00 .25 .34 .00 .18 .46 .00 .35 .37 .01 .00 .02 .05 .43
.00 .00 .10 .00 .01 .09 .01 .04 .00 .48 .11 .00 .48 .02 .38 .27 .34 .10 .07 .02
.00 .00 .00 .00 .00 .00 .21 .04 .01 .34 .00 .12 .00 .09 .00 .24 .32 .00 .13 .03 .15
.00 .00
0
A11
A12
A13
A14
A15
A16
Sale
Price
Last
Price
Interi
or
Bed
Bath
Roo
ms
Cond
o
Tax
RC
A1
A2
.399 .377
.021 .244
.311 .238
.099 .098
.008 .019
.387 .487
.24 .38 .37 .45 .08 .45 .14 .36 .26 .41 .31 .27 .44 .13 .31 .33
8
.32 .27 .01 .03 .09 .06 .00 .26 .13 .35 .19 .14 .39 .02 .20 .22 .34
1
.00 .34 .00 .25 .07 .24 .00 .14 .03 .26 .07 .03 .33 .00 .07 .10 .24 .11
7
1
.
1
.
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
.32 .12
5
.
.15 .02 .43 .05 .22 .01 .01 .16 .04 .28 .08 .04 .34 .00 .09 .12 .25 .12 .02 .12 .23
8
.03 .07 .05 .02 .07 .00 .42 .35 .24 .40 .29 .25 .43 .11 .29 .32 .39 .32 .22 .32
6
.13 .10 .02 .12 .00 .06 .08 .26 .13 .35 .19 .14 .39 .02 .20 .22 .34 .23 .11
0
9
.23
9
.
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
A3
A4
A5
A6
A7
A8
A10
A11
A12
A13
A14
A15
A16
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
456
456 456
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
45 45
6
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
456 456
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
6
Coefficients
Interval for B
Statistics
5913.780
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
Coefficients
15967.736
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
Collinearity
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
95% Confidence
Beta
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
Standardized
Std. Error
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
Unstandardized
B
(Constant)
45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45
Coefficients
Model
t
-2.700
Sig.
Lower
Upper
Bound
Bound
.007 -27590.071
-4345.402
Tolerance
VIF
LastPrice
.333
.023
.403 14.763
.000
.289
.377
.335
2.988
Tax
35.947
3.136
.364 11.462
.000
29.783
42.110
.248
4.035
Interior
44.967
5.554
.173
8.097
.000
34.052
55.882
.549
1.821
Condo
105.108
21.268
.127
4.942
.000
63.311
146.906
.380
2.629
8366.791
-.056
-3.345
.902
1.108
A12
27984.595
A5
29804.817
6552.903
.084
4.548
.738
1.354
RC
10992.327
3445.556
.059
3.190
.002
.726
1.378
5480.634
-.037
-2.271
-1676.216
.944
1.059
5486.742
.036
2.240
1507.625 23073.784
.967
1.034
A16
12447.291
A2
12290.704
4220.785 17763.869
.024 -23218.366
.026
Excluded Variables
Collinearity Statistics
Partial
Model
9
Beta In
Sig.
Minimum
Correlation
Tolerance
VIF
Tolerance
Bed
-.010
-.408
.684
-.019
.414
2.416
.245
Bath
-.002
-.070
.944
-.003
.436
2.295
.248
Rooms
-.003
-.095
.924
-.005
.340
2.938
.246
A1
.001
.077
.939
.004
.971
1.030
.248
A3
-.025
-1.589
.113
-.075
.972
1.029
.248
A4
-.013
-.783
.434
-.037
.957
1.045
.247
A6
-.004
-.274
.784
-.013
.987
1.014
.248
A7
.010
.564
.573
.027
.818
1.223
.247
A8
-.019
-1.094
.275
-.052
.837
1.195
.248
A10
-.016
-.976
.329
-.046
.927
1.079
.246
A11
.003
.159
.873
.008
.983
1.017
.248
A13
.024
1.411
.159
.067
.894
1.118
.245
A14
-.001
-.079
.937
-.004
.956
1.046
.248
A15
-.004
-.271
.786
-.013
.978
1.022
.246
Residuals Statistics
Minimum
Predicted Value
Maximum
Mean
Std. Deviation
2.1894E4
7.3736E5
1.7108E5
84699.37571
456
-1.761
6.686
.000
1.000
456
1971.030
2.458E4
4.021E3
1982.252
456
1.6813E4
1.1794E6
1.7253E5
95574.81320
456
-3.59573E5
1.37644E5
.00000
29967.84529
456
Std. Residual
-11.879
4.547
.000
.990
456
Stud. Residual
-20.352
4.861
-.017
1.268
456
1.57295E5 -1.45182E3
55783.52632
456
Deleted Residual
-1.05539E6
-76.135
4.990
-.139
3.664
456
Mahal. Distance
.932
298.983
8.980
16.348
456
Cook's Distance
.000
80.153
.179
3.753
456
.002
.657
.020
.036
456
Variables Entered/Removed
Model
Variables
Variables
Entered
Removed
Method
Stepwise
(Criteria:
Probabilityof-F-toINVERSE_LAST
RICE
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toSQRT_TAX
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA5
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toLOG_CONDO
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toLOG_INTERIOR
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toRC
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA16
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Correlations
LOG_SALEP
RICE
Pearso LOG_SALEPRI
n
CE
RC A2 A5 A12 A16
-
Correlat
ion
8
RC
-.278
A2
-.003
A5
1.0 .11
00
.11 1.0
5
00
A12
.32
5
-
A16
.048
.10
2
.10
3
-
.35 .18
2
7
1.0
00
-
.10
2
.05 .07
0
1.0
00
-
AX
RIOR
NDO
-.949
.816
.750
.502
.244
-.378
-.248
-.332
-.016
-.102
-.088
-.066
-.240
.273
.150
.352
.080
-.071
-.016
-.054
-.067
.122
.078
.023
.04
TRICE
8
.05
3
1.0
00
INVERSE_LAS
TRICE
-.949
.24
-.757
-.427
-.759
1.000
.655
.573
-.757
.655
1.000
.204
-.427
.573
.204
1.000
.000
.000
.000
.000
.000
.000
.000
.000
.366
.015
.030
.080
.000
.000
.001
.000
.044
.065
.369
.127
.077
.005
.049
.314
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
455
455
455
455
455
455
455
455
6
-
LOG_INTERIO
LOG_CONDO
CE
RC
A2
A5
A12
A16
INVERSE_LAS
TRICE
SQRT_TAX
LOG_INTERIO
R
LOG_CONDO
LOG_SALEPRI
CE
RC
-.759
.01 .24
.08
1.000
SQRT_TAX
.000
.471
.000
.014
.153
.000
.000
.000
.000
0
.27
3
.15
0
.35
2
0
.07
1
.01
6
.05
4
.06
7
.12
2
.07
8
.02
3
7
.
1
.
.14 .04
4
.
.00 .05
0
8
.12
8
.
A2
455
455
455
455
A5
455
455
455
455
A12
455
455
455
455
A16
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
455
INVERSE_LAS
TRICE
SQRT_TAX
LOG_INTERIO
R
LOG_CONDO
Model Summary
Change Statistics
Std. Error of
Model
.965
g.
Predictors:
Adjusted R
the
R Square
Square
Square
Estimate
Change
Change
.932
(Constant),
.931
.11086
.001
INVERSE_LASTRICE,
4.369
SQRT_TAX,
A5,
df1
df2
1
Sig. F
Durbin-
Change
Watson
447
.037
1.615
LOG_CONDO,
Dependent
Variable:
LOG_SALEPRICE
ANOVA
Mean
Model
7
Sum of Squares
Regression
Residual
Total
df
Square
75.497
10.785
5.493
447
.012
80.991
454
Sig.
877.61
8
.000
Coefficients
Model
7
Unstandardized
Standardized
Coefficients
Coefficients
for B
B
(Constant)
Std. Error
11.619
INVERSE_LASTRICE
121950.297
Beta
.216
53.895
3836.638
-.726 -31.786
Sig.
.000
Lower
Upper
Bound
Bound
11.195
12.043
SQRT_TAX
.007
.001
.180
7.971
.000
.005
.009
A5
.135
.023
.081
5.886
.000
.090
.181
LOG_CONDO
.049
.014
.059
3.595
.000
.022
.076
LOG_INTERIOR
.087
.026
.069
3.412
.001
.037
.138
RC
.031
.012
.035
2.488
.013
.006
.055
A16
-.042
.020
-.026
-2.090
.037
-.081
-.002
Casewise Diagnostics
Case
Number
Residual
59
4.446
13.68
13.1891
.49288
217
3.660
11.03
10.6291
.40575
305
3.420
13.33
12.9502
.37916
306
3.162
13.35
12.9950
.35051
360
-8.181
11.73
12.6349
-.90689
408
-3.276
11.17
11.5336
-.36320
Residuals Statistics
Minimum
Predicted Value
Maximum
Mean
Std. Deviation
10.5508
13.1891
11.9509
.40779
455
-3.433
3.036
.000
1.000
455
.007
.033
.014
.005
455
10.5324
13.1544
11.9506
.40777
455
Residual
-.90689
.49288
.00000
.11000
455
Std. Residual
-8.181
4.446
.000
.992
455
Stud. Residual
-8.391
4.600
.001
1.009
455
-.95413
.52762
.00029
.11387
455
-9.132
4.708
.000
1.028
455
Mahal. Distance
.861
39.738
6.985
5.922
455
Cook's Distance
.000
.458
.005
.025
455
.002
.088
.015
.013
455
Error
of
Predicted Value
Deleted Residual
Stud. Deleted Residual
Variables Entered/Removed
Model
Variables
Variables
Entered
Removed
Method
Stepwise
(Criteria:
Probabilityof-F-toINVERSE_LAST
RICE
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toSQRT_TAX
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toA5
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toLOG_CONDO
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Stepwise
(Criteria:
Probabilityof-F-toLOG_INTERIOR
enter
<=
.050,
Probabilityof-F-toremove >=
.100).
Correlations
LOG_SALEP
RICE
Pearso LOG_SALEPRI
n
CE
RC A2 A5 A12 A16
-
1.000 .26
Correlat
ion
7
RC
-.267
A2
.000
.00 .31
0
1.0 .11
00
.11 1.0
2
00
4
-
.10
3
-
.35 .18
3
.05
5
.10
0
AX
RIOR
NDO
-.965
.800
.752
.472
.229
-.365
-.238
-.319
-.019
-.103
-.088
-.062
TRICE
A5
A12
1.0
00
-
INVERSE_LAS
TRICE
-.965
SQRT_TAX
.10
0
.22
9
-
.01 .23
9
-
LOG_CONDO
CE
RC
A2
A5
A12
A16
INVERSE_LAS
TRICE
.000
.496
.000
.014
.122
.000
5
.26
6
.13
6
.34
6
.04 .07
8
1.0
00
-
LOG_INTERIO
2
.11
5
.07
5
.01
7
.05
2
.
.00
9
9
.
.05
3
.
1.0
00
.07
-.075
-.017
-.052
-.072
.139
.080
.030
1.000
-.761
-.757
-.416
-.761
1.000
.658
.543
-.757
.658
1.000
.183
-.416
.543
.183
1.000
.000
.000
.000
.000
.000
.000
.000
.000
.343
.015
.032
.095
.000
.000
.002
.000
.008
.056
.358
.135
.064
.002
.045
.265
.000
.000
.000
2
.13
9
.08
0
.03
0
0
.13
5
.
.115
.15 .05
5
.346
.136
.00 .05
0
.266
-.235
SQRT_TAX
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
449
449
449
449
RC
449
449
449
449
A2
449
449
449
449
A5
449
449
449
449
A12
449
449
449
449
A16
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
449
LOG_INTERIO
.000
R
LOG_CONDO
.000
LOG_SALEPRI
CE
INVERSE_LAS
TRICE
SQRT_TAX
LOG_INTERIO
R
LOG_CONDO
Model Summary
Change Statistics
Std. Error of
Model
.974
Adjusted R
the
R Square
Square
Square
Estimate
Change
Change
.949
.948
.09189
.001
df1
8.435
df2
1
443
Dependent
Variable:
LOG_SALEPRICE
ANOVA
Model
Sum of Squares
df
Mean Square
Sig.
Sig. F
Durbin-
Change
Watson
.004
1.649
Regression
Residual
Total
e.
Predictors:
(Constant),
68.896
13.779
3.741
443
.008
72.636
448
INVERSE_LASTRICE,
SQRT_TAX,
1.632E3
A5,
.000
LOG_CONDO,
LOG_INTERIOR
f. Dependent Variable: LOG_SALEPRICE
Coefficients
Unstandardized
Standardized
Collinearity
Coefficients
Coefficients
for B
Statistics
Std.
Model
5 (Constant)
12.059
INVERSE_LASTRICE
Error
133487.069
Beta
.178
Sig.
67.598 .000
3284.282
-.808
40.644
.000
Lower
Upper
Bound
Bound
Tolerance VIF
11.708
12.409
139941.779 127032.360
.294 3.403
SQRT_TAX
.004
.001
.003
.006
.324 3.089
A5
.119
.019
.082
.155
.869 1.151
LOG_CONDO
.034
.011
.012
.057
.596 1.677
LOG_INTERIOR
.062
.021
.020
.104
.368 2.720
Residuals Statistics
Minimum
Predicted Value
Maximum
Mean
Std. Deviation
10.4796
12.9755
11.9452
.39215
449
-3.737
2.627
.000
1.000
449
.005
.028
.010
.004
449
10.4557
12.9635
11.9449
.39233
449
Residual
-.26221
.37351
.00000
.09138
449
Std. Residual
-2.853
4.065
.000
.994
449
Stud. Residual
-2.922
4.137
.001
1.007
449
-.27492
.38696
.00027
.09376
449
Deleted Residual
-2.947
4.215
.003
1.012
449
Mahal. Distance
.299
41.115
4.989
5.166
449
Cook's Distance
.000
.182
.004
.015
449
.001
.092
.011
.012
449
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.775952808
0.60210276
0.601226335
217.7411745
456
ANOVA
df
Regression
Residual
Total
1
454
455
Coefficients
-76.7538578
235.8872688
Intercept
Rooms
SS
MS
F
32571418.05 32571418 686.9981
21524693.45 47411.22
54096111.51
Standard Error
t Stat
P-value
42.08789622 -1.82366 0.068861
8.999672999 26.21065 6.77E-93
1000
5000
0
0
50
100
Sample Percentile
Lower 95%
Upper 95%
-159.4651166 5.957400971
218.2010847 253.5734529
2000
Residuals
Interior
Normal Probability
Plot
Significance F
6.7719E-93
150
-1000
5
Rooms
10