Documente Academic
Documente Profesional
Documente Cultură
Article has been submitted for consideration to the Journal of Spatial Economic Analysis
1. Introduction
The addition or expansion of municipal Light Rail Transit (LRT) structures is an issue that can often divide
communities. It is becoming clear both that urban public transit is necessary and also that researchers and
policy makers remain unsure as whether or not the presence of light rail transit affects house prices. Accessible
transit is an affordable, environmentally sustainable transportation option for urban communities (Hewitt &
Hewitt, 2012; Hess & Almeida, 2007). The importance of determining LRT effects on house prices is growing,
all urban centers need more public transit options meaning that if LRT does negatively affect nearby
homeowners then it becomes an issue that municipal authorities must address. Of secondary importance is a
need to add to the growing body of literature regarding the addition of spatiality to the traditional hedonic linear
regression models currently being applied to housing datasets. One of the biggest issues with traditional hedonic
model is the assumption of homogenous model parameters across the study region (, 201).
The null hypothesis is that housing prices within City of Calgary communities are not affected by the presence
of or proximity to light rail transit (LRT) stations. Instead, housing prices are strictly dictated by the various
other variables taken into account during this study. The alternate hypothesis is that housing prices within the
City of Calgary communities are affected by the presence of or proximity to light rail transit (LRT) stations. In
addition to the above stated hypothesis this paper will also address the following questions: To what degree
does proximity to LRT stations affect housing prices within surrounding communities? Is the relationship
between this dependent and independent variable inherently positive or negative? Does accessibility to LRT
contribute substantially to housing prices or does it exert negligible influence? To what degree do other
variables, such as # of bedrooms, # of bathrooms and proximity to the CBD affect housing prices? Does the
cumulative effect of these additional variables vastly supersede the associative influence of LRT station
proximity?
The following analysis uses the city of Calgary, Alberta as a general study area and applies advanced spatial
regression techniques to a subset of the city consisting of homes near LRT stations. Utilizing a dataset of 1242
observed transaction prices, we will conduct a spatial analysis exploring a possible relationship between sold
house prices and proximity to LRT stations. The paper is organized into five sections. Section one first
introduces the study followed by section two which lays out the methodology of the analysis. Section three
presents the statistical results and section four contains a discussion of model results and the last section
concludes the study by summarizing the results.
2.2 Data
The following data was obtained from the City of Calgary Open Data Catalogue in shapefile format including
City Community Districts, Schools, Amenities, Census 2006 and Major Road Networks. Median income for
Calgary communities in 2006 was downloaded from the Statistics Canada website. LRT Line and Station data
was obtained from Calgary Transit and the University of Calgary SANDS Desk. Initially, an attempt was made
to acquire the raw sold house price data through the Calgary Real Estate Board (CREB) and then later through
the Real Estate Council of Alberta (RECA). Unfortunately, despite extensive communications with both
organizations the data was not released and instead obtained from a personal real estate contact of one of the
authors. The data provided by the personal contact included information on style, type of house and other
housing characteristics which may affect house prices.
2.3 Methodology
This section will layout the process used to select, prepare and create data as well as the steps taken to generate
the Ordinary Least Squares (OLS), Geographically Weighted (GWR) and Moving Wnidows Kriging (MWK)
regression models presented below.
Property Variables
Area, Age, Bathrooms, Bedrooms, Basement, Garage, Building type were chosen as variables that can be
assumed to affect house prices (Hewitt & Hewitt, 2012).
Locational ??Variables
Clip all data and variables to relevant communities in the study area
Once this buffer was created, a select by location strategy was used in order locate all communities
falling within or along the buffer zone.
Given that our study would only be concerned with residential communities, any industrial or
principally commercial regions falling within this buffer were systematically removed from our study
area.
Figure 3. Location of observations, transit lines and transit stations in study area
Variable Type
Dependent Variable
Independent Variables
Property Characteristic
Variables
Variable
PRICE
Definition
Sale price
Units
$
Source
Industry contact
BDRMS
Number of bedrooms
Count
Industry contact
BATHS
M_SQ
Number of bathrooms
Area of property
Dummy - Type of house:
Duplex = 3;
Detached = 2;
Attached = 1;
Otherwise = 0
Dummy - Basement type:
Walk-out = 4;
Dugout = 3;
Full = 2;
Partial = 1;
None = 0
Dummy - Presence of garage:
Garage = 1;
Otherwise = 0
Count
m2
Industry contact
Industry contact
Value
Industry contact
Value
Industry contact
Value
Industry contact
AGE
Age of property
# years
Calculated from
YEAR
Dist_CBD
Calculated
Dist_SCL
Dist_AMN
m
m
Calculated
Calculated
Dist_STN
Calculated
Dist_LINE
Dist_RDWYS
m
m
Med_INC
Median income
Calculated
Calculated
Statistics
Canada
Statistics
Canada
Type_BLDG
BSMNT
GAR
Location Dependent
Variables
Transportation
Accessibility Variables
Neighbourhood Quality
Variables
Calculated
Calculated
2.2.5
3. Results
Exploratory data analysis
Figure 4. Spatial distribution of price
Within the study area, higher priced homes are clustered in the central and west-central parts of the city, middle
priced homes are scattered throughout including two substantial groupings in the north-west and south-central
areas and lastly, lower priced homes dominate the north-east portion of Calgary and are also present in the
south-central area.
Histogram
Transformation: None
Figure 5. Histogram of Price
Frequency 10
2.37
Count
: 1242
Min
: 31800
Max
: 2500000
Mean
: 503440
Std. Dev. : 255980
1.9
Skewness
Kurtosis
1-st Quartile
Median
3-rd Quartile
: 3.0929
: 17.363
: 350000
: 436750
: 579000
1.42
0.95
0.47
0
0.03
0.28
0.53
0.77
1.02
1.27
1.51
1.76
2.01
2.25
2.5
Dataset 10
: Obs_1242_WGS19843TM
Sale_Price
Figure Dataset
5 shows a histogram
of the observations which reveals that the sale Attribute:
price is approximately
normally
distributed and is slightly skewed to the right. .
Normal QQPlot
Figure 6. Scatterplot of Price
Transformation: None
Dataset 10
2.5
2.01
1.51
1.02
0.53
0.03
-3.35
-2.68
-2.01
-1.34
-0.67
0.67
1.34
2.01
2.68
3.35
Normal
QQPlot
Figure 7. Scatterplot of logged price
Transformation: Log
Dataset 10
1.47
1.39
1.3
1.21
1.12
1.04
-3.35
-2.68
-2.01
-1.34
-0.67
0.67
1.34
2.01
2.68
3.35
Price becomes more linear, especially at each end after having been logged.
Std Dev
Min
Median
Max
Price
503440
255980
31800
436750
2500000
Area
141.800
55.373
53.9
122.21
510.48
# Beds
3.752
0.924
# Baths
2.259
0.788
2.1
Age
31.531
22.162
25
106
Table 2 presents the average house in this study which was built around 1982, is 141.8 m2, has 3.8 bedrooms,
2.3 bathrooms and on average sold for approximately $503,440. The median house price was $436,750 while
the least expemsive home in the study area sold for $31,800 and the most expensive for $2,500,000.
0.00
Sale_Price
1000000.00
2000000.00
1500000.00
2500000.00
500000.00
0.00
500.00
1000.00
1500.00
DISTSTATION
2000.00
2500.00
The principle variable of interest, distance to stations does not visually appear to be correlated with price, the
correlation coefficient is 0.0474. Area, baths, med income are moderately positively correlated with price while
basement, beds and garage are mildly positively correlated with price. Distance to line, school, roadway,
amenities and building type are all positive but very small.. Distance to central business district is mildly
negatively correlated with price while percent owned and age are negative but very small.
Five variables were removed including beds, building type, distance to schools and roadways, adjacency to
roadway.
Final model: -Model #1 Hedonic OLS (see Appendix B):
Price = Area + Baths + Age + Bsmt _ garage + percent owner + median income + distance
to
Coefficient [a]
StdError
t-Statistic
Probability
AREA
2767.86
115.95
23.87
0.000
BATHS
32175.15
7420.61
4.34
0.000
AGE
922.4979
253.05
3.65
0.000
BSMT
21936.87
7698.17
2.85
0.004
GARAGE
37150.08
12397.93
3.00
0.003
PERC_OWNED
-2185.59
252.22
-8.67
0.000
MED_INCOME
4.366
0.33
13.38
0.000
DISTSTATIO
-32.25
10.03
-3.21
0.001
DIST_AMN
-28.45
9.59
-2.967
0.003
ADJ_LINE
-22370.92
10914.42
-2.05
0.041
Intercept
-24568.16
32620.61
-0.75
0.452
For every additional meter squared of living area, price increases on average by $2767.86. Across the study
area, house prices increase on average by $32.175.15 for each additional bathroom.. Basement type is
positively related to house price. The presence of a garage is positively related to house price. As the
percentage of units in the study area which are owned increases, house prices decrease by $2185.58. As median
neighbourhood income increases, house prices increase on average by $4.37. As distance to station increases,
each additional meter lowers house prices by $32.25. As distance to amenities increases, each additional meter
lowers house prices by $28.45. Homes that are located within the adjacency to line buffer zone are associated
on average with a $22,370.92 decrease in house prices which is what we expected to see as a result of
disamenity affects of noise and pollution etc.
Table 4. Model #1 Summary of OLS model results (Stata)
Metric
Metric Values
R2
0.6026
F-statistic
186.84
Prob(>F)
0.000000
Coefficient
Standard Error
t-statistic
p-value
AREA
2768.47
116.20
23.83
0.000
BATHS
32172.91
7423.64
4.33
0.000
AGE
922.958
253.20
3.65
0.000
BSMT
21877.77
7730.17
2.83
0.005
GARAGE
37122.45
12397.93
2.99
0.003
PERC_OWNED
-2185.39
252.22
-8.66
0.000
MED_INCOME
4.366
0.33
13.36
0.000
DISTSTATIO
-32.22
10.05
-3.21
0.001
DIST_AMN
-28.42
9.60
-2.96
0.003
ADJ_LINE
-22389.72
10920.87
-2.05
0.041
Intercept
-24522.21
32637.85
-0.75
0.453
Coefficents, standard errors, t-statistics, and pvalues are all similar to the values obtained from OLS in Stata.
Table 6. Model #1 Summary of OLS model results (ArcMap)
Metric
Metric Values
AICc
33334.59
R2
0.602615
F-statistic
186.675499
Prob(>F)
0.000000
65.436437
BP Prob(>chi-squared)
0.000000
Jarque-Bera statistic
49454.709984
JB Prob(>chi-squared)
0.000000
Metric Values
Morans Index
0.143652
Expected Index
-0.000806
Variance
0.000045
z-score
21.435641
p-value
0.000000
Distance Threshold
1828.52 (m)
Model #2 GWR
Summary Statistics
AICc
53500.47
R2
0.762
Bandwidth
1828.35
Residuals
362563.93
Sigma
135522.63
Model #3 MWK
4. Discussion
It would have been possible to have tried to account for spatial heterogeneity by disaggregating the data but
without expert knowledge of the local real estate micro-markets in each community, we would almost certainly
do it incorrectly. Disaggregating to the neighbourhood level is not a viable option because neighbourhood
boundaries are often arbitrary and do not accurately reflect actual spatial groupings on the ground. In either
case, if linear regression is performed at the incorrect level of aggregation then the resulting estimates have the
potential to be severely biased. (Sundig et al, 2008)
The fact that the Koenker (BP) statistic is statistically significant indicates that the relationships between price
and the independent variables in Model #1 are not consistent (either due to non-stationarity or
heteroskedasticity). (ESRI, 2013) It is likely that Koenker statistic is indicating an error arising from using
global (stationary) models to estimate local (non-stationary) phenomena. GWR improves predictions and
sometimes reveals the type of non-stationarity. The fact that the Jarque-Bera statistic is statistically significant
indicates that the model predictions are biased (the residuals are not normally distributed).
(ESRI, 2013)
Despite the extensive literature regarding the determinants of house prices in North America and globally, most
of the popular variables utilized are reported as varying between being positively significant, negatively
significant or insignificant (Zietz et al, 2008; Landis et al, 1995) which is consistent with the results above.
Literature Cited
Bowes, D.R. and Ihlanfeldt, K.R. (2001). Identifying the Impacts of Rail Transit Stations on
Residential Property Values. Journal of Urban Economics. 50: 1-25.
Brandt, S. and Maennig W. (2012). The impact of rail access on condominium prices in
Hamburg. Transportation. 39: 997-1017.
Calgary Transit (n.d.). Technical Data. Available at: http://www.calgarytransit.com/
html/technical_information.html
ESRI, (2011). Interpreting OLS regression results. Available at: http://resources.arcgis.com
/en/help/main/10.1/index.html#/Interpreting_OLS_results/005p00000030000000/
ESRI. (2012a). Geographically weighted regression (GWR). Available at: http://resources
.arcgis.com/en/help/main/10.1/index.html#/Geographically_Weighted_Regression_GWR/005p0000002
1000000/
ESRI. (2012b). Moving Windows Kriging (Geostatistical Analyst). Available at: http://resources
.arcgis.com/en/help/main/10.1/index.html#//00300000000m000000
ESRI. (2013). Modeling spatial relationships. Available at: http://resources.arcgis.com/en
/help/main/10.1/index.html#/Modeling_spatial_relationships/005p00000005000000/
Hess, D.B. and Almeida, T.M. (2007). Impact of Proximity to Light Rail Rapid Transit on
Station-area Property Values in Buffalo, New York. Urban Studies. 44: 1041-1068.
Hewitt, C.M. and Hewitt, W.E. (2012). The Effect of Proximity to Urban Rail on Housing Prices
in Ottawa. Journal of Public Transportation. 15(4): 43-62.
Landis, J., Guhathakurta, S., Zhang, M. (1995). Rail transit investments, real estate values and
land use change: A comparative analysis of five California rail transit systems. University
of California at Berkeley. 27-41. Available at: http://www.fltod.com/research/housing
/rail_transit_investments_real_estate_values_and_land_use_change.pdf
Long, F., Paez, A., Farber, S. (2007). Spatial effects in hedonic price estimation: A case study in
the city of Toronto. Center for Spatial Analysis, McMaster University. Available at:
http://sciwebserver.science.mcmaster.ca/cspa/papers/CSpA%20WP%20020.pdf
Sundig, D., Swoboda, A. (2008). Hedonic analysis with locally weighted regression: An
application to the shadow cost of housing regulation in Southern California. Journal of
Regional Science and Urban Economics. 40:550-573
Zietz, J., Norman-Zietz, E. (2008). Determinants of house prices: A quantile regression
approach. Journal of Real Estate Financial Economics. 37:317-333
Appendix
Appendix A - Correlation matrix of independent variables
. . correlate sale_price area beds baths diststation distschool distline distcbd age perc_owned med_income b
> smt garage bldg_type dist_rdwy dist_amn
(obs=1242)
sale_price
area
beds
baths
diststation
distschool
distline
distcbd
age
perc_owned
med_income
bsmt
garage
bldg_type
dist_rdwy
dist_amn
med_income
bsmt
garage
bldg_type
dist_rdwy
dist_amn
sale_p~e
area
beds
1.0000
0.6890
0.1589
0.4445
0.0474
0.0982
0.0731
-0.2727
-0.0340
-0.0454
0.4376
0.2034
0.3016
0.0134
0.0499
0.0167
1.0000
0.2025
0.5633
0.1930
0.2006
0.2146
0.0822
-0.3279
0.1324
0.3233
0.2262
0.2773
0.0312
0.0249
0.0714
1.0000
0.3572
0.0205
0.0025
0.0036
0.0074
-0.0680
-0.0168
-0.0291
0.0877
0.1070
-0.0319
0.0130
-0.0365
med_in~e
bsmt
1.0000
0.1979
0.1589
-0.1084
0.1094
0.2970
1.0000
0.0567
-0.0336
-0.0115
0.1125
1.0000
0.1043
0.1058
0.1333
0.0555
-0.2709
0.0882
0.1482
0.1976
0.2611
0.0650
0.0478
0.0408
1.0000
0.1713
0.9332
0.3277
-0.2461
0.1250
0.0722
0.1288
0.0903
-0.1140
0.0072
0.0494
1.0000
0.1362
0.2839
-0.2548
0.2051
0.1612
0.1657
0.0182
-0.0945
-0.0167
0.0714
1.0000
-0.1556
0.0387
0.0132
1.0000
-0.0004
-0.1118
1.0000
0.1198
1.0000
1.0000
0.3092
-0.2334
0.1256
0.0736
0.1337
0.1109
-0.1123
0.0013
0.0581
distcbd
1.0000
-0.5453
0.5331
-0.0270
0.1353
-0.0634
-0.2259
-0.0440
0.1363
age perc_o~d
1.0000
-0.3271
0.0878
-0.1876
0.0252
-0.0654
0.0567
-0.0952
1.0000
0.3405
0.1723
-0.0568
-0.2255
0.0662
0.2653
SS
df
MS
Model
Residual
4.9006e+13
3.2314e+13
10
1232
4.9006e+12
2.6229e+10
Total
8.1319e+13
1242
6.5475e+10
Sale_Price
Coef.
Area
Baths
AGE
BSMT
Garage
Perc_OWNED
MED_INCOME
DISTSTATION
Dist_AMN
Adj_LINE
_cons
2767.86
32175.15
922.4979
21936.87
37150.08
-2185.584
4.366606
-32.24929
-28.45044
-22370.92
-24568.16
Std. Err.
115.9448
7420.606
253.0472
7698.334
12389.03
252.1045
.3263052
10.03949
9.593748
10914.42
32620.61
t
23.87
4.34
3.65
2.85
3.00
-8.67
13.38
-3.21
-2.97
-2.05
-0.75
Number of obs
F( 10, 1232)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.004
0.003
0.000
0.000
0.001
0.003
0.041
0.452
=
=
=
=
=
=
1243
186.84
0.0000
0.6026
0.5994
1.6e+05
2995.331
46733.57
1418.949
37040.16
61456.01
-1690.982
5.006782
-12.5529
-9.628551
-958.0202
39429.93
Area
Baths
AGE
BSMT
Garage
Perc_OWNED
MED_INCOME
DISTSTATION
Dist_AMN
Adj_LINE
Area
Baths
AGE
BSMT
1.0000
0.5633
-0.3279
0.2262
0.2773
0.1324
0.3233
0.1930
0.0714
0.0335
1.0000
-0.2709
0.1976
0.2611
0.0882
0.1482
0.1043
0.0408
0.0163
1.0000
-0.1876
0.0252
-0.3271
0.0878
-0.2461
-0.0952
-0.0873
1.0000
0.0567
0.1723
0.1979
0.1288
0.1125
0.0361
1.0000
-0.0568
0.1589
0.0903
0.0132
-0.0493
1.0000
0.3405
0.1250
0.2653
0.1322
Dist_AMN Adj_LINE
Dist_AMN
Adj_LINE
1.0000
0.1833
1.0000
1.0000
0.0722
0.2970
0.1133
1.0000
0.0494
0.0500