Sunteți pe pagina 1din 8

1

BES Tutorial Sample Solutions, S1/13



WEEK 12 TUTORIAL EXERCISES (To be discussed in the week starting
May 27)

1. Recall the Anzac uaiage uata (ANZACu.XLS) useu in Weeks S, 8 anu 1u.
In Week S we consiueieu the simple lineai iegiession mouel given by:

pricc

= [
0
+[
1
ogc

+ u



wheie price = useu cai piice in uollais anu age = age of the cai in yeais.
The EXCEL iesults obtaineu using 0iuinaiy Least Squaies aie
piesenteu below:

Regiession Statistics
R
2
u.u77
Stanuaiu Eiioi 42u69
0bseivations 117

CoefficientsStanuaiu Eiioi t Stat pvalue
Inteicept 47469 6748 7.uSS u.uuu
Age 26S8 8S6 S.1u6 u.uu2

(a) Inteipiet the tStat anu the pvalues in the EXCEL output.
What uo you neeu to assume.
Thetstat&pvaluesintheEXCELoutputarederivedfromtwotailtestswith
null hypotheses that the associated population parameter equals to 0. Hence,
larger tstats and lower pvalues mean we are more confident that the
associatedpopulationparameterisnonzero.Here,pvaluesforbothintercept
and Age coefficients are below 1% &, hence we can be confident that both
populationparametersarestatisticallysignificant(nonzero).

Weneedtoassumethedisturbancesarenormalorbecausethesamplesizeis
largeinvoketheCLT.

(b) Calculate a 9S% confiuence inteival foi the coefficient on age.

Standardnormalcriticalvalueis1.96hence95%confidenceintervalis:

2

26581.96856=26581678=(4336,980)

(c) Inteipiet the R
2
value.

Theregressionmodelincludingageexplains7.7%ofthevariationinusedcar
prices.

(u) Test whethei the estimateu coefficient of Age is significantly less
than zeio at the S% level of significance.

Unlikein(a)thisisaonetailedtest:

H
0
:
1
=0; H
1
:
1
<0

Decisionrule:RejectHu if b1se(b1) < 1.64S
Teststatistic:b1/se(b1)=3.106<1.645andhencerejectHu

(e) Estimate a 9S% confiuence inteival foi the mean piice foi a
seconuhanu passengei cai that is 1u yeais olu anu inteipiet the
iesult. Note: the sample mean of age is 6.44 yeais.

A10yearoldcarisexpectedtobevaluedat$47469102658=20889.

Boundariesofconfidenceintervalforthispredictioncanbefoundby:

`
p
_tu
2
, ,0
s
_
1
n
+
(X
p
-X

)
2
(X

-X

)
2

wheres=42069,se(b1)=8S6 anu hence



(X

-X

)
2
=
s
2
(sc(b
1
))
2
=
42u69
2
8S6
2
= 241S

Hence:
2u889 _ 1.98 42u69
_
1
117
+
(1u -6.44)
2
241S
= 2u889 _ 978S

3

Weare95%confidentthatthepriceofa10yearoldcarwillfallbetween
$11,106 and $30,672. While the impact of age on price is precisely
estimated, the CI is quite wide because of the large amount of
unexplainedvariationthatisindicatedbytheverylowR
2
valuereported.
(Note: use of normal critical values here would be acceptable given the
large sample size and would make little practical difference as the
criticalvaluewouldbe1.96ratherthan1.98)

Anzac uaiages piicing scheme baseu on the age of the cai is not
woiking out veiy well. When its seconuhanu cais aie compaieu with
cais of the same age fiom othei uealeis, piices often uiveige. 0ne of
theii consultants noteu that the value of a seconuhanu cai shoulu
uepenu on both the 0uometei ieauing as well as the Age of the vehicle.
This consultant wanteu to estimate the following two simple lineai
iegiession mouels sepaiately:

pricc

= [
0
+[
1
ogc

+ u


pricc

= o
0
+ o
1
oJomctcr

+ :



wheie Odometer = uistance the cai has tiavelleu since leaving factoiy
in kilometeis. A senioi consultant auviseu use of a multiple lineai
iegiession mouel insteau:

pricc

= y
0
+y
1
oJomctcr

+y
2
ogc

+:



(f) Biscuss why the simple lineai iegiession methous may not be
piefeiable to the multiple iegiession methou, in geneial, anu in
the context of this pioblem. The iesultant 0LS estimates foi the
multiple iegiession mouel given below:

Thepredictiveperformanceofthemodelwillimproveasrelevantvariablesare
addedtoasimpleregressionmodel.

Alsotheassumptionthatthedisturbanceisuncorrelatedwiththeexplanatory
variables is critical for the unbiased estimation of coefficients of included
variables. In the simple price on age regression it will be violated if variables
affecting price and correlated with age have been omitted from the model.
Thisislikelytobethecaseherewithdistancethecarhastraveled.

4

We see the R
2
has improved (approximately doubled) with the addition of
odometerandthecoefficientonageisnowmuchsmallerinmagnitudeandis
nowstatisticallyinsignificant.

S0NNARY 00TP0T

Regiession Statistics
R Squaie u.1Su
Stanuaiu Eiioi 4uS68
0bseivations 117

CoefficientsStanuaiu Eiioit Stat Pvalue
Inteicept SS867 682S 7.89S u.uuu
0uometei (km) u.27u u.u87 S.11u u.uu2
Age S6u 11u8 u.S2S u.746


2. ComputingExercise#4
Refei to the computing piogiam anu Biscussion Question 4.S on
multiple iegiession.

Aftei estimating thiee impoit equations, the fiist two being simple
lineai iegiession, the thiiu being a multiple iegiession containing uNR
anu ielative piices as explanatoiy vaiiables you weie askeu the
following uiscussion question:

Aie the coefficients
1
anu
2
statistically uiffeient fiom zeio at the S%
level. 0f the thiee iegiession equations you estimateu, which one
pioviues a bettei explanation of the level of impoits.

The pvalues for
1
and
2
are both <0.0005 and hence at all conventional
significance levels one would reject the null hypotheses that these coefficients
areindividuallyequaltozero.

We could interpret better in a number of ways. In terms of fit the third


regression is best in terms of adjusted R
2
: 0.9713 compared to 0.9457 and
0.3167 in the two simple regression models. (Notice the multiple regression
modelwillalwaysdominatethetwosimpleregressionmodelsintermsofR
2

butmaynotintermsofadjustedR
2
.)
5

In addition though, you could argue that the multiple regression model is
better because it guards against the omitted variable bias that is likely in the
twosimplelinearregressionmodels.




S. SIA:Sydneyhousingprices.
Recall the housing piice uata foi Syuney subuibs useu in Question 6 in
Week S. Youi statistically nave fiienu has been uoing some analysis of
Syuney housing piices using these uata anu has askeu you foi help. In
auuition to the piice uata theie aie a numbei of chaiacteiistics
associateu with the subuib that have been collecteu anu aie likely to
explain some of the laige vaiiation in housing piices acioss subuibs
that aie obseiveu in the uata. Youi fiienu was veiy inteiesteu in the
impact on housing piices of being locateu unuei the flight path. The
iegiession of housing price on the flightpath vaiiable (Nouel 1)
pioviueu a iesult that he uiu not expect. 0n youi auvice he ian a
seconu iegiession (Nouel 2) that incluueu seveial extia explanatoiy
vaiiables. Results foi Nouel 1 anu Nouel 2 aie piesenteu in the table,
togethei with a full uesciiption of vaiiables useu in the analysis.

Housing price is the mean of the meuian piice of houses solu in each
subuib foi two quaiteis (Septembei anu Becembei 2uu2) measuieu
in thousanus of uollais;
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9867
R Square 0.9736
Adjusted R 0.9713
Standard E 3140.3680
Observatio 26
Coefficients Standard Error t Stat P-value
Intercept 16101.329 10822.442 1.488 0.150
GNE 0.249 0.011 23.406 0.000
Price -38978.894 8255.354 -4.722 0.000
6

DistancetoCBD is uistance measuieu in kilometeis of the subuib fiom
Syuneys CBB;
Distance to Airport is uistance measuieu in kilometeis of the subuib
fiom Syuney Aiipoit;
Distance to beach is uistance of the subuib measuieu in kilometeis
fiom the neaiest beach;
Flightpath is a uummy vaiiable that equals 1 if the subuib is unuei the
flight path anu equal to u otheiwise.

(a) Bow woulu you inteipiet the iegiession estimates foi the
paiameteis in Nouel 1 anu explain why youi fiienu founu the
iesult to be unexpecteu.
Because the estimate of 1 is positive this means houses under the flightpath
on average sell for more ($216,200 more) than houses not under the
flightpath. This is surprising because you would except aircraft noise
associated with being under the flighpath would be unattractive and hence
leadtolowernothigherprices.

(b) Explain why the iesults in Nouel 1 aie unieliable as a basis foi
ueteimining the impact on housing piices of being locateu unuei
the flight path. Which of the assumptions associateu with simple
lineai iegiession has cleaily been violateu in Nouel 1.

You would like to make the statement about the impact of being under the
flightpath holding other factors constant. This is not possible with Model 1
as it is a simple linear regression and hence there is potential for omitted
(confounding) variables that lead to biased estimates of the impact of being
situatedundertheflightpath.

For example, proximity to the beach is likely to impact on housing prices and
be correlated with being under the flightpath. In Model 1, the variable
Distance to beach is in the disturbance term and hence leads to a violation of
assumptionthatE(u|X)=0.

(c) Wiite a biief uesciiption of the iesults foi Flightpath in Nouel 2 in
teims of the paiametei estimate, its inteipietation anu its
statistical significance.

7

The estimated parameter indicated a $51,500 premium (much smaller than
for Model 1) for suburbs under the flightpath relative to those not holding
otherfactorsconstant.

Forstatisticalsignificance:

H0:
i
=0versusH1:
i
0where
i
isthei
th
regressioncoefficient

BecausewehavealargesamplesizewecaninvoketheCLTandusestandard
normalcriticalvalueswhenevaluatingtheteststatisticsgivenbybi/se(bi)

Ifwechoose =0.05thenthedecisionrulewillbetorejectif|bi/se(bi)|>1.96

The test statistic for flightpath (51.5/50.2 = 1.03) indicates that this
parameterisnotstatisticallydifferentfromzero.

(u) Inteipiet the oveiall fit of Nouel 2.

Model 2 produces an R
2
of 0.372 37.2% of the variation in Sydney housing
prices is explained by the explanatory variables in the regression.

(e) 0se Nouel 2 to pieuict the aveiage housing piice foi the subuib of
Ranuwick which is S.21 kms fiom the CBB, 1.78 kms fiom the
beach, 6.62 kms fiom the aiipoit anu is not ueemeu to be unuei
the flight path.

Prediction=853.5+021.55.21+216.6213.91.78
=855.763

ThepredictedaveragehousepriceforRandwickis$855,763


8

MultipleregressionresultsforSydneyhousingprices*

Explanatory
variables
Dependentvariable:
Housingprice
Model1 Model2
Intercept
S69.9
(2u.6)
8SS.S
(SS.S)
Flightpath
216.2
(S6.u)
S1.S
(Su.2)
Distanceto
CBD

21.S
(S.4)
Distanceto
Airport

21.u
(2.9)
Distanceto
beach

1S.9
(2.S)
0bseivations SuS SuS
R squaieu u.u29 u.S72
* Numbeis in biackets below coefficient estimates aie stanuaiu eiiois.

S-ar putea să vă placă și