Sunteți pe pagina 1din 5

1 a)

b)

R = SSay/SSaaSSyy = 0.8696
b = SSay/SSaa =1.1
c = y bA = -4
Yi = -4 + 1.1Ai

c) -4 + 1.1(70) = 73
if you use the equation but if you use your intuition and you see that at age 60 the income starts to
drop again therefore income at 70 would probably be less than 50 and intuitively this would be due
to retirement.
2 a)

CI = [b_hat - t(n-2)*se, b_hat + t(n-2)*se] = 0.4472007, 0.8247993


i)
ii)
iii)

Ho : B2 = 0
H1 : B2 not = 0
t crit = t 35, 0.05/2 = 2.0301

Rejection region
T < -2.0301
T > 2.0301

iv)

we reject the null hypotheses because the T H0 is greater that T crit

c) R^2 = 0.2763
d) The Models functional form should be changed; if you look at the residual plot the u
shape of the residuals suggests that there is a relationship other than a linear one between the
variables.
3.
Source

A)

SS

df

MS

Model
Residual

79.1208602
466.593426

1
5

79.1208602
93.3186851

Total

545.714286

90.952381

Coef.

b
_cons

.2892977
73.6583

Std. Err.
.3141838
10.20055

t
0.92
7.22

Therefore; Wi = 73.6583 + 0.2892977Bi + ei


a = 73.6583
b = 0.2892977
R squared = 0.1450

Number of obs
F( 1,
5)
Prob > F
R-squared
Adj R-squared
Root MSE

P>|t|
0.399
0.001

=
7
=
0.85
= 0.3994
= 0.1450
= -0.0260
= 9.6602

[95% Conf. Interval]


-.5183375
47.43695

1.096933
99.87966

75

80

85

90

95

B)

10

20

30
b

40

Fitted values

50

C)
. ttest b==0
One-sample t test
Variable

Obs

Mean

30.31571

Std. Err.
4.74434

Std. Dev.

[95% Conf. Interval]

12.55234

18.70673

mean = mean(b)
Ho: mean = 0
Ha: mean < 0
Pr(T < t) = 0.9997

t =
degrees of freedom =
Ha: mean != 0
Pr(|T| > |t|) = 0.0007

41.9247
6.3899
6

Ha: mean > 0


Pr(T > t) = 0.0003

Therefor we do not reject the null hypothesis.


D)
1/0.2892977 = 3.4566469
E)
. predict res, r
Team
Baltimore
Boston
Cleveland
Detroit
Milwaukee
New York
Toronto

res
8.453517
-11.49829
-.1867665
-6.822286
9.648299
-7.644862
8.050388

Therefore residuals the largest residuals are Boston: - 11.4983 and Milwauke +9.6483
4.
A)

i) units of measurment: prpblck;


% of black people: 0 0.9816579
Income;
$
15919 - 136529
i)
prpblck mean: 0.1134864income mean: 47053.78
ii)
prpblck SD:
0.1824165
income SD: 13179.29

.
. summ prpblck income
Variable

Obs

Mean

prpblck
income

409
409

.1134864
47053.78

Std. Dev.
.1824165
13179.29

Min

Max

0
15919

.9816579
136529

Min

Max

0
15919

.9816579
136529

B)
Psoda = 0.9563196 + 0.1149882(prpblck) + 1.60e-06(income) + u
R^2 = 0.0642
Sample size = 401

Variable

Obs

Mean

prpblck
income

409
409

.1134864
47053.78

Std. Dev.
.1824165
13179.29

. reg psoda prpblck income


Source

SS

df

MS

Model
Residual

.202552215
2.95146493

2
398

.101276107
.007415741

Total

3.15401715

400

.007885043

psoda

Coef.

prpblck
income
_cons

.1149882
1.60e-06
.9563196

Std. Err.
.0260006
3.62e-07
.018992

t
4.42
4.43
50.35

Number of obs
F( 2,
398)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

401
13.66
0.0000
0.0642
0.0595
.08611

P>|t|

[95% Conf. Interval]

0.000
0.000
0.000

.0638724
8.91e-07
.9189824

.1661039
2.31e-06
.9936568

C)
The coefficient is smaller in this simple regression because income and prpblck have a negative
relationship and therefor this regression captures the effects of both the income and the prpblck in
the prpblck variable. Holding income constant larger and more accurate corelation between prpblck
and psoda, you can see this is also reflected in the r squared value although both are low the
previous regression produced a higher r squared value.

. reg psoda prpblck


Source

SS

df

MS

Model
Residual

.057010466
3.09700668

1
399

.057010466
.007761922

Total

3.15401715

400

.007885043

psoda

Coef.

prpblck
_cons

.0649269
1.037399

Std. Err.
.023957
.0051905

t
2.71
199.87

Number of obs
F( 1,
399)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

401
7.34
0.0070
0.0181
0.0156
.0881

P>|t|

[95% Conf. Interval]

0.007
0.000

.0178292
1.027195

.1120245
1.047603

D)
. reg lpsoda prpblck lincome
Source

SS

df

MS

Model
Residual

.196020672
2.68272938

2
398

.098010336
.006740526

Total

2.87875005

400

.007196875

lpsoda

Coef.

prpblck
lincome
_cons

.1215803
.0765114
-.793768

Std. Err.
.0257457
.0165969
.1794337

t
4.72
4.61
-4.42

Number of obs
F( 2,
398)
Prob > F
R-squared
Adj R-squared
Root MSE

P>|t|
0.000
0.000
0.000

.0709657
.0438829
-1.146524

R squared = 0.0681
N= 401

E)

401
14.54
0.0000
0.0681
0.0634
.0821

[95% Conf. Interval]

Log(psoda) = -0.793768 + 0.1215803prpblck + 0.0765114Log(income) + U

Log(psoda) = 0.1215803

=
=
=
=
=
=

prpblck = 0.1215803*0.2 = 0.02431606%

.1721948
.1091399
-.4410117

. reg lpsoda prpblck lincome prppov


Source

SS

df

MS

Model
Residual

.250340622
2.62840943

3
397

.083446874
.006620679

Total

2.87875005

400

.007196875

lpsoda

Coef.

prpblck
lincome
prppov
_cons

.0728072
.1369553
.38036
-1.463333

Std. Err.
.0306756
.0267554
.1327903
.2937111

t
2.37
5.12
2.86
-4.98

Number of obs
F( 3,
397)
Prob > F
R-squared
Adj R-squared
Root MSE

P>|t|
0.018
0.000
0.004
0.000

=
=
=
=
=
=

401
12.60
0.0000
0.0870
0.0801
.08137

[95% Conf. Interval]


.0125003
.0843552
.1192999
-2.040756

.1331141
.1895553
.6414201
-.8859092

The estimate of B1 changes from 0.121508 to .0728072 suggesting that the effects of prppov were
previously being incorporated in the effects of prpblck; there was a relationship between
B1(prpblck) with U(error) and U(error) with Y(lpsoda). Although if you check the correlation
between the three variables you will see there is a low value for lpsoda and prppov and a fairly high
correlation between prpblck and prppov therefore it may be a multicollinearity issue.
. corr psoda prpblck prppov
(obs=401)

psoda
prpblck
prppov

psoda

prpblck

prppov

1.0000
0.1344
0.0260

1.0000
0.6795

1.0000

F).
. corr lincome prppov
(obs=409)

lincome
prppov

lincome

prppov

1.0000
-0.8385

1.0000

Yes it is what I would expect, a correlation figure close to -1 makes sense; as income goes up
poverty goes down.
G) log(income) and prppov being in the same regression model cause the problem of multicollinearity,
the problem of two or more independent variables in a regression being highly correlated. But as the
prpblck is not too highly correlated with either of these variables it is still viable to include both in
the regression when looking at causality of prpblck on psoda.

S-ar putea să vă placă și