Sunteți pe pagina 1din 3

Bivariate Regression Lab #8

In order to identify explanatory relationships between two variables we will use a


bivariate regression approach. For this example use the same variables from Lab 7
(Correlation); age and income98. We saw that there was a significant correlation
between these two variables, now we will use regression to obtain a precise indicator of
exactly what that relationship is given our sample.
To begin, I am going to re-estimate the correlation coefficient with a significance
indicator (p-value) using the same command from Lab 7 to do this, type the following
command in the command line editor:
pwcorr age income98, sig
The results indicate that there is a significant (p-value = 0.000) weak positive correlation
(0.1488) between age and income for our survey respondents.

. pwcorr age income98, sig


age income98
age

1.0000

income98

0.1488
0.0000

1.0000

To examine this correlation via regression, follow the Statistics Linear Models and
Related Linear Regression menu options.

In the Linear Regression popup window, we will want to identify the Dependent Variable
and the Independent Variable. This is the first time this semester that we have had to
identify the DV and IV prior to running the statistical test. To this point, all of our
analyses have the ability to be mechanically implemented without specifying the DV and
the IV formally, although we may have theoretically had an idea of what the relationship
is.
For this analysis, we are going to say that your income (income98) is dependent on your
age (age) thus, the DV = income98 and the IV = age. Enter these into the linear
regression window and we can leave everything else as is click OK.

OUTPUT, NEXT PAGE.

To regression output can be split into two sections the top half contains all of the model
fit information. From the model fit information, we can see that the model is statistically
significant with an F value = 21.00 (p-value = 0.000). This F-statistic indicates that a
significant amount of the total variation in the DV (income98) is explained with the age
IV, relative to the amount of variation left unexplained. These show up in the model fit
table as the MS-Model (458.095281) and the MS-Residual (21.8092642), respectively.
Finally, we can also see out R2 statistic here (0.0221) indicating that about 2.2% of the
variation in income is accounted for by the variation in age in this linear model.

. regress income98 age


Source

SS

df

MS

Model
Residual

458.095281
20238.9972

1 458.095281
928 21.8092642

Total

20697.0925

929 22.2788939

Number of obs
F(1, 928)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

930
21.00
0.0000
0.0221
0.0211
4.67

. regress income98 age

The second half of the output is where we find the slope coefficient (Coef.) associated
Source
df age and
MS P>|t|
Number
930
income98
Coef. Std.
Err.
t income
[95% of
Conf.
with
the
unstandardizedSS
relationship
between
(0.0516031).
Weobs
see Interval]
that =
F(1,
21.00
that slope coefficient is statistically significant with a t-value of 4.58
and928)
a p-value of =
0.000.
In interpretation,
we would say that 1for458.095281
each additional year
of age
Model
> Freported by =
age 458.095281
.0516031
.0112595
4.58 0.000 Prob
.0295061
.0737001 0.0000
the
survey
respondents,
we
find
a
0.0516031
unit
increase
in
the
category
of
income
Residual
20238.9972
928 21.8092642
R-squared
=
0.0221
15.45208 .4952788
31.20 0.000
14.48008
16.42407
reported _cons
by the respondents.
Total

20697.0925

income98

Coef.

age
_cons

.0516031
15.45208

929

Std. Err.
.0112595
.4952788

22.2788939

t
4.58
31.20

Adj R-squared
Root MSE

=
=

0.0211
4.67

P>|t|

[95% Conf. Interval]

0.000
0.000

.0295061
14.48008

.0737001
16.42407

S-ar putea să vă placă și