05-CI For Regression PDF

Confidence intervals for regression parameters
A statistics calculated from a sample provides a point estimate of the unknown

parameter. A point estimate can be thought of as the single best guess for the
population value. While the estimated value from the sample is typically different from
the value of the unknown population parameter, the hope is that it is not too for away.
Based on the sample estimates, it is possible to calculate a range of values that, with a
designated likelihood, includes the population value. Such a range is called a
confidence interval.
NOTE: 90% C.I can be interpret as If we take 100 samples of the same size under the
same conditions and compute 100 C.I’s about parameter, one from each sample, then
90 such C.Is will contain the parameter (i.e not all the constructed C.Is)
Confidence interval estimate of a parameter is more informative than point estimate
because it reflects the precision of the estimate.
The width of the C.I (i.e U.L – L.L) is called precision of the estimate. The precision can
be increased either by decreasing the confidence level or by increasing the sample size.
(iv):- Test the hypothesis that β0=3.5

Test of hypothesis for 0
1) Construction of hypotheses
Ho : o = 3.5
H1: o  3.5
2) Level of significance
 = 5%
3) Test Statistic
bo  o 3.47  3.5
t   0.5
SE (bo) 0.06
4) Decision Rule:- Reject Ho if t cal  t   t 0.025(6)  2.447
( n2)
2
5) Result:- So don’t reject Ho

(vi):- Construct 95% C.I for regression parameters.
95% C.I for 0:
b0  t / 2( n2) SE(b0)
3.47  t.025( 6) (0.06)
3.47  (2.447)(0.06)
(3.32 , 3.62)
(v):- Test the hypothesis that there is no linear relation between Y

and X. i.e β1=0
Test of hypothesis for 1
1) Construction of hypotheses
Ho : 1 = 0
H1: 1  0
2) Level of significance
 = 5%
3) TEST STATISTIC
b1   1  0.0878  0
t   17 .56
SE (b1) 0.005
4) Decision Rule:- Reject Ho if t cal  t   t 0.025(6)  2.447

( n2)
2
5) Result:- So reject Ho and conclude that there is significant relationship

between temperature and oxygen consumption.
(vi):- Construct 95% C.I for regression parameters.
95% C.I for 1
b1 t / 2( n2) SE(b1)
 0.0878  t.025(6) 0.005
 0.0878  (2.447 )0.005 
(-0.1 , -0.076)
(vii):-Perform Analysis of Variance. Calculate/ interpret coefficient

of determination
ANALYSIS OF VARIANCE IN SIMPLE LINEAR REGRESSION
The Analysis of Variance table is also known as the ANOVA table (for ANalysis
Of VAriance). It tells the story of how the regression equation accounts for variablity in
the response variable.
The column labeled Source has three rows: Regression, Residual, and Total. The
column labeled Sum of Squares describes the variability in the response variable, Y.
Partition of variation in dependent variable into explained and unexplained variation
Total variation=Explained variation (Variation due to X also called variation due to
regression) + Unexplained variation (Variation due to unknown factors)
Total variation:- First, the overall variability of the dependent var is calculated by
computing the sumof squares of deviations of Y-values from Y, a quantity termed the
total sum of squares.
S(YY)= 8.915
Explained variation (Variation in Y due to X also called variation due to regression):
b S(XY) =-0.0878(-99.65)=8.74
Unexplained Variation: Total variation – explained variation=8.915-8.7452=0.1698
Associated with any sun of square is its degree of freedom (the # of independent
observations) TSS has n-1 d.f b/c it lose 1 d.f in computing sample mean Y and reg .SS
has (k-1) d.f. b/c there is only one independent var. and residual SS has n-k d.f. where k
is the # of parameters in the model.
The hypothesis 1=0 may be tested by analysis of variance procedure.
ANOVA TABLE
Source Of Degree of Sum of Mean Sum Fcal Ftab

Variation Freedom Squares
of Squares
(S.O.V) (DF) (SS)
(MSS=SS/df)
Regression k-1=1 8.7452 8.7452 308.93* F.05(1,6)=5.99
Error n-k=7-1=6 0.1698 0.0283
TOTAL n-1=8-1=7 8.915
S = 0.1682 R-Sq = 98.1% R-Sq(adj) = 97.8%
Relation between F and t for testing 1=0
F=t2 308.93=(-17.56)2
Goodness of Fit:
An important part of any statistical procedure that built models from data is establishing
how well the model actually fits. This topic encompasses the detecting of possible
violations of the required assumptions in the data being analyzed and to check how
close the observed data points to the fitted line.
A commonly used measure of the goodness of fit of a linear model is R 2 called

coefficient of determination. R² is the squared multiple correlation coefficient. It is also
called the Coefficient of Determination. R² is the Regression sum of squares divided
by the Total sum of squares, RegSS/TotSS. It is the fraction of the variability in the
response that is accounted for by the model. Some call R² the proportion of the variance
explained by the model. If a model has perfect predictability, the Residual Sum of
Squares will be 0 and R²=1. If a model has no predictive capability, R²=0.
Re g.SS 8.7452
R2  x100  x100  98.1%
TotalSS 8.915
The value of R2, indicates that about 98% variation in the dependent variable has been
explained by the linear relationship with X and remaining are due to some other
unknown factors.
(x): Test the goodness of fit of the regression model by residual plot
Residual Plot:- The estimated residuals ei’s are defined as the difference
between observed and fitted values of yi’s i.e e  Y  Yˆ
The plot of ei against the corresponding fitted values Yˆ ' s , provides useful information
about the appropriateness of the model. If the plot of residuals against Yˆ ' s is a random,
scatter and does not show any systematic pattern, then we conclude that the model is
appropriate.
NOTE: If there are same residuals with very large values, it may be indication of the
presence of outliers (the values that are not consistent with data)

05-CI For Regression PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

05-CI For Regression PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Confidence intervals for regression parameters

A statistics calculated from a sample provides a point estimate of the unknown

because it reflects the precision of the estimate.

(iv):- Test the hypothesis that β0=3.5

5) Result:- So don’t reject Ho

3.47  t.025( 6) (0.06)

(v):- Test the hypothesis that there is no linear relation between Y

Test of hypothesis for 1

4) Decision Rule:- Reject Ho if t cal  t   t 0.025(6)  2.447

5) Result:- So reject Ho and conclude that there is significant relationship

b1 t / 2( n2) SE(b1)

 0.0878  t.025(6) 0.005

 0.0878  (2.447 )0.005 

(vii):-Perform Analysis of Variance. Calculate/ interpret coefficient

the response variable.

Partition of variation in dependent variable into explained and unexplained variation

Total variation=Explained variation (Variation due to X also called variation due to

regression) + Unexplained variation (Variation due to unknown factors)

total sum of squares.

Unexplained Variation: Total variation – explained variation=8.915-8.7452=0.1698

is the # of parameters in the model.

The hypothesis 1=0 may be tested by analysis of variance procedure.

Source Of Degree of Sum of Mean Sum Fcal Ftab

Regression k-1=1 8.7452 8.7452 308.93* F.05(1,6)=5.99

Error n-k=7-1=6 0.1698 0.0283

TOTAL n-1=8-1=7 8.915

S = 0.1682 R-Sq = 98.1% R-Sq(adj) = 97.8%

Relation between F and t for testing 1=0

A commonly used measure of the goodness of fit of a linear model is R 2 called

S-ar putea să vă placă și