Documente Academic
Documente Profesional
Documente Cultură
Chapter 17
Least-Square Regression
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Where
substantial
error
is
associated
with
data,
the
following
figure
(a)
shows
seven
with
higher
values
of
x.
more
appropriate
strategy
is
to
derive
an
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
To avoid this,
some criterion
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
(xn,yn).
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
n
i=1
i=1
e i= ( y ia0 a1 xi )
where n = total number of points. However, this is an
inadequate criterion, as illustrated by the next figure, which
shows the fit of a straight line to two points.
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
i=1
i=1
|ei|= | y iaoa1 x i|
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
i=1
i=1
Sr =
i=1
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
2
y i x i a0 xi a1 x i
0=
a0
n a0 + ( xi ) a1= y i
( x i ) a0 + ( x 2i ) a1= x i y i
n x i yi x i y i
2
n x 2i ( x i)
where
and
are the means of y and x, respectively.
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Solution:
The following quantities can be computed
n=7
x i yi =119.5
x 2i =140
28
x i=28 x = 7 =4
24
y i=24 y = 7 =3.428571
Using the previous two equations,
a1 =
7 ( 119.5 )28(24)
=0.8392857
2
7 ( 140 )(28)
a0 =3.4285710.8392857 ( 4 )=0.07142857
The line, along with the data, is shown in the first figure (c).
17.1.3 Quantification of Error of Linear Regression
Any line other than the one computed in the previous
example results in a larger sum of the squares of the
residuals. Thus, the line is unique and in terms of our
chosen criterion is a best line through the points.
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
were
computed.
S r = e = ( y ia0 a1 xi )2
2
i
i=1
i=1
Sr
n2
is called the standard error of
the )
estimate.
(
The subscript notation y / x designates that the error is
where
s y /x
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
we
have
lost
two
degrees
of
freedom.
s y /x
The above concepts can be used to quantify the
goodness
of our fit. This is particularly useful for comparison of
several
regressions
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
the
dependent
variable
residuals
around
the
regression
line.
. . .
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
S t S r
St
n x i y i ( x i )( y i)
n x ( x ) n y ( y )
2
i
2
i
Example 17.2 Estimate of errors for the linear leastSquares Fit
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Problem Statement:
Compute the total standard deviation, the standard error of
the estimate, and the correlation coefficient for the data in
Example 17.1
Solution:
The summations are performed and represented in the
previous examples table. The standard deviation is
S y=
St
22.7143
=
=1.9457
n1
71
Sr
2.9911
=
=0.7735
n2
72
Thus, because
S y / x< S y
efficient.
The extent of the improvement is quantified by
r 2=
S t S r 22.71432.9911
=
=0.868
St
22.7143
or
r= 0.868=0.932
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
variables
is
linear.
This is not always the case and the first step in any
regression analysis should be to plot and visually inspect
the data to know whether a linear model applies. For
example, the next figure shows some data that is obviously
curvilinear. In some cases, techniques such as polynomial
regression, are appropriate. For example, transformations
can be used to express the data in a form that is compatible
with linear regression.
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
One
example
is
the
exponential model
y= 1 e
1 x
(17.2)
where
and
are
constants. As shown
in
the
next
equation
figure,
the
represents a nonlinear
relationship (for
1 0
between x and y.
Another
example of a nonlinear model is the simple power
equation
y=a2 x
(17.13)
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
where
and
x
3+ x
(17.4)
Where
and
simpler
alternative
is
to
use
mathematical
But because ln e = 1,
ln y=ln 1 + 1 x
and an intercept of ln
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Thus, a plot of
1/ y
versus
1/ x
(previous
fig f).
In their transformed forms, these models can use linear
3 / 3
and an intercept of
1/ 3
predictive
purposes.
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Solution:
The next figure (a) is a plot of the original data in its
untransformed state. Figure (b) shows the plot of the
transformed data. A linear regression of the log-transformed
data yields the result
log y=1.75 log x0.300
2=100.3=0.5
. The slope is
1.75
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
17.1.6
General
Linear
We
Comments
on
Regression
have
focused
on
the
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
to
accomplish
this
objective
is
to
use
that
we
fit
y=a0+ a1 x+ a2 x +e
S r = ( y i a0a1 x ia2 x i )
2 2
i=1
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
9 4444 260
eng-hs.com, eng-hs.net
July 2013
( x i ) a0 + ( x 2i ) a1 + ( x 3i ) a2 = xi y i
( x 2i ) a0 + ( x 3i ) a1 + ( x 4i ) a2= x 2i y i
a0 , a1 ,
and
a2
three
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Solution:
From the given data,
m=2
x i=15
x 4i =979
n =6
y i=152.6
x i yi =585.6
x =2.5
y =25.433
x 2i =55
x 2i y i =2488.8
x 3i =225
]{ } { }
6 15 55 a0
152.6
15 55 225 a1 = 585.6
55 225 979 a2
2488.8
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Continue:
Therefore, the least-squares quadratic equations for this
case is
y = 2.47857 + 2.35929x + 1.86071x2
The standard error of the estimate based on the regression
polynomial is
S y / x=
Sr
3.74657
=
=1.12
63
n( m+1)
S t S r 2513.393.74657
=
=0.99851
St
2513.39
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Multiple
Linear
17.3
Regression
Such
an
equation
is
particularly
useful
when
fitting
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
As with the previous cases, the best values of the
coefficients are determined by setting up the sum of the
squares of the residuals,
n
S r = ( y i a0a1 x 1 ia 2 x 2 i)2
i=1
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
n
x1 i
x2 i
x1 i
x2 i
2
x1 i x1 i x2 i
x 1 i x 2 i x 22 i
]{ } { }
a0
a1 =
a2
yi
x1 i y i
x2 i y i
( )
Example 17.6 Multiple Linear Regression
Problem Statement:
The following data was calculated from
the equations y = 5 + 4x1 3x2:
Use multiple linear regression to fit this
data.
Solution:
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
The result is
]{ } { }
6
16.5 14 a0
54
16.5 76.25 48 a1 = 243.5
14
48
54 a2
100
ai = 4
a2= -3
:
The foregoing two-dimensional
be
easily
case
can
extended to m dimensions, as in
y = a0 + a1x1 + a2x2 + + amxm + e
where the standard error is formulated as
S y / x=
Sr
n(m+1)
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
Such
equations
are
extremely
experimental
To
use
useful
when
fitting
data.
multiple
linear
regression,
the
equation
is
Problem 17.5
11
15
17
21
23
29
21
29
14
21
15
29
37
39
y
29
13
3
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
( s y / x 4.476306 ; r 0.901489 )
At x = 10, the best fit equation gives 23.2543. The line and
data can be plotted along with the point (10, 10).
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
k c
k = max 2
c s +c
Where
cs
and
k max
cs
and
k max
mg/L.
C
0.5
0.8
1.5
2.5
1.1
2.4
5.3
7.6
8.9
Solution:
The equation can be linearized by inverting it to yield
c 1
1
1
s 2
k k max c
k max
1.1
0.8
2.4
1.5
5.3
2.5
7.6
8.9
Sum
1/c2
4.00000
0
1.56250
0
0.44444
4
0.16000
0
0.06250
0
6.2294
1/c21/
1/k
k
0.9090 3.6363
91
64
0.4166 0.6510
67
42
0.1886 0.0838
79
57
0.1315 0.0210
79
53
0.1123 0.0070
60
22
1.7583 4.3993
(1/c2)2
16.0000
00
2.44140
6
0.19753
1
0.02560
0
0.00390
6
18.668
9 4444 260
Physics I/II, English 123, Statics, Dynamics, Strength, Structure I/II, C++, Java, Data, Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net
July 2013
44
75
38
44
Continue:
(. . . as
. . )
The slope and the intercept can be computed
a1
a0
1.758375
6.229444
0.202489
0.099396
5
5
10.06074 c 2
2.037189 c 2
10.06074 (2) 2
2.037189 (2) 2
6.666
Data,
Algorithms,
Numerical, Economy
eng-hs.com, eng-hs.net