Sunteți pe pagina 1din 6

EXTRAPOLATION OF MONOTONOUS FUNCTIONS USING

THE LINEAR CORRELATION COEFFICIENT

C. DOCA and L. DOCA

ABSTRACT

The paper presents a possible method of extrapolation / prognosis for monotonous


functions based on the utilization of the linear correlation coefficient. While having the
advantage of not requiring knowledge of the analytical (explicit) expression of the function
that gave initial values, the computing formulae proposed by the authors can be
implemented in computer programs for automatically processing and interpretation of
experimental data or of those taken over from various computing tables.
Key words: prognosis, extrapolation, linear correlation coefficient

Introduction
Given the real functions, continuous and double derivable y = f (t ; a ,b ,c...) , of variable t and
parameters a, b, c describing the evolution in time of a phenomenon (physical, economical, social
etc.) one ascertains [1], related to prognosis problems, that y' = df (t ; a ,b ,c...) dt offers information
regarding the short-duration tendency, while y" = d 2 f (t ; a ,b ,c...) dt 2 deals with the long-term
tendency.
Knowing the analytical expression f (t ; a ,b ,c...) and with the help of a set of measurements (t i , yi ) ,

i = 1,2 ,..., N , the most probable values of parameters a, b, c are determinate, for instance, by the
least-squares method [2], solving the system of equations:
S (a ,b ,c...)
= 0;
a

S (a ,b ,c...)
= 0;
b

S (a ,b ,c...)
= 0;
c

(1)

where:
N

S (a ,b ,c...) = [ yi f (t i ; a ,b ,c...)]

(2)

i =1

161

With f (t ; a ,b ,c...) thus explained, in the case of extrapolation at the moment t = t * , one can define:
- short-term prognosis, if:

1
*
t t N 3 (t N t1 ); right extrapolation

t t * 1 (t t ); left extrapolation
N
1
1
3

(
(

(3)

- medium-term prognosis, if:

1
*
3 (t N t1 ) < t t N (t N t1 ); right extrapolation

1 (t t ) < t t * (t t ); left extrapolation


1
N
1
3 N 1

(
(

(4)

- long-term prognosis, if:

(
(

)
)

t * t N > (t N t1 ); right extrapolation

*
t1 t > (t N t1 ); left extrapolation

(5)

assuming as evaluation errors [1]: 15% for the short-term prognosis, 25% for the medium-term
prognosis and 50% for the long-term prognosis, respectively.
If the analytical expression of the function f (t ; a ,b ,c...) is not known, then based on the same set of
measured values (t i , yi ) , i = 1,2 ,..., N , one can build interpolation polynomials (Newton, Lagrange,

Hermite etc.) that would allow, again, the evaluation y* = f t * ; a ,b ,c... .


In both these cases one solves systems of equations, eventually non-linear, requiring in most situations
the computer elaboration and implementation of appropriate programs.

Extrapolation Using the Linear Correlation Coefficient


It is known that for the set of N measurements (t i , yi ) , i = 1,2 ,..., N , the linear correlation coefficient
[1]:

R y ,t ,N =

N
N N
N t i yi t i yi
i =1
i =1 i =1
; R y ,t ,N [ 1,1]
2
N 2 N N 2 N 2
N t i t i N y i yi
i =1 i =1
i =1
i =1

(6)

162

is a measure of deviation of the function y = f (t ) from a straight line. It is obvious that a new value

y N +1 , obtained at the moment t * = t N +1 , will lead to the amplification or diminution of this deviation
if R y ,t ,N +1 < R y ,t ,N , and R y ,t ,N +1 > R y ,t ,N respectively.
Since:

(N + 1) ti yi ti yi
N +1

R y ,t ,N +1 =

N +1

N +1

i =1 i =1
2
2
N +1
N +1

N +1
N +1
2
2
( N + 1) t i t i ( N + 1) yi yi
i =1
i =1
i =1
i =1

i =1

(7)

can be re-written as:

(N + 1) ti yi + t N +1 y N +1 ti yi + y N +1
N +1

R y ,t ,N +1 =

i =1

i =1 i =1

2
2
N +1

N +1
N 2
N

2
2
(
)
(
)
N
+
1
t

N
+
1
y
+
y

y
+
y

i
i N +1
i
i
N +1
i =1
i =1
i =1
i =1

(18)

then, with the notations:


N

A = N t N +1 t i

(9)

i =1
N
N +1 N
B = ( N + 1) t i yi t i yi
i =1
i =1 i =1

(10)

2
N +1

N +1
2
C = N ( N + 1) t i t i
i =1
i =1

(11)

2
N +1
N
N +1
2
D = 2 yi ( N + 1) t i t i
i =1
i =1
i =1

(12)

2
2
N
N +1

N
N +1
2
2
E = ( N + 1) yi yi ( N + 1) t i t i
i =1
i =1
i =1
i =1

(13)

equation (8) becomes:

R y ,t ,N +1 =

A y N +1 + B
C y N2 +1 + D y N +1 + E

(14)

163

or, finally:

(CR

2
y ,t ,N +1

A 2 y N2 +1 + DR y2,t ,N +1 2 AB y N +1 + ER y2,t ,N +1 B 2 = 0

(15)

In order to obtain immediate information ranging within the limits of acceptable evaluation errors
related to prognosis problems, equality (15) can be interpreted as a binomial equation in the unknown
y N +1 in the point t = t N +1 . For this, it is sufficient to accept, in a first approximation, the equality

R y ,t ,N +1 = R y ,t ,N and to check the compliance with the condition of existence or real solutions, i.e.:

= (DR y2,t ,N 2 AB ) 4(CR y ,t ,N A 2 )(ER y2,t ,N B 2 ) 0


2

(16)

Between the two solutions:

y N +1 =

DR y2,t ,N 2 AB

2 CR y2,t ,N A 2

(17)

one will choose the minimum or maximum value, depending on the evolution more or less
monotonous of previous observations yi = f (t i ) , i = 1,2 ,..., N .
For monotonous increasing functions it must:

A y N +1 + B 0

(18)

and for monotonously decreasing functions:

A y N +1 + B 0

(19)

Because of the estimation error dR of the linear correlation coefficient R y ,t ,N +1 , by logarithm and
derivation of relation (17), one obtains the calculation formula of the relative error of evaluation of the
value y N +1 :

) (

2
2
2
dy N +1 d DR y ,t ,N + 2 AB d CR y ,t ,N A
=

y N +1
CR y2,t ,N A 2
DR y2,t ,N + 2 AB

(20)

Making the implied calculation, the result is:

dy N +1
D
C
= 2

2
2
2
y N +1
DR y ,t , N + 2 AB CR y ,t , N A

(D

4CE R y2,t , N + 2 A 2 E + B 2C ABD


R y ,t , N dR
DR y2,t , N + 2 AB

(21)

164

Starting from the fact that, in the case of a linear dependency y = f (t ) we have R y ,t ,N 1 , and for a
non-linear monotonous function R y ,t ,N 0 , in a first approximation, for the numerical evaluation of
relative error (19) the value:

dR R y ,t ,N R y ,t ,N 1

(22)

can be admitted.

Numerical Verifications
In order to check and sustain the above assertions, Table 1 presents the results obtained in the case of
extrapolation by means of the linear correlation coefficient for several monotonous increasing
functions, strongly non-linear.
Table 1

t i ; i = 1,2 ,..., N

t N +1

y N +1

y N +1 f (t N +1 )
f (t N +1 )

dy N +1
y N +1

120.125
10196.473
1001958.982

-0.723 %
-0.044 %
-0.004 %

-0.960 %
-0.060 %
-0.005 %

-0.984 %
-0.046 %
-0.004 %

-1.413 %
-0.078 %
-0.007 %

-16.193 %
-2.092 %
-0.214 %

-21.847 %
-20.267 %
-20.026 %

2.001 %
0.425 %
0.082 %

2.428 %
0.343 %
0.065 %

f (t N +1 )
f (t ) = t 2

1, 2, , 10
1, 2, , 100
1, 2, , 1000

11
101
1001

121
10201
1002001

f (t ) = t 3
1, 2, , 10
1, 2, , 100
1, 2, , 1000

11
101
1001

1331
1030301
1003003001

1317.899
1029826.664
1002960781.302

f (t ) = exp(t )
1, 2, , 10
1, 2, , 100
1, 2, , 1000

11
101
1001

59874.141
7.3071043
5.35510434

50178.334
7.1541043
5.34310434

f (t ) = ln(t )
1, 2, , 10
1, 2, , 100
1, 2, , 1000

11
101
1001

2.397895
4.615121
6.908755

2.445906
6.634768
6.914453

165

Conclusions
From the assessment of these data it results that formulae (17) and (21) which helped to obtain
numerical values for y N +1 and dy N +1 y N +1 respectively, lead to results acceptable from the point of
view of the errors assumed in the approaching short-term extrapolation and prognosis problems. The
benefits of the method presented in the paper is that it does not require knowledge / determination of
the analytical (specific) form of function f (t ) .
At least from this last perspective one may conclude that the extrapolation and prognosis method
based on the linear correlation coefficient can be used, at a more or less reliable confidence factor,
with the programs for computer automatically processing and interpretation of experimental data or of
those taken from various computing tables.

References
[1] Rafiroiu, M., Simulation models in constructing (in Romanian), Editura Facla, Timioara, 1982
[2] Constantinescu, I., Experimental data analysis using numerical computers (in Romanian), Editura
Tehnic Bucureti, 1980

166

S-ar putea să vă placă și