Sunteți pe pagina 1din 8

Novel Time Series Analysis and Prediction of Stock

Trading Using Fractal Theory and Time Delayed Neural


Network *
Fuminnri Yakuwa

Yasubikn Dote

Mika Yoneyama

Kushiro branch

Deuartment of Comuuter
Science & Systems
Engineering
Muroran Institute of
Technology
27-1, Mizumoto-cho,
Muroran 050, JAPAN
Phone: +SI-143-46-5432;
FAX: +SI-14346.5499

Deuartment of Comuuter
Science & Systems

Hokkaido Electric Power


Co.,Inc.

8-1, Saiwai-cho, Kushiro,


050, JAPAN
Phone: +S1-154-23-1114;
FAX: 181-154-23-2220

Panasonic
Mobile & System
Engineering Co.,Ltd.

Technology
27-1, Mimmoto-cho,
Muroran 050, JAPAN
Phone: +SI-143-46-5432;
F A X +SI-143-46-5499
series using a Hurst exponent, a fractal analysis method,
and an autocorrelation analysis method. In order to extract
the knowledge, decision making rules comprehensible by
humans using the features are derived with rough set
theory [26]. Finally the knowledge is embedded into the
stmcture of the Time Delayed Neural Network (TDNN).
The accurate prediction is obtained.

Abstract - The stock markets are well known for wide


variations in prices over short and long terms. These
fluctuations are due to a large number of deals produced
by agents and act independently from each other.
However, even in the middle of the apparently chaotic
world, there are opportunities for making good
predictions [ I ] .

This paper is organized as follows. In Section 2 time


series analysis using fractal analysis is described. Section
3 illustrates the structure of neural networks for time
series. Section 4 describes short-term prediction using
TDNN. Some conclusions are drawn in Section 5 .

In this paper the Nikkei stock prices over 1500 days


from July to Oct. 2002 are analyzed and predicted using
a Hurst exponent (H), a fractal dimension (D), and an
autocorre~ation coefficient (c). They are H = 0.6699
D=2-H=1.3301 and C = 0.26558 over three days. This
obtained knowledge is embedded into the structure of our
developed time delayed neural network 121. It is
confirmed that the obtained prediction accuracy is much
higher than that by a back propagation-type forward
neural networkfor the short-term.

2 Time Series Analysis using Fractal


Analysis
2.1

Fractal

Fractal analysis provides a unique insight into a wide


range of natural phenomena. Fractal objects are -those
which exhibit self-similarity. This means that the general
shape of the object is repeated at arbitrarily smaller and
smaller scales. Coastlines have this property: a particular
coastline viewed on a world map has the same character
as a small piece of it seen on a local map. New details
appear at each smaller scale, so that the coastline always
appears rough. Although true fractals repeat the detail to a
vanishingly small scale, examples in nature are selfsimilar up to some non-zero limit. The fractal dimension
measures how much complexity is being repeated at each
scale. A shape with a higher fractal dimension is more
complicated or rough than one with a lower dimension,
and fills more space. These dimensions are fractional: a
shape with fractal dimension of D=1.2, for example, fills

Although this predictor works for the short term. it


is embedded into our developedfiruy neural network [3]
to construct multi-blended local nonlinear models. It is
applied to general long term prediction whose more
accurate prediction is expected than that by the method
proposed in [I].

1 Introduction
The Nikkei Average Stock prices over 1500 days are
in the middle of the apparently chaotic world. In this
paper. on the basis of Zadehs proposal: i.e., From
Manipulation of Measurements to Manipulation of
Perceptions-Computations with Words [25], that is a data
mining technology, knowledge easily comprehensible by
humans is extracted by obtaining the features of the time
0-7803-7952-7/03/$17.00 0 2003 IEEE.

Engineering
Muroran Institute of

Shinji U z u r a b a s h i

134

more space than a one-dimensional curve, but less space


than a two-dimensional area. The fractal dimension
successfully tells much information about the geometry of
an object. Very realistic computer images of mountains,
clouds and plants can be produced by simple recursions
with the appropriate fractal dimension. Time series of
many natural phenomena are fractal. Small sections taken
from these series, one scaled by the appropriate factor,
cannot be distinguished from the whole signal. Being able
to recognize a time series as fractal means being able to
link information at different time scales. We call such sets
'self-affine' instead of self-similar because they scale by
different amounts in each axis direction.

2.3 R I S method
The rescaled range analysis, also called as R I S or
Hurst method, was invented by Hurst for the evaluation of
time dependent hydrological data [8][9]. His original
work is related to the water reservoirs and the design of an
ideal storage on river Nile. After the detailed discussion of
this work by Mandelbrot [lO][ll], the method has
attracted much attention in many fields of science. For the
mathematical aspects of the method we refer to the papers
of Mandelbrot [19], Feder 1121, and Daley [13]. Since its
earliest days the method was used for a number of
applications, whenever the question was the quantification
of long range statistical interdependence within time
series. As examples we can cite the analysis of the
asymmetry of solar activity [7][8],relaxation of stress [SI,
problems in particle physics [IS], mechanical sliding in
solids [19]. The Hurst analysis is also used as a tool to
determine the self-similarity parameter of fractal signals
[20-231, or to detect unwanted correlations in pseudorandom number generaton [23]. The Hurst exponent was
calculated for corrosion noise data in the work of Moon
and Skerry [24] where the corrosion resistance properties
of organic paint films was analyzed and a direct
relationship between the Hurst exponent and the corrosion
resistance of different coatings was established. Greisiger
and Schauer [15] discussed the applicability of different
methods to the electrochemical potential and current noise
analysis. They concluded, that the Hurst exponent allows
the extraction of mechanistic information about corrosion
processes, hence suitable for characterizing coatings.

There are many methods available for estimating the


fractal dimension of data sets. These lead to different
numerical results, yet little comparison of accuracy has
been made among them in the literature. We combine two
methods which are known as the most popular for
assigning fractal dimensions to time series, the boxcounting method and rescaled range analysis.

2.2

Box-counting

The box-counting algorithm is intuitive and easy to


apply. It can be applied to sets in any dimension, and has
been used on images of everything from river systems to
the clusters of galaxies. A fractal curve is a curve of
infinite detail, by virtue of its self-similarity. The length of
the curve is indefinite, increasing as the resolution of the
measuring instrument increases. The fractal dimension
determines the increase in detail, and therefore length, at
each resolution change. For a fractal, the length L as a
function of the resolution of the measurement device 6
is:

We give a brief introduction to the R I S method, in


lines of Feder's [I91 work. Let the time coordinate, t, he
discredited in terms of the time resolution, A t , as i = r / & ,
The discrete time record of a given process is denoted by
x, , i=O,l,---,N if the total duration of the observation is
T = N & . According to the hasic idea of the R I S method
the time record is evaluated for a certain time interval,
called time lag, the length of which is r = jAt and begins
at to= luAt . Obviously, j < N and I, < N hold. The
average of xI over the time lag is calculated as

L ( 6 ) oc F D
where D is an exponent known as the fractal
dimension. (For ordinary curves ~ ( 6 )approaches a
constant value as 6 decreases) Box-counting algorithms
measure L(S) for varying 6 by counting the number of
non-overlapping boxes of size 6 required to cover the
curve. These measurements are fitted to Eq. ( I ) to obtain
an estimate of the fractal dimension, known as the box
dimension. A fractal dimension can be assigned to a set of
time series data by plotting it as a function of time, and
calculating the box dimension. Eq. (1) will hold over a
finite range of box-size; the smallest boxes will be of
width r , where r is the resolution in time, and height 0 ,
where a is the resolution of the magnitude of the time
series.

(3)
Next the accumulative deviation from the mean,
as

J , ~ is
, evaluated
~

(4)

1=4.>

where k takes the values 15 k 5 j

135

In order to visualize the meaning of Eq. (4) let us


refer to the hydrological context in which the method was
devised by Hurst. Here x is the annual water input into a
reservoir in the ith year of a series of N years, and .v~,,,,~

events with Gaussian distribution and zero mean. The


Hurst exponent for such a time record is 1/ 2 .

For 1 1 2 < H < 1 the time series is called persistent,


i.e.
an
increasing trend in the past implies, on the average,
is the net gain or loss of stored water in the year 1, + k , i.e.
a continued increasing trend in the future, or a decreasing
some time within the time lag in question. That is, the
trend in the past implies a decrease in the future. If,
annual increment is the object of analysis. The ideal
O<H < 1 / 2 prevails, the time series observed is anti
reservoir never empties and never overflows, so the
persistent, i.e. an increasing trend in the past implies
required storage capacity is equal to the difference
decreasing trend in the future and vice versa. Persistency
between the maximum and minimum value of Y ~ , , , ~
is found also in cases where the time series exhibit clear
over j . This difference is called the range, R,h3,
trends with relatively little noise [I 1][14][22].

R, I,)

= "Yk,k.,"

1- "Yk

,.lo

2.4

(5)

We have already mentioned that the fractal


dimension of an object is a measure of complexity and
degree of space filling. When the object is a series in time,
the dimension also tells us something about the relation
between increments. It is a useful and meaningful insight
into series of natural processes.

The variance of x, for the same period, r , IS given as

2.5

and the quotient R,,"


is called rescaled range. The
above expressions are referred to a given position of the
time lag in the time axis. However, the time lag can be
shifted and the procedure given by (I), (2), (3) and (4) can
be repeated for each position. Thus a series of rescaled
ranges is obtained the average of which can be evaluated.
As a non-unique but rational choice the lag is shifted by
steps Eqs. (3)-(6), thus a series of non-overlapping but
contacting intervals is constructed. In other words a series
of R,,," /S,.!" is evaluated with j fixed and I vaned as

I = ( l , + m , ) where m=1,2;..,[N/j]

addition of all past increments. The function X ( / ) is a


self-affine fractal, whose graph has dimension 1.5.
Fractional Brownian motion (Eh) generalizes

X ( f ) by allowing the increments to be correlated.

, with the square

Ordinary Brownian motion can be defined by:

X(r) - ~ ( t , =) g1t -tOl2

I=[Nij]

R/S=C ( R j l /Sj,,)
IN1 f M,

where H = I12

('I

(9)

, 5 is a normalized independent

Gaussian process and X ( t J the initial position [4][5].


Replacing the exponent H = 1 / 2 in Eq.5 with any other
number in the range O < H < l defines an fBm function
X , ( r ) . The exponent H here corresponds to the statistic H
that R I S analysis calculates.

Hurst observed that there is a great number of


natural phenomena, for which the ratio R I S obeys the
rule

RISocr'

Fractional Brownian motion

A particle undergoing Brownian motion moves by


jumping step-lengths which are given by independent
Gaussian random variables. For one-dimensional motion
the position of the particle in time, A'(/), is given by the

bracket denoting integer part. Then the rescaled range for


the time lag r , is calculated as the average:

Interpretation of fractal dimension

(8)

The correlation function of future increments with


past increments for the motion X , ( t ) can be shown to be

[5]:

where H is called Hurst exponent. The Hurst


exponent was seen to be between 0 and 1. The value
H = 112 has a special significance, because this reflects
that the observations are statistically independent of each
other. This is the random noise case. For example the
increment series, i.e. the series of displacements, in
Brownian motion is a sequence of uncorrelated random

C(t)= 22x-1- 1

(10)

Clearly, C ( t ) = O for H = I12 ; increments in


ordinary Brownian motion are independent. For H > 1 1 2 ,
C(r)is positive for all r . This means that after a positive
136

increment, future increments are more likely to be positive.


This is known as persistence. When H < I / 2 , increments
are negatively correlated, which means an increase in the
past makes a decrease more likely in the future. This is
called anti-persistence.
Now it is true for self-affine functions such as
X,,(t) that the fractal dimension, D , is related to H by
[41:

We can then identify persistence or anti-persistence


in data sets whose graphs are fractal. Persistent time series
show long-term memory effects. An increasing trend in
the past is likely to continue in the future because future
increments are positively correlated to past ones.
Similarly, a negative trend will persist. This means that
extreme values in the series tend to be more extreme than
for uncorrelated series. In the context of climatic data,
droughts or extended rain periods are more likely for
persistent data.

i_-__I

In order to determine the Hurst exponent,


log(R/S) is plotted against logr and the slope renders
H . However, not all the points of this plot have the same
statistical weights: when P is very small, a large number
of R / S data can be calculated but their scatter is large;
when r is very large, only few RIS data are at hand, so
the statistics is poor. For this reason the first and the last
few points of the double logarithmic plot are usually
discarded.

0s

0.3

0.
0

).

0.66Pn-010,l

10

o,

o.

,e

10

so

I_

Figure 2. Relationship of scaling interval (N) versus Hurst


exponent (H)

To begin with, we verified whether change of a


stock price time series follows the random walk
hypothesis using the rescaled range analysis. We analyzed
stock prices time series of Nikkei Stock Average. The
analysis period used the data for 1500 days from July,
1996 to October, 2002. In the analysis, the logarithm
profitability was applied to the original analysis object.

IO

As shown in this Figure 2, H =0.88 corresponding


to N = 3 is the maximum. Therefore, it is found that Data
for 3 days show the strongest correlation using the fractal
analysis of the Nikkei Stock Average Prices. This
knowledge is discovered .using a feature.de map with
rough set theory [26] shown in Table I . Firstly a Hurst
exponent is obtained. Then the fractal dimension and the
autocorrelation coefficient are calculated from the Hurst
exponent.

Table I . Knowledge extraction with feature-rule map


with rough set theory

Figure 1. Rescaled range analysis of Nikkei Stock


Average time series
Figure 1 shows the result of analyzing the Nikkei
Stock Average Prices for 1500 days. From the gradient of
this obtained straight line, the Hurst exponent (H) in the

137

Table 2. Pawlaks Lower and Upper Approximation


Class

Short-term Prediction of using


BPNN

Brownian Motion

Lower approximation
Upper approximation
Aooroximation accuracv

Class
Number of objects:

We obtained the features (knowledge: N = 3 ) of the


time series by the fractal analysis. Two kinds of 3 layers
BP neural networks which have a delay element between
each input node are considered. No filter at each node has
connected in the first one. The second one has a 3-order
FIR at each input node.

N=3
3

Upper approximation
Approximation accuracy

1
1

Class
Number of objects

Similaritv (Fractal)
1

Simulation by 3 layer BPNN without filters

4.1

No filter is connected at each input node. The


structure of the neural network is shown in Table 3.
Table 3. BP networks structure

Upper approximation
Aooroximation accuracv

I
I

1
1

I
Out utNode
E ochs

Structure of Neural Network for


Time Series

500

300

4.1.1

BPNN simulation with 3 input nodes


The structure of the neural network is illustrated in
Figure 4. The simulation result with 3 input nodes is
shown in Figure 5. We predicted for seven days from the
1501st. The error and the number of epochs are given in
Table 4.

In order to embed the discovered knowledge into the


structure of neural networks, it is found that our
developed delayed neural network is suitable 121.
3.1 Time Delayed Neural Network (TDNN)

In order to handle dynamical systems time delay


elements representing the obtained knowledge are put into
the inputs of neural networks 121. The structure
configuration of FIR filter is shown in Figure 3. It is a
finite impulse response (FIR) digital filter which is
connected to each input of a back propagation type
forward neural network (BPNN). A time delay element is
also put between the inputs of the filter.

Table 4. BPNN without filters with 3 inputs

Error 0)
Epochs

59.4803

201

BPNN simulation with 5 input Nodes


In the same way, the structure of the neural network
is illustrated in Figure 6. The simulation result with 5
input nodes is shown in Figure 7. The error and the
number of epochs are listed in Table 5.
4.1.2

Table 5 . BPNN without filters with 5 input nodes


Error 0)
Epochs

304.9743
447

Zl:delay element

Figure 3. FIR filter


Error 0)
Epochs

where, f is a sigmoid function and the weights are


corrected by the Back Propagation algorithm (BP).
138

3 Inputs Nodes

5 Inputs Nodes

59.4803
201

304.9743
447

input

input

FIR filters

Three-layer

Three-layer

Zl. .Pm
Back Propagation
Neural Network

JUtpUt

K)

Z-1-- 2-1

Back Pmpagation

.2

Neural Nework

2-1.

2-1..

Figure 4. The structure ofthe BPNN without filter with 3


input nodes

2~'

Figure 8. The structure of the BPNN with filter with 3


input nodes.
.-

Figure 9. BPNN Simulation with filter with 3 input nodes

Figure 5. BPNN simulation without filters with 3 input


nodes
input

..__.."..

output

Three-layer

input

Three-layer

Neural Network

Figure 10. The structure of the BPNN with filter with 5


input nodes

Figure 6. The structure of the BPNN without filter with 5


input nodes

"-.I

I
-

">-

- c .-*./-

d
,
-

Figure 7. BPNN simulation without filters with 5 input


nodes

Figure 11. BPNN Simulation with tilter with 5 input nodes

139

4.2

Simulation by the 3 layer BPNN with filters.

Table 7 shows the structure of the 3 layer BPNN


with filters.

Table 1 I , Conparison of both networks

I T T i n p u t Nodes
filters

Table 7. 3 layer BPNN with filters network structure

\
3-order FIR filter
Hidden Node
OutputNode
Epochs

3 Inputs Nodes
connected
3

I
I

5 Inputs Nodes
connected
3

filters

500

300

4.2.1 Simulation with 3 input Nodes


The structure of the neural network is illustrated in
Figure 8. The simulation result is shown in Figure 9.
Table 8 lists the prediction error and the number of
emchs.

I
I

Erroro)
59.4803

Epochs
201

5 Input Nodes

304.9743

447

5 Input Nodes

88.2962

154

Conclusion

A data mining technique is applied to time series


analysis and prediction. From a large amount of data
understandable knowledge is extracted using a Hurst
exponent, a fractal analysis method and an
autocorrelation analysis method. Then it is embedded into
the suitable network, BPFN. The accurate prediction is
obtained in the Nikkei Average Stock price time series by
the BPNN with filters.

Table 8. Simulation with 3 input nodes


E r r o r 0)

References

47.3381

[ I ] 0. Castillo and P. Melin, Hybrid Intelligent


Systems for Time Series Prediction Using Neural
Networks, F u u y Logic, and Fractal Theory, I
Transactions on NN, Vol. 13, no. 6, pp. 1395-1407, Nov.
2002.
. ,.
. .
[2]

M. S. Shafique and Y. Dote, An Empirical Study

on Fault Diagnosis for -Nonlinear Time Series using


Linear Regression Method and FIR Network, Trans. I
ofJapan, Vol. 120-C, no. 10, pp. 1435-1440, Oct. 2000.

Table 9. Simulation with 5 input nodes

E r r o r 0)
Epochs

88.2962

154

[3] F. Yakuwa, S. Satoh, M. S. Shaikh, and Y. Dote,


Fault Diagnosis for Dynamical Systems Using Sol?
Computing, Proceedings of the 2002 World Congress on
Computational Intelligence, Honolulu, Hawaii, U.S.A.,
May 12-17,2002.

Table 10 tells that with both 3 input 5 input nodes


prediction accuracy is fairly high.

[4] J. Feder, Fractals, Plenum Press, New York, 1988, p.


288.

Table IO. Conparison ofboth

3 Inputs Node

Error 0)
Epochs

47.3381
37

5 Inputs Node
88.2962
154

..

,/%_I

[5] N. Wiener, Differential space, I. Math. Phys. Mass.


Inst. Techno. 2,1923, pp. 131-174

[6] T. Vicsek, Fractal Growth Phenomena, World


Scientific, Sigapore, 1992, p.488.
[7] H. Greisiger and T. Schauer, Prog. Org. Coat. 39,
2000, p. 31.
[SI

H. E. Hurst, Nature 180, 1957, p. 494,

[9] H. E. Hurst, R. P. Black and Y. M. Simaika, Longterm Storage, an Experimental Study, Constable, London,
1965.
140

[IO] B. Mandelbrot and J. R. Wallis, WaferResour. Res:


5,

1969, p. 228.

[ I 11 B. Mandelbrot and J. R. Wallis, WaferResour. Res.


5 , 1969, p. 967.
[ 121 J. Feder, Fractals, Plenum, New York, 1988,

(131 D. J. Daley, Ann. Probab. 27, 1999, p. 2035.

[I41 R. W. Komm, SolarPhys. 156,1995,~.17.

[I51 R. Oliver and J. L. Ballester, Solar Phys. 169, 1996, p.


216.
.
[I61 A. Gadomski,Mod. Phys. Lett. B I!, 1 9 9 7 , ~645.
[I71 1. A. Lebedev and B. G. Shaikhatdenov, J Phys. G:
Nucl. Part. Phys. 23, 1997, p. 637.
[I81 M. A. F. Gomes, F. A. Osouza and V. P. Brito, J.
Phys. D 31, 1998, p. 3223.
[I91 C. L. Jones, G. T. Lonergan and D. E. Mainwaring,
J. Phys. A 29, 1996, p. 2509.
[20] C. Heneghan and G. McDarby, Phys. Rev. E 62,
2000, p. 6103.
[21] C. W. Lung, J. Jiang, E. K. Tian and C. H. Zhang,
Phys. Rev. E 60, 1999, p. 5121.
[22] B. A. Cameras, B. Ph. van Milligen, M. A. Pedrosa,
R. Balbin, C. Hidalgo, D. E. Newman, E. Sanchez, M.
Frances, 1. Garcia-Cones, J. Bleuel, M. Endler, C. Ricardi,
S. Davies, G. F. Matthews, E. Martines, V. Antoni, A.
Latten and T. Klinger, Phys. Plasmas 5 , 1998, p. 3632.

[23] B. M. Gammel, Phys. Rev. E58,1998, p. 2586.


[24] M. Moon and B. Skerry, J. Coal. Technol. 67, 1995,
p. 35.
1251 L. A. Zadeh, Plenaly Talk on from computing with
numbers to computing with words-from manipulation of
measurements to manipulation of perceptions, Proceedings of
the IWSCI-99,June 16-18, Muroran, Japan.

1261 A. Kusiak, Rough Set Theory: A Data Mining Twll for


Semiconductor Manufacturing, IEEE Trans. On EPM, Vol. 24.
No. I , pp. 44-50, January, 2001.

141

S-ar putea să vă placă și