Documente Academic
Documente Profesional
Documente Cultură
1 Introduction
1.1 Background
Predictability of stock markets is subject to long debate [1, 2]. No prediction
tool which applies to all stock markets and guarantees high prots has been
published yet. Even if such a tool existed, it remained unpublished due to two
obvious reasons: First, the material gain of such a tool is beyond the moral gain
that can be made by the publication of the tool. Second, the public and wide-
spread use of the tool would render the predictions made by the tool almost
useless.
The two well-known hypothesis which foster the pessimism even further are
the ecient market hypothesis and the random walk hypothesis. The ecient
market hypothesis states that at any time, the value of a stock re
ects its actual
value, that is, any information that can aect the stock price must have been
Table 1. Possible outcomes of a transaction made by a prediction tool.
Case Trader has Prediction Price Trading Result
stock made change action
1 No Down Down Wait Nothing lost or gained
2 No Down Up Wait Nothing lost or gained
3 No Up Down Buy Lost money
4 No Up Up Buy Gained money
5 Yes Down Down Sell Avoided the fall
6 Yes Down Up Sell Missed the rise
7 Yes Up Down Hold Lost money
8 Yes Up Up Hold Gained money
already re
ected on the stock price. Depending on the availability and type of the
information, three dierent forms of this hypothesis exist. The weak form states
that past price movements have no use. The semi-strong form states that publicly
available information such as company balance sheets and prot forecasts have no
use. The strong form states that even private information is not useful. Although
the weak and semi-strong forms are widely accepted, there is no consensus on
the validity of the strong form.
The other important hypothesis, namely, the random walk hypothesis is a
variant of the ecient market hypothesis. It claims that the stock prices do
not follow any patterns, and there is no way to predict the future prices using
the past data. According to this hypothesis, a trader investing his money in an
ecient market is no dierent than a gambler playing card games in a casino.
Despite all this pessimism, stock market prediction models are still being pub-
lished showing how to make money in some particular stock markets, within some
specic time intervals or under some special circumstances [9, 13, 14, 16, 17].
All these prediction models are common in that they develop some buy and sell
strategies depending on the behavior of the past technical and fundamental data.
The technical data includes the previous prices of a stock, total number of trans-
actions, transaction volume, and the values obtained from technical indicators.
Fundamental data includes the general economic parameters and information
about company activities. Economic parameters are such as in
ation rate, in-
terest rate and currency exchange rates. Parameters about a company include
data about its sales performance, prots, debts and estates. It is also possible
to use combinations of technical and fundamental data for prediction (such as
price/earning rate).
Although most people think that making transactions relying on the predic-
tions of a non-living trader is doubtful and risky, in fact, decisions of a prediction
tool are actually no worse than that of a naive stock trader who has no insider
information. Table 1 displays eight possible scenarios each formed depending on
the current portfolio of the trader and the correctness of the trading system's
decisions. Assuming all scenarios are equally probable, at least half of the time
(cases 1, 2, 5 and 6), the trader can avoid the risk just by staying out of the
market. In the four remaining cases, the trader either looses (cases 3 and 7) or
gains (cases 4 and 8) money. Surely, in a more detailed analysis, current inter-
est rates, amount of transaction fees, the frequency of the transactions and the
prediction accuracy of the system play an important role.
series. This is basically due to the fact that stock prices vary at large magnitudes
over the time. This makes both application of the models and comparison of
the results between dierent stocks rather dicult. Hence, instead, using price
changes between consecutive sessions as input is more meaningful. Price changes
happen to appear in a more restricted range. For example, in IMKB, due to the
restrictions on price changes, a stock cannot rise or fall more than 25% within
the same day.
As a convention, we use two-letter names to denote price change time series.
We use fg(t) to denote the change between the values of price series f and g,
at days t 1 and t, respectively. Formula for calculating price change series is
shown in Equation 1. With this notation, for example, lc(t) corresponds to the
rate of change between the closing price at day t and the previous day's lowest
price. Surely, combination of e, n and v letters with other letters is not possible,
since these series belong to dierent domains. Hence, there are 19 possible price
change series that can be formed. Note that, although nn and vv series are not
actual price changes, with some abuse of language, we refer to them as price
changes.
A cc B cc C cc D cc E cc Fcc G cc
0
-2.08 -0.23 0.93 2.69
A ll B ll C ll D ll E ll
Figure 1 illustrates working of our approach over cc and ll price change se-
ries. Figure 1-a presents the result of 7-way partitioning over the cc series. We
denote the categories corresponding to intervals by capital letters. In our exam-
ple, Acc indicates a drastic decrease in closing prices. Similarly, in Figure 1-b,
Dll indicates a fair increase in lowest price changes. We denote categorical price
change series by f. In the given example, for cc(t) < 2:98, cc(t) returns Acc.
The disadvantage of this approach is due to uniform discretization. Even if
two values are very close to each other, they can still fall on dierent categories.
For example, whereas a closing price change of 0:383 is classied as Ccc, 0:381
is classied as Dcc . A more general problem is related with nding the appro-
priate number of categories, k. This number is usually determined empirically.
A pattern P is formed by combining one or more category labels from dif-
ferent series. The example pattern we gave at the beginning of this section can
be stated as Ecc DLH . If when this pattern is observed, the experience is that
the market closing price goes up, we create the rule R : EccDLH ! +cc . During
the training, all past patterns are searched and frequently occurring patterns
are used to construct the predictive rules. Among these rules, the ones with the
highest condence over the training set are used for prediction over the test set.
5 Performance Measures
For evaluating the performance of the neural network model, the error made over
the training set can be used. However, normalized mean squared error (NMSE) is
a better measure, since it allows comparison between experiments over dierent
neural network architectures and stock data. NMSE can be used to see how
closely the predictions simulate the actual values. Formula for NMSE is given,
for an N-day long test period, in Equation 5. In this formula, f 0 (t) represents
the predicted price for actual price f(t). f^ is the mean of f(t).
Pt=N (f(t) f 0(t))2
NMSE = Pt=1t=N ^2 (5)
t=1 (f(t) f)
From the traders point of view, a more important measure is to correctly
predict the sign of change. Knowing whether the market will rise or fall tomorrow
would be a great help to the trader. Sign accuracy can be calculated using the
following formula,
Pt=N s(t)
SignAccuracy = t=1 (6)
N
where s(t) is calculated as,
8 1; f(t) f 0 (t) > 0
<
s(t) = : 1; f(t) = f 0 (t) = 0 (7)
0; otherwise
Quality of the rules produced by the second model can be measured in a
similar manner. Among the rules created, only the ones with higher condence
values are used for predictions. If a low-condence rule or a low-support pattern
is observed, the prediction tool is assumed to remain silent. Accuracy of the
produced rules can be calculated as shown in Equation 8.
Accuracy = number of correct predictions of s(t)
total number of predictions made (8)
Despite all these measures, we believe that the real value of a stock market
prediction model can be best judged by its ability to earn money. For this pur-
pose, recommended buy and sell orders should be simulated over the test set.
Transaction fees, in
ation rate and interest rates are the factors which further
complicate the prot simulation process. A prediction system, which results in
revenues beyond the interest rates when the transaction fees were subtracted
from the prot can be considered as successful.
6 Experimental Results
6.1 Experimental Setup
The experiments are carried out over a Pentium III-500 Mhz PC, with 512 MB
of RAM, running a Debian Linux system. The execution time of the neural
network model largely depends of the network parameters. With 10 hidden layer
neurons, it takes around 5 minutes to complete 5000 epochs in the training phase.
Compared to this, the second model runs quite fast. Both pattern discovery and
rule formation phases require around a few seconds to be complete.
The data set we used contains some missing data and requires some cleansing.
While creating the inputs to the neural network, and generating the patterns
used in the second model, we took the gaps existing between consecutive sessions
into consideration. These gaps are due to either the missing session data or the
holidays. If the gap is longer then three days or the closing prices between the
last day before the gap and the rst day after the gap is higher than 25%
then the values of the price change series at the rst day after the gap are not
used for training. Other preprocessing, for the neural network model, includes
normalization of the input price change values.
6.2 Back-Propagation Neural Network Model
Back-propagation learning is a gradient-descent search algorithm which aims to
nd a local minimum over an error surface. Finding the absolute minimum is
known to be an NP-hard problem. The back-propagation algorithm, most of the
time, converges to a sub-optimal local minimum. Execution time of the algo-
rithm depends on the targeted error value, the learning constant used, number
of processing units and the size of the training data. Converging to a local min-
imum may take quite a long time, and hence in our networks, we tried small
number of neurons. Use of higher number of neurons did not improve accuracies
much.
For the learning constant, , we used the xed number 0:5. However, as the
epoch count increases, we gradually decreased this number, In other words, we
used a dynamic learning constant for faster convergence. Also to prevent some
possible oscillations during the search process, and to add some hill climbing
capability, we used a forgetting factor of = 0:1. Training terminated when
epoch count reached 5000 or the change in error between two successive epochs
dropped below = 10 7.
Results in Table 3 are obtained using a network with 5 hidden layer neurons.
For training, rst 90% of the time series is used. The remaining 10% is used for
testing. The base-line accuracies (BLA) over the test set are 53:53, 52:52 and
51:52 for the predicted series, cc, ll and hh series, respectively. In Table 3, E
represents the error over the training set at the point that the training phase
terminated. NMSE is measured using the test set. Accuracy stands for the sign
accuracy of the predictions made by the neural network.
The results indicate that cc series is hardly predictable. Best sign accuracy
(54:55%) is obtained using ccLHHC combination. Again with the same combi-
nation, (77:78%) accuracy achieved for the hh series. The most easily predictable
series seems to be the hh series. Figure 2 displays the simulated (hh0 ) and actual
(hh) series, for IMKB 30 index, between the dates 2 Oct 2000 and 20 Feb 2001.
7 Discussion
The results from the both models indicate the diculty of predicting closing
prices. However, despite the fact that the other intra-day price change series
behave as if random, there exist useful patterns in these series that can be used
for prediction. In both models, best results are obtained for predicting the daily
highest price change series. As we stated before, results of prot simulation is an
important measure to evaluate the performance of an automated trading system.
So far, no means of performing prot simulation over highest and lowest price
change series have been proposed. We leave this and application of the models
to portfolio selection problem as a future work.
Although not reported here, we have conducted experiments to see the eect
of number of transactions and volume on price movements. Interestingly, no
signicant correlations have been found. We also tried rules having larger window
sizes (in the reported experiments only the data of the previous day, that is, a
window size of 1 is used). This way more specic rules have been obtained, but
due to the lack of available data the support for the patterns were too low.
Currently we are working on creating the series for technical indicators. We
plan to combine indicator series with other price change series and to integrate
this into both models. We believe that, use of technical indicators will boost the
prediction accuracy of the forecasting system we developed.
Another interesting issue is trying to make predictions for coarse-grain time
intervals. Currently, our forecasting system produces daily predictions. Weekly,
or monthly predictions could be tried. On the other extreme, each session in a
day could be used as an instance. This approach may be benecial in that it
doubles the amount of available data.
Our results on temporal eects are rather interesting. The best days to buy
stock are found as Thursday and Friday. The highest price changes occur in the
rst and third weeks of a month. Also, it is interesting to note the correlation
between these weeks and the days that people receive their salaries (the 1st and
15th days of a month). Furthermore, we observed some seasonal eects. Prices
seemed to go up towards the start of winter.
References
1. B. G. Malkei, A Random Walk Down Wall Street, 7th ed., New York, 1999.
2. E. Maasoumi and J. Racine, Entropy and Predictability of Stock Market Re-
turns, Journal of Econometrics, March 2002.
3. R. Herbrich, M. Keilbach, T. Graepel, P. Bollmann-Sdorra, and K. Obermayer,
Neural Networks in Economics: Background, Applications and new Develop-
ments, In Advances in Computational Economics: Computational Techniques
for Modelling Learning in Economics, Vol. 11, pp. 169{196, 2000.
4. S. Haykin, Neural Networks: A Comprehensive Foundation, Macmillan College
Publishing Company Inc., 1994.
5. T. Hellstrom and K. Holmstrom, Predicting the Stock Market, Technical Report,
Dept. of Mathematics and Physics, Malardalen University, 1998.
6. F. Castiglione, Forecasting Price Increments using an Articial Neural Network,
Advances in Complex Systems, No. 1, pp. 45{56, March, 2001.
7. J. Li and E. P. K. Tsang, Investment Decision Making Using FGP: A Case Study,
Proceeedingsof Congress on Evolutionary Computation, 1999.
8. S. Singh, Noisy Time-Series Prediction using Pattern recognition Techniques,
Computational Intelligence, Vol. 16, No. 1, pp. 114{133, 2000.
9. T. Hellstrom and K. Holmstrom, Predictable Patterns in Stock Returns, Tech-
nical Report, Dept. of Mathematics and Physics, Malardalen University, 1998.
10. P. J. McCann amd B. L. Kalman, A Neural Network Model for Gold Market,
Technical Report, Washington University, 1993.
11. G. Dorner, Neural Networks for Time Series Processing, Neural Network World,
6(4), 447{468, 1996.
12. M. Magdon-Ismail, A. Nicholson and Y. S. Abu-Mostafa, Financial Markets:
Very Noisy Information Processing, Proceedings of the IEEE, Special Issue on
Information Processing, 1998.
13. T. Chenoweth and Z. Obradovic, A Multi-Component Nonlinear Prediction Sys-
tem for the S&P 500 Index, Neurocomputing, Vol. 10, No. 3, pp. 275{290.
14. J. Yao and H. L. Poh, Equity Forecasting: a Case Studyon the KLSE Index, Neu-
ral Networks in Financial Engineering, Proceedings of 3rd International Confer-
ence on Neural Networks in the Capital Markets, London, 1995.
15. S. B. Achelis, Technical Analysis from A to Z, Irwin Professional Publishing,
Chicago, 2nd Edition, 1995, (available at http://www.equis.com/free/taaz/).
16. F. Hamelink, Systematic Patterns Before and After Large Price Changes: Evi-
dence From High Frequency Data from Paris Bourse, 1999.
17. P. Tino, C. Schittenkopf and G. Dorner, Volatility Trading via Temporal Pat-
tern Recognition in Quantized Financial Time Series, Pattern Analysis and Ap-
plications, 4(4), pp. 283{299, 2001.
18. M. W. Sholom and N. Indurkhya, Rule-based Machine Learning Methods for
Functional Prediction, Journal of Articial Intelligence Research, Vol. 3, pp.
383{403, 1995.
19. http://www.analiz.com/AYADL/ayadl01.html
This article was processed using the LaTEX macro package with LLNCS style
Table 3. Accuracies obtained by the back-propagation neural network.
Predicted Series
(t+1) ll (t+1)
cc hh(t+1)
19000
18000
hh (actual)
17000 hh’ (predicted)
16000
Daily Highest Price
15000
14000
13000
12000
11000
10000
9000
1 9 17 25 33 41 49 57 65 73 81 89 97
Days
Accuracy 56.94 100 76.32 100 53.37 68.94 99.63 99.89 51.07
Table 5. Accuracies obtained by the generated rules (k=5).
Rule Predicted Series
Domain ll(t+1) lh(t+1) lc(t+1) hl(t+1) hh(t+1) hc(t+1) cl(t+1) ch(t+1) cc(t+1)
ll (t) 59.39 100 76.32 100 65.24 68.94 99.63 99.89 47.73
lh (t) 59.29 100 76.32 100 58.69 68.94 99.63 99.89 39.03
lc(t) 66.53 100 76.32 100 70.76 68.94 99.63 99.89 32.21
hl (t) 57.96 100 76.32 100 61.15 68.94 99.63 99.89 51.68
hh (t) 58.16 100 76.32 100 58.59 68.94 99.63 99.89 42.61
hc(t) 64.29 100 76.32 100 68.81 68.94 99.63 99.89 52.7
cl (t) 61.63 100 76.32 100 68.51 68.94 99.63 99.89 52.7
ch(t) 60.51 100 76.32 100 62.68 68.94 99.63 99.89 38.83
cc(t) 68.88 100 76.32 100 75.66 68.94 99.63 99.89 52.09
ee(t) 56.94 100 76.32 100 53.37 68.94 99.63 99.89 51.07
LH (t) 56.94 100 76.32 100 56.44 68.94 99.63 99.89 52.6
LC (t) 72.14 100 76.32 100 70.25 68.94 99.63 99.89 52.7
H C (t) 68.67 100 76.32 100 76.69 71.59 99.63 99.89 52.91
60.0
55.0 ll
lh
lc
50.0 hl
hh
Accuracy
hc
45.0 cl
ch
cc
40.0
ee
LH
35.0 LC
HC
30.0
1 2 3 4 5 6 7 8
k
Fig. 3. Predicting the closing price change by various price change series.
75.0
ll
70.0 lh
lc
hl
hh
65.0 hc
Accuracy
cl
ch
60.0 cc
ee
LH
LC
55.0
HC
50.0
1 2 3 4 5 6 7 8
k
Fig. 4. Predicting the lowest price change by various price change series.
80.0
75.0
ll
lh
70.0 lc
hl
hh
Accuracy
65.0
hc
cl
60.0 ch
cc
55.0 ee
LH
LC
50.0 HC
45.0
1 2 3 4 5 6 7 8
k
Fig. 5. Predicting the highest price change by various price change series.
80.0
78.0
Accuracy
76.0
74.0 ccHC
ccLC
HCLC
ccHCLC
72.0
1 2 3 4 5 6 7 8
k
Fig. 6. Predicting the highest price change by multi-label patterns.
0.6
Average of Daily Closing Price Changes
0.4
0.2
0.0
-0.2
-0.4
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month of the Year
Fig. 7. Impact of the month of the year on closing stock prices.
1.2
0.6
Average of Daily Closing Price Changes
0.4
0.2
0.0
-0.2
-0.4
Mon Tue Wed Thu Fri
Day of the Week
Fig. 9. Impact of the day of the week on closing stock prices.