Sunteți pe pagina 1din 185

NELION: A Non-Linear Stock

Prediction and Portfolio


Management System

Am Fachbereich Mathematik und Informatik
der Freien Universitt Berlin
eingereichte Dissertation
zur Erlangung des akademischen Grades
eines Doktors der Naturwissenschaften

vorgelegt von Thomas Schwerk
Berlin, 2001


Gutachter: Prof. Dr. Raul Rojas
Freie Universitt Berlin
Prof. Dr. Volker Sperschneider
Universitt Osnabrck

Tag der Disputation: 13. Februar 2001


This dissertation is dedicated to Victoria Malaika Bonnekamp


Preface
This thesis presents a unified Internet-based portfolio
management tool, NELION
1
, that combines non-linear stock
predictions and suggests the optimal portfolio for multiple
investors, based on their specific preferences.
There exist a variety of publications on different approaches to
time series analysis, especially as it pertains to financial data,
including stock prices. The bulk of the work, however,
optimized a certain technique to a specific set of data. There
also exists a wide consensus on portfolio management theory,
which describes how stocks in a portfolio should be distributed
so that it conforms to the risk and return requirements of an
investor. Until now, however, these two approaches have not
been connected to form an automated and integrated stock
prediction and portfolio management tool.
NELION connects to the Internet on a daily basis and
downloads stock prices and transaction volumes to a local
database. With this information at hand, it determines the
volatility of the stock and the correlation between any two

1
At 5188 m, Nelion is the second of the twin peaks of Mount Kenya,
the highest mountain in this East-African country. The silouette of
the mountain is reminiscient of a local maximum of a stock chart.
4 Preface
stocks that are tracked on the database. The system then
calculates numerous mathematical predictor models to
forecast the stock price one day, one week and one month
into the future. These models are continually refined through
a genetic algorithm that uses the available processing power
of the computer to search the model parameter space for
improved configurations whenever no other tasks are awaiting
execution.
Investors benefit from these models through personalized
transaction suggestions, which take both their current portfolio
and their risk adversity and other preferences into account.
They can be notified by e-mail at defined intervals of the
current state of their portfolio and receive suggestions on how
to reduce its risk, while maintaining a defined return on
investment. In order to limit expensive excess trades, it takes
the minimum transaction value as defined by the investor into
account. In case NELION identifies that a particular stock has
undergone a dramatic change in stock price, the system
generates an alert for all investors who own shares in the
company by sending a short message via mobile phone.
NELION provides a Test Investor function that simulates
trades based on historic data for any number of investor
profiles. This function is designed for use with all
combinations of the adjustable parameters so that the results
of these trials can be analyzed and different categories of
investors recommended for different risk adversity levels.
Efficient Market Hypothesis 5
Additionally, the system has an Auto Investor module that acts
like an autonomous trader on real data and automatically
executes unsupervised, simulated trades at defined intervals.
These functions base all purchases and sales on the current
stock price and include transaction costs in an effort to provide
a fair measure of the success of the system.
The experiment allowed the agent to perform unlimited
transactions once every weekend with no human intervention
and ran for one year starting May 15, 1999 for a conservative
and a high-risk investor configuration. The algorithm merely
restricted negative ownership of stocks or short positions and
was not permitted to use more cash than the US$ 10,000
initial investment.
The results show that NELION is an effective advisor alerting
a private investor to promising opportunities and showing him
alternatives to reduce the risk of his portfolio while maintaining
a defined level of return. The portfolio management tools
provide the user with relevant information both on the portfolio
history and the recommendations for the future.


Table of Contents
Preface................................................................................. 3
1 Introduction...................................................................10
2 Predicting Stock Prices .................................................17
2.1 Efficient Market Hypothesis....................................17
2.2 Mathematical Modeling Techniques .......................21
3 Portfolio Management ...................................................69
3.1 Return....................................................................70
3.2 Risk .......................................................................71
3.3 The Optimal Portfolio .............................................73
3.4 Applying the Theory ...............................................78
4 Methodology .................................................................82
4.1 Overview................................................................82
4.2 The HTML Interface ...............................................82
4.3 The Administration Tool .........................................83
4.4 The Database ........................................................84
4.5 The Task Agent......................................................87
5 Implementation ...........................................................105
5.1 Overview..............................................................105
5.2 The HTML Interface .............................................106
Efficient Market Hypothesis 7
5.3 The Administration Tool .......................................110
5.4 The Database ......................................................115
5.5 The Task Agent....................................................118
6 Experimental Results ..................................................129
6.1 Test Investor Identification....................................129
6.2 Testing the Profiles ..............................................134
6.3 Model Distribution ................................................138
6.4 Daily Operation....................................................139
7 Conclusion..................................................................140
8 Appendix A: Investor Profiles ......................................144
8.1 Conservative Investors.........................................146
8.2 High Risk Investors ..............................................152
9 Appendix B: Portfolio History.......................................158
9.1 Transactions by Conservative Investor .................158
9.2 Transactions by High Risk Investor ......................160
10 Appendix C: Screen Shots.......................................161
11 Appendix D: Conceptual Data Model .......................167
12 Appendix E: Stocks Tracked in the Simulation .........168
13 Appendix F: Bibliography.........................................173
14 Appendix G: Curriculum Vitae for Thomas Schwerk.186

Table of Figures
Figure 2.2.1: Trend Lines and Trend Channel ......................27
Figure 2.2.2: Resistance Line...............................................28
Figure 2.2.3: Momentum for n=7 ..........................................29
Figure 2.2.4: Trend Confirmation Indicator............................31
Figure 2.2.5: Comparison of the 50 and 200 Day Moving
Average........................................................................32
Figure 2.2.6: Trend Oscillator with a 10-Day Moving Average
.....................................................................................33
Figure 2.2.7: The Over-Bought/Over-Sold Indicator with n=20
.....................................................................................34
Figure 2.2.8: Artificial Neural Network ..................................46
Figure 2.2.9: Utans, Moody Experimental Training and Test
Error .............................................................................53
Figure 2.2.10: Utans, Moody Test Error with Removed Input
Data..............................................................................54
Figure 2.2.11: Utans, Moody Experiment with Optimal Brain
Damage........................................................................55
Figure 2.2.12: Hierarchical Networks....................................57
Figure 2.2.13: Training Error for Hierarchical Networks.........59
Figure 2.2.14: Test Error for Different Prediction Horizons....60
Figure 2.2.15: The VSmart Virtual Stock Market ..................65
Efficient Market Hypothesis 9
Figure 3.2.1: The One-in-Six Rule........................................71
Figure 3.3.1: Risk of a Portfolio ............................................73
Figure 3.3.2: The Efficient Frontier .......................................74
Figure 3.3.3: The Utility Function..........................................75
Figure 3.3.4: The Optimal Portfolio.......................................77
Figure 4.1.1: Block Diagram NELION...................................82
Figure 4.4.1: Simplified Conceptual Data Model ...................84
Figure 4.5.1: Artificial Neural Network ..................................92
Figure 5.2.1: Porffolio Overview via the HTML Interface .....107
Figure 5.3.1: The NELION Administration Tool ...................111
Figure 5.5.1: The Task Agent Program...............................119
Figure 6.2.1: Comparison of NELION Investors with Major
Indexes.......................................................................135

1 Introduction
The rapid growth of the Internet in recent years makes
available to every investor an unprecedented wealth of
information and data used in the stock portfolio decision-
making process. However, easy access to information does
not translate into knowledge and does not automatically result
in a higher yield. In this thesis, we present NELION, a stock
prediction and portfolio management tool that takes advantage
of the Internet. NELION is designed to help small investors
improve their return on investment and achieve consistently
higher yields.
Since the introduction of the first stock market, many attempts
have been made to predict the prices of commodities. The
result of a simple calculation by Farmer and Lo impressively
underlines the motivation for these predictions: A single US
dollar invested in US Treasury bills in January 1926 and not
touched since then, would have grown to 14 dollars by
December 1996. A single US dollar invested in the S&P 500
for the same period would have grown to 1,370 dollars. In the
same period, an investor with perfect foresight and who would
have placed his cumulative fortune in the financial vehicle with
the highest yield at the beginning of every month would have
grown his one dollar investment to a staggering 2,296,183,456
dollars [Farmer, Lo 1998]. With returns like those calculated
Efficient Market Hypothesis 11
above, it is easy to see why market prediction almost
coincides with the creation of the actual market.
As early as the late 1800s, Charles Dow focused on simple
models to predict share prices [Bishop 1960]. One of the
components of what was later called the Dow Theory tracked
the Industrial Average and the Railroad Average. The theory
stated that when the price of both averages remains within a
band of 5% for several weeks, a line has been drawn. If,
thereafter, both averages break out of this band in the same
direction, the Dow Theory states that the price movement has
the momentum to continue with its trend.
Since then both the legal framework and technology
progressed. On May 1, 1975, the Security and Exchange
Commission of the U.S. ruled that security exchanges could
not fix brokerage commission rates, forcing them into a
competitive market situation. The result has been a rich
landscape of different service providers, including many
discount brokers. Unlike traditional stockbrokers, these
companies do not offer any investment advice but will manage
the stock portfolio of private investors at very competitive
prices.
The recent advances of computer technology and the
availability of diverse and comprehensive investment
information on-line have dramatically affected the domain of
the inidividual investor. For the first time, it is feasible for small
investors to manage their own investment portfolio, though not
necessarily wise. Barber and Odean show that Frank Zappas
12 Introduction
statement that Information is not knowledge applies here as
well [Barber, Odean 1999] [Zappa 1979]. They show that
access to the breadth of information that is available today
tends to give investors a false sense of confidence prompting
them to trade excessively. Additionally, the authors show that
investors tend to hold onto their losing investments
disproportionately, while selling winners.
This psychological trap seems to be the Achilles heel for most
private investors. Because most investors do not have a clear
understanding of their adversity to risk given a medium term
benefit, investors are likely to take counterproductive
decisions. Inevitably, these decisions have a negative affect
on the long-term profitability of the portfolio. Kahneman and
Tversky documented this phenomenon as early as 1979
[Kahneman, Tversky 1979]. They asked two groups of
subjects the following two questions:
1. In addition to whatever you own, you have been given US$
1,000. You are now asked to choose between A) A sure gain
of US$ 500 and B) A 50% chance to gain US$ 1,000 and a
50% chance to gain nothing.
2. In addition to whatever you own, you have been given US$
2,000. You are now asked to choose between A) A sure loss
of US$ 500 and B) A 50% chance to lose US$ 1,000 and a
50% chance to lose nothing.
Statistically, the two questions evaluate to the same result so
that neither choice affects the expected net gain. However, in
Efficient Market Hypothesis 13
the first group 84% chose A), while 69% of the second group
chose B). The result indicates that generally, persons tend to
be risk averse when faced with a potential gain, but are willing
to take more of a risk when faced with a loss.
As Joachim Goldman, the head of the behavioral Finance
department of the Deutsche Bank explained in 1999, this
tendency leads investors to sell stocks too quickly, simply
because they have appreciated from their purchase value
[Reitz 2000]. This directly contradicts the old saying that no
one has gotten poorer by realizing gains, a popular saying
within the investment community. However, the development
of a portfolio depends on future movements, not on gains or
losses in the past. Unless there are clear reasons for the sale
of a commodity that has appreciated, one is better off holding
the stock than incurring transaction costs by selling it.
Similarly, investors tend to keep stocks in order to avoid
realizing a paper loss. Selling a commodity at a depreciated
level seals the loss and that seems to be a mental hurdle for
the human psychology. However, Koija Rudzio agrees with
the behavioral finance research and states that holding an
overvalued stock is liable to result in further losses [Rudzio
1999].
In this thesis, I present a system, NELION, that harnesses the
massive amounts of stock data available on the Internet and
provides an objective prediction and stock selection strategy.
The system automatically downloads select information to a
local database and uses mathematical models to predict a
14 Introduction
stock prediction using auto-regressive, Markov, k-nearest
neighbors and artificial neural network algorithms one day,
one week and one month into the future.
In order to optimize the mathematical predictors, a background
thread uses a genetic algorithm to search the input parameter
space for improved models. Since this task can be distributed
to any number of task agents running on different computers
in the network, NELION is a very powerful prediction tool that
constantly adjusts to the changing dynamics of the market.
At the same time, NELION maintains parameters set by the
investor in a profile that takes into account the desired return
in conjunction with adversity to risk. Based on these investor
profiles and the predictions for each stock, the system
suggests specific purchases or sales at defined intervals with
a selected investment horizon, that lead to an optimal portfolio.
NELION helps the investor set parameters by use of the Test
Investor function. This module simulates automatic trades
during a specified interval in the past to observe how the
investor would have fared. By choosing a profile that
corresponds to his needs, a new investor can expect relevant
suggestions in the future.
As a proof of concept, an Auto-Investor function built into the
system executed transactions for two different investor profiles
as an autonomous agent for one year in a simulation using
real data and realistic transaction costs. The results show that
without any intervention, NELION was able to out-perform the
Efficient Market Hypothesis 15
major indexes, depending on the exact profile and the target
markets.
Though one could blindly follow the investment advice as
recommended by the system, NELION is designed as a
trading advisor that helps the investor focus on promising
opportunities and maintain a balanced portfolio. A real
investor still has to verify that these suggestions conform to his
geographical, industry and personal preferences and
estimations.
The rest of the thesis is organized as follows:
In chapter 2, I present the theory of stock price prediction and
the Efficient Market Hypothesis. Based on this, I compare
numerous examples of how different authors have
implemented stock price prediction models.
Chapter 3 addresses the theory of portfolio management
based on the Markowitz approach [Markowitz 1952]. After
defining the return of a portfolio and comparing it to the
inherent risk associated with investments, I introduce the
theory of optimal portfolios and how this theory can be applied.
Based on these two pillars, I describe the theoretical
foundation of NELION, in chapter 4. It describes the
mathematical models used in the system and how each
algorithm is designed.
Chapter 5 covers the implementation details of the system
from an information technical perspective, including the
interfaces to the Internet for external communication.
16 Introduction
The use of NELION is described in Chapter 6, starting with the
identification of adequate investor profiles with different risk
and return requirements. In this chapter, I also present the
promising results from a simulation using the Auto-Investor
function, working with real data, executing transactions
autonomously for one year.
Finally, in Chapter 7 I conclude on the strengths and
weaknesses of NELION and suggest further directions of
research.


2 Predicting Stock Prices
Mathematicians and economists have studied stock price
predictions for many years. In this chapter, the theory of
efficient markets presented will show that though no one can
consistently predict an exact future stock price, it is possible,
on average, to exploit inefficiencies in the commodity markets
and achieve a favorable return. With this theoretical
framework in hand, I describe various practical approaches
and conclude on what I perceive to be a promising direction.
2.1 Efficient Market Hypothesis
The ability of capital markets to reflect and react to the data
relating to a tradable security is known as the Efficient Market
Hypothesis (EMH). Paul Samuelson first coined this term in
seminal work [Samuelson 1965] and the fact that he was
awarded the Nobel Prize in economics shows the importance
of the EMH concept to generations of investors. Simply, the
EMH states that the price of a stock is the consensus of all
investors and other players in the market. If a disproportionate
group believes that a security is undervalued, the buyers will
outnumber the sellers, driving the price up until it has reached
equilibrium. Similarly, an overvalued commodity will attract
fewer buyers than sellers, so that its price will drop.
18 Predicting Stock Prices
In a perfect market, one that is completely efficient, the price
of a commodity reflects all information that pertains to it in any
way. This includes published reports and press releases,
articles in newspapers, magazines or electronic media as well
as macro-economic trends, the political climate and strategic
as well as tactical plans of the companies. New information
and decisions would immediately lead to an adjustment of the
price of the commodity.
The efficient market hypothesis is typically formulated in a
weak, semi-strong and a strong form. The weak form of
market efficiency assumes that security prices follow patterns
with specific cycles of upward and downward trends. Analysts
subscribing to the weak form of the efficient market hypothesis
generally search for specific patterns in charts or the product
and management structure of a company to identify under- or
overvalued stocks. This includes all investment advisors who
make a living researching particular companies, markets and
industries. Prominent representatives of this guild are
Goldman Sachs, Merrill Lynch, Salomon Smith Barney and
Lehman Brothers.
Followers of the semi-strong form of the EMH assume that the
prices of all securities reflect all publicly available data. This
includes fundamental business data, press releases as well as
rumors, which possibly spread inaccurate information about
the underlying commodity. By implication, the only possible
means of consistently benefiting from the stock market would
be to act on non-public or internal information about a
Efficient Market Hypothesis 19
company. This is usually information held by the directors of
the relevant companies and includes plans and strategies.
Much of this information can affect the stock prices if leaked to
the general investment public. However, doing business on
non-public information is called insider trading and is
punishable by law.
Persons subscribing to the semi-strong form of the efficient
market hypothesis build portfolios with a long-term gain in
mind, based on the assumption that the stock market has
traditionally outperformed risk-free investments over periods of
ten years or longer. The most renowned representative of this
school of thought is Warren Buffet and his Berkshire
Hathaway Mutual Fund.
In contrast, the strong form of the efficient market hypothesis
suggests that stock prices reflect all data relevant to the
security, both publicly available and non-public information.
This form is generally rejected by the investment community
and expressed eloquently by Farmer and Lo. They argue that
taken to its logical conclusion, no biotechnology company
would attempt to develop a vaccine for the AIDS virus,
because if the market for biotechnology is efficient in the
classical EMH sense, such a vaccine can never be developed
if it could, someone would have already done it! This is
clearly an absurd conclusion because it ignores the challenges
and gestation lags of research and development in
biotechnology [Farmer, Lo 1998].
20 Predicting Stock Prices
Much work has been done by a variety of persons on the EMH
to verify if price movements are indeed stochastic and
unpredictable.
As early as 1963, C.W.J. Granger analyzed stock behavior
based on linear models and spectral analysis and found
evidence of inefficiencies [Granger 1963]. These findings
were supported by the research from Niederhoffer and
Osborn, by showing that professional portfolio management
statistically achieved greater returns than amateur or random
selections [Niederhoffer, Osborn 1966].
As Ambachtsheer showed, only the introduction of massive
databases and complex algorithms permit reasonably
consistent investment success [Ambachtsheer 1994]. Studies
like the one by Per H. Ivarsson on inter-bank foreign exchange
trading show that there continues to be extensive interest in
the subject [Ivarsson 1997]. The conclusion of many authors
is that inefficiencies exist and can be exploited given a
coherent investment strategy.
Not surprisingly, exchanges with better infrastructure, players
that are more sophisticated and better regulatory frameworks
are more efficient than others. In general, the opportunities
are more pronounced in smaller, less developed markets like
the Bombay or Helsinki stock exchange, as opposed to the
New York Stock Exchange, as Samuelson convincingly
argued [Samuelson 1965].
Mathematical Modeling Techniques 21
Though it is unlikely that the debate regarding the efficiencies
of markets will ever have a formal conclusion, the
overwhelming evidence shows that even if stock markets are
not gold mines, they do offer opportunities, given a coherent
strategy. This view was succinctly expressed by Boldt and
Arbit: ...trading carefully and searching for opportunities
caused by a bias in conventional thinking seem to be the keys
to success for professional investors in a highly competitive,
but not strictly efficient, market [Boldt, Arbit 1984].
2.2 Mathematical Modeling Techniques
Traditionally stocks were researched using fundamental
analysis, a method by which the financial health of the
company is evaluated and compared to those of competitors
in the same industry and the market as a whole.
Warren Buffet is probably the most famous investor who used
this approach successfully over decades. He has amassed a
fortune both for himself and his investors of the Berkshire
Hathaway Mutual Fund. At the annual meetings in Omaha,
Nebraska, he makes investing sound simple with statements
like his conviction to judge a company only by its inner
values and that they buy if we like what we see [Heller
2000]. However, the sheer number of potential investment
opportunities does not permit an in depth analysis of
management personalities, business models as well as
financial health. Consequently, computers were introduced
22 Predicting Stock Prices
very early to analyze quantitative data from a variety of
companies to help identify potential winners.
Frequently this analysis focuses on the comparison of many
different ratios which investors weight relative to their
importance. The most common and widely quoted ratio
continues to be price-to-earnings, though all other values from
the balance sheet, profit and loss statement and cash flow can
be taken into account. Regg-Strm and the training
materials from the Financial Training Company introduce all
common ratios used in the financial evaluation of companies.
Both provide an intuitive tutorial on the topic [Regg-Strm
1997] [The Financial Training Company 1998]
It is worth emphasizing that the ratios do not represent
absolute quantities, but should only be used for comparison of
companies in the same industry, region or market. Also, all
ratios can only serve as one indication and should not be
viewed in isolation.
Earning
Price
multiple P/E = Equation 2.2.1
The P/E multiple or price/earnings ratio compares the closing
price of the stock with the earnings of the last 12 months. A
high value is often a reflection of lofty expectations of stock
price and may indicate that the stock is overpriced.
Mathematical Modeling Techniques 23
Sales
Profit Gross
Margin Profit Gross = Equation 2.2.2
The Gross Profit Margin determines a companys trading
activity. It indicates a companys profit margin by showing the
relationship between sales and direct production costs.
Sales
Profit Trading Net
margin Operating = Equation 2.2.3
The Operating Margin indicates the profitability of sales, taking
into account the volume of activity. Net trading profits should
be before tax, interest paid and income from investments.
Sales
Overheads
Rate Overhead = Equation 2.2.4
The Overhead Rate forms the bridge between gross profit and
trading profit to sales in the previous two ratios. If there is a
significant upward movement in this ratio, it may be a cause
for concern.
Employed Capital
Profit Net
Capital on Return = Equation 2.2.5
This Return on Capital Employed ratio measures the overall
efficiency of the business but is only meaningful if compared
within the same industry. Manufacturing, for example, tends
to be more capital intensive than service industries and will
24 Predicting Stock Prices
exhibit a lower return on capital ratio. Capital Employed
should include share capital, reserves and long-term loans.
s Liabilitie Current
Assets Current
Ratio Current = Equation 2.2.6
Based on the balance sheet figures, the Current Ratio
comments on the working capital position of the company and
is generally accepted as a measure of short-term solvency.
This ratio is particularly pertinent for dot.com companies.
s Liabilitie Current
Stock less Assets Current
Ratio Quick = Equation 2.2.7
The Quick Ratio or Acid Test indicates a companys ability to
pay it debts quickly. Stock and work-in-progress are
generally excluded, since they are not readily convertible to
cash.
Materials of Cost
days 365 Stock x
Turnover Stock = Equation 2.2.8
The Stock Turnover ratio indicates the stock turnover period in
days. If this value increases, it could indicate excessive or
obsolete stock, a negative indicator, particularly for high-tech
companies, where the life cycle of a product is relatively short.
Sales Credit
days 365 x Debtors
Turnover Debtors = Equation 2.2.9
Mathematical Modeling Techniques 25
The Debtors Turnover indicates the average period of credit
taken by customers and is useful in determining the possible
existence of bad debts.
Funds r Shareholde
Debt Bearing Interest
Gearing = Equation 2.2.10
This Gearing ratio helps measure the long-term strength of a
company. High gearing indicates a high risk and susceptibility
to economic fluctuations. It may also indicate that the
company would have difficulties borrowing additional funds.
Dividend
Profit
Cover Dividend = Equation 2.2.11
The Divided Cover calculates the number of times the
company could have paid the dividend amount out of profit. A
high dividend cover may indicate that the company is
financially sound, having retained considerable amounts of
profit for investment back into the company or that the
dividend was very low. The latter may be sign of expansion.
Market Total
100 x Share Market
Share Market Relative = Equation 2.2.12
The Relative Market Share measures the market share of the
company as a percentage compared to its major competitors.
Calculating ratios like these does not require significant
processing power. As computer capacity and processing
26 Predicting Stock Prices
power increased, it became easier for analysts to visually
inspect graphs of individual stock prices and values derived
from them. This led to the birth of a new school of stock
analysis based on charting techniques.
2.2.1 Charting Techniques
Charting techniques work with the visual representation of the
stock price graph over a selected period. By enhancing the
diagrams with secondary time series documenting perceived
trends, the chartists identify trends or trading opportunities.
All of these charting techniques represent common analysis
tools and are frequently quoted by all major investment
magazines, including Capital, Brse and Wirtschaftswoche in
German and Forbes, Money Magazine and Barrons in the
United States. Bookstaber describes and explains these
techniques in detail and shows which combination of triggers
he considers particularly valuable [Bookstaber 1985].
One notable proponent of these techniques is Chrystyna
Bedrij, chief investment officer at Griffin Securities in New
York. She publishes a one-page document three to four times
a week, called The X list and has gained notoriety for having
produced portfolio recommendations with returns in excess of
60% since 1997. Her recommendations are naturally only
available to paying customers, but frequently appear on public
Mathematical Modeling Techniques 27
web sites with a one or two week delay, including
MoneyCentral on MSN.com.
Like the many ratios discussed earlier, analysts have devised
numerous charting techniques worth consideration.
Comparable to the Price-to-Earnings ratio in importance, the
most basic enhancement used by chartists to a stock price is
the trend line. It is defined by connecting local maxima or
minima with a straight line. The area between the two lines is
called the trend channel and provides an indication of the price
tendencies.
Microsoft Stock Prices (US$)
0
40
80
120
1/1/1999 2/1/1999 3/1/1999 4/1/1999 5/1/1999 6/1/1999 7/1/1999
Trend Channel

Figure 2.2.1: Trend Lines and Trend Channel
28 Predicting Stock Prices
This technique is as much an art as a science since the quality
of the implicit statement depends on the correct identification
of the local maxima and minima.
Another somewhat subjective indicator is the support and
resistance line. Similar to trend lines, these floors and ceilings
are usually based on psychological barriers, which are
frequently associated with round numbers.
Spiegel Corp. Stock Prices (US$)
0
2
4
6
8
10
1
/
1
/
1
9
9
9
2
/
1
/
1
9
9
9
3
/
1
/
1
9
9
9
4
/
1
/
1
9
9
9
5
/
1
/
1
9
9
9
6
/
1
/
1
9
9
9
7
/
1
/
1
9
9
9
Resistance Line at US$ 9.00

Figure 2.2.2: Resistance Line
In the example above, the Spiegel stock price has repeatedly
challenged the US$ 9.00 level, but there seems to be a barrier
preventing it from passing this value.
The momentum (M
t
) of a price is defined as follows, where P
t

is the price of the stock at time t:
Mathematical Modeling Techniques 29
n t
t
t
P
P
M

=100 Equation 2.2.13


Using this definition, it is possible to supplement the graph of a
stock with its momentum for different values of n. The
momentum indicator can help identify trend reversals and is
designed to show the strength of the movement. A reversal in
the momentum from values smaller than 100 to bigger than
100 are interpreted as buy signals and vice versa.
Interestingly, some analysts draw trend channels into the
momentum lines to identify buy and sell signals.
Spiegel Corp. Stock Prices and Momentum
0
2
4
6
8
10
1
/
1
/
1
9
9
9
2
/
1
/
1
9
9
9
3
/
1
/
1
9
9
9
4
/
1
/
1
9
9
9
5
/
1
/
1
9
9
9
6
/
1
/
1
9
9
9
7
/
1
/
1
9
9
9
80
100
120
140
160
180

Figure 2.2.3: Momentum for n=7
30 Predicting Stock Prices
Like many indicators, this technique can help identify
opportunities, though some analysts claim that it primarily
documents historic opportunities, instead of predicting the
future. On 1/12/99 and 4/30/99 the momentum value in the
diagram above climbed above 130, documenting a purchase
opportunity that would have resulted in a profitable trade. It
subsequently rose further and broke through the 140 level
confirming the momentum.
The trend confirmation indicator (TCI
t
) is the ratio of two
moving averages D
n
and D
m
, of n and m days, with n<m:
100 =
m
n
t
D
D
TCI Equation 2.2.14
A TCI value below 100 is usually interpreted as a signal
forecasting a change in trends. Values above 100 confirm the
current trend and provide an indication of its strength.
Mathematical Modeling Techniques 31
Spiegel Corp. Stock Prices and Trend Confirmation
Indicator
0
2
4
6
8
10
1
/
1
/
1
9
9
9
2
/
1
/
1
9
9
9
3
/
1
/
1
9
9
9
4
/
1
/
1
9
9
9
5
/
1
/
1
9
9
9
6
/
1
/
1
9
9
9
7
/
1
/
1
9
9
9
80
90
100
110
120
130

Figure 2.2.4: Trend Confirmation Indicator with n=5 and m=10
We see that this indicator can, at times, provide helpful
predictions. On 4/30/99, for example, the TCI climbed to
above 110, which would have allowed for a lucrative
investment in the Spiegel stock.
Another popular indicator graphs a short and a long term
moving average on the same graph. Common values for this
moving average (MA) comparison are 50 and 200 days. A
simple trading rule states that if the short term moving average
crosses the long term moving average from below (above), the
stock price shows a trend of increasing (decreasing) strength
and promises to continue rising (falling).
32 Predicting Stock Prices
0
2
4
6
8
10
1
/
1
/
1
9
9
9
2
/
1
/
1
9
9
9
3
/
1
/
1
9
9
9
4
/
1
/
1
9
9
9
5
/
1
/
1
9
9
9
6
/
1
/
1
9
9
9
7
/
1
/
1
9
9
9
0
2
4
6
8
10
Spiegel Corp.
200 MA
50 MA
Spiegel Corp. Stock Prices, 50 Day and 200 Day Moving Averages

Figure 2.2.5: Comparison of the 50 and 200 Day Moving Average
In this example, this MA indicator triggered a buy signal on
1/9/99 and indeed, the stock price rose from around US$ 6.00
to close to US$ 9.00. It did not, however, identify the
subsequent decrease in stock price, nor did it trigger any other
buy or sell signals in the period displayed.
Trend Oscillators (TO
t
) are the relationship between the
current price and a moving average.
100 =
n
t
t
D
P
TO Equation 2.2.15
This indicator is based on the theory that commodities
oscillate within a defined trend channel over extended periods.
Mathematical Modeling Techniques 33
Given this indicator, it is possible to fine-tune the timing of a
purchase. Values above 110 (below 90) are generally
interpreted as a buy (sell) signal.
0
2
4
6
8
10
1/1/99 2/1/99 3/1/99 4/1/99 5/1/99 6/1/99 7/1/99
80
90
100
110
120
130
Spiegel Corp. Stock Prices and Trend Oscillator

Figure 2.2.6: Trend Oscillator with a 10-Day Moving Average
Again, we see that on 1/12/99 and on 4/30/99 the trend
Oscillator exceeds 110 and are followed by values up to 130.
A purchase on either of these days would have been followed
by a substantial increase in price within a few days.
The over-bought/over-sold indicator (OBOS
t
) is designed to
identify stocks that are currently traded at an inefficient price.
100

=
n n
t n
t
L H
P H
OBOS Equation 2.2.16
34 Predicting Stock Prices
In this equation, the values H
n
and L
n
are the high and low
prices in the previous n days. Generally, it is assumed that a
value over 90 forecasts a price reduction while a result below
10 indicates a price increase.
0
2
4
6
8
10
01/01/99 02/01/99 03/01/99 04/01/99 05/01/99 06/01/99 07/01/99
0
20
40
60
80
100
Spiegel Corp. Stock Prices and OBOS Indicator

Figure 2.2.7: The Over-Bought/Over-Sold Indicator with n=20
The OBOS
t
indicator clearly reacts more sensitively than the
previous ratios, resulting in numerous buy and sell signals.
Like the previous charting techniques, it identified the 4/30/99
buying opportunity. However, there were also some false
alarms at the beginning of February and on March 20, 1999.
These issues showed that even the chartists are not infallible
and in 1986, Frankel and Froot suggested that it is necessary
to take the expectations from both fundamentalists and
Mathematical Modeling Techniques 35
chartists into account, if one is to understand financial markets
[Frankel, Froot, 1986]. As an example, they analyzed the
value of the US dollar and devised a meta-model, based on a
combination of models. They showed that the complex
behavior in the years before the paper was published could be
explained by the inter-play between these chartist and
fundamentalist schools of thought.
After the stock market crash on October 19, 1987, non-linear
dynamics and especially deterministic chaotic systems
became a major topic both among the financial press and
academic literature. Since the violent swings could not be
explained with the usual assumptions, this approach was seen
as an alternative model for the stock market.
Hsieh, for example, analyzed the S&P 500 Index between
1982 and 1990 and concluded that non-linear methods offer
promising new venues in the attempt to model this data [Hsieh
1990]. His extensive tests show evidence that the stock
returns tested are not independent and identically distributed.
Hutchinson also bases his work on the assumption that
financial time series are fundamentally non-linear in nature
[Hutchinson 1994]. He shows that though it is difficult to
benefit from them, models based on radial basis functions
provide better forecasts than linear predictors.
This was good news to the proponents of complex, chaotic
models and a variety have been developed and tested over
the past years. With their proliferation, a third approach to
36 Predicting Stock Prices
stock evaluation evolved and Robinson and Zigomanis
propose the extension of the work of Frankel and Froot to
include the expectations of the non-linear dependence of
financial data in order to optimize these models [Robinson,
Zigomanis 1999].
The following sections address mathematical prediction
models starting with linear auto-regressive methods as a base
line. Subsequently, I focus on more advanced non-linear
approaches using k-nearest neighbors, Markov Models and
artificial neural networks.
2.2.2 Auto-Regressive Models
The literature usually distinguishes between two types of
Linear Models: AR(p) or Auto-regressive Models of degree p
have the form

1
,

i
i t i t
X X Equation 2.2.17
and MA(q) or Moving Average Models of degree q are defined
as follows

1 i
i t i t
Z X Equation 2.2.18
where Z
t
are elements of a white noise process.
Mathematical Modeling Techniques 37
Though the two can and frequently are combined to form
ARMA (p,q) models, we concentrate on AR(p) models
because any MA(q) process can be represented as an AR()
process.
The explicit representation of an AR(p) model entails the
determination of the p weights
ip
, which is frequently
accomplished with the Durbin-Levinson algorithm. Brockwell
and Davis show an elegant derivation for this method that
uses a recursive scheme to circumvent the need for a large
matrix inversion [Brockwell, Davis 1986].
The algorithm requires a stationary process with a constant
arithmetic mean and an autocovariance function such that
(0)>0 and (h)0 as h. For the algorithm, the mean
squared error of the prediction
n
is defined as
( )
2
1 1

+ +
=
n n n
X X E v Equation 2.2.19
Using the standard definition for the estimated autocovariance
function
( ) ( )( )

=
+

=
h N
i
h i i
X X X X
h N
h
1
1
Equation 2.2.20
it follows that
( )
v
0
0 =
$
Equation 2.2.21
38 Predicting Stock Prices
The coefficients of the AR models are calculated using the
following equation.
( ) ( )
1
1
1
, 1

=


=
n
n
i
i n
nn
v
i n n
Equation 2.2.22
The AR(1) model follows immediately:
( )
( )

1 1
1
0
,
= Equation 2.2.23
Given this recursive anchor and the following equations, it is
possible to compute
n,m
for n=2, 3, and m=1,n as well as

n
providing the coefficients for the AR(n), n>1, models.

1,1 n
1 n 1, n
n n,
1 n 1, n
1,1 n
1 n n,
n,1
:
:
:
:
:
:

Equation 2.2.24
( )
2
, 1
1
n n n n
v v =

Equation 2.2.25
AR(n) models compute a weighted mean of past values.
Though very useful and easy to compute, the method does not
perform well when the underlying system contains non-linear
dependencies.
Mathematical Modeling Techniques 39
The results from Hsieh indicate that most financial data are
non-linear in nature resulting in a natural disadvantage for
these models [Hsieh 1990]. Additionally, due to their simple
nature they are used widely as a baseline for comparison but
consequently offer no competitive advantage.
In an effort to benefit from the extensive research done for AR
models and the numerous well-documented algorithms that
exist, it is possible to enhance the basic algorithm in various
ways. By dividing the input space into two or more regions,
defined by the Euclidean distance to their respective centers in
n-dimensional space it is possible to generate different local
linear models. Each approximates the function linearly in their
respective input spaces. The prediction is the weighted sum
of individual models, based on the distance to the input space.
This generalization requires sufficient data to generate several
local models and the quality of the results depends on the
choice of the centers used to define the separate regions, but
generally helps to reduce the model error.
Mlroiu, Kiviluoto and Oja proposed a different enhancement
to generic time series prediction [Mlroiu et al 1999]. After
preprocessing the target time series to zero mean and unit
variance, they separate it into different independent spectral
components using the FastICA package in MATLAB. Each
component is then filtered to reduce the effects from supposed
noise, by applying a high-pass and/or low-pass filter. The
individual components are then modeled using the AR method
40 Predicting Stock Prices
and combined by calculating the weighted sum of each
prediction.
Hsieh developed a similar model, which decomposes
exchange rate futures contracts into a (linear) predictable and
a (non-linear) unpredictable component [Hsieh 1993]. He
focused on US dollar contracts traded on the Chicago
Mercantile Exchange for the British Pound, German Mark,
Japanese Yen and Swiss Franc. Though this approach does
not accurately calculate the expected prices, it is able to
isolate the autoregressive volatility of the data allowing him to
forecast the prediction risk.
In the Santa Fe Time Series prediction competition organized
by Weigend and Gershenfeld, Sauer concentrated on the
prediction of data set A, the intensity of a detuned NH3-FIR
Laser [Weigend, Gershenfeld 1993, Sauer 1993]. By using
delay coordinate embedding, he successfully built local-linear
models to predict the output data. In the same competition,
Lewis, Ray and Stevens modeled the time series A, B and C,
which additionally included physiological and foreign currency
exchange data. They used multivariate adaptive regression
splines in another example where a linear concept is
expanded in scope so that it can be applied to the non-linear
domain.
Mathematical Modeling Techniques 41
2.2.3 K-Nearest Neighbors Models
The k-nearest-neighbors models (KNN) search the training
data for historic points in n-dimensional space that correspond
to the current configuration. The assumption is that similar
configurations in the past are followed by values, which can be
interpreted as predictions for the future. Usually the
predictions are the weighted sum of k of these nearest
neighbors based on distance, hence the name.
In its basic form, the algorithm defines a data window
t
= (x
t

... x
t-n-1
) where x
t
is the value of the time series at time t. As a
next step, it calculates the distance d
t-m
between
t
and
t-m
for
all m<t-n. A common metric used is the Euclidean distance,
shown in the following equation, though numerous alternatives
are possible.
( )

=

=
1
0
2
n
i
m i t i t m t
x x d Equation 2.2.26
The resulting scalars are sorted in increasing order so that the
k closest data windows
t1

tk
, can be identified. Now
simply taking the value following the historic neighbors x
t1+1
,
x
tk+1
, the system has identified k predictions for the future value
of the time series.
Commonly, these k nearest neighbors are averaged, using the
distances d
t1
d
tk
as a means of weighting mechanism.
42 Predicting Stock Prices

=
=
+

=
+
k
i
k
j
t
t t
t
j
i i
d
d x
x
1
1
1
1
Equation 2.2.27
Frequently, the algorithm is extended by including values from
related time series y
t
, or z
t
, representing the trading volume
and general economic data like inflation, interest and
unemployment rates, as well as the price history of major
indexes or related stocks. This tends to increase the
dimensionality of the data window, but does not affect the
complexity of the remaining algorithm.
( )
l t t m t t n t t
z z y y x x

= .. , .. , ..
1 t
Equation 2.2.28
This is a general model and performs well both for linear and
non-linear time series. The results of the model can be
reconstructed because the system can list the points it used
for input, making the predictions very transparent.
The quality of the model deteriorates when the time series
enters uncharted territory or domain space for which no
previous examples exist. This can be avoided somewhat by
normalizing the input, though this usually increases the error
for the known input space.
Mathematical Modeling Techniques 43
2.2.4 Markov Models
Markov Models (MMs) assume that it is only possible to obtain
certain observations of the system that describe its state,
possibly incompletely. In the space-time continuum, the
system jumps from one state to the next. The idea is that if
one observes the system long enough, one can note the
progression from any state
x
to state
y
1
,
y
2
...
y
n
. With this
information at hand, it is possible to calculate the probability
that the state following
x
will be
y
i
for all i. If the system
should end up in state
x
in the future, it is possible to
determine the likelihood for each state that it will be the next
one.
When we apply this method to time series analysis, we first
have to define our data window . Given a training set of N
data points we are now able to define N-=P training tuples.
Each of these tuples represent a move in -space from state

t-1
= (s
t-
, ... , s
t-1
) to state
t
= (s
t-+1
, ... , s
t
). This can also be
interpreted as the functional f(
t-1
)
t
where 1 t P. In the
next step, we divide the -space into k P clusters. This is
done by dividing the bounded -dimensional hypercube into
=n

smaller hypercubes or by randomly selecting points


and attaching each tuple to its nearest representative. The
second method would necessitate the definition of a metric to
identify the nearest point, with all common variants possible.
44 Predicting Stock Prices
Having now categorized the states, one would have to
determine the probabilities of a transition between any two
states by counting the total number of transitions from one
state to another. Once the probabilities are calculated, the
model is fully specified and can be used to predict the next
state if a new point is presented. The prediction follows from
the most probable state. Calculating the weighted mean of all
follower states can extend this model. Many schemes are
conceivable, with linear and exponential weights used most
commonly.
The state of a financial time series is frequently identified both
by the price and volume of the recent trading days. This
algorithm is also frequently enhanced by including related time
series as described in the KNN models.
Fraser and Dimitriadis used MMs in speech research and
recognized their possibilities for time series prediction when
they became aware of the Santa Fe competition [Fraser,
Dimitriadis 1993]. Their contribution focused on data set D, a
numerically generated series representing a chaotic process.
The authors describe their use of Baums EM algorithm, which
iteratively adjusts the model parameters to maximize the
likelihood of its observations and apply the resulting model to
forecast the data. For every point, they are able to map the
probability density making it possible to attach a confidence to
the predicted value. The latter is especially appealing for
stock trading predictions.
Mathematical Modeling Techniques 45
Poritz and Rabiner also came from the speech recognition
field and used Markov Models to forecast the likelihood of a
particular word given a certain beginning of a phrase [Poritz
1998] [Rabiner 1989]. In their work, the states described
recent words in a phrase. Given numerous historic examples,
the system is able to question every interpreted word and
replace it with one that makes more sense, or
mathematically expressed, where the likelihood of its
placement in a particular position is higher.
2.2.5 Artificial Neural Network Models
Artificial neural networks (ANN) are based on research of the
human brain. Here neurons receive input from n different
electrical sources, weight the inputs through an electrical
resistance and sum the results. The output of the neuron is a
transformation of this sum and is fed into the next neuron. An
ANN simulates this operation within a computer program.
The human brain consists of approximately 10
10
neurons, all of
which are active simultaneously and can be interconnected.
The computer equivalent consists of only a few tens or
hundreds of the electronic "neurons" called a unit, and
normally only one is active at any one time.
46 Predicting Stock Prices

Input Layer
Prediction
Hidden Layer

Figure 2.2.8: Artificial Neural Network
In the figure above, one can detect a strict hierarchy, with
clearly identifiable layers. Each layer is fully connected to the
next. Though recurrent connections are conceivable and
sometimes used, we will restrict ourselves to feed-forward
ANNs with this structure. The bottom layer in the diagram
represents the input layer. The activation of this layer is
determined by the input into the system, for example I
t-1
, I
t-2
,...,
I
t-n
. The input data is then propagated to the next (hidden)
layer as represented by the connecting line in Figure 2.2.8.
Each of the four units in the middle layer sum their weighted
inputs and scale the output using the transfer function (h).
4 , 3 , 2 , 1
6
1
,
=

=

=
j
I w o
i
i i j j

Equation 2.2.29
The I
i
in this case are the values from the input units and the
weights w
j,i
represent the axon or weight connection between
input i and hidden unit j. It is common to use a sigmoidal
function for (h), like
Mathematical Modeling Techniques 47
( )
( )

h
e
h
2
1
1

+
= Equation 2.2.30
or

( ) ( )
h h = tanh Equation 2.2.31
with the alternatives being linear or sinusoidal functions for an
input between - and . Similarly, the output unit or prediction
sums and transforms the activation values from the hidden
layer.

=
=
4
1
, 1 1
j
j j
o W O Equation 2.2.32
The example in the figure contains only one hidden layer and
an output layer with one unit, though more output units and
more layers are possible.
The output unit contains the desired prediction and depends
on all the weights in the net. The ANN has to learn the data
to search its configuration space for a good set of weights.
This is usually done using the back propagation algorithm. In
it, we define the error of the prediction of a particular pattern
as a function of the weight vector to the output unit as follows,
where O
i
is the output of the ANN and T
i
represents the target
value.
48 Predicting Stock Prices
( ) ( )

=
=
max
1
2
2
1
i
i
i i
T O w E
v
Equation 2.2.33
By now differentiating the error expression with respect to
each weight w
ij
in Equation 2.2.33 we can determine an
expression that will reduce the error using the gradient
descent method with a step-size .
( )

=
= =
max
1
1
'
k
k
i ik i i
ij
ij
O W O T O
W
E
W

Equation 2.2.34
The error is then propagated to the next layer of units where
the same algorithm is then used. This update routine can be
applied after the presentation of each pattern (on-line learning)
or collected for all the patterns in a training set (batch mode
learning). After running through all training patterns, the
model is verified on the test set. Training continues as long as
the test error is reduced with each iteration. This ensures that
the ANN does not start modeling stochastic noise in the
training set, a process called "overfitting." It is common for the
error to vary dramatically in the first iterations, so that many
algorithms permit the definition of a minimum number of
iterations to be tested.
From this description, it is obvious that this method is
computationally considerably more expensive than the
previous algorithms. Also, there is no easy analytical method
to determine the number of units necessary in the hidden
Mathematical Modeling Techniques 49
layers, or to make other topological decisions so that it is
necessary to find the optimal configuration using trial-and-
error. The high number of degrees of freedom poses some
difficulties as well, since it reduces the stability during the
convergence process. However, the gain, once a good
configuration is found, is a truly non-linear function predictor.
This difficulty associated with models built using artificial
neural networks is exemplified by an experiment conducted by
White [White 1989]. He attempted to predict the quarterly IBM
stock prices with a static network topology of five input and
five hidden units. The single output unit was trained with data
points from the second quarter of 1974 until the first quarter of
1978. The resulting artificial neural network was used to
predict prices from the second quarter of 1972 until the first
quarter of 1974 as well as the second quarter of 1978 until the
first quarter of 1980. Though the network was able to model
the training set well, the project had no control over the extent
of the training on the network, so that its ability to generalize
was unsatisfactory. For the two test sets, the correlation
between the predicted and the real values was 0.0751 and
0.0699 respectively. This example shows that this powerful
tool can not be applied blindly without tailoring the network to
the specifics of the time series.
With increased complexity and consideration however, ANNs
offer good opportunities. Rehkugler and Poddig show the
potential of ANNs in an experiment, which included only three
input values and was designed to predict the movement of the
50 Predicting Stock Prices
German stock index DAX [Rehkugler, Poddig 1990]. Instead
of using the prices of the index itself, the ANN was based on
the nominal interest rate, a business confidence indicator and
free liquidity, defined as follows:
GNP
M1
= L Equation 2.2.35
In this equation, the money supply M1 is divided by the gross
national product, GNP. In order to eliminate seasonal swings,
the input to the ANN included only the change in value
compared to the previous year. The system was trained with
this delta data from the first quarter 1965 with 6 years worth of
information. The topology included different numbers of
hidden layers each with different number of units. The
network had five output units. One was trained to calculate a
simple rise/fall predictor with the values 1 or 0 respectively.
The remaining four output units defined index changes of
bigger than 10%, between 0% and 10%, between 0% and
10% or smaller than 10%. During every training cycle, only
one of these four units was trained with the value 1. The
others were set to 0.
After training, the network was tested with 68 values. For the
rise/fall indicator, values of 0.5 or greater were interpreted as a
rise, while lower values were considered a falling prediction.
For the four categorization output units, the algorithm
assumed that the output with the highest value was the
winner and interpreted it as the prediction.
Mathematical Modeling Techniques 51
As a simple trading strategy, the program simulated a
purchase of the index as if it were a stock if the prediction was
positive and it did not already own it. Similarly, it sold the
index, if it owned it and the network predicted a decrease.
The basis for this decision was the first rise/fall indicator. The
table below summarizes the results.

Topology Rise/Fall Output Categorization Return
3-5 49 correct, 19 false 43 correct, 25 false 177.28%
3-5-5 49 correct, 19 false 47 correct, 21 false 160.23%
3-9-7-5 44 correct, 24 false 44 correct, 24 false 142.29%
Table 2.2.1: First Experiment [Rehkugler, Poddig 1990]
Interestingly, the return as well as the correct prediction
frequency responded negatively to increased complexity in the
network. Nevertheless, the average annual returns well over
100% suggest an opportunity, although transaction costs are
not considered.
Since this artificial neural network effectively produced two
different outputs, the two scientists simplified the output in a
second experiment to the simple rise/fall indicator. This
approach ensured that the secondary categorization goal did
not interfere with the desired prediction. In an effort to
improve the forecasts, the output in this second experiment
was interpreted more stringently: Only outputs over 0.9 and
under 0.1 were considered rise or fall predictors respectively.
All other values specified an undefined state and resulted in a
52 Predicting Stock Prices
hold strategy for the hypothetical stock. The results are
shown in the table below:
Topology Rise/Fall Output Return
3-1 32 correct, 13 false 225.16%
3-2-1 42 correct, 16 false 194.39%
3-3-1 43 correct, 14 false 237.65%
3-4-1 44 correct, 19 false 194.36%
Table 2.2.2: Second Experiment [Rehkugler, Poddig 1990]
The results indicate that the specialized approach increased
returns considerably, and that the number of undecided states
primarily helped reduce the number of false predictions. It is
also noteworthy that the increase in network complexity
reduced the undecided states and improved the number of
correct predictions. Though the return did not immediately
benefit from the increased quality of the predictions, one
should question whether a simple portfolio management
strategy is an adequate measure for this model.
In an attempt to improve the generalization ability of artificial
neural networks, Utans and Moody extended the methodology
in a study that was designed to predict the Standard & Poor
(S&P) rating for assorted companies [Utans, Moody 1991].
These ratings define the risk associated with an investment in
a particular company or market. S&P categorize the
companies in 18 discrete steps. Since these ratings are
updated infrequently, this ANN helped interpolate the rating
between official releases.
Mathematical Modeling Techniques 53
The ANN used ten financial ratios as input data, a hidden layer
and a single output unit, which categorized the risk rating for
the company between 2 and 19, in the order of decreasing
risk. The hidden units used sigmoidal transfer functions. In
contrast, the output unit used a piece-wise linear transfer
function in order to reduce the complexity and thereby
increasing the training speed.
= ) (x f
16 for 0
-16 16 for
2
1
32
16 for 1
x
x
x
x
>
+
>
Equation 2.2.36
The authors tested this configuration with different numbers of
hidden units, in order to determine the optimal topology.
0
1
2
0 1 2 3 4 5 6 7 8 9 10
Train Error
Test Error
Experimental Training and Test Error

Figure 2.2.9: Utans, Moody Experimental Training and Test Error
The results show a pronounced change of trend in the training
error. This point coincides with the minimum test error,
54 Predicting Stock Prices
indicating that the optimal configuration should include three
hidden units.
As a next step, Utans and Moody defined the importance of
the i input data by defining the sensitivity of the output to the
input.
es input tupl of number where

1
1
=

=

=
N
X N
S
N
p
pi
i

Equation 2.2.37
By sorting the sensitivity in decreasing order, the two authors
identified the test error as more of the input data was
removed.
1.7
1.8
1.9
2
10 9 8 7 6
# Input Data
Test Error with Removed Input Data

Figure 2.2.10: Utans, Moody Test Error with Removed Input Data
This approach shows that the removal of two input data
actually helped improve the test error of the resulting ANN.
Mathematical Modeling Techniques 55
In a separate effort to improve the performance of the model,
the authors applied the Optimal Brain Damage (OBD) method,
as presented by Le Cun [Le Cun et al 1990]. This approach
defined the influence of each weight in the network through
the saliency function, defined in the equation below.
( )
2
2
2
2
1
i
i
i
i
w
w
w E
s

= Equation 2.2.38
By resetting the weights with the lowest saliency, their effect
on the output is effectively removed. The resulting network is
retrained so that it can adjust to this brain damage. The
figure below shows the effect of this adjustment to the test
error.
1.7
1.8
1.9
0 3 6 9 12 15 18
# Reset Weights
E
r
r
o
r
Experiment with Optimal Brain Damage

Figure 2.2.11: Utans, Moody Experiment with Optimal Brain
Damage
56 Predicting Stock Prices
These two comparisons both showed a promising
improvement but were strangely not combined into a single
model.
The results of these two ANNs were categorized by the
deviation of the actual S&P classification. The table below
shows that percentages for each error group.

Deviation 8 Input Units, No OBD 10 Input Units, with OBD
0 31.1 % 29.1 %
1 39.3 % 39.3 %
2 15.8 % 17.9 %
>2 13.8 % 13.7 %
Table 2.2.3: Error with Reduced Input and ODB Methods
Approximately 70 % of the test set were categorized correctly
or were only off by one, resulting in reasonably accurate
predictor for the S&P risk rates.
In a different attempt to optimize the network topology and
improve the ability to generalize, Fahlman and LeBiere
introduced cascade-correlation networks [Fahlman, LeBiere
1990]. In contrast to the standard back-propagation algorithm,
in these networks the size of the hidden layer is not static.
Initially, a two-layer network is trained until the error for each
vector pair in the training set falls below a defined threshold.
When this occurs, a neuron is added to the hidden layer. The
synaptic weights between the input layer neurons and the new
Mathematical Modeling Techniques 57
neuron are adjusted to maximize the magnitude of the
correlation between its activation and the output layer error.
Deppisch, Bauer and Geisel generalize this idea in their paper
on hierarchical networks [Deppisch, et al 1991]. The concept
envisioned training an ANNs until it is no longer able to model
the complexities of input data. In a second step, a new ANN is
trained to predict the residues of the original model. Finally,
the two ANNs are used together as a predictor of the time
series.
+ =

Figure 2.2.12: Hierarchical Networks
The diagram above shows the combination of two ANNs but
this procedure could be extended indefinitely to continually
reduce the residual model. The final output is the sum of all
sub-models.
The authors tested this approach on the chaotic coupled
differential Rssler equations defined in the equation below.
58 Predicting Stock Prices
( )
( ) 7 . 5 2 . 0
2 . 0
+ =
+ =
+ =
x z
dt
dz
y x
dt
dy
z y
dt
dx
Equation 2.2.39
In order to approximate the optimal learning per network, the
first ANN had a topology using one input x value and two
hidden units to predict the next x value. Network 1 was trained
10
3
epochs until it produced an average test error of 10
-2
and
then supplemented with a second ANN with a 1-6-1 topology
until the test error reached a minimum. In Network 2, the
original 1-2-1 ANN was trained until it produced an average
output error of 10
-3
and was then supplemented with the same
1-6-1 ANN. As a comparison the following diagram include
the test error of the 1-2-1 without a secondary ANN and the
error of an ANN where all eight hidden units are trained from
the beginning.
Mathematical Modeling Techniques 59

1.E-08
1.E-06
1.E-04
1.E-02
1.E+00
1.E+02 1.E+03 1.E+04 1.E+05 1.E+06
1-2-1 Network
1-8-1 Network
1-2-1 + 1-6-1 Network 1
1-2-1 + 1-6-1 Network 2
Training Error for Hierarchical Networks

Figure 2.2.13: Training Error for Hierarchical Networks
Not surprisingly, the 1-2-1 network learned the data quicker
than the more complex 1-8-1 network. After several thousand
epochs it caught up and eventually produced a slightly lower
training error. The first hierarchical network initially improved
the simpler model only slightly. It only showed a significant
error reduction after 10
4
epochs reducing the training error by
almost two orders of magnitude. The second network
immediately caught up this level and was apparently able to
continue improving its model even in the next epochs. After
10
6
epochs, the training error leveled off for all models.
A possibly more significant measure of the quality of the model
was the prediction quality measured by the test error. The
diagram below shows a comparison of the most successful of
the hierarchical models in comparison to a linear predictor.
60 Predicting Stock Prices
0.01
0.1
1
2 4 6 8 10 12 14 16
1-2-1 + 1-6-1 Network
Linear
Test Error for Different Prediction Horizons

Figure 2.2.14: Test Error for Different Prediction Horizons
Though significantly larger, the hierarchical predictor is an
order of magnitude better than the linear model and though
unsurprisingly both errors increase with a longer prediction
horizon, the ANN remains about ten times better throughout
the domain shown in Figure 2.2.14.
The experiment shows that hierarchical networks are able to
offer advantages in at least some cases. The difference in
training and test errors point to overfitting, an issue that would
need to be addressed, if this approach were to be used in a
system.
Artificial Neural Networks also received a big push in
popularity from two entries by Wan (data set A) and Mozer
(data set C) to the Santa Fe Time Series Prediction
Competition [Weigend, Gershenfeld 1993, Wan 1993, Mozer
Mathematical Modeling Techniques 61
1993]. The latter also showed that these models could be
applied successfully in the financial domain, an aspect of
particular relevance to this paper.
Networks based on radial basis functions have also been
applied to financial time series prediction. As early as 1986,
Hutchinson showed that these non-linear models performed
better than linear and univariate models in a simulated stock
trading comparison [Hutchinson 1986]. Building on the
theoretical foundation, he developed a system that was also
applicable to stock option pricing, showing the versatility of the
approach.
Parkinson analyzed preprocessing techniques and their
application to financial time series prediction using neural
networks [Parkinson 1999]. He proposes a method whereby it
should be possible to compare scaling, logarithmic transforms,
smoothing, differences and ratios as well as normalization.
However, the sheer number of combinations of these
preprocessing techniques makes it difficult to identify which
one of them optimally suits what kinds of data.
Nevertheless, artificial neural networks have been applied
extensively to predict financial time series. Zimmermann
enthusiastically supports their usage, because they combine
the non-linearity of multi-variate calculus with the number of
variables typically used in linear algebra [Zimmermann 1994].
Working with him, Braun developed several applications for
the prediction of the DAX, which included manual optimization
62 Predicting Stock Prices
techniques like weight and input pruning as well as the
merging of hidden units and layers [Braun 1994].
2.2.6 New Approaches to Financial Market Analysis
In recent years, several new agent based approaches to the
analysis and prediction of financial data have gained
popularity. In 1997, Arthur, Holland, LeBaron and Tayler
proposed an artificial stock market (ASM) where N simulated
players, called agents, each had the option to buy, sell or keep
their current position [Arthur et al 1997]. The experiment was
permitted to run for 250,000 periods during which each agent
was permitted to execute orders once. Each agent forecast
the price of each commodity using auto-regressive models,
which it adapted using genetic algorithms. The trading rules
were based on a parameterized strategy, which was permitted
to evolve using a genetic algorithm over the course of the
experiment. This project spawned several relevant research
initiatives.
One was the experiment by Joshi and Bedau, who assume
that investors continually explore and develop expectation
models, buy and sell assets based on the predictions of the
models that perform best and confirm or discard these models
based on their performance over time [Joshi, Bedau 1998].
They used the ASM to explore the volatility of prices and
average wealth earned by the investors as a function of the
frequency of strategy adjustments.
Mathematical Modeling Techniques 63
In their experiments, they simulated a market with a fixed
number N=25 of agents. Time was discrete and in each
interval, the agents had to decide whether to invest their
portfolio in a risky stock or a risk free asset, analogous to a
Treasury Bond. There was an unlimited supply of the risk free
assets and it paid a constant interest rate of r=10%. The risky
stock, issued in S shares, paid a stochastic dividend that
varied over time governed by a process that was unknown to
the agents.
The agents applied forecasting rules to their knowledge of the
stocks price and dividend history and performed a risk
aversion calculation and decided how to invest their money at
each time period. The price of the stock rose if the demand
exceeded the supply and fell if the supply exceeded the
demand. Each agent can submit a bid to buy or an offer to
sell fractions of shares at the previous periods price.
Their results were classified in to four different classes of
behavior depending on the frequency of their updates using
the genetic algorithm.
Genetic Interval Volatility
of Prices
Average wealth
earned
Complexity of
Forecasting Rules
Never Low Low Low
Every Iteration Low High Very Low
10
2
< interval < 10
3
High Low Very High
10
3
< interval < 10
4
Moderate High High
Table 2.2.4: ASM Investor Types as a Function of GA Interval
64 Predicting Stock Prices
Not surprisingly, the volatility of the simulated market was low,
if the investors never updated their forecasting model, since
each remained with the set of rules they were originally
endowed with. Interestingly, the volatility of the price structure
remains low when the agents update their rules at every
interval and increases dramatically as the genetic interval
increases to somewhere between 10
2
and 10
3
. With this
configuration, the complexity of the forecasting rules was also
very high. At the same time, the average wealth earned by
the investors was high if they adjusted their strategy on every
iteration, then dropped to a low as soon as there are some
changes, but peaked if the investors modify their strategy
every 10
3
to 10
4
intervals.
This work was used to explain the rapid increase in volatility of
the financial markets in recent years, by implying that the
average trading strategies of the investment community has
changed. Since it is assumed that professional investors,
which represent the traditional players, do not adjust their
strategy very rapidly, the authors suggest that this increase is
due to privat investors, which tend to trade with a higher
frequency and have presumably grown their share of the
trading volume with the increasing ease of online trading via
the Internet
Also based on the ASM, Kurumatani proposes a virtual stock
market, Vsmart, where researchers worldwide can inquire
about stock prices, and can execute purchases and sales, just
like in its real world counterpart [Kurumatani et al 2000].
Mathematical Modeling Techniques 65
Unlike actual markets, no actual money will transfer ownership
and all trades, trading histories and results are open to all
participants. This results in complete transparency and
hopefully in some insights into the dynamics of stock markets,
multi-agent research and profitable trading strategies as well
as human decision-making.
To this end, the authors provide a Simple Virtual Market
Protocol, SVMP, which allows the academic community to
automate the transaction chain via the Internet. This process
allows autonomous agents to retrieve current and historic
market data, which can be processed using any kind of
algorithm. Using the resulting predictions, the agent can send
order inquiries. The central VSmart Server matches offers
and bids, thereby determining the resulting price and providing
an immediate response whether the order resulted in a
transaction.

Order Result Status
Order Inquiry
VSmart Server
V-Smart
Client 1

Order Result Status
Order Inquiry
V-Smart
Client 2


Figure 2.2.15: The VSmart Virtual Stock Market
Since all transactions are tracked and available for all
participants, this open approach can bring together
66 Predicting Stock Prices
heterogeneous agent types. A human interface via the
Internet, even allows participants, who have not fully
automated the transaction chain to interact with the server.
Due to its completely transparent design, this project promises
to provide interesting results, when it goes live.
In 1990, Granger hypothesized that trading volume and price
movements were related [Granger 1990]. Karpoff found that
stock price and trading volumes are related for bull markets,
supporting this Grangers causality [Karpoff 1987]. Taking this
work as a basis, Chen, Yeh and Liao extended the algorithms
used in the ASM to analyze the old saying on Wall Street that
it takes volume to make price move [Chen et al, 2000]. They
developed a similar artificial agent based stock market and
tested for Granger causality between these two components
and showed that the relationship exists in all examples tested.
This finding also emphasized the validity of the agent based
simulations and reinforced this new methodology.
Ingber and Mondescu developed an interesting alternative
based on an Adaptive Simulated Annealing (ASA) approach to
generate buy and sell signals for S&P futures contracts
[Ingber, Mondescu 2000]. The algorithm fits short-time
probability distributions to observed data using a maximum
likelihood technique on the Lagrangian. It was developed to fit
observed data to the following function.
( ) t dz F dt f dF
x F
+ = Equation 2.2.40
Mathematical Modeling Techniques 67
Using this equation, it defined the market momentum as
follows:
s
F
F
F
f
dt
df
2 2

= Equation 2.2.41
In these equations, f
F
represents the drift, the standard
deviation defining the volatility, F(t) is the S&P future price and
dz the standard Gaussian noise with zero mean and unit
standard deviation. The parameter x was used to adjust the
system to the current market conditions. The system sampled
the data with different time resolutions t and averaged actual
tick data that fell into a particular sample interval.
The ASA algorithm calculated optimal parameters for the drift f

F
and the parameter x in the equations above as soon as
sufficient data was available in the trading day and periodically
thereafter. By then defining the null momentum
x
F
F
F
f
2 2
0

= Equation 2.2.42
the momentum uncertainty band
dt F
x
F

1
= Equation 2.2.43
68 Predicting Stock Prices
the system was able to execute a simple trading rule for long
(buy) and short (sell) signals.
If
F
> M
F
0
then signal = buy
If
F
< -M
F
0
then signal = sell
The threshold parameter M was used to limit the number of
transactions by defining a momentum uncertainty band around
the null momentum value.
The system was tested with different sampling intervals t,
data windows W and threshold parameters M. The
parameter was continually updated. The results presented in
the paper included US$ 35 transaction costs for each buy and
sell combination and show the trading profit for two specific
days, June 20 and June 22, 1999. Depending on the
parameter selection for t, W and M, the system generated a
gain of up to US$ 2285 or loss of up to US$ 1125. It does not
state how much money was available for investment in total.
Nevertheless, these results show how mathematical methods
usually used in physics can be successfully applied to the
financial trading domain.
No doubt, these are not the last new prediction strategies as
scientists keep forging ahead with always more potent
algorithms, which alone would allow them to remain ahead of
the pack as Lequarr predicts. However, this race will
continue, as patterns in the price tend to disappear as agents
evolve profitable strategies to exploit them [Lequarr 1993].
Mathematical Modeling Techniques 69
3 Portfolio Management
If all stock predictions were perfect, portfolio management
would amount to the transfer of funds to the commodity that
promises the highest return in the specified investment
interval. Unfortunately, the future is not predictable to that
degree of accuracy. Consequently, portfolio management
requires a careful distribution of funds in various stocks so that
any one single incorrect prediction does not dramatically and
negatively affect the performance of the entire portfolio. On
the other hand, spreading the risk between numerous stocks
also implies that a dramatic upside gains by any one
investment only helps the portfolio proportionally.
Due to this dynamic, portfolio selection is dependent on the
risk adversity of the investor. Markowitz defined the
theoretical concept of the perfect portfolio, on which NELION
is based [Markowitz 1959]. After analyzing the concepts of
return and risk in this chapter, I present the parameterization
of the conflicting goals of high return with low risk in the
optimal portfolio theory.
70 Portfolio Management
3.1 Return
The return of a stock in a specified period is the percentage
increase of the value of the investment. It is defined as
follows:
1
1

t
t t t
t
P
D P P
R Equation 3.1.1
In the equation above, P
t
is the current price and P
t-1
is the
price at the beginning of the interval, while D
t
is the dividend
within that period. The dividend can never be negative. For
periods, which do not coincide with the financial year of the
underlying stock, the dividend is calculated as a percentage of
the total accrued during the period. Following standard
investment convention, we assume that the interval is one
year. From Equation 3.1.1, it is clear that the return R
t
is
positive if P
t
is larger than P
t-1
, or the price of the commodity
has increased.
The return of a portfolio is the weighted sum of the i individual
stock returns.
( )

i
i t i P t
R X X R
, ,
Equation 3.1.2
In this equation X
i
denotes the fraction of the portfolio covered
by each investment and therefore
Risk 71
0 where 1
1

i
n
i
i
X X Equation 3.1.3
This requirement does not impose any restrictions on the
allocation of funds, since it allows for money kept as cash.
The return would then simply be the bank interest rate, which
may be 0%, depending on the account type.
3.2 Risk
Unlike the return of an investment, the definition of risk is more
subjective. Markowitz assumes a normal distribution of upside
and downside potential around the return of a commodity,
based on its volatility .

Normal Distribution
Downside
Risk
Upside
Potential
4/6
1/6
1/6
+ -

Figure 3.2.1: The One-in-Six Rule
In Figure 3.2.1, the expected return defines the peak of the
normal distribution with - and + defining a 2/3 margin of
72 Portfolio Management
return. The downside potential is 1/6, hence the name of the
rule. It is clear that a small reduces potential loss thereby
minimizing the associated risk of the equity. We therefore
define the risk V(X) of an investment of value X with a variance
as follows.
( )
2 2
X X V Equation 3.2.1
Unlike the return, the risk can not simply be calculated as the
weighted sum of the individual risks, since individual stocks
can be dependent on similar external factors. Both Daimler-
Chrysler and Ford are affected negatively by rising oil prices,
for example, so that a portfolio consisting of these two stocks
has a higher risk than one with Daimler-Chrysler and
Microsoft, for example, assuming that Microsoft and Ford have
the same volatility. Consequently, the systemic risk of a
portfolio includes the covariance
ij
of the individual
investments i and j as shown in Equation 3.2.2 below.
( )


+
+
1
1 1 1
2 2
2
n
i
n
i j
ij j i
n
i
i i P
X X X X V Equation 3.2.2
The first term represents the inherent risk of every individual
stock, while the second term captures the risk associated with
the correlation between stocks. Given a portfolio where the
correlation
ij
between all stocks i and j is zero, the risk V
P

reduces to the simple sum of individual risks for each stock.
The Optimal Portfolio 73
( )
0 for
1
2 2

ij
n
i
i i P
X X V

Equation 3.2.3
3.3 The Optimal Portfolio
The conditions of Equation 3.2.3 are virtually impossible to
achieve for any n>1, but additionally, this approach ignored
the return of the portfolio. In order to find the optimal stock
distribution, we look at a sample portfolio with two stocks with
an equal expected return , where
ij
= 0.2,
1
= 0.6 and
2
=
0.8. We can plot the risk of the portfolio as a function of the
investment in the first stock.
Risk of a Portfolio
0% 20% 40% 60% 80% 100%
Stock Distribution
R
i
s
k

Figure 3.3.1: Risk of a Portfolio
If these two stocks were the only available choices, an
investor would ideally distribute 60% of the available capital in
stock 1 and the remaining 40% in stock 2. This example
74 Portfolio Management
shows that the risk of a portfolio can be minimized without
changing the expected return.
In order to calculate this optimal portfolio, we use Markowitz'
approach. He defined the objective function, which assigns a
weight between the desire for high returns and a low risk.
( ) [ ]
P P
V R E A f + Equation 3.3.1
In this function, A represents the risk aversion of the investor
and is dependant on his investment needs. A graph of this
function highlights a region that satisfies the investors
requirements for return as well as risk. The edge of this region
defines the portfolios with the highest return given a specific
risk or conversely, the lowest risk give a defined return and is
called the Efficient Frontier.

Efficient Frontier
Risk
E
x
p
e
c
t
e
d

R
e
t
u
r
n
Excessive expected risk for a given
expected return
Efficient Frontier
Low expected return for a given
expected risk

Figure 3.3.2: The Efficient Frontier
The Optimal Portfolio 75
In the next step, Markowitz defined the Utility Function, which
is also investor dependent and describes the utility U(R) of a
specific return R. This function is used to identify the desired
return when optimizing a portfolio.
( )
2
P P P
cR bR a R U + Equation 3.3.2
The coefficients b and c are not negative so that the resulting
graph will have a form as shown below.

The Utility Function
Return
U
t
i
l
i
t
y
Maximal Utility

Figure 3.3.3: The Utility Function
A person at the beginning of his career can generally afford to
take a larger risk since he will generally not depend on the
savings in the near future but would benefit from higher
returns later in life. Short-term downward fluctuations are
tolerable to this group of persons but not for an investor who is
close to retirement and will need his savings in the near future.
Job security, the plans for a large purchase in the near future
76 Portfolio Management
or personal risk aversion are other considerations, which will
affect these parameters.
Applying the expectation operator E(.) on Equation 3.3.2 we
get the following result.
( ) ( ) ( ) ( )
2
P P P
R cE R bE R U E + Equation 3.3.3
Using the definition of the variance
( ) ( ) ( ) [ ]
2 2
P P P
R E R E R V Equation 3.3.4
we can re-write Equation 3.3.4 as follows:
( ) ( ) ( ) ( ) [ ] ( )
P P P P
R cV R E c R bE a R U E + +
2
Equation 3.3.5
For a constant expected utility, C, we can solve this equation
for the expected return E(R
P
).
( ) ( )
2 1
C C R V R E
P P
+ Equation 3.3.6
where
( )
2
2
1
2c
b
c
a C
C +

and
c
b
C
2
2
Equation 3.3.7
This equation defines utility curves, which we can add to the
graph shown in Figure 3.3.2, to arrive at the optimal portfolio
as shown below.
The Optimal Portfolio 77
The Optimal Portfolio
Risk
E
x
p
e
c
t
e
d

R
e
t
u
r
n
Optimal Portfolio
V(R)

Figure 3.3.4: The Optimal Portfolio
The point of tangency between the utility curve and the
efficient frontier defines the optimal portfolio for this investor.
This point can be calculated by substituting Equation 3.3.6 into
Equation 3.3.1 and solving for V(R
P
) or E(R
P
).
( )
1
]
1


,
_



,
_

1
]
1


,
_

1
2
2
1
1
2
2
1
2
2 2
1
2
2
2
2
2
A
f
C
A A A
C
A
f
C
A
f
C
A A
R V
P
Equation 3.3.8
( )
( )
2
2
4
2
2
2
2
2 1
A C A C
C C f R E
P

+ + Equation 3.3.9
This expression uniquely specifies the optimal portfolio.
78 Portfolio Management
The challenge of this approach is the identification of the
parameters in the utility and the objective functions since they
are highly subjective and represent relative weights and
cannot be attached to measurements in the physical world.
3.4 Applying the Theory
Most trading systems are extensions of financial prediction
experiments and have the goal of measuring the real-world
results that can be associated with the forecasts. The
simplest form was already mentioned in the experiments from
Rehkugler and Poddig: If an increase was predicted, the
system purchased one additional fictitious unit of the DAX, if a
decrease was predicted, one was sold. The system did not
permit the ownership of negative numbers of the stock, or
short positions.
Due to their mathematical simplicity, trading strategies based
on moving averages are probably the most widely used
technical rules. These models were prominently used by
LeBaron and became the baseline for further comparison
[LeBaron 1995].
In LeBarons experiment, the single moving average indicator
generated buying (selling) signals when its value was above
(below) the current stock price. The adjustment of the model
required identifying the optimal length of the data window. A
slight improvement on the basic algorithm could be achieved if
trading signals were only generated if the difference between
the moving average and the current price exceeded a
Applying the Theory 79
specified band. This reduced the number of trades and, by
implication, the transaction costs that a real world investor has
to bear.
Moving average oscillators compare a short term and a long
term moving average of the stock price against each other.
These models frequently use the commonly quoted moving
averages of five, ten, 15, 50 and 200 days for their
comparisons. Buy (sell) signals are generated only if the short
(long) term moving average rises above that of the long (short)
term. Again, frequently the difference between the two values
has to exceed a specified value in order to trigger a trading
signal, so that the number of transactions is kept at bay.
Using both of these moving average trading systems as a
foundation, Dihardjo and Tan compared artificial neural
network prediction models with an associated trading system
to predict profitability opportunities in the Australian Dollar/US
Dollar exchange rate [Dihardjo, Tan 1999]. Though they found
that both systems were profitable in the period tested, the
ANN models performed better (annualized returns between
13% and 19%) than the simple moving average approach
(returns of between 8% and 13%). The experiments showed
that both models were particularly successful in markets,
which exhibited long term trends.
Kumar, Tan and Ghosh used the same Australian Dollar/US
Dollar exchange rate data and built sophisticated financial
forecasting models. These incorporated the chaotic
80 Portfolio Management
components in numerous ways in an effort to optimize the
predictive powers of the models and, by implication, the
profitability of their system [Kumar et al, 1999]. The trading
system worked with two different rule patterns:
Pattern 1:
If (Current Forecast Previous Forecast) > Delta then
Signal = Buy
Else If (Previous Forecast Current Forecast) > Delta then
Signal = Sell
Else
Signal = Hold
Pattern 2:
If (Current Forecast Current Close) > Delta then
Signal = Buy
Else If (Current Close Current Forecast) > Delta then
Signal = Sell
Else
Signal = Hold
The Delta value was used to provide a threshold, which
eliminates excessive trades, since they were taken into
account with 0.1% of the transaction value in this experiment.
Interestingly, though the forecasting models were considerably
more complex than the ones used by Dihardjo and Tan, the
profitability ranged between 11% and 20% annualized return
and thus did not significantly help this goal much.
Applying the Theory 81
Notably missing from this list of trading strategies is one that
addresses the realities of an individual investor, who has to
decide not only which stocks offer good growth opportunities,
but also how to distribute his investment between the
numerous alternatives. Bookstaber describes a simple BASIC
program that combines chart analysis with a simple risk
calculation algorithm, but does not analyze the success or
failure of this approach given historic data [Bookstaber 1985].
Programs with a similar focus exist with investment institutions
or other professional investors who emphasize risk analysis,
however, this work tends to not get published since it is
considered the strategic advantage of the respective owner or
user community. Jean Y. Lequarr voiced a similar sentiment
in the conclusion of his article: This inability to discuss their
findings in the open is often frustrating for many of those
involved in this activity and specially the ones who come from
academia [Lequarr 1993].
This thesis is an effort to combine the significant work on
financial time series analysis and prediction with a coherent
trading strategy that can be adjusted to the preferences and
needs of the individual investor. The resulting system is
designed to run on common PC hardware making it suitable
for personal investment advice and as a portfolio management
tool.

4 Methodology
4.1 Overview
NELION is an Internet based personal investment tool and
portfolio management software. It retrieves stock data from
the Internet, manages it and generates investment
suggestions and portfolio updates via e-Mail and alerts via
short messaging system, SMS. Additionally, an investor can
view his portfolio via a web page and can manipulate his
preferences and execute purchases or sales.
The system is separated into a task agent, an administration
tool and an HTML interface, each of which attaches to the
common NELION database.

HTML Interface
Task Agent
NELION
Database
WWW
e-Mail
SMS
Internet
Admin Tool

Figure 4.1.1: Block Diagram NELION
4.2 The HTML Interface
The HTML Interface provides an investor using NELION with a
means to manage his portfolio and edit his preferences. After
entering his e-mail address as a user name and the
The Administration Tool 83
associated password under www.nelion.net the investor gets
an overview of his porfolio including the current value, stocks
owned and current recommendations. A graph compares the
return on investment of his portfolio compared to the Dow
Jones Industrial Index as well as the Nasdaq.
From this main page, the investor can select hyperlinks that
allow him to maintain his investment preferences and account
parameters, buy or sell or research specific stocks.
4.3 The Administration Tool
The Administration Tool is designed for the administration of
the data, parameters and tasks on the database. As such, it is
only used by the NELION system manager and does not
require any investment experience, since it only provides a
means to maintain data but does not contain any logic for
stock prediction or portfolio distribution.
The data entry screens mirror the structure of the database
and consequently include a data entry screen for all the major
tables. For the investor data, the Administration Tool provides
one page for the data kept in the investor table. Additionally,
there are pages to view the current portfolio, its development
in a graphical format, the purchasing and sales history, as well
as the return on investment.
The stock interface includes pages with lists of the
mathematical models and current predictions with different
horizons, in addition to the standard information, such as the
company name and ticker, current price and volume. A
84 Methodology
parameter dialog allows the administrator to maintain all the
adjustable settings of the system, while the task list shows the
jobs that the task agents are currently working on, or that is
waiting for execution by one of them.
4.4 The Database
The database is the central store of information and has to
scale to several gigabytes in size in order to be able
accommodate historic data, models, recommendations and
portfolio histories for thousands of stocks and investors. The
diagram below shows a simplified conceptual data model.
The additional tables required for the model storage have not
been included for simplicity. The complete data model is
shown in Appendix D. The primary keys for each table is
included and underlined in the figure.

STOCK
Stock_ID
MODEL

Model_ID
CORROLATION

TODO
Todo_Key
Todo_Desc
PARAMETERS
Parameter_ID
PREDICTION
Prediction_Interval
Prediction_Date
RECOMMENDATION
Recommendation_Date

INVESTOR
Investor_ID

PORTFOLIO
PURCHASES
Purchase_ID
StockData
Date

Figure 4.4.1: Simplified Conceptual Data Model
The Database 85
The historic price and volume data for each stock that is
tracked is stored in a separate table, which is created when
the stock is entered. The figure above only shows one
representative table, StockData.
The system hinges on the two main entities, stocks and
investors. For each stock, the system stores the company
name, stock exchange abbreviation called the ticker as well
as the volatility. The internal Stock_ID number is the primary
key of the table and used as a foreign key in all related tables.
The correlation between all stocks, for example, uses the two
Stock_IDs as its primary key and merely stores the correlation
as an attribute. The prediction table needs two additional
fields, the prediction interval and the prediction date, as a
primary key and stores the percentage prediction increase or
decrease as a positive or negative float as its only attribute.
The stock model component is only represented by a single
table, which defines its own internal model number Model_ID
as a primary key. The additional tables that are required to
store the assorted models are not included in the diagram.
The investor entity requires less support tables but contains
more fields within the table itself but also uses an Investor_ID
as a primary key. Besides the investor name and his e-mail
addresses for update and notification purposes as well as
SMS updates, the system stores the investment preferences
in the form of risk adversity parameters, expected minimum
annual return as well as the minimum transaction volume.
Additionally, it tracks parameters that control how frequently
the investor receives e-mails with an update of the current
86 Methodology
portfolio and purchase and sale recommendations. In order to
be able to show the change in portfolio value from one e-mail
to the next, it also stores the account value from the last
update. Furthermore, it maintains a field for the cost of each
transaction.
Lastly, each investor has an investor type, where three
different options are possible: A Test Investor is used to
simulate trading behavior and the resultant portfolio for
different configurations in a past period, so that a new
potential user of the system can select a configuration with the
desired characteristics. For these investors, the system
maintains a start and end date for this test. An Auto-Trader
is an automatic investor that autonomously acts on the
investment recommendations in a live simulation on current
data. This function allows the analysis of the system as a
proof of concept. Finally, there is the Normal Investor, who
receives regular updates, but that has to update his purchases
and sales on the system, whether they were recommended or
not.
The portfolio, purchases and recommendation tables provide
the link between investors and stocks, since each investor has
a portfolio containing zero or more stocks. Each inherits the
respective internal identifiers as foreign keys.
The purchase table shows when these stocks were purchased
and sold and at what price these transactions were executed.
The current and historic recommendations are kept in the
corresponding table along with their recommendation dates.
The Task Agent 87
The parameters are not connected to the remaining tables,
since they only store system values, which will be read by the
task agent or Administration Tool for specific functions. The
table describes the six parameters that are stored in this table.
Parameter Description
TimerInterval Specifies the frequency with which the Administration
Tool updates the task list on the screen
BankRate Guaranteed Interest Rate from the broker or bank
Diff2NY Time difference to the New York Stock Exchange.
This is used to schedule the Internet download task
Mutation The likelihood of mutation in the genetic algorithm
SMS Threshold If the value of a stock changes more than this
threshold, an SMS message is sent to all investors
who own the stock, as well as the System
Administrator
SMS for System
Administrator
The SMS e-mail address of the system administrator
Table 4.4.1: System Parameters
The task table has an implicit link to the stocks and investors,
since all tasks relate to one of these two entities. Due to this
dual connection, there cannot be an explicit database
constraint and the referential integrity of the link has to be
verified by the task agent.
4.5 The Task Agent
The task agent supports the stock prediction and portfolio
management calculations, as well as the Internet interface. It
retrieves individual tasks from the task list, marks them as
taken until they have been executed and then deletes them
from the database.
The task agent can perform nine different tasks. The Internet
Load function to retrieve data from the World Wide Web is
88 Methodology
always the first step. Given this data, the next tasks,
calculating the volatility, mathematical models and the
correlation between stocks can be executed. These tasks can
be grouped together into a single task for a new time series.
Further tasks include sending a portfolio update, possibly
including transaction recommendations, to the investor.
Lastly, the task agent executes the test investor function and a
background thread that performs model optimization with a
genetic algorithm. All of these are explained in detail in the
following sections.
4.5.1 Internet Load
The Internet Load function connects to the Internet and
downloads historic price and volume data for each stock
tracked by NELION. If a stock has just been added to the
system so that the data table is empty, it will attempt to load
data starting from January 1, 1980. In case this function was
not invoked for several days, it retrieves missing data in one
download, bringing the stock data up-to-date.
The system stores the closing price for every day since the
stock has started trading. On weekends, public holidays and
days where the trading volume was nil, it assumes that the
price has not changed but still inserts a record into the
corresponding table. This facilitates the monthly model
calculation, which uses the last day of each month as a basis.
Additionally, it allows for consistent correlation calculations of
The Task Agent 89
stocks that are traded in different markets and with different
public holidays.
Significant changes in the stock price tend to signify dramatic
occurrences either for the stock itself or for the market as a
whole. Since this may require the attention of the investor, the
system notifies the administrator and all investors who own a
stock via a mobile phone SMS message. The sensitivity of
this threshold is controlled by the SMS threshold parameter,
which defaults to 20% so that stock price changes that exceed
that value will result in a message.
4.5.2 Calculate Volatility
The volatility of each stock is calculated using the following
equation:
( )

=
n
i
i
i i
x
x x
n
x
1
1
1
Equation 4.5.1
This value measures the mean absolute percentage change in
price over the entire interval for which the system has data.
Though it treats public holidays like regular trading days, it
ignores weekends.
90 Methodology
4.5.3 Calculate Models
For each stock, NELION calculates prediction models of four
different types: Autoregressive models of degree n (ARN),
artificial neural networks (ANN), k-nearest neighbor models
(KNN) and Markov models (MM). Since the system permits
prediction horizons of one day, seven days (one week) and 30
days (one month), it calculates models for each of these. As
input, it correspondingly uses the closing stock prices of the
last n days, weeks or months to predict the closing stock price
one prediction interval into the future.
This function serves as a bootstrap for the genetic algorithm
described below, which uses the available models to further
search the parameter space for better predictors.
The data was divided into a test and a training set but in order
to capture trends throughout the available data, each input
tuple had a 50% chance of being assigned to one of the two.
The quality of a model was measured by calculating the
normalized mean squared error (NMSE) of the predictions in
the test set.
( )
( ) ( ) ( )
2
1
2
target min target max
target prediction
1
NMSE
i i
n
i
i i
n

=
Equation 4.5.2
Since each stock price time series has a different dynamic,
NELION calculates models of each type and prediction
The Task Agent 91
horizon with various parameter combinations and stores the
best two configurations of each type on the database. The
details of each model type are described in the following
paragraphs.
The autoregressive models (ARN) use the Durbin-Levinson
algorithm described in Chapter 2 to calculate a linear
prediction model. The only parameters that could be adjusted
for the model type were the number of input values, which
ranged between two and 14 values. It stored the best two
models for each prediction horizon on the database.
The artificial neural network (ANN) models in NELION use a
single hidden layer with a single output unit, which represents
the prediction of the model. The units in the input layer are
mapped to the input tuple of the network. Each unit in the
hidden and output layers is fully connected to the previous
layer and has an additional link to a threshold input, which has
a constant input of one.
Since the model is only defined for an input range of between
zero and one, all input data is normalized to a range between
zero and 0.5. This ensures that all values remain within unity,
since the stock prices in our experiments never doubled their
value within one unit of the investment horizon.
As suggested by Weigend and Nix, the hidden units have a
sigmoidal transfer function while the output unit uses a linear
transfer function [Weigend, Nix, 1994]. The artificial neural
network was trained using the back propagation network as
described in Chapter 2. The learn rate and momentum
92 Methodology
parameters were set to 0.1 for all units, but in an effort to
speed up convergence, the learn rate was left dynamic and
increased or decreased by a factor two if consecutive updates
were in the same direction.

h=2...14
2
2
i 1
1 h
o
Input Units
Linear Transfer
Function
Sigmoidal
Transfer Function
i=2...14

Figure 4.5.1: Artificial Neural Network
The system used batch learning so that weight updates on
each link were performed after every epoch since this proved
to be more reliable than on-line learning. Since the test error
initially tends to exhibit rather erratic behavior, NELION
imposes a minimum number of 500 epochs. Learning was
stopped after three consecutive epochs increased the test
error or until it reached a maximum number of learning
epochs. This value was set at 1000 for an investment horizon
of one day, 2000 for one week and 3000 for one month.
These values were identified through experimentation and
helped some configurations, which remained near the
minimum error but never achieved three consecutive
increases.
NELION tested all combinations of artificial neural network
models with between two and 14 input units and between two
The Task Agent 93
and 14 hidden units and stored the best two for each
prediction interval.
The k-nearest neighbors (KNN) models algorithm retrieves
each tuple in the test set and searches the training set for the
constellations, which resemble the given pattern most closely.
NELION calculates all models with between two and 14 input
values and identifies between k=2 and k=14 nearest
neighbors. The distance from the input tuple to the tuples in
the training set is calculated using the Euclidean metric,
though the genetic algorithms described below can select
between this and a Gaussian or constant metric.
The prediction for each tuple is the weighted average of the k
nearest neighbors, where the weight of each neighbor is
inversely proportional to the distance as shown in the equation
below.

=
=
k
n
n
i
i
d
d
w
1
Equation 4.5.3
The Markov models (MM) use between four and 14 input
values and select between four and 20 random states from the
training set. The system then assigns each tuple in the
training set to one of these states and then counts the number
of transitions from one state to another. Given these numbers,
it is possible to calculate the probability of each transition. The
prediction was the weighted outcome of between one and all
94 Methodology
states used in the model. The weighting algorithm is the same
as for k-nearest neighbors algorithm.
4.5.4 Calculate Correlations
The correlation between stock x and stock y measures how
closely the two time series are related and is calculated as
follows:
( )
( )( )
( ) ( )



=
i
i
i
i
i
i i
y y x x
y y x x
y x
2 2
, Equation 4.5.4
In the formula above, x
i
represents the price of time series x at
a specific time i and x is the mean price for all i. The task
calculates the correlation between the given time series and
all other time series tracked in the system. If time series x
equals time series y, the correlation is unity by definition.
4.5.5 Send E-Mail Update
In order to provide the investor with an update on his portfolio,
its total value and the loss or gain for each stock currently
held, the system periodically sends an e-mail to the specified
address. The frequency of these messages is set for each
investor, though the task can be created manually at any time
for one specific or all investors.
The Task Agent 95
4.5.6 Calculate Recommendation
Given models for all stocks tracked in NELION, the system
calculates predictions for investment horizons on a daily,
weekly and monthly basis. It is assumed that the transactions
executed by the investor have a negligible influence on the
stock market as a whole. The system uses the model with the
lowest NMSE to predict the future stock price. It is worth
noting, that the MM and KNN models use the entire historic
data to predict the future stock prices and not only the test set,
as was done during the model calculation.
The predictions for each stock are stored on the database as
the percentage change from the current stock price and form
the basis of a recommendation for each investor. Additionally,
however, the recommendations take the current portfolio as
well as risk adversity parameters of the investor into account
by calculating the relative risk of all favorable future portfolios.
The parameters pertain to stock correlation, volatility, model
error and trading volume as well as a minimum transaction
amount.
The correlation between two stocks measures the likelihood of
congruent movement in response to external market forces,
like interest rate changes, new laws, political conflict or acts of
nature. The volatility measure described above measures the
variability of the stock price. Stocks with a high volatility tend
to show erratic price movements, making them a riskier
investment than those with a low volatility. NELION is able to
forecast some stocks with greater precision than others,
96 Methodology
resulting in a lower NMSE for these time series. An investor
can specify that the recommendations should favor these
stocks, since this would decrease the risk of the resulting
portfolio. A stock with a high turnover volume tends to show
greater price stability. Additionally, the transactions by the
investor affect the market price of the stock to a lesser degree
for stocks with a high volume. Consequently, this reduces the
risk associated with these kinds of stock.
In order to identify the portfolio with the lowest relative risk
given the investor parameters, NELION calculates the risk of
the current portfolio using the following equation.

= =

=
n
i
n
j
j i
j i j i j i
O
E L C
Risk
1 1
,
, , ,
Equation 4.5.5
In this equation, the factors in the numerator (C
i,j
for the
correlation, L
i,j
for the volatility and E
i,j
for the measure of error)
increase the risk of the portfolio and the volume factor O
i,j
in
the denominator decrease it.
The dependence on the correlation C
i,,j
is defined as follows,
where C represents the investor specific correlation
parameter, I
i
is the portfolio value currently invested in stock i
of and
i,,j
is the correlation between stock i and j.
( ) ( ) [ ] 1 1
, ,
+ =
j i j i j i
I I C C Equation 4.5.6
Similarly, the volatility, error and volume components are
defined as follows.
The Task Agent 97
( ) [ ] 1 1
,
+ =
j i j i
l l L L Equation 4.5.7
( ) [ ] 1 1
,
+ =
j i j i
e e E E Equation 4.5.8
( ) [ ] 1 1
,
+ =
j i j i
o o O O Equation 4.5.9
In these equations, L, E and O represent the investor
parameters for volatility, model error adversity and volume
preference respectively and must be in the interval [0,1]. The
terms l
i
, e
i
, and o
i
represent the volatility, model error and
current trading volume for the stock i.
The relative weight of each factor is determined by the
relationship between each parameter. A comparatively large
value of L, for example, increases the weight of the volatility
characteristics, which can be interpreted as a particular risk
adversity as it pertains to volatility.
Each of the individual factors is reduced to unity in case one of
the parameters C, L, E and O vanishes, making the risk
calculation independent of that component. This is equivalent
to the investor stating that the corresponding component
should not be considered in his risk calculation.
Several boundary conditions were handled as exception cases
in NELION. If the trading volume o
i
of a particular stock was
zero, the entire term in the double sum was disregarded. This
is equivalent to disregarding this stock completely. If all
investor parameters were set to zero, all terms in the double
98 Methodology
sum would be unity and the calculated risk would be the same
for all possible portfolios. This is equivalent to the investor not
making any statement regarding his investment preferences
and is disallowed by the system.
It is important to note that the equation above represents a
relative risk calculation and that it does not map directly to a
physical quantity. It does, however, permit NELION to
compare the relative risk associated with different portfolios by
initially calculating the relative risk value for the current
portfolio and then searching for portfolios with a lower relative
risk.
This is done using the gradient decent method over this n
2
-
dimensional parameter space. The standard algorithm is
restricted to prevent the recommendation of negative
ownership of specific stocks, called short positions. The
system starts its search with a step size of one and iteratively
calculates the resultant portfolio. In case the resultant portfolio
does not have a lower relative risk, the step size is reduced by
a factor two and the iteration is restarted. This process
repeated until the step size has diminished to 10
-8
.
Once the optimal portfolio is calculated, NELION filters it to
ensure that the minimum transaction limit for the investor is
not violated. This restriction prevents the system from
recommending purchases or sales, where the cost of the
transaction outweighs the benefit of it. This filter also validates
that a sale suggestion of a particular stock does not result in a
portfolio, where the sale of the remaining stocks of the same
The Task Agent 99
company would force a transaction that would fall below the
minimum transaction volume at the current price. In such a
case, the system would recommend selling all of the stocks for
this company.
For example, if NELION finds the optimum by selling 60 of the
100 stocks of company XYZ in the portfolio, and the future
sale of the remaining 40 stocks would result in a transaction of
less than the minimum transaction volume, it would
recommend selling all 100 stocks.
In order to ensure that the expected portfolio return specified
by the investor is met, the portfolio selection only includes
stocks, for which the system has predicted a price increase
greater than this threshold. This results in a customized
portfolio recommendation for each investor, which is sent to
his e-mail address for the next investment horizon. For the
auto-investors, the recommendations are executed
immediately, so that the purchases and the portfolio are
updated automatically.
4.5.7 New Time Series
This task combines the tasks that are necessary for each new
time series: Internet Load, Calculate Volatility, Calculate
Models and Calculate Correlations.
100 Methodology
4.5.8 Test Investor
In order to identify parameters that correspond to the risk and
return expectations of an investor, NELION provides the Test
Investor function. This task simulates the behavior of an
investor for a specified period in the past so that the outcome
of the resulting portfolio can be analyzed.
This function is designed with the assumption that test
investors with all different parameter combinations are created
on a specific database. The investors are then tested in a
defined interval of sufficient length to be able to analyze their
behavior.
Assuming that the dynamics of the past hold in the future, one
can then aggregate the results from many different parameter
combinations. It is then possible to make statistical
statements about investors with certain parameter
combinations so that a potential user of NELION can select
the risk and return structure suitable for his needs.
4.5.9 Parameter Selection with the Genetic Algorithm
Since the computer running the task agent only responds to
requests entered into the task list on the database, it spends
the majority of its time waiting for new jobs. This processing
time is nevertheless available for productive tasks at no
incremental costs. In order to take advantage of this power,
NELION starts a background thread to search the parameter
The Task Agent 101
space of the prediction models using a genetic algorithm if no
other tasks need to be addressed immediately.
Genetic algorithms imitate the gene selection process from
nature to mix different traits from two parents, in the hope of
generating a child that can outperform either parent, as
defined by some fitness function. Much like in its biological
equivalent, where an animal, the genotype, is defined by its
genetic makeup, its phenotype, a mathematical model can be
specified by a series of parameters. These parameters are
encoded in a string of bytes of a finite length.
Biological reproduction entails the selection of specific genes
from the two parental phenotypes. Similarly, a mathematical
genetic algorithm maps this crossover function to the random
selection of bytes from the phenotypes of the two parents
resulting in a child phenotype, which has inherited some
features from either of its parents.
Biological mutation is a process by which a specific gene was
not inherited from either parent but is randomly generated,
frequently through some sort of defect or external influence.
In the overwhelming majority of cases, this leads to children
with undesirable characteristics. However, occasionally, this
leads to a new trait that increases the likelihood of survival and
begins to dominate the population thereafter. This dynamic
can be imitated in genetic algorithms by selecting random
bytes instead of inheriting them from one of the parents on
occasion.
102 Methodology
The child phenotype can be used to generate a new
mathematical model, which can be trained and tested. If the
model error is lower than either of its parents, it is apparently
superior and can replace one of the two parents.
For each stock and model type (ANN, ARN, MM and KNN) the
system stores two models with the lowest test error as
parents. Using these two models, the algorithm uses
crossover and mutation to generate new models, calculate the
predictive quality of them and to replace the worse of the
existing parent models if the child test error is lower than either
of them. The likelihood of mutation is controlled through a
system parameter, which can be anywhere between 0% (no
mutation) and 50%, meaning that on average every second
byte is randomly selected with no heritage from either parent
phenotype.
The resulting phenotype is converted back to a genotype by
interpreting the string and populating the parameters of a new
model. These parameters are validated to ensure a valid and
sensible configuration. The specific parameters and validation
depends on the model type and are shown in the table below.
The Task Agent 103
Model Parameter Validation
ARN # input values
Can not be more than twice the # of input
values of either of the parent models
Can not be more than 32
Must be at least 1
# input values (units)
Can not be more than twice the # of input
values of either of the parent models
Can not be more than 32
Must be at least 1
# hidden units
Can not be more than twice the # of input
values of either of the parent models
Can not be more than 32
Must be at least 1
ANN
Transfer Function
Must be 1 or 2 representing the constant
a in the transfer function
# input values
Can not be more than twice the # of input
values of either of the parent models
Can not be more than 32
Must be at least 1
# of nearest neighbors
(k)
Can not be more than twice the # of
nearest neighbors of either of the parent
models
Can not be more than 32
Must be at least 1
Metric
Must be 1, 2 or 3 representing Euclidean,
Gaussian or Constant functions
KNN
Weighting
Must be 1, 2 or 3 representing Euclidean,
Gaussian or Constant functions
# input values
Can not be more than twice the # of input
values of either of the parent models
Can not be more than 32
Must be at least 1
# states
Can not be more than twice the # of
states of either of the parent models
Can not be more than 32
Must be at least 1
# states used to
calculate prediction
Can not be more than # states
Must be at least 1
MM
Weighting
Must be 1, 2 or 3 representing Euclidean,
Gaussian or Constant functions
Table 4.5.1: Genetic Algorithm Parameter Validation
104 Methodology
Given the verified parameters, the system calculates the
model and if its test error is lower than that of either of the
other two models, the system replaces it on the database.

5 Implementation
5.1 Overview
While many mathematical models concentrate on the
prediction of stock prices, few use these predictions to
manage a portfolio given the theoretical basis described in
Chapter III. I have developed a system, NELION, based on
non-linear stock prediction models and this investment theory.
It uses the many different possibilities that the Internet offers
both for data retrieval as well as user interaction.
NELION is designed to manage the portfolios for numerous
investors and to suggest customized purchases and sales for
each, in an effort to achieve an optimal portfolio. Investors
can request to receive portfolio recommendations on a daily,
weekly or monthly basis, depending on their preference.
Correspondingly, these recommendations are based on
mathematical models, which take into account daily, weekly or
monthly stock data.
As described in Chapter 4, NELION is divided into four
components that are shown in Figure 4.1.1. The database is a
Microsoft SQL Server 7 running on MS Windows 2000 and the
Task Agent and Administration Tool are two applications
written in Microsoft Visual C++ 6.0. The User Interface runs
106 Implementation
on MS Internet Information Server and uses Active Server
Pages. All components connect to the database via an ODBC
interface, which is used to manage the entire data pool. Both
Visual C++ programs require the ODBC Data Source Name
(DSN) of the NELION database as an input parameter and
use MS Windows authentification to ensure access rights.
5.2 The HTML Interface
The HTML Interface runs on MS Internet Information Server
5.0 and provides the investor with a means to manage his
portfolio. Since this web server is not connected to the
Internet continuously, the domain www.nelion.net is hosted on
an Internet service provider and when the web server logs
onto the Internet it updates a link on the welcome page to it.
After a successful log in using the e-mail address as a user
name and a password, the investor is shown an overview his
current portfolio.
The HTML Interface 107

Figure 5.2.1: Porffolio Overview via the HTML Interface
The upper half of the page is devided into two columns: The
left side contains basic portfolio information including the e-
mail address, the sum of all stocks currently owned, the cash
reserves in the portfolio and the sum of these two figures,
which represents the total value of the portfolio. It also shows
the gain or loss in portfolio value both in US dollar and in
percentage terms. The right side contains a graph comparing
the return of the portfolio with the developments of the Dow
Jones Industrial Index and the Nasdaq.
Below these two components, the web page shows a list of all
stocks in the current portfolio, including the ticker symbol, the
full stock name, the current price and quantity as well as the
product of these two values, which represents the total
108 Implementation
investment in this stock. Lastly, the table shows the total gain
or loss that the investor has incurred with this stock.
Finally, the web page contains a further table with the current
recommendation for the investor. This component contains
the same column as the portfolio table, with the exception of
the gain. A positive quantity represents a buy, a negative
quantity a sell recommendation.
From this page, the investor can select numerous links that
permit him to manage and adjust his portfolio.
5.2.1 Stock Search
The stock search link opens a new window that permits the
entry of a stock ticker symbol, a company name or a part of a
company name. After completing this information and
pressing the Search Button, NELION will attempt to locate a
stock with the specified stock symbol.
If it can not find the corresponding company, it will list all
companies that contain the word specified in the search field.
The ticker symbol for each company in the list also serves as
a link to the relevant stock, so that the investor can view all
relevant company details.
The HTML Interface 109
5.2.2 Buy/Sell Stock
This link is used to update the portfolio after a transaction.
NELION will open a dialog in the menu window when the link
is pressed and will request the investor to specify whether a
stock was bought or sold. Additionally, it needs the
transaction date, the ticker symbol, the amount, purchase or
sales price as well as the transaction costs. The latter will be
defaulted with the value specified in the investor parameters.
5.2.3 Deposit/Withdraw Cash
If the investor changes the cash reserves by depositing or
withdrawing cash from the account, he will need to update the
portfolio accordingly with this link. The system will respond
with a dialog in the menu window requesting information on
whether cash was deposited or withdrawn from the account,
the transaction date and the cash amount.
5.2.4 Password
In order to change the NELION password, an investor has to
enter the current password and type the new password in
twice. The double entry is necessary since none of the
passwords are shown in clear text but only with a star (*) for
every character entered.
110 Implementation
5.2.5 Parameters
This page is used to specify all investor preferences and
parameters. For each parameter, the system offers an
explaination on the usage of this parameter. Besides the
investor name, e-mail and SMS e-mail address, the system
requires information on the investment horizon and the e-mail
frequency. The transaction cost field is used to populate the
Buy/Sell Stock dialog with a correct default value. The
remaining parameters focus on the risk adversity of the
investor and include the minimum transaction amount, as well
as the volatility, error, volume and correlation risk adversity.
This same page is used for the online application of new
investors. It allows a new applicant to specify all relevant
parameters when he opens an account without intervention by
the NELION administrator.
5.2.6 Log Out
The log out link removes a session token from the web server
so that no additional transactions are possible and returns the
investor to the NELION welcome page.
5.3 The Administration Tool
The NELION Administration Tool is the primary tool for the
administrator of the system. It contains all common features
The Administration Tool 111
of an MS Windows application including a menu, icon bar for
short cuts and a status bar at the bottom of the screen.

Figure 5.3.1: The NELION Administration Tool
The program is written using a multi document interface (MDI)
so that the administrator can open multiple windows, each
displaying information for a specific investor or stock.
Additionally, it is possible to open a window for the system
parameters. The task list is also implemented as a MDI
window but, unlike the others, it remains open on the left side
of the screen as long as the application is running.
Following the MS Windows standard, the Administration Tool
uses multiple tabs in documents in order to make optimal use
112 Implementation
of the available screen real estate. Some of these allow the
user to enter or update data pertaining to the corresponding
entity, while others show historical, calculated data or
parameters. They are described in detail in the following
sections. Full screen prints are included in Appendix C.
5.3.1 Investors
The General tab for the investors contain the basic investor
information like the name, e-mail address, investment horizon,
e-mail notification interval, transaction costs and minimum
transaction amount, as well as all investor risk adversity
parameters. The investor type is maintained here as well and
for test investors the data entry fields for the beginning and
end of the test have to be entered. The investor number
cannot be modified and represents the internal system
number. Similarly, the portfolio value is calculated from the
current stock prices held by the investor.
The Portfolio History tab displays a graph of all the stocks
that the investor owned since he started tracking his account
on NELION. The proportional value of each stock as well as
the overall value of the portfolio is visible at a glance for the
entire portfolio history.
On the Purchases tab, the Administration Tool provides a list
of a purchases and sales undertaken within the portfolio.
Each transaction is one record and includes the ticker and
The Administration Tool 113
company name, date, stock price, amount, and the product of
these two, representing the total transaction value.
The Portfolio tab displays a list where every record
represents one stock that is currently held. Each line contains
the ticker and stock name, the current stock price, the amount
held and again the product of these two, representing the total
value of the stock. The sum of these values is shown on the
header of this column representing the total portfolio value.
Finally, the list contains a column for the value gained or lost
with this stock.
The Return tab shows the actual return on the investment for
three different periods: Since the portfolio was included in
NELION, since the beginning of the calendar year and in the
last 12 months.
5.3.2 Stocks
The General tab for a stock includes the ticker and stock
name and the web site for data download. At present, the
latter only offers one option, which is Quote Central. If more
sites provide historic stock data, especially for European and
Asian stocks, this list and the additional required functionality
would be expanded. The ticker name is used to create a table
on the database to store the daily price and volume
information for the stock.
114 Implementation
The two additional pieces of information on the tab, current
price and volatility are read-only fields since the former is
downloaded from the Internet and the latter is calculated.
For a new stock in NELION, this is the only tab available.
When editing an existing stock, three further tabs with
information are available. The Models tab shows a list with
the header information of the mathematical models stored for
this stock. Each model has a name, the number of data input
values used, the prediction interval as well as the NMSE.
The Graph tab displays the price movement over time of the
stock. The graph auto scales the y-axis to ensure that the
entire data is visible on the screen.
Finally, the Correlation tab shows a list for the correlations of
the chosen stock with all other stocks in the system. The list is
sorted in decreasing order, so that the selected stock will
always be at the top with a correlation of 1.
5.3.3 Parameters
The Parameters data entry screen permits the maintenance
of the system parameters. The Refresh Interval parameter
controls how often the Administration Tool will update the task
list. The entry is interpreted in seconds, so that a value of 60
will result in a refresh rate of once a minute.
The Bank Interest Rate represents the return on cash kept
with the broker or with another bank. Since this can be viewed
The Database 115
as a risk-free investment, any stock purchase, which is by
definition risky to a greater or lesser degree, is only
recommended by the system if the prediction for it exceeds
the bank interest rate.
The Time Difference to New York parameter records the time
difference from the current location to the home of the New
York Stock Exchange. Since Quote Central, the Internet
source of our data, does not update the historic stock price
and volume list until approximately 4:00 a.m. local time in New
York, we use this value to calculate the time for the Internet
data download.
The SMS Threshold defines a band of uncritical price
swings. If the price of a stock changes by an amount, which
exceeds the SMS threshold, all investors who own the stock
as well as the NELION system administrator are notified
through an e-mail. The e-mail address is different from the
one used for regular updates and should be tied to a
messaging provider that forwards the e-mail to a specified
mobile phone via short messaging system, SMS. The system
administrators SMS e-mail address is defined in the
corresponding system parameter.
5.4 The Database
The NELION database server runs on MS Windows 2000 and
employs MS SQL Server 7.0. This software infrastructure
ensures scalability to a multi-gigabyte installation, while
permitting an individual investor to work with standard PC
116 Implementation
hardware on a single, modern workstation. The database
platform includes functionality to ensure referential integrity, by
including counters, stored procedures, triggers, primary and
foreign keys and other constraints.
In order to harness the power of these tools effectively, I used
S-Designor DataArchitect 5.1 to design and document the
database, which only required a few manual adjustments for
installation. The graphical tool permits relating tables
graphically and generates an SQL script for different database
types. The database creation is then limited to the execution
of this script and assigning user access rights.
DataArchitect also allows the definition of data types, which
map to types supported by the underlying database. In a first
design, I defined a type Geld
2
, which mapped to the data
type money. At a later stage, I reduced the number of
different data types and mapped Geld to float, in an effort to
reduce complexity. This merely required creating a new
database and transferring the data to it, with an appropriate
mapping from money to float.
MS SQL Server offers scheduled tasks, which I used to
automatically generate tasks in the task list for e-mail updates
of the portfolio value and recommendations for the investors.
Similarly, the system inserted an Internet Download task into
the task list once a weekday. In order to adjust the

2
"Geld is the German word for money.
The Database 117
mathematical models to the changing dynamics of the market,
NELION autonomously increased the error of all models by
5% every Sunday. This gave new models calculated by the
genetic algorithm a better chance of undercutting the existing
models, with no loss of generality. In a worst-case scenario, it
identified a model with the same parameters as optimal again,
so that it merely overwrote itself.
A trigger on the Purchases table secured referential integrity
between transactions, investor portfolio and the investor
header tables. When a purchase or sale was entered, either
through the Administration Tool or directly on the database,
the trigger updated the Portfolio table and reduced the
portfolio value on the investor table to reflect the associated
transaction cost.
Since the data tables for each stock were created dynamically
and represented the bulk of the tables in the database, I did
not include the same trigger on each of these to keep the
current value of the header table updated. Instead, I wrote a
stored procedure that was called from the Administration Tool
and updated both the data table and the stock header table
when data was downloaded from the Internet.
A number of tables used to store the mathematical models
contained a database-internal counter as the primary key,
since this key was a foreign key in subsequent tables. This
design reduced the data requirements in the database and
normalized the tables effectively. When storing a model on
the database, it was necessary to save the model header
118 Implementation
information and then to retrieve the counter because it was a
required piece of information on the detailed tables. Here, I
also used stored procedures to save the record and
immediately deliver the ID as the return value, because it
reduced the database accesses from two to one, thereby
optimizing the system.
Though the insert function automatically increments the
counter for a table, I manually added a stock with the counter
0, which is defined as Cash. It is used to maintain the cash
held by the individual investors in their respective portfolios
and is automatically adjusted to account for purchases and
sales. Similarly, on the Purchase table, each transaction
results in an entry for this stock to account for the transaction
costs, as they are defined for each investor.
5.5 The Task Agent
The Task Agent is a separate program with a simple interface
that consists of an Exit button as well as nine check boxes,
which allow the user to define which actions this instance of
the program should perform. When the program exits, the
current configuration is stored in the MS Windows registry and
is restored the next time the program is started to ensure that
the same configuration is retained.
The Task Agent 119

Figure 5.5.1: The Task Agent Program
Two text boxes provide the user with a description of the
current object and the actions on that object that are currently
executed. The objects are either the stock or an investor from
the corresponding table.
The server process typically sleeps until either the
Administration Tool or a scheduled task enters a record in the
task list. In this state, it could start a second thread that
benefited from the idle processing time to search for improved
prediction models using a genetic algorithm. While performing
this optimization, a progress bar is updated every second to
show that the system is still running and checking for new
tasks every six seconds.
If it found an Open task in the task list, it marked the task as
Working so that no other Task Agent on the network begins
executing them. After performing the relevant task, it deleted
the task from the task list and the cycle starts afresh.
120 Implementation
The Task Agent can perform nine different functions, which
are described in detail in the following section.
5.5.1 Internet Load
The Internet Load function assumes that a permanent
connection to the Internet exists and downloads data from a
site called Quote Central, which has its Internet homepage at
www.wallstreetcity.com. Navigating to the Stocks tab on this
site allows the user to enter a stock ticker name to retrieve
stock data and charts with a 20-minute delay. On the left-
hand side of the screen, it also provides the option to view
historical quotes. Choosing this function, the site provides
options to enter the stock, the start and end dates of the
historic data requested, the data interval as well as eight
different formats for the historical stock data. After selecting
the appropriate parameters, pressing the Get Quotes button
performs the request.
By analyzing the URL that results from this request, one can
automate the retrieval by generating it within NELION and
executing it directly. The Internet Load function builds the
correct URL, using a daily interval and requests historic stock
data in the HEADER 5 Super Charts format. In order to
limit the data download, it sets the start date to the day of the
last date available in the system and overwrites it in case it
has been updated. All new data is appended to the stock data
The Task Agent 121
table. For newly created stocks, NELION attempts to
download all data since January 1, 1980.
After sending the request to the Internet site,
wallstreetcity.com responds with a new page, which contains a
temporary link to a page with the Historical Quote Header 5
Export File. NELION extracts this link and executes it. The
result is a comma deliminated ASCII file, which contains the
date, open, high, low and close prices as well as the daily
volume. From this page, it extracts the close price and volume
for each day and stores them on the database. For days,
which do not have an entry on the web site, like public
holidays and weekends, NELION generates a fictitious record
using the previous stock price and volume figures. In case the
price changes more than a specified percentage from one day
to the next, the system automatically sends an alert to all
investors, who own the stock as well as the system
administrator. The e-mail address of this alert can be different
from the address used for regular updates, so that the investor
receives it as an SMS notification on his mobile phone. The
technical details of this function are described in section 5.5.5
below.
Since the system does not download any images, the data
volume of the download is kept at a minimum so that the daily
update per page can be accomplished within one second,
given the necessary bandwidth to and within the Internet.
Since the system is tailored to this specific site map, it is clear
122 Implementation
that a change in the structure at wallstreetcity.com will require
an adjustment of the NELION Internet Load function
Though other sites also offer historic stock quotes, many are
fee-based and require a log in process, which complicates the
retrieval. Of those that were free of charge none that I found
offer the simple ASCII format and several tests showed that
parsing an HTML page is considerably more complex and
error prone.
It is worth noting that the system does not retrieve dividend
and stock split notices, so that these would have to be entered
manually. The former would merely be an adjustment to the
available cash. For the latter, stocks are normally split in a
ratio of 2:1 or 3:1, so that one can expect a drop in the stock
price by approximately 50% or 66% respectively. The investor
database administrator will be alerted to drastic changes in
price through the SMS notification so that he can take
appropriate action.
5.5.2 Calculate Volatility
Calculating the volatility of a stock is a straightforward
implementation of the equation described in 4.5.2. To this
end, NELION retrieves all of the data of a particular stock and
executes the loop described in the section. This task is
performed when a new stock is created and can be repeated
periodically thereafter.
The Task Agent 123
A single task in the task list, for stock ID 0 performs this for
all stocks. Since this stock counter refers to Cash and no
data table exists for this dummy stock, the instruction is well
defined.
5.5.3 Calculate Models
Before a forecast for a stock can be calculated, it is necessary
to calculate predictors. The Calculate Models task performs
this task for a specific stock for all investment horizons (one
day, one week and one month) and all model types (ARN,
ANN, KNN and MM), keeping only the two models for each
combination that have the lowest NMSE. The algorithm for
the model calculation follows from section 4.5.3.
It is worth noting that though the database only stored all
model and financial parameters as floats, all models employed
the data type double in RAM, in order to ensure greater
precision for internal calculations, potentially increasing the
quality of the results.
Given the models as a basis, the system was able to make
predictions for each horizon. In an effort to continually improve
the quality of the models, however, the fine-tuning of these
predictors took place in the genetic algorithm implemented in
the background thread described in section 5.5.9.
124 Implementation
5.5.4 Calculate Correlations
To calculate the correlations for a stock, the system loaded the
historic data into memory, excluding weekends. It then
iteratively loaded the data for the remaining time series into
memory and started calculating the correlation as specified in
section 4.5.4 starting with the later of the two beginning dates.
The function did not disregard public holidays, because this
does not significantly affect the calculation and helps reduce
the complexity of the application.
5.5.5 Send E-Mail Update
The Send E-Mail Update function is designed to keep the
investor informed on the status of his portfolio. The subject of
the message contains the current total account value and in
parentheses the change from the last e-mail.
The body of the e-mail contains a table with all the stocks that
the investor currently holds in his portfolio. For each stock, the
message shows the number of shares, the current price and
the product of these two, representing the value held in this
stock. The last column shows the total gain or loss that the
investor has incurred to date with this investment. The last
row is always the current cash holdings of the portfolio. At the
bottom of the table, the message shows the sum of all
investments representing the total portfolio value.
The Task Agent 125
All dollar amounts are displayed with three significant digits, to
ensure that eights of dollars (US$ 0.125) can be represented
accurately.
NELION generates a plain-text e-mail and needs a MAPI
compliant e-mail client to send it. The application uses the
default profile to access the e-mail services. The MAPI client
does not have to be running, but its behavior and settings
determines how the message is treated thereafter. I tested
the task agent with Outlook Express 5 running under Windows
2000 as well as Outlook 2000 running under Windows 2000
and Windows 95. The OS did not affect the operation, but
although both e-mail clients were set to send e-mails
immediately, only Outlook Express 5.0 performed this action,
even if it was not started. Outlook 2000 needed to be started,
but only sent the message because it was configured to send
and receive new messages every ten minutes.
In order to provide a consistent appearance, I registered the
domain name NELION.NET. All e-mail was sent using the e-
mail address Administrator@NELION.NET.
5.5.6 Calculate Recommendation
In the recommendation calculation for a specific investor,
NELION attempted to load the prediction with the specified
investment horizon for each stock into memory. If the
prediction did not exist on the database, it proceeded to load
all historic stock data and the model with the lowest NMSE
126 Implementation
into memory and calculated the prediction. To make this
prediction available for subsequent recommendations, it was
stored on the database.
With the predictions in memory, NELION identifies the stocks,
which promise to achieve an annual return bigger than the
bank interest rate and uses the algorithm described in section
4.5.5 to identify the optimum portfolio.
The investor subsequently receives an e-mail similar to the e-
mail update described in the previous section. In addition,
however, the system adds a line for each purchase or sale
recommendation, specifying the stock and the recommended
number of shares. For Auto-Investors, the transactions were
simulated, using the most recent stock price and included
transaction costs in order to provide a realistic scenario for
comparison. In this case, the e-mail indicated exactly which
transactions were performed.
5.5.7 New Time Series
This task combines the tasks that are necessary for each new
time series: Internet Load, Calculate Volatility, Calculate
Models and Calculate Correlations. This was necessary in
case more than one Task Agent accessed the database,
because combining these tasks into a single function ensured
that they were executed sequentially. If they were executed in
parallel, one Task Agent might be downloading the historic
data from the Internet, while a second might start calculating
The Task Agent 127
its volatility, models or correlations with incomplete data, which
would have falsified the results.
5.5.8 Test Investor
The Test Investor function is necessary to identify parameter
combinations, which result in investment recommendations
that suit the preferences of each investor. The implementation
follows from the theoretical discussion in section 4.5.8.
In order to provide a realistic test environment, I created a
separate database, deleted all stock data after the beginning
of the test interval, and recreated all prediction models.
Thereafter, task agents on two computers spent two weeks
optimizing the models using the genetic algorithms of the
background thread.
The test investor results are discussed in detail in chapter 6,
where the experimental results show the quality of our Internet
trading system.
5.5.9 Parameter Selection with the Genetic Algorithm
If the Background Thread check box was set on the interface
and the task list of the database only contained functions that
the instance of the Task Agent was not requested to perform,
it started a program thread to calculate new predictors for a
randomly selected stock.
128 Implementation
In the foreground, the application continued updating the
progress bar every second and checked the task list every six
seconds. In case a new task was entered while the
background thread was active or the Exit button was pressed,
the program completed the calculation of the current model in
the background before ending the thread.
The symmetric multi-processor (SMP) architecture of Windows
2000 ensured that the two threads were executed on different
processors in multi-processor machines so that no
degradation in execution speed was noticeable. On a single
processor computer, the two threads had to share the
resources, so that two computing intensive tasks increased
the execution time.
The additional models calculated by the background task used
the genetic algorithm described in section 4.5.9, using
inheritance and cross-over to generate new model parameters
and then checking whether this model was able to achieve a
lower NMSE than either of its parents. If this was the case, it
replaced the worse of the two existing models and became a
parent for future calculations. This regenerative process was
facilitated by increasing the NMSE value of all models by 5%
every Sunday, because this permitted new models, which
capture new market dynamics, to replace outdated predictors.

6 Experimental Results
In order to test the trading system we described in the
previous two chapters experimentally, I first executed the Test
Investor function with different parameter configurations in
order to identify promising candidates for high risk and
conservative investment profiles. Once these investment
profiles were identified, I used them in a live simulation to
test their performance under realistic circumstances with the
Auto Investor function in NELION. In this chapter, I present
the results of these two steps. Additionally, I show the
distribution of the four different model types that were used in
the simulation.
6.1 Test Investor Identification
The Test Investor function in NELION is designed to identify
the optimal investor configuration for a high risk and
conservative investor. This option allowed the user to specify a
test interval to simulate the actions of investors with specific
parameters. I limited the parameter space to [0,1] for the
volatility, correlation, volume and error parameters and tested
weekly investors for the period from January 1, 1998 to
December 31, 1998. A second test included the period of
January 1, 1999 to December 31, 1999 with an initial capital of
US$ 10,000 and a minimum transaction volume of US$
130 Experimental Results
250.00. The cost of each transaction was set at US$ 9.99,
which corresponds to the charges at the online broker
Datek.com and exceeds the cost at AmeriTrade.com. The
minimum expected return was 6%. The trials included all
combinations of parameters with values with an increment of
0.125.
As a measure of the risk profile, I examined the final portfolio
of each test investor and discarded parameter combinations,
which had more than 75% of the portfolio value invested in
one stock. Investors with between 50% and 75% invested in a
single stock at the end of the investment period were
considered High Risk combinations, while those with less
than 50% invested in one stock were considered
Conservative Investors.
This separation resulted in two sets of parameter
configurations, which I analyzed to identify a promising
combination by creating a cross tab query on every
combination with two of the four parameters. The resulting six
tables for each set and both test intervals for the subsequent
calculations are shown in Appendix A.
From the tables, I identified the parameter combination that
occurred the most frequent in each set and used this to define
two of the four parameters. This value is highlighted on the
tables in Appendix A a dark grey background. The table
below summarizes these results.

Test Investor Identification 131
Investor Parameter 1 Parameter 2
Conservative 1998 Volume=0.875 Error=0.75
Conservative 1999 Volume=0 Volatility=1
High Risk 1998 Volume=0 Volatility=0.875
High Risk 1999 Volume=0 Correlation=0.25
Table 6.1.1: Sample Investor with two Parameters identified
In a second iteration, I used these two values to identify the
parameter combination in the remaining four tables that
occurred the most frequently to define the third parameter. To
simplify the search, I highlighted all relevant columns and rows
in the tables of Appendix A with a light grey background and
identified the largest values with dark grey characters. This
specifies a third parameter as shown in the following table.
Investor Parameter 1 Parameter 2 Parameter 3
Conservative 1998 Volume=0.875 Error=0.75 Volatility=0.875
Conservative 1999 Volume=0 Volatility=1 Error=0.625
High Risk 1998 Volume=0 Volatility=0.875 Correlation=0.875
High Risk 1999 Volume=0 Correlation=0.25 Volatility=0
Table 6.1.2: Sample Investor with three Parameters identified
Finally, using the three defined parameters, I identified the
combination from the last three tables that occurred the most
frequently to set the last parameter. The relevant columns
and rows are displayed with a medium grey background in the
tables in Appendix A, which had not been used and included a
medium grey line at the bottom or the left hand side of the two
columns or rows that had already been identified in the second
iteration.
For the conservative investor 1998, this did not uniquely define
the correlation parameter, because a volatility of 0.875 and a
132 Experimental Results
volume 0.875 both had 14 occurrences in the correlation
parameter at 0.5 and 0.375 respectively. This seemed to
indicate that the maximum is somewhere between these two
values. Comparing the number of occurrences at 0.375 and
0.5 for each of these two parameters respectively indicated a
score of eleven versus ten occurrences, so that I opted for the
0.375 value. This is supported by the tests for the
conservative investor in 1999.
The results from this third and final iteration are summarized in
the table below.
Investor Correlation Error Volatility Volume
Conservative 1998 0.375 0.750 0.875 0.875
Conservative 1999 0.375 0.625 1 0
High Risk 1998 0.875 0 0.875 0
High Risk 1999 0.250 0.500 0 0
Table 6.1.3: Sample Investor Parameters
The results show considerably more consistency for the
conservative than for the high-risk investors. The correlation,
error and volatility parameters changed only slightly between
the tests for 1998 and 1999, while the volume parameter
dropped from 0.875 to 0. For the high-risk investors, only the
volume parameter remained the same at 0, while the
remaining parameters underwent significant adjustments.
This is caused by two factors. Firstly, in 1998 the high-risk
investor selection only contained 150 investors, compared to
3774 in 1999. For comparison, the conservative investors in
1998 and 1999 had 1218 and 1360 profiles respectively in
their analysis. Consequently, the analysis in 1999 allows for a
Test Investor Identification 133
significantly higher degree of confidence, since it is based on
25 times more investor.
Secondly, the volatility in the markets rose in this period, most
notably for the Nasdaq, which had an increase of 17%
between 1998 and 1999. As a result, a number of profiles,
which had a volatility parameter of 0 in 1998, were spread
evenly between all investor profiles, so that the profiles with a
volatility value of 0.875 were able to dominate. In 1999, the
profiles with a volatility parameter of 0 were concentrated in
the high-risk selection and defined this set.
The conservative investor profile shows a consistently strong
adversity toward high volatility and stocks where the system
cannot effectively predict future movements. At the same
time, it does not disregard the need for a well balance
portfolio, as exemplified by the correlation parameter. This is
to be expected, since the portfolios in that conservative set
were chosen so that no more than 50% of the invested value
remained in a single stock at the end of the interval.
The high-risk investors disregarded the transaction volume
and, in 1999, volatility of the stocks completely. This is
consistent with the approach of relying on the raw predictions
since that parameter had the biggest weight in the 1999 test.
The result was a portfolio, which fluctuated, at times widely, as
one might expect.
134 Experimental Results
6.2 Testing the Profiles
Using the results from the previous section, I tested the quality
of the system for one year starting May 15, 1999 with the
NELION test investor function. I configured a high-risk and a
conservative investor with the parameters calculated from the
1998 investor profiles. Each had a starting capital of US$
10,000, expected a return of at least 6% annually and required
a minimum transaction volume of US$ 250. The transaction
costs were calculated at US$ 9.99 per trade. On the January
15, 2000, I updated the profiles, which resulted from 1999 test.
Every weekend, the system had the opportunity to
automatically perform fictitious but unverified purchases and
sales. A detailed list of purchases and sales is included in
Appendix B. The portfolio development is shown in the
diagram below. The Nasdaq, Dow Jones Composites as well
as the S&P 500 indexes are included for comparison.
Testing the Profiles 135
80%
100%
120%
140%
160%
180%
200%
5
/
1
5
/
1
9
9
9
6
/
1
5
/
1
9
9
9
7
/
1
5
/
1
9
9
9
8
/
1
5
/
1
9
9
9
9
/
1
5
/
1
9
9
9
1
0
/
1
5
/
1
9
9
9
1
1
/
1
5
/
1
9
9
9
1
2
/
1
5
/
1
9
9
9
1
/
1
5
/
2
0
0
0
2
/
1
5
/
2
0
0
0
3
/
1
5
/
2
0
0
0
4
/
1
5
/
2
0
0
0
5
/
1
5
/
2
0
0
0
NELION High Risk
NELION Conservative
Nasdaq Composite
Dow Jones Composite
SP 500

Figure 6.2.1: Comparison of NELION Investors with Major
Indexes
136 Experimental Results
This diagram is not adjusted for inflation, which amounts to
approximately 1.6% in the test interval. For easy comparison,
the return on investment calculation for the interval is
summarized in the table below.
NELION
High
Risk
NELION
Conser-
vative
NELION
Average
Nasdaq
Composite
Dow Jones
Composite
S&P
500
5/15/1999 $10,000 $10,000 $10,000 $2,528 $3,306 $1,338
5/15/2000 $15,323 $9,736 $12,530 $3,607 $3,143 $1,452
Return 53.2% -2.6% 25.3% 42.7% -4.9% 8.6%
Table 6.2.1: NELION Test Investor Comparison
The results exhibit a pronounced difference between the high-
risk and the conservative investor, the former achieving a
53.2% return, while the latter lost 2.6% of the portfolio value.
This follows from the portfolio held. The high-risk investor
disregarded the model error in the first half of the trial and
volatility in the second so that his investment choices
gravitated toward stocks traded on the Nasdaq because they
tended to exhibit comparatively erratic behavior. This
correlation is apparent from Figure 6.2.1, where the portfolio
value of the high-risk investor and the Nasdaq remained close
during the entire year. NELION beat the index the first six
months, trailed it slightly at the beginning of 2000 and
regained the edge shortly before the correction in March 2000.
Though this adjustment did not pass by the NELION portfolio,
the drop was not as pronounced as for the Nasdaq.
The conservative portfolio consistently emphasized good
predictability of the stock, as one might expect. Consequently,
Testing the Profiles 137
it chose fewer volatile stocks resulting in a mix between Dow
Jones and Nasdaq stocks. The fact that the old economy
stocks did not perform well is documented by the 4.9% decline
of the Dow Jones Composite during our simulation interval.
Two changes in the portfolio value are worth noting here: On
November 21, 1999, the conservative portfolio held 1072
stocks of Angeion Corporation at US$ 0.88 a stock. Within
two days, the price had jumped to US$ 2.25 and continued
climbing up to a peak of US$ 3.94 on February 18, 2000. This
increase pushed the value of portfolio up 16% within two days
and is clearly visible in on the graph above. On the other
hand, the conservative investor purchased 80 shares from
Fruit of the Loom on May 15, 1999 for US$ 11.88 each, for a
total investment of US$ 950.40. Unfortunately, the company
sought protection under Chapter 11 of the bankruptcy law on
December 28, 1999 so that this investment was lost
completely.
The average of the two NELION portfolios achieved a healthy
25.3% return on investment, well above the S&P 500.
Compared to a risk-free investment in government bonds,
which returned about 6%-8% annually, these portfolios
represent a very attractive alternative. Bearing in mind that
the returns already account for transaction costs, they
compare favorably to many mutual funds, which state the
return on investment without mentioning their charge of
between 3% and 5% of the invested value.
138 Experimental Results
6.3 Model Distribution
In the test above NELION selected the weekly trial portfolios
from the total list of stocks tracked, which is included in
Appendix E. However, since the system is designed for
investment horizons of one day, one week and one month, it
calculated models for all of these intervals. For the
overwhelming majority of the stocks, NELION was best able to
predict future values using the k-nearest-neighbor models.
For a detailed list, please see the table below.
Model Type Daily Weekly Monthly
ANN 0 1 7
ARN 11 2 1
MM 0 3 5
KNN 96 101 94
Table 6.3.1: Model Type Usage
The low success rate of auto-regressive models is not
surprising, given the complexity of stock price movements.
The dominance of k-nearest-neighbor models over the Markov
and Artificial Neural Network models seems to indicate that
the stocks used in this experiment exhibit lowdimensional
chaotic behavior, since the complexity that KNN models are
generally able to model is lower than the other two non-linear
predictors. Hsieh supports this by showing that stock market
data is a low-dimensional deterministic system [Hsieh 1990].
The ANN models require considerably more processing time
than any of the other models. This poses a challenge, since
the algorithm tends to spend as much time calculating these
Daily Operation 139
models, as it requires for all the others together.
Consequently, the initial model calculation does not search the
parameter space of these models as extensively as it does for
the remaining model types, which may have further helped the
dominance of the KNN models.
6.4 Daily Operation
The application evolved over the months and years in
response to the specific requirements of private investors and
addresses their immediate needs in its current form. Several
distinguishing features were emphasized repeatedly, beside
the obvious guidance with specific suggestions.
Overall, the investment recommendations were considered
valuable because they helped direct attention to opportunities
that are beyond the scope of an individual investor.
The customization of the suggestions instilled a significant
amount of trust, because each investor felt that he was getting
individual attention. It was clear that the recommendations
were not of one mold and independent of the personal goals of
the investor, so that there was no cause for suspicion that the
recommendations were motivated by NELIONs personal gain.
The recommendation and update intervals differ widely
between persons who actively participate on a daily basis and
investors with a long-term horizon and each appreciate the e-
mail frequency.

7 Conclusion
There exist numerous articles and papers on various
approaches to the prediction of financial time series. Each of
these focuses on the specific qualities of the data at hand and
attempts to optimize the predictions for it based on historic
values. On the other hand, the theory of portfolio
management is documented and widely agreed, in its basic
form. Though these two components must be combined to
form a coherent portfolio management system, academic
papers have largely ignored this comprehensive approach.
The goal of this thesis was to build a fully integrated Internet-
based system that helps a private investor focus on promising
opportunities from the vast amount of financial data that is
available. NELION retrieves historic stock data from the World
Wide Web, stores it on a local database and uses four
mathematical model types to predict stock prices at different
intervals in the future.
At the same time, it allows an investor to choose from four risk
adversity parameters to establish a risk profile that matches
his needs. Given the investor preferences and stock
predictions, the system calculates the optimal portfolio for the
investor and sends him an e-mail with these
recommendations. The investor can then evaluate these
Daily Operation 141
suggestions based on qualitative indicators, which cannot be
captured in the mathematical models. NELION tracks his
portfolio, provides regular transaction recommendations, and
updates via e-mail and alerts via SMS to his mobile phone in
case any one of the stocks in his portfolio undergoes dramatic
swings.
The Auto-Investor function in NELION autonomously
simulates the trading behavior without intervention, providing
an objective means to evaluate the success of the system as a
real world application. Our comparisons of the U.S. markets
show that on average our fully automated investment agents
performed better than the major indexes in our test period of
one year.
As an initial experiment, these are promising results.
However, the system has also shown that a number of further
features could increase the flexibility and profitability of the
system.
Increased Diversity in the Input Data Set
The models are currently only based on historic data of the
time series itself. It is reasonable to assume that adding other
information to the input vector of the model will improve the
quality of the predictions by reducing the model error. This
information can include stock prices and trading volumes of
other stocks or indexes as well as price/earnings or other
ratios. A genetic algorithm can then search for effective input
combinations for each stock.
142 Conclusion
Generalized Data Pre-processing
Parkinson proposed an idea of pre-processing the input data
of time series analysis in order to improve the model quality
[Parkinson 1999]. By generalizing this approach and allowing
any number of combinations of both pre- and post-processing,
one can assume that the quality of the models can be
enhanced. Again, a genetic algorithm could be used to search
the parameter space for the best combinations.
Identifying Appropriate Age of Historic Input Data
Presently, NELION uses all available historic data to generate
models. This is likely to include periods where the dynamic of
the stock has undergone changes, leading to reduced overall
performance. It is possible to shorten the input data to an
appropriate length by identifying specific, current
characteristics of it. This measure would improve the model
quality and will reduce computation time, offsetting the
increase in parameter space proposed by the previous
suggestions.
Additional Attributes for each Stock
Each stock can be associated with a region and industry. By
allowing each investor to assign a subjective risk factor for
these categories and factoring this value into the risk equation,
the investor can focus on opportunities that conform to his
preference or that, in his estimation, promise above-average
returns.
Daily Operation 143
Including a Prediction Error for each Forecast
The current approach defines the reliability of the model by
using the out-of-sample error. By enhancing all models to
associate an error with each specific prediction, it should be
possible to identify stocks, which have entered a phase of
unpredictability.
Include Qualitative Explanation Module
With its current functionality, NELION offers valuable
assistance to the informed investor, by directing his attention
to promising opportunities. In order to appeal to novices as
well, the system would need to include a qualitative
explanation module. This function would support the
recommendations with copies or links to articles on the
Internet that shed more light on the suggestion. Additionally,
the module could automate the explanation of financial ratios
and charting techniques, helping less experienced investors
judge the validity of the forecasts by the system.
NELION represents a first attempt to automate a stock
prediction and portfolio management system. In its present
form, it shows promise and can be used effectively as a tool.
Based on the experiences gained from the extensive tests, it is
possible to refine the first version and take this integrated
approach to the next level of sophistication.

8 Appendix A: Investor Profiles
In order to specify his risk adversity, an investor has to define
numeric values for four parameters. The Correlation
parameter specified the emphasis NELION should place on
the correlation between the stocks when identifying the
optimal portfolio. The Error and Volatility parameters
determined if stocks where the model error or the volatility was
low were favored. Lastly, the Volume parameter performed
the same function based on the trading volume of the previous
trading day.
Chapter 6 explains the details of how the experiments were
conducted. The results were a collection of conservative and
high-risk investor profiles for 1998 and 1999. For each of
these four categories, I created cross tab queries, which
compare two of the four parameters, resulting in the six tables
shown in this appendix.
By selecting the combination that occurred the most frequently
in the six tables, it was possible to identify the first two
parameters for each category. It is highlighted with a dark
grey background.
Given these two parameters, I highlighted the four columns
and/or rows of the remaining five tables that contained either
of the two parameters with a light grey background. From
these, I chose the combination with the highest occurrence,
thereby defining the third parameter.
Daily Operation 145
Lastly, the fourth parameter was selected in the same manner
from remaining three tables. The rows and columns were
defined by the first three parameters and highlighted with a
medium grey background, or a medium grey strip on the left
side of the column or bottom of the row, if it had been
identified with a light grey background already.
146 Appendix A: Investor Profiles
8.1 Conservative Investors
8.1.1 January 1, 1998 December 31, 1998
Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 4 3 4 8 1
0.250 4 4 2 1 10 2
0.375 4 7 1 3 10 5 4 1
0.500 3 9 3 4 2 3 4 1
0.625 1 6 5 4 4 1 1
0.750 2 2 3 2 7 1
0.875 6 2 9 2 1 3
C
o
r
r
e
l
a
t
i
o
n

1.000 4 3 2 5 2 5 10

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 4 3 3 10
0.250 2 2 2 13 4
0.375 2 1 4 11 2 14 1
0.500 1 5 1 1 6 5 10
0.625 1 1 4 4 4 4 4
0.750 2 2 5 7 1
0.875 3 6 2 4 5 3
C
o
r
r
e
l
a
t
i
o
n

1.000 1 2 2 1 9 8 2 6

Conservative Investors 147

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 2 4 2 8 4
0.125 1 1 3 5 12 4 8
0.250 4 2 2 4
0.375 7 2 4 3 3 4
0.500 18 6 7
0.625 2 6 1
0.750 37 5
0.875 2 16 4 3
E
r
r
o
r

1.000 2 2

Correlation
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 1 1 2 2
0.125 2 3 2 2 2 1 6
0.250 1 1 3 4 1 2 3 5
0.375 3 3 2 1 2 4
0.500 2 5 3 1 2 2
0.625 3 3 1 1 1 1 3
0.750 11 8 10 1 11 5 4 4
0.875 2 5 11 14 2 6 9 5
V
o
l
a
t
i
l
i
t
y

1.000 4

148 Appendix A: Investor Profiles

Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 1 4 1
0.125 4 1 3 3 1 3 3
0.250 4 4 1 1 6 3 1
0.375 2 1 1 3 3 1 2 2
0.500 5 5 2 1 2
0.625 2 3 1 2 1 4
0.750 6 6 2 9 11 3 17
0.875 6 12 2 2 11 10 11
V
o
l
a
t
i
l
i
t
y

1.000 4

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 2 4
0.125 1 2 2 1 2 4 4 2
0.250 1 1 2 5 2 8 1
0.375 2 1 2 3 4 2 1
0.500 2 1 4 3 2 3
0.625 3 1 3 1 4 1
0.750 2 8 3 7 10 4 18 2
0.875 4 2 4 12 9 21 2
V
o
l
a
t
i
l
i
t
y

1.000 4
Conservative Investors 149
8.1.2 January 1, 1999 December 31, 1999
Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 3 4 4 2 4 4 2 2 2
0.250 1 1 3 8 4 3 1 2 3
0.375 2 5 3 5 4 2 1 2
0.500 2 7 3 4 6 4 3 1 2
0.625 2 4 4 3 5 4 1 2 1
0.750 1 2 3 3 6 2 3 3 3
0.875 2 5 3 2 4 5 3 3 1
C
o
r
r
e
l
a
t
i
o
n

1.000 1 3 4 6 4 6 3 5 2

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 13 1 1 1 2 2 2 5
0.250 10 1 1 3 6 5
0.375 14 1 1 5 3 4
0.500 8 1 6 5 6 6
0.625 11 1 4 3 2 5
0.750 13 2 1 1 4 5
0.875 12 3 1 5 3 4
C
o
r
r
e
l
a
t
i
o
n

1.000 12 2 5 4 5 6

150 Appendix A: Investor Profiles

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 10 1 2 1
0.125 9 2 5 2 2 6
0.250 8 3 5 6 7
0.375 5 2 2 9 7 6
0.500 6 4 1 5 5 9 8
0.625 17 2 2 4 7
0.750 9 1 2 3 3
0.875 13 1 1 2 2
E
r
r
o
r

1.000 12 1 1 1 1

Correlation
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 2 2 1 3 2
0.125 2 3 4 1 1 4 4 5
0.250 2 6 4 6 7 2 4
0.375 4 3 4 4 2 2 3 8
0.500 2 4 2 8 2 4 1 3
0.625 4 4 6 2 3 1 4 1
0.750 3 2 1 4 7 2 4 6
0.875 4 2 1 3 1 3 3 3
4 2 4 5 4 3 4 2
V
o
l
a
t
i
l
i
t
y

1.000

Conservative Investors 151

Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 2 1 1 1 3 1
0.125 1 7 5 4 1 3 1 2
0.250 3 6 6 8 2 1 4 1
0.375 1 4 5 5 7 4 2 1 1
0.500 1 2 3 5 5 5 1 2 2
0.625 3 1 6 6 2 4 1 2
0.750 1 4 5 2 4 9 3 1
0.875 1 3 2 2 1 2 6 3
V
o
l
a
t
i
l
i
t
y

1.000 10 5 1 7 1 1 3


Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 1 3 1 4
0.125 7 2 4 3 2 6
0.250 8 2 4 1 7 9
0.375 11 1 3 5 3 7
0.500 12 1 1 2 7 3
0.625 11 1 1 6 2 4
0.750 10 1 2 2 5 5 4
0.875 10 1 3 1 2 3
V
o
l
a
t
i
l
i
t
y

1.000 19 2 1 2 2 2

152 Appendix A: Investor Profiles
8.2 High Risk Investors
8.2.1 January 1, 1998 December 31, 1998
Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 1 2 1
0.250 1
0.375 1 1
0.500 2 1 1
0.625
0.750 1 1
0.875 6 1
C
o
r
r
e
l
a
t
i
o
n

1.000 1 1

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 2
0.250 3
0.375 2
0.500 3 1
0.625
0.750 2
0.875 7
C
o
r
r
e
l
a
t
i
o
n

1.000 2

High Risk Investors 153

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 6
0.125
0.250 4
0.375 2
0.500 2
0.625
0.750 2 1
0.875 4
E
r
r
o
r

1.000 1

Correlation
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 1 2 1 1 1 1
0.125
0.250 1 1
0.375
0.500 1 1
0.625
0.750 1
0.875 1 6 1
V
o
l
a
t
i
l
i
t
y

1.000 2

154 Appendix A: Investor Profiles

Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 2 1 3 1
0.125
0.250 2
0.375
0.500 2
0.625
0.750 1
6 2
0.875
V
o
l
a
t
i
l
i
t
y

1.000 1 1

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 6 1
0.125
0.250 2
0.375
0.500 2
0.625
0.750 1
0.875 8
V
o
l
a
t
i
l
i
t
y

1.000 2

High Risk Investors 155
January 1, 1999 December 31, 1999
Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 9 8 10 11 10 10 10 6 8
10 10 9 9 13 8 9 6 10
0.250
0.375 9 9 10 9 12 9 8 8 8
0.500 9 6 8 8 10 8 8 8 11
0.625 7 7 9 9 8 8 8 8 8
0.750 8 8 9 9 9 10 8 8 7
0.875 7 9 9 7 12 7 10 7 10
C
o
r
r
e
l
a
t
i
o
n

1.000 9 6 8 10 11 7 8 8 9

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000
0.125 65 2 3 3 6 3
0.250 69 1 3 3 2 3 3
0.375 67 2 2 1 4 6
0.500 67 1 1 3 4
0.625 67 2 3
0.750 64 1 1 1 2 4 3
0.875 65 1 3 9
C
o
r
r
e
l
a
t
i
o
n

1.000 65 1 2 2 1 5

156 Appendix A: Investor Profiles

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 62 3 2 1
0.125 57 1 1 4
0.250 59 1 2 5 5
0.375 60 1 2 1 1 3 4
0.500 65 2 2 2 2 1 4 7
0.625 55 2 4 6
0.750 61 1 2 3 2
0.875 50 2 2 5
E
r
r
o
r

1.000 60 4 4 3

Correlation
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 10 11 11 9 8 8 9 7
0.125 7 9 7 9 7 6 7 7
0.250 8 6 9 10 8 7 9 8
0.375 8 10 5 7 10 8 9 8
0.500 11 8 7 7 7 8 9 9
0.625 7 6 9 6 8 10 10 9
0.750 9 10 10 10 7 10 8 8
0.875 11 12 13 8 9 8 9 9
V
o
l
a
t
i
l
i
t
y

1.000 11 12 11 10 8 11 8 11

High Risk Investors 157

Error
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 9 8 8 6 11 8 6 8 9
0.125 10 4 5 8 6 7 7 6 6
0.250 8 8 8 7 9 7 7 3 8
0.375 7 7 6 10 7 6 9 6 7
0.500 7 7 9 8 9 6 7 7 6
0.625 8 7 8 7 8 8 4 8 7
0.750 8 8 6 7 10 6 7 11 9
0.875 9 8 11 8 12 11 11 1 8
V
o
l
a
t
i
l
i
t
y

1.000 2 6 11 11 13 8 11 9 11

Volume
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
0.000 67 1 1 2 2
0.125 56 1 1 1
0.250 63 1 1
0.375 59 2 1 1 2
0.500 59 1 1 1 2 2
0.625 59 1 5
0.750 57 2 1 1 6 5
0.875 57 2 1 6 4 9
V
o
l
a
t
i
l
i
t
y

1.000 52 1 3 3 4 8 11

9 Appendix B: Portfolio History
This appendix shows all transactions for the two test investors.
The first column contains the ticker abbreviation of the stock
as defined in Appendix E. The second column shows the
transaction date. The amount in the third column is positive
for purchases and negative for sales. Finally the stock price at
the time the transaction took place is shown in the last column.
Please note that these prices are adjusted for stock splits as of
May 15, 2000.
9.1 Transactions by Conservative Investor
Ticker Date Amount Price
MADGF 5/15/1999 208 $ 3.50
GAP 5/15/1999 23 $ 31.56
IBM 5/15/1999 2 $239.25
SPGLA 5/15/1999 70 $ 8.50
DOW 5/15/1999 7 $135.31
FTLAF 5/15/1999 80 $ 11.88
AAPL 5/15/1999 28 $ 44.38
AOL 5/15/1999 10 $125.25
BI 5/15/1999 13 $ 11.00
COMS 5/15/1999 4 $ 28.56
AXP 5/15/1999 6 $120.75
BS 5/15/1999 118 $ 9.13
CSCO 5/15/1999 2 $115.44
CRUS 5/15/1999 32 $ 7.56
EK 5/15/1999 3 $ 77.63
IT 5/15/1999 5 $ 24.06
IBM 5/23/1999 1 $230.38
BA 5/23/1999 3 $ 44.94
DOW 5/23/1999 -2 $126.19
AOL 5/23/1999 -4 $126.44
AXP 5/23/1999 1 $120.25
CSCO 5/23/1999 1 $113.25

Transactions by Conservative Investor 159
DOW 9/11/1999 5 $113.13
AAPL 9/11/1999 -5 $ 77.44
AOL 9/11/1999 -5 $ 96.31
AXP 9/11/1999 -2 $140.00
CHV 9/11/1999 4 $ 95.88
DOW 9/19/1999 -6 $115.63
AAPL 9/19/1999 -17 $ 76.94
AXP 9/19/1999 -4 $139.88
CVD 9/19/1999 48 $ 8.75
EBF 9/19/1999 46 $ 8.56
DZB 9/19/1999 17 $ 24.00
HSIC 9/19/1999 18 $ 15.81
MADGF 11/14/1999 -208 $ 2.44
GAP 11/14/1999 -23 $ 27.25
IBM 11/14/1999 -3 $ 95.88
SPGLA 11/14/1999 -70 $ 12.44
DOW 11/14/1999 -4 $118.56
AAPL 11/14/1999 -6 $ 90.63
BI 11/14/1999 72 $ 4.75
COMS 11/14/1999 28 $ 33.38
BS 11/14/1999 -67 $ 6.06
CHV 11/14/1999 -4 $ 91.38
CSCO 11/14/1999 -3 $ 83.44
IT 11/14/1999 32 $ 11.44
ANGN 11/14/1999 536 $ 1.56
CVD 11/14/1999 -48 $ 11.25
EBF 11/14/1999 46 $ 9.00
DZB 11/14/1999 -17 $ 22.38
SHN 11/14/1999 846 $ 1.50
UMGI 11/14/1999 28 $ 70.19
DAB 11/14/1999 48 $ 10.31
HBNK 11/14/1999 60 $ 18.56
BS 1/23/2000 -101 $ 7.50
DAB 1/23/2000 -95 $ 6.88
BI 2/12/2000 -85 $ 3.56
BS 2/12/2000 50 $ 5.63
DAB 2/12/2000 47 $ 7.88
HBNK 2/12/2000 -60 $ 16.13
FLSC 2/19/2000 210 $ 1.88
IT 5/7/2000 -37 $ 13.44
SHN 5/7/2000 -846 $ 0.81
IT 5/14/2000 37 $ 12.13
SHN 5/14/2000 846 $ 0.63
IT 7/14/2000 -37 $ 13.31
FLSC 7/14/2000 -210 $ 2.38
GPS 7/15/2000 22 $ 38.19
160 Appendix B: Portfolio History

9.2 Transactions by High Risk Investor
Ticker Date Amount Price
MSFT 5/15/1999 126 $ 76.88
AOL 5/15/1999 1 $125.25
INTC 5/23/1999 2 $ 57.00
MSFT 5/23/1999 -9 $ 77.56
AAPL 5/23/1999 6 $ 43.94
AOL 5/23/1999 2 $126.44
BS 5/23/1999 18 $ 9.00
MSFT 5/30/1999 -2 $ 80.69
AOL 5/30/1999 1 $119.25
INTC 6/12/1999 42 $ 54.44
MSFT 6/12/1999 -100 $ 78.13
AAPL 6/12/1999 -4 $ 46.44
AOL 6/12/1999 55 $ 99.50
BS 6/12/1999 18 $ 7.56
INTC 6/19/1999 12 $ 54.94
MSFT 6/19/1999 6 $ 85.00
AOL 6/19/1999 -17 $112.00
CSCO 6/19/1999 3 $119.38
MSFT 7/3/1999 6 $ 90.19
AOL 7/3/1999 -3 $110.00
DELL 7/10/1999 24 $ 42.81
INTC 7/10/1999 4 $ 66.25
AAPL 7/10/1999 9 $ 55.63
AOL 7/10/1999 -23 $128.25
BS 7/10/1999 90 $ 8.06
CSCO 7/10/1999 9 $ 67.06
MSFT 7/17/1999 3 $ 99.44
BS 7/17/1999 -60 $ 8.13
IBM 7/24/1999 5 $124.81
BS 7/24/1999 -46 $ 8.06
IBM 7/31/1999 -2 $125.69




10 Appendix C: Screen Shots

The Task Agent

An E-Mail Recommendation
162 Appendix C: Screen Shots

The Administration Tool

The New Task Dialog
163

The Investor List

General Investor Information

The Graphical Portfolio History
164 Appendix C: Screen Shots

The Investor Transactions

The Current Investor Portfolio

The Return on Investment
165

The Stock List

General Stock Information
166 Appendix C: Screen Shots

The Graph of the Stock Price

The Correlation between Stocks

11 Appendix D: Conceptual Data Model

12 Appendix E: Stocks Tracked in the
Simulation
3 Com COMS
ACE*COMM Corp. ACEC
Actrade International, Ltd. ACRT
ADC, Inc. ADCT
Air Express International AEIC
Air Transportation Holding AIRT
Airborne Freight Corp. ABF
AirNet Systems, Inc. ANS
AK Steel Holdings AKS
Allied Irish Banks, p.l.c AIB
Alterra Healthcare Corp. ALI
America Online AOL
American Express AXP
American Home Products AHP
AMRESCO, Inc. AMMB
ANGEION Corp. ANGN
Apple Computer AAPL
Applied Materials, Inc. AMAT
Archer Daniels Midland Co ADM
Argentaria, Caja Postal y AGR
ARV Assisted Living, Inc. SRS
AT&T T
Australia and New Zealand ANZ
Autodesk ADSK
Avon AVP
Banco Comercial Portugues BPC
Banco Santander Central STD
Bank of Montreal BMO
Bank of Tokyo-Mitsubishi MBK
Barclays PLC BCS
Bausch + Lomb BOL
Bell Industries BI
Bestfoods BFO
Bethlehem Steel BS

169
Beverly Enterprises, Inc. BEV
Black + Decker BDK
Blonder Tongue Laboratories BDR
Boca Research, Inc. BOCI
Boeing BA
Boston Communications Group BCGI
Campbell Soup Company CPB
Caterpillar CAT
Celadon Group, Inc. CLDN
Centigram Communications CGRM
Chase Manhattan Bank CMB
Checkpoint Software CHKP
Chevron CHV
Circle International Group CRCL
Cirrus Technology CRUS
Cisco CSCO
Citigroup C
Citrix CTXS
CAN Financial Corp. CNA
Coca-Cola COKE
Colgate-Palmolive CL
Compaq Computers CPQ
Computer Associates CA
ConAgra, Inc. CAG
Converse Inc. CVE
C-Phone Corp. CFON
Cutter + Buck Inc. CBUK
Cymer, Inc. CYMI
Data Broadcasting Corp. DBCC
Dave & Buster's Inc. DAB
Deere + Co. DE
Delhaize America Inc. DZB
Dell Computers DELL
Delta Airlines DAL
Dexter Corp. DEX
Dole Food Company, Inc. DOL
Dow Chemicals DOW
Eagle USA Airfreight, Inc. EUSA
Eastman Kodak EK
EIS International, Inc. EISI
Emeritus Assisted Living ESC
Ennis Business Forms, Inc. EBF
EXECUTONE ELOT
170 Appendix E: Stocks Tracked in the Simulation
Expeditors International EXPD
Exxon XOM
FDX Corp. FDX
First Albany Companies Inc. FACT
Florsheim Group Inc. FLSC
Fonix Corp. FONX
Ford Motors F
Fresh America Corp. FRES
Fritz Companies, Inc. FRTZ
Fruit of the Loom FTL
Gartner Group IT
General Electric GE
General Mills, Inc. GIS
General Motors GM
Genesis Health Ventures GHV
Great Atlantic and Pacific Tea Co. GAP
Greenbriar Corp. GBR
Gucci Group N.V GUC
H.D. Vest, Inc. HDVS
H.J. Heinz Company HNZ
Harvey Entertainment HRVY
HCC Insurance Holdings HCC
Healthsouth Corp. HRC
Henry Schein Inc. HSIC
Hewlett Packard HWP
Highland BanCorp. Inc. HBNK
Hoenig Group Inc. HOEN
Hollywood Park Inc. HPK
Hub Group, Inc. HUBG
HUMANA Inc. HUM
IBM IBM
Imperial Credit Industries ICII
Intel Corp. INTC
InterVoice-Brite Inc. INTV
Jaclyn, Inc. JLN
Justin Industries, Inc. JSTN
Kellogg Company K
Kenneth Cole Productions KCP
K-Swiss Inc. KSWS
LaCrosse Footwear, Inc. BOOT
LandAir Corp. LAND
Leather Factory, Inc. TLF
Lifeline Systems, Inc. LIFE
171
LSI Logic LSI
Lucent Technology LU
M. H. Meyerson & Co., Inc. MHMY
Madge Networks MADGF
Manor Care, Inc. HCR
Maytag Corp. MYG
Mediaone Group Inc. UMG
Metretek Technologies MTEK
Microlog Corp. MLOG
Micron Technology MU
Microsoft Corp. MSFT
MTI Technology Corp. MTIC
Nabisco Holdings Corp. NA
Nabors Industries Inc. NBR
National Australia Bank NAB
National Discount Brokers NDB
National HealthCare Corp. NHC
National Semiconductor NSM
National Westminster Bank NW
Natural MicroSystems Corp. NMSS
Nexus Telecomm Sys Ltd. NXUSF
Nike NKE
Nobel Drilling Corp. NE
NUWAVE Technologies, Inc. WAVE
Oracle ORCL
Orange Telephone PLC ORNGY
Pepsi Co. PEP
Pfitzer PFE
Pittston BAX Group PZX
Professional BanCorp., Inc. MDB
Quaker Oats Company OAT
Ralston Purina Company RAL
RCM Technologies, Inc. RCMT
Reebok International Ltd. RBK
Res-Care, Inc. RSCR
Research Partners Intl RPII
Response USA, Inc. RSPN
Rite Aid Corp. RAD
Rockwell International ROK
Ross Stores ROST
Royal Bank of Canada RY
Salton, Inc. SFP
Sara Lee Corp. SLE
172 Appendix E: Stocks Tracked in the Simulation
Saucony, Inc. SCNYA
Sawtek Inc. SAWS
Schering Plough Corp. SGP
Shoney's, Inc. SHN
Smithfield Foods Inc. SFD
Southwest Securities Group SWS
Spiegel Corp. SPGLA
Starbucks Coffee SBUX
Stifel Financial Corp. SF
Stride Rite Corp. SRR
Sun Microsystems SUNW
Sunrise Assisted Living SNRZ
Swank, Inc. SNKI
Tandy Brands Accessories TBAC
Texas Instruments TXN
Textron Inc. TXT
The Gap GPS
Time Warner TWX
Toronto-Dominion Bank TD
Total Oil Co. TOT
Track Data Corp. TRAC
Unilever N.V. UN
Unilever PLC UL
United Airlines UAL
United Shipping USHP
Value City Department Stores VCD
Vans, Inc. VANS
Veritas DGC Inc. VTS
Verity Inc. VRTY
Vicon Industries, Inc. VII
VISX, Inc. VISX
Vodafone Airtouch VOD
VTEL Corp. VTEL
Wal-Mart WMT
Wendt-Bristol Health Services WMD
Westpac Banking Corp. WBK
Xerox XRX

13 Appendix F: Bibliography
(1) Ambachtsheer, Keith, The Economics of Pension Fund
Management, Financial Analysts Journal, Nov.-Dec.
1994, p.21-31.
(2) Anderson, The Box-Jenkins Approach, Butterworths,
1976.
(3) Aoki, Masanao Notes on Economic Time Series Analysis:
System Theoretic Perspectives, Springer Verlag, 1983.
(4) Arthur, W.B., J. H. Holland, B. LeBaron and P. Tayler,
Asset Pricing under Endogenous Expectations in an
Artificial Stock Market, SFI Studies in the Sciences of
Complexity, Vol. XXVII, Addision Wesley, 1997.
(5) Baldi, Pierre and Yves Chauvin, Smooth On-Line Learing
Algorithms for Hidden Markov Models, Technical Paper of
the Jet Propulsion Laboratory, 1993.
(6) Baldi, P., et al, Hidden Markov Models in molecular
biology: New algorithms and applications, Advances in
Neural Information Processing Systems, volume 5,
Morgan Kaufmann, San Mateo, 1993.
(7) Barber, Brad M., Terrance Odean, The Courage of
Misguided Convictions: The Trading Behavior of
Individual Investors, Technical Report of the Graduate
School of Management, UC Davis, Davis, California,
1999.
(8) Barnsley, Michael F. Factals Everywhere, 2. Edition,
Academic Press Professional, 1993.
(9) Baum, L. E., T. Petrie, G. Soules and N. Weiss, A
Maximization Technique Occuring in the Statistical
174 Appendix F: Bibliography
Analysis of Probabilistic Functions of Markov Chains,
Ann. Math. Stat. 41(1): 164-171, 1970.
(10) Berenson, Mark L. and David M. Levine, Statistics for
Business & Economics, Prentice Hall, 1990.
(11) Bhattacharyya, M. N. Comparison of Box-Jenkins and
Bonn Monetary Model Prediction Performance, Lecture
Notes in Economics and Mathematical System, Springer
Verlag, 1980.
(12) Bishop, George W., Jr., Charles Dow and the Dow
Theory, Appleton, Century Crofts, Inc., New York, 1960.
(13) Boldt, B. and H. Arbit, Efficient markets and the
professional investor, Financial Analysts Journal, 40,
July-August 1984, 2234.
(14) Bookstaber, Richard, The Complete Investment Book,
Scott, Foresman and Company, 1985.
(15) Borscheid, Peter Vom verdienten zum erzwungenen
Ruhestand. Wirtschaftliche Entwicklung und der Ausbau
des Sozialstaates, Digitale Bibliothek der Friedrich Ebert
Stifung, 1999.
(16) Box, George E. P. and Gwilym M. Jenkins, Time Series
Analysis, 2
nd
Edition, Holden-Day, 1976.
(17) Braun, Susanne, Neuronale Netze in der Aktienprognose,
article contributed to Neuronale Netze in der konomie
Eds. Heinz Rehkugler and Hans Georg Zimmermann,
Verlag Vahlen, 1994.
(18) Brockwell, Peter J. and Richard A. Davis, Time Series:
Theory and Methods, Springer Verlag, 2nd Edition, 1986.
(19) Casdagli, Martin Nonlinear Prediction of Chaotic Time
Series, Physica D35 (1989), 335-356.
(20) Chen, Shu-Heng, Chia-Hsuan Yeh and Chung-Chih Liao,
Testing for Granger Causality in the Stock Price-Volume
Relation: A Perspective from the Agent-Based Model of
175
Stock Markets, Proceedings of the Fifth Joint Conference
on Information Science, Volume II, Atlantic City, New
Jersey, USA, 2000.
(21) Day, Shawn P. and Michael R. Davenport, Continuous-
Time Temporal Back-Propagation with Adaptable Time
Delays, submitted to IEEE Transactions on Neural
Networks, 1992.
(22) de Groot and D. Wrtz, Analysis of univariate time series
with connectionist nets: A case study of two classical
examples, Neurocomputing 2, Elsevier, 1990/91, 177 -
192.
(23) de Jong, Eelke Exchange Rate Determination and
Optimal Economic Policy Under Various Exchange Rate
Regimes, Lecture Notes in Economics and Mathematical
System, (1991).
(24) Deppisch, Hierarchical training of neural networks and
prediction of chaotic time series, Physics Letters A 158
(1991), 57-62.
(25) Dihardjo, Herlina and Clarence Tan, Moving Average As
Technical Indicators And Artificial Neural Network Model
Trading System for Australian Dollar Market,
Proceedings of the Conference on Advanced Investment
Technology, Bond University, Australia, 1999.
(26) Dobbins, Richard and Stephen F. Witt, Portfolio Theory &
Investment Management, Martin Robinson, Oxford, 1983.
(27) Doya, Kenji Bifurcations in the Learning of Recurrent
Neural Networks, Proceedings of 1992 IEEE International
Symposium on Circuits and Systems, 2777-2780.
(28) Doya, Kenji Bifurcations of Recurrent Neural Networks in
Gradient Descent Learing, submitted to IEEE
Transactions on Neural Networks, 1993.
176 Appendix F: Bibliography
(29) Elsner, Prediction time series using a neural network as a
method of distinguisching chaos from noise, Journal of
Physics A 25 (1992), 843-850.
(30) Fahlman, S. E. and C. Lebiere, The cascade-correlation
learning architecture, Advances in neural information
processing systems 2, D. S. Touretzky (Eds.) pp. 524-
532, 1990.
(31) Farmer, J. Doyne, Andrew W. Lo, Frontiers of Finance:
Evolution and Efficient Markets, Frontiers of Science,
1998.
(32) Financial Training Company, The, Finance for Non-
Financial Manager, Course Training Materials, 1998.
(33) Frankel, J. and K. Froot, Understanding the U.S. Dollar in
the Eighties: The Expectations of Chartists and
Fundamentalists, The Economic Record, vol.62, pp.24-
38, 1986.
(34) Fraser, Andrew M. and Alexis Dimitriadis, Forecasting
Probability Densities by Using Hidden Markov Models
with Mixed States, article contributed to Time Series
Prediction: Forecasting the Future and Understanding the
Past, Eds. Andreas S. Weigend and Neil A. Gershenfeld,
Santa Fe Institute Studies in the Sciences of Complexity,
Proc. Vol. XV, Addison-Wesley, 1993.
(35) Goh, TH; Wang, PZ; Lui, HC, Learning Algorithm for the
Enhanced Fuzzy Perceptron, Institute of Systems
Science, Technical Report presented at the National
University of Singapore, 1991.
(36) Granger, C. W. J., Economic processes involving
feedback, Information and Control, 6(1): 28-48, March
1963.
(37) Granger, C. W. J. Modeling Economic Series, Clarendon
Press, 1990.
177
(38) Hastings, Harold M. and George Sugihara, Fractals - A
Userss Guide for the Natural Sciences, Oxford Science
Publications, 1993.
(39) Heller, Robert, Business Masterminds: Warren Buffet,
Dorling Kindersley, 2000.
(40) Hertz, John et al, Introduction to the Theory of Neural
Computation, Santa Fe Institute Studies in the Sciences
of Complexity, Addison-Wesley Publishing Company,
1992.
(41) Hsieh, David, Chaos and Nonlinear Dynamics:
Application to Financial Markets, Technical Report
presented at the Fuqua School of Business, Duke
University, Durham, North Carolina, 1990.
(42) Hsieh, David, Implications of Nonlinear Dynamics for
Financial Risk Management, Journal of Financial and
Quantitative Analysis, March 1993.
(43) Hutchinson, James M., A Radial Basis Function Approach
to Financial Time Series Analysis, Department of
Electrical Engineering and Computer Science,
Massachusetts Institute of Technology, 1994.
(44) Ivarsson, Per H Weak-Form Efficient Market Hypothesis
in the Interbank Foreign Exchange Market, Technical
Report, Institute of Theoretical Physics, December 16,
1997.
(45) Jensen, Michael C, The Performance of Mutual Funds in
the Period 1945-64, Journal of Finance 23, 389-416,
1968.
(46) Joshi, S. and M. A. Bedau, An Explanation of Generic
Behavior in an Evolving Financial Market. In R. Standish,
B. Henry, S. Watt, R. Marks, R. Stocker, D. Green, S.
Keen, T. Bossomaier, eds., Complex Systems '98--
Complexity Between the Ecos: From Ecology to
178 Appendix F: Bibliography
Economics, Complexity Online Network; Sydney, pp. 327-
335, 1998.
(47) Kahneman, Daniel and Amos Tversky, Prospect Theory:
An Analysis of Decision Making under Risk,
Econometrica, 1979.
(48) Karpoff, J.M. The Relation between Price Changes and
Trading Volume: A Survey, Journal of Financial and
Quantitative Analysis, Vol. 22, No. 1, pp. 109-126, 1987.
(49) Kerr, Edward Financial Management Study Notes,
University of Hertfordshire, Business School, Lecture
Notes, March 1997.
(50) Kosko, Bart Neural Networks and Fuzzy Systems,
Prentice-Hall International Editions, 1992.
(51) Kumar, Kuldeep, Clarence Tan and Ranadhir Ghosh,
Using Chaotic Component and ANN for Forecasting
Exchange Rates in Foreign Currency Market,
Proceedings of the Conference on Advanced Investment
Technology, Bond University, Australia, 1999.
(52) Kurumatani, Koichi, Yuhsuke Koyama, Takao Terano,
Hajime Kita, Akira Namatame, Hiroshi Deguchi, Yochinori
Shiozawa and Hitoshi Matsubara, Vsmart: A Virtual Stock
Market as a Forum for Market Structure Analysis and
Engineering, Proceedings of the Fifth Joint Conference
on Information Science, Volume II, Atlantic City, New
Jersey, USA, 2000.
(53) Langer, Mariensen, Quinke, Simulationsexperimente mit
konomischen Makromodellen, GMD, R. Oldenbourg
Verlag, 1984.
(54) LeBaron, Blake, Experiments in evolutionary finance,
Working Paper, Department of Economics, University of
Wisconsin-Madison, November 1995.
(55) Lewis, Michael Liars Poker, Hodder and Stoughton, page
38, 1989.
179
(56) Le Cun, J. S. Denker and S. A. Solla, Optimal Brain
Damage, Advances in Neural Information Processing
Systems 2, Morgan Kaufmann Publishers, 1990.
(57) Lequarr, Jean Y., Foreign Currency Dealing: A Brief
Introduction (Data Set C), article contributed to Time
Series Prediction: Forecasting the Future and
Understanding the Past, Eds. Andreas S. Weigend and
Neil A. Gershenfeld, Santa Fe Institute Studies in the
Sciences of Complexity, Proc. Vol. XV, Addison-Wesley,
1993.
(58) Looman, Volker, Die Hhe der Alterversorgung ist eine
Frage der Selbstdiziplin, Frankfurter Allgemeine Zeitung,
Seite 26, Nr. 128, 3. Juni 2000.
(59) Naylor, Thomas H., Computer Simulation Experiments
with Models of Economic Systems, John Wiley & Sons,
Inc., 1971.
(60) Markowitz, Harry M., Portfolio Selection, Journal of
Finance, Vol. VII No. 1, 1952.
(61) Markowitz, Harry M., Portfolio Selection: Efficient
Diversification of Investments, John Wiley and Sons,
1959.
(62) Mlroiu, Simona, Kimmo Kiviluoto and Erkki Oja, Time
Series Prediction with Independent Component Analysis,
Proceedings of the Conference on Advanced Investment
Technology, Bond University, Australia, 1999.
(63) Motamen, Homa Economic Modelling in the OECD
Countries, Chapman and Hall, 1987.
(64) Mozer, Michael C., Neural Net Architectures for Temporal
sequence Processing, article contributed to Time Series
Prediction: Forecasting the Future and Understanding the
Past, Eds. Andreas S. Weigend and Neil A. Gershenfeld,
Santa Fe Institute Studies in the Sciences of Complexity,
Proc. Vol. XV, Addison-Wesley, 1993.
180 Appendix F: Bibliography
(65) Murata, Noboru et al, Network Information Criterion -
Determining the Number of Hidden Units for an Artificial
Neural Network Model, Department of Mathematical
Engineering and Information Physics, University of Tokio,
1992.
(66) Niarchos, Nikitas A. Christos A. Alexakis, Stock market
prices, causality and efficiency: evidence from the Athens
stock exchange, Applied Financial Economics, Volume 8,
Issue 2, pp 167 174
(67) Nienstedt, Heinz-Werner Ein Verfahren zur
Kurzfristprognose - Die Integration von Methoden der
Zeitreihenanalyse in ein konometrisches Modell,
Dissertation zum Erlangen des Grades eines Doktors der
Wirtschaftswissenschaften eingereicht am Fachbereich
Informatik der Technischen Universitt Berlin, 1984.
(68) Ofer, Aharon R. and Arie Melnick, Price Deregulation in
the Brokerage Industry: An Empirical Analysis, pp. 633-
641, The RAND Journal of Economics, Volume 9, No. 2,
Autumn 1978.
(69) Paa, Gerhard Prognose und Asymptotik Bayesscher
Modelle, GMD, R. Oldenbourg Verlag, 1984.
(70) Packard, H. et al, Geometry from a Time Series, Physical
Review Letters, Vol 45, Number 9, (1980), 712-716.
(71) Parkinson, Alan, Financial Time Series Prediction using
Neural Networks: Approaches to Data Pre-Processing,
Proceedings of the Conference on Advanced Investment
Technology, 1999.
(72) Poritz, A.B., Hidden Markov Models: A guided Tour, IEEE
International Conference on Acoustic Speech Signal
Proceedings, volume 198-8, 1988.
(73) Rabiner, L.R., A Tutorial on Hidden Markov Models and
Selected Application in Speech Recognition, Proceedings
of the IEEE, volume 77-2, 1989.
181
(74) Rehkugler, Heinz und T. Poddig, Statistische Methoden
versus Knstliche Neuronale Netzwerke zur
Aktienkursprognose, Nr. 73/1990, T.U. Bamberg.
(75) Rehkugler, Heinz und Hans Georg Zimmermann,
Neuronale Netze in der konomie, Verlag Vahlen, 1994.
(76) Reitz, Ulrich Marktprofil: Joachim Goldberg, Welt am
Sonntag, 6. Februar 2000.
(77) Renals, Steve Chaos in Neural Networks, Lecture Notes
in Computer Science, 90 - 99 (1990).
(78) Robinson, David M., Steven Zigomanis, Chartists,
Fundamentalists and Nonlinear Dependence,
Proceedings of the Conference on Advanced Investment
Technology, Bond University, Australia, 1999.
(79) Rssler, O.E. Chaos and Order, Physics Letters, A57,
397-402, (1976).
(80) Rohwer, Richard, The Moving Targets Training
Algorithm, Lecture Notes in Computer Science, 100-109
(1990).
(81) Rojas, Raul, Theorie der neuronalen Netze. Eine
systematische Einfhrung, Springer-Verlag, 1993.
(82) Rudzio, Koija, Verflixte Psyche, Die Zeit, Nr. 41, 7.
Oktober 1999.
(83) Regg-Strm, Controlling fr Manager, Campus Verlag,
1997.
(84) Samuelson, Paul, Proof that Properly Anticipated Prices
Fluctuate Randomly, 1965.
(85) Sauer, Tim, Time Series Prediction by Using Delay
Coordinate Embedding, article contributed to Time
Series Prediction: Forecasting the Future and
Understanding the Past, Eds. Andreas S. Weigend and
Neil A. Gershenfeld, Santa Fe Institute Studies in the
182 Appendix F: Bibliography
Sciences of Complexity, Proc. Vol. XV, Addison-Wesley,
1993.
(86) Schneburg, Stock price prediction using neural
networks: A project report, Neurocomputing 2, Elsevier,
1990/91, 17 - 27.
(87) Schwerk, Thomas Knstliche Neuronale Netze zur
Beurteilung von Marktgren, Diplomarbeit, TU Berlin,
1994.
(88) Schwerk, Thomas, Forecasting of Time Series with
Neural Networks, Proceedings of The Third International
Congress on Industrial and Applied Mathematics,
Hamburg, Germany, 1995.
(89) Schwerk, Thomas, Portfolio Recommendations Based on
Non-Linear Models, Proceedings of the Conference on
Advanced Investment Technlogy, Bon University, Gold
Coast, Australia, 1999.
(90) Schwerk, Thomas, Using Non-Linear Mathematical
Models for Stock Portfolio Management, Proceedings of
the Fifth Joint Conference on Information Science,
Volume II, Atlantic City, New Jersey, USA, 2000.
(91) Schwert, G. William, The Capital Asset Pricing Model:
Theory, Tests and Extensions, William E. Simon
Graduate School of Business Administration, Lecture
Notes, 1997.
(92) Silva, Fernando M. and Luis B. Almeida, Acceleration
Techniques for the Backpropagation Algorithm, Lecture
Notes in Computer Science, 100 - 119 (1990).
(93) Trippi, Robert R. and Efraim Turban, Neural Networks in
Finance and Investing, Probus Publishing Company,
1993.
(94) Utans, Joachim and John Moody, Selecting Neural
Network Architectures via the Prediction Risk: Application
to Corporate Bond Rating Prediction, submitted to First
183
Internation Conference on Artificial Intelligence
Applications on Wall Street, IEEE Computer Society
Press, 1991.
(95) Wan, Eric, Time Series Prediction by Using a
Connectionist Network with Internal Delay Lines, article
contributed to Time Series Prediction: Forecasting the
Future and Understanding the Past, Eds. Andreas S.
Weigend and Neil A. Gershenfeld, Santa Fe Institute
Studies in the Sciences of Complexity, Proc. Vol. XV,
Addison-Wesley, 1993.
(96) Weymann, Peter Kalman-Filter und -Glttung und deren
Anwendung auf Erwartungsbildungsmechanismen,
Dissertation an der Universitt Fridericiana Karlsruhe,
1987.
(97) White, Halbert Economic prediction using neural
networks: the case IBM daily stock returns, Proceedings
of the IEEE International Conference on Neural Networks,
San Diego, 1989, II-451 - II-459.
(98) Wong, S. and Pan Yong Tan, Neural Networks And
Genetic Algorithm For Economic Forecasting, submitted
to AI in economics and business administration, 1990.
(99) Wong, S. Time series forecasting using backpropagation
neural networks, Neurocomputing 2, Elsevier, 1990/91,
147 - 159.
(100) Wong, S. and P.Z. Wang, A stock selection strategy
using fuzzy neural networks, Neurocomputing 2, Elsevier,
1990/91, 233 - 242.
(101) Weigend, Andreas S. and Neil A. Gershenfeld, Time
Series Prediction: Forecasting the Future and
Understanding the Past, Santa Fe Institute Studies in the
Sciences of Complexity, Proc. Vol. XV, Addison-Wesley,
1993.
184 Appendix F: Bibliography
(102) Weigend, Andreas S. and David. A. Nix, Predictions with
Confidence Intervals (Local Error Bars), working paper of
the Sonderforschungsbereich 373 at the Humboldt-
Universitt zu Berlin, 1994.
(103) Woodwell, Donald R., Automating your Financial
Portfolio, Dow Jones-Irwin, 1983.
(104) Zappa, Frank, Joes Garage: Packard Goose, Munchkin
Music, 1979.
(105) Zimmermann, Hans Georg, Neuronale Netze als
Entscheidungskalkl, article contributed to Neuronale
Netze in der konomie Eds. Heinz Rehkugler and Hans
Georg Zimmermann, Verlag Vahlen, 1994

S-ar putea să vă placă și