Documente Academic
Documente Profesional
Documente Cultură
Master of Engineering
McGill University
Montreal,Quebec
2007-05-01
I thank Shie Mannor for his advice. I thank Hasan Mirza and Chantale Cardinal-
Watkins for reviewing my text. I thank my parents for their support.
ii
ABSTRACT
algorithm not only “beats the market”, but can also beat the best stock. Our study
of the Anticor algorithm extends these results in several ways. First, we examine how
the Anticor algorithm performs on more recent market data. Second, we run Anticor
on several simulated markets, as part of an attempt to explain its performance.
Finally, we examine how the Anticor algorithm’s performance is affected when some
iii
ABRÉGÉ
historiques qui démontrent que l’algorithme Anticor non seulement “bat le marché”
mais peut aussi surperformer le meilleur titre. Notre étude de l’algorithme Anticor
ajoute à ces résultats de plusieurs faons. Premièrement, nous examinons comment
l’algorithme Anticor performe sur les marchés financiers récents. Deuxièmement,
nous appliquons l’algorithme Anticor à des marchés simulés afin de tenter d’expliquer
iv
TABLE OF CONTENTS
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
ABRÉGÉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Background Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Agents and Monetary Resources . . . . . . . . . . . . . . . . . . . 1
1.2 Financial Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Efficient Market Hypothesis . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Weak-Form Efficient Market Hypothesis . . . . . . . . . . . 4
1.3.2 Semi-Strong-Form Efficient Market Hypothesis . . . . . . . 5
1.3.3 Strong-Form Efficient Market Hypothesis . . . . . . . . . . 5
1.3.4 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . 6
2 Portfolio Selection Problem . . . . . . . . . . . . . . . . . . . . . . . . . 7
v
3.2.1 Constant rebalancing . . . . . . . . . . . . . . . . . . . . . 15
3.2.2 The Universal Portfolio Algorithm . . . . . . . . . . . . . . 16
4 The Anticor Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1 Notation preliminaries . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Compounded Algorithms . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Compounding the Anticor Algorithm . . . . . . . . . . . . . . . . 23
4.5 Anticor Explorer . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Transaction Cost Considerations . . . . . . . . . . . . . . . . . . . . . . 25
5.1 Brokerage Scheme Examples . . . . . . . . . . . . . . . . . . . . . 25
5.2 Proportional Commission Model . . . . . . . . . . . . . . . . . . . 26
5.2.1 Modifications to the Proportional Commission Model . . . 26
6 Markets Used For Simulation . . . . . . . . . . . . . . . . . . . . . . . . 28
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1.1 Total Return . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1.2 Cumulative Return . . . . . . . . . . . . . . . . . . . . . . 36
7.1.3 In-Hindsight Geometric Mean Return . . . . . . . . . . . . 37
7.1.4 Sharpe Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2 Market Detailed View . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.2.1 Old Historical Market Data . . . . . . . . . . . . . . . . . . 39
7.2.2 Recent Historical Market Data . . . . . . . . . . . . . . . . 42
7.3 Simulated Market Data . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3.1 Modified Random Walk . . . . . . . . . . . . . . . . . . . . 54
7.3.2 Modified Autoregressive Model . . . . . . . . . . . . . . . . 54
8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.1 Future extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
vi
A Dependence Matrix Generation Algorithm . . . . . . . . . . . . . . . . . 64
B Indices Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
B.1 Old Historical Market Data . . . . . . . . . . . . . . . . . . . . . 65
B.2 Recent Historical Market Data . . . . . . . . . . . . . . . . . . . . 65
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
vii
LIST OF TABLES
Table page
viii
LIST OF FIGURES
Figure page
ix
7–18 Results for mrw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7–19 Results for mrw3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7–20 Results for mam0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7–21 Results for mam1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
x
CHAPTER 1
Background Information
We start with some background information on the portfolio selection problem
in a semi formal context. We explain how groups of agents come together to exchange
monetary resources thus forming financial markets. We then provide a brief overview
of the efficient market hypothesis. Moving into the core subject matter, we present
formally the portfolio selection problem. We carefully list simplifying assumptions
made to render the portfolio selection problem more manageable. We then present
a series of portfolio selection algorithms, namely UBAH (the uniform buy-and-hold
strategy), BAH* (the optimal “in hindsight” buy-and-hold strategy), UCBAL (the
uniform constant rebalancing strategy) and UCBAL* (the optimal constant rebal-
ancing strategy). We then present the markets used for simulation: old historical
Finance studies the ways agents allocate monetary resources over time. An
agent could be a person or an organization, while a monetary resource is anything
to which a numerical “dollar value” can be assigned.
An agent holding some monetary resources can decide either to consume them,
1
the value of the resource decreases. To invest a monetary resource is, simply, to not
consume it. Buying a house and living in it is both an investment and a consumption.
As the price of the house changes over time, a profit (or loss) may be realized and
that is the investment part. But by not leasing out the house, some revenue is not
the money; instead, it just involves exchanging it for another monetary resource.
In economics, the agent’s increase in happiness is measured by a numerical utility
function. Agents are usually assumed to always act in a way that maximizes their
utility. This characteristic is called rationality. It has been suggested, however, that
humans are not always rational. In [10], Kahneman and Tversky present a critique
of expected utility theory and, instead, propose an alternative model, called prospect
theory. We mention this because, as we will see in Section 1.3, the Efficient Market
Hypothesis assumes that agents are rational.
The mechanisms which allow agents to exchange monetary resources are called
financial markets. There are many types of financial markets, such as stock markets,
2
bond markets, commodities markets, futures markets, and foreign exchange markets.
In general, when a group of agents come together and exchange their monetary
resources, a financial market is created.
A commonly used tool for evaluating the characteristics of a group of related
etary resources has captivated the attention of many researchers both in and out of
academia. The fierce competition has lead some to the formulation of the efficient
market hypothesis, which suggests that security prices adjust rapidly and rationally
to new information.
Since many researchers believe in the efficient market hypothesis and since these
researchers (those who firmly believe in the efficient market hypothesis at least) would
probably find little interest is learning about the Anticor algorithm, we will present
briefly the essence of the efficient market hypothesis. Following that, we will present
formally the portfolio selection problem which the Anticor algorithm attempts to
solve.
3
1.3 Efficient Market Hypothesis
always open for debate. If the efficient market hypothesis holds true, then agents
should not be able to consistently achieve above-average performance. In other
words, the likes of Warren Buffet are akin to lottery winners: gamblers who got
lucky. The efficient market hypothesis is based on the following assumptions:
sumptions, then the prices on the securities traded in this market reflect all known
information.
There are three ways to define what is meant by “information”, leading to three
different forms of the efficient market hypothesis: the weak-form, the semi-strong-
The weak-form efficient market hypothesis defines information as all the histor-
ical market data (expressible as large matrices of real numbers) including sequences
of prices, rates of return, trading volume, odd-lot transactions, block trades, and
exchange specialist transactions.
4
Consequently, the weak-form efficient market hypothesis states that “technical
analysis” will not be able to consistently produce excess returns. As we will see later,
this implies that algorithms such as the Anticor algorithm (which uses only historical
sequences of prices) should not be able to achieve consistent excess returns. On the
collective beliefs and expectations are rapidly (or instantly) reflected in the asset’s
prices. This implies that it is not possible to consistently outperform the market
using the information that the market already knows. The semi-strong-form implies
that neither technical nor fundamental analysis will allow agents to consistently
outperform the market. The only way to consistently outperform the market is
through luck or by obtaining and trading on material non-public information.
1.3.3 Strong-Form Efficient Market Hypothesis
5
1.3.4 Empirical Evidence
In [6] and [7], Fama presents the efficient market theory in terms of a fair game
model: an agent can be confident that current market prices fully reflect all available
6
CHAPTER 2
Portfolio Selection Problem
2.1 Problem Definition
by X.
At the start of each day t, an agent chooses a portfolio bt = [bt (1), . . . , bt (m)] ,
satisfying bt (j) ≥ 0 for all j and j bt (j) = 1, where each entry bt (j) is the fraction
of the agent’s wealth invested in security j. These portfolios produce a total return
T
t=1 bt xt over the T days. The agent might not have access to the full market
sequence X when choosing each bt . (In practice, for example, an agent does not
have access to future market vectors.) Loosely stated, the goal in the portfolio
selection problem is to choose bt to achieve a “good” return, given the information
7
A portfolio selection algorithm A is a (deterministic or randomized) rule for
specifying the portfolio sequence b1 , . . . , bT . We define retX (A) as the total return
of A for the market sequence X. In practice, the market sequence X is not known
in advance and hence it is a random process. In this context, retX (A) is a random
how markets operate. Specifically, it assumes that there is a given “current price”
for each security, which is set at the start of each trading day, and any agent can
always buy or sell any amount of any security at its current price. Below, we present
a more realistic view of market operation.
Throughout each trading day, agents enter the market at various times, seeking
to buy or sell securities. When an agent decides to buy (respectively, sell) a security,
it places an order specifying the quantity desired (respectively, for sale), and the
maximum (respectively, minimum) price per unit that it is willing to pay (respectively
sell for). This order is recorded in a tabular structure called an order book, which
contains two lists: one containing all the unfulfilled purchase orders, called the bid
list, and one containing the unfulfilled sale orders, called the ask list. The contents
of an order book change as orders are placed, fulfilled, and retracted; an example is
given in Table 2–1.
8
Table 2–1: Example of an order book
Bid Ask
Qty. Price Qty. Price
100 50$ 200 51$
150 49$ 100 52$
The difference between the highest price in the bid list and the lowest price in the
ask list is called the bid-ask spread. Whenever the bid-ask spread is negative or zero,
a transaction occurs between the agent who placed the highest-price purchase order
and the agent who placed the lowest-price sale order. Whichever order has a higher
quantity is removed from the order book; the other has its quantity appropriately
reduced. Transactions continue to occur in this way until the bid-ask spread is strictly
positive, at which time the market becomes idle until an agent places or retracts an
order, changing the bid-ask spread. Note that a single purchase order can be fulfilled
by multiple sale orders with different prices, and vice versa.
In effect, the simplified view given above assumes that throughout each trading
day, the order book always appears as shown in Table 2–2. The price p is set at the
beginning of the day and does not change throughout the day.
Table 2–2: Effective order book under the simplified view of market operation
Bid Ask
Qty. Price Qty. Price
∞ p$ ∞ p$
The assumption that the bid and ask quantities are both infinite is known as
infinite liquidity, while the assumption that the bid and ask prices are equal is known
as zero spread.
9
2.2.2 Infinitely small agent
Under this assumption, no brokerage fees are incurred when a transaction takes
place. As we shall see later, it is possible to relax this assumption to penalize overly
active strategies. However, brokerage fees are difficult to represent accurately, since
they vary widely with the amounts being traded, the type of securities being traded,
and other considerations such as soft dollars.
2.2.4 Tax-free profits
Under this assumption, there is no tax incurred when a profit is realized. This
assumption is often overlooked, but as we will see, a strategy’s after-tax profit can
sometimes be significantly less than its before-tax profit.
Although the tax rate is the same for all strategies, different strategies will pay
different amounts of taxes. The reason for this is that taxes are paid whenever
securities are sold for a profit. For example, buying a security for d dollars then
selling it later for 1.5d dollars will result in a before-tax profit of 0.5d dollars and an
after-tax profit of (1 − τ )0.5d dollars, where τ is the tax rate. Note however that
taxes are calculated on the aggregate of the profit and losses of all the securities in
the portfolio.
In practice, it is preferable to compare the relative performance of two strategies
on an after tax basis. The following example demonstrates an example where the
10
buy and hold strategy performs worse on a before-tax basis but better on an after-tax
basis than another strategy.
Consider a situation where the market consists of only two investment possibil-
ities, s1 and s2 such that the yearly return market vector is
⎡ ⎤ ⎡ ⎤
s1 t
⎢xt ⎥ ⎢ σ + (−1) δ ⎥
x(t) = ⎣ ⎦ = 12×1 + ⎣ ⎦
xst 2 σ + (−1)t+1 δ
where σ = 12 percent, δ = 0.5 percent and t is in years. Suppose that two investors,
Alice and Bob, both purchase 10, 000$ worth of s1 at the start of a 10-year period,
and that the tax rate τ is 20 percent. Alice adopts the buy and hold (see Section 3.1
strategy and, hence, she holds the securities for the entire 10 years, and only sells
them at the end, realizing a before-tax profit of
Bob, on the other hand, has a time machine and uses it to determine which stock will
perform best. On the first day of each year, he sells the underperforming security
and buys the security with the best return for the coming year, realizing a before-tax
profit of
(1.12510 − 1) × 10, 000$ = 22, 473.21$
11
However, every time he sells, he pays a 20% tax on any profits made since the previous
purchase and his after-tax profit is
Thus, over the the 10-year period, Bob’s total before-tax profit is greater than Alice’s,
but his after-tax profit is lower. Table 2–3 shows the before- and after-tax return of
the two strategies in greater detail. The conclusion to draw from this example is that
although Bob correctly chooses the best performing securities, he should consider the
effect of taxes before making his investment decisions. Fortunately, he can go back
in time and rectify his investments to account for taxes.
Table 2–3: The before and after tax return of two strategies
12
We have given here a simplicied view of how taxes affect return on investments.
However in practice, taxes are significantly more complicated than that; hence, we
decided to ignore the effect of taxes.
13
CHAPTER 3
Portfolio Selection Algorithms
We now present algorithms that attempt to solve the portfolio selection prob-
lem (as defined in Chapter 2) and we classify them as either “passive” or “active”
algorithms. Presentation of the Anticor algorithm will follow in Chapter 4.
3.1 Passive Algorithms
Passive algorithms are algorithms that never re-invest any money. The main ex-
ample of such an algorithm is BAHb (buy-and-hold). This algorithm is parametrized
by an initial portfolio b; it invests according to b on the first trading day, then never
re-invests any money henceforth. This results in a portfolio sequence given by
1
bt+1 = [xt (1)bt (1), . . . , xt (m)bt (m)]
xt bt
mance of the market when benchmarking other algorithms. (In practice, how-
ever, stock market indices such as the Dow Jones use non-uniform weights.) If
an algorithm A has retX (A) > retX (U-BAH), then we say that A “beats the
market.”
14
• The BAH∗ algorithm is BAHb with b = arg maxb retX (BAHb ). This is the op-
timal “in hindsight” buy-and-hold strategy, and is often used in offline bench-
marks. The portfolio b assigns a weight of 1 to the “best” stock, and a weight
of 0 to all others.
In practice one could use a “fundamental” [9] approach such as that of Warren
Buffet or Peter Lynch, but such approaches are informal and require the evaluation
of intangible factors, such as the quality of management of the company selling the
stock. Alternatively, one could use a “behavioral” approach. In [15], Shleifer argues
that less than fully rational investors trade against arbitrageurs whose resources are
limited by risk aversion, short horizons, and agency problems. We are focusing our
work on quantitative approaches that require only the market sequence X.
This algorithm maintains a fixed portfolio b throughout the entire trading period
by appropriately re-investing money at the end of each trading day. As with BAH,
there are two important special cases:
• U-CBAL, where b = [ m1 , . . . , m1 ].
• CBAL*, where the fixed portfolio is the optimal “in hindsight” portfolio.
We have that
retX (CBAL∗ ) ≥ retX (BAH∗ )
Cover and Gluss [3] present an interesting example involving a hypothetical “no
15
consider the market sequence
⎡ ⎤
1 1
⎢2 2 2 2 · · ·⎥
XT = ⎣ ⎦;
1 1 1 1 ···
we have
retXT (U-BAH) = 1
but
T /2
9
retXT (U-CBAL) =
8
which is exponential in T .
In [5], Cover and Thomas prove that, if a random market sequence X = [x1 , . . . , xT ]
consists of i.i.d. daily market vectors, then for any online algorithm A,
However, in [11], McKinlay and Lo argue that the daily market vectors xt are not
i.i.d., but instead have memory. It would be preferable to have an online algorithm
which drops the i.i.d. assumption and makes use of the memory between the xt ’s.
3.2.2 The Universal Portfolio Algorithm
Cover and Ordentlich [4] present an algorithm, called the Universal Portfolio
Algorithm, which they prove guarantees a sub-exponential ratio (in n) between its
return and the return of CBAL∗ for any market sequence over n days. This result
is surprising, as it implies that the Universal algorithm can track the potentially
16
CHAPTER 4
The Anticor Algorithm
By attempting to systematically follow the constant rebalancing philosophy, the
⎡ ⎤
⎢ a1k · · · a1l ⎥
⎢ . .. ⎥
Ak,...,l =⎢
⎢
.. . ⎥
⎥
⎣ ⎦
amk · · · aml
Next, for an m × n matrix A, we denote by Log(A) (note the capital L) the
element-wise logarithm of A:
⎡ ⎤
⎢ log a11 · · · log a1n ⎥
⎢ . .. ⎥
Log(A) = ⎢
⎢
.. . ⎥⎥
⎣ ⎦
log am1 · · · log amn
17
For two m × n matrices B and C, we denote by B ⊗ C their element-wise product
(also known as the Hadamard product), and by B C their element-wise quotient:
⎡ ⎤ ⎡ ⎤
⎢ b11 c11 · · · b1n c1n ⎥ ⎢ b11 /c11 · · · b1n /c1n ⎥
⎢ . .. ⎥ ⎥ ⎢ .. .. ⎥
B⊗C =⎢ ⎢
.. . ⎥ B C = ⎢
⎢ . . ⎥
⎥
⎣ ⎦ ⎣ ⎦
bm1 cm1 · · · bmn cmn bm1 /cm1 · · · bmn /cmn
which produce m×1 column vectors containing, respectively, the mean and standard
deviation of each row of A.
4.2 The algorithm
18
stocks. Specifically, whenever the algorithm detects that (i) a stock i outperformed
a stock j during the last window, but (ii) i’s performance in the last window is anti-
correlated to j’s performance in the second-to-last window, then it transfers wealth
from i to j. We present the algorithm more formally below.
L1 = Log(Xt−w
t−2w+1 )
L2 = Log(Xtt−w+1 )
which are m × w matrices containing the logarithms of the daily market vectors
during the second-to-last and last windows. We take logarithms because ordering
logarithms of arithmetic means is equivalent to ordering geometric means, though
analytically simpler.
Next, we derive “centered” versions L̄1 and L̄2 of L1 and L2 by subtracting the
μ1 = Mean(L1 )
μ2 = Mean(L2 )
L̄2 = L2 − μ2 · · · μ2
19
Now, we let
1
Mcov = L̄1 L̄T2
w−1
For each i and j, Mcov (i, j) is the covariance between the log-relative prices of stock
i over the first window and stock j over the second window.
Finally, we let
σ 1 = StdDev(L1 )
σ 2 = StdDev(L2 )
procedure Anticor(w, t, Xt , b)
if t < 2w then
return b
end if
L1 ← Log([Xt ]t−2w+1,...,t−w )
L2 ← Log([Xt ]t−w+1,...,t )
μ1 ← Mean(L1 )
μ2 ← Mean(L2 )
L̄1 ← L1 − μ1 · · · μ1
L̄2 ← L2 − μ2 · · · μ2
20
1
Mcov ← L̄ L̄T
w−1 1 2
σ 1 ← StdDev(L1 )
σ 2 ← StdDev(L2 )
for all i, j ∈ {1, . . . , m} do
else
Mcor (i, j) ← 0
end if
end for
for all i, j ∈ {1, . . . , m} do
if μ2 (i) ≥ μ2 (j) and Mcor (i, j) > 0 then
claimi→j ← Mcor (i, j) − [Mcor (i, i)]− − [Mcor (j, j)]−
else
claimi→j ← 0
end if
end for
end for
else
21
transferi→j ← 0
end if
end for
for all i, j ∈ {1, . . . , m} do
Note that output of ANTICORw for day t cannot be directly fed back into
ANTICORw+1 as the next day’s input; we must first compute the effect of the market
vector xt on bt :
1
b̂t = bt ⊗ xt
b
t xt
The resulting vector b̂t can then be fed into ANTICORw as input for day t + 1.
4.3 Compounded Algorithms
cantly affects the algorithm’s performance. We can thus view the Anticor algorithm
as not a single algorithm, but rather a family of algorithms, indexed by the pa-
rameter w. Since it is not possible to choose w in hindsight when applying the
Anticor algorithm “online”, the authors of [2] (effectively) suggest viewing the dif-
22
4.4 Compounding the Anticor Algorithm
first two columns which show the portfolios bt and bt+1 . The frequency with which
we execute the Anticor algorithm can vary anywhere from split seconds to years.
23
In this context, we apply it daily and this qualifies the Anticor algorithm as a high
frequency trading strategy. Furthermore, the algorithm has a high turnover ratio
every time it is applied as is clearly demonstrated in 4–1.
24
CHAPTER 5
Transaction Cost Considerations
The effect of transaction costs associated with brokerage fees is non negligible.
The following example is used to demonstrate the various ways of computing trans-
action vectors. One way is to compute the change in monetary value of the securities
and the other is to compute the change in number of units (or shares, for the sake
of the example).
is a unitless measure of each share’s growth over the period. Let us assume that the
algorithm outputs b2 = [0.5, 0.5] . Hence we can compute d2 = B
1 X1 = 125$ and
25
If the broker charges us a per monetary value fee, then we first need to compute the
value of B at the end of period 1, right before the transfer occurs, which we denote
by B̂1 . Hence, B̂1 = B1 ⊗ X1 = [100$, 25$] and the transfer is Tm = B̂1 − B2 =
[100$, 25$] − [62.5$, 62.5$] = [37.5$, 37.5$], where ⊗ is the element wise multi-
The problem with these two methods is that one requires knowledge of the
price of the underlying securities to compute the transfer vectors. The authors
of [2] suggest using the proportional commission model which assumes a fraction
γ
γ ∈ (0, 1) that an investor pays at a rate of 2
for each buy and for each sell. The
model specifies that the return of a sequence b1 , . . . , bn of portfolios with respect to
a market sequence x1 , . . . , xn is
γ
bt xt 1 − bt (j) − b̂t (j)
t j
2
where
1
b̂t = (bt ⊗ xt )
bt xt
26
if the transaction costs are to be included in the previous day’s performance or
γ
bt xt 1 − bt (j) − b̂t−1 (j)
t j
2
if the transaction costs are to be included in the next day’s performance. We have
arbitrarily decided that the transaction costs be included in the previous day’s per-
formance measure.
A very important point to consider here is that brokerage fees are not the same
for all types of securities. If we were to apply the Anticor algorithm to deriva-
tive products (such as options, futures, or swaps) which usually incur much smaller
transaction fees, then making the zero transactions fee assumption would be more
acceptable.
27
CHAPTER 6
Markets Used For Simulation
The experimental study was performed using three different types of data, de-
Our first data set consisted of historical data for DJIA, SP500, NYSE and
TSE, obtained from the authors of [2]. Running our implementation of the Anticor
algorithm on this data allowed us to verify their results, and to ensure that our
implementation was correct. Another source is the London Stock exchange data set,
DATML1. Table 6–1 gives the daily sampled mean, variance, skewness and kurtosis
of the old historical market data.
Table 6–1: Daily Sampled Statistics of Old Historical Markets
Index Start Date End Date Mean Variance Skewness Kurtosis
datML1.txt N/A N/A 1.0004 0.00044 0.21604 13.9545
djia.txt 2001-01-14 2003-01-14 0.9997 0.000662 -0.8938 26.557
nyse.txt 1962-07-03 1984-12-31 1.0006 0.000399 1.0445 17.7132
sp500.txt 1998-01-02 2003-01-31 1.0005 0.000656 0.13304 8.074
tse.txt 1994-01-04 1998-12-31 1.0004 0.00057745 1.5791 71.435
Our second data set consisted of recent trading data, obtained from Yahoo [8], for
several market indices, each containing about 30 stocks. Note that the composition
of these indices is given in Appendix B.2. Each index was treated as one “market”
28
for the purpose of the algorithm. Although we could have assembled markets from
other collections of securities, market indices are more representative of practical
trading situations, and are easier to obtain data for. Also, they are more likely to
approximately represent the simplifying assumptions presented earlier.
Several flaws in this data set make it difficult to use with the algorithms:
• Stocks that cease to exist during the considered time period are completely
omitted from the provided data set, even for the time when they did exist.
This is difficult to compensate for, as the data set contains no evidence that
the stocks were ever in the index, and it introduces a bias towards stocks that
• Each stock may have gaps in its sequence of prices, where the last traded
price is unknown for one or more consecutive days. These gaps are filled in by
interpolating between the nearest known prices.
Table 6–2 gives the daily sampled mean, variance, skewness and kurtosis of the
Gaussian Noise
In this model, we draw each relative price xs (t) (relative price of security s at
time t) from a Gaussian distribution with mean μ and variance σ 2 . For each security
29
Table 6–2: Daily Sampled Statistics of Recent Historical Markets
Index Start Date End Date Mean Variance Skewness Kurtosis
dja 1998-11-16 2007-02-02 1.0006 0.00053639 0.68273 62.7138
dji 1998-11-16 2007-02-02 1.0004 0.00043574 -0.032271 11.3017
dot 1998-11-16 2007-02-02 1.0012 0.0015245 0.62122 15.9012
iix 1998-11-16 2007-02-02 1.0012 0.0022224 0.84644 15.4131
ndx 1998-11-16 2007-02-02 1.0012 0.0012992 1.3052 39.0859
nwx 1998-11-16 2007-02-02 1.0008 0.0017904 0.37211 12.332
nyi 1998-11-16 2007-02-02 1.0007 0.00045597 8.6386 770.3089
nyy 1998-11-16 2007-02-02 1.0006 0.00075616 1.6179 88.5836
oex 1998-11-16 2007-02-02 1.0006 0.0005303 -0.13936 22.0578
soxx 1998-11-16 2007-02-02 1.0009 0.0014157 0.36659 7.8734
xau 1998-11-16 2007-02-02 1.0012 0.0011922 0.5639 10.9352
xmi 1998-11-16 2007-02-02 1.0003 0.00036708 -0.080012 11.591
except that the noise at each time step is scaled according to the price level. Indeed,
a (scalar-valued) Gaussian random walk process is given by
where each nσ2 (t) is a Gaussian random variable with mean 0 and variance σ 2 ; this
produces a relative price sequence
v(t) nσ2
x(t) = =1+
v(t − 1) v(t − 1)
Hence the larger v(t − 1) is, the smaller the variance of x(t) is. This, we feel, is not
representative of real markets; in other words, we believe it is not the case that a
100$ security has a relative price variance approximately 10 times smaller than a 10$
security. Thus, we instead use the modified model given above, which gives the price
30
sequence
v(t) = v(t − 1) · (μ + nσ2 (t))
To determine values for μ and σ 2 , we computed the first four moments of the
historical market data sets. Thus, we set the yearly expected mean to be conserva-
tively μ = 1.07 (or 7%) and the daily variance to be σ 2 = 0.0005. Assuming 252
days of trading a year, this corresponds to a daily mean of ≈ 1.000268. Comparing
this with the values in Table 6–1 and Table 6–2 we see that 7 percent annual return
is smaller than all real market data.
Table 6–3: Daily Sampled Statistics of Simulated Markets
mam Mean Variance Skewness Kurtosis
mrw0 1.0003 0.00049848 0.0087546 2.961
mrw1 1.0003 0.00049848 0.0087546 2.961
mam0 1.0002 0.00049841 0.056457 2.9856
mam1 1.0003 0.00051299 0.062445 3.0189
mam2 1.0003 0.00054372 0.061123 3.0098
mam3 1.0001 0.0005965 0.054341 3.0443
mam4 1.0003 0.00072091 0.046434 3.0126
mam5 1.0003 0.00088263 0.012532 3.0171
mam6 1.0003 0.0012225 0.013761 3.0878
mam7 1.0004 0.0018012 0.0097862 3.2389
mam8 1.0003 0.0033626 -0.002742 3.377
mam9 1.0003 0.0051434 0.0093402 4.0548
It should be noted here that under such random markets, one cannot expect to
perform well. Hence if the Anticor algorithm performs poorly on mrw0, this should
not come as a surprise.
31
Log-normal Noise
Alternatively, we tried to use log normal distributed noise since markets tend to
be positively skewed. In this version of the model, the log-normal distribution has
the same mean and variance as the previous model. Hence, the distribution of the
log-normal relative price is obtained by taking the exponential of the normal
was not noticeable as far as the relative performance of the algorithms is concerned.
6.3.2 Modified Autoregressive Model
In this model, the joint movement of all securities’ relative prices are generated
L
x(t) = 1m×1 + Dl (1m×1 − x(t − l)) + n(t)
l=1
Here, n(t) is a noise process, which may be either normally or log-normally dis-
tributed. The matrices D1 , . . . , DL , each of size m × m, are parameters that can be
used to express dependencies between different securities. Specifically, the i, j entry
of Dl expresses how much the price of security j will influence the price of i, l days
32
later. This overcomes an important limitation of the modified random walk model
given above, which assumes the securities’ prices to be independent.
This model can be visualized as the system diagram shown in Figure 6–1.
+
z −1 ... z −1 z −1 + 1
−
DL D2 D1
x(t)
... + + +
1 + n(t)
Note that if L = 1 and D1 = 0, then this model reduces to the modified random
walk model.
For some choices of the matrices D1 , . . . , DL, the relative price sequence x(t)
may grow unboundedly, because of positive feedback. We have not explored the exact
conditions under which this happens, but we have found a method for constructing
the D matrices which, intuitively and experimentally, seem to avoid unbounded
growth in x(t). The algorithm for generating these matrices is shown in Appendix
A; by ensuring that each matrix Dl is lower triangular, it avoids cyclical dependencies
33
CHAPTER 7
Empirical Comparison
We implemented several software systems to perform this experiment. We will
not discuss the implementation details here, however we have provided online a
“readme” file which explains how to obtain the source code. The readme file is
available at:
In this chaper, we present empirical results for every market mentioned in Chap-
ter 6. We focus our attention on four graphs (as presented in Section 7.1) which we
use to compare the relative performance of the algorithms.
To avoid overcrowding the graphs, we do not display all of the benchmarks. As
stated before, retX (CBAL∗ ) ≥ retX (BAH∗ ), since we know that BAH* invests only
in the best stock(s), and this strategy is a special case of CBAL*. We thus decided
to hide BAH*. The performance of UBAH and UCBAL are usually similar, so only
one should suffice. However, the performance of UBAH is unaffected by transaction
costs, so we decided to hide it as well. Hence the algorithms presented are as follows:
• UCBAL (as defined in Chapter 3)
• CBAL* (as defined in Chapter 3)
• BAHW (ANT ICORw ): abbreviated as “ANTI1” (as defined in Section 4.3)
34
• BAHW (ANT ICORw (ANT ICORw )): abbreviated as “ANTI2” (as defined in
Section 4.4)
To account for transaction costs, we used the method described in Section 5.2.1.
The friction coefficient used is equal to one percent and the performances of the
algorithms after transaction costs are incurred are shown as a dotted line (and are
denoted by f-name, where “f” stands for friction).
It is difficult to provide a completely unbiased view, but we hope that these four
graphs provide the reader with enough information to assess the performance and
the risk (as will be discussed in Section 7.1.4) of the Anticor algorithm. In Section
7.1 we discuss each of the four graphs in turn and present a summary of the salient
points observed in the simulations. For a more thorough investigation, we present
(in Section 7.2) specific observations for every market mentioned in Chapter 6.
7.1 Overview
We will now present each of the four graphs and make general observations
about what each graph enables us to see.
7.1.1 Total Return
The first graph we consider shows the total return versus the window sizes. The
benchmark algorithms are not parametrized by the window sizes and so are displayed
as straight lines. The most relevant curves plot the total return of ANTICORw (and
35
lead to better performance. In some but not all cases, the Anticor algorithm beats
CBAL* for both old and recent historical markets. For simulated markets, it is clear
that as the “dependence factor” increases in magnitude, so does the comparative
performance of the Anticor algorithm. For most of these simulated markets, ANTI1
provides higher returns than ANTI2, which could be attributed to the the simplicity
of the simulated markets: ANTI2 is attempting to exploit complex interdependencies
that do not exist. Note that the maximum lag is 30, so it is not surprising that the
performance of the Anticor algorithm declines greatly for window sizes between 30
and 50.
The total return versus window size enables us to see how ANTICORw performs
with respect to the window size parameter. If ANTICORw does better than the
benchmark algorithms for all window sizes, we know that for the specific period
between the start date and the end date the Anticor algorithm (irrespective of the
window size used) was a “better” way to invest. However, the total return versus
window size graph tells us little about risk and performance over time. Indeed, it is
heavily biased by the specific choice of start date and end date.
7.1.2 Cumulative Return
The cumulative return graph enables us to look at return over time, removing
the bias associated with choosing a specific end date (but retaining the bias from the
start date). It also allows us to obtain an idea of the risk by looking at the volatility
36
(in absolute terms) of the portfolio given a strategy. This view enables us to avoid
the bias associated with the starting point, as we can see that certain periods account
for much of the growth.
7.1.3 In-Hindsight Geometric Mean Return
Given a particular ending date, the in-hindsight geometric mean return (IGMR)
over T days is the average yearly return that we would have obtained if we had started
investing T days before this end date. (Note that larger values of T correspond to
earlier start dates.) By examining this quantity, we avoid the bias associated with
choosing a start date, but still incur bias from the chosen end date.
It is obvious that, in principle, we would want to always invest in the strategy
with the highest IGMR at every t, to maximize the amount of return obtained at the
end date. Also interesting to note are the “cross-overs” between strategies; cross-over
signifies that the strategy “going up” will do better than the strategy “going down”
for the coming while.
Another interesting point to note is that UCBAL is usually in the 10 percent
region while CBAL* is in the 20 percent region. ANTICORw produces returns near
of investment performance that accounts for both the return obtained and the risk
incurred.
The finance literature on balancing risk and return, and the proposed metrics
for doing so, are far too large to survey here (see [1], Chapter 4 for an overview).
37
Among the most common methods are the Sharpe ratio [14], and the mean-variance
(MV) criterion, of which Markowitz was the first proponent [12].
Even after taking the transaction costs into account, ANTI1 and ANTI2 have a
higher Sharpe ratio than UCBAL for some indices. Table 7–1 summarizes how the
Table 7–1: Comparison of Sharpe ratios of ANTI1/ANTI2 with CBAL∗ . The “Bet-
ter” column contains the data sets where ANTI1 and ANTI2 always had better
Sharpe ratios than CBAL∗ ; similarly for the “Worse” column.
Better Mixed Worse
Old market data
djia.txt datML1.txt
sp500.txt
tse.txt
nyse.txt
Recent market data
dji soxx dja
iix dot ndx
nyy nwx
xau nyi
oex
xmi
This distribution is impressive and puts the Anticor algorithm in a good light.
Whether such Sharpe ratios will exist in the future is a difficult question to answer.
On the other hand, it is clear that the results provided by the authors of [2] paint
a more positive picture than the results we obtained. We also see that even with
small commissions, the Sharpe ratio is significantly affected. The performance is
diminished, but we see that the risk stays the same, which serves as a sanity check
of the results.
38
7.2 Market Detailed View
This Section covers the empirical comparisons for old “offline” markets. We
consider the four markets considered by the authors of [2] and one extra market,
datML1, for the London Stock Exchange.
It should be noted that the dates on the graphs are not meaningful because the
results where obtained offline and we do not have access to the exact dates at which
the data were recorded.
datML1
Figure 7–1 shows the empirical results obtained for the datML1 historical market
data set. The total return graph shows that the Anticor algorithm performed better
than the UCBAL but worse than CBAL*. We also note that when transaction
costs are taken into consideration the Anticor Algorithm still performs better than
UCBAL.
The Sharpe ratio shows that CBAL* offers more return for a greater risk. On a
return per unit of risk basis CBAL* is also the clear winner. In the IGMR graph it is
interesting to note that for a short while in the middle section the ANTI2 algorithm
actually surpassed CBAL*. Hence, an investor starting to invest during this short
time window could have expected to obtain a greater return from ANTI2 than from
CBAL*.
In the cumulative return graph we observe that during a short period of time
ANTI1 had less cumulative return than even UCBAL but somewhere after the mid-
point ANTI1 managed to surpass UCBAL. The clear winner for datML1 is CBAL*
39
but since CBAL* requires knowledge of the future, ANTI1 or ANTI2 would have
given excellent returns.
datML1.txt Total Return
Sharpe Ratio
−3
x 10 4.5
3
4
2.5 ucbal
Total Return
2 cbal*
Daily Return
3
f−cbal*
1.5 anti1
2.5
f−anti1
1 2 anti2
f−anti2
0.5 1.5
0 1
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
150 5
Cumulative Return
Yearly Return (%)
100
3
2
50
1
0 0
0 50 100 150 200 250 0 100 200 300 400 500
Time (days) Time (days)
djia
Figure 7–2 shows results that are consistent with those of the authors of [2]. It
is interesting to note that both ANTI1 and ANTI2 performed extremely well both
in terms of total return and Sharpe ratio. These market data clearly demonstrate
that historically the Anticor algorithm could have generated high returns.
nyse
Figure 7–3 confirms the extraordinary results obtained on the nyse by authors
of [2]. All the graphs confirm that ANTI1 and ANTI2 offer much higher returns and
40
djia.txt Total Return
Sharpe Ratio
−4
x 10 2.5
20
15 2 ucbal
Total Return
cbal*
Daily Return
10 f−cbal*
1.5
anti1
5 f−anti1
anti2
1 f−anti2
0
−5 0.5
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
100 2.5
80
2
Cumulative Return
Yearly Return (%)
60
40 1.5
20
1
0
−20 0.5
0 100 200 300 0 200 400 600
Time (days) Time (days)
3 8
ucbal
Risk−free Rate = 0.04
2.5 f−ucbal
Total Return
cbal*
Daily Return
6
2 f−cbal*
anti1
1.5 4 f−anti1
anti2
1
f−anti2
2
0.5
0 0
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
2
150
Cumulative Return
Yearly Return (%)
1.5
100
1
50
0.5
0 0
0 1000 2000 3000 0 2000 4000 6000
Time (days) Time (days)
sp500
The Anticor algorithm performed well on the sp500 as shown in Figure 7–4. The
most notable aspect is that in the IGMR graph, the yearly return of all strategies
41
goes down as time increases. This suggests that as time passes, over the period
covered by the the sp500 data set, the overall market performed worse.
sp500.txt Total Return
Sharpe Ratio
−3
x 10 14
1.5
12
ucbal
Total Return
1 cbal*
Daily Return
8
f−cbal*
anti1
6
f−anti1
0.5 4 anti2
f−anti2
2
0 0
0 0.005 0.01 0.015 0.02 0.025 0.03 0 20 40 60
Daily Risk Window Size − 1
60 7
6
40
Cumulative Return
Yearly Return (%)
4
20
3
2
0
1
−20 0
0 200 400 600 800 0 500 1000 1500
Time (days) Time (days)
tse
The Toronto Stock Exchange historical market empirical results shown in Figure
7–5 are also consistent with what the authors of [2] found. Of particular interest is
that even though in terms of absolute return ANTI2 performs better than ANTI1,
the risk-adjusted return of ANTI1 is superior to that of ANTI2. In fact, the Sharpe
ratio of f-ANTI2 is approximately equal to that of CBAL*.
7.2.2 Recent Historical Market Data
dja
Figure 7–6 shows the four graphs for the dja market index. We should note that
the window size parameter has a significant effect. Indeed, for small window sizes
both ANTI1 and ANTI2 perform worse than CBAL* but perform better for larger
42
tse.txt Total Return
Sharpe Ratio
−3
x 10 35
3
30
2.5 ucbal
Total Return
2 cbal*
Daily Return
20
f−cbal*
1.5 anti1
15
f−anti1
1 10 anti2
f−anti2
0.5 5
0 0
0 0.01 0.02 0.03 0.04 0.05 0 20 40 60
Daily Risk Window Size − 1
120 30
100 25
Cumulative Return
Yearly Return (%)
80 20
60 15
40 10
20 5
0 0
0 200 400 600 800 0 500 1000 1500
Time (days) Time (days)
window sizes. Furthurmore, we find interesting that the Sharpe ratio of CBAL* is
the best. Hence for an equal amount of risk (assuming that the investor can borrow
at the risk free rate of four percent), an investor should prefer CBAL* over ANTI1
or ANTI2.
dji
The dji market is one of the new historical markets where the Anticor algorithm
performs best. As shown in Figure 7–7 both ANTI1 and ANTI2 perform better than
UCBAL and CBAL*. In addition, the Sharpe ratio of the Anticor algorithms are
better than CBAL* even after transaction costs. However, as can be observed from
the IGMR graph, most of the return is accumulated prior to 2002. In fact, if we look
at the cumulative return graph it appears that the period 2002 to 2005 resulted in a
negative return.
43
dja Total Return
Sharpe Ratio
−3
x 10 40
2
30 ucbal
Total Return
cbal*
Daily Return
f−cbal*
20
1 anti1
f−anti1
anti2
0.5 10 f−anti2
0 0
0 0.01 0.02 0.03 0.04 0.05 0 20 40 60
Daily Risk Window Size − 1
80 35
70 30
Cumulative Return
Yearly Return (%)
60 25
50 20
40 15
30 10
20 5
10 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
1.2 20
ucbal
Risk−free Rate = 0.04
1 f−ucbal
Total Return
cbal*
Daily Return
15
0.8 f−cbal*
anti1
0.6 10 f−anti1
anti2
0.4
f−anti2
5
0.2
0 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
40 14
35 12
Cumulative Return
Yearly Return (%)
30 10
25 8
20 6
15 4
10 2
5 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
dot
Similarly to dji, the Anticor algorithms performs well on the dot market. How-
ever, it should be noted that when the transaction costs are taken into consideration
44
the ANTI1 performs worse than CBAL*. In the cumulative return graph shown
in Figure 7–8, we note that most of the gains of ANTI2 were accomplished after
2002. In that respect, the performance of the Anticor algorithm on the dot market
is negatively correlated with the performance on the dji market.
200
2 ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
cbal*
Daily Return
150
1.5 f−cbal*
anti1
1 100 f−anti1
anti2
f−anti2
0.5 50
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
100 70
60
80
Cumulative Return
Yearly Return (%)
50
60 40
40 30
20
20
10
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
iix
The performance of the Anticor algorithm for the iix market suggests that the
extremely high returns on the old historical nyse are not an isolated case. Indeed,
the Anticor algorithms would have obtained over 60 percent of yearly return had an
investor used either ANTI1 or ANTI2 since 1998 as shown in Figure 7–9. In addition,
this would have been accomplished at a small level of risk. Indeed, the daily risk of
ANTI1 is equal to the daily risk of CBAL*.
45
iix Total Return
Sharpe Ratio
−3
x 10 1200
3
1000
2.5 ucbal
Total Return
2 cbal*
Daily Return
f−cbal*
600
1.5 anti1
f−anti1
1 400 anti2
f−anti2
0.5 200
0 0
0 0.01 0.02 0.03 0.04 0.05 0 20 40 60
Daily Risk Window Size − 1
140 250
120
200
Cumulative Return
Yearly Return (%)
100
80 150
60 100
40
50
20
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
ndx
In Figure 7–10, we observe the four graphs for the ndx market. As in many
other instances, both ANTI1 and ANTI2 perform better than UCBAL even when
transaction costs of one percent are taken into consideration. However, unlike for
other markets, the total performance of CBAL* is between that of the two Anticor
algorithms. Most interestingly, a closer examination of the risk-adjusted performance
suggests that in this case the preferred strategy would be to adopt CBAL* (if it were
possible to view in the future). As shown in the IGMR graph, at all times it is clearly
preferable to invest in either ANTI1 or ANTI2 rather than UCBAL.
nwx
Similarly to the ndx market, the nwx market differentiates the performance of
ANTI1 versus ANTI2 by a large factor. However, as shown in Figure 7–11, the
46
ndx Total Return
Sharpe Ratio
−3
x 10 700
3
600
2.5 ucbal
Total Return
2 cbal*
Daily Return
400
f−cbal*
1.5 anti1
300
f−anti1
1 200 anti2
f−anti2
0.5 100
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
140 500
120
400
Cumulative Return
Yearly Return (%)
100
80 300
60 200
40
100
20
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
note that most of the gain realized by ANTI2 occured post 2002, a period during
which ANTI1 tracked approximately the performance of CBAL*.
60 ucbal
Risk−free Rate = 0.04
1.5 f−ucbal
Total Return
cbal*
Daily Return
f−cbal*
40
1 anti1
f−anti1
anti2
0.5 20 f−anti2
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
80 60
60 50
Cumulative Return
Yearly Return (%)
40
40
30
20
20
0 10
−20 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
47
nyi
Figure 7–12 shows impressive total returns earned by ANTI1 and ANTI2 be-
tween November 1998 and December 2002. Here we need to emphasize that the total
return graph is not in units of percent; it is the multiple times the initial investment.
The IGMR graph shows that over that time period, UCBAL had yearly returns
slightly less than 20 percent, which is also very good. It is astonishing to see that,
even after accounting for transaction fees, both ANTI1 and ANTI2 could have yielded
200
2 ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
cbal*
Daily Return
150
1.5 f−cbal*
anti1
1 100 f−anti1
anti2
f−anti2
0.5 50
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
120 120
100 100
Cumulative Return
Yearly Return (%)
80 80
60 60
40 40
20 20
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
48
nyy
Figure 7–13 shows impressive returns for the nyy. As surprising as the results
on nyi were, the results for nyy are superior. Both of these markets start with the
letters “ny” (for New York Stock Exchange), however they have little overlap. On
the one hand, nyi contains international stocks from all industries, while on the other
hand, nyy contains technology, media and telecommunications stocks.
One interesting observation is that we can clearly observe a sharp decline in the
IGMR around 2000-2001, the time when the so called “dot-com bubble” burst.
nyy Total Return
Sharpe Ratio
−3
x 10 500
3
2.5 400
ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
2 cbal*
Daily Return
300
f−cbal*
1.5 anti1
200 f−anti1
1 anti2
f−anti2
100
0.5
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
140 500
120
400
Cumulative Return
Yearly Return (%)
100
80 300
60 200
40
100
20
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
oex
The total return graph for the oex market (S&P 100 Index - American) is shown
in Figure 7–14. This graph shows that the performance of ANTI1 and ANTI2 can
be strongly affected by the window size used. Indeed, we observe a trend where the
higher the window size the higher the total return. One interesting observation is
49
that even though ANTI1 and ANTI2 have higher Sharpe ratios than CBAL*, when
transactions costs are included the three strategies appear to have equal Sharpe
ratios. In the cumulative return graph we observe that much of the growth occured
in early 2002.
oex Total Return
Sharpe Ratio
−3
x 10 35
2
30
ucbal
Risk−free Rate = 0.04
1.5 25 f−ucbal
Total Return
cbal*
Daily Return
20
f−cbal*
1 anti1
15
f−anti1
10 anti2
0.5 f−anti2
5
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
80 40
60 30
Cumulative Return
Yearly Return (%)
40 20
20 10
0 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
soxx
Figure 7–15 shows the four graphs for the soxx (Philadelphia Stock Exchange
Semiconductor Sector) market. The ANTI1 and ANTI2 curves for the total return
versus window size graph start high and gradually decrease as the window size in-
creases. It is interesting to note in the cumulative return graph that a sharp increase
in wealth occured between 1998 and 2000 followed by a strong decline. In early 2002,
it seems that all four strategies resulted in an equal total return. However, during the
period between December 2002 and February 2007, ANTI2 (and to a lesser extent
ANTI1) has shown stalwart performance.
50
soxx Total Return
Sharpe Ratio
−3
x 10 50
1.4
1.2 40
ucbal
Total Return
cbal*
Daily Return
30
0.8 f−cbal*
anti1
0.6 20 f−anti1
anti2
0.4
f−anti2
10
0.2
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
50 15
40
Cumulative Return
Yearly Return (%)
30 10
20
10 5
−10 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
xau
Of all the recent historical markets, xau is perhaps the most curious one. Figure
7–16 shows the exceptionally high total return for ANTI1 and ANTI2. If an investor
had decided to invest in the xau market in November 1998 using the ANTI2 strategy,
he would have obtained over 100 percent return every year until December 2002. On
the other hand, we can observe that starting to invest in this market using ANTI2 at
a later time resulted in marginally smaller yearly return every following year. Yet,
even for the investor joining in 2002 would have earned over 20 percent per year, an
excellent return when compared to the market.
xmi
Figure 7–17 shows the results obtained for the xmi market. In the total return
graph we observe that ANTI1 with transaction costs results in a worse total return
for window size of 2 than UCBAL but in a better performance than both UCBAL
51
xau Total Return
Sharpe Ratio
−3
x 10 2000
3.5
3 ucbal
1500
Total Return
cbal*
Daily Return
2 f−cbal*
1000
anti1
1.5 f−anti1
anti2
1 500 f−anti2
0.5
0 0
0 0.01 0.02 0.03 0.04 0 20 40 60
Daily Risk Window Size − 1
140 800
120
600
Cumulative Return
Yearly Return (%)
100
80 400
60
200
40
20 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
and CBAL* for window size 50. The inverse performance relationship exists for
ANTI2.
It is also interesting to note how the IGMR graph is distributed. For the period
before 1999, the ANTI1 and ANTI2 strategies provided returns better than CBAL*.
However, after 2001, ANTI2 performed worse than CBAL* and ANTI1 performed
worse than UCBAL. This market contains major stocks and mirrors the Dow Jones
Industrial Average. That the results are mixed suggests, perhaps, that major stocks
are priced more efficiently and behave more randomly than others.
52
xmi Total Return
Sharpe Ratio
−4
x 10 12
8
10
ucbal
Risk−free Rate = 0.04
6 f−ucbal
8
Total Return
cbal*
Daily Return
f−cbal*
6
4 anti1
f−anti1
4 anti2
2 f−anti2
2
0 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
30 6
25 5
Cumulative Return
Yearly Return (%)
20
4
15
3
10
2
5
0 1
−5 0
Nov 1998 Dec 2002 Nov 1998 Dec 2002 Feb 2007
Time (days) Time (days)
53
7.3 Simulated Market Data
Figures 7–18 and 7–19 show the four graphs for the modified random walk
simulated market as described in Section 6.3.1. The relative prices are independent
and identically-distributed random variables and so no correlation exists between
the relative prices. It is not surprising therefore to find that ANTI1 and ANTI2
performed very poorly on these markets. This suggests that in real markets, relative
prices are not independent and identically-distributed.
mrw Total Return
Sharpe Ratio
−3
x 10 40
1.5
1 30 ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
cbal*
Daily Return
0.5 f−cbal*
20
anti1
0 f−anti1
anti2
10 f−anti2
−0.5
−1 0
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
60 40
40
30
Cumulative Return
Yearly Return (%)
20
20
0
10
−20
−40 0
0 500 1000 1500 0 1000 2000 3000
Time (days) Time (days)
Figures starting from 7–20 up to and including 7–29 show the simulation results
54
mrw3 Total Return
Sharpe Ratio
−4
x 10 10
8
6 8
ucbal
Total Return
cbal*
Daily Return
6
2 f−cbal*
anti1
0 4 f−anti1
anti2
−2
f−anti2
2
−4
−6 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
30 12
20 10
Cumulative Return
Yearly Return (%)
8
10
6
0
4
−10 2
−20 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
55
mam0 Total Return
Sharpe Ratio
−4
x 10 7
8
6
6 ucbal
Total Return
cbal*
Daily Return
4 4
f−cbal*
anti1
3
2 f−anti1
2 anti2
f−anti2
0
1
−2 0
0 0.005 0.01 0.015 0 20 40 60
Daily Risk Window Size − 1
25 7
20 6
Cumulative Return
Yearly Return (%)
5
15
4
10
3
5
2
0 1
−5 0
0 500 1000 1500 0 1000 2000 3000
Time (days) Time (days)
40
1 ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
cbal*
Daily Return
30
0.5 f−cbal*
anti1
0 20 f−anti1
anti2
f−anti2
−0.5 10
−1 0
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
60 50
40 40
Cumulative Return
Yearly Return (%)
20 30
0 20
−20 10
−40 0
0 500 1000 1500 0 1000 2000 3000
Time (days) Time (days)
56
mam2 Total Return
Sharpe Ratio
−3
x 10 35
1.5
30
1 ucbal
Total Return
cbal*
Daily Return
0.5 20
f−cbal*
anti1
15
0 f−anti1
10 anti2
f−anti2
−0.5
5
−1 0
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
60 40
40
30
Cumulative Return
Yearly Return (%)
20
20
0
10
−20
−40 0
0 500 1000 1500 0 1000 2000 3000
Time (days) Time (days)
10 f−ucbal
20
Total Return
cbal*
Daily Return
f−cbal*
15
5 anti1
f−anti1
10 anti2
0 f−anti2
5
−5 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
40 30
30 25
Cumulative Return
Yearly Return (%)
20
20
15
10
10
0 5
−10 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
57
mam4 Total Return
Sharpe Ratio
−3
x 10 20
1
0.8 15 ucbal
Total Return
cbal*
Daily Return
0.6 f−cbal*
10
anti1
0.4 f−anti1
anti2
5 f−anti2
0.2
0 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
30 20
20 15
Cumulative Return
Yearly Return (%)
10 10
0 5
−10 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
8 20
ucbal
Risk−free Rate = 0.04
f−ucbal
Total Return
6 cbal*
Daily Return
15
f−cbal*
4 anti1
10 f−anti1
2 anti2
f−anti2
5
0
−2 0
0 0.005 0.01 0.015 0.02 0 20 40 60
Daily Risk Window Size − 1
50 20
40
15
Cumulative Return
Yearly Return (%)
30
20 10
10
5
0
−10 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
58
mam6 Total Return
Sharpe Ratio
−3
x 10 10
1
8
ucbal
Total Return
cbal*
Daily Return
6
f−cbal*
0 anti1
4 f−anti1
anti2
−0.5 f−anti2
2
−1 0
0 0.005 0.01 0.015 0.02 0.025 0 20 40 60
Daily Risk Window Size − 1
30 14
20 12
Cumulative Return
Yearly Return (%)
10
10
8
0
6
−10
4
−20 2
−30 0
0 500 1000 1500 0 1000 2000 3000
Time (days) Time (days)
2.5 f−ucbal
Total Return
cbal*
Daily Return
3 2
f−cbal*
anti1
1.5
2 f−anti1
1 anti2
f−anti2
1
0.5
0 0
0 0.005 0.01 0.015 0.02 0.025 0.03 0 20 40 60
Daily Risk Window Size − 1
150 1.5
Cumulative Return
Yearly Return (%)
100 1
50 0.5
0 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
59
mam8 Total Return
Sharpe Ratio 11
x 10
2.5
0.01
2
0.008 ucbal
Total Return
cbal*
Daily Return
1.5
0.006 f−cbal*
anti1
0.004 1 f−anti1
anti2
f−anti2
0.002 0.5
0 0
0 0.01 0.02 0.03 0.04 0.05 0 20 40 60
Daily Risk Window Size − 1
3
1000
Cumulative Return
Yearly Return (%)
2.5
2
500
1.5
1
0
0.5
−500 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
2 f−ucbal
Total Return
0.01 cbal*
Daily Return
f−cbal*
1.5
anti1
f−anti1
0.005 1 anti2
f−anti2
0.5
0 0
0 0.02 0.04 0.06 0.08 0 20 40 60
Daily Risk Window Size − 1
8000 4
Cumulative Return
Yearly Return (%)
6000
3
4000
2
2000
0 1
−2000 0
0 500 1000 1500 0 500 1000 1500 2000 2500 3000
Time (days) Time (days)
60
CHAPTER 8
Conclusion
We have presented the portfolio selection problem, as well as some of the sim-
sets, including the historical market data used by the Borodin et al. in [2] in their
experiments, as well as more recent market data. Settings both with and without
transaction costs were considered. On the historical data, our results matched those
of Borodin et al.; on the recent market data, the Anticor algorithm also performed
well, but not at the same exceptional level as with the historical data. Nonetheless,
our results show that it may be possible for an algorithm to consistently outperform
the market.
In Section 1.3, we presented the efficient market hypothesis, which states that
agents should not be able to consistently achieve above-average performance. It is
an open question whether the efficient market hypothesis is valid or not, and this
thesis does not intend to argue in favour of either side. However, we can observe
that the historical markets considered, as well as some of our simulated markets, do
respect the assumptions put forth in the efficient market hypothesis, and thus the
performance of the Anticor algorithm suggests that the weak-form efficient hypothesis
61
did not hold. Thus, it may be possible to obtain benefits via technical analysis. Our
results do not imply that in the future the hypothesis would not hold.
The behavioural finance ideas of Shleifer [15] and the human irrationality ideas
of Kahneman and Tversky [10] suggest weaknesses in the efficient market hypothesis’
assumptions.
8.1 Future extensions
plified view of market operation, which does not consider how transactions would
actually be carried out via orders placed in the order books. In theory, a purchase
could be done by placing an order with a bid price of ∞, while a sale could be done
using an order with an ask price of 0; however, we suspect that this would severely
degrade the performance of the algorithm. Kearns et al. [16] have presented a
a limit order, which is a request from an agent to a broker to buy (or sell) a desired
quantity of a given security, within a given time interval, for no more than a given
maximum (or minimum, when selling) price per unit. The time interval can range
from a few seconds to several weeks, and the broker may or may not be able to fulfill
the order. For more information on limit orders, see [13]. Although the Anticor
algorithm cannot, in its current form, work with limit orders, it may be possible
to extend the algorithm to do so, allowing it to be combined with the algorithm of
62
Kearns et al. A combination of a portfolio selection with a trade-execution algorithm
could potentially be useful in practice.
63
APPENDIX A
Dependence Matrix Generation Algorithm
procedure Dependence Matrix(m, L, δ)
D· (1, ·) ← 0m×1
λ ← ceil(L× rand (m, 1))
for all s ∈ {2, . . . , m} do
dep ← ceil((s − 1)× rand)
Dλi (i,dep) ← δ
end for
end procedure
This algorithm creates a dependence matrix where each security si can only be
dependent (directly or indirectly) on securities sj with j < i, avoiding the creation of
cycles. In our case, we have decided to let securities have at most one direct positive
dependence and one direct negative dependence.
64
APPENDIX B
Indices Composition
B.1 Old Historical Market Data
Index compositions, data sets and suplementary market information for the djia,
nyse, sp500 and tse are available online at:
aa, aep, aes, aig, alex, amr, axp, ba, bni, c, cal, cat, chrw, cnp, cnw, csx, d, dd,
dis, duk, ed, eix, exc, expd, fdx, fe, ge, gm, gmt, hd, hon, hpq, ibm, intc, jbht, jblu,
jnj, jpm, ko, lstr, luv, mcd, mmm, mo, mrk, msft, ni, nsc, osg, pcg, peg, pfe, pg, r,
so, t, txu, unp, ups, utx, vz, wmb, wmt, xom, yrcw.
aa, aig, axp, ba, c, cat, dd, dis, ge, gm, hd, hon, hpq, ibm, intc, jnj, jpm, ko,
mcd, mmm, mo, mrk, msft, pfe, pg, t, utx, vz, wmt, xom.
adbe, amzn, avct, chkp, ckfr, csco, ebay, fdx, goog, hlth, iaci, ibm, intu, mfe,
65
iix, AMEX INTERACTIVE WEEK INTERNET
akam, amzn, aqnt, beas, brcm, chkp, cien, ckfr, cnet, csco, dgin, driv, ebay, elnk,
etfc, fdry, ffiv, goog, hlth, iaci, intu, jcom, jnpr, mfe, mnst, nflx, ntbk, palm, pcln, q,
qcom, rht, rimm, rnwk, sone, sunw, symc, tibx, twx, untd, vrsn, wbsn, webm, webx,
yhoo.
aapl, adbe, adsk, aeos, akam, altr, amat, amgn, amln, amzn, apcc, apol, atvi,
bbby, beas, biib, bmet, brcm, cdns, cdwc, celg, chkp, chrw, ckfr, cmcsa, cost, csco,
ctas, ctsh, ctxs, dell, disca, dish, ebay, eric, erts, esrx, expd, expe, fast, fisv, flex,
adct, adpt, adtn, alu, av, axe, cien, coms, csco, elx, fdry, jnpr, tlab.
abb, abn, abx, aeg, aib, amx, anz, axa, az, azn, bay, bbl, bbv, bce, bcs, bf, bhp,
bmo, bns, bp, brg, bt, caj, ceo, chl, cm, cni, cnq, cs, csg, da, db, dcm, dcx, deo, dt,
e, eca, ele, en, eon, fte, gsk, hbc, hmc, ing, ire, ity, kb, kep, kpn, lr, lyg, mc, mfc,
mfg, mt, mtu, nab, ngg, nhy, nmr, nok, ntt, nvs, pbr, phg, pkx, puk, rds-a, rds-b,
rep, rio, rtp, ry, sap, scm, si, slf, sne, sny, ssl, std, sto, su, sze, td, tef, ti, tls, tm, toc,
tot, ts, tsm, ubs, ul, un, vod, wbk.
adi, alu, amd, amt, amx, at, ate, auo, bce, bmc, bsy, bt, ca, caj, cbs, cci, ccu,
cha, chl, chu, cn, csc, cvc, dcm, dis, dox, dt, dtv, eds, emc, enl, eq, fte, gci, glw, hpq,
htx, ibm, ifx, ipg, kpn, ktc, lpl, luk, lxk, mbt, mhp, mot, mu, ncr, nok, nsm, nt, ntt,
66
nws, nws-a, nzt, omc, ote, pbi, phi, pso, pt, pub, q, rg, ruk, s, sap, say, scm, skm,
ssp, stm, stx, t, tef, ti, tka, tkc, tkg, tlk, tls, tmx, toc, trb, tsm, tu, tv, twx, txn,
umc, uvn, via-b, vip, vod, vz, wfr, wit, xrx.
aa, abt, aep, aes, aig, all, amgn, ati, avp, axp, ba, bac, bax, bdk, bhi, bmy, bni,
bud, c, cat, cbs, ccu, ci, cl, cmcsa, cof, cop, cpb, csc, csco, cvx, dd, dell, dis, dow, ek,
emc, ep, etr, exc, f, fdx, gd, ge, gm, goog, gs, hal, hd, het.
altr, amat, amd, brcm, fsl-b, ifx, intc, klac, lltc, mrvl, mu, mxim, nsm, nvls,
stm, ter, tsm, txn, xlnx.
abx, aem, au, cde, fcx, gfi, gg, gold, hmy, kgc, mdg, nem, paas, rgld.
axp, cvx, dd, dis, dow, ek, ge, gm, ibm, ip, jnj, ko, mcd, mmm, mo, mrk, msft,
pg, wmt, xom.
67
References
[1] Z. Bodie, A. Kane, and A. J. Marcus. Portfolio Performance Evaluation: In-
vestments. Irwin McGraw-Hill, 4 edition, 1999.
[2] A. Borodin, R. El-Yaniv, and V. Gogan. Can we learn to beat the best stock.
Journal of Artificial Intelligence Research, 21:579–594, May 2004.
[3] T.M. Cover and D. Gluss. Empirical bayes stock market portfolios. Advances
in Applied Mathematics, pages 170–181, 1986.
[4] T.M. Cover and E. Ordentlich. Universal portfolios with side information. IEEE
Transactions on Information Theory, page 348363, 1996.
[5] T.M. Cover and J. Thomas. Elements of Information Theory. John Wiley and
Sons, Inc., 1991.
[6] Eugene F. Fama. Efficient capital markets: A review of theory and empirical
work. Journal of Finance, 25(2):383, 1970.
[7] Eugene F. Fama. Efficient capital markets: ii. Journal of Finance, 46(2):1575,
1991.
[8] Yahoo! Finance. Historical market data. http://finance.yahoo.com/. Data
providers: http://finance.yahoo.com/exchanges.
[9] Benjamin Graham and David Dodd. Security Analysis. McGraw-Hill, 1951.
[10] D. Kahneman and A. Tversky. Prospect theory: An analysis of decision under
risk. Econometrica, 47(2):263–291, 1979.
[11] Archie Craig MacKinlay and Andrew W. Lo. A Non-Random Walk Down Wall
Street. Princeton UniversityPress, 1999.
[12] Harry M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, 1952.
[13] Frank K. Reilly and Keith C. Brown. Investment Analysis and Portfolio Man-
agement. Harcourt Collge Publishers, 2003.
68
69