Sunteți pe pagina 1din 29

The Fundamental Law of Active Management:

Time Series Dynamics and Cross-Sectional Properties

Zhuanxin Ding, Ph.D.

Research Analyst

Analytic Investors
555 W 5th St, 50th Floor
Los Angeles, CA 90013

(Phone) 213-787-9760
(Fax) 213-688-1518
zding@aninvestor.com

First Draft October 15, 2009


Revised May 1st, 2011

Electronic copy available at: http://ssrn.com/abstract=1625834


ABSTRACT

I derive a generalized version of the fundamental law of active management under


some weak conditions. I show that the original fundamental law of Grinold and
various extensions are special cases of the result presented in this paper. I also
show that cross-sectional ICs are usually different from time series ICs even if the
time series ICs are all the same across securities. The fundamental law derived
in this paper is robust to forecast model specification. My results show that
average signal IC and IC standard deviation are the two most important variables
in determining the potential IR a strategy can achieve. I extend the fundamental
law to models with multiple factors and study the impact of missing one or more
return or risk factors to portfolio IR.

Electronic copy available at: http://ssrn.com/abstract=1625834


The Fundamental Law of Active Management:
Time Series Dynamics and Cross-Sectional Properties

Introduction

Since the seminal work by Grinold (1989), the fundamental law of active management
has been widely used in the quantitative investment community as a tool to assess a
portfolio manager's ability to add value. According to Grinold (1989), the fundamental
law relates three variables: your skill in forecasting exceptional returns (IC), the breadth
of your strategy (N), and the value added of your investment strategy (IR). Grinold
(1989) concludes that "based on assumptions that are not quite true and simplified with
some reasonable approximations" the three variables have the following relationship:
IR IC N , (1)
where IR is the information ratio, IC is the information coefficient, and N is the breadth.
The derivation of the fundamental law is closely related to another Grinold paper
(Grinold (1994)) that shows "Alpha is Volatility Times IC Times Score", i.e.,
it ri IC zit 1 , (2)
where ri is the residual return (will be defined below) volatility and zit 1 is the
standardized forecast signal (score) that is known at the end of time t-1. The theoretical
and empirical development on this line of the fundamental law culminated in the book by
Grinold and Kahn (2000) titled "Active Portfolio Management." Based on the
fundamental law, Grinold and Kahn (2000) conclude that "you (portfolio managers) must
play often and play well to win at the investment management game. It takes only a
modest amount of skill to win as long as that skill is deployed frequently and across a
large number of stocks."

Grinold and Kahn (2000, p148) define "breadth" as the number of independent forecasts
of exceptional return one makes per year. Buckle (2004) and Grinold (2007) provide
some further discussion on how to measure breadth. Given the inter-dependence nature of
the investment process, it is rather hard to measure "breadth" in any precise way which
limited the usefulness of the fundamental law in practice. Portfolio managers or analysts
usually use the number of stocks in the selection universe as breadth. The consequence of
using this simplified definition is that the theoretically calculated IR number seems to
overestimate the IR a portfolio manager can reach in practice. For example, given a
forecast signal with a monthly average IC of 0.03 and a selection universe of 1000 stocks,
the expected annualized IR from Grinold's formula will be 3.29 which is usually much
higher than anyone can get from an actual portfolio or a backtest simulation. Portfolio
managers are left wondering why realized information ratios are only a fraction of their
predicted value. Clarke, de Silva and Thorley (2002, p50) point out "a common rule of
thumb in practice is that the theoretical information ratio suggested by the fundamental
law should be cut in half." However, for the above mentioned example, the IR estimate
will still be too high even if cut by half (IR=1.64). As noted by Grinold (1989, p32)
himself "an observed information ratio above 1.5 is rare indeed."

3
Clarke et al. (2002) attribute the reduction in performance to the constraints in the
portfolio construction process and proposed the concept of "the transfer coefficient" to
account for the leaking of IR from Grinold's original formula. They show that constraints
in portfolio construction (constraints such as country or sector exposures, long only, etc.),
leads to suboptimal portfolio weights in terms of alpha generation, thus reducing the
maximum achievable IR. They developed a framework for measuring the deviation of the
optimal constrained weights from optimal non-constrained weights and proposed a
generalized fundamental law as follows:
IR TC IC N , (3)
where TC is the transfer coefficient, defined as the cross-sectional correlation coefficient
between risk-adjusted expected residual returns and risk-adjusted active weights. They
define N precisely as the number of stocks in the selection universe. According to their
simulation study, the typical transfer coefficient is in the range of 0.3 to 0.8. So the
original IR calculated from Grinold's formula should be about halved. Even so, as
discussed above, the TC adjusted IR still appears to be too high.

In order to understand why that happens, we need to examine the assumptions made by
Grinold in deriving his fundamental law. It is essential on clearly specifying the modeling
assumptions in order to have a constructive discussion. Different modeling assumptions
will give different return forecasts and forecast error covariance matrices which will
generate different estimates for IR. As readily acknowledged by Grinold, the original
form of the fundamental law is based on some unrealistic assumptions. The most
important assumption is that time series ICs between an individual stock's residual return
and its forecast signal are the same across all securities and are a constant over time.
Grinold (1989, 1994) and Grinold and Kahn (2000) then used the time series IC and
cross-sectional IC interchangeably. In practice, many quantitative managers run a Fama-
MacBeth type cross-sectional regression to get realized ICs at different time periods. The
ICs calculated this way are far from constant and often fluctuate around an average IC,
i.e., the ICs themselves are stochastic. Based on this observation, Qian and Hua (2004)
show that a more appropriate IR to use in practice is average IC divided by the standard
deviation of IC
IC N
IR , (4)
IC,N
where IC N is the sample average IC from a selection universe of N securities, and IC, N
is the sample standard deviation of IC that Qian and Hua (2004) call the "strategy risk".

In a more recent paper, Ye (2008) goes one step further to bridge the gap between the
original Grinold (1989) formula and the Qian and Hua (2004) formula. Based on her
assumptions, she establishes that
IC
IR . (5)
1 / N IC
2

It is obvious that Equation (1) is a special case of Equation (5) when IC 0 (as assumed
by Grinold (1989)).

4
With all these different versions of fundamental laws, it can be confusing for practitioners
to decide which one to use. It is crucial to have a full grasp of the different underlying
assumptions and the resulting conclusions from these fundamental laws. In this paper, I
try to set up a coherent econometric modeling structure and derive a more precise
fundamental law under much weaker assumptions. I will show that time series ICs are
usually different from cross-sectional ICs even if time series ICs are the same across all
individual securities. They will be the same only under some strong conditions. I will also
show that different forms of fundamental laws are a result of either simplistic
assumptions (Grinold (1989)), or approximations (Ye (2008)), or sample vs. population
(Qian and Hua (2004)). When the more relevant conditional residual return forecast error
covariance matrices are used, we will arrive at the more general form of the fundamental
law presented in this paper.

I further extend the fundamental law to models with more than one factor, and discuss the
impact of missing one or more return or risk factors to the portfolio IR. I also show that
the form of the generalized fundamental law derived in this paper is robust to model
specification. If the true relationship in the analysis is between the raw residual returns
and the factor exposures, one will get the fundamental law in a similar form.

The empirical section of this study shows that the fundamental law derived in this paper
gives a very close estimate for the empirical IR one can reach when the proper risk model
that incorporates the "strategy risk" is used. It also shows that the impact of other risk
factors are usually small or negligible after the most important "strategy risk" is included
in the risk model.

Framework and Notation

I will follow the framework and notation in Clarke et al. (2002) and Ye (2008). A
variable with subscript i ( i 1, , N ) and t ( t 1, , T ) represents the variable value for
security i at the end of time t. A variable in bold represents a vector or matrix.

Given a benchmark portfolio, the total excess return (i.e., return in excess of the risk-free
rate) on any stock i can be decomposed into a systematic portion that is correlated with
the benchmark excess return and a residual return that is not by
ritTotal it RB , t rit , (6)
where
it = beta of security i with respect to the benchmark
RB , t = benchmark excess return
rit = realized residual return
The benchmark and the actively managed portfolios are defined by the weights,
wB , it and wP , it , assigned to each of the N stocks in the investable universe respectively. It
is shown in Clarke et al. (2002) that the portfolio active return, which is defined as the
managed portfolio total excess return minus the benchmark total excess return, adjusted
for the managed portfolio's beta with respect to the benchmark, can be written as

5
N N
R A,t RP ,t P ,t RB ,t wP ,it rit wit rit , (7)
i 1 i 1

where wit is the active weight defined as the difference between the managed portfolio
weight and the benchmark weight at the beginning of time period t. 1 Note that the active
weights, wit , sum to 0 because they are differences in two sets of weights that each sum
to 1. Also note that the stock returns, rit , in (7) are residual, not total, excess returns. As
pointed out in Clarke, et al. (2002), residuals are the relevant component of security
returns when performance is measured against a benchmark on a beta-adjusted basis.

We assume that residual returns follow a conditional normal distribution, and define ex
ante alpha of security i ( i 1, , N ) in period t as the expected residual return
conditional on information available at the end of time period t 1 : I t 1
t E (rt | I t 1 ) , (8)
and we define risk related to the alpha expectation as the conditional covariance of the
forecast errors
t E[(rt t )(rt t )' | I t 1 ] , (9)
where t and rt are N 1 vectors with it and rit as their elements respectively. The
assumption of asset return normality is one of the fundamental assumptions under
Markowitz's mean-variance portfolio choice theory, and the mean and covariance matrix
fully determine a multivariate normal distribution. Under the residual return normality
assumption, the covariance of the forecast errors is the relevant measure of risk. There is
risk because there is uncertainty, and risk is associated to the part of return that we are not
able to predict. If we know the future returns perfectly then there is no uncertainty, hence
no risk. The conditional risk associated with our alpha estimate must be smaller than the
total risk around the unconditional alpha expectation. If this is not the case, then the
forecast provides no additional information and the lagged information set, I t 1 , is
useless. 2 This is the major difference between the risk model used in this paper and the
risk models used in Grinold (1989, 1994), Grinold and Kahn (2000), Clarke et al. (2002,
2006), Qian and Hua (2004), and Ye (2008). Of course, the assumption of stock return
normality may not be valid in practice, and the return and risk models one uses are very
likely mis-specified, which may cause theoretically derived results not to reflect what one
gets in reality. I will give some discussion later on the impact of missing alpha or risk
factors in conditional mean and covariance modeling.

After having specified the conditional mean and covariance matrix, we will then use the
mean-variance analysis tool for portfolio construction based on the theory of utility
maximization. In each period t, the optimal market-neutral portfolio, Pt , is selected to
maximize the mean-variance utility function:
1 1
Max U t Pt Pt2 w t ' t w t ' t w t
w t 2 2 , (10)
s.t. w t '1 0
where

6
Pt expected active return on the portfolio
Pt2 active risk of the portfolio based on the portfolio holdings
a risk-aversion parameter
1 N 1 vector of 1s
The solution for this optimization problem is
1
w t ( t1 t t1 1) , (11)

t ' t1 1
where is a scalar.
1' t1 1
A certain value of corresponds to a certain value of Pt since
w t ' t w t Pt2 . (12)
Substituting (11) into (12) and by some straightforward algebra we have
1
t ' t1 t 1' t1 t . (13)
Pt
The optimal portfolio active weight is then
t1 ( t 1)
w t Pt , (14)
t ' t1 ( t 1)
and the expected portfolio return
Pt w t ' t
(15)
Pt t ' t1 ( t 1) .
If we assume that the target tracking error remains a constant ( Pt P ) at each
rebalance of the portfolio, a typical practice for many quantitative portfolio managers,
then the ex ante expected information ratio of the portfolio is

IR E ( Pt ) / P E t ' t1 ( t 1) . (16)
This is a very general result that should hold as long as the residual return has a
conditional normal distribution with mean t and covariance matrix t . The result here
is very similar to the result by Clarke, de Silva and Thorley (2006). The difference is that
they only deal with a one period static model with no time series dynamics. It will be
seen below that the interesting part of the fundamental law is in decomposing t and t
so that analysts can get more insight out of Equation (16).

From the above discussion, it is clear that the key is how to forecast the alpha and the
corresponding covariance matrix. As Kahn (1997) points out "active management is
forecasting." Different forecasts will give us different ex ante expected information
ratios. In the literature, two different approaches are used to forecast alpha. One uses time
series models and the other uses a Fama-MacBeth type cross-sectional regression
approach. As for covariance matrix, many people use a risk model that does not have a
direct relationship with the alpha estimation. Strictly speaking, a risk model that is
detached from the alpha model will be a mis-specified risk model for the reasons
discussed above (see Lee and Stefek (2008) for a very good discussion on this topic).

7
This mis-specification usually results in the underestimation of risk when one runs an
actual portfolio because the very important "strategy risk" is being left out (see Qian and
Hua (2004), Qian, Hua, and Sorensen (2007)).

Time Series Dynamics

Given two panel data sets r and z observed over time periods t 1, , T for
securities i 1,, N , I will show in this section that the cross-sectional IC (correlation)
between r and z is not necessarily the same as time series IC (correlation) even if the time
series IC is the same across all the securities. The result from time series modeling
assumptions cannot be applied to cross-sectional modeling structures without some
further assumptions. In practice, cross-sectional IC is what most practitioners care about
for the reason that will become clear later. It is important for researchers to exercise
caution in applying results from time series analysis to cross-sectional data.

If we assume that the true forecasting relationship between the lagged information set,
I t 1 , and the residual returns, rit , is a linear model as follows
rit g i zit 1 it (17)
for security i over a sample of T periods. In the equation, g i is the regression coefficient
specific for security i that relates lagged information zit 1 with future residual returns rit
(please note that g i is different from the usual definition of factor return from a cross-
sectional regression), zit 1 is the lagged explanatory variable that becomes known at the
end of time t-1 that has both time series and cross-sectional mean 0 and standard
deviation 1, it ~ N (0, 2i ) is the idiosyncratic noise that cannot be predicted. We further
assume
T1) E ( zit 1 it ) 0 for all i and t,
and
T2) E ( it jt ) 0 for i j .
T1) is a very general assumption for linear regression models stating that the explanatory
variable and the residual are not correlated, and T2) assumes that the forecast errors are
not correlated across stocks so that the idiosyncratic covariance matrix is diagonal. This
is also a common assumption for idiosyncratic noise.

For ease of exposition, we will focus our attention on population quantity and ignore the
sample estimation error of the parameters. Basic regression of Equation (17) gives us,
g i Var 1 ( zi ) Cov( zi , ri )
Cov( zi , ri ) Var(ri )
, (18)
Var( zi ) Var(ri ) Var( zi )
ICts ,i ri / zi
where ICts ,i is the time series correlation between residual return rit and forecast
signal zit 1 , ri is the standard deviation (volatility) of residual return rit , and zi is the

8
standard deviation (volatility) of zit 1 which is 1 by assumption. The time series prediction
for alpha from this model is
it E ( rit | I t 1 ) ICts ,i ri zit 1 , (19)
and the conditional volatility, or forecast error volatility, is
2i Var(rit | I t 1 ) (1 ICts2 ,i ) r2i . (20)
It should be noted here that ri i when ICts ,i 0 . As we discussed above for
Equation (9), when the forecast signal zit 1 contains useful information for predicting
residual return rit , then the resulting error variance ( 2i ) should be smaller than the
original unconditional residual return variance ( r2i ). Our residual error variance
estimation is related to the alpha model estimation. This is the major difference between
the risk estimate here and the risk estimate provided by any commercial risk model which
has no connection with alpha estimation.

Substituting the alpha and volatility prediction into Equation (16) we have the ex ante
expected information ratio as

IR E t ' t1 ( t 1)
N IC2 z 2 N ICts ,i zit 1 (21)
E
ts , i it 1
.
i 1 1 ICts2 ,i i 1 (1 ICts , i ) ri
2

If we assume that the cross-sectional distribution of ICts,i and zit 1 are independent, then as
N becomes large, we have
ICts2 ,i zit21 ICts ,i zit 1

IR E N Ecs N E
2 cs
(1 ICts ,i ) ri
2
1 IC ts ,i

ICts2 ,i
E N Ecs Ecs zit 1
ICts ,i
Ecs zit21 N Ecs (22)
1 ICts2 ,i (1 ICts2 ,i ) r
i
N ICts2 ,i
1 IC
i 1
2
,
ts ,i

where Ecs stands for the cross-sectional expectation operator. In deriving Equation (22)
we used the assumption that the forecast signal, zit 1 , is cross-sectionally normalized to
have mean 0 and standard deviation 1. When the time series ICs are the same across all
the securities, i.e. ICts ,i ICts for all i, we have
ICts
IR N ICts N . (23)
1 ICts2
The approximation holds when ICts is small which is typically the case in empirical work.

9
Equation (23) proved that the original fundamental law of Grinold (1989) holds
approximately under the time series model assumption when ICs are the same across all
the assets and is small. The reason that the original formula of Grinold (1989) needs to be
adjusted by 1 ICts2 is that we used the conditional volatility of the residual return
instead of the unconditional one. If the unconditional residual return variance ( r2i ) is
used instead, then the denominator part of Equations (21) to (23) becomes 1 and we get
Grinolds original fundamental law. Some interesting observations can be made from
Equations (22) and (23). When one has the skill to predict some residual returns perfectly
(some ICts,i 1 ) then the IR shall go to infinity no matter what the breadth is. This makes
intuitive sense because if one can predict some residual returns perfectly then she/he can
make a sure bet on these stocks against the rest of the universe to achieve the desired
excess return. The IR will be infinity since the optimization is set in such a way that one
can take a leveraged bet. This is not a feature in the original Grinold formula which states
that the IR will increase with the square root of N even if ICts 1 .

If, instead of running a time series regression, we run a "mis-specified" cross-sectional


regression for the model in Equation (17),
rit f t zit 1 it (24)
for cross-sectional security i = 1, 2 ,..., N at time t. A simple cross-sectional regression
gives us
f t Ecs ,t (rit zit 1 ) / Ecs ,t ( zit21 )
Ecs ,t (rit zit 1 ) Ecs ,t (rit2 )

Ecs ,t (rit2 ) Ecs ,t ( zit21 ) Ecs ,t ( zit21 ) (25)
ICcs ,t d (rt ) / d (z t 1 )
ICcs ,t d (rt ) ,
where Ecs ,t stands for the cross-sectional expectation operator at time t, ICcs ,t is the
cross-sectional correlation between residual return rit and forecast signal zit 1 , d (z t 1 ) is
the cross-sectional dispersion 3 of zit 1 , which is 1 by assumption, and d (rt ) is the cross-
sectional residual return dispersion at time t.

The expected value of f t is


f E ( f t ) E ( Ecs ,t (rit zit 1 )) Ecs ,t ( E (rit zit 1 ))
Ecs ,t ( E (( g i zit 1 it ) zit 1 ))
N
1
Ecs ,t ( g i )
N
g
i 1
i
(26)

N
1

N
IC
i 1
ts ,i r .
i

10
On the other hand, if we assume ICcs,t and d (rt ) are independent over t, then from
Equation (25) we have
f E( ft )
E (ICcs ,t d (rt ))
(27)
E (ICcs ,t ) E (d (rt ))
ICcs ,
where E (d (rt )) is the expected cross-sectional residual return dispersion.

Substituting (26) into (27) we have


1 N
ICcs ICts ,i ri / , (28)
N i 1
i.e., the expected cross-sectional IC, ICcs , is a weighted average of time series ICs and
they are usually not the same. If the time series ICs are the same across all securities,
i.e., ICts ,i ICts for all i then
N
1
ICcs ICts
N

i 1
ri / ICts~r / , (29)

1 N
where ~r ri is the cross-sectional average of the residual return standard
N i 1
deviation. So as long as ~r , we have the seemly surprising result that the cross-
sectional ICcs will be different from the time series ICts even if the time series ICs are the
same across all securities.

In the extreme case that all residual return standard deviations are the same, i.e. ri r
for all i, we have ~ and IC IC .4 So the discussion here shows that the
r r cs ts

cross-sectional IC is usually different from the time series IC for an identical set of return
and factor exposures. They will only be the same under the very strong assumption that
the residual return volatilities are the same across all securities.

Given the "mis-specified" cross-sectional model prediction for each individual security,
it ICcs zit 1 ICts~r zit 1 , (30)
we have the forecast error term as
it ICts ri zit 1 ICcs zit 1 it ICts ( ri ~r ) zit 1 it , (31)
which is different from it . The conditional covariance matrix has the following elements:
ij E ( it jt )
ICts2 ( ri ~r ) 2 (1 ICts2 ) r2i when i j (32)

0 when i j
Substituting (32) into (16) we have

11
N ICts2 ~r2 zit21 ICts~r zit 1

IR E 2 . (33)
i 1 ICts ( r ~r ) 2 (1 ICts2 ) r2
i i
If we assume that the cross-sectional distribution of ri and zit 1 are independent, then as N
becomes large, we have
N
1
IR ICts IC
i 1
2
( ri / r 1) (1 ICts2 )( ri / ~r ) 2
~ 2
. (34)
ts

When all the residual return volatilities are the same we have
ICts
IR N ICts N , (35)
1 ICts2
which is consistent with the result from time series model. When the individual residual
return standard deviation varies across securities, the IR we get from the mis-specified
cross-sectional model will be different from the IR we get from the time series model.

The discussion above shows that the original fundamental law of Grinold (1989, 1994)
holds under the assumption that the time series ICs are the same across all the securities
and the common IC is small. The cross-sectional IC is only the same as the time series IC
if an additional assumption is imposed that all residual return standard deviations are the
same (Ye (2008) made this assumption).

In practice, the above two assumptions (time series ICs and residual return volatilities are
the same across all securities) are overly restrictive and we can almost surely say they do
not hold. As an example, I calculated monthly means and standard deviations for time
series and cross-sectional ICs for book/price ratio (B/P) and Momentum factors for US
stocks in Table 1. The top panels in Figures 1 and 2 show the time series IC distributions
for both factors. It can be seen that the time series ICs have a normal-like distribution
with high dispersion. The bottom panels in Figures 1 and 2 show the cross-sectional IC
distributions for both factors. The result here shows that the assumption that the time
series ICs are the same across securities or the cross-sectional ICs are the same over time
is not valid in practice.

Table 1. Mean and Standard Deviation for Factor IC (Time Series and Cross-Section)
Factors Time Series Cross-Section
mean stdev n mean stdev n t-test
Original Signal
Book to Price 0.088 0.176 15232 0.017 0.062 412 1.82
Momentum -0.028 0.152 15232 0.025 0.099 412 -1.58
Both Dimension Normalized
Book to Price 0.087 0.175 15232 0.050 0.072 412 0.94
Momentum -0.028 0.152 15232 -0.003 0.085 412 -0.74
Note: The cross-sectional time period is from 1975:05 to 2009:08 with 412 observations. Time series ICs
are calculated for the companies that existed during this time period that have meaningful returns and factor
exposures data.

12
Figure 1. Histogram for Time Series and Cross-Sectional Correlation
One dimension standardized
Book to Price Ratio Momentum
1000 1200

1000
800

800

Time Series
Time Series

600
600
400
400

200
200

0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

Book to Price Ratio Momentum


120 120

100 100

80 80
Cross Section

Cross Section
60 60

40 40

20 20

0 0
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.6 -0.4 -0.2 0 0.2 0.4

Figure 2. Histogram for Time Series and Cross-Sectional Correlation


Both dimensions standardized
Book to Price Ratio Momentum
1000 1200

1000
800

800
Time Series

Time Series

600
600
400
400

200
200

0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

Book to Price Ratio Momentum


100 120

100
80

80
Cross Section

Cross Section

60
60
40
40

20
20

0 0
-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

13
Cross-Sectional Properties

The above discussion shows the assumption that all time series ICs are the same is not
realistic. I will show below it is also not necessary in deriving the (generalized)
fundamental law. In empirical finance work, many people use a Fama-MacBeth type
cross-sectional regression in relating the explanatory variables with asset returns.
Ibragimov and Mller (2009) find that as long as the cross-sectional coefficient
estimators are approximately normal (or scale mixtures of normals) and independent, the
Fama-MacBeth method results in valid inference even for a short panel that is
heterogeneous over time. Due to the small sample conservativeness result, the approach
allows for unknown and unmodelled heterogeneity. Peterson (2009) shows that when the
residuals of a given time period are correlated across firms, the Fama-MacBeth method
produces more efficient estimates than OLS and the standard error will be correct.
Another advantage is that the assumptions we have to make to achieve the kind of
fundamental law are much weaker than the assumptions we have to make in the time
series section.

If we assume that the true forecasting relationship between the lagged information set,
I t 1 , and the residual returns, rit , is a Fama-MacBeth type cross-sectional linear one
factor model as follows:
rit / ri t zit 1 it (36)
for i = 1, ..., N assets at any given time t over t 1, , T time periods, where t is the
cross-sectional factor return at time t, ri is the conditional volatility for security i based
on lagged information set I t 1 so that the risk-adjusted residual return, rit / ri , has mean
0 and standard deviation 1, zit 1 is the lagged factor exposure that becomes known at the
end of time t-1 that has both time series and cross-sectional mean 0 and standard
deviation 1, it ~ N (0, 2i ) is the idiosyncratic noise that cannot be predicted. We further
assume:
C1) E ( zit 1 it ) 0 for all i and t,
C2) E ( it jt | I t 1 ) 0 for i j ,
and
C3) E ( t it | I t 1 ) 0 for all i and t.
C1) is a very general assumption for linear regression models stating that the explanatory
variable and the idiosyncratic noise are not correlated, C2) assumes that the forecast
errors are not correlated across stocks so that the idiosyncratic covariance matrix is
diagonal, and C3) assumes that there is no serial correlation between factor returns and
the idiosyncratic noise terms. C3) is important because otherwise the information
contained in the idiosyncratic noise terms can be used to improve the factor exposures
which can further improve the return forecasts.

Under the above assumptions, we have,


t ICt , (37)

14
where ICt is the cross-sectional IC (all the ICs discussed in this section will be cross-
sectional IC unless otherwise specified) between the risk-adjusted residual returns and the
forecast signals. In empirical work, one needs to get an ex ante estimate for the cross-
sectional correlation ICt before making an estimate for the alpha. The most common
method simply uses the historical average as an estimate. After the fact we can estimate
the ex post realized ICt using the actual risk-adjusted residual return rit / ri and zit 1 . As
discussed before, usually the cross-sectional factor IC spreads around a mean. For ease of
exposition below, we will assume that the cross-sectional factor ICt follows a normal
distribution with mean IC and standard deviation IC . 5

When the alpha model has the linear one factor structure in Equation (36) and under the
above assumptions, we have the conditional expectation (on known z t 1 ) of rt as
t E (rt | I t 1 ) 1t / 2 IC z t 1 , (38)
and the conditional covariance as
t E ((rt t )(rt t )' | I t 1 ) 1t / 2 ( IC
2
z t 1z t 1 ' ) 1t / 2 , (39)
where t is a N N diagonal matrix with as its elements, z t 1 is a N 1 vector of
2
ri

factor exposures, and is the conditional covariance matrix of t which is diagonal


according to assumption C2) above 6
2 I , (40)
where 2 1 (IC2 IC
2
) is the volatility for the idiosyncratic error term it (it is the
same for all i based on the modeling assumptions) and I is the identity matrix.

Given the above modeling assumptions and by some straightforward algebra, it is shown
in Appendix A that the ex ante expected portfolio excess return at time t to be
IC
Pt P . (41)
(1 IC IC
2 2
) / N IC
2

So the fundamental law in the more general form should be


E ( Pt ) IC
IR . (42)
P (1 IC2 IC
2
) / N IC
2

The portfolio IR is positively related to the average cross-sectional IC (skill) and the
square root of N (breadth), but inversely related to the cross-sectional IC standard
deviation, IC (Qian and Hua (2004) call this strategy risk). This result should not be
surprising to students of modern portfolio theory. Basically, if a portfolio is built upon a
sufficiently large universe (large N ), the main risk to the portfolio comes from a bet on
the alpha factor that has an uncertain (but positive average) payoff stream (strategy risk).
As the universe (N) becomes larger, the impact of the idiosyncratic risk will diminish
( (1 IC2 IC
2
) / N part goes to zero in the formula). In statistics, the quantity 1 / IC
2
is a
measure of how close the realized information coefficient at time t, ICt , is to the mean IC.
Under our modeling assumptions, the breadth in Grinolds original formula is a function
of N and IC2
, and it will approach 1 / IC2
as N gets larger.

15
Three interesting special cases emerge from Equation (42):

1) if the cross-sectional ICt is a constant over time, i.e., IC 0 , then we have


approximately the Fundamental Law of Grinold (1989) with breadth being the
IC
same as number of stocks in the selection universe: IR N IC N
1 IC2
when IC is small. If we ignore the sample estimation error in this extreme case,
then the strategy becomes a money machine, generating constant excess return
every single period with no active risk at all. One can make leveraged bet in
proportion to the square root of N, and the ex ante portfolio IR is also proportional
to the square root of N. It should be noted that (1 IC2 ) / N is the standard error
of the sample IC estimation when the true IC is a constant over time. This term is
there simply to control the risk of sample estimation error so that one will not
make too big a leveraged bet to have the portfolio wiped out.
2) we have approximately the IR formula of Ye (2008) when IC and IC are both
IC IC
very small numbers: IR (this is
(1 IC2 IC
2
) / N IC
2
1 / N IC
2

typically the case in practice, factor average IC is usually in the range of 0.02 to
0.05 and IC standard deviation is around 0.1). The approximation results from Ye
(2008) using the unconditional residual return standard deviation in her risk
model instead of the conditional idiosyncratic error standard deviation that is
consistent with the alpha model. Equation (42) have the property that IR will go
to infinity when IC=1 and IC 0 no matter what the breadth (N) is, while Ye's
original formula does not have this feature.
3) when the number of stocks in the selection universe (N) goes to infinity, then we
have IR IC / IC . As N becomes larger, one gets a more accurate estimate of
the population parameters IC and IC . The risk owing to sample estimation error
disappears. However, the intrinsic strategy risk still remains and one can incur bad
performance at any time even if she/he can estimate IC and IC perfectly.

The results here show that the ex ante expected portfolio excess return is proportional to
targeted portfolio tracking error, P , i.e., the more risk one takes, the more return one
gets. This is consistent with the fundamentals of financial economics. The portfolio
excess return is also positively related to ones skill that is represented by the average IC
one can achieve, and inversely related to the volatility of the skill, IC , i.e., the more
volatile the skill, the less excess return one can get.

Figure 3 plots the relationship between portfolio IR and breadth N for various forms of
the fundamental law discussed above. The parameters are assumed to be IC=0.03 and
IC 0.1 . The portfolio IR based on the Grinold fundamental law increases at the rate of
the square root of breadth N. As the breadth increases, the portfolio IR will increase
without a limit. According to our analysis above, this is true if the manager can pick

16
stocks consistently at certain skill level (so that the cross-sectional IC is a constant over
time). In reality, this is hardly the case. A forecast signal's IC changes constantly over
time, and IC 0 . Under this more realistic situation, the standard deviation adjusted IC,
IC / IC , sets a "Chinese Wall" as the limit IR one can achieve. According to the formula,
as long as IC / IC does not improve, one will not be able to improve the performance
much even if the breadth increases.

Figure 3. Various Forms of the Fundamental Law


(IC=0.03, IC 0.1 )

0.9

0.8
Grinold & Kahn

0.7

0.6
IR

0.5

Max IR:
0.4 "The Chinese Wall"

0.3

0.2
Ye
Ding

0.1

0
0 100 200 300 400 500 600 700 800 900 1000
Breadth

One may wonder what is the relationship between Qian and Huas fundamental law in
Equation (4) and the fundamental law I derived in Equation (42)? It all depends on how
to interpret the variables in Qian and Huas formula. If the average IC and IC standard
deviation in their formula is the true population measures as in my discussion, i.e.,
IC N IC and IC,N IC , then Qian and Huas fundamental law is the "Chinese Wall" I
discussed above. On the other hand, if we interpret IC N and IC, N as the sample average
IC and sample IC standard deviation from a sample of size N and calculated over T time
periods, then it is shown in Ding (2011) that
IC N IC
IR p IR , (43)
IC,N (1 IC2 IC
2
) / N IC
2

i.e., the sample average IC divided by sample IC standard deviation approaches the true
IR derived in Equation (42) asymptotically. Since no one knows the true parameters IC
and IC in practice, the Qian and Hua formula should then be used in empirical work in
estimating the signal IR.

17
Multifactor Fundamental Law and the Impact of Missing Factors

The fundamental law we discussed so far only concerns one factor and we assumed the
true forecasting relationship to be a linear one factor model between risk-adjusted
residual return and the forecast signal. In practice, analysts or portfolio managers often
use raw residual returns directly and build a multi-factor model. It will be interesting to
see the form of fundamental law with multiple factors and study the consequences of
missing one or more factors in modeling. In deriving the fundamental laws presented in
previous sections, we used the risk-adjusted residual return in analysis. But this is not
necessary if we work on residual security returns and factor returns directly.

If we assume residual returns follow a linear relationship with factor exposures


rt Z t 1Ft t , (44)
where rt is an N 1 vector of residual returns, Z t 1 is an N K matrix of factor
exposures, Ft is a K 1 vector of factor returns, and t is an N 1 vector of
idiosyncratic noise. It is shown in Appendix B under some weak regularity conditions
that the ex ante expected portfolio IR has the following relationship with the expected
factor return (F) and factor return covariance ( F )
IR F' (1 /( N )I F ) 1 F
(45)
F' -F1F ,
where Ecs (1 / 2i ) represents part of the risk related to idiosyncratic noise. As in the
univariate case, this part of the risk will be diversified away as N gets larger, and the
remaining dominant risk is the "strategy risk" represented by the factor return covariance
that cannot be diversified away. When there is only one factor, Equation (45) reduces to
f
IR
( Ecs (1 / 2i )) 1 / N 2f
(46)
f
.
f
So the expected portfolio IR is just the IR of the factor-mimicking portfolio.

If, instead of using the raw residual return in Equation (44), we use the risk-adjusted
residual returns, then the multi-factor fundamental law in Equation (45) becomes (see
Appendix B)
IR IC' ( 2 / N I IC ) 1 IC
(47)
IC' -IC1 IC ,
K
where 2 (1 ( IC
2
,k IC k )) is the variance for idiosyncratic noise, IC is the cross-
2

k 1
sectional correlation vector between factor exposures and risk-adjusted residual returns,
and IC is the factor IC covariance matrix. Equation (47) reduces to Equation (42) when
there is only one factor.

18
The above conclusion is based on the assumption that the model is correctly specified
which is almost surely not the case in practice. A natural question to ask is what happens
if the return or risk model is mis-specified. With the fundamental law in multi-factor
format, we can easily study the impact of missing one or more return or risk factors. For
ease of exposition, I will only present the analysis for a 2-factor system here. More
detailed analysis with missing multiple factors can be found in Appendix B. In the
analysis below, I will not purposely distinguish risk factors from alpha factors.
Statistically, the only difference should be that the expected IC (or factor return) for risk
factor is zero while that for alpha factor is different from zero.

For a 2-factor system, Equation (B17) reduces to


IC 2
IC2
2

IR
1 1 2 IC1 IC2 IC ,IC
1 IC
2 IC IC IC 1 2
1 , IC2
IC1 2 1 2

2 2
IC 1 IC2 IC
1 IC1 ,IC2 1 (48)
IC 1 IC
2 IC IC1
1 1 , IC2 2
IC
1 .
IC
1

where IC1 ,IC2 is the time series correlation of the two factor ICs.

From Equation (48), it is clear that a mis-specified model, whether it is mis-specified in


the return forecast part or the risk forecast part, will almost always hurt the performance.
For a missing return factor, the adverse impact comes from both the missing return
forecast, IC2 , and the resulting conditional covariance mis-specification, ( 1 IC
2
1 , IC 2
).
For a missing risk factor, the adverse impact only comes from the resulting conditional
covariance mis-specification ( 1 IC
2
1 , IC 2
). This is not surprising indeed. The only
exception is when the missing factor is a risk factor and the risk factor IC is not serial
correlated with the return factor IC (i.e., when IC2 0 and IC1 ,IC2 0 ). When the risk
factor is missing, the ex post realized portfolio tracking error will be larger than the ex
ante targeted portfolio tracking error by a factor of 1 / 1 IC
2
1 ,IC2
1 . So if IC1 ,IC2 is
small, then the impact of missing a risk factor is small.

Empirical Factor IR Comparison

In order to compare the differences between various forms of the fundamental law, I
calculated the IR that can be achieved by various quantitative factors using different
formulas. For each factor, I calculate the ex post realized cross-sectional correlation (IC)
between lagged factor exposures and raw residual returns, and then calculate the mean
and standard deviation of the sample IC series. The results are then substituted as true
population values (given the size of the selection universe (N=1000, 2000 and 3000), the
sample estimation error is negligible which can be seen by how close the results from

19
Qian and Hua (2004), Ye (2008) and my formula are) into various formulas to generate
Table 3. For all the factors considered here, IC is much more important than 1 / N . I
calculated IC N for each factor and they are in the range of 4 to 10 which means IC
is 4 to 10 times more important than 1 / N for a typical investment universe. From the
table, we can see that the expected IR from the Grinold formula (GK) is always much
higher than those from Qian and Hua (QH), Ye (YE) and this paper (DING) while QH,
YE and DING stay very close to each other. This is not surprising given the result in
Figure 3 and the above discussion.

Table 3. Monthly Factor IR Comparison


(Raw Return, Pearson Correlation, 1978:12-2009:8)

Average
IC IC IR IR IR IR
IR
Factor Index Stdev Simulated
IC
IC 1/ N GK QH YE DING Portfolio
R1000 0.011 0.125 5.01 0.337 0.085 0.082 0.083 0.077
Book to
Price R2000 0.013 0.095 5.95 0.599 0.141 0.137 0.139 0.096
R3000 0.012 0.096 7.35 0.683 0.131 0.128 0.129 0.095
R1000 0.023 0.085 3.39 0.732 0.273 0.256 0.262 0.218
Cash Flow
to Price R2000 0.033 0.094 5.87 1.476 0.351 0.342 0.346 0.292
R3000 0.031 0.089 6.83 1.694 0.348 0.341 0.345 0.296
R1000 0.009 0.132 5.27 0.276 0.066 0.064 0.065 0.056
Earnings to
Price R2000 0.025 0.120 7.50 1.123 0.209 0.206 0.207 0.168
R3000 0.023 0.122 9.36 1.276 0.191 0.189 0.190 0.164
R1000 0.017 0.109 4.36 0.551 0.160 0.153 0.156 0.100
Sales to
Price R2000 0.017 0.086 5.35 0.756 0.198 0.191 0.194 0.116
R3000 0.016 0.087 6.68 0.903 0.190 0.186 0.188 0.118
R1000 0.031 0.190 7.61 0.983 0.163 0.161 0.162 0.120
12-Month
Momentum R2000 0.039 0.135 8.44 1.734 0.287 0.283 0.285 0.203
R3000 0.037 0.144 11.06 2.040 0.259 0.257 0.258 0.195
R1000 0.013 0.075 3.02 0.427 0.179 0.165 0.170 0.180
Share
Repurchase R2000 0.021 0.068 4.23 0.935 0.309 0.294 0.301 0.269
R3000 0.020 0.067 5.13 1.105 0.303 0.292 0.297 0.264
R1000 0.017 0.134 5.37 0.530 0.125 0.122 0.123 0.084
Percent
Short R2000 0.033 0.095 5.95 1.495 0.351 0.342 0.346 0.214
R3000 0.030 0.100 7.68 1.635 0.299 0.294 0.296 0.215

Note: I calculate N, and sample IC for each time period for different universes from 1978:12 to 2009:08
and then take the average. For IC I also calculate sample IC standard deviation. Since the IC is between raw
returns and factor exposures I use the formula in Equation (B12) for IR DING.

The last column in Table 3 is the simulated portfolio IR using alpha generated from
various alpha factors and the conditional covariance matrix as specified in Equations (B4)
that incorporates the strategy risk. It can be seen that the simulated factor portfolio IR is
smaller than the theoretically calculated IR in general. The discrepancy can result from
the fact that N and are not constants over time and stock excess returns and factor

20
exposures do not satisfy the conditions as specified by the theoretical model. However, it
can also be seen that the IR numbers calculated using the formulas by Qian and Hua
(2004), Ye (2008) and this paper are much closer to the actual numbers than those from
Grinold's original formula. As discussed in more details in Ding (2011), one should
simply use sample IC divided by sample IC standard deviation or sample factor return
divided by sample factor return standard deviation as an ex ante estimate for the IR that is
achievable in practice.

Conclusion

I have derived a generalized version of the fundamental law of active management under
some weak assumptions. I show that different forms of fundamental laws are a result of
either simplistic assumptions (Grinold (1989)), or approximations (Ye (2008)), or sample
vs. population (Qian and Hua (2004)). When the more relevant conditional residual return
forecast error covariance matrices are used, we will arrive at the more general form of the
fundamental law presented in this paper. I show that cross-sectional ICs are usually
different from time series ICs, and they will be the same only under the strong
assumption that either the residual return volatilities are the same across all the securities
or the ICs are calculated using risk-adjusted residual returns with the forecast signal.

I also show that the form of the fundamental law derived in this paper is quite robust to
forecast model specification. According to our generalized fundamental law, the variation
in IC (IC volatility over time) has a much bigger impact to portfolio IR than the breadth
N for a typical investment universe. Signal IC divided by IC standard deviation sets a
"Chinese Wall" as the upper limit for the portfolio IR a portfolio manager can reach when
the cross-sectional IC varies over time. The fundamental law by Grinold (1989) is
derived under some unrealistic assumptions and always overestimates by a large margin
the IR a portfolio manager can actually reach. I extend the fundamental law to models
with multiple factors and study the impact of missing one or more return or risk factors. It
is shown that a mis-specified model, whether it is mis-specified in the return forecast part
or risk forecast part, will almost always hurt performance. The exception occurs when a
missing risk factor (IC=0) has a zero time series IC correlation with all the other factors.
For the commonly used quantitative return and risk factors, I found that the impact of a
missing risk factor is usually small.

One insight from this paper is that the most important risk for a quant manger is the
"strategy risk" of betting on certain alpha factors. Even though the strategy risk takes a
small portion of the total risk for each individual security, it is a systematic risk for the
portfolio that cannot be diversified away. Portfolio managers should include the strategy
risk in their risk model and try to play well (high IC) and play precisely (low IC ). Extra
efforts should be made to process the information and to build models that can increase
IC and reduce IC variation.

I thank Roger Clarke, Harindra de Silva, Rob Engle, Russell Fuller, Tom Fuller, John Kling, Wenling Lin,
Doug Martin, Edward Qian, Doug Stone, Wei Su, Yixiao Sun, Steven Thorley, Yining Tung, Jia Ye, and an
anonymous referee for helpful discussions and comments. I also like to thank Richard Grinold for

21
providing me with his original technical notes. Yining Tung helped with some empirical calculations in the
paper.

Appendix A

Given the conditional forecasting error covariance matrix in Equation (39) and based on
the Woodbury matrix identity, we have the inverse matrix of t as
t1 2 t1 / 2 (I z t 1z t 1 ' ) t1 / 2 , (A1)
where
IC2 / 2
. (A2)
1 IC
2
/ 2 z t 1 ' z t 1
Substituting (A1) and Equation (38) into Equation (15) we have
Pt P t ' t1 ( t 1)
P 2 ICz t 1 ' 1t / 2 t1/ 2 (I z t 1z t 1 ' ) t1/ 2 (IC1t / 2 z t 1 1)
P IC / z t 1 ' (I z t 1z t 1 ' )(z t 1 t1/ 2 1 / IC)
(A3)
P IC / (z t 1 ' z t 1 ' z t 1z t 1 ' )(z t 1 1 / 2
t 1 / IC)
P IC / (1 z t 1 ' z t 1 )(z t 1 ' z t 1 z t 1 ' t1 / 2 1 / IC)
P IC / (z t 1 ' z t 1 z t 1 ' t1 / 2 1 / IC) /(1 IC
2
/ 2 z t 1 ' z t 1 ) .
When ri , zit 1 are cross-sectionally independent and use the fact that zit 1 has cross-
sectional mean zero and standard deviation 1, we have as N becomes large

Pt P IC / N / IC ( zit 1 / r ) / 1 N IC2 / 2
N


i
i 1

P IC / N (1 / ICEcs ( zit 1 / ri )) /(1 N IC


2
/ 2 )

P IC / N (1 / ICEcs ( zit 1 ) Ecs (1 / ri )) /(1 N IC


2
/ 2 )
(A4)
P IC / N /(1 N IC
2
/ 2 )
IC
P
/ N IC2
2

IC
P
(1 IC2 IC
2
) / N IC
2

which is Equation (41) in the paper.

Appendix B

Assume residual security returns rt and security factor exposures Z t 1 are related through
a linear factor model as follows

22
rt Z t 1Ft t , (B1)
where rt is an N 1 vector of residual returns, Z t 1 is an N K matrix of factor
exposures that become known at the end of time t-1, Ft is a K 1 vector of factor
returns, and t | I t 1 ~ N (0, ) is an N 1 vector of idiosyncratic noise with mean 0
and covariance diag ( 21 , 22 , , 2N ) . The factor exposures are normalized to have
both time series and cross-sectional mean 0 and standard deviation 1, and are cross-
sectionally orthogonal to each other so that Z t 1 ' Z t 1 / N I . Other regularity
assumptions like those in C1) and C3) also apply. We further assume that factor returns
follow a multivariate normal distribution
Ft | I t 1 ~ N (F, F ) . (B2)

Based on the above assumptions, we have


t Z t 1F , (B3)
and
t Z t 1 F Z t 1 ' . (B4)
Applying Woodbury matrix identity, we get the inverse of the conditional covariance
matrix as
t1 1 1Z t 1 ( F1 Z t 1 ' 1Z t 1 ) 1 Z t 1 ' 1 . (B5)

Substituting Equations (B3) and (B5) into the two components of the IR formula in
Equation (16) we get
t ' t1 t F' Z t 1 ' ( 1 1Z t 1 ( F1 Z t 1 ' 1Z t 1 ) 1 Z t 1 ' 1 )Z t 1F
F' Z t 1 ' 1Z t 1 (I ( F1 Z t 1 ' 1Z t 1 ) 1 Z t 1 ' 1Z t 1 )F
F' Z t 1 ' 1Z t 1 ( F1 Z t 1 ' 1Z t 1 ) 1 F1F
F' ( F ( F1 Z t 1 ' 1Z t 1 )(Z t 1 ' 1Z t 1 ) 1 ) 1 F (B6)
F' ((Z t 1 ' 1Z t 1 ) 1 F ) 1 F
F' ((Z t 1 ' 1Z t 1 / N ) 1 / N F ) 1 F
F' (1 /( N ) I F ) 1 F
and
t ' t1 1 F' Z t 1 ' ( 1 1Z t 1 ( F1 Z t 1 ' 1Z t 1 ) 1 Z t 1 ' 1 )1
F' (I Z t 1 ' 1Z t 1 ( F1 Z t 1 ' 1Z t 1 ) 1 )Z t 1 ' 1 1
(B7)
F' (1 /( N ) I F ) 1 Z t 1 ' 1 1 /( N )
0,
where we assumed z ki and i to be cross-sectionally independent and used the facts that
for k , l 1, 2,, K ,

23
N z k ,it 1 zl ,it 1
1
Z k ,t 1 ' 1Z l ,t 1 / N
N

i 1 2i

z z
Ecs k ,it 1 2 l ,it 1

i (B8)

Ecs z k ,it 1 zl ,it 1 Ecs (1 / 2i )
1 N
E (1 / 2i )
cs N
(1 / )
i 1
2
i
when k l
0 when k l
and
N
1
Z k ,t 1 ' 1 1 / N
N
(z
i 1
k ,it 1 / 2i )

Ecs ( z k ,it 1 / 2i ) (B9)


Ecs ( z k ,it 1 ) Ecs (1 / 2i )
0.
So the ex ante expected portfolio IR is

IR E t ' t1 ( t 1)
E ' t
1
t t

E F' (1 /( N ) I ) F F
1
(B10)
F' (1 /( N ) I F ) 1 F
F' F1F .
For a one factor model, Equation (B10) simplifies to
f
IR
1

Ecs (1 / 2i ) / N 2f
(B11)
f
,
f
i.e., the expected portfolio IR is just the IR of the factor-mimicking portfolio. When the
N
1
cross-sectional residual return dispersion is a constant,7 i.e., d (rt )
N

i 1
2
ri , then

Equation (B11) becomes

24
IC
IR
E cs (1 / 2i )
1
/ N 2 IC
2

IC
(B12)
2
Ecs (1 / i ) 2
1
/ N 2
IC

IC

1 /( N ) IC
2

where
N N
1 1 1

N

i 1
2
ri
N

i 1
2
i
N N
1 1 1

N
r2
i 1
i
N

i 1
2
(IC IC
22
) 2 (B13)
ri
N N
1 1 1

N

i 1
2
ri
N

i 1
2
ri

1.
The last line in (B13) is based on Jensen's inequality.

By applying the same assumptions for deriving Equation (B12) to Equation (B10), we get
the multifactor fundamental law in terms of IC as follows:
IR F' (1 /( N ) I F ) 1 F

IC' 1 /( N ) I IC IC
1
(B14)
1
IC' IC IC
where IC F / is the cross-section correlation vector between factor exposures and
residual security returns, and IC F / 2 is the factor IC covariance matrix. It should
be emphasized that the results in Equations (B12) and (B14) are only valid when the
cross-sectional residual return dispersion is a constant. When this assumption is violated,
then the IR calculated from Equations (B10) and (B11) will usually be smaller than that
from (B12) and (B14).

To avoid the problem of cross-sectional heteroskedasticity in cross-sectional regression,


one can use the risk-adjusted residual security returns as the dependant variable, i.e.,
~
rt t1 / 2rt Z t 1ICt t (B15)
where t diag ( r21 , r22 , , r2N ) , and r2i is the residual return variance for security i. By
using the same algebra one can get

IR IC' 2 / N I IC
1
IC
(B16)
1
IC' IC IC

25
K
where 1 ( IC
2 2
, k IC k ) . It should be emphasized again that the ICs in Equation
2

k 1
(B16) are the cross-sectional correlation between risk-adjusted residual security returns
and factor exposures, while the ICs in Equation (B14) are the cross-sectional correlation
between the raw residual security returns and factor exposures, hence they will usually be
different.

With the fundamental law in multifactor format, we can easily study the impact of
missing one or more return or risk factors. In the analysis below, I will study the impact
of missing factors based on factor ICs, the analysis based on factor returns is almost
identical. I will not purposely distinguish risk factors from alpha factors. Statistically, the
only difference should be that the expected IC (or factor return) for risk factor is zero
while that for alpha factor is different from zero. I will separate the factors into two
groups with IC i and ii (i=1,2) as their factor IC and IC covariance respectively. I will
also assume that the inter-group factor IC covariance to be 12 . Under these assumptions,
we can write Equation (B14) as follows 8
IR IC' IC1 IC
1
12 IC1
(IC1 ' IC 2 ' ) 11

12 ' 22 IC 2 (B17)
1 1
IC1 ' 11 IC1 (IC 2 12 ' 11 IC1 )' E 1 (IC 2 12 ' 11
1
IC1 )
1
IC1 ' 11 IC1
1
where E 22 12 ' 11 12 .

So IR 2 will be reduced by an amount of


1 1
(IC 2 12 ' 11 IC1 )' ( 22 12 ' 11 12 ) 1 (IC 2 12 ' 11
1
IC1 ) 0 (B18)
when the second group of k 2 factors are missing. The impacts come from both alpha
model mis-specification (when IC 2 0 ) and risk model mis-specification (when
1 1
IC 2 0 but IC1 ' 11 12 ( 22 12 ' 11 12 ) 1 12 ' 11
1
IC1 0 ).

Alternatively the IR can be expressed as


IR IC 2 ' 221 IC 2 (IC1 12 22
1
IC 2 )' D 1 (IC1 12 22
1
IC 2 ) (B19)
where D 11 12 221 12 ' . When IC 2 0 , then the missing group is purely risk
factors,
1 1 1
IR IC1 ' 11 IC1 IC1 ' 11 12 ( 22 12 ' 11 12 ) 1 12 ' 11
1
IC1
1
IC1 ' ( 11 12 22 12 ' ) 1 IC1 (B20)
1
IC1 ' 11 IC1 ,

26
so the reduction in IR comes only from missed risk allocation. When 12 0 , i.e., the
alpha group factor ICs and risk group factor ICs are not correlated, then missing risk
factors will not impact the final portfolio performance.

The result we derived in Equation (B6) relies on the condition that assumption C2) is
satisfied. When C2) is violated, then the idiosyncratic error covariance matrix is no
longer diagonal and we have
t ' t1 t F ' ((Z t 1 ' 1Z t 1 ) 1 F ) 1 F
(B21)
F ' ((Z t 1 ' 1Z t 1 / N ) 1 / N F ) 1 F .
We cannot simplify Equation (B21) without further assumptions. In order to study the
impact of a full idiosyncratic error covariance matrix on portfolio IR, I run simulation of
a one factor system for the value of Z t 1 ' 1Z t 1 , assuming Z t 1 and are independent
and is consistent with what implied by the factor model, for N from 50 to 2000. I
found that Z t 1 ' 1Z t 1 N in all the simulations I run (the minimum Z t 1 ' 1Z t 1 / N
is 26 and the maximum is 1.6 1010 out of 1950 simulations). So the impact of
(Z t 1 ' 1Z t 1 ) 1 is negligible compared to F in Equation (B21). The study here shows
again that the most important risk in running a portfolio using factor model is the factor
IC variation (strategy risk) itself. Most other risks will be diversified away for a large
enough selection universe.

Notes
1
We used the fact that the benchmark residual return is zero in deriving Equation (7), i.e.,
N

w
i 1
r 0.
B , it it

This is true because


N N N N
RB ,t wB ,it ritTotal wB ,it it RB ,t wB ,it rit RB ,t wB ,it rit .
i 1 i 1 i 1 i 1

2
This is called The Law of Total Variance which states that
E (Var (r | x)) Var (r ) Var ( E (r | x)) Var (r ) .
A proper forecast, E (r | x) , is one where the errors are not correlated with the expected
returns. That means that the variance of actual returns must be equal to the variance of
your expected returns plus the variance of the error terms. As long as there is any forecast
error at all, an efficient forecast will always be one where your expected returns are less
variable than what actually takes place.
3
We define the realized cross-sectional residual return dispersion at time t as
N N
1 1
d (rt )
N
(rit rt ) 2
i 1 N
(r
i 1
2
it rt 2 ) Ecs (rit2 rt 2 ) ,

27
N
1
where rt
N
r
i 1
it is the average cross-sectional residual return which we will assume to be

zero in this article. The expected cross-sectional residual return dispersion is then

E (d (rt )) E
1
N
N

(r 2
it


rt 2 ) E Ecs (rit2 rt 2 ) .


i 1
4
We can decompose rit as rit r eit where eit ~ N (0, 1) . So
1 N N
E (d (rt )) E r 2 r E 1 e 2 r
N it N it
i 1 i 1
as N by law of large numbers.
5
The assumption of normality in the information coefficient is approximate because IC is
bounded by 1 .

The unconditional covariance of risk adjusted residual return t1 / 2 rt is


6

E ( t1 / 2rt rt ' t1 / 2 ) (IC2 IC


2
) z ,
where z is the covariance matrix of z t 1 with 1 in the diagonal and
var( t ) diag (var( it )) (1 IC2 IC
2
)I .
7
When we assume the cross-sectional residual return dispersion is a constant, i.e.,
N
1
d (rt )
N
r
i 1
2
it d,

then
E (d (rt )) d .
On the other hand,
N N
1 1
E (d 2 (rt ))
N
E (rit2 )
i 1 N
i 1
2
ri d2.

So we have
N
1
E (d (rt )) d
N

i 1
2
ri .

8
The inverse of a partitioned matrix is repeatedly used in the derivation, see Magnus and
Neudecker (2002, p11).

References

Buckle, David. 2004. How to calculate breadth: An evolution of the fundamental law of
active portfolio management. Journal of Asset Management 4, 393405 (1 April 2004).

28
Clarke, Roger, Harindra de Silva, and Steven Thorley. 2002. Portfolio Constraints and
the Fundamental Law of Active Management. Financial Analysts Journal, vol. 58, no. 5
(September/October):4866.
Clarke, Roger, Harindra de Silva, and Steven Thorley. 2006. The Fundamental Law of
Active Management. Journal of Investment Management, vol. 4, no. 3: 5472.
Ding, Zhuanxin, 2011. The Statistics of Cross-Sectional Information Coefficient.
Working Paper, Available at SSRN: http://ssrn.com/abstract=1826303.
Grinold, Richard C. 1989. The Fundamental Law of Active Management. The Journal
of Portfolio Management, vol. 15, no. 3 (Spring): 3038.
Grinold, Richard C. 1994. Alpha is Volatility Times IC times Score. The Journal of
Portfolio Management, vol. 20, no. 4 (Summer): 916.
Grinold, Richard C. 2007. Dynamic Portfolio Analysis. The Journal of Portfolio
Management, vol. 34, no. 1 (Fall): 1226.
Grinold, Richard C., and Ronald N. Kahn. 2000. Active Portfolio Management. 2nd ed.
New York: McGraw-Hill.
Ibragimov, R. and Mller, U. (2009). "t-statistic Based Correlation and Heterogeneity
Robust Inference," forthcoming in the Journal of Business & Economic Statistics.
Kahn, Ronald, 1997. "Seven Quantitative Insights into Active Management Part 3: The
Fundamental Law of Active Management," BARRA Newsletter, Winter.
Lee, Jyh-Huei and Dan Stefek. Do Risk Factors Eat Alphas. The Journal of Portfolio
Management, vol. 34, no. 4 (Summer 2008), pp. 12-25.
Magnus, Jan R. and Heinz Neudecker. 2002. Matrix Differential Calculus with
Applications in Statistics and Econometrics. Revised Edition. New York: John Wiley &
Sons.
Petersen, M. A. (2009), Estimating standard errors in finance panel data sets: Comparing
approaches, The Review of Financial Studies, 22, 435-480.
Qian, Edward, and Ronald Hua. Active Risk and Information Ratio. The Journal of
Investment Management, vol 2, no. 3 (2004), pp. 20-34.
Qian, E., Hua, R., and Sorensen, E.H. (2007). Quantitative Equity Portfolio Management:
Modern Techniques and Applications, London: CRC Press.
Sorensen, Eric H., Ronald Hua, Edward Qian, and Robert Schoen. Multiple Alpha
Sources and Active Management. The Journal of Portfolio Management, vol. 30, no. 2
(Winter 2004), pp. 39-45.
Sorensen, Eric H., Ronald Hua, and Edward Qian. 2007. Aspects of Constrained Long
Short Equity Portfolios. The Journal of Portfolio Management, vol. 33, no. 2
(Winter):1222.
Ye, Jia. "How Variation in Signal Quality Affects Performance." Financial Analysts
Journal, vol. 64, no. 4 (2008), 48-61.

29

S-ar putea să vă placă și