2009 L1 Key Formulas

2009 Financial Risk Manager (FRM)
Level 1 (L1) Formula Sheets

Prepared by David Harper, CFA, FRM, CIPM
This document is for only for paid members of Bionic Turtle

Copyright @ 2009 by Bionic Turtle, LLC
Page 1
Value‐at‐risk (VaR)
VaR summarizes the worst loss over a target horizon that will not be exceeded with a
given level of confidence. VaR is given by:
P( L  VaR)  1  c
Portfolio Variance and Covariance

The variance of the two-asset portfolio is given by:
 a2b  wa2 a2  wb2 b2  2wa wb cov(a, b) cov(a, b)

cov(a, b)   a b  
 a2b  wa2 a2  wb2 b2  2wa wb  a b  a b
Portfolio variance with several assets:
N N N
 Portfolio
2
  wi2Var( Ri )   wi w jCov(Ri , R j )
i 1 i 1 j i
Capital asset pricing model (CAPM)

The capital asset pricing model (CAPM) is given by:
E(Ri )  RF  i[ E(RM )  RF )]
CAPM tells us that the expected excess return of a risky security is equal to the systematic risk of
that security measured by its beta times the market's risk premium. The key insight of the CAPM
is that a security’s risk premium is proportional to its systematic risk.
Security Market Line (SML)

The security market line (SML) plots expected return against beta:
E(Ri )  RF  i[E(Rm )  RF ]
Beta
Beta is a measure of an asset’s sensitivity to movements in the market. A security’s beta is the
covariance of the return of the security with the return of the market portfolio divided by the
variance of the return of the market portfolio:
Cov( Ri , RM )
i 
 M2
2 / 55
Price of Risk (beta)
The price of risk is excess expected return of the market portfolio above the risk-free rate:
price of risk = E(R M  RF )

Quantity of Risk (beta)
The quantity of risk is beta (). Beta is the measure of the asset’s sensitivity to the market.
Specifically, it is the covariance between the asset’s return and the market’s return divided by the
market’s variance (the division is a way to standardize the beta into a unit-less metric).
cov( Ri , Rm )
Beta  i 
var( Rm )
Diversification and Risk Management

The value of firm’s equity is future expected cash flows discounted at CAPM rate:
Expected [future cash flows]

ValueEquity 
1+R F   [ E( Rm )  RF ]
Treynor Measure, Sharpe Measure, and Jensen’s Alpha

E( RP )  RF The Treynor measure: excess return divided by
TP 
P portfolio beta ():
E( RP )  RF
SP  The Sharpe measure: excess return divided by
 (RP ) portfolio volatility (standard deviation):
E(RP )  RF   P  P (E(RM )  RF ) Jensen’s alpha is the excess return equated to

alpha plus expected systematic return:
Tracking Error (TE)

Tracking error (TE) is the standard deviation of the difference between the portfolio return
and the benchmark return:
TE   (RP  RB )
Ex ante tracking error can be given by:
TEV 2   TE
2
  P2  2 P B   B2
3 / 55
Information ratio (IR)
The information ratio (IR, aka, the appraisal ratio) is given by:
E(RP )  E(RB )
IR 
 (RP  RB )
Sortino Ratio
The Sortino ratio is given by:
E(R P )  MAR
Sortino ratio =
T
1
  RPt  MAR 2
T t 0
RPt  MAR
Arbitrage Pricing Theory (APT)

APT postulates a multiple-factor model of excess returns. APT assumes that there are K factors
such that the excess returns can be expressed as:
K
rn   X n,k  bk  un
k 1
 X(n,k) = exposure of stock (n) to factor (k). These exposures are factor loadings. In
practice, we will assume that the exposures are known before the returns are observed.
 b(k) = the factor return for factor k. These factor returns are either attributed to the
factors at the end of the period or observed during the period.
 u(n) = stock (n)'s specific return; i.e., the return that cannot be explained by the factors.
This is also called the idiosyncratic return to the stock. The excess return contains an
unexplained specific (idiosyncratic) return
In regard to expected excess return

K
f n  E {rn }   X n,k  mk
k 1
 m(k) = the factor forecast for factor k.
The factor forecast is simply the sum of [Exposure(k) * Factor(k)].
4 / 55
Discrete random variables (probability function)
A discrete random variable (X) assumes a value among a finite set including x1, x2, x3 and so
on. The probability function is expressed by:
P( X  xk )  f ( xk )
The function must meet two conditions:
1st condition: f ( x)  0
2nd condition:  f ( x)  1
x
Continuous random variable

A continuous random variable (X) has an infinite number of values within an interval:
b
P(a  X  b)  a f ( x)dx
The function must meet two conditions:
1st condition: f ( x)  0

2nd condition:  f ( x)dx  1

Note that instead of an “in between” interval, a continuous variable can be expressed in
cumulative terms; i.e., what is the probability that X “is less than” some value?
x
F ( x)  P( X  x)   f (u)du (  x  )
Bayes’ Theorem
P(U | G)P(G)
P(G |U ) 
P(U )
We can expand the denominator into the elaborated Bayes’ formula:
P(U | G)P(G)
P(G |U ) 
P(U | G)P(G)  P(U | G)P(G)
5 / 55
Conditional probability
What is the probability of B occurring, given that A has already occurred?
P( A  B )
P(B | A)   P( A)P(B | A)  P( A  B)
P( A)
Mathematical expectation
In the case of a discrete random variable, expected value is given by:
E( X )  x1 f ( x1 )  x2 f ( x2 )  xn f ( xn )   xf ( x)
In the case of a continuous random variable, expected value is given by:
E( X )   xf ( X )dx
Variance and standard deviation


var( X )   2
X  (x  )
2
f ( x)dx

But if X is a discrete random variable, the variance is given by:
Variance( X )   X2  E[( X   )2 ]
And the standard deviation, which is simply the square root of the variance, is given by:
Standard Deviation =  X  var( X )  E[( X   )2 ]
Important variance formula

Variance is also conveniently expressed as the difference between the expected value of X^2 and
the square of the expected value of X:
Variance( X )  E[( X   )2 ]  E( X 2 )  [ E( X )]2
6 / 55
Properties of variance if X and Y are independent
1.  constant
2
0
2a.  X Y   X2   Y2
2
only if independent
2b.  X2 Y   X2   Y2 only if independent
3.  X2 b   X2
4.  aX
2
 a2 X2
5.  aX
2
b  a  X
2
6.  aX
2
bY  a  X  b  Y
2 2 2 2
only if independent
7.  X2  E( X 2 )  E( X )2
Chebyshev’s Inequality
Chebyshev’s inequality provides a shorthand method for specifying a cumulative probability
without our need to know the underlying distribution (conditional on a finite variance):
1 1
P( X    k )  , or P ( X    k )  1 
k2 k2
Coefficient of variation
Because the standard deviation depends on the units of measurement, the coefficient of variation
is used to measure relative variation. In other words, the coefficient of variation (like the
correlation coefficient) is a unit-less number.
X
coeff. of variation (V)= (100)
uX
Covariance
The covariance (a.k.a., covariance of joint distributions) is given by:
 XY  cov( X ,Y )  E[( X   X )(Y  Y )]

Properties of covariance
1. If X &Y are independent,  XY  cov( X , Y )  0
2. cov(a  bX , c  dY )  bd cov( X , Y )
3. cov( X , X )  var( X ). In notation,  XX   X2
4. If X &Y are not independent,
 X2 Y   X2   Y2  2 XY
 X2 Y   X2   Y2  2 XY
7 / 55
Correlation Coefficient
The correlation coefficient is the covariance (X,Y) divided by the product of the each variable’s
standard deviation. The correlation coefficient translates covariance into a unitless metric
that runs from -1.0 to +1.0:
 XY cov( X ,Y )
    X Y   XY
 X Y StandardDev( X )  StandardDev(Y )
Properties of correlation:
 Correlation has the same sign (+/-) as covariance

 Corrleation measures the linear relationship between two variables
 Between -1.0 and +1.0, inclusive
 The correlation is a unit-less metric
 Zero covariance → zero correlation (But the converse not necessarily true. For example,
Y=X^2 is nonlinear )
Define, calculate and interpret the mean and variance of a set of random variables.
The variance of the sum of correlated variables is given by the following:
 X2 Y   X2   Y2  2 XY , and given that  XY   X Y

 X2 Y   X2   Y2  2  X Y
The variance of the difference of correlated variables is given by:
 X2 Y   X2   Y2  2 XY and given that  XY   X Y

 X2 Y   X2   Y2  2  X Y
Describe the difference between conditional and unconditional expectation.

An unconditional expectation is the expected value of the variable without any restrictions (or
lacking any prior information).
A conditional expectation is an expected value for the variable conditional on prior information
or some restriction (e.g., the value of a correlated variable). The conditional expectation of Y,
conditional on X = x, is given by:
E(Y | X  x)
The conditional variance of Y, conditional on X=x, is given by:
var(Y | X  x)
8 / 55
Moments of a distribution
The k-th moment about the mean () is given by:
 ( xi   )k
n
i 1
k-th moment 
n
In this way, the difference of each data point from the mean is raised to a power (k=1, k=2, k=3,
and k=4). There are the four moments of the distribution:
 If k=1, refers to the first moment about zero: the mean.

 If k=2, refers to the second moment about the mean: the variance.
 If k=3, refers to the third moment about the mean: skewness
 If k=4, refers to the fourth moment about the mean: peakedness.
Skewness (asymmetry)
Skewness refers to whether a distribution is symmetrical. An asymmetrical distribution is
skewed and will be either positively (to the right) or negatively (to the left) skewed. The measure of
“relative skewness” is given by the equation below, where zero indicates symmetry (no skewness):
E[( X   )3 ]
Skewness =  3 
3
Kurtosis
Kurtosis measures the degree of “peakedness” of the distribution, and consequently of “heaviness
of the tails.” A value of three (3) indicates normal peakedness. The normal distribution has
kurtosis of 3, such that “excess kurtosis” equals (kurtosis – 3).
E[( X   )4 ]
Kurtosis =  4 
4
Note that technically skew and kurtosis are not, respectively, equal to the third and fourth
moments; rather they are functions of the third and fourth moments.
9 / 55
Summary
Population Sample
Mean n
  Xi
X
i 1
n X
n
Variance 1 n 1 n
 x2   ( Xi  X )2 sx2   ( Xi  X )2
n i 1 n  1 i 1
Covariance 1 1
 XY  ( Xi  X )(Yi  Y ) sample  XY  ( Xi  X )(Yi  Y )
n n1
Correlation  sample  XY
  XY sample  
 X Y S X SY
Skew
Skewness =  Sample Skewness =  3
3
E[( X   ) ]
3
3  ( X  X )3
  ( N  1)
 3
3  3
S
Kurtosis
Kurtosis =  4 Sample Kurtosis = 4
E[( X   ) ]
4
4  ( X  X )4
  ( N  1)
 4
4 =
S 4
10 / 55
Normal distribution
Here is the probability density function (PDF) for a normally distributed random variable:
1 2 2 2
f ( x)  e ( x  )
 2
Key properties of the normal include:
 Symmetrical around its mean value (symmetry = 0)

 Peaks at mean but descends rapidly to tails
 ~68% at 1, ~95% at 2, ~99.7% at 3
 Only requires (fully described by) two parameters, mean and variance
 A linear combination (function) of two normally distributed random variables is itself
normally distributed
 Skew = 0, Kurtosis = 3 (excess kurtosis = 0)
Key locations on the normal distribution are noted below. In the FRM curriculum, the choice
of one-tailed 5% significance and 1% significance (i.e., 95% and 99% confidence) is
common, so please pay particular attention to the yellow highlights:
% of Intervals – Interval – VAR -worst

% of all observations multiple of the mathematically expected loss at
observations “to the left” standard expressed (two- the given
(two-tailed) (one-tailed) deviation tailed) confidence
~ 50% ~ 25% 2/3 uˆ  0.67ˆ
~ 68% ~ 34% 1 uˆ  ˆ
~ 90% ~ 5.0 % 1.645 (~1.65) uˆ  1.65ˆ ˆ  1.65ˆ
~ 95% ~ 2.5% 1.96 uˆ  1.96ˆ
~ 98% ~ 1.0 % 2.327 (~2.33) uˆ  2.33ˆ ˆ  2.33ˆ
~ 99% ~ 0.5% 2.58 uˆ  2.58ˆ
Standard normal distribution

A normal distribution is fully specified by two parameters, mean and variance (or standard
deviation). We can transform a normal into a unit or standardized variable:
X  X
Z
X
This unit or standardized variable is normally distributed with zero mean and variance of
one (1.0).
11 / 55
Sampling distribution of means
If either: (i) the population is infinite and random sampling, or (ii) we have a finite population
and sampling with replacement, then the variance of the sampling distribution of means is
given by:
2
E[( X   ) ]   
2 2
X
n
If the population is size (N), if the sample size n  N, and if sampling is conducted “without
replacement,” then the variance of the sampling distribution of means is given by:
2  N n
 X2   
n  N 1 
Standard error of a sample mean.

The standard error is simply the standard deviation of the sampling distribution of the
estimator.
In the case of a sample mean, according to the central limit theorem, the variance of the
estimator is the population variance divided by the sample size. The standard error is the square
root of this quantity:
 X2 X
se  
n n
If the population is distributed with mean  and variance 2 but the distribution is not a
normal distribution, then the standardized variable given by Z below is “asymptotically normal;
i.e., as (n) approaches infinity () the distribution becomes normal.
Z
 X    ~ N(0,1)
X

n
12 / 55
Student’s t‐distribution…
As the degrees of freedom (d.f.) increases, the t-distribution converges with the normal
distribution. It is similar to the normal, except it exhibits heaver tails (the lower the d.f.., the
heavier the tails). The student’s t variable is given by:
X  X
t
Sx n
Properties of the t-distribution:
 Like the normal, it is symmetrical

 Like the standard normal, it has mean of zero (mean = 0)
 Its variance = k/(k-2) where k = degrees of freedom. Note, as k increases, the variance
approaches 1.0. Therefore, as k increases, the t-distribution approximates the standard
normal distribution.
Both the normal (Z) and student’s t (t) distribution characterize the sampling distribution of
the sample mean. The difference is that the normal is used when we know the population
variance; the student’s t is used when we must rely on the sample variance. In practice, we don’t
know the population variance, so the student’s t is typically appropriate.
Z
X   
X
t
X    X
X SX
n n
Chi‐square distribution
For the chi-square distribution, we observe a sample variance and compare it to a
hypothetical population variance. This variable has a chi-square distribution with (n-1) degrees
of freedom:
 s2 
 2  (n  1) ~ (n1)
2
 
Properties of the chi-square distribution:
 Nonnegative (>0)
 Skewed right, but as d.f. increases it approaches normal
 Expected value (mean) = k, where k = degrees of freedom
 Variance = 2k, where k = degrees of freedom
 The sum of two independent chi-square variables is also a chi-squared variable
13 / 55
F distribution
The F ratio is the ratio of sample variances, with the greater sample variance in the numerator:
sx2
F 2
sy
Properties of F distribution:
 Nonnegative (>0)
 Skewed right
 Like the chi-square distribution, as d.f. increases, approaches normal
 The square of t-distributed r.v. with k d.f. has an F distribution with 1,k d.f.
 m * F(m,n)=χ2
Critical t-values
The critical t-values show what percentage of the area under the student’s t distribution curve lies
between the values. The random variable is given by:
X  X
t
Sx n
It follows the student’s t distribution with (n-1) degrees of freedom (d.f.). The confidence interval
is given by:
Sx S
X  t    X  X  t  x
n n
Population regression function (PRF)

The population regression function (PRF) describes the population regression line (PRL). We
don’t observe the PRF; instead, we try to infer it with the sample regression function (SRF). The
population regression function is given by:
E(Y | Xi )  B1  B2 Xi
Its stochastic equivalent adds the stochastic (or random) error term:
Yi  B1  B2 Xi  ui
 B1 = intercept = parameter or regression coefficient
 B2 = slope = parameter or regression coefficient
14 / 55
Sample regression function (SRF)
Stochastic PRF Yi  B1  B2 Xi  ui
Sample regression function (SRF) Yî  b1  b2 Xi
Stochastic sample regression function (SRF) Yi  b1  b2 Xi  ei
 b1 = intercept = parameter or regression coefficient

 b2 = slope = parameter or regression coefficient
 ei = the residual term
Residual sum of squares (RSS)

Each residual is the difference between the observed and predicted Y; the residual sum of squares
(RSS) is the sum of the square of these residuals:
RSS =  ei2   (Yi  Yˆ )2

  (Yi  b1  b2 Xi )2
Standard errors in OLS

Assume the sample regression function (SRF):
Yi  b1  b2 Xi  ei
It contains two estimators (estimates of the population parameters). Each estimate has a
standard error, a measure of its variability.
The intercept is given by b1. Its standard error is the square root of its variance:
var(b1 )   xi2 2
 se(b1 )  var(b1 )
n xi2
The slope coefficient is given by b2. Its standard error is the square root of its variance:
2
var(b2 )  se(b2 )  var(b2 )
 xi2
The standard error of the regression, SER (a.k.a., standard error of estimate), is given by the
square root of: RSS/(n-2):
ˆ 
2  ei2
ˆ   ei2 k  2 in a two-variable model
nk nk
15 / 55
Sum of squares
We can break the regression equation into three parts:
 Explained sum of squares (ESS),

 Residual sum of squares (RSS), and
 Total sum of squares (TSS).
The explained sum of squares (ESS) is the squared distance between the predicted Y and the
mean of Y:
n
ESS   (Yî  Y )2
i 1
The residual sum of squares (RSS) is the summation of each squared deviation between the
observed (actual) Y and the predicted Y:
n
RSS   (Yi  Yî )2
i 1
The ordinary least square (OLS) approach minimizes the RSS. The RSS and the standard error of
regression (SER) are directly related; the SER is the standard deviation of the Y values around
the regression line. The residual sum of squares (RSS) is the square of the error term. It is
directly related to the standard error of the regression (SER):
n RSS ei2
RSS   (Yi  Yî )2  RSS  SER2 (n  k )  SER  
i 1 nk nk
Or, equivalently:
ˆ 2   ei2  ˆ   ei2 k  2 in a two-variable model

nk nk
The standard error of the regression (SER) is a function of the residual sum of squares (RSS):
n
 î2
i 1
Standard Error of the Regression (SER) =
nk
16 / 55
Coefficient of Determination & Correlation Coefficient
The coefficient of determination (R^2 or r^2) measures the proportion or percentage of the total
variation in Y explained by the regression model:
r 2
1  ( y  yest )2 ( yest  y )2 explained variation
 
( y  y )2 ( y  y )2 total variation
The coefficient of determination can be directly inferred from the ANOVA and its sum of squares:
TSS  ESS  RSS

ESS RSS
R2  1
TSS TSS
The correlation coefficient is the square root of the coefficient of determination:
ESS RSS
r  R2   1
TSS TSS
Test of hypothesis for the slope (b2)

To test the hypothesis that the regression coefficient (b2) is equal to some specified value (), we
use the fact that the statistic
b2  
test statistic t 
se(b2 )
This has a student's distribution with n - k degrees of freedom; k=2 in a two-variable model.
Jarque-Bera
The Jarque-Bera is a popular test of normality that incorporates both skew and kurtosis. It is given
by the following:
n  2  K  3 
2
JB   S  
6  4 
 n = the sample size, S = skewness (not sample variance!), K = kurtosis
The JB value is a random variable that follows the chi-square distribution with 2 degrees of
freedom (d.f. = 2).
17 / 55
Prediction Error
The predication error is the difference between the predicted Y and the true mean value. Like the
regression coefficients, the predicted Y has a sampling distribution. The variance of the
predicted Y is given by:
 1  X  X 2 
ˆ 2   
0
var Y X
n
  i 
x 2
Distinguish between simple and multivariate regression

A simple regression is a two-variable regression, one dependent regressed against one
independent variable:
E(Yt )  B1  B2 X2t
E(Yt )  B1  B2 X 2t  B3 X 3t
A multivariate regression has more than one independent variable:
Yt  B1  B2 X2t  t
Yt  B1  B2 X2t  B3 X 3t  t
Partial slope coefficient

The three-variable (two independents + one dependent) linear regression is given by:
Yt  B1  B2 X2t  B3 X 3t  t
In this case, B2 and B3 are partial slope coefficients (or partial regression coefficients). This
means,
 In the case of B2, B2 measures the change in the mean value Y, E(Y), per unit of change in
X2, holding constant the value of X3.
 In the case of B3, B3 measures the change in the mean value Y, E(Y), per unit of change in
X3, holding constant the value of X2.
The partical slope coefficients are, therefore, measures of direct sensitivity; i.e., what change in
the dependent variable can be directly attributable to a particular independent variable.
18 / 55
Explain the assumptions of the multiple linear regression model.
 A8.1. Linear in parameters
 A8.2. X2 and X3 uncorrelated with disturbance term
 A8.3. Expected value of error term is zero
 A8.4 Constant variance (homoskedasticity)
 A8.5. No autocorrelation (a.k.a., serial correlation) between error terms
 A8.6. No collinearity between X2 and X3. If the explanantory variables are correlated, such
a violation is called multicollinearity.
 A8.7. Error term is normally distributed
Standard errors in multilinear regression

The variances and standard errors are given by the following:
 1 X 22  x32t X 32  x22t  2X 2 X 3  x2t x3t  2

var(b1 )     
 n  2t  3 t  2t 3 t
x 2
x 2
 ( x x )2

se(b1 )  var(b1 )
var(b2 ) 
 x32t  2
      ( x
x22t x32t 2
2t x 3 t )
se(b2 )  var(b2 )
var(b3 )   x22t  2
      ( x
x22t x32t 2
2t x 3 t )
se(b3 )  var(b3 )
Relationship between the number of Monte Carlo replications and the standard error
of the estimated values.
The relationship between the number of replications and precision (i.e., the standard error of
estimated values) is not linear: to increase the precision by 10X requires approximately 100X more
replications. The standard error of the sample standard deviation:
1 SE(ˆ ) 1
SE(ˆ )    
2T  2T
Therefore to increase VaR precision by (1/T) requires a multiple of about T2 the number of
replications; e.g., x 10 precision needs x 100.
19 / 55
Correlated random variables
The following transforms two independent random variables into correlated random variables:
 1  1
 2  1  (1   2 )2
1 ,2 : independent random variables

: correlation coefficient
 1 , 2 : correlated random variables
Periodic returns (e.g., 1 period = 1 day)

Assume that one period equals one day. You can either compute the “continuously compounded
daily return” or the “simple percentage change.” If Si-1 is yesterday’s price and Si is today’s price,
the continuously compounded return (ui) is given by:
 S 
ui  ln  i 
 Si1 
The simple percentage change is given by:
Si  Si1
ui 
Si1
Volatility weighting schemes

Un-Weighted Scheme
1 m 2
   uni
2
n
m i 1
Weighted Scheme Alphas are weights so
m
they must sum to one.
 n2  iun2i
i 1
20 / 55
Estimate volatility using the EWMA model.
In exponentially weighted moving average (EWMA), the weights decline (in constant proportion,
given by lambda).
Exponentially weighted moving average (EWMA):
 n2  (1   ) 0un21  In EWMA weights also sum

to one. However, they
(1   ) 1 un22  decline in constant ratio
(1   ) 2un23  (lambda).
Recursive version of EWMA: Lambda is the

“persistence
 n2   n21  (1   )un21 parameter” or
“smoothing constant”
RiskMetricsTM is a branded EWMA: RiskMetricsTM is EWMA

with a lambda (smoothing
 n2  (0.94) n21  (0.06)un21 constant) of 0.94
RiskMetricsTM Approach
RiskMetrics is a branded form of the exponentially weighted moving average (EWMA) approach:
ht  ht 1  (1   )rt21
The optimal (theoretical) lambda varies by asset class, but the overall optimal parameter used by
RiskMetrics has been 0.94. In practice, RiskMetrics only uses one decay factor for all series:
 0.94 for daily data

 0.97 for monthly data (month defined as 25 trading days)
21 / 55
GARCH(p,q) model
EWMA is a special case of GARCH(1,1) where gamma = 0 and (alpha + beta = 1)
 n2   n21  (1   )un21
GARCH (1,1) is the weighted sum of a long run-variance (weight = gamma), the most recent
squared-return (weight = alpha), and the most recent variance (weight = beta)
 n2   VL   un21   n21
(Weighted) Long- Lagged, squared

run variance return (1)
Lagged variance (1)

 n2   VL   un21   n21
Mean reversion in the GARCH(1,1) model.
Long-run average variance as a function of omega and the weights (alpha, beta):

VL 
1   
Explain how GARCH models perform in volatility forecasting.

The forecasted volatility forward (k) days is given by:
E[ n2k ]  VL  (   )k ( n2  VL )
Discuss how correlations and covariances are calculated, and explain the consistency
condition for covariances.
Correlations play a key role in the calculation of value at risk (VaR). We can use similar methods
to EWMA for volatility. In this case, an updated covariance estimate is a weighted sum of
 The recent covariance; weighted by lambda

 The recent cross-product; weighted by (1-lambda)
covn   covn1  (1   )xn1 yn1
22 / 55
Bernoulli
A Bernoulli variable is discrete and has two possible outcomes:
 1 if C defaults in I
X 
0 else
Binomial
A binomial distributed random variable is the sum of (n) independent and identically
distributed (i.i.d.) Bernoulli-distributed random variables. The probability of observing (k)
successes is given by:
n  n n!
P(Y  k )    pk (1  p)nk ,   
k   k  (n  k )! k !
Poisson
The Poisson distribution depends upon only one parameter, lambda λ, and can be interpreted as
an approximation to the binomial distribution. A Poisson-distributed random variable is usually
used to describe the random number of events occurring over a certain time interval. The
lambda parameter (λ) indicates the rate of occurrence of the random events; i.e., it tells us
how many events occur on average per unit of time.
k
P( N  k )  e 
k!
Exponential
The exponential distribution is popular in queuing theory. It is used to model the time we have
to wait until a certain event takes place. According to the text, examples include “the time
until the next client enters the store, the time until a certain company defaults or the time until
some machine has a defect.”
f ( x)  e  x ,   1 , x  0

23 / 55
Weibull
Weibull is a generalized exponential distribution; i.e., the exponential is a special case of the
Weibull where the alpha parameter equals 1.0.

x
 
 
F ( x)  1  e ,x  0
Gamma distribution
 1
1 x
f ( x)   e x    ,x  0
  ( )  
Generalized extreme value (GEV) fits block maxima

The Generalized extreme value (GEV) distribution is given by:
  
1


exp  (1   y)   0
H ( y)    
 
 y
exp(e )  0
The  (xi) parameter is the “tail index;” it represents the fatness of the tails. In this expression, a
lower tail index corresponds to fatter tails.
Peaks over threshold (POT)

Peaks over threshold (POT) collects the dataset of losses above (or in excess of) some threshold.
FU ( y)  P( X  u  y | X  u)
Peaks over threshold (POTS):
 1
 x 
1  (1  )  0
 
G , ( x )  
1  exp( x )  0
 
24 / 55
Describe the hazard rate of an exponentially distributed random variable.
The parameter 1/ has a natural interpretation as hazard rate or default intensity.
x
1 
 1 
f ( x)  e ,x  0
 f ( x)    e  x
x

 F ( x )  1  e   x
F ( x)  1  e  , x  0
Basis
Basis = Spot Price Hedged Asset – Futures Price Futures Contract = S0 – F0
Minimum variance hedge ratio

If the spot and future positions are perfectly correlated, then a 1:1 hedge ratio results in a perfect
hedge. However, this is not typically the case. The optimal hedge ratio (a.k.a., minimum variance
hedge ratio) is the ratio of futures position relative to the spot position that minimizes the
variance of the position. Where  is the correlation and  is the standard deviation, the optimal
hedge ratio is given by:
S
h*  
F
And the number of futures contracts is given by N* when NA is the size of the position being
hedged and QF is the size of one futures contract:
h * NA
N* 
QF
Optimal number of futures contracts needed to hedge an exposure

When futures are used, a small adjustment, known as “tailing the hedge” can be made to allow for
the impact of daily settlement. The only difference here is to replace the units with values.
Instead of using quantities, as in:
h * QA
N* 
QF
25 / 55
We use the dollar value of the position being hedged and the dollar value of one futures
contract, as in:
h * VA
N* 
VF
Stock index futures contracts to change a stock portfolio’s beta

Given a portfolio beta (), the current value of the portfolio (P), and the value of stocks
underlying one futures contract (A), the number of stock index futures contracts (i.e., which
minimizes the portfolio variance) is given by:
P
N*  
A
By extension, when the goal is to shift portfolio beta from () to a target beta (*), the number of
contracts required is given by:
P
N  ( *  )
A
Compounding
Assuming:
 R : rate of interest with continuous compounding

c
 R : rate of interest with discrete compounding (m per annum)
m
 n: number of years
Ae Rcn
 R 
 A 1  m 
mn

Rm  m e Rc /m  1 
 m  R 
m Rc  m ln  1  m 
 R   m
e Rc   1  m 
 m
26 / 55
Convert rates based on different compounding frequencies
The present value is discretely discounted at (m) periods per year (e.g., m=2 for semi-annual
compounding) over n years by using the formula on the left. The continuous equivalent is the
right. Note that if the future value is one dollar (FV = $1), then the PV is the discount factor (DF).
Discrete Continuous
PV 
FV PV  FV  e rn
mn
 r 
1  
 m PV  $1  e rn
$1
PV  mn
 r 
1  
 m
Calculate forward interest rates from a set of spot rates

Hull assumes a continuous compound/discount frequency. For example, given a two-year spot
rate of 4% and a one-year spot rate of 3%, the one-year implied forward rate = [(4%*2) –
(3%)(1)]/[2-1] = 5%.
R2T2  R1T1
T2  T1
Par Yield
The par yield for a certain maturity is the coupon rate that causes the bond price to equal its face
value. For example, assume a 2-year bond that pays semi-annual coupons. Further, assume the
zero rate term structure is given by {0.5 years = 5.0%, 1.0 years = 5.8%, 1.5 years = 6.4%, and 2.0
years = 6.8%}. Then solve for the coupon rate (c) that solves for a price (present value) equal to
the par (100):
c 0.050.5 c 0.0581.0 c 0.0641.5  c

e  e  e   100   e 0.0682.0  100
2 2 2  2
to get c = 6.87 (with s.a. compounding)
27 / 55
Yield
The bond yield is also known as the yield to maturity (YTM). The yield (YTM) is the discount
rate that makes the present value of the cash flows on the bond equal to the market price of the
bond.
(Macaulay) Duration
n  ci e  yti 
Duration of a bond that provides cash flow c at time t is D   ti   , where B is its price
i i
i 1  B 
and y is its yield (continuously compounded). This leads to:
B
  Dy
B
Modified Duration
BDy
When the yield y is expressed with compounding m times per year B  
1y m
Modified duration (D*) is related to (Macaulay) duration (D) by the following:
D
D* 
1y m
Such that the estimated change in bond price is a function of the modified duration:
B  BD * y
Dollar duration
Dollar duration (DD**), also known as value duration, is the slope of the tangent line (a first
partial derivative)
B   BD * y
D * *  BD *
B   D * *y
28 / 55
Convexity
Convexity is the weighted average of maturity-squares of a bond, where weights are the present
values of the bond’s cash flows, given as proportions of bond’s price. Convexity can be
mathematically expressed
1 d 2 B i1 citi e
n 2  yti
C 
B dy 2 B
Cost of Carry Model

For a non-dividend-paying investment asset (i.e., an asset which has no storage cost) the cost
of carry model says the futures price is given by:
F0  S0ecT  F0  S0erT
The equations for forward prices are essentially similar to futures prices. The generalized forward
price (F0) is either case (futures or forwards) is therefore given by:
F0  S0erT
If the asset provides interim cash flows (e.g., a stock that pays dividends), then let (I) equal
the present value of the cash flows received and the cost-of-carry model is then given by:
F0  (S0  I )erT
If the asset provides income (e.g., a stock that pays dividends), where the income can be
expressed as a constant percentage of the spot price (given by q), then the model is given by:
F0  S0e(r q)T
If the asset has a storage cost and produces a convenience yield (where the convenience yield is a
constant percentage of the spot price, denoted by ‘y’), the cost-of-carry model expands to:
F0  S0e(r u y )T
Where r is the risk-free rate, u is the storage cost as a constant percentage, and y is the
convenience yield.
29 / 55
Value of a forward contract
The value of a forward contract (f) is given by either equation below:
f  (F0  K )e rT
f  S0e qT  Ke rT
Interest rate parity: Discrete annual compounding

In discrete terms, the equality is given by:
1
1  rdomestic
D
  [1  rforeign
F
]  Ft
St
1  rdomestic
D
F
 t
1  rforeign St
F
Where:
1  rdomestic
D
 1 + the domestic interest rate at time t
St  spot exchange rate (domestic/foreign) at time t
1  rforeign
F
 1 + the foreign interest rate time t
Ft  forward exchange rate at time t
Interest rate parity: continuous

Alternatively, interest rate parity (IRP) can be given in continuously compounded terms:
(r r f )T
F0  S0e
Where r is the domestic interest rate and rf is the foreign interest rate.
Convenience Yield
For a consumption asset—where (y) is the convenience yield and (c) is the cost of carry—the
futures price is given by:
F0  S0e(c y )T
If a non dividend-paying stock offered a “convenience yield” then its forward price calculation
would mirror the above formula:
F0  S0e(r  y )T
30 / 55
Cost of carry with storage costs
The futures price for a commodity can be given by two formulae:
F0  (S0  U )erT
Where U is the present value of storage costs
F0  S0e(r u y )T
Where
 u is the storage costs as a proportion of the spot price

 y is the convenience yield
Day Count
Day count conventions are important for computing accrued interest:
 Actual/actual: U.S. Treasury bonds

 30/360: U.S. corporate and municipal bonds
 Actual/360: U.S. Treasury bills and other money market instruments
Calculate the cost of delivering a bond into a Treasury bond futures contract
The cost to deliver is the dirty price, which is the bond quoted price plus accrued interest (AI).
The short position will receive the settlement multiplied by the conversion factor plus accrued
interest (AI). The cheapest to deliver (CTD) is:
 The bond that minimizes  MIN: Quoted Bond Price - (Settlement)(CF), or similarly
 The bond that maximizes  MAX: (Settlement)(CF) - Quoted Bond Price
Describe and compute the Eurodollar futures contract convexity adjustment

The convexity adjustment assumes continuous compounding. Given that () is the standard
deviation of the change in the short-term interest rate in one year, t1 is the time to maturity of the
futures contract and t2 is the time to maturity of the rate underlying the futures contract.
1
Forward = Futures   2t1t2
2
Duration‐based hedge ratio

The number of contracts required to hedge against an uncertain change in the yield, given by y,
is given by:
31 / 55
PDP
N* 
FC DF
Note: FC = contract price for the interest rate futures contract. DF = duration of asset underlying
futures contract at maturity. P = forward value of the portfolio being hedged at the maturity of
the hedge (typically assumed to be today’s portfolio value). DP = duration of portfolio at maturity
of the hedge
Duration of a Bond
The duration of a bond (D) is given by a formula that says “the percentage change of a bond’s
price (B) is a function of its duration (D) and the change in the yield:”
B
  D y
B
B 1
D 
B y
If we recast the same equation with deltas, we get: the duration multiplied by the change in yield
(D
B
  Dy
B
And solving this equation for the duration (D) gives us:
B 1
D 
B y
Modified Duration of a Bond (D*)

D
D*  where k = compound periods per year
1y k
Black-Scholes-Merton
c  S0 N(d1 )  KerT N(d2 )
32 / 55
Identify, interpret and compute upper and lower bounds for option prices
Upper Bounds c  S0 C  S0
p X P X
Lower bound for European CALL non-dividend paying stock:
c  max(S0  Ke rT ,0)
Lower bound for European PUT on non-dividend paying stock:
p  max( Ke rT  S0 ,0)
Put‐call parity
Put–call parity is based on a no-arbitrage argument; it can be shown that arbitrage opportunities
exist if put–call parity does not hold. Put–call parity is given by:
c  Ke rT  p  S0
c  p  Ke rT  S0
Explain the early exercise features of American call and put options on a
non‐dividend‐paying stock and the price effect early exercise may have
The difference between an American call and an American put (C–P) is bounded by the following:
S0  K  C  P  S0  Ke rT
Discuss the effects dividends have on the put‐call parity, the bounds of put and call
option prices, and on the early exercise feature of American options
The ex-dividend date is specified when a dividend is declared. Investors who own shares of the
stock as of the ex-dividend date receive the dividend.
An American option should never be exercised early in the absence of dividends. In the case of
a dividend-paying stock, it would only be optimal to exercise immediately before the stock
goes ex-dividend. Specifically, early exercise would remain sub-optimal if the following
inequality applied:
Di  K (1  er(ti1 ti ) )
33 / 55
Derive the basic equilibrium formula for pricing commodity forwards and futures
The forward price is equal to the expected spot price in the future, but discounted to the present.
F0,T  E0 (ST )e(r  )T

Where:
E0 (ST ) Spot price of S at time T, as expected at time 0

F0,T Forward price
r Risk-free rate
 Discount rate for commodity S
And, as McDonald says, the forward price [F0] is a biased estimate of expected spot price
[E(St)], where the bias is due to the risk premium on the commodity (risk premium = α – r).
Explain the implication basic equilibrium has for different types of commodities
The forward price is a biased estimate of the expected spot price.
erT F0,T  E0 (ST )e T
For commodities on which forward prices are available, the forward price can be discounted; i.e.,
Forward Price * EXP[(-rate)(time)]. This give the present value of the commodity received at
future time (T).
The forward price when there is a storage (carry) cost is given by:
F0,T  S0e(r  )T
Where:
   g
 : lease rate
 : commodity discount rate
g : commodity growth rate
: storage cost
34 / 55
Define the lease rate
If the lease rate is given by , then the forward price is given by:
F0,T  S0e(r  )T
The lease rate = commodity discount rate – growth rate:
   g
The lease rate is economically like (~) a dividend yield.
Define carry markets

A commodity that is stored is in a carry market. Storage is carry. Storage permits
consumption throughout the year
F0,T  S0e(r  c)T
Compare the lease rate with the convenience yield

If we are given the forward price, we only need to re-arrange the above formula to solve for the
implicit lease rate. We re-arrange as follows:
F0,T
F0,T  S0 e(r  )T   e(r  )T 
S0
F  1 F 
ln  0,T   (r   )T  l n  0,T   (r   ) 
 S0  T  S0 
1 F 
  r  l n  0,T 
T  S0 
Define basis risk and the variance of the basis
Basist ,T  Spot Pr icet  F T (t)
 2 (St  F T (t))   2 (St )   2(F T (t))  2  (St ) (F T (t))

This equation shows that basis risk is zero when
 Variances between the Futures and spot prices are identical, and
 The correlation coefficient between spot and futures prices is equal to one.
35 / 55
Effectiveness of hedging a spot position with a futures contract
The classical measure of the effectiveness of hedging a spot position with Futures contracts is
given by:
 2 (basis)
h1
 2 (St )
The nearer (h) is to one, the better (more perfect) the hedge.
Historical > standard deviation (simple parametric)

A moving average forecast requires a window of fixed length; e.g., 30 or 60 trading days. If we
observe returns (rt) over M days, this volatility estimate is constructed from a moving average
(MA):
M
 t2  (1 / M ) rt2i
i 1
The moving average (MA) series is simple but has two drawbacks
 The MA series ignores the order of the observations. Older observations may no longer be
relevant, but they receive the same weight.
 The MA series has a so-called ghosting feature: data points are dropped arbitrarily due to length
of the window.
GARCH (p, q) and in particular GARCH (1, 1)

GARCH (p, q) is General Autoregressive Conditional Heteroskedastic model. There are three key
aspects to the GARCH moniker:
 Autoregressive (AR): tomorrow’s variance (or volatility) is a regressed function of today’s

variance—it regresses on itself
 Conditional (C): tomorrow’s variance depends—is conditional on—the most recent variance. An
unconditional variance would not depend on today’s variance
 Heteroskedastic (H): variances are not constant, they flux over time
GARCH regresses on “lagged” or historical terms. The lagged terms are either variance or squared
returns. The generic GARCH (p, q) model regresses on (p) squared returns and (q) variances.
Therefore, GARCH (1, 1) “lags” or regresses on last period’s squared return (i.e., just 1 return) and
last period’s variance (i.e., just 1 variance). GARCH (1, 1) given by the following equation.
 t2  a  brt21,t  c t21
36 / 55
Hull writes the same GARCH equation as:  n2   VL   un21   n21 . The first term (VL) is
important because VL is the long run average variance. Therefore, (VL) is a product: it is the
weighted long-run average variance.
The same GARCH (1, 1) formula can be given with Greek parameters. The GARCH (1, 1) model
solves for the conditional variance as a function of three variables (previous variance, previous
return^2, and long-run variance):
ht  0   1rt21   ht 1 ht or  t2  conditional variance (i.e., we're solving for it)

a or   weighted long-run (average) variance
ht 1 or  t-1
2
 previous variance
2
rt-1 2
or rt-1,t  previous squared return
EWMA
EWMA is a special case of GARCH (1,1). Here is how we get from GARCH (1,1) to EWMA:
GARCH(1,1)   t2  a  brt21,t  c t21
Then we let a = 0 and (b + c) =1, such that the above equation simplifies to:
GARCH(1,1) =  t2  brt21,t  (1  b) t21
This is now equivalent to the formula for exponentially weighted moving average (EWMA):
EWMA   t2  brt21,t  (1  b) t21

 t2   t21  (1   )rt21,t
In EWMA, the lambda parameter now determines the “decay:” a lambda that is close to one
(high lambda) exhibits slow decay.
EWMA is recursive solution to infinite series

The exponentially weighted moving average (EWMA) is given by:
 n2   n21  (1   )un21
37 / 55
The above formula is a recursive simplification of the “true” EWMA series which is given by:
 n2  (1   ) 0un21 
(1   ) 1 un22 
(1   ) 2un23  ...
In the EWMA series, each weight assigned to the squared returns is a constant ratio of the
preceding weight. Specifically, lambda () is the ratio of between neighboring weights. In this
way, older data is systematically discounted. The systematic discount can be gradual (slow) or
abrupt, depending on lambda. If lambda is high (e.g., 0.99), then the discounting is very gradual.
If lambda is low (e.g., 0.7), the discounting is more abrupt.
Explain how persistence is related to the reversion to the mean.

Given the GARCH (1, 1) equation:
ht  0   1rt21   ht 1
Persistence is given by:
Persistence   1  
GARCH (1, 1) is unstable if the persistence > 1. A persistence of 1.0 indicates no mean reversion. A
low persistence (e.g., 0.6) indicates rapid decay and high reversion to the mean. The average,
unconditional variance in the GARCH (1, 1) model is given by:
0
LV 
1  1  
Using GARCH (1, 1) to forecast volatility

The expected future variance rate, in (t) periods forward, is given by:
E[ n2t ]  VL  (   )t ( n2  VL )
Multivariate Density Estimation (MDE)

K
   (t i )ut2i
2
t Instead of weighting returns^2 by time,
i 1
Weighting by proximity to current state
Kernel Function Vector describing

Economic state at time t-i
38 / 55
Square root rule (i.e., variance is linear with time) only applies under restrictive i.i.d.
The simplest approach to extending the horizon is to use the “square root rule”
 (rt ,t  J )   (rt ,t 1 )  J J-period VAR = J  1-period VAR
For example, if the 1-period VAR is $10, then the 2-period VAR is $14.14 ($10 x square root of 2)
and the 5-period VAR is $22.36 ($10 x square root of 5).
The square-root-rule: under the two assumptions below, VaR scales with the square
root of time. Extend one-period VaR to J-period VAR by multiplying by the square
root of J.
The square-root rule for extending the time horizon requires i.i.d., that’s two assumptions:
 Random-walk (acceptable)
 Constant volatility (unlikely)
Auto Regression, AR(1)

The analysis of auto-regression AR (1) model describes tomorrow’s return as a function of
(dependent on) today’s return:
Xt 1  a  bXt  et 1
If the expected value of the error term is zero, then the expected value of Xt simplifies to the
equation below where the parameter (b) is called the “speed of reversion.” If (b=1) then the
formula is a random walk:
E[ Xt 1 ]  a  bXt
Explain how to calculate VaR for linear derivatives.

By definition, the transmission parameter is constant. Therefore, in the case of a linear derivative,
VaR scales directly with the underlying risk factor.
VaR Linear Derivative    VaR Underlying Risk Factor
Taylor Series Approximation
f ( x)  f ( x0 )  f ( x0 )( x  x0 )  1 2 f ( x0 )( x  x0 )2
39 / 55
Explain the full revaluation method for computing VaR.
Full revaluation considers values for a range of price levels. New values can be generated by:
 Historical simulation,
 Bootstrap (simulation)
 Monte Carlo simulation
dV  V (S1 )  V (S0 )
… Multifactor Models
Stock returns (as the dependent variable) are regressed against multiple factors. This is a multiple
regression where Iit are the external risk factors and the betas are the sensitivity (of each firm) to
the external risk factors:
Rit  it  1i I1t  1i I1t   it

The risk factors are external to the firm; e.g., interest rates, GDP. Also, note the multi-factor
model cannot help model low-frequency, high-severity loss (LFHS) events.
… Income Based Models

These are also called Earning at Risk (EaR) models. Income or revenue (as the dependent
variable) is regressed against credit risk factor(s) and market risk factor(s). The residual, or
unexplained, volatility component is deemed to be the measure of operational risk.
Extract market & credit risk from historical income volatility
Residual volatility (volatility of ε) is operational risk measure
Eit  it  1tC1t  2t M2t  it C1  Credit Risk

M1  Market Risk
 it  Residual. Volatility of residual is operational risk
40 / 55
Binomial
We need the following notation:
f  price/value of option u = proportional "up" jump (u  1)

S0  stock price d  proportional "down" jump (d  1)
  number of shares of stock f u  option payoff if stock jumps up
f d = option payoff if stock jumps down
The probability of an “up jump” (or up movement) is denoted by (p) and given by:
erT  d
p
ud
This probability (p) then plugs into the equation that solves for the option price:
f  erT [ pf u  (1  p) f d ]
Up movement (u) and down movement (d)
 t   t
ue and d  e
er t  d
p
ud Probability of up (p)
If u = 1.15, then stock moves up +15% to S0*u

rt
f e [ pf u  (1  p) f d ] p = probability of up movement,
So (1-p) = probability of down movement
Describe the impact dividends have on the binomial model

If the stock pays a known dividend yield at rate (q), the probability (p) of an up movement is
adjusted:
e(r q)t  d
p
ud
41 / 55
Options on Currencies
Analogous to the adaptation of the cost of carry model to foreign exchange forwards, if (rf) is the
foreign risk-free rate, we can use:
e(r rf )t  d

poption on currency 
ud
Options on Futures
Since it costs nothing to take a long or short position in a futures contract, in a risk-neutral world
the futures price has an expected growth rate of zero. In this case, we can use:
1 d
pfutures 
ud
Lognormal property of stock prices

Under GBM (a Weiner process), Periodic returns are normally distributed
S
~  (t,t)
S
Price levels are log-normally distributed
  2  
ln ST ~  ln S0      , T ) 
  2  
An Ito process is a generalized Weiner process (a stochastic process) where the change in the
variable during a short interval is normally distributed. The mean and variance of the distribution
are proportional to t. In an Ito process, the parameters are a function of the variables x and t.
ST  2  
ln ~      T , T  and
S0   2  
  2  
ln ST ~  ln S0      T , T 
  2  
Let ST equal the stock price at future time T. The expected value of ST [i.e., E(ST) is given by:
E(ST )  S0e T
2T
var(ST )  S02e2 T (e  1)
42 / 55
Distribution of the Rate of Return
The continuously compounded rate of return per annum is normally distributed. The distribution
of this rate of return is given by the following:
 2  
 ~   , 
 2 T
The Expected Return: Arithmetic vs. Geometric

The phrase “expected return” has two common meanings: arithmetic and geometric.
E( Arithmetic )   2
E(Geometric )   
2
The continuously compounded return realized over T years is given by:
1 S
 ln( T )
T S0
Compute the realized return and historical volatility of a stock

Start with the variable (ui) which is the natural log of the ratio between a stock price at time (i)
and the previous stock price at time (i-1):
 S 
ui  ln  i 
 Si1 
An unbiased estimate of the variance is given by:
1 m
 
2
n 
m  1 i 1
(un1  un )2
Important: the equation above is the variance. The volatility is the standard deviation and,
therefore, is given by:
1 m
 n   n2  
m  1 i 1
(un1  u)2
43 / 55
For purposes of calculating VAR—and often for volatility calculations in general—a few
simplifying assumptions are applied to this volatility formula. Specifically:
 Instead of the natural log of the ratio [Si/Si-1], we can substitute a simple percentage
change in price: %S = [(Si-Si-1)/Si-1]
 Assume the average price change is zero
 Replace the denominator (m-1) with (m)
With these three simplifications, an alternative volatility calculation is based on the following
simplified variance:
1 m 2
   un1
2
n
m i 1
The stock price process is described by the following formula:
dS   Sdt   Sdz 
dS
  dt   dz
S
The Black–Scholes–Merton Differential Equation is given by:
f f 1 2 2  2 f
 rS   S  rf
t S 2 S 2
European option using Black‐Scholes‐Merton model on a non‐dividend‐paying stock

The Black–Scholes model gives the following values for a call (c) and a put (p) in the case of a
European option:
c  S0 N(d1 )  KerT N(d2 ) p  KerT N(d2 )  S0 N(d1 )
Where d1 and d2 are given by:
)  (r   )  (r  
S0 2 S0 2
ln( )T ln( )T
d1  K 2 d2  K 2  d1   T
 T  T
44 / 55
European option using the Black‐Scholes‐Merton model on a dividend‐paying stock
A European option on a dividend-paying stock can be analyzed as the sum of two components:
 A riskless component = known dividends during the life of the option, plus
 A risk component
A dividend yield effectively reduces the stock price (the option holder forgoes dividends).
Operationally, the amounts to reducing the stock price by the present value of all the
dividends during the life of the option. If (q) represents the annual continuous dividend yield
on a stock (or stock index), the adjusted Black-Scholes-Merton for a European call option is given
by:
c  S0eqT N(d1* )  Ke rT N (d2* )
)  (r  q  
S0 2
ln( )T
d1*  K 2 d2*  d1*   T
 T
Identify the complications involving the valuation of warrants

Assume that VT equals the value of the company’s equity and N equals the number of
outstanding shares. Further, assume that a company will issue (M) number of warrants with a
strike price equal to K. ST equals the stock price at time T. The (adjusted) stock price, after we
account for the dilution effect of the issued warrants, is:
VT  MK
Sadjusted 
NM
The Black–Scholes can be used to value a warrant; however, three adjustments are required
 The stock price (S0) is replaced by an “adjusted” stock price

 The volatility input is calculated on equity (i.e., common equity plus warrants) not stock price
 The calculation is reduced by a multiplier. The multiplier captures dilution and is also called a
“haircut.” The haircut is given by:
N
NM
N
valueWarrant =ValueOption  , N=shares & M=warrants
NM
45 / 55
Define delta hedging for an option, forward, and futures contracts
Delta is the rate of change of the option price with respect to the price of the underlying asset:
c c  change in the price of the call option

delta =  
S S = change in the price of the stock price
Delta of European Stock Options

  N (d 1 ) European call on non-dividend stock
  e  qT N (d1 ) European call on dividend stock
  N (d 1 )  1 European put on non-dividend stock
  e  qT  N (d1 )  1  European put on dividend stock
Delta of Forward Contracts

The delta of a forward contract on one share of stock is 1.0.
If the stock pays a dividend, the delta = EXP(-qT).
Delta of Futures Contract

The delta of a futures contract is erT
If the asset pays a dividend, the delta = EXP[(r-q)*T]
Gamma
Gamma is the rate of change of the portfolio’s delta with respect to the underlying asset; it is
therefore a second partial derivative of the portfolio:
 2
Gamma =  
S 2
 2  the second partial derivative of the call price
 2S 2  the second partial derivative of the stock price
Vega
Vega is the rate of change of the value of a portfolio (of derivatives) with respect to the
volatility of the underlying asset:

Vega =

46 / 55
Rho
Rho is the rate of change of the value of a portfolio (of derivatives) with respect to the interest
rate (or, as in the Black–-Scholes, the risk-free interest rate):

Rho =
r
Relationship between delta, theta, and gamma

The risk-free rate multiplied by the portfolio (i.e., a fractional share of the portfolio) is directly
related to a linear function of theta, delta and gamma:
1 r  risk-free interest rate

r    rS    2S 2   value of the portfolio
2
  option theta
S  stock price
  option delta
2  variance of underlying stock
  option gamma
If theta is large and positive then gamma tends to be large and negative. Delta is zero by
definition in a “delta-neutral” portfolio, in which case the formula simplifies to:
1
r     2 S 2 
2
Discount factor and discount function

The discount factor, d(t), for a term of (t) years, gives the present value of one unit of currency
($1) to be received at the end of that term.
If d(.5)=.97557, the present value of $1 to be received in six months is 97.557 cents
Assume A pays $105 in six months. Given the same discount factor of 0.97557, $105 to be received
in six months is worth .97557 x $105 = $102.43
$1 $1
  $1.025 in six months
d(.5) 0.97557
The discount function is simply the series of discount factors that correspond to a series of times
to maturity (t). For example, a discount function is the series of discount factors: d(0.5), d(1.0),
d(1.5), d(2.0).
47 / 55
Impact of different compounding frequencies on a bond’s value.
Investing (x) at an annual rate of (r) compounded semiannually for (T) years produces a terminal
wealth (w) of:
2T
 r
w  x1  
 2
Discount factor
Let d(t) equal the discounted value of one unit of currency. Assuming the one unit of currency is
discounted for (t) years at the semiannual compound rate r(t), then the discount rate d(t) is given
by:
1
d(t)  2t
 rˆ(t) 
1  
 2 
The relationship between continuous compounding and discrete compounding (semi-annual
compounding is discrete compounding where the number of periods per year is equal to 2) is
given by:
mn
 R 
Ae Rcn  A 1  m 
 m
The continuous rate of return as function of The discrete rate of return as a function of
the discrete rate of return (where m is the the continuous rate of return is given by:
number of periods per year) is given by:
 R  Rm  m(e Rc m1
Rc  m ln  1  m  )
 m
Compute semi‐annual compounded rate of return for a C‐Strip

If the price of one unit of currency maturing in t years is given by d(t), the semiannual
compounded return, is given by:
 1

  1  2t 
rˆ(t)  2  1
 d(t)  
 
48 / 55
Derive spot rates from discount factors.
Given a t-period discount factor d(t), the semiannual compounded return is given by:
 1

  1 
 1
2t
rˆ(t)  2  
 d(t)  
 
The relationship between spot rates and maturity/term is called the term structure of spot
rates. When spot rates increase with maturity, the term structure is said to be upward-sloping.
When spot rates decrease with maturity, the term structure is said to be downward-sloping or
inverted.
Define, interpret, and apply a bond’s yield‐to‐maturity (YTM) to bond pricing.

Yield-to-maturity (YTM), sometimes just yield, is the single rate that, when used to discount a
bond’s cash flows, produces the bond’s market price. Given an annual coupon of c (and therefore
a semi-annual coupon of c/2), a final principal payment of F, a market price of P(T) with T years
to maturity, the yield to maturity (YTM) is given by (y) is the following equation:
2T c
F
P(T )   2 
t 1 (1  y )t (1  y )2T
2 2
Price of an annuity and a perpetuity

An annuity with semiannual payments is a security that makes a payment c/2 every six months
for T years but never makes a final “principal” payment (i.e., FV=0). The price of an annuity, A(T),
is given by:
  
2T

c  1  
A(T )   1  
 y 
  1  2
y
 
 
A perpetuity bond is a bond that pays coupons forever. The price of a perpetuity is simply the
coupon divided by the yield (i.e., the price of a perpetuity = c/y).
49 / 55
Calculate the price of an annuity.
Annuity: makes semiannual payments of c/2 ever six months for T years but never makes a final
payment. Price is given by:
c  1  
2T
A(t)   1    
y  1  y 2 
 
One-factor measures of sensitivity

DV01 = dollar value of an ’01
a.k.a., PV01, price value of an ’01
Gives the dollar value change of a fixed income security for a one-basis point decline in rates.
Modified duration
Percentage change in value of security for a one unit change (10,000 basis points)
Key relationship:
P  DMod
DV 01 
10,000
DV01
DV01 is an acronym for “dollar value of an 01” (.01%). DV01 gives the change in the value of a fixed
income security for a one-basis point decline:
P
DV01 = 
10,000  y
Importantly, the DV01 is related to modified duration:
Duration Mod  Price Duration Macaulay  Price

DV01 = =
10,000 10,000
DV01
 Duration Mod = (10,000)
Price
50 / 55
Duration
Duration (D) is given by:
1 P
D
P y
If we multiply both sides of equation, then we get the following key equation:
P
  Dy
P
The above equation says: the percentage change in the price is equal to the modified duration
multiplied by the change in the rate (the minus sign indicates they move in opposite directions;
i.e., a positive yield change corresponds to a negative price change).
Duration can be calculated with the following formula:
price if yields decline - price if yields rise

Duration = 
2  (initial price)  (yield change in decimal)
V  V
D=
2(V0 )(y)
Convexity
Convexity also measures interest rate sensitivity. Mathematically, convexity is given by the
formula below where the term (d2P/dy2) is the second derivative of the price-rate function:
1 d 2P
C
P dy 2
The common convexity formula is given by:
V  V  2V0
convexity measure =
V0 (y)2
Where:
 V0 is the initial price of the bond

 V+ is the price of the bond if yields increase by Δy
 V- is the price of the bond if yields decrease by Δy
 Δy is a change in the yield (in decimal terms)
51 / 55
Applying the Convexity Measure
In order to estimate the percentage price change due to a bond’s convexity (i.e., the percentage
price change not explained by duration), the convexity measure must by “translated” into a
convexity adjustment:
1
Convexity adjustment = convexity measure  (y)2
2
The (1/2) in the formula above is called the “scaling factor.”
Debt service ratio

Exports generate dollars and hard currencies; the greater the exports, the easier it is to service
debt
interest + amortization on debt

DSR =
export
 DSR  Likely to reschedule
Debt Service Ratio (DSR) has a positive (+) relationship to the likelihood of debt rescheduling:
Exports are its primary way of generating hard currencies for an LDC. Larger debt repayments
(i.e., in relation to export revenues) imply a greater probability that the country will need to
reschedule.
Import ratio
To pay for imports, LDC must run down its stock of hard currencies
total imports
Import Ratio (IR) =
total foreign exchange reserves
 IR  Likely to reschedule
Import Ratio (IR) has a positive (+) relationship to the likelihood of debt rescheduling: To pay
for imports, the LDC must run down its stock of hard currencies. The greater the need for
imports, the quicker a country can be expected to deplete its foreign exchange reserves.
52 / 55
Investment ratio
Higher investment implies greater future productivity (i.e., negative relationship: less likely to
reschedule) but also greater bargaining power with creditors (positive relationship)
real investment
Investment Ratio (IRVR) =
gross national product
 Likely to reschedule
 IR Both views
 Likely to reschedule
Variance of export ratio

LDC export revenue variability impacted by (i) Quantity risk [how much sold?] and (ii) Price risk
[exchange rate]
VAREX =  ER
2
V  Likely to reschedule
Variance of Export Revenue (VAREX) has a positive Relationship (+) to the likelihood of debt
rescheduling.
Domestic money supply growth

Faster growth in money supply → higher domestic inflation rate → weaker currency
money supply
Domestic Money Supply Growth (MG) =
money supply
 MG  Likely to reschedule
Domestic Money Supply Growth (MG) has a positive relationship (+) to likelihood of debt
rescheduling: a higher rate of growth in domestic money supply should cause a higher domestic
inflation rate and, consequently, a weaker currency.
53 / 55
Define, calculate and interpret the expected loss for an individual credit instrument.
Expected loss = Assured payment at maturity time T x Loss Given Default (LGD) x Probability
that default occurs before maturity T (PD)
However, “Assure payment at maturity time T” should be replaced with “Exposure.” Therefore,
the key formula is given by:
Expected loss = Exposure (at default, EAD) x Loss Given Default (LGD) x Probability of default
(PD)
Or, equivalently and ultimately:
Expected loss = Exposure at default (EAD) x Loss Given Default (LGD) x Expected Default
Frequency (EDF)
EL  AE  LGD  EDF
EL  AE  LGD  PD
Define exposures, adjusted exposures, commitments, covenants, and outstandings.
Assume
 Value of bank asset = V

 Outstandings = OS
 Commitments = COM
Then V = OS + COM
Outstandings: generic term referring to the portion of the bank asset which has already been
extended to the borrowers and also to other receivables in the form of contractual payments
which are due from customers. Examples of outstandings include term loans, credit cards, and
receivables.
Commitments: An amount the bank has committed to lend, at the borrower’s request, up to the
full amount of the commitment. An example of a commitment is a line of credit (LOC). A
commitment consists of two portions:
 Drawn, or
 Undrawn
But the drawn commitment should be treated as part of the outstanding (i.e., the amount
currently borrowed).
54 / 55
Define, calculate and interpret the unexpected loss of an asset.
Unexpected loss (UL) = Standard Deviation of unconditional value of the asset at horizon.
Unexpected loss (UL) is given by:
UL  AE  EDF   LGD
2
 LGD2   EDF
2
Where the variance of the default frequency (EDF) is given by:
 EDF
2
 EDF  (1  EDF )
Note: the variance of loss given default (LGD), unlike the variance of EDF, is non-trivial.
Unexpected loss (UL) is average loss bank can expect (to lose on its asset) over the specified
horizon.
The standard deviation of EDF = SQRT[(EDF)(1-EDF)]
The standard deviation of LGD is given as an input (not solved, being non-trivial)
Unexpected loss = SQRT[(EDF)(variance of LGD) + (LGD^2)(Variance of EDF)
55 / 55

2009 L1 Key Formulas

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

2009 L1 Key Formulas

Încărcat de

Drepturi de autor:

Formate disponibile

2009 Financial Risk Manager (FRM)

Level 1 (L1) Formula Sheets

This document is for only for paid members of Bionic Turtle

Portfolio Variance and Covariance

 a2b  wa2 a2  wb2 b2  2wa wb cov(a, b) cov(a, b)

Capital asset pricing model (CAPM)

E(Ri )  RF  i[ E(RM )  RF )]

Security Market Line (SML)

price of risk = E(R M  RF )

Diversification and Risk Management

Expected [future cash flows]

Treynor Measure, Sharpe Measure, and Jensen’s Alpha

E(RP )  RF   P  P (E(RM )  RF ) Jensen’s alpha is the excess return equated to

Tracking Error (TE)

Arbitrage Pricing Theory (APT)

In regard to expected excess return

The factor forecast is simply the sum of [Exposure(k) * Factor(k)].

Continuous random variable

We can expand the denominator into the elaborated Bayes’ formula:

Variance and standard deviation

Standard Deviation =  X  var( X )  E[( X   )2 ]

Important variance formula

Variance( X )  E[( X   )2 ]  E( X 2 )  [ E( X )]2

 XY  cov( X ,Y )  E[( X   X )(Y  Y )]

 Correlation has the same sign (+/-) as covariance

 X2 Y   X2   Y2  2 XY , and given that  XY   X Y

 X2 Y   X2   Y2  2 XY and given that  XY   X Y

Describe the difference between conditional and unconditional expectation.

 If k=1, refers to the first moment about zero: the mean.

 Symmetrical around its mean value (symmetry = 0)

% of Intervals – Interval – VAR -worst

Standard normal distribution

Standard error of a sample mean.

 Like the normal, it is symmetrical

Population regression function (PRF)

 b1 = intercept = parameter or regression coefficient

Residual sum of squares (RSS)

RSS =  ei2   (Yi  Yˆ )2

Standard errors in OLS

 Explained sum of squares (ESS),

ˆ 2   ei2  ˆ   ei2 k  2 in a two-variable model

TSS  ESS  RSS

Test of hypothesis for the slope (b2)

Distinguish between simple and multivariate regression

A multivariate regression has more than one independent variable:

Partial slope coefficient

Standard errors in multilinear regression

 1 X 22  x32t X 32  x22t  2X 2 X 3  x2t x3t  2

1 ,2 : independent random variables

Periodic returns (e.g., 1 period = 1 day)

Volatility weighting schemes

Exponentially weighted moving average (EWMA):

 n2  (1   ) 0un21  In EWMA weights also sum

Recursive version of EWMA: Lambda is the

RiskMetricsTM is a branded EWMA: RiskMetricsTM is EWMA

 0.94 for daily data

(Weighted) Long- Lagged, squared

Lagged variance (1)

Mean reversion in the GARCH(1,1) model.

Explain how GARCH models perform in volatility forecasting.

 The recent covariance; weighted by lambda

covn   covn1  (1   )xn1 yn1

Generalized extreme value (GEV) fits block maxima