Post-Crisis Quant Finance PDF

Post-Crisis Quant Finance
00 Prelims PCQF.indd 1 11/03/2013 10:08

Post-Crisis Quant Finance
Edited by Mauro Cesa

Published by Risk Books, a Division of Incisive Media Investments Ltd
Incisive Media
32–34 Broadwick Street
London W1A 2HG
Tel: +44(0) 20 7316 9000
E-mail: books@incisivemedia.com
Sites: www.riskbooks.com
www.incisivemedia.com
© 2013 Incisive Media
ISBN 978 1 782720 07 2
British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library
Publisher: Nick Carver

Commissioning Editor: Sarah Hastings
Editorial Development: Amy Jordan
Managing Editor: Lewis O’Sullivan
Designer: Lisa Ling

Copy-edited by Laurie Donaldson
Typeset by Mark Heslington Ltd, Scarborough, North Yorkshire

Printed and bound in the UK by Berforts Group Ltd
Conditions of sale
All rights reserved. No part of this publication may be reproduced in any material form whether
by photocopying or storing in any medium by electronic means whether or not transiently or
incidentally to some other use for this publication without the prior written consent of the copy-
right owner except in accordance with the provisions of the Copyright, Designs and Patents Act
1988 or under the terms of a licence issued by the Copyright Licensing Agency Limited of Saffron
House, 6–10 Kirby Street, London EC1N 8TS, UK.
Warning: the doing of any unauthorised act in relation to this work may result in both civil and
criminal liability.
Every effort has been made to ensure the accuracy of the text at the time of publication, this
includes efforts to contact each author to ensure the accuracy of their details at publication is cor-
rect. However, no responsibility for loss occasioned to any person acting or refraining from acting
as a result of the material contained in this publication will be accepted by the copyright owner,
the editor, the authors or Incisive Media.
Many of the product names contained in this publication are registered trade marks, and Risk
Books has made every effort to print them with the capitalisation and punctuation used by the
trademark owner. For reasons of textual clarity, it is not our house style to use symbols such
as TM, ®, etc. However, the absence of such symbols should not be taken to indicate absence of
trademark protection; anyone wishing to use product names in the public domain should first
clear such use with the product owner.
While best efforts have been intended for the preparation of this book, neither the publisher, the
editor nor any of the potentially implicitly affiliated organisations accept responsibility for any
errors, mistakes and or omissions it may provide or for any losses howsoever arising from or in
reliance upon its information, meanings and interpretations by any parties.

Contents
About the Editor ix

About the Authors xi
Acknowledgements xix
Foreword xxi
Introduction xxv
SECTION 1: DERIVATIVES PRICING
1 Smile Dynamics IV 3

Lorenzo Bergomi
Société Générale
2 Fundung Beyond Discounting: Collateral Agreements

and Derivatives Pricing 25
Vladimir V. Piterbarg
Barclays
3 Two Curves, One Price 43

Marco Bianchetti
Intesa Sanpaolo Bank
4 A Libor Market Model with a Stochastic Basis 61

Fabio Mercurio
Bloomberg
5 Volatility Interpolation 77
Jesper Andreasen and Brian Huge
Danske Bank
6 Random Grids 91
Danske Bank

post-crisis quant finance
7 Being Particular About Calibration 109

Julien Guyon and Pierre Henry-Labordère
Bloomberg and Société Générale
8 Cooking with Collateral 129

Barclays
SECTION 2: ASSET AND RISK MANAGEMENT
9 A Dynamic Model for Hard-to-borrow Stocks 149

Marco Avellaneda and Mike Lipkin
New York University and Columbia University
10 Shortfall Factor Contributions 165

Richard Martin and Roland Ordovàs
Longwood Credit Partners and Sovereign Bank
11 Stressed in Monte Carlo 183

Christian Fries
DZ Bank
12 A New Breed of Copulas for Risk and Portfolio

Management 197
Attilio Meucci
SYMMYS
13 A Historical-parametric Hybrid VaR 213

Robin Stuart
State Street Global Markets Risk Management
14 Impact-adjusted Valuation and the Criticality of

Leverage 229
Jean-Philippe Bouchard; Fabio Caccioli and Doyne Farmer
Capital Fund Management, Santa Fe Institute and
University of Oxford
vi

CONTENTS
SECTION 3: COUNTERPARTY CREDIT RISK
15 Being Two-faced Over Counterparty Credit Risk 243

Jon Gregory
Solum Financial Partners
16 Real-time Counterparty Credit Risk Management in

Monte Carlo 259
Luca Capriotti, Jacky Lee and Matthew Peacock
Credit Suisse and Axon Strategies
17 Counterparty Risk Capital and CVA 275

Michael Pykhtin
US Federal Reserve Board
18 Partial Differential Equation Representations of

Derivatives with Bilateral Counterparty Risk and
Funding Costs 295
Christoph Burgard and Mats Kjaer
Barclays
19 Close-out Convention Tensions 315

Damiano Brigo and Massimo Morini
Imperial College London and
IMI Bank of Intesa San Paulo
20 Cutting CVAs Complexity 329

Pierre Henry-Labordère
Notes on Chapters 349
Index 351
vii

About the Editor
Mauro Cesa is the technical editor of the Risk Management and

Alternative Investment (RMAI) division at Incisive Media in
London. Since 2009, he has been responsible for the Cutting Edge
section of Risk, Energy Risk, Insurance Risk and ETF Risk magazines.
Cutting Edge publishes peer-reviewed quantitative finance articles
with a focus on the pricing and hedging of financial instruments, as
well as risk management relevant to investment banking, buy-side
industry, energy firms and insurance companies. Before joining
Incisive Media in 2007, Mauro worked with the quantitative asset
management team at Eurizon Capital in Milan on equity and fixed
income investment models for mutual funds and pension funds.
He studied economics at Trieste University and Aarhus University,
and holds an MA in quantitative finance from Brescia University.
ix

About the Authors
Jesper Andreasen heads the quantitative research department at

Danske Bank in Copenhagen. He has previously held positions in
the quantitative research departments of Bank of America, Nordea
and General Re Financial Products. Jesper’s research interests
include term structure modelling, volatility smiles and numerical
methods. He has a PhD in mathematical finance from Aarhus
University, Denmark, is an honorary professor of mathematical
finance at Copenhagen University, and has twice received Risk
magazine’s Quant of the Year award.
Marco Avellaneda has been involved in teaching, developing and

practicing quantitative finance since the late 1990s. He previously
worked at Banque Indosuez, Morgan Stanley, Gargoyle Strategic
Investments, Capital Fund Management and at the Galleon Group.
His interests – both practical and theoretical – are focused on quan-
titative alpha generation. As a faculty member at the Courant
Institute, he teaches classes in stochastic calculus, risk management
and portfolio theory, PDEs in finance and quantitative investment
strategies. He is on the editorial boards of Communications on Pure
and Applied Mathematics, the International Journal for Theoretical and
Applied Finance and Quantitative Finance, and co-authored the text-
book Quantitative Modeling of Derivative Securities. He was named
2010 Quant of the Year by Risk magazine.
Lorenzo Bergomi is head of the quantitative research in the global

markets division at Société Générale. Originally trained in electrical
engineering, he obtained a PhD in theoretical physics and spent a
few years as an academic before joining Société Générale’s equity
derivatives department in 1997. Lorenzo’s team was given a cross-
asset global mandate in 2009. He is best known for his work on
stochastic volatility, most of which has been published in a series of
papers in Risk magazine.
xi

Marco Bianchetti joined the market risk management area of Intesa

Sanpaolo in 2008, to cover derivatives’ pricing and risk manage-
ment across all asset classes, with a focus on new products
development, model validation, model risk management, interest
rate modelling, funding and counterparty risk. Marco previously
worked in the front-office financial engineering division of Banca
Caboto (now Banca IMI), developing pricing models and applica-
tions for interest rate and inflation trading desks. He is a speaker at
international conferences and training in quantitative finance, and
holds an MSc in theoretical nuclear physics and a PhD in theoretical
condensed matter physics.
Jean-Philippe Bouchaud obtained his PhD in physics from the

Ecole Normale Supérieure in 1985, before working on the dynamics
of complex systems and then in theoretical finance since 1991. His
work has been very critical about the standard concepts and models
used in economics and in the financial industry. Jean-Philippe
co-founded Capital Fund Management in 1994, and is now their
president and head of research, as well as being a professor at the
Ecole Polytechnique. He has published over 300 scientific papers
and several books, and was awarded the CNRS Silver Medal in
1996.
Damiano Brigo is chair and co-head of mathematical finance at

Imperial College London and director of the Capco Research
Institute. Previously a professor at King’s College and managing
director at Fitch, he is also managing editor of the International
Journal of Theoretical and Applied Finance. Damiano has published
over 70 works on mathematical finance, probability and statistics,
and field reference books on interest rate and credit modelling, and
his interests include pricing, risk, credit, funding and stochastic
models for commodities and inflation. He holds a PhD in differen-
tial geometric stochastic filtering.
Christoph Burgard is a managing director at Barclays, with global

responsibility for the modelling of equity, securitisation deriva-
tives, counterparty credit, banking book capital and ALM. In
addition, he has built up a number of other quant teams over the
years, including in emerging markets and exposure analytics
xii

About the authors
modelling, and managed the credit quant team through the credit
crisis. Before joining Barclays in 1999, Christoph worked in theo-
retical and experimental particle physics and was a fellow at CERN
and DESY. He holds a PhD in physics from Hamburg University.
Fabio Caccioli is a postdoctoral fellow at the Santa Fe Institute in

Santa Fe, New Mexico. His research mainly focuses on systemic risk
and financial stability, as well as complex networks and non-equi-
librium statistical mechanics. Fabio holds a PhD in statistical
physics from the International School for Advanced Studies in
Trieste, Italy.
Luca Capriotti is the US head of quantitative strategies global credit

products at Credit Suisse where he focuses on flow and structured
credit, strategic risk programmes and counterparty credit risk
management. He also works on developing efficient computational
methods for the fast calculation of Greeks, for which he has a patent
pending. Prior to working in finance, Luca was a researcher at the
Kavli Institute for Theoretical Physics, Santa Barbara, California.
He holds an MPhil and a PhD in condensed matter theory from the
International School for Advanced Studies, Trieste, Italy.
J. Doyne Farmer is a professor of mathematics and director of the

complexity economics program at the Institute for New Economic
Thinking at Oxford University. He was previously founder of the
complex systems group at Los Alamos National Laboratory,
founder of Prediction Company and spent 10 years as a professor at
the Santa Fe Institute.
Christian Fries is head of model development at DZ BANK’s risk

control division and professor of applied mathematical finance in
the Department of Mathematics at LMU Munich. His current
research interests are hybrid interest rate models, exposure simula-
tion, Monte Carlo methods and valuation under funding and
counterparty risk. Christian is the author of Mathematical Finance:
Theory, Modeling, Implementation, and runs the finmath.net website.
Jon Gregory is a partner at Solum Financial Partners and special-

ises in counterparty risk and CVA-related consulting and advisory
xiii

projects. He has worked on many aspects of credit risk in his career,

being previously with Barclays Capital, BNP Paribas and Citigroup.
Jon is author of Counterparty Credit Risk: The New Challenge for Global
Financial Markets, now in its second edition. He holds a PhD from
Cambridge University.
Julien Guyon is a senior quantitative analyst in the quantitative

financial research group at Bloomberg, New York. Before joining
Bloomberg, he worked in the global markets quantitative research
team at Société Générale in Paris. Julien graduated from Ecole
Polytechnique (Paris), Université Paris 6 and Ecole des Ponts, and
received his PhD in probability theory and statistics from the Ecole
des Ponts (Paris). He has also been a visiting professor at Université
Paris 7 and at Ecole des Ponts, teaching mathematics of finance in
their master programmes. Julien’s main research interests are
numerical probabilistic methods and volatility modelling.
Pierre Henry-Labordére works in the global markets quantitative

research team at Société Générale. After receiving his PhD at Ecole
Normale Supérieure (Paris) in the theory of superstrings, he joined
the theoretical physics department at Imperial College London,
before moving to finance in 2004. Since 2011, Pierre has also been an
associate researcher at Centre de Mathématiques Appliquées, Ecole
Polytechnique. He was the recipient of the 2013 Quant of the Year
award from Risk magazine.
Brian Huge is chief analyst in the quant group, with focus on FX

and equity derivatives, at Danske Bank, where he has worked since
the early 2000s. He has a PhD in mathematical finance from
Copenhagen University, with a thesis entitled “On Defaultable
Claims and Credit Derivatives”.
Mats Kjaer works in the quantitative analytics group at Barclays,

which he joined as a graduate in 2006. He specialises in CVA and
funding modelling, on which he has published three peer-reviewed
papers and numerous working papers, and regularly presents his
work at academic and industry conferences. Prior to joining
Barclays, Mats was a PhD student in mathematical finance at
Gothenburg University, Sweden, where he earned his doctorate in
xiv

About the authors
2006. He has also worked as a visiting research fellow at the

University of Texas at Austin and as a management consultant with
the Boston Consulting Group in Stockholm.
Jacky Lee is the US regional head of quantitative strategies and the

global head of quantitative strategies global credit products at
Credit Suisse, where he focuses on flow and structured credit, stra-
tegic risk programmes and counterparty credit risk management.
Prior to joining Credit Suisse in August 2002, he began his career as
a quantitative modeller in the credit derivative research team at
Morgan Stanley in 1998. Jacky holds a PhD in operations research
from Stanford University and an MSc in applied mathematics from
Auckland University, New Zealand.
Mike Lipkin is associate adjunct professor in the industrial engi-

neering and operations research department at Columbia
University. He has been an options market maker on the American
Stock Exchange since the late 1990s, and has also carried out
research in derivatives, producing a generally accepted theory of
the pinning of optionable stocks on expirations with Marco
Avellaneda. Mike’s research involves take-overs, earnings and
special announcements, all topics covered in his Columbia course
on experimental finance. He has a PhD in chemistry.
Richard Martin is a founding partner at Longwood Credit Partners

in London. He was previously at AHL, part of Man Group, where
he was initially head of quantitative credit strategies, then a port-
folio manager in the fixed income sector. Between 2003 and 2008, he
was a managing director in fixed income with Credit Suisse in
London. Richard’s interests include systematic trading, CDO corre-
lation trading, credit-equity trading and the pricing and hedging of
credit derivatives. An authority on portfolio modelling, he intro-
duced the saddle-point method as a tool for assisting in portfolio
risk calculations. He was awarded Quant of the Year by Risk maga-
zine in 2002.
Fabio Mercurio is head of derivatives research at Bloomberg in

New York and an adjunct professor at NYU. Previously, he was
head of financial engineering at Banca IMI, Milan. Fabio has jointly
xv

authored the book Interest Rate Models: Theory and Practice and has
published extensively in books and international journals, including
13 Cutting Edge articles in Risk magazine. He holds a BSc in applied
mathematics from the University of Padua and a PhD in mathemat-
ical finance from the Erasmus University of Rotterdam.
Attilio Meucci is the founder of SYMMYS, under whose umbrella

he designed, and continues to teach the six-day Advanced Risk and
Portfolio Management Bootcamp (ARPM Bootcamp), and manages
the charity One More Reason. He is also the chief risk officer and
director of portfolio construction at Kepos Capital. Previously,
Attilio was the head of research at ALPHA, Bloomberg’s portfolio
analytics and risk platform, a researcher at POINT, Lehman
Brothers’ portfolio analytics and risk platform, a trader at Relative
Value International and a consultant at Bain & Co. Concurrently, he
taught at Columbia–IEOR, NYU–Courant, Baruch College–CUNY
and Bocconi University.
Massimo Morini is head of interest rate and credit models at IMI

Bank of Intesa San Paolo, where he is also coordinator of model
research. He is professor of fixed income at Bocconi University and
he was previously a research fellow at Cass Business School.
Massimo regularly delivers advanced training worldwide, and has
led workshops and expert panels on the financial crisis at major
international conferences. He has published in journals including
Risk magazine, Mathematical Finance and the Journal of Derivatives,
and is the author of Understanding and Managing Model Risk: A
Practical Guide for Quants, Traders and Validators and other books on
credit and interest rate modelling. He holds a PhD in mathematics
and an MSc in economics.
Roland Ordovàs is head of US risk methodology at Sovereign Bank

in Boston. Previously, he was director of capital methodology at
Santander, where he was responsible for credit portfolio modelling,
inter-risk aggregation, the development of low-portfolio rating
models and credit risk measures. Roland has also worked at BNP
Paribas, London, where he was involved in the capital manage-
ment team and credit counterparty risk analytics. His interests
include research on analytical solutions related to the area of credit
xvi

About the authors
portfolio modelling and capital allocation. Roland received a scien-

tific PhD from Imperial College, London in 2000.
Matthew Peacock worked for six years at Credit Suisse as a quant,

specialising in flow credit products, and has since co-founded Axon
Strategies to research and develop quantitative systematic trading
systems. He holds a BE and a BSc from the University of Melbourne
and a PhD in engineering from the University of Sydney.
Vladimir V. Piterbarg is a managing director and the global head of

quantitative analytics at Barclays. Before joining Barclays Capital in
March 2005, he was a co-head of quantitative research for Bank of
America. Vladimir’s main areas of expertise are the modelling of
interest rate and hybrid derivatives. He has won two Quant of the
Year awards from Risk magazine, and serves as an associate editor
of the Journal of Computational Finance and the Journal of Investment
Strategies. Vladimir co-authored the three-volume book Interest Rate
Modeling, and has published more than 20 articles on quantitative
finance. He holds a PhD in mathematics from the University of
Southern California.
Michael Pykhtin is a senior economist in the quantitative risk

management section at the Federal Reserve Board, where he is
responsible for carrying out policy analysis and independent
research related to financial markets, risk management and regula-
tion of financial institutions. Prior to joining the FRB in 2009, he was
a quantitative researcher at Bank of America and KeyCorp. Michael
edited the book Counterparty Credit Risk Modelling and has contrib-
uted to several edited collections, as well as being an associate
editor of the Journal of Credit Risk and extensively publishing in
leading industry journals. He holds a PhD in physics from the
University of Pennsylvania.
Robin Stuart is the head of risk analytics for State Street Corporation
Global Markets, responsible for the modelling of market and coun-
terparty credit risk. Trained in mathematics and theoretical physics,
he held post-doctoral positions at a number of international institu-
tions including CERN and the Max Planck Institute. Robin was
previously a professor of physics at the University of Michigan,
xvii

Ann Arbor, before joining Merrill Lynch in 1999, where he held a

number of roles including model validation, risk management for
the FX and short-term interest rate business and VaR modelling,
continuing after the merger with Bank of America. He holds doctor-
ates in theoretical physics from the Universities of Oxford and
Otago.
xviii

Acknowledgements from the Editor
I am grateful to each and every one of the chapter contributors for

choosing to publish their research papers with Risk magazine and
for showing their support to this project from its beginning. I would
like to express my gratitude to colleagues at Risk, in particular my
current and past colleagues at the Cutting Edge section: Laurie
Carver, Nazneen Sherif and Sebastian Wang, whose hard work has
been fundamental to the publication of this book. Special thanks go
to Matt Cameron for his numerous helpful suggestions. I would
also like to thank Sarah Hastings for approaching me with the idea
for this project, Lewis O’Sullivan for directing its production and
Amy Jordan for encouraging and guiding me through the process
and facilitating it with her infinite patience and thorough
professionalism.
xix

Foreword
The origins of quantitative finance are lost in the mists of time and
are difficult to identify precisely. However, most scholars agree that
they can be traced back certainly as far as the celebrated treatise on
double entry bookkeeping Summa de Arithmetica, Geometria,
Proportioni et Proportionalita (Everything About Arithmetic, Geometry
and Proportion), which was published in 1494 by Luca Pacioli. He, in
turn, credits an even earlier manuscript Delia Mercatura et del
Mercante Perfetto (Of Trading and the Perfect Trader) by his half-
forgotten predecessor Benedetto Cotrugli. Two important treatises
on options trading are Confusion de Confusiones (Confusion of
Confusions), published by Joseph de la Vega in 1688, and Traité de la
Circulation et du Crédit (An Essay on Circulation of Currency and
Credit), published by Isaac de Pinto in 1771. The works by de la
Vega and de Pinto clearly show that trading in options is not a new
phenomenon (as is from time to time wrongly claimed by its detrac-
tors) and has been thriving in Europe at least since the 16th century,
if not earlier. For instance, the Antwerp Exchange, the London
Royal Exchange and the Amsterdam Bourse were opened in 1531,
1571 and 1611, respectively. The reasons for the existence of a
burgeoning trade in options are not difficult to fathom – such
trading is crucial for the smooth functioning of commerce.
Regardless of any disputes about the history of quantitative
finance, there is general consensus that the starting point of modern
quantitative finance was the PhD thesis by Louis Bachelier Théorie
de la Spéculation (The Theory of Speculation), which was published in
1900. In his thesis, Bachelier introduced the stochastic process now
known as Brownian motion, and used it to study the evolution of
stock prices and to develop a theory of option pricing. Bachelier’s
work was forgotten for several decades until it was rediscovered
and published in English by Paul Cootner in 1964.
Although several distinguished scholars contributed to the
progress of quantitative finance in the interim (the name of Paul
xxi

Samuelson springs to mind), the first major advance since

Bachelier’s PhD came in 1973 when Fischer Black, Myron Scholes
and Robert Merton (BSM) presented a novel solution to the option
pricing problem, now known as the BSM formula. In 1997, Merton
and Scholes shared the Nobel Prize in Economics for their
discovery; Black, who died in 1995, was mentioned as a contrib-
utor by the Swedish Academy. The BSM theory is based on the
following explicit assumptions: (i) there are no arbitrage opportu-
nities in the market; (ii) the market is frictionless, so that it is
possible to borrow and lend cash at a constant risk-free interest
rate and buy and sell any amount of stock without transaction
costs and taxes; (iii) the underlying stock does not pay dividends
and its price is driven by a geometrical Brownian motion with
constant drift and volatility parameters. While assumption (iii) is
easy to relax, assumptions (i) and (ii) are so fundamental to the
BSM theory that they have been taken for granted ever since the
theory was first proposed.
Between 1973 and 2008, quantitative finance developed at a very
fast pace. A major effort was aimed at replacing assumption (iii),
since it was realised early on that no actual option market conforms
exactly to the BSM framework; to reconcile premiums in the market,
practitioners assume that the volatility argument in the BSM
formula depends on option maturity and strike. To put it differ-
ently, in practice it is necessary to use the market-implied volatility,
which differs in shape from a constant, and may have considerable
slope and convexity as a function of its arguments. In order to
account for the implied volatility not being constant, several exten-
sions to the BSM theory were introduced: local volatility models;
stochastic volatility models; jump-diffusion models; and universal
volatility models. In addition to options on stocks, many other
types of derivatives were introduced and analysed in detail, largely
within the confines of the extended BSM theory. An incomplete list
includes foreign exchange derivatives, interest rate swaps and
swaptions, commodity derivatives and credit derivatives. Overall,
pre-crisis option pricing theory satisfied the needs of the banking
industry reasonably well.
Another important source of inspiration for quantitative finance
is modern portfolio theory (MPT), largely developed by Harry
Markowitz (1952, 1959), James Tobin (1958), William Sharpe (1964),
xxii

foreword
John Lintner (1965) and Fisher Black and Robert Litterman (1992).
Tobin, and Markowitz and Sharpe were awarded the Nobel Prize in
Economics in 1981 and 1990, respectively. MPT explains the advan-
tages of diversification and shows how to achieve it in the best
possible way, and makes numerous idealised assumptions that it
shares with the BSM framework. In essence, it describes the behav-
iour of rational investors operating in frictionless Gaussian markets
and aiming at maximisation of economic utility.
However, since the 2008 financial crash, practitioners and
academics alike have realised that in markets under duress frictions
become dominant. This means that some large parts of quantitative
finance, including option pricing theory and MPT, have to be rebuilt
in order to account for market frictions in earnest. Some of the fail-
ures of quantitative finance from the pre-crisis build-up and during
the crisis itself were used by ill-informed detractors to claim that
the mathematical modelling of financial markets is futile and there-
fore has no future. This timely book serves as a concise response to
these detractors; it shows very clearly that well thought through
modelling is not only useful but necessary in order to help financial
markets to operate smoothly and perform their social role properly.
The editor, Mauro Cesa, has selected some of the best papers
published in Risk magazine since the beginning of the crisis; he
should be congratulated on his knowledge and taste.
The book consists of three parts and covers several important
topics, including post-BSM derivative pricing, asset allocation and
risk management, and, most importantly, counterparty risk.
Broadly speaking, it addresses the following subjects: (i) choices of
appropriate stochastic processes for modelling primary assets
including cash, bonds (government and corporate), equities,
currencies and commodities; (ii) financial derivatives on primary
assets and their risk-neutral and real-world pricing in different
modelling frameworks; (iii) modern approaches to volatility objects
and model calibration; (iv) asset allocation and related issues in the
presence of market frictions; and (v) credit risk and credit, debt, and
funding value adjustment calculations with and without collateral.
The reader will benefit from the expertise of some of the sharpest
thinkers in the field. Although most of the post-crisis models are
still far from being in the same state of completeness as their pre-
crisis predecessors, after reading the book it becomes clear that in
xxiii

the future these new and more realistic and accurate models will
find wide applications and thus flourish and expand.
Alexander Lipton
Bank of America Merrill Lynch and Imperial College
February 2013
xxiv

Introduction
Since its inception in 1987, Risk magazine has had the privilege to
publish a collection of articles widely considered to be milestones
of modern quantitative finance, such as Vasicek (2002) on distribu-
tions of portfolio losses and Lipton (2002) on the volatility smile of
exotic options, while Dupire’s (1994), which introduced local vola-
tility, is still considered one of the most influential articles on
derivatives pricing. However, the world of modern quantitative
finance is changing. Where pre-2007 quants dreamed up compli-
cated theorems and designed exotic payouts, the credit crisis has
caused the industry as a whole to question long-held truisms,
including the pricing of something as simple as a plain vanilla
interest rate swap. Quants have also had to refocus their attentions
on capital and funding as a wave of regulatory reform has dramati-
cally reshaped the derivatives industry. As a result of this rapid
change, and adaption to the post-crisis landscape, quants have
generated a new wave of research. The aim of this book is to provide
a comprehensive overview of this new research, the challenges
quants have had to confront during the crisis, and of course their
responses; it will also focus on instruments and methodologies that
emerged or showed resilience during the crisis.
The repercussions of the credit crisis that enveloped global
markets in 2007 were keenly felt and continue to have a widespread
effect in all asset classes, even obliterating some and contributing to
the birth of others. Prior to 2007, a significant portion of financial
research was dedicated to complex credit derivatives. However,
subsequent to the collapse in credit markets in 2007, much of that
research was singled out as blameworthy and a key contributor to
the deterioration of bank’s balance-sheet health and their plum-
meting stock prices. One of the instruments borne out of that
quantitative research, the collateralised debt obligation (CDO), an
instrument that constituted the most toxic component of banks’
xxv

portfolios, virtually disappeared from the markets – although it

may make a comeback in 2013 or 2014.
In over-the-counter (OTC) derivatives markets, counterparties
to swaps trades typically sign up to a credit risk mitigant known
as a credit support annex (CSA), a legal document designed by
the International Swap and Derivatives Association (ISDA) to
govern collateral posting between the two counterparties to a
trade. Under a CSA, counterparties to a trade agree to post each
other collateral, which is intended to cover the mark-to-market
value of the swap and ensure that if one counterparty were to
default, the non-defaulting counterparty would bear no loss as a
result. The great majority of trades are now collateralised.
However, the way in which collateralised trades are valued has
undergone a revolution. Prior to the crisis, every bank discounted
all trades at the prevailing risk-free rate, typically Libor. But
during the crisis, the basis between Libor and the overnight
indexed swap rate blew out, and it was no longer true that banks
could borrow at a risk-free rate. Banks using Libor to discount
trades produced swap prices that were too low, and because the
overnight indexed swap (OIS) rate is the rate which CSAs stipu-
late should be paid on collateral, consequently it became the rate
at which future cashflows in a swap should be discounted.
Chapter 2 and Chapter 8 will discuss the radical evolution of
pricing collateralised trades.
Meanwhile, when a trade is not collateralised, counterparties
need to account for the risk of both counterparties defaulting and
calculate the market value of the potential associated losses. The
valuation and the inclusion in pricing models of credit value adjust-
ments (CVA) and debit value adjustments (DVA) is hotly debated
by market players. Counterparty credit risk has evolved into the
primary focus of research, and we thought that it was appropriate
to dedicate an entire section to these issues.
The traditional concepts of funding and discounting have been
revolutionised by the explosion of basis spreads – the differences
between Libor and OIS rate-denominated trades. Before the crisis,
these two curves tended to coincide and one term structure was
used for calculating discount factors and forward rates. Not any
more. The multiple-curve environment that stemmed from the
credit and liquidity risks priced in these markets pushed the
xxvi

INTRODUCTION
development of new paradigms and pricing models. Chapters 2, 3,

4 and 8 will focus on these issues.
On the buy-side, the effects of prolonged periods of high vola-
tility have prompted the strengthening of risk management tools.
Value-at-risk (VaR) and expected shortfall have been put under
scrutiny, and several solutions have been proffered to resolve their
flaws. Meanwhile, in an attempt to quell high volatility and tame
negative spikes in the stock markets, regulators have controver-
sially adopted periodic short-sell bans. This practice has been
criticised by many in the industry for its negative impact on the
hedge fund industry, which claims the bans cause volatility spikes
and liquidity issues. Chapter 9 will discuss the topic.
Some contributions to this book, rather than being a direct conse-
quence of the credit crisis, follow the natural evolution of quant
finance, specifically equity derivatives modelling, which if ignores
counterparty risk, forms a pure derivatives pricing model. Chapter
1 will investigate the complex behaviour of volatility smiles in
stochastic models, while Chapters 5, 6 and 7 focus on calibration of
derivatives pricing models.
The book is organised as follows. The first section deals with
derivatives pricing, including topics on equity derivatives, interest
rates derivatives, multiple-curve environments, collateralisation
and pricing model calibration. The second section, on asset and risk
management, offers contributions on liquidity risk, short selling,
risk measurement tools and correlation structures, while the
following section explores counterparty credit risk, examining its
bilateral formulation, connection with risk capital, stochastic repre-
sentations, challenging computation and the residual value of a
deal at close-out.
SECTION ONE – DERIVATIVES PRICING

The first chapter, “Smile Dynamics IV,” is the final addition to a
ground-breaking series of articles by Lorenzo Bergomi, who was
crowned Risk’s Quant of the Year in 2009 for his previous work,
Smile Dynamics III. The series explores the dynamics of spot prices
in conjunction with implied variance swap volatilities. By
combining the two, it is possible to design a framework consistent
with both exotic equity derivatives markets and volatility swap
markets, which reduces computational costs and operational risk,
xxvii

as well as avoiding constraints associated with other popular

models such as Heston’s stochastic volatility and Merton’s jump
model. Building on his previous work, Bergomi explains the rela-
tionship between the rate at which the at-the-money-forward
(ATMF) skew decays with maturity and the rate at which the ATMF
volatility varies with spot price. He then introduces the skew sticki-
ness ratio (SSR) as the regression coefficient of the ATMF volatility
conditional on the price movements and shows its interrelation
with the ATMF skew. Subsequently, for short maturities he explains
how to exploit the difference between implied and realised SSR
through the example of a dynamic option strategy on Eurostoxx 50
implied volatilities.
Chapter 2 is the first of a number of chapters dedicated to funding
costs and collateralisation. Prior to February 2010, when Vladimir
Piterbarg’s “Funding Beyond Discounting: Collateral Agreements
and Derivatives Pricing” was published, the market had already
begun to move away from Libor-based discounting for collateral-
ised swaps; however, his paper was the first to provide a consistent
and mathematically rigorous framework explaining the necessity
of using different discounting rates for collateralised and non-
collateralised trades. Starting from first principles, Piterbarg points
out that collateralised trades, governed by a CSA, should be
discounted using OIS rates, while non-collateralised trades should
be discounted using the bank’s own cost of funding. He also adds
that a convexity adjustment is needed to capture the collateralisa-
tion effects on forward curves. To illustrate the impact of a CSA on
standard derivatives pricing, Piterbarg shows how the Black–
Scholes formula in the presence of a CSA is derived. He won his
second Risk Quant of the Year award for this paper, which has been
cited as one of the most influential since the onset of the crisis, and
has sparked a set of follow-up papers building on cases of more
complex derivatives.
One of the effects of the crisis and the associated spikes in vola-
tility has been the divergence of basis spreads in the single-currency
interest rate derivatives market between instruments with different
tenors. Before this, interest rate derivatives were priced using a
single term structure despite practitioners and researchers being
aware of the presence of multiple curves for many years (the first
theoretical treatment was published in Tuckman and Porfirio, 2003).
xxviii

INTRODUCTION
Although the distances between the curves were considered negli-

gible for pricing and hedging purpose, when the spread between
Libor and Eonia rates, which used to float at around six basis points
before the crisis, peaked at over 80 basis points in 2008 – due to
what is widely believed to be the presence of liquidity and credit
risk premia – the market had to adapt quickly in order to avoid
arbitrage opportunities presented by the spread explosion. In
Chapter 3, “Two Curves, One Price”, Marco Bianchetti formalises
the pricing methodology that became standard practice during the
crisis and provides double-curve no-arbitrage pricing formulas for
vanilla interest rate derivatives – including forward rate agree-
ments (FRAs), swaps, caps, floors and swaptions. In his framework,
Bianchetti uses the foreign currency analogy to explain the move-
ment of one curve relative to another.
Chapter 4 builds on the discussion of models used to price
interest rate derivatives post the misalignment of OIS and forward
rate agreement (FRA) rates, which are typically calculated using
Libor or Euribor depending on the currency involved. Previous
works by Alan Brace and Fabio Mercurio independently addressed
the issue by jointly modelling OIS and Libor rates, and developing
a multiple-curve model consistent with the Libor market model
(LMM) model. “A Libor Market Model with a Stochastic Basis” by
Mercurio adapts the LMM to the multi-curve setting using a
different approach. He models the OIS rate and the basis spread
and obtains the FRA rate as the sum of the two. This has the advan-
tage of keeping a handle on the credit spread and being able to
model the Libor curves more easily. The multi-tenor multi-curve
LMM model (McLMM) can be applied to plain vanilla interest rate
derivatives as well as more complex instruments, such as basis
swaps or caps and swaptions with non-standard underlying tenors.
In Chapters 5 and 6, “Volatility Interpolation” and “Random
Grids”, Jesper Andreasen and Brian Huge build within a local
stochastic volatility framework, a robust method for calibrating
implied volatilities guaranteeing no arbitrage. The peculiarity of
their approach is based on the premise that, while using Monte
Carlo simulations, a discrete set of option quotes generating a
smooth no-arbitrage implied volatility surface can be calibrated.
The advantage of using a first-order discretisation, which avoids
continuous time modelling, is the significant reduction of
xxix

computational costs. The method has been implemented at Danske

Bank since 2010 and is versatile enough to potentially accommo-
date additional features like jump processes and asset correlation
structures. Although it is designed for equity derivatives, it can be
extended to interest rates – Andreasen and Huge (2013), for
example, applies it to the SABR model – and credit risk. The two
papers drew many plaudits, causing some to claim that Andreasen
and Huge have reinvented local volatility modelling, and contrib-
uted to the authors picking up Risk’s Quant of the Year award in
2012.
Chapter 7 further explores the calibration of market smile for
equity derivatives. Julien Guyon and Pierre Henry-Labordère show
how to calibrate multi-factor hybrid local stochastic volatility
models to market smiles using an algorithm borrowed from particle
physics. Their paper explains how the Monte Carlo method can be
calibrated to any local stochastic volatility/hybrid model to fit the
smile dynamics. Its derivation and implementation are mathemati-
cally challenging, but the chapter introduces the necessary tools to
use and show the efficiency of the algorithm on well-known models,
such as the Ho–Lee and Dupire hybrid models, and Bergomi’s local
stochastic volatility model. Compared to Andreasen and Huge’s
approaches in Chapters 5 and 6, Guyon and Henry-Labordère’s
method handles high dimensionality in the stochastic volatility
model more efficiently. This work, and Henry-Labordère’s CVA
model treatise in Chapter 20, led to him being chosen as Risk’s
Quant of the Year for 2013.
Chapter 8 also deals with the pricing of collateralised trades
discussed in previous chapters. In “Cooking with Collateral”,
Vladimir Piterbarg develops a model for an economy in which
there are no such things as credit risk-free securities and all contracts
are collateralised. After explaining the foundations of the model, he
develops it in a cross-currency setting, consistent with Fujii and
Takahashi (2011). However, in Fujii and Takahashi (2011), the risk-
free rate is still part of the model, and the innovation in Piterbarg’s
model is that it excludes risk-free rates but still coherently defines a
risk-neutral measure, allowing the use of much of the traditional
pricing framework. To obtain that result, each collateralised asset
grows at the rate at which it is collateralised. Interestingly, as there
are no risk-free rates, the FX rate drift is not expressed in the form of
xxx

INTRODUCTION
rates spread, but is rather provided by an overnight repo rate on the

sale of currencies.
SECTION TWO – ASSET AND RISK MANAGEMENT

During the crisis, regulators readily claimed that short selling was
one of the prime reasons behind market downturns, and subse-
quently introduced a patchwork of short-selling restrictions.
However, market players claim that short-selling restrictions, in
particular situations where re-purchases of shorted shares are
imposed by clearing houses, entail unjustified price spikes or
consistent overpricing, higher volatility and liquidity issues. In
Chapter 9, “A Dynamic Model for Hard-to-borrow Stocks” by
Marco Avellaneda and Mike Lipkin, proposes a pricing model for
stocks for which the probability of buy-ins is strictly positive. The
model comprises two stochastic differential equations explaining
the behaviour of the stock price, one of which describes the evolu-
tion of the buy-in rate. The model enables the assessment of the
effective cost of borrowing the stocks one wants to short. Conversely,
the buy-in rate can be interpreted by the owner of the stock as a
convenience yield, which is technically equivalent to a dividend.
The arguments are then applied to option pricing and leveraged
exchange-traded funds (ETFs). Following the publication of this
work, Avellaneda was awarded Risk’s Quant of the Year in 2010.
Chapter 10, “Shortfall Factor Contributions” by Richard Martin
and Roland Ordovàs, proposes a generalisation of the Euler formula
to decompose the expected shortfall of a portfolio into a sum of risk
factors. The Euler formula, an instrument used in asset and risk
allocation, does not capture the contribution of individual factors as
it focuses on positions and portfolio weights instead. Its generalisa-
tion allows the production of factor contributions that add up to the
systematic part of the expected shortfall minus the expected loss. In
using this method, it is possible to calculate the sensitivity of a port-
folio to each of the individual factors. Since it is not model-specific,
it is potentially applicable to a variety of portfolio models. The
authors provide worked examples, using a multivariate normal
model, on a portfolio of defaultable instruments and a retail banking
portfolio.
In Chapter 11, “Stressed in Monte Carlo,” Christian Fries
discusses stress-test failures that may occur with Monte Carlo
xxxi

simulation when model parameters reach extreme values, and

proposes an alternative solution. Misleading results may be
observed in stressed situations, for example, when pricing an
option in a high-volatility regime, but Fries shows how the method
can be modified by introducing analytic boundary conditions.
When boundaries are defined, the in-bound area is distinguished
from the out-bound one and a Monte Carlo scheme is designed in
such a way that all the paths stay within the in-bound area. Put
simply, this excludes the simulated paths that violate the boundary,
and the simulation thus gains stability in the numerical results.
While the process of determining the boundaries is an additional
computational challenge, the method allows robust stress tests on a
portfolio of complex products.
Copula functions have been applied to risk and asset manage-
ment, and credit derivatives pricing since their introduction to
finance by Li (2000). However, copulas have had a turbulent history:
once hailed as a modelling masterstroke they now are singled out
as incapable of capturing the complexity of correlation structures
underlying CDOs. One limit of a copula function is that the
marginal distributions of its random variables are uniform, and
therefore have limited flexibility. In Chapter 12, “A New Breed of
Copulas for Risk and Portfolio Management,” Attilio Meucci intro-
duces a technique to generate new flexible copulas in which the
marginals can be distributed arbitrarily. To exemplify the potential
of this methodology, Meucci shows how it can be used to create
“panic” copulas for stress testing, and in a separate case study
explains how to transform copulas in order to generate new ones.
The method has been praised as it does not attribute equal proba-
bilities to all scenarios, and it is also computationally efficient.
In Chapter 13, “A Historical-parametric Hybrid VaR”, Robin
Stuart addresses the issue of missing data in market time series. The
absence of data points can affect VaR calculations via historical
simulation and distort its outcome, and Stuart proposes a method
that combines historical and parametric VaR by taking the main
framework of a historical simulation and incorporating arguments
from the parametric method to fill the time series where needed.
Instead of using Monte Carlo simulation, as is often done in similar
circumstances, the hybrid method provides an analytical approach
of the probability distribution of the empty data point. To achieve
xxxii

INTRODUCTION
that, it decomposes the portfolio P&L into the P&L of individual

market variable changes. For estimating the possible changes and
obtaining the associated probability distributions, Stuart suggests
using a multi-factor model, such as the capital asset pricing model
(CAPM), and notes that the methodology can be applied to several
asset classes and to nonlinear portfolios.
In Chapter 14, “Impact-adjusted Valuation and the Criticality of
Leverage”, Jean-Philippe Bouchaud, Fabio Caccioli and Doyne
Farmer discuss the impact that liquidating assets may have on a
leveraged portfolio. The usual practice is to attribute the mark-to-
market value to a position, which will have the effect of
overestimating the monetisable value of the trade, which increases
with the size of the liquidation. Conversely, the level of leverage is
underestimated and the two effects together may result in signifi-
cant unexpected portfolio losses. The authors start their analysis
from an empirical law that describes the market impact as the
difference between pre-trade price and execution price as a func-
tion of the liquidating quantity, the total volume traded for that
asset and the volatility. They show that when a position is liqui-
dated, counterintuitively, with the fall in price the leverage ratio
momentarily increases. The authors propose a model that accounts
for these issues by incorporating an impact adjustment in the
pricing. Therefore, the price obtained is more realistic as it estimates
the liquidation price. The method can also give early warnings.
SECTION THREE – COUNTERPARTY CREDIT RISK

Until 2007, financial institutions were considered default-free and
contracts between a bank and counterparty used to take into
account only the default risk of the latter. The wave of defaults that
has shaken the markets since 2008 highlights the necessity to
measure the default risk of both parties. The next chapter, Jon
Gregory’s “Being Two-faced Over Counterparty Credit Risk,” is
one of the first studies made during the crisis that analyses the bilat-
eral credit risk associated with derivatives contracts. An article by
Brigo and Capponi (2008) investigated the issue around the same
time, while the first reference to bilateral counterparty risk is attrib-
uted to Duffie and Huang (1996). In addition to the risk of a
counterparty defaulting on its obligations, Gregory’s work takes
into account the probability that the dealer itself can default, and
xxxiii

thus combines the two different credit risks together in a frame-

work that defines bilateral CVA. If the dealer defaults first, it will in
effect experience a gain because the swap is closed out and no
future payments will be made. Gregory also shows the general
formulas for simultaneous defaults and for non-simultaneous
defaults, and, as an example, presents the case of two counterpar-
ties whose default probabilities are correlated and modelled by a
Gaussian copula. The model has been an important reference for
future developments of CVA and DVA pricing.
Chapter 16, “Real-time Counterparty Credit Risk Management
in Monte Carlo”, by Luca Capriotti, Jacky Lee and Matthew
Peacock, extends the application of the adjoint algorithmic differen-
tiation (AAD) to the calculation of counterparty credit risk.
Algorithmic differentiation is a methodology devised to calculate
numerically the derivative of a function specified by a computer
code, and is designed to be less time consuming than standard tech-
niques. It has been applied to several areas of physics, engineering,
meteorology, chemistry, biology and latterly in finance. Michael
Giles and Paul Glasserman first introduced the methodology to
finance in 2006, with their seminal work in Risk magazine, “Smoking
Adjoint: Fast Monte Carlo Greeks.” These techniques are particu-
larly attractive in finance because they allow computational costs to
be considerably reduced and obtain outputs more rapidly. One of
the obvious applications is the calculation of sensitivities of option
prices. These need to be computed continuously and be available in
real time. Standard method traders need to find a balanced compro-
mise between accuracy and computational speed, and AAD proves
to be an efficient tool to overcome this obstacle. Similarly, counter-
party credit risk requires high computational capacity since the
number of risk factors to be considered for the computation of CVA
can be very large. In the chapter, the AAD method is applied to an
example portfolio of five swap contracts referencing distinct
commodities futures, and shows that CVA and risk measures can be
calculated 150 times faster than the finite differences method.
In Chapter 17, “Counterparty Risk Capital and CVA,” Michael
Pykhtin proposes a general framework to calculate counterparty
credit risk (CCR) that includes CVA consistently with the Basel III
regulatory package, published by the Basel Committee on Banking
Supervision (BCBS) in 2010. The chapter navigates the definitions
xxxiv

INTRODUCTION
of CCR, CVA and bilateral CVA. It explains how the last of these –
an approach that allows the two parties to agree on a price – is a
function of loss given default (LGD), expected exposure and default
probability, and that it takes into account the joint default proba-
bility and the first-to-default entity. In the general setting of an
asymptotic single risk factor model, Pykhtin shows two applica-
tions for the proposed framework. In the first, a market risk
approach, which is indicated for use by sophisticated banks that
actively and dynamically hedge CCR, it allows banks to calculate
the VaR in the trading book comprising both market risk and CCR
simultaneously. The second approach treats CCR separately from
market risk and is more suitable for banks that do not actively
manage CCR. Finally, Pykhtin discusses the minimum capital
requirements under Basel II and Basel III, and argues that the CVA
capital charge for Basel III, as it is calculated independently from
market risk, could incentivise risk taking. He concludes by
proposing a solution to this issue.
In Chapter 18, “Partial Differential Equation Representations of
Derivatives with Bilateral Counterparty Risk and Funding Costs,”
Christoph Burgard and Mats Kjaer propose a unified framework in
which the creditworthiness of the dealer and its subsequent effects
on funding costs and bilateral counterparty risk are taken into
account. The model is derived as an extension of the Black–Scholes
partial differential equation (PDE) that includes a funding compo-
nent, which may differ for lending and borrowing. The model is
based on a controversial assumption – that there exists the possi-
bility for the bank to buy back its own bonds in order to hedge its
credit risk. Some say this operation cannot be executed, as it is tech-
nically not possible for a bank to have a long position in its own
debt (see, for example, Castagna, 2012). However, assuming DVA is
replicable, the model is presented in two settings. In the first, the
mark-to-market value of a derivative at default includes counter-
party credit risk, while in the second it does not. In the latter
situation, the authors obtain a linear PDE whose Feinman–Kac
representation (a formula that allows for solving certain types of
PDEs) makes it easily tractable. One example shows how large the
impact on CVA can be if funding is taken into account. The work is
considered one of the most influential on the subject.
Damiano Brigo and Massimo Morini, in “Close-out Convention
xxxv

Tensions” (Chapter 19), address an issue that is rarely dealt with in

the quantitative finance literature: the close-out value of derivatives
at default. Prior to the crisis, post the default of a counterparty, the
non-defaulting counterparty would receive a portion (determined
by the recovery rate) of the close-out value, which was calculated as
the expected discounted value of the future payments of the swap.
De facto, this method assumes future payments are risk-free.
However, because ISDA does not identify in its protocol (2009)
which close-out valuation approach has to be adopted, and only
mentions the possibility of a replacement close-out, it is hotly
debated whether or not the value should take into account the cred-
itworthiness of the non-defaulting party. Brigo and Morini explain
the advantages and disadvantages of a risk-free and replacement
close-out, both from the point of view of the debtor and creditor.
The outcome is mixed and the question as to which of the two is
preferable is left to the regulators to answer. Finally, the authors
also present a bilateral counterparty risk framework that incorpo-
rates a replacement close-out feature.
In Chapter 20, “Cutting CVA’s Complexity,” Pierre Henry-
Labordère presents an algorithm aimed at reducing computational
costs of CVA calculations. The bilateral counterparty risk calcula-
tion is expressed in the form of nonlinear, second-order, partial
differential equations, as presented by Burgard and Kjaer in Chapter
18. Solving this equation numerically is computationally cumber-
some and high dimensionality impedes the use of finite difference
–therefore, Monte Carlo simulation is the only available instru-
ment. Henry-Labordère proposes a solution by simulating
backward stochastic differential equations (BSDEs), and to calcu-
late the conditional expectations of defaults he suggests adopting a
Galton–Watson process – a statistical tool originally devised in 1875
for demographic analysis. The adoption of the process helps in the
modelling of default probabilities and recovery rates, and even if its
implementation is not straightforward, the method results in a
reduction of complexity and computing time.
The intention of this selection of articles is to present the reader
with a near exhaustive spectrum of topics that comprise the fore-
most concerns of quants and their employers. The crisis has shown
that value adjustments are of paramount importance when it comes
to pricing financial instruments. And it is now clear to every player
xxxvi

INTRODUCTION
in the market that these issues need to be studied thoroughly and

prices need to account for credit risk, cost of funding, discounting
rates and liquidity. The result is a complex combination of factors,
which need to be captured by models in a consistent framework.
Computational limits need also be considered, and work towards
devising clever algorithms will allow greater accuracy and
reliability.
This book deals with these issues in detail and aims to provide
the building blocks for a post-crisis quantitative finance world
(where “post-crisis” is a stochastic variable with unknown time
boundaries). Multi-curve environments, collateralisation of trades,
the effects of new regulation and the uncertainty associated with it,
and the awareness of the flaws in risk measurement, are all compo-
nents of a new reality whose foundations have been laid in the past
four years. I trust the reader with find the ideas in this book both
inspiring and constructive.
Mauro Cesa
REFERENCES
Andreasen, J. and B. Huge, 2013, “Expanded Local Volatility,” Risk, January.
Basel Committee on Banking Supervision (BCBS), 2010, "Basel III: A Global Regulatory
Framework for more Resilient Banks and Banking Systems," December.
Bergomi, L., 2008, “Smile Dynamics III,” Risk, October, pp 90–96.
Brace A., 2010, “Multiple Arbitrage Free Interest Rate Curves,” preprint, National
Australia Bank.
Brigo, D. and A. Capponi, 2008, "Bilateral Counterparty Risk with Stochastic Dynamical
Models" (available at SSRN or arXiv.org).
Castagna, A., 2012, “The Impossibility of DVA Replication, Risk, Nove,ber, pp 66–70.
Duffie, D. and M. Huang, 1996, “Swap Rates and Credit Quality,” Journal of Finance,
51(3), pp 921–49.
Dupire, B., 1994, “Pricing with a Smile,” Risk, January.
Fujii, M. and A. Takahashi, 2011, “Choice of Collateral Currency,” Risk, pp 120–25.
Giles, M. B. and P. Glasserman, 2006, “Smoking Adjoint: Fast Monte Carlo Greeks,” Risk,
January.
International Swaps and Derivatives Association, 2009, “ISDA Close-out Amount

Protocol,” October.
xxxvii

Lipton, M., 2012, “The Volatility Smile Problem,” Risk, February.
Mercurio, F., 2010, “Modelling Libor Market Models, Using Different Curves for
Projecting Rates and for Discounting,” Journal of Theoretical and Applied Finance, 13(1), pp
113–37.
Tuckman, B. and P. Porfirio, 2003, “Interest Rate Parity, Money Market Basis Swaps and
Cross-currency Basis Swaps,” Fixed Income Liquid Market Research, Lehman Brothers,
June.
Vasicek, O., 2002, “The Distribution of Loan Portfolio Value,” Risk, December.
xxxviii

Section 1
Derivatives Pricing
01 Bergomi PCQF.indd 1 11/03/2013 10:09

1
Smile Dynamics IV
Lorenzo Bergomi
In previous works (Bergomi, 2004, 2005, 2008), we studied the

dynamical properties of popular smile models and proposed a new
framework for specifying stochastic volatility models with the
objective of controlling some of their dynamical properties, such as
the term structure of the volatilities of volatilities, the level of short
forward skew and the smile of volatility of volatility.
While these issues are mostly relevant for pricing and risk
managing exotic options, the subject of the joint dynamics of spot
and implied volatilites has wider relevance, both for managing
exotic and vanilla books.
Stochastic volatility models can be assessed either synchroni-
cally, by examining the strike and maturity dependence of the smile
they produce, or diachronically, by studying the dynamics of vola-
tilities they generate. How are these two aspects of a model related?
Is one a reflection of the other and is this connection quantifiable? If
so, where a violation of this relationship is observed on market
smiles, can it be arbitraged?
These are the issues we address in this chapter, for general
stochastic volatility models based on diffusion processes. In the first
section, we derive a relationship, at first order in the volatility of
volatility, linking two features – one static, one dynamic – of
stochastic volatility models:
❑❑ the rate at which the at-the-money-forward (ATMF) skew decays

with maturity; and
❑❑ the rate at which the ATMF volatility moves when the spot
moves,

which will prompt us to introduce a new quantity: the skew stick-

iness ratio (SSR).
In the second section, we address the issue of practically materi-
alising the profit and loss (P&L) resulting from a difference between
implied and realised SSR, focusing on short maturities.
The skew stickiness ratio

The vanilla ATM skew
Let us assume a general stochastic volatility model driven by
Brownian motion. Quite generally, the dynamics in a stochastic
volatility model can be formulated using as basic objects forward
variances x T: ξ Tt is the instantaneous variance for date T, observed at
time t: ξ Tt = d((T – t)s^ 2tT)/dT where s^tT is the implied variance swap
(VS) volatility for maturity T, observed at time t. The ξ Tt have no drift.
The dynamics of the x T may or may not have a low-dimensional
Markov representation. While, for example, the Heston model
allows for a one-dimensional representation built on the instanta-
neous variance, we can imagine an extreme case where the variance
curve is driven by a Brownian sheet: each x T is driven by its own
Brownian motion.
Let us write a general stochastic volatility model driven by a
diffusion process as:
dStω = ( r − q) Stω dt + ξ tt Stω dZt
n
dξ tT = ω ∑ξ itT λitT dwti (1.1)
i=1
where Zt is a Brownian motion, Wt is a vector of n Brownian motions

– all possibly correlated – and w is a common scale factor for vola-
tilities of volatilities lTit that may depend very generally on the curve
xt and time, but not on St. Without loss of generality we factor ξ Tt out
of lTit. When w = 0, volatilities are not stochastic anymore: ξ Tt = ξ T0,
where ξ0 is the variance curve calibrated at t = 0. Let us expand ξ Tt at
first order in w. We have:
⎛ t
⌠ n 0
⎞
ξ tT = ξ 0T ⎜⎜1+ ω ⎮ ∑ ( λiTτ ) dWτi ⎟⎟
⎝ ⌡0 i=1 ⎠
where (lTit)0 is evaluated in the unperturbed state (w = 0), in which
forward variances are frozen and S0t is lognormal. We now derive an
expression of the ATMF skew for maturity T, ST, at first order in w

SMILE DYNAMICS IV
by evaluating the skewness of xT = ln(ST/FT) and using the well-

known approximation relating the ATMF skew to the skewness sT
of xT :1
dσ̂ KT s
ST = = T
d ln K F 6 T
For the sake of analytical tractability, we assume that (lTit)0 does not
depend on St. This restricts our analysis to pure stochastic volatility
models with no local volatility component.
sT is given by sT = MT3/(MT2)3/2 where MTi = 〈(xT − 〈xT〉)i 〉 and 〈X〉 denotes
E[X]. Let us denote by δξt the perturbation of the instantaneous
variance at time t at order one in w:
ξ tt = ξ 0t + δξ t
t
⌠ n
δξ t = ωξ 0t ⎮ ∑ (λitτ )0 dWτi
⌡0 i=1
For ω = 0, M3 = 0. At lowest order, M3 is thus of order one in ω. We
then need to compute M3 at order one and M2 at order zero in ω :
T 1 T t 1 T
xT − xT = ∫ 0
ξ 0t + δξ t dZt −
2
∫ 0
(ξ 0 + δξ t ) dt + ∫ 0 ξ 0t dt
2
T 1 T δξ t 1 T
= ∫ ξ 0 dZt + ∫ 0
t
2
dZt − ∫ δξ t dt
2 0
0
ξ 0t
T
M2 = ∫ 0
ξ 0t dt
⎡ 2 ⎛ ⎞⎤
3 δξ t
M3 = E ⎢
2 ⎢⎣ (∫ T
0 ) ⎝
T
ξ 0t dZt ⎜⎜− ∫ 0 δξ t dt + ∫
T
0
ξ 0t
dZt
⎟⎥
⎟⎥
⎠⎦
Evaluating the expectation for M3, we get:

n
T t
M3 = 3ω ∫ dtξ 0t ∫
0 0
ξ 0τ ∑ ρ iS ( λitτ )0 dτ
i=1
where ρiS is the correlation between Z and Wi. This expression can
be rewritten as:
T ⎛ t ⎡ dS0 ⎤⎞
M3 = 3 ∫ dt ⎜⎜ ∫ E ⎢ 0τ δξ t ⎥⎟⎟ (1.2)
0
⎝ 0 ⎣ Sτ ⎦⎠
Equation 1.2 shows that M3 is given at first order in the volatility of
volatility by the double integral of the spot/volatility covariance
function. The expression E[dS0t /S0t dxt]) quantifies how much a move

of the (unperturbed) spot at time τ is correlated with the fluctuation

of the instantaneous variance at a later time t. Let us define the
spot/volatility covariance function f as:
1 ⎡ dSτ0 ⎤
f (τ ,t ) = E ⎢ δξ t ⎥ (1.3)
dτ ⎣ Sτ0 ⎦

At first order in the volatility of volatility, the ATMF skew is then
given by:
T t
ST =
1 ∫ 0
dt ∫ 0 f (τ ,t ) dτ
(1.4)
3
2 T
( T
∫ 0 ξ 0t dt ) 2
For an illustration of the accuracy of formula 1.4, see Figure 1.1 for
the case of a two-factor lognormal model for forward variances.
The skew stickiness ratio

Different models generate different deltas for vanilla options as
they imply different scenarios for implied volatilities, conditional
on a move of the spot. Market-makers on index options empirically
adjust their deltas by making an assumption for the following ratio:
1 dσ̂ FT
rT =
dσ̂ KT d ln S
d ln K F
which quantifies how much the ATMF volatility s^ TF moves condi-

tional on a move of S. They have coined names for two types of
market regimes: sticky-strike (r = 1) and sticky-delta (r = 0).
While they may be correlated with S, volatilities are not functions
of S. Let us introduce the SSR RT, which we define as:
1 E ⎡⎣dσ̂ FT d ln S⎤⎦
RT =
dσ̂ KT E ⎡⎣( d ln S) ⎤⎦
2

d ln K F
RT is then the regression coefficient of ds^ TF on d ln S in units of the

ATMF skew. The values of R for some classes of models are well
known:
❑❑ in models built with jump or Lévy processes, R = 0;

❑❑ in local volatility models, for weak skews, R = 2; and
❑❑ in stochastic volatility models, for short maturities and weak
skews, R = 2.

SMILE DYNAMICS IV
As we are working at order one in ω and the numerator of RT is of

order one in ω, using either the VS or the ATMF volatility is indif-
ferent, as their difference is of order one in ω. For the purpose of
calculating the numerator, we then use the VS volatility whose vari-
ation at order one in ω at lowest order in dt is given by:
1 T
dσ̂ tT =
2σ̂ tT (T − t )
∫ t
dξ tu du
where dξ ut is given by Equation 1.1. Taking now the expectation

E [dσ^ tT d ln S] and keeping only terms at order one in ω, we get:
1 T
E [dσ̂ tT d ln St ] =
2σ̂ tT (T − t )
∫ t
E ⎡⎣dξ tud ln St ⎤⎦ du
T
1 ⌠ ⎡ dSt0 ⎤
= ⎮ E ⎢ 0 δξ u ⎥ du
2σ̂ tT (T − t ) ⌡t ⎣ St ⎦
We now divide by 〈(d ln S)2〉 and evaluate expectations at t = 0,
making use of the definition of f in Equation 1.3 and the expression
of the ATMF skew 1.4 to get:
T T
RT =
∫ 0
ξ 0t dt T ∫ 0 f ( 0, u) du
(1.5)
T t
ξ 00T ∫ dt ∫ f (τ ,t ) dτ
0 0
Consider how, except for some dependence on the term structure of

the variance curve, this expression for RT, as well as expression 1.4
for ST, involve the same ingredient: the spot/volatility covariance
function. The common dependence of RT and ST on f supplies the
connection between a static feature of the smile – the term structure
of the ATMF skew – and a dynamic property – the SSR.2
We now study the limit of RT and ST when T → 0 then characterise
further the relationship between the SSR and the ATMF skew for the
case of a time-homogeneous model and a flat variance curve.
Short-maturity limit of the ATMF skew and the SSR

Let us take the limit T → 0. Using expression 1.4:
T t
S0 = lim
1 ∫ 0
dt ∫ f (τ ,t ) dτ
0
=
f ( 0, 0)
(1.6)
3 3
T→0 2 T
( T
) 4 (ξ 00 ) 2
2
∫ 0 ξ dt t
0

The short skew has a finite limit that directly measures the covari-
ance function at the origin. Let us now turn to R. The pre-factor in
Equation 1.5 tends to one and we get:
T
T ∫ du
R0 = lim T
0
t =2 (1.7)
T→0
∫ 0
dt ∫ dτ
0
We recover for short maturities, at first order in volatility of vola-

tility, the same value for stochastic volatility as for local volatility
models. We had pointed out this general property in Bergomi (2004)
explaining why, for short maturities and weak skews, the dynamics
– and hence the deltas – in stochastic volatility and local volatility
models calibrated on the same ATMF skew were identical. Our
calculation at order one in the volatility of volatility is also of order
one in the spot/volatility correlations ρiS: stochastic and local vola-
tility models behave differently if the smile near the money is
dominated by curvature (ρiS = 0) rather than skew.
Scaling behaviour of ST and RT for a time-

homogeneous model and a flat term structure of
variance
Let us now assume that the term structure of variance is flat and
that the underlying model is time-homogeneous, so that the covari-
ance function is a function of t − τ only: f(τ, t) ≡ f(t − τ). We now get
simpler expressions for ST and RT:
T T
ST =
∫ 0
(T − t) f (t) dt
, RT =
∫ 0
f (t ) dt
3 T
2 (ξ 0 ) T
2 2
∫ (1− ) f (t) dt
0
t
T
Admissible range for RT

The expression for RT can be rewritten as:
g (T )
RT = T
1
T ∫ 0
g (t ) dt
t
where g(t) = ∫ 0 f(u)du. Let us make the natural assumption that f(u)
decays monotonically towards zero as u → ∞. RT is the ratio of g(T)
– either positive increasing concave or negative decreasing convex,
depending on the sign of f – to its average value over
[0, T]. Thus RT ≥ 1. Using the fact that g(t)/g(T) ≥ t/T yields a higher
bound for RT: RT ≤ 2.

SMILE DYNAMICS IV
Figure 1.1 Approximate and actual 95/105 skew in a two-factor model,

as a function of T
5
Actual
Approximate
4
3
%
0
0 0.5 1.0 1.5 2.0
We then have the following model-independent range for RT:
1 ≤ RT ≤ 2
Scaling of ST and RT
Let us investigate the scaling behaviour of ST and RT by assuming
that for large time separations, f decays algebraically with exponent
γ : f(u) ∝ u−γ .
x
As we take the limit x → ∞, the integral ∫ 0 f(u)du either scales like
T 1−γ if γ < 1 or tends to a constant if γ > 1. Working out the limiting
regimes for ST and RT, we get:
❑❑ (Type I) If γ > 1:
1
ST ∝ and lim RT = 1
T T→∞
❑❑ (Type II) If γ < 1:
1
ST ∝ and lim RT = 2 − γ
Tγ T→∞
It is easy to check that exponential decay falls into the type I

category. Let us comment on these results:
❑❑ If f(u) decays faster than 1/u or exponentially, the ATMF skew

decays like 1/T and the long-maturity limit of the SSR is one. We
have already reported this property for the specific case of the
Heston model (Bergomi, 2004). The fact that ST decays like 1/T

can be understood by realising that, if the spot/volatility covari-

ance function decays too rapidly, increments of ln(St) become
independent. This leads to the 1/T scaling for the ATMF skew, a
feature shared with jump and Lévy models.
❑❑ If f(u) decays more slowly than 1/u, ST decays with the same expo-
nent as f and RT tends to the non-trivial limit 2 − γ.
The connection between the decay of the ATMF skew and the
long-maturity limit of the SSR can be summarised compactly by the
following formula: if the spot/volatility covariance function has
either algebraic or exponential decay, then for long maturities:
1
ST ∝ (1.8)
T 2−R*
with:
R* = lim RT
T→∞
Type II behaviour in a two-factor model

Of the two types of scaling listed above for ST and RT, type II is the
most interesting. Can we build a model that generates such behav-
iour? Consider a model of the following type (Bergomi, 2008):
dξ tT = ξ tTω ∑ wi e −ki (T−t) dWti

i
where wi are positive weights. Assuming a flat initial VS term struc-

ture, f has the following form:
3
f (τ ) = ω (ξ 0 ) 2 ∑ wi ρSi e −kiτ
n
Plugging this expression in Equations 1.4 and 1.5 yields:
ω k iT − (1− e −kiT
)
ST = ∑
2 i
wi ρSi 2
( kiT )
RT =
∑ wρ i i Si
1−e− kiT
kiT
(1.9)
(
kiT− 1−e− kiT )
∑wρi i Si ( kiT )
2
f is a linear combination of exponentials: as T → ∞, f(u) ∝ e−min (k )u. i i
Thus, when T → ∞, ST ∝ 1/T and RT → 1, the model eventually

behaves like type I. However, by suitably choosing parameters, it is
10

SMILE DYNAMICS IV
Figure 1.2 ln(ST . ln(95/105)) as a function of ln(T) (top); R(T) (bottom)
–2
–2 –1 0 1 2
–3
–4
–5
–6
2.0
1.8
1.6
1.4
1.2
1.0
0 2 4 6 8 10
possible to generate a power law-like behaviour for f over a suffi-

ciently wide range of maturities.
Let us take two factors and use the following parameters (these
are values we used in Bergomi, 2008): k1 = 8.0, k2 = 0.35, w1 = 72%, w2 =
28%, ρS1 = −70%, ρS2 = −35.7%, ω = 3.36, which are typical of equity
index skews and volatilities of volatilities. Figure 1.1 shows a
comparison of the 95/105 skew in volatility points calculated either
by direct Monte Carlo simulation of the model, or using Equation
1.9.
The top of Figure 1.2 shows the 95/105 skew defined as
ST . ln(95/105) as a function of T in log/log plot, for maturities from
three months to five years. It is almost a straight line with slope about
1/2 (−0.51). This exponent is a well-known typical feature of equity
smiles. The bottom of Figure 1.2 shows RT as a function of T.
11

As expected, RT starts from two and tends for long maturities to

one. Note, however, the shoulder around 1.5 for intermediate matu-
rities. This can be traced to the scaling of ST: initially ST decays
approximately algebraically with power 1/2. Consequently,
according to Equation 1.8 RT initially stabilises to a value equal to 2
− 1/2 = 1.5. Eventually, for longer maturities, the exponential decay
of f kicks in (f ∼ e−k T), so that ST decays like 1/T and RT tends to its
2
long-maturity limit of one.

Even though the model becomes of type I when T → ∞, we are
able to get type II behaviour over a range of maturities that is wide
enough for practical purposes.
Type II behaviour with the Eurostoxx 50 index

As mentioned above, the ATMF skew of equity smiles typically
decays like 1/T 1/2. This would suggest type II behaviour, but is this
confirmed by the value of RT? Let us here look at the realised SSR of
the Eurostoxx 50 index for different maturities, measured using the
ATM volatility. Figure 1.3 displays the three-month running average
of RT for maturities of one month, six months and two years, since
May 2004, calculated as:
∑ (σ̂ i+1 − σ̂ i ) ln ( SSi+1i )

i
RT = i 2
∑ dσ̂ KT
ln ( SSi+1i )
i d ln K S
Figure 1.3 Three-month running average of RT for the Eurostoxx 50

index
2.0
1.8
1.6
1.4
1.2
1.0
0.8
0.6
0.4 Two years
0.2 Six months
One month
0.0
May May May May May May
2004 2005 2006 2007 2008 2009
12

SMILE DYNAMICS IV
We observe that:
❑❑ The SSR usually lies in the interval [1, 2]; and

❑❑ The SSR is very noisy and is affected by sudden and simulta-
neous changes in spot and ATM volatility. However, over the
past five years, the average value of the SSR for long-dated
options has been notably larger than one and hovers around 1.4.
This number is – in our framework – compatible with a power
law decay of the skew with an exponent around 1/2.
Equity volatility markets thus seem to behave like type II. Note,
however, that the short-maturity limit of the SSR (Equation 1.7) is
two whether the model be of type I or type II. This value is very
different from the historical value of the SSR for one-month options
on the Eurostoxx 50. As Figure 1.3 shows, its value is always lower
than two, at times markedly.
It is then natural to ask whether this can be arbitraged: is it
possible to implement an option strategy whose P&L is 2 − R0?
Arbitraging the SSR for short maturities

We consider short maturities, for which the distinction between
ATMF and ATM strikes is not relevant. Let us use the ATM vola-
tility, which we denote by σ0. We also neglect interest rate and repo
effects. The SSR relates the level of the market ATMF skew to the
spot/volatility covariance. Arbitraging the SSR entails being able to
materialise this covariance as a P&L, and so requires a hedging
model in which at least both S and σ0 can move.
We now develop a model for the joint dynamics of S and the
smile, using as dynamical variables S and σ0. Our goal is to be able
to write the P&L of a delta-hedged, σ0-hedged vanilla option as:
1 d 2Q ⎛⎛ δS ⎞ ⎞
2
P & L = S2 2 ⎜⎜ ⎟ − σ S2δ t ⎟

2 dS ⎝⎝ S ⎠ ⎠
d 2Q ⎛⎜⎛ δσ 0 ⎞ ⎞
2
1
+ σ 02 ⎟ − ν δt ⎟⎟
2
2 ⎜⎜
2 dσ 0 ⎝⎝ σ 0 ⎠ ⎠
d 2Q ⎛ δS δσ 0 ⎞
+Sσ 0 ⎜ − ρσ Sνδt ⎟ (1.10)
dSdσ 0 ⎝ S σ 0 ⎠
13

with the crucial condition that the breakeven levels σS, ν, ρ be strike-
independent – unlike the Black–Scholes implied volatility σ^K – and
such that the market smile is recovered.
A model for short near-the-money options

Let us consider short-maturity vanilla options. Let us introduce
moneyness x = ln(K/S) and parameterise the smile near the money
as:
⎛ β (σ 0 ) 2 ⎞
σ̂ ( x) = σ 0 ⎜1+ α (σ 0 ) x + x ⎟ (1.11)
⎝ 2 ⎠
The smile is characterised by three quantities: σ0, the skew σ0α (σ0)
and the curvature σ0β (σ0). α and β are functions of σ0 and σ^ (x) has no
explicit dependence on T − t.
The price of an option is then Q(S, K, σ0, α, β, T) = PBS(S, K, σ (x), T)
where PBS(S, σ^ , T) is the Black–Scholes formula. The P&L of a delta-
hedged, σ0-hedged vanilla option reads, at order δ t:
2 2
dQ 1 d 2Q ⎛ δ S ⎞ 1 d 2Q ⎛ δσ 0 ⎞
P&L= δt + S2 2 ⎜ ⎟ + σ 02 ⎜ ⎟
dt 2 dS ⎝ S ⎠ 2 dσ 02 ⎝ σ 0 ⎠
d 2Q δS δσ 0
+Sσ 0 (1.12)
dSdσ 0 S σ 0
Our parameterisation for the smile 1.11, together with the assump-
tion that S, σ0 are the only dynamical quantities in our model, is
consistent only if we are able to find breakeven levels σS, ν, ρ that
make this P&L vanish on average, irrespective of the strike consid-
ered. Since σ^ has no explicit time-dependence, the theta in our
model is the same as the Black–Scholes theta: dQ/dt = dP/dt. Our
consistency requirement can be stated as:
dQ 1 2 d 2 PBSK 2
− = S σ̂ K
dt 2 dS2
1 d 2Q K 2 1 2 d 2Q K 2 d 2Q K
= S2 2
σS + σ0 2
ν + Sσ 0 ρσ Sν (1.13)
2 dS 2 dσ 0 dSdσ 0
In other words, we need to be able to split the Black–Scholes theta

into three pieces – matching our three gammas.
Inspection of the derivatives d2PBS/dS2, d2PBS/dSdσ, d2PBS/dσ 2 shows
that they all have the same pre-factor SN′(d)/σ √T, which encapsu-
lates the singularity at T = 0, where:
14

SMILE DYNAMICS IV
2
−x + σ 2T
d=
σ T
Factoring this pre-factor out and keeping terms at order two in x

and order zero in T yields the following expressions for the Greeks
in Equation 1.12:
dQ 1 SN ʹ′ ( d ) ⎛ β 2 ⎞
=− σ 0 ⎜1+ α x + x 2 ⎟
dt 2 T ⎝ 2 ⎠
1 2 d Q 1 SN ʹ′ ( d ) ⎛
2
⎛ 2 5 ⎞ 2 ⎞
S = ⎜1− 3α x + ⎜6α − β ⎟x ⎟
2 dS2 2 σ 0 T ⎝ ⎝ 2 ⎠ ⎠
1 2 d 2Q 1 SN ʹ′ ( d ) 2
σ0 = x
2 dσ 02 2 σ 0 T
d 2Q SN ʹ′ ( d)

Sσ 0 =
dSdσ 0 σ 0 T
( x − (2α − σ 0α ʹ′) x 2 )
where α ′ = dα/dσ0. Plugging these expressions in Equation 1.10
yields:
⎡
1 N ʹ′ ( d ) ⎢ 2
P&L= S − σ 0 δt (1+ α x + β2 x 2 )
2 σ 0 T ⎢⎣
2
⎛ δS ⎞
( )
+ 1− 3α x + (6α 2 − 25 β ) x 2 ⎜ ⎟
⎝ S ⎠
δS δσ 0 ⎤⎥
2
⎛ δσ ⎞
+ x 2 ⎜ 0 ⎟ + 2 ( x − (2α − σ 0α ʹ′) x 2 )
⎝ σ 0 ⎠ S σ 0 ⎥⎦
Let us now find breakeven levels:

2 2
⎛ δS ⎞ ⎛ δσ 0 ⎞ δS δσ 0
⎜ ⎟ = σ S2δt, ⎜ ⎟ = ν 2δt, = ρσ 0νδt
⎝ S ⎠ ⎝ σ 0 ⎠ S σ0
that make the P&L vanish for all x. Grouping terms by powers of x
we get:
1 N ʹ′ ( d) ⎡
P&L= S
2 σ 0 T ⎣
(−σ 02 + σ S2 ) + (−σ 02α − 3ασ S2 + 2 ρσ Sν ) x
( )
+ −σ 02 β2 + (6α 2 − 25 β ) σ S2 + ν 2 − 2 ( 2α − σ 0α ʹ′) ρσ Sν x 2 ⎤⎦δt
This yields the following equations for σS, ν, ρ :
σS =σ0 (1.14)
ρν = 2ασ 0 (1.15)
15

ν 2 = σ 02 ( 3β + 2α 2 − 4σ 0αα ʹ′) (1.16)
❑❑ The first equation expresses that the breakeven volatility for the
spot is the ATM volatility.
❑❑ The second equation relates the ATM skew to the covariance of S
and σ 0. Using the fact that, in our parameterisation dσ /d ln K|S =
ασ0, it can be rewritten as:
1 δS δσ 0
δt S σ 0
=2
dσ
σ0
d ln K S
which recovers the result that R0 = 2. A discrepancy between the

realised value of R and 2 is then materialised as a spot/volatility
cross-gamma/theta P&L. Using condition 1.15, the third piece in
Equation 1.10 now reads:
d 2Q ⎛ δS δσ 0 dσ ⎞
Sσ 0 ⎜ − 2σ 0 δt ⎟ (1.17)
dSdσ 0 ⎝ S σ 0 d ln K S ⎠
❑❑ Most importantly, these two properties are model-independent.

They do not depend on the functions α (σ0) and β (σ 0). The third
equation, which relates the curvature parameter β to the vola-
tility of volatility ν and α, is model-dependent as it involves
α ′(σ0).
We are free to specify any functional form for α (σ0) and β (σ0)
provided conditions 1.14, 1.15 and 1.16 hold.
Consistency conditions
S, σ0 are allowed to move while functions α, β stay constant. If ρ is
assumed to be constant, then Equations 1.15 and 1.16 show that the
dependence of the ATM skew on σ 0 is related to the dependence of
ν on σ 0.
Lognormal dynamics for σ0

Let us assume that ν is a constant. Equation 1.15 implies that σ0α (σ0)
is constant, thus α is proportional to 1/s0. Equation 1.16 then implies
that β is proportional to 1/s 20. Let us write α = a/σ0, β = b/σ 20. We get
the following expression for the smile:
16

SMILE DYNAMICS IV
⎛ a b ⎞
σ ( x ) = σ 0 ⎜1+ x + 2 x 2 ⎟ (1.18)
⎝ σ 0 2σ 0 ⎠
where a, b are constant, and ν and ρ are given by:
ρν = 2a (1.19)
ν = 3b + 6a 2 (1.20)
Thus assuming a lognormal dynamics for the ATM volatility

implies on one hand that the skew is constant, and on the other
hand that the curvature is inversely proportional to the ATM
volatility.
Normal dynamics for σ0

Assume now that ν is inversely proportional to σ0: ν = μ/σ0 where μ
is the constant normal volatility of volatility. Equations 1.15 and
1.16 then imply that α is inversely proportional to σ 20 and β is
inversely proportional to σ 40: α = a/σ 20, β = b/σ 40 where a, b are constant.
We get the following expression for the smile:
⎛ a b ⎞
σ ( x ) = σ 0 ⎜1+ 2 x + 4 x 2 ⎟
⎝ σ 0 2σ 0 ⎠
where a, b are related to μ and ρ by:
ρµ = 2a µ = 3b + 10a 2
Assuming a normal dynamics for the ATM volatility generates a

skew that is inversely proportional to the ATM volatility and a
curvature inversely proportional to the cube of the ATM volatility.
These scaling properties for skew and curvature as a function of
the ATM volatility induced by the dependence of the volatility of
the ATM volatility on the ATM volatility itself are in agreement
with more general results derived for short maturities by directly
specifying a dynamics for the implied volatilities and imposing that
the discounted option price be a martingale (see, for example,
Balland, 2006, and Durrleman, 2004).
One can check that the lognormal (respectively, normal) case is the
short-maturity limit of the SABR (respectively, Heston) model.
17

Conclusion and application to one-month-maturity options –

lognormal dynamics for σ0
In what follows, we use a lognormal dynamics for σ0. The parame-
terisation of the smile near the money is given by Equation 1.18 and
the P&L during δt of a delta-hedged, σ0-hedged vanilla option reads:
1 d 2Q ⎛⎛ δS ⎞ ⎞
2
P & L = S2 2 ⎜⎜ ⎟ − σ 02δt ⎟

2 dS ⎝⎝ S ⎠ ⎠
d 2Q ⎛⎜⎛ δσ 0 ⎞ ⎞
2
1
+ σ 02 2 ⎜ ⎜ ⎟ − ( 3b + 6a 2 ) δt ⎟
2 dσ 0 ⎝⎝ σ 0 ⎠ ⎟
⎠
d 2Q ⎛ δS δσ 0 ⎞
+Sσ 0 ⎜ − 2aσ 0δt ⎟ (1.21)
dSdσ 0 ⎝ S σ 0 ⎠
This was derived in the limit of a short maturity and for strikes near
the money. How reliable are our approximations, for practical
trading purposes? Let us check how accurately Equation 1.13 holds:
how well do our three thetas add up to the Black–Scholes theta and
what are their relative magnitudes?
We consider a one-month maturity and a typical index short-
maturity smile, shown in Figure 1.4. The corresponding values of
σ0, a and b are σ0 = 20%, a = −10% and b = 0.4%.
The top of Figure 1.5 shows:
❑❑ the spot theta:

σ 02 2 d 2Q
S
2 dS2
❑❑ the volatility theta:
3b + 6a 2 2 d 2Q
σ0
2 dσ 02
❑❑ and the cross spot/volatility theta:

d 2Q
2aσ 02S
dSdσ 0
The lower graph shows both the Black–Scholes theta:

σ K2 2 d 2 PBS
S
2 dS2
18

SMILE DYNAMICS IV
Figure 1.4 Implied volatilities for σ0 = 20%, a = –10%, b = 0.4%

35
30
25
%
20
15
10
80 90 100 110 120
%
and the sum of the three thetas in the top graph. The x-axis in both
graphs is the option’s strike. The lower graph shows acceptable
agreement: the model is usable in practice.
Arbitraging the 95/105 one-month skew on the Eurostoxx 50

The discrepancy between the realised value of R0 and its model-
independent value of two evidenced in Figure 1.3 can equivalently
be expressed as a discrepancy between the market ATM skew and
the ‘realised’ skew whose definition is suggested by expression
1.17:
Realised
dσ 1 δS δσ 0
=
d ln K S 2δt S σ 02
which materialises into a non-vanishing cross gamma/theta P&L.

This is shown in Figure 1.6, where the realised and implied skew
have been multiplied by 10 to correspond approximately to a
95/105 skew.
Equation 1.21 shows that to isolate as a P&L the difference
between realised and implied skew we need to cancel the spot and
volatility gamma, and ensure that it is not dwarfed by the P&L
generated by remarking to market a and b. This could happen if our
assumption of diffusive behaviour for s0 did not hold, or if the
assumption of constant a and b was blatantly violated.
We now back-test a strategy that involves selling every day one
one-month option of strike 95 and buying the appropriate number
– typically around 0.7 – of options of strike 105 so as to cancel the
spot gamma. This position is delta-hedged and unwound the next
19

Figure 1.5 The spot, volatility and cross theta (top); the sum of the three
thetas compared with the Black–Scholes theta (bottom)
15
Spot
Vol
10 Cross spot/vol
0
80 90 100 110 120
%
–5
–10
16
BS theta
14 Sum of three thetas
12
10
8
6
4
2
0
80 90 100 110 120
%
day, then started again. On top of the third piece in Equation 1.21,
our total P&L comprises: (a) a volatility gamma/theta P&L – the
second piece in expression 1.21; (b) a vega P&L as our position has
some small residual sensitivity to σ0; and (c) additional P&L created
by remarking a, b to market on the next day. Since dP/dσ0 and d2P/
dσ 20 are approximately symmetric around the money, P&Ls (a) and
(b) are expected to be small.
The results of our back-test are illustrated in Figure 1.7. The top
scatter plot shows the daily P&L without portions (b) and (c) as a
function of the P&L calculated using Equation 1.21 while the scatter
plot at the bottom shows the real total daily P&L as a function of the
cross-gamma/theta P&L (Equation 1.7).
The dispersion of points around a straight line in the top graph is
20

SMILE DYNAMICS IV
Figure 1.6 Three-month running averages of realised and market skew

for one-month options on the Eurostoxx 50
10 Realised skew
9 Market skew
8
7
6
5
4
3
2
1
0
May May May May
2002 2004 2006 2008
another measure of how well the P&L accounting in Equation 1.21

holds. Note that it is only correct at order one in δ t and at order two
in δ S and δσ0 and has been derived with the assumption of short
maturities and strikes near the money – still the agreement is satis-
factory. More interestingly, the bottom graph shows that, although
some noise is contributed by P&Ls (b) and (c), the total P&L of our
strategy is still well correlated with the cross-gamma/theta P&L,
that is, the difference between realised and implied skew.
Our conclusion is that the difference between the realised value of
R0 and two can be materialised as a P&L and hence arbitraged.
Running the strategy we have outlined entails unreasonable bid/
offer costs; one would rather use a market-making automaton to
optimally rebalance an option position near the money so as to main-
tain vanishing spot gamma, small vega and volatility gamma.
Conclusion
In time-homogeneous stochastic volatility models at order one in
the volatility of volatility, the SSR and the rate at which the ATMF
skew decays with maturity are structurally related through the
spot/volatility covariance function. Assuming time-homogeneity
and a flat VS volatility term structure implies that the SSR is
restricted to the interval [1, 2]. Inspection of the historical behaviour
of Eurostoxx 50 implied volatilities shows that while for longer
maturities the relationship between the SSR and the ATMF skew
holds approximately, for short maturities the SSR is at
21

Figure 1.7 Profit and loss of a dynamic option strategy
0.3
0.2
0.1
0
–0.3 –0.2 –0.1 0 0.1 0.2 0.3
–0.1
–0.2
–0.3
0.3
0.2
0.1
0
–0.3 –0.2 –0.1 0 0.1 0.2 0.3
–0.1
–0.2
–0.3
times significantly smaller than its model-independent value of

two, indicating that the market skew is “too steep”. We show that
this discrepancy is equivalent to the difference between the market
implied at-the-money skew and the “realised” skew and that it can
be materialised as the cross-gamma/theta P&L of a hedged near-
the-money vanilla option position. We provide an example of a
dynamic option strategy whose P&L approximately captures this
effect.
The author is grateful to Julien Guyon, Pierre Henry-Labordère and

other members of his team for useful discussions and suggestions.
1 This approach allows for an economical derivation of the ATMF skew at order one in w. We
would get the same result by perturbing the pricing equation at order one in w.
2 For a similar approach based on the spot/realised volatility covariance function, see
Ciliberti, Bouchaud and Potters (2008).
22

SMILE DYNAMICS IV
references
Balland P., 2006, “Forward Smile,” presentation at Global Derivatives, Paris.
Bergomi L., 2004, “Smile Dynamics,” Risk, September, pp 117–23.
Bergomi L., 2005, “Smile Dynamics II,” Risk, October, pp 67–73.
Bergomi L., 2008, “Smile Dynamics III,” Risk, October, pp 90–96.
Bergomi, L., 2008, “Smile Dynamics III,” Risk, October, pp 90–96.
Ciliberti S., J.-P. Bouchaud and M. Potters, 2008, “Smile Dynamics – A Theory of the
Implied Leverage Effect” (available at http://arxiv.org/pdf/0809.3375v1).
Durrleman V., 2004, “From Implied to Spot Volatilities” (available at http://math.

stanford.edu/~valdo/papers/FmImplied2SpotVols.pdf).
23

2
Funding Beyond Discounting: Collateral
Agreements and Derivatives Pricing
Barclays
Standard derivatives pricing theory (see, for example, Hull, 2006)

relies on the assumption that one can borrow and lend at a unique
risk-free rate. The realities of being a derivatives desk are, however,
rather different these days, as historically stable relationships
between bank funding rates, government rates, Libor rates, etc,
have broken down.
The practicalities of funding, that is, how dealers borrow and lend
money, are of central importance to derivatives pricing, because
replicating naturally involves borrowing and lending money
and other assets. In this chapter, we establish derivatives valua-
tion formulas in the presence of such complications starting from
first principles, and study the impact of market features such as
stochastic funding and collateral posting rules on values of funda-
mental derivatives contracts, including forwards and options.
Simplifying considerably, we can describe a derivatives desk’s
activities as selling derivatives securities to clients while hedging
them with other dealers. Should the desk default, a client would
join the queue of the bank’s creditors. The situation is a bit different
for trading among dealers where, to reduce credit risk, agreements
have been put in place to collateralise mutual exposures.
Such agreements are based on the so-called credit support annex
(CSA) to the International Swaps and Derivatives Association
master agreement, so we often refer to collateralised trades as CSA
trades. As collateral is used to offset liabilities in case of a default, it
25
02 Piterbarg PCQF.indd 25 11/03/2013 10:09

could be thought of as an essentially risk-free investment, so the

rate on collateral is usually set to be a proxy of a risk-free rate such
as the fed funds rate for dollar transactions, Eonia for euro, etc.
Often, purchased assets are posted as collateral against the funds
used to buy them, such as in the “repo” market for shares used in
delta hedging.
Secured borrowing will normally attract a better rate than
unsecured borrowing. In a bank, funding functions are often
centralised within a treasury desk. The unsecured rates that the
treasury desk provides to the trading desks are generally linked to
the unsecured funding rate at which the bank itself can borrow/
lend, a rate typically based on the bank credit rating, that is, its
perceived probability of default.
The money that a derivatives desk uses in its operations comes
from a multitude of sources, from the collateral posted by counter-
parties to funds secured by various types of assets. We show in this
chapter how to aggregate these rates to come up with the value of a
derivatives security given the rules for collateral posting and repo
rates available for the underlying. Note that some desks may be
required to borrow at rates different from those that they can lend
at – a complication we avoid in this chapter as our formalism does
not extend readily to the nonlinear partial differential equations
that such a set-up would require.
Having derived an appropriate extension to the standard
no-arbitrage result, we then look carefully at the differences in
value of CSA (that is, collateralised) and non-CSA (not collateral-
ised) versions of the same derivatives security. This is important as
dealers often calibrate their models to market-observed prices of
derivatives, which typically reflect CSA-based valuations, yet they
also trade a large volume of non-CSA over-the-counter derivatives.
We demonstrate that a number of often significant adjustments are
required to reflect the difference between CSA and non-CSA trades.
The first adjustment is to use different discounting rates for CSA
and non-CSA versions of the same derivative. The second adjust-
ment is a convexity, or quanto, adjustment and affects forward
curves – such as equity forwards or Libor forward rates – as they
turn out to depend on collateralisation used. This is a consequence
of the stochastic funding spread and, in particular, of the correlation
between the bank funding spread and the underlying assets. The
26

FUNDING BEYOND DISCOUNTING: COLLATERAL AGREEMENTS AND DERIVATIVES PRICING
third adjustment that may be required is to volatility information

used for options – in particular, the volatility smile changes
depending on collateral. We show some numerical results for these
effects.
Preliminaries
We start with the risk-free curve for lending, a curve that corre-
sponds to the safest available collateral (cash). We denote the
corresponding short rate at time t by rC(t); C here stands for “CSA”,
as we assume this is the agreed overnight rate paid on collateral
among dealers under CSA. It is convenient to parameterise term
curves in terms of discount factors; we denote corresponding risk-
free discount factors by PC(t, T), 0 ≤ t ≤ T < ∞. Standard
Heath–Jarrow–Morton theory applies, and we specify the following
dynamics for the yield curve:
T
dPC (t,T ) / PC (t,T ) = rC (t ) dt − σ C (t,T ) dWC (t ) (2.1)
where WC(t) is a d-dimensional Brownian motion under the risk-

neutral measure P and sC is a vector-valued (dimension d) stochastic
process.
In what follows, we shall consider derivatives contracts on a
particular asset, whose price process we denote by S(t), t ≥ 0. We
denote by rR(t) the short rate on funding secured by this asset (here
R stands for “repo”). The difference rC(t) – rR(t) is sometimes called
the stock lending fee. Finally, let us define the short rate for unse-
cured funding by rF(t), t ≥ 0. As a rule, we would expect that rC(t) ≤
rR(t) ≤ rF(t).
The existence of non-zero spreads between short rates based
on different collateral can be recast in the language of credit risk,
by introducing joint defaults between the bank and various assets
used as collateral for funding. In particular, the funding spread
sF(t) @ rF(t) – rC(t) could be thought of as the (stochastic) intensity
of default of the bank. We do not pursue this formalism here (see,
for example, Gregory, 2009, or Burgard and Kjaer, 2009), postu-
lating the dynamics of funding curves directly instead. Likewise,
we ignore the possibility of a counterparty default, an extension
that could be developed rather easily.
27

Black–Scholes with collateral

Let us look at how the standard Black–Scholes pricing formula
changes in the presence of a CSA. Let S(t) be an asset that follows, in
the real world, the following dynamics:
dS (t ) /S (t ) = µ S (t ) dt + σ S (t ) dW (t )
Let V(t, S) be a derivatives security on the asset; by Itô’s lemma it
follows that:
dV (t ) = (LV (t )) dt + Δ (t ) dS (t )
where L is the standard pricing operator:

2
∂ σ S ( t ) S2 ∂ 2
L= +
∂t 2 ∂S2
and D is the option’s delta:
∂V (t )
Δ (t) =
∂S
Let C(t) be the collateral (cash in the collateral account) held at time
t against the derivative. For flexibility, we allow this amount to be
different1 from V(t).
To replicate the derivative, at time t we hold D(t) units of stock
and g (t) cash. Then the value of the replication portfolio, which we
denote by Π(t), is equal to:
V ( t ) = Π ( t ) = Δ ( t ) S ( t ) + γ (t ) (2.2)
The cash amount g (t) is split among a number of accounts:
❑❑ Amount C(t) is in collateral.

❑❑ Amount V(t) – C(t) needs to be borrowed/lent unsecured from
the treasury desk.
❑❑ Amount D(t)S(t) is borrowed to finance the purchase of D(t)
stocks. It is secured by stock purchased.
❑❑ Stock is paying dividends at rate rD.
The growth of all cash accounts (collateral, unsecured, stock-

secured, dividends) is given by:
g (t ) dt = ⎡⎣rC (t ) C (t) + rF (t ) (V (t ) − C (t ))
−rR (t ) Δ (t) S (t ) + rD (t ) Δ (t ) S (t)] dt
28

On the other hand, from Equation 2.2, by the self-financing

condition:
g (t ) dt = dV (t ) − Δ (t ) dS (t)
which is, by Itô’s lemma:
dV (t ) − Δ (t ) dS (t )
⎛ ∂ σ (t ) 2 2 ∂2 ⎞
= (LV (t )) dt = ⎜⎜ + S S ⎟V (t ) dt
⎝ ∂t 2 ∂S2 ⎟⎠
Thus we have:
⎛ ∂ σ (t )2 2 ∂2 ⎞
⎜⎜ + S S ⎟V
⎝ ∂t 2 ∂S2 ⎟⎠

∂V
= rC (t) C (t ) + rF (t ) (V (t ) − C (t )) + ( rD (t ) − rR (t )) S
∂S
which, after some rearrangement, yields:
2
∂V ∂V σ (t ) 2 ∂2 V
+ ( rR (t ) − rD (t )) S+ S S
∂t ∂S 2 ∂S2

= rF (t ) V (t ) − ( rF (t ) − rC (t )) C (t )
The solution, obtained by essentially following the steps that lead
to the Feynman–Kac formula (see, for example, Karatzas and
Shreve, 1997, theorem 4.4.2), is given by:
⎛ − ∫ T rF (u) du
V (t ) = Et ⎜ e t V (T )
⎝
T u
∫ t rF (v) dv ⎞
+⌠
−
⌡ e (rF (u) − rC (u)) C (u) du⎟ (2.3)
t ⎠
in the measure in which the stock grows at rate rR(t) – rD(t), that is:
dS (t ) /S (t ) = ( rR (t ) − rD (t)) dt + σ S (t) dWS (t) (2.4)
Note that if our probability space is rich enough, we can take it to be

the same risk-neutral measure P as used in Equation 2.1. We note
that this derivation validates the view of Barden (2009) (who also
cites Hull, 2006) that the repo rate rR(t) is the right “risk-free” rate to
use when valuing assets on S(t).
By rearranging terms in Equation 2.3, we obtain another useful
formula for the value of the derivative:
⎛ − ∫ T rC (u) du ⎞
V (t ) = Et ⎜ e t V (T ) ⎟
⎝ ⎠
⎛ T − ∫ tu rC (v) dv ⎞
− Et ⎜ ∫ e (rF (u) − rC (u)) (V (u) − C (u)) du⎟ (2.5)
t
⎝ ⎠
29

We note that:
Et ( dV (t )) = (rF (t ) V (t ) − ( rF (t ) − rC (t )) C (t )) dt
= ( rF (t ) V (t ) − sF (t ) C (t )) dt (2.6)
So, the rate of growth in the derivatives security is the funding

spread rF(t) applied to its value minus the credit spread sF(t) applied
to the collateral. In particular, if the collateral is equal to the value V
then:
⎛ − ∫ r (u) du T
⎞
Et ( dV (t )) = rC (t ) V (t ) dt, V (t) = Et ⎜ e V (T ) ⎟
t
C
(2.7)
⎝ ⎠
and the derivative grows at the risk-free rate. The final value is the
only payment that appears in the discounted expression as the
other payments net out given the assumption of full collateralisa-
tion. This is consistent with the drift in Equation 2.1 as PC(t, T)
corresponds to deposits secured by cash collateral. On the other
hand, if the collateral is zero, then:
Et ( dV (t )) = rF (t ) V (t ) dt (2.8)
and the rate of growth is equal to the bank’s unsecured funding rate
or, using credit risk language, adjusted for the possibility of the
bank default. We show later that the case C = V could be handled by
using a measure that corresponds to the risk-free bond PC(t, T) =
Et(e–∫ r (u)du) as a numéraire and, likewise, the case C = 0 could be
T
C
t
handled by using a measure that corresponds to the risky bond PF(t,

T) = Et(e–∫ r (u)du) as a numéraire.
T
F
t
Before we proceed with valuing derivatives securities in our

set-up, let us comment on the portfolio effects of the collateral. When
two dealers are trading with each other, the collateral is applied to
the overall value of the portfolio of derivatives between them, with
positive exposures on some trades offsetting negative exposures on
other trades (so-called netting). Hence, potentially, valuation of indi-
vidual trades should take into account the collateral position on the
whole portfolio. Fortunately, in the simple case of the collateral
requirement being a linear function of the exact value of the portfolio
(the case that includes both the no-collateral case C = 0 and the full
collateral case C = V), the value of the portfolio is just the sum of
values of individual trades (with collateral attributed to trades by the
same linear function). This easily follows from the linearity of the
pricing formula 2.3 in V and C.
30

Zero-strike call option

Probably the simplest derivatives contract on an asset is a promise
to deliver this asset at a given future time T. The contract could be
seen as a zero-strike call option with expiry T. In the standard
theory, of course, the value of this derivative is equal to the value of
the asset itself (in the absence of dividends). Let us see what the
situation is in our case. The payout of the derivative is given by V(T)
= S(T) and the value, at time t, assuming no CSA, is given by:
⎛ − ∫ T rF (u) du ⎞
Vzsc (t) = Et ⎜ e t S (T )⎟
⎝ ⎠
On the other hand, if rD(t) = 0, then:
⎛ − ∫ T rR (u) du ⎞
S (t ) = Et ⎜ e t S (T ) ⎟
⎝ ⎠
as follows from Equation 2.4 and, clearly, S(t) ≠ Vzsc(t). The difference
in values between the derivative and the asset are now easily under-
stood, as the zero-strike call option carries the credit risk of the bank,
while the asset S(⋅) does not. Or, in our language of funding, the
asset S(⋅) can be used to secure funding – which is reflected in the
discount rate applied – while Vzsc cannot be used for such a purpose.
Forward contract
We now consider a forward contract on S(⋅), where at time t the
bank agrees to deliver the asset at time T, against a cash payment at
time T.
Without CSA
A no-CSA forward contract could be seen as a derivative with the
payout S(T) – FnoCSA(t, T) at time T, where FnoCSA(t, T) is the forward
price at t for delivery at T. As the forward contract is cost-free, we
have by 2.3 that:
⎛ − ∫ T rF (u) du ⎞
0 = Et ⎜ e t (S (T ) − FnoCSA (t,T ))⎟
⎝ ⎠
so we get:
⎛ − ∫ T rF (u) du ⎞
Et ⎜ e t S (T ) ⎟
⎝ ⎠
FnoCSA (t,T ) =
⎛ − ∫ tT rF (u) du ⎞
Et ⎜ e ⎟ (2.9)
⎝ ⎠
31

Going back to Equation 2.9, let us define:
⎛ − ∫ T rF (u) du ⎞
PF (t,T )  Et ⎜ e t ⎟
⎝ ⎠
Note that this is essentially a credit-risky bond issued by the bank.
Then we can rewrite Equation 2.9 as:
FnoCSA (t,T ) = E tT (S (T ))
~
where the measure P T is defined by the numeraire PF(t, T) as:
−
t
∫ 0 rF (u) du ⎛ − ∫ T rF (u) du ⎞
e PF (t,T ) = Et ⎜ e 0 ⎟
⎝ ⎠
~
is a P-martingale. Finally we see that FnoCSA(t, T) is a P T-
martingale.
We note that the value of an asset under no CSA at time t with
payout V(T) is given, by Equation 2.8, to be:
⎛ − ∫ T rF (u) du ⎞
V (t ) = Et ⎜ e t V (T ) ⎟ = PF (t,T ) E tT (V (T ))
⎝ ⎠
so it could be calculated by simply taking the expected value of the

payout in the risky T-forward measure.
With CSA
Now let us consider a forward contract covered by CSA, where we
assume that the collateral posted C is always equal to the value of
the contract V. Let the CSA forward price FCSA(t, T) be fixed at t, then
the value, from Equation 2.5, is given by:
⎛ − ∫ T rC (u) du ⎞
0 = V (t ) = Et ⎜ e t V (T ) ⎟
⎝ ⎠

⎛ − ∫ rC (u) du
T
⎞
= Et ⎜ e t (S (T ) − FCSA (t,T ))⎟
⎝ ⎠
so that:
⎛ − ∫ T rC (u) du ⎞
Et ⎜ e t S (T ) ⎟
⎝ ⎠
FCSA (t,T ) =
⎛ − ∫ tT rC (u) du ⎞
Et ⎜ e ⎟ (2.10)
⎝ ⎠
Comparing this with Equation 2.9, we see that in general:
32

FCSA (t,T ) ≠ FnoCSA (t,T )

By the arguments similar to the no-CSA case, we obtain:
FCSA (t,T ) = EtT (S (T ))
where the measure PT is the standard T-forward measure, that is, a

measure defined by PC(t, T) = Et(e–∫ r (u)du) as a numeraire.
T
C
t
We note that the value of an asset under CSA at time t with

payout V(T) is given, by Equation 2.7, to be:
⎛ − ∫ T rC (u) du ⎞
V (t ) = Et ⎜ e t V (T ) ⎟ = PC (t,T ) EtT (V (T ))
⎝ ⎠
so it could be calculated by simply taking the expected value of the
payout in the (risk-free) T-forward measure.
Calculating CSA convexity adjustment

Let us now calculate the difference between CSA and non-CSA
forward prices. We have:
⎛ − ∫ T rF (u) du ⎞
Et ⎜ e t S (T )⎟
⎝ ⎠
FnoCSA (t,T ) = E tT (S (T )) =
PF (t,T )
⎛ − ∫ T rC (u) du − ∫ tT (rF (u)−rC (u)) du ⎞
Et ⎜ e t e S (T ) ⎟
⎝ ⎠
=
PF (t,T )
PC (t,T ) T ⎛ − ∫ t sF (u) du ⎞
T
= Et ⎜ e S (T ) ⎟
PF (t,T ) ⎝ ⎠
⎛ M (T,T ) ⎞
= EtT ⎜ S (T ) ⎟ (2.11)
⎝ M (t,T ) ⎠
where:
PF (t,T ) − ∫ 0t sF (u) du
M (t,T )  e (2.12)
PC (t,T )
is a PT-martingale, as:
⎛ − ∫ T sF (u) du ⎞
M (t,T ) = EtT ⎜ e 0 ⎟
⎝ ⎠
We note that, trivially:
⎛ M (T,T ) ⎞
EtT ⎜ ⎟ = 1
⎝ M (t,T ) ⎠
33

so:
FnoCSA (t,T ) − FCSA (t,T )

⎛⎛ M (T,T ) ⎛ M (T,T ) ⎞⎞ ⎞
= EtT ⎜⎜⎜⎜ − EtT ⎜ ⎟⎟⎟ (S (T ) − FCSA (t,T ))⎟⎟
⎝⎝ M (t,T ) ⎝ M (t,T ) ⎠⎠ ⎠
1
= CovtT ( M (T,T ) , FCSA (T,T )) (2.13)
M (t,T )
To obtain the actual value of the adjustment we would need to

postulate joint dynamics of sF(u) and S(u), u ≥ t. We present a simple
model below where we carry out the calculations.
Relationship with futures contracts

At first sight, a forward contract with CSA looks rather like a futures
contract on the asset. Recall that with futures contracts, the (daily)
difference in the futures price gets credited/debited to the margin
account. In the same way, as forward prices move, a CSA forward
contract also specifies that money exchanges hands. There is,
however, an important difference. Consider the value of a forward
contract at t′ > t, a contract that was entered at time t (so V(t) = 0).
Then:
⎛ − ∫ T rC (u) du ⎞
V (tʹ′) = Etʹ′ ⎜ e tʹ′ (S (T ) − FCSA (t,T ))⎟
⎝ ⎠

⎛ − ∫ rC (u) du
T
⎞ ⎛ − ∫ rC (u) du ⎞
T
= Etʹ′ ⎜ e tʹ′ S (T ) ⎟ − Etʹ′ ⎜ e tʹ′ ⎟ FCSA (t,T )

⎝ ⎠ ⎝ ⎠
By Equation 2.10:
⎛ − ∫ T rC (u) du ⎞
V (tʹ′) − V (t) = Etʹ′ ⎜ e tʹ′ ⎟ ( FCSA (tʹ′,T ) − FCSA (t,T ))
⎝ ⎠
so the difference in contract values on t′ and t that exchanges hands

at t′ is equal to the discounted (to T) difference in forward prices.
For a futures contract, the difference will not be discounted.
Therefore, the type of convexity effects we see in futures contracts
are different from what we see in CSA versus no-CSA forward
contracts, a conclusion different from that reached in Johannes and
Sundaresan (2007).
34

European-style options
Consider now a European-style call option on S(T) with strike K.
Depending on the presence or absence of CSA, we get two prices:
⎛ − ∫ T rF (u) du + ⎞
VnoCSA (t) = Et ⎜ e t (S (T ) − K ) ⎟
⎝ ⎠

⎛ − ∫ tT rC (u) du + ⎞
VCSA (t) = Et ⎜ e (S (T ) − K ) ⎟
⎝ ⎠
(where for the CSA case we assumed that the collateral posted, C, is
always equal to the option value, VCSA). By the same measure-
change arguments as in the previous section:
( )
VnoCSA (t ) = PF (t,T ) E tT (S (T ) − K )
+

(t ) = P (t,T ) E ((S (T ) − K ) )
T +
VCSA C t
~
The difference between measures P Tt and PTt not only manifests itself
in the mean of S(T) – as already established in the previous section
– but also shows up in other characteristics of the distribution of
S(⋅), such as its variance and higher moments. We explore these
effects in the next section.
Distribution impact of convexity adjustment

Let us see how a change of measure affects the distribution of S(⋅).
In the spirit of Equation 2.11, we have:
⎛ M (T,T ) + ⎞
VnoCSA (t ) = PF (t,T ) EtT ⎜ ( S (T ) − K ) ⎟
⎝ M (t,T ) ⎠
where M(t, T) is defined in Equation 2.12. Then, by conditioning on
S(T), we obtain:
(
VnoCSA (t ) = PF (t,T ) EtT α (t,T,S (T )) (S (T ) − K )
+
) (2.14)
where the deterministic function a (t, T, x) is given by:

⎛ M (T,T ) ⎞
α (t,T, x ) = EtT ⎜⎜ S (T ) = x ⎟⎟
⎝ M (t,T ) ⎠
Inspired by Antonov and Arneguy (2009), we approximate the
function a (t, T, x) by a linear (in x) function:
α (t,T, x ) ≈ α 0 (t,T ) + α 1 (t,T ) x
35

and obtain a 0 and a 1 by minimising the squared difference (while

using the fact that ETt(M(T, T)/M(t, T)) = 1 and ETt(S(T)) = FCSA(t, T)):
α 1 (t,T ) =
EtT ( M(T ,T )
M (t,T ) S (T ) − FCSA (t,T ))
VartT (S (T ))
α 0 (t,T ) = 1− α 1FCSA (t,T )
We recognise the term:

⎛ M (T,T ) ⎞
EtT ⎜ S (T ) ⎟ − FCSA (t,T )
⎝ M (t,T ) ⎠
as the convexity adjustment of the forward between the no-CSA
and CSA versions (see Equation 2.13), and rewrite:
FnoCSA (t,T ) − FCSA (t,T )
α 1 (t,T ) =
VartT (S (T ))
Differentiating Equation 2.14 with respect to K twice, we obtain the

following relationship between the probability density functions
(PDFs) of S(T) under the two measures:
PtT (S (T ) ∈ dK ) = (α 0 (t,T ) + α 1 (t,T ) K ) PtT (S (T ) ∈ dK ) (2.15)
Figure 2.1 Historical credit spread/interest rates and credit spread/equity

correlation calculated with a rolling one-year window
50
40
30
20
10
0
%
–10
–20
–30
–40 Credit/rates correlation
Credit/equity correlation
–50
Mar 27, 2006
Dec 6, 2007
Mar 19, 2008
Jul 6, 2006
Oct 7, 2008
Dec 13, 2005
Aug 22, 2007

May 14, 2007
Jan 31, 2007
Jun 27, 2008
Jan 21, 2009

May 20, 2005
May 1, 2009
Aug 30, 2005
Oct 16, 2006
36

Table 2.1 Relative differences between non-CSA and CSA forward prices
with s S = 30%, sF = 1.50%, ℵF = 5.00%
Time/ρ –30% –20% –10% 0% 10%
1 0.07% 0.04% 0.02% 0.00% –0.02%

2 0.26% 0.17% 0.09% 0.00% –0.09%
3 0.58% 0.39% 0.19% 0.00% –0.19%
4 1.02% 0.68% 0.34% 0.00% –0.34%
5 1.57% 1.04% 0.52% 0.00% –0.52%
6 2.23% 1.48% 0.74% 0.00% –0.73%
7 3.00% 1.99% 0.99% 0.00% –0.98%
8 3.87% 2.56% 1.27% 0.00% –1.26%
9 4.85% 3.20% 1.59% 0.00% –1.56%
10 5.92% 3.91% 1.94% 0.00% –1.90%
so the PDF of S(T) under the no-CSA measure is obtained from the
density of S(T) under the CSA measure by multiplying it with a
linear function. It is not hard to see that the main impact of such a
transformation is on the slope of the volatility smile of S(⋅). We
demonstrate this impact numerically below.
Example: stochastic funding model

Let us consider a simple model that we can use to estimate the
impact of collateral rules on forwards and options. We start with an
asset that follows a lognormal process:
dS (t ) /S (t ) = O ( dt ) + σ SdWS (t )
and funding spread that follows dynamics inspired by a simple
one-factor Gaussian model of interest rates:2
dsF (t ) =ℵF (θ − sF (t )) dt + σ F dWF (t )

with 〈dWS(t), dWF(t)〉 = r dt. Here r is the correlation between the asset
and the funding spread. We also assume for simplicity that rC(t),
rR(t) are deterministic, while rD(t) = 0. Then:
FCSA (t,T ) = Et (S (T ))
and:
dFCSA (t,T ) / FCSA (t,T ) = σ SdWS (t )
with WS(t) being a Brownian motion in the risk-neutral measure P.

On the other hand:
37

dPF (t,T ) / PF (t,T ) = O (dt ) − σ F b (T − t ) dWF (t )
where:
1− e −ℵF (T−t)
b (T − t ) =
ℵF
As M(t, T) is a martingale under P (since rC(t) is deterministic, the

measures P and PT coincide), we have from Equation 2.12 that:
dM (t,T ) / M (t,T ) = −σ F b (T − t ) dWF (t)

Also both M(t, T) and FCSA(t, T) are martingales under P. We then
have:
d ( M (t,T ) FCSA (t,T )) / ( M (t,T ) FCSA (t,T ))

= σ Sσ F b (T − t ) ρdt + O (dW (t ))
Recall that:
FnoCSA (0,T ) − FCSA (0,T )
⎛ M (T,T ) ⎞
= E ⎜ ( FCSA (T,T ) − FCSA (0,T )) ⎟
M
⎝ ( 0,T ) ⎠
so that:
( T
FnoCSA ( 0,T ) = FCSA ( 0,T ) exp − ∫ 0 σ Sσ F b (T − t ) ρ dt )
⎛ T − b (T ) ⎞
= FCSA ( 0,T ) exp ⎜−σ Sσ F ρ ⎟ (2.16)
⎝ ℵF ⎠
and, in the case ℵF = 0:

FnoCSA ( 0,T ) − FCSA ( 0,T )

(
= FCSA ( 0,T ) exp (−σ Sσ F ρT 2 / 2 ) − 1 )
We note that the adjustment grows as (roughly) T2. A similar
formula was obtained by Barden (2009) using a model in which
funding spread is functionally linked to the value of the asset.
Let us perform a couple of numerical experiments. We start with
an equity-related example. Let us set sS = 30%, a number roughly in
line with implied volatilities of options on the S&P 500 equity index
(SPX). We estimate the basis-point volatility of the funding spread
to be sF = 1.50% and mean reversion to be ℵF = 5% by looking at
historical data of credit spreads on US banks. Figure 2.1 shows a
38

Figure 2.2 Difference in CSA v. non-CSA implied distribution for

European options using Equation 2.15, expressed in implied vol across
strikes, for different levels of correlation ρ
35
Original (CSA) implied volatilities
34 Adjusted (non-CSA) implied volatilities, corr = –30%
33 Adjusted (non-CSA) implied volatilities, corr = –10%
32 Adjusted (non-CSA) implied volatilities, corr = 10%
31
30
%
29
28
27
26
25
40 60 80 100 120 140 160
Strike
Note: T = 10 years, FCSA(0, T) = 100, σF = 1.50%, ℵF = 5.00%
rolling historical estimate of correlations between credit spreads

and the SPX (as well as credit spread and interest rates in the form
of a five-year swap rate). From this graph, we estimate a reasonable
range for the correlation r to be [–30%, 10%]. In Table 2.1, we report
relative adjustments:
FnoCSA (0,T ) − FCSA ( 0,T )

FCSA ( 0,T )
for different values of correlations and for different T from one to 10

years. Clearly, the adjustments could be quite significant.
Next we look at the difference in implied volatilities for CSA and
non-CSA options. We look at options expiring in 10 years across
different strikes, with FCSA(0, T) = 100. We assume that the market
prices of CSA options are given by the 30% implied volatility (for all
strikes), so that the “CSA distribution” of the asset is lognormal with
30% volatility. Then we express the distribution of the underlying
asset for non-CSA options as given by Equation 2.15 in terms of
implied volatilities (using put options and the original value of the
forward, 100, to ensure fair comparison). Figure 2.2 demonstrates the
impact – non-CSA options have lower volatility (lower put option
values), and the volatility smile has a higher (negative) skew.
39

Table 2.2 Absolute differences between non-CSA and CSA forward Libor
rates, using market-implied caplet volatilities and sF = 1.50%, ℵF = 5.00%
Time/ρ –20% 0% 20% 40%
1 0.00% 0.00% 0.00% 0.00%

2 0.01% 0.00% –0.01% –0.01%
3 0.01% 0.00% –0.01% –0.02%
4 0.02% 0.00% –0.02% –0.04%
5 0.03% 0.00% –0.03% –0.05%
7 0.05% 0.00% –0.05% –0.10%
10 0.09% 0.00% –0.09% –0.18%
15 0.18% 0.00% –0.18% –0.37%
20 0.30% 0.00% –0.30% –0.60%
25 0.42% 0.00% –0.42% –0.84%
30 0.54% 0.00% –0.54% –1.07%
Finally, let us look at CSA convexity adjustments to forward

Libor rates. Table 2.2 presents absolute differences (that is, FnoCSA(0,
T) – FCSA(0, T)) in non-CSA versus CSA forward Libor rates fixing in
one to 30 years over a reasonable range of possible correlations. We
use the same parameters for the funding spread as above together
with recent market-implied caplet volatilities and forward Libor
rates. Again, the differences are not negligible, especially for longer-
expiry Libor rates.
Conclusions
In this chapter, we have developed valuation formulas for derivative
contracts that incorporate the modern realities of funding and
collateral agreements that deviate significantly from the textbook
assumptions. We have shown that the pricing of non-collateralised
derivatives needs to be adjusted, as compared with the collateralised
version, with the adjustment essentially driven by the correlation
between market factors for a derivative and the funding spread.
Apart from rather obvious differences in discounting rates used for
CSA and non-CSA versions of the same derivative, we have exposed
the required changes to forward curves and, even, the volatility
information used for options. In a simple model with stochastic
funding spreads we demonstrated the typical sizes of these adjust-
ments and found them significant.
40

The author would like to thank members of the quantitative and

trading teams at Barclays Capital for thoughtful discussions, and
referees for comments that greatly improved the quality of the
chapter.
1 In what follows we use Equation 2.3, 2.5 with either C = 0 or C = V. However, these formulas,
in their full generality, could be used to obtain, for example, the value of a derivative
covered by one-way (asymmetric) CSA agreement, or a more general case where the collat-
eral amount tracks the value only approximately.
2 While a diffusion process for the funding spread may be unrealistic, the impact of more
complicated dynamics on the convexity adjustment is likely to be muted.
REFERENCES
Antonov A. and M. Arneguy, 2009, “Analytical Formulas for Pricing CMS Products in
the Libor Market Model with the Stochastic Volatility,” SSRN eLibrary.
Barden P., 2009, “Equity Forward Prices in the Presence of Funding Spreads,” ICBI
Conference, Rome, April.
Burgard C. and M. Kjaer, 2009, “Modelling and Successful Management of Credit-

counterparty Risk of Derivative Portfolios”, ICBI Conference, Rome, April.
Gregory J., 2009, “Being Two-faced over Counterparty Credit Risk”, Risk, February, pp
86–90.
Hull, J., 2006, Options, Futures and Other Derivatives (6e) (Upper Saddle River, NJ:
Pearson/ Prentice Hall).
Johannes M. and S. Sundaresan, 2007, “Pricing Collateralized Swaps,” Journal of Finance,

62, pp 383–410.
Karatzas I. and S. Shreve, 1996, Brownian Motion and Stochastic Calculus (2e) (New York,
NY: Springer).
41

3
Two Curves, One Price
Marco Bianchetti
Intesa Sanpaolo Bank
The credit crunch that began in the second half of 2007 has trig-
gered, among many consequences, the explosion of the basis
spreads quoted on the market between single-currency interest rate
instruments (swaps in particular) characterised by different under-
lying rate tenors (Xibor three-month and Xibor six-month, etc,
where Xibor denotes a generic interbank offered rate). In Figure 3.1,
we show a snapshot of the market quotations as of February 16,
2009 for the six basis swap term structures corresponding to the
four Euribor tenors, one month, three months, six months and 12
months. Such very high basis spreads reflect the increased liquidity
risk suffered by financial institutions and the corresponding prefer-
ence for receiving payments with higher frequency (quarterly
instead of semi-annually, etc). Other indicators of changes in the
interest rate markets are the divergence between deposit (Xibor-
based) and overnight indexed swaps (OIS, Eonia based for euro)
rates, and between forward rate agreement (FRA) contracts and the
corresponding forward rates implied by consecutive deposits (see,
for example, Ametrano and Bianchetti, 2009, Mercurio, 2009, and
Morini, 2009).
These frictions reveal that apparently similar interest rate
instruments with different underlying rate tenors are character-
ised, in practice, by different liquidity and credit risk premiums,
reflecting the different views and interests of market players.
Thinking in terms of more fundamental variables, for example, a
short rate, the credit crunch has acted as a sort of “symmetry
breaking mechanism”: from an unstable situation in which a
43
03 Biancheti PCQF.indd 43 11/03/2013 10:10

unique short rate process was able to model and explain the whole
term structure of interest rates of all tenors, towards a sort of
“segmentation” into sub-areas corresponding to instruments with
different underlying rate tenors, characterised, in principle, by
distinct dynamics, for example, distinct short rate processes. We
stress that market segmentation was already present (and well
understood) before the credit crunch (see, for example, Tuckman
and Porfirio, 2003), but not effective due to negligible basis
spreads.
Such evolution of the financial markets has had strong effects
on the methodology used to price and hedge interest rate deriva-
tives. In principle, a consistent credit and liquidity theory would
be required to account for the interest rate market segmentation,1
but unfortunately such a framework is not easy to construct (see,
for example, Mercurio, 2009, and Morini, 2009). In practice, an
empirical approach has prevailed among market practitioners,
based on the construction of multiple “forwarding” yield curves
from plain vanilla market instruments homogeneous in the under-
lying rate tenor, used to calculate future cashflows based on
forward interest rates with the corresponding tenor, and of a
“discounting” yield curve, used to calculate discount factors and
cashflows’ present values. Such a “double-curve” approach allows
for an immediate recovery of market prices of quoted instruments
but, unfortunately, it does not fulfil the classic no-arbitrage
constraints of the single-curve pricing approach.
In this chapter, we acknowledge the current market practice,
assuming the existence of a given methodology for bootstrapping
multiple homogeneous forwarding and discounting yield curves,
and focus on the consequences for pricing and hedging interest rate
derivatives. This is a central problem in the interest rate market,
which still lacks attention in the published financial literature. In
particular, Boenkost and Schmidt (2005) discuss two methodologies
for pricing cross-currency basis swaps, the first of which (the actual
pre-crisis common market practice) does coincide, once reduced to
the single-currency case, with the double-curve procedure presently
adopted by the market2 (see also Tuckman and Porfirio, 2003, and
Fruchard, Zammouri and Willems, 1995). Kijima, Tanaka and
Wong (2008) have extended the approach of Boenkost and Schmidt
(2005) to the (cross-currency) case of three curves for discount rates,
44

TWO CURVES, ONE PRICE
Figure 3.1 Quotations as of February 16, 2009 for the six euro basis
swap spread curves corresponding to the four Euribor swap curves 1M,
3M, 6M, 12M
80
1M v. 3M
70 1M v. 6M
60 1M v. 12M
Basis spread (bp)
3M v. 6M
50 3M v. 12M
40 6M v. 12M
30
20
10
0
1Y 2Y 3Y 4Y 5Y 6Y 7Y 8Y 9Y 10Y11Y12Y15Y20Y25Y30Y
Source: Reuters
Libor rates and bond rates. Finally, simultaneous with the develop-
ment of this chapter, Morini (2009) has been approaching the
problem in terms of counterparty risk, Mercurio (2009) in terms of
the extended Libor market model and Henrard (2009) using an
axiomatic model.
Here, we follow an alternative route with respect to those cited
above, in the sense that: we adopt a “bottom-up” practitioner’s
perspective, starting from the current market practice of using
multiple yield curves and working out its natural consequences,
looking for a minimal and light generalisation of well-known
frameworks, keeping things as simple as possible; we show how
no-arbitrage can be recovered in the double-curve approach by
taking into account the basis adjustment, whose term structure can
be extracted from available market quotations; and we use a
straightforward foreign currency analogy to derive generalised
double-curve market-like pricing expressions for basic single-
currency interest rate derivatives, such as FRAs, swaps, caps/floors
and swaptions.3
Pre- and post-credit crunch market practices for

pricing and hedging interest rate derivatives
Following the discussion above, we denote with Mx, x =
{d, f1, ... , fn} multiple, distinct interest rate sub-markets, characterised
45

by the same currency and by distinct bank accounts Bx, such that
t
Bx(t) = exp ∫0rx(u)du, where rx(t) is the associated short rate. We also
have multiple distinct yield curves Cx in the form of a continuous
term structure of discount factors, Cx = {T → Px(t0, T), T ≥ t0}, where t0
is the reference date (for example, the settlement date, or today) and
Px(t, T) denotes the price at time t ≥ t0 of the Mx-zero-coupon bond
for maturity T, such that Px(T, T) = 1. In each sub-market Mx we
assume the usual no-arbitrage relation:
Px (t,T2 ) = Px (t,T1 ) Px (t,T1 ,T2 ) ,t ≤ T1 < T2 (3.1)
where Px(t, T1, T2) denotes the Mx-forward discount factor from time
T2 to time T1. By expressing the latter in terms of the corresponding
simple compounded forward rate Fx(t; T1, T2), we obtain from
Equation 3.1 the familiar no-arbitrage expression:
Px (t,T1 ) − Px (t,T2 )
Fx (t;T1 ,T2 ) = (3.2)
τ x (T1 ,T2 ) Px (t,T2 )
where tx(T1, T2) is the year fraction between times T1 and T2 with day
count dcx. Equation 3.2 can be also derived as the fair value condi-
tion of the FRA contract4 with price at time t ≤ T1 < T2, for the
fixed-rate payer side, given by:
FRAx (t;T1 ,T2 , K, N )
{ }
T
= N Px (t,T2 ) τ x (T1 ,T2 ) EQt x ⎡⎣Lx (T1 ,T2 )⎤⎦ − K
2
= N Px (t,T2 ) τ x (T1 ,T2 ) ⎡⎣ Fx (t;T1 ,T2 ) − K ⎤⎦ (3.3)
where N is the nominal amount, Lx(T1, T2) := Fx(T1; T1, T2) is the T1 spot
Xibor rate, K the strike rate (sharing the same compounding and
day count conventions), QxT denotes the Mx-T2- forward measure
2
corresponding to the numeraire Px(t, T2), EtQ[.] is the expectation at

time t with respect to measure Q and filtration Ft, encoding the
market information available up to time t. We stress that the
assumptions above imply that we have multiple interest rate sub-
markets, each with the same properties of the “classic” interest rate
market before the crisis. This is a strong hypothesis, which could be
relaxed in more sophisticated frameworks.
The pre-crisis approach for pricing and hedging single-currency
interest rate derivatives was based on a single-curve procedure,
well known to the financial world. For instance, a 5.5-year maturity
46

euro floating swap leg on Euribor one-month (not directly quoted

on the market) was commonly priced using discount factors and
forward rates calculated on a single yield curve C, built from quoted
plain vanilla interest rate derivatives (for example, deposit, FRA,
futures and swap contracts) using a preferred bootstrapping proce-
dure. The delta sensitivity was calculated by shocking one by one
the market pillars, and the resulting delta risk was hedged using
the suggested amounts (hedge ratios) of five-year and six-year
Euribor six-month swaps.5 The post-crisis market practice, for a
general single-currency interest rate derivative with m ≥ 1 future
coupons with payouts p = {p1, ... , pm}, generating m cashflows c = {c1,
... , cm} at future dates T = {T1, ... , Tm}, t < T1 < ... < Tm, can be summa-
rised as follows.6
❑❑ Build one discounting curve Cd using the preferred selection of

vanilla interest rate market instruments and bootstrapping
procedure.
❑❑ Build multiple distinct forwarding curves Cf , ... , Cf using the
n 1
preferred selections of distinct sets of market instruments, each

homogeneous in the underlying rate tenor, and bootstrapping
procedures.
❑❑ For each interest rate coupon i ∈ {1, ... , m}, calculate the relevant
forward rates with tenor f using the corresponding curve Cf as in
Equation 3.2:
Pf (t,Ti−1 ) − Pf (t,Ti )
Ff (t;Ti−1 ,Ti ) = , t ≤ Ti−1 < Ti (3.4)
τ f (Ti−1 ,Ti ) Pf (t,Ti )
❑❑ Calculate cashflows ci as expectations at time t of the corre-
sponding coupon payouts pi with respect to the discounting
Ti-forward measures QdT , associated with the numeraire7 Pd(t, Ti),
i
as:
QTi
c (t,Ti , π i ) = E t d [ π i ] (3.5)
❑❑ Calculate the relevant discount factors Pd(t, Ti) using the

discounting curve Cd.
❑❑ Calculate the derivative’s price at time t as the sum of the
discounted cashflows:
m m
Ti
π (t;T ) = ∑ Pd (t,Ti ) c (t,Ti , π i ) = ∑ Pd (t,Ti ) EQt d [ π i ] (3.6)
i=1 i=1
47

❑❑ Calculate the delta sensitivity with respect to the market pillars of

each curve and hedge the resulting delta risk using the suggested
amounts (hedge ratios) of the corresponding set of vanillas.
For instance, the 5.5-year floating swap leg cited above is currently
priced using Euribor one-month forward rates calculated on the C1M
forwarding curve (bootstrapped using Euribor one-month vanillas
only), plus discount factors calculated on the discounting curve Cd.
The delta sensitivity is calculated with respect to the market pillars
of both C1M and Cd curves, and the resulting delta risk is hedged
using the suggested amounts (hedge ratios) of five-year and six-
year Euribor one-month swaps plus the suggested amounts of
five-year and six-year instruments from the discounting curve.8
The static double-curve methodology described above can be
extended, in principle, by adopting multiple distinct models for
the evolution of each underlying interest rate with tenors f1, ... , fn
to calculate the dynamics of yield curves and expected cashflows.
The volatility/correlation dependencies carried by the models
would imply, in principle, the bootstrapping of multiple distinct
variance/covariance matrices and hedging the corresponding
sensitivities using volatility- and correlation-dependent plain
vanilla market instruments. A more general problem has been
approached in Mercurio (2009) in the context of the generalised
Libor market model. In this chapter, we will focus only on the
basic matter of static yield curves and leave out the dynamical
volatility/correlation dimensions. In the following two sections,
we will work out some consequences of the assumptions above in
terms of no-arbitrage.
No-arbitrage and basis adjustment

First, we notice that in the double-curve framework, classic no-arbi-
trage relations are broken. For instance, Equations 3.1 and 3.2
become:
Pd (t,T2 ) = Pd (t,T1 ) Pf (t,T1 ,T2 ) (3.7)
1 Pf (t,T2 )
Pf (t,T1 ,T2 ) = = (3.8)
1+ Ff (t;T1 ,T2 ) τ f (T1 ,T2 ) Pf (t,T1 )
but clearly cannot hold at the same time. No-arbitrage is recovered
by taking into account the basis adjustment (to be distinguished
from the quoted market basis of Figure 3.1) defined as:
48

1
Pf (t,T1 ,T2 ) :=
1+ ⎡⎣Fd (t;T1 ,T2 ) + BA fd (t;T1 ,T2 )⎤⎦τ d (T1 ,T2 ) (3.9)
From Equation 3.9, we obtain an expression in terms of discount

factors from Cd and Cf curves as:
BA fd (t;T1 ,T2 )
1 ⎡⎣Ff (t;T1 ,T2 ) τ f (T1 ,T2 ) − Fd (t;T1 ,T2 ) τ d (T1 ,T2 )⎤⎦
=
τ d (T1 ,T2 )
1 ⎡ Pf (t,T1 ) P (t,T ) ⎤
1
= ⎢ − d ⎥ (3.10)
τ d (T1 ,T2 ) ⎢⎣ Pf (t,T2 ) Pd (t,T2 ) ⎥⎦

Note that if Cd = Cf we recover the single-curve case BAfd(t; T1, T2) = 0.
In Figure 3.2, we plot a numerical example of basis adjustment in
a realistic market situation. We bootstrap five distinct yield curves
Cx = {Cd, C1M, C3M, C6M, C12M}. The discounting curve Cd is built with a
typical “pre-crisis” standard recipe (using the most liquid deposits,
futures and swaps).9 The four forwarding curves are built from
convenient selections of plain vanilla instruments with homoge-
neous underlying rate tenors. A smooth and robust algorithm is
used for interpolations (monotonic cubic spline on log discounts, as
described in Ametrano and Bianchetti, 2009). In the upper panels,
we plot the term structure of the four corresponding basis adjust-
ment curves calculated through Equation 3.10. Overall, we notice
that they reveal a complex micro-term structure not present either
in the monotonic basis swaps market quotes of Figure 3.1 or in the
smooth yield curves Cx (not shown here, see Ametrano and
Bianchetti, 2009). Such effect is due essentially to an amplification
mechanism of small local differences between Cd and Cf forward
curves. In the lower panels, we also show that smooth yield curves
are a crucial input for the basis adjustment: using a non-smooth
bootstrapping, for example, linear interpolation on zero rates (still a
diffused market practice), the zero curve apparently shows no
particular problems, while the forward curve displays a jagged
shape inducing, in turn, strong and unnatural oscillations in the
basis adjustment.
We conclude that, once a smooth and robust bootstrapping tech-
nique for yield curve construction is used, the richer term structure
of the basis adjustment curves provides a sensitive indicator of the
tiny, but observable, static differences between different interest
rate market sub-areas in the post-credit crunch interest rate world,
49

Figure 3.2 Basis adjustment as of end of day February 16, 2009
Basis adjustment 0–3 years

A 80
60
40
20
Basis points
–20
–40 1M v. Disc
3M v. Disc
–60 6M v. Disc
12M v. Disc
–80
Feb May Aug Nov Feb May Aug Nov Feb May Aug Nov Feb
09 09 09 09 10 10 10 10 11 11 11 11 12
Euro forwarding curve 3M, 0–30 years

B 6
4
%
2
Zero rates
Forward rates
1
Feb Feb Feb Feb Feb Feb Feb Feb
09 13 17 21 25 29 33 37
and a tool to assess the degree of liquidity and credit issues in

interest rate derivatives’ prices. It is also helpful for a better under-
standing of the profit and loss encountered when switching between
the single- and double-curve worlds.
No-arbitrage and quanto adjustment

A second important issue regarding no-arbitrage arises in the
double-curve framework: from Equation 3.6 we see that, for
instance, the single-curve FRA price in Equation 3.3 is generalised
into the following double-curve expression:
50

Figure 3.2 (continued)

C 10
8
6
4
Basis points
2
0
–2
–4 1M v. Disc
–6 3M v. Disc
–8 6M v. Disc
12M v. Disc
–10
Feb Feb Feb Feb Feb Feb Feb Feb Feb Feb
12 15 18 21 24 27 30 33 36 39

D 4
0
Basis points
–2
–4
–6 1M v. Disc
3M v. Disc
–8 6M v. Disc
12M v. Disc
–10
Feb Feb Feb Feb Feb Feb Feb Feb Feb Feb
12 15 18 21 24 27 30 33 36 39
Note: Graphs A and C: basis adjustment from Equation 3.10 (basis points) for daily
sampled 3M-tenor forward rates calculated on C1M, C3M, C6M and C12M curves
against Cd taken as reference curve. Graphs A: 0–3-year data; graph C: 3–30-year
data on magnified scales. Graphs B and D: the effect of poor interpolation
schemes (linear on zero rates, see Ametrano and Bianchetti, 2009) on
zero/forward 3M rates (graph B) and on basis adjustment (graph D)
FRA (t;T1 ,T2 , K, N )
{ }
T2
= N Pd (t,T2 ) τ f (T1 ,T2 ) E Qt x ⎡⎣L f (T1 ,T2 )⎤⎦ − K
≠ N Pd (t,T2 ) τ f (T1 ,T2 ) ⎡⎣Ff (t;T1 ,T2 ) − K ⎤⎦ (3.11)
51

Obviously the forward rate Ff(t; T1, T2) is not, in general, a martin-
gale under the discounting measure QdT , and the expression in the
2
third line above is an approximation discarding the adjustment

coming from this measure mismatch. Hence, a correct no-arbitrage
pricing within the double-curve framework requires a theoretical
model for the calculation of expectations, as in the first line of
Equation 3.11. This task can be accomplished by resorting to the
natural analogy with cross-currency derivatives: if we identify the
two interest rate markets Md and Mf with a domestic and a foreign
market, Cd and Cf with the corresponding yield curves, and the bank
accounts Bd(t), Bf(t) with the corresponding currencies, respectively,10
we may recognise on the right-hand side of Equation 3.11 the expec-
tation of the foreign forward rate with respect to the domestic
forward measure, thus leading to the well-known quanto adjust-
ment commonly encountered in the pricing of cross-currency
derivatives. We revisit its derivation here within the present double-
curve single-currency framework.11
In the general double-curve double-currency case, no-arbitrage
requires the existence at any time t0 ≤ t ≤ T of a spot and a forward
exchange rate between equivalent amounts of money in the two
currencies:
c d (t ) P (t,T )
x fd (t ) = , X fd (t,T ) = x fd (t ) f (3.12)
c f (t ) Pd (t,T )
where the subscripts f and d stand for foreign and domestic, cd(t) is
any cashflow (amount of money) at time t in units of domestic
currency and cf(t) is the corresponding cashflow at time t (the corre-
sponding amount of money) in units of foreign currency. Our
particular double-curve single-currency case is obtained from
Equations 3.12 above simply by reading the subscripts f and d as
shorthand for forwarding and discounting, and collapsing today’s
spot exchange rate to xfd(t0) = 1. Note that for Cd = Cf, we recover the
single-currency single-curve case Xfd(t0, T) = 1 ∀ T.
According to standard market practice, we assume a (driftless)
lognormal martingale dynamics for Cf (foreign) forward rates:
dFf (t;T1 ,T2 )
= σ f (t ) dWfT2 (t ) , t ≤ T1 (3.13)
Ff (t;T1 ,T2 )
where sf(t) is the volatility of the process and WfT is a Brownian 2
motion under the forwarding (foreign) T2-forward measure QfT 2
52

associated with the Cf (foreign) numeraire Pf(t, T2). Furthermore,

since Xfd(t, T2) in Equation 3.12 is the ratio between the price at time
t of a Cd (domestic) tradable asset and the Cd (domestic) numeraire, it
must evolve according to a (driftless) martingale process under the
associated discounting (domestic) T2-forward measure:
dX fd (t,T2 )
= σ X (t) dWXT2 (t ) , t ≤ T2 (3.14)
X fd (t,T2 )
where sX(t) is the volatility of the process and WXT is a Brownian 2
motion under QdT such that: 2
dW fT2 (t ) dWXT2 (t ) = ρ fX (t) dt (3.15)
Now, returning to the change-of-numeraire technique (see Brigo

and Mercurio, 2006, Jamshidian, 1989, and Geman, El Karoui and
Rochet, 1995), we switch the dynamics of Ff(t; T1, T2) from the
forwarding (foreign) measure QfT associated with the numeraire
2
Pf(t, T2) to the discounting (domestic) measure QdT associated with 2
the numeraire Pd(t, T2), and obtain:
dFf (t;T1 ,T2 )

= µ f (t) dt + σ f (t ) dW fT2 (t) , t ≤ T1 (3.16)
Ff (t;T1 ,T2 )
µ f (t ) = −σ f (t ) σ X (t ) ρ fX (t ) (3.17)
T
EQt d ⎡⎣L f (T1 ,T2 )⎤⎦ = Ff (t;T1 ,T2 ) + QA fd (t,T1 , σ f , σ X , ρ fX )
2
(3.18)
QA fd (t,T1 , σ f , σ X , ρ fX ) = Ff (t;T1 ,T2 ) ⎡⎢⎣exp ∫ µ f ( u) du − 1⎤⎥⎦

T1
t
(3.19)
where in Equation 3.18 we have defined an additive quanto adjust-

ment.12 We stress that a non-trivial adjustment is obtained if and
only if the forward exchange rate Xfd is stochastic (sX ≠ 0) and corre-
lated with the forward rate Ff(rfX ≠ 0), otherwise expression 3.19
collapses to the single-curve case QAfd = 0.
The derivation above can be remapped to swap rates. Given
two increasing date sets T = {T0, ... , Tn}, S = {S0, ... , Sm}, T0 = S0 ≥ t and
an interest rate swap with a floating leg paying at times Ti, i = 1, ...,
n the Xibor rate L(Ti–1, Ti) fixed at time Ti–1, versus a fixed leg paying
at times Sj, j = 1, ... , m a fixed rate, we obtain:
EQt d ⎡⎣S f (T0 , T,S)⎤⎦ = S f (t, T,S) + QA fd (t, T,S, ν f , ν Y , ρ fY )

S
(3.20)
53

QA fd (t, T,S, ν f , ν Y , ρ fY )
= S f (t, T,S) ⎡⎢⎣exp ∫ λ f (u, T,S) du − 1⎥⎦⎤

T0
(3.21)
t
λ f (t, T,S) = −ν f (t, T, S) ν Y (t,S) ρ fY (t, T, S) (3.22)
where Sf(t, T, S) is the (fair) swap rate on curve Cf, QSd is the
discounting (domestic) swap measure associated with the annuity
Ad(t, S) on curve Cd, nf(t, T, S) is the swap rate volatility, nY(t, S) is the
volatility of the swap forward exchange rate defined as:
A f (t,S)
Yfd (t,S) = x fd (t ) (3.23)
Ad (t,S)
(equivalent to Equation 3.12), and rfY(t, T, S) is the correlation

between the swap rate and the swap forward exchange rate. The
same considerations as above apply.
Double-curve pricing interest rate derivatives

The results above allow us to derive no-arbitrage, market-like,
double-curve single-currency pricing formulas for interest rate
derivatives.
The FRA, whose single-curve price is given in Equation 3.3, is
priced at time t ≤ T1 ≤ T2 as:
FRA (t;T1 ,T2 , K, N )
{ }
2 T
= N Pd (t,T2 ) τ f (T1 ,T2 ) EQt d ⎡⎣L f (T1 ,T2 )⎤⎦ − K
= N Pd (t,T2 ) τ f (T1 ,T2 ) ⎡⎣Ff (t;T1 ,T2 )

+ QA fd (t,T1 , σ f , σ X , ρ fX ) − K ⎤⎦ (3.24)
where we have used Equation 3.18 and the quanto adjustment term
is given by Equation 3.19.
For a (payer) floating versus fixed swap with payment date
vectors T, S as above, we have the price at time t ≤ T0:
Swap (t; T,S, K, N )
m
= −∑ N j Pd (t,Sj ) τ d (Sj−1 ,Sj ) K j
j=1
n
+∑ N i Pd (t,Ti ) τ f (Ti−1 ,Ti ) ⎡⎣Ff (t;Ti−1 ,Ti )
i=1
+ QA fd (t,Ti−1 , σ f , i , σ X , i , ρ fX , i )⎤⎦ (3.25)
54

For caplet/floorlet options on T1-spot rates Lf(T1, T2), the standard

market-like pricing expression at time t ≤ T1 ≤ T2 is modified as
follows:
cf (t;T1 ,T2 , K, ω , N )
= NEQt d ⎡⎣ Max ω ⎡⎣L f (T1 ,T2 ) − K ⎤⎦ τ f (T1 ,T2 )⎤⎦
T2
{ }
= N Pd (t,T2 ) τ f (T1 ,T2 ) Black ⎡⎣Ff (t;T1 ,T2 )
+QA fd (t,T1 , σ f , σ X , ρ fX ) , K, µ f , σ f , ω ⎤⎦ (3.26)
where w = ±1 for caplets/floorlets, respectively, and Black[F, K, m, s,

w] is the standard Black formula. Hence, cap/floor options prices
are given at t ≤ T0 by:
n
CF (t;T, K, ω , N ) = ∑ cf (t;Ti−1 ,Ti , K i , ωi , N i )
i=1
n
= ∑ N i Pd (t,Ti ) τ f (Ti−1 ,Ti ) ×Black ⎡⎣Ff (t;Ti−1 ,Ti )
i=1
QA fd (t,Ti−1σ f , i , σ X , i , ρ fX , i ) K i , µ f , i , σ f , i , ωi ⎤⎦ (3.27)
Finally, for swaptions on T0-spot swap rates Sf(T0, T, S), the standard
market-like pricing expression, using the discounting swap
measure QSd associated with the numeraire Ad(t, S) on curve Cd, is
modified as follows at time t ≤ T0:
Swaption (t; T, S, K, ω , N )
= NEQt d Max ⎡⎣ω (S f (T0 , T,S) − K )⎤⎦ Ad (t,S)
S
{ }
= N Ad (t,S) Black ⎡⎣S f (t, T,S)
+ QA fd (t, T,S, ν f , ν Y , ρ fY ) , K, λ f , ν f , ω ⎤⎦ (3.28)
where we have used Equation 3.20 and the quanto adjustment term
is given by Equation 3.21.
The calculations above also show that basic interest rate deriva-
tives prices become, in principle, volatility and correlation
dependent. The volatilities and the correlation in Equations 3.17
and 3.22 can be inferred from market data. In the euro market, the
volatilities sf and nf can be extracted from quoted caps/floors/
swaptions on Euribor six-month, while for sX, rfX and nY, rfY one
must resort to historical estimates.
In Figure 3.3, we show a numerical scenario for the quanto
adjustment in Equation 3.19. We see that, for realistic values
55

of volatility, the magnitude of the additive adjustment may be

important or negligible, depending on the correlation. Note that
positive correlation implies negative adjustment, thus lowering the
forward rates in the pricing formulas. Through historical estima-
tion we obtain, using Equation 3.12 with the same yield curves as in
Figure 3.2 and considering one year of backward data, forward
exchange rate volatilities below 5–10% and correlations within the
range [–0.6; +0.4].
We conclude that pricing interest rate derivatives without the
quanto adjustment (as in the third line of Equation 3.11) leaves, in
principle, the door open to arbitrage opportunities. In practice,
the correction depends on financial variables currently not quoted
on the market, thus making it very difficult to set up arbitrage
positions and lock positive gains expected in the future today.
Obviously, a better understanding of this conundrum requires us
to go beyond a pure interest rate description and introduce credit
issues, as outlined in the next section. On the other hand, given
that, regardless of the model adopted, an adjustment is imposed
by no-arbitrage, the present framework has the advantage of
introducing a minimal set of parameters with a transparent finan-
cial interpretation and leading to familiar pricing formulas, thus
Figure 3.3 Numerical scenarios for the quanto adjustment (from

Equation 3.19) corresponding to three different combinations of (flat)
volatility values as a function of the correlation
100
80
Quanto adjustment (bp)
60
40
20
0
–20
–40
–60 σf = 10%, σX = 2.5%
σf = 20%, σX = 5%
–80 σf = 30%, σX = 10%
–100
–1.0 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1.0
Correlation
Note: The time interval is fixed to T1 – t = 10 years and the forward rate entering
Equation 3.19 to 3%
56

constituting a simple and easy-to-use tool for practitioners and

traders to promptly intercept possible market evolutions.
No-arbitrage and counterparty risk

Both the basis and the quanto adjustment discussed above find a
simple financial explanation in terms of counterparty risk. From
this point of view we may identify Pd(t, T) with a default-free zero-
coupon bond and Pf(t, T) with a risky zero-coupon bond with
recovery rate Rf, emitted by a generic interbank counterparty subject
to default risk. The associated risk-free and risky Xibor rates, Ld(T1,
T2) and Lf(T1, T2), respectively, are the underlyings of the corre-
sponding derivatives, for example, FRAd and FRAf. Adapting the
simple credit model proposed in Mercurio (2009), we may write,
using our notation:13
Pf (t,T ) = Pd (t,T ) R (t;t,T, R f ) (3.29)
1 ⎡ P (t,T ) R (t;t,T , R ) ⎤
1 f
Ff (t;T1 ,T2 ) = ⎢ d 1
− 1⎥ (3.30)
τ f (T1 ,T2 ) ⎢⎣ Pd (t,T2 ) R (t;t,T2 , R f ) ⎥⎦
FRA f (t;T1 ,T2 , K )

Pd (t,T1 )
= − Pd (t,T2 ) ⎡⎣1+ Kτ f (T1 ,T2 )⎤⎦ (3.31)
R (t;T1 ,T2 , R f )
R (t;T1 ,T2 , R f ) := R f + (1− R f ) EQt d ⎡⎣qd (T1 ,T2 )⎤⎦ (3.32)
where qd(T1, T2) = EtQ [1t (t)>T] is the counterparty survival probability
d
up to time T2 expected at time T1 under the risk-neutral discounting

measure Qd. Comparing the expressions above with Equations 3.10
and 3.24, we obtain:
1 Pd (t,T1 ) ⎡ R (t;t,T1 , R f ) ⎤
BA fd (t;T1 ,T2 ) = ⎢ − 1⎥ (3.33)
τ d (T1 ,T2 ) Pd (t,T2 ) ⎢⎣ R (t;t,T2 , R f ) ⎥⎦
QA fd (t;T1 ,T2 )
1 Pd (t,T1 ) ⎡ 1 R (t;t,T1 , R f ) ⎤
= ⎢ − ⎥ (3.34)
τ f (T1 ,T2 ) Pd (t,T2 ) ⎢⎣ R (t;T1 ,T2 , R f ) R (t;t,T2 , R f ) ⎥⎦
Thus the basis and the quanto adjustment can be expressed, under
simple credit assumptions, in terms of risk-free zero-coupon bonds,
survival probability and recovery rate. A more complex credit
57

model, as, for example, in Morini (2009), would also be able to

explain the spot exchange rate in Equation 3.12 in terms of credit
variables. Note that the single-curve case Cd = Cf is recovered for
vanishing default risk (full recovery).
Conclusion
We have shown that after the credit crunch the classical single-curve
no-arbitrage relations are no longer valid and can be recovered by
taking into account the basis adjustment, whose term structure can
be extracted from available market quotations. Our numerical
results show that, once a smooth and robust bootstrapping tech-
nique for yield curve construction is used, the richer term structure
of the basis adjustment curves provides a sensitive indicator of the
tiny, but observable, static differences between different interest rate
market sub-areas in the post-credit crunch interest rate world.
Furthermore, the basis adjustment may also be helpful for a better
understanding of the profit and loss encountered when switching
between single- and double-curve worlds.
Using the foreign currency analogy, we have recalculated gener-
alised, double-curve no-arbitrage market-like pricing formulas for
basic interest rate derivatives, FRAs, swaps, caps/floors and swap-
tions in particular. When the forward exchange rate between the
two curves is stochastic and correlated with the forward rate, these
expressions include a single-currency version of the quanto adjust-
ment typical of cross-currency derivatives, naturally arising from
the change between the numeraires, or probability measures, asso-
ciated with the two yield curves. Numerical scenarios show that the
quanto adjustment can be important, depending on volatilities and
correlation. Unadjusted interest rate derivatives prices are thus, in
principle, not arbitrage-free, but, in practice, at the moment the
market does not trade enough instruments to set up arbitrage
positions.
Both the basis adjustment and the quanto adjustment find a
natural financial explanation in terms of counterparty risk within a
simple credit model including a default-free and a defaultable zero-
coupon bond.
Besides the current lack of information about volatility and corre-
lation, the present framework has the advantage of introducing a
minimal set of parameters with a transparent financial interpreta-
58

tion and leading to familiar pricing formulas, thus constituting a

simple and easy-to-use tool for practitioners and traders to promptly
intercept possible market evolutions.
The author acknowledges fruitful discussions with M. de Prato, M.

Henrard, M. Joshi, C. Maffi, G. V. Mauri, F. Mercurio, N. Moreni, A.
Pallavicini and many colleagues in risk management and participants
at Quant Congress Europe 2009. A particular mention goes to M.
Morini and M. Pucci for their encouragement, and to F. M. Ametrano
and the QuantLib community for the open-source developments
used here. The views expressed are those of the author and do not
represent the opinions of his employer; they are not responsible for
any use that may be made of these contents.
1 This would also explain why the frictions cited above do not necessarily lead to arbitrage
opportunities, once counterparty and liquidity risks are taken into account.
2 These authors were concerned with the fact that their first methodology was not consistent
with the pre-crisis single-curve market practice for pricing single-currency swaps.
Actually, it has become consistent with the post-crisis multiple curve practice.
3 Some details have been omitted here for brevity (see Ametrano and Bianchetti, 2009, and
Bianchetti, 2009).
4 See, for example, Brigo and Mercurio (2006), section 1.4. Note that here we are using the
“textbook” representation of the FRA contract, which is slightly different from the market
term sheet (see also Morini, 2009).
5 We refer here to the case of local yield curve bootstrapping methods, for which there is no
sensitivity delocalisation effect (see Hagan and West, 2006 and 2008).
6 This is a description of what really happens inside an investment bank after August 2007.
Even though it is rather familiar to many practitioners in the financial world, we summa-
rise it here in order to keep in touch with a larger audience, and to remark on the changes
induced by the credit crunch.
7 We use the T-forward measure here because it emphasises that the numeraire is associated
with the discounting curve; obviously any other equivalent measure would be fine as well.
8 The construction of the “right” discounting curve at first step in the post-credit crunch
world is a debated question that we do not consider here. See, for example, Piterbarg (2010).
9 This particular discounting curve, Euribor-based, is not considered risk-free in the post-
credit crunch world. Anyway, different choices (for example, an Eonia curve) as well as other
technicalities of the bootstrapping would obviously lead to slightly different numerical
results, but do not alter the conclusions drawn here.
10 Notice the fortunate notation, where d stands either for “discounting” or “domestic” and f
for “forwarding” or “foreign”, respectively.
11 In particular, we will adapt to the present context the discussion found in Brigo and
Mercurio, Chapters 2.9 and 14.4.
12 Frequently the quanto adjustment for cross-currency derivatives is defined as a multipli-
cative factor. Here, we prefer an additive definition to be consistent with the additive basis
adjustment in Equation 3.10)
59

13 In particular, in contrast to Mercurio (2009), we use here the FRA definition of Equation
3.3, leading to Equation 3.31.
REFERENCES
Ametrano F. and M. Bianchetti, 2009, “Bootstrapping the Illiquidity: Multiple Yield

Curves Construction for Market Coherent Forward Rates Estimation,” in Fabio Mercurio
(Ed), Modelling Interest Rates: Advances in Derivatives Pricing (London, England: Risk
Books).
Bianchetti M., 2009, “Two Curves, One Price: Pricing and Hedging Interest Rate
Derivatives Decoupling Forwarding and Discounting Yield Curves,” working paper
(available at http://ssrn. com/abstract=1334356).
Boenkost W. and W. Schmidt, 2005, “Cross Currency Swap Valuation,” working paper,
HfB Business School of Finance & Management.
Brigo D. and F. Mercurio, 2006, Interest Rate Models – Theory and Practice (2e) (Berlin,
Germany: Springer).
Fruchard E., C. Zammouri and E. Willems, 1995, “Basis for Change,” Risk, October, pp
70–75.
Geman H., N. El Karoui and J. Rochet, 1995, “Changes of Numeraire, Changes of

Probability Measure and Option Pricing,” Journal of Applied Probability, 32(2), pp 443–58.
Hagan P. and G. West, 2006, “Interpolation Methods for Curve Construction,” Applied
Mathematical Finance, 13(2), June, pp 89–129.
Hagan P. and G. West, 2008, “Methods for Constructing a Yield Curve,” Wilmott
Magazine, May, pp 70–81.
Henrard M., 2009, “The Irony in the Derivatives Discounting – Part II: The Crisis,”
working paper (available at http://ssrn.com/abstract=1433022).
Jamshidian F., 1989, “An Exact Bond Option Formula,” Journal of Finance, 44, pp 205–09.
Kijima M., K. Tanaka and T. Wong, 2008, “A Multi-quality Model of Interest Rates,”
Quantitative Finance, 9(2), pp 133–145.
Mercurio F., 2009, “Post Credit Crunch Interest Rates: Formulas and Market Models,”
working paper, Bloomberg (available at http://ssrn.com/abstract= 1332205).
Morini M., 2009, “Solving the Puzzle in the Interest Rate Market,” working paper
(available at http://ssrn.com/abstract=1506046).
Piterbarg V., 2010, “Funding Beyond Discounting: Collateral Agreements and Derivatives
Pricing,” Risk, February, pp 97–102.
Tuckman B. and P. Porfirio, 2003, “Interest Rate Parity, Money Market Basis Swaps, and
Cross-currency Basis Swaps,” Fixed Income Liquid Markets Research, Lehman Brothers.
60

4
A Libor Market Model with
a Stochastic Basis
Fabio Mercurio
Bloomberg
The 2007 credit crunch brought unprecedented levels and volatility

of basis spreads in the interest rate market. Classic no-arbitrage
rules broke down and rates that used to closely track each other
suddenly diverged. Discrepancies between theoretically equivalent
rates were present in the market even before 2007. For instance,
deposit rates and overnight indexed swap (OIS) rates for the same
maturity had always been a few basis points apart. Likewise, swap
rates with the same maturity, but based on different Libor tenors,
had always been quoted at a non-zero spread. However, all these
spreads were generally regarded as negligible and often assumed
to be zero when constructing zero-coupon curves or pricing interest
rate derivatives.
Since August 2007, basis spreads have been neither constant nor
so small as to be safely ignored. Practitioners acknowledged the
presence of several yield curves in the market, and started to use a
given discount curve to calculate net present values (NPVs) in a
default-free setting, and different forward Libor curves to generate
future cashflows dependent on different Libor tenors (see, for
example, Ametrano and Bianchetti, 2009).
The assumption of distinct discount and forward curves, for the
same currency and in the absence of counterparty risk, immediately
invalidates the classic pricing principles, which were built on the
cornerstone of a single zero-coupon curve that contains all relevant
information about the (risk-adjusted) projection of future rates and
61
04 Mercurio PCQF.indd 61 11/03/2013 10:10

the NPV calculation of associated payouts. A new model paradigm

is thus needed to accommodate the market practice of using
multiple interest rate curves (see also Bianchetti, 2010, and Kenyon,
2010).
In this chapter, we extend the Libor market model (LMM) to the
multi-curve setting by modelling the basis between OIS and
forward rate agreement (FRA) rates, which is consistent with the
market practice of building (forward) Libor curves at a spread over
the OIS curve.1 To this end, we will assume that the discount curve
coincides with that stripped from OIS swap rates. This assumption
is reasonable due to the collateral agreements that are typically in
place between banks, which led to the recent market practice of
pricing swaps and swaptions with OIS discounting (see also
Whittall, 2010).
The market practice of using OIS discounting necessitates a model
framework where OIS and Libor rates are jointly modelled. Our
multi-curve LMM goes in this direction, providing a consistent
framework for the valuation of any (collateralised) interest rate
derivative, from linear to exotic. In addition, our extension can be
used to price contracts that depend on more than one Libor tenor or
to link volatilities of Libor rates that belong to different curves (see,
for example, our numerical example below). In such cases, tradi-
tional single-curve models fail to provide sensible valuations because
of the impossibility of simultaneously fitting forward Libor curves
that are associated with different tenors.
Note also that no market data on OIS or basis volatilities is
needed for our LMM calibration. In fact, OIS rates and basis spreads
can be viewed as factors driving the evolution of Libor rates. This is
similar to what we have in some short-rate models where the
instantaneous rate is defined as the sum of two (or more) additive
factors. Such factors do not need specific options to be calibrated to
but their parameters can be fitted to market quotes of standard
(Libor-based) caps and swaptions.
Assumptions and definitions

We assume we are given a single discount curve to be used in the
calculation of all NPVs. This curve is assumed to coincide with the
OIS zero-coupon curve, which is in turn assumed to be stripped
from market OIS rates and defined for every possible maturity T →
62

A LIBOR MARKET MODEL WITH A STOCHASTIC BASIS
PD(0, T) = POIS(0, T), where PD(t, T) denotes the discount factor (zero-
coupon bond) at time t for maturity T. The subscript D stands for
discount curve.
We assume that the tradable assets in our economy, at time t, are
the zero-coupon bonds PD(t, T) and the floating legs of (theoretical)
FRAs setting at times Tk−1 and paying at times Tk, t ≤ Tk−1 < Tk, for a
given time structure T0, ... , TM.
Consider times t, Tk−1 and Tk, t ≤ Tk−1 < Tk. The time-t FRA rate Lk(t)
is defined as the fixed rate to be exchanged at time Tk for the Libor
rate L(Tk−1, Tk) so that the swap has zero value at time t.
As in Kijima, Tanaka and Wong (2009), the pricing measures we
will consider are those associated with the discount curve.2
Denoting by QTD the T-forward measure with numeraire the zero-
coupon bond PD(t, T), we then assume the FRA rate Lk(t) to be
defined by:
Lk (t ) := EDTk ⎡⎣L (Tk−1 ,Tk ) Ft ⎤⎦ (4.1)
where ETD denotes expectation under QTD and Ft denotes the infor-
mation available in the market at time t.
In the classic single-curve valuation, that is, when the forward
Libor curve for the tenor Tk − Tk−1 coincides with the discount curve,
it is well known that the FRA rate Lk(t) coincides with the forward
rate [PD(t, Tk−1)/PD(t, Tk)−1]/(Tk − Tk−1). In our multi-curve setting,
however, this no longer holds, since the simply compounded rates
defined by the discount curve are different, in general, from the
corresponding Libor fixings.
Extending the LMM

As is well known, the classic (single-curve) LMMs are based on
modelling the joint evolution of a set of consecutive forward Libor
rates, as defined by a given time structure. When moving to a
multi-curve setting, we immediately face two complications. The
first is the existence of several yield curves, which multiplies the
number of building blocks (the “old” forward rates) that one needs
to jointly model. The second is the impossibility of applying the old
definitions, which were based on the equivalence between forward
Libor rates and the corresponding ones defined by the discount
curve.
The former issue can be trivially addressed by adding extra
63

dimensions to the vector of modelled rates, and by suitably model-

ling their instantaneous covariance structure. The latter is less
straightforward, requiring a new definition of forward rates, which
needs to be compatible with the existence of different curves for
discounting and for projecting future Libors.
A natural extension of the definition of forward rate to a multi-
curve setting is given by the FRA rate above. In fact, FRA rates have
the following properties.
❑❑ They reduce to the “old” forward rates when the particular case
of a single-curve framework is assumed.
❑❑ They coincide with the corresponding Libor rates at their reset
times: Lk(Tk−1) = L(Tk−1, Tk).
❑❑ They are martingales, by definition, under the corresponding
forward measures.
❑❑ Their time-0 value Lk(0) can easily be bootstrapped from market
data of swap rates (see Chibane and Sheldon, 2009, Henrard,
2009, Fujii, Shimada and Takahashi, 2009a, and formula 4.11
below).
A consistent extension of an LMM to the multi-curve case can thus

be obtained by modelling the joint dynamics of FRA rates and of
forward rates belonging to the discount curve.3 This extension,
Figure 4.1 Basis between 6 × 12 forward Eonia rates and 6 × 12 FRA

rates in the euro market: Oct 7, 2005–Oct 7, 2010
1.0
0.9
0.8
0.7
0.6
0.5
%
0.4
0.3
0.2
0.1
0
Oct 7, Feb 19, Jul 3, Nov 15,
2005 2007 2008 2009
Source: Bloomberg
64

hereafter denoted by McLMM, was first proposed in Mercurio

(2009, 2010), with lognormal dynamics for given-tenor FRA and OIS
rates, later adding stochastic volatility to their evolution. Also, Brace
(2010) models the joint evolution of FRA and OIS rates, but starts
from an HJM framework and gives conditions for the existence of
the related processes. In this chapter, we follow a different approach.
We still model the OIS rates but now explicitly model the basis
between OIS and FRA rates. By doing so, we can easily guarantee
the positivity of the forward Libor–OIS bases in accordance with
what has been observed historically (see, for instance, Figure 4.1). In
the following, we will first introduce a general modelling frame-
work for pricing derivatives whose underlying rates are based on
one tenor only. We will then propose a class of multi-tenor McLMMs,
specifically designed to retain tractability across different tenors.
A general framework for the single-tenor McLMM

Let us fix a given tenor x and consider a time structure T = {0 < T x0, ...
, T xM } compatible with x, where typically x ∈ {1m, 3m, 6m, 1y}, and Mx
x
∈ N.4 Let us define forward OIS rates by:
1 ⎡ PD (t,Tk−1 ) ⎤
x
Fkx (t ) := FD (t;Tk−1
x
,Tkx ) = ⎢ − 1⎥ , k = 1,..., Mx (4.2)
τ kx ⎢⎣ PD (t,Tkx ) ⎥⎦
where t xk is the year fraction for the interval (T xk−1, T xk], and basis
spreads by:
Skx (t ) := Lxk (t ) − Fkx (t ) , k = 1,..., Mx (4.3)
By definition, both Lxk and F xk are martingales under the forward

measure QDT , and hence their difference S xk is a QDT -martingale as
x x
k k
well.
We define the joint evolution of rates F xk and spreads S xk under the
spot Libor measure QTD associated with times T, and whose numer-
aire is the discretely rebalanced bank account:
BDT (t) =
(
PD t,Tβx(t)−1 )
β (t )−1
∏ PD (T ,Tjx )
x
j−1
j=0
where b (t) = m if T xm−2 < t ≤ T xm−1, m ≥ 1 and T x−1 := 0.

Our single-tenor framework is based on assuming that, under
QD, OIS rates follow general stochastic local volatility processes:
T
65

dFkx (t ) = φ kF (t, Fkx (t )) ψkF (V F (t ))

⎡ k τ x ρ φ F (t, F x (t )) ψ F (V F (t)) ⎤
×⎢ ∑
h h, k h h h
x x
dt + dZkT (t )⎥
⎢⎣h=β (t) 1+ τ h Fh (t ) ⎥⎦
dV F (t) = a F (t,V F (t )) dt + b F (t,V F (t )) dW T (t ) (4.4)
where φ Fk, ψFk, aF and bF are deterministic functions of their respective

arguments5, ZT = {ZT1, ... , ZTM } is an Mx-dimensional QTD-Brownian
x
motion with instantaneous correlation matrix (rk,j)k,j=1,...,M , and W T is x
a QTD-Brownian motion whose instantaneous correlation with ZTk is

denoted by r xk for each k. The stochastic volatility VF is assumed to
be a process common to all OIS forward rates, with VF(0) = 1.
We then assume that the spreads Sxk follow stochastic local vola-
tility processes analogous to Equation 4.4. For computational
convenience, we assume that spreads and their volatilities are inde-
pendent of OIS rates,6 which implies that each Sxk is a QTD-martingale
as well. Finally, the global correlation matrix that includes all cross-
correlations is assumed to be positive semi-definite.
There are several examples of dynamics (Equation 4.4) and
respective ones for the spreads that can be considered. Obvious
choices include combinations (and permutations) of geometric
Brownian motions and stochastic volatility models. Some explicit
examples can be found below. However, the discussion that follows
is rather general and requires no dynamics specification.
Caplet pricing
Let us consider the x-tenor caplet paying out at time T xk:
+ +
τ kx ⎡⎣Lxk (Tk−1
x
) − K ⎤⎦ = τ kx ⎡⎣Fkx (Tk−1x ) + Skx (Tk−1x ) − K ⎤⎦ (4.5)
where K is the caplet’s strike.

Our assumptions on the discount curve imply that the caplet
price at time t is given by:
Cplt (t, K;Tk−1

x
,Tkx )
{
= τ kx PD (t,Tkx ) EDTk ⎡⎣Fkx (Tk−1
x
) + Skx (Tk−1x ) − K ⎤⎦ Ft
+
} (4.6)
Assume we explicitly know the QDT -densities fS (T ) and fF (T ) (condi-

x
k
x x x x
k k−1 k k−1
tional on Ft) of Sxk(T xk−1) and F xk(T xk−1), respectively, and/or the
associated caplet prices. Thanks to the independence of the random
variables F xk(T xk−1) and Sxk(T xk−1) we equivalently have:
66

Cplt (t, K ;Tk−1

x
,Tkx )
+∞
⌠
⌡−∞
x
{
= τ kx PD (t,Tkx ) ⎮ EDTk ⎡⎣Fkx (Tk−1
x
+
}
) − ( K − z)⎤⎦ Ft fSkx (Tk−1x ) ( z) dz
+∞

⌠
= τ kx PD (t,Tkx ) ⎮ EDTk
⌡−∞
x
{⎡⎣S (T
x
k
x
k−1 ) − ( K − z)⎤⎦ F } f
+
t z dz
( )( )
Fkx Tk−1
x (4.7)
To calculate Equation 4.7, one needs to derive the dynamics of F xk

and VF under the forward measure QDT , given that the QDT -dynamics
x x
k k
of Sxk and its volatility are the same as those under QTD thanks to our
independence assumption. To this end, we apply the standard
change-of-numeraire result that relates the drifts of a (continuous)
process X under measures QTD and QDT :
x
k
Drift X ;QDTk
( )
x
= Drift ( X ;QDT ) +
(
d X, ln PD (⋅,Tkx ) / PD ⋅,Tβx(t)−1 ( )) t
(4.8)
dt
where 〈⋅, ⋅〉t denotes instantaneous covariation at time t.

The dynamics of F xk and VF under QDT are thus given by:
x
k
dFkx (t ) = φ kF (t, Fkx (t )) ψkF (V F (t )) dZkk (t )

dV F (t ) = a F (t,V F (t )) dt + b F (t,V F (t ))
⎡ k τ xφ F (t, F x (t )) ψ F (V F (t )) ρ x ⎤
×⎢− ∑
h h h h h
dt + dW k (t )⎥ (4.9)
x x
⎢⎣ h=β (t) 1+ τ h Fh (t ) ⎥⎦
where Zkk and Wk are QDT -Brownian motions.

x
k
By resorting to standard drift-freezing techniques, one can find

tractable approximations of VF for typical choices of aF and bF, which
will lead either to an explicit density fF (T ) or to an explicit option x
k
x
k−1
pricing formula (on F xk). This, along with the assumed tractability of
S xk, will finally allow the calculation of the caplet price by applica-
tion of Equation 4.7 (see also our explicit example below).
Swaption pricing
Let us consider a (payer) swaption, which gives the right to enter at
time T xa = T Sc an interest rate swap with payment times for the
floating and fixed legs given by T xa+1, ... , T xb and T Sc+1, ... , T Sd, respec-
tively, with T xb = T Sd and where the fixed rate is K. We assume that
each TSj belongs to {T xa, ... , T xb}.
67

The swaption payout at time T xa = T Sc is given by:

d
+
⎡Sa,b, c , d (Tax ) − K ⎤
⎣ ⎦ ∑τ S
P (TcS ,TjS )
j D
(4.10)
j=c+1
where the forward swap rate Sa,b,c,d(t) is defined by:

b
∑ τ kx PD (t,Tkx ) Lxk (t )
Sa,b, c , d (t ) = k=a+1
d
(4.11)
∑ τ Sj PD (t,TjS )
j=c+1
The swaption payout 4.10 is conveniently priced under the swap

measure QDc,d, whose associated numeraire is the annuity Σdj=c+1t SjPD(t,
T Sj). In fact, denoting by EDc,d the expectation under QDc,d, we have:
PS (t, K;Tax ,...,Tbx ,Tc+1

S
,...,TdS )
d
= ∑τ
j=c+1
S
j D {
P (t,TjS ) EDc, d ⎡⎣Sa,b, c , d (Tax ) − K ⎤⎦ Ft
+
} (4.12)
so that, in a multi-curve as in the single-curve set-up, pricing a

swaption is equivalent to pricing an option on the underlying swap
rate.
To calculate the last expectation, we proceed as follows. We set:
τ kx PD (t,Tkx )
ω k (t ) := d (4.13)
∑ τ Sj PD (t,TjS )
j=c+1
and write:
b b
Sa,b, c , d (t) = ∑ω k (t ) Lxk (t ) = ∑ω k (t) Fkx (t)
k=a+1 k=a+1
b
+ ∑ω k (t) Skx (t) =: F (t ) + S (t ) (4.14)

k=a+1
_ _
with the last equality defining
_ _ processes F and S .
_ The processes S a,b,c,d
, F and S are all QDc,d-martingales. In particular,
F is equal to the classic single-curve forward swap rate that is
defined by OIS discount factors, and whose reset and payment
times are given by T Sc, ... , T Sd. If the dynamics in Equation 4.4, which
define a standard (single-curve) LMM based _ on OIS rates, are suffi-
ciently tractable, we can approximate F (t) by a driftless stochastic
~
volatility process, F(t), of the same type as Equation 4.9. This prop-
erty holds for the majority of LMMs in the financial literature, such
68

as the LMMs of Wu and Zhang (2006), Rebonato (2007) and

Mercurio and Morini (2009), so that we can safely assume it also
applies to our dynamics
_ in Equation 4.4.
The process S is more complex, since it explicitly depends both
on OIS discount factors and on basis spreads. However, we can
resort to a standard approximation and freeze the weights _ wk at
their time-0 value, thus removing the dependence of S on OIS_
discount factors. We then assume we can further approximate S
~
with a dynamics S similar to that of Sxk, for instance by matching
instantaneous variations.
After the approximations just described, the swaption price
becomes:
PS (t, K;Tax ,...,Tbx ,Tc+1
S
,...,TdS )
d
{ }
+
= ∑τ P (t,TjS ) EDc, d ⎡⎣F (Tax ) + S (Tax ) − K ⎤⎦ Ft
S
j D (4.15)
j=c+1
which can be calculated exactly in the same fashion as the caplet

price Equation 4.6.
A tractable class of multi-tenor McLMMs

Let us now consider different tenors x1 < x2 < ··· < xn with associated
time structures T x = {0 < T 0x , ... , T xM }, Mx ∈ N, i = 1, ... , n. We assume
i i i
xi i
that each xi is a multiple of the preceding tenor xi−1, and that T x ⊂ n
T x ⊂ ··· ⊂ T x . We set T := T x .
n–1 1 1
The joint evolution of forward OIS rates for all given tenors x can
be defined by modelling the rates with smaller tenor x1. In fact, the
dynamics of rates F xk for tenors x ∈ {x2, ... , xn} can be obtained from
the dynamics of rates F kx by noting that we can write:
1
ik
∏ ⎡⎣1+ τ hx1 Fhx1 (t)⎤⎦ = 1+ τ kx Fkx (t ) (4.16)

h=ik−1+1
for some indexes ik−1 and ik.

Our class of multi-tenor McLMMs is based on assuming that,
under QTD, the OIS forward rates F kx , k = 1, ... , Mx , follow “shifted-1
1
lognormal” stochastic volatility processes:

⎡ 1 ⎤
dFkx1 (t ) = σ kx1 (t) V F (t ) ⎢ x1 + Fkx1 (t)⎥
⎣ τ k ⎦
⎡ k ⎤
×⎢V F (t) ∑ ρh, kσ hx1 (t) dt + dZkT (t)⎥ (4.17)
⎢⎣ h=β (t ) ⎥⎦
69

where, for each k, skx is a deterministic function and b (t) refers to

1
times T = T x . This corresponds to setting x = x1, φ Fk(t, F) = s kx (t)[1/t kx +

1 1 1
F] and ψFk(V) = V in the general Equation 4.4. The reason for this
modelling choice will be made clear below.7 The stochastic vola-
tility VF is assumed to follow the dynamics in Equation 4.4.
As per spread dynamics, we assume for each tenor x ∈ {x1, ... , xn}
the following one-factor models:
Skx (t ) = Skx ( 0) Mx (t ) , k = 1,..., Mx (4.18)
where, for each x, Mx is a (continuous and) positive QTD-martingale

independent of rates F kx and of the stochastic volatility VF. Clearly,
1
Mx(0) = 1. The spreads Sxk are thus positive martingales under QTD
and any forward or swap measures. A convenient choice in terms of
model tractability is to assume that Mx are stochastic processes
whose densities or associated option prices are known in closed
form. This will be the case in our explicit example below.
Measure changes and option pricing

To price a caplet with payout 4.5, which is based on a general tenor
x, we need to derive the QDT -dynamics of rate F xk. To this end,
x
k
applying Itô’s lemma to Equation 4.16, and Equation 4.8 to 4.17, we

get, for each x ∈ {x1, ... , xn}:
⎡ 1 ⎤
dFkx (t) = σ kx (t ) V F (t ) ⎢ x + Fkx (t)⎥ dZkk , x (t) (4.19)
⎣ τ k ⎦
where s xk, x ∈ {x2, ... , xn}, is a deterministic function whose value is

determined by volatilities s hx and correlations rh,k, and Zkk,x is a QDT -
x
1 k
Brownian motion whose instantaneous correlation with Zhh,x is

specified by the instantaneous covariance structure of rates Fhx . 1
The QDT -dynamics of VF are characterised by a drift correction

x
k
similar to (but different from) that in Equation 4.9. The difference is

given by the fact that T x is in general different from (and contained
in) T x . We have:
1
ik
dV F (t) = −V F (t) b F (t,V F (t)) ∑σ x1
h (t ) ρ hx dt
1
h=β (t )
+ a F (t,V F (t )) dt + b F (t,V F (t )) dW k , x (t) (4.20)
where Wk,x is a QDT -Brownian motion and ik is defined in Equation

x
k
4.16, that is, T ix = T xk. The instantaneous correlation r xk between Zkk,x

k
1
and Wk,x is specified by volatilities s hx and correlations r hx and ri,j.

1 1
70

From Equation 4.19, we notice that 4.17 are the simplest stochastic
volatility dynamics that are consistent across different tenors. This
means, for example, that if three-month rates follow shifted-
lognormal processes with common stochastic volatility, the same
type of dynamics (modulo the drift correction in the volatility
process) is also followed by six-month rates under the respective
forward measures. Our choice of dynamics in 4.17 is motivated by
this feature, which allows us to price simultaneously in closed form
(with the same type of formula) caps and swaptions with different
underlying tenors.
Caplet prices can then be calculated along the lines suggested
above. Likewise, analytical approximations for swaption prices can
be obtained by applying the procedure described above (“Swaption
pricing”) and by noticing that, thanks to assumption 4.18, formula
4.15 can be simplified as follows:
b
S (t ) = ∑ω k (t ) Skx ( 0) Mx (t )
k=a+1
b
≈ Mx ( t ) ∑ω k (0) Skx (0) = S (0) Mx (t)
k=a+1
A specific example of rate and spread dynamics

Dynamics 4.17 and 4.18 can both be driven by stochastic volatility.
For ease of computation, in the following example we choose to
model with stochastic volatility the forward rates F kx and not the 1
spreads S kx , for the smaller tenor x1. Precisely, we assume constant

1
volatilities s kx (t) = s kx in Equation 4.17 and the SABR dynamics of

1 1
Hagan et al (2002) for VF. This leads to the following dynamics for
the x-tenor rate F xk under QDT :
x
k
⎡ 1 ⎤
dFkx (t ) = σ kxV F (t ) ⎢ x + Fkx (t )⎥ dZkk , x (t )
⎣ τ k ⎦
ik
2
dV F (t ) = −ε ⎡⎣V F (t )⎤⎦ ∑σ x1
h ρ hx1 dt + εV F (t ) dW k , x (t )
h=β (t )
F
V ( 0) = 1 (4.21)
where also s xk is now constant and e is a positive constant.

We then assume that basis spreads for all tenors x are governed
by the same process Mx ≡ M, which is assumed to follow a (drift-
less) geometric Brownian motion:
71

dM (t ) = σ M (t ) dZ (t ) (4.22)
where Z is a QD -Brownian motion independent of Zkk,x and W k,x and

x
Tk
s is a positive constant.
Caplet prices under this specification can easily be calculated in
closed form as soon as we smartly approximate the drift term of VF.
Some possible choices can be found in Mercurio and Morini (2009).
Applying the first of formulas 4.7, we get:
Cplt (t, K;Tk−1
x
,Tkx )
ax t
⌠ k( ) ⎛ 1 1
= ⎮ Cplt SABR ⎜t, Fkx (t ) + x , K + x
⌡−∞ ⎝ τk τk
1 − 21 z2
− Skx (t ) e
− 21 σ 2Tk−1
x x
+σ Tk−1 z x
;Tk−1 ,Tkx ) 2π
e dz
+τ kx PD (t,Tkx ) ( Fkx (t ) − K ) Φ (−akx (t ))

(
+τ kx PD (t,Tkx ) Skx (t ) Φ −akx (t ) + σ Tk−1
x
−t ) (4.23)
where:
1
K+ τ x
x
ln Sx (tk) + 21 σ 2 (Tk−1
x
− t)
a (t) :=
k
k
x
σ Tk−1 −t
and CpltSABR(t, F, K; T xk−1, T xk) denotes the (“lognormal”) SABR price at

time t of the caplet that sets at time T xk−1 and pays at time T xk, where
F is the underlying’s value, K is the caplet’s strike, and the SABR
parameters are s xk (corrected for the drift approximation), e and r xk
(the SABR b is here equal to one).
The caplet pricing formula 4.23 can be used to price caps on any
tenor x. Cap prices on a non-standard tenor z can be derived by cali-
brating the market prices of standard y-tenor caps using formula
4.23 with x = y and assuming a specific correlation structure ri,j. One
then obtains in output the model parameters s kx , r kx , e and s, which 1 1
can be used to price z-based caps again with formula 4.23, this time
setting x = z. This will be done in the following example.
An example of calibration to real market data

We finally consider an example of calibration to market data of the
multi-tenor McLMM defined by Equations 4.21 and 4.22. As we
have already pointed out in the introduction, OIS rates and basis
spreads can be interpreted as additive factors driving the evolution
72

Figure 4.2 Absolute differences between market and model cap

volatilities (top), and between model-implied three-month Libor cap
volatilities and model six-month Libor ones (bottom)
0.015
0.010
0.005
0
−0.005
−0.010
−0.015 4y
5y
6y
−0.020 7y
2% 8y
3% 4% 9y
5% 10y
6% 7%
10
8
6
10–3
4
2
0
4y
−2 5y
6y
2% 7y
3% 8y
4% 9y
5% 10y
6%
7%
of FRA rates. As such, their model calibration requires no specific

market information on their respective volatilities and can be
directly performed on Libor-based instruments. To this end, we use
euro data as of September 15, 2010 and calibrate six-month caps
with (semi-annual) maturities from three to 10 years. The consid-
ered strikes range from 2% to 7%.
The calibration is performed by minimising the sum of squared
relative differences between model prices (Equation 4.23) and
73

respective market ones. To simplify things, we assume that OIS

rates are perfectly correlated with one another, that all r kx are equal
1
to the same r and that the drift of VF is approximately linear in VF.

The resulting fitting is shown on the top of Figure 4.2, where we
plot the absolute differences between model and market cap vola-
tilities (obtained under OIS discounting). The average of the
absolute values of these differences is 19bp.
After calibrating the model parameters to caps with x = 6m, we
can apply the same model shown in Equation 4.21 and 4.22 to price
caps based on the three-month Libor (x = 3m), where we assume
that s 3m
i
k−1
= s i3m for each k. The absolute difference between model-
k
implied three-month-based cap volatilities (obtained under OIS

discounting) and corresponding six-month-based ones is plotted
on the bottom of Figure 4.2.
Conclusion
We have shown how to extend the LMM to price interest rate deriv-
atives under distinct yield curves, which are used for generating
future Libor rates and for discounting. We have first modelled the
joint evolution of forward OIS rates and related Libor-OIS spreads
for a given tenor, and then proposed a class of models for the multi-
tenor case. Under assumptions that are standard in the classic LMM
literature, the general dynamics we have considered imply the
possibility of pricing in closed form both caps and swaptions, with
procedures that are only slightly more involved than the corre-
sponding ones in the single-curve case.
Modelling different tenors at the same time has the advantage
of allowing for the valuation of derivatives that are based on
multiple tenors, for example, basis swaps. Another interesting
application involves the pricing of caps or swaptions with a non-
standard underlying tenor, given the market quotes of
standard-tenor options. In both cases, additional constraints on
the model dynamics should be imposed so as to ensure that basis
spreads keep realistic relations between one another as they move
over time.
An issue that needs further investigation is the modelling of
correlations with parametric forms granting the positive definite-
ness of the global correlation matrix. To this end, one may try to
extend to the multi-curve case the parameterisation proposed by
Mercurio and Morini (2007) in the single-curve setting.
74

The author would like to thank Peter Carr, Liuren Wu, Antonio
Castagna, Raffaele Giura and Massimo Morini for stimulating discus-
sions, and Nabyl Belgrade, Marco Bianchetti, Marcelo Piza, Riccardo
Rebonato and two anonymous referees for helpful comments.
1 A similar approach has recently been proposed by Fujii, Shimada and Takahashi (2009b),
who model stochastic basis spreads in a Heath–Jarrow–Morton (HJM) framework both in
single- and multi-currency cases, but without providing examples of dynamics. An alter-
native route is chosen by Henrard (2009), who hints at the modelling of basis swap spreads,
but without addressing typical issues such as the modelling of joint dynamics or the
pricing of plain vanilla derivatives.
2 This is also consistent with the results of Fujii, Shimada and Takahashi (2009a, 2009b) and
Piterbarg (2010), since we assume credit support annex agreements where the collateral
rate to be paid equals the (assumed risk-free) overnight rate
3 The reason for modelling OIS rates in addition to FRA rates is twofold. First, by assump-
tion, our pricing measures are related to the discount (ie, OIS) curve. Second, swap rates
explicitly depend on zero-coupon bonds PD(t, T)
4 For instance, if the tenor is three months, the times T xk must be three-month spaced.
x
5 These functions must be chosen so that Fxk is a martingale under QDT k (see Equation 4.9)
6 We acknowledge that this assumption may lack economic foundation. However, the
historical correlation between OIS rates and spreads in the post credit-crunch period has
been rather unstable. In fact, both positive and negative values have been recorded. The
zero-correlation assumption may thus be regarded as reflecting an average (long-term)
behaviour.
7 Notice also that simply compounded forward rates in a Gaussian short rate model follow
stochastic differential equations analogous to Equation 4.17 with VF ≡ 1.
REFERENCES
Ametrano F. and M. Bianchetti, 2009, “Bootstrapping the Illiquidity,” in Fabio Mercurio

(Ed), Modelling Interest Rates: Advances for Derivatives Pricing (London: Risk Books).
Bianchetti M., 2010, “Two Curves, One Price,” Risk, August, pp 66–72.
Brace A., 2010, “Multiple Arbitrage Free Interest Rate Curves”, preprint, National
Australia Bank.
Chibane M. and G. Sheldon, 2009, “Building Curves on a Good Basis,” working

paper, Shinsei Bank (available at http://papers.ssrn.com/s013/papers. cfm?abstract_
id=1394267).
Fujii M., Y. Shimada and A. Takahashi, 2009a, “A Note on Construction of Multiple

Swap Curves With and Without Collateral,” CARF working paper series F-154 (available
at http://ssrn.com/ abstract=1440633).
Fujii M., Y. Shimada and A. Takahashi, 2009b, “A Market Model of Interest Rates with
Dynamic Basis Spreads in the Presence of Collateral and Multiple Currencies,” working
paper, University of Tokyo and Shinsei Bank (available at www.e.u-tokyo.ac.jp/cirje/
research/dp/2009/2009cf698.pdf).
Hagan P., D. Kumar, A. Lesniewski and D. Woodward, 2002, “Managing Smile Risk,”
Wilmott Magazine, September, pp 84–108.
75

Henrard M., 2009, “The Irony in the Derivatives Discounting Part II: The Crisis,”
preprint, Dexia Bank, Brussels.
Kenyon C., 2010, “Post-shock Short-rate Pricing,” Risk, November, pp 79–83.
Kijima M., K. Tanaka and T. Wong, 2009, “A Multi-quality Model of Interest Rates,”
Quantitative Finance, 9(2), pp 133–45.
Mercurio F., 2009, “Interest Rates and the Credit Crunch: New Formulas and Market
Models” (available at http://papers.ssrn.com/s013/ papers.cfm?abstract_id=1332205).
Mercurio F., 2010, “Modern Libor Market Models: Using Different Curves for Projecting
Rates and for Discounting,” International Journal of Theoretical and Applied Finance, 13, pp
1–25.
Mercurio F. and M. Morini, 2007, “A Note on Correlation in Stochastic Volatility Term

Structure Models,” working paper (available at SSRN.com).
Mercurio F. and M. Morini, 2009, “Joining the SABR and Libor Models Together,” Risk,
March, pp 80–85.
Pricing,” Risk, February, pp 97–102.
Rebonato R., 2007, “A Time-homogeneous, SABR-consistent Extension of the LMM,”

Risk, November, pp 92–97.
Whittall C., 2010, “The Price is Wrong,” Risk, March, pp 18–22.
Wu L. and F. Zhang, 2006, “Libor Market Model with Stochastic Volatility,” Journal of
Industrial and Management Optimization, 2(2), pp 199–227.
76

5
Volatility Interpolation
Danske Bank
Local volatility models such as those of Dupire (1994), Andersen

and Andreasen (1999), JP Morgan (1999) and Andreasen and Huge
(2010) ideally require a full continuum in expiry and strike of arbi-
trage-consistent European-style option prices as input. In practice,
of course, we only observe a discrete set of option prices.
It is well known that interpolation and extrapolation of a two-
dimensional implied volatility surface is a non-trivial problem,
particularly if one wishes to preserve characteristics that guar-
antee arbitrage-free option prices. Previous attempts to solve the
problem include: interpolation in the strike dimension via fitting
of an implied density; best-fit approaches where parametric
option pricing models such as the Heston and SABR models are
fitted to observed option prices and subsequently used for inter-
polation; and full-scale non-parametric optimisation approaches
where local volatility models are fitted directly to observed option
prices (see, for example, Jackwerth and Rubinstein, 1996, Sepp,
2007, Coleman, Li and Verma, 1999, and Avellaneda et al, 1997).
All these approaches, however, suffer from drawbacks: the
implied-density route does not directly lend itself to interpola-
tion in the maturity dimension; the parametric model approach
will not necessarily exactly match all the observed option prices;
and the full-scale optimisation technique is computationally
costly.
Our modelling approach is based on the finite difference solution
of the Dupire (1994) forward equation for option prices, and, as such,
is related to the work by Carr (2008), where it is shown that at one
77
05 Andreason/Huge PCQF.indd 77 11/03/2013 10:11

step the implicit finite difference method can be viewed as option

prices coming from a local variance gamma model. The methodology
is related to the implied-density approach and can be specified to
give an exact fit to the observed option prices. But, contrary to the
implied-density approach, it directly allows for arbitrage-consistent
interpolation in the maturity dimension.
For each maturity, a nonlinear optimisation problem has to be
solved. The number of free parameters will typically be equal to the
number of targets, that is, strikes. An update in the optimisation
problem is quick as it only involves one time step in the implicit
finite difference method, that is, the solution of one tri-diagonal
matrix system, a reduction by an order of magnitude or more on
traditional approaches. The model calibration can be bootstrapped
in the maturity direction but global optimisation is also an option.
After the model is calibrated, the full continuous surface of
option prices is, again, generated by a single time step finite differ-
ence solution of Dupire’s forward equation. Typical interpolation
problems for equity options can be solved in a few hundredths of a
second of CPU time.
Discrete expiries
Given a time grid of expiries 0 = t0 < t1 < ... and a set of volatility
functions {ϑ (k)}i=0,1,..., we construct European-style option prices for
all the discrete expiries, by recursively solving the forward system:
⎡ 1 2 ∂ ⎤
2
⎢1− 2 Δtiϑ i ( k ) ⎥ c (ti+1 , k ) = c (ti , k ) ,

⎣ ∂k 2 ⎦
+
c ( 0, k ) = ( s ( 0) − k ) , i = 0,1,... (5.1)
where Dti = ti+1 – ti.

If we discretise the strike space kj = k0 + jDk, j = 0,1, ... , n and replace
the differential operator by the difference operator, we get the
following finite difference scheme:
⎡1− 1 Δt ϑ ( k )2 δ ⎤ c (t , k ) = c (t , k ) ,
⎣ 2 i i kk ⎦ i+1 i
+
c ( 0, k ) = ( s ( 0) − k ) , i = 0,1,...
1
δkk f ( k ) =
Δk 2
( f ( k − Δk ) − 2 f ( k ) + f ( k + Δk )) (5.2)
The system 5.2 can be solved by recursively solving tri-diagonal

matrix systems. One can thus view the system 5.1 as a one-step per
78

VOLATILITY INTERPOLATION
expiry implicit finite-difference discretisation of the Dupire (1994)

forward equation:
2
∂c 1 2 ∂ c
0= + 2 σ (t, k ) (5.3)
∂t ∂k 2
For a set of discrete option quotes {c^ (ti, kij)}, the system 5.1 can be
bootstrapped forward, expiry by expiry, to find piecewise constant
functions:
ϑ i ( k ) = aij , bi, j−1 < k ≤ bij (5.4)
that minimise the pricing error in 5.1. In other words, we solve the
optimisation problems:
2

ϑ i (⋅)
j
(( ) )
inf ∑ c (ti , k ij ) − ĉ (ti , kij ) / wij , wij = ∂ĉ (ti , k ij ) /∂σ̂ (ti , kij ) (5.5)
sequentially for i = 1, 2, .... Here s^ denotes implied Black volatility.

The point here is that for each iteration in 5.5 only one tri-diagonal
matrix system 5.2 needs to be solved.
Filling the gaps

The system 5.1 translates the local volatility functions into arbitrage-
consistent prices for a discrete set of expiries but it does not directly
specify the option prices between the expiries. We fill the gaps by
constructing the option prices between two expiries according to:
⎡ 1 2 ∂ ⎤
2
⎢1− 2 (t − ti ) ϑ i ( k ) ⎥ c (t, k ) = c (ti , k ) , t ∈ ]ti ,ti+1 [ (5.6)
⎣ ∂k 2 ⎦
Note that for expiries that lie between the quoted expiries, the time
stepping is non-standard. Instead of multiple small time steps that
connect all the intermediate time points, we step directly from ti to
all times t ∈ ]ti, ti+1[. The time-stepping scheme is illustrated in Figure
5.1. This methodology is essentially what distinguishes our model-
ling approach from previously presented finite difference-based
algorithms, for example, Coleman, Li and Verma (1999) and
Avellaneda et al (1997).
Absence of arbitrage and stability

Carr (2008) shows that the option prices generated by 5.1 are
consistent with the underlying being a local variance gamma
79

Figure 5.1 Model timeline
process. From this or from straight calculation we have that

Equation 5.6 can be written as:
∞
⌠ 1 −u/(t−ti )
c (t, k ) = ⎮ e g ( u, k ) du, t > ti (5.7)
⌡0 t − ti
where g(u, k) is the solution to:

2
∂g 1 2 ∂ g
0=− + 2 ϑ (k ) , u>0
∂u ∂k 2
g ( 0, k ) = c (ti , k ) (5.8)
In the appendix, we use this to show that the option prices gener-
ated by 5.1 and 5.6 are consistent with absence of arbitrage, that is,
that ct(t, k) ≥ 0, ckk(t, k) ≥ 0 for all (t, k).
For the discrete space case, we note that with the additional
(absorbing) boundary conditions ckk(t, k0) = ckk(t, kn) = 0, 5.2 can be
written as:
Ac (ti+1 ) = c (ti ) (5.9)
where A is the tri-diagonal matrix:

⎡ 1 0 ⎤
⎢ ⎥
⎢ −z 1 1+ 2z1 −z1 ⎥
⎢ −z2 1+ 2z2 −z2 ⎥
A = ⎢ ⎥
⎢    ⎥
⎢ −zn−1 1+ 2zn−1 −zn−1 ⎥
⎢ ⎥
⎢⎣ 0 1 ⎥⎦
Δt 2
z j = 21 2 ϑ i ( k j ) (5.10)
Δk
The tri-diagonal matrix A is diagonally dominant with positive

diagonal and negative off-diagonals. Nabben (1999) shows that for
this type of matrix:
A−1 ≥ 0 (5.11)
This implies that the discrete system 5.2 is stable. As we also have:
80

05 Andreason/Huge PCQF.indd 81
Table 5.1 SX5E implied volatility quotes (%)

k\t 0.025 0.101 0.197 0.274 0.523 0.772 1.769 2.267 2.784 3.781 4.778 5.774
51.31 33.66 32.91

58.64 31.78 31.29 30.08
65.97 30.19 29.76 29.75
73.30 28.63 28.48 28.48
76.97 32.62 30.79 30.01 28.43
80.63 30.58 29.36 28.76 27.53 27.13 27.11 27.11 27.22 28.09
84.30 28.87 27.98 27.50 26.66
86.13 33.65
87.96 32.16 29.06 27.64 27.17 26.63 26.37 25.75 25.55 25.80 25.85 26.11 26.93
89.79 30.43 27.97 26.72
91.63 28.80 26.90 25.78 25.57 25.31 25.19 24.97
93.46 27.24 25.90 24.89
95.29 25.86 24.88 24.05 24.07 24.04 24.11 24.18 24.10 24.48 24.69 25.01 25.84
97.12 24.66 23.90 23.29
98.96 23.58 23.00 22.53 22.69 22.84 22.99 23.47
100.79 22.47 22.13 21.84
102.62 21.59 21.40 21.23 21.42 21.73 21.98 22.83 22.75 23.22 23.84 23.92 24.86
104.45 20.91 20.76 20.69
106.29 20.56 20.24 20.25 20.39 20.74 21.04 22.13
108.12 20.45 19.82 19.84
109.95 20.25 19.59 19.44 19.62 19.88 20.22 21.51 21.61 22.19 22.69 23.05 23.99
111.78 19.33 19.29 19.20
113.62 19.02 19.14 19.50 20.91
117.28 18.85 18.54 18.88 20.39 20.58 21.22 21.86 22.23 23.21
120.95 18.67 18.11 18.39 19.90
124.61 18.71 17.85 17.93 19.45 20.54 21.03 21.64 22.51
131.94 19.88 20.54 21.05 21.90
139.27 19.30 20.02 20.54 21.35
146.60 18.49 19.64 20.12
Note: The table shows implied Black volatilities for European-style options on the SX5E index. Expiries range from two weeks to a little under six years
and strikes range from 50–146% of current spot of 2,772.70. Data is as of March 1, 2010
81
11/03/2013 10:11
05 Andreason/Huge PCQF.indd 82
82

Table 5.2 SX5E calibration accuracy
k\t 0.025 0.101 0.197 0.274 0.523 0.772 1.769 2.267 2.784 3.781 4.778 5.774
51.31 0.00 0.00

58.64 0.00 –0.02 0.08
65.97 0.00 0.02 –0.23
73.30 0.00 –0.02 0.05
76.97 –0.02 –0.01 0.00 0.00
80.63 –0.02 –0.01 0.00 0.01 0.00 0.00 0.01 0.06 0.00
84.30 0.00 0.00 0.00 –0.02
86.13 0.01
87.96 –0.07 –0.05 0.01 0.02 0.01 –0.01 0.01 0.00 0.00 –0.01 –0.02 0.00
89.79 0.02 0.01 0.00
91.63 0.01 0.01 0.00 0.02 0.01 0.00 –0.01
93.46 –0.02 –0.02 0.00
95.29 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.01 –0.01 0.00
97.12 0.02 0.01 –0.01
98.96 –0.01 –0.01 0.00 0.00 0.00 0.00 0.00
100.79 0.01 0.00 0.00
102.62 0.01 –0.01 0.00 0.00 0.00 –0.01 –0.01 0.00 0.00 –0.03 0.00 0.00
104.45 0.01 0.00 0.02
106.29 –0.06 –0.01 0.00 0.01 0.00 0.03 0.01
108.12 0.00 0.00 –0.02
109.95 –0.10 –0.09 0.00 –0.02 0.00 0.02 –0.01 0.00 –0.01 0.02 –0.02 0.00
111.78 –0.02 0.03 –0.04
113.62 0.03 0.00 –0.01 0.00
117.28 –0.03 0.00 0.01 0.00 0.00 0.02 –0.02 0.00 0.00
120.95 0.01 0.00 –0.02 0.00
124.61 0.00 0.02 0.07 0.02 –0.03 0.02 –0.02 0.00
131.94 0.00 –0.05 0.01 0.00
139.27 0.00 0.01 –0.01 –0.01
146.60 0.02 –0.01 0.00
Note: The table shows the difference between the model and the target in implied Black volatilities for European-style options on the SX5E index. Data
is as of March 1, 2010
11/03/2013 10:11
A−1ι = ι , ι = (1,...,1)ʹ′ (5.12)
we can further conclude that the discrete system 5.2 is arbitrage-

free. Because of the tri-diagonal form of the matrix and the
discretisation, this also holds if the spacing is non-equidistant; a
proof of which is provided in the appendix of this chapter.
If the problem is formulated in logarithmic space, x = lnk, as
would often be the case, then the discrete system 5.2 becomes:
⎡1− 1 Δt ϑ ( x )2 (δ − δ )⎤ c (t , x ) = c (t , x ) ,
⎣ 2 i i xx x ⎦ i+1 i
+
c ( 0, x ) = ( s ( 0) − e x ) , i = 0,1,...
1
δx f ( x ) = ( f ( x + Δx) − f ( x − Δx))
2Δx
1
δxx f ( x) = 2 ( f ( x − Δx ) − 2 f ( x ) + f ( x + Δx )) (5.13)
Δx
It follows that the system is stable if Dx = ln(kj+1/kj ) ≤ 2, not a

constraint that will be breached in any practical application.
As shown in the appendix, 5.1 and 5.6 can be slightly generalised
by introducing a deterministic time-change T(t):
⎡ 1 2 ∂ ⎤
2
⎢1− 2 (T (t ) − ti ) ϑ i ( k ) ⎥ c (t, k ) = c (ti , k ) , t ∈ ]ti ,ti+1 ] (5.14)
⎣ ∂k 2 ⎦
where T(ti) = ti and Tʹ(t) > 0. In this case, the local volatility function
5.3 consistent with the model is given by:
2 ct (t, k )
σ (t, k ) = 2
c kk (t, k )
2 ⎡ ∂ln c kk (t, k ) ⎤
= ϑ i ( k ) ⎢T ʹ′ (t ) + (T (t ) − ti ) ⎥ (5.15)
⎣ ∂t ⎦
The introduction of the time-change facilitates the interpolation in
the expiry direction. For example, a choice of piecewise cubic func-
tions T(t) can be used to ensure that implied volatility is roughly
linear in expiry.
Algorithm
In summary, a discrete set of European-style option quotes is inter-
polated into a full continuously parameterised surface of
arbitrage-consistent option quotes by:
83

Figure 5.2 Local volatility derived from model option prices
100
80
60
%
40
20
0
115.8781
0.024641
219.3651
0.519887
415.2729
1.060515
1.617426
786.1397
2.174337
2.722793
1,488.216
3.278013
3.834924
2,817.293
4.391835
5,333.327
4.948746
10,096.35
5.505657
19,113.07
36,182.34
Spot/strike Time/maturity
Note: The graph shows the local volatility surface in the model after it has been
fitted to the SX5E market. Data is as of March 1, 2010
❑❑ Step 1. For each expiry, solve an optimisation problem 5.6 for a

piecewise constant volatility function with as many levels as
target strikes at the particular expiry. Each iteration involves one
update of 5.1 and is equivalent to one time step in a fully implicit
finite difference solver.
❑❑ Step 2. For expiries between the original expiries, the volatility
functions from step 1 are used in conjunction with 5.7 to generate
option prices for all strikes.
Note that step 2 does not involve any iteration. The process of the
time stepping is shown in Figure 5.1.
Numerical example
Here, we consider fitting the model to the Eurostoxx 50 (SX5E)
equity option market. The number of expiries is 12, with up to 15
strikes per expiry. The target data is given in Table 5.1. We choose to
fit a lognormal version of the model based on a finite difference
solution with 200 grid points. The local volatility function is set up
to be piecewise linear with as many levels as calibration strikes per
expiry. The model fits to the option prices in approximately 0.05
seconds of CPU time on a standard PC. The average number of
84

iterations is 86 per expiry. Table 5.2 shows the calibration accuracy

for the target options. The standard deviation of the error is 0.03%
in implied Black volatility.
After the model has been calibrated, we use 5.6 to calculate option
prices for all expiries and strikes, and deduce the local volatility from
the option prices using 5.3. Figure 5.2 shows the resulting local vola-
tility surface. We note that the local volatility surface has no
singularities. So, as expected, the model produces arbitrage-
consistent European-style option prices for all expiries and strikes.
Conclusion
We have shown how a non-standard application of the fully implicit
finite difference method can be used for arbitrage-free interpolation
of implied volatility quotes. The method is quick and robust, and
can be used both as a pre-pricing step for local volatility models as
well as for market-making in option markets.
Appendix: technical results

Proposition 1: Absence of arbitrage
The surface of option prices constructed by the recursive schemes
5.1 and 5.6 is consistent with absence of arbitrage, that is:
ct (t, k ) ≥ 0
c kk (t, k ) ≥ 0 (5.16)
for all (t, k).
Proof of proposition 1
Consider option prices generated by the forward equation:
2
∂g 1 2 ∂ g
0=− + 2 ϑ (k ) (5.17)
∂t ∂k 2
which is solved forward in time t given the initial boundary condi-
tion g(0, k).
As also noted in Andreasen (1996), 5.17 can also be seen as the
backward equation for:
g (t, k ) = E k ⎡⎣ g (0, k ( 0)) k (t ) = k ⎤⎦ (5.18)
where k follows the process:

dk (t ) = ϑ ( k (t )) dZ (t) (5.19)
85

and Z is a Brownian motion running backwards in time. The

mapping g(0, ⋅) ⎪→ g(t, ⋅) given by 5.17 thus defines a positive linear
functional, in the sense that:
g ( 0,⋅) ≥ 0 ⇒ g (t,⋅) ≥ 0 (5.20)
Further, differentiating 5.17 twice with respect to k yields the

forward equation for p = gkk:
∂p 1 ∂2 ⎡
ϑ ( k ) p⎤⎦
2
0=− +
∂t 2 ∂k 2 ⎣
p (0, k ) = g kk ( 0, k ) = ∫ g kk ( 0,l) δ ( k − l) dl (5.21)
Equation 5.21 is equivalent to the Fokker-Planck equation for the

process:
dx (t ) = ϑ ( x (t )) dW (t ) (5.22)
where W is a standard Brownian motion. From this, we conclude

that 5.17 preserves convexity:
g kk (0,⋅) ≥ 0 ⇒ g kk (t,⋅) ≥ 0 (5.23)
Let T(u) be a strictly increasing function. Define the (Laplace) trans-

form of the option prices by:
∞
⌠ 1 −t/T (u)
h ( u, k ) = ⎮ e g (t, k ) dt (5.24)
⌡0 T ( u)
Multiplying 5.17 by e–t/T(u) and integrating in t yields:
⎡ 1 2 ∂ ⎤
2
⎢1− 2 T (u)ϑ ( k ) ⎥ h ( u, k ) = g ( 0, k ) (5.25)
⎣ ∂k 2 ⎦
From 5.20 and 5.23, we conclude that 5.25 defines a positive linear
functional that preserves convexity.
Differentiating 5.25 with respect to u yields:
⎡ 1 2 ∂ ⎤
2
1 2
⎢1− T ( u) ϑ ( k ) ⎥ hu ( u, k ) = T ʹ′ (u) ϑ ( k ) hkk ( u, k ) (5.26)
⎣ 2 ∂k 2 ⎦ 2
Using 5.25 as a positive linear functional that preserves convexity,

we have that if g(0, ⋅) is convex then:
hu ( u, k ) ≥ 0 (5.27)
for all (u, k).
86

Proposition 2
On the discrete non-uniform strike grid k0 < k1 < … < kn, the finite
difference scheme (5.2) produces option prices that are consistent
with absence arbitrage, ie, the generated option prices are increasing
in maturity, decreasing in strike and convex in strike:
c(t, ki ) ≥ c(t, k i+1 )
∂c(t, ki )
≥0
∂t
c(t, ki+1 ) − c(t, ki ) c(t, k i ) − c(t, ki−1 )
≥
k i+1 − ki ki − k i−1
for all i = 1, …, n – 1.
Proof of proposition 2
On the discrete non-uniform strike grid k0 < k1 < … < kn, the finite
difference scheme (5.2) can be written as the matrix equation system
[I − ΘD]c(t) = c(0) (5.28)
where c(t) = (c(t, k 0 ),…, c(t, k n ))' is a vector of option prices, I is the
identity matrix, Θ is a diagonal matrix and D is proportional to the
discrete second order difference matrix. Specifically:
⎡ θ (k )2 ⎤
⎢ 0 ⎥
⎢ θ (k1 )2 ⎥
⎢ ⎥
⎢ θ (k 2 )2 ⎥
Θ = ⎢ ⎥
⎢  ⎥
⎢ θ (kn−1 )2 ⎥
⎢ ⎥
⎢⎣ θ (kn )2 ⎥⎦
⎡ 0 0 ⎤
⎢ ⎥
⎢ l1 −l1 − u1 u1 ⎥
1 ⎢ l2 −l2 − u2 u2 ⎥
D = t δkk = ⎢ ⎥
2 ⎢    ⎥
⎢ ln−1 −ln−1 − un−1 un−1 ⎥
⎢ ⎥
⎢⎣ 0 0 ⎥⎦
1 1
li = t ⋅
k i+1 − k i−1 ki − ki−1
1 1
ui = t ⋅ (5.29)
k i+1 − ki−1 ki+1 − ki
87

In the following, assume θ(ki) > 0.

Multiplying (5.28) by yields:
[Θ−1 − D](ΘDc(t)) = (Dc(0)) (5.30)
It follows that:
⎡ θ (k )−2 0 ⎤
⎢ 0 ⎥
⎢ −l1 θ (k1 )−2 + l1 + u1 −u1 ⎥
⎢ ⎥
−1 ⎢ −l2 θ (k2 )−2 + l2 + u2 −u2 ⎥
A ≡ [Θ − D] = ⎢ ⎥
⎢    ⎥
⎢ −ln−1 θ (kn−1 )−2 + ln−1 + un−1 −un−1 ⎥
⎢ ⎥
⎢⎣ 0 θ (kn )−2 ⎥⎦ (5.31)
The matrix A is diagonally dominant with positive diagonal

elements and non-negative off-diagonal elements. It follows from
Gershgorin’s circle theorem that the real part of the eigenvalues of
A are positive. Hence, is an M-matrix and thus that all elements of
A–1 are non-negative (see Theorem 5.1 in Fiedler, 1986).
As the Dc(0) ≥ 0, we get that ΘDc(t) ≥ 0 and thereby that is convex
in strike, ie,
δkk c(t) ≥ 0 (5.32)
Differentiating (5.28) by t yields:
∂c(t) 1
= [Θ−1 − D]−1 ( δkk c(t)) ≥ 0 (5.33)
∂t 2
Since c(t, k0) = (s – k0)+ and c(t, kn) = (s – kn)+, we can further conclude
that c(t) is also monotone decreasing in strike and satisfy:
c(t, k i ) ≥ c(0, ki ) = (s − ki )+ (5.34)
We conclude that the option prices c(t) generated by Equation 5.2

and 5.28 are indeed arbitrage-free.
We conclude that the option prices constructed by 5.2 and 5.6 are
consistent with absence of arbitrage.
88

REFERENCES
Andersen L. and J. Andreasen, 1999, “Jumping Smiles,” Risk, November, pp 65–68.
Andreasen J., 1996, “Implied Modeling,” working paper, Aarhus University.
Andreasen J. and B. Huge, 2010, “Expanded Smiles,” Risk, May, pp 78–81.
Avellaneda M., C. Friedman, R. Holmes and D. Samperi, 1997, “Calibrating Volatility

Surfaces Via Relative Entropy Minimization,” Applied Mathematical Finance, 4, pp 37–64.
Carr P., 2008, “Local Variance Gamma,” working paper, Bloomberg, New York.
Coleman T., Y. Li and A. Verma, 1999, “Reconstructing the Unknown Local Volatility
Function,” Journal of Computational Finance, 2(3), pp 77–100.
Dupire B., 1994, “Pricing With a Smile,” Risk, January, pp 18–20.
Fiedler, M., 1986, Special Matrices and their Application in Numerical Mathematics
(Dordrecht, Holland: Martinus Nijhoff Publishers).
Jackwerth J. and M. Rubinstein, 1996. “Recovering Probability Distributions from

Options Prices,” Journal of Finance, 51(5), pp 1,611–31.
JP Morgan, 1999, “Pricing Exotics Under the Smile,” Risk, November, pp 72–75.
Nabben R., 1999, “On Decay Rates of Tridiagonal and Band Matrices,” SIAM Journal on
Matrix Analysis and Applications, 20, pp 820–37.
Sepp A., 2007, “Using SABR Model to Produce Smooth Local Volatility Surfaces,”
working paper, Merrill Lynch.
89

6
Random Grids
Danske Bank
Derivatives models are generally specified in continuous form as a

stochastic differential equation (SDE), and implementation of a
model will typically involve a number of different discrete approxi-
mations of the SDE. For example, an implementation of the Heston
(1993) model might have calibration to European-style option
prices via numerical inversion of discrete Fourier transforms, back-
ward pricing of exotics handled in a Craig–Sneyd (1988) finite
difference scheme, and Monte Carlo pricing using the Andersen
(2006) method for simulation. In this case, the different numerical
schemes will only be fully consistent with each other in the limit
when the number of Fourier steps, the number of time and spatial
steps in the finite difference grid, and the number of time steps in
the Monte Carlo all tend to infinity and the numerical schemes
converge to the continuous time and space solution.
In this chapter, we present an approach that achieves full discrete
consistency between calibration, finite difference solution and
Monte Carlo simulation. A continuous time stochastic model has a
backward partial differential equation (BPDE) associated with it.
We derive a discrete model based on a finite difference scheme for
this BPDE. We term this scheme the backward finite difference
(BFD) scheme, which we take as our base model.
For calibration purposes, we develop a forward finite difference
(FFD) scheme that is dual to, and fully consistent with, the BFD
scheme. It is important to stress that the FFD scheme is not a direct
discretisation of the forward (Fokker–Planck) partial differential
equation (FPDE) of the continuous time model and that by
91

construction our FFD scheme eliminates the need for specification

of the non-trivial boundary conditions that are normally associated
with numerical solution of FPDEs (see Lucic, 2008).
Next, we use results in Nabben (1999) to devise an efficient algo-
rithm for calculation of the transition probabilities implicit in the
BFD scheme, which is then used for simulating the model in a way
fully consistent with the BFD scheme. The numerical work associated
with identifying the transition probabilities in the BFD scheme is
equivalent to numerical solution of one BFD scheme.
As calibration, backward finite difference solution and simula-
tion are based on the same discretisation, the prices generated by
the model are the same, up to Monte Carlo noise, regardless of
which numerical scheme is used. Our implementation method-
ology is presented in the context of a stochastic local volatility
model but is applicable to a variety of models.
The rest of the chapter is organised as follows: the following
section describes the stochastic local volatility model and its associ-
ated forward and backward partial differential equations (PDEs),
the third section introduces the backward and forward finite differ-
ence methods, and in the fourth section we present the simulation
algorithm. In the fifth section, we describe our implementation and
give numerical examples. The chapter is rounded off with a section
that discusses possible extensions and a conclusion.
The model and the PDEs

For simplicity we assume zero rates. We let s be the price of the
underlying stock and assume it evolves according to a stochastic
local volatility model with zero correlation between spot and
volatility:
ds (t ) = z (t )σ (t, s (t )) dW (t )
γ
dz (t) = θ (1− z (t )) dt + ε z (t ) dZ (t) , z ( 0) = 1
dW (t ) ⋅ dZ (t) = 0 (6.1)
where W, Z are independent Brownian motions under the risk-

neutral measure.
The FFD equations used for calibration can still be derived for
the case of non-zero correlation between underlying stock and its
volatility. However, as it stands, our simulation methodology
cannot directly be applied for the non-zero correlation case. We will
discuss this in a subsequent section.
92

RANDOM GRIDS
Equation 6.1 leads to the BPDE:

∂V
0= + DxV + DyV
∂t
1 ∂2
Dx = σ 2 y 2
2 ∂x
∂ 1 ∂2
Dy = θ (1− y ) + ε 2 y 2γ 2 (6.2)
∂y 2 ∂y
where V(t, x, y) is the price of a claim at time t with current stock
price s(t) = x and volatility z(t) = y. The boundary conditions are
defined by the payout of the claim in question.
The joint density (or Green’s function) q(t, x, y) = Pr(s(t) ∈ dx, z(t) ∈
dy) satisfies the FPDE:
∂q
0 = − + Dx* q + Dy* q, q ( 0, x, y ) = δ ( x − s (0)) ⋅ δ ( y − z (0))
∂t
1 ∂2 ⎡ 2 ⎤
Dx* q = ⎣σ yq⎦
2 ∂x 2
∂ 1 ∂2 ⎡ 2 2γ ⎤
Dy* q = − ⎡⎣θ (1− y ) q⎤⎦ + ⎣ε y q⎦ (6.3)
∂y 2 ∂y 2
where d is the Dirac function. The operator pair Dx* , Dy* are the adjoint
operators of Dx, Dy.
The FPDE 6.3 can be seen as the dual of the BPDE 6.2 in the sense
that European-style options satisfy 6.2 but also satisfy:
V ( 0, x ( 0) , y ( 0)) = ∫ ∫ V (t, x, y ) q (t, x, y ) dx dy (6.4)
The BPDE is solved backward in time and the solution describes the
future prices of a particular derivative for all times, spots and vola-
tility levels, whereas the FPDE is solved forward in time. The
solution gives the current marginal densities to all future times,
spot and volatility levels, and thereby the current prices of all
European-style options on spot and volatility.
The European-style call option prices are given by:
c (t, k ) = E ⎡⎣( s (t ) − k ) ⎤⎦ =
+
∫ ∫ ( x − k ) q (t, x, y ) dxdy
+
(6.5)
Double integration of the FPDE 6.3 or local time arguments can be

used to find the extended Dupire equation (1994) that relates
initially observed option prices to the local volatility function:
2
∂c 1 ⎡ 2 ∂ c
0=− + E ⎣z (t ) s (t ) = k ⎤⎦σ (t, k ) (6.6)
∂t 2 ∂k 2
93

where:
E ⎡⎣z (t ) s (t ) = k ⎤⎦ =
∫ yq (t, k, y ) dy (6.7)
∫ q (t, k, y ) dy
A typical approach for implementing the model 6.1 would be to
discretise 6.3 to find the local volatility function from 6.6 that can
then, in turn, be used in a discretisation of the BPDE or in a Monte
Carlo simulation of a discretisation of the SDE 6.1.
There are several problems with this approach. The approxima-
tions are not mutually consistent. Specifically, direct application of
the same type of finite difference scheme to 6.2 and 6.3 would not
lead to the same results. Further, and more importantly, naive Euler
discretisation of the SDE for Monte Carlo may necessitate very fine
time stepping for the Monte Carlo prices to be close to the finite
difference prices. Finally, application of a finite difference scheme to
the FPDE requires specification of non-trivial boundary conditions
along the edges of the grid. The latter is particularly a problem for
the discretisation in the z direction for parameter choices when the
domain of z is closed and includes z = 0 (see Lucic, 2008).
The finite difference schemes

Let 0 = t0 < t1 < ... be a discretisation of the time axis. A finite differ-
ence scheme for the backward PDE 6.2 is the locally one-dimensional
(LOD) scheme:
(1− ΔtD ) v (t ) = v (t )
x h+1/2 h+1
(1− ΔtD ) v (t ) = v (t )
y h h+1/2
(6.8)
where Dt = th+1 – th, and:
1
Dx = yσ 2δxx
2
1
δxx f ( x) = 2 [ f ( x + Δx ) − 2 f ( x ) + f ( x − Δx )] (6.9)
Δx
and:
1
Dy = θ (1− y ) δy + y 2γ ε 2δyy
2
1y<1 1
δy g ( y ) = ⎡⎣ g ( y + Δy ) − g ( y )⎤⎦ + y>1 ⎡⎣ g ( y ) − g ( y − Δy )⎤⎦
Δy Δy
1 ⎡
δyy g ( y ) = ⎣ g ( y + Δy ) − 2 g ( y ) + g ( y − Δy )⎤⎦ (6.10)
Δy 2
94

RANDOM GRIDS
Note that we use an upwind operator that switches between

forward and backward approximations of the first-order derivative
with respect to y depending on the sign of the drift of y.
To close the system 6.8, boundary conditions at the edges of the
grid have to be defined. Here, setting the second derivative_ to zero,
that is, dxxv = 0 and dyyv = 0, can be used. For Ax ≡ 1 – DtDx, we have:
⎡ 1 0 ⎤
⎢ ⎥
⎢ −a1 1+ 2a1 −a1 ⎥
⎢ −a2 1+ 2a2 −a2 ⎥
Ax = ⎢ ⎥
⎢    ⎥
⎢ −am−2 1+ 2am−2 −am−2 ⎥
⎢ ⎥
⎢⎣ 0 1 ⎥⎦
1 Δt 2
ai = yσ (t, xi ) , xi = x0 + iΔx (6.11)
2 Δx 2
_
and for Ay ≡ 1 – DtDy we have:
⎡ 1+ c −c0 ⎤
0
⎢ ⎥
⎢ −a1 1+ b1 −c1 ⎥
⎢ ⎥
⎢ −a2 1+ b2 −c2 ⎥
Ay = ⎢ ⎥
⎢    ⎥
⎢ −an−2 1+ bn−2 −cn−2 ⎥
⎢ ⎥
⎢⎣ −an−1 1+ an−1 ⎥⎦
Δt 1 Δt 2 2γ Δt
c0 = θ (1− y0 ) , ak = ε y k − θ (1− y k ) 1yk >1
Δy 2 Δy 2 Δy
1 Δt 2 2 γ Δt
b k = a k + ck , ck = ε y k + θ (1− y k ) 1y k <1
2 Δy 2 Δy
Δt (6.12)
an−1 = − θ (1− y n−1 ) , y k = y 0 + k ⋅ Δy
Δy
In 6.8, the first equation has to be solved for each y and the second
equation for each x. As usual, the solution of the tridiagonal matrix
equations can be performed in linear time using the tridag() algo-
rithm in Press et al (1988). The LOD scheme 6.8 can be seen as an
alternating direction implicit extension for two spatial dimensions
of the fully implicit finite difference scheme for one-dimensional
problems. It is not the most sophisticated finite difference scheme
and only achieves O(Dt + Dx2 + Dy) accuracy. The drop in accuracy in
the y direction is due to the upwind operator used for the first
95

derivative of y. The strength of the scheme is that it has strong

stability properties that, in conjunction with the upwind operator
used for the dy_, guarantee positivity
_ of the inverse of the operators,
that is, (1 – DtDx) ≥ 0 and (1 – DtDy)–1 ≥ 0. This will be a crucial prop-
–1
erty for the Monte Carlo scheme that we present in the next section.
As our schemes for calibration, backward finite difference and
simulation are mutually consistent, low (formal) order of accuracy
and (relatively) poor approximation of the SDE is not critical. We
can say that the basis of our modelling is the BFD scheme 6.8 rather
than the SDE 6.1.
The LOD scheme 6.8 can be rewritten (in transition form) as:
−1 −1
v (th ) = (1− ΔtDy ) (1− ΔtD ) v (t )
x h+1 (6.13)
Multiplying 6.13 by a vector p yields:1

−1 −1
p (th )ʹ′ v (th ) = p (th )ʹ′ (1− ΔtDy ) (1− ΔtDx ) v (th+1 ) (6.14)
From 6.12 we see that if p is chosen to satisfy:
p (t0 ) = 1x=x(0) ⋅ 1y=y(0)
(1− ΔtD )ʹ′ p (t ) = p (t )

y h+1/2 h
(1− ΔtD )ʹ′ p (t ) = p (t )

x h+1 h+1/2 (6.15)
then:
p (th )ʹ′ v (th ) = p (th+1 )ʹ′ v (th+1 ) (6.16)
for every h. In particular, for European-style options we have:
v ( 0, s ( 0) , z ( 0)) = p (t0 )ʹ′ v (t0 ) = p (tn )ʹ′ v (tn ) (6.17)
We conclude that p is the discrete Green’s function for the BFD

scheme 6.8. Note that 6.15 is completely consistent with the BFD
scheme 6.8, since it is constructed directly from that equation. The
adjoint operators in the FPDE 6.3 are in 6.15 replaced by the trans-
pose of the matrix. There is no need to take special care of the
boundary conditions at the edges because these are implicitly
defined by the backward scheme. We also note that direct applica-
tion of the LOD scheme to the FPDE 6.3 would result in a different
order of the application of the x and y operators.
Consistent FFD schemes can be developed for any BFD scheme,
96

RANDOM GRIDS
including schemes that are second-order accurate in time such as,

for example, the Craig–Sneyd (1988) scheme. In fact, for any BFD
scheme that can be written as:
v (th ) = Hv (th+1 ) (6.18)
for some matrix H, we have a consistent FFD scheme that is given

by:
p (th+1 ) = H ʹ′p (th ) (6.19)
Further, we note that the matrix H can be seen as a transition prob-

ability matrix, which in the context of Markov chain models would
be termed as a discrete time generator matrix. The latter observa-
tion is what will be used for developing our Monte Carlo simulation
scheme in the following section.
A similar result exists for the continuous time and discrete space
case. If the prices solve the backward differential equation:
∂v
0= + Lv (6.20)
∂t
for some matrix L, then the associated Green’s function solves the
forward differential equation:
∂p
0=− + Lʹ′p (6.21)
∂t
This can be used for deriving boundary conditions for the FPDE 6.3
(see Andreasen, 2009).
Multiplying the last equation in 6.15 by (x – k)+ and summing
over x, y leads to the following discrete version of the Dupire
Equation 6.6:
1 ⎡
0=−
Δt ⎣
(
c (th+1 , x ) − c th+ 21 , x ⎤⎦ )
1 2
+ σ (th , x ) E ⎣⎡y (th+1 ) x⎤⎦δxx c (th+1 , x )
2
( )
c th+ 21 , k = ∑ ∑ ( x − k ) p th+ 21 , x, y
x y
+
( )
∑ yp (t h+1 , x, y )
E ⎡⎣y (th+1 ) x⎤⎦ = y (6.22)
∑ p (t h+1 , x, y )
y
97

Equation 6.22 can be used in conjunction with 6.15 for calibrating

the local volatility surfaces so that we achieve consistency between
the calibration scheme and the BFD scheme.
Monte Carlo simulation

As mentioned earlier, the representation in 6.13 of the BFD scheme
can be viewed as a description of the transition probabilities of (x, y)
over a single time step, and thereby as a recipe for simulation of (x,
y). Over the first half time step we simulate y while x is frozen, over
the second half time step we simulate x while y is frozen:
−1 −1
v (th ) = (1− ΔtDy ) ⋅ (1− ΔtDx ) v (th+1 ) = Ay−1 ⋅ Ax−1v (th+1 ) (6.23)
 
simulate y simulate x
The conditional transition probabilities are given by:

Pr ⎡⎣ x (th+1 ) = x j x (th ) = x (th+1/2 ) = xi ⎤⎦ = ( Ax−1 ) ij
Pr ⎡⎣ y (th+1 ) = y (th+1/2 ) = yl y (th ) = y k ⎤⎦ = ( Ay−1 ) (6.24)

kl
From 6.11 we see that:
Axι = ι , ι = (1,...,1)ʹ′ (6.25)
Hence:
Ax−1ι = ι (6.26)
Further, as aj ≥ 0, the matrix Ax is diagonally dominant with positive

diagonal and negative off-diagonals, and results in Nabben (1999)
and Achdou and Pironneau (2005) can be used to show that:
Ax−1 ≥ 0 (6.27)
Inspecting 6.12 reveals that the use of upwind operators for the first
derivative ensures that:
ak ,bk , ck ≥ 0 (6.28)
and thereby that Ay is diagonally dominant with positive diagonal

and negative off-diagonals. This, in turn, ensures that the proper-
ties 6.26 and 6.27 also are satisfied for Ay.
We conclude that the transition probabilities 6.24 are indeed non-
negative and sum to one. For constant parameters the rows in Ax–1 and
Ay–1 are Laplace densities. Figure 6.1 shows an example of this.
98

RANDOM GRIDS
Generally, the calculations needed to find the inverse of a matrix

are quite time-consuming, of the order O(N 3), for a matrix in RN×N.
However, for a tridiagonal matrix A ∈ RN×N, Nabben (1999) shows
that the inverse can be represented as:
⎧ u v i≤ j
⎪ i j
Aij−1 = ⎨ (6.29)
⎪⎩ rj si i> j
where the vectors r, s, u, v can be found in linear time, that is, at a

computational cost of O(N ). The drawback of the Nabben algorithm
is that the elements of u, r tend to be very large and the elements of
v, s tend to be very small.
Instead, we base our simulation algorithm on Huge (2010), who
~
provides an O(N) algorithm for identifying the vectors x~, y~, d ∈ RN
satisfying:
∑ A−1 = d , A−1 = x y / A
ij i ii i i ii
j≤i
x j −A j+1, j
Aij−1 = Ai,−1j+1 ⋅ ⋅ , j<i
x j+1 A jj
y j −A j−1, j
Aij−1 = Ai,−1j+1 ⋅ ⋅ , j>i (6.30)
y j−1 A jj
A C-implementation of the Huge (2010) algorithm is given in

Figure 6.2. The algorithm is very similar in structure and numerical
complexity to the tridag() algorithm in Press et al (1988). Computing
~
the vectors x~, y~, d for all steps in the finite difference grid 6.8 has a
computational cost that is roughly equivalent to solving one finite
difference grid.
For the cumulative distribution function:
Qij ≡ Pr ⎡⎣x (th+1 ) ≤ x j x (th ) = xi ⎤⎦ = ∑ Aik−1 (6.31)

k≤ j
we have:
i j
Qij = Qii − ∑ Aik−1 + ∑A −1

ik (6.32)
k= j+1 k=i+1
The results in 6.30 can be used for identifying each of the elements
in 6.32 by recursion from the diagonal j = i.
This leads to the following simulation algorithm:
99

0. Suppose x (th ) = xi . Set j = i.

1. Drawauniform u ~ U (0,1) .
2. If u ≤ Qii : while u < Qi, j−1 set j := j − 1. Goto 4.
3. If u > Qii : while u < Qi, j−1 set j := j + 1. Goto 4.
4. Set x (th+1 ) = x j . (6.33)
~
Once the vectors x~, y~, d have been calculated using the Huge
algorithm, the actual simulation algorithm is very fast. There are
two reasons for this. First, the updating in the while loops in steps
two and three is quick due to the recursive nature of 6.28 and 6.30.
Second, as we start in the diagonal, the number of steps in the while
loops will be very limited, say zero, one or two, for most draws of
the uniform variable.
Numerical implementation and examples

In our implementation, we use the algorithm of Andreasen and
Huge (2011) to interpolate and extrapolate discrete sets of observed
European-style option prices to a continuum of arbitrage-consistent
European-style option prices before calibration of the dynamic
model.
In a conventional model implementation where different approxi-
mations of the original SDE are used, it is important to verify the
convergence and to control the approximation error. With our
approach, this is no longer as important since the calibration, BFD
scheme pricing and Monte Carlo simulation all are done within the
same discrete model. That said, it is interesting to quantify the inherent
discrepancies between the continuous and the discrete models.
To this end, we consider a Heston (1993) model with flat vola-
tility and the parameters:
σ (t, s) = 0.3 ⋅ s
ε = 3, θ = 1, γ = 0.5 (6.34)
We calculate option prices using the following numerical
methods: BFD scheme solution using the LOD scheme 6.8, Monte
Carlo simulation as described in the previous section, that is, 6.23,
and numerical inversion of the Fourier transform, as in Lipton
(2002). The option prices calculated with different numerical reso-
lution are reported in Table 6.1. We note that the prices generated by
the BFD scheme and Monte Carlo on the same finite difference grid
match. But we also note that the convergence is relatively crude in
100

RANDOM GRIDS
Figure 6.1 A row in the transition matrix

0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0
–1.50 –1.00 –0.50 0.00 0.50 1.00 1.50
Note: Graph shows the values of a row in A–1 with
x ai = 12
Figure 6.2 Algorithm for decomposition of tridiagonal matrix
101

Table 6.1 Pricing on flat parameters
Strike 50% 100% 200% CPU
Fourier 29.69% 26.56% 29.69% < 0.001s

FD
25 × 50 × 12 29.53% 25.46% 29.58% 0.04s
25 × 100 × 25 29.60% 25.78% 29.62% 0.11s
25 × 200 × 50 29.63% 25.88% 29.63% 0.39s
MC
25 × 50 × 12 29.50% 25.45% 29.59% 0.69s
25 × 100 × 25 29.61% 25.78% 29.64% 0.80s
25 × 200 × 50 29.67% 25.89% 29.67% 1.09s
FD
100 × 100 × 25 29.63% 26.22% 29.64% 0.35s
1,000 × 100 × 25 29.64% 26.38% 29.65% 2.80s
10,000 × 100 × 25 29.64% 26.40% 29.65% 27.23s
CS
25 × 100 × 25 29.67% 26.51% 29.69% 0.15s
Note: The table reports the implied Black volatilities for five-year European-style options
with strikes given as 50%, 100% and 200% of the initial forward priced in a (Heston)
model with flat parameters. Results are reported for solution by Fourier inversion,
and backward finite difference solution (FD) and Monte Carlo (MC) for different grid
sizes. Here 25 × 100 × 25 refers to a grid with a total of 25 t-steps, 100 x-steps and
25 y-steps. The term “CS” refers to pricing in a second-order accurate Craig–Sneyd
scheme. MC prices are generated by 32 randomly seeded batches of 16,384 Sobol paths
(see Glasserman, 2003). MC pricing error is approximately 0.02% in implied Black
volatility terms. For the FD case, the reported CPU times include one forward sweep
(calibration) and one backward sweep (pricing) of the finite difference grid. For the MC
case, it includes one forward sweep, one decomposition sweep (MC initialisation) and
simulation of 16,384 paths. Hardware is a standard 2.66GHz Intel PC
the time domain. This is as expected as the LOD method only

achieves O(Dt) accuracy.
However, this is of little practical importance when the discrete
models are calibrated to the same observed option prices. This is
illustrated in Table 6.2, where we compare European-style option
prices in discrete models with different number of time steps and
different levels of e. The models are all calibrated to the SX5E equity
option data as of July 28, 2010.
The default configuration in our implementation uses a finite
difference grid with a total size of 25 × 200 × 50 (t × x × y) steps. As can
be seen in Tables 6.1 and 6.2, a BFD scheme solution on such a grid can
be done in approximately 0.4 seconds of CPU time on a standard
102

RANDOM GRIDS
Table 6.2 Pricing with stochastic local volatility
50% 100% 200% CPU
FD 25 × 200 × 50
EPS = 0 36.96% 26.73% 19.19% 0.05s
EPS = 1 36.96% 26.73% 19.19% 0.42s
EPS = 2 36.96% 26.73% 19.19% 0.42s
EPS = 3 36.96% 26.72% 19.18% 0.42s
MC 25 × 200 × 50
EPS = 0 36.94% 26.74% 19.17% 0.68s
EPS = 1 36.90% 26.72% 19.14% 1.20s
EPS = 2 36.95% 26.72% 19.14% 1.20s
EPS = 3 36.99% 26.74% 19.14% 1.20s
FD 1,000 × 200 × 50
EPS = 0 36.98% 26.75% 19.21% 0.41s
EPS = 1 36.98% 26.75% 19.21% 14.14s
EPS = 2 36.98% 26.75% 19.21% 14.14s
EPS = 3 36.98% 26.74% 19.21% 14.14s
Note: The table reports the implied Black volatilities for five-year European-style
options with strikes of 50%, 100% and 200% of the initial forward priced when
the model is calibrated to the SX5E equity option market, for different levels of the
volatility of variance parameter e. Here, we use the same terminology for grid sizes
as in Table 6.1. Monte Carlo (MC) prices are generated the same way as for Table 6.2,
again with an MC pricing error of approximately 0.02% in implied Black volatility
terms. CPU times are also measured in the same way as in Table 6.1. The SX5E equity
option data is as of July 28, 2010
Table 6.3 Pricing exotics
Epsilon 0.0 1.0 2.0 3.0 MC error
Variance 0.1891 0.1862 0.1843 0.1854 0.0030

Floored variance 0.2423 0.2466 0.2567 0.2665 0.0027
Straddles 1.8123 1.7155 1.5230 1.3460 0.0089
Note: The table reports prices of three different exotics on the SX5E equity index for
different levels of e. Prices are generated from 16,384 paths in a grid of dimensions
25 × 200 × 25. Market data is as of July 28, 2010
computer. This includes a forward sweep to calibrate the model and a

backward sweep of the grid to price the actual contract. Monte Carlo
simulation on the finite difference grid carries an overhead to set up
the simulation that is similar to the BFD scheme solution: a forward
sweep to calibrate the model and decomposition to identify the
~
vectors x~, y~, d at all steps in the grid. The total CPU time for simulation
of 16,384 paths on a 25 × 200 × 50 grid is approximately 1.2 seconds. So,
103

in this case, roughly 0.8 seconds is spent inside the simulation algo-
rithm described in Equation 6.33. Profiling our code reveals that
roughly 80% of the time spent in the simulation algorithm involves
drawing the random numbers in step 1. So in terms of speed, the
simulation methodology is, step by step, almost as fast as naive Euler
discretisation. Since our algorithm reproduces the exact distribution
of the BFD scheme 6.8, there will, however, be no need for more steps
in the simulation than in the BFD scheme.
Though two models produce the same prices for European-style
options, they can produce markedly different prices for exotics. To
illustrate this, we consider three different exotic options on the
SX5E equity index. Let ti = i/12 and define the returns:
s ( ti )
Ri = − 1, i = 1,..., 36 (6.35)
s (ti−1 )
We consider three different exotic options:
Variance: U = ∑ Ri2
i
Floored variance: max ( H,U )

Straddles: ∑R i
(6.36)
i
Table 6.3 reports the prices of these exotics for different levels of
volatility of variance e. We see that the variance contract is almost
invariant to the level of e. The intuition here is that if there are no
jumps in the underlying stock, then a contract on the continuously
observed variance can be statically hedged by a contract on the
logarithm of the underlying stock (see Dupire, 1993). Hence, if
European-style option prices are the same in two models with
continuous evolution of the stock, then the value of the variance
contract should be the same. The floored variance contract, on the
other hand, includes an option on the variance and should there-
fore increase with the volatility of variance parameter e. Finally, for
each period, the forward starting straddle payout is the square root
of the realised variance:
Ri = Ri2
As the square root is a concave function, we should expect to see the

value of the straddle contract decrease with the level of e. We
104

RANDOM GRIDS
conclude that the e dependence of the exotic option prices in Table

6.3 are in line with what we expect.
Extensions
An easy way to extend the model to the multi-asset case is to use a
joint volatility process and correlate the increments of the under-
lying stocks using a Gaussian copula. Specifically, if u~i is the uniform
used for propagating stock i at a given time step, we can set:
u i = Φ ( w i ) , i = 1,...,l, w = ( w i ) = Pξ (6.37)
where x1, ... , xl are independent with xi ~ N(0, 1), and PPʹ is a correla-
tion matrix in Rl×l.
The BFD scheme 6.8 can be extended to include correlation
between stock and volatility, in the following way:
⎡⎣1− ΔtDx ⎤⎦ v (th+1/2 ) = ⎡1+ 1 ΔtDxy ⎤ v (th+1 )

⎢⎣ 2 ⎥⎦
⎡⎣1− ΔtDy ⎤⎦ v (th ) = ⎡1+ 1 ΔtDxy ⎤ v (th+1/2 ) (6.38)

⎢⎣ 2 ⎥⎦
where:
Dxy = σερ y γ +1/2δxy
f ( x + Δx, y + Δy ) − f ( x + Δx, y − Δy )
δxy f ( x, y ) =
4ΔxΔy
f ( x − Δx, y + Δy ) − f ( x − Δx, y − Δy )
− (6.39)
4ΔxΔy
The scheme 6.38 is unconditionally stable, in the von Neumann

sense, and has accuracy of order O(Dt + Dx2 + Dy). The BDF scheme
6.38 leads to the dual FFD scheme:
p (t0 ) = 1x=x(0) ⋅ 1y=y(0)
(1− ΔtD )ʹ′ p (t ) = p (t )

y h+1/4 h
⎛ 1 ⎞ʹ′
p (th+1/2 ) = ⎜1+ ΔtDxy ⎟ p (th+1/4 )
⎝ 2 ⎠
(1− ΔtD )ʹ′ p (t

x h+3/4 ) = p (th+1/2 )
⎛ 1 ⎞ʹ′
p (th+1 ) = ⎜1+ ΔtDxy ⎟ p (th+3/4 ) (6.40)
⎝ 2 ⎠
105

The FFD scheme 6.40 is then to be used as the basis for calibration
instead of 6.15.
For simulation, we note that 6.38 can be rearranged as:
−1 ⎡ 1 ⎤
v (th ) = ⎡⎣1− ΔtDy ⎤⎦ ⋅ ⎢1+ ΔtDxy ⎥
⎣ 2 ⎦
−1 ⎡ 1 ⎤
⋅⎡⎣1− ΔtDx ⎤⎦ ⋅ ⎢1+ ΔtDxy ⎥ v (th+1 ) (6.41)
⎣ 2 ⎦
with x or y simulated for the first-order factors, and both for the
second order.
The matrix:
⎡ 1 ⎤
B ≡ ⎢1+ ΔtDxy ⎥ (6.42)
⎣ 2 ⎦
specifies weights that sum to one and link the left-hand side values
at state (x, y) to the right-hand side values at the states:
{( x − Δx, y − Δy ) , ( x − Δx, y + Δy ) , ( x + Δx, y − Δy ) , ( x + Δx, y + Δy )} (6.43)

Like Ax–1, Ay–1, it can therefore be viewed as a transition matrix for
joint transition of (x, y) into one of the states in 6.43. This suggests
simulation according to the (conditional) probabilities:
Pr ⎡⎣( x, y ) = ( xi , y j )⎤⎦ = Bij (6.44)
The trouble is that some of the entries of B are negative. This can be
handled by simulation according to the transition probabilities:
Bij
Pr ⎡⎣( x, y ) = ( xi , y j )⎤⎦ = (6.45)
∑∑ B kl
k l
in combination with a numeraire that over each B-step is updated

according to:
N := N sgn (Bij ) ∑∑ bij (6.46)

i j
if entry ij is simulated. The numeraire will then have to be

multiplied on the terminal payout.
106

RANDOM GRIDS
Conclusion
We have presented a methodology that achieves full discrete
consistency between calibration, backward finite difference pricing
and Monte Carlo simulation, in the context of a stochastic local
volatility model. The methods extend to the multi-asset case and to
the case of non-zero correlation between the underlying and the
volatility process, as well as to other model types.
The authors would like to thank two anonymous referees for helpful
comments and suggestions.
1 Here, Aʹ denotes the transpose of a matrix A
REFERENCES
Achdou Y. and O. Pironneau, 2005, “Computational Methods for Option Pricing,” SIAM
Frontiers in Applied Mathematics.
Andersen L., 2006, “Efficient Simulation of the Heston Process,” working paper, Bank of
America.
Andreasen J., 2009, “Planck–Fokker Boundary Conditions,” working paper, Danske

Markets.
Andreasen J. and B. Huge, 2011, “Volatility Interpolation,” Risk, March, pp 86–89.
Craig I. and A. Sneyd, 1988, “An Alternating-direction Implicit Scheme for Parabolic
Equations with Mixed Derivatives,” Computers and Mathematics with Applications, 16(4),
pp 341–50.
Dupire B., 1993, “Model Art,” Risk, July, pp 118–21.
Dupire B., 1994, “Pricing with a Smile,” Risk, January, pp 18–20.
Glasserman P., 2003, Monte Carlo Methods in Financial Engineering (New York, NY:
Springer).
Heston S., 1993, “A Closed-form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options,” Review of Financial Studies, 6(2), pp 327–43.
Huge B., 2010, “The Inverse of a Tridiagonal Matrix,” working paper, Danske Markets.
Lipton A., 2002, “The Vol Smile Problem,” Risk, February, pp 61–65.
Lucic V., 2008, “Boundary Conditions for Computing Densities in Hybrid Models via
PDE Methods,” working paper, Barclays Capital.
Nabben R., 1999, “On Decay Rates of Tridiagonal and Band Matrices,” SIAM Journal on
Matrix Analysis and Applications, 20, pp 820–37.
Press W., W. Vetterling, S. Teukolsky and B. Flannery, 1988, Numerical Recipes in C: The
Art of Scientific Computing (Cambridge, England: Cambridge University Press).
107

7
Being Particular About Calibration
Julien Guyon and Pierre Henry-Labordère
Bloomberg and Société Générale
The calibration of stochastic volatility and hybrid models to market

smiles is a longstanding problem in quantitative finance. Partial
answers have been given: for low-dimensional factor models such
as old-fashioned one-factor local stochastic volatility (LSV) models
or a hybrid Dupire local volatility model with a one-factor interest
rate model, this calibration can be achieved by solving a two-
dimensional nonlinear Fokker–Planck partial differential equation
(PDE) (Lipton, 2002). For multi-factor stochastic models such as
variance swap curve models, Libor market models (LMMs) with
stochastic volatility such as the SABR–LMM model or hybrid
Dupire local volatility appearing in the pricing of power reverse
dual derivatives (Piterbarg, 2006), approximate solutions have been
suggested based on heat kernel perturbation expansions, time-
averaging methods and the so-called Markovian projection
techniques (Piterbarg, 2006, and Henry-Labordère, 2009).
In this chapter, we introduce the particle algorithm. This Monte
Carlo method allows us to exactly calibrate any LSV/hybrid model
to market smiles. This method relies on the fact that the dynamics
of the calibrated model is a nonlinear McKean stochastic differen-
tial equation (SDE). Here, nonlinear means that the volatility
depends on the marginal distribution of the process. As a conse-
quence, this SDE is associated with a nonlinear Fokker–Planck PDE.
The particle method consists of considering this equation as the
large N limit of a N-dimensional linear Fokker–Planck PDE that can
be simulated efficiently with a Monte Carlo algorithm.
This chapter is organised as follows. We first introduce nonlinear
109
07 Guyon and Labordere PCQF.indd 109 11/03/2013 10:12

McKean SDEs, then give examples of them arising in the calibration

of LSV and hybrid models to market smiles. Then we present the
particle algorithm for LSV and hybrid models, including important
implementation details. Next, we illustrate the efficiency of our
algorithm on various models commonly used by practitioners.
McKean SDEs
A McKean equation for an n-dimensional process X is an SDE in
which the drift and volatility depend not only on the current value
Xt of the process, but also on the probability distribution Pt of Xt:
dX t = b (t, X t , Pt ) dt + σ (t, Xt , Pt ) ⋅ dWt , Pt = Law ( Xt ) (7.1)
where Wt is a d-dimensional Brownian motion. In Sznitman (1991),

uniqueness and existence are proved for Equation 7.1 if the drift and
volatility coefficients are Lipschitz-continuous functions of Pt, with
respect to the so-called Wasserstein metric. The probability density
function (PDF) p(t, ·) of Xt is the solution to the Fokker-Planck PDE:
n
−∂t p − ∑∂i (b i (t, x, Pt ) p (t, x ))
i=1
1 n ⎛ d ⎞
+ ∑ ∂ij ⎜∑σ ki (t, x, Pt ) σ kj (t, x, Pt ) p (t, x ) ⎟ = 0
2 i, j=1 ⎝ k=1 ⎠
lim p (t, x ) = δ ( x − X 0 ) (7.2)
t→0
It is nonlinear because bi(t, x, Pt) and sji (t, x, Pt) depend on the
unknown p.
Particle method
The stochastic simulation of the McKean SDE 7.1 is very natural. It
consists of replacing the law Pt , which appears explicitly in the drift
and diffusion coefficients, by the empirical distribution:
1 N
PtN = ∑δ i ,N
N i=1 Xt
where the Xti,N are the solution to the (Rn)N-dimensional linear SDE:
dX ti,N = b (t, X ti,N , PtN ) dt + σ (t, Xti,N , PtN ) ⋅ dWti , Law (X 0i,N ) = P0
where {Wti}i=1,…,N are N independent d-dimensional Brownian

motions and PtN is a random measure on Rn. For instance, in the
case of the McKean–Vlasov SDE where:
110

BEING PARTICULAR ABOUT CALIBRATION
b i (t, x, Pt ) = ∫ b (t, x, y ) p (t, y ) dy = E ⎡⎣b (t, x, X )⎤⎦

i i
t
σ i
j (t, x, P ) = ∫ σ (t, x, y ) p (t, y ) dy = E ⎡⎣σ (t, x, X )⎤⎦
t
i
j
i
j t
{Xti,N}i=1,…,N are n-dimensional Itô processes given by:
dX ti,N = ( ∫ b (t, X i,N

t )
, y )dPtN ( y ) dt + ( ∫ σ (t, X i,N
t )
, y ) dPtN ( y ) ⋅ dWti
which is equivalent to:
1 N 1 N
dX ti,N = ∑ b (t, X ti,N , X tj,N ) dt + ∑σ (t, X ti,N , X tj,N ) ⋅ dWti
N j=1 N j=1
One can then show the chaos propagation property (Sznitman,

1991). If at t = 0, X0i,N are independent random variables then as N →
∞, for any fixed t > 0, the Xti,N are asymptotically independent and
their empirical measure PtN converges in distribution towards the
true measure Pt . This means that, in the space of probabilities, the
distribution of the random measure PtN converges towards a Dirac
mass at the deterministic measure Pt . Practically, it means that for
bounded continuous functions f:
1 N
∑ f (Xti,N ) ⎯N→∞
N i=1
L1
⎯⎯→∫ Rn
f ( x ) p (t, x ) dx
where p(t, ·) is the fundamental solution of the nonlinear Fokker–

Planck PDE 7.2 (see Sznitman, 1991).
Using an analogy with the mean-field approximation in statis-
tical physics, we speak of the “particle method:” the N processes
{Xti}i=1,…,N can be seen as a system of N interacting (bosonic) parti-
cles. In the large N limit, the linear Rn×N-dimensional Fokker–Planck
PDE approximates the nonlinear low-dimensional (n-dimensional)
Fokker-Planck PDE 7.2. Then, the resulting drift and diffusion coef-
ficients of Xti,N depend not only on the position of the particle Xti,N
but also on the interaction with the other N – 1 particles.
We now show how standard local stochastic volatility and hybrid
models can be written in this form, and how an efficient simulation
can yield their exact calibration to market data.
Local stochastic volatility model

An LSV model is defined by the following SDE for the T-forward ft
in the forward measure PT:
111

dft = ftσ (t, ft ) at dWt (7.3)
where at is a (possibly multi-factor) stochastic process. It can be seen

as an extension to the Dupire local volatility model, or as an exten-
sion to the stochastic volatility model. In the stochastic volatility
model, one handles only a finite number of parameters (volatility of
volatility, spot/volatility correlations, etc). As a consequence, one is
not able to perfectly calibrate the implied volatility surface. To be
able to calibrate market smiles exactly, one decorates the volatility
of the forward with a local volatility function s (t, f). By definition,
the effective local volatility is:
2 2 T
σ loc (t, f ) = σ (t, f ) E P ⎡⎣at2 ft = f ⎤⎦
From Dupire, this model is exactly calibrated to market smiles if

and only if this effective volatility equals the square of the Dupire
local volatility sDup(t, f)2. SDE 7.3, once the requirement that market
marginals have to be calibrated exactly has been taken into account,
can be rewritten as a McKean SDE:
σ Dup (t, ft )
dft = ft T
at dWt (7.4)
E P ⎡⎣at2 ft ⎤⎦
The local volatility function depends on the joint PDF p(t, f, a) of (ft,
at):
σ (t, f , p ) = σ Dup (t, f )

∫ p (t, f , aʹ′) daʹ′ (7.5)
∫ aʹ′ p (t, f , aʹ′) daʹ′
2
In 7.5, the Lipschitz condition is not satisfied, so a uniqueness and

existence result for solutions to Equation 7.4 is not at all obvious. In
particular, given a set of stochastic volatility parameters, it is not
clear at all whether an LSV model exists for a given arbitrary arbi-
trage-free implied volatility surface: some smiles may not be
attainable by the model. However, a partial result exists: in Abergel
and Tachet (2010) it is shown that the calibration problem for a LSV
model is well posed but only: (i) until some maturity T*; (ii) if the
volatility of volatility is small enough; and (iii) in the case of suit-
ably regularised initial conditions. The result does not apply to
Equation 7.2 because of the Dirac mass initial condition. Our
numerical experiments will show that the calibration does not work
112

for large enough volatility of volatility. This may come from numer-
ical error, or from the non-existence of a solution. The problem of
deriving the set of stochastic volatility parameters for which the
LSV model does exist for a given market smile is very challenging
and open (see an illustration in the numerical experiments section).
Local correlation models can similarly be put in McKean form
(see an extended version of this chapter, Guyon and Henry-
Labordère, 2011).
Hybrid models
A hybrid LSV model is defined in a risk-neutral measure P by:
dSt
= rt dt + σ (t,St ) at dWt
St
where the short-term rate rt and the stochastic volatility at are Itô
processes. For simplicity we assume no dividends. We explain how
to include (discrete) dividends in Guyon and Henry-Labordère
(2011).
This model is exactly calibrated to the market smile if and only if
(see Guyon and Henry-Labordère, 2011, for proof):
E P ⎡⎣( rT − rT0 ) 1ST >K ⎤⎦

T
2 T 2
σ (T, K ) E P ⎡⎣aT2 ST = K ⎤⎦ = σ Dup (T, K ) − P0T (7.6)
1 2
K ∂K C (T, K )
2
with rT0 = EP [rT] = –∂TlnP0T and:

T
2 ∂T C (T, K ) + rT0 K ∂K C (T, K )

σ Dup (T, K ) =
1 2 2
K ∂K C (T, K )
2
Here PT denotes the T-forward measure, C(T,K) the market fair
value of a vanilla option with strike K and maturity T, and PtT the
time-t value of the bond of maturity T. Hence, the dynamics of the
calibrated hybrid LSV model reads as the following nonlinear
McKean diffusion for the forward ft = St /PtT_ in the forward measure
– –
PT , where T denotes the last maturity date for which we want to
calibrate the market smile:
dft

ft
( )
= σ t, PtT ft , PtT at dWtT − σ PT (t ) ⋅ dBtT
113

where:1
⎛ ⎞
E P ⎡⎣PtT−1 ( rt − rt0 ) 1St >K ⎤⎦ ⎟
T
⎜ 2
( t
T
) 2
σ t, K, P = ⎜σ Dup (t, K ) − P0T
1 2 ⎟
⎜ K ∂K C (t, K ) ⎟
⎝ 2 ⎠
PT ⎡ −1 ⎤
E ⎣P St = K ⎦
× PT −1tT 2
E ⎡⎣PtT at St = K ⎤⎦ (7.7)
– –
s PT(t) is the volatility of the bond PtT_, and BtT is the (possibly multidi-
–
mensional) PT -Brownian motion that drives the interest rate curve.
Malliavin representation
We now give another expression of the contribution of stochastic
interest rates to local volatility:
E P ⎡⎣( rT − rT0 ) 1ST >K ⎤⎦

T
P0T
1 2
K ∂K C (T, K )
2
Numerical implementation of the particle algorithm using the alter-

native formula proves to produce a much more accurate and
smooth estimation of the local volatility for far from the money
strikes. As a consequence, it is very useful for extrapolation
purposes. To derive this new formula, we will make use of the
Malliavin calculus.
From the martingale representation theorem:
T
rT − rT0 = ∫ 0
σ rT ( s) ⋅ dBsT
with s Tr(s) an adapted process. By the Clark–Ocone formula, s Tr(s) =

EsP [DsB rT] with DsB the Malliavin derivative with respect to the
T T T
Brownian motion BT, and Es the conditional expectation given Fs,

the natural filtration of all the Brownian motions used. The applica-
tion of the Clark–Ocone formula to the process 1ST > K, combined
with the Itô isometry, gives (see Henry-Labordère, 2008, for similar
calculations and a brief introduction to Malliavin calculus):
T
P0T E P ⎡⎣( rT − rT0 ) 1ST >K ⎤⎦ = ∂2K C (T, K ) ∫ 0 E P ⎡⎣σ rT ( s) ⋅ DsB ST ST = K ⎤⎦ ds
T T T

so that the stochastic interest rate contribution to local volatility

reads (see Balland, 2005, for a similar expression):
114

E P ⎡⎣( rT − rT0 ) 1ST >K ⎤⎦ 2

T
T
E P ⎡⎣σ rT ( s) ⋅ DsB ST ST = K ⎤⎦ ds
T T
P0T = ∫ (7.8)
1 2 K 0
K ∂ K C (T, K )
2
We call this trick a Malliavin “disintegration by parts”, because it

transforms an unconditional expectation involving the Heaviside
function 1S >K into a conditional expectation given ST = K. The
T
Malliavin integration by parts formula goes the other way round.
Note that the second derivative ∂K2C(T,K) of the call option with
respect to strike cancels out in the right hand side of Equation 7.8.
This is fortunate as the computation of this term is sensitive to the
T
strike interpolation/extrapolation method. Also, both EP [(rT – r T0)
1S >K] and K∂K2C(T,K) are very small for strikes K that are far away
T
from the money. Numerically, this 0/0 ratio can be problematic.
There is no such problem in the right hand side of Equation 7.8,
because of the Malliavin disintegration by parts. This make the
Malliavin representation of the hybrid local volatility very useful in
practice, in particular when one wants to design an accurate extrap-
olation of the contribution of stochastic interest rates to local
volatility.
The case of one-factor short rate models

For the sake of simplicity, let us assume that the short rate rt follows
a one-factor Itô diffusion:
drt = µ r (t, rt ) dt + σ r (t, rt ) dBt
where mr(t, rt) and sr(t, rt) are deterministic functions of the time t
and the short rate rt , and Bt is a one-dimensional P-Brownian motion
with d〈B, W〉t = r dt. Then s TP (t), the volatility of the bond PtT, is also
a deterministic function s TP (t, rt) of the time t and the short rate rt.
Moreover, we assume that the stochastic volatility is not correlated
with the stochastic rate rt. Both assumptions can easily be relaxed
but at the cost of additional straightforward calculations. By explic-
T
itly calculating DsB ST, 7.8 can then be written as (see Guyon and
Henry-Labordère, 2011, for detailed calculations):2
E P ⎡⎣( rT − rT0 ) 1ST >K ⎤⎦

T
T
P0T = 2E P ⎡⎣VT ( ρUTT + ΘTT ΞTT − Λ TT ) ST = K ⎤⎦
1 2
K ∂ K C (T, K ) (7.9)
2
with:
115

dVt
= St ∂S σ (t,St ) at ( dWt − atσ (t,St ) dt ) , V0 = 1 (7.10)
Vt
σ (t,St )
dU tT = σ tT (t ) at dt, U 0T = 0 (7.11)
Vt
dRtT
= (∂r µ r (t, rt ) + σ r (t, rt ) ∂r σ PT (t, rt )) dt + ∂r σ r (t, rt ) dBt , R0T = 1 (7.12)
RtT
RtT
dΘTt =
Vt
(1+ ρ ∂r σ PT (t, rt )σ (t,St ) at ) dt, ΘT0 = 0 (7.13)
σ rT (t ) σ r (t, rt )
dΞTt = dt, ΞT0 = 0 (7.14)
RtT
dΛ Tt = ΘTt dΞTt , Λ T0 = 0 (7.15)
and
T −1
σ rT (t) = E tP ⎡⎣RTT ⎤⎦( RtT ) σ r (t, rt ) (7.16)
The particular case of the Ho-Lee and Hull-White models

Equation 7.9 is not completely satisfactory in two ways. First, the
extra processes U Tt, RTt, ΘTt, ΞTt and ΛTt depend on T, which means that
in Equation 7.9, one has to simulate 5 processes for each value of T!
Second, s rT(t) has still to be evaluated in closed form. Considering
constant short rate volatility and affine short rate drift:
drt = ( λ (t ) − κ rt ) dt + σ r dBt (7.17)
solves those two issues at a time. This extra hypothesis is restrictive,

but actually encompasses the cases of commonly used short rate
models, such as the Ho-Lee and Hull-White models, so the results
below are very useful in practice. In the Ho-Lee model, k = 0; in the
Hull-White model, k > 0.
Under Equation 7.17, the volatility of the bond is deterministic so
that ∂rs TP (t,rt ) = 0. Then the process Rt = e–kt is independent of T, it
coincides with the tangent process of rt ,
σ rT (t) = σ r e −κ (T−t)
116

and Equation 7.9 reads
E P ⎡⎣( rT − rT0 ) 1ST >κ ⎤⎦

T
T
P0T = 2σ r e −κT E P ⎡⎣VT ( ρU T + ΘT ΞT − Λ T ) ST = K ⎤⎦ (7.18)
1 2
K ∂K C (T, K )
2
with
dVt
= St ∂S σ (t,St ) at ( dWt − atσ (t,St ) dt ) , V0 = 1
Vt
σ (t,St )
dU t = eκ t at dt, U0 = 0
Vt
e −κ t
dΘt = dt, Θ0 = 0
Vt
⎧ e 2 κ t − 1
⎪σ if κ ≠ 0
Ξt = ⎨ r 2κ
⎪σ t otherwise
⎩ r
2κ t
dΛ t = σ r e Θt dt, Λ0 = 0

The computation of Equation 7.18, for all T, requires the simulation

of only 3 processes ft , rt , Vt and 3 integrals Ut , Θt, Λt.
In this case we eventually obtain the following representation of
the local volatility (7.7):
T
2 E P ⎡⎣P−1 St = K ⎤⎦
(
σ t, K, P t
T
) = PT −1tT 2
E ⎡⎣PtT at St = K ⎤⎦
⎛ E P ⎡⎣PtT−1Vt ( ρU t + ΘtΞt − Λ t ) St = K ⎤⎦ ⎞
T
× ⎜σ Dup (t, K ) − 2σ r e −κ t ⎟

2
(7.19)
⎜ T
E P ⎡⎣PtT−1 St = K ⎤⎦ ⎟
⎝ ⎠
where the dynamics for Vt is3

dVt

Vt
( (
= St ∂S σ (t,St ) at dWtT + ρσ PT (t, rt ) − atσ (t,St ) dt , ) ) V0 = 1
A similar derivation in the case of hybrid LMMs can be found in

Guyon and Henry-Labordère (2011).
Particle simulation method

Local stochastic volatility model
In the LSV model, to simulate the process we need to calculate the
approximated conditional expectation:
117

N ∫ aʹ′ p (t, f , aʹ′) daʹ′

E Pt ⎡⎣at2 ft = f ⎤⎦ =
2
N
∫ p (t, f , aʹ′) daʹ′ N

N i,N 2
=
∑ (a ) δ ( f − f )
i=1 t t
i,N
N
∑ δ( f − f )
i=1 t
i,N
However, it is not properly defined because of the Dirac delta func-

tions. We use a regularising kernel dt,N(·) that converges to the Dirac
function as N → ∞. It is natural to take dt,N (x) = (1/ht,N )K(x/ht,N),
where K is a fixed, symmetric kernel with a bandwidth ht,N that
tends to zero as N grows to infinity. The exponential kernel K(x) =
(1/√⎯2π)exp(–x2/2) and the quartic kernel K(x) = 15/16(1 – x2)21{⎢x⎢≤1}
are typical examples. We use the latter because it saves computa-
tional time. We take:
1
−
ht,N = κ f 0σ VS,t max (t,tmin ) N 5
with sVS,t the variance swap volatility at maturity t, k ≅ 1.5, tmin =

1/4.
Then we define:
N
σ N (t, f ) = σ Dup
∑ δ (f − f) i=1 t,N t
i,N
(7.20)
(t, f ) N i,N 2
∑ (a ) δ ( f − f ) t t,N t
i,N
i=1

and simulate:
dfti,N = fti,Nσ N (t, fti,N ) ati,N dWti (7.21)
A similar algorithm was used in Jourdain & Sbai (2010) in the case
of the joint calibration of smiles of a basket and its components.
At first sight, 7.20 and 7.21 require O(N2) operations at each
discretisation date: each calculation of sN(t, fti,N) requires O(N) oper-
ations, and there are N such local volatilities to calculate. This naive
method is too slow. First, computing sN(t, fti,N) for all i is useless.
One can save considerable time by calculating sN(t, f) for only a grid
Gf,t of values of f, of a size much smaller than N, say Nf,t , and then
inter- and extrapolate. We use cubic splines, with a flat extrapola-
tion, and Nf,t = max(Nf √t, Nf´); typical values are Nf = 30 and Nf´ = 15.
The range of the grid can be inferred from the prices of digital
options: E[1f >maxG ] = E[1f <minG ] = a. In practice, we take a = 10–3.
t f,t t f,t
Second, in the sums in 7.20, a large number of terms make a
negligible contribution: we can disregard fti,N when it is far from f,
118

say, when dt,N( fti,N – f) is smaller than some threshold h. In practice,

this requires sorting particles according to the spot value. The cost
of sorting, O(N ln N), is more than compensated for by the accelera-
tion in the Nf,t evaluations of 7.20. By following this through we
develop our calibration algorithm.
Particle algorithm
Let {tk} denote a time discretisation of [0, T]. The particle algorithm
can now be described by the following:
❑❑ 1. Initialise k := 1 and set sN(t, f) = sDup(0, f)/a0 for all t ∈ [t0 = 0, t1].
❑❑ 2. Simulate the N processes {fti,N, ati,N}i=1,…,N from tk–1 to tk according
to 7.21.
❑❑ 3. Sort the particles according to spot value. _ For f ∈ Gf,tk, find the
smallest index _(fi ) and the largest index i (f) for which dt ,N(fti,N – f)
k k
> h, and calculate the local volatility:
i(f)
∑ δ
i= i ( f ) tk ,N
(f i,N
tk − f)
σ N (tk , f ) = σ Dup (tk , f ) i(f) i,N 2
∑ i= i ( f )
(a ) δ ( f
tk tk ,N
i,N
tk − f)
Interpolate the local volatility using cubic splines, and extrapo-

late the surface as flat outside the interval [minGf,t , maxGf,t ]. Set
k k
sN(t, f) ≡ sN(tk , f) for all t ∈ [tk, tk+1].
❑❑ 4. Set k := k + 1. Iterate steps 2 and 3 up to the maturity date T.
Step 2 is easily parallelisable. Calibration and pricing can be

achieved in the course of the same Monte Carlo simulation. We only
need to ensure that all spot observation dates needed in the calcula-
tion of the payout are included in the time discretisation {tk}. The
price of an option is estimated as 1/N Si=1 N
Hi,N where Hi,N is the
discounted payout evaluated on the path of particle i.
Hybrid local stochastic volatility model

In the case of the hybrid LSV model, a particle is described by three
processes (ft, at , rt). If we use representation 7.7 of the hybrid local
volatility, we define:
119

Figure 7.1 Ho–Lee/Dupire hybrid model calibration (Dax implied

volatilities T = 4Y, T = 10Y: May 30, 2011)
45 Fit of the market smile for T = 4Y
40 210 particles
212 particles
35 Market
No calibration
30
25
20
15
10
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike
40 210 particles,
time = four seconds
35 212 particles,
time = 12 seconds
30 Market
No calibration
25
20
15
10
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike
Note: Ho–Lee parameters σr = 6.3bp a day, ρ = 40%. Δ = 1/100, N = 210 on a full
10-year implied volatility surface with a Intel Core Duo, 3GHz, 3 GB of Ram:
four seconds
N −1
σ N (t,S)
2
=
∑ ( P ) δ (S − S)
i=1
i,N
tT t,N
i,N
t
N −1
i,N 2
∑ ( P ) ( a ) δ (S − S)
i=1
i,N
tT t t,N
i,N
t
⎛ 1 N i,N −1 i,N 0 ⎞
⎜ ∑ ( PtT ) ( rt − rt ) 1Si ,N >S ⎟
× ⎜σ Dup (t,S) − P0T N
2 i=1 t
1 2 ⎟
⎜ S∂K C (t,S) ⎟ (7.22)
⎝ 2 ⎠
and simulate:
dfti,N = fti,Nσ N (t, fti,N PtTi,N ) ati,N dWti − fti,Nσ PT ,i,N (t ) .dBti
–
where W i and Bi are PT-Brownian motions.
If we use representation 7.19 of the hybrid local volatility, we
need to add the Malliavin processes to the particle, which means
more processes to simulate, but usually results in a more accurate
estimation of the wings of the local volatility.
120

Figure 7.2 Bergomi LSV model calibration (Dax implied volatilities T =

4Y, T = 10Y: May 30, 2011)
210 particles
45
212 particles
40 213 particles
35 Market
30 No calibration
Approx
25
20
15
10
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike
50 Fit of the market 210 particles, time = four seconds

45 smile for T = 10Y 212 particles, time = 11 seconds
40 213 particles, time = 21 seconds
Market
35 No calibration
30 Approx, time = 12 seconds
25
20
15
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike
Note: Bergomi parameters σ = 200%, θ = 22.65%, kX = 4, kY = 12.5%, ρXY = 30%,

ρSX = –50%, ρSY = –50%.
Numerical tests
Ho–Lee/Dupire hybrid model
We consider a hybrid local volatility model (at ≡ 1) where the short
rate follows a Ho–Lee model, _for which the volatility sr(t,rt) = sr is a
constant. A bond of maturity T is given by:
2
σ 2 T−t t
P0Tmkt r ( 2 ) −σ r (T−t)BtT
PtT = mkt e
P0t
_ _
with a volatility s PT(t) = –sr(T – t). From 7.19, the local volatility is:
T
2 E P ⎡⎣PtT−1VtU t St = K ⎤⎦
(
σ t, K, PtT ) 2
= σ Dup (t, K ) − 2 ρσ r T
E P ⎡⎣PtT−1 St = K ⎤⎦
T
E P ⎡⎣PtT−1Vt (tΘt − Λ t ) St = K ⎤⎦
−2σ r2 T
E P ⎡⎣PtT−1 St = K ⎤⎦
with:
121

Figure 7.3 High σ Bergomi LSV solution may not exist (Dax implied
volatilities T = 4Y: May 30, 2011)
50 Fit of the market smile for T = 4Y. 213 particles
45 VolVol = 350% 215 particles
40 Market
35 Approx
30
25
20
15
10
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike

ρSX = –50%, ρSY = –50%.
dVt
Vt
( (
= St ∂S σ (t,St ) dWtT + ρσ PT (t ) − σ (t,St ) dt ,V0 = 1 ) )
t σ ( s,Ss ) t ds t
Ut = ∫ 0 Vs
ds, Θt = ∫ 0 Vs
, Λt = ∫ 0
Θs ds
As a sanity check, when sDup depends only on the time t, we obtain

the exact expression for s (⋅) as expected:
2 2 t
σ (t ) = σ Dup (t ) − 2 ρσ r ∫ 0
σ ( s) ds − σ r2t 2
Note that in Benhamou, Gruz and Rivoira (2008), the local volatility
is approximated by:
2 2 t
σ (t, K ) ≈ σ Dup (t, K ) − 2 ρσ r ∫ 0
σ ( s, K ) ds
This equation in s (t, K) is then solved using a fixed-point method.

Practitioners typically use such approximations for s (t, K) whose
quality deteriorates significantly far out of the money or for long
maturities. We emphasise that even in the simple case where sDup
depends only on the time t, the above approximation is not exact
because of the missing term s r2t2. Our algorithm achieves exact cali-
bration in this case with a single particle.
We have checked the accuracy of our calibration procedure on
the Dax market smile (May 30, 2011). We have chosen sr = 6.3 basis
points a day (1% a year) and set the correlation between the stock
and the rate to r = 40%. The time discretisation D = tk+1 – tk has been
122

Figure 7.4 Ho–Lee/Bergomi (Dax implied volatilities T = 4Y, T = 10Y:

May 30, 2011)
45 Fit of the market smile for T = 4Y 210 particles

40 212 particles
213 particles
35
Market
30 No calibration
25
20
15
10
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike
45 Fit of the market 210 particles, time = four seconds

40 smile for T = 10Y 212 particles, time = 20 seconds
213 particles, time = 40 seconds
35 Market
No calibration
30
25
20
15
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Strike

ρSX = –50%, ρSY = –50%. Ho-Lee parameters: σr = 6.3bp per day, ρ = 40%
set to D = 1/100 and we have used N = 210 or N = 212 particles. After

calibrating the model using the particle algorithm, we have calcu-
lated vanilla smiles using a (quasi) Monte Carlo pricer with N = 215
paths and a time step of 1/250. Figure 7.1 shows the implied vola-
tility for the market smile (Dax, May 30, 2011) and the hybrid local
volatility model for maturities of four years and 10 years. When we
use the Malliavin representation, the computational time is around
four seconds for maturities up to 10 years with N = 210 particles (12
seconds with N = 212). Our algorithm definitively outperforms a
(two-dimensional) PDE implementation and has already converged
Table 7.1 Dax implied volatilities T = 10Y: May 30, 2011
Strike 0.5 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.5 1.8
With Malliavin, time = four seconds 14 10 10 9 8 7 6 5 3 1

Without Malliavin, time = eight seconds 16 8 7 4 1 1 1 3 3 5
Note: Errors in basis points using the particle method with N = 210 particles
123

with N = 210 particles. Note that the calibration is also exact using
Equation 7.22, that is, with no use of the Malliavin representation,
but with a larger computational time of eight seconds with N = 210
particles (26 seconds with N = 212). As shown in Table 7.1, the abso-
lute error in implied volatility is a few basis points. For completeness,
we have plotted the smile obtained from the hybrid local volatility
model without any calibration, that is, s (t, K) = sDup(t, K), to materi-
alise the impact of the stochastic rate.
Bergomi’s local stochastic volatility model

For the next example, we consider Bergomi’s LSV model (Henry-
Labordère, 2009):
dft = ftσ (t, ft ) ξ tt dWt

ξ tT = ξ 0T f T (t, xtT )
f T (t, x ) = exp ( 2σ x − 2σ 2 h (t,T ))
(
xtT = αθ (1− θ ) e −kX (T−t)X t + θ e −kY (T−t)Yt )
−1/2
( 2
αθ = (1− θ ) + θ 2 + 2 ρXYθ (1− θ ) )
X
dX t = −kX X t dt + dW
t
dYt = −kYYt dt + dWtY
where:
2
h (t,T ) = (1− θ ) e −2 kX (T−t)E ⎡⎣X t2 ⎤⎦ + θ 2 e −2 kY (T−t)E ⎡⎣Yt2 ⎤⎦
+2θ (1− θ ) e −( kX +kY )(T−t)E [ X tYt ]
1− e −2 kX t ⎡ 2 ⎤ 1− e −2 kY t
E ⎡⎣X t2 ⎤⎦ = , E ⎣Yt ⎦ = ,
2k X 2kY
1− e −( kX +kY )t
E [ X tYt ] = ρXY
k X + kY
This model, commonly used by practitioners, is a variance swap

curve model that admits a two-dimensional Markovian representa-
tion. We have performed similar tests as in the previous section (see
Figure 7.2). The Bergomi model parameters are s = 200% (the vola-
tility of an ultra-short volatility), q = 22.65%, kX = 4, kY = 12.5%, rXY =
30%, rSX = –50% and rSY = –50%. The time discretisation has been
fixed to D = 1/100 and we have used N = 210, N = 212 or N = 213 parti-
cles. Figure 7.2 shows the implied volatility for the market smile
124

(Dax, May 30, 2011) and the LSV model for maturities of four years
and 10 years. The computational time is four seconds for maturities
up to 10 years with N = 210 particles (11 seconds with N = 212). This
should be compared with the approximate calibration (Henry-
Labordère, 2009), which has a computational time of around 12
seconds. To illustrate that we have used stressed parameters to
check the efficiency of our algorithm, we have plotted the smile
produced by the naked stochastic volatility model, which signifi-
cantly differs from the market smile.
Existence under question

As highlighted previously, the existence of LSV models for a given
market smile is not at all obvious although this seems to be a
common belief in the quant community. To illustrate this mathe-
matical question, we calculated our algorithm with a volatility of
volatility of s = 350% (see Figure 7.3). This large value of volatility
of volatility is sometimes needed to generate typical levels of
forward skew for indexes.
Our algorithm seems to converge with N = 213 particles around-
the-money but the market smile is not calibrated. For the maturity
T = 4 years, we have an error of around 61bp at-the-money, which
increases to 245bp for K = 2. For comparison, we graph the result of
an approximate calibration (Henry-Labordère, 2009), which defini-
tively breaks down for high levels of volatility of volatility. The
failure to calibrate could indicate the non-existence of a solution in
this regime, or it could simply be due to numerical error.
Local Bergomi and Ho–Lee hybrid model

We now go one step further in complexity and consider a Bergomi
LSV model with Ho–Lee stochastic rates. We should emphasise that
since this model is driven by four Brownian motions, a calibration
relying on a PDE solver is out of the question. The Bergomi model
parameters are those used in the previous section. Additionally, we
have chosen sr = 6.3bp a day, set the correlation between the stock
and the rate to r = 40% (see Figure 7.4), and assumed that interest
rates and stochastic volatility are uncorrelated. With N = 212 parti-
cles, the fit is very accurate, while computational time is only 20
seconds, for maturities up to 10 years.
125

Conclusion
We have explained how to calibrate multi-factor hybrid local
stochastic volatility models exactly to market smiles using the
particle algorithm to simulate solutions to McKean SDEs. We have
also provided a Malliavin stochastic representation of the stochastic
interest rate contribution to local volatility. The technique we
proposed proves useful when models incorporating multiple vola-
tility and interest rate risks are needed, typically for long-dated,
forward skew-sensitive payouts. Our algorithm represents, to the
best of our knowledge, the first exact algorithm for the calibration
of multi-factor hybrid local stochastic volatility models. Acceleration
techniques make it efficient in practice. As highlighted in our
numerical experiments, the computation time is excellent and, even
for low-dimensional (hybrid) LSV models, our algorithm outper-
forms PDE implementations.
The analysis of nonlinear (kinetic) PDEs arising in statistical
physics such as the McKean–Vlasov PDE and the Boltzmann equa-
tion has become more popular and drawn attention in part thanks
to the work of Fields medallist Cédric Villani. We hope that this
work will initiate new research and attract the attention of practi-
tioners to the world of nonlinear SDEs.
This research was done at the time when both authors were working
at Société Générale. The authors wish to thank their colleagues at
Société Générale for useful discussions. This chapter is dedicated to
the memory of Paul Malliavin, who pointed out to us the efficiency of
the Clark–Ocone formula as we were working on this project.
1 For any claim XT, we have the identity:

T T
P0T E P [ XT ] = P0T E P ⎡⎣PTT
−1
XT ⎤⎦
2 Equations (11) and (14) in Guyon and Henry-Labordere, 2011 for U and Ξ are actually erro-
neous in general – our mistake – and are replaced respectively by Equations 7.11 and 7.14.
However, they are correct for the Ho-Lee model – the model we used in our numerical
experiments in Guyon and Henry-Labordère, 2011 – because in this case s rT(t) = sr(t,rt ). The
correct derivation can be found in Guyon and Henry-Labordère, 2013.
3 Ut, Θt, Ξt and Lt have finite variation and are not affected by the change of measure from P
to PT.
126

REFERENCES
Abergel F. and R. Tachet, 2010, “A Nonlinear Partial Integro-differential Equation from

Mathematical Finance,” Discrete and Continuous Dynamical Systems, Series A, 27(3), pp
907–17.
Balland P., 2005, “Stoch-vol Model with Interest Rate Volatility,” ICBI conference.
Benhamou E., A. Gruz and A. Rivoira, 2008, “Stochastic Interest Rates for Local Volatility
Hybrids Models,” Wilmott Magazine.
Bergomi L., 2008, “Smile Dynamics III,” Risk, October, pp 90–96 (available at www.risk.
net/1500216).
Guyon J. and P. Henry-Labordère, 2011, “The Smile Calibration Problem Solved,”

extended version (available at http:// ssrn.com/abstract=1885032).
Guyon J. and P. Henry-Labordère, 2013, “Nonlinear Option Pricing,” Chapman & Hall,
CRC Financial Mathematics Series (forthcoming).
Henry-Labordère P., 2009, Analysis, Geometry, and Modeling in Finance: Advanced Methods
in Option Pricing (Chapman & Hall, CRC Financial Mathematics Series).
Henry-Labordère P., 2009, “Calibration of Local Stochastic Volatility Models to Market

Smiles,” Risk, September, pp 112–17 (available at www.risk.net/1532624).
Jourdain B. and M. Sbai, 2010, “Coupling Index and Stocks,” Quantitative Finance,
October.
Lipton A., 2002, “The Vol Smile Problem,” Risk, February, pp 61–65 (available at www.
risk.net/1530435).
Piterbarg V., 2006, “Smiling Hybrids,” Risk, May, pp 66–71 (available at www.risk.
net/1500244).
Sznitman A., 1991, “Topics in Propagation of Chaos,” Ecole d’été de probabilités de

Saint-Flour XIX – 1989, volume 1,464 of Lecture Notes in Mathematics (Berlin, Germany:
Springer).
127

8
Cooking with Collateral
Barclays
An economy without a risk-free rate has been considered in the

past (see Black, 1972), but traditional derivatives pricing theory (see,
for example, Duffie, 2001) assumed the existence of such a rate as a
matter of course. Until the crisis, this assumption worked well, but
now even government bonds cannot be considered credit risk-free.
Hence, using a risk-free money-market account or a zero-coupon
bond as a foundation for asset pricing theory needs revisiting.
While some of the standard constructions in asset pricing theory
could be reinterpreted in a way consistent with the developments
of this chapter, there is significant value in going through the steps
of derivations to show how they should be adapted to the prevailing
market practice. This is the programme we carry out here.
What comes closest to a credit risk-free asset in a modern
economy, in our view, is an asset fully collateralised on a contin-
uous basis. Of course, possible jumps in asset values and
practicalities of collateral monitoring and posting do not allow for
full elimination of credit risk, but we will neglect this here.
A collateralised asset is fundamentally different from the money-
market account that serves the role of risk-free asset traditionally.
Whereas with a money-market account one can deposit money
now and withdraw it, credit risk-free, in 10 years, a collateralised
asset produces a continuous stream of payments from the changes
in mark-to-market value. In this chapter, we show how to develop a
model of the economy from these non-traditional ingredients. We
show that a risk-neutral measure can still be defined and hence
much of the pricing technology developed in the traditional setting
129

can, fortunately, still be used. A similar argument is independently

developed in Macey (2011).
Once the basic building blocks are developed, we apply our
approach in a cross-currency setting. This allows us to rigorously
develop a model of assets collateralised in different currencies. A
model with these features has been presented in Fujii and Takahashi
(2011). However, the authors start from assuming the existence of a
risk-free rate and measure. The lack of financial meaning of this
risk-free rate is the main weakness of the argument of Fujii and
Takahashi (2011), although, as shall be clear later, much of what
they do can be justified from the developments of this chapter.
Collateralised processes
A collateralised derivative has quite a different set of cashflows
from an uncollateralised “traditional” one. At the inception of the
collateralised trade there is no exchange of cashflows – the price
paid for the derivative is immediately returned as collateral. During
the life of a collateralised trade, there is a continuous stream of
payments based on the changes in the trade’s mark-to-market. A
collateralised trade can be terminated at any moment at zero addi-
tional cost. So the notion of a price of a collateralised asset is actually
somewhat misleading – as the trade can be terminated at zero addi-
tional cost, the value of this transaction is always zero. What we
would call a price is nothing but a process that defines the level of
the collateral holding. Or, in the language of the classic asset pricing
theory, a collateralised transaction is an asset with a zero price
process and given cumulative-dividend process (see Duffie, 2001).
A moment’s thought shows that this is very much the same as for a
futures contract. In fact, a futures contract is just a particular type of
collateralised forward contract, with the collateral rate set to zero.
Keeping this in mind helps set the right picture.
Still, given the standard terminology, we would still use the word
“price” for a collateralised trade, but the meaning should always be
clear – this is the level of holding of a collateral at any given time.
Let us start by introducing some notation. Let V(t) be the
price of a collateralised asset between parties A and B. If V(t) is
positive from the point of view of A, party B will post V(t) to A.
Party A will then pay party B a contractually specified collateral
130

COOKING WITH COLLATERAL
rate c(t) on V(t). Throughout we expressly do not assume that c(t) is

deterministic.
Suppose two parties agree to enter into a collateralised transac-
tion at time t, and in particular A buys some collateralised asset
from B. Let us consider the cashflows.
❑❑ Purchase of the asset. The amount of V(t) is paid by A to B.

❑❑ Collateral at t. Since A’s mark-to-market is V(t), the amount V(t) of
collateral is posted by B to A.
❑❑ Return of collateral. At time t + dt, A returns collateral V(t) to B.
❑❑ Interest. At time t + dt, A also pays V(t)c(t) dt interest to B.
❑❑ New collateral. The new mark-to-market is V(t + dt). Party B pays
V(t + dt) in collateral to A.
Note that there is no actual cash exchange at time t. At time t + dt,

the net cashflow to A is given by:
V (t + dt ) − V (t ) (1+ c (t) dt) = dV (t ) − c (t ) V (t ) dt
As already noted, at time t + dt the sum of the mark-to-market and

the collateral for each party is zero, meaning they can terminate the
contract at no cost, keeping the collateral.
Two collateralised assets

Now assume there are two assets both collateralised at rate c(t).
Assume that in the real-world measure the asset prices follow:
dVi (t ) = µ i (t ) Vi (t ) dt + σ i (t) Vi (t ) dW (t ) , i = 1, 2 (8.1)
with both assets driven by the same Brownian motion. This is the
case when, for example, we have a stock1 and an option on that
stock. At time t, we can enter into a portfolio of two collateralised
transactions to hedge the effect of the randomness of dW(t) on the
cash exchanged at time t + dt. To do that, we go long a notional of
s2 (t)V2(t) in asset 1 and short a notional of s1(t)V1(t) in asset 2. The
cash exchange at time t + dt is then equal to:
σ 2 (t ) V2 (t ) ( dV1 (t) − c (t ) V1 (t ) dt )

−σ 1 (t ) V1 (t) ( dV2 (t) − c (t ) V2 (t ) dt )
which, after some manipulation, gives us:
131

σ 2 (t ) V1 (t) V2 (t) (µ 1 (t ) − c (t )) dt
−σ 1 (t ) V1 (t) V2 (t) (µ 2 (t) − c (t )) dt
This amount is known at time t and, moreover, the contract can be

terminated at t + dt after the cashflow is paid at zero cost. Hence, the
only way both parties agree to transact on this portfolio – in other
words, for the economy to have no arbitrage – this cashflow must
actually be zero, which gives us:
σ 2 (t ) (µ 1 ( t ) − c ( t )) = σ 1 (t ) (µ 2 ( t ) − c ( t ) )
and, in particular:
µ 1 (t ) − c ( t ) µ 2 (t ) − c ( t )
=
σ 1 (t ) σ 2 (t)
Let us now define:
 (t )  dW (t) + µ 1 (t) − c (t ) dt
dW

σ 1 (t)
By the previous result, we also have:
 (t ) = dW (t ) + µ 2 (t ) − c (t ) dt
dW
σ 2 (t )
~ as:
Hence, we can rewrite 8.1 using the newly defined dW
 (t ) , i = 1, 2
dVi (t ) = c (t) Vi (t ) dt + σ i (t) Vi (t ) dW (8.2)
Now, looking at 8.2 we see that there exists a measure, equivalent to

the real-world one, in which both assets grow at rate c(t). This is the
analogue to the traditional risk-neutral measure. In this measure Q,
the price process for each asset is given by:
⎛ − ∫ T c(s)ds ⎞
Vi (t ) = EtQ ⎜ e t Vi (T ) ⎟ , i = 1, 2 (8.3)
⎝ ⎠
Different collateral rates

Note that the two assets can be collateralised at different rates, c1
and c2, and the same result would apply. In particular, we would
still have the condition:
σ 2 (t ) (µ 1 ( t ) − c 1 (t )) = σ 1 (t ) (µ 2 ( t ) − c 2 ( t ) )
132

from the cashflow analysis. Hence, the change of measure is still

possible, and 8.3 still holds with c(t) replaced by the appropriate c1
or c2:
⎛ − ∫ T ci (s)ds ⎞
Vi (t ) = EtQ ⎜ e t Vi (T ) ⎟ , i = 1, 2 (8.4)
⎝ ⎠
In the stock option example of the previous section, the stock will
grow at its repo rate and the option will grow at its collateral rate in
the risk-neutral measure, consistent with the analysis of Piterbarg
(2010).
Many collateralised assets

Let us now consider a general economy where we have more assets
than sources of noise. In particular, assume that N + 1 collateralised
(with the same collateral rate c) assets are traded, and their real-
world dynamics are given by:
dV = µVdt + ΣdW
where dW is an N-dimensional Brownian motion. Here m and dV =

(dV1, …, dVN+1)T are column vectors of dimension N + 1, mV is under-
stood as a vector with elements miVi , i = 1, …, N + 1, and S is an (N +
1) × N matrix of full rank N. We can find a column vector of weights
w of dimension N + 1 such that:
wT Σ = 0 (8.5)
Then the cash in the portfolio wTV has no randomness and hence,
by the no-arbitrage arguments used previously, we must have that:
w T (µV − cV ) = 0
Therefore the vector mV – cV belongs to the N-dimensional subspace

of vectors orthogonal to w. This sub-space also contains all columns
of the matrix S by 8.5 and since they are linearly independent by the
full rank assumption, the vector mV – cV is spanned by the N
columns of S. Hence, there exists an N-dimensional vector l such
that:
µV − cV = Σλ
So we can write:
dV = cVdt + Σ ( dW + λ dt )
133

and define the risk-neutral measure by the condition that dW + l dt

is a driftless Brownian motion. In this measure, all processes V have
drift c.
If we now consider the assets to be collateralised zero-coupon
bonds, we obtain a model of interest rates that looks exactly like
the standard Heath–Jarrow–Morton (HJM) model, except each
collateralised zero-coupon bond grows at a – possibly maturity-
dependent, although this is unusual – collateral rate c(t). A
zero-coupon bond is then given, in analogy with the corresponding
HJM formula, by:
⎛ − ∫ T c( s) ds ⎞
P (t,T ) = EtQ ⎜ e t ⎟
⎝ ⎠
Counterparty-specific collateral rates

The same asset can be collateralised with different rates when, for
example, traded with different counterparties. Clearly, it will have
two different price processes if the collateral rates are different, and
so it should be actually thought of as two different assets.
Given that the price processes are different, this case is no
different from that considered above. So 8.4 would still hold. For
example, if the asset is a zero-coupon bond with maturity T collat-
eralised with either rate c1 or c2, then the ratio of their prices under
different collateral mechanisms would be given by:
⎛ − ∫ T c1 (s) ds ⎞
EtQ ⎜ e t ⎟
P1 (t,T ) ⎝ ⎠
=
P2 (t,T ) ⎛ − ∫ 2 ⎞
T
c ( s ) ds
EtQ ⎜ e t ⎟
⎝ ⎠
Switching to the measure in which P2(t, T) is a martingale (which

we denote by Q2,T), this ratio is given by:
⎛ ∫ T (c2 (s)−c1 (s)) ds ⎞
E 2,T ⎜ e t ⎟
⎝ ⎠
Cross-currency model
The previous section gives a flavour of the results one gets for an
asset collateralised with different rates, but probably the main
example when this situation occurs is in cross-currency markets.
134

According to London-based clearing house LCH.Clearnet’s collat-

eral rules, single-currency swaps are collateralised in the currency
of the trade, while cross-currency swaps, when they start to be
cleared, are likely to be collateralised in dollars. Clearly, having
both types of swap leads to an economy where we must consider
zero-coupon bonds collateralised in the domestic, as well as in
some foreign currency. Collateralised zero-coupon bonds are not
traded by themselves, but provide convenient fundamental
building blocks for swaps collateralised in different currencies. We
carefully develop such a model in this section.
Domestic and foreign collateral

Consider an economy with domestic and foreign assets and a forex
rate X(t) expressed as a number of domestic (denoted D) units per
one foreign (denoted F). Suppose the only possible collateral types
are the domestic currency with a given unique domestic collateral
rate cd(t), and the foreign currency with a given unique foreign rate
cf(t). Denote a domestic zero-coupon bond collateralised in domestic
currency by Pd,d(t, T). This bond generates the following cashflow at
time t + dt:
dPd,d (t,T ) − cd (t ) Pd,d (t,T ) dt (8.6)
Now consider a foreign zero-coupon bond collateralised with the

domestic rate. Let its price, in foreign currency, be Pf,d(t, T). We
consider the cashflows to determine its price process from
no-arbitrage arguments.
❑❑ Purchase of the asset. The amount of Pf,d(t, T) is paid (in foreign

currency F) by party A to B.
❑❑ Collateral at t. Since A’s mark-to-market is Pf,d(t, T) in foreign
currency, the amount Pf,d(t, T)X(t) of collateral is posted in
domestic currency D by B to A.
❑❑ Return of collateral. At time t + dt, A returns collateral Pf,d(t, T)X(t)
D to B.
❑❑ Interest. At time t + dt, A also pays cd(t)Pf,d(t, T)X(t)dt interest to B
in D.
❑❑ New collateral. The new mark-to-market is Pf,d(t + dt, T). Party B
pays Pf,d(t + dt, T)X(t + dt) collateral to A in D.
135

The cashflow, in D, at t + dt is:
d ( Pf ,d (t,T ) X (t )) − cd (t ) Pf ,d (t,T ) X (t ) dt (8.7)
Drift of FX rate
Equations 8.6 and 8.7 are insufficient to determine the drift of the
forex rate X(·). From 8.7 we can only deduce the drift of the
combined quantity XPf,d and the drift of Pf,d is in general not cf (nor is
it cd, for that matter). To understand the drift of X(·), we need to
understand what kind of domestic cashflow we can generate from
holding a unit of foreign currency. So, suppose we have 1F. If it was
a unit of stock, we could repo it out (that is, borrow money secured
by the stock) and pay a repo rate on the stock. What is the equiva-
lent transaction in the forex markets? Having 1F, we can give it to
another dealer and receive its price in domestic currency, X(t)D.
The next instant t + dt we would get back 1F, and pay back X(t) +
rd,f(t)X(t)dt, where rd,f(t) is a rate agreed on this domestic loan collat-
eralised by foreign currency. As we can sell our 1F for X(t + dt)D at
time t + dt the cashflow at t + dt would be:
dX (t ) − rd, f (t ) X (t ) dt
It is not hard to see that the transaction is in fact an “instantaneous”

forex swap, with a real-life equivalent an overnight (or what is
known as tom/next) forex swap.
Is there any relationship between the rate rd,f(t) and collateralisa-
tion rates in two different currencies? We contend that no, and the
rate is independent of collateral rates in either currency.
Cross-currency model under domestic collateral

Let us summarise the instruments we have discussed so far and the
cashflows they generate at time t + dt.
❑❑ The market in instantaneous forex swaps allows us to generate

cashflow dX(t) – rd,f(t)X(t)dt.
❑❑ The market in Pd,d allows us to generate cashflow dPd,d(t, T) – cd(t)
Pd,d(t, T)dt.
❑❑ The market in Pf,d allows us to generate cashflow d(Pf,d(t, T)X(t))
– cd(t)Pf,d(t, T)X(t)dt.
Assuming real-world measure dynamics (m, dW are vectors and S is

a matrix):
136

⎛ ⎞
⎜ dX / X ⎟
⎜ dPd,d / Pd,d ⎟ = µ dt + ΣdW
⎜ ⎟
⎜ d P X / P X ⎟
⎝ ( f ,d ) ( f ,d )⎠
by the same arguments as above (see Different collateral rates), we

can find a measure (“domestic risk-neutral”) Qd under which the
dynamics are:
⎛ ⎞ ⎛ r ⎞
⎜ dX / X ⎟ ⎜ d, f ⎟
⎜ dPd,d / Pd,d ⎟ = ⎜ cd ⎟ dt + ΣdW d
⎜ ⎟
⎜ d P X / P X ⎟ ⎜⎝ cd ⎟⎠
⎝ ( f ,d ) ( f ,d ) ⎠ (8.8)
In particular, we have:
⎛ − ∫ T rd , f (s)ds ⎞
X (t ) = Etd ⎜ e t X (T ) ⎟
⎝ ⎠
⎛ − ∫ cd (s)ds ⎞
T
Pd,d (t,T ) = Etd ⎜ e t ⎟

⎝ ⎠
1 ⎛ − ∫ cd (s)ds
T
⎞
Pf ,d (t,T ) = Etd ⎜ e t X (T )⎟ (8.9)
X (t) ⎝ ⎠
Cross-currency model under foreign collateral

We can consider the same model under foreign collateralisation. We
would look at foreign bonds Pf,f and domestic bonds collateralised
in foreign currency Pd,f. By repeating the arguments above, we can
find a measure Qf under which:
⎛ ⎞ ⎛ ⎞
⎜ d (1/ X ) / (1/ X ) ⎟ ⎜−rd, f ⎟
⎜ dP / P 
⎟ = ⎜ c ⎟ dt + ΣdW f
⎜ f , f f,f
⎟ ⎜ f ⎟
⎜ d P / X / P / X ⎟ ⎝ c f ⎠
⎝ ( d, f ) ( d, f )⎠ (8.10)
Note the drift of the first component is the rate –rd,f, which is the rate
on the instantaneous forex swap from the point of view of the
foreign party. In particular:
⎛ − ∫ T c f (s) ds 1 ⎞
Pd, f (t,T ) = X (t ) Etf ⎜ e t ⎟ (8.11)
⎝ X (T ) ⎠
It is not hard to see the connection between Qf and Qd. In
particular:
dQ f ∫ 0 rd , f (s) ds X (t )
t
−
= M (t )  e (8.12)
dQ d Ft
X ( 0)
137

where the quantity:
∫ 0 rd , f (s) ds X (t)
t
−
M (t ) = e
X ( 0)
is a normalised positive martingale under the domestic measure.

Having this connection allows us to find the dynamics of Pf,f under
Qd, for example.
Not all processes in 8.8 and 8.10 can be specified independently.
In fact, with the addition of the dynamics of Pf,f to 8.8, the model is
fully specified, as the dynamics of Pd,f can then be derived.
Comparing our setup with that of Fujii and Takahashi (2011), we
can clarify the roles of risk-free rates that were introduced there.
While by themselves they are superfluous for the development of a
cross-currency model with collateralisation, their spreads have a
concrete market interpretation – they are given by the rates quoted
for instantaneous forex swaps and define the rate of growth of the
forex rate.
Forward forex
Forward forex contracts are traded among dealers and, as such, are
subject to collateralisation rules. A forward forex contract pays X(T)
– K at T in domestic currency. The price process of the domestic-
currency-collateralised forward contract is given by:
⎛ − ∫ T cd (s) ds ⎞
Etd ⎜ e t (X (T ) − K )⎟ = X (t) Pf ,d (t,T ) − KPd,d (t,T )
⎝ ⎠
and so the forward forex rate, that is, K that makes the price process
have value zero is given by:
X (t) Pf ,d (t,T )
X d (t,T ) = (8.13)
Pd,d (t,T )
Note that by switching to the measure associated with Pd,d(·, T), we

get:
X d (t,T ) = Etd,T ( X (T ))
so Xd(·, T) is a martingale under this measure.

We can also view a forward forex contract as paying 1 – K/X(T)
in foreign currency. Then, with foreign collateralisation, the value
would be:
138

⎛ − ∫ T c f (s)ds ⎞
Etf ⎜ e t (1− K / X (T )) ⎟ = Pf , f (t,T ) − KPd, f (t,T ) / X (t)
⎝ ⎠

and the forward forex rate collateralised in cf is given by:

X (t) Pf , f (t,T )
X f (t,T ) =
Pd, f (t,T )
In a general model, there is no reason why Xf(t, T) would be equal to

Xd(t, T), and the forward forex rate would depend on the collateral
used. It appears, however, that in current market practice forex
forwards are quoted without regard for the collateral arrangements,
essentially assuming that the cross-currency spread qd,f(t), defined
in the next section, is deterministic or that its volatility is small
enough to make no practical difference at liquidly observed maturi-
ties of forex forwards.
Forward forex rates are fundamental market inputs, so the
Formulas 8.13 and 8.14 are not, strictly speaking, required for
pricing them. They are needed, however, for calibrating a model
such as developed in the next section, as they are the source of
information on the initial term structures Pf,d and Pd,f and, ultimately,
on the expected values of the spot forex drift rd,f.
A simple model for collateral choice

Collateral choice
Let us consider a domestic asset, with price process V(t), that can be
collateralised either in the domestic (rate cd) or the foreign (rate cf)
currency. What is the price process of such an asset? From the
analysis in the previous section, it follows that the foreign-
collateralised domestic zero-coupon bond grows (in the domestic
currency) at the rate cf + rd,f. It can be shown rigorously, through the
type of cashflow analysis we have performed a few times in this
chapter, that the same is true for any domestic asset. When one can
choose the collateral, one would maximise the rate received on it, so
the collateral choice rate is equal to:
max ( cd (t) , c f (t ) + rd, f (t ))

= cd (t ) + max ( c f (t ) + rd, f (t ) − cd (t) , 0) (8.15)
The simplest extension of the traditional cross-currency model that

accounts for different collateralisation would keep the spread:
139

qd, f (t )  c f (t ) + rd, f (t ) − cd (t )
deterministic. In this case, the collateral choice will not generate

any optionality, although the discounting curve for the choice
collateral rate will be modified (see Fujii and Takahashi, 2011).
Anecdotal evidence at the time of writing suggests that at least
some dealers do assign some value to the option to switch collateral
in the future. So, let us build a simple model that would give some
value to the collateral choice option. The most technically straight-
forward extension of the standard cross-currency model would
then involve specifying volatilities for the following objects: Pd,d, Pf,d,
Pf,f and X, and then proceeding to derive relevant drifts through the
HJM-type calculations. In our view, this is not particularly conven-
ient as it would make it difficult to choose parameters in a way that
would keep the spread qd,f deterministic, which is an important
boundary case. So, instead, we will specify the dynamics of Pd,d, Pf,f,
X and, importantly, the spread qd,f directly.
Zero-coupon curves
Before we start, let us discuss time-zero market data that the model
needs to recover. We have the domestic-collateral, domestic-
currency zero-coupon bonds Pd,d(0, T) that can be obtained from the
market on linear instruments in a single currency. We denote corre-
sponding instantaneous forward rates by pd,d(0, T) = –∂log Pd,d(0,
T)/∂T. Similarly we can build the “pure foreign” discounting curve
Pf,f(0, T), pf,f(0, T). From the cross-currency swaps collateralised in
the foreign currency (or from the forex forward market via 8.14), we
can obtain the foreign-collateral domestic zero-coupon bonds Pd,f(0,
T) and corresponding forward rates pd,f(0, T).
Note that we have from 8.11 and the measure change 8.12 that:
Pd, f (t,T )
⎛ − ∫ T c f (s)ds 1 ⎞ ⎛ − ∫ T (c f (s)+rd , f (s)) ds ⎞
= X (t ) Etf ⎜ e t ⎟ = Etd ⎜ e t ⎟
⎝ X (T ) ⎠ ⎝ ⎠
⎛ T T
− ∫ cd ( s) ds − ∫ qd , f ( s)ds ⎞ ⎛ − ∫ qd , f ( s)ds ⎞
T
= Etd ⎜ e t e t d,T
⎟ = Pd,d (t,T ) Et ⎜ e t ⎟
⎝ ⎠ ⎝ ⎠
where the T-forward measure Qtd,T corresponds to Pd,d(t, T) being a

numeraire. Hence, in the deterministic spread case, the timezero
curve qd,f(·) will be given by the forward rate difference pd,f(0, T) –
pd,d(0, T).
140

Dynamics
We work under the domestic measure. Let the dynamics for Pd,d(t,
T) be given by:
dPd,d (t,T ) / Pd,d (t,T ) = cd (t ) dt − Σd (t,T ) dWd (t) (8.16)
Standard HJM machinery can be employed to obtain the dynamics

of cd(t), which, in the simplest case, can be taken to be of the Hull-
White form.
Let the forex rate dynamics be given by:
dX (t ) / X (t ) = rd, f (t) dt + ΣX (t) dWX (t ) (8.17)
We will eventually be able to derive the dynamics for rd,f through

that of cd, cf and qd,f.
The dynamics of Pf,f have the same form as 8.16 but under the
foreign measure. Changing measure per 8.12 and using 8.17, we
obtain:
dPf , f (t,T ) / Pf , f (t,T )

= c f (t ) dt − Σ f (t,T ) ( dW f (t ) − ρXf ΣX (t ) dt )
where rX,f is the correlation between dWX and dWf . Again, the
dynamics for cf will follow from the standard HJM arguments.
Now let us decide on the dynamics of qd,f . Recall:
⎛ − ∫ T qd , f (s)ds ⎞
Qd, f (t,T ) = Etd,T ⎜ e t ⎟
⎝ ⎠
where Qd,f(t, T)  Pd,f (t, T)/Pd,d (t, T). Denoting the volatility of Qd,f (t,
T) by Sq(t, T) and using Wq(t) as a driving Brownian motion under
Qd we can write down the dynamics of Qd,f (t, T) as:
dQd, f (t,T ) /Qd, f (t,T )

= qd, f (t ) dt − Σq (t,T ) ( dWq (t ) + ρq,d Σd (t,T ) dt) (8.18)
By standard calculations along the lines of section 10.1 of Andersen

and Piterbarg (2010), we obtain that qd,f (t) is given by:
qd, f (t ) = pd, f ( 0,t ) − pd,d ( 0,t )

1 t ∂ t ∂
+
2
∫ 0 ∂t
( 2
)
Σq ( s,t ) ds + ρq,d ∫ 0 ∂t
( Σq ( s,t) Σd ( s,t)) ds
t ∂
+ ∫0 Σq ( s,t ) dWq ( s)
∂t
141

Choosing Sq(t, T) to be of the standard deterministic meanreverting

form, Sq (t, T) = sq(t)(1 – e–aq(T–t))/aq , we obtain the following model
dynamics under the domestic risk-neutral measure Qd :
dPd,d (t,T ) / Pd,d (t,T ) = cd (t ) dt − Σd (t,T ) dWd (t)

dPf , f (t,T ) / Pf , f (t,T ) = c f (t ) dt − Σ f (t,T ) ( dW f (t ) − ρX , f ΣX (t ) dt )
dX (t ) / X (t ) = ( cd (t ) − c f (t ) + qd, f (t )) dt + ΣX (t ) dWX (t )
qd, f (t ) = pd, f ( 0,t ) − pd,d ( 0,t )
1 t ∂
+
2
∫ 0∂t
( 2
)
Σq ( s,t ) ds
t ∂
+ ρq,d ∫ 0 ∂t ( Σq ( s,t) Σd ( s,t))
t −aq (t−s)
+ ∫0 e σ q ( s)dWq ( s)
By setting sq(·) = 0, we recover the deterministic-spread model.

What is also convenient about this formulation is that it is
symmetric, that is, all quantities change in an expected way when
switching from the domestic to the foreign point of view, that is, qf,d
= –qd,f, rf,d = –rd,f .
Observations
We make the following observations regarding the model 8.19.
While writing the dynamics is relatively straightforward, using
such a model in practice presents considerable challenges. There are
a number of parameters that are simply not observable in the
market, such as those for the process qd,f and various correlations.
Even if statistical estimates could be used, hedging these parame-
ters would be very difficult. Moreover, the option to switch collateral
– which is ultimately the application of this model – could disap-
pear for reasons unrelated to the model, such as a move to central
clearing or a standard credit support annex. Moreover, there are
doubts about whether an ability to instantaneously switch collateral
from one currency to another is a good reflection of reality. On the
other hand, it does give a way of getting some estimate for the
option to switch collateral, and is derived in a rigorous way.
Another point worthy of note is that a model 8.19 is a model of
discounting only. If one were to use it to price interest rate deriva-
tives beyond those depending on discounting rates only, additional
dynamics would need to be specified for forecasting curves. This
142

can be done, for example, either for Libor forwards or for short
rates that drive the forecasting curves. These can be specified as
deterministic spreads to the collateral rates or, in full generality,
modelled with their own stochastic drivers, further increasing the
number of unobservable parameters. In the latter case, not only the
discounting curves will depend on the collateralisation used, but
also forecasting curves such as forward Libor curves will as well, in
close analogy to the quanto-type adjustments obtained in Piterbarg
(2010).
Valuing a collateral choice option

We proceed to look at the problem of collateral choice option valua-
tion. Given an asset subject to collateral choice, its value at t = 0 is
given by:
⎛ − ∫ T max(qd , f (s),0) ds ⎞
Pd,d ( 0,T ) E d,T ⎜ e 0 V (T )⎟ (8.20)
⎝ ⎠
For an interest rate swap, say, V(T) here will be either a constant
(fixed leg cashflow) or a Libor rate fixing (floating leg cashflow). Let
us consider the fixed leg first. Here we need to calculate:
⎛ − ∫ T max(qd , f (s),0) ds ⎞
E d,T ⎜ e 0 ⎟ (8.21)
⎝ ⎠
There appears to be no closed-form expression for an option like

this. However, given stringent computational requirements for a
typical swaps trading system, using say a Monte Carlo or a partial
differential equation method is rarely feasible, and a fast analytic
approximation is required. By Jensen’s inequality:
⎛ − ∫ t max(qd , f (s),0)ds ⎞ − ∫ 0T Ed ,T ⎡⎣max(qd , f (s),0)⎤⎦ds
E d,T ⎜ e 0 ⎟ ≥ e
⎝ ⎠
The integrand in the exponent on the right-hand side is T-dependent

(through Ed,T expectation), which means we have to re-evaluate the
integrand terms for each T, slowing down calculations. It seems
sensible to replace the expectations with Ed,s[max(qd,f (s), 0)], allowing
them to be calculated once for all T. Given small differences between
the two, other more significant approximations involved and the
uncertainty in market parameters, the trade-off seems justified. So
we use a simple first-order approximation:
143

⎛ − ∫ t max(qd , f (s),0)ds ⎞ − ∫ 0T Ed ,s ⎡⎣max(qd , f (s),0)⎤⎦ds

E d,T ⎜ e 0 ⎟ ≈ e (8.22)
⎝ ⎠
Given that in the model 8.19 qd,f (s) is Gaussian, the required
Ed,s[max(qd,f (s), 0)] can be readily calculated in closed form. In prac-
tice, we would calculate it for a number of points si and interpolate
in between. While 8.22 is only an approximation, at least for some
values of market parameters it appears to be a good one (see below).
For the floating leg, a pragmatic choice would be to move the
Libor fixing outside of the expected value, that is, replace 8.20 with:
⎛ − ∫ t max(qd , f (s),0)ds ⎞ d,T
Pd,d ( 0,T ) E d,T ⎜ e 0 ⎟ E (V (T ))
⎝ ⎠
and proceed with 8.22.
Example
Here, we present a numerical example for collateral choice option
valuation. We use data from November 2011 from Barclays. The
domestic currency is the euro and the foreign currency is sterling.
We use the following parameters for the process qd,f (·), estimated
historically: sq = 0.50% and aq = 40%. In Figure 8.1, we plot a number
of forward curves against time t in years. The curve labelled option
forward is pd,f (0, t) – pd,d (0, t), that is, the forward curve for the spread
process qd,f (·). The curve labelled option value (intrinsic) is the curve
max(pd,f (0, t) – pd,d (0, t), 0). This would be the value of the collateral
choice option assuming deterministic evolution of the spread qd,f (·).
The curve labelled option value (exp) is the true value calculated
from 8.21 by Monte Carlo simulation, expressed in instantaneous
forward rate terms:
∂ ⎛ − ∫ t max(qd , f (s),0)ds ⎞
Option value (exp) = − log E d,t ⎜ e 0 ⎟
∂t ⎝ ⎠
Finally, the curve labelled option value (first order) is the first-order
approximation from 8.22, Ed,t[max(qd,f (t), 0)].
We see that the option value is not insignificant. We also see that
the first-order approximation matches the true value of the option
closely, at least for the values of the parameters used.
144

Figure 8.1 Collateral choice forward, inrinsic and option values
0.6
0.4
0.2
0
?? ?? ?? ?? ?? ?? ?? ?? ??
–0.2
%
–0.4
Option forward
–0.6 Option value (intrinsic)
Option value (exp)
–0.8 Option value (first order)
–1.0
Conclusion
We have developed a framework for asset pricing in an economy
where there is no risk-free rate and all transactions are collateral-
ised. It turns out that much of the machinery of standard risk-neutral
pricing theory can be reused, with a few changes. In the risk-neutral
measure, each collateralised asset grows at the rate at which it is
collateralised. The forex rate drift is not given by the difference of
the risk-free rates in two currencies (as they do not exist in such an
economy), but is given by a rate on an instantaneous forex swap,
which is essentially an overnight repo rate on the sale of one unit of
foreign currency for domestic price. Consequently, the forex rate
drift is not dependent on the collateral rates in the two economies
(domestic and foreign), but the forward forex rates are.
Furthermore, we demonstrated a simple model with stochastic
dynamics for the difference between the forex-adjusted foreign
collateral rate and the domestic collateral rate in which the option
to switch collateral has time value, commented on the practical use
of such a model and presented a numerical example.
The author would like to thank Rashmi Tank and Thomas Roos for
stimulating discussions.
1 Collateralised stock sale is actually a repo transaction. Here we assume that the repo rate
is the same as the collateral rate. We consider different rates later
145

REFERENCES
Andersen L. and V. Piterbarg, 2010, Interest Rate Modeling (London, England: Atlantic
Financial Press).
Black F., 1972, “Capital Market Equilibrium with Restricted Borrowing,” Journal of
Business, 45, pp 444–55.
Duffie D., 2001, Dynamic Asset Pricing Theory (3e) (Princeton, NJ: Princeton University
Press).
Fujii M. and A. Takahashi, 2011, “Choice of Collateral Currency,” Risk, January, pp

120–25 (available at www.risk.net/1935412).
Macey G., 2011, “Pricing with Standard CSA Defined by Currency Buckets,” SSRN
eLibrary.
Pricing,” Risk, February, pp 97–102 (available at www.risk.net/1589992).
146

Section 2
Asset and Risk Management
09 Avellaneda and Lipkin PCQF.indd 147 11/03/2013 10:13

9
A Dynamic Model for
Hard-to-borrow Stocks
Marco Avellaneda and Mike Lipkin
New York University and Columbia University
Moves by regulators to put restrictions on short selling financial

stocks have had many repercussions for financial markets. Such
restrictions are known to lead to overpricing, in the sense used by
Jones and Lamont (2002) – stock prices have been “pumped up” by
forced buying of short positions in the market – and have increased
market volatility.
The availability of stocks for borrowing depends on market
conditions. Firms usually charge a fee, often in the form of a reduced
interest rate, to accommodate clients who wish to short “hard-to-
borrow” stocks, so there is a cost associated with maintaining a
short position. While many stocks are easily borrowed, others are in
short supply.
In general, hard-to-borrow stocks earn a reduced interest rate on
cash credited for short positions by the clearing firms. Moreover,
short positions in hard-to-borrow stocks may be forcibly repur-
chased (bought in) by the clearing firms. These buy-ins will usually
be made in order to cover shortfalls in delivery of stock following
the US Securities and Exchange Commission’s Regulation SHO,
which requires traders to “locate” shares of “threshold” securi-
ties that they intend to short before doing so. Although a stock may
have a large short interest – the percentage of the float currently held
short – without actually being subject to buy-ins, hard-to-borrow
stocks are those for which buy-ins will occur with non-zero prob-
ability. The larger the short interest, the harder it is to borrow stock.
149

When a buy-in takes place, firms repurchase stock in the amount

of the undelivered short positions of their clients. This introduces
an excess demand for stock that is unmatched by supply at the
current price, resulting in a temporary upward impact on prices.
Each day, when buy-ins are completed, the excess demand disap-
pears, causing the stock price to jump roughly to where it was
before the buy-in started (see Figure 9.1). We note that the short
interest and the buy-in rate should vary in the same direction: the
greater the short interest, the more frequent the buy-ins. The more
frequent the buy-ins, the higher the stock price gets driven by
market impact.
A critical consideration is that shorting stock and buying puts are
not equivalent as a means of gaining short exposure. A trader
subject to a potential buy-in remains uncertain of how much, if any,
of his short position might be repurchased until the market closes,
and will have to sell any unexpected long deltas acquired through
buy-ins. As a consequence, someone who is long a put will not have
the same synthetic position as the holder of a call and short stock.
The latter position will reflect an uncertain amount of short stock
overnight but not the former. The following examples illustrate the
rich variety of phenomena associated with hard-to-borrow stocks,
which we will attempt to explain with our model.
Artificially high prices and sharp drops

Over a period of less than two years, from 2003 to 2005, the stock of
Krispy Kreme Doughnuts (KKD) made extraordinary moves, rising
from single digits to more than US$200. During this time, buy-ins
were quite frequent. Short holders of the stock were unpredictably
forced to cover part of their shorts by their clearing firms, often at
unfavourable prices. After 2005, KKD failed to report earnings for
more than four consecutive quarters, several members of the orig-
inal management team left or were replaced, and the stock price
dropped to less than US$3.
Short-squeezes
A short-squeeze is often defined as a situation in which an imbal-
ance between supply and demand causes the stock to rise abruptly
and a scramble to cover on the part of short sellers. The need to
cover short positions drives the stock even higher. In another
150

A DYNAMIC MODEL FOR HARD-TO-BORROW STOCKS
market development, Porsche indicated its desire to control 75% of

Volkswagen, leading to an extraordinary spike in the stock price
(see Figure 9.2).
Cost of conversions
Converting means selling a call option and buying a put option of
the same strike and 100 shares of stock. According to put–call parity,
for an ordinary (non-dividend paying) stock, the premium-over-
parity of a call (Cpop) should exceed the premium-over-parity of the
corresponding put (Ppop) by an amount approximately equal to the
strike times the spot rate.1 In particular, a converter should receive a
credit for selling the call, buying the put and buying 100 shares.
However, for hard-to-borrow stocks the reverse is often true. In
January 2008, prior to announcing earnings, the stock of VMWare
Corporation (VMW) became extremely hard to borrow. This was
reflected by the unusual cost of converting on the January 2009
at-the-money strike. The difference Cpop – Ppop for the January 2009
US$60 line was –US$8! A converter would therefore need to pay
US$8 (per share) to enter the position, that is, US$800 per contract.
Following the earnings announcement, VMW fell roughly US$28
(see Figure 9.3). At the same time, the cost of the conversion on the
60 strike in January 2009 dropped in absolute value to
Figure 9.1 Minute-by-minute price evolution of Interoil Corporation:

June 17–23, 2008
43
41
39
37
IOC ($)
35
33
31
29
27
25
1 997
84 167 250 333 416 499 582 665 748 831 914 1,0801,1631,246
1,329
1,412
1,495
1,5781,661
Note: Note the huge spike, which occurred on the closing print of June 19. The
price retreats nearly to the same level as prior to the buy-in
151

Figure 9.2 Short-squeeze in Volkswagen: October 2008

140
120
100
VMW ($)
80
60
40
20
0
Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
2007 2008
Note: The large drop in price after an earnings announcement in late January
2008 was accompanied by a reduction in the difficulty to borrow, as seen in
the price of conversions
approximately –US$1.80 (per share) from –US$8. (The stock was

still hard to borrow, but much less so.) Therefore, a trader holding
10 puts, long 1,000 shares and short 10 calls, believing himself to be
delta-neutral, would have lost (US$8 – US$1.80) × 10 × 100 =
US$6,200.
Unusual pricing of vertical spreads

A vertical spread (see Natenberg, 1998) is defined as buying an
option with one strike and selling another with a different strike on
the same series. Options on the same hard-to-borrow name with
different strikes and the same expiry seem to be mis-priced. For
example, the biotech company Dendreon was extremely hard to
borrow in February 2008. With stock trading at US$5.90, the January
2009 US$2.50–5.00 put spread was trading at US$2.08 (midpoint
prices), shy of a maximal value of US$2.50, despite having zero
intrinsic value. Note this greatly exceeds the midpoint-rule value of
US$1.25, which is typically a good upper bound for out-of-the-
money vertical spreads – buying an option with one strike and
selling another with a different strike.
To recover these features within a mathematical model, we
propose a stochastic buy-in rate that provides a feedback mecha-
nism coupling the dynamics of the stock price with the frequency at
152

Figure 9.3 Closing prices of VMWare: November 1, 2007–September

26, 2008
1,000
900
800
700
Volkswagen (€)
600
500
400
300
200
100
0
Sept Oct Nov
which buy-ins take place, measured in events per year. We model

the temporary excess demand as a drift proportional to the buy-in
rate and the relaxation as a Poisson jump with intensity equal to the
buy-in rate, so that, on average, the expected return from holding
stock that is attributable to buy-in events is zero. Using this process,
we derive option pricing formulas and describe many empirical
stylised facts. The model presented here can be seen as providing a
dynamic framework for quantifying market-makers’ losses due to
buy-ins, as in the empirically focused article by Evans et al (2008),
and adds to a considerable amount of previous theoretical work on
hard-to-borrow stocks, for example in Nielsen (1989), Duffie,
Garleanu and Pedersen (2002) and Diamond and Verrecchia (1987).
After introducing the model in the following section, we derive a
corresponding put-call parity relation matching the observed
conversion prices. The anomalous vertical spreads are also
explained. Then we present option pricing formulas for European-
style options and tractable approximations for American-style
options. One of the most striking consequences is the early exercise
of deep in-the-money calls. Then we observe that the fluctuations in
the intensity of buy-ins and changes in the cost of borrowing can be
measured using leveraged exchange-traded funds (ETFs) tracking
financial stocks (which were extremely hard to borrow in autumn
2008).
153

The model
We assume that under the physical measure the hard-to-borrow
stock St and buy-in rate lt satisfy the system of coupled equations:
dSt
= σ dWt + γλt dt − γ dN λt (t) (9.1)
St
dSt
dX t = κ dZt + α ( X − X t ) dt + β , X t = ln ( λt / λ0 ) (9.2)
St
where dNl(t) denotes the increment of a standard Poisson process
with intensity lt over the interval (t, t + dt); the parameters s and g
are respectively the volatility of the continuous part and the price
elasticity of demand due to buy-ins; and Wt is a standard Brownian
motion. Equation 9.2 describes the evolution of the logarithm of the
buy-in rate; k is the_volatility of the rate, Zt is a Brownian motion
independent of Wt, X is a long-term equilibrium value for Xt, a is the
speed of mean-reversion and b couples the change in price with the
buy-in rate.
We assume that b > 0; in particular, Xt = ln(lt) is positively
correlated with price changes, introducing a positive feedback
between increases in buy-ins (hence in short interest in the stock)
and price.
Equations 9.1 and 9.2 describe the evolution of the stock price
across an extended period of time. One can think of a diffusion
process for the stock price, which is punctuated by jumps occurring
at the end of the trading day, the magnitude and frequency of the
latter being determined by lt. Fluctuations in lt represent the fact
that a stock may be difficult to borrow one day and easier another.
In this way, the model describes the dynamics of the stock price as
costs for stock-loan vary. Short squeezes can be seen as events asso-
ciated with large values of lt, which will invariably exhibit price
spikes (rallies followed by a steep drop).
The cost of shorting: buy-ins and effective dividend

yield
Option market-makers need to hedge by trading the underlying
stock, both on the long and short side, with frequent adjustments.
However, securities that become hard to borrow are subject to
buy-ins as the firm needs to deliver shares according to the pres-
ently existing settlement rules. From a market- maker’s viewpoint,
154

a hard-to-borrow stock is essentially a security that presents an

increased likelihood of buy-ins.
The profit or loss for a market-maker is affected by whether and
when their short stock is bought in and at what price. Generally,
this information is not known until the end of the trading day. To
model the economic effect of buy-ins, we assume that the trader’s
profit and loss from a short position of one share over a period (t, t +
dt) is:
P & L = −dSt − ξγ St = −St (σ dWt + λtγ dt )
where Prob{ξ = 0} = 1 – ltdt + o(dt) and Prob{ξ = 1} = ltdt + o(dt). Thus,

we assume that the trader who is short the stock does not benefit
from the downward jump in Equation 9.1 because they are no
longer short by the time the buy-in is completed. The idea is that the
short trader takes an economic loss post-jump due to the fact that
their position was closed at the buy-in price.
Suppose then, hypothetically, that the trader was presented with
the possibility of “renting” the stock for the period (t, t + dt) so that
they can remain short and be guaranteed not to be bought in. The
corresponding profit and loss would now include the negative of
the downward jump, that is, g St if the jump happened right after
time t. Since jumps and buy-ins occur with frequency lt, the
expected economic gain is ltg St. It follows that the fair value of the
proposed rent is ltg per dollar of equity shorted. In other words, ltg
can be viewed as the cost-of-carry for borrowing the stock.
Hence, we can interpret ltg as a convenience yield associated
with owning the stock when the buy-in rate is lt. This convenience
yield is monetised by holders of long positions lending their stock
out for one day at a time and charging the fee associated with the
observed buy-in rate. The convenience yield or rent is mathemati-
cally equivalent to a stochastic dividend yield that is credited to
long positions and debited from holders of short positions who
enter into lending agreements. For traders who are short but do
not enter into such agreements, it is assumed that stochastic
buy-ins prevent them from gaining from downward jumps.
We can therefore introduce an arbitrage-free pricing measure
associated with the physical process 9.1–9.2, in which the rent, or
stock-financing, ltg, cancels the drift component of the model and
the expected return is equal to the cost of carry. Under this measure,
our model takes the form:
155

dSt
= σ dWt + rdt − γ dN λt (t ) (9.3)
St
where r is the instantaneous interest rate. The absence of the drift
term ltg in this last equation is due to the fact that, under an arbi-
trage-free pricing measure, the discounted price process is a
martingale.
It follows from Equation 9.3 that the stock price in the risk-neutral
world can be written as:
t
St = S0 e rt Mt (1− γ ) ∫0 λt
dN (t )
(9.4)
where the third factor represents the effects of buy-ins, and:

⎧ σ 2t ⎫
Mt := exp ⎨σ Wt − ⎬
⎩ 2 ⎭
is the classical lognormal martingale.
The first application of the model is forward pricing. Assuming
constant interest rates, we have:
Forward price = E {ST }
⎧ σ W −σ 2T +rT T
⎫
= E ⎨S0 e T 2 (1− γ ) ∫0 λt ( ) ⎬
dN t
⎩⎪ ⎭⎪
⎧ ⎛ T ⎞
k ⎫
⎪ T
⎪ − ∫ λt dt ⎝
⎜ ∫ λ t dt ⎟ ⎪
⎠ k ⎪
= S0 e rT E ⎨e 0 ∑ 0 (1− γ ) ⎬
⎪ k k! ⎪
⎪ ⎪
⎩ ⎭
⎧ −γ ∫ λt dt ⎫
T
⎪ ⎪
= S0 e rT E ⎨e 0 ⎬ (9.5)
⎪⎩ ⎪⎭
This gives a mathematical formula for the forward price in terms of

the buy-in rate and the constant g. Clearly, if there are no jumps, the
formula becomes classical. Otherwise, notice that the dividend is
positive and delivering stock into a forward contract requires
hedging with less than one unit of stock, “renting it” along the way
to arrive at one share at delivery. From Equation 9.5, the term struc-
ture of forward dividend yields (dt) associated with the model is
given by:
T
∫ dt dt ⎧ −γ T∫ λt dt ⎫
−
⎪ ⎪
e 0
= E ⎨e 0 ⎬ (9.6)
⎪⎩ ⎪⎭
156

Option pricing for hard-to-borrow stocks

Put-call parity for European-style options states that:
C ( K,T ) − P ( K,T ) = S (1− DT ) − K (1− RT )
where P(K, T), C(K, T) represent respectively the fair values of a put
and a call with strike K and maturity T, S is the spot price and R, D
are respectively the simply discounted interest rate and dividend
rate. It is equivalent to:
Cpop ( K,T ) − Ppop ( K,T ) = KRT − DST (9.7)
where Ppop(K, T) = P(K, T) – max(K – S, 0) represents the premium-

over-parity for the put, a similar notation applying to calls.
It is well known that put-call parity does not hold for hard-to-
borrow stocks if we enter the nominal rates and dividend rates in
Equation 9.7. The price of conversions in actual markets should
therefore reflect this. A long put position is mathematically equiva-
lent to being long a call and short 100 shares of common stock, but
this will not hold if the stock is a hard-to-borrow stock. The reason
is that shorting costs money and the arbitrage between puts and
calls on the same line, known as a conversion, cannot be made
unless there is stock available to short. Conversions that look attrac-
tive, in the sense that:
Cpop ( K,T ) − Ppop ( K,T ) < KRT − DST (9.8)
may not result in a risk-free profit due to the fact that the crucial
stock hedge (short 100 shares) may be impossible to establish.
We quantify deviations from put-call parity by considering the
function:
Cpop ( K,T ) − Ppop ( K,T ) − KRT

dimp ( K,T ) ≡ , 0< K <∞ (9.9)
−ST
As a function of K, dimp(K, T) will be approximately flat for low strikes

and will rise slightly for large values of K because puts become
more likely to be exercised. The dividend yield for the stock should
correspond roughly to the level of dimp(K, T) for at-the-money strikes.
If we consider American-style options on dividend-paying stocks
or exchange-traded funds (for example, the S&P 500), then the
implied dividend curve will, in addition, be lower for low strikes,
reflecting the fact that calls have an early-exercise premium.
157

Figure 9.4 Implied dividend rates as a function of strike price for options
on Dendreon
0.50
0.45
0.40
Implied dividend (%) 0.35
0.30
0.25
0.20
0.15
0.10
0.05
0
0 5 10 15 20 25 30 35 40 45
Strike (US$)
Note: The trade date is January 10, 2008 and the expiry is January 17, 2009.
The stock price is US$5.81. The best fit constant dividend rate is approximately
15%. Dendreon does not pay dividends
The situation is quite different for hard-to-borrow stocks, as we

can see from Figures 9.4 and 9.5. Two distinctions are important: (i)
the implied dividend curve dimp(K, T) for K ≈ S is not equal to the
nominal dividend yield (which is zero, in the case of the stocks that
are displayed in the figures) – instead, it has a positive value; and
(ii) the implied dividend curve dimp(K, T) also bends for low values of
the strike, suggesting that calls with low strikes should have an
early exercise premium.
The first feature – a change in level in the implied dividend curve
– has to do with the extra premium for being long puts in a world
where shorting stock is difficult or expensive. Since synthetic puts
cannot be manufactured by owning calls and shorting stock, the
nominal put–call parity does not hold. Instead, it is replaced by a
functional put–call parity, which expresses the relative value of
puts and calls via an effective dividend rate. Indeed, if we define
the at-the-money implied dividend yield D*(T) = dimp(S, T), we obtain
the new parity relation:
Cpop ( K,T ) − Ppop ( K,T ) = KRT − D* (T ) ST
According to our model, we have, from Equation 9.9:
158

Figure 9.5 Implied dividend rates for VMWare
0.14
0.12
Implied dividend (%)
0.10
0.08
0.06
0.04
0.02
0
0 50 100 150 200 250 300
Strike ($)
Note: The dates are as in Figure 9.4 and the stock price is US$80.30. The best fit
dividend rate (associated with at-the-money options) is 5.5%. VMWare does
not pay dividends
⎡ −γ T∫ λt dt ⎤
T
− ∫ dt dt
1− E ⎢e 0 ⎥
⎢ ⎥
1− e 0 ⎣ ⎦
D * (T ) = = (9.10)
T T
which connects the implied dividend rate obtained from the options
markets to the buy-in rate process.
The option market predicts different borrowing rates over time
for any given stock, through variations in the interest rate quoted
by clearing firms, and by conversion-reversals quoted by option
market-makers. The latter approach suggests different implied
dividends per option series, that is, it contains market expectations
of the varying degree of difficulty of borrowing a stock in the future.
We can use the model 9.1–9.2 and Equation 9.10 to calculate a term
structure of effective dividends (or, equivalently, short rates) that
could be calibrated to any given stock. To generate such a term
structure, we simulate paths of lt, 0 < t < Tmax and calculate the
discount factors by Monte Carlo. Figure 9.6 shows a declining term
structure, which is typical of most stocks. This decay represents the
fact that stocks rarely remain hard to borrow over extremely long
time periods.
159

Figure 9.6 Term structure of effective dividend rates

16
15
14
D*(T) (%) 13
12
11
10
9
8
0 0.5 1.0 1.5 2.0
T (years)
Note: D*(T) for λ0 = 15, γ = 0.01, β = 30, σ = 0.5
Figure 9.7 Theoretical implied dividend yield dimp(K, T) generated by the

model
0.18
0.16
0.14
Implied dividend (%)
0.12
0.10
0.08
0.06
0.04
0.02
0
50 60 70 80 90 100 110 120 130 140 150
Strike
Note: We assume that the stock price is US$100, σ = 0.50, β = 1.00, λ0 = 50,
T = 0.5yrs, γ = 0.03, r = 10%. The effective dividend rate is dimp(100, T) = 14%.
For low strikes, the drop in value is related to the early exercise of calls, a
feature unique to hard-to-borrow stocks. For high strikes, the broad increase
corresponds to the classical early exercise property of in-the-money puts
If we make the approximation that lt is independent of Wt, in the

sense that:
dX t = κ dZt + α ( X − X t ) dt + β (γλt dt − γ dN λt (t))
160

the model becomes more tractable and we obtain semi-explicit

pricing formulas for European-style puts and calls as series expan-
sions by separation of variables. To see this, we define the weights:
⎧ T ⎫
Π (n,T ) = Prob ⎨ ∫ dN λt (t ) = n⎬
⎩ 0 ⎭
⎧ ⎞ ⎫
n
⎛ T
⎪ T ⎜ ∫ λt dt ⎟ ⎪
− ∫ λt dt
⎪ ⎝ 0 ⎠ ⎪
= E ⎨e 0 ⎬ (9.11)
⎪ n! ⎪
⎪ ⎪
⎩ ⎭
Denote by BSCall(s, t, k, r, d, s) the Black–Scholes value of a call

option for a stock with price s, time to maturity t, strike price k,
interest rate r, dividend yield d and volatility s. We then have:
∞

n=0
( n
C (S, K,T ) = ∑ Π ( n,T ) BSCall S (1− γ ) ,T, K, r, 0, σ ) (9.12)
with a similar formula holding for European-style puts.

Note that Equation 9.4 can be viewed as the risk-neutral process
for a stock that pays a discrete dividend g St with frequency lt.
Therefore, calls will be exercisable if they are deep enough in-the-
money. A heuristic explanation is that a trader long a call and short
stock would suffer repeated buy-ins costing more than the synthetic
put forfeited by exercising. Unfortunately, pricing an American-
style call using the full model 9.3 entails a high-dimensional
numerical calculation, because the number of jumps until time t:
t
∫ dN λt (t )
0
is not a Markov process unless lt is constant. In other words, the

state of the system depends on the current value of lt and not just
on the number of jumps that occurred previously. The case lt =
constant is an exception; it corresponds to b = 0, that is, to the absence
of coupling between the price process and the buy-in rate. The
calculation of American-style option prices in this case is classical
(see, for instance, Amin, 1993). Figure 9.7 shows the curve dimp(K, T)
for American-style options using the model, consistent with the
observed graphs of Dendreon and VMW (Figures 9.4 and 9.5).
161

Hard-to-borrowness in leveraged short ETFs

Here, we show how hard-to-borrowness can also be observed, in
some cases, from price data for leveraged long and short ETFs.
Since short ETFs maintain a short position in the underlying secu-
rity, we expect that the cost of borrowing the underlying shares
should be reflected in the value of the fund. Thus, we should be
able to observe the borrowing costs, by comparing the prices of the
short-leveraged product with the underlying ETF or with a long-
leveraged product.
Let Ut(2) and Ut(–2) denote respectively the prices of a double-long
ETF and a double-short ETF on the same underlying index, St. It
follows from the definition of these products that:
dU tβ dS
= β t − β ( r − ρt ) dt + rdt − fdt, β ∈ {2,−2} (9.13)
U tβ St
where r is the benchmark funding rate (Libor or Fed Funds), f is the
expense ratio or management fee and:
ρt = γλt if β < 0
= 0 if β > 0 (9.14)
Thus rt is the instantaneous (annualised) rent that is associated with

shorting the underlying stock. We can view rt as a proxy for glt, the
expected shortfall for a short-seller subject to buy-in risk, or the
“fair” reduced rate associated with shorting the underlying asset.
We obtain:
dU t(2) dU t(−2)
+ (−2) = 2 (( r − ρt ) − f ) dt
U t(2) Ut
which implies that:

dUt( 2) (−2 )
2
Ut( )
+ dU −2 + ( 2 f − 2r ) dt
t
U( )
γλt dt = t
(9.15)
−2
This suggests that we can use daily data on leveraged ETFs to esti-
mate the cost of borrowing the underlying stock.
For the empirical analysis, we used dividend-adjusted closing
prices from the PowerShares Ultrashort Financial ETF (SKF) and
the PowerShares Ultralong Financial ETF (UYG). The underlying
ETF is the Barclays Dow Jones Financial Index ETF. Using historical
data, we calculated the right-hand side of Equation 9.15, which we
interpret as corresponding to daily sampling, with dt = 1/252, r =
162

Figure 9.8 The cost of borrowing the Barclays Dow Jones Financial
Index
300
250
200
150
100
%
50
0
–50
–100
Jan 2007
Feb 2007
Mar 2007
Apr 2007
May 2007
Jun 2007
Jul 2007
Aug 2007
Sep 2007
Oct 2007
Nov 2007
Dec 2007
Jan 2008
Feb 2008
Mar 2008
Apr 2008
May 2008
Jun 2008
Jul 2008
Aug 2008
Sep 2008
Oct 2008
Nov 2008
Dec 2008
Jan 2009
Feb 2009
Note: The thin line corresponds to the daily values of the cost of borrowing
parameter γλt, in percentage points, estimated from Equation 9.16. The thick
line is a 10-day moving average. Hard-to-borrowness exceeds 100% in
September–October 2008 and remains elevated until March 2009
three-month Libor and f = 0.95%, the expense ratio of SKF and UYG
advertised by Powershares. The results of the simulation are seen in
Figure 9.8.
We see that rt, the cost of borrowing, varies in time and can
change quite dramatically. In Figure 9.8, we consider a 10-day
moving average of rt to smooth out the effect of volatility and end-
of-day marks. The data shows that increases in borrowing costs, as
implied from the leveraged ETFs, began in late summer 2008 and
intensified in mid-September, when Lehman Brothers collapsed
and the US Securities and Exchange Commission (SEC) ban on
shorting 800 financial stocks was implemented (the latter occurred
on September 19, 2008). Note that the implied borrowing costs for
financial stocks remain elevated subsequently, despite the fact that
the SEC ban on shorting was removed in mid-October. This calcula-
tion may be interpreted as exhibiting the variations of lt (or glt) for
a basket of financial stocks. For instance, if we assume that the elas-
ticity remains constant (for example, at 2%), the buy-in rate will
range from a low number (for example, l = 1, or one buy-in a year)
to 50 or 80, corresponding to several buy-ins a week.
163

Conclusion
In the past, attempts have been made to understand option pricing
for hard-to-borrow stocks using models that do not take into
account price dynamics. This approach leads to a view of put–call
parity that is at odds with the functional equilibrium (steady state)
evidenced in the options markets, in which put and call prices are
stable and yet naive put–call parity does not hold. The point of this
chapter has been to show how dynamics and pricing are inter-
twined. The notion of effective dividend is the principal consequence
of our model, which also obtains a term structure of dividend
yields. Reasonable parametric choices lead to a term structure that
is concave down, a shape frequently seen in real option markets.
The model also reproduces the (American-style) early exercise
features, including early exercise of calls, which cannot happen for
non-dividend-paying easy-to-borrow stocks.
The authors would like to thank the referees for many helpful and
insightful comments, Sacha Stanton of Modus Incorporated for assist-
ance with options data and Stanley Zhang for exciting discussions on
leveraged exchange-traded funds.
1 Premium-over-parity (POP) means the difference between the (mid-)market price of the
option and its intrinsic value. Some authors also call the POP the extrinsic value. We use
“approximately equal” because listed options are American-style, so they have an early
exercise premium. Nevertheless, at-the-money options will generally satisfy the put–call
parity equation within narrow bounds.
REFERENCES
Amin K., 1993, “Jump-diffusion Option Valuation in Discrete Time,” Journal of Finance,
48, pp 1,833–63.
Diamond D. and R. Verrecchia, 1987, “Constraints on Short-selling and Asset Price
Adjustment to Private Information,” Journal of Financial Economics, 18, pp 277–312.
Duffie D., N. Garleanu and L. Pedersen, 2002, “Securities Lending, Shorting, and
Pricing,” Journal of Financial Economics, 66, pp 307–39.
Evans R., C. Geczy, D. Musto and A. Reed, 2008, “Failure is an Option: Impediments
to Short Selling and Options Prices,” Review of Financial Studies (available at http://rfs.
oxfordjournals.org/cgi/ content/full/hhm083).
Jones C. and O. Lamont, 2002, “Short Sale Constraints and Stock Returns,” Journal of
Financial Economics, 66, pp 207–39.
Natenberg S., 1998, Option Volatility and Strategies: Advanced Trading Techniques for
Professionals (2e) (Chicago, Ill: Probus).
Nielsen L., 1989, “Asset Market Equilibrium with Short-selling,” Review of Economic
Studies, 56(3), July, pp 467–73.
164

10
Shortfall Factor Contributions
Richard Martin and Roland Ordovàs
Longwood Credit Partners and Sovereign Bank
The notion of the contribution of a position to the risk of a portfolio

is well understood as the sensitivity of risk to a fractional change in
position (for comprehensive reviews, see Tasche, 2007 and 2008). By
the Euler formula, these contributions add up to the risk, as the risk
is a homogeneous function1 of the asset allocations. It is also known
that when risk is to be understood as the standard deviation or
expected shortfall (ES) then, in the context of a factor or conditional-
independence model, the risk can be decomposed into a systematic
part and a positive unsystematic part (Martin and Tasche, 2007).
For the portfolio shortfall, defined as E[Y|Y > y] with Y being the
portfolio loss and y the corresponding value-at-risk, the systematic
part is E[mY|V|Y > y], where mY|V is the conditional expected loss (EL) of
the portfolio given the risk factors V. The purpose of this chapter is
to better understand the behaviour of this systematic part and, in
particular, its sensitivity to the underlying factors. For this reason,
we are only interested in the variation of mY|V with V, as any further
variation of Y conditional on mY|V is unsystematic risk and is not of
interest in this chapter.
It is convenient to have a particular model in mind, though none
of this chapter is model-specific. Consider the probit or Vasicek
model of credit default risk, or any kind of binary event risk:
⎛ −1 ⎞
N Φ ( p j ) − c j ⋅V ⎟
µY V = ∑ a jΦ ⎜⎜ (10.1)
j=1 ⎜ 1− cʹ′j ∑ c j ⎟⎟
⎝ ⎠
where pj is the unconditional default probability, V ~ N(0, S) is the

factor vector, aj is the exposure net of recovery and cj
165
10 Martin and Ordovas PCQF.indd 165 11/03/2013 10:14

is the correlation weight vector for the jth issuer.2 If risk factors
correspond to sectors, for example industrial sectors, and each
issuer corresponds to a unique sector, then we can proceed as
follows: as explained in Martin and Tasche (2007) and Martin (2009),
each issuer’s contribution can be split into a systematic and an
unsystematic part. The systematic parts are grouped by sector, and
are represented as the central segments of the “dartboard” in Figure
10.1, with an area proportional to the associated contribution. The
unsystematic parts, which are necessarily positive, are arranged
around the edge. If a few issuers belong to, say, two sectors (for
example, automotive and financial), then one can legitimately
subdivide their contribution appropriately between the sectors and
still arrive at the same kind of result, with only a small loss in clarity
as to what exactly is meant by sector contribution.
However, factors may not correspond to sectors in a simple way.
For example, the model 10.1 could be a model of credit card loans or
other types of retail loan. In that case, the factors in question might
be interest rates, GDP, unemployment, regional, foreign exchange
(for foreign currency loans), etc. Or, more generally, the portfolio
could be a fund of hedge funds, each with different styles (vola-
tility, momentum, value, etc). Then, each constituent is linked to
many factors, not just one. Again, one wants the sensitivity of risk
to each factor, thereby differentiating with respect to V rather than a.
Whereas simple Euler-type constructions are central to the theory
of position risk contribution, and have a certain appeal, they seem
to cause problems with factor contributions.
The first difficulty with the Euler construction is that, whereas
the derivative of a risk measure with respect to a parameter (such as
asset allocation) is easily defined, the derivative with respect to a
random variable (such as a risk factor) is not. On the other hand, mY|V
can be differentiated with respect to V because it is simply a func-
tion of V. It seems, therefore, that we should attempt to find
“contributions” only to quantities that can be represented as func-
tions of mY|V. Now the systematic part of the ES measure, previously
defined as E[mY|V|Y > y], is one such function, as it is a weighted sum
of mY|V over all values of V (explicitly, the weight is P[V = v|Y > y].
Indeed, some integration over V is inevitable, otherwise one ends
up with a contribution formula that is a random variable). Let us
now denote by R the function of V that we are trying to decompose,
166

SHORTFALL FACTOR CONTRIBUTIONS
and consider what the construction should look like. It is here that
we encounter further problems.
Consider, for example, the derivative with respect to the kth
factor, that is, ∂R/∂Vk. This is unlikely to be useful, because it has
dimensions reciprocal to Vk. In other words, if the units of Vk are
changed, or Vk is rescaled, the risk contribution changes. It is then
impossible to compare the contributions of different factors. If one
examines Vk∂R/∂Vk instead, one obtains a measure that is invariant
to scaling but dependent on the choice of origin of the factor, so it
changes if the factor’s origin is shifted (and by shifting each factor
appropriately, one can arrange for each factor to have zero contri-
bution, which seems fundamentally wrong). An improvement on
both these two ideas is to normalise by the standard deviation of
the factor, that is, something more like s [Vk]∂R/∂Vk. This now has the
required invariance to scaling and shifting, and has some merit,
though the standard deviation might not be the best quantity to
multiply by if one is more interested in tail risks, and it is a little ad
hoc.
There is also the issue of what the contributions add up to, because
whereas risk is a homogeneous function of asset allocations, it is not
in general a homogeneous function of the factors (and indeed it is not
in 10.1, though notably it is linear in CreditRisk+). So the contribu-
tions no longer add up to the risk. Arguably, this is not mandatory.
However, if they do not, it calls into question what the sensitivities
mean and why there is a “bit left over”. Whatever the academic
discussion (which, in view of the large amount of literature on Euler
allocations, is likely to rumble on for some time), there is little doubt
that those given the task of managing firm-wide risk like the contri-
butions to add up to the whole and that a method that does not satisfy
this condition is unlikely to gain general acceptance. None of the
simple constructions above does.
The literature on factor risk contributions is still very small, and
none of it follows the “Euler route”. Cherny and Madan (2006)
propose that the factor contribution to a risk measure be defined as
that risk measure applied to the conditional expectation of the port-
folio on the factor, and Bonti et al (2006) take what might be
described as a cut-down version of this by looking at the impact of
factor-driven stress scenarios. Subsequently, Cherny, Douady and
Molchanov (2009) propose regressing the portfolio return on
167

nonlinear functions of each single risk driving factor in turn, then

merging together the obtained estimates. They derive analytical
expressions for the solution for a Gaussian copula. A similar idea is
pursued by Rosen and Saunders (2009), who define nonlinear
contributions through a linear approximation, and make a connec-
tion with the concept of hedging. More recently, Rosen and Saunders
(2010) also suggest the Hoeffding decomposition of the loss random
variable into components that depend on all combinations of the
factors (ones, twos, etc), although this appears to generate compu-
tational difficulties arising from the fact that the decomposition is
into 2m components (m being the number of factors), which could be
prohibitively large for complex models, and their exposition
assumes independence of the factors, which may obscure the
interpretation.
In this chapter, we revisit the above idea of using Vk∂R/∂Vk for the
contribution, but overcome the lack of shift-invariance by placing
the risk factor’s origin in all possible positions and then averaging
the results, weighted by the density of the risk factor. The end
expression describes a marginal allocation formula, and has the
intuitive appeal of producing contributions that add up to the short-
fall minus the EL (see Figure 10.2). It is a direct generalisation of the
Euler decomposition, in that it reduces exactly to the Euler allo-
cation for homogeneous functions (not necessarily linear). A further
connection with the Euler allocation is provided by the explicit
representation of the factor risk contribution as the position risk
contribution of a hypothetical replicating instrument. We make no
assumptions about the factor distributions or dependency of loss on
the factors, and we do not need to assume that the factors are inde-
pendent. The method is in principle applicable to, for example, loan
portfolios, retail banking or portfolios of hedge funds.
We demonstrate it on different portfolio models: first the multi-
variate normal model, where it is particularly simple; then a
hypothetical portfolio of defaultable instruments under the model
10.1 to illustrate some of the features of the allocation formula, in
particular its behaviour at different percentiles. Finally we take a
real-life example from a retail banking portfolio composed of
typical products with the factors calibrated from a set of macr-
oeconomic indexes associated with the Spanish market. The
results show how the relative factor contributions to each level of
168

Figure 10.1 Risk dartboard for decomposition of portfolio risk into

sectors and issuers by position
Individual
issuer
contributions
Financials
Industrials Utilities
Telecoms
Note: Sector factors are around the middle, unsystematic parts are around the edge
Figure 10.2 Risk dartboard for decomposition of portfolio expected

shortfall into factors
Individual
issuer
contributions
Rates
GDP EL Forex
Unemployment
Note: Central bullseye is the expected loss, contributions from factors are shown
in the middle ring, unsystematic parts are around the edge
loss depend on the interplay of the different variables governing

the portfolio loss and the nature of the products of which it is
composed.
169

Factor decomposition formula for ES less EL

Consider the identity, for some differentiable function f: Rm → R
(with x ∈ Rm a vector, shortly to represent the risk factor, and λ a
scalar):
1
∫ x ⋅ (∇f ) (λ x) dλ = f ( x) − f (0) (10.2)
0
This amounts to the fundamental theorem of calculus.3 By writing

the dot product as a sum over its components, we identify a contri-
bution of the kth component to the variation of f between zero and
x. Now the origin (which in risk-factor space is not of special impor-
tance) can be replaced by z, to give:
1
∫ ( x − z) ⋅ (∇f ) (λ ( x − z) + z) dλ = f ( x) − f ( z)
0
and this can be decomposed in the same way:

m 1
∑(x k − zk ) ∫ (∂k f ) ( λ ( x − z) + z ) dλ = f ( x ) − f ( z )
k=1 0
We can now apply this to the conditional EL function in a factor risk

model. The function f is the conditional EL of the portfolio on the
factors. We replace x with the factor variable V, and z with an inde-
pendent copy of the variable, V° say, and then integrate V° out:
m ⎡ 1 ⎤
∑ E ⎢⎣(V − V ) ∫ (∂ f ) (λ (V − V ) + V ) dλ ⎥⎦
°
k k
°
k
° °
k=1 0
= f (V ) − E ⎡⎣ f (V ° )⎤⎦ = f (V ) − µY (10.3)
where mY is the unconditional EL and the notation E° indicates that

only V° is being integrated out (not V as well). The reason for intro-
ducing a variable V°, and then integrating it out, is that in general
the function f must be many-to-one, so there is no unique V° satis-
fying f(V°) = mY. This gives us an expression for the variation of the
conditional EL function about the EL. Finally, to apply it to short-
fall, we take the expectation of the whole expression conditionally
on the portfolio loss Y exceeding the VAR y. This gives the final
expression:
m ⎡ ⎡ 1 ⎤ ⎤
∑ E⎢⎢E ⎢⎣(V
°
k ( ⎦
)
− Vk° ) ∫ (∂k f ) λ (V − V ° ) + V ° d λ ⎥ Y > y⎥
⎥⎦
k=1 ⎣ 0
⎡ ⎤
= E ⎣ f (V ) Y > y ⎦ − µY (10.4)
170

which decomposes the systematic part of the ES, less the EL (which
it must always exceed), as a sum of factor components.
We can justify that 10.2 is a direct generalisation of the Euler
formula by simply observing that it agrees for any p-homogeneous
function (with p > 0)4:
1 1
∂ ∂
xk ∫ ∂(λ x ) [ f (λ x)] dλ = x ∫ λ k
−1
∂xk
[ f (λ x)] dλ
0 k 0
1
∂ 1
= ∫λ p−1
dλ x k
∂xk
[ f ( x )] = x k ∂k f
p
0
In fact we may work back from this to 10.2, as follows, thereby

establishing 10.2 as the only generalised Euler formula.5 To repli-
cate the Euler formula for each p > 0, the function f(x) = (c⋅x)p must
decompose as Smk=1ckxk(c⋅x)p–1. From the power series representation of
the exponential, we must therefore have the kth contribution of the
function6 exp(iu⋅x) as ukxk[exp(iu⋅x) – 1]/u⋅x. Write f in terms of its
Fourier transform:
f ( x) = ∫ F (u) exp (iu ⋅ x) d [u]

Rm
m
⎛ 1 ⎞
F ( u) = ⎜ ⎟
⎝ 2 π ⎠
∫ f ( y ) exp (−iu ⋅ y ) d [ y ]
Rm
with d[u] denoting the volume element in u space. The kth contribu-
tion of this function is therefore:
m
⎛ 1 ⎞ exp ( iu ⋅ x) − 1
⎜ ⎟
⎝ 2 π ⎠
∫ux k k
u⋅ x
∫ f ( y ) exp (−iu⋅ y ) d [ y ] d [u]
R m
Rm
1
= xk ∫ (∂ f ) (λ x) dλ
k
0
the last stage following immediately from the Fourier inversion

theorem. The right-hand side of this expression is the same as 10.2.
Interpretation of formula via hedges

We do not think that the notion of contribution of a factor to overall
EL is meaningful. This may seem strange at first, but suppose for
example that the conditional loss is 2 + 3V1 + 4V2, where both V1 and
V2 have zero mean. As the EL (which is 2) is not coupled to the
factors, there is no obvious definition of a contribution to it: the
factors only describe the variation about the EL, and so does 10.4.
171

It is desirable to link factor contribution with hedging or repli-

cating instruments where possible, and such an interpretation is
possible with our construction. It is helpful to consider a specific
form of model, so let us assume that the dependency of the condi-
tional EL factors can be written Ψ(c⋅V) or a linear combination of
such terms (the latter extension will be clear in context). Indeed,
10.1 is of this form. It is now a question of appropriately defining
the instruments. In general these will need to be nonlinear func-
tions of V (except if Ψ is linear) and will not be expressible as a
simple additive combination “function(V1) + function(V2) + ...”,
though the kth instrument will depend primarily on Vk. Indeed, it
turns out that the kth instrument is essentially of the form Vk × (func-
tion of V). Writing f(V) = Ψ(c⋅V) in Equation 10.3 and performing the
integrals gives the following equation, in which we interpret the
summation as a sum over hypothetical “replicating instruments”:
m
µY V − µY = ∑ c k hk (V )
k=1
⎛ Ψ ( c ⋅V ) − Ψ ( c ⋅V ° ) ⎞
hk (V ) = E° ⎜⎜(Vk − Vk° ) ⎟
⎟ (10.5)
⎝ c ⋅V − c ⋅V ° ⎠
Note that the expected payout of each instrument is zero (that is,
E[hk(V)] = 0, clear from anti-symmetry under the interchange V ↔
V°), and indeed this effect is anticipated in the discussion above: the
factors are there to describe the variation about the EL. It is straight-
forward to check that the position risk contribution of the kth
instrument, that is, the conditional expectation of ckhk(V) in 10.5 on
the loss exceeding the VAR, is simply the factor risk contribution
10.4. The last part of the expression for hk(V) is the gradient of the
chord joining (c⋅V°, Ψ(c⋅V°)) to (c⋅V, Ψ(c⋅V)) and therefore is a “discrete
derivative” of Ψ, and the (Vk – V°k ) term gives the kth component of
the variation. The expectation over V° takes into account the proba-
bility-weighted variation of Ψ between the arbitrary “reference
point” V° and V. The instruments together therefore replicate the
variation of the conditional EL around the EL and each one describes
in a reasonably natural way how much variation is due to the kth
component Vk.
We can now demonstrate 10.4 for two different well-known models.
172

Multivariate normal model

This can be written:
Y = µY + c ⋅V +U, V ~ N ( 0, Σ)
with U denoting the unsystematic risk. The decomposition of the

systematic part of ES into factors is easily calculated because ∇f = c,
which is constant. After performing the necessary algebra, one
finds that the kth factor contribution is simply:

−1
(
c k ( Σc ) k φ Φ ( P )
+
)
+
σY P
with P+ = P[Y > y]. Note that this decomposition is therefore essen-
tially the same as the logical decomposition of the standard
deviation, as the systematic part of the standard deviation is c′Σc/σY,
and the multiplier as the right-hand term of the above expression is
dependent only on the choice of tail probability. The kth contribu-
tion therefore vanishes if the factor is uncorrelated with the
portfolio, as expected.
The equivalence with the standard deviation measure in the
multivariate normal model was obtained for the decomposition of
ES into systematic and unsystematic parts (Martin and Tasche,
2007), so the fact that it carries over to this subdivision of the
systematic part is unsurprising.
Vasicek (probit) model and numerical examples

This is less tractable than the multivariate normal model and
requires numerical methods. We have from equation 10.1:7
N
a jc j
∇f = −∑
j=1 1− cʹ′j Σc j
( ( ))
φ Φ−1 p j V
with pj|V shorthand for the conditional default probability, that is:
⎛ Φ−1 ( p ) − c ⋅V ⎞
j j
p j V = Φ ⎜⎜ ⎟
⎟
⎝ 1− c ʹ′
j Σc j ⎠
In calculating 10.3, the integral over λ is simple, giving the kth

contribution as:
173

Table 10.1 Test portfolio description
Group Exposure Default prob. Factor weight
1 2 3 4 5 R²
1 50 0.4% 0.8 0.0 0.0 0.0 0.0 0.64

2 5 3% 0.0 0.4 0.1 0.0 0.1 0.26
3 4 3% 0.2 0.0 0.5 0.0 0.2 0.54
4 5 4% 0.1 0.0 0.2 0.5 0.0 0.45
5 6 4% 0.1 0.2 0.0 0.0 0.5 0.44
6 11 7% 0.2 0.2 0.2 0.1 0.2 0.44
7 8 7% 0.2 0.0 0.2 0.2 0.2 0.37
8 8 2% 0.2 0.1 0.1 0.2 0.1 0.27
9 8 5% 0.1 0.0 0.2 0.2 0.0 0.16
10 2 10% 0.2 0.3 0.2 0.2 0.2 0.66
N ⎧⎪ V − V °
∑a c j jk E° ⎨ k k
⎪⎩ c j ⋅ (V − V )
°
j=1
( j ) j ⎟ − Φ ⎛⎜ Φ−1 ( p j ) − c j ⋅V ° ⎞⎟⎤⎥⎫⎪⎬

⎡ ⎛ Φ−1 p − c ⋅V ⎞
⎢Φ ⎜
⎢⎣ ⎜⎝ 1− cʹ′j Σc j ⎟⎠ ⎜ 1− cʹ′j Σc j ⎟⎠⎥⎦⎪
⎝ ⎭
but the expectation over V° has to be calculated numerically. To
attend to this, we consider for any vector w the expression:
⎧⎪ w ⋅ (V − V ° ) ⎡ ⎛ Φ−1 ( p ) − c ⋅V ⎞ ⎛ Φ−1 ( p ) − c ⋅V ° ⎞⎤⎫⎪
E° ⎨ ⎢Φ ⎜ ⎟ − Φ ⎜ ⎟⎥⎬
⎪⎩ c ⋅ (V − V ) ⎢⎣ ⎝
°
1− cʹ′Σc ⎠ ⎝ 1− cʹ′Σc ⎠⎥⎦⎪⎭
____
Write c⋅V° = θZ, with Z ~ N(0, 1), θ = √c′ Sc, and write w = (c′Sw/c′Sc)c +
u where c⋅V° and u⋅V° are by construction independent. Only the
integration in the Z-direction is non-trivial, and the expectation
therefore emerges as the one-dimensional integral:
⎧ w ⋅V − ( cʹ′Σw /θ ) Z
E Z ⎨
⎩ c ⋅V − θ Z
⎡ ⎛ Φ −1 ( p ) − c ⋅V ⎞ ⎛ Φ −1 ( p ) − θ Z ⎞⎤⎪⎫
⎢Φ ⎜ ⎟ − Φ ⎜ ⎟⎥⎬
⎢⎣ ⎝ 1− θ 2
⎠ ⎝ 1− θ 2 ⎠⎥⎦⎪⎭
for which a simple numerical routine can be written, in effect calcu-

lating the function:
ψ ( g, h, x, y, θ )
∞
g − hz ⎛ ⎛ x − y ⎞ ⎛ x − θ z ⎞⎞

:= ∫ y − θ z ⎜Φ ⎜⎝ 2
⎟ − Φ ⎜ ⎟⎟φ ( z) dz
⎝ 1− θ 2 ⎠⎠
−∞ ⎝ 1− θ ⎠
174

Table 10.2 Covariance matrix of factors S
1 2 3 4 5
1 1.00 0.40 0.40 0.40 0.40

2 0.40 1.00 0.43 0.43 0.43
3 0.40 0.43 1.00 0.47 0.47
4 0.40 0.43 0.47 1.00 0.48
5 0.40 0.43 0.47 0.48 1.00
(with an obvious interpretation when y = θ z). The kth summand in

10.3 thereby emerges as:
N ⎛ ( Σc j )k ,Φ−1 p , c ⋅V, cʹ′Σc ⎞⎟
∑a c j jk ψ ⎜Vk ,
⎜ cʹ′j Σc j
( j) j j j ⎟ (10.6)
j=1 ⎝ ⎠
Finally, the ‘outer’ integration over Y has to be done to obtain 10.4.

In practice this is easily calculated along with the loss distribution.
We suggest the following scheme. First, fix some loss levels, say {yν}.
Next, run a Monte Carlo simulation in which at each sample the
following steps are performed.
❑❑ Sample8 the risk factors V.

❑❑ Calculate the conditional EL SNj=1aj pj|V and calculate 10.6 for each k.
❑❑ Calculate P[Y > yn |V] and E[Y1(Y > yn )|V] for each ν . (In the exam-
ples we run here, the actual loss equals the conditional EL because
unsystematic risk is being neglected, so the probability is zero or
one and the expectation is zero or Y according to whether each
loss level is exceeded. In more general models, one uses analyt-
ical approximations to do this (see Martin, 2009)). This allows
10.4 to be estimated for each ν , k.
Once this has been done, one has at each loss level the probability of
exceeding it, and the factor contributions. The VAR, ES and contri-
butions for any level can then be found by interpolation.
We showed that in the multivariate normal case, all shortfall
measures are essentially equivalent in that the tail probability has
no effect on the decomposition. In general, and indeed for the
probit model, this no longer holds. To demonstrate this, we take
the grouped portfolio described in Tables 10.1 and 10.2. By
grouping, we mean that we are assuming that each group in the
portfolio consists of many thousands of similar, individual
175

Figure 10.3 Shortfall allocation into portfolio constituents for two

percentiles (95% and 99.5%), numerically (top) and pictorially (bottom)
1
2
3
4
5
6
7
8
9
10
Note: The higher risk from constituent 1 at the higher percentile
Figure 10.4 Shortfall allocation into factors for two percentiles (95%
and 99.5%), numerically (top) and pictorially (bottom)
1
2
3
4
5
Note: The central part of the chart is the expected loss, which of course is the
same in the two percentiles. In the higher percentile, there is a significantly higher
contribution from factor 1
exposures and is fine-grained enough to make the unsystematic

risk ignorable. This allows the exposures within any group (of
equally correlated exposures) to be added up and represented as a
176

single line item. The factor weight vectors (cj) are shown and the
“R-squared”, or proportion of variation explained by the factors,
is c′j Σcj (not simply |cj|2, as the factors are not orthogonal), for each j
from one to 10. The portfolio EL is exactly 3. The portfolio consists
of groups that are linked primarily to one factor each and a collec-
tion of other groups that are linked more generally. Group 1 is a
large exposure to a low-probability default event linked strongly
to factor 1.
The shortfall is calculated at two different tail probabilities,
0.5% and 5%, by Monte Carlo simulation (the grouping allows this
to be made very fast).9 The results are shown in Figures 10.3 and
10.4. Figure 10.3 shows the usual decomposition of shortfall into
constituents,10 and Figure 10.4 shows the decomposition into
factors using the methods shown here. Looking at the portfolio
model, it is reasonably clear that group 1 is a “tail event”, that is,
an unlikely but severe event. At a higher percentile (further into
the tail), we therefore expect it to show a larger risk contribution,
which is seen from Figure 10.3; this much is standard. The new
Table 10.3 Retail portfolio description
Product pool Exposure Def. Factor weight

prob.
GDP Un- Rates CPI Housing R²
empl.
Mortgages 32 3.5% –0.176 0.123 0.001 0.108 0.122 0.14

Personal loans 21 5% –0.071 0.115 0.057 0.044 0.035 0.05
Credit cards 3 6% –0.103 0.108 0.029 0.016 0.056 0.05
Current a/c
overdraft 2 7% –0.093 0.083 0.042 0.021 0.027 0.03
Savings a/c
overdraft 2 7% –0.086 0.095 0.099 0.090 –0.092 0.05
SME 27 0.80% –0.200 0.004 0.023 0.054 0.023 0.05
Other 13 5.5% –0.123 0.045 –0.007 –0.030 0.065 0.03
Table 10.4 Covariance matrix of macroeconomic factors S
GDP Unempl. Rates CPI Housing
GDP 1.00 –0.45 –0.24 –0.09 –0.30

Unempl. –0.45 1.00 0.10 0.33 0.39
Rates –0.24 0.10 1.00 0.42 0.10
CPI –0.09 0.33 0.42 1.00 0.43
Housing –0.30 0.39 0.10 0.43 1.00
177

part of our analysis is that factor 1, to which group 1 is most

strongly linked, should contribute a larger proportion of the risk,
which is seen from Figure 10.4. Analogous behaviour is demon-
strated in Martin and Tasche (2007) in the context of unsystematic
risk, where a single exposure to an unlikely event contributes
more substantially to unsystematic tail risk. In this chapter, the tail
risk is coming from sectorial or “factorial” concentrations instead.
Moving on to an example from real life, we consider a retail
portfolio driven by Spanish macroeconomic factors as described
in Tables 10.3 and 10.4. The dispersion of exposures (net of
recovery) among the different products is a typical one, adding to
a total of 100 units. The systematic drivers are the Spanish GDP,
unemployment, one-year interest rate, consumer price index
(CPI) and housing price index. The calibration of 10.1 to these
indexes is based on a 16-year window of monthly returns without
introducing any orthogonalisation, so the covariance matrix is
not diagonal (Tables 10.3 and 10.4). As the dependent variable,
the historical default rate for each product is used whenever
available and otherwise reconstructed with the aid of the global
Spanish mortality rate, which then acts like a “proxy”. On the
whole, the variables that dominate are, as suspected, GDP and
unemployment, especially affecting personal finance. Credit
positions to small enterprises appear to be less sensitive to unem-
ployment and concentrate their sensitivity to GDP. The asset
Figure 10.5 Proportion of risk in each factor in the base case (solid
lines) and rebalanced portfolio (dashed lines)
60
GDP
50
Factor contribution (%)
Unempl.
Rates
40
CPI
Housing
30
GDP*
20 Unempl.*
Rates*
10 CPI*
Housing*
0
0 2 4 6 8 10 12 14
Expected shortfall minus expected loss
178

Figure 10.6 Loss distributions for base case and rebalanced portfolio
1
Base case
Improved
Tail probability
0.1
0.01
0.001
0 2 4 6 8 10 12 14
Expected shortfall minus expected loss
correlations thus considered are similar to those proposed within

the Basel II Accord for these product types (Bank for International
Settlements, 2005).
From the composition of the portfolio, we anticipate that the
relative contribution of the unemployment factor is more impor-
tant for lower loss levels given that the products that exhibit a
major contribution to this factor (that is, mortgages and personal
loans) also tend to have larger probabilities of default and should
enter earlier in the losses as we move down the tail. This is what is
observed in Figure 10.5, where the solid lines show the factor
contributions, as percentages of risk, at different loss levels. At
higher loss levels (further into the tail), the relative GDP contribu-
tion increases from the presence of the significant credit positions
to small and medium-size enterprises (SMEs), which are mainly
GDP-sensitive and have the lowest probability of default in the
portfolio.
We can also use the results to rebalance the portfolio and improve
performance, in the sense of reducing risk for a given total expo-
sure. Intuitively, it seems that the portfolio should be rebalanced
away from the GDP and employment factors, which leads us to
consider a reduction in exposure to mortgages and SME lending in
favour of the other asset classes (which are more related to consumer
finance). This can be done by, for example, changing the exposure
mix from (32, 21, 3, 2, 2, 27, 13) in the base case to (10, 36, 14, 7, 7, 13,
13). In the absence of revenue information, it is not clear how to say
179

anything well-defined about optimality of performance. However,

it is clear that the risk decreases, as desired. For example, the reduc-
tion is about 10% at 99.5% confidence (see Figure 10.5). Notice,
however, that the risk contribution from the employment factor
increases (dashed lines), and ostensibly this portfolio is mainly
about a trade-off between GDP and employment factors, the former
being the primary driver of corporate defaults and the latter the
primary driver of the performance of consumer finance.
A similar analysis, where the “factorial” concentration is of
interest to financial institutions and the regulator, could be
performed on particular portfolios that may embed large amounts
of systematic risk, such as books of mortgages stemming from
differently behaved geographical regions.
Conclusions
We have demonstrated how to decompose the systematic part of ES
among its dependent risk factors in arbitrary models for which the
simple Euler formula no longer holds. The decomposition is done
using a direct generalisation of the Euler formula (Equations 10.2
and 10.4), which reduces to the Euler formula for any function that
is homogeneous of positive degree.
In the context of the multivariate normal model, the decompo-
sition we have exhibited is identical to a simple differentiation of
the variance with respect to the risk factors. This identity is lost in
more general models: one can easily find examples in which
contributors to some percentiles are less significant at others and
vice versa.
1 A function f is p-homogeneous if f(θ x) = θ pf(x) for all θ > 0. The Euler formula is Skxk∂k f = pf.
2 For a logit model, one replaces Φ(...) by the function 1/(1 + e–x) and corrects the Φ–1(pj) term
appropriately, so the same remarks will apply to that model too.
3 In compound expressions, we distinguish ∇ and ∂k (differentiate with respect to the kth
argument), which operate on the function, from (∂/∂xk), which operates on a whole expres-
sion containing x. For example, if f(x) = sin(2x1 + 3x2) then (∂1 f)(4x1, 5x2) = 2cos(8x1 + 15x2) and
(∇f )(4x1, 5x2) = (23)cos(8x1 + 15x2), but (∂/∂x1)[f(4x1, 5x2)] = 8cos(8x1 + 15x2). Putting brackets
around the ∇f helps to clarify this.
4 Note again the importance of distinguishing between differentiating with respect to x and
with respect to λx.
5 The argument__in the rest of this section is not used subsequently, so can be skipped.
6 As usual, i = √–1.
7 Note that cj is a vector and its kth component is written cjk.
180

8 Standard techniques such as importance sampling can be used (see, for example,
Glasserman, 2005).
9 One million simulations were used.
10 The ES contribution of a constituent is its EL conditionally on portfolio loss exceeding the
VAR.
REFERENCES
Bank for International Settlements, 2005, “An Explanatory Note on the Basel II IRB Risk
Weight Functions,” BIS, July (available at www.bis.org/bcbs/ irbriskweight.pdf).
Bonti G., M. Kalkbrenner, C. Lotz and G. Stahl, 2006, “Credit Risk Concentrations
Under Stress,” Journal of Credit Risk, 2(3), pp 115–36.
Cherny A., R. Douady and S. Molchanov, 2009, “On Measuring Nonlinear Risk with
Scarce Observations,” Finance & Stochastics, 14(3), pp 375–95.
Cherny A. and D. Madan, 2006, “Coherent Measurement of Factor Risks,” May 2

(available at http://arXiv.org/ abs/math/0605062v1).
Glasserman P., 2005, “Importance Sampling for Portfolio Credit Risk,” Management
Science, 51(11), pp 1,643−56.
Martin R., 2009, “Shortfall: Who Contributes and How Much?” Risk, October, pp 84−89.
Martin R. and D. Tasche, 2007, “Shortfall: A Tail of Two Parts,” Risk, February, pp 84−89.
Rosen D. and D. Saunders, 2009, “Analytical Methods for Hedging Systematic Credit
Risk with Linear Factor Portfolios,” Journal of Economic Dynamics & Control, 33(1), pp
37−52.
Rosen D. and D. Saunders, 2010, “Risk Factor Contributions in Portfolio Credit

Risk Models,” Journal of Banking & Finance, 34(2), pp 336−49 (also available at www.
defaultrisk.com).
Tasche D., 2007, “Euler Allocation: Theory and Practice” (available at www.defaultrisk.
com).
Tasche D., 2008, “Capital Allocation to Business Units and Sub-portfolios: The Euler
Principle In Pillar II,” in A. Resti (Ed), The New Basel Accord: The Challenge of Economic
Capital (London: Risk Books): pp 423–53.
181

11
Stressed in Monte Carlo
Christian Fries
DZ Bank
A stress test is an important tool for assessing risk in a portfolio. In

this chapter, we consider a stress test implemented by an evaluation
under stressed model parameters. These could stem from a calibra-
tion to stressed market data created by a historical simulation for
value-at-risk (or some other risk measure), for instance.
From the perspective of the numerical valuation (mark-to-model)
of derivatives, the valuation under stressed market data is
demanding, as the calibration procedure may break down and the
numerical valuation method itself may break down.
We focus on the latter, that is, valuation under stressed model
parameters. We will compare Monte Carlo with partial differential
equation (PDE) valuation and propose a new, robust variant: Monte
Carlo simulation with boundary conditions.
When valuation models are used as part of, for example, VAR
calculation or stressed VAR, we may easily end up feeding our
numerical algorithm with 10-year volatilities of 50% or more.1
While one may question the level of the stressed data by itself, we
like to consider another aspect: is it safe to feed a numerical valu-
ation model such as a Monte Carlo simulation with stressed
parameters?
Why Monte Carlo fails for stressed data

The way a Monte Carlo valuation algorithm can fool you can be
observed for even the simplest model and the simplest product:
valuation of a call option under a Black–Scholes model. If stressed
model parameters are used, for example, stressed (local) volatility,
183
11 Fries PCQF.indd 183 11/03/2013 10:14

then standard error estimates can report that the result is very accu-
rate, but the result can be completely wrong.
Consider the Black–Scholes model:
dS (t ) = rS (t) dt + σ S (t ) dW  (t ) (11.1)
The exact solution of this stochastic differential equation is known
analytically and using an Euler scheme for log(S), we get the time
discretisation scheme:
rΔti − 21 σ 2 Δti +σΔW (ti )
S (ti+1 ) = S (ti ) e
which is the exact solution, that is, the Euler scheme has no discreti-
sation error. Hence, a Monte Carlo simulation using this scheme
bears only the Monte Carlo error, which can be assessed by a simple
standard error estimate.
Because of its functional form, Black–Scholes paths tend to zero
as T → ∞. The convergence is quicker and more easily observed
under extreme volatility scenarios (see Figure 11.1).
For fixed maturity and Brownian sample, the paths also converge
to zero as s → ∞. This may come as a surprise. Intuitively, one might
expect that, since the distribution widens, the paths should widen.
Figure 11.1 Sample paths of Monte Carlo simulation of the Black−

Scholes model with high volatility
20
18
16
14
12
Value
10
8
6
4
2
0
0 1 2 3 4 5 6 7 8 9 10
Simulation time
Note: All paths share the same volatility but are generated with different
Brownian paths. The paths flatten as maturity increases. For an animation of
the effect, see Fries (2010)
184
11 Fries PCQF.indd 184 11/03/2013 10:14

STRESSED IN MONTE CARLO
The effect is easy to understand. For a given, fixed volatility s =

s0, the function:
1
g : (σ , z)  − σ 2T + σ T zi
2
increases monotonically in z (the random number). But for a fixed
random number z = z0, the function is a parabola with g(s, z0) → –∞
as s → ∞.
Of course, higher volatility should be accompanied by a larger
number of paths, and Monte Carlo simulation itself assumes that
the number of paths should tend to infinity, not the volatility.
However, in practice the number of paths is limited (for example,
by computing power or available memory) while the model
parameter is not restricted. Hence, it is tempting to apply a stress
test to a model without adapting its numerical properties. If the
numerical algorithm does not adapt to the parameter change in a
sophisticated way, the result may be surprising.
The behaviour that all (fixed) paths tend to zero is a result of the
assumed lognormal dynamics. It may be questionable whether the
model is still appropriate in a stressed environment. However, the
behaviour is not unrealistic. The model now resembles the model-
ling of a credit event, where we have high(er) probability of default
and a rare probability of survival where huge values occur in order
to match the assumed average return of the distribution.
As a consequence, looking at the valuation of a European-style
call we have the (discounted) payout V(T, w) := max(S(T, w) – K, 0)
exp(–rT) converging to zero pointwise in w as s → ∞. Hence, the
Monte Carlo valuation:
1
V (T0 ) := ∑V (T, ω i )
n i
converges to zero for s → ∞, while the correct limit for the call
option is V(0) = S(0). In addition, the standard estimate for the
Monte Carlo error:
1 2
ErrorEst := ∑ V (T, ωi ) − V (T0 )
n2 i
( )
converges to zero for s → ∞.
So, applying a stress to the volatility, then re-evaluating the
product, can result in the Monte Carlo error estimator reporting a
185
11 Fries PCQF.indd 185 11/03/2013 10:14

good level of accuracy, but the Monte Carlo valuation being

completely wrong.
Note that standard importance sampling does not solve the
problem, because the relevant Monte Carlo weights (likelihood
ratios) converge to zero too.
Why a PDE is more robust

The picture is different if we evaluate using a PDE. Let us consider
the corresponding Black–Scholes PDE:
∂V ∂V 1 ∂2 V
(t,S) + rS (t,S) + σ 2S2 2 (t,S) = rV (t,S) (11.2)
∂t ∂S 2 ∂S
on some bounded domain [0, T] × A together with a linear extra

polation at the boundary ∂A, that is, we assume the boundary
condition:
∂2 V
(t,S) = 0
∂S2
on ∂A. Using a space discretisation S(ti) ∈ {S0, S1, ... , Sm–1} to discretise
the PDE by an implicit Euler scheme, we find:
(1− L) V (ti ) = (1+ rΔt) V (ti+1 )

that is:
V (ti ) = (1− L) (1+ rΔt ) V (ti+1 )
−1

Figure 11.2 Monte Carlo simulation restricted to inbound domain
X B
ω1
ω3
ω2
ω4
B
t0 t1 t2 t3 t4 t5
186
11 Fries PCQF.indd 186 11/03/2013 10:14

~
where V(ti) := (V(ti, S0), ... , V(ti, Sm–1))T is the value vector and L is a
tri-band matrix with:
Si1σ 2 − Si rΔS
Li, i−1 = Li, i+1 =
2ΔS2
and:
Si2σ 2
Li, i = −
ΔS2
for i = 1, ... , m – 2 and Li,j = 0 otherwise and with DS = Sj+1 – Sj (assuming

equidistant space discretisation).
For s → ∞, the PDE scheme converges to (setting r = 0 without
loss of generality):
Sj − Sl Su − Sj
V (ti ,Sj ) = V (ti+1 ,Su ) + V (ti+1 ,Sl )
Su − Sl Su − Sl
where Sl = S0(ti+1) (lower bound) and Su = Sm–1(ti+1) (upper bound) (see

Fries, 2010). In other words, in the limit we have a linear inter
polation of the boundary values. Thus, the PDE recovers at least all
linear payouts in the limit for large s.
Figure 11.3 European-style option under Black−Scholes model,

super-hedge boundary condition, 10-year option in 100 time steps
1.1
1.0
0.9
0.8
Standard Monte Carlo
Option value
0.7 valuation (light grey)

0.6 Monte Carlo valuation
with super-hedge
0.5 boundary condition at
barrier = 3 (black)
0.4 Analytic benchmark (grey)
0.3
0.2
0.1
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Volatility
Note: In this example, the super-hedge boundary condition induces a significant
overpricing at medium volatilities. As with a lattice-based PDE solver, this effect
can be reduced by widening the sampling region and moving the boundary
further away. The choice of the boundary position has to balance induced bias
and gained convergence (see Figure 11.4)
187
11 Fries PCQF.indd 187 11/03/2013 10:14

With this PDE scheme, the valuation of a call with strike K ∈ [Sl,
Su] will converge to:
S ( 0) − Sl S − S ( 0) S −K S −K
(Su − K ) +0 u = S ( 0) u − Sl u
Su − Sl Su − Sl Su − Sl Su − Sl
and for Sl = 0 (lower bound) to:
K
S ( 0) − S ( 0) (11.3)
Su
which is close to the true value S(0) when Su is sufficiently large. If

the upper bound grows with s, it will converge to the correct value.
So, why is the PDE more robust against stresses of the model
data? The answer is that the high volatility pushes probability to
the boundaries and the PDE has a simple analytic rule for its
boundary values. Can we add boundary conditions to a Monte
Carlo simulation?
Monte Carlo simulation scheme with boundary

conditions
Fries and Kienitz (2010) propose a Monte Carlo simulation scheme
that features boundary conditions from which they derive a
boundary value process for the underlying value process. The
complete set-up consists of four parts:
❑❑ The definition of a boundary and the corresponding inbound

regime A and outbound regime B. This is done for each time step
ti.
❑❑ The definition of an inbound Monte Carlo scheme for which all
paths remain within the boundary, that is, within A.
❑❑ The definition of a boundary condition that defines the value
process V in the region B and its valuation conditional to being in
A at the time step before.
❑❑ A modified pricing algorithm that allows us to evaluate the
product using the Monte Carlo scheme within the boundary
conditions, adding the boundary value process.
Monte Carlo scheme restricted to inbound regime

Consider Monte Carlo valuation of t → V(t) derived from model
⎟
primitives t → X(t) (for example, underlying(s), discretised at:

⎟
188
11 Fries PCQF.indd 188 11/03/2013 10:14

0 = t0 < t1 < t2 < t3 < ...
Let the Monte Carlo simulation be modified to sample only:
A := {ω X (ti , ω ) ∈ Ai∀i}
for some given sets Ai. Let Bi denote the domain X(ti, Ω)\Ai. The situ-
ation is sketched in Figure 11.2 and we refer to Fries and Kienitz
(2010) on how to construct such a Monte Carlo simulation. The
construction is similar to Fries and Joshi (2011). See also Glasserman
and Staum (2001) and Joshi and Leung (2007).
Modified valuation algorithm using restricted Monte Carlo

simulation and boundary value process
As a next step, we describe how to evaluate a financial product
using the inbound Monte Carlo simulation. We specify a backward
induction to determine V(T0), where t → V(t) denotes the product’s
⎟
value process. To do so we will make assumptions about the

boundary value of that process. These assumptions will be consid-
ered in the following sections.
Let V i(t) denote the value of the financial product at time t
(excluding cashflows in T < Ti). Let Ci(Ti) denote the time Ti value of
the cashflows/change in value in [Ti, Ti+1), that is:
Ci (Ti ) = V i (Ti ) − V i+1 (Ti )
where:2
⎛ V i+1 (T ) ⎞
V i+1 (Ti ) = N (Ti ) E Q ⎜⎜ i+1
FTi ⎟⎟
⎝ N (Ti+1 ) ⎠
Furthermore, let:
V out, i+1 (Ti+1 ) := V i+1 (Ti+1 )1Bi+1

V in, i+1 (Ti+1 ) := V i+1 (Ti+1 )1 Ai+1
Since Ai+1 ∪ Bi+1 = X(Ti+1, Ω), we have:
V i+1 (Ti ) = V out, i+1 (Ti ) + V in, i+1 (Ti )
Here V out,i+1 is the value of the paths ending in the outbound domain
in time Ti+1, and V in,i+1 is the value of the paths ending in the inbound
domain in time Ti+1.
189
11 Fries PCQF.indd 189 11/03/2013 10:14


super-hedge boundary condition, 10-year option in 100 time steps
1.1
1.0
0.9
0.8 Standard Monte Carlo
valuation (light grey)
Option value
0.7 Monte Carlo valuation
0.6 with super-hedge
boundary condition at
0.5 barrier = 6 (black)
0.3
0.2
0.1
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Volatility
We make the assumption that there is an analytic formula for

V (Ti) (or an approximation thereof), at least when X(Ti) is in the
out,i+1
inbound domain.3
Then we define the modified valuation algorithm recursively (in
~
a backward algorithm) as V in,i(Ti) = 0 on Bi and on Ai:
V in, i (Ti ) := V out, i+1 (Ti )
N (Ti )
+V in, i+1 (Ti+1 ) Q Ai+1 FTi + C (Ti ) 1 Ai
( )
N (Ti+1 )
~
given some final value V in,n(Tn).
~
The above backward induction defines V in,i(Ti) under the assump-
tion that X(Ti) is inbound. Note that C(Ti) has to be evaluated only
~
on Ai and that given V out,i+1(Ti) (or an approximation thereof), V in,i(Ti)
can be constructed from the modified (inbound) Monte Carlo
simulation.
On {X(Ti) ∈ Ai} we have (by backward induction):
⎛ V in, i (T ) ⎞ 1
E Q ⎜⎜
N ( T )
i
FTi ⎟⎟ =
N (T )
(V out, i+1 (Ti ) + V in, i+1 (Ti ) + C (Ti ))
⎝ i ⎠ i
1 V i (Ti )
=
N (Ti )
( V i+1 (Ti ) + C (Ti )) =
N (Ti )
Given that our Monte Carlo simulation starts within the bounda-
ries, the value of the product at T0 is then:
190
11 Fries PCQF.indd 190 11/03/2013 10:14


sub-hedge boundary condition, 10-year option in 100 time steps
1.1
1.0
0.9
0.8
Option value
0.7 Standard Monte Carlo

0.6 valuation (light grey)
Monte Carlo valuation
0.5 with sub-hedge
boundary condition at
0.4 barrier = 6 (black)
0.2
0.1
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Volatility
(
V (T0 ) = E Q V 0 (T0 ) FT0 )
If the product has an early exercise or any other payout conditional
on its future value, that can be incorporated in C(Ti).
Definition of boundary value process

The missing link is the definition of the boundary value process
V out,i+1(Ti). One possible approach to defining it is as follows.
Determine a functional representation of V out,i+1(Ti+1), for example:
V out, i+1 (Ti+1 ) = G (tT+1 , X (Ti+1 ))
and derive valuation formulas for V out,i+1(Ti) using the transition

probabilities of the numerical scheme, that is:
⎛ G (t , X (T )) ⎞
T+1 i+1
V out, i+1 (Ti ) = N (Ti ) E Q ⎜⎜ X (Ti )⎟⎟
⎝ N (Ti+1 ) ⎠
Example: linear boundary conditions on a scalar underlying

Let X(t) be real values. For the linear boundary functionals:
V (Ti+1 , x ) = ax + b for x < xl
V (Ti+1 , x ) = cx + d for x > xu
the conditional values of the payouts are given by option prices. It

is:
191
11 Fries PCQF.indd 191 11/03/2013 10:14

⎛ V out, i+1 (T ) ⎞
EQ ⎜⎜ i+1
X (Ti ) ⎟⎟
⎝ N (Ti+1 ) ⎠
= (b + axl ) DP (X (Ti ) , xl ,Ti+1 − Ti ) − aP (X (Ti ) , xl ,Ti+1 − Ti )
+ ( d + cxu ) DC ( X (Ti ) , xu ,Ti+1 − Ti ) + cC ( X (Ti ) , xl ,Ti+1 − Ti )
where DP(X, K, T) is the value of a digital put, DC(X, K, T) is the

value of a digital call, P(X, K, T) is the value of a put, C(X, K, T) is the
value of a call, each with spot X, strike K and maturity T.
Some boundary assumptions

To complete the definition of the boundary value process, we have
to specify how to determine the functional representation (in the
above example, a, b, c, d). A PDE scheme usually determines the
(linear) extrapolation of the value function by evaluating the neigh-
bouring points. For a Monte Carlo simulation, it is not obvious how
the linear boundary value functional x → V out,i(ti, x) can be ⎟
determined.
We propose three different variants: analytic calculation, which
is only possible for products with analytic valuation formulas and
is useful for benchmarking; sub-hedge or super-hedge, which is
product-dependent, but easy to find and gives fast, good results;
and numerical calculation, which is product-independent, similar
to PDEs, and gives good results, but may be unstable.
Analytic linear extrapolation of the boundary value functional

We can derive a linear boundary value functional analytically if we
have an analytic valuation formula and we have an analytic formula
for its delta.
This is, for example, the case for simple European-style options.
Of course, one would not use a Monte Carlo simulation at all in this
case. However, we will consider this case to benchmark our method.
Boundary value functional as sub-hedge or super-hedge

This is a simple but nevertheless very useful definition of the
boundary value functional.
Assume we have Gsup and/or Gsub such that on Bi+1:
G sub (Ti+1 , X (Ti+1 )) ≤ V i+1 (Ti+1 )
and:
192
11 Fries PCQF.indd 192 11/03/2013 10:14

V i+1 (Ti+1 ) ≤ G sup (Ti+1 , X (Ti+1 ))
where V i+1 denotes the true value of the future cashflows of the
product under consideration. Then, using Gsub in place of G as an
approximation of V out,i+1(Ti+1), we get a lower bound of the true
option price. Using Gsup in place of G as an approximation of
V out,i+1(Ti+1), we get an upper bound of the true option price.
Both methods will only give bounds for the true option prices;
however, the deviation of the value depends on the location of the
boundary. If the boundary is distant, the induced error is small.
The value returned by the Monte Carlo valuation with super-
hedge boundary conditions can be interpreted as the costs of a
corresponding hedge strategy: using dynamic hedging below a
certain barrier and converting to a safe static super-hedge (at addi-
tional costs) once a barrier has been crossed.
Numerical linear extrapolation of the boundary value functional

In the general case, a linear extrapolation of the boundary value
functional can be determined numerically, for example, using a
regression.
Such a numerical calculation of the boundary value process can
be designed to be completely product-independent. The method
then resembles more closely the approach taken by a PDE, where
the extrapolation of the value functional is determined
numerically.
However, the numerical calculation of the boundary value
process using a simple regression is not suitable for our application
to stress testing. This is easy to see: the plain regression on the
unmodified Monte Carlo simulation will suffer from the same
degeneration as the standard Monte Carlo valuation. See Fries and
Kienitz (2010) for an example.
Numerical results
We compare a standard Monte Carlo simulation (cross-hatch), the
analytic benchmark (grey) and the Monte Carlo simulation with
boundary conditions (black) for the valuation of a European-style
call under the Black–Scholes model. While this is a simple product
(with an analytic formula), the Monte Carlo algorithm is fully
fledged using fine steps (even where European-style options require
only one time step). We check the behaviour for increasing
193
11 Fries PCQF.indd 193 11/03/2013 10:14

volatility. The figures show the mean (grey line) and standard error
estimate (black/light grey area).
In Figure 11.3, we see that a Monte Carlo simulation with a super-
hedge boundary condition (black) converges to the analytic value
(grey), while the standard Monte Carlo simulation breaks down
(cross-hatch). Figure 11.4 shows how the error induced by the
super-hedge assumption decreases when the distance from the
barrier increases (from three in Figure 11.3 to six in Figure 11.4).
Figure 11.5 shows a simple sub-hedge boundary condition (V(t,
S) = S – K). The error induced is around 0.1, which is still much less
than that of a corresponding PDE valuation, which would give
1.05/6 = 0.175 (compare 11.3).
Other applications and conclusion

Monte Carlo simulation with boundary conditions can also be
applied to other applications, for example, the simulation of models
where the numerical scheme would otherwise show undesired
boundary behaviour (see Fries and Kienitz, 2010). This chapter also
comments on the extension to multi-dimensions (multiple risk
factors).
With respect to stress testing, we found that the super/sub-hedge
boundary condition is a very promising choice. It gives a stable
upper/lower bound for the true value with low Monte Carlo error.
The bound can be made as sharp as the original Monte Carlo simu-
lation when the model in its non-stressed region. If the boundary
value process is good, then the method gives even better results
than a corresponding PDE algorithm.
Christian Fries would like to thank his co-authors Mark Joshi, Jörg
Kampen and Jörg Kienitz, as well as Peter Landgraf and his colleagues
at DZ Bank. This chapter expresses the views of its authors and does
not represent the opinion of DZ Bank, which is not responsible for
any use that may be made of its contents.
1 In December 2008, volatility was high: 10-year volatility on a 30-year swap rate was
observed to move from 20% to 40%.
2 Here N denotes the numeraire and Q the corresponding equivalent martingale measure.
3 We will derive an approximation to Vout,i+1(Ti) later.
194
11 Fries PCQF.indd 194 11/03/2013 10:14

REFERENCES
Fries C., 2010, “Stressed in Monte Carlo: Comparing a Monte-Carlo Simulation

to a PDE for Stressed Volatility” (available at www.christian-fries.de/ finmath/
stressedinmontecarlo).
Fries C. and M. Joshi, 2011, “Perturbation Stable Conditional Analytic Monte-Carlo

Pricing Scheme for Auto-callable Products,” International Journal of Theoretical and Applied
Finance, 14(2), March.
Fries C. and J. Kampen, 2006, “Proxy Simulation Schemes for Generic Robust Monte
Carlo Sensitivities, Process Oriented Importance Sampling and High Accuracy Drift
Approximation,” Journal of Computational Finance, 10(2), pp 97–128 (available at www.
christian-fries.de/finmath/proxyscheme).
Fries C. and J. Kienitz, 2010, “Monte Carlo Simulation with Boundary Conditions,”
(available at www.christian-fries.de/ finmath/montecarloboundaryconditions).
Glasserman P. and J. Staum, 2001, “Conditioning on One-step Survival in Barrier Option

Simulations,” Operations Research, 49, pp 923–37.
Joshi M. and T. Leung, 2007, “Using Monte Carlo Simulation and Importance Sampling
to Rapidly Obtain Jump-diffusion Prices of Continuous Barrier Options,” Journal of
Computational Finance, 10(4), pp 93–105.
195
11 Fries PCQF.indd 195 11/03/2013 10:14

11 Fries PCQF.indd 196 11/03/2013 10:14
12
A New Breed of Copulas for Risk and
Portfolio Management
Attilio Meucci
SYMMYS
The multivariate distribution of a set of risk factors such as stocks

returns, interest rates or volatility surfaces is fully specified by the
separate marginal distributions of the factors and by their copula
or, loosely speaking, the dependence among the factors.
Modelling the marginals and the copula separately provides
greater flexibility for the practitioner to model randomness. As a
result, copulas have been used extensively in finance, both on the
sell side to price derivatives (see, for example, Li, 2000), and on the
buy side to model portfolio risk (see, for example, Meucci, Gan,
Lazanas and Phelps, 2007).
In practice, a large variety of marginal distributions can be
modelled by parametric or non-parametric specifications. However,
unlike for marginal distributions, despite the wealth of theoretical
results on copulas, only a few parametric families of copulas, such
as elliptical or Archimedean, are used in practice in commercial
applications.
Here we introduce a technique, which we call the copula-marginal
algorithm (CMA), to generate and use new, extremely flexible copulas.
The CMA enables us to extract the copulas and the marginals from
arbitrary joint distributions, perform arbitrary transformations of
those extracted copulas and then glue those transformed copulas
back with another set of arbitrary marginal distributions.
This flexibility follows from the fact that, unlike traditional
approaches to copula implementation, the CMA does not require the
explicit computation of marginal cumulative distribution function
197
12 Meucci PCQF.indd 197 11/03/2013 10:14

(CDFs) and their inverses. As a result, the CMA can generate scenarios
for many more copulas than the few parametric families used in the
traditional approach. For instance, it includes large-dimensional,
downside-only panic copulas that can be coupled with, say, extreme
value theory marginals for portfolio stress testing.
An additional benefit of the CMA is that it does not assume that
all the scenarios have equal probabilities.
Finally, the CMA is computationally very efficient even in large
markets, as can be verified in the code available for download.
In Table 12.1, we summarise the main differences between the CMA
and the traditional approach to apply the theory of copulas in practice.
We proceed as follows. First, we review the basics of copula theory.
Then we discuss the traditional approaches to copula implementa-
tion. Next, we introduce the CMA in full generality. Then we present a
first application: we create a panic copula for stress testing that hits
the downside non-symmetrically and is probability-adjusted for risk
premium. Finally, we present a second application of the CMA,
namely how to perform arbitrary transformations of copulas.
Documented code for the general algorithm and for the applications
of the CMA is available at http://symmys.com/node/335.
Review of copula theory

The two-step theory of copulas is simple and powerful. For more
information on the subject, see articles such as Embrechts, McNeil
and Straumann (2000), Durrleman, Nikeghbali and Roncalli (2000),
Embrechts, Lindskog and McNeil (2003), Nelsen (1999), Cherubini,
Luciano and Vecchiato (2004), Brigo, Pallavicini and Torresetti
(2010) and Jaworski, Durante, Haerdle and Rychlik (2010), or refer
to Meucci (2011) for a quick primer and code.
Consider a set of N joint random variables X ≡ (X1, ... , XN) with a
given joint distribution that we represent in terms of the CDF:
FX ( x1 ,…, xN ) ≡ P {X 1 ≤ x1 ,…X N ≤ xN } (12.1)
Table 12.1 Main differences between CMA and the traditional approach to
copulas implementation
Copula Marginals Probabilities
Traditional Parametric Flexible Equal

CMA Flexible Flexible Flexible
198
12 Meucci PCQF.indd 198 11/03/2013 10:14

A NEW BREED OF COPULAS FOR RISK AND PORTFOLIO MANAGEMENT
We call the first step “separation”. This step separates the distribu-
tion FX into the pure “individual” information contained in each
variable Xn, that is, the marginals FX , and the pure “joint” informa-
n
tion of all the entries of X, that is, the copula FU. The copula is the
joint distribution of the grades, that is, the random variables U ≡ (U1,
... , UN) defined by feeding the original variables Xn into their respec-
tive marginal CDF:
U n ≡ FXn ( X n ) , n = 1,..., N (12.2)
Each grade Un has a uniform distribution on the interval [0, 1] and

thus it can be interpreted as a nonlinear z-score of the original vari-
ables Xn that lost all the individual information of the distribution
of Xn and only preserved its joint information with other Xm’s. To
summarise, the separation step S proceeds as follows:
⎧ FX1 ,..., FX N
⎪
X1 ⎪⎪ U1
S: (  ) ~ FX  ⎨
⎪ (  ) ~ FU
XN ⎪
⎪⎩ UN (12.3)
The above separation step can be reversed by a second step, which

we call_ “combination”. We start from arbitrary marginal distribu-
tions F X , which are in general different from the above FX , and
n n
grades U_ ≡ (U1, ... , UN) distributed according to a chosen arbitrary

copula F U, which can, but does not need to, be obtained_by separa-
tion as the
_ above FU. Then we combine_the marginals F X and the n
copula F U_into a new joint distribution F X for X_. To do so, for each
marginal F X we first compute the inverse CDF F −1
n X , or quantile, and n
then we apply the inverse CDF to the respective grade from the
copula:
X n ≡ FX−1n (U n ) , n = 1,..., N (12.4)
To summarise, the combination step C proceeds as follows:
FX1 ,..., FX N ⎫
⎪ X1
U1 ⎪⎪
C: ⎬  (  ) ~ FX
(  ) ~ FU ⎪
XN
⎪
UN ⎪⎭ (12.5)
199
12 Meucci PCQF.indd 199 11/03/2013 10:14

Traditional copula implementation

In general, the separation step 12.3 and combination step 12.5
cannot be performed analytically. Therefore, in practice, one resorts
to Monte Carlo scenarios.
In the traditional implementation of the separation step 12.3, first
we select a parametric N-variate joint distribution F θX to model X ≡
(X1, ... , XN), whose marginal distributions F θX can be representedn
analytically. Then we draw J joint Monte Carlo scenarios {x1,j, ... , xN,j}
j=1,...,J
from F θX. Next, we compute the marginal CDFs F θX from their n
analytical representation. Then, the joint scenarios for X are mapped

as in 12.2 into joint grade scenarios by means of the respective
marginal CDFs:
un, j ≡ FXθn ( xn, j ) , n = 1,..., N ; j = 1,..., J (12.6)
The grade scenarios {u1,j, ... , uN,j}j=1,...,J now represent simulations from
the parametric copula F θU of X.
To illustrate the traditional implementation of the separation, F θX
can be normal, and the scenarios xn,j can be simulated by twisting N
independent standard normal draws by the Cholesky decomposition
of the covariance and adding the expectations. The marginals of the
joint normal distribution are normal, and the normal CDFs F θX are n
computed by quadratures of the normal probability density function.

Then the scenarios for the normal copula follow from 12.6.
We can summarise the traditional implementation of the separa-
tion step as follows:
⎧ FXθ ,..., FXθ
x1, j ⎪ 1 N
⎪⎪ u1, j
S: {(  )} ~ FXθ  ⎨
⎪ {(  )} ~ FU
θ
xN , j ⎪
⎪⎩ uN , j (12.7)
where for brevity we dropped the subscript j = 1, ... , J from the curly
brackets, displaying only the generic jth joint N-dimensional
scenario.
In the traditional implementation of the combination _ step 12.5,
we first generate scenarios from the desired copula _ F U, typically
θ
obtained via a parametric separation step, that is, F U ≡ F θU and thus

θ
_
u n,j ≡ un,j. Then we specify
_ the desired marginal distributions, typi-
cally parametrically F θX , and we compute analytically or by
n
200
12 Meucci PCQF.indd 200 11/03/2013 10:14

_
quadratures the inverse CDFs F Xθ . Then we feed as in 12.4 each
–1
_ n
grade scenario u n,j into the respective quantiles:
xn, j ≡ FXθn ( un, j ) , n = 1,..., N ; j = 1,..., J

−1
(12.8)
_ _ _
The joint scenarios
_ {x 1,j, ... , x N,j}j=1,...,J display the desired copula F θU
and marginals F θX . n
To illustrate the traditional implementation of the combination,

we can use the previously obtained normal copula scenarios and
combine them with, say, chi-square marginals with different
degrees of freedom, giving rise to a multivariate correlated chi-
square distribution.
We can summarise the traditional implementation of the combi-
nation step as follows:
⎫
FXθ1 ,..., FXθN ⎪
⎪ x1, j
u1, j ⎪
C: ⎬  {(  )} ~ FXθ
{(  )} ~ FUθ ⎪
⎪ xN , j
uN , j ⎪ (12.9)
⎭
In practice, only a few parametric joint distributions are used to

obtain the copula scenarios that appear in 12.7 and 12.9, because in
general it is impossible to compute the marginal CDFs and thus
perform the transformations un,j ≡ FX (xn,j) in 12.2 and 12.6. As a result,
n
practitioners resort to elliptical distributions such as normal or

Student t, or a few isolated tractable distributions for which the
CDFs are known, such as in Daul, De Giorgi, Lindskog and McNeil
(2003).
An alternative approach to broaden the choice of copulas involves
_
simulating the grade
_ scenarios un,j in Equation 12.9 directly from a
parametric copula F Uθ , without obtaining them from a separation step
12.7. However, the parametric specifications that allow for direct
simulation are limited to the Archimedean family (see Genest and
Rivest, 1993), and a few other extensions. Furthermore, the parame-
ters of the Archimedean family are not easily interpreted. Finally,
_
simulating the grades scenarios u n,j from the Archimedean family
when the dimension N is large is computationally challenging.
To summarise, only a restrictive set of parametric copulas is used
in practice, whether they stem from parametric joint distributions
or they are simulated directly from parametric copula
201
12 Meucci PCQF.indd 201 11/03/2013 10:14

specifications. The CMA is intended to greatly extend the set of

copulas that can be used in practice.
The copula-marginal algorithm

Unlike the traditional approach, the CMA does not require the
analytical representation of the marginals that appear in theory in
12.2 and in practice in 12.6. Instead, we construct these CDFs non-
parametrically from the joint scenarios for FX. It then becomes easy
to extract the copula. This allows us to start from arbitrary para-
metric or non-parametric joint distributions FX and thus achieve
much higher flexibility.
Even better, the CMA allows us to extract both the marginal
CDFs and the copula from distributions that are represented by
joint scenarios with fully general, non-equal probabilities. Therefore,
we can include distributions FX obtained from advanced Monte
Carlo techniques such as importance sampling (see Glasserman,
2004), from posterior probabilities driven by the entropy pooling
approach (see Meucci, 2008) or from “fully flexible probabilities”,
as in Meucci (2010).
Let us first discuss the separation step S. For this step, the CMA
takes as input the scenarios-probabilities representation {x1,j, ... , xN,j;
pj}j=1,...,J of a fully general distribution FX. See Figure 12.1, where we
display a N = 2-variate distribution with J = 4 scenarios. With this
input, the CMA computes the grade scenarios un,j as the probability-
weighted empirical grades:
J
un, j ≡ ∑ pi 1xn , i ≤xn , j , n = 1,..., N ; j = 1,..., J (12.10)
i=1
where 1A denotes the indicator function for the generic statement A,

which is equal to one if A is true and zero otherwise (refer again to
Figure 12.1).
With the grade scenarios 12.10 we are ready to separate both the
copula and the marginals in the distribution FX. For the copula, we
associate the probabilities pj of the original scenarios xn,j with the
grade scenarios un,j. As it turns out, the copula FU, that is, the joint
distribution of the grades, is given by the scenarios-probabilities
{u1,j, ... , uN,j; pj}j=1,...,J. For the marginal distributions, the CMA interpo-
lates/extrapolates as in Meucci (2006) a function I{x , u } from the n,j n,j
grid of scenario pairs {xn,j, un,j} (see Figure 12.1). This function is the
CDF of the generic nth variable:
202
12 Meucci PCQF.indd 202 11/03/2013 10:14

Figure 12.1 Copula-marginal algorithm: separation
cv U1
u1,3 u3
p3 u1,4
u4
cv I{x ,u } p4
1,j 1,j u1,1 u1
p1 u1,2 u2
p2
u2,1 u2,2 u2,3 u2,4 U2
x1,2 x1,1 x1,4 x1,3
X2 p4
x4 x2,4
0
x3 x2,3 p3
x2,2 p2
x2
x2,1 p1 I{x2,j,u2,j}
x1
cv
X1
FXn ( x ) ≡ I{x }(
x ) , n = 1,..., N (12.11)
n , j , un , j
To summarise, the CMA separation step attains from the distribu-

tion FX the scenarios-probabilities representation of the copula FU
and the interpolation/extrapolation representation of the marginal
CDFs FX as follows:
n
⎧ I ,..., I{x , u }
⎪ {x1, ju1, j } N ,j N ,j
x1, j ⎪
⎪ u1, j
SCMA : {(  ) ; pj} ~ FX  ⎨
⎪ {(  ) ; p j } ~ FU
xN , j ⎪
⎪ uN , j (12.12)
⎩
Notice that the CMA avoids the parametric CDFs F θX that appear in n
12.6.
Let us now address _ the combination step C. The two inputs are an
arbitrary copula F U and _ arbitrary marginal distributions, repre- _
sented by the CDFs F X . For the copula, we take any copula F U
n
obtained with the separation step, that is, a set of scenarios-proba-

_ _ _
bilities {u 1,j, ... , u N,j; p j}. For the marginals, we_ take any parametric or
non-parametric specification of the CDFs F X . Then for each n we n
construct, in one of a few ways discussed in the appendix available

at http://symmys.com/node/335, _ a grid of significant points {x~n,k,
~ ~ ~
u n,k}i=1,...,K , where u n,k ≡ F X (x n,k). Then, the CMA takes each grade
n
_ n
scenario for the copula u n,j and maps it into the desired combined
203
12 Meucci PCQF.indd 203 11/03/2013 10:14

_
scenarios x n,j by interpolation/extrapolation of the copula scenarios
_
u n,j on the grid:
xn, j ≡ I{u n ,k , x n ,k } ( un, j ) , n = 1,..., N ; j = 1,..., J (12.13)
To summarise, the CMA combination step achieves the _ scenarios-

probabilities
_ representation of the
_ joint distribution F X that glues
the copula F U with the marginals F X as follows: n
⎫
FX1 ,..., FX N ⎪
⎪ x1, j
u1, j ⎪
CCMA : ⎬  {(  ) ; pj} ~ FX
{(  ) ; pj} ~ FU ⎪
xN , j
⎪
uN , j ⎪ (12.14)
⎭
Notice that the interpolation/extrapolation

_ replaces the computa-
tion of the inverse CDF F −1
X that appears in
n
12.4 and 12.8.
From a computational perspective, both the separation step 12.12
and the combination step 12.14 are extremely efficient, as they run
in fractions of a second even in large markets with very large
numbers of scenarios. Refer to the code and the appendix available
at http://symmys.com/node/335 for more details.
Case study: panic copula

Here we apply the CMA to generate a large-dimensional panic
copula for stress testing and portfolio optimisation. The code for
this case study is available at http://symmys.com/node/335.
Consider an N-dimensional vector of financial random variables
X ≡ (X1, ... , XN), such as the yet to be realised returns of the N = 500
stocks in the S&P_ 500. Our aim is to construct a panic stress-test
joint distribution F X for X.
To do so, we first introduce a distribution FX that is driven by two
separate sets of random variables X(c) and X(p), representing the calm
market and the panic-stricken market. From FX we will extract the
panic copula, which we will then glue with marginal distributions
fitted to empirical data.
Accordingly, we first define the joint distribution FX with a few
components, as follows:
X ≡d (1N − B)  X (c) + B  X ( p) (12.15)
204
12 Meucci PCQF.indd 204 11/03/2013 10:14

where 1N is an N-dimensional vector of ones and the operation °

multiplies vectors term-by-term. The first component, X(c) ≡ (X1(c), ... ,
XN(c)) are the calm-market drivers, which are normally distributed
with expectation an N-dimensional vector of zeros 0N and correla-
tion matrix r 2:
X (c ) ~ N ( 0 N , ρ 2 ) (12.16)
The second component, X(p) ≡ (X1(p), ... , XN(p)), are panic-market drivers
independent of X(c), with high homogeneous correlations r among
each other:
⎛⎛ 0 ⎞ ⎛ 1 r  ⎞⎞
( p) ⎜⎜ ⎟ ⎜ ⎟⎟
X ~ N ⎜⎜  ⎟ , ⎜ r 1 r ⎟⎟
⎜⎜ 0 ⎟ ⎜  r 1 ⎟⎟ (12.17)
⎝⎝ ⎠ ⎝ ⎠⎠
The variable B ≡ (B1, ... , BN) are panic triggers. More precisely, B
selects the panic downside endogenously as in Merton (1974):
⎧⎪
1 if X n( ) < Φ−1 (b )
p
Bn ≡ ⎨ (12.18)
⎩⎪ 0 otherwise
where Φ is the standard normal CDF and b is a low threshold

probability.
The parameters (r 2, r, b) that specify the joint distribution 12.15
have an intuitive interpretation. The correlation matrix r 2 charac-
terises the dependence structure of the market in times of regular
activity. This matrix can be obtained by fitting a normal copula to
the realisations of X that occurred in non-extreme regimes, as
filtered by the minimum volume ellipsoid (see, for example, Meucci,
2005, for a review and the code). The homogeneous correlation
level r determines the dependence structure of the market in the
panic regime. The probability b determines the likelihood of a high-
correlation crash event. Therefore, r and b steer the effect of a
non-symmetric panic correlation structure of an otherwise calm-
market correlation ρ 2 and are set as stress-test parameters.
The highly non-symmetrical joint distribution FX defined by
12.15 is not analytically tractable. Nonetheless, we can easily
generate a large number J of equal-probability joint scenarios {x1,j, ...
, xN,j}j=1,...,J from this distribution, and for enhanced precision impose
as in Meucci (2009) that the first two moments of the simulations
match the theoretical distribution.
205
12 Meucci PCQF.indd 205 11/03/2013 10:14

Due to the non-symmetrical nature of the panic triggers 12.18, this

distribution has negative expectations, that is, 1/J Sjxn,j < 0. Now we
perform a second step to create a more realistic distribution that
compensates for the exposure to downside risk. For this purpose, we
use the entropy pooling approach as in Meucci (2008). Accordingly,
_
we twist the probabilities pj of the Monte Carlo scenarios in such a
way that they display the least distortion with respect to the original
probabilities pj ≡ 1/J, and yet they give rise to non-negative expecta-
tions for the market X, that is, Sj pjxn,j ≥ 0. In practice, this amounts to
solving the following minimisation:
{ p } ≡ arg min ∑ p ln ( Jp )
j j j j
{p j }
such that ∑ p j xn, j ≥ 0, ∑p j ≡ 1, p j ≥ 0 (12.19)
j j
(see Meucci, 2008, for more details). Now the scenarios-probabili-

_
ties {x1,j, ... , xN,j; p j} represent a panic distribution FX adjusted for risk
premium.
Using the separation step of the CMA 12.12, we produce the
_
scenario-probability representation {u1,j, ... , uN,j; pj} of the panic copula
FU. Then, using the combination step _ of the CMA 12.14, we glue the
panic copula FU with marginals F X fitted to the empirical observa-
n
_ _ _
tions of X, creating_the scenarios-probabilities {x 1,j, ... , xN,j; p j} for the
panic
_ distribution F X, which fits the empirical data. The distribution
F X can be used for stress testing, or it can be fed into an optimiser to
select an optimal panic-aware portfolio allocation.
To illustrate the panic copula, we show in the top-left portion of
Figure 12.2 the scenarios of this copula with panic correlations r ≡
90% and with very low panic probability b ≡ 2%, for two stock
returns. In the circle we highlight the non-symmetrical downside
panic scenarios.
For the marginals, a possible choice are Student t fits, as in
Meucci, Gan, Lazanas and Phelps (2007). Alternatively, we can
construct the marginals as the kernel-smoothed empirical distribu-
tions of the returns, with tails fitted using extreme value theory (see
Embrechts, Klueppelberg and Mikosch, 1997).
However, for didactical purposes, in the top-right portion of
Figure 12.2 we combine the panic copula with normal marginals
fitted to the empirical data. _ This way we obtain a deceptively tame
joint market distribution FX of normal returns. Nevertheless, even
206
12 Meucci PCQF.indd 206 11/03/2013 10:14

Figure 12.2 Panic copula, normal marginals and skewed portfolio of

normal returns
Panic copula U = (U1, U2) Stock returns X = (X1, X2)
u2
_ _
fX x2
2 X ~ FX
fU
2
U ~ FU
u1 x1
_
fU Normal fX
1
1 distribution
: panic
Equally weighted portfolio return PDF
Table 12.2 Risk statistics for the equally weighted portfolio
Risk Panic copula Normal copula
CVAR 95% –29% –24%

Exp value 0 0
Standard deviation 12% 12%
Skewness –0.4 0
Kurtosis 4.4 3
with perfectly normal marginal returns, and even with a very

unlikely panic probability b ≡ 2%, the market is dangerously skewed
towards less favourable outcomes: portfolios of normal securities are
not necessarily normal! In the bottom portion of Figure 12.2, we can
visualise this effect for the equally weighted portfolio.
In Table 12.2, we report relevant risk _ statistics for the equally
weighted portfolio in our panic market FX. We also report the same
statistics in a perfectly normal market, which follows by setting b ≡ 0 in
12.18.
For more details, documented code is available at http://
symmys.com/node/335.
Case study: copula transformations

Here we use the CMA to perform arbitrary operations on arbitrary
copulas. The documented code for this case study is available at
http://symmys.com/node/335.
207
12 Meucci PCQF.indd 207 11/03/2013 10:14

By construction, a generic copula FU lives on the unit cube

because each grade is normalised to have a uniform distribution on
the unit interval. At times, when we need to modify the copula, the
unit-interval, uniform normalisation is impractical. For instance,
one might need to reshuffle the dependence structure of the N × 1
vector of the grades U by means of a linear transformation:
Tγ : U  γU (12.20)
where γ is an N × N matrix. Unfortunately, the transformed entries of

γ U are not the grades of a copula. This is easily verified because in
general 12.20 transforms the copula domain, which is the unit cube,
into a parallelotope that trespasses the boundaries of the unit cube.
To perform transformations on copulas, we propose to simply
use alternative, not necessarily uniform, normalisations for the
copulas, operate the transformation on the normalised variables,
and then map the result back in the unit cube.
To be concrete, let us focus on the linear transformation 12.20.
First, we normalise each grade to have a standard normal distribu-
tion, instead of uniform, that is, we define the following random
variables:
Zn ≡ Φ −1 (U n ) ~ N ( 0,1) (12.21)
_
This is a special case of a combination step 12.5, where F X = Φ. Then
n
we operate the linear transformation 12.20 of the normalised

variables:
Tγ : Z  Z ≡ γ Z (12.22)
~
Finally, we map the transformed variables Z back into the unit cube
~
space of the copula by means of the marginal CDFs of Z :
U n ≡ FZ n Z n ~ U ([0,1])
( ) (12.23)
This step entails performing a separation step Equation 12.3 and

then only retaining the copula. This way we obtain the distribution
~
of the grades FU ≡ FU~ .
We summarise the copula transformation process in the following
diagram:
FU −−T→ FU
↓C ↑S
T
FZ → FZ (12.24)
208
12 Meucci PCQF.indd 208 11/03/2013 10:14

It is trivial to generalise the above steps and diagram to arbitrary

nonlinear transformations T. It is also possible to consider non-
normal standardisations of the grades in the combination step
12.21, which can be tailored to the desired transformation T. The
theory of the most suitable standardisation for a given transforma-
tion is the subject of a separate publication.
In rare cases, the above copula transformations can be imple-
mented analytically. However, the family of copulas that can be
transformed analytically is extremely small, and depends on the
specific transformation. For instance, for linear transformations we
can only rely on elliptical copulas.
Instead, to implement copula transformations in practice, we
rely on the CMA, which allows us to perform arbitrary combination
steps and separation steps, which are suitable for fully general
transformations of arbitrary copulas.
To illustrate how to transform a copula using the CMA, we
perform a special case of the linear transformation γ in 12.22, namely
a rotation on the panic copula introduced in the previous section. In
the bivariate case, we can parameterise the rotations by an angle θ
as follows:
⎛ cosθ sin θ ⎞
γ ≡ ⎜ ⎟ (12.25)
⎝ −sin θ cos θ ⎠
In Figure 12.3, we display the result for θ ≡ π /2: the non-symmetric

panic scenarios now affect the second security positively. For more
details, documented code is available at http://symmys.com/
node/335.
Figure 12.3 Copula transformations: A rotation
Panic copula Rotated panic copula

u2
fU fU~
2 2
~
U ~ fU U ~ fU~
u1
fU fU~
1 1
: panic
209
12 Meucci PCQF.indd 209 11/03/2013 10:14

Conclusion
We have introduced the CMA, a technique to generate new flexible
copulas for risk management and portfolio management. The CMA
generates flexible copulas and glues them with arbitrary marginals
using the scenarios-probabilities representation of a distribution.
The CMA generates many more copulas than the few parametric
families used in traditional copula implementations. For instance,
with the CMA we can generate large-dimensional, downside-only
panic copulas. The CMA also allows us to perform arbitrary trans-
formations of copulas, despite the fact that copulas are only defined
on the unit cube. Finally, unlike in traditional approaches to copula
implementation, the probabilities of the scenarios are not assumed
to be equal. Therefore, the CMA allows us to leverage techniques
such as importance sampling and entropy pooling.
The author is grateful to Garli Beibi.
REFERENCES
Brigo D., A. Pallavicini and R. Torresetti, 2010, Credit Models and the Crisis: A Journey into
CDOs, Copulas, Correlations and Dynamic Models (Chichester, England: Wiley).
Cherubini U., E. Luciano and W. Vecchiato, 2004, Copula Methods in Finance (Hoboken,
NJ: Wiley).
Daul S., E. De Giorgi, F. Lindskog and A. McNeil, 2003, “The Grouped t-Copula
with an Application to Credit Risk,” working paper (available at http://ssrn.com/
abstract=1358956).
Durrleman V., A. Nikeghbali and T. Roncalli, 2000, “Which Copula is the Right One?
working paper.
Embrechts P., A. McNeil and D. Straumann, 2000, “Correlation: Pitfalls and

Alternatives,” working paper.
Embrechts P., C. Klueppelberg and T. Mikosch, 1997, Modelling Extremal Events for
Insurance and Finance (New York, NY: Springer).
Embrechts P., F. Lindskog and A. McNeil, 2003, “Modelling Dependence with Copulas
and Applications to Risk Management,” in Handbook of Heavy Tailed Distributions in
Finance (Amsterdam, Holland: Elsevier).
Genest C. and R. Rivest, 1993, “Statistical Inference Procedures for Bivariate

Archimedean Copulas,” Journal of the American Statistical Association, 88, pp 1,034–43.
Springer).
Jaworski P., F. Durante, W. Haerdle and T. Rychlik, 2010, Copula Theory and its
Applications (Heidelberg, Germany: Springer).
210
12 Meucci PCQF.indd 210 11/03/2013 10:14

Li D., 2000, “On Default Correlation: A Copula Function Approach,” Journal of Fixed
Income, 9, pp 43–54.
Merton R., 1974, “On the Pricing of Corporate Debt: The Risk Structure of Interest Rates,”
Journal of Finance, 29, pp 449–70.
Meucci A., 2006, “Beyond Black–Litterman in Practice,” Risk, September, pp 114–19

(article and code available at http://symmys.com/node/157).
Meucci A., 2008, “Fully Flexible Views: Theory and Practice,” Risk, October, pp 97–102
Meucci A., 2009, “Simulations with Exact Means and Covariances,” Risk, July, pp 89–91
Meucci A., 2009, Risk and Asset Allocation (New York, NY: Springer), available at at
http://symmys.com/attilio-meucci/book.
Meucci A., 2010, “Historical Scenarios with Fully Flexible Probabilities,” GARP Risk
Professional – The Quant Classroom, December, pp 40–43 (article and code available at
http:// symmys.com/node/150).
Meucci A., 2011, “A Short, Comprehensive, Practical Guide to Copulas,” GARP Risk
Professional, October, pp 40–43 (article and code available at
http://symmys.com/node/351).
Meucci A., Y. Gan, A. Lazanas and B. Phelps, 2007, A Portfolio Managers Guide to Lehman
Brothers Tail Risk Model (Lehman Brothers Publications).
Nelsen R., 1999, An Introduction to Copulas (New York, NY: Springer).
211
12 Meucci PCQF.indd 211 11/03/2013 10:14

12 Meucci PCQF.indd 212 11/03/2013 10:14
13
A Historical-parametric Hybrid VaR
Robin Stuart
State Street Global Markets Risk Management
In standard calculations of value-at-risk by historical simulation,

some representation of the profit and loss (P&L) function is obtained
for the portfolio of interest that permits it to be repriced under
shocks to market variables observed in some specified historical
time window. Often the P&L function is represented by a set of
revaluations on a regular grid from which the P&L associated with
any changes in market variables can be calculated by interpolation.
Histories of market variable changes constitute a set of time series
and, as the degree of specificity captured by the model increases
their number, may become very large. As an inevitable conse-
quence, the proportion of incomplete or bad data points will rise.
A number of methods have been devised to deal with the problem
of incomplete time-series data. Isolated intervals of missing observa-
tions in a time series can be filled with plausible values that preserve
the variance of the time series itself and its correlation with other
time series by means of the so-called Brownian bridge (see
Glasserman, 2004). This may be applied as a “one-off” realisation of
the missing data or repeatedly in the form of a Monte Carlo simula-
tion. Alternatively, if a multi-factor model is available, as in the
capital asset pricing model (CAPM), the expectation maximisation or
EM-algorithm (Dempster, Laird and Rubin, 1977, and Schafer, 1997)
or similar can be employed to obtain an expected value for a missing
data change. A distribution of idiosyncratic or residual market vari-
able moves is then placed around it. This approach too can be applied
in one-off or Monte Carlo form.
An alternative to the historical simulation described above is
213
13 Stuart PCQF.indd 213 11/03/2013 10:15

parametric VaR. In this case market variable shocks are represented

by a multivariate distribution that is combined with the portfolio’s
P&L function to generate a probability distribution for the P&L.
Locating the required percentile of this distribution gives the VaR.
Exact analytic results for parametric VaR can, in general, only be
obtained when the P&L function is at most quadratic in the market
variables (Britten-Jones and Schaefer, 1999, Jaschke, 2002, and Holton,
2003) and is therefore not well suited for treating derivatives with
strongly nonlinear payout functions.
Here, we describe the calculation of a hybrid historical-para-
metric VaR, referred to as hybrid VaR. Hybrid VaR is essentially a
historical simulation, but in the presence of time series with missing
data it combines some of the best features of the methodologies
mentioned above. When a missing point in a time series is encoun-
tered it is filled, not by a single draw from some assumed underlying
distribution, but with a parametric representation of the distribu-
tion of the possible values for the missing data as a whole. As a
consequence, it effectively produces the same results as Monte
Carlo simulation of missing data but by analytic means, hence
avoiding the computational burden and statistical uncertainty. The
methodology can be applied to highly nonlinear portfolios,
provided the number of positions is reasonably large, as the charac-
teristics of the resulting P&L distributions are carried through the
full course of the calculation.
Hybrid VAR
Suppose the calculation of a historical simulation VaR is to be
undertaken. The P&L of the portfolio for each historical date is
calculated by applying a set of observed changes xi in the market
variables to a function P&L(x1, x2, ...). The term market variable here
applies to any input variable that is required in order to price the
positions in the portfolio, and could include such things as an
interest or foreign exchange rate, the price of a particular equity, the
credit spread of a specific corporate bond issue, etc. For our
purposes, it will be assumed that the P&L function for the portfolio
as a whole can be decomposed into the sum of P&L functions in the
individual market variable changes:
P & L ( x1 , x2 ,…) = ∑ P & L i ( xi ) (13.1)

i
214
13 Stuart PCQF.indd 214 11/03/2013 10:15

A HISTORICAL-PARAMETRIC HYBRID VAR
Of course, multiple positions within the portfolio may respond to

changes in the same market variable, but it is assumed that these
have been netted and P&Li gives the total response of the portfolio
to xi. The effects of cross-terms between market variables, which
may be present, will be ignored here although the methodology
described below can be generalised to include them.
At issue here is that on any given historical date, some of the
observations of market variable changes will be missing. This
section shows how to obtain, on a particular historical date, the
probability distribution for the P&L of the portfolio as a whole from
the probability distribution obtained or assumed for the missing
data. The contribution to the portfolio’s P&L from any market vari-
ables that are actually observed on the date in question is obtained
by simply plugging the observed discrete market variable change
into the relevant P&Li in Equation 13.1.
In the case of a market variable whose change is not observed on a
given historical date, the hybrid VaR methodology requires that a
probability distribution of possible changes be provided and that it is
independent of the distributions associated with other market varia-
bles. In a CAPM or similar multi-factor model these requirements are
easily and naturally satisfied. The distribution would generally be
taken to be Gaussian and, for a given market variable, the mean of the
distribution is determined by regressing against systematic variables,
and the standard deviation is set to that of the idiosyncratic compo-
nent. Alternatively, the characteristics of the distribution could be
determined by looking at a set of related securities and obtaining the
mean and standard deviation of changes experienced by them on that
date. For positions that are poorly observed in the market, the mean
of the distribution might be determined from the change experienced
by some proxy, and the standard deviation chosen to reflect its quality
or to impose an uncorrelated penalty/noise factor incurred against
the use of imperfect information. Whatever the method used to arrive
at it, it will be assumed here that on dates for which an observation is
not available, the distribution of possible changes is known. It should
be emphasised that the choice of a Gaussian distribution is a conven-
ience that follows widely used practice, but once moments are
calculated, nothing in the subsequent methodology is dependent on
that choice and the substitution of more general distributions, for
example, with fatter tails, is relatively straightforward.
215
13 Stuart PCQF.indd 215 11/03/2013 10:15

For definiteness, let the function P&Li(xi) be known on a grid of

discrete points, uj, and that, on the historical date in question, the
distribution of possible missing values is Gaussian with mean m
and standard deviation s. The notation φ (x) and Φ(x) will be used to
denote the standard and cumulative Gaussian probability functions
respectively. For compactness, the subscript i will be dropped in
this section.
The probability distribution in market variable changes, taken
together with the function P&L(x), implies a probability distribution in
P&L associated with that market variable. Rather than calculate the
P&L probability distribution itself, its moments will be used to charac-
terise it and their information content carried analytically through the
entire calculation. The nth raw moment, m ʹn, is given by:
∞
⎛ x − µ ⎞ dx
µ ʹ′n = P&Ln = ∫ P&L ( x) ⋅ φ ⎜⎝
n
⎟
σ ⎠ σ
−∞
∞
= ∫ P&L (µ + σ x̂) ⋅ φ ( x̂) dx̂

n
−∞
The value of the function P&L(x) can be estimated at intermediate

points not lying on the grid by using some form of interpolation. If
linear interpolation is used to produce a piecewise continuous
representation of P&L(x) then:
û j+1
n
µ nʹ′ = ∑ ∫ (m x̂ + c )j j ⋅ φ ( x̂ ) dx̂
j û j
Figure 13.1 P&L distribution produced by a nonlinear P&L function in

the presence of a Gaussian uncertainty in the market variable change
15
Nonlinear
Market variable P&L function
10
P&L (US$ ‘000)
change
0
P&L
distribution
–5
–200 0 200
Market variable change
216
13 Stuart PCQF.indd 216 11/03/2013 10:15

where x^ = (x – m )/s , u^j = (uj – m)/s and:

P&L ( uj+1 ) − P&L ( u j )
mj = σ
u j+1 − u j
cj =
(u j+1 − µ ) P&L ( u j ) − ( u j − µ ) P&L ( u j+1 )
u j+1 − u j
The linear function in the first and last intervals are extrapolated to
–∞ and +∞ respectively.
The required integrals can be calculated by recursively applying
the identities:
b b
∫ φ ( x) dx = Φ (b) − Φ ( a), ∫ xφ ( x) dx = − (φ (b) − φ ( a))

a a
b b
∫ x φ ( x) dx = −x (φ (b) − φ ( a)) + (n − 1) ∫ x
n n−1 n−2
φ ( x ) dx (13.2)
a a
If other interpolation methods, such as cubic spline, are used, the

term (mix + ci) would be replaced by a cubic polynomial. Although
somewhat more complicated, the integrals involved can still be
evaluated by means of the identities 13.2 given above.
Figure 13.1 plots the P&L for a single position in the portfolio as a
nonlinear function of the change in some market variable such as the
credit spread in basis points. If the observed market variable change is
subject to an uncertainty, indicated by the dark grey Gaussian distribu-
tion on the horizontal axis, then the P&L is correspondingly uncertain
and is represented by the light grey distribution on the vertical axis.
The characteristics of this distribution are encapsulated by and carried
through the calculation of VaR using its moments/cumulants.
P&L distribution for combined positions

When, on a given historical date, observations are missing for the
changes in multiple market variables, the P&L distribution for the
portfolio as a whole is obtained by convolving together the P&L
distributions associated with each of the individual market varia-
bles. In principle, this is straightforwardly done by recursively
building up the raw moments of the portfolio from the raw
moments of the individual position’s distributions. If P&L1(x1) and
P&L2(x2) gives the P&L of the portfolio for shocks in market varia-
bles x1 and x2, then the moments of their combined P&L distribution
is given by:
217
13 Stuart PCQF.indd 217 11/03/2013 10:15

n ⎛ ⎞
(P&L1 + P&L 2 ) = ∑⎜ n ⎟ P&Ln1 P&Ln−r
n
2
r=0 ⎝ r ⎠

Note that this, and the subsequent development, is independent of

the form of the underlying distributions.
In practice, the aggregation of moments is considerably simpli-
fied by transforming them into cumulants, κn. The formal
relationship between the cumulants of a probability density func-
tion, Pr(x), and its moments can be encapsulated in terms of its
characteristic function or Fourier transform, φ (t), by:
∞ ∞ ∞
µ nʹ′ n κ n
φ (t ) = ∫e itx
Pr ( x ) dx = ∑ (it ) = exp ∑ n (it )
−∞ n=0 n! n=0 n!
Since convolution can be performed by multiplication of Fourier

transforms, and cumulants are the coefficients in the Taylor series
representation of the logarithm of the Fourier transform, it follows
that the cumulants are simply added. From a practical standpoint,
the cumulants are most easily calculated using the recursion
relation:
n−1 ⎛ ⎞
n−1
κ n = µ nʹ′ − ∑⎜ ⎟ µ ʹ′n−rκ r
r=1 ⎝ r − 1 ⎠
which is also valid for central moments by bootstrapping with κ 1 =
0. The first few cumulants are explicitly given in terms of the central
moments by:
κ 1 = µ ; κ 2 = µ 2 ; κ 3 = µ 3 ; κ 4 = µ 4 − 3µ 22 ; κ 5 = µ 5 − 10µ 2µ 3
Thus, at the cost of converting from moments to cumulants, the

aggregation process is significantly simplified. The skewness and
excess kurtosis of the distribution are described by the parameters:
κ3 κ
γ1 = 3 , γ 2 = 42
κ 22 κ2
Moments for correlated residuals

In certain circumstances, a group of distinct market variables may
show a high degree of correlation between their idiosyncratic
components. Two such variables will be referred to as having corre-
lated residuals. Examples of this might be the credit spread risk
associated with differently rated bonds from an issuer or its subsid-
iaries. In such a case, the distributions of the idiosyncratic
218
13 Stuart PCQF.indd 218 11/03/2013 10:15

component of market variable changes are better modelled as being

100% correlated. Fortunately, it is possible to calculate the moments
for a group of market variables with correlated residuals. The
cumulants may then be generated from the moments and aggre-
gated with those of the rest of the portfolio in the usual way.
Assume the N members of the group of positions with correlated
residuals are indexed by the variable i. On a historical date for
which actual observations are not available, let the systematic vari-
able changes be m i with residual uncertainty σ i.
For exactly correlated residuals, the raw moments are:
n
∞ ⎡ N ⎤
µ nʹ′ = ∫ ⎢∑ P&L i (µ i + σ i x̂ )⎥ φ ( x̂ ) dx̂
−∞ ⎣ i=1 ⎦
where, as before, P&Li(xi) is the P&L function associated with the ith
market variable. As previously, if the P&Li(xi) are all represented as
piecewise continuous functions by interpolating between values on
a discrete grid u1, u2, ... , uj, ... then the range of integration must be
broken up into adjacent intervals within which the functional form
of the term in square brackets remains fixed. The boundaries of the
intervals lie at the points x^ = (uj – m i)/s i for all i, j and the identities
13.2 can then be used to calculate the raw moments. These are
converted to cumulants that are added to those of the rest of the
portfolio, ensuring that there is zero correlation between the resid-
uals of the group and those of all other positions.
Computation of VaR
VaR by historical simulation is calculated by treating the calculated
historical P&L values as random draws from an underlying histor-
ical P&L distribution. Figure 13.2 illustrates the effect of taking
account of the uncertainties, the σ ’s, in daily observations due to
missing data. The dots on the horizontal axis represent P&L values
for the portfolio of interest obtained by applying a set of market
variable changes observed in the same historical window. The
spikes rising from the dots are appropriately weighted stylised
Dirac delta functions, indicating the probability density function
associated with each P&L value when no uncertainty, that is, σ = 0,
is attributed to the daily observations. The effect of associating an
uncertainty with the daily observations is shown by the distribu-
tions plotted in black, with one distribution corresponding to each
219
13 Stuart PCQF.indd 219 11/03/2013 10:15

Figure 13.2 Effect of associating uncertainties with observed daily

historical changes in a market variable
Probability density
–63 –62 –61 –60 –59 –58 –57 –56 –55

P&L (US$ million)
Note: When no uncertainty is assumed, the P&L is a set of discrete values with a
probability distribution represented by Dirac delta functions. Introducing
uncertainties transforms the delta functions into the black curves, which combine
to produce the overall P&L distribution shown by the dotted curve
of the original uncertainty-free P&L values. Note that nonlinearities

present in the P&L functions cause the peaks of these distributions
not to be aligned with those original P&L values. In the case shown
here, individual distributions are very close to Gaussian as a fairly
large number of missing observations in market variables causes
them to “pull to normal” in accordance with the central limit
theorem. It should be emphasised that this pull to normal tends to
happen in practice but is not a requirement of the method described
here, as deviations from Gaussian distributions are accounted for
by the cumulants and carried through the analysis. The overall
dotted curve is the weighted sum, or mixture, of the individual
daily observations, which becomes progressively smoother as the
density of P&L observations increases or as the uncertainty associ-
ated with them grows.
Assume that there are N historical observations, one at each
historical date t1, t2, ... , tk, .... If Prk(p) is the probability density func-
tion for the P&L associated with historical date, tk, then the
probability of the portfolio’s P&L, p, falling in the interval (p, p + dp)
is:
N
Pr ( p ) dp = ∑ wk ⋅ Prk ( p ) dp (13.3)
k=1
where the weight, wk, gives the probability of drawing the kth
historical observation. Pr(p) is the estimated probability density
220
13 Stuart PCQF.indd 220 11/03/2013 10:15

function for the portfolio and is thus a mixture of the distributions

of the daily observations. If the historical observations are consid-
ered to be equally likely then wk = 1/N. With the constraint:
N
∑w k =1
k=1
adjustments can be made to the wk to impose an exponential or

other weighting scheme. When the historical observations are free
from uncertainty, the above equation can be written in terms of the
Dirac delta function as:
N
Pr ( p ) = ∑ wk ⋅ δ ( p − pk ) (13.4)
k=1
in which pk is the P&L observed on date tk.

The raw moments of the P&L distribution are:
∞ N ∞
µ nʹ′ = ∫p n
Pr ( p ) dp = ∑ wk ⋅ ∫p n
Prk ( p ) dp
−∞ k=1 −∞
which is simply the weighted sum of the corresponding raw

moments of the individual daily historical observations. When the
historical observations are equally weighted, the moments of Pr(p)
are just simple averages of the raw moments of the corresponding
daily observations. The raw moments could, in principle, be used
as a basis for calculating the quantiles of the P&L distribution and
hence to directly derive a number for the VaR. An obvious approach
Figure 13.3 P&L probability density function simulated for a large

portfolio assuming zero and non-zero uncertainty in the observed daily
changes in market variables
Probability density
–100 –80 –60 –40 –20 0 20 40 60 80 100

P&L (US$ million)
Note: The white triangles and white diamonds indicate the estimated locations of
the 99th and 95th percentiles of the respective distributions
221
13 Stuart PCQF.indd 221 11/03/2013 10:15

Figure 13.4 P&L probability density function for a portfolio in 91

high-yield CDSs using the conventions from Figure 13.3
Probability density
–6.0 –4.8 –3.6 –2.4 –1.2 0 1.2 2.4 3.6 4.8 6.0
P&L (US$ million)
would be to convert the raw moments to cumulants and then use

the Cornish–Fisher asymptotic expansion (Cornish and Fisher,
1937) to obtain the quantiles of the P&L distribution (Jaschke, 2002,
and Holton, 2003). However, the Cornish–Fisher expansion is best
suited to distributions that are close to Gaussian. Since the P&L
distribution used for the calculation of VaR is a mixture of the daily
P&L distributions, it can be very far from Gaussian and the Cornish–
Fisher expansion would be expected to perform poorly. On the
other hand, the daily P&L distributions are joint distributions of the
daily P&L distributions of a large number of individual positions.
As a consequence of the central limit theorem, these daily joint
distributions will each tend to closely approximate Gaussian distri-
butions and thus be amenable to standard expansion techniques.
A stable procedure that can be used to extract the location of the
required percentile of the distribution Pr(p) in Equation 13.3, and
hence the VaR, is to numerically solve for n in the equation:
ν ∞
q= ∫ Pr ( p) dp = ∑ w ⋅ ∫ Pr ( p) dp
k k (13.5)
−∞ k −∞
setting q = 0.01 and q = 0.05 respectively for the 99% and 95% VaR. In
the case where the historical observations are free from uncertainty
and equally weighted, plugging Equation 13.4 into Equation 13.5
yields the VaR to be the (N × q)th worst loss experienced by the port-
folio in the historical period and is commonly used as a conservative
estimate of VaR in historical simulation.
In the case where the Prk(x) are sufficiently close to Gaussian, the
integrals on the right-hand side are just cumulative normal
222
13 Stuart PCQF.indd 222 11/03/2013 10:15

functions, Φ (x). Departures from normality may be captured using

the Gram–Charlier A series (Charlier, 1905) or Edgeworth expan-
sion (Cramér, 1925).
_ Without loss of generality, consider a standardised distribution,
Pr(x^ ), with cumulants k^ n, with k^ 1 = 0, k^ 2 = 1. The Gram–Charlier A
Series is given by:
⎛ κ̂ n
⎞ ⎛ κ̂ ⎞
Pr ( x̂ ) = ⎜ 1+ ∑ n (−D) ⎟ φ ( x̂ ) = φ ( x̂ ) ⎜1+ ∑ n He n ( x̂ )⎟
⎝ n=3 n! ⎠ ⎝ n=3 n! ⎠
where D is the differentiation operator and Hen(x) is the nth
Chebyshev–Hermite polynomial. From the results in the Appendix,
it follows that:
ν̂
κ̂ n n−1
∫ Pr ( x̂) dx̂ = Φ (ν̂ ) − ∑ n! (−D) φ ( x̂ )
−∞ n=3 x̂=ν̂
κ̂
= Φ (ν̂ ) − φ ( x̂ ) ∑ n He n−1 (ν̂ )
n=3 n!
To calculate hybrid VaR, use:

ν ν̂ i
∫ Pr ( x) dx = ∫ Pr ( x̂ ) dx̂
k k k k
−∞ −∞
with x^k = (x – m k)/s k, n^k = (n – m k)/s k in which m k = k 1, s k = √k 2 are

respectively the mean and standard deviation of the daily P&L
distribution for the portfolio as a whole on the historical date tk. For
other cumulants, n ≥ 3, associated with that date, k^n = k n/s nk.
The more reliable Edgeworth expansion may be employed in a
similar manner. Up to terms involving at most k^ 5, the corresponding
result is:
⎛ ⎧ κ̂ ⎫ ⎧ κ̂ 2 κ̂ ⎫
Pr ( x̂ ) = φ ( x̂ ) ⎜1+ ⎨ 3 He 3 ( x̂ )⎬ + ⎨ 3 He6 ( x̂ ) + 4 He 4 ( x̂ )⎬
⎝ ⎩ 6 ⎭ ⎩ 72 24 ⎭
⎧ κ̂ 3 κ̂ κ̂ κ̂ ⎫⎞
+ ⎨ 3 He 9 ( x̂ ) + 3 4 He7 ( x̂ ) + 5 He 5 ( x̂ )⎬⎟
⎩ 1, 296 144 120 ⎭⎠
and hence:
ν̂
∫ Pr ( x̂) dx̂
−∞
⎛⎧ κ̂ ⎫ ⎧ κ̂ 2 κ̂ ⎫
= Φ (ν̂ ) − φ (ν̂ ) ⎜⎨ 3 He 2 ( v̂ )⎬ + ⎨ 3 He5 (ν̂ ) + 4 He 3 (ν̂ )⎬
⎝ ⎩ 6 ⎭ ⎩ 72 24 ⎭
⎧ κ̂ 3
κ̂ κ̂ κ̂ ⎫⎞
+ ⎨ 3 He 8 (ν̂ ) + 3 4 He6 (ν̂ ) + 5 He 4 (ν̂ )⎬⎟
⎩ 1, 296 144 120 ⎭⎠
223
13 Stuart PCQF.indd 223 11/03/2013 10:15

An efficient method for calculating the Edgeworth expansion to

arbitrary order has been given by Blinnikov and Moessner (1998).
As an alternative to the Gram–Charlier A series or Edgeworth
expansion, path of steepest descent techniques could be used, but
these will not be discussed here.
When the daily P&L distributions are strongly Gaussian, that is,
the k^n are sufficiently small, terms beyond the leading cumulative
normal function can be safely neglected.
The outcome of applying the methodology described here is illus-
trated in Figure 13.3 using 780 observations of simulated data
representing the behaviour that might be seen for a large portfolio.
The figure shows the probability density of P&L values. These were
initially calculated assuming no uncertainty in the historically
observed market variables changes and bucketed to produce the light
grey histogram. The black triangles on the horizontal axis show the
locations of the 99th and 95th percentile worst loss as estimated
conservatively at the seventh and 39th order statistics respectively.
The black curve shows the effect of incorporating an uncertainty in a
significant number of the positions, which results in a roughly US$10
million uncertainty in the P&L value on any given historical date.
Applying the hybrid VaR methodology produces the smooth black
curve for the P&L probability density function. The white diamonds
on the black curve indicate the locations of the 99th and 95th percen-
tiles of this distribution and are obtained by numerically solving
Equation 13.5. The net effect of introducing the uncertainty is that the
distribution is smoothed and broadened, as evidenced both visually
and by the percentile locations. In practice, when estimating the
percentiles that lie far into a sparsely populated tail, introducing a
non-zero uncertainty may sometimes result in the VaR estimates
being reduced, but this is dependent on exactly where the points in
the tail fall.
Using the same conventions as above, Figure 13.4 shows the result
of applying the hybrid VaR methodology to a modestly sized equally
weighted portfolio in 91 members of CDX High Yield Series 17 over a
recent period of 770 days. The portfolio notional is US$100 million. In
this example, it is assumed that 35% of the total possible historical
observations are missing, which yields an average uncertainty in the
daily returns of US$150,000. As the proportion of missing data
increases, the uncertainty increases and the black curve becomes
smoother as the overall hybrid VaR distribution broadens.
224
13 Stuart PCQF.indd 224 11/03/2013 10:15

Expected tail loss

The expected tail loss (ETL), or expected shortfall, is a useful alter-
native risk measure to VaR. It is calculated by considering losses in
the historical P&L distribution that exceed some specified threshold,
n. The ETL is the average of those losses that exceed the threshold,
which may be calculated in the hybrid VaR framework by:
N ν
∑ w ⋅ ∫ x Pr ( x) dx
k k
ETL = k=1
N
−∞
ν
∑ w ⋅ ∫ Pr ( x) dx
k k
i=1 −∞
An expression for the denominator can be found in the previous

section and an expression for the numerator can be easily obtained
from results given in the Appendix:
ν̂
⎛ ⎧ κ̂ ⎫
∫ x̂ Pr ( x̂) dx̂ = −φ ( v̂) ⎜⎝1+ ⎨⎩ 6 (He (ν̂ ) + 3He (ν̂ ))⎬⎭
3
3 1
−∞
⎧ κ̂ 2 κ̂ ⎫
+ ⎨ 3 ( He6 (ν̂ ) + 6He 4 (ν̂ )) + 4 ( He 4 (ν̂ ) + 4He2 (ν̂ ))⎬
⎩ 72 24 ⎭
⎧ κ̂ 33
+ ⎨ (He9 (ν̂ ) + 9He7 (ν̂ ))
⎩ 1, 296
κ̂ 3κ̂ 4 κ̂ ⎫⎞
+
144
( He7 (ν̂ ) + 7He5 (ν̂ )) + 5 ( He 5 (ν̂ ) + 5He3 (ν̂ ))⎬⎟
120 ⎭⎠
and:
ν ν̂ ν̂
∫ x Pr ( x) dx = µ ∫ Pr ( x̂) dx̂ + σ ∫ x̂ Pr ( x̂) dx̂
k k k k k
−∞ −∞ −∞
Summary and conclusion

We have presented a method for the calculation of VaR by historical
simulation in the presence of incomplete market data. It draws on
the spirit of the CAPM and treats the probability distribution asso-
ciated with missing data analytically, thereby providing an efficient
alternative to Monte Carlo methods. The characteristics of the P&L
distributions for daily observations are carried through the calcula-
tion by means of their cumulants and then combined to form an
overall P&L distribution. The upshot is that the P&L probability
distribution from which the VaR is determined becomes continuous
as opposed to a set of discrete samples. Applying the hybrid VaR
225
13 Stuart PCQF.indd 225 11/03/2013 10:15

methodology has the effect of smoothing and broadening the P&L

distribution. As a result of the smoothing, the calculated VaR
becomes relatively stable day on day. Hybrid VaR is found to
perform very well in practical applications for a broad range of
asset classes.
Appendix: Chebyshev–Hermite polynomials

The Chebyshev–Hermite polynomials are related to the Hermite
polynomials, Hn(x), by:
⎛ x ⎞
He n ( x ) = 2 −n/2 H n ⎜ ⎟
⎝ 2 ⎠
They are useful in the present context because of the property:
n d nφ ( x )
(−1) = Hen ( x ) ⋅ φ ( x )
dx n
The Chebyshev–Hermite polynomials can be generated by the

recursion relation:
He0 ( x ) = 1; He n ( x ) = xHe n−1 ( x ) − ( n − 1) He n−2 ( x )
From the above results, the following derivative and identities are easily
obtained:
[Hen ( x) φ ( x)]ʹ′ = −Hen+1 ( x) φ ( x)

ν
∫ He n ( x) φ ( x) dx = −Hen−1 (ν ) φ (ν )
−∞
ν
∫ xHe n ( x) φ ( x) dx = −{Hen (ν ) + nHen−2 (ν )}φ (ν )

−∞
ν
∫ x He
2
n ( x) φ ( x) dx = −{Hen+1 (ν ) + (2n+ 1) Hen−1 (ν ) + n ( n − 1) Hen−3 (ν )} φ (ν )
−∞
The hybrid VaR methodology described here was developed while

the author was at Merrill Lynch and Bank of America Merrill Lynch.
226
13 Stuart PCQF.indd 226 11/03/2013 10:15

REFERENCES
Blinnikov S. and R. Moessner, 1998, “Expansions for Nearly Gaussian Distributions,”

Astronomy and Astrophysics Supplement Series, 130, pp 193–205 (available at www.
edpsciences. org/10.1051/aas:1998221).
Britten-Jones M. and S. Schaefer, 1999, “Non-linear Value-at-Risk,” European Finance

Review, 2, pp 161–87.
Charlier C., 1905, “Über das Fehlergesetz Arkiv för Matematik,” Astronomi och Fysik, 2(8),
pp 1–9.
Cornish E. and R. Fisher, 1937, “Moments and Cumulants in the Specification of

Distributions,” Review of the International Statistical Institute, 5, pp 307–20.
Cramér H., 1925, “On Some Classes of Series Used in Mathematical Statistics,”
Proceedings of the Sixth Scandinavian Congress of Mathematicians, Copenhagen.
Dempster A., N. Laird and D. Rubin, 1977, “Maximum Likelihood from Incomplete
Data via the EM Algorithm,” Journal of the Royal Statistical Society Series B (Methodological),
39(1), pp 1–38.
Springer).
Holton G., 2003, Value-at-risk: Theory and Practice (Amsterdam, Holland: Academic Press).
Jaschke S., 2002, “The Cornish–Fisher Expansion in the Context of Delta-gamma-normal

Approximations,” Journal of Risk, 4, pp 33–55.
Schafer J., 1997, Analysis of Incomplete Multivariate Data (Boca Raton, Fl: Chapman &
Hall/CRC).
227
13 Stuart PCQF.indd 227 11/03/2013 10:15

13 Stuart PCQF.indd 228 11/03/2013 10:15
14
Impact-adjusted Valuation and the
Criticality of Leverage
Jean-Philippe Bouchaud; Fabio Caccioli
and J. Doyne Farmer
Capital Fund Management; Santa Fe Institute and University of Oxford
Mark-to-market or “fair-value” accounting is standard industry

practice. It consists of assigning a value to a position held in a finan-
cial instrument based on the current market clearing price for the
relevant instrument or similar instruments. This is commonly justi-
fied by the theory of efficient markets, which posits that at any
given time market prices faithfully reflect all known information
about the value of an asset. However, mark-to-market prices are
only marginal prices, reflecting the value of selling an infinitesimal
number of shares.
Obviously, traders are typically concerned with selling more
than an infinitesimal number of shares, and are intuitively aware
that this practice is flawed. Selling has an impact on the market,
depressing the price by an amount that increases with the quantity
sold. The first part of a sale will be sold near the current price, but as
more is liquidated the clearing price may drop substantially. This
counterintuitively implies the value of 10% of a company is less
than 10 times the value of 1% of that company. We take advantage
of what has been learned about market impact to propose an
impact-adjusted valuation method that results in better risk control
than mark-to-market valuation. This is in line with other proposals
that valuation should be based on liquidation prices (Acerbi and
Scandolo, 2008, and Caccioli et al, 2011).
The need for a better alternative to marking-to-market is most
229
14 Bouchard PCQF.indd 229 11/03/2013 10:15

evident for positions with leverage, that is, when assets are
purchased with borrowed money. As a leveraged position is sold,
the price tends to drop due to market impact. As it is gradually
unwound, the depression in prices due to impact overwhelms the
decrease in position size, and leverage can initially rise rather than
fall. As more of the position is sold, provided the initial leverage
and initial position are not too large, it will eventually come back
down and the position retains some value. However, if the initial
leverage and position are too large, the leverage diverges during
unwinding, and the resulting liquidation value is less than zero,
that is, the debt to the creditors exceeds the resale value of the asset.
The upshot is that, under mark-to-market accounting, a leveraged
position that appears to be worth billions of dollars may predict-
ably be worth less than nothing by the time it is liquidated. Under
firesale conditions or in very illiquid markets, things are even
worse.
From the point of view of a risk manager or regulator, this makes
it clear that an alternative to mark-to-market accounting is badly
needed. Neglecting impact allows huge positions in illiquid instru-
ments to appear profitable when this is not the case. We propose
such an alternative based on the known functional form of market
impact, and that valuation should be based on expected liquidation
value. While mark-to-market valuation only indicates problems
with excessive leverage after they have occurred, this makes them
clear before positions are entered into. At the macro level, this could
be extremely useful for damping the leverage cycle and coping with
pro-cyclical behaviour (Thurner, Geanakoplos and Farmer, 2012,
and Geanakoplos, 2010). An extended discussion of our proposal
that treats extensions to the problem of risky execution can be found
in Caccioli, Bouchaud and Farmer (2012).
Market impact and liquidation accounting

Accounting based on liquidation prices requires a quantitative
model of market impact. Because market impact is very noisy, and
because it usually requires proprietary data to be studied empiri-
cally, a good picture of market impact has emerged only gradually
in the literature (for reviews, see Bouchaud, Farmer and Lillo, 2009,
Moro et al, 2009, and Toth et al, 2011). Here, we are particularly
concerned with the liquidation of large positions, which must
230

IMPACT-ADJUSTED VALUATION AND THE CRITICALITY OF LEVERAGE
either be sold in a block market or broken into pieces and executed

incrementally. Our interest is therefore in the impact of a so-called
meta-order, that is, a single large trade that must be executed in
pieces. This is in contrast to the impact of a single small trade in the
order book, or the impact of the average order flow, both of which
have different functional forms and different time dependencies
(see Toth et al, 2011). Empirical studies on meta-orders make it clear
that the market impact I = E[e ⋅ (pf – p0)/p0], defined as the expected
shift in price from the price p0 observed before a buy trade (e = +1) or
a sell trade (e = –1) to the price pf at which the last share is executed,
is a concave function of position size Q normalised by the trading
volume V. When liquidation occurs in normal conditions, that is,
at a reasonable pace that does not attempt to remove liquidity
too quickly from the order book, the expected impact I due to
liquidating Q shares is to a large extent universal, independent of
the asset, time period, tick size, execution style, etc. It is given by:
Q
I (Q) = Yσ (14.1)
V
where s is the daily volatility, V is daily share transaction volume

and Y is a numerical constant of order unity (see Toth et al, 2011, for
a detailed discussion). A crucial observation for the validity of our
further analysis is that the above formula holds approximately true
within each meta-order as well, that is, the impact of the first q
shares is simply given by I(q). After completion of the meta-order
the behaviour of impact is less clear (Farmer et al, 2011, and Toth et
al, 2011).
The earliest theory of market impact (Kyle, 1985) predicted that
expected impact should be linear and permanent. This was further
supported by the work of Huberman and Stanzl (2004), who argued
that providing certain assumptions are met, such as lack of correla-
tion in order flow, impact has to be linear in order to avoid arbitrage.
However, empirical studies have made it clear that these assump-
tions are not met (see, for example, Toth et al, 2011), and the
overwhelming empirical evidence that impact is concave has driven
the development of alternative theories. For example, Farmer et al
(2011) have proposed a theory based on a strategic equilibrium
between liquidity demanders and liquidity providers, in which
uncertainty about Q on the part of liquidity providers dictates the
231

functional form of the impact. Toth et al (2011), in contrast, derive a

square-root impact function within a stochastic order flow model.
Assuming prices are diffusions, they show that this implies a locally
linear latent order book, and provide a proof-of-principle using a
simple agent-based model. Both of these theories roughly predict
square-root impact, although with some differences.
We should stress the formulas above for market impact hold only
in relatively calm market conditions, when execution is slow
enough for the order book to replenish between successive trades
(Weber and Rosenow, 2005, and Bouchaud, Farmer and Lillo, 2009).
If the execution schedule is so aggressive that Q becomes compa-
rable to V, liquidity may dry up, in which case the parameters s
and V can no longer be considered fixed, but themselves react to the
trade: s increases and V drops. Impact in such extreme conditions,
such as the so-called flash crashes, is expected to be much larger
than the square-root formula above. In these cases, the expected
impact becomes less concave and it can become linear or even
super-linear (Gatheral, 2010). For the above impact formula to be
Figure 14.1 Possible deleveraging trajectories

30
I(Q) = 0
I(Q) = 0.1
25 I(Q) = 0.15
I(Q) = 0.19
20
Leverage
15
10
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
Note: Possible deleveraging trajectories, showing the leverage λ(x) based on
mark-to-market accounting as a function of the fraction x of the position that has
been liquidated. We hold the initial leverage λ0 = 9 constant and show four
trajectories for different values of the market impact parameter I = I(Q) = Yσ√Q/V,
that is I = 0 (black dashed line, corresponding to the no-impact case), I = 0.1
(dotted line), 0.15 (black line) and 0.19 (grey dotted-dashed line). If the market
impact is too high, the leverage diverges before the position can be liquidated,
implying that the position is bankrupt
232

valid, the execution time T needs to be large enough that Q remains

much smaller than V (20% is a typical upper limit). The execution
time should not be too long either, otherwise impact is necessarily
linear in Q: beyond the timescale for the market to remember link-
ages between individual trades, trades must necessarily become
independent and impact must be additive (see Toth et al, 2011).
The establishment of a quantitative theory for expected impact
makes it possible to do impact-adjusted accounting. Rather than
using the mark-to-market price, which is the marginal price of an
infinitesimal liquidation, we propose using the expected price
under complete liquidation, depressed by the impact. Using the
approximation that shares are executed continuously and inte-
grating the impact, it is easy to see that this is given by:
⎛ 2 ⎞
p = p0 ⎜1− I (Q )⎟ (14.2)
⎝ 3 ⎠
where p0 is the initial mark-to-market price.
The critical nature of leverage

When leverage is used, it becomes particularly important to take
impact into account and value assets based on their expected liqui-
dation prices. Consider an asset manager taking on liabilities L to
hold Q shares of an asset with price p. For simplicity, we consider
the case of a single asset. We define the leverage l as the ratio of the
value of the asset to the total equity:
Qp
λ= (14.3)
Qp − L
In the absence of market impact, selling q shares always decreases

leverage linearly, because the denominator remains constant – the
cash generated by selling the asset reduces the liability by the same
amount, that is, Qp – L → (Q – q)p – (L – qp). So, l → l(1 – x) where x
= q/Q is the fraction of assets sold.
This changes when impact is considered in deleveraging. Selling
q shares pushes current trading prices down, which under mark-to-
market accounting decreases the value of the remaining Q – q
unsold shares. As we will show, this generally overwhelms the
effect of selling the shares, increasing the leverage even as the
overall position is reduced. Letting l0 be the initial leverage before
233

Figure 14.2 Leverage as a function of position size

30
Mark-to-market
Impact-adjusted
25
No impact
20
Leverage
15
10
0
0 Q/2 Q Q/2 0
q
30
25
20 Impact-adjusted
Leverage
No-impact
15 Mark-to-market
10
0
0 Q/2 qc Q Q/2 0
q
Note: The x axes denote number of shares held. The position q(t) varies from zero
to Q in the left half of each panel and from Q to zero in the right. The black dashed
line shows the leverage without price; the grey line shows the leverage including
impact under mark-to-market accounting; and the grey dot dashed line shows the
leverage using impact-adjusted valuation. The upper panel is a case in which Q is
small enough that the leverage never becomes critical; the lower panel is a case
where the leverage becomes supercritical. In this case, the impact-adjusted
leverage diverges as the position is entered, warning the manager of the impending
disaster. The vertical grey, dashed line shows the critical position qc.
selling begins, the leverage as a function of the fraction x sold can

be shown to be:
⎛ (1− x ) 1− I x ⎞
λ ( x ) = λ0 ⎜⎜ ⎟⎟ (14.4)
⎝ 1− λ0 I x (1− x / 3) ⎠
where I ≡ I(Q) = Ys √⎯ ⎯ is the impact of selling the entire posi-

Q/V
tion. From this expression, one deduces that for small x and any I >
0, l(x) is larger than l0 for l0 > 1, that is, whenever any leverage is
234

used. This means, seemingly paradoxically, that when selling a

leveraged position, the expected leverage under mark-to-market
accounting always initially increases. When l0I > 3/2, the leverage
l(x) in fact diverges during liquidation.
Three representative deleveraging trajectories l (x) are illustrated
in Figure 14.1, together with the trajectory obtained in the absence
of market impact. We assume a fixed starting mark-to-market
leverage l 0 = 9 and show three cases corresponding to different
values of the overall market impact parameter I. For the two cases
where the leverage is subcritical, that is, with l 0I < 3/2, the manager
unwinds the position without bankruptcy. However, due to the rise
in leverage during the course of liquidation, they may have trouble
with their prime broker. For example, in the case where I = 0.15 at
its peak, l (x) is more than twice its starting value (see Figure 14.1).
The case where the leverage is allowed to become supercritical is a
disaster. If l 0I > 3/2, which for l 0 = 9 implies I > 0.16, the manager
is trapped, and the likely outcome of attempting to deleverage is
bankruptcy.
Risk management is improved by impact-adjusted accounting,
simply by using the average impact-adjusted valuation price p~ in
the formula for leverage. In Figure 14.2, we show how the leverage
behaves when a manager first steadily assumes a position 0 ≤ q(t) ≤
Q and then steadily liquidates it. We compare three different notions
of leverage:
❑❑ No-impact leverage is represented by the dashed black line. This

is the leverage that would exist if the price remained constant on
average. It rises and falls linearly proportional to the position
q(t).
❑❑ Mark-to-market leverage is represented by the grey line. While
the position is building, it rises more slowly than linearly,
because impact causes the price to increase, by partially offset-
ting the increasing position size. When the position is exited, the
expected leverage initially shoots up. In the subcritical case
(upper graph) it eventually returns to zero, but in the supercrit-
ical case (lower graph) it diverges, making the position
bankrupt.
❑❑ Impact-adjusted leverage is represented by the dashed grey line,
and is always greater than the other two measures. It is
235

particularly useful in the supercritical case – its rapid increase is

a clear warning that a problem is developing, in contrast to the
decreasing mark-to-market leverage. In particular, this shows
how dangerous the mark-to-market case is – it overestimates
profits and depresses the time leverage value. Over-leveraging is
only revealed when it is too late. A prudent risk manager would
use impact-adjusted leverage to avoid bankruptcies.
So far it is not clear whether the effects we have illustrated in

t he preceding sections happen under realistic conditions, or
whether they require such extreme conditions as to be practically
unimportant. In this section, we plug in some numbers for different
assets and show that indeed these effects can happen under realistic
conditions.
Let us first give some orders of magnitude for stock markets. The
daily volume of a typical stock is roughly 5 × 10–3 of its market cap,
while its volatility is of the order of 2% a day. Suppose the portfolio
to be liquidated owns Q = 5% of the market cap of a given stock.
Taking Y = 0.5, the impact discount is:
0.05
I (Q ) ≈ 2% × ≈ 6% (14.5)
0.005
A 6% haircut on the value of a portfolio of very liquid stocks is

already quite large, and it is obviously much larger for less liquid/
more volatile markets.
Let us now turn to the question of the critical leverage lc under
mark-to-market accounting. From the discussion above, the
condition reads:
3 3
λc I = → λc = V /Q (14.6)
2 2Yσ
To get a feeling for whether or not these conditions can be met, we

present representative values for several different assets. For futures
we assume Q = V, implying that it would take five days to trade out
of the position with a participation rate of 20%. For stocks we
assume Q = 10V, which assuming the same participation rate
implies a position that would take 50 trading days to unwind. Such
positions might seem large, but they do occur for large funds; for
instance, Warren Buffet was reported to have taken more than eight
236

Table 14.1 Rough orders of magnitude for numerical parameters entering

the impact formulas given in Equation 14.1, with the corresponding
estimates of impact and critical leverage
Asset σ (daily) V(US$bn) I* λC
Bund** 0.4% 140 0.4% ~ 300

S&P500** 1.6% 150 1.6% ~ 100
MSFT*** 2% 1.25 6.3% ~ 25
AAPL*** 2.8% 0.5 8.9% ~ 17
KKR**** 2.5% 2**** 7.9% ~ 16
ClubMed***** 4.3% 1***** 13.5% ~ 11
Note: Except as otherwise noted, numbers are based on data for first-quarter 2008
* Impact I1 = I(Q) based on volatility and volume, calculated with Equation 14.1,
with Y = 1 and Q = V for futures and Q = 10V for stocks, roughly 5% of the market
capitalisation
** For futures, we refer to the nearest maturity; the numbers for the 10-year US note
are very similar to those for the Bund
*** Large cap US stocks, Q = 10V
**** Krispy Kreme Doughnuts, a small cap stock, March 2012, with Q = 10V in
US$m
***** ClubMed, a small cap French stock, Q = 10V with V in €m
months to buy a 5.5% share of IBM. The results are given in Table
14.1.
We see that for liquid futures, such as the Bund or S&P 500, the
critical leverage is large enough that the phenomenon we discuss
here is unlikely to ever occur. As soon as we enter the world of
equities, however, the situation looks quite different. For over-the-
counter markets, the effect is certainly very real. Using reasonable
estimates, we find that the impact of deleveraging a position can
easily reach 20% on these markets, corresponding to a critical
leverage lc ≈ 7.5.
Conclusion
Positions need to be based on liquidation prices rather than mark-
to-market prices. For small, unleveraged positions in liquid markets
there is no problem, but as soon as any of these conditions are
violated, the problem can become severe. As we have shown,
standard valuations, which do nothing to take impact into account,
can be wildly overoptimistic.
Impact-adjusted accounting gives a more realistic value by esti-
mating liquidation prices based on recent advances in understanding
market impact. If one believes – as we do – that Equation 14.1 is a
237

reasonable representation of the impact that an asset manager will

unavoidably incur when liquidating their position, our procedure
has the key virtue of being extremely easy to implement. It is based
on quantities such as volatility, trading volume or the spread, which
are all relatively easy to measure. Risk estimates can be calculated
for the typical expected behaviour or for the probability of a loss of
a given magnitude (see Caccioli, Bouchaud and Farmer, 2012).
The worst negative side-effects of mark-to-market valuations
occur when leverage is used. As we have shown here, when
liquidity is low, leverage can become critical. By this we mean that
as a position is being entered there is a critical value of the leverage
lc above which it becomes very likely that liquidation will result in
bankruptcy, that is, liquidation value less than money owed to cred-
itors. This does not require bad luck or unusual price fluctuations
– it is a nearly mechanical consequence of using too much leverage.
Standard mark-to-market accounting gives no warning of this
problem, in fact quite the opposite: mark-to-market prices rise as a
position is purchased, causing leverage to be underestimated.
However, as a position is unwound the situation is reversed. The
impact of unwinding causes leverage to rise, and if the initial
leverage is at or above a critical value, the leverage becomes infinite
and the position is bankrupt. Under mark-to-market accounting
this comes as a complete surprise. Under impact-adjusted
accounting, in contrast, the warning is clear. As the critical point is
approached, the impact-adjusted leverage diverges, telling any
sensible portfolio manager that it is time to stop buying.
The method of valuation that we propose here could potentially
be used both by individual risk managers as well as by regulators.
Had such procedures been in place in the past, we believe many
previous disasters could have been avoided. As demonstrated in
the previous section, the values where leverage becomes critical are
not unreasonable compared with those used before, such as the
leverages of 50–100 used by Long-Term Capital Management in
1998, or 30–40 used by Lehman Brothers and other investment
banks in 2008.
However, one should worry about other potentially destabilising
feedback loops that our impact-adjusted valuation could trigger.
For example, in a crisis situation, spreads and volatilities increase
while the liquidity of the market decreases. Updating the
238

parameters entering the impact formula (volatility, spread and

available volumes) too quickly would predict a deeper discount on
the asset valuation, potentially leading to further fire sales, fuelling
more panic, etc. It is therefore important to estimate parameters
using a slow-moving average to avoid any overreaction to tempo-
rary liquidity droughts. This observation is in fact quite general:
recalibrating models after every market hiccup often leads to
instabilities.
The failure of marginal prices as a useful means of valuation is
part of an emerging view of markets as dynamic, endogenously
driven and self-referential (Bouchaud, 2010), as suggested long ago
by Keynes and more recently by Soros. For example, studies suggest
that exogenous news plays a minor role in explaining major price
jumps (Joulin et al, 2008), while self-referential feedback effects are
strong (Filimonov and Sornette, 2012). Market prices are moulded
and shaped by trading, just as trading is moulded and shaped by
prices, with intricate and sometimes destabilising feedback. Because
the liquidity of markets is so low, the impact of trades is essential to
understand why prices move (Bouchaud, Farmer and Lillo, 2009).
This work was supported by the National Science Foundation under

grant 0965673, the European Union Seventh Framework Programme
FP7/2007-2013 under grant agreement CRISIS-ICT-2011-288501 and
the Sloan Foundation. Jean-Philippe Bouchaud acknowledges impor-
tant discussions with Xavier Brockman, Julien Kockelkoren, Yves
Lempérière and Bence Toth.
REFERENCES
Acerbi C. and G. Scandolo, 2008, “Liquidity Risk Theory and Coherent Measures of
Risk,” Quantitative Finance, 8(7), pp 681–92.
Bouchaud J.-P., 2010, “The Endogenous Dynamics of Markets: Price Impact, Feedback
Loops and Instabilities,” in A. Berd (Ed), Lessons From The Financial Crisis (London,
England: Risk Books).
Bouchaud J.-P., D. Farmer and F. Lillo, 2009, “How Markets Slowly Digest Changes
in Supply and Demand,” in T. Hens and K. Schenk-Hoppe (Eds), Handbook of Financial
Markets: Dynamics and Evolution (Amsterdam, Holland: Elsevier).
Caccioli F., J.-P. Bouchaud and D. Farmer, 2012, “A Proposal for Impact Adjusted
Valuation: Critical Leverage and Execution Risk” (available at http://ArXiv:1204.0922).
239

Caccioli F., S. Still, M. Marsili and I. Kondor, 2011, “Optimal Liquidation Strategies
Regularize Portfolio Selection, “ European Journal of Finance, special issue.
Farmer D., A. Gerig, F. Lillo and H. Waelbroeck, 2011, “How Efficiency Shapes Market
Impact,” technical reprint (available at http://arxiv.org/abs/1102.5457).
Filimonov V. and D. Sornette, 2012, “Quantifying Reflexivity in Financial Markets:

Towards a Prediction of Flash Crashes,” technical reprint.
Gatheral J., 2010, “No-dynamic-arbitrage and Market Impact. Quantitative Finance, 10, pp
749–59.
Geanakoplos J., 2010, “Solving the Present Crisis and Managing the Leverage Cycle,”
FRBNY Economic Policy Review, pp 101–31.
Huberman G. and W. Stanzl, 2004, “Price Manipulation and Quasi-arbitrage,”

Econometrica, 72(4), pp 1,247–75.
Joulin A., A. Lefevre, D. Grunberg and J.-P. Bouchaud, 2008, “Stock Price Jumps: News
and Volume Play a Minor Role,” Wilmott Magazine, 46, September–October, pp 1–7.
Kyle A., 1985, “Continuous Auctions and Insider Trading,” Econometrica, 53, pp 1,315–35.
Moro E., L. Moyano, J. Vicente, A. Gerig, D. Farmer, G. Vaglica, F. Lillo and R.

Mantegna, 2009, “Market Impact and Trading Protocols of Hidden Orders in Stock
Markets,” Physical Review, E, 80(6), 066102.
Thurner S., G. Geanakoplos and D. Farmer, 2012, “Leverage Causes Fat Tails and
Clustered Volatility,” Quantitative Finance, 12(5), pp 695–707.
Toth B., Y. Lemperiere, C. Deremble, J. de Lataillade, J. Kockelkoren and J.-P.

Bouchaud, 2011, “Anomalous Price Impact and the Critical Nature of Liquidity in
Financial Markets,” Physical Review, X, 1(2), 021006.
Weber P. and B. Rosenow, 2005, “Order Book Approach to Price Impact,” Quantitative
Finance, 5, pp 357–64.
240

Section 3
Counterparty Credit Risk
15 Gregory PCQF.indd 241 11/03/2013 10:16

15
Being Two-faced Over
Counterparty Credit Risk
Jon Gregory
Solum Financial Partners
Counterparty credit risk is the risk that a counterparty in a financial

contract will default prior to the expiry of the contract and fail to
make future payments. Counterparty risk is taken by each party in
an over-the-counter derivatives contract and is present in all asset
classes, including interest rates, foreign exchange, equity deriva-
tives, commodities and credit derivatives. Given the decline in
credit quality and heterogeneous concentration of credit exposure,
the high-profile defaults of Enron, Parmalat, Bear Stearns and
Lehman Brothers, and writedowns associated with insurance
purchased from monoline insurance companies, the topic of
counterparty risk management remains ever-important.
A typical financial institution, while making use of risk mitigants
such as collateralisation and netting, will still take a significant amount
of counterparty risk, which needs to be priced and risk-managed
appropriately. Since the early 2000s, some financial institutions have
built up their capabilities for handling counterparty risk and active
hedging has also become common, largely in the form of buying
credit default swap (CDS) protection to mitigate large exposures (or
future exposures). Some financial institutions have a dedicated unit
that charges a premium to each business line and in return takes on
the counterparty risk of each new trade, taking advantage of port-
folio-level risk mitigants such as netting and collateralisation. Such
units might operate partly on an actuarial basis, utilising the diversi-
fication benefits of the exposures, and partly on a risk-neutral basis,
hedging key risks such as default and forex volatility.
243

A typical counterparty risk business line will have significant

reserves held against some proportion of expected and unexpected
losses, taking into account hedges. The significant increases in
credit spreads, especially in the financial markets, will have
increased such reserves and/or future hedging costs associated
with counterparty risk. It is perhaps not surprising that many insti-
tutions, notably banks, are increasingly considering the two-sided
or bilateral nature when quantifying counterparty risk.1 A clear
advantage of doing this is that it will dampen the impact of credit
spread increases by offsetting mark-to-market losses arising, for
example, from increases in required reserves. However, it requires
an institution to attach economic value to its own default, just as it
may expect to make an economic loss when one of its counterpar-
ties defaults. While it is true a corporation does “gain” from its own
default, it might seem strange to take this into account from a
pricing perspective. In this chapter, we will make a quantitative
analysis of the pricing of counterparty risk and use this to draw
conclusions about the validity of bilateral pricing.
Unilateral counterparty risk

The reader is referred to Pykhtin and Zhu (2006) for an excellent
overview of measuring counterparty risk. We denote by V(s, T) the
value at time s of a derivatives position with a final maturity date of
T. The value of the position is known with certainty at the current
time t(< s ≤ T). We note that the analysis is general in the sense that
V(s, T) could indicate the value of a single derivatives position or a
portfolio of netted positions,2 and could also incorporate effects
such as collateralisation. In the event of default, an institution must
consider the following two situations:
❑❑ V(s, T) > 0. In this case, since the netted trades are in the institu-
tion’s favour (positive present value), it will close out the position
but retrieve only a recovery value, V(s, T)dC, with dC a percentage
recovery fraction.
❑❑ V(s, T) ≤ 0. In this case, since the netted trades are valued against
the institution, it is still obliged to settle the outstanding amount
(it does not gain from the counterparty defaulting).
We can therefore write the payout3 in default as dCV(tC, T)+ + V(tC, T)–
where tC is the default time of the counterparty. The risky value of a
244

BEING TWO-FACED OVER COUNTERPARTY CREDIT RISK
trade or portfolio of trades where the counterparty may default at

some time in the future is then:
V (t,T ) = Et ⎡⎢1τ C >T V (t,T ) + 1τ C ≤T

⎣

(V (t, τ + V (τ C ,T ) ⎤⎦
)
+ −
C ) + δCV (τ C ,T ) (15.1)
The first term in the expectation is simply the risk-free value condi-
tional upon no default before the final maturity. The second
component 1t ≤TV(t, tC) corresponds to the cashflows paid up to4 the
C
default time. The final components can be identified as the default

payout as described above.
Rearranging the above equation, we obtain:
V (t,T )
= Et ⎡⎣1τ C >T V (t,T ) + 1τ C ≤T
(V (t, τ C ) + δCV (τ C ,T )
+
+ V (τ C ,T ) − V (τ C ,T ) ⎤⎦
+
)
= Et ⎡⎣1τ C >T V (t,T ) + 1τ C ≤T V (t,T )
(
+ 1τ C ≤T δCV (τ C ,T ) − V (τ C ,T ) ⎤⎦
+ +
)
= V (t,T ) − Et ⎡⎣1τ C ≤T (1− δC ) V (τ C ,T ) ⎤⎦
+ (15.2)
This allows us to express the risky value as the risk-free value less
an additional component. This component is often referred to (see,
for example, Pykhtin and Zhu, 2006) as the credit value adjustment
(CVA). As first discussed by Sorensen and Bollier (1994), an analogy
is often made that the counterparty is long a series of options. Let us
denote the standard CVA in this unilateral case as:
CVAunilateral = Et ⎡⎣1τ C ≤T (1− δC ) V (τ ,T ) ⎤⎦
+
(15.3)
We might calculate the expectation under the risk-neutral (Q) or

the real probability measure (P), in the latter case using historical
analysis rather than market-implied parameters. Traditionally, the
real measure is used in risk management applications involving
modelling future events such as exposures. However, since the
default component of the CVA is likely to be hedged, the risk-
neutral measure is more appropriate. Since most counterparty risk
books may hedge only the major risks and are therefore part risk-
neutral, part real, we can note that the choice of measure to use in
Equation 15.3 is a rather subtle point.
245

Bilateral counterparty risk – no simultaneous

defaults
The unilateral treatment neglects the fact that an institution may
default before its counterparty, in which case the latter default
would become irrelevant. Furthermore, the institution actually
gains following its own default since it will pay the counterparty
only a fraction of the value of the contract. The payout to the institu-
tion in its own default is dIV(tI, T)– + V(tI, T)+ with tI and dI representing
its own default time and associated recovery percentage (to its
counterparties), respectively.
Denoting by t 1 = min(tC, tI) the “first-to-default” time of both the
institution and counterparty, and assuming that simultaneous
defaults are not possible, the valuation equation becomes:
V (t,T )
⎡ ⎛ ⎞⎤
⎜V (t, τ ) +
1
⎢ ⎟⎥
⎢ ⎜ ⎟⎥
⎢ ⎜
( C
1 + 1 −
= Et ⎢1τ 1 >T V (t,T ) + 1τ 1 ≤T ⎜1τ 1 =τ δCV (τ ,T ) + V (τ ,T ) +⎟⎥
⎟⎥
)
⎢
⎢⎣ (
⎜1 1 δ V τ 1 ,T − + V τ 1 ,T + ⎟⎥
⎝ τ =τ I I ( ) ( ) ⎠⎥⎦ )
= V (t,T )

⎣ ( C I
)
−Et ⎡⎢1τ 1 ≤T 1τ 1 =τ (1− δC ) V (τ 1 ,T ) + 1τ 1 =τ (1− δI ) V (τ 1 ,T ) ⎤⎥
+ −
⎦
(15.4)
We can identify the first component in Equation 15.4 as being the

same adjustment as before conditioned on no default of the institu-
tion. The additional term corresponds to the gain made by the
institution in the event of its default (conditional on no previous
counterparty default). This term is commonly referred to as DVA
which stands for debt value adjustment. Using the Sorensen and
Bollier (1994) analogy, the institution is then also long a series of
options on the reverse contract. We note that the mean of the future
distribution of V(t 1, T) (for example, due to forward rates being far
from spot rates) will be important in determining the relative value
of the two terms above in addition to the individual default
probabilities.
246

Bilateral counterparty risk – with simultaneous

defaults
For the reader to gain some insight into bilateral CVA, we extend
the formula to allow for a simultaneous default of both parties at a
time denoted by t. One motivation for this is that super-senior
tranched credit protection has been traded at significant premiums.
For example, in the case of the 30–100% tranche on the CDX IG
index, 54 out of 125 investment-grade defaults5 are required to
cause a loss on the tranche and yet the five-year maturity tranche
has for the past year traded at a premium of around 50 basis points
a year (a significant proportion of many financial spreads). The
price of such protection is often modelled through a catastrophic
event causing many simultaneous (or closely clustered) default
events. The joint default representation can also be thought of as a
simple way to introduce systemic over idiosyncratic risk.
With joint default of the counterparty and institution, the valua-
tion formula becomes:
V (t,T )
⎡ ⎛V (t, τ 1 ) + ⎞⎤
⎢ ⎜ ⎟⎥
⎢
⎢
⎜1
(
⎜ τ 1 =τ C C (
δ V τ 1
,T ) ( ) ⎟⎟⎥⎥
+
+ V τ 1
,T
−
+ )
= Et ⎢1τ 1 >T V (t,T ) + 1τ 1 ≤T ⎜ ⎟⎥
⎢
⎢ ⎜
( 1 − 1 +
⎜1τ 1 =τ A δIV (τ ,T ) + V (τ ,T ) + ⎟⎥ )
⎟⎥
⎢
⎣ (
⎜1 1 δ V (τ 1 ,T )+ + δ V (τ 1 ,T )− ⎟⎥
⎝ τ =τ C I ⎠⎦ )
= V (t,T )
⎡ ⎛ + ⎞⎤
⎜1τ 1=τ C (1− δC ) V (τ ,T ) +
1
⎢ ⎟⎥
⎢ ⎜ − ⎟⎥
−Et ⎢1τ 1 ≤T ⎜1τ 1=τ (1− δI ) V (τ 1 ,T ) ⎟⎥
A
⎢ ⎜ ⎟⎥
⎢
⎣ ⎝ ( +
⎜1τ 1=τ V (τ 1 ,T ) − δCV (τ 1 ,T ) − δIV (τ 1 ,T )
−
) ⎟⎥
⎠⎦
= V (t,T ) − CVAbilateral
with t 1 = min(tC, tI, t). The final term corresponds to the fact that in
the event of joint default, the value of the derivatives position is
essentially cancelled, with a recovery value paid to whichever party
is owed money. It can be seen that an overall positive (negative)
CVA will increase (decrease) with increasing joint default
probability.6
We will make the common assumption that the default times and
247

value of the derivatives portfolio are independent. This is a rather

standard simplification in the case that there is not obvious “wrong-
way risk” (which clearly exists in credit derivatives and certain
other cases).7 The most straightforward way to calculate the expec-
tation in Equation 15.5 is by discretisation over a suitable time grid
[t0 = t, t1, ... , tm–1, tm = T]. With this and the independence assumption
we obtain:
CVAbilateral ≈
m
∑ Q (τ ∈ [ti−1 ,ti ] , τ I > ti , τ > ti )Et ⎡⎣(1− δC ) V (τ C ,T ) ⎤⎦ +

+
C
i=1
m
∑ Q (τ ∈ [ti−1 ,ti ] , τ C > ti , τ > ti )Et ⎡⎣(1− δI ) V (τ I ,T ) ⎤⎦ +

−
I
i=1
m
∑Q (τ ∈ [t i−1 ,ti ], τ C > ti , τ I > ti )

i=1

Et ⎡⎣V (τ ,T ) − δCV (τ ,T ) − δIV (τ ,T ) ⎤⎦
+ −
(15.6)
Example
We now present a simple example8 assuming that the counterparty
and institution default probabilities (conditional on no joint default)
are correlated according to a Gaussian copula. The correlation
parameter is denoted by r . Following the Gaussian correlation
assumption between tC and tI and the independence of t, the above
probabilities can be readily calculated, for example:
Q (τ C ∈ [ti−1 ,ti ] , τ I > ti , τ > ti )
= Q (τ C > ti−1 , τ I > ti , τ > ti ) − Q (τ C > ti , τ I > ti , τ > ti )
⎡N ( N −1 (Q (τ > t )) , N −1 (Q (τ > t )) ; ρ )⎤
2d C i−1 I i
= ⎢ ⎥Q (τ > t )
i
⎢−N ( N (Q (τ > t )) , N (Q (τ > t )) ; ρ ) ⎥
−1 −1 (15.7)
⎣ 2d C i I i ⎦
where N(.) and N2d(.) represent the univariate and bivariate cumula-
tive normal distribution functions.
We assume that the probabilities of default are determined by:
Q (τ C > s) = exp ⎡⎣− ( λC − λ ) s⎤⎦ , λC ≥ λ (15.8a)
Q (τ I > s) = exp ⎡⎣− ( λI − λ ) s⎤⎦ , λI ≥ λ (15.8b)
Q (τ > s) = exp [−λ s] (15.8c)
248

Figure 15.1 Expected exposure profiles for case A and case B with
μ = –1%, σ = 10% and μ = 1%, σ = 10%, respectively
20 EPE (case A)
15 ENE (case A)
EPE (case B)
10 ENE (case B)
Exposure (%)
5
0
–5
–10
–15
–20
0 2 4 6 8 10
Time (years)
where lC, lI and l are deterministic default intensities that could

readily be made time-dependent or, in a more complex approach,
stochastic. The joint default probability, l, could be calculated from
the prices of nth to default baskets or (under the assumption that
this will be a systemic event) senior tranches of a relevant credit
index. Subsequently, lC and lI can be calibrated to the CDS spreads
and recovery rates of the counterparty and institution, respectively.
Since derivatives under standard International Swaps and
Derivatives Association documentation are pari passu with senior
debt,9 a cancellation effect means we do not expect a considerable
impact from differing recovery assumptions.
We finally use the simple representation:10
V ( s,T ) = µ ( s − t ) + σ s − tZ
where m and s are drift11 and volatility parameters respectively and

Z is a random variable drawn from a standard normal distribution.
Table 15.1 Unilateral and bilateral CVA values for case A

and case B under the assumption of independence
Case A Case B
Unilateral 0.668% 2.140%

Unilateral adjusted 0.535% 1.902%
Bilateral –1.366% 1.366%
249

Figure 15.2 CVA as a function of correlation between counterparty and

institution default for case A (top) and case B (bottom)
1.0
0.5
–0.5
CVA (%)
–1.0
–1.5
Unilateral
–2.0 Unilateral adjusted
Bilateral
–2.5
0 20 40 60 80 100
Correlation (%)
2.5
2.0
1.5
CVA (%)
1.0
0.5 Unilateral
Unilateral adjusted
Bilateral
0
0 20 40 60 80 100
Correlation (%)
The simple assumptions above allow us to calculate the required

exposure quantities as:
( ) (
Et ⎡⎣V ( s,T ) ⎤⎦ = µΔxN µ Δx / σ + σ Δxϕ µ Δx / σ
+
) (15.9a)
( ) (
Et ⎡⎣V ( s,T ) ⎤⎦ = −µΔxN µ Δx / σ − σ Δxϕ µ Δx / σ
−
) (15.9b)
Δx = s − t
where ϕ (.) represents the normal density function. These compo-

nents are typically known as the expected positive exposure (EPE)
and the expected negative exposure (ENE). Under the independ-
ence assumptions, interest rates simply amount to multiplicative
250

Figure 15.3 CVA as a function of the systemic spread intensity with zero
correlation for case A (top) and case B (bottom)
1.0
0.5
0
CVA (%)
–0.5
–1.0
–1.5
Unilateral
–2.0 Unilateral adjusted
Bilateral
–2.5
0 0.5 1.0 1.5 2.0
Joint default intensity (λ)
2.5
2.0
1.5
CVA (%)
1.0
0.5 Unilateral
Unilateral adjusted
Bilateral
0
0 0.5 1.0 1.5 2.0
Joint default intensity (λ)
components via discount factors, and thus to simplify and aid

reproduction of the results, we ignore them.
Let us assume a maturity of 10 years, that dC = dI = 40% and define

two parameter sets:12
❑❑ Case A: m = –1%, s = 10%, lC = 2%, l I = 4%.

❑❑ Case B: m = +1%, s = 10%, lC = 4%, l I = 2%.
The (symmetric) exposure for profiles EPE and ENE are shown in
Figure 15.1.
We will consider three distinct CVA measures outlined below:
251

❑❑ Unilateral. This is the standard unilateral formula given in

Equation 15.3.
❑❑ Adjusted unilateral. This is the unilateral adjustment but taking
into account the default probability of the institution, that is, this
is the first term in Equation 15.6 with no negative contribution as
can arise from the second and third terms.
❑❑ Bilateral. The bilateral CVA given by Equation 15.6.
Since this article was originally published there has been a discus-
sion on “closeout conventions” in relation to the above components
and the reader is referred to Brigo and Morini (2011) and Gregory
and German (2013) for further reading. Initially we assume zero
correlation and zero joint default probability, r = l = 0, and show the
three CVA values in Table 15.1.
Case A represents a situation where the bilateral CVA is negative
due to the institution’s higher default probability and the high chance
that they will owe money on the contract (negative exposure due to m
= –1%). Case B is the opposite case and, since the counterparty is more
risky than the institution, the bilateral CVA is reduced by only around
one third compared with the unilateral case. We see that, since case A
and case B represent equal and opposite scenarios for each party, the
sum of the bilateral adjustments is zero.
Now we show the impact of correlation on the CVA. As shown in
Figure 15.2, correlation can have a reasonably significant impact on
both the unilateral and bilateral values. As correlation increases, we
approach comonotonicity, where the more risky credit is sure to
default first. This means that, in case A, the unilateral adjusted CVA
goes to zero (the institution is sure to default first), while in case B it
converges to the pure unilateral value (the counterparty is sure to
default first).
Let us finally consider the impact of joint default in Figure 15.3,
which illustrates the three CVA components versus the joint default
intensity, l ≤ min(lC, lI). We see that joint default plays a similar role
to that of correlation but does not have a significant impact on the
bilateral CVA. This illustrates, importantly, that even with high
joint default probability (systemic component), a substantial
portion of the bilateral benefit comes from the idiosyncratic compo-
nent, a point that is particularly acute in case A.
252

Bilateral or unilateral?
An obvious implication of the bilateral formula is that the overall
CVA may be negative, that is, actually increase the overall value of
the derivatives position(s). Another result of the above symmetry is
that the overall amount of counterparty risk in the market would be
zero.13 While this symmetry or the bilateral risk might seem reason-
able and clean, let us consider the associated hedging issues. While
the default component of the unilateral CVA is often hedged by
buying CDS protection on the counterparty, the additional term in
the bilateral formula would require an institution to sell CDS protec-
tion on themselves (or trading their credit quality in some other way
such as by shorting their own stock). Even using the “adjusted
unilateral” CVA is debatable on hedging grounds since the relevant
hedging instruments do not exist (for example, an institution buying
CDS protection that cancels if they themselves default).
Since hedging arguments do not support the use of a bilateral
CVA, let us consider the ways in which the bilateral reduction to the
CVA could be monetarised.
❑❑ File for bankruptcy. An institution can obviously realise the bilat-

eral reduction by going into bankruptcy but, since the component
is directly related to default, this is a circular argument. Consider
a firm with a bilateral counterparty benefit so substantial that it
can prevent their bankruptcy. Yet going into bankruptcy is the
only way to realise the bilateral counterparty risk gain!
❑❑ Get very close to bankruptcy. The institution may realise bilateral
CVA if a trade is unwound at some point, probably due to their
heavily declining credit quality. For example, some monolines
have gained from banks unwinding senior credit insurance and
realising large CVA-related losses. However, we would suggest
that an institution would need to be in severe financial distress
and not expected to survive before being able to recognise gains
in this way. Indeed, one way of interpreting the failure of mono-
lines is through a naive use of bilateral counterparty risk pricing.
❑❑ Beta hedging. While it is not possible for an institution to sell CDS
protection on themselves, they could instead sell protection on a
highly correlated credit or credits; for example, banks might sell
CDS protection on (a portfolio of) other banks.14 However, we
note that a hedging instrument is required so that an institution
253

makes money when its credit spread widens (and vice versa).
Our view is that this is problematic, especially since the calcula-
tions earlier showed that the bilateral CVA was not strongly
sensitive to the joint default – an illustration that the idiosyn-
cratic component of the spread constitutes the significant
proportion of the bilateral CVA. We also point out that institu-
tions wishing to sell protection on credits highly correlated with
their own creditworthiness will lead to an increase in the overall
amount of (wrong-way) counterparty risk in the market.
❑❑ As a funding benefit. Since this article was first published in 2008,
it seems that DVA has been increasingly seen as a funding benefit.
Since there is a potential double-counting between DVA and
funding benefits, our view is that DVA should be associated with
“own default” and not as a funding benefit. There has been a
significant amount of discussion in this area since the original
publication of this article and the reader is referred to Gregory
(2012) for a more detailed discussion on DVA and funding and
also Hull and White (2012) and Burgard and Kaer (2011) for more
theoretical discussion around DVA and funding value adjust-
ment (FVA).
Appropriate pricing and risk management of counterparty risk is a

key area for financial institutions, and controlling the level of reserves
and cost of hedging is critical in turbulent times. However, realistic
pricing and management of risk should always be the key objective.
While standard risk-neutral pricing arguments lead to a reduction of
counterparty risk charges (CVA) in line with the default probability
of an institution, the question of how to monetarise this component
should be carefully considered. Arguments that the bilateral counter-
party risk can be beta hedged, realised when an institution is in
severe financial distress, or represents an offset to future funding
costs are in our view simply not strong enough to justify the wide-
spread use of bilateral CVA.
Conclusion
We have presented an overview of bilateral counterparty risk
pricing. Using a model that represents a simple extension of
standard counterparty risk pricing approaches, we have illustrated
pricing behaviour and considered the impact of default of both
254

parties. Such ideas can readily be incorporated into counterparty

risk pricing and management functions in order to attempt a
reasonable treatment of the bilateral nature of this risk.
Should therefore an institution post profits linked to their own
worsening credit quality? Standard valuation of contingent claims
that have a payout linked to an institution’s own bankruptcy may
give some mathematically appealing and symmetric results.
However, in practice, an institution attaching economic value to
their own default (and, indeed, gaining when their own credit
quality worsens) may be simply fooling themseves and storing up
greater problems in the future.
A problem with using only unilateral CVA for pricing counter-
party risk is that in many cases parties will simply not be able to
agree a price for a trade. However, this is a strong argument for
better collateral management functions or a central clearing house
for counterparty risk and not for the naive introduction of bilateral
CVA pricing.
Bilateral counterparty risk pricing has become standard in the
market and agreed upon by all relevant parties (practitioners,
accountants, regulators, tax officers and legal). Given some of the
lessons learnt from the global financial crisis, such as the issues with
monolines insurers, we suggest that a sanity check on the validity of
using bilateral counterparty risk quantification is appropriate.
This article was first published in February 2009 (see Notes on

Chapters) and there have been many developments in the area of
bilateral counterparty risk since this time. Some more recent refer-
ences are included below to aid the reader. The author acknowledges
helpful comments and ideas from Matthew Leeming, Andrew Green,
Vladimir Piterbarg, Sitsofe Kodjo, Peter Jäckel and Michael Pykhtin.
Discussions with participants at the WBS Fixed Income conference in
Budapest on September 25–26, 2008, and the critical suggestions of
two anonymous referees were also extremely helpful. A spreadsheet
with the model-based calculations from this chapter is available from
the author on request. Email: jon@solum-financial.com
1 Since the publication of the original Risk article, accounting for two-sided counterparty
risk has become mandatory under IFRS13
2 We note that since exposures within netted portfolios are linear then this case is suitably
general.
255

3 We use the notation x+ = max(x, 0) and x– = min(x, 0).

4 Strictly speaking, V(t, tC) corresponds to cashflows paid before the default time of the
counterparty but for the sake of brevity we do not introduce additional notation.
5 This assumes an average recovery value of 40%.
6 This follows from V(t, T) – dCV(t, T)+ – dI V(t, T) – = (1 – dC)V(t, T)+ + (1 – dI)V(t , T) –.
7 As noted before, the approach described here could be combined with a “wrong-way risk”
approach such as in Cherubini and Luciano (2002).
8 A spreadsheet with an implementation of the simple model is available from the author on
request.
9 We note that there is some additional complexity regarding this point. First, since CDS
protection buyers must buy bonds to deliver, a “delivery squeeze” can occur if there is
more CDS notional in the market than outstanding deliverable bonds. In this case, the
bond price can be bid up and suppress the value of the CDS hedging instrument. This has
been seen in many recent defaults such as Parmalat (2003) and Delphi (2005), and for many
counterparties the amount of CDSs traded is indeed larger than available pool of bonds.
We also note that while CDSs are settled shortly after default, derivatives claims go
through a workout process that can last years.
10 For single cashflow products, such as forex forwards, or products with a final large cash-
flow, such as the exchange of principal in a cross-currency swap, the maximum exposure
occurs at the maturity of the transaction and this formula proves a good proxy for the
typical exposure. Products with multiple cashflows, such as interest rate swaps, typically
have a peak exposure between one half and one third of the maturity. We note that the
exposure of the same instrument may also vary significantly due to market conditions,
such as the shape of yield curves. We have confirmed that the qualitative conclusions do
not depend on the precise exposure profile chosen.
11 Given the risk-neutral setting, V(s, T) should be a martingale and therefore determined
uniquely by the relevant forward rates for the product in question. We note that some
institutions follow the practice of modelling exposure under the physical measure.
12 The constant intensities of default are approximately related to CDS premiums via l (1
– d ).
13 This assumes that all parties have the same pricing measure, in which case the two sides to
a trade or netted portfolio of trades will always have equal and opposite CVAs.
14 But, in doing so, the bank should expect to incur a relatively large CVA on the hedge.
REFERENCES
Arvanitis, A. and J. Gregory, 2001, Credit: The Complete Guide to Pricing, Hedging and Risk
Management (London, England: Risk Books).
Brigo, D., and M. Morini, 2011, “Closeout convention tensions”, Risk, December, pp
86–90.
Burgard, C., and M. Kjaer, 2011, “In the balance”, Risk, October.
Canabarro E. and D. Duffie, 2003, “Measuring and Marking Counterparty Risk,” in L.

M. Tilman (Ed), Asset/liability Management of Financial Institutions (London, England:
Euromoney Books).
Canabarro E., E. Picoult and T. Wilde, 2003, “Analysing Counterparty Risk,” Risk,
September, pp 117–22.
Cherubini U. and E. Luciano, 2002, “Copula Vulnerability,” Risk, October, pp 83–86.
256

Duffie D. and M. Huang, 1996, “Swap Rates and Credit Quality,” Journal of Finance, 6, pp
379–406.
Gregory, J., 2012, “Counterparty credit risk: a continuing challenge for global financial
markets”, Wiley.
Gregory, J., and I. German, 2013, “Closing out DVA”, Risk, January.
Hull, J., and A. White. “CVA, DVA, FVA and the Black-Scholes-Merton Arguments”,
Working paper, September 2012.
Pykhtin M., 2005, Counterparty Credit Risk Modelling (London, England: Risk Books).
Pykhtin M. and S. Zhu, 2006, “Measuring Counterparty Credit Risk for Trading Products
Under Basel II,” in M. Ong (Ed), Basel II Handbook (London, England: Risk Books).
Sorensen E. and T. Bollier, 1994, “Pricing Swap Default Risk,” Financial Analysts Journal,
50, pp 23–33.
United States Tax Court, 2003, “Bank One Corporation, Petitioner, v. Commissioner of
Internal Revenue,” respondent, May 2.
257

16
Real-time Counterparty Credit Risk
Management in Monte Carlo
Luca Capriotti, Jacky Lee and Matthew Peacock
Credit Suisse and Axon Strategies
One of the most active areas of risk management is counterparty

credit risk management (CCRM). Managing counterparty risk is
particularly challenging because it requires the simultaneous eval-
uation of all the trades facing a given counterparty. For multi-asset
portfolios, this typically comes with extraordinary computational
challenges.
Indeed, for portfolios other than those comprising simple
vanilla instruments, computationally intensive Monte Carlo (MC)
simulations are often the only practical tool available for this task.
Standard approaches for the calculation of risk require repeating
the calculation of the profit and loss of the portfolio under
hundreds of market scenarios. As a result, in many cases these
calculations cannot be completed in a practical amount of time,
even employing a vast amount of computer power. Since the total
cost of through-the-life risk management can determine whether
it is profitable to execute a new trade, solving this technology
problem is critical to allow a securities firm to remain
competitive.
Following the introduction of adjoint methods in finance (Giles
and Glasserman, 2006), a computational technique dubbed adjoint
algorithmic differentiation (AAD) (Capriotti, 2011, and Capriotti
and Giles, 2010 and 2011) has recently emerged as tremendously
effective for speeding up the calculation of sensitivities in MC in the
context of the so-called pathwise derivatives method (Broadie and
Glasserman, 1996).
259
16 Capriotti PCQF.indd 259 11/03/2013 10:16

Algorithmic differentiation (AD) (Griewank, 2000) is a set of

programming techniques for the efficient calculation of the deriva-
tives of functions implemented as computer programs. The main
idea underlying AD is that any such function – no matter how
complicated – can be interpreted as a composition of basic arith-
metic and intrinsic operations that are easy to differentiate. What
makes AD particularly attractive, when compared with standard
(finite-difference) methods for the calculation of derivatives, is its
computational efficiency. In fact, AD exploits the information on the
structure of the computer code in order to optimise the calculation.
In particular, when one requires the derivatives of a small number
of outputs with respect to a large number of inputs, the calculation
can be highly optimised by applying the chain rule through the
instructions of the program in opposite order with respect to their
original evaluation (Griewank, 2000). This gives rise to AAD.
Surprisingly, even if AD has been an active branch of computer
science for several decades, its impact in other research fields has
been fairly limited. Interestingly, in contrast to the usual situation in
which well-established ideas in applied mathematics or physics have
often been “borrowed” by quants, AAD has been introduced in MC
applications in natural science (Sorella and Capriotti, 2010) only after
its “rediscovery” in quantitative finance.
In this chapter, we demonstrate how this powerful technique can
be used for highly efficent computation of price sensitivities in the
context of CCRM.
Counterparty credit risk management

As a typical task in the day-to-day operation of a CCRM desk, here
we consider the calculation of the credit valuation adjustment
(CVA) as the main measure of a dealer’s counterparty credit risk.
For a given portfolio of trades facing the same investor or institu-
tion, the CVA aims to capture the expected loss associated with the
counterparty defaulting in a situation in which the position, netted
for any collateral agreement, has a positive mark-to-market for the
dealer. This can be evaluated at time T0 = 0 as:
⎡I (τ c ≤ T ) D ( 0, τ c ) ⎤
VCVA = E ⎢ ⎥
( (
⎢⎣×LGD (τ c ) NPV (τ c ) − C R (τ c− ) ))
⎥⎦ (16.1)
260

REAL-TIME COUNTERPARTY CREDIT RISK MANAGEMENT IN MONTE CARLO
where tc is the default time of the counterparty, NPV(t) is the net

present value of the portfolio at time t from the dealer’s point of
view, C(R(t)) is the collateral outstanding, typically dependent on
the rating R of the counterparty, LGD(t) is the loss given default, D(0,
t) is the discount factor for the interval [0, t], and I(tc ≤ T) is the indi-
cator that the counterparty’s default happens before the longest
deal maturity in the portfolio, T. Here, for simplicity of notation, we
consider the unilateral CVA, and the generalisation to bilateral CVA
(Brigo and Capponi, 2010) is straightforward. The quantity in 16.1
is typically calculated on a discrete time grid of ‘horizon dates’ T0 <
T1 < ... < TN as, for instance:
O
NO
⎡
VCVA  ∑ E ⎢I (Ti−1 < τ c ≤ Ti ) D ( 0,Ti )
i=1
⎣
)) ⎤⎦⎥
+
(
×LGD (Ti ) NPV (Ti ) − C R (Ti− )( (16.2)
In general, the quantity above depends on several correlated

random market factors, including the interest rate, the counter
party’s default time and rating, the recovery amount, and all the
market factors the net present value of the portfolio depends on. As
such, its calculation requires a MC simulation.
To simplify the notation and generalise the discussion beyond
the small details that might form part of a dealer’s definition of a
specific credit charge, here we consider expectation values of the
form:
V = E Q [ P ( R, X )] (16.3)
with “payout” given by:

NO
P = ∑ P (Ti , R (Ti ) , X (Ti )) (16.4)
i=1
where:
NR
P (Ti , R (Ti ) , X (Ti )) = ∑ Pi ( X (Ti ) ;r ) δr , R(Ti ) (16.5)
r=0
Here the rating of the counterparty entity including default, R(t), is

represented by an integer r = 0, ... , NR for simplicity; X(t) is the real-
ised value of the M market factors at time t. Q = Q(R, X) represents a
probability distribution according to which R = (R(T1), ... , R(TN ))t 0
261

~
and X = (X(T1), ... , X(TN ))t are distributed; Pi(⋅; r) is a rating-dependent
0
payout at time Ti.1

The expectation value in 16.3 can be estimated by means of MC
by sampling a number NMC of random replicas of the underlying
rating and market state vector, R[1], ... , R[NMC] and X[1], ... , X[NMC],
according to the distribution Q(R, X), and evaluating the payout
P(R, X) for each of them. This leads to the central limit theorem
(Kallenberg, 1997) estimate of the option value V as:
N
1 MC
V ∑
N MC iMC =1
P ( R [ iMC ] , X [ iMC ]) (16.6)
___
with standard error S/√NMC, where S2 = EQ[P(R, X)2] – EQ[P (R, X)]2 is
the variance of the sampled payout.
In the following, we will make minimal assumptions about the
particular model used to describe the dynamics of the market
factors. In particular, we will only assume that for a given MC
sample the value at time Ti of the market factors can be obtained
from their value at time Ti–1 by means of a mapping of the form X(Ti)
= Fi(X(Ti–1), ZX) where ZX is a NX dimensional vector of correlated
standard normal random variates, X(T0) is today’s value of the
market state vector, and Fi is a mapping regular enough for the
pathwise derivatives method to be applicable (Glasserman, 2004),
as is generally the case for practical applications.
As an example of a counterparty rating model generally used in
practice, here we consider the rating transition Markov chain model
of Jarrow, Lando and Turnbull (1997) in which the rating at time Ti
can be simulated as:
NR
R (Ti ) = ∑ I Z iR > Q (Ti , r )
( ) (16.7)
r=1
~
where ZRi is a standard normal variate, and Q(Ti, r) is the quantile-
threshold corresponding to the transition probability from today’s
rating to a rating r at time Ti. Note that the discussion below is not
limited to this particular model, and it could be applied with minor
modifications to other commonly used models describing the
default time of the counterparty and its rating (Schönbucher, 2003).
Here we consider the rating transition model 16.7 for its practical
utility, as well as for the challenges it poses in the application of the
pathwise derivatives method, because of the discreteness of its state
space.
262

In this setting, MC samples of the payout estimator in 16.6 can be

generated according to the following standard algorithm. For i = 1,
... , NO:
❑❑ Step 1. Generate a sample of NX + 1 jointly normal random varia-

bles (ZRi, ZXi) ≡ (ZRi, ZXi,1, ... , ZXi,N )t distributed according to φ (ZRi, ZX;
X
ρ i), a (NX+1)-dimensional standard normal probability density

function with correlation matrix ρ i, for example, with the first
row and column corresponding to the rating factor.
❑❑ Step 2. Iterate the recursion X(Ti) = Fi(X(Ti–1), ZX ).
~ _
❑❑ Step 3. Set ZRi = Sij=1ZRj/√i and calculate R(Ti) according to (7).2
❑❑ Step 4. Calculate the time Ti payout estimator P(Ti, R(Ti), X(Ti)) in
(5), and add this contribution to the total estimator in 16.4.
The calculation of risk can be obtained in a highly efficient way by

implementing the pathwise derivatives method (Broadie and
Glasserman, 1996) according to the principles of AAD (Capriotti,
2011, and Capriotti and Giles, 2010 and 2011). The pathwise deriva-
tives method allows the calculation of the sensitivities of V 16.6
with respect to a set of Nθ parameters θ = (θ1, ... , θ N ), say: θ
∂V (θ ) ∂
= E [ P ( R, X )] (16.8)
∂θ k ∂θ k
by defining appropriate estimators that can be sampled simultane-

ously in the same MC simulation. This can be achieved by observing
that whenever the payout function is regular enough (for example,
Lipschitz-continuous, see Glasserman, 2004), one can rewrite
Equation 16.8 by taking the derivative inside the expectation value,
as:
∂V (θ ) ⎡ ∂P ( R, X ) ⎤
= E P ⎢ ⎥ (16.9)
∂θ k ⎣ ∂θ k ⎦
where P = P(ZR, ZX ) is the distribution of the correlated normal vari-

ates used in the MC simulation, which is independent of θ.3 The
calculation of Equation 16.9 can be performed by applying the
chain rule, and calculating the average value of the pathwise deriv-
atives estimator:
∂Pθ ( R, X ) NO M ∂Pθ ( R, X ) ∂X l (Ti ) ∂Pθ ( R, X )

θk ≡ = ∑∑ × + (16.10)
∂θ k i=1 l=1 ∂X l (Ti ) ∂θ k ∂θ k
263

where we have allowed for an explicit dependence of the payout on

the model parameters.4 Due to the discreteness of the state space of
the rating factor, the pathwise estimator for its related sensitivities
is not well defined. However, as we will show below, one can
express things in such a way that the rating sensitivities are incor-
porated in the explicit term ∂Pθ (R, X)/∂θ k.
In the following, we will show how the calculation of the path-
wise derivatives estimator 16.10 can be implemented efficiently by
means of AAD. We begin by briefly reviewing this technique.
Adjoint algorithmic differentiation

Griewank (2000) contains a detailed discussion of the computa-
tional cost of AAD. Here, we will only recall the main results in
order to clarify how this technique can be beneficial for the efficient
implementation of the pathwise derivatives method. The interested
reader can find in Capriotti (2011) and Capriotti and Giles (2010 and
2011) several examples illustrating the intuition behind these
results.
To this end, consider a function:
Y = FUNCTION ( X ) (16.11)
mapping a vector X in Rn to a vector Y in Rm through a sequence of

steps:
X → →U → V → → Y (16.12)
Here, each step can be a distinct high-level function or even an indi-

vidual instruction.
The adjoint mode of AD results from propagating the derivatives
of the final result with respect to all the intermediate variables – the
so-called adjoints – until the derivatives with respect to the inde-
pendent variables are formed. Using the standard AD notation, the
adjoint of any intermediate variable Vk is defined as:
m
∂Yj
Vk = ∑ Yj (16.13)
j=1 ∂Vk
_
where Y is the vector in Rm. In particular, for each of the interme-
diate variables Ui, using the chain rule we get:
m m
∂Yj ∂Yj ∂Vk
U i = ∑ Yj = ∑Yj ∑
j=1 ∂U i j=1 k ∂Vk ∂U i
264

which corresponds to the adjoint mode equation for the interme-

diate function V = V(U):
∂Vk
U i = ∑ Vk (16.14)
k ∂U i
_ _ _
_ of the form U = V (U, V ). Starting from the adjoint
namely a function
of the outputs, Y , we can apply this to each step in the calculation,
working from right to left:
X ← ← U ← V ← ← Y (16.15)
_
until we obtain X , that is, the following linear combination of the
rows of the Jacobian of the function X → Y:
m
∂Yj
X i = ∑ Yj (16.16)
j=1 ∂X i
with i = 1, ... , n.
In the adjoint mode, the cost does not increase with the number
of inputs, but it is linear in the number of (linear combinations of
the) rows of the Jacobian that need to be evaluated independently.
In particular, if the full Jacobian is required, one needs
_ to repeat
the adjoint calculation m times, setting the vector Y equal to each
of the elements of the canonical basis in Rm. Furthermore, since the
partial (branch) derivatives depend on the values of the interme-
diate variables, one generally first has to compute the original
calculation storing the values of all the intermediate variables
such as U and V, before performing the adjoint mode sensitivity
calculation.
One particularly important theoretical result (Griewank, 2000) is
that given a computer program performing some high-level func-
tion 16.11, the execution time of its adjoint counterpart:
X = FUNCTION _ b ( X,Y ) (16.17)
(with suffix _b for “backward” or “bar”) calculating the linear

combination 16.16 is bounded by approximately four times the cost
of execution of the original one, namely:
Cost [ FUNCTION _ b]
≤ ωA (16.18)
Cost [ FUNCTION ]
with w A ∈ [3, 4]. Thus, one can obtain the sensitivity of a single
output, or of a linear combination of outputs, to an unlimited
265

number of inputs for a little more work than the original

calculation.
As also discussed at length in Capriotti (2011) and Capriotti and
Giles (2010 and 2011), AAD can be straightforwardly implemented
by starting from the output of an algorithm and proceeding back-
wards, applying systematically the adjoint composition rule 16.14
to each intermediate step, until the adjoints of the inputs 16.16 are
calculated. As already noted, the execution of such backward sweep
requires information that needs to be calculated and stored by
executing beforehand the steps of the original algorithm – the
so-called forward sweep. A simple illustration of this procedure is
discussed in the Appendix.
AAD and counterparty credit risk management

When applied to the pathwise derivatives method, AAD allows the
simultaneous calculation of the pathwise derivatives estimators for
an arbitrarily large number of sensitivities at a small fixed cost.
Here, we describe in detail the AAD implementation of the path-
wise derivatives estimator 16.10 for the CCRM problem 16.1.
As noted above, the sensitivities with respect to parameters
affecting the rating dynamics need special care due to the discrete
nature of the state space. However, setting these sensitivities aside
for the moment, the AAD implementation of the pathwise deriva-
tives estimator consists of Steps 1–4 described above plus the
following steps of the backward sweep. For i = NO, ... , 1:
_
❑❑ Step 4 . Evaluate the adjoint of the payout:
(X (T ) ,θ ) = P (T , R (T ) , X (T ) ,θ , P)
i i i i
_
with _P = 1.
❑❑ Step 3. Nothing to do: the parameters θ do not affect this non-
differentiable
_ step.
❑❑ Step 2. Evaluate the adjoint of the propagation rule in step 2:
(X (T ) ,θ ) + = F (X (T ) ,θ , Z
i−1 i i−1
X
, X (Ti ) , θ )
where_ + = is the standard addition assignment operator.

❑❑ Step 1 . Nothing to do: the parameters θ do not affect this step.
_
A few comments are in order. In step 4 , the adjoint of the payout
function is defined while keeping the discrete rating variable
266

_ _
constant. This provides the derivatives _ X l(Ti) = ∂Pθ /∂Xl(Ti), and θ k =
∂Pθ /∂θ k. In defining the adjoint in step 2, we have taken into account
that the propagation rule in step 2 is explicitly dependent on both
X(Ti) and the model parameters θ. As _ a result,
_ its adjoint counter-
part produces contributions to both θ and X(Ti). Both the adjoint of
the payout and of the propagation mapping can be implemented
following the principles of AAD as discussed in Capriotti (2011)
and Capriotti and Giles (2011). In many situations, AD tools can
also be used as an aid or to automate the implementation, espe-
cially for simpler,
_ self-contained
_ functions. In the backward sweep
above, steps 1 and 3 have been skipped because we have assumed
for simplicity of exposition that the parameters θ do not affect the
correlation matrices r i and_ the rating dynamics. If correlation risk is
instead required, step_ 2 also produces the adjoint of the random
variables ZX, and step 1 contains the adjoint of the Cholesky decom-
position, possibly with the support of the binning technique, as
described in Capriotti and Giles (2010).
Rating transition risk

The risk associated with the rating dynamics can be treated by
noting that 16.5 can be expressed more conveniently as:
P Ti , Z iR , X (Ti ) = Pi ( X (Ti ) ;0)
( )
NR
+∑ Pi ( X (Ti ) ;r ) − Pi ( X (Ti ) ;r − 1) I Z iR > Q (Ti , r;θ )
( )( ) (16.19)
r=1
so that the singular contribution to the pathwise derivatives esti-

mator reads:
NR
∂θ k P Ti , Z i , X (Ti ) = −∑ Pi ( X (Ti ) ;r ) − Pi ( X (Ti ) ;r − 1)
( ) ( )
r=1
× δ Z iR = Q (Ti , r;θ ) ∂θ k Q (Ti , r;θ )

( ) (16.20)
This estimator cannot be sampled in this form with MC.

Nevertheless, it can be integrated out using the properties of Dirac’s
delta along the lines of Joshi and Kainth (2004), giving after straight-
forward computations:
NR
φ ( Z * , ZiX , ρi )
θ k = −∑ ∂θ k Q (Ti , r;θ )
r=1 iφ ( ZiX , ρiX )
× Pi ( X (Ti ) ;r ) − Pi ( X (Ti ) ;r − 1)
( ) (16.21)
267

_
where Z* is such that (Z* + Sji –1 ZR)/√i = Q(Ti, r; θ ), and φ (ZXi, r Xi ) is a
=1 j
NX-dimensional standard normal probability density function with
correlation matrix r Xi obtained by removing the first row and
column of r i; here ∂θ Q(Ti, r; θ ) is not stochastic and can be evaluated
k
(for example, using AAD) once per simulation. The final result is
rather intuitive as it is given by the probability-weighted sum of the
discontinuities in the payout.
Results
As a numerical test, we present here results for the calculation of
risk on the CVA of a portfolio of commodity derivatives. For the
purpose of this illustration, we consider a simple one-factor
lognormal model for the futures curve of the form:
dFT (t)
= σ T exp (−β (T − t )) dWt (16.22)
FT (t )
where Wt is a standard Brownian motion; FT(t) is the price at time t

of a futures contract expiring at T; s T and b define a simple instanta-
neous volatility function that increases approaching the contract
expiry, as empirically observed for many commodities. The value
of the future’s price FT(t) can be simulated exactly for any time t so
that the propagation rule in step 2 reads for Ti ≤ T:
⎛ 1 ⎞
FT (Ti ) = FT (Ti−1 ) exp ⎜σ i ΔTi Z − σ i2 ΔTi ⎟ (16.23)
⎝ 2 ⎠
where DTi = Ti – Ti–1 and:

σ T2 −2 βT 2 βTi 2 βTi−1
σ i2 =
2 βΔTi
e (e − e )
is the outturn variance. In this example, we will consider determin-
istic interest rates. As an underlying portfolio for the CVA
calculation, we consider a set of commodity swaps, paying on a
strip of futures (for example, monthly) expiries tj, j = 1, ... , Ne the
amount Ft (tj) – K. The time t net present value for this portfolio
j
reads:
Ne

j=1
(
NPV (t ) = ∑ D (t,t j ) Ft j (t) − K ) (16.24)
Note that, although we consider here for simplicity of exposition a

linear portfolio, the method proposed applies to an arbitrarily
268

Figure 16.1 Speed-up in the calculation of risk for the CVA of a portfolio
of five commodity swaps over a five-year horizon, as a function of the
number of risks calculated (empty dots)
160
°
140
120 °
Speed-up/RCPU
100
80
60 °
40
20
0 •° • • •
0 100 200 300 400 500 600
Nrisks
Note: The full dots are the ratio of the CPU time required for the calculation of
the CVA, and its sensitivities, and the CPU time spent for the calculation of the
CVA alone. Lines are guides for the eye
complex portfolio of derivatives, for which in general NPV(t) will be

a nonlinear function of the market factors Ft (t) and model parame- j
ters θ . _
For this example, the adjoint propagation rule in step 2 simply
reads:
⎛ 1 ⎞
FT (Ti − 1) + = FT (Ti ) exp ⎜σ i ΔTi Z − σ i2 ΔTi ⎟
⎝ 2 ⎠
σ i = FT (Ti ) F (Ti ) ( ΔTi Z − σ i ΔTi )
_
with s i related to this step’s contribution to the adjoint of the
_
future’s volatility s T by:
σi
σT+ = e −2 βT ( e 2 βTi − e 2 βTi−1 )
2 βΔTi
_ _
At the end of the backward path, F T(0) and s T contain the pathwise
derivatives estimator 16.10 corresponding, respectively, to the
sensitivity with respect to today’s price and volatility of the futures
contract with expiry T.
The remarkable computational efficiency of the AAD implemen-
tation is illustrated in Figure 16.1. Here, we plot the speed-up
269

Table 16.1 Variance reduction (VR) on the sensitivities with respect to the
thresholds Q(1, r) (NR = 3) for a call option with a rating-dependent strike
d VR[(Q(1, 1)] VR[Q(1, 2)] VR[Q(1, 3)]
0.1 24 16 12
0.01 245 165 125
0.001 2,490 1,640 1,350
Note: d indicates the perturbation used in the finite-difference estimators of the

sensitivities. The specification of the parameters used for this example is available
upon request
produced by AAD with respect to the standard finite-difference

method. On a fairly typical trade horizon of five years, for a port-
folio of five swaps referencing distinct commodities futures with
monthly expiries, the CVA bears non-trivial risk to more than 600
parameters: 300 futures prices (FT(0)), and at-the-money volatilities
(s T), say 10 points on the zero rate curve, and 10 points on the credit
default swap curve of the counterparty used to calibrate the transi-
tion probabilities of the rating transition model 16.7. As illustrated
in Figure 16.1, the computation time required for the calculation of
the CVA and its sensitivities is less than four times that spent for the
computation of the CVA alone, as predicted by Equation 16.18. As a
result, even for this very simple application, AAD produces risk
measures more than 150 times faster than finite differences, that is,
for a CVA evaluation taking 10 seconds, AAD produces the full set
of sensitivities in less than 40 seconds, while finite differences
require approximately one hour and 40 minutes.
Moreover, as a result of the analytic integration of the singularities
introduced by the rating process, the risk measures produced by AAD
are typically less noisy than those produced by finite differences. This
is illustrated in Table 16.1, which shows the variance reduction on the
sensitivities with respect to the thresholds Q(Ti, r) for a simple test
case. Here, we have considered the calculation of a call option of the
form (FT(Ti) – C(R(Ti)))+ with a strike C(R(Ti)) linearly dependent on the
rating, and Ti = 1. The variance reduction displayed in the table can be
thought of as a further speed-up factor because it corresponds to the
reduction in the computation time for a given statistical uncertainty
on the sensitivities. This diverges as the perturbation in the finite-
difference estimators d tends to zero, and may be very significant even
for a fairly large value of d.
270

Conclusion
In conclusion, we have shown how AAD allows an extremely effi-
cient calculation of counterparty credit risk valuations in MC. The
scope of this technique is clearly not limited to this important appli-
cation but extends to any valuation performed with MC. For any
number of underlying assets or names in a portfolio, the proposed
method allows the calculation of the complete risk at a computa-
tional cost that is at most four times the cost of calculating the profit
and loss of the portfolio. This results in remarkable computational
savings with respect to standard finite-difference approaches. In
fact, AAD allows one to perform in minutes risk runs that would
take otherwise several hours or could not even be performed over-
night without large parallel computers. AAD therefore makes
possible real-time risk management in MC, allowing investment
firms to hedge their positions more effectively, actively manage
their capital allocation, reduce their infrastructure costs and ulti-
mately attract more business.
Appendix: a simple example

As a simple example of AAD implementation, we consider an algo-
rithm mapping a set of inputs (θ 1, ... , θ n) into a single output P,
according to the following steps:
❑❑ Step 1. Set Xi = exp(–θ 2i /2 + θ iZ), for i = 1, ... , n, where Z is a constant.

❑❑ Step 2. Set P = (Sni=1Xi – K)+, where K is a constant.
The corresponding adjoint algorithm consists of steps 1 and 2

(forward sweep), plus a backward sweep consisting of the adjoints
of steps 2 and 1, respectively:
_ _ _
❑❑ Step 2_. Set X_i = P I(Sni=1Xi – K > 0), for i = 1, ... , n.
❑❑ Step 1. Set θ i = Xi(–θi + Z), for i = 1, ... , n.
We can immediately verify that the output of the adjoint algorithm

_
above gives
_ for P = 1 the full set of sensitivities with respect to the
inputs, θi = ∂P/∂θ i. Note that, as described in the main text, the back-
ward sweep requires information that is calculated during the
execution of the forward sweep, steps 1 and 2, for example, to calcu-
late the indicator I(Sni=1Xi – K) and the value of Xi. Finally, simple
inspection shows that both the forward and the backward sweep
have a computation complexity O(n), that is, all the components of
271

the gradient of P can be obtained at a cost that is of the same order of

the cost of computing P, in agreement with the general result 16.18. It
is easy to recognise in this example a stylised representation of the
calculation of the pathwise estimator for vega (volatility sensitivity)
of a call option on a sum of lognormal assets.
The authors would like to thank Mike Giles, Adam Peacock, Nick
Seed and Mark Stedman for numerous useful discussions, and
Fredrik Akesson for a careful reading of the manuscript. The opinions
and views expressed in this chapter are those of the authors, and do
not necessarily represent those of Credit Suisse Group.
1 The discussion below applies also to the case in which the payout at time Ti depends on the
history of the market factors X up to time Ti.
2 Here we have used the fact that the payout 16.5 depends on the outturn value of the rating
at time Ti and not on its history.
3 For simplicity of notation, we exclude the case in which θ includes the elements of the
correlation matrix r in φ (Z R, Z X; r ). The extension to this case is straightforward and can be
performed along the lines of Capriotti and Giles (2010).
_
4 Here and in the following we will use the standard AD notation θ k to indicate the sensi-
tivity of the payout with respect to the model parameter θ k.
REFERENCES
Brigo D. and A. Capponi, 2010, “Bilateral Counterparty Risk with Application to CDSs,”
Risk, March, pp 85–90.
Broadie M. and P. Glasserman, 1996, “Estimating Security Price Derivatives Using

Simulation,” Management Science, 42, pp 269–85.
Capriotti L., 2011, “Fast Greeks by Algorithmic Differentiation,” Journal of Computational

Finance, 3(3), pp 3–35.
Capriotti L. and M. Giles, 2010, “Fast Correlation Greeks by Adjoint Algorithmic

Differentiation,” Risk, April, pp 79–83.
Capriotti L. and M. Giles, 2011, “Algorithmic Differentiation: Adjoint Greeks Made

Easy,” Risk, September, pp 96-102.
(available at http://ssrn.com/ abstract=1801522).
Giles M. and P. Glasserman, 2006, “Smoking Adjoints: Fast Monte Carlo Greeks,” Risk,
January, pp 92–96.
Springer).
Griewank A., 2000, “Evaluating Derivatives: Principles and Techniques of Algorithmic

Differentiation,” Frontiers in Applied Mathematics, Philadelphia.
Jarrow R., D. Lando and S. Turnbull, 1997, “A Markov Model for the Term Structure of
Credit Spreads,” Review of Financial Studies, 10, pp 481–523.
272

Joshi M. and D. Kainth, 2004, “Rapid Computation of Prices and Deltas of n-th to
Default Swaps in the Li Model,” Quantitative Finance, 4, pp 266–75.
Kallenberg O., 1997, Foundations of Modern Probability (New York, NY: Springer).
Schönbucher P., 2003, Credit Derivatives Pricing Models: Models, Pricing, Implementation
(London, England: Wiley).
Sorella S. and L. Capriotti, 2010, “Algorithmic Differentiation and the Calculation of

Forces in Quantum Monte Carlo,” Journal of Chemical Physics, 133, 234111, pp 1–10.
273

17
Counterparty Risk Capital and CVA
Michael Pykhtin
US Federal Reserve Board
Counterparty credit risk (CCR)1 is one of the primary focus points

of the recent changes to regulatory minimum capital requirements,
now commonly known as Basel III (Basel Committee on Banking
Supervision (BCBS), 2010). Among other things, Basel III has intro-
duced the concept of credit value adjustment (CVA) into calculations
of the CCR capital charge. CVA appears twice in the Basel III
minimum capital requirements for CCR:
❑❑ In addition to the default capital charge, banks are required to

calculate a CVA capital charge. The CVA capital charge is
supposed to account for losses related to deterioration of credit
quality of the counterparties that survive the time period up to
the capital horizon.
❑❑ Banks will have to subtract the counterparty-level CVA from the
counterparty’s exposure-at-default (EAD).
The appearance of CVA in CCR capital charges has prompted

numerous discussions between regulators and the finance industry.
It appears that there has been no universal agreement on how to
incorporate CVA into the CCR capital charge consistently.
In this chapter, we propose a general framework for calculating
capital for CCR that consistently incorporates CVA. We illustrate
two possible applications of this framework:
❑❑ CCR as market risk – This is a traditional market risk approach

with a relatively short time horizon (for example, two weeks).
We show that CCR under this approach cannot be separated
275
17 Pykhtin PCQF.indd 275 11/03/2013 10:16

from the market risk of the CCR-free trading book. Instead,

market risk should be calculated for the extended trading book,
where CCR is incorporated by adding to the portfolio one virtual
defaultable contingent claim per counterparty. Value-at-risk
calculated for the extended trading book covers both the market
risk and the CCR. This approach is appropriate for sophisticated
financial institutions that manage and hedge their CCR dynami-
cally; and
❑❑ CCR as credit risk – This is a traditional credit risk approach
with a relatively long time horizon (for example, one year).
Under the asymptotic single risk factor (ASRF) framework that
underlies Basel minimum capital requirements, we show that the
full capital is the sum of two terms: the default capital charge and
the CVA capital charge. The default capital charge depends on
time-zero CVA, but does not depend on the expected loss (EL).
The CVA capital charge covers the risk of CVA change from time
zero to the capital horizon and is an analogue of the credit migra-
tion risk of a loan or bond portfolio. This approach (but not
necessarily the ASRF flavour discussed in this chapter) is appro-
priate for financial institutions that hold CCR to maturity.
We use our general framework to analyse both Basel II and Basel

III treatment of CCR. We show how Basel II rules can be made
consistent with the credit risk application of our framework by
making a few adjustments to the default risk formula and recali-
brating the maturity adjustment. Although the Basel III default
capital charge depends on the time-zero CVA, it is not accurate, and
we propose a replacement formula. While the Basel III CVA capital
charge is certainly a step in the right direction, we point out that
because CVA risk is treated separately from the market risk of the
trading book, it does not capture the CVA risk properly and even
has a potential for creating perverse incentives for banks. Finally,
we argue that advanced banks that hedge their CCR dynamically
should be allowed to move CCR fully to the market risk framework
for their regulatory capital calculations.
Counterparty credit exposure

Before we address CCR losses and capital charges, we will give
precise definitions of counterparty credit exposure and CVA.
276

COUNTERPARTY RISK CAPITAL AND CVA
Suppose a bank has a portfolio of trades with a counterparty.

This portfolio may contain multiple netting and/or margin agree-
ments. Exposure of the bank to the counterparty Ec(t) at time t is
determined by the trades’ mark-to-market (MTM) values and the
amount of collateral C(t) available to the bank at time t. If all the
trades between the bank and the counterparty net, the bank’s expo-
sure to the counterparty is given by:
Ec (t ) = max {V (t) − C (t ) , 0} (17.1)
where V(t) is the counterparty-credit-risk-free MTM value of the

entire portfolio with the counterparty from the bank’s perspective.
Under the sign convention that C(t) > 0 when the bank holds collat-
eral at time t, and C(t) < 0 when the bank has posted collateral at
time t, Equation 17.1 holds for both unilateral and bilateral margin
agreements.2
The counterparty is also exposed to the bank’s default. The coun-
terparty’s exposure to the bank Eb(t) at time t is given by:
Eb (t ) = max {−V (t ) + C (t) , 0} (17.2)
where we have flipped the signs of V(t) (the portfolio value from the
counterparty’s perspective is –V(t)) and C(t) (collateral available to
the counterparty is –C(t)). Note that both Ec(t) and Eb(t) for future
time points t are uncertain because neither future portfolio MTM
value nor the amount of collateral the bank will hold in the future
are known at present.
Credit valuation adjustment

Let us assume for a moment that the bank is default-risk-free. Then,
when pricing transactions with the counterparty, the bank should
require a risk premium from the counterparty to be compensated
for the risk of the counterparty defaulting. The market value of this
risk premium, defined for the entire portfolio of trades with the
counterparty, is known as unilateral CVA and is given by:
T
CVAc (t ) = LGDcQ ⋅ ∫ EEc* ( s t ) ⋅ dPDcQ ( s t ) (17.3)
t
where LGDQc is market-implied loss given default (LGD) for the

counterparty, PDQc(s|t) is the risk-neutral cumulative probability of
the counterparty’s default3 between time t and time s ≥ t, estimated
277

at time t, and EEc*(s|t) is the risk-neutral discounted conditional

expected exposure (EE) of the bank to the counterparty at time s
estimated at time t < s, given by:
⎡ B ( 0) ⎤
EEc* ( s t ) = EtQ ⎢ Ec ( s) τ c = s⎥ (17.4)
⎢⎣ B ( s) ⎥⎦
where tc is the time of default of the counterparty, B(t) is the value of

the money-market account at time t, and EQt[⋅] denotes the risk-
neutral expectation, conditional on all the information available up
to time t. Note that the expectation in Equation 17.4 is also condi-
tional on the counterparty defaulting at time s. This conditioning is
material when the exposure to the counterparty depends on the
counterparty’s credit quality (that is, wrong-way or right-way risk
is present). Note also that the exposure is discounted to time zero
instead of time t to facilitate measuring losses in time-zero dollars.
Suppose now that the counterparty is default-risk-free. Then, the
counterparty should require a risk premium from the bank to be
compensated for the risk of the bank defaulting. The market value
of this risk premium is given by:
T
CVAb (t ) = LGDbQ ⋅ ∫ EEb* ( s t ) ⋅ dPDbQ ( s t ) (17.5)

t
where LGDQb is the market-implied LGD for the bank, PDQb(s|t) is the
risk-neutral cumulative probability of the bank’s default between
time t and time s ≥ t, estimated at time t, and EEb* (s|t) is the discounted
risk-neutral expected exposure of the counterparty to the bank at
time s calculated at time t, conditional on the bank defaulting at
time s, given by:
⎡ B ( 0) ⎤
EEb* ( s t ) = EtQ ⎢ Eb ( s) τ b = s⎥ (17.6)
⎢⎣ B ( s) ⎥⎦
where t b is the time of default of the bank. Note that in practice the
bank would often refer to the unilateral CVA calculated from the
counterparty’s perspective as debit valuation adjustment (DVA).
However, neither the bank nor the counterparty is default-risk-
free. If they value counterparty risk for their portfolio unilaterally,
they would never agree on the price, as one would demand a posi-
tive risk premium from the other. The bank and the counterparty
would agree on the price only if they both price counterparty risk
278

bilaterally. The bilateral pricing approach specifies a single quantity

– known as bilateral CVA – that accounts both for the bank’s loss
caused by the counterparty’s default and the counterparty’s loss
caused by the bank’s default.
Often the bilateral CVA is approximated by the difference
between unilateral CVA and unilateral DVA:
CVAcb (t ) ≈ CVAc (t ) − CVAb (t ) (17.7)
where CVAcb(t) is the bilateral CVA at time t from the bank’s perspec-
tive. However, Equation 17.7 is not quite accurate because it ignores
the order in which the bank and the counterparty default.
It is not difficult to account for the default order in calculation of
the bilateral CVA. There are two types of possible default scenario:4
❑❑ Counterparty defaults before the bank does (that is, tc < t b). Under
these scenarios, the loss for the bank is equal to the bank’s
exposure to the counterparty at the time of the counterparty’s
default less the amount the bank is able to recover: LGDQc⋅Ec(tc).
❑❑ Counterparty defaults after the bank does (that is, tc > t b). Under
these scenarios, the counterparty experiences a loss equal to the
counterparty’s exposure to the bank at the time of the bank’s
default less the amount the counterparty is able to recover:
LGDQb⋅Eb(t b). However, from the bank’s perspective, the counter-
party’s loss is the bank’s gain (or negative loss) resulting from the
bank’s option to default.
Combining both types of scenario into a single expression, applying

appropriate discounting and taking conditional expectation, we
obtain the bilateral CVA from the bank’s perspective:5
T
CVAcb (t ) = LGDcQ ⋅ ∫ EEc* ( s t ) ⋅ PrtQ ⎡⎣τ b > s τ c = s⎤⎦ ⋅ dPDcQ ( s t )
t
T
−LGDbQ ⋅ ∫ EEb* ( s t ) ⋅ PrtQ ⎡⎣τ c > s τ b = s⎤⎦ ⋅ dPDbQ ( s t ) (17.8)
t
where Pr [⋅] denotes the risk-neutral probability conditional on all

Q
t
the information available up to time t.
One can use a copula model to express the conditional probabili-
ties in Equation 17.8 as functions of the counterparty’s and the
bank’s risk-neutral unconditional probabilities of default (PDs). For
279

example, if the normal copula model (see Li, 2000) is used to

describe the dependence between tc and t b, the conditional proba-
bilities in Equation 17.8 take this form:
⎛ Φ−1 ⎡PDQ ( s t)⎤ − ρΦ−1 ⎡PDQ ( s t )⎤ ⎞
⎣ b ⎦ ⎣ c ⎦ ⎟
PrtQ ⎡⎣τ b > s τ c = s⎤⎦ = 1− Φ ⎜⎜ 2 ⎟
(17.9)
⎝ 1− ρ ⎠
and:
⎛ Φ−1 ⎡PDQ ( s t)⎤ − ρΦ−1 ⎡PDQ ( s t )⎤ ⎞
⎣ c ⎦ ⎣ b ⎦ ⎟
PrtQ ⎡⎣τ c > s τ b = s⎤⎦ = 1− Φ ⎜⎜ 2 ⎟
(17.10)
⎝ 1− ρ ⎠
where r is the normal copula correlation, Φ(⋅) is the standard normal
cumulative distribution function, and Φ–1(⋅) is its inverse function.
Loss in a trading book

Suppose V(t) is the MTM value of the bank’s entire trading book at
time t, discounted to time zero. Then the bank’s loss LMkt(H) over
time horizon H in time-zero dollars is given by:
LMkt ( H ) = V ( 0) − V ( H ) − CF (0, H ) (17.11)
where CF(0, H) is all the trading book cashflows the bank receives
between time zero and time H, discounted to time zero (the bank’s
payments result in negative contributions to CF). These cashflows
may be deterministic or stochastic and include coupon and divi-
dend payments, other periodic payments (for example, swaps),
payments in the event of default, payments at trades’ maturity, and
exercising of options.
Equation 17.11 (or its equivalents) has been used by financial
institutions for many years to quantify trading book losses and
calculate VaR. Until recently, these market risk calculations were
performed without taking into account CCR, as all trade values
were calculated counterparty-risk-free. However, CCR is an
inherent part of a trading book and must be accounted for in
Equation 17.11 by adjusting trade values for CCR and including
cashflows arising from the counterparties’ defaults.
Let us consider trading book losses of the bank on counterparty i.
If the counterparty does not default prior to the horizon H, we only
need to adjust the portfolio value at the beginning and at the end of
the time interval for CCR in Equation 17.11 to arrive at:
280

Li ( H τ i > H )
= [Vi ( 0) − CVAib ( 0)] − [Vi ( H ) − CVAib ( H )] − CFi ( 0, H ) (17.12)
where CVAib(t) is calculated bilaterally, as discussed above.

If the counterparty defaults at time t i prior to the horizon, the
portfolio that the bank has with the counterparty is closed out.
During the close-out the bank receives the portfolio MTM value
Vi(τi) at the time of default (if negative, the bank pays to the counter-
party) less the loss in the event of default. The loss in the event of
default is given by the product of the LGD LGDi on counterparty i
and the bank’s exposure Ei*(t i) to counterparty i at the time of default,
discounted to time zero (that is, Ei*(t) = [B(0)/B(t)]⋅Ei(t)). Since the
portfolio is closed out at the time of default, portfolio MTM value at
the horizon is zero. Thus, the bank’s trading book loss on counter-
party i conditional on the counterparty defaulting prior to the
horizon is:
Li ( H τ i ≤ H )
= [Vi ( 0) − CVAib ( 0)] − ⎡⎣Vi (τ i ) − LGDi ⋅ Ei* (τ i )⎤⎦ − CFi ( 0, τ i ) (17.13)
Equation 17.13 is valid if the bank does not take any action after the
settlement with the defaulting counterparty. However, this would
rarely be the case because banks try to maintain a market-neutral
position, and removal of the trades with the defaulting counter-
party from the bank’s trading book exposes the bank to unhedged
market risk. To restore the market-neutral position, the bank would
have to replace the trades it had with the defaulting counterparty
by booking equivalent trades with another counterparty. It is typi-
cally assumed that the bank pays the amount Vi(t i) (if negative, the
bank receives the money) to replace the portfolio.6 Thus, the bank
will have the same portfolio at the horizon as in the case of no
default, and Equation 17.13 transforms to:
Li ( H τ i ≤ H )
= [Vi ( 0) − CVAib ( 0)] − Vi ( H ) + LGDi ⋅ Ei* (τ i ) − CFi ( 0, H ) (17.14)
Combining Equations 17.12 and 17.14, summing over the counter-

parties and adding the market positions that are not subject to CCR
(for example, positions in stocks, bonds, exchange-traded deriva-
tives), we obtain the trading book loss LTB(H) in this form:
281

LTB ( H ) = LMkt ( H ) + LCCR ( H ) (17.15)
where LMkt(H) is given by Equation 17.11 and LCCR(H) is given by:
(
LCCR ( H ) = ∑ 1{τ i ≤H} ⋅ ⎡⎣LGDi ⋅ Ei* (τ i ) − CVAib ( 0)⎤⎦
i
+ 1{τ i >H} ⋅ [CVAib ( H ) − CVAib ( 0)] ) (17.16)
where 1{⋅} is the indicator function of a Boolean argument that takes

a value of one if the argument is true and a value of zero otherwise.
Loss in Equation 17.16 is calculated as the summation of CCR losses
across all the counterparties of the bank. For each counterparty, the
CCR loss in Equation 17.16 consists of two terms: the loss that
occurs in the event of the counterparty’s default and the loss that
occurs in the event of the counterparty’s survival.
CCR as market risk

Many large banks actively manage CCR. They have established
CVA trading desks that aggregate the bank’s CCR by selling protec-
tion to its traditional trading desks and then hedging the aggregated
CCR either internally or externally. To hedge jump-to-default risk
and CVA credit spread risk, CVA desks would buy and sell credit
protection on the counterparty in the form of credit default swaps
(CDSs). For a counterparty whose credit risk is not traded and for
own credit risk that appears in bilateral CVA, one can still hedge the
systematic component of the credit spread via index CDSs. To
hedge CVA exposure risk, CVA desks would use the sensitivities of
CVA to various market risk factors and then enter into over-the-
counter derivatives transactions that offset these sensitivities.
Banks that actively manage CCR should model and calculate it
jointly with market risk. Conceptually, this can be done by adding
one virtual trade per counterparty to the trading book and then
calculating the distribution of market losses for the extended
trading book, which includes the virtual trades. From Equation
17.16, we see that these virtual trades are complex defaultable
derivatives whose value is equal to the negative of the bilateral
CVA. When the counterparty defaults, the derivative’s value jumps
to zero and an outgoing cashflow equal to the product of LGD and
exposure at the time of default occurs.
282

Incorporating CCR in market risk calculations is quite chal-

lenging. One has to calculate the CVA at the market risk time
horizon H for several thousand market scenarios. As can be seen
from Equation 17.8, a CVA calculation for a single market scenario
at H requires the counterparty discounted EE profile EEc*(t|H) for a
set of time points t ≥ H that extend to the maturity of the portfolio
(which may be as high as 30 or even 50 years), so that the number of
time points can be of the order of 100. To obtain this EE profile, one
has to simulate several thousand paths of market risk factors for all
time points beyond H and revalue the entire portfolio for each path
at each time point. Repeating this extensive simulation procedure
several thousand times on a daily basis is not computationally
feasible. Instead, banks often apply pre-calculated CVA sensitivities
to various market risk factors to simulated changes of these factors.
Another approach is to calculate CVA for a set of grid points in the
market risk factor space prior to market risk simulation. During the
simulation, the CVA at the horizon is obtained from the grid point
values via interpolation.
It is important to understand that the outcome of these extended
market risk calculations would be a distribution of trading book
losses, which cannot be separated into a market risk part and a CCR
part. From this distribution, trading book risk measures (such as VaR
or expected shortfall) can be calculated. These measures can be used
for calculating trading book economic capital. For more details on
CCR as market risk, see Canabarro (2010).
CCR as credit risk

Many banks do not actively manage CCR, but hold this risk to the
portfolio maturity (as they would do with lending credit risk). For
such banks, the joint treatment of market risk and CCR may not be
appropriate. Time horizons used for market risk calculations are
usually quite short, which reflects short liquidity horizons for many
trading book items. However, if a bank does not hedge CCR, appli-
cation of the short time horizon is not justified. Instead, it may be
more appropriate to treat the CCR loss given by Equation 17.16
jointly with the credit risk of the banking book.
In this section, we analyse economic capital for CCR assuming an
ASRF framework. The ASRF framework underlies Basel II and, to
some extent, Basel III minimum capital requirements and allows for
283

analytical tractability. The ASRF framework is based on two assump-

tions: a bank’s exposure to any obligor is infinitesimally small in
comparison with the total portfolio exposure; and credit losses in the
portfolio are driven by a single systematic risk factor.
From these assumptions, it follows that the contribution of each
exposure to the portfolio credit VaR at confidence level q is inde-
pendent of the portfolio composition and is given by the expected
loss on that exposure conditional on the systematic risk factor being
equal to either its q-percentile (portfolio loss is an increasing func-
tion of the factor) or its (1 – q)-percentile (portfolio loss is a decreasing
function of the factor).7
We will assume that the portfolio loss is a decreasing function of
a single systematic risk factor Z that has the standard normal distri-
bution. We will focus our analysis on a single counterparty. From
Equation 17.16, the loss Li(H) on counterparty i for the time horizon
H is given by:
Li ( H ) = 1{τ i ≤H} ⋅ ⎡⎣LGDi ⋅ Ei* (τ i ) − CVAib ( 0)⎤⎦

+1{τ i >H} ⋅ [CVAib ( H ) − CVAib ( 0)] (17.17)
The first term in Equation 17.17 describes a bank’s losses resulting

from the counterparty’s default, while the second term describes
losses experienced by the bank when the counterparty survives.
The credit VaR capital charge Ki(H) for counterparty i at confi-
dence level q is obtained by applying the expectation8 conditional
on Z = z1–q to Equation 17.17:
( ) ( )
K i ( H ) = PDi H z1−q ⋅ ⎡⎣LGDi ( z1−q ) ⋅ EADi H z1−q − CVAib ( 0)⎤⎦
( )
+⎡⎣1− PDi H z1−q ⎤⎦ ⋅ ⎡⎣E ⎡⎣CVAib ( H ) Z = z1−q , τ i > H ⎤⎦ − CVAib ( 0)⎤⎦ (17.18)
Several important quantities enter Equation 17.18:
❑❑ PDi(H|z1–q) is the probability of default of counterparty i for the

time horizon H conditional on Z = z1–q under the real measure.
Assuming the Merton/Vasicek model of default,9 this probability
is related to the unconditional PD via:
⎛ Φ−1 [ PD ( H )] − r z ⎞
( )
PDi H z1−q = Φ ⎜⎜ i
1− r 2
i 1−q
⎟
⎟ (17.19)
⎝ i ⎠
284

where PDi(H) is the unconditional PD of the counterparty for the

time horizon H under the real measure, and ri describes the sensi-
tivity of the counterparty to the systematic credit risk. Basel rules
for many types of exposure use Equation 17.19.
❑❑ LGDi(z1–q) is the expected LGD conditional on Z = z1–q. It is still
common practice to assume that LGD is completely idiosyn-
cratic. Conceptually, the dependence of LGD on Z can be
modelled as described in Pykhtin (2003). Basel rules do not have
a model of this dependence, but do require the use of downturn
LGD.
❑❑ EADi(H|z1–q) is the exposure-at-default for horizon H defined
according to:
H
∫ ( )
dPDi t z1−q ⋅ E ⎡⎣Ei* (t) Z = z1−q , τ i = t⎤⎦
( )
EADi H z1−q = 0
(17.20)
(
PDi H z1−q )
Equation 17.20 is a weighted average of conditional expected
exposure over the time interval that starts today and ends at the
horizon. The weights are determined by the cumulative condi-
tional PD in Equation 17.18. In practice, two simplifications are
often applied to Equation 17.20: the weighted average is replaced
with the simple time average; and the conditional expectation is
replaced by the product of the unconditional expectation and a
multiplier commonly known as alpha. After these simplifica-
tions, the EAD becomes:
H
1
( )
EADi H z1−q = α ( q ) ⋅
H
∫ dt ⋅ E⎡⎣E (t)⎤⎦
0
*
i (17.21)
The multiplier a (q) incorporates the effects of possible depend-

ence between exposure and the counterparty credit quality
(wrong-way risk) along with possible deviations from ASRF
assumptions (such as dependence of exposure on systematic
market factors).10 Basel rules use a modified version of Equation
17.21.
❑❑ E[CVAib(H)|Z = z1–q, t i > H] is the expectation of the bilateral CVA at
the horizon conditional on Z = z1–q and on no default prior to the
horizon. We are not aware of any common practice of evaluating
this quantity.
285

The first term in Equation 17.18 is the capital charge covering losses
due to the counterparty’s default, while the second term is the
capital charge covering losses occurring in the event of the counter-
party’s survival. These terms correspond to “default risk” and
“credit migration risk” terms in standard portfolio credit risk
models, but there are important differences:
❑❑ In the default risk term, the default loss is reduced by the amount
of the time zero CVA (since the CVA is bilateral, it can be nega-
tive, so the loss would increase in this case). In loan portfolio
models, no such term is present.
❑❑ In the credit migration risk term, the capital charge covers the
potential increase of the CVA over the time horizon, which can be
caused by changes in the counterparty’s credit spread, in the
bank’s credit spread and in the risk-neutral EE profile (see
Equation 17.8). In loan portfolio models, this capital charge
covers the change in loan value due to potential deterioration of
the obligor’s credit quality.
❑❑ While EL is usually subtracted from credit VaR in loan portfolio
models, no such subtraction occurs in Equation 17.18. The moti-
vation for subtracting EL is that losses up to EL are covered by a
bank’s reserves, funded by the income from the bank’s assets.
However, CCR can be described in terms of virtual trades (see
the previous section) that produce no income that could fund the
reserve. Therefore, subtracting EL from credit VaR is not
appropriate.
Capital for CCR under Basel II

In the last two sections, we described two different ways of treating
CCR. One approach is a pure market risk approach, which is appro-
priate for banks that actively manage and hedge CCR. The other
approach is a pure credit risk approach, appropriate for banks that
hold CCR until the trades’ maturity.
Basel II looks at CCR from the pure credit risk perspective
assuming a capital horizon H = 1 year and a confidence level q =
99.9%. For banks with internal models method (IMM) approval, the
CCR capital charge under Basel II is calculated according to the
formula used for corporate exposures:
286

( )
K i ( H ) = EADi ⋅ LGDiDT ⋅ ⎡⎣PDi H z1−q − PDi ( H )⎤⎦ ⋅ MA ( Mi ) (17.22)
where LGDiDT is the downturn (hence the superscript DT) LGD,

PDi(H|z1–q) is calculated according to Equation 19.19 with the asset
correlations specified in BCBS (2006), and EADi is calculated
according to Equation 17.21 with two modifications: exposure is
not discounted; and the non-decreasing constraint is applied to the
EE profile to account for rollover of short-term trades. The multi-
plier a (q) is set to 1.4, but advanced banks may use internal
calculations of a (q), subject to supervisory approval and to a floor
of 1.2. The multiplier MA(Mi) is the maturity adjustment (MA) for
an exposure with effective remaining maturity Mi calculated
according to the formula specified in BCBS (2006).
Comparing Equation 17.22 with our discussion in the previous
section and the equations there, one can see that the Basel II capital
charge is mostly consistent with the underlying ASRF assumptions.
EAD is calculated in line with the standard industry practice,
downturn LGD is a reasonable proxy for expected LGD conditional
on a stressed value of the systematic risk factor. However, one can
raise two issues regarding Equation 17.22:
❑❑ Expected loss11 is subtracted from the credit VaR. As we have

discussed at the end of the previous section, EL should not be
subtracted from credit VaR because the CCR virtual trades have
no income that could be counted against CCR losses. Instead,
today’s value of the bilateral CVA should be subtracted from the
conditional loss in the event of default, as in the first term (top
line) of Equation 17.18.
❑❑ The MA factor is used as a proxy for the second term (bottom
line) in Equation 17.18 that describes credit migration risk. The
calibration of the MA formula was done for loan portfolios by
means of an MTM portfolio credit risk model similar to KMV
Portfolio Manager.12 One can argue that the effective MA factors
for CCR should be different from the ones for lending risk. One
obvious difference is that the MA for effective maturity equal to
one year would not be necessarily equal to one, but would
depend on the term structure of the EE.
287

If these two issues were resolved, the Basel II framework for CCR
would be very reasonable for banks that do not actively manage or
hedge CCR. However, resolving the MA issue is not trivial because
it would involve calculation of E[CVAib(H)|Z = z1–q, t i > H].
Capital for CCR under Basel III

Recall that CCR loss is defined according to Equation 17.16. We
have argued that this CCR loss can be treated either as market risk
or as credit risk. While Basel II treated CCR loss as credit risk, Basel
III applies a hybrid approach by treating default risk and credit
migration risk differently: the default risk is treated as credit risk
under the ASRF framework (with capital horizon H = 1 year and
confidence level q = 99.9%), while the credit migration risk is treated
as market risk (the CVA capital charge in BCBS, 2010).
Under Basel III, banks with IMM approval for CCR should calcu-
late the default risk capital charge according to the Basel II formula
(Equation 17.22) with EAD replaced with so-called ‘outstanding
EAD’, defined as the greater of zero and the difference between
EAD and unilateral CVA:
K idef ( H ) = max {EADi − CVAi ( 0)}
( )
⋅LGDiDT ⋅ ⎡⎣PDi H z1−q − PDi ( H )⎤⎦ ⋅ MA ( Mi ) (17.23)
IMM banks that have approval for specific risk interest rate internal
VaR models that can demonstrate that their specific VaR models
incorporate credit migrations (for brevity, we will call such banks
advanced banks), may set the maturity adjustment equal to one
(that is, MA(Mi) = 1 for any Mi).
One can raise the following issues regarding Equation 17.23.
❑❑ Unconditional PD is subtracted from conditional PD. This is a

carry-over from the Basel II formula resulting from subtracting
the EL from the credit VaR. As we have argued above, the CCR
virtual trades do not generate any income, so the EL subtraction
should be eliminated.
❑❑ Time zero unilateral CVA is subtracted from EAD. Instead, as we
have shown above, the bilateral CVA should be subtracted from
the product of EAD and LGD.
❑❑ Maturity adjustment should not be present in the default risk
288

capital charge, as credit migrations should be captured in the

explicitly modelled credit migration term (the CVA capital charge
discussed below).13
The correct default risk capital charge formula is described by the

first term (the top line) in Equation 17.18, which can be restated in
terms of Basel definitions of EAD and LGD as:
( )
K idef ( H ) = PDi H z1−q ⋅ ⎡⎣LGDiDT ⋅ EADi − CVAib ( 0)⎤⎦ (17.24)
Note that there is no MA factor and that CVAib(0) is bilateral and, as

such, can be negative.
The credit migration risk is addressed in Basel III via the CVA
capital charge. There are two approaches available: advanced and
standardised. Under the advanced approach (which will be the focus
of the rest of this section), banks would simulate the credit indexes
and credit spreads of all their counterparties for a market risk time
horizon (H = 10 days), calculate the changes in the unilateral CVA for
each counterparty and in the value of eligible hedges caused by the
changes in credit spreads (the EE profiles are assumed to be fixed),
and calculate the VaR for this portfolio at the confidence level of 99%.
Then, the CVA capital charge will be calculated according to the
market risk capital rules (see BCBS, 2009, for details). Unilateral CVA
calculations under the advanced approach are consistent with
Equation 17.3, but there is one caveat: the EE profile has to be calcu-
lated according to Basel II requirements, as described in BCBS (2006).
Eligible hedges include single-name CDSs, single-name contingent
CDSs and index CDSs booked specifically for hedging CVA risk. The
eligible hedges are to be removed from the bank’s market risk capital
charge calculations.
At first glance, the Basel III advanced CVA capital charge makes
perfect sense, as it addresses the second term of Equation 17.16.
However, it is not difficult to see that the advanced CVA capital
charge has some problems.
❑❑ Calculating the CVA capital charge on a stand-alone basis – As we

have argued above, by treating CVA as market risk, one has to
put the CVA virtual trades together with the rest of the trading
book and calculate market risk VaR for this extended portfolio
via joint simulation of real and virtual trades. Calculating market
289

risk VaR for CVA separately from the market risk calculations for
actual trades completely ignores dependencies between CVA
and the real trades.
❑❑ Keeping the EE profiles fixed during CVA simulations and exclusion of
market risk hedges – CVA fluctuates due to variability of both the
counterparty credit spread and the EE profile. Keeping the EE
profile fixed ignores a significant portion of CVA risk. Moreover,
banks that actively manage CCR hedge both the EE changes and
the counterparty credit spread changes. EE hedges may include
trades of many types: they offset the sensitivity of the EE to
multiple risk factors: interest rate, foreign exchange, etc.14 While
these trades actually reduce the bank’s trading book risk, they
are not eligible CVA hedges under Basel III and have to be
included in the market risk capital calculations. However, these
trades do not offset any of the real trades – they will appear as
“naked” trades in the market risk calculations, resulting in higher
market risk capital. This could create a perverse incentive for
banks: banks that hedge variability of CVA due to market risk
factors will be punished by higher market risk capital.
❑❑ Unilateral CVA is used in calculations – Bilateral CVA determines
the market value of CCR and, as such, should be used for calcu-
lating the CVA capital charge. Note that only bilateral CVA enters
Equation 17.16.
Thus, the Basel III CVA capital charge does not capture CVA risk
properly and even creates incentives for banks to leave the EE
portion of CVA risk unhedged.
Conclusion
We have proposed a general framework for calculating capital for
CCR that consistently incorporates CVA. We have considered two
applications of this framework: market risk and credit risk. Under
the market risk approach, CCR is modelled and calculated jointly
with the CCR-free trading book. This can be done by extending the
actual trading book via adding to the portfolio one virtual hybrid
defaultable contingent claim per counterparty. VaR calculated for
the extended trading book covers both market risk and CCR. This
approach is appropriate for sophisticated financial institutions that
manage and hedge their CCR dynamically. Under the credit risk
290

approach, CCR is treated as credit risk jointly with a bank’s banking

book. We have derived expressions for CCR capital under the ASRF
framework that underlies the minimum capital requirements of
Basel II and Basel III. Both default risk and credit migration risk
terms critically depend on bilateral CVA.
We used our framework to analyse minimum capital requirements
for CCR under Basel II and Basel III. We have shown that Basel II
capital requirements could be made consistent with the credit risk
version of our framework by amending the default risk capital charge
formula (via removing the subtraction of expected loss and correctly
incorporating the time-zero bilateral CVA) and redefining the matu-
rity adjustment. We have also shown that the Basel III default capital
charge is not correct either: it incorporates the time-zero CVA incor-
rectly and still involves subtraction of the expected loss. The CVA
capital charge introduced in Basel III is certainly a step in the right
direction. However, while appearing conceptually sound, it ignores
exposure variability and treats CVA risk on a stand-alone basis (that
is, separately from the market risk of the trading book). This stand-
alone treatment could potentially create perverse incentives for
banks not to hedge the variability of CVA due to changes in market
risk factors that drive exposure.
A conceptually sound regulatory treatment of CCR could be as
follows. The default treatment, applicable to all internal ratings-
based banks, should be the credit risk approach described above.
For IMM banks, the capital charge should be in the form of default
charge (given by Equation 17.24), supplemented with an appropri-
ately calibrated CVA/credit migration charge. Banks that can prove
to their supervisors that they actively manage CCR should be
allowed to bring CCR completely to the market risk framework.
CCR virtual trades should be subject to the same market risk rules
as real trades, including an incremental risk capital charge that
accounts for credit migration and jump-to-default risk. No CCR
default risk capital (the “IMM capital charge”) should be applied to
such banks. We hope that the upcoming fundamental review of the
trading book rules by the Basel Committee will address most of
the issues discussed in this chapter by following this general
direction.
291

The author would like to thank Eduardo Canabarro, David Lynch

and Dan Rosen for valuable comments. The opinions expressed here
are those of the author, and do not necessarily reflect the views of the
Federal Reserve Board or its staff.
1 For a comprehensive review of CCR, see, for example, Canabarro and Duffie (2003),
Pykhtin and Zhu (2007) or Gregory (2010).
2 See Pykhtin (2010) for details.
3 The term structure of the risk-neutral PDs is obtained from the CDS spreads quoted in the
market (see, for example, Schönbucher, 2003).
4 There is also a possibility of simultaneous default (tc = tb), but we ignore this scenario as
unlikely. There have been studies that accounted for joint default possibility (see, for
example, Gregory, 2009).
5 See, for example, Pykhtin (2011) for more details.
6 Strictly speaking, the bilateral CVA at the time of default should be subtracted from the
risk-free portfolio value Vi(ti). However, this CVA is usually ignored because it is practi-
cally impossible to estimate and it is likely to be negligible. First, it is not known in advance
what counterparty the bank will use to replace the portfolio, so it is not clear what credit
spread to use in the CVA calculations. Second, the replacement counterparty is likely to be
another bank, and the replacement trades are likely to be a part of a larger netting set.
Their contribution to the netting set CVA will depend on the existing trades in that netting
set. Third, interbank OTC derivatives portfolios are usually well collateralised (with low
threshold), so that the extra exposure and CVA resulting from the replacement trades
should be small. Finally, the replacement trades are not completely equivalent to the orig-
inal trades: they have the same sensitivity to the market risk factors, but may have different
MTM value. For example, regardless of the current market value of an interest rate swap,
the replacement swap’s value will always be zero. Thus, even if a swap portfolio with the
defaulting counterparty is well in-the-money (which would lead to a high bilateral CVA),
the bilateral CVA for the replacement trades will be negligible because the EE profiles for
the bank and the replacement counterparty will be similar.
7 For a rigorous proof, see Gordy (2003).
8 The expectation is taken under the physical probability measure, not risk-neutral.
9 See, for example, Vasicek (2002).
10 While Equation 17.20 does capture wrong-way risk, it relies on the ASRF framework, thus
ignoring the systematic nature of many market risk factors. Thus, the right-hand side of
Equation 17.20 still requires a multiplier alpha, but this alpha would be more stable and
usually have smaller magnitude than alpha in Equation 17.21. ISDA-TBMA-LIBA (2003)
reported alpha calculated by four large banks for their actual portfolios to be in the 1.07–
1.10 range. For theoretical work on alpha, see Canabarro, Picoult and Wilde (2003) and
Wilde (2005).
11 The EL calculation in Equation 17.23 is not quite correct: the unconditional expected LGD
should be used in the EL calculation rather than the downturn LGD.
12 See Basel Committee on Banking Supervision (2004) for details.
13 This issue is addressed only for advanced banks by allowing them to set MA(Mi) = 1 for
any Mi.
14 See Canabarro (2010) for more details.
292

REFERENCES
BCBS, 2004. “An Explanatory Note on the Basel II IRB Risk Weight Functions,” October.
BCBS, 2006, “International Convergence of Capital Measurement and Capital Standards:

A Revised Framework,” June.
BCBS, 2009, “Revisions to the Basel II Market Risk Framework,” July.
BCBS, 2010. “Basel III: A Global Regulatory Framework for More Resilient Banks and
Banking Systems,” December.
Canabarro E., 2010, “Pricing and Hedging Counterparty Risk: Lessons Relearned?” in E.
Canabarro (Ed), Counterparty Credit Risk (London, England: Risk Books).
Canabarro E. and D. Duffie, 2003, “Measuring and Marking Counterparty Risk,” in

L. Tilman (Ed), Asset/Liability Management for Financial Institutions (London, England:
Euromoney Books).
Canabarro E., E. Picault and T. Wilde, 2003, “Analysing Counterparty Risk,” Risk,
September, pp 117–22.
Gordy M., 2003, “A Risk-factor Model Foundation for Ratings-based Bank Capital
Rules,” Journal of Financial Intermediation, 12(3), July, pp 199–232.
Gregory J., 2009,” Being Two-faced Over Counterparty Credit Risk.” Risk, February, pp
86–90.
Gregory J., 2010. Counterparty Credit Risk: The New Challenge for Global Financial Markets
(Hoboken, NJ: Wiley).
ISDA-TBMA-LIBA, 2003, “Counterparty Risk Treatment of OTC Derivatives and

Securities Financing Transactions,” June (available at www.isda.org/c_and_a/pdf/
counterpartyrisk.pdf).
Li D., 2000, “On Default Correlation: A Copula Approach,” Journal of Fixed Income, 9, pp
43–54.
Pykhtin M., 2003, “Unexpected Recovery Risk,” Risk, August, pp 74–78.
Pykhtin M., 2010, “Collateralised Credit Exposure,” in E. Canabarro (Ed), Counterparty

Credit Risk (London, England: Risk Books).
Pykhtin M., 2011, “Counterparty Risk Management and Valuation,” in T. Bielecki,

D. Brigo and F. Patras (Eds), Credit Risk Frontiers: Subprime Crisis, Pricing and Hedging
(Hoboken, NJ: Wiley).
Pykhtin M. and S. Zhu, 2007, “A Guide to Modeling Counterparty Credit Risk,” GARP
Risk Review, July/August, pp 16–22.
Schönbucher P, 2003, Credit Derivatives Pricing Models: Models, Pricing, and Implementation
(Chichester, England: Wiley).
Vasicek O, 2002, “Loan Portfolio Value,” Risk, December, pp 160–62.
Wilde T, 2005, “Analytic Methods for Portfolio Counterparty Risk,” in M. Pykhtin (Ed),
Counterparty Credit Risk Modelling (London, England: Risk Books).
293

18
Partial Differential Equation
Representations of Derivatives with
Bilateral Counterparty Risk and
Funding Costs
Christoph Burgard and Mats Kjaer
Barclays
Given recent market conditions, counterparty credit risk implicitly

embedded in derivative contracts has become increasingly relevant.
This kind of risk represents the possibility that a counterparty will
default while owing money under the terms of a derivative contract,
or, more precisely, if the mark-to-market value of the derivative is
positive to the seller at the time of default of the counterparty.
While, for exchange-traded contracts, the counterparty credit risk is
mitigated by the exchange’s presence as intermediary, this is not the
case for over-the-counter products. For these, a number of different
techniques are used to mitigate counterparty risk, most commonly
by means of netting agreements and collateral mechanisms. The
details of these agreements are specified, for example, by the
International Swaps and Derivatives Association (ISDA) 2002
Master Agreement. However, the counterparty faces the similar
risk of the seller defaulting when the mark-to-market value is posi-
tive to the counterparty. Taking into account the credit risk of both
parties is commonly referred to as considering bilateral counter-
party risk. When doing so, the value of the derivative to the seller is
influenced by its own credit quality. Research into developing tech-
niques for the valuation of derivatives and derivative portfolios
under counterparty risk includes, but is not limited to, Alavian et al
295
18 Burgard PCQF.indd 295 11/03/2013 10:17

(2008), Brigo and Mercurio (2007), Gregory (2009), Jarrow and

Turnbull (1995), Jarrow and Yu (2001), Li and Tang (2007), Pykhtin
and Zhu (2007) and Cesari et al (2009). There are other contexts
where the credit of the seller is relevant, in particular in terms of
mark-to-market accounting of the seller’s own debt and the effect
that this has on its funding costs. Piterbarg (2010) discusses the
effect of funding costs on derivative valuations when collateral has
to be posted. Here, we combine the effects of the seller’s credit on
its funding costs with the effects on the bilateral counterparty risk
into a unified framework. We use hedging arguments to derive the
extensions to the Black–Scholes partial differential equation (PDE)
in the presence of bilateral counterparty risk in a bilateral jump-to-
default model, and we include funding considerations in the
financing of the hedge positions. In addition, we consider two rules
for the determination of the derivative mark-to-market value at
default, namely, the total risky value and the counterparty-risk-free
value. The latter corresponds to the most common approach taken
in the literature. The total value of the derivative will then depend
on which of the two mark-to-market rules is used at default. For
contracts following the ISDA 2002 Master Agreement, for example,
the value of the derivative upon default of one of the counterparties
is determined by a dealer poll. There is no reference to the counter-
parties and one could reasonably expect the derivative value to be
the counterparty-risk-free value, ie, the second case considered. In
the first case, where we use the default-risky derivative price as the
mark-to-market value, we derive a pricing PDE that, in general, is
nonlinear, and show that the unknown risky price can be found by
solving a nonlinear integral equation. Under certain conditions on
the payout, the nonlinear terms vanish and we study the Feynman–
Kac representation of the solution of the resulting linear PDE. In the
second case, where we use the counterparty-risk-free derivative
price as the mark-to-market value, the resulting pricing PDE is
linear. As in the first case, we use the Feynman–Kac representation
to decompose the risky derivative value into a counterparty-risk-
free part, a funding adjustment part and a bilateral credit-valuation
adjustment (CVA) part. By using a hedging strategy to derive our
results, we ensure that the hedging costs of all considered risk
factors are included in the derivative price and our decomposition
of the risky price is a generalisation of the result commonly found
296

partial differential equation representations of derivatives
with bilateral counterparty risk and funding costs
in the literature. Moreover, we obtain explicit expressions for the

hedges, which is important for risk management. There have been
discussions in the literature about how a seller can hedge out its
own credit risk (see Gregory (2009) for a summary). The strategy
described in this chapter includes the (re)purchase by the seller of
its own bonds to hedge out its own credit risk. On the face of it, this
may seem like a futile approach, since if this bond purchase were
funded by issuing more debt (ie, more bonds), the seller would, in
effect, have achieved nothing in terms of hedging its own credit
risk. However, the replication strategy presented shows how the
funding for the purchase of the seller’s own bond is achieved
through the cash account of the hedging strategy. The hedging
strategy (including the premium of the derivative) generates the
cash needed to fund the repurchase of the seller’s own bonds.
Although all results in this chapter are derived for one derivative
on one underlying asset following a particular dynamic, they
extend directly to the situation of a netted derivatives portfolio on
several underlyings following general diffusion dynamics.
This paper is organised as follows. In the second section we
summarise the main results of the chapter. A general PDE for the
counterparty risky derivative value is derived in the third section
but we defer the specification of the boundary condition at default
to the fourth and fifth sections. In these sections we assume that the
mark-to-market value of the derivative at default is given by the
risky value and the counterparty-risk-free value discounted at the
risk-free rate, respectively. We then consider some examples in the
next section before concluding in the last section.
MAIN RESULTS
We consider a derivative contract VÈ on an asset S between a seller B
and a counterparty C that may both default. The asset S is not
affected by a default of either B or C, and is assumed to follow a
Markov process with generator At. Similarly, we let V denote the
same derivative between two parties that cannot default. At default
of either the counterparty or the seller, the value of the derivative to
the seller VÈ is determined with a mark-to-market rule M, which
may equal VÈ or V (throughout this chapter we use the convention
that positive derivative values correspond to seller assets and coun-
terparty liabilities).
297

Table 18.1 Definitions of the rates used throughout the chapter
Rate Definition Choices discussed
r Risk-free rate
rB Yield on recoveryless bond of seller B
r C Yield on recoveryless bond of
counterparty C
lB lB ≡ rB – r
lC lC ≡ rC – r
rF Seller funding rate for borrowed cash rF = r if derivative can be used as
on seller's derivatives replication cash collateral; rF = r + (1 – RB)lB if
account derivative cannot be used as
collateral
sF sF ≡ rF – r
RB Recovery on derivate mark-to-
market value in case seller B
defaults
R C Recovery on derivate mark-to-
market value in case counterparty
C defaults
By using replication arguments and including funding costs, we

derive the PDEs in Equations 18.1 and 18.2 below. The rates, spreads
and recoveries used here and throughout the chapter are summa-
rised in Table 18.l.
Main result 1 (PDE for V̂ when M = V̂ )

When the mark-to-market value at default is given by M = VÈ, then
V satisfies the PDE:
∂t V̂ + AtV̂ − rV̂ = (1− RB ) λBV̂ − + (1− RC ) λCV̂ + + sFV̂ + (18.1)
Main result 2 (PDE for V̂ when M = V)

When the mark-to-market value at default is given by M = V, then
VÈ satisfies the PDE:
∂t V̂ + AtV̂ − ( r + λB + λC ) V̂ = − ( RB λB + λC ) V − − ( λB + RC λC ) V + + sFV + (18.2)
Main result 3 (CVA when M = V)

Let M = V and rF = r + SF. Then VÈ = V + U, where the CVA U is given
by:
298

T
U (t,S) = − (1− RB ) ∫ λBDr+λB +λC E t ⎡⎣V − ( u,S ( u))⎤⎦
t
T
− (1− RC ) ∫ t λBDr+λB +λC E t ⎡⎣V + ( u,S ( u))⎤⎦
T
− ∫ sF ( u)Dr+λB +λC E t ⎡⎣V + ( u,S ( u))⎤⎦ (18.3)
t
where:
Dk (t, u) ≡ exp (∫ u
t
k (υ )dυ )
is the discount factor between times t and u using rate k. If SF = O,
then U is identical to the regular bilateral C VA derived in many of
the papers cited in the first section.
Another important result of the chapter is the justification on
which the seller’s own credit risk can be taken into account. In the
hedging strategy considered, this risk is hedged out by the seller
buying back its own bond. It is shown that the cash needed for
doing so is generated through the replication strategy.
MODEL SETUP AND DERIVATION OF A BILATERAL RISKY

PARTIAL DIFFERENTIAL EQUATION
We consider an economy with the following four traded assets:
PR: default risk-free zero-coupon bond;
PB: default risky, zero-recovery, zero-coupon bond of party B;
PC: default risky, zero-recovery, zero-coupon bond of party C; and
S: spot asset with no default risk.
Both risky bonds PB and PC pay 1 at some future date T if the issuing
party has not defaulted, and 0 otherwise. These simplistic bonds
are useful for modeling and can be used as building blocks of more
complex corporate bonds, including those with nonzero recovery.
We assume that the processes for the assets PR, PB, PC and S under
the historical probability measure are specified by:
dPR dPB ⎫
= r (t ) dt, = rB (t ) dt − dJ B ⎪
PR PB ⎪
⎬
dPC dS ⎪
= rC (t) dt − dJ C , = µ (t) dt + σ (t ) dW (18.4)
PC S ⎪⎭
where W(t) is aWiener process, where r(t) > 0, rB(t) > = 0, rC(t) > 0,
s (t) > 0 are deterministic functions of t, and where JB and JC are two
independent point processes that jump from zero to one on default
299

of B and C, respectively. This assumption implies that we can hedge

using the bonds PB and PC alone, but we will discuss how to relax
this later. We further stress that the spot asset price S is assumed to
be unaffected by a default of party B or C. For the remainder of this
chapter, we will refer to B as the seller and C as the counterparty,
respectively.
Now assume that the parties B and C enter a derivative on the
spot asset that pays the seller B the amount H(S ) ∈ R at maturity T.
Thus, in our convention, the payout scenario H(S) ≥ 0 means that
the seller receives cash or an asset from the counterparty. The value
of this derivative to the seller at time t is denoted VÈ(t, S, JB, JC) and
depends on the spot S of the underlying and the default states JB
and JC of the seller B and counterparty C. Analogously, we let V(t, S)
denote the value to the seller of the same derivative if it were a
transaction between two default-free parties.
When party B or C defaults, in general, the mark-to-market value
of the derivative determines the close-out or claim on the position.
However, the precise nature of this depends on the contractual
details and the mechanism by which the mark-to-market is
determined. The 2002 ISDA Master Agreement specifies that the
derivative contract will return to the surviving party the recovery
value of its positive mark-to-market value (from the view of the
surviving party) just prior to default, whereas the full mark-to-
market has to be paid to the defaulting party if the mark-to-market
value is negative (from the view of the surviving party). The Master
Agreement specifies a dealer poll mechanism to establish the mark-
to-market to the seller M(t, S) at default, without referring to the
names of the counterparties involved in the derivative transaction.
In this case, one would expect M(t, S) to be close to V(t, S), even
although it is unclear whether the dealers in the poll may or may
not include their funding costs in the derivatives price. In other
cases, not following the Master Agreement, there may be other
mechanisms described. Hence, we will derive the PDE for the
general case M(t, S) and consider the two special cases where M(t,
S) = VÈ(t, S, 0, 0) and M(t, S) = V(t, S) in later sections. Let RB ∈ [0, l]
and RC ∈ [0, l] denote the recovery rates on the derivative positions
of parties B and C, respectively. In this chapter, we take them to be
deterministic. From the above discussion it follows that we have
the following boundary conditions:
300

V̂ (t,S,1, 0) = M + (t,S) + RB M – (t,S) (seller defaults first) ⎪⎫

⎬
–
V̂ (t,S,1, 0) = RC M (t,S) + M (t,S)
+
(counterparty defaults first)⎪⎭ (18.5)
Gregory (2009), Li and Tang (2007) and the vast majority of papers
on the valuation of counterparty risk use M(t, S) = V(t, S).
As in the usual Black–Scholes framework, we hedge the deriva-
tive with a self-financing portfolio that covers all the underlying
risk factors of the model. In our case, the portfolio Π that the seller
sets up consists of d (t) units of S, a B(t) units of PB, a C(t) units of PC
and b (t) units of cash, such that the portfolio value at time I hedges
out the value of the derivative contract to the seller, ie, VÈ(t) + Π(t) =
0. Thus:
−V̂ (t ) = Π (t) = δ (t) S (t) + α B (t) PB (t ) + α C (t) PC (t ) + β (t )
Before proceeding, we note that when VÈ ≥ 0 the seller will incur a

loss at counterparty default. To hedge this loss, PC needs to be
shorted, so we expect that a C ≤ 0. Assuming that the seller can
borrow the bond PC close to the risk-free rate r through a repurchase
agreement, the spread l C between the rate rC on the bond and the
cost of financing the hedge position in C can be approximated to l C
= rC – r. Since we defined PC to be a bond with zero recovery, this
spread corresponds to the default intensity of C.
On the other hand, if VÈ ≤ 0, the seller will gain at its own default,
which can be hedged by buying back PB bonds, so we expect that a B
≥ 0 in this case. For this to work, we need to ensure that enough cash
is generated and that any remaining cash (after purchase of PB) is
invested in a way that does not generate additional credit risk for
the seller; ie, any remaining positive cash generates a yield at the
risk-free rate r.
Imposing that the portfolio Π(t) is self-financing implies that:
−dV̂ (t ) = δ (t ) dS (t) + α B (t) dPB (t ) + α C (t ) dPC (t ) + dβ (t ) (18.6)
where the growth in cash1 db ì may be decomposed as db ì(t) = dbSì (t)

+ dbFì (t) + dbCì (t) with the following.
❑❑ dbSì (t): the share position provides a dividend income of d (t))g (t)
S(t) dt and a financing cost of –d (t)qS(t)S(t) dt, so dbSì (t) = d (t)(gS(t)
– qS(t))S(t) dt. Here the value of qS(t) depends on the risk-free rate
r(t) and the repo rate of S(t).
301

❑❑ dbFì (t): from the above analysis, any surplus cash held by the
seller after the own bonds have been purchased must earn the
risk-free rate r(t) in order not to introduce any further credit risk
for the seller. If borrowing money, the seller needs to pay the rate
rF(t). For this rate, we distinguish two cases. Where the derivative
itself can be used as collateral for the required funding, we
assume no haircut and set rF(t) = r(t). If, however, the derivative
cannot be used as collateral, we set the funding rate to the yield
of the unsecured seller bond with recovery RB, ie, rF(t) = r(t) + (1
– RB)lB. In practice, the latter case is often the more realistic one.
Keeping rF general for now, we have:
{ ( ) } dt
+ −
)
dβ F (t ) = r (t ) −V̂ − α B PB + rF (t ) −V̂ − α B PB (
−
( ) (
= r (t) −V̂ − α B PB dt + sF −V̂ − α B PB dt ) (18.7)
where the funding spread sF ≡ rF – r, ie, sF = 0 if the derivative can
be used as collateral, and sF = (1 — RB)lB if it cannot.
❑❑ dbCì (t): by the arguments above, the seller will short the counter-
party bond through a repurchase agreement and incur financing
costs of dbCì (t) = –a C(t)r(t)PC(t) dt if we assume a zero haircut.
For the remainder of this chapter we will drop the t from our nota-
tion, Where applicable, to improve clarity. From the above analysis
it follows that the change in the cash account is given by:
{( ) } dt − rα P dt
−
) (
dβ = δ (γ S − qS ) Sdt + r −V̂ − α B PB + sF −V̂ − α B PB C C
(3.5)
so 18.6 becomes:
−dV̂ = δ dS + α BdPB + α C dPC + dβ

= δ dS + α BdPB ( rBdt − dJ B ) + α C PC ( rC dt − dJC )
{( ) − α rP − δ (q − γ ) S} dt
−
) (
+ r −V̂ − α B PB + sF −V̂ − α B PB C C S S
= {−rV̂ + s (−V̂ − α P ) + (γ − q ) δS
−
F B B S S
+ ( rB − r ) α B PB + ( rC − r ) α C PC } dt
−α B PBdJ B − α C PC dJC + δ dS (18.9)
On the other hand, by Ito’s lemma for jump diffusions and our
assumption that a simultaneous jump is a zero probability event,
the derivative value moves by:
302

1
dV̂ = ∂t V̂dt + ∂S V̂dS + σ 2 S2 ∂S2 V̂dt + ΔV̂BdJ B + ΔV̂C dJ C (18.10)
2
where:
ΔV̂B = V̂ (t,S,1, 0) − V̂ (t,S, 0, 0) (18.11)
ΔV̂C = V̂ (t,S, 0,1) − V̂ (t,S, 0, 0) (18.12)
which can be computed from the boundary condition 18.5.

Replacing dV̂ in 18.9 by 18.10 shows that we can eliminate all
risks in the portfolio by choosing d, a B and a C as:
δ = −δSV̂ (18.13)
ΔV̂B
αB =
PB
V̂ − ( M + + RB M − )
=− (18.14)
PB
ΔV̂C
αC =
PC
V̂ − ( M − + RC M + )
=− (18.15)
PC
Hence, the cash account evolution 18.7 can be written as:
dβ F (t ) = {−r (t ) RB M − − rF (t) M + } dt (18.16)
so the amount of cash deposited by the seller at the risk-free rate

equals –RBM– and the amount borrowed at the funding rate rF
equals –M+.
If we now introduce the parabolic differential operator At as:
1
AtV ≡ σ 2S2 ∂S2 V + ( qS − γ S ) S∂S V (18.17)
2
then it follows that VÈ is the solution of the PDE:

+ ⎫
( )
∂t V̂ + AtV̂ − rV̂ = sF V̂ + ΔV̂B − λBΔV̂B − λC ΔV̂C ⎪
⎬
V̂ (T,S) = H (S) ⎪⎭ (18.18)
where l B ≡ rB – r and l C ≡ rC – r. Inserting 18.11 and 18.12 with

boundary condition 18.5 into 18.18 finally gives:
303

∂t V̂ + AtV̂ − rV̂ = ( λB + λC ) V̂ + sF M + ⎫⎪

⎬
−λB ( RB M + M ) − λC ( RC M + M )⎪⎭
− + + −
V̂ (T,S) = H (S) (18.19)
where we have used (VÈ + DVÈB)+ = (M+ + RBM–)+ = M+.

In contrast, the risk-free value V satisfies the regular Black–
Scholes PDE:
∂t V + AtV − rV = 0, V (T,S) = H (S) (18.20)
so if we interpret l B and l C as effective default rates, then the differ-

ences between 18.19 and 18.20 may be interpreted as follows.
1. The first term on the right-hand side of 18.19 is the additional

growth rate that seller B requires on the risky asset V to
compensate for the risk that default of either the seller or the
counterparty will terminate the derivative contract.
2. The second term is the additional funding cost for negative
values of the cash account of the hedging strategy.
3. The third term is the adjustment in growth rate that the seller can
accept because of the cashflow occurring at own default.
4. The fourth term is the adjustment in growth rate that the seller
can accept because of the cashflow occurring at counterparty
default.
The first, third and fourth terms are related to counterparty risk,
whereas the second term represents the funding cost. From this
interpretation, it follows that the PDE for a so-called extinguisher
trade, whereby it is agreed that no party gets anything at default, is
obtained by removing terms three and four from the PDE 18.19.
In the subsequent sections We will examine the PDE 18.19 in the
following four cases:
1. M(t, S) = VÈ(t, s, 0, 0) and rF = r;

2. M(t, S) = V(t, S, 0, 0) and rF : r + sF;
3. M(t, S) = V(t, S) and rF = r;
4. M(t, S) = V(t, S) and rF = r + sF.
Because either the value M at default or the funding rate rF differ

between these four cases, we expect the total derivative value V to
differ as well.
304

USING V̂(t, S) AS THE MARK-TO-MARKET AT DEFAULT

Let us consider the case where the payments in case of default are
based on VÈ so that M(t, S) = VÈ(t, S) in the boundary condition 18.5.
Conceptually, this is the simpler case, since if the defaulting party is
in-the-money with respect to the derivative contract, then there is
no additional effect on the profit and loss at the point of default.
Similarly, if the surviving party is in-the-money with respect to the
derivative contract, then its loss is simply (1 — R)VÈ. In this case, the
PDE 18.19 simplifies to:
∂t V̂ + AtV̂ − rV̂ = (1− RB ) λBV̂ − + (1− RC ) λCV̂ + + sFV̂ + ⎪⎫
⎬
V̂ (T,S) = H (S) ⎪⎭ (18.21)
where we recall that sF = 0 if the derivative can be posted as collat-

eral, and sF = (1 – RB)l B if it cannot. Moreover, the hedge ratios a B,
and a C are given by:
αB = −
(1− RB ) V̂ − (18.22)
PB
αC = −
(1− RC ) V̂ + (18.23)
PC
so a B ≥ 0 and a C ≤ 0, and the replication strategy generates enough

cash (–VÈ–) for the seller to purchase back its own bonds?
In the counterparty risk literature, it is customary to write VÈ = V
+ U, where U is called the CVA. Inserting this decomposition into
18.21 and using the fact that V satisfies 18.20 yields:
∂t U + AtU − rU = (1− RB ) λB (V +U )
− ⎫
⎪⎪
+ +
+ (1− RC ) λC (V +U ) + sF (V +U ) ⎬
⎪
U (T,S) = 0 ⎪⎭ (18.24)
where V is known and acts as a source term. Furthermore, we may

formally apply the Feynman–Kac theorem to 18.24, which, with the
assumption of deterministic rates, gives us the following nonlinear
integral equation:
U (t,S) = − (1− RB ) ∫ t λB ( u) Dr (t, u) E t ⎡⎣(V ( u,S (u)) +U (u,S (u))) ⎤⎦ du

T −
− (1− RC ) ∫ t λC ( u) Dr (t, u) E t ⎡⎣(V ( u,S ( u)) +U ( u,S ( u))) ⎤⎦ du

T +
− ∫ sF ( u)Dr (t, u) E t ⎡⎣(V ( u,S ( u)) +U ( u,S ( u))) ⎤⎦ du

T +
(18.25)
t
305

It follows that we can compute U by first computing V and then

solving either the nonlinear PDE 18.24 or the integral equation
18.25.
Before proceeding with a study of the two cases sF = 0 and sF = (1
– RB)l B, it is worthwhile to study a few examples, namely, where V
corresponds to bonds of the seller or the counterparty, where those
bonds are either without or with recovery.
The seller sells PB to the counterparty

The first case we consider is a risky, recoveryless bond sold by the
seller B to the counterparty C. In this case, we have VÈ = VÈ– = –PB and
RB = 0. Since we consider deterministic rates and credit spreads, we
do not have any risk with respect to the underlying market factors
and the term At, VÈ vanishes so that 18.21 becomes:
∂t V̂ = ( r + λB ) V̂ = RBV̂, V̂ (T,S) = −1 (18.26)
with the solution:
( T
V̂ (t ) = −exp − ∫ t rB ( s) ds ) (18.27)
as expected for VÈ = –PB(t).

If, on the other hand, we consider the bond PBì that has recovery
RB, then 18.21 becomes:
∂t V̂ = {r + (1− RB ) λB } V̂, V̂ (T,S) = −1 (18.28)
with the solution:
( T
V̂ (t) = −exp − ∫ t {r ( s) + (1− RB ) λB ( s)}ds ) (18.29)
As expected, the rate r + (l – RB)/l B payable on the bond With

recovery is equal to the unsecured funding rate rF that the seller has
to pay on negative cash balances when the derivative cannot be
posted as collateral.
The seller purchases PC from C

If, on the other hand, VÈ describes the purchase of the bond PC by the
seller from the counterparty (ie, the seller lends to the counterparty
Without recovery), then VÈ = VÈ+ = PC and RC = 0, and Equation 18.21
becomes:
∂t V̂ = ( rF + λC ) V̂ = ( rF + ( rC − r )) V̂, V̂ (T,S) = 1 (18.30)
306

In this case, if the seller can use the derivative (ie, the loan asset) as
collateral for the funding of its short cash position within its replica-
tion strategy, then (neglecting haircuts) we have rF = r, the risk-free
rate. The net result in this case is then:
∂t V̂ = rCV̂ (18.31)
(
V̂ (t ) = exp − ∫ rC ( s) ds
t
T
) (18.32)
as expected for VÈ(t) = PC(t). If, on the other hand, we consider a

bond PCì with recovery RC, then we find:
(
V̂ (t ) = exp − ∫
T
t
{r ( s) + (1− RC ) λC ( s)} ds ) (18.33)
as expected.
The case where rF = r

If the derivative can be posted as collateral, the PDE 18.21 becomes:
∂t V̂ + AtV̂ − rV̂ = (1− RB ) λBV − + (1− RC ) λCV̂ + ⎫⎪

⎬
V̂ (T,S) = H (S) ⎭⎪ (18.34)
which is a nonlinear PDE that needs to be solved numerically unless

VÈ ≥ 0 or VÈ ≤ 0.
Assuming that VÈ ≤ 0 (ie, the seller sold an option to the counter-
party, so H(S) ≤ 0) and that all rates are deterministic, the
Feynman–Kac representation of VÈ is given by:
V̂ (t,S) = E t ⎡⎣Dr+(1−RB )λB (t,T ) H (S (T ))⎤⎦ (18.35)
where:
(
Dk (t,T ) ≡ exp − ∫ t k ( s)ds
T
)
is the discount factor over [t, T] given the rate k.
Alternatively, if, for VÈ ≤ 0, we insert the ansatz3 VÈ = V + U0 into
18.34, apply the Feynman–Kac theorem and finally use that V(t, s) =
Dr(t, u)Et[V(u, S(u))], then we get:
{∫ (1− RB ) λB (u) D(1−R )λ (t, u)du}

T
U 0 (t,S) = −V (t,S) (18.36)
t B B
When VÈ ≥ 0, ie, the seller buys an option, symmetry yields that:
307

{∫ (1− RC ) λC (u) D(1−R )λ (t, u)du}

T
U 0 (t,S) = −V (t,S) (18.37)
t C C
We conclude by noting that if VÈ ≤ 0, then U0 depends only on the

credit of the seller, whereas if VÈ ≥ 0, then it depends only on the
credit of the counterparty.
The case where rF = r + (1 – RB))λB

If the derivative cannot be posted as collateral, the PDE 18.21
becomes:
∂t V̂ + AtV̂ − rV̂ = (1− RB ) λBV̂ − + {(1− RB ) λB + (1− RC ) λC } V̂ + ⎫⎪

⎬
V̂ (T,S) = H (S) ⎪⎭ (18.38)
which, again, is a nonlinear PDE.

If VÈ ≤ 0, we write VÈ = V + U, and it is easy to see that U = U0 given
in (4.16), so VÈ is given by (4.15). If VÈ ≥ 0, we have that:
V̂ (t,S) = E t ⎡⎣Dr+k (t,T ) H (S (T ))⎤⎦ (18.39)
with k ≡ (1 – RB)l B + (1 – RC)l C. Analogously to the case rF = r, we can

make the ansatz VÈ = V + U and show that:
{∫ }
T
U (t,S) = −V (t,S) k ( u) Dk (t, u) du (18.40)
t
Comparing 18.37 and 18.40 shows that, when the seller buys an
option from the counterparty, it encounters an additional funding
spread sF = (1 – RB)l B.
USING V(t, S) AS THE MARK-TO-MARKET AT DEFAULT

We will now consider the case where payments in case of default
are based on V and hence use M(t, S) = V(t, S) in the boundary
condition 18.5. Equation 18.19 then becomes:
∂t V̂ + AtV̂ − ( r + λB + λC ) V̂ = − ( RB λB + λC ) V − ⎫
⎪⎪
− ( λB + RC λC ) V + + sFV + ⎬
⎪
V̂ (T,S) = H (S) ⎪⎭ (18.41)
The PDE 18.41 is linear and has a source term on the right-hand
side. If we write VÈ = V + U, then the hedge ratios become:
U + (1− RB ) V −
αB = − (18.42)
PB
308

U + (1− RC ) V +
αC = − (18.43)
PC
Comparing 18.23 and 18.42 shows that, in the latter case, a default
triggers a windfall cashflow of U that needs to be taken into account
in the hedging strategy.
Writing VÈ = U + V also gives us the following linear PDE for U:
∂t U + AtU − ( r + λB + λC )U = (1− RB ) λBV − ⎫

⎪
+ (1− RC ) λCV + sFV ⎬
+ +
⎪
U (T,S) = 0 ⎪⎭ (18.44)
so again applying the Feynman–Kac theorem yields:
T
U (t,S) = − (1− RB ) ∫ λB ( u) Dr+λB +λC (t, u) E t ⎡⎣V − ( u,S ( u))⎤⎦ du
t
T
− (1− RC ) ∫ λC ( u) Dr+λB +λC (t, u)E t ⎡⎣V + ( u,S ( u))⎤⎦ du
t
T
− ∫ sF ( u)Dr+λB +λC (t, u) E t ⎡⎣V + ( u,S ( u))⎤⎦ du (18.45)
t
The CVA U can be calculated by using V(t, S) as a known source

term when solving the PDE 18.44 or computing the integrals 18.45.
In the case where we can use the derivative as collateral for the
funding of our cash account, ie, sF = 0, the last term of 18.45 vanishes
and the equation reduces to the regular bilateral CVA derived in
many of the works cited in the first section (see, for example,
Gregory (2009)). In this case the bilateral benefit does not come from
any gains at own default, but from being able to use the cash gener-
ated by the hedging strategy and buy back own bonds, thereby
generating an excess return of (1 – RB)l B. We denote this CVA when
M = V and sF = 0 by U0.
In practice, however, We cannot normally use the derivative as
collateral and Equation 18.45 gives us a consistent adjustment of
the derivatives prices for bilateral counterparty risk and funding
costs. In the specific case where the funding spread corresponds to
that of the unsecured B bond (with recovery RB), ie, sF = (1 – RB)l B,
we may merge the first and third terms of 18.45 and rewrite U as:
T
U (t,S) = − (1− RB ) ∫ λB ( u) Dr+λB +λC (t, u) E t ⎡⎣V ( u,S ( u))⎤⎦ du
t
T
− (1− RC ) ∫ λC ( u) Dr+λB +λC (t, u) E t ⎡⎣V + ( u,S ( u))⎤⎦ du (18.46)
t
309

The first term of 18.46 now not only contains the bilateral asset
described above, but also the funding liability arising from the fact
that the higher rate rF = (1 – RB)l B is paid when borrowing to fund
the hedging strategy’s cash account.
EXAMPLES
In this section we calculate the total derivative value VÈ for a call
option bought by the seller in the following four cases:
Case 1 : M = V̂, sF = 0
Case 2 : M = V̂, sF = (1− RB ) λB
Case 3 : M = V, sF = 0
Case 4 : M = V, sF = (1− RB ) λB
A bought call is a one-sided trade that satisfies V ≥ 0 and VÈ ≥ 0, and

if, furthermore, we assume constant rates, the CVAs U0, U, U0 and U
from the previous two sections simplify to:
Case 1 : U 0 (t, s) = − (1− exp {− (1− RC ) λC (T − t )}) V (t,S)

Case 2 : U (t, s) = − (1− exp {− (1− RB ) λB + (1− RC ) λC (T − t)}) V (t,S)
(1− RC ) λC (1− exp {− (λB + λC ) (T − t)})

Case 3 : U 0 (t, s) = − V (t,S)
λ B + λC
{(1− RB ) λB + (1− RC ) λC }
× (1− exp {− ( λB + λC ) (T − t )})
Case 4 : U (t, s) = − V (t,S)
λ B + λC
The results are shown in Figures 18.1–18.3. Since the four CVAs
above are linear in V in all four cases, we have chosen to display
their magnitude as a percentage of V. All CVAs are negative since
the seller faces counterparty risk and funding costs when sF = 0, but
does not have any bilateral asset because of the one-sidedness of
the option payout. From these results we see that the effect of the
funding cost is significantly larger than that of choosing M = VÈ or M
= V for a bought option. For a sold option, the impact of the funding
cost does not have any effect.
310

Figure 18.1 Credit valuation adjustment relative to V, when M = V̂ and

M = V for different values of λB when λC = 0 and T – t = 5 years
30
M = V,ˆ s F = 0
25 M = V,ˆ s F > 0
CVA (% of option value V )
M = V, s F = 0
20 M = V, s F > 0
15
10
−5
0 1 2 3 4 5
Seller hazard rate (% per annum)
Other parameters: RB = 40%, RC = 40%
CONCLUSION AND POSSIBLE EXTENSIONS

The results in this chapter extend the standard CVAs encountered
in the literature by taking all funding costs associated with the
hedging strategy into account. Since the seller’s funding costs and
own credit are intimately related, this results in a consistent treat-
ment of bilateral credit and funding costs in the bilateral CVA
calculation. A numerical example of a bought call option shows that
taking funding into account is relevant and may result in the CVA
being up to 100% higher than if funding is assumed to occur at the
risk-free rate. We believe that the results presented here are particu-
larly relevant when pricing interest rate swaps and vanilla options
since these markets are very liquid, and having an analytical model
that does not fully take into account all costs may consume all
profits from a deal given high funding and credit spreads.
Although we worked within a simple one-derivative, one-asset
Black–Scholes framework, the results can be immediately extended
as follows.
Derivatives with more general payments than H(S(T))

These derivatives could be Asian options or interest rates swaps.
311


M = V for different values of λB when λC = 2.5% and T – t = 5 years
30
M = V,ˆ s F = 0
M = V,ˆ s F > 0
CVA (% of option value V ) 25
M = V, s F = 0
M = V, s F > 0
20
15
10
0
0 1 2 3 4 5
Netted portfolios with many trades

In this case, the values VÈ and V represent the net derivative port-
folio value rather than the value of a single derivative.
Generalised multiasset diffusion dynamics for multiple underlyings

Here, the only restriction is that the asset-price stochastic differen-
tial equations satisfy technical conditions such that the option
pricing PDE (now multi-dimensional) admits a unique solution
given by the Feynman–Kac representation. Note that if the number
of assets exceeds two or three, it is computationally more efficient
to compute the CVA using Monte Carlo simulation combined with
numerical integration rather than solving the high-dimensional
PDE.
Stochastic interest rates

This is essential for interest rate derivatives, and the effect would be
that the discounting in the CVA formula would occur inside the
expectation operator.
Stochastic hazard rates

One way of introducing default time dependence and right-way/
wrong-way risk would be to make l B and l C stochastic and
312


M = V for different values of λB when λC = 5% and T – t = 5 years
30
M = V,ˆ s F = 0
25 M = V,ˆ s F > 0
CVA (% of option value V )
M = V, s F = 0
20 M = V, s F > 0
15
10
0
0 1 2 3 4 5
correlate them with each other and the other market factors. This
would simply imply that we would not move the discount factors
outside of the expectation operator in 18.25 and 18.45. Also, the
generator A, would incorporate terms corresponding to the new
stochastic state variables.
Direct default time dependence

Another way of introducing default time dependence is by allowing
simultaneous defaults. This could be done by letting J0, J1 and J2 be
independent point processes and then setting JB = J0 + J1 and JC = J0 +
J1. This approach is known as the Marshall–Olkin copula and would
require some kind of basket default instrument for perfect replica-
tion. The hazard rates l 0, l 1 and l 2 of J0, J1 and J2 could be made
stochastic, in which case we can model right-way and wrong-way
risk as well.
Our results can also be readily extended to the case where a
collateral agreement is in place by following Piterbarg (2010) and
introducing a collateral account in the delta hedging strategy.
This chapter represents the views of the authors alone, and not the
views of Barclays Bank Plc. The authors would like to thank Tom
Hulme and Vladimir Piterbarg for useful comments and suggestions.
313

Various versions of this work have been presented and benefited from
discussions with participants at ICBI conferences in Rome (April
2009) and Paris (May 2010). This is a slightly revised version of the
paper that originally appeared in Volume 7, Issue 3 of The Journal of
Credit Risk in September 2011 (submitted November 2009).
1 Note that this growth is the growth in the cash account before rebalancing of the portfolio.
The self-financing condition ensures that, after dt, the rebalancing can happen at zero
overall cost. The original version of this chapter used the notation db, suggesting the total
change in the cash position. This notation has been corrected here. The authors are grateful
to Brigo et al (2012) for pointing this out.
2 For the first term, the cash available to the seller is (–VÈ –), of which a fraction of (1 – R B) is
invested in buying back the recoveryless bond B and the fraction R B is invested risk-free.
This is equivalent to investing the total amount (–VÈ –) into purchasing back a seller bond 13
with recovery R B.
3 We use the zero subscript to indicate that the CVA U0 has been computed with a zero
funding spread sF.
REFERENCES
Alavian, S., J. Ding, P. Whitehead and L. Laiddicina, 2008, “Counterparty Valuation

Adjustment (CVA),” working paper, Lehman Brothers (available at http://papers.ssrn.
com/s013/papers.cfm?abstract_id=1310226).
Brigo, D. and F. Mercurio, 2006, Interest Rate Models: Theory and Practice (2e) (Berlin,
Germany: Springer).
Brigo, D., C. Buescu, A. Pallavicini and Q. D. Liu, 2011, “Illustrating a Problem in

the Self-financing Condition in Two 2010–2011 Papers on Funding, Collateral and
Discounting,” working paper, King’s College London (available at http://ssrn.com/
abstract=2103121).
Cesari, G., J. Aquilina, N. Charpillon, X. Filipovic, G. Lee, and L. Manda, 2009,

Modelling, Pricing and Hedging Counterparty Credit Exposure: A Technical Guide (New York,
NY: Springer).
Gregory, J., 2009, “Being Two-faced Over Counterparty Risk,” Risk, 22(2), pp 86–90.
Jarrow, R. and S. Turnbull, 1995, “Pricing Derivatives on Financial Securities Subject to

Credit Risk,” Journal of Finance, 50(1), pp 53–85.
Jarrow, R. and F. Yu, 2001, “Counterparty Risk and the Pricing of Defaultable Securities,”
Journal of Finance, 56(1), pp 1,765–99.
Li, B. and Y. Tang, 2007, Quantitative Analysis, Derivatives Modeling, and Trading Strategies
in the Presence of Counterparty Credit Risk for the Fixed-Income Market (Hackensack, NJ:
World Scientific).
Piterbarg, V., 2010, “Funding Beyond Discounting: Collateral Agreements and

Derivatives Pricing,” Risk, 2, pp 97–102.
Pykhtin, M. and S. Zhu, 2007, “A Guide to Modelling Counterparty Risk,” GARP Risk
Review, July/August, pp 16–22.
314

19
Close-out Convention Tensions
Damiano Brigo and Massimo Morini
Imperial College London and IMI Bank of Intesa San Paulo
When a default event happens to one of the counterparties in a deal,

it is stopped and marked-to-market: the net present value (NPV) of
the residual part of the deal is calculated. The recovery rate is
applied to this close-out value to determine the default payment.
While modelling the recovery is known to be a difficult task, the
calculation of the close-out amount has never been the focus of
extensive research. Before the credit crunch, and actually up to the
Lehman Brothers default in 2008, the close-out amount was usually
calculated as the expectation of the future payments discounted
back to the default day by a Libor-based curve of discount factors.
However, things have become less trivial. We are aware that
discounting a deal that is default-free and backed by a liquid collat-
eral should be performed using a default-free curve of discount
factors, based on overnight quotations, whereas a deal that is not
collateralised and is thus subject to default risk should be discounted
taking liquidity costs into account and include a credit value adjust-
ment. NPV should be calculated in different ways even for equal
payouts, depending on the liquidity and credit conditions of the
deal.
The previous literature on counterparty risk assumes that when
default happens the close-out amount is calculated treating the
NPV of the residual deal as risk-free (risk-free close-out). This was
an obvious choice when one of the two parties, usually the bank,
was treated as default-free, based on its generally very superior
credit standing. Latterly, no counterparty can be considered risk-
free. In the case that a default happens, the surviving party can still
default before the maturity of the deal. In spite of this, even more
315
19 Brigo PCQF.indd 315 11/03/2013 10:17

recent literature that assumes such a bilateral counterparty risk still

adopts a risk-free close-out amount at default.
The legal (International Swaps and Derivatives Association)
documentation on the settlement of a default does not confirm this
assumption. Isda (2010) says: “Upon default close-out, valuations
will in many circumstances reflect the replacement cost of transac-
tions calculated at the terminating party’s bid or offer side of the
market, and will often take into account the creditworthiness of the
terminating party.” Analogously, Isda (2009) says that in deter-
mining a close-out amount the information used includes
“quotations (either firm or indicative) for replacement transactions
supplied by one or more third parties that may take into account
the creditworthiness of the determining party at the time the quota-
tion is provided”.
A real market counterparty replacing the defaulted one would
not neglect the creditworthiness of the surviving party. On the other
hand, there is no binding prescription – the Isda documentation
speaks of creditworthiness that is taken into account often, not
always, and that may, not must, be included. This leaves room for a
risk-free close-out, which is probably easier to calculate since it is
independent of the features of the survived party. The counterparty
risk adjustments change strongly depending on which close-out
amount is considered. Also, the effects at the moment of default of a
company are very different under the two close-out conventions,
with some dramatic consequences on default contagion, as we
show in the following. These results should be considered carefully
by the financial community, and in particular by Isda, which can
give more certainty on this issue.
Risk-free versus replacement close-out: practical

consequences
A risk-free close-out has implications that are very different from
what we are used to in the case of a default in standardised markets,
such as the bond or loan markets. If the owner of a bond defaults, or
if the lender in a loan defaults, this means no losses to the bond
issuer or to the loan borrower. But if the risk-free default close-out is
applied to a derivatives transaction when the net creditor party in a
derivative (thus in a position similar to a bond owner or loan lender)
defaults, the value of the net debtor’s liability will suddenly jump
316
19 Brigo PCQF.indd 316 11/03/2013 10:17

CLOSE-OUT CONVENTION TENSIONS
up. This is because before default it is marked-to-market accounting

for this default risk, while afterwards this is excluded under the
risk-free close-out and so the contract is more valuable.
This increase grows with the debtor’s credit spread, and it must
be paid upfront at default by the debtors to the liquidators of the
defaulted party. So, obviously net debtors will prefer a replacement
close-out, which does not imply this increase. Under a replacement
close-out, if one of the two parties in the deal has no future obliga-
tions, just like a bond or option holder, their default probability
does not influence the value of the deal at inception, consistently
with market practice for bonds and options.
On the other hand, the replacement close-out has shortcomings
opposite to the risk-free close-out. While protecting debtors, it can
in some situations penalise the creditors. Consider the case when
the defaulted entity is a company with high systemic impact, like
Lehman Brothers, such that when it defaults the credit spreads of
its counterparties are expected to jump high. Under a replacement
close-out this jump reduces the creditworthiness of the debtors and
therefore the market value of their liabilities. All the claims of the
liquidators towards the debtors of the defaulted company will be
deflated, and the low level of the recovery may again be a dramatic
surprise, but this time for the creditors of the defaulted company.
Unilateral and bilateral valuation adjustments

Consider two parties entering a deal with final maturity T, an
investor I and a counterparty C. Assume the deal’s discounted total
cashflows at time t, in the absence of default risk of either party, are
valued by I at PI(t, T). The analogous cashflows seen from C are
denoted with PC(t, T) = –PI(t, T). In a “unilateral” situation where
only the counterparty risk of name C is considered, one can write
the value of the deal to either party including this counterparty risk.
This will be the value when this default does not occur before matu-
rity, plus a credit value adjustment (for I) or debit value adjustment
(for C) term consisting of the expected value at default plus terms
reflecting the recovery payments. From the point of view of C this is:
NPVCC (t ) = E t {1τ C >T ΠC (t,T )} + E t {1t<τ C ≤T ⎡⎣ΠC (t, τ C )
(
+D (t, τ C ) ( NPVC (τ C )) − RECC (−NPVC (τ C )) ⎤⎥⎦
+ +
)}
317
19 Brigo PCQF.indd 317 11/03/2013 10:17

where REC and LGD = 1 – REC denote recoveries and loss given
defaults, D(t, T) is the discount factor between times t and T, and the
expected exposure NPVC(t) = Et[PC(t, T)] is the default risk-free value
of the residual deal at time t, seen by C. The corresponding formula
for I is the classical result from Brigo and Masetti (2005), and can be
seen by changing the lower index – which represents who is doing
the valuation – to I, and switching the order of the recovery and
positive exposure terms in the difference part, remembering that
the sign of the exposure changes.
In the situation where both I and C may default, we have a bilat-
eral valuation adjustment (see, for example, Brigo and Capponi,
2008, or Gregory, 2009, for the general framework, and Brigo,
Pallavicini and Papatheodorou, 2011, for the application to interest
rate instrument portfolios with netting and wrong-way risk). We
define t 1 to be the first-to-default time, t 1 = min(tI, tC).
Inclusion of bilateral default risk leads to the risk-free close-out
adjustment:
{{
NPVIFree (t,T ) = E t 1 τ 1 >T Π I (t,T )
} }
{{
+E t 1 τ 1 =τ
C <T
⎡Π I (t, τ C ) + D (t, τ C )
} ⎣
(REC (NPV (τ
C I C
+
)) − (−NPVI (τ C ))
+
)⎤⎦⎥}
{{
+E t 1 τ 1 =τ
I <T
⎡Π I (t, τ I ) + D (t, τ I )
} ⎣
((−NPV (τ )) − REC (NPV (τ )) )⎤⎦}

C I
+
I C I
+
(19.1)
where we use the risk-free NPV upon the first default to close the
deal, in keeping with a risk-free close-out. But, as we saw in the
introduction, this choice is not obvious in a bilateral setting because
the surviving party is not default risk-free, and even Isda docu-
mentation considers a replacement close-out taking into account
the credit quality of the surviving party. So, we consider the
substitutions:
NPVI (τ C ) → NPVII (τ C )

NPVC (τ I ) → NPVCC (τ I )
with the counterparties valuing the NPV account for the risk of
their own default, as denoted by the superscript. The final formula
318
19 Brigo PCQF.indd 318 11/03/2013 10:17

for the adjustment under a replacement close-out is the same as

Equation 19.1 but with this substitution. We denote the related NPV
by NPVIRepl(t, T). For more details, see Brigo and Morini (2010) and
Morini (2010).
A quantitative analysis and a numerical example

Here we choose quite simple payouts and modelling assumptions.
This is done to show the effects of the close-out conventions by
dinsentangling them from complex modelling and payout assump-
tions that would obscure patterns. We consider a simple T-maturity
call option on stock S, with the risk-free price for I, the option holder,
given by:
NPVI ( 0, K,T ) = P ( 0,T ) E 0 ⎡⎣(ST − K ) ⎤⎦
+

where we assume deterministic interest rates and P(0, T) is the

deterministic discount factor (risk-free bond price), and an even
simpler deal where C promises to pay an amount K to I at maturity
T. In this case the risk-free price to the bondholder I is:
NPVI ( 0, K,T ) = P ( 0,T ) K
Note that in the above deals I is the option or bondholder, so it is the

lender in the deal, with no further obligation after the payment of
the premium at inception, while C is in the position of the borrower,
the party that commits to execute payments at a future time. We
will often refer to I as the lender and to C as the borrower. The
second payout is an established market standard to compare the
consistency of each close-out assumption. Here the comparison
with a bond-style payout is interesting for a further reason: when
bilateral counterparty risk was introduced for derivatives, it was
pointed out in the market that this approach, involving a bank
including its own risk of default in valuation, already existed for
bonds through the fair-value option, and this analogy dominated
the discussion on its appropriateness.
We now introduce risk of default for both parties. If we consider
an underlying stock independent of the risk of default of the parties,
the above formulas for the risky price under the two possible close-
out assumptions reduce to:
319
19 Brigo PCQF.indd 319 11/03/2013 10:17

NPVIRepl ( 0, K,T )
= NPVI ( 0, K,T ) ⎡⎣Q (τ C > T ) + RECC Q (τ C ≤ T )⎤⎦
NPVIFree ( 0, K,T )
= NPVI ( 0, K,T )
⎡⎣Q (τ C > T ) + Q (τ I < τ C < T ) + RECC Q (τ C ≤ min (τ I ,T ))⎤⎦
= NPVI ( 0, K,T )
⎡⎣Q (τ C > T ) + RECC Q (τ C ≤ T ) + LGDC Q (τ I < τ C < T )⎤⎦
where Q is the risk-neutral probability measure. We see an impor-

tant oddity of the risk-free close-out in this case. The adjusted price
of the bond or of the option depends on the credit risk of the lender
I (bondholder or option holder) if we use the risk-free close-out.
This is counterintuitive since the lender has no obligations in the
deal, and it is not consistent with market practice for loans or bonds.
From this point of view, the replacement close-out is preferable.
This bizarre dependence of the risk-free close-out price on the risk
of default of the party with no obligations in a deal can be properly
appreciated in the following numerical example, where we consider
the option-style payout with S0 = 2.5, K = 2 and a stock volatility equal
to 40% in a standard lognormal Black–Scholes framework. Set the
risk-free rate and the dividend yield at r = q = 3%, and consider a matu-
rity of five years. The price of an option varies with the default risk of
the option writer, as usual, and here also with the default risk of the
Figure 19.1 Value of an option with risk-free close-out
100
80
60
%
40
20
0
100
0
90
10
80
20
70
30
60
40
50
50
40
60
70
30
Writer (borrower) Buyer (lender)

80
20
90
10
five-year spread λC (%) five-year spread λI (%)

100 0
320
19 Brigo PCQF.indd 320 11/03/2013 10:17

option holder, due to the risk-free close-out. In Figure 19.1, we show

the price of the option for default intensities l I, l C going from zero to
100%. We consider RC = 0 so that the level of the intensity approxi-
mately coincides with the market credit default swap spread on the
five-year maturity. We also assume that default of the entities I and C
are independent of each other.
We see that the effect of the holder’s risk of default is not negli-
gible, and is particularly decisive when the writer’s risk is high.
Similar patterns are shown for a bond payout in Brigo and Morini
(2010): with a risk-free close-out there is a strong effect of the default
risk of the bondholder, an effect that is higher the higher the risk of
default of the bond issuer. The results of Figure 19.1 can be compared
with those of Figure 19.2, where we apply the formula that assumes
a replacement close-out. This is the pattern one would expect from
standard financial principles: independence of the price of the deal
from the risk of default of the counterparty that has no future obli-
gations in the deal.
We can also consider a special case where, at first sight, the
picture appears different – when we assume maximal dependence
between the defaults. We assume the default of I and C to be
co-monotonic, and the spread of the lender I to be larger, so it
defaults first in all scenarios (for example, C is a subsidiary of I, or a
company whose well-being is completely driven by I: C is a tyre
Figure 19.2 Value of an option with substitution close-out
100
80
60
%
40
20
0
100
0
90
10
80
20
70
30
60
40
50
50
40
60
30
Writer (borrower) Buyer (lender)

70
20
80
10
five-year spread λC (%) five-year spread λI (%)

90
0
100
321
19 Brigo PCQF.indd 321 11/03/2013 10:17

Figure 19.3 Loss for the borrower at default of the lender under risk-free
close-out
τLen
0
D
–Pr(τBor > T)e–rT E
Replacement close-out F
A
U
L
T
–Pr(τBor > T)e–rT – PrLen (τBor > T)e–r(T–τLen)
τ
–Pr(τLen < τBor < T)e–rT Replacement close-out
Risk-free close-out
–e–r(T–τLen)
Risk-free close-out
factory whose only client is car producer I). In this case, the two
formulas become:
NPVIRepl ( 0, K,T )
= NPVI ( 0, K,T ) ⎡⎣Q (τ C > T ) + RECC Q (τ C ≤ T )⎤⎦
NPVIFree ( 0, K,T )
= NPVI ( 0, K,T ) ⎡⎣Q (τ C > T ) + Q (τ C < T )⎤⎦ = NPVI ( 0, K,T )
Now the results we obtain with a risk-free close-out appear

somehow more logical. Either I does not default, and then C does
not default either, or when I defaults C is still solvent, and so I
recovers the whole payment. The credit risk of C should not affect
the deal. This happens with the risk-free close-out but not with the
replacement close-out. However, one may argue that this result is
obtained under a hypothesis that is totally unrealistic: the hypoth-
esis of perfect default dependency with heterogeneous deterministic
spreads (co-monotonicity), which can imply that company C will
go on paying its obligations, maybe for years, in spite of being
doomed to default at a fully predictable time. For a discussion on
the problems that can arise when assuming perfect default depend-
ency with deterministic spreads, see Brigo and Chourdakis (2009)
and Morini (2009) and (2011).
In an example such as the one described above, where the
borrower is so linked to the lender, the realistic scenario is that the
default of the borrower will not happen simultaneously with the
default of the lender, but before the settlement of the lender’s
default, so that the borrower will be in a default state and will pay
322
19 Brigo PCQF.indd 322 11/03/2013 10:17

Figure 19.4 Lower recovery for creditors under replacement close-out
τLen
0
–Pr(τBor > T)e–rT D
Replacement close-out E Replacement close-out
F
A when lender and borrower
U have strong links
L
T
–Pr(τBor > T)e–rT
–Pr(τLen < τBor < T)e–rT
Risk-free close-out
–e–r(T–τLen)
Risk-free close-out
only a recovery fraction of the risk-free present value of the deriva-

tive. This makes the payout exactly the same as in a replacement
close-out, so this assumption also appears more logical in the
special case of co-monotonic companies. Standard formulas for
counterparty risk cannot capture this reality because they make the
simplification that default is settled exactly at default time, as
pointed out above.
We now analyse contagion issues. We write the price at a generic
time t < T, and then assume the lender defaults between t and t + Dt, t
< t I < t + Dt, checking the consequences in both formulas:
NPVIRepl (t,T )
= NPVI (t,T ) ⎡⎣Q t (τ C > T ) + RECC Q t (t < τ C ≤ T )⎤⎦
NPVIFree (t,T )
= NPVIRepl (t,T ) + NPVI (t,T ) LGDC Q t (t < τ I < τ C < T ) (19.2)
Here the subscript t on the probabilities means we are conditioning

on the market information at time t. This conditioning will be
crucial in the co-monotonic case. Indeed, we focus on two cases:
❑❑ t I and t C are independent. In this case, the default event t I alters

only one quantity: we move from:
Q t (τ I < τ C < T ) < Q t (τ C < T )
to:
Q t+Δt (τ I < τ C < T ) = Q t+Δt (τ C < T ) ≈ Qt (τ C < T )
for small Dt so that from NPVIFree(t, T) given in 19.2 we move to:
323
19 Brigo PCQF.indd 323 11/03/2013 10:17

NPVIFree (t + Δt,T ) = NPVI (t + Δt,T )
whereas the replacement close-out price does not change.

❑❑ t I and t C are co-monotonic. Take an example where t < t I < t + Dt
implies that t + Dt < t C < T. Then, using A → B with the meaning of
⎥
“we go from A to B”, we have with t < t I < t + Dt:

Q t (τ C > T ) > 0  Q t+Δt (τ C > T ) = 0
Q t (τ C ≤ T ) < 1  Qt+Δt (τ C ≤ T ) = 1
Q t (τ I < τ C < T ) < 1  Qt+Δt (τ I < τ C < T ) = 1

This means that from NPVIRepl(t, T) given in 19.2 we move to:
NPVIRepl (t + Δt,T ) = RECC NPVI (t + Δt,T )
Under a risk-free close-out and independent defaults, a previously

risky derivative turns suddenly into a risk-free one at default of the
lender, suddenly raising the liability of the borrower. This jump will
be greater the higher the default risk of the borrower. As we said
above, it is a form of contagion that affects debtors of a defaulted
entity and adds to the standard contagion affecting creditors. Under
a replacement close-out we have no discontinuity and no contagion
of the debtors.
In the co-monotonic case, under a replacement close-out the
default of the lender sends the value of the contract to its minimum
value, the value of a defaulted contract. The borrower will see a
strong decrease of its liabilities to the lender. This is a positive fact
for debtors, but it is an increase of the contagion of the creditors of
the defaulted company, which will see the recovery reduced. This
does not happen in case of a risk-free close-out. This example is
under the extreme hypothesis of co-monotonicity, but in this case
the main conclusions do not hinge on the unrealistic elements of
this hypothesis. We can see it as the extreme of a realistic scenario:
the case when the defaulted company has a strong systemic impact,
leading the spreads of the counterparties to very high values,
deflating the liabilities of the debtors under a replacement close-
out. We cannot deny this is realistic: it is what we saw in the Lehman
case.
Let us now consider the loan/bond/deposit payout, with
counterparty C (borrower) promising to pay K = 1 to I (lender). We
start from the above r = 3% and maturity of five years, for a 1
324
19 Brigo PCQF.indd 324 11/03/2013 10:17

billion notional. We suppose the borrower has a very low credit

quality, as expressed by l C = 0.2, which means a probability to
default before maturity of 63.2%, while l I = 0.04, which means a
default probability of 18.1%. An analogous risk-free “bond”
would have a price:
P ( 0, 5y ) = 860.7 million
while taking into account the default probability of the two parties,
which are assumed to be independent, we have:
NPVIFree ( 0, 5y ) = 359.5million, NPVIRepl = 316.6million
The difference of the two valuations is not negligible but not

dramatic. More relevant is the difference in case of a default. We
have the following risk-adjusted probabilities of the occurrence of a
default event:
⎧ τ C with prob 58%

⎪⎪
min ( 5y, τ , τ ) = ⎨ τ I with prob12%
I C
⎪
⎪⎩ 5y with prob 30%
The two formulas disagree only when the lender defaults first. Let
us analyse in detail what happens in this case. Suppose the exact
day when default happens is t I = 2.5y. Just before default, at 2.5
years less one day, we have for the borrower C the following book
value of the liability produced by the above deal, depending on the
assumed close-out:
NPVCFree (τ I − 1d, 5y ) = −578.9million

NPVCRepl (τ I − 1d, 5y ) = −562.7 million
Now default of the lender happens. In case of a risk-free close-out,

the book value of the bond becomes simply the value of a risk-free
bond:
NPVCFree (τ I + 1d, 5y ) = −927.7 million
The borrower, which has not defaulted, must pay this amount
entirely – and soon. He has a sudden loss of 348.8 million due to
default of the lender. With the substitution close-out, we have
instead:
NPVCRepl (τ I + 1d, 5y ) = −578.9million
325
19 Brigo PCQF.indd 325 11/03/2013 10:17

There is no discontinuity and no loss for the borrower in case of

default of the lender. This is true, however, only in case of inde-
pendence. If the default of the lender leads to an increase of the
spreads of the borrower, the liability can jump to lower in absolute
value, also lowering the expected recovery for the liquidators of the
defaulted lender (see Figures 19.3 and 19.4, and Table 19.1).
In Brigo and Morini (2010), we also cover the issue of how to treat
the two close-out conventions for the case of collateralised deals,
when the final outcome should always be that, irrespective of close-
out, collateral and exposure match at default.
Conclusion
We have analysed the effect of the assumptions about the computa-
tion of the close-out amount on the counterparty risk adjustments
of derivatives. We have compared the risk-free close-out assumed
in the earlier literature with the replacement close-out we introduce
here, inspired by Isda documentation on the subject.
We have provided a formula for bilateral counterparty risk when
a replacement close-out is used at default. We reckon that the
replacement close-out is consistent with counterparty risk adjust-
ments for standard and consolidated financial products, such as
bonds and loans. On the contrary, the risk-free close-out introduces
at time zero a dependence on the risk of default of the party with no
future obligations.
We have also shown that in case of the risk-free close-out, a party
that is a net debtor of a company will have a sudden loss at the
default of the latter, and this loss is higher the higher the debtor’s
credit spreads. This does not happen when a replacement close-out
is considered.
Thus, the risk-free close-out increases the number of operators
subject to contagion from a default, including parties that currently
seem to think they are not exposed, and this is certainly a negative
fact. On the other hand, it spreads the default losses to a higher
number of parties and reduces the classic contagion channel
affecting creditors. For the creditors, this is a positive fact because
it brings more money to the liquidators of the defaulted company.
We think the close-out issue should be considered carefully by
market operators and Isda. For example, if the risk-free close-out
introduced in the previous literature had to be recognised as a
326
19 Brigo PCQF.indd 326 11/03/2013 10:17

Table 19.1 Impact of the lender default on counterparties and contagion –

risky and risk-free close-out
Dependence → Independence Co-monotonicity
Close-out ↓
Risk-free Negatively affects borrower No contagion
Substitution No contagion Further negatively affects lender
standard, banks should understand the consequences of this as

explained above. In fact, banks usually perform stress tests and set
aside reserves for the risk of default of their net borrowers, but do
not consider any risk related to the default of net lenders. The above
calculations and the numerical examples show that under risk-free
close-out banks should set aside important reserves against this
risk. On the other hand, under replacement close-out, banks can
expect the recovery to be lowered when their net borrowers default,
compared with the case when a risk free close-out applies. In the
case of a replacement close-out, the money collected by liquidators
from the counterparties will be lower, since it will be deflated by the
default probability of the counterparties themselves, especially if
they are strongly correlated to the defaulted entity.
The authors would like to thank Giorgio Facchinetti, Marco Bianchetti,

Luigi Cefis, Martin Baxter, Andrea Bugin, Vladimir Chorny, Josh
Danziger, Igor Smirnov and other participants at the ICBI 2010 Global
Derivatives and Risk Management Conference for helpful discussion.
They would also like to give special thanks to Andrea Pallavicini and
Andrea Prampolini for thoroughly and deeply discussing the research
issues considered in this chapter. The remaining errors are their own.
This chapter expresses the views of its authors and does not represent
the institutions where the authors are working or have worked in the
past. Such institutions, including Banca IMI, are not responsible for
any use that may be made of this chapter’s contents.
327
19 Brigo PCQF.indd 327 11/03/2013 10:17

REFERENCES
Brigo D. and I. Bakkar, 2009, “Accurate Counterparty Risk Valuation for Energy-
commodities Swaps,” Energy Risk, March, pp 106–11.
Brigo D. and A. Capponi, 2008, “Bilateral Counterparty Risk Valuation with Stochastic
Dynamical Models and Application to Credit Default Swaps,” working paper (available
at http://arxiv. org/abs/0812.3705). A short updated version appeared in Risk, March
2010, pp 85–90.
Brigo D. and K. Chourdakis, 2009, “Counterparty Risk for Credit Default Swaps: Impact
of Spread Volatility and Default Correlation,” International Journal of Theoretical and
Applied Finance, 12(7), pp 1,007–26.
Brigo D. and M. Masetti, 2005, “Risk Neutral Pricing of Counterparty Risk,” in M.

Pykhtin (Ed), Counterparty Credit Risk Modeling: Risk Management, Pricing and Regulation
(London, England: Risk Books).
Brigo D. and M. Morini, 2010, “Dangers of Bilateral Counterparty Risk: The

Fundamental Impact of Closeout Conventions” (available at http://arxiv.org, http://
defaultrisk.com and http://ssrn.com/ abstract=1709370). Summary appeared as
“Rethinking Counterparty Default”, in Creditflux, 114, pp 18–19, 2011.
Brigo D., A. Pallavicini and V. Papatheodorou, 2011, “Arbitrage-free Valuation

of Bilateral Counterparty Risk for Interest-rate Products: Impact of Volatilities and
Correlations,” International Journal of Theoretical and Applied Finance, 14(6), pp 773–802.
Gregory J., 2009, “Being Two-faced Over Counterparty Credit Risk,” Risk, February, pp
86–90.
International Swaps and Derivatives Association, 2009, “ISDA Close-out Amount

Protocol” (available at www.isda.org/isdacloseoutamtprot/isdacloseoutamtprot.html).
International Swaps and Derivatives Association, 2010, “Market Review of OTC

Derivative Bilateral Collateralization Practices,” March 1.
Morini M., 2009, “One More Model Risk When Using Gaussian Copula for Risk
Management,” April (available at http://ssrn.com/ abstract=1520670).
Morini M., 2010, “Can the Default of a Bank Cause the Default of its Debtors? The
Destabilizing Consequences of the Standard Definition of Bilateral Counterparty Risk,”
working paper, March.
Morini M., 2011, Understanding and Managing Model Risk: A Practical Guide for Quants,
Traders and Validators (Hoboken, NJ: Wiley).
Morini M. and A. Prampolini, 2011, “Risky Funding with Counterparty and Liquidity
Charges,” Risk, March, pp 70–75.
328
19 Brigo PCQF.indd 328 11/03/2013 10:17

20
Cutting CVAs Complexity
Pierre Henry-Labordère
The financial crisis has highlighted the importance of the credit

value adjustment (CVA) when pricing derivatives. Bilateral coun-
terparty risk is the risk that the issuer of a derivative, or its
counterparty, may default prior to the expiry and fail to make future
payments. For Markovian models, this leads naturally to nonlinear
second-order parabolic partial differential equations (PDEs) to
price the contract. More precisely, the nonlinearity in the pricing
equation affects none of the differential terms but depends on the
positive part of the mark-to-market value M of the derivative upon
default. Where this mark-to-market value is calculated in the pres-
ence of counterparty risk, M = VÈ, we have a so-called semi-linear
PDE. If we do not include counterparty risk, M = V, we have a semi-
linear (respectively linear) PDE if the funding lending and
borrowing rates are different (respectively equal) (see Burgard and
Kjaer, 2011).
The numerical solution of this equation is a formidable task.
Typically, the market data is evolved and the mark-to-market values
are calculated at each default date. This works fine for simple trades
such as swaps, forwards or vanilla options. More complex trades
can use pre-calculated regression/PDE look-up tables. For multi-
asset portfolios, these PDEs cannot be solved with finite-difference
schemes because they suffer from the curse of dimensionality. We
must rely on new probabilistic methods. It has seemed that a brute
force intensive nested Monte Carlo method is the only tool avail-
able for this task, particularly in the case M = VÈ.
In this chapter, we rely on new advanced nonlinear Monte Carlo
329
20 Labordere PCQF.indd 329 11/03/2013 10:18

methods to solve these semi-linear PDEs. A first approach is to use

the so-called first-order backward stochastic differential equations.
Unfortunately, in practice this method requires the calculation of
conditional expectations using regressions. Finding good quality
regressors is notably difficult, especially for multi-asset portfolios.
This leads us to introduce a new method based on branching diffu-
sions describing a marked Galton–Watson random tree. A similar
algorithm can also be applied to obtain stochastic representations
for solutions of a large class of semi-linear parabolic PDEs in which
the nonlinearity can be approximated by a polynomial function.
The valuation PDEs

For completeness, we derive the PDE arising in counterparty risk
valuation of a European-style derivative with a payout ψ at matu-
rity T. In short, depending on the modelling choice of the
mark-to-market value of the derivative upon default, we will get
two types of semi-linear PDEs that can be schematically written as:
∂t V̂ + LV̂ − (1− R ) λ2V̂ + − rV̂ = 0, V̂ (T, x ) = ψ ( x ) (20.1)
and:
∂t V + LV + λ2 RV + − V − − V − rV = 0, V (T, x ) = ψ ( x )

( )
∂t V + LV − rV = 0,V (T, x ) = ψ ( x ) (20.2)
where +/– superscripts denote positive and negative parts

respectively X ≡ X+ – X–, and L is the Itô generator of a multi-dimen-
sional diffusion process. VÈ (respectively V‚ ) denotes the derivative
value with (respectively without) provision for counterparty risk.
Derivation
We assume the issuer is allowed to dynamically trade d underlying
assets independent of the counterparty’s default process in a
complete market. Counterparty quantities are denoted with a
superscript C. To hedge their credit risk on the counterparty name,
the issuer can trade a default-risky bond, denoted PtC, and the
default process is modelled by a Poisson jump process, with
constant intensity lC, although this assumption can be easily
relaxed, for instance to have the intensity follow an Itô diffusion.
We consider the case of a long position in a single derivative whose
330

CUTTING CVAS COMPLEXITY
value we denote u. In practice, netting agreements apply to the

global mark-to-market value of a pool of derivative positions – u
would then denote the aggregate value of these derivatives. The
processes Xt, PtC satisfy under the risk-neutral measure P:
dX t dP C
= rdt + σ (t, X t ) ⋅ dWt , 2t = ( r + λC ) dt − dJtC
Xt Pt
where Wt is a d-dimensional Brownian motion, JtC is a jump Poisson

process with intensity lC and r the interest rate. The no-arbitrage
condition and completeness of the market give that e–rtu(t, Xt) is a
P-martingale, characterised by:
∂t u + Lu + λC ( u − u) − ru = 0
where L denotes the Itô generator of X and u) is the derivative’s

value after the counterparty has defaulted.
PDEs for M = V̂ and M = V

At the default event, u) is given by:
u = RM + − M −
where M is the mark-to-market value of the derivative to be used in

the unwinding of the position upon default and R is the recovery
rate. There is an ambiguity in the market about the convention for
the mark-to-market value to be settled at default. There are two
natural conventions: the mark-to-market of the derivative is evalu-
ated at the time of default with provision for counterparty risk or
without. The exposure, in case of provisions, is set equal to the pre-
default value of the derivative. Brigo and Morini (2011) compare
the risk-free close-out with the replacement close-out convention.
In general, the assumption of using the pre-default value of the
derivative in calculating the close-out amount might be question-
able, since after the default event one party has defaulted and the
replacement deal must be closed with another party. This issue is
not important for our numerical algorithm, which can handle these
various conventions.
❑❑ Provision for counterparty risk, M = VÈ. We get the semi-linear

PDE 20.1. Although the case M = VÈ does not seem to be supported
in the International Swaps and Derivatives Association Master
331

Agreement, a similar nonlinear PDE is obtained in the case M = V

when there is a bid/offer spread on the issuer funding rate (see
Burgard and Kjaer, 2011, for details).
❑❑ No provision for counterparty risk, M = V. We get PDE 20.2. This
is a linear PDE with a source term V +.
In the case of collateralised positions, counterparty risk applies to

the variation of the mark-to-market value of the corresponding
positions experienced over the time it takes to qualify a failure to
pay margin as a default event – typically a few days. In the latter
case, the nonlinearity Vt+ should be substituted with (Vt – Vt+D)+,
where D is this delay. We will come back to this situation in the last
section.
~
By proper discounting and replacing VÈ (respectively V, V ) by –VÈ
for the sake of the presentation, these two PDEs can be cast into
normal forms:
( )
∂t V̂ + LV̂ + β V̂ + − V̂ = 0, V̂ (T, x ) = ψ ( x ) : PDE2 (20.3)
β
∂t V + LV +
1− R
(
(1− R ) Et,x [ψ ] + RE t,x [ψ ] − V = 0,
+
)
V (T, x ) = ψ ( x ) : PDE1 (20.4)
where b ≡ l 2(1 – R) ∈ R+. It is interesting to note that a similar semi-

linear PDE type 20.3 also appears in the pricing of American-style
options (see Benth, Karlsen and Reikvam, 2003, for details) and
corresponds to well-known early exercise premium formulas.
In the next section, we briefly list (nonlinear) Monte Carlo algo-
rithms that can be used to solve PDEs 20.3–20.4 and highlight their
weaknesses in the context of CVA. We then explain our algorithm,
which relies on approximating the nonlinearity x+ by a polynomial
in x and then finding an efficient stochastic representation based on
(marked) branching diffusions.
Existing methods for solving the valuation PDEs

A brute force algorithm
Using Feynman–Kac’s formula, the solution of PDE 20.3 can be
represented stochastically as:
T
V̂ (t, x ) = e − β (T−t)E t,x ⎡⎣ψ ( XT )⎤⎦ + ∫ t
β e − β (s−t)E t,x ⎡⎣V̂ + ( s, X s )⎤⎦ ds (20.5)
332

where X is an Itô diffusion with generator L and Et,x[⋅] = E[⋅⎪Xt = x].

By assuming that the intensity b is small, we get the
approximation:
V̂ (t, x) = e − β (T−t)E t,x ⎡⎣ψ ( XT )⎤⎦
+β e − β (T−t) ∫ E t,x ⎡⎢⎣( E s,X s ⎡⎣ψ ( XT )⎤⎦) ⎤⎥⎦ ds + O ( β 2 )

T +
t
(20.6)
This is exact for PDE 20.4 and consists of applying Feynman–Kac’s

formula with the source term (Es,Xs[ψ (XT)])+. Then, as a next step,
we discretise the Riemann integral:
V̂ (t, x)  e − β (T−t) E t,x ⎡⎣ψ ( XT )⎤⎦

n
⎡ + ⎤

i=1
⎣ ( )
+β e − β (T−t) ∑ E t,x ⎢ E ti ,Xti ⎡⎣ψ ( XT )⎤⎦ ⎥Δti
⎦
This last expression can be numerically tackled by using a brute

force nested Monte Carlo method. The second Monte Carlo is used
to calculate Eti,Xt [ψ (XT)] on each path generated by the first Monte
i
Carlo algorithm. Although straightforward, this method suffers
from the curse of dimensionality – requiring generation of O(N1 ×
N2) paths. Due to this complexity, the literature focuses on exposi-
tion of linear portfolios for which the second Monte Carlo simulation
can be skipped by using closed-form formulas or low-dimensional
parametric regressions (see, for example, Brigo and Pallavicini,
2007 and 2008, in which the authors consider the pricing of constant
maturity swap spread option and contingent credit default swaps).
Could we design a simple (nonlinear) Monte Carlo algorithm
that solves our PDEs 20.3–20.4, without relying on an approxima-
tion such as 20.6? This is the purpose of this article.
Backward stochastic differential equations

A first approach is to simulate a backward stochastic differential
equation (BSDE):
dX t = µ (t, Xt ) dt + σ (t, X t ) ⋅ dWt , X0 = x (20.7)
dYt = −βYt+dt + Ztσ (t, X t ) ⋅ dWt , YT = ψ ( XT ) (20.8)
where (Y, Z) are required to be adapted processes and L = Sim i∂xi+

1/2Si,j(ss *)ij∂2xi xj. BSDEs differ from (forward) SDEs in that we impose
the terminal value (see Equation 20.8). Under the condition ψ ∈ L2(Ω),
this BSDE admits a unique solution (Pardoux and Peng, 1990). A
333

straightforward application of Itô’s lemma gives that the solution of

this BSDE is (Yt = eb (T–t)V̂(t, Xt), Zt = eb (T–t)∇xV̂(t, Xt)) with VÈ the solu-
tion of PDE 20.3. This leads to a Monte Carlo-like numerical solution
of 20.3 via an efficient discretisation scheme for the above BSDE.
This BSDE can be discretised by an Euler-like scheme (Yti–1 is
forced to be Fti–1-adapted, (Ft)t≥0 being the natural filtration gener-
ated by the Brownian motions):
⎛ 1+ (1− θ ) βΔti ⎞
Yti−1 = E ti−1 ⎡⎣Yti ⎤⎦⎜1E ⎡Y ⎤≥0 + 1E ⎡Y ⎤<0 ⎟
⎝ ti−1 ⎣ ti ⎦
1− θβΔti ti−1 ⎣ ti ⎦
⎠
with qb Dti < 1 and q ∈ [0, 1]. This requires the calculation of the
conditional expectation Eti–1[Yti] (in practice by regression methods),
which could be quite difficult and time-consuming, especially for
multi-asset portfolios. A careful analysis of the convergence in
terms of the number of simulations, the function regressors and the
time step Dti is achieved in Gobet, Lemor and Warin (2006).
An introduction to branching diffusions

Branching diffusions were first introduced by McKean (1975) to
give a probabilistic representation of the Kolmogorov-Petrovskii-
Piskunov PDE and more generally of semi-linear PDEs of the type:
⎛ ∞ ⎞
∂t u + Lu + β (t ) ⎜∑ pk u k − u⎟ = 0 in R + × R d
⎝ k=0 ⎠
u (T, x ) = ψ ( x ) in R d (20.9)
with b (⋅) ∈ R+. Here the nonlinearity is a power series in u where

the coefficients satisfy the restrictive condition:
∞ ∞
f ( u) ≡ ∑ pk uk ,∑ pk = 1 0 ≤ pk ≤ 1 (20.10)
k=0 k=0
The probabilistic interpretation of such an equation goes as follows:

let a single particle start at the origin, perform an Itô diffusion on Rd
with generator L, and after a mean b (⋅) exponential time (inde-
pendent of X) die and produce k descendants with probability pk.
Then the descendants perform independent Itô diffusions on Rd
(with same generator L) from their birth locations, die and produce
descendants after a mean b (⋅) exponential times, etc. This process is
called a d-dimensional branching diffusion with a branching rate
b (⋅), and its paths are known as Galton–Watson trees. b can also
334

depend spatially on x or itself be stochastic (a so-called Cox process).

We denote by Zt ≡ (z1t , …, ztNt) ∈ Rd×Nt the locations of the particles
alive at time t and Nt the number of particles at t (see Figure 20.1 for
examples with two and three descendants). We consider then the
multiplicative functional defined by (PNT=0 ≡ 1):
⎡ NT ⎤
û (t, x ) = E t,x ⎢∏ψ ( zTi )⎥ (20.11)
⎣ i=1 ⎦
where Et,x[⋅] = E[⋅⎪Nt = 1, z1t = x]. Note that as NT can become infinite
when m = Sk=0 ∞
kpk > 1 (super-critical regime, see Durrett, 2004), a
sufficient condition on ψ in order to have a well-behaved product is
⎥ψ⎥ < 1. Then û solves the semi-linear PDE 20.9. This stochastic
representation can be understood as follows: mathematically, by
conditioning on t , the first time to jump of a Poisson process with
intensity b (t), we get from 20.11:
û (t, x ) = E t,x ⎡⎣1τ ≥T ψ ( zT1 )⎤⎦

⎡ ∞ ⎡ k NTj (τ ) ⎤⎤
+E t,x ⎢1τ <T ∑ p k Eτ ⎢∏ ∏ ψ ( zTi, j,zτ )⎥⎥
⎢⎣ ⎢⎣ j=1 i=1 ⎥⎦⎥⎦
k=0
where N Tj (t) is the number of ancestors of particle j born at time t

and alive at time T and zTi,j,zt is the position of the ith particle at matu-
rity T produced by the jth particle generated at time t . By using the
independence and the strong Markov property, we obtain:
û (t, x ) = E t,x ⎡⎣1τ ≥T ψ ( zT1 )⎤⎦

∞ ⎡ k ⎡NTj (τ ) ⎤⎤
+∑ E t,x 1τ <T p k ∏ E τ ⎢ ∏ ψ ( zTi, j,zτ )⎥⎥
⎢
k=0 ⎢⎣ j=1 ⎢⎣ i=1 ⎥⎦⎥⎦
⎡ ∞ k ⎤
= E t,x ⎡⎣1τ ≥T ψ ( zT1 )⎤⎦ + E t,x ⎢1τ <T ∑ p k ∏ û (τ , zτ1 )⎥
⎢⎣ k=0 j=1 ⎥⎦
∞
= E t,x ⎡⎣1τ ≥T ψ ( zT1 )⎤⎦ + ∑ pk E t,x ⎡⎣ûk (τ , zτ1 ) 1τ <T ⎤⎦
k=0
⎡ − ∫ T β (s) ds ⎤
= E t,x ⎢e t ψ ( zT1 )⎥
⎣ ⎦
T
∞
⎡ s
− ∫ β ( u) du k ⎤
+∫ ∑p E k t,x ⎢β ( s) e
t
û ( s, zs1 )⎥ ds
t
k=0 ⎣ ⎦
335

Figure 20.1 Marked Galton–Watson random tree for the nonlinearity

F(u) = (a/p2)p2u2 + (b/p3)p3u3
PDE2
+
PDE2
+
Note: The grey (respectively black) vertex corresponds to the weight a/p1 (respectively
b/p2). The diagram with two grey vertices has the weights (ω1 = 2, ω2 = 0)
Then, by applying formally the Feynman–Kac formula, û is a formal

solution of PDE (20.9). This can deduced rigorously by assuming
that ⎥⎥ψ⎥⎥∞ < 1. û is then uniformly bounded by one in [0, T] × Rd
and we get from the Feynman–Kac formula that û is a viscosity
solution to PDE 20.9 (see Theorem 6.4 in Touzi, 2010). By assuming
that PDE 20.9 satisfies a comparison principle, we conclude that u =
û.
Marked branching diffusions

It seems too restrictive and unreasonable to fit the nonlinearity u+
by a polynomial of type 20.10. The aim of this section is to extend
the results of the previous section to an arbitrary polynomial F(u) =
SMk=0akuk for which the PDE is:
∂t u + Lu + β ( F ( u) − u) = 0 (20.12)
For convenience, we write F(u) = Sk=0

M
(ak/pk)pkuk.
336

Assumption (comparison)
To have uniqueness in the viscosity sense, we assume PDE 20.12
satisfies a comparison principle for sub- and super-solutions (see
Fleming and Soner, 1993).
For each Galton–Watson tree, we denote by wk ∈ N the number of
particles that branch into k descendants with k ∈ {0, …, M}. SMk=0wk(k
– 1) + 1 gives the total number of individuals produced by the
branching w ≡ (w 0, …, w M). The descendants are drawn with an arbi-
trary distribution pk – for example, we can take a uniform
distribution pk = 1/(M+1) (see another choice below). In Figure 20.1,
we have drawn the diagrams for the nonlinearity F(u) = (a/p2)p2u2 +
(b/p3)p3u3 up to two defaults. We then define the multiplicative
functional:
Main formula
⎡ NT M ⎛ wk ⎤
a ⎞
û (t, x ) = E t,x ⎢∏ψ ( zTi ) ∏ ⎜ k ⎟ ⎥ (20.13)
⎢⎣ i=1 k=0 ⎝ p k ⎠ ⎥⎦
We state our main result (the proof is reported in the Appendix,
available in an extended version of this chapter, Henry-Labordère,
2012).
Theorem
Assume that û ∈ L∞([0, T] × Rd) and the comparison holds. The func-
tion û(t, x) is the unique viscosity solution of 20.12.
Diagrammatic interpretation
From Feynman–Kac’s formula, we have:
u (t, x ) = E t,x ⎡⎣1τ ≥T ψ ( XT )⎤⎦ + E t,x ⎡⎣F ( u (τ , Xτ )) 1τ <T ⎤⎦ (20.14)
This integral equation can be recursively solved in terms of multiple

exponential random times t i:
u (t, x ) = E t,x ⎡⎣1τ ≥T ψ ( XT )⎤⎦
+E t,x ⎡⎣F ( Eτ 0 ⎡⎣1τ ≥T ψ ( XT )⎤⎦
( ) )
+Eτ 0 ⎡⎣F Eτ 1 ⎡⎣1τ 2 ≥T ψ ( XT )⎤⎦ 1τ 1 <T ⎤⎦ 1τ 0 <T ⎤⎥ +…
⎦
(20.15)
Each term can be interpreted as a Feynman diagram (see Figure

20.1) representing the trajectory of a branching diffusion with a
337

weight depending on the branching of each monomial. For example,

in Figure 20.1, the diagram with two grey vertices corresponds to:
2
⎛ a2 ⎞ ⎡ ⎡ 2 ⎤⎤
⎜ ⎟ E t,x ⎢1τ 0 <T Eτ 0 ⎡⎣1τ 1 ≥T ψ ( XT )⎤⎦ Eτ 0 ⎣⎢1τ 2 <T E τ 2 ⎡⎣1τ 3 ≥T ψ ( XT )⎤⎦ ⎦⎥⎥
⎝ p2 ⎠ ⎣ ⎦
By assuming that the series 20.15 is convergent, one can guess that
the solution is given by our multiplicative functional 20.13.
Next, we focus on convergence issues and deduce a sufficient
condition to ensure that û ∈ L∞([0, T] × Rd) if ψ is bounded (the proof
is reported in the Appendix), so that our algorithm is certain to
converge.
Proposition 1
Let us assume that ψ ∈ L∞ (Rd). Set q(s) = SM ⎥a ⎥⎥⎥ψ⎥⎥∞k–1sk.
k=0 k
❑❑ Case q(1) > 1. We have û ∈ L∞([0, T] × Rd) (as defined by 20.13) if

there exists X ∈ R +* such that:
X ds
∫ 1 q ( s) − s
= βT
In the particular case of one branching type k, the sufficient

condition for convergence reads as:
ak ψ ∞
k−1
(1− e − βT ( k−1)
)<1
❑❑ Case q(1) ≤ 1. û ∈ L∞ ([0, T] × Rd) for all T.
Note that our blow-up criterion – the converse of the above
inequality – does not depend on the probabilities pk as expected.
PDE 20.4
We assume that the function (1 – R)x+ + Rx can be well approxi-
mated by a polynomial F(x) (see the next section) and consider the
PDE:
β
∂t u (t, x) + Lu (t, x) +
1− R
(
F (E t,x ⎡⎣ψ ( XT )⎤⎦) − u (t, x) = 0, )
u (T, x ) = ψ ( x )
From Feynman–Kac’s formula, we have:
u (t, x ) = E t,x ⎡⎣1τ ≥T ψ ( XT )⎤⎦ + E t,x ⎡⎣F ( Eτ ⎡⎣ψ ( XT )⎤⎦) 1τ <T ⎤⎦
338

where t is a Poisson default time with intensity b /(1 – R). As

compared with the previous section, we have the term
Et,x[F(Et [ψ (XT)])1t <T] instead of Et,x[F(u(t, Xt ))1t <T]. This term can be
calculated using the previous algorithm by imposing that the
particle can default only once. This corresponds to the first three
diagrams in Figure 20.1. As NT is valued in [0, M], our formula 20.13
is convergent here for all polynomial nonlinearities.
As a conclusion, without any modification, the branching particle
algorithm can solve the two PDEs 20.3–20.4 modulo that the non-
linearly u+ can be fairly well approximated by a polynomial.
Optimal probabilities pk
Is there a better choice than a uniform distribution pk = 1/(M+1) for
improving the convergence?
For the PDE 20.3, the variance of the algorithm (depending on
the probabilities pk) is bounded by (see the Appendix):
⎛ a k−1
⎞
ψ ∞ P̂ ⎜T,−2 ln k − 2 ln ψ ⎟ (20.16)
⎝ p k
∞
⎠
where P̂(T, c) = E[P M

k=0
e–ckw k] is given by:
M
P̂ (T ,c ) ds
∫ M k
= β T if ∑ pk e −ck ≠ 1
1
−s + ∑ pk e −ck s k=0
k=0
M
P̂ (T, c ) = 1 if ∑ pk e −ck = 1 (20.17)
k=0
By minimising with respect to pk, we get:

k
ak ψ
pk = M
∞
i (20.18)
∑ i=0
ai ψ ∞
Similarly, for the PDE 20.4, the variance (depending on the proba-
bilities pk) is bounded by:
M
ak2 2k
∑p ψ ∞
βTe − βT
k=0 k
By minimising with respect to pk, we also get 20.18.

We recall that the population in the Galton–Watson tree disap-
pears in finite time almost surely if m ≡ S M
k=0
kpk ≤ 1 (see Durrett, 2004).
In the super-critical case m > 1, the population explodes at a finite
339

time Texp with probability 1 – s0 where s0 = inf{s ∈ [0, 1], S M p sk = s}.

k=0 k
From 20.18, we are in the super-critical case if S k=0 (k – 1)⎥ak⎥⎥⎥ψ⎥⎥k∞
M
> 0.
Credit valuation adjustment algorithm

In the previous section, we assumed that the payout was bounded:
ψ ∈ L∞. Then, the solution u can then be written as v = u/⎥⎥ψ⎥⎥∞
where v satisfies:
∂t v + Lv + β ( v+ − v ) = 0, v (T,⋅) ≤ 1 (20.19)
Therefore, by rescaling, we can consider that the payout satisfies

the condition ⎥⎥ψ⎥⎥∞ ≤ 1. The condition ψ ∈ L∞ can be easily relaxed,
as observed in Fahim, Touzi & Warin (2011, see Remark 3.7). Let ψ
be a payout with a -exponential growth for some a > 0. We scale the
solution by an arbitrary smooth positive function r given by:
ρ ( x ) ≡ eα x for x ≥ M

v (t, x ) ≡ ρ −1 ( x ) v (t, x )
If we write the linear operator L, in one dimension for simplicity, as

Lv = m (t, x)∂xv + 1/2s 2(t, x)∂x2v, then v~ satisfies a PDE with the same
nonlinearity b v+:
∂t v + L v + β ( v + − v ) = 0
~
with L v~ = (m + s 2r –1∂xr )∂xv~ + 1/2s 2(t, x)∂x2 v~ + (mr–1∂xr + 1/2r–1s 2∂x2r)v~.
What remains to be done in order to use 20.13 is to approximate
v+ by a polynomial F(v):
∂t v + Lv + β ( F ( v ) − v ) = 0, v (T, x) = ψ ( x ) (20.20)
In our numerical experiments, we take (see Figure 20.2):

F ( u) = 0.0589 + 0.5u+ 0.8164u2 − 0.4043u4 (20.21)
The proposition above gives that the solution does not blow up if
b T < 0.50829 (take X = ∞ with ⎥⎥ψ⎥⎥∞ = 1). Moreover, as a numerical
check of 20.17, we have calculated using a PDE solver the solution
~
of (20.20) with ψ (x) = 1, F(u) = 0.0589 + 0.5u + 0.8164u2 + 0.4043u4, b
= 0.05 and T = 10 years. The solution X is given by P̂(T, –ln (⎥ak⎥/pk))
and should satisfy (see Equation (20.17)):
X ds
∫ 1 −s + 0.0589 + 0.5u + 0.8164u2 + 0.4043u 4
= 0.5 (20.22)
340

We found X = 4.497 (PDE solver) and the reader can check that this
value satisfies the above identity 20.22 as expected.
Algorithm: Final recipe

The algorithm for solving PDEs 20.3–20.4 can be described by the
following steps:
1. Choose a polynomial approximation of u+  SM a uk on the
k =0 k
domain [–1, 1].
2. Simulate the assets and the Poisson default time with intensity
b (respectively b/1–R) for PDE2 (respectively PDE1), which is
usually calibrated to default probabilities implied by credit default
swap quotes.
3. At each default time, produce k descendants with probability
pk (given by 20.18). For PDE type 1, descendants produced after the
first default become immortal.
4. Evaluate for each alive particle the payout:
ω
NT ⎛ a ⎞ k
M
∏ψ ( z )∏ ⎜ pk ⎟ , PDE2
i
T
i=1 k=0 ⎝ k ⎠
NT ∈[0,M ] ω1 M ωk
⎛ a 1− R + R ⎞ ⎛ a 1− R ⎞
∏ ψ ( zTi ) ⎜ 1 ( p ) ⎟ ∏ ⎜ k ( p ) ⎟
i ⎝ 1 ⎠ k≠1 ⎝ k ⎠
⎛ M ⎞
⎜ here, ∑ω k = 0 or 1⎟, PDE1
⎝ k=0 ⎠
where w k denotes the number of branching type k. We should high-

light that the algorithm for PDE1 is always convergent for all T
whatever condition on the payout as the multiplicative functional
involves at most M particles.
Extensions
In the case of collateralised positions, the nonlinearity u+t should be
substituted with (ut – ut+D)+ where D is a delay. Using our poly
nomial approximation, we get F(ut – ut+D). By expanding this
function, we get monomials of the form {upt uqt+D}. Our algorithm can
then be easily extended to handle this case. At each default time t,
we produce p descendants starting at (t, Xt ) and q descendants
starting at (t + D, Xt +D).
In the case of bilateral counterparty risk, the PDE 20.1 reads:
341

Figure 20.2 u+ versus its polynomial approximation on [–1, 1]
90%
70%
50%
30%
10%
–100 –80 –60 –40 –20 0 20 40 60 80 100

%
Note: y = –0.4043x4 + 0.8164x2 + 0.5x + 0.0589; R2 = 0.9969
⎛ λ (1− R1 ) − r ⎞
∂t V̂ + LV̂ + λ2 (1− R2 ) ⎜⎜−V̂ + + 1 V̂ − V̂ ⎟⎟ = 0
⎝ λ 2( 1− R2) λ 2( 1− R2) ⎠
Our above algorithm can be easily adapted to this situation by

taking the Poisson intensity b = l2(1 – R2) and choosing a poly
nomial approximation of:
λ1 (1− R1 ) − r
F ( x ) = −x+ + x − x+x
λ2 (1− R2 ) λ2 (1− R2 )
The linear term in F can be arbitrarily scaled by a constant m/b by

doing a proper discounting u = e–m(T–t)v.
A natural question is to characterise the error of the algorithm as
a function of the approximation error of u+ by F(u). Using the para-
bolicity of the semi-linear PDE, we can characterise the bias of our
algorithm (the proof is reported in the Appendix).
Proposition 2
Let us assume that F(v) and F(v) are two polynomials satisfying
(comparison), the sufficient condition in Proposition 1 for a matu-
rity T and:
F ( x ) ≤ x+ ≤ F ( x )
342

We denote by v and v# the corresponding solutions of 20.20 and v the

solution of 20.19. Then:
v≤v≤v
A similar result can be found for PDE 20.4. In the case of American-
style options, our algorithm gives robust lower and upper bounds.
Complexity
By approximating u+ with a high-order – say N2 – polynomial our
algorithm converges towards the brute force nested Monte Carlo
method with a complexity O(N1 × N2). By comparison, with our
choice 20.21, the complexity is at most O(4N1) for PDE type 20.4.
In comparison, the regression-based method has a complexity of
O(N1 × b) with b the number of regressors. Here, the accuracy
depends on the choice of basis functions, which may require
Table 20.1 Monte Carlo paths quoted in percentage of notional as a

function of the number of Monte Carlo paths 2N
N Fair (PDE1) Stdev (PDE1) Fair (PDE2) Stdev (PDE2)
12 21.31 0.79 20.78 0.78

14 21.37 0.39 22.25 0.39
16 21.76 0.20 21.97 0.19
18 21.51 0.10 21.90 0.10
20 21.48 0.05 21.86 0.05
22 21.50 0.02 21.81 0.02
Note: PDE pricer (PDE 1) = 21.50; PDE pricer (PDE 2) = 21.82; non-
linearity F(u) = 1/2(u3 – u2)

function of the number of Monte Carlo paths 2N
N Fair (PDE1) Stdev (PDE1) Fair (PDE2) Stdev (PDE2)
12 20.00 0.78 21.14 0.78

14 19.90 0.39 21.56 0.38
16 20.25 0.20 21.62 0.19
18 20.39 0.10 21.31 0.10
20 20.36 0.05 21.38 0.05
22 20.40 0.02 21.36 0.02
Note: PDE pricer (PDE 1) = 20.39; PDE pricer (PDE 2) = 21.37; non-
linearity F(u) = 1/3(u3 – u2 – u4)
343


function of the maturity for the nonlinearity F(u) = u2 + u
Maturity (year) BBM alg. (Stdev) PDE
0.5 71.66 (0.09) 71.50

1 157.35 (0.49) 157.17
1.1 ∞(∞) ∞
Note: ψ(x) ≡ 1x>1
numerical experimentation and a good understanding of the finan-

cial derivative under consideration.
This is a common feature in the pricing of American-style options.
The main contribution of our approach is to remove the bias
produced by this difficult regression procedure.
Moreover, this method allows us to solve exactly PDE type 20.3,
which cannot be tackled without relying on an approximation
within the nested Monte Carlo method or the regression-based
BSDE approach.
Table 20.4 Monte Carlo price quoted in percentage of notional as a

function of the maturity for PDE1 with b = 1%
Maturity (year) PDE with poly. BBM alg. PDE
2 11.62 11.63 (0.00) 11.62

4 16.54 16.53 (0.00) 16.55
6 20.28 20.27 (0.00) 20.30
8 23.39 23.38 (0.00) 23.41
10 26.11 26.09 (0.00) 26.14

2 11.62 11.64 (0.00) 11.63

4 16.56 16.55 (0.00) 16.57
6 20.32 20.30 (0.00) 20.34
8 23.45 23.45 (0.00) 23.48
10 26.20 26.18 (0.00) 26.24
344

Numerical examples
Before applying our algorithm to the problem of credit valuation
adjustment, we check it on polynomials that do not belong to the
classes defined by 20.10.
Experiment 1
We have implemented our algorithm for the two PDE types:
∂t u + Lu + β ( F ( u) − u) = 0, u (T, x) = 1x>1 : PDE2
and:
( ( ) )
∂t u+ Lu + β F Et,x ⎡⎣1XT >1 ⎤⎦ − u = 0, u (T, x ) = 1x>1 : PDE1
with F(u) = ½(u3 – u2). L is the Itô generator of a geometric Brownian

motion with a volatility s BS = 0.2 and the Poisson intensity is b =
0.05. In financial terms, this corresponds to a credit default swap
spread around 500 basis points. The maturity is T = 10 years. From
20.18, we note that our optimal probability distributions for PDE1
and PDE2 coincide with the uniform distribution. Moreover, our
Proposition 1 gives that the solution does not blow up.
The numerical method has been checked against a one-dimen-
sional PDE solver with a fully implicit scheme (see Table 20.1) for

2 12.34 12.35 (0.00) 12.35

4 17.72 17.71 (0.00) 17.75
6 21.77 21.76 (0.00) 21.82
8 25.07 25.06 (0.00) 25.14
10 27.89 27.88 (0.00) 27.98

2 12.38 12.39 (0.00) 12.39

4 17.88 17.86 (0.00) 17.91
6 22.08 22.07 (0.00) 21.14
8 25.58 25.57 (0.00) 25.66
10 28.62 28.60 (0.00) 28.74
345

which we find u = 21.50% (PDE1) and u = 21.82% (PDE2). Note that

this algorithm converges as expected and the error is properly indi-
cated by the Monte Carlo standard deviation estimator (see the
column headed Stdev).
Experiment 2
Same test with F(u) = 1/3(u3 – u2 – u4) (see Table 20.2).
Experiment 3: Blow-up
It is well known that the semi-linear PDE in Rd:
∂t u + Lu+ u2 = 0
blows up in finite time if d ≤ 2 for any bounded positive payout (see

Sugitani, 1975). We deduce that the PDE with the nonlinearity F(u)
= u2 + u blows up in finite time (Tmax) in one dimension. Using our
first proposition, our sufficient condition reads:
Tmax ψ ∞
<1
We have verified this explosion when the maturity T is greater than

one year (in our case ψ = 1x>0, ⎥⎥ψ⎥⎥∞ = 1) using our algorithm (and a
PDE solver as a benchmark, see Table 20.3). Note that for T = 1, the
algorithm starts to blow up (see Stdev = 0.49). A different stochastic
representation can be obtained by setting u = e(T–t)v. We get:
∂t v + Lv + e (T−t) v 2 − v = 0, v (T, x ) = ψ ( x )
and this can be interpreted as a binary tree with a weight e(T–t). Our
stochastic representation then gives:
⎡ NT ∑
# branching
(T−τ i ) ⎤
u (t, x ) = e T−t E t,x ⎢∏ψ ( zTi ) e i=1 ⎥ (20.23)
⎣ i=1 ⎦
where t i is the time where the ith branching appears. This represen-
tation (20.23) appears in López-Mimbela and Wakolbinger (1998)
and was used to reproduce Sugitani’s blow-up criteria (1975).
CVA: Some examples

We have implemented our algorithm for the two PDE types:
1
2
∂t u + x 2σ BS ∂2x u + β ( u+ − u) = 0, u (T, x) = 1− 2.1x>1 : PDE1
2
346

and:
1 β
∂t u + x 2σ BS
2
2
∂2x u +
1− R (
(1− R ) E t,x ⎡⎣1− 2.1XT >1 ⎤⎦
+
)
+RE t,x ⎡⎣1− 2.1X >1 ⎤⎦ − u = 0, PDE2
T
with Poisson intensities b = 1%, b = 3% and a recovery rate R = 0.4

(see Tables 20.4–20.7). In financial terms, this corresponds to credit
default swap spreads of around 100 and 300 basis points. We have
used a deterministic default intensity to check our numerical algo-
rithm against a one-dimensional PDE solver. A more realistic
example with a stochastic intensity can be easily handled (see
Algorithm: Final recipe). The question of model relevance for
pricing counterparty risk is a key issue and has been explored by
many authors (see, for example, Brigo and Morini, 2011, and Brigo
and Pallavicini, 2008 and 2007) but is not addressed here.
The method has been checked using a PDE solver with the poly-
nomial approximation 20.21 (see the column headed PDE with
poly.). To justify the validity of 20.21, we have included the PDE
price with the true nonlinearity u+ (see the column headed PDE). As
can be observed, prices, produced by our algorithm, converge to
the PDE solver with the polynomial approximation and are close to
the exact CVA values. We would like to highlight that replacing the
Black–Scholes generator ½x2s 2BS∂x2 by a multi-dimensional operator
L can easily be handled in our framework by simulating the
branching particles with a diffusion process associated to L. This is
out of reach with finite-difference scheme methods and not such an
easy step for the BSDE approach.
Conclusion
CVA is now an important quantitative issue that needs to receive
special attention. Brute force nested Monte Carlo or the BSDE
approach are not a good solution for large, multi-asset portfolios.
The algorithm presented here, based on marked branching diffu-
sions, reduces the complexity by an order of magnitude, as
illustrated by our numerical examples. This method can also be
used for semi-linear PDEs with polynomial nonlinearities and
extended to fully nonlinear PDEs by including in the branching
process Malliavin weights for derivatives. We leave this investiga-
tion for future research.
347

The author would like to thank the other members of the team for
REFERENCES
their comments. He is also grateful to Jean-François Delmas and
Benth F., K. Karlsen and K. Reikvam, 2003, “A Semilinear Black and Scholes Partial
Denis Talay for useful discussions.
Differential Equation for Valuing American Options,” Finance and Stochastics, 7(3), pp
277–98.
Brigo D. and M. Morini, 2011, “Close-out Convention Tensions,” Risk, December, pp

74–78 (available at www.risk.net/2128152).
Brigo D. and A. Pallavicini, 2008, “Counterparty Risk and CCDSs under Correlation,”
Risk, February, pp 84–88 (available at www.risk.net/1500236).
Brigo D. and A. Pallavicini, 2008, “Counterparty Risk Pricing Under Correlation

Between Default and Interest-rates,” in J. Miller, D. Edelman and J. Appleby (Eds),
Numerical Methods for Finance (Boca Raton, Fl: Chapman & Hall/CRC).
Burgard C. and M. Kjaer, 2011, “PDE Representations of Derivatives with Bilateral

Counterparty Risk and Funding Costs,” Journal of Credit Risk, 7(3) (available at http://
ssrn.com/abstract=1605307).
Durrett R, 2005, Probability: Theory and Examples (3e) (Belmont, CA: Thomson Brooks/
Cole).
Fahim A., N. Touzi and X. Warin, 2011, “A Probabilistic Numerical Method for Fully
Nonlinear Parabolic PDEs,” Annals of Applied Probability, 21(4), pp 1,322–64.
Fleming W. and H. Soner, 1993, Controlled Markov Processes and Viscosity Solutions (New
York, NY: Springer).
Gobet E., J.-P. Lemor and X. Warin, 2006, “Rate of Convergence of an Empirical
Regression Method for Solving Generalized Backward Stochastic Differential Equations,”
Bernoulli, 12(5), pp 889–916.
Henry-Labordère P., 2012, “Counterparty Risk Valuation: A Marked Branching Diffusion

Approach,” extended version (available at http://ssrn.com/abstract=1995503).
Lopez-Mimbela J. and A. Wakolbinger, 1998, “Length of Galton-Watson Trees and Blow-

up of Semilinear Systems,” Journal of Applied Probability, 35(4), pp 802–11.
McKean H., 1975, “Application of Brownian Motion to the Equation of Kolmogorov-–

Petrovskii–Piskunov,” Communications on Pure and Applied Mathematics, 28(3), May, pp
323–31.
Pardoux E. and S. Peng, 1990, “Adapted Solution of a Backward Stochastic Differential

Equation,” Systems & Control Letters, 14, pp 55–61.
Sugitani S., 1975, “On Non-existence of Global Solutions for Some Nonlinear Integral
Equations,” Osaka Journal of Mathematics, 12, pp 45–51.
Touzi N., 2010, “Optimal Stochastic Control, Stochastic Target Problems, and Backward
SDE,” lecture notes.
348

Notes on Chapters
Chapter 1: Lorenzo Bergomi, 2009, “Smile Dynamics IV,” Risk, December.

Chapter 2: Vladimir Piterbarg, 2010, “Funding Beyond Discounting:
Collateral Agreements and Derivatives Pricing,” Risk, February.
Chapter 3: Marco Bianchetti, 2010, “Two Curves, One Price,” Risk, August.
Chapter 4: Fabio Mercurio, 2010, “A Libor Market Model with a Stochastic
Basis,” Risk, December.
Chapter 5: Jesper Andreasen and Brian Huge, 2011, “Volatility
Interpolation,” Risk, March.
Chapter 6: Jesper Andreasen and Brian Huge, 2011, “Random Grids,” Risk,
July.
Chapter 7: Julien Guyon and Pierre Henry-Labordère, 2012, “Being
Particular about Calibration,” Risk, January.
Chapter 8: Vladimir Piterbarg, 2012, “Cooking with Collateral,” Risk,
August.
Chapter 9: Marco Avellaneda and Mike Lipkin, 2009, “A Dynamic Model
for Hard-to-borrow Stocks,” Risk, June.
Chapter 10: Richard Martin and Roland Ordovàs, 2010, “Shortfall Factor
Contributions,” Risk, October.
Chapter 11: Christian Fries, 2011, “Stressed in Monte Carlo,” Risk, April.
Chapter 12: Attilio Meucci, 2011, “A New Breed of Copulas for Risk and
Portfolio Management,” Risk, September.
Chapter 13: Robin Stuart, 2011, “A Historical-parametric Hybrid VAR,”
Risk, December.
Chapter 14: Jean-Philippe Bouchaud, Fabio Caccioli and Doyne Farmer,
2012, “Impact-adjusted Valuation and the Criticality of Leverage,” Risk,
December.
Chapter 15: Jon Gregory, 2009, “Being Two-faced Over Counterparty
Credit Risk,” Risk, February.
349
21 Dates of Publication PCQF.indd 349 11/03/2013 10:18

Chapter 16: Luca Capriotti, Jacky Lee and Matthew Peacock, 2011, “Real-
time Counterparty Credit Risk Management in Monte Carlo,” Risk, June.
Chapter 17: Michael Pykhtin, 2011, “Counterparty Risk Capital and CVA,”
Risk, August.
Chapter 18: Christoph Burgard and Mats Kjaer, 2011, “Partial Differential
Equation Representations of Derivatives with Bilateral Counterparty Risk
and Funding Costs,” Journal of Credit Risk, 7(3).
Chapter 19: Damiano Brigo and Massimo Morini, 2011, “Close-out
Convention Tensions,” Risk, December.
Chapter 20: Pierre Henry-Labordère, 2012, “Cutting CVA’s Complexity,”
Risk, July.
350
21 Dates of Publication PCQF.indd 350 11/03/2013 10:18

Index
(page numbers in italic type refer to figures and tables)
A with simultaneous defaults

adjoint algorithmic differentiation 247–52
(ADD) 259 see also counterparty credit risk
and counterparty credit risk Black–Scholes, with collateral
264–7 28–30
and counterparty credit risk
management (CCRM) C
266–7 calibration 109–26, 120, 121, 122,
simple example of 271–2 123
see also counterparty credit risk hybrid local stochastic volatility
model 119–21
B hybrid models 113–14
backward finite difference (BFD) and local stochastic volatility
scheme 91–2, 96–7 model 111–13
backward partial differential as longstanding problem 109
equations (BPDEs) 91 Malliavin representation 114–17
Basel II 179 and one-factor, short-rate
and counterparty credit risk models 115–17
(CCR) 276, 283–4, 286–8 and McKean SDEs 109, 110–11
capital for 286–8 particle method 110–11
Basel III: numerical tests 121–5
and counterparty credit risk Bergomi’s local stochastic
(CCR) 275, 276, 283–4 volatility model 121, 122,
capital for 288–90 124–5
Bear Stearns 243 Ho–Lee/Dupire hybrid model
Bergomi’s local stochastic volatility 120, 121–4
model 121, 122, 124–5 local stochastic volatility (LSV)
see also calibration 123, 125
bilateral counterparty risk 246–54 particle simulation method
with no simultaneous defaults 117–21
246 algorithm 119
351
22 Index PCQF.indd 351 11/03/2013 10:19

local stochastic volatility collateral choice 139–40

model 117–19 and collateral choice option,
capital asset pricing model valuing 143–4
(CAPM) 213, 215 dynamics 141–2
caplet pricing 66–7, 71, 72 example 144
CCRM, see counterparty credit risk: observations 142–3
management of zero-coupon curves 140
Chebyshev–Hermite polynomials cross-currency model 134–9
226 under domestic collateral
close-out 315–27, 320, 321, 322, 323, 136–7
327 domestic and foreign 135–6
quantitative analysis and drift of FX rate 136
numerical example 319–26 under foreign collateral 137–8
risk-free vs replacement 316–17 and forward forex 138–9
and unilateral and bilateral different rates 132–3
valuation adjustments example, many 133–4
317–19 example, two 131–2
collateral agreements and processes 130–1
derivatives pricing 25–41, rates, counterparty-specific 134
36, 37, 39, 40 copula-marginal algorithm (CMA)
and Black–Scholes 28–30 197–8, 202–9, 203
European-style options 35–40 case study: copula
and distribution impact of transformations 207–9
convexity adjustment 35–7 case study: panic copula 204–7
example: stochastic funding copulas 197–210, 198, 203, 207,
model 37–40 209
forward contract 31–4 and copula-marginal algorithm
calculating CSA convexity (CMA) 197–8, 202–9, 203
adjustment 33–4 case study: copula
with CSA 32–3 transformations 207–9
and futures contracts, case study: panic copula
relationship with 34 204–7
minus CSA 31–2 for risk and portfolio
preliminaries 27 management 197–210
and risk-free curve for lending 27 theory of, reviewed 198–9
zero-strike call option 31 traditional implementation of
collateralised assets 129–45, 145 198, 200–2
and collateral choice, simple counterparty credit exposure
model for 139–44 276–7
352
22 Index PCQF.indd 352 11/03/2013 10:19

index
counterparty credit risk 243–56, credit crunch:

259–72 and basis spreads 43, 61
and adjoint algorithmic pre- and post-, market practices
differentiation 264–6 for pricing and hedging
and adjoint algorithmic interest-rate derivatives 45–8
differentiation (ADD) 264–7 credit risk, CCR as 276, 283–6
simple example of 271–2 credit value adjustment (CVA):
bilateral 246–54 algorithm 340–4
example 248–52 and branching diffusions 334–40
with no simultaneous defaults marked 336–40
246 complexity of, cutting 329–47,
with simultaneous defaults 336, 342, 343, 344, 345
247–52 numerical examples 345–7, 345
vs unilateral 253–4 and valuation PDEs 330–4
capital, and credit value and counterparty risk capital
adjustment (CVA) 275–92 275–92
and counterparty credit and counterparty credit
exposure 276–7 exposure 276–7
and credit value adjustment and credit value adjustment
(CVA) 277–80 (CVA) 277–80
and loss in a trading book and loss in a trading book
280–2 280–2
capital for, under Basel II 286–8
capital for, under Basel III D
288–90 derivatives pricing and collateral
as credit risk 276, 283–6 agreements, see collateral
management of (CCRM) 260–4 agreements and derivatives
and adjoint algorithmic pricing
differentiation (ADD) derivatives with bilateral
266–7 counterparty risk and
real-time 259–72 funding costs:
as market risk 275–6, 282–3 partial differential equations
numerical test 268–70, 269, 270 (PDE) representations of
and rating transition risk 267–8 295–314, 311, 312, 313
unilateral 244–5 chapter results 297–9
vs bilateral 253–4 examples 310–11
counterparty-specific collateral and mark-to-market at default
rates 134 305–10
see also collateralised assets possible extensions 311–14
353
22 Index PCQF.indd 353 11/03/2013 10:19

“discounting” yield curves 44 H

“double-curve” methodology hard-to-borrow stocks 149–64, 151,
43–60, 45, 50–1, 56 152, 153, 158, 159, 160, 163
and interest-rate derivatives, and artificially high prices and
double-curve pricing of sharp drops 150
43–60 and buy-ins and effective
and no-arbitrage and basis dividend yield 154–6
adjustment 48–50, 50–1 and cost of conversions 151–2
and no-arbitrage and and exchange-traded-funds
counterparty risk 57–8 (ETFs), leveraged, hard-to-
and no-arbitrage and quanto borrowness in 162–3
adjustment 50–4, 56 the model 154
and pre- and post-credit-crunch option pricing for 157–61
market practices for pricing and short squeezes 150–1
and hedging interest-rate and shorting, cost of 154–6
derivatives 45–8 and vertical spreads, unusual
pricing of 152–3
E Ho–Lee/Dupire hybrid model 120,
Edgeworth expansion 223–4 121–4
Enron 243 see also calibration
Euler formula 165, 166, 168, 171,
184, 186 I
large amount of literature on interest-rate derivatives, double-
167 curve pricing of 54–7
exchange-traded funds (ETFs), interest-rate market segmentation
leveraged, h
ard-to- 44
borrowness in 162–3 International Swaps and
Derivatives Association 249,
F 316
finite difference schemes 94–8 Master Agreement of 25, 295,
forward finite difference (FFD) 296, 300, 331–2
scheme 91–2
“forwarding” yield curves 44 L
Lehman Bros 163, 238, 243, 315, 317
G leverage, criticality of:
global financial crisis, see credit and impact-adjusted valuation
crunch 229–39, 234, 237
Gram–Charlier A 223 and market impact and
grids, random, see random grids liquidation accounting 230–3
354
22 Index PCQF.indd 354 11/03/2013 10:19

index
Libor market model (LMM): M

example of calibration to real McKean SDEs 110–11
market data 72–4, 73 see also stochastic differential
example of rate and spread equation
dynamics 71–2 McLMM, see under Libor market
extending (McLMM) 63–74, model
64 Malliavin representation 114–17
calibration to real market data, mark-to-market accounting 229–30
example of 72–4, 73 badly needed alternative to 230
and caplet pricing 66–7 as standard industry practice 229
and measuring changes and market impact and liquidation
option pricing 70–1 accounting 230–3
multi-tenor, tractable class of market risk, CCR as 275–6, 282–3
69–74 market segmentation 44
and rate and spread dynamics, Monte Carlo:
example of 71–2 failure of, in s tress-testing 183–6
single-tenor, general and PDEs 186–8
framework for 65–9 simulation, with boundary
and swaption pricing 67–9 conditions 188–93
and measuring changes and boundary assumptions 192–3
option pricing 70–1 definition of boundary value
and rate and spread dynamics, process 191–2
example of 71–2 and modified valuation
with stochastic basis 61–75, 64, algorithm 189–91
73 other applications 194
assumptions and definitions restricted to inbound regime
62–3 188–9
LMM, see Libor market model
local Bergomi and Ho–Lee hybrid N
model 123, 125 no-arbitrage and basis adjustment
local stochastic volatility (LSV) 48–50, 50–1
111–13 no-arbitrage and counterparty risk
existence of, under question 57–8
125 no-arbitrage and quanto
hybrid 119–21 adjustment 50–4
and particle simulation 117–19 numerical tests 121–5
see also calibration Bergomi’s local stochastic
Long-Term Capital Management volatility model 121, 122,
238 124–5
355
22 Index PCQF.indd 355 11/03/2013 10:19

calculation of risk on CVA of and mark-to-market at default

commodity derivatives 305–10
268 possible extensions 311–14
Ho–Lee/Dupire hybrid model valuation 330–4
120, 121–4 particle algorithm 119
local Bergomi and Ho–Lee PowerShares 162
hybrid model 123, 125
see also calibration R
random grids 91
O and backward partial differential
one-month-maturity options, equations (BPDEs) 91
application to 18–19 extensions 105–6
option pricing for hard-to-borrow and finite difference schemes
stocks 157–61 94–8
see also h
ard-to-borrow stocks model and PDEs 92–4
and Monte Carlo simulation
P 98–100
Parmalat 243 numerical implementation and
partial differential equations: examples 100–5, 101, 102,
backward 91–2 103
forward 91–2 and stochastic differential
and the model 92–4 equation (SDE), discrete
partial differential equations approximations of 91
(PDEs): rating transition risk 267–8
backward 91, 92, 94 risk-free versus replacement close-
bilateral, risky, model setup and out, practical consequences
derivation of 299–304 of 316–17; see also close-out
Fokker–Planck 109, 110, 111 risk and portfolio management:
greater robustness of copulas for 197–210
186–8 theory of, review 198–9
non-linear 126 traditional implementation of
and random grids, see random 198, 200–2
grids
representing derivatives with S
bilateral counterparty risk scaling behaviour 8–13
and funding costs 295–314, Type II behaviour with
311, 312, 313 Eurostoxx 50 index 12–13
chapter results 297–9 Type II, in a two-factor model
examples 310–11 10–12
356
22 Index PCQF.indd 356 11/03/2013 10:19

index
short maturities, arbitraging SSR smile dynamics 3–22, 9, 11, 12, 19,
for 13–21 20, 21, 22
arbitraging the 95/105 one- and scaling behaviour 8–13
month skew on Eurostoxx Type II behaviour with
50 19–21 Eurostoxx 50 index 10–12
and consistency conditions Type II, in a t wo-factor model
16–17 10–12
and one-month-maturity and short maturities, arbitraging
options, application to SSR for 13–21
18–19 arbitraging the 95/105 one-
and short near-the-money month skew on Eurostoxx
options, model for 13–21 50 19–21
short-maturity limit of AT MF and consistency conditions
skew and SSR 7–8 16–17
shortfall factor contributions and o ne-month-maturity
165–81, 169, 170–1, 174, 175, options, application to
176, 177, 178, 179 18–19
and factor decomposition short near-the-money options,
formula for ES less EL model for 13–21
170–1 short-maturity limit of AT MF
and hedges, interpretation of skew and SSR 7–8
formula via 171–2 and skew stickiness ratio 4–8
multivariate normal model 173 short-maturity limit of AT MF
Vasicek (probit) model and skew and SSR 7–8
numerical examples 173–4 stochastic differential equation
skew stickiness ratio (SSR): (SDE):
arbitraging, for short maturities discrete approximations of 91
13–21 McKean 109–11, 112, 113
arbitraging the 95/105 one- stress-testing 183–94
month skew on Eurostoxx and Monte Carlo:
50 19–21 failure of 183–6
and consistency conditions and PDE’s greater robustness
16–17 186–8
and one-month-maturity Monte Carlo simulation, with
options, application to boundary conditions
18–19 188–93
short n ear-the-money options, boundary assumptions 192–3
model for 13–21 definition of boundary value
vanilla ATM skew 4–6 process 191–2
357
22 Index PCQF.indd 357 11/03/2013 10:19

and modified valuation historical-parametric hybrid

algorithm 189–91 (hybrid VaR) 213–26, 216,
other applications 194 221, 222
restricted to inbound regime and moments for correlated
188–9 residuals 218–19
numerical results 193–4 and P&L distribution for
swap curves, euro and Euribor 45 combined positions 217–18
swaption pricing 67–9, 71 vanilla ATM skew 4–6
volatility interpolation 77–88, 80,
U 81, 82, 84
unilateral counterparty risk 244–5 and absence of arbitrage, proof
see also counterparty credit risk 85–6
and absence of arbitrage and
V stability 79–83
valuation, impact-adjusted: algorithm 83–4
and criticality of leverage 229–39, discrete expiries 78–9
234, 237 filling the gaps 79
and market impact and numerical example 84–5
liquidation accounting 230–3
value-at-risk (VaR): X
and Chebyshev–Hermite Xibor 43
polynomials 226
computation of 219–24 Y
and expected tail loss 225 yield curves, “forwarding” and
and Gram–Charlier A 223 “discounting” 44
358
22 Index PCQF.indd 358 11/03/2013 10:19

Post-Crisis Quant Finance PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Post-Crisis Quant Finance PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Post-Crisis Quant Finance

00 Prelims PCQF.indd 1 11/03/2013 10:08

Edited by Mauro Cesa

00 Prelims PCQF.indd 3 11/03/2013 10:08

© 2013 Incisive Media

ISBN 978 1 782720 07 2

British Library Cataloguing in Publication Data

Publisher: Nick Carver

Designer: Lisa Ling

Typeset by Mark Heslington Ltd, Scarborough, North Yorkshire

00 Prelims PCQF.indd 4 11/03/2013 10:08

About the Editor ix

SECTION 1: DERIVATIVES PRICING

1 Smile Dynamics IV 3

2 Fundung Beyond Discounting: Collateral Agreements

3 Two Curves, One Price 43

4 A Libor Market Model with a Stochastic Basis 61

00 Prelims PCQF.indd 5 11/03/2013 10:08

7 Being Particular About Calibration 109

8 Cooking with Collateral 129

SECTION 2: ASSET AND RISK MANAGEMENT

9 A Dynamic Model for Hard-to-borrow Stocks 149

10 Shortfall Factor Contributions 165

11 Stressed in Monte Carlo 183

12 A New Breed of Copulas for Risk and Portfolio

13 A Historical-parametric Hybrid VaR 213

14 Impact-adjusted Valuation and the Criticality of

00 Prelims PCQF.indd 6 11/03/2013 10:08

SECTION 3: COUNTERPARTY CREDIT RISK

15 Being Two-faced Over Counterparty Credit Risk 243

16 Real-time Counterparty Credit Risk Management in

17 Counterparty Risk Capital and CVA 275

18 Partial Differential Equation Representations of

19 Close-out Convention Tensions 315

20 Cutting CVAs Complexity 329

Notes on Chapters 349

00 Prelims PCQF.indd 7 11/03/2013 10:08

Mauro Cesa is the technical editor of the Risk Management and

00 Prelims PCQF.indd 9 11/03/2013 10:08

Jesper Andreasen heads the quantitative research department at

Marco Avellaneda has been involved in teaching, developing and

Lorenzo Bergomi is head of the quantitative research in the global

00 Prelims PCQF.indd 11 11/03/2013 10:08

Marco Bianchetti joined the market risk management area of Intesa

Jean-Philippe Bouchaud obtained his PhD in physics from the

Damiano Brigo is chair and co-head of mathematical finance at

Christoph Burgard is a managing director at Barclays, with global

00 Prelims PCQF.indd 12 11/03/2013 10:08

Fabio Caccioli is a postdoctoral fellow at the Santa Fe Institute in

Luca Capriotti is the US head of quantitative strategies global credit

J. Doyne Farmer is a professor of mathematics and director of the

Christian Fries is head of model development at DZ BANK’s risk

Jon Gregory is a partner at Solum Financial Partners and special-

00 Prelims PCQF.indd 13 11/03/2013 10:08

projects. He has worked on many aspects of credit risk in his career,

Julien Guyon is a senior quantitative analyst in the quantitative

Pierre Henry-Labordére works in the global markets quantitative

Brian Huge is chief analyst in the quant group, with focus on FX

Mats Kjaer works in the quantitative analytics group at Barclays,

00 Prelims PCQF.indd 14 11/03/2013 10:08

2006. He has also worked as a visiting research fellow at the

Jacky Lee is the US regional head of quantitative strategies and the

Mike Lipkin is associate adjunct professor in the industrial engi-

About the Editor ix

1 Smile Dynamics IV 3

3 Two Curves, One Price 43

4 A Libor Market Model with a Stochastic Basis 61

7 Being Particular About Calibration 109

8 Cooking with Collateral 129

9 A Dynamic Model for Hard-to-borrow Stocks 149

10 Shortfall Factor Contributions 165

11 Stressed in Monte Carlo 183

13 A Historical-parametric Hybrid VaR 213

15 Being Two-faced Over Counterparty Credit Risk 243

17 Counterparty Risk Capital and CVA 275

19 Close-out Convention Tensions 315

20 Cutting CVAs Complexity 329

Notes on Chapters 349