Documente Academic
Documente Profesional
Documente Cultură
Chibisi Chima-Okereke
Mango Solutions
E-mail: cchima-okereke@mango-solutions.com
Agenda
What is R?
R as a functional language
Basic Examples
Actuarial pricing
GLM Example
Problems with checking calculations and types of errors which can be silent and unknown
Do your spreadsheets start to grind to a halt with rather moderate sets of data?
Versioning excel files could be over 50MB each relative to script versions few KB. Imagine this
across your network and the waste of space this encourages
VBA versioning problems, inadequate for data analysis and most useful purposes harsh but
true?
What is R?
A big calculator?
People A programming
have language?
described A rapid prototyping tool?
R as: A free SAS?
Statistical Analysis Tool?
Useful R Features
Easy output formats, all picture files, data formats, even Excel!
Current Actuarial R
Packages
ChainLadder
lifecontingencies
LifeTables
http://cran.r-project.org/web/packages/
Functional Programming
Reference: http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-
to-apply-in-r/
lapply(list, function)
http://www.jstatsoft.org/v40/i01/paper
Data Source (Simulated): Modern Actuarial Risk Theory Using R: Kaas, Goovaerts, Dhaene, and Denuit.
Dynamic SQL
Query Example
require(RODBC)
C:\Users\cchima-okereke\Documents\R\RScripts\ActuarialPricing\tmp\myPlots.pdf
Plotting Analysis
GUI In R (claimsExploreR)
GUI In R (claimsExploreR)
Histogram of claim counts with BonusMalus and Age
Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
40
30
20
10
Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
40
Frequency
30
20
10
Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
40
30
20
10
0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8
Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65 Age : >65
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
4
10
3
10
2
10
Exposure Weighted Severity (Log Scale)
1
10
Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23 Age : 18-23
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
4
10
3
10
2
10
1
10
Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64 Age : 24-64
Year : 2001 Year : 2002 Year : 2003 Year : 2004 Year : 2005 Year : 2006 Year : 2007 Year : 2008 Year : 2009 Year : 2010
4
10
3
10
2
10
1
10
BonusMalus
1 3 5 7 9 11 13
2 4 6 8 10 12 14
GLM Models in Pricing
Poisson Frequency
Gamma Severity
Information Criteria
AIC
What metrics
BIC (Multiple flavours)
shall we use to
Significance of variable: Chi-
include/exclude Squared/F-Test
variables? Consistency measures
Other Measures
Automation Algoritms
Forward Algorithm
What
mechanics will Backward
we use to Algorithm
select/exclude
variables? Some other
bespoke method
Actuarial Pricing in R
Link to Excel
150
140
Relativity (%)
130
120
110
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Bonus Malus
Write results
Connect to R, to PDF, any
Data residing
RODBC, Carry out picture
in some
RPostgreSQL, analysis in R format, push
database
RODM etc. to Latex,
Excel, CSV, etc
Advantages of
R for GLM Analysis
environment
techniques, ODE/PDE, HMMs, contingency tables,
survival analysis, copulas, extreme value analysis,
geospatial analysis and visualisation
Challenges &
Opportunities
IT support for R
For mere mortals (like me) the learning curve is tough and the
documentation appears ambiguous