R in Insurance

11 September 2015

Agenda

Introduction to R
Mortality Overview
Mortality Models
- Classical
- Stochastic
Workshop: Stochastic Mortality Models in R
- Models Fitting
- Mortality Projection - Forecasting
- Portfolio Analysis of BEL and SCR
Case Studies

What is R?

R Highlights

Free, open source, community based statistical software and programming language
Main usage: Statistical and numerical computing, Data Analysis, Data Visualisation
Under GNU GPL (General Public License) allowing commercial usage
Power of R in its packages

“Everything that exists [in R] is an object. Everything that happens [in R] is a function call.” - John Chambers

RStudio

Most popular Environment to run R

http://www.rstudio.com/products/rstudio/

Free & Open-Source Integrated Development Environment (IDE) for R

Features:

Built-in R console
R, LaTeX, HTML, C++ syntax highlighting, R code completion
Easy management of multiple projects
Integrated R documentation
Interactive debugger
Package development tools

Note: R must be installed first!

Playing with R

Type in the interactive console:

3 + 3

## [1] 6

getwd()

## [1] "C:/Users/omierzwa/Documents/GitHub/smm"

1:10

##  [1]  1  2  3  4  5  6  7  8  9 10

Playing with R

Type in the interactive console:

x <- 1:10 # "name <- value returns the value invisibly"

##  [1]  1  2  3  4  5  6  7  8  9 10

(x <- 1:10) # creates x and prints it

##  [1]  1  2  3  4  5  6  7  8  9 10

Operations in R

x + y addition

x - y substraction

x * y multiplication

x / y division

x ^ y exponention

x %% y devision remainder

x %/% y integer division

numeric vector operator numeric vector –> numeric vector

Vectors

(x <- 11:20) # exemplary vector

##  [1] 11 12 13 14 15 16 17 18 19 20

x[1] # first

## [1] 11

x[length(x)] # last element using length function

## [1] 20

x[c(1, length(x), 1)] # first, last and first again

## [1] 11 20 11

Vectors

x[1000] # no such element

## [1] NA

x * 3 # multiplication

##  [1] 33 36 39 42 45 48 51 54 57 60

y <- 1:10
x * y

##  [1]  11  24  39  56  75  96 119 144 171 200

R in Insurance

Main focus Non-life
- fit loss distributions and perform credibility analysis - package actuar
- estimate loss reserves - package ChainLadder
Financial analysis
- packages YieldCurve, termstrc
Life insurance
- handle demography data - package demography
- demography projections - package LifeTables
- actuarial and financial mathematics - package lifecontingencies
- Life models - packages ilc, LifeMetrics, StMoMo

.

Mortality Overview

Mortality Overview

Basic quantities in the analysis of mortality

Survival function

\(s_{x} = Pr (X > x)\)

Probability that (x) will survive for another t

\(p_{xt} = s_{x+t}/s_{x}\)

Probability that (x) will die within t years

\(q_{xt} = [s_{x} - s_{x+t}]/s_{x}\)

Mortality intensity (hazard function or force of mortality)

\(\mu_{x} = lim_{h \to 0} 1/h * q_{xh}\)

Probability that (x) will die within h

Mortality Features

Mortality Features

.

Mortality Models

Mortality Models

Drawing

Classic Models

Drawing

Classic Models

Some special parametric laws of mortality

De Moivre

\(\mu_{x} = 1/ (ω – x)\) subject to \(0 \leq x < ω\)

Gompertz

\(\mu_{x} = Bc^{x}\) subject to \(x \geq 0, B>0, c>1\)

Makeham

\(\mu_{x} = A + Bc^{x}\) subject to \(x \geq 0, B>0, c>1, A>=-B\)

Thiele

\(\mu_{x} = B_{1}C_{1}^{-x}+B_{2}C_{2}^{[-1/2(x-k)^2]}+B_{3}C_{3}^{x}\) subject to \(x\geq 0, B_{1}, B_{2}, B_{3}>0, C_{1}, C_{2}, C_{3}>1\)

Classic Models

Advantages:

Compact, small numbers of parameters
Highly interpretable
Good for comparative work

Disadvantages:

Almost certainly “wrong”
Too simplistic
Struggle with a new source of mortality

Stochastic Models

Drawing

Stochastic Models

Drawing

Predictor

\(\hat{\mu}_{x}(t)\)

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x}\)

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x} + \kappa_{t}\)

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x} + \beta_{x} \kappa_{t}\)

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t}\)

\(N\) - number of age-period terms

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t} + \gamma_{t-x}\)

\(N\) - number of age-period terms

Predictor

\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t} + \beta^{0}_{x} \gamma_{t-x}\)

\(N\) - number of age-period terms

Models Predictor

Generalised Age-Period-Cohort Stochastic Mortality Models

Lee-Carter (LC) \(\hat{\mu_{xt}} = \alpha_{x} + \beta^{(1)}_{x} \kappa^{(1)}_{t}\)
Age-Period-Cohort (APC) \(\hat{\mu_{xt}} = \alpha_{x} + \kappa^{(1)}_{t} +\gamma_{t-x}\)
Cairns-Blake_Dowd (CBD) \(\hat{\mu_{xt}} = \kappa^{(1)}_{t} + (x-\bar{x})\kappa^{(2)}_{t}\)
Quadratic CBD with cohort effects M6 \(\hat{\mu_{xt}} = \kappa^{(1)}_{t} + (x-\bar{x})\kappa^{(2)}_{t} +\gamma_{t-x}\)
Quadratic CBD with cohort effects M7 \(\hat{\mu_{xt}} = \kappa^{(1)}_{t} + (x-\bar{x})\kappa^{(2)}_{t} + ((x-\bar{x})^2-\hat{\sigma^{2}_{x}}) \kappa^{3}_{t}\)
Quadratic CBD with cohort effects M8 \(\hat{\mu_{xt}} = \kappa^{(1)}_{t} + (x-\bar{x})\kappa^{(2)}_{t} + (x_{c}-x)\gamma_{t-x}\)
Plat \(\hat{\mu_{t}} = \alpha_{x} + \kappa^{(1)}_{t} + (x-\bar{x})\kappa^{(2)}_{t} + \gamma_{t-x}\)

Model Comparison

Random Component

The numbers of deaths \(D_{xt}\) are independent

\(D_{xt}\) follows a Poisson or a Binomial distribution

Link Function

Number of link functions possible
Convenient to use canonical link
- Log link for Poisson \(\hat{\mu}_{xt} = log\mu_{xt}\)
- Logit link for Binomial \(\hat{\mu}_{xt} = ln[\mu_{xt}/(1-\mu_{xt})]\)

Parameter Constraints

Most stochastic mortality models not identifiable up to a transformation
Therefore parameter constraints required to ensure unique parameter estimates
Parameter constraints applied through constraint function

Stochastic Models

Models sensitive to:

Historical data
Assumptions made

Good practice

Understand and validate data
Understand models and their assumptions
Test several methods
Compare and validate results

.

R Workshop

R getting started

Create a new folder and put the R scripts prepared for the workshop there https://github.com/TARF/SMM/raw/master/Case_studies.zip
Set working directory Session > Set Working Directory > Choose Directory… or press Ctrl+Shift+H and select the folder or use the console

setwd('<directory name>') # wrapped in '' 
## for Windows the path uses / instead \

Install required packages

install.packages(c("demography","StMoMo","rgl",
                   "googleVis","fanplot", "gdata"))

Case Study

Slovak mortality data from Human Mortality Database

library(demography)
source("temp_functions.R")

skDemo<-hmd.mx("SVK", username=username, password=password)
# load("skDemo.RData")
years <- skDemo$year
ages<- skDemo$age
Dxt <- skDemo$rate[[3]] * skDemo$pop[[3]]
E0xt <- skDemo$pop[[3]] + 0.5 * Dxt
Ecxt <- skDemo$pop[[3]]
qxt <- Dxt/E0xt

Inspect data

RStudio Environment double click the qxt object

Let's inspect mortality rates

testglsnapshot
You must enable Javascript to view this page properly.

Settings for Model Fitting

library(StMoMo)
forecastTime <- 120
ages.fit <- 25:90
weights <- genWeightMat(ages.fit, years, 3)

## Modelling
modelFit <- array(data = NA, c(7, 2))
rownames(modelFit) <- c("LC", "APC", "CBD", "M6", "M7", "M8", "PLAT")
colnames(modelFit) <- c("AIC", "BIC")

Model Fit Criteria

AIC tries to select the model that most adequately describes an unknown, high dimensional reality. Reality is never in the set of candidate models that are being considered.
BIC tries to find the TRUE model among the set of candidates.
Best model minimizes AIC and BIC.

Lee Carter Model

## LC model under a Binomial setting - M1
LC <- lc(link = "logit")
LCfit <- fit(LC, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
             ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## Initialising
## Running start-up iterations..
## Running main iterations...............
## Done
## StMoMo: Finish fitting with gnm

LCres <- residuals(LCfit)

## Collect the fit statistics
modelFit[1, 1] <- AIC(LCfit)
modelFit[1, 2] <- BIC(LCfit)

Lee Carter Model Fitting Results

plot(LCfit, nCol = 3)

Forecast for Lee Carter

## Forecast
LCfor <- forecast(LCfit, h = forecastTime)
LCqxt <- cbind(LCfor$fitted, LCfor$rates)

Projected Mortality

unnamed_chunk_24snapshot
You must enable Javascript to view this page properly.

APC Model

## Fitting APC Currie (2006) - M3
APC <- apc("logit")
APCfit <- fit(APC, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
              ages.fit = ages.fit, wxt = weights, start.ax = LCfit$ax,
              start.bx = LCfit$bx, start.kt = LCfit$kx)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

## Collecting fit results
modelFit[2, 1] <- AIC(APCfit)
modelFit[2, 2] <- BIC(APCfit)

## Forecasting
APCfor <- forecast(APCfit, h = forecastTime, gc.order = c(1, 1, 0))
APCqxt <- cbind(APCfor$fitted, APCfor$rates)

CBD Model

CBD model Cairns (2009) under a Binomial distribution of deaths Haberman and Renshaw (2011) - M5

CBD <- cbd()
CBDfit <- fit(CBD, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
              ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

modelFit[3, 1] <- AIC(CBDfit)
modelFit[3, 2] <- BIC(CBDfit)

CBDfor <- forecast(CBDfit, h = forecastTime)
CBDqxt <- cbind(CBDfor$fitted, CBDfor$rates)

M6 Model

## M6
M6 <- m6()
M6fit <- fit(M6, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
             ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

modelFit[4, 1] <- AIC(M6fit)
modelFit[4, 2] <- BIC(M6fit)

M6for <- forecast(M6fit, h = forecastTime, gc.order = c(2, 0, 0))
M6qxt <- cbind(M6for$fitted, M6for$rates)

M7 Model

M7 <- m7(link = "logit")
M7fit <- fit(M7, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
             ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

modelFit[5, 1] <- AIC(M7fit)
modelFit[5, 2] <- BIC(M7fit)

M7for <- forecast(M7fit, h = forecastTime, gc.order = c(2, 0, 0))
M7qxt <- cbind(M7for$fitted, M7for$rates)

M8 Model

M8 <- m8(link = "logit", xc = 65)
M8fit <- fit(M8, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
             ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

modelFit[6, 1] <- AIC(M8fit)
modelFit[6, 2] <- BIC(M8fit)

M8for <- forecast(M8fit, h = forecastTime, gc.order = c(2, 0, 0))
M8qxt <- cbind(M8for$fitted, M8for$rates)

PLAT Model

f2 <- function(x, ages) mean(ages) - x
constPlat <- function(ax, bx, kt, b0x, gc, wxt, ages){
  nYears <- dim(wxt)[2]
  x <- ages
  t <- 1:nYears
  c <- (1 - tail(ages, 1)):(nYears - ages[1])
  xbar <- mean(x)
  phiReg <- lm(gc ~ 1 + c + I(c^2), na.action = na.omit)
  phi <- coef(phiReg)
  gc <- gc - phi[1] - phi[2] * c - phi[3] * c^2
  kt[2, ] <- kt[2, ] + 2 * phi[3] * t
  kt[1, ] <- kt[1, ] + phi[2] * t + phi[3] * (t^2 - 2 * xbar * t)
  ax <- ax + phi[1] - phi[2] * x + phi[3] * x^2
  ci <- rowMeans(kt, na.rm = TRUE)
  ax <- ax + ci[1] + ci[2] * (xbar - x)
  kt[1, ] <- kt[1, ] - ci[1]
  kt[2, ] <- kt[2, ] - ci[2]
  list(ax = ax, bx = bx, kt = kt, b0x = b0x, gc = gc)
}

PLAT Model Fitting

PLAT <- StMoMo(link = "logit", staticAgeFun = TRUE,
               periodAgeFun = c("1", f2), cohortAgeFun = "1",
               constFun = constPlat)
PLATfit <- fit(PLAT, Dxt = Dxt, Ext= E0xt, ages = ages, years = years, 
               ages.fit = ages.fit, wxt = weights)

## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 
## StMoMo: Start fitting with gnm
## StMoMo: Finish fitting with gnm

modelFit[7, 1] <- AIC(PLATfit)
modelFit[7, 2] <- BIC(PLATfit)

PLATfor <- forecast(PLATfit, h = forecastTime, gc.order = c(2, 0, 0))
PLATqxt <- cbind(PLATfor$fitted, PLATfor$rates)

Fitting Results

What model has the best fit?

	AIC	BIC
LC	44621.04	45814.42
APC	40844.80	42364.79
CBD	53410.05	54163.77
M6	39526.23	41014.82
M7	36862.39	38721.56
M8	39538.75	41033.62
PLAT	38546.23	40430.52

Model Comparison

Extrapolation

Limited data, often poor quality at high ages
Kannisto extrapolation

extrapolate <- kannistoExtrapolation(qx=PLATqxt, ages=ages.fit, 
                                     years=years_chart, max_age=120, nObs=15)
PLATqxtExtr <- extrapolate$qxt

Curious about the function?

Just type kannistoExtrapolation in the console.

Portfolio Calculations

Assumption: we have imaginary portofolio
Premium read from input data
Youngest entry age 25
Retirement age 65
Annuity is paid until age 120
Annuity annual amount 1000 EUR
Valuation year 2015

## Read in portfolio data
portfolio <- read.csv('portfolio.csv')

## Assumptions
ages.fit <- 25:120
valyear <- 2015
pension <- 1000

Experience Factors

Experience factors - observed portfolio mortality divided by population mortality.

## Read in experience factors
experience.factors <- read.csv('experience-factors.csv')

# calculate ages of the insured
portfolio$age <- valyear - portfolio$YoB
experience.factors$total <- (experience.factors$Male + 
                               experience.factors$Female)/2
expF <- experience.factors$total[ages.fit]

Best Estimate Liability

The expected or mean value of the present value of future cash flows for current obligations, projected over the contract’s run-off period, taking into account all up-to-date financial market and actuarial information.

Calculate BEL for all models: LC, APC, CBD, M6, M7, M8 and PLAT.
Use our function DFcashflowfunction

DFcashflow = function(qxt, ageStart, omegaAge, pensionAge, valyear, ir, type = 1)

Function arguments:
- qxt - mortality rates
- ageStart - age of the insured
- omegaAge - omega age
- pensionAge - retirement age
- valyear - valuation year
- ir - interest rate value (for simplifaction constant)
- type - indicator for discounting premium or annuity

BEL Calculations

BEL <- array(NA, c(7,1))
rownames(BEL) <- c("LC", "APC", "CBD", "M6", "M7", "M8", "PLAT")
colnames(BEL) <- "BEL"

for (m in 1:length(models)){
output <- list()
output2 <- list()
for (i in 1:nrow(portfolio)){
  output[[i]] <- DFcashflow(models[[m]]*expF, ageStart = portfolio$age[i], 
                            omegaAge = 120, pensionAge = 65, valyear = valyear, 
                            ir = 0.02, type = 1)*pension
  output2[[i]] <- DFcashflow(models[[m]]*expF, ageStart = portfolio$age[i], 
                             omegaAge = 120, pensionAge = 65, valyear = valyear, 
                             ir = 0.02, type = 2)*portfolio$Premium[i]
}
BEL[m, 1] <- round(do.call(sum, output)-do.call(sum, output2),2)
}

BEL per Model

Simulations for SCR

SCR is the amount of capital the insurance company needs to hold in order to survive a 1 in 200 event.

Use stochastic simulations to calculate SCR

set.seed(1234) ## Specify seed for Random Number Generation
nsim <- 200 ## Number of simulations
models2run <- length(modelsFitted) ## Number of models to loop for
ages.fit <- 25:90 ## Ages used in model fitting

Simulations in R

modelSim <- list()
selectBEL <- list()
for (m in 1:models2run){
  modelSim[[m]] <- simulate(modelsFitted[[m]], nsim = nsim, h = forecastTime)
  collectBEL <- array(NA, c(200,1))
  for (s in 1:nsim){
    prem_s <- 0
    ben_s <- 0
    qx <- cbind(modelSim[[m]]$fitted[, , s], modelSim[[m]]$rates[, , s])
    extrapolate <- kannistoExtrapolation(qx, ages.fit, years_chart)
    for (i in 1:nrow(portfolio)){
      ben_s <- ben_s+DFcashflow(extrapolate$qxt*expF, ageStart = portfolio$age[i], 
              omegaAge = 120, pensionAge = 65, valyear = valyear, 
              ir = 0.02, type = 1)*pension
      prem_s <- prem_s+DFcashflow(extrapolate$qxt*expF, ageStart = portfolio$age[i], 
                omegaAge = 120, pensionAge = 65, valyear = valyear, 
                ir = 0.02, type = 2)*portfolio$Premium[i]
    }
    collectBEL[s, 1] <- round(ben_s - prem_s, 2)
  }
  selectBEL[[m]] <- quantile(collectBEL, probs = 0.995, type = 1)
}
SCR <- round(as.numeric(selectBEL) - BEL, 2)
colnames(SCR) <- "SCR"

Simulation Results

Simulation Uncertainity

Percentails (%): 0.5, 2.5, 10, 25, 75, 90, 97.5, 99.5

Show SCR for Models

Different Capital Requirements

.

Case Studies Now it's your turn.

Case 1

It's time to play with portfolio data. Task is to make everybody in our portfolio 5 years older.

Can we form any expections regarding the BEL and SCR values?

Open Case_study_1.R instructions and code to adjust is commented out.

Case 2

Now shorten the time trend. Instead of using hitory data from 1950 we will just focus on 1989 to 2009.

Can we form any expections regarding the BEL and SCR values?

Open Case_study_2.R instructions and code to adjust is commented out.

Case 3

For all the analysis done so was raw data was used. Let's check what happens if we used smoothing.

Can we form any expections regarding the BEL and SCR values?

Open Case_study_3.R instructions and code to adjust is commented out.

Case 4

To the list of already fitted models add amortality table that assumes constant mortality rates after 2009. Projection period remains 120. Extrapolation is used for ages 91-120.

Can we form any expections regarding the BEL value?

Open Case_study_4.R instructions and code to adjust is commented out.

References

"StMoMO: An R Package for Stochastic Mortality Modelling" A.M. Villegas et al.

"Stochastic Modelling of Mortality Risks" Frankie Gregorkiewicz

"Solvency II Glossary" CEA and the Groupe Consultatif

"R: The most powerful and most widely used statistical software" Revolution Analytics

Contact

Franciszek Gregorkiewicz

franciszek.gregorkiewicz@aaa-riskfinance.nl
+48 605 682 112

Olga Mierzwa

olga.mierzwa@aaa-riskfinance.nl
+48 798 878 358

Presentation and source code available on GitHub: https://github.com/TARF/SMM/

Who We Are?

An independent and innovative consultancy firm specialized in the field of actuarial science and risk management

Founded in 2006
80 FTE in 2015
Headquarters in Amsterdam
Local, dedicated office in Warsaw
Highly qualified and trained actuaries and risk professionals

Comprehensive, hands-on services for insurance companies

Actuarial modeling
Solvency II and regulation
Risk management function
Actuarial function
Governance
Strategy and profitability
Investment consulting and ALM