Introduction to R
Mortality Overview
- Mortality Models
- Classical
- Stochastic
- Workshop: Stochastic Mortality Models in R
- Models Fitting
- Mortality Projection - Forecasting
- Portfolio Analysis of BEL and SCR
Case Studies
11 September 2015
Introduction to R
Mortality Overview
Case Studies
“Everything that exists [in R] is an object. Everything that happens [in R] is a function call.” - John Chambers
Most popular Environment to run R
http://www.rstudio.com/products/rstudio/
Free & Open-Source Integrated Development Environment (IDE) for R
Features:
Note: R must be installed first!
Type in the interactive console:
3 + 3
## [1] 6
getwd()
## [1] "C:/Users/omierzwa/Documents/GitHub/smm"
1:10
## [1] 1 2 3 4 5 6 7 8 9 10
Type in the interactive console:
x <- 1:10 # "name <- value returns the value invisibly"
x
## [1] 1 2 3 4 5 6 7 8 9 10
(x <- 1:10) # creates x and prints it
## [1] 1 2 3 4 5 6 7 8 9 10
x + y addition
x - y substraction
x * y multiplication
x / y division
x ^ y exponention
x %% y devision remainder
x %/% y integer division
numeric vector operator numeric vector –> numeric vector
(x <- 11:20) # exemplary vector
## [1] 11 12 13 14 15 16 17 18 19 20
x[1] # first
## [1] 11
x[length(x)] # last element using length function
## [1] 20
x[c(1, length(x), 1)] # first, last and first again
## [1] 11 20 11
x[1000] # no such element
## [1] NA
x * 3 # multiplication
## [1] 33 36 39 42 45 48 51 54 57 60
y <- 1:10 x * y
## [1] 11 24 39 56 75 96 119 144 171 200
Basic quantities in the analysis of mortality
\(s_{x} = Pr (X > x)\)
\(p_{xt} = s_{x+t}/s_{x}\)
\(q_{xt} = [s_{x} - s_{x+t}]/s_{x}\)
\(\mu_{x} = lim_{h \to 0} 1/h * q_{xh}\)
Probability that (x) will die within h
Some special parametric laws of mortality
\(\mu_{x} = 1/ (ω – x)\) subject to \(0 \leq x < ω\)
\(\mu_{x} = Bc^{x}\) subject to \(x \geq 0, B>0, c>1\)
\(\mu_{x} = A + Bc^{x}\) subject to \(x \geq 0, B>0, c>1, A>=-B\)
\(\mu_{x} = B_{1}C_{1}^{-x}+B_{2}C_{2}^{[-1/2(x-k)^2]}+B_{3}C_{3}^{x}\) subject to \(x\geq 0, B_{1}, B_{2}, B_{3}>0, C_{1}, C_{2}, C_{3}>1\)
Advantages:
Disadvantages:
\(\hat{\mu}_{x}(t)\)
\(\hat{\mu}_{x}(t) = \alpha_{x}\)
\(\hat{\mu}_{x}(t) = \alpha_{x} + \kappa_{t}\)
\(\hat{\mu}_{x}(t) = \alpha_{x} + \beta_{x} \kappa_{t}\)
\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t}\)
\(N\) - number of age-period terms
\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t} + \gamma_{t-x}\)
\(N\) - number of age-period terms
\(\hat{\mu}_{x}(t) = \alpha_{x} + \sum^{N}_{i=1} \beta^{i}_{x} \kappa^{i}_{t} + \beta^{0}_{x} \gamma_{t-x}\)
\(N\) - number of age-period terms
Generalised Age-Period-Cohort Stochastic Mortality Models
The numbers of deaths \(D_{xt}\) are independent
\(D_{xt}\) follows a Poisson or a Binomial distribution
Most stochastic mortality models not identifiable up to a transformation
Therefore parameter constraints required to ensure unique parameter estimates
Parameter constraints applied through constraint function
Models sensitive to:
Historical data
Assumptions made
Good practice
Understand and validate data
Understand models and their assumptions
Test several methods
Compare and validate results
Create a new folder and put the R scripts prepared for the workshop there https://github.com/TARF/SMM/raw/master/Case_studies.zip
Set working directory Session > Set Working Directory > Choose Directory… or press Ctrl+Shift+H and select the folder or use the console
setwd('<directory name>') # wrapped in ''
## for Windows the path uses / instead \
install.packages(c("demography","StMoMo","rgl",
"googleVis","fanplot", "gdata"))
Slovak mortality data from Human Mortality Database
library(demography)
source("temp_functions.R")
skDemo<-hmd.mx("SVK", username=username, password=password)
# load("skDemo.RData")
years <- skDemo$year
ages<- skDemo$age
Dxt <- skDemo$rate[[3]] * skDemo$pop[[3]]
E0xt <- skDemo$pop[[3]] + 0.5 * Dxt
Ecxt <- skDemo$pop[[3]]
qxt <- Dxt/E0xt
RStudio Environment double click the qxt object
You must enable Javascript to view this page properly.
library(StMoMo)
forecastTime <- 120
ages.fit <- 25:90
weights <- genWeightMat(ages.fit, years, 3)
## Modelling
modelFit <- array(data = NA, c(7, 2))
rownames(modelFit) <- c("LC", "APC", "CBD", "M6", "M7", "M8", "PLAT")
colnames(modelFit) <- c("AIC", "BIC")
AIC tries to select the model that most adequately describes an unknown, high dimensional reality. Reality is never in the set of candidate models that are being considered.
BIC tries to find the TRUE model among the set of candidates.
Best model minimizes AIC and BIC.
## LC model under a Binomial setting - M1
LC <- lc(link = "logit")
LCfit <- fit(LC, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## Initialising ## Running start-up iterations.. ## Running main iterations............... ## Done ## StMoMo: Finish fitting with gnm
LCres <- residuals(LCfit) ## Collect the fit statistics modelFit[1, 1] <- AIC(LCfit) modelFit[1, 2] <- BIC(LCfit)
plot(LCfit, nCol = 3)
## Forecast LCfor <- forecast(LCfit, h = forecastTime) LCqxt <- cbind(LCfor$fitted, LCfor$rates)
You must enable Javascript to view this page properly.
## Fitting APC Currie (2006) - M3
APC <- apc("logit")
APCfit <- fit(APC, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights, start.ax = LCfit$ax,
start.bx = LCfit$bx, start.kt = LCfit$kx)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
## Collecting fit results modelFit[2, 1] <- AIC(APCfit) modelFit[2, 2] <- BIC(APCfit) ## Forecasting APCfor <- forecast(APCfit, h = forecastTime, gc.order = c(1, 1, 0)) APCqxt <- cbind(APCfor$fitted, APCfor$rates)
CBD model Cairns (2009) under a Binomial distribution of deaths Haberman and Renshaw (2011) - M5
CBD <- cbd()
CBDfit <- fit(CBD, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
modelFit[3, 1] <- AIC(CBDfit) modelFit[3, 2] <- BIC(CBDfit) CBDfor <- forecast(CBDfit, h = forecastTime) CBDqxt <- cbind(CBDfor$fitted, CBDfor$rates)
## M6
M6 <- m6()
M6fit <- fit(M6, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
modelFit[4, 1] <- AIC(M6fit) modelFit[4, 2] <- BIC(M6fit) M6for <- forecast(M6fit, h = forecastTime, gc.order = c(2, 0, 0)) M6qxt <- cbind(M6for$fitted, M6for$rates)
M7 <- m7(link = "logit")
M7fit <- fit(M7, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
modelFit[5, 1] <- AIC(M7fit) modelFit[5, 2] <- BIC(M7fit) M7for <- forecast(M7fit, h = forecastTime, gc.order = c(2, 0, 0)) M7qxt <- cbind(M7for$fitted, M7for$rates)
M8 <- m8(link = "logit", xc = 65)
M8fit <- fit(M8, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
modelFit[6, 1] <- AIC(M8fit) modelFit[6, 2] <- BIC(M8fit) M8for <- forecast(M8fit, h = forecastTime, gc.order = c(2, 0, 0)) M8qxt <- cbind(M8for$fitted, M8for$rates)
f2 <- function(x, ages) mean(ages) - x
constPlat <- function(ax, bx, kt, b0x, gc, wxt, ages){
nYears <- dim(wxt)[2]
x <- ages
t <- 1:nYears
c <- (1 - tail(ages, 1)):(nYears - ages[1])
xbar <- mean(x)
phiReg <- lm(gc ~ 1 + c + I(c^2), na.action = na.omit)
phi <- coef(phiReg)
gc <- gc - phi[1] - phi[2] * c - phi[3] * c^2
kt[2, ] <- kt[2, ] + 2 * phi[3] * t
kt[1, ] <- kt[1, ] + phi[2] * t + phi[3] * (t^2 - 2 * xbar * t)
ax <- ax + phi[1] - phi[2] * x + phi[3] * x^2
ci <- rowMeans(kt, na.rm = TRUE)
ax <- ax + ci[1] + ci[2] * (xbar - x)
kt[1, ] <- kt[1, ] - ci[1]
kt[2, ] <- kt[2, ] - ci[2]
list(ax = ax, bx = bx, kt = kt, b0x = b0x, gc = gc)
}
PLAT <- StMoMo(link = "logit", staticAgeFun = TRUE,
periodAgeFun = c("1", f2), cohortAgeFun = "1",
constFun = constPlat)
PLATfit <- fit(PLAT, Dxt = Dxt, Ext= E0xt, ages = ages, years = years,
ages.fit = ages.fit, wxt = weights)
## StMoMo: The following cohorts have been zero weigthed: 1860 1861 1862 1982 1983 1984 ## StMoMo: Start fitting with gnm ## StMoMo: Finish fitting with gnm
modelFit[7, 1] <- AIC(PLATfit) modelFit[7, 2] <- BIC(PLATfit) PLATfor <- forecast(PLATfit, h = forecastTime, gc.order = c(2, 0, 0)) PLATqxt <- cbind(PLATfor$fitted, PLATfor$rates)
What model has the best fit?
| AIC | BIC | |
|---|---|---|
| LC | 44621.04 | 45814.42 |
| APC | 40844.80 | 42364.79 |
| CBD | 53410.05 | 54163.77 |
| M6 | 39526.23 | 41014.82 |
| M7 | 36862.39 | 38721.56 |
| M8 | 39538.75 | 41033.62 |
| PLAT | 38546.23 | 40430.52 |
extrapolate <- kannistoExtrapolation(qx=PLATqxt, ages=ages.fit,
years=years_chart, max_age=120, nObs=15)
PLATqxtExtr <- extrapolate$qxt
Curious about the function?
Just type kannistoExtrapolation in the console.
## Read in portfolio data
portfolio <- read.csv('portfolio.csv')
## Assumptions
ages.fit <- 25:120
valyear <- 2015
pension <- 1000
Experience factors - observed portfolio mortality divided by population mortality.
## Read in experience factors
experience.factors <- read.csv('experience-factors.csv')
# calculate ages of the insured
portfolio$age <- valyear - portfolio$YoB
experience.factors$total <- (experience.factors$Male +
experience.factors$Female)/2
expF <- experience.factors$total[ages.fit]
The expected or mean value of the present value of future cash flows for current obligations, projected over the contract’s run-off period, taking into account all up-to-date financial market and actuarial information.
DFcashflow = function(qxt, ageStart, omegaAge, pensionAge, valyear, ir, type = 1)
BEL <- array(NA, c(7,1))
rownames(BEL) <- c("LC", "APC", "CBD", "M6", "M7", "M8", "PLAT")
colnames(BEL) <- "BEL"
for (m in 1:length(models)){
output <- list()
output2 <- list()
for (i in 1:nrow(portfolio)){
output[[i]] <- DFcashflow(models[[m]]*expF, ageStart = portfolio$age[i],
omegaAge = 120, pensionAge = 65, valyear = valyear,
ir = 0.02, type = 1)*pension
output2[[i]] <- DFcashflow(models[[m]]*expF, ageStart = portfolio$age[i],
omegaAge = 120, pensionAge = 65, valyear = valyear,
ir = 0.02, type = 2)*portfolio$Premium[i]
}
BEL[m, 1] <- round(do.call(sum, output)-do.call(sum, output2),2)
}
SCR is the amount of capital the insurance company needs to hold in order to survive a 1 in 200 event.
set.seed(1234) ## Specify seed for Random Number Generation nsim <- 200 ## Number of simulations models2run <- length(modelsFitted) ## Number of models to loop for ages.fit <- 25:90 ## Ages used in model fitting
modelSim <- list()
selectBEL <- list()
for (m in 1:models2run){
modelSim[[m]] <- simulate(modelsFitted[[m]], nsim = nsim, h = forecastTime)
collectBEL <- array(NA, c(200,1))
for (s in 1:nsim){
prem_s <- 0
ben_s <- 0
qx <- cbind(modelSim[[m]]$fitted[, , s], modelSim[[m]]$rates[, , s])
extrapolate <- kannistoExtrapolation(qx, ages.fit, years_chart)
for (i in 1:nrow(portfolio)){
ben_s <- ben_s+DFcashflow(extrapolate$qxt*expF, ageStart = portfolio$age[i],
omegaAge = 120, pensionAge = 65, valyear = valyear,
ir = 0.02, type = 1)*pension
prem_s <- prem_s+DFcashflow(extrapolate$qxt*expF, ageStart = portfolio$age[i],
omegaAge = 120, pensionAge = 65, valyear = valyear,
ir = 0.02, type = 2)*portfolio$Premium[i]
}
collectBEL[s, 1] <- round(ben_s - prem_s, 2)
}
selectBEL[[m]] <- quantile(collectBEL, probs = 0.995, type = 1)
}
SCR <- round(as.numeric(selectBEL) - BEL, 2)
colnames(SCR) <- "SCR"
Percentails (%): 0.5, 2.5, 10, 25, 75, 90, 97.5, 99.5
It's time to play with portfolio data. Task is to make everybody in our portfolio 5 years older.
Can we form any expections regarding the BEL and SCR values?
Open Case_study_1.R instructions and code to adjust is commented out.
Now shorten the time trend. Instead of using hitory data from 1950 we will just focus on 1989 to 2009.
Can we form any expections regarding the BEL and SCR values?
Open Case_study_2.R instructions and code to adjust is commented out.
For all the analysis done so was raw data was used. Let's check what happens if we used smoothing.
Can we form any expections regarding the BEL and SCR values?
Open Case_study_3.R instructions and code to adjust is commented out.
To the list of already fitted models add amortality table that assumes constant mortality rates after 2009. Projection period remains 120. Extrapolation is used for ages 91-120.
Can we form any expections regarding the BEL value?
Open Case_study_4.R instructions and code to adjust is commented out.
"StMoMO: An R Package for Stochastic Mortality Modelling" A.M. Villegas et al.
"Stochastic Modelling of Mortality Risks" Frankie Gregorkiewicz
"Solvency II Glossary" CEA and the Groupe Consultatif
"R: The most powerful and most widely used statistical software" Revolution Analytics
Franciszek Gregorkiewicz
franciszek.gregorkiewicz@aaa-riskfinance.nl +48 605 682 112
Olga Mierzwa
olga.mierzwa@aaa-riskfinance.nl +48 798 878 358
Presentation and source code available on GitHub: https://github.com/TARF/SMM/
An independent and innovative consultancy firm specialized in the field of actuarial science and risk management
Comprehensive, hands-on services for insurance companies