Credibility and premium update

Now, we can use the credibility approach to measure frequency and severity of claims and contrast them with new observed values. Use a coverage probability of 95% and a range around the true mean of 10%

We can start with some data.

setwd("~/Dropbox/UDLAP/Cursos/2022 Primavera/Tema Selecto/data")
data<-read.csv("insample.csv")

In this dataset, we have different years:

table(data$Year)
## 
## 2006 2007 2008 2009 2010 
## 1154 1138 1125 1112 1110

Imagine we have information from years 2006-2008, we can estimate a premium based on expected frequency and expected severity. In this example, lets explore small pieces … this is, first we will review frequency for the 2006-2009 period, and then will contrast that to the information from 2009 and we will decide if we can change to 2009 or stay with the previous value.

data1<-data[which(data$Year<=2008),]
data2<-data[which(data$Year==2009),]

The mean frequency for 2006-2008, this is, the mean per policy is:

mu<-mean(data1$Freq)
mu
## [1] 1.030729

Now, imagine we get information from 2009, should we update the information on frequency?

First, what is the expected number of claims based on the previous findings:

\[ \lambda_N = Policies \times \mu \]

policies<-length(data2$Y)
lN<- policies*mu
lN
## [1] 1146.17

\[ \lambda_F=\bigg( \frac{z_{1-\alpha/2}}{k} \bigg)^2 \]

In this example we are working with k=0.1 and \(\alpha\)=0.05, then:

\[ \lambda_F=\bigg( \frac{z_{0.975}}{0.1} \bigg)^2 \]

lF<-(qnorm(0.975)/0.1)^2
lF
## [1] 384.1459

Now, we have:

\[ 348.1459 = \lambda_F < \lambda_N = 1146.17 \]

This means that we attain full credibility for claim frequency. Therefor, for the next period we have an expected frequency of:

mu2<-mean(data2$Freq)
mu2
## [1] 1.219424

Now, we can see what happens with severity. First, we need to estimate the coefficient of variation, which is:

\[ C_x=\frac{\sigma_X}{\mu_X} \]

For claims in 2009, we need the values of deviation and variance. First, the number of claims:

table(data2$Freq)
## 
##   0   1   2   3   4   5   6   7   8   9  10  11  12  13  15  16  18  19  21 143 
## 811 155  69  26  17  11   1   4   2   2   2   1   1   1   2   1   1   1   1   1 
## 228 263 
##   1   1
length(data2$Freq)
## [1] 1112

Which means we have 1112-811=301 claims… then:

varx<-var(data2$y[data2$y>0])
varx
## [1] 20039852864
sigmax<-sqrt(varx)
sigmax
## [1] 141562.2
mux<-mean(data2$y[data2$y>0])
Cx<-sigmax/mux
Cx
## [1] 3.857419

Now, for claim severity we have:

\[ \lambda_F \cdot C_x^2 \]

lF*Cx*Cx
## [1] 5715.97

We have then

\[ \lambda_F \cdot C_x^2 > Claims^{2009} \]

Which means that we did not attained full credibility for claim severity. Then, we can estimathe the credibility factor as

\[ Z=\sqrt(\frac{Claims}{\lambda_F C_x^2}) \]

Z<- sqrt(301/(lF*Cx*Cx))
Z
## [1] 0.2294765

Now, we can update our severity as:

\[ U_{X}=Z\cdot \mu_{X_{2009}} + (1-Z) \mu_{X_{2006-2008}} \]

Ux<-Z*mean(data2$y[data2$y>0])+(1-Z)*mean(data1$y[data1$y>0])
Ux
## [1] 47759.67

Which compares with the values for the observed data in 2009 and the observed 2006-2008 data

mean(data1$y[data1$y>0])
## [1] 51053.84
mean(data2$y[data2$y>0])
## [1] 36698.68

Now, how does this affect premiums?

According to the basic approach. Premium was calculated for 2006-2008 as:

p1<-mean(data1$Freq)*mean(data1$y[data1$y>0])
p1
## [1] 52622.66

Now, if we were to use our data from 2009, we would have a premium of the form:

p2<-mean(data2$Freq)*mean(data2$y[data2$y>0])
p2
## [1] 44751.26

However, based on credibility theory, the updated premium would be:

pu<-mean(data2$Freq)*Ux
pu
## [1] 58239.31

Which is the consequence of a higher frequency in 2009.

\[\textbf{Activity: Replicate this example with 2010.}\]