Graduation

Graduation is the process of smoothing crude estimates of $\hat{q}_x$ or $\hat{\mu}_x$
There are many reasons for graduation:
- It is intuitive that the values for these estimates would follow a smooth function
  - Allows for interpolation and extrapolation
- Removes sampling noise, allowing to see trends easier
- Incorporates information from adjacent ages for estimate, similar to pooling information
- It is desirable for financial quantities to progress smoothly with age
There are also many desirable features of a graduation:
- Smoothed or graduated rates $\mathring{q_x}, \mathring{\mu}$ need to satisfy:
  - Smoothness
  - Goodness of fit/adherence to data
  - Suitability for application
- Smoothness and Adherence typically conflict
  - If rates are too smooth/overgraduated, then the rates show little adherence to data
  - If rates follow the observed data too closely/overgraduated, it will have inadequate smoothness
After graduation consider the reasonability and financial risks
- Over/under estimation of rates can lead to losses
- Do reasonability checks, such as male $>$ female mortality

Parametric Graduation

Choose a mathematical formula with unknown parameters
Estimate parameters
- MLE
- Minimise sum of squared standardised deviations
- Minimise weighted least squares
Calculate graduated rates
Test Graduation
Note too many parameters may result in undergraduation (low bias, high variance)

Gompertz and Makeham

Gompertz considers an exponential approach
- Assumes log-linear

\[ \mu_x = Bc^x,\quad\text{or }\quad \mu_x=\alpha e^{\beta x}\]

Makeham improves on Gompertz by also adding an intercept for young values, as it tends to be curved in the loglinear space

\[ \mu_x = A + Bc^x\]

We can generalise these approaches with the generalised makeham class of models:

\[ \mu_x = \sum^{r-1}_{i=0}\alpha_i x^i+\exp\left(\sum^{s-1}_{j=0}\beta_jx^j\right)\]

Heligman and Pollard (first) Law
- has different components to account for non-linear progression of mortality with age
\[ q_x = A^{(x+b)^C}+D^{-E(\ln(x)-\ln(F))^2}+\frac{GH^x}{1+GH^x}\]
Maxmise Maximum Likelihood Function
- Either binomial ($q_x$) or poisson ($\mu$)
Minimise $\chi^2$ statistic:

\[ \sum\frac{(A-E)^2}{E}\]

Minimise weighted least squares
- Use weight based on variance to give less weight to more variable ages (typically with less data)
  - Use inverse of variance

\[ \sum w_x(\hat{q_x}-\mathring{q}_z)^2 \]

Binomial Weights:

\[ w_x\approx \frac{E_x}{\hat{q}_x(1-\hat{q}_x)}\approx\frac{E_x}{\hat{q}_x}\]

Poisson Weights:

\[ w_x \approx \frac{E^c_x}{\hat{\mu}_x}\]

Graduation with Reference to Standard Table

Useful when there is not a lot of data
- Assume that the overall progression of rates from age to age should be similar
Select appropriate standard table
Decide relationship between standard table and graduated rates $\mathring{q}_x = f(q_x^s)$ or $\mathring{\mu}_x = f(\mu_x^s)$. For example:
- Can determine these relationships by plotting $\hat{q}_x$ against $q^s_x$

\[ \mathring{q}_x = a + bq_x^s\]

Determine parameter values through MLE, least squares, weighted least squares
Test the graduation/goodness of fit
- Can assume it will be relatively smooth since the standard rates themselves are smooth

Graduation with Splines

Useful and flexible curve fitting method
Knots that are used are not necessarily integers
Concentrate knots around the accident hump as this is the most variable point
General Spline Formula:

\[ q_y = a_0+a_1y+a_2y^2+a_3y^3+\sum^n_{j=1}b_j(y-x^{(j)})_+^3 + \epsilon_y\]

We often use weights in the least squares spline regression. We times both sides by $\sqrt{w_x}$:

\[ \sqrt{w_x}q_x = \sum^3_{i=0}a_i\sqrt{w_x}x^3 +\sum^n_{j=1}b_j\sqrt{w_x}(y-x^{(j)})_+^3+\sqrt{w_x}\epsilon_x\]

Smoothing Splines

Non-parametric natural cublic spline that minimises the following equation:
- $\lambda$ impacts the level of smoothness, with $\lambda = 0$ resulting in undergraduation to the extent of extrapolation

\[ \sum^n_{i=1}[y_i-f(x_i)]^2 + \lambda\int^{x_n}_{x_1}[f''(t)]^2dt \]

Comparison of Different Grad Methods

Parametric

Used when large amounts of data is available
Produces extremely precise results
Creates smooth enough results for a small enough number of parameters
Optimised through statistical fitting, therefore not subjective
Can use MLE parameters which have good statistical properties
However, it is hard to find a curve that fits an experience well at all ages

Reference to Standard Table

Relatively simple
Used with little data
Typically produces smooth rates
Easier to fit the high and low ages
Does not fully represent the data as it relies on a reference
Heavily depends on the standard table chosen

Testing for Smoothness

Can test for smoothness using finite differences of the data
First Differences:

\[ \Delta\mathring{q}= \mathring{q}_{x+1}-\mathring{q}_x\]

Second Differences:

\[ \Delta^2\mathring{q}= \Delta\mathring{q}_{x+1}-\Delta\mathring{q}_x\]

Third Differences:

\[\Delta^3\mathring{q}= \Delta^2\mathring{q}_{x+1}-\Delta^2\mathring{q}_x \]

Absolute value of third differences should be relatively small when compared to the graduated rates themselves

Statistical Tests for Adherence to Data of a Graduation

No single test will provide adequate coverage of every aspect of the fit
Assumptions
- Lives are independent
- No heterogenity in each age
- Approximation (expected deaths should be >5 for normal approximation)
Null Hypothesis is that the actual data are consistent with the ones that are predicted by the graduated rates
Standardised Deviation:
- If there is sufficient number of independent lives at each age x, by central limit theorem the standardized deviations are standard normal and mutually independent.

\[ \frac{A-E}{\sqrt{E}} \]

Under Poisson:
- Number of deaths:

\[ D_x\sim N(E^c_x\mathring{\mu}_{x+0.5}, E^c_x\mathring{\mu}_{x+0.5})\]

Standardised Deviation:

\[ z_x = \frac{d_x-E^c_x\mathring{\mu}_{x+0.5}}{\sqrt{E^c_x\mathring{\mu}_{x+0.5}}}\]

Under Binomial:
- Number of deaths:

\[ D_x\sim N(E_x\mathring{q}_x,E_x\mathring{q}_x(1-\mathring{q}_x))\]

Standardised Deviation

\[ z_x = \frac{d_x-E_x\mathring{q}_x}{\sqrt{E_x\mathring{q}_x(1-\mathring{q}_x)}}\approx \frac{d_x-E_x\mathring{q}_x}{\sqrt{E_x\mathring{q}_x}}\]

Chi-Square Test of Fit

General test for goodness of fit
- Doesn’t give information on direction of any bias
Test Statistic:
- Degrees of freedom $n$ = Number of groups - Number of estimated parameters

\[ X = \sum z_x^2\sim\chi^2_n \]

Degrees of freedom:
- Lose one per parameter estimated
- When graduating using a standard table
  - Lose one degree for each parameter fitted
  - Also lose some degrees of freedom (2-3) due to constraints imposed on chosen table

Standardised Deviations Test

Normality test for standardised deviations
- Checks whether standard deviations are too bunched, too spread out or in line with a standard normal distribution
- Can also use any other normality test, such as chi-squared, qq plots
Roughly half of the deviations should fall between $(-\frac{2}{3},\frac{2}{3})$

Portion into table with intervals and use chi squared statistic based on number of deviations in each group:
- Note chisquared has n-1 degrees of freedom, where n is number of groups

\[ X=\sum\frac{\text{actual-expected}}{\text{expected}}\sim\chi^2_{n-1}\]

Sign Test

Tests balance between positive and negative deviations
- Roughly half of the deviations should be positive and negative
- Provides no indication of the extent of the discrepencies
Calculate the test statistic $P$: the number of $z_x$ that are positive
Find p-value of the value in relation to binomial.
Alternatively use the normal approximation (if m>20):

\[ P\sim N\left(\frac{1}{2}m,\frac{1}{4}m\right)\]

Cumulative Deviations Test

General goodness of fit
- High value of test statistic indicates either:
  - The graduated rates are biased
  - Actual variance is higher than predicted by the assumed model for the range of ages considered (could be due to duplicate policies)
- Detects overall bias or long runs of deviations of the same sign
Test Statistic:

\[ \frac{\sum A - E}{\sqrt{\sum Var(rate)}}\sim N(0,1)\]

Perform hypothesis test assuming standard normal distribution

Grouping of Signs Test (Stevens)

Tests for overgraduation and runs of the same sign
- This can however lead to different results based on whether positive or negative is chosen
Compare the number of groups/runs with the number of groups that would be expected if the positive and negative deviations were arranged in a random order
Test statistic is:

\[ G = \text{ Number of groups of positive }z_x's\]

Perform hypothesis test with the hypogeometric distribution
For large $m$ we can use the normal approximation:

\[ G\sim N\left(\frac{n_1(n_2+1)}{n_1+n_2},\frac{(n_1n_2)^2}{(n_1+n_2)^3}\right) \]

Serial Correlations Test

Tests for overgraduation
- Overgraduated curves tend to stay on the same side of the crude rates for relatively long periods of time
- Undergraduated curves will cross the crude rates frequently
  - One part of correlation can be cancelled by other parts of correlation, therefore this test is typically weaker than the signs/grouping of signs tests
Formulas:

\[ r_j=\frac{\sum^{m-j}_{i=1}(z_i-\bar{z}_1)(z_{i+j}-\bar{z}_2)}{\sqrt{\sum^{m-j}_{i=1}(z_i-\bar{z}_1)^2\sum^{m-j}_{i=1}(z_{i+j}-\bar{z}_2)^2}}\]

\[ \bar{z_1} = \frac{1}{m-j}\sum^{m-j}_{i=1}z_{i} \]

\[ \bar{z_2} = \frac{1}{m-j}\sum^{m-j}_{i=1}z_{i+j}\]

For large $m$, $\bar{z}_1,\bar{z}_2$ can be approximated by the total average

\[ r_j\approx\frac{\frac{1}{m-j}\sum^{m-j}_{i=1}(z_i-\bar{z})(z_{i+j}-\bar{z})}{\frac{1}{m}\sum^{m}_{i=1}(z_i-\bar{z})^2}\]

Test statistic:
- Noting large positive test statistic indicates over graduation

\[ t_j = r_j\sqrt{m}\sim N(0,1)\]

Effect of Duplicate Policies

There is the potential for duplicate policies, which violates the assumption of independence of policies
- Has no effect on bias (expected value)
- Affects variability, increasing the variance
Assume N lives from age $x$ to $x+1$
- Assume proportion $$ lives own $i$ insurance policies (these properties are unknown)
- Total number of policies is:
\[ \sum_i i\pi_i N\]
- Assume the mortality rate for each life is $q_x$
- Let $D_i$ be the number of deaths among the $\pi_i N$ lives each with $i$ policies and $C_i$ be the number of claims among these lives
Assuming the binomial model:

\[ D_i\sim Bin(\pi_iN, q_x)\] * Then we have:

\[ \mathbb{E}[C] = \left(\sum_i i\pi_i N\right)q_x \]

\[ Var(C) = \left(\sum_i i\pi_i N\right)q_x(1-q_x)\]

Duplicate policies increase the variance of the number of claims by the ratio:

\[ \frac{\sum_i i^2\pi_i}{\sum_i i\pi_i}\]

Should make an allowance for the increased variances in statistical tests

Graduation

Jake

27/04/2022

Graduation

Parametric Graduation

Gompertz and Makeham

Graduation with Reference to Standard Table

Graduation with Splines

Smoothing Splines

Comparison of Different Grad Methods

Parametric

Reference to Standard Table

Testing for Smoothness

Statistical Tests for Adherence to Data of a Graduation

Chi-Square Test of Fit

Standardised Deviations Test

Sign Test

Cumulative Deviations Test

Grouping of Signs Test (Stevens)

Serial Correlations Test

Effect of Duplicate Policies