Graduation
Graduation is the process of smoothing crude estimates of \(\hat{q}_x\) or \(\hat{\mu}_x\)
There are many reasons for graduation:
- It is intuitive that the values for these estimates would follow a smooth function
- Allows for interpolation and extrapolation
- Removes sampling noise, allowing to see trends easier
- Incorporates information from adjacent ages for estimate, similar to pooling information
- It is desirable for financial quantities to progress smoothly with age
- It is intuitive that the values for these estimates would follow a smooth function
There are also many desirable features of a graduation:
- Smoothed or graduated rates \(\mathring{q_x}, \mathring{\mu}\) need to satisfy:
- Smoothness
- Goodness of fit/adherence to data
- Suitability for application
- Smoothness and Adherence typically conflict
- If rates are too smooth/overgraduated, then the rates show little adherence to data
- If rates follow the observed data too closely/overgraduated, it will have inadequate smoothness
- Smoothed or graduated rates \(\mathring{q_x}, \mathring{\mu}\) need to satisfy:
After graduation consider the reasonability and financial risks
- Over/under estimation of rates can lead to losses
- Do reasonability checks, such as male \(>\) female mortality
Parametric Graduation
Choose a mathematical formula with unknown parameters
Estimate parameters
- MLE
- Minimise sum of squared standardised deviations
- Minimise weighted least squares
Calculate graduated rates
Test Graduation
Note too many parameters may result in undergraduation (low bias, high variance)
Gompertz and Makeham
- Gompertz considers an exponential approach
- Assumes log-linear
\[ \mu_x = Bc^x,\quad\text{or }\quad \mu_x=\alpha e^{\beta x}\]
- Makeham improves on Gompertz by also adding an intercept for young values, as it tends to be curved in the loglinear space
\[ \mu_x = A + Bc^x\]
- We can generalise these approaches with the generalised makeham class of models:
\[ \mu_x = \sum^{r-1}_{i=0}\alpha_i x^i+\exp\left(\sum^{s-1}_{j=0}\beta_jx^j\right)\]
Heligman and Pollard (first) Law
- has different components to account for non-linear progression of mortality with age
\[ q_x = A^{(x+b)^C}+D^{-E(\ln(x)-\ln(F))^2}+\frac{GH^x}{1+GH^x}\]
Maxmise Maximum Likelihood Function
- Either binomial (\(q_x\)) or poisson (\(\mu\))
Minimise \(\chi^2\) statistic:
\[ \sum\frac{(A-E)^2}{E}\]
- Minimise weighted least squares
- Use weight based on variance to give less weight to more variable ages (typically with less data)
- Use inverse of variance
- Use weight based on variance to give less weight to more variable ages (typically with less data)
\[ \sum w_x(\hat{q_x}-\mathring{q}_z)^2 \]
- Binomial Weights:
\[ w_x\approx \frac{E_x}{\hat{q}_x(1-\hat{q}_x)}\approx\frac{E_x}{\hat{q}_x}\]
- Poisson Weights:
\[ w_x \approx \frac{E^c_x}{\hat{\mu}_x}\]
Graduation with Reference to Standard Table
- Useful when there is not a lot of data
- Assume that the overall progression of rates from age to age should be similar
- Select appropriate standard table
- Decide relationship between standard table and graduated rates \(\mathring{q}_x = f(q_x^s)\) or \(\mathring{\mu}_x = f(\mu_x^s)\). For example:
- Can determine these relationships by plotting \(\hat{q}_x\) against \(q^s_x\)
\[ \mathring{q}_x = a + bq_x^s\]
- Determine parameter values through MLE, least squares, weighted least squares
- Test the graduation/goodness of fit
- Can assume it will be relatively smooth since the standard rates themselves are smooth
Graduation with Splines
- Useful and flexible curve fitting method
- Knots that are used are not necessarily integers
- Concentrate knots around the accident hump as this is the most variable point
- General Spline Formula:
\[ q_y = a_0+a_1y+a_2y^2+a_3y^3+\sum^n_{j=1}b_j(y-x^{(j)})_+^3 + \epsilon_y\]
- We often use weights in the least squares spline regression. We times both sides by \(\sqrt{w_x}\):
\[ \sqrt{w_x}q_x = \sum^3_{i=0}a_i\sqrt{w_x}x^3 +\sum^n_{j=1}b_j\sqrt{w_x}(y-x^{(j)})_+^3+\sqrt{w_x}\epsilon_x\]
Smoothing Splines
- Non-parametric natural cublic spline that minimises the following equation:
- \(\lambda\) impacts the level of smoothness, with \(\lambda = 0\) resulting in undergraduation to the extent of extrapolation
\[ \sum^n_{i=1}[y_i-f(x_i)]^2 + \lambda\int^{x_n}_{x_1}[f''(t)]^2dt \]
Comparison of Different Grad Methods
Parametric
- Used when large amounts of data is available
- Produces extremely precise results
- Creates smooth enough results for a small enough number of parameters
- Optimised through statistical fitting, therefore not subjective
- Can use MLE parameters which have good statistical properties
- However, it is hard to find a curve that fits an experience well at all ages
Reference to Standard Table
- Relatively simple
- Used with little data
- Typically produces smooth rates
- Easier to fit the high and low ages
- Does not fully represent the data as it relies on a reference
- Heavily depends on the standard table chosen
Testing for Smoothness
- Can test for smoothness using finite differences of the data
- First Differences:
\[ \Delta\mathring{q}= \mathring{q}_{x+1}-\mathring{q}_x\]
- Second Differences:
\[ \Delta^2\mathring{q}= \Delta\mathring{q}_{x+1}-\Delta\mathring{q}_x\]
- Third Differences:
\[\Delta^3\mathring{q}= \Delta^2\mathring{q}_{x+1}-\Delta^2\mathring{q}_x \]
- Absolute value of third differences should be relatively small when compared to the graduated rates themselves
Statistical Tests for Adherence to Data of a Graduation
- No single test will provide adequate coverage of every aspect of the fit
- Assumptions
- Lives are independent
- No heterogenity in each age
- Approximation (expected deaths should be >5 for normal approximation)
- Null Hypothesis is that the actual data are consistent with the ones that are predicted by the graduated rates
- Standardised Deviation:
- If there is sufficient number of independent lives at each age x, by central limit theorem the standardized deviations are standard normal and mutually independent.
\[ \frac{A-E}{\sqrt{E}} \]
Under Poisson:
- Number of deaths:
\[ D_x\sim N(E^c_x\mathring{\mu}_{x+0.5}, E^c_x\mathring{\mu}_{x+0.5})\]
- Standardised Deviation:
\[ z_x = \frac{d_x-E^c_x\mathring{\mu}_{x+0.5}}{\sqrt{E^c_x\mathring{\mu}_{x+0.5}}}\]
Under Binomial:
- Number of deaths:
\[ D_x\sim N(E_x\mathring{q}_x,E_x\mathring{q}_x(1-\mathring{q}_x))\]
- Standardised Deviation
\[ z_x = \frac{d_x-E_x\mathring{q}_x}{\sqrt{E_x\mathring{q}_x(1-\mathring{q}_x)}}\approx \frac{d_x-E_x\mathring{q}_x}{\sqrt{E_x\mathring{q}_x}}\]
Chi-Square Test of Fit
- General test for goodness of fit
- Doesn’t give information on direction of any bias
- Test Statistic:
- Degrees of freedom \(n\) = Number of groups - Number of estimated parameters
\[ X = \sum z_x^2\sim\chi^2_n \]
- Degrees of freedom:
- Lose one per parameter estimated
- When graduating using a standard table
- Lose one degree for each parameter fitted
- Also lose some degrees of freedom (2-3) due to constraints imposed on chosen table
Standardised Deviations Test
- Normality test for standardised deviations
- Checks whether standard deviations are too bunched, too spread out or in line with a standard normal distribution
- Can also use any other normality test, such as chi-squared, qq plots
- Roughly half of the deviations should fall between \((-\frac{2}{3},\frac{2}{3})\)
- Portion into table with intervals and use chi squared statistic based on number of deviations in each group:
- Note chisquared has n-1 degrees of freedom, where n is number of groups
\[ X=\sum\frac{\text{actual-expected}}{\text{expected}}\sim\chi^2_{n-1}\]
Sign Test
Tests balance between positive and negative deviations
- Roughly half of the deviations should be positive and negative
- Provides no indication of the extent of the discrepencies
Calculate the test statistic \(P\): the number of \(z_x\) that are positive
Find p-value of the value in relation to binomial.
Alternatively use the normal approximation (if m>20):
\[ P\sim N\left(\frac{1}{2}m,\frac{1}{4}m\right)\]
Cumulative Deviations Test
- General goodness of fit
- High value of test statistic indicates either:
- The graduated rates are biased
- Actual variance is higher than predicted by the assumed model for the range of ages considered (could be due to duplicate policies)
- Detects overall bias or long runs of deviations of the same sign
- High value of test statistic indicates either:
- Test Statistic:
\[ \frac{\sum A - E}{\sqrt{\sum Var(rate)}}\sim N(0,1)\]
- Perform hypothesis test assuming standard normal distribution
Grouping of Signs Test (Stevens)
- Tests for overgraduation and runs of the same sign
- This can however lead to different results based on whether positive or negative is chosen
- Compare the number of groups/runs with the number of groups that would be expected if the positive and negative deviations were arranged in a random order
- Test statistic is:
\[ G = \text{ Number of groups of positive }z_x's\]
- Perform hypothesis test with the hypogeometric distribution
- For large \(m\) we can use the normal approximation:
\[ G\sim N\left(\frac{n_1(n_2+1)}{n_1+n_2},\frac{(n_1n_2)^2}{(n_1+n_2)^3}\right) \]
Serial Correlations Test
- Tests for overgraduation
- Overgraduated curves tend to stay on the same side of the crude rates for relatively long periods of time
- Undergraduated curves will cross the crude rates frequently
- One part of correlation can be cancelled by other parts of correlation, therefore this test is typically weaker than the signs/grouping of signs tests
- Formulas:
\[ r_j=\frac{\sum^{m-j}_{i=1}(z_i-\bar{z}_1)(z_{i+j}-\bar{z}_2)}{\sqrt{\sum^{m-j}_{i=1}(z_i-\bar{z}_1)^2\sum^{m-j}_{i=1}(z_{i+j}-\bar{z}_2)^2}}\]
\[ \bar{z_1} = \frac{1}{m-j}\sum^{m-j}_{i=1}z_{i} \]
\[ \bar{z_2} = \frac{1}{m-j}\sum^{m-j}_{i=1}z_{i+j}\]
- For large \(m\), \(\bar{z}_1,\bar{z}_2\) can be approximated by the total average
\[ r_j\approx\frac{\frac{1}{m-j}\sum^{m-j}_{i=1}(z_i-\bar{z})(z_{i+j}-\bar{z})}{\frac{1}{m}\sum^{m}_{i=1}(z_i-\bar{z})^2}\]
- Test statistic:
- Noting large positive test statistic indicates over graduation
\[ t_j = r_j\sqrt{m}\sim N(0,1)\]
Effect of Duplicate Policies
There is the potential for duplicate policies, which violates the assumption of independence of policies
- Has no effect on bias (expected value)
- Affects variability, increasing the variance
Assume N lives from age \(x\) to \(x+1\)
- Assume proportion $$ lives own \(i\) insurance policies (these properties are unknown)
- Total number of policies is:
\[ \sum_i i\pi_i N\]
- Assume the mortality rate for each life is \(q_x\)
- Let \(D_i\) be the number of deaths among the \(\pi_i N\) lives each with \(i\) policies and \(C_i\) be the number of claims among these lives
Assuming the binomial model:
\[ D_i\sim Bin(\pi_iN, q_x)\] * Then we have:
\[ \mathbb{E}[C] = \left(\sum_i i\pi_i N\right)q_x \]
\[ Var(C) = \left(\sum_i i\pi_i N\right)q_x(1-q_x)\]
- Duplicate policies increase the variance of the number of claims by the ratio:
\[ \frac{\sum_i i^2\pi_i}{\sum_i i\pi_i}\]
- Should make an allowance for the increased variances in statistical tests