class: center, middle, inverse, title-slide # Statistics with R ## Statistical Modelling with R for Actuarial Students --- <style type="text/css"> pre { background: #ADD8E6; max-width: 100%; overflow-x: scroll; } </style> ***CS2B – Risk Modelling and Survival Analysis *** * The emphasis is placed on being able to apply statistical methods to actuarial problems using real data sets and the open-source software environment R. * Time Series. Probability Distributions, Survival Analysis --- ### Compound Poisson Distribution The annual aggregate claim amount of an insurer follows a compound Poisson distribution with parameter 1,000. Individual claim amounts follow a Gamma distribution with shape parameter α = 750 and rate parameter λ = 0.25. --- 1. Generate 20,000 simulated aggregate claim values for this insurer, using a random number generator seed of 914. Use the R function, <tt>head()</tt>, to display the first seven simulated claim values in your answer script. 2. Determine the mean and the standard deviation of the simulated aggregate claim values from part 1. 3. Plot the empirical density function of the simulated aggregate claim values from part 1, setting the x-axis range from 2,600,000 to 3,300,000 and the y-axis range from 0 to 0.0000045. 4. Suggest a suitable distribution, including its parameters, that approximates the simulated aggregate claim values from part 1. 5. Generate 20,000 values from your suggested distribution in part 4 using a random number generator seed of 914. Use the R function, <tt>head()</tt>, to display the first seven generated values in your answer script. 6. Plot the empirical density function of the simulated values in part 5. as a different coloured line in the chart that was produced in part 3. --- ### Part 1 ```r set.seed(914) n = rpois(20000, 1000) s = numeric(20000) for(i in 1:20000) { x = rgamma(n[i], shape = 750, rate = 0.25) s[i] = sum(x) } head(s, 7) ``` ``` ## [1] 2860469 2915250 2998362 3223837 2915546 2971731 3132371 ``` --- ### Part 1 #### Alternative Approach ```r set.seed(914) s = numeric(20000) for(i in 1:20000) { n = rpois(1, 1000) x = rgamma(n, shape = 750, rate = 0.25) s[i] = sum(x) } head(s, 7) ``` ``` ## [1] 2856968 2929369 2910782 2941784 3041930 3057008 2953528 ``` --- #### Part 2. ```r mean(s) ``` ``` ## [1] 3000283 ``` The mean of the simulated claim values is 3,000,283 ```r sd(s) ``` ``` ## [1] 95263 ``` The standard deviation of the simulated claim values is 95,263 --- ### Part 3. ```r plot( density(s), xlim = c(2600000, 3300000), ylim = c(0, 4.5e-06), xlab = "Simulated Claims", main = "Probability Density Function", sub = "Simulated Claims from a Compound Poisson Distribution") ``` --- ### Part 3. <!-- --> --- ### Part 4. Suggest a suitable distribution, including its parameters, that approximates the simulated aggregate claim values #### Normal distribution * mean = 3,000,982 * standard deviation = 93,872.61 * 93,872.61^2 = 8,812,067,277 $$ X \sim N(\mu=3,000,982, \sigma^2 = 8,812,067,277)$$ --- #### Part 5. 20,000 values from your suggested distribution in part 4 using a random number generator seed of 914. ```r set.seed(914) approx_dist = rnorm(20000, mean(s), sd(s)) head(approx_dist, 7) ``` ``` ## [1] 2854076 3082548 2900897 2999033 3232148 2917926 2975351 ``` --- ### Part 6. ```r plot( density(s), xlim = c(2600000, 3300000), ylim = c(0, 4.5e-06), xlab = "Simulated Claims", main = "Probability Density Functions") ### Super-imposed theoretical distribution lines( density(approx_dist), col = "red") legend("topright", legend = c("Simulated Claims", "Approximate Normal Distribution"), col = c("black","red"), pch = 7) ``` --- ### Part 6. <!-- --> --- ---