Question 1

(i) Derive a two-sided 90% confidence interval for the mean difference in the patients’ blood pressure before and after participating in an exercise programme.

bp_before <- c(155, 152, 146, 153, 146, 160, 139, 148)
bp_after <- c(145, 147, 123, 137, 141, 142, 140, 138)

bp_diff <- bp_before - bp_after

bp_diff_size <- length(bp_diff)
bp_diff_mean <- mean(bp_diff)
bp_diff_stdv <- sd(bp_diff)

degree_of_freedom <- bp_diff_size - 1
t_score <- qt(0.95, degree_of_freedom)

confidence_interval <- c(bp_diff_mean - t_score*bp_diff_stdv/sqrt(bp_diff_size), bp_diff_mean + t_score*bp_diff_stdv/sqrt(bp_diff_size))

confidence_interval
## [1]  5.46661 16.03339

Conclusion: The two-sided 95% confidence interval for the mean difference in weights is [5.46661, 16.03339].

(ii) Perform a suitable t-test for the null hypothesis that the mean difference in the patients’ blood pressure before and after participating in an exercise programme is less than or equal to 10 units, against the alternative that it is greater than 10 units. Use a significance level of 1%.

t.test(x = bp_before, y = bp_after, paired = TRUE, alternative = "greater", mu = 10, conf.level = 0.99)
## 
##  Paired t-test
## 
## data:  bp_before and bp_after
## t = 0.26894, df = 7, p-value = 0.3979
## alternative hypothesis: true difference in means is greater than 10
## 99 percent confidence interval:
##  2.389646      Inf
## sample estimates:
## mean of the differences 
##                   10.75
t_score_0.01 <- qt(0.99, degree_of_freedom)
t_score_0.01
## [1] 2.997952

Since |to|= 0.26894 < 2.997952 = t0.01,7 or p-value = 0.3979 > 0.01 = alpha, the null hypothesis is not rejected at 1% significance level.

Question 2

(i)(a) Implement M = 1000 Monte Carlo repetitions of a credibility analysis to estimate the distribution of the posterior mean of parameter πœ† using the credibility factor Z = 1/(𝛽 + 1), in the case where the number of past claims π‘₯ is known only for the last one year.

set.seed(508)

simulation = 1000
alpha = 100
beta = 1
Z = 1/(beta + 1)
posterior_mean <- rep(0, simulation)
X <- rep(0, simulation)
for(i in 1:simulation){
  lamda = rgamma(1, shape = alpha, rate = beta)
  x = rpois(1, lamda)
  X[i]= x
  posterior_mean[i] = Z*x + (1-Z)*beta*alpha
}
posterior_mean
##    [1]  87.5  97.0 102.0 109.5  97.0 108.0 101.0 105.5 101.0 103.0  98.0  99.0
##   [13] 109.5 101.0  99.5  89.0 107.5 104.0 102.0  95.0 104.0  92.5  95.5 102.5
##   [25] 104.5 110.5 102.5 100.0 106.0 100.0  93.0 101.5 107.0 110.0  97.5 100.5
##   [37]  92.5 100.0 105.5 103.0 106.0 109.5  95.5  95.5 104.0 107.0 102.0 121.0
##   [49]  98.5 102.0 100.5  95.0  93.0  92.0  95.5 100.0 106.5  94.5 105.0  87.0
##   [61]  98.0 103.5  91.0 102.0  95.5 100.5 106.0 106.0  97.5  96.0  99.5 100.5
##   [73]  95.5  84.0 100.5 119.5 109.5  91.0 107.0 104.5  95.0 102.0  96.0 102.5
##   [85] 108.0 105.0  91.5 106.0  87.5  87.0 105.5 104.5 103.5  99.0  98.5 107.0
##   [97] 101.0 105.5 102.0 102.5  94.0 115.5 100.5 110.0 105.0  89.5 105.0 111.0
##  [109] 100.0 108.0 111.5  91.5 100.0 111.5  99.0  95.0  97.5 108.5 101.5 103.5
##  [121]  95.5 106.5  97.5  96.5 104.0  96.0  92.0 103.5 102.5  86.0 102.0 103.0
##  [133] 101.5  93.5 111.5  93.0 101.5 102.0  95.0 100.5 101.5 107.5  89.5 100.5
##  [145]  98.5  92.0 104.0 103.5 101.0 105.5  91.0  90.5  92.5 108.5 100.0 104.5
##  [157] 104.0 106.0 100.5 102.0 115.5 108.5 111.0  97.0  99.0  96.0  97.0 102.5
##  [169]  93.0 103.5 119.5  89.5  97.5 105.5 102.0 100.0 102.0 111.5 110.0  96.0
##  [181]  93.0  99.5  95.5  86.5 102.5 103.5  94.0  94.5  93.0  91.0  93.5 102.5
##  [193] 110.0 111.5  96.5  84.0 103.0  87.5  92.0  95.0  99.0  92.0  93.5 106.0
##  [205] 103.5 104.5  98.0 104.0 100.0  93.0  98.5  89.0 100.5  98.5  98.5  96.5
##  [217] 112.0  96.5 105.0 109.0 101.5 107.5  93.5  84.0  94.5 108.0  84.0 102.0
##  [229] 107.0  98.5  99.5 109.0  97.5 114.0  97.0  99.0  97.0 100.0 113.0 105.5
##  [241] 112.0 101.0  91.5 106.0 105.0  98.0 100.5  84.5  98.0  93.0 101.5  95.5
##  [253] 106.0 100.5  97.0 104.5  86.0  94.5 100.5  92.5  92.5  94.0  98.0 104.5
##  [265]  97.5  85.5 107.0 104.5 100.5 111.5  97.5  94.5 101.5  99.0  94.5  98.0
##  [277]  95.0  98.0  89.0 105.0 102.0 100.5  95.5 117.0 113.0  99.5 103.5 100.5
##  [289] 112.0  98.5 111.0 100.0 102.5 100.0  96.5  99.0 103.0  92.0  99.0  99.0
##  [301] 104.0  85.0 100.5 100.5  97.5 102.0  95.0  99.0  96.0 100.5  94.0 108.0
##  [313]  98.5 100.5 103.5 107.5 113.0 106.5 109.5 100.5  99.0 106.0 112.0  95.5
##  [325]  93.0 101.0  98.0  99.0  99.5 111.0 124.5  98.5  92.5 104.0  95.0 106.5
##  [337]  95.5 105.5 102.5 100.0  96.0  97.0  95.5  95.5  99.0  99.5  97.0 102.5
##  [349]  94.0 108.0  90.5  98.5 100.0  94.5  88.5  91.0 109.0 101.0  89.0  99.5
##  [361] 102.0 107.5  94.0 101.0 108.0  97.0  99.0  91.5 115.0  98.5  99.5  90.0
##  [373]  93.5  96.5 101.5 102.0 101.0  98.5  94.5  91.0  86.5  89.5  93.5  97.5
##  [385]  94.0  98.5  87.5  97.5 100.5 111.5  97.5 101.5  93.5 110.0 103.0  86.0
##  [397] 101.0  97.5  94.5  89.5  96.0  99.0  91.5 106.0  87.0  91.0 102.0  90.5
##  [409] 104.5 100.0 110.0  89.0 105.5  98.0 112.5  98.5 113.0  96.5  98.0  90.5
##  [421] 102.0  95.5  94.5  99.5 106.0 102.5 106.5  94.5 104.5 100.0  98.5  92.0
##  [433] 103.0  96.0  98.5 101.5  99.0  96.5 103.0 107.5  94.0  94.0  88.5 100.5
##  [445]  97.5 106.5 107.5  98.0  92.0 103.0  99.0  92.5 101.0 110.0  92.0  98.5
##  [457] 106.0 101.5  92.5  96.5  98.0  94.0  99.5  99.5 100.0  95.5  96.0 111.5
##  [469] 102.0 101.5 101.0  99.0 116.5 104.0 104.5  89.0 105.0 103.0  94.0  97.0
##  [481] 100.0  98.5 101.0  88.0  98.0  89.5 107.5  93.5  94.0  91.0  96.0  92.0
##  [493] 107.0 101.5  96.5 101.0 104.0 106.0  98.0 104.0  94.0  94.5  99.0  92.0
##  [505]  95.5 110.5 102.5  90.5  91.5 101.0 113.5  94.5 112.5  89.0  96.5 106.5
##  [517] 100.5 108.5  89.5 107.5 101.0  98.5  94.0 101.5 111.5 100.5  98.0 100.5
##  [529]  95.0 107.0  99.5 104.5  99.5  94.5 102.0 114.0 104.0 100.0  91.5 101.5
##  [541]  89.0 110.0  95.0  99.5  90.5  91.5 113.5 104.0  98.0  92.5  96.5 101.0
##  [553]  95.0 102.5 109.0 113.5 107.0 100.5  88.5 104.5 106.5 106.5  95.0  94.0
##  [565] 104.5 109.0 103.5  87.0 102.5  99.0 101.5  89.0  95.5 104.5  98.5  90.5
##  [577]  93.5 100.5  99.0 106.0 107.5 103.0  96.0 108.5  95.5 103.0  99.5 105.5
##  [589] 100.0  89.5  98.5 101.0 108.0  96.5 100.5  90.0  93.5 105.0 103.5 107.0
##  [601] 106.0  83.5  97.5 103.5 106.5 101.0 102.5 103.5 100.5  97.0 106.5  86.5
##  [613] 102.5  88.0  93.5 102.5 110.5  98.5  97.5 101.5 108.5 101.0  97.0 102.0
##  [625]  99.5 111.0 106.5 112.5 101.5 100.5 103.5 107.5 105.5  93.5 102.5 102.5
##  [637] 100.0  97.0  90.5 101.5 100.0 103.5 111.0 106.5  94.0 104.0 100.5  79.5
##  [649]  93.0 104.0 107.5  88.5  94.0 112.5 100.5 103.5  89.0 102.5 104.5 109.0
##  [661] 103.0  97.0 106.5  86.5 104.0  91.5  90.0 113.5 101.5  92.0 100.5 108.0
##  [673] 107.0 105.5  95.0  97.0 101.0 110.5  96.0  95.0 108.0 106.5  93.0  86.5
##  [685] 102.5  99.5  97.5  86.5  92.0 103.0 101.5  95.0  94.0  93.5  97.0 112.5
##  [697]  94.5 108.0 103.5  91.0 101.5 102.0  94.0 102.5  95.5 110.5 100.5  98.0
##  [709] 115.5  98.0  94.5  87.5 103.0 106.5  81.5 110.0  90.5 103.0  94.0 104.5
##  [721]  95.0  88.0  96.0 101.5  94.5 106.0  95.5 100.5 102.5  88.5 103.0  89.5
##  [733]  95.5  94.5  90.5  98.5  80.5  95.0 102.5  94.5  91.5  86.5 108.0 104.5
##  [745] 101.0 105.5  91.5 103.0 100.0  94.0  94.0 109.0 102.0 113.5 102.5 102.5
##  [757] 105.0 102.5 103.0  95.5  92.0  96.0  97.0  85.0 104.5  85.5  94.5  99.0
##  [769]  98.0 116.0  91.0 102.0 104.5  97.5 104.5 103.5 105.0 110.0 103.0 106.5
##  [781] 102.0  89.5  97.5 115.0 116.5 100.0 103.0 101.0  96.0  89.0  92.5  86.0
##  [793] 102.0  99.0  97.0 103.0  95.0  98.5  97.0 103.0 109.0 112.5  95.0  97.0
##  [805] 107.0  97.0 102.5 104.0  96.5  99.0  95.0  95.5 105.5 102.5 110.0 100.0
##  [817] 114.5 100.5  93.0 102.0 101.0 105.5 105.0 105.0 104.5  93.5 110.0  87.0
##  [829]  99.0  86.5 110.0 103.5  88.0  93.0 102.0 112.5 111.5 100.0 105.0 108.0
##  [841] 100.5 101.0 106.0  90.5 110.5 103.0 110.0  89.5 101.0  97.0 105.5 105.5
##  [853]  93.0 108.0 106.0  95.5  99.5 104.0  96.0  92.5 105.0  88.5 105.5 104.5
##  [865] 105.5 108.0  91.0 122.5 113.5 100.5  99.0 105.5  94.5  93.0  99.0  93.0
##  [877]  93.0  93.5 100.5 101.5  99.5 101.5 110.5 100.5 100.0 107.0 110.0 102.0
##  [889] 101.5 102.0 100.5  94.5  97.5  84.0  95.0  96.0  88.5  99.5 108.5  98.5
##  [901]  95.5  98.0 103.0  96.0  95.5 104.0  92.0 112.0 106.5  90.5 108.5  99.5
##  [913] 103.5 112.5  90.0 101.0 112.0 108.0 108.0  87.0  91.5 106.5 102.0  91.0
##  [925]  87.5  94.5  99.0 101.0  98.5 107.5  97.0  98.0  99.5 104.0 102.5 110.5
##  [937] 109.5 116.0 109.5 101.5 109.0 101.0 100.0  93.5 100.0 102.0  88.5 102.0
##  [949]  92.0 105.0  99.5  95.0  94.5  94.0  98.5  95.5  98.5 101.5  94.5 106.0
##  [961]  96.0 100.0  99.0 105.0 104.5  91.5  99.0 108.5 105.0 106.5  99.0 115.0
##  [973]  95.5 105.0  87.5  93.5  99.5  94.5 101.0  94.0 103.0  95.0 109.5 115.0
##  [985] 102.0  92.0 101.0 102.5  94.5  96.5 100.0  98.5  93.0  99.0  95.5  98.0
##  [997]  93.5 100.5  99.0  98.5

(i)(b) Provide the histogram of the 1000 Monte Carlo posterior mean estimates calculated in part (i)(a).

hist(posterior_mean, main = "Histogram of Posterior Mean", xlab = "Estimated posterior mean", ylab = "frequency")

(ii)(a)Calculate the mean and variance of the Monte Carlo posterior mean estimates from part (i).

mean1 <- mean(posterior_mean)
mean1
## [1] 99.951
var1 <- var(posterior_mean)
var1
## [1] 46.20981

(ii)(b) Compare the Monte Carlo mean and variance obtained in part (ii)(a) with those obtained from samples of size 1000 drawn from a Gamma(𝛼+π‘₯, 𝛽+1) distribution. Round your results to three decimal places.

set.seed(508)

Gamma_mean <- rep(0,simulation)
for(i in 1:simulation){
  lamda_2 = rgamma(1, shape = alpha, rate = beta)
  x = rpois(1, lamda_2)
  y = rgamma(1000, shape = alpha + x, rate = beta + 1)
  Gamma_mean[i] = mean(y)
}
Gamma_mean
##    [1]  87.44616  94.09157  85.72460  97.85816 106.26727 108.54678 102.77412
##    [8]  88.00897  82.48867  96.12308  96.25689  96.63907  98.34087  93.62327
##   [15] 111.00561  89.94649  99.16369  92.63186 101.40587 100.67896 101.34644
##   [22]  96.70605  91.00182  97.99534 105.36139  97.03906  99.00485 101.36582
##   [29]  94.11833 107.89870  96.71023  90.47764 106.81600 109.07858  89.56825
##   [36] 106.41340 110.56892 101.60491  99.99152 108.49164  95.40981 104.33371
##   [43] 111.53132 103.11583  91.67074  94.95488 106.12957  99.91089  96.13810
##   [50] 106.47255  90.64209  95.22531  93.23244  98.05992 102.78078  96.12602
##   [57] 108.53701 103.26429 107.30659  97.83310 106.88247 102.90077  95.09109
##   [64]  98.07250  91.60490 103.42467  96.20448  83.86903 101.51668 109.89602
##   [71]  94.02304  98.68766  96.17537  97.92481  88.20746  94.06660 109.50077
##   [78]  99.63221  97.78322  97.31124 107.90703 106.10030 102.70141  94.60826
##   [85]  94.66463  94.02434  99.76632 100.50942  95.80981  97.72138 113.80622
##   [92]  87.43322  86.76662 104.25398  91.20564 100.91093  93.30252 109.19926
##   [99] 103.81104  96.41373 102.97263 103.37098 109.83017  96.06345  97.06004
##  [106] 104.97312  97.04730 106.61357  91.55757 112.08850  99.83845 101.20049
##  [113]  96.94024  96.83808  89.90985 114.04112  94.79241  89.94812  97.25440
##  [120] 105.54951 102.50800 108.33600  99.83026 106.90319  97.96451 100.59141
##  [127]  99.28247  98.83492 107.70755 107.55030  93.03827  91.01591 100.29393
##  [134] 110.36471 100.42004  89.20323 102.72593 100.54160  92.09632 122.55762
##  [141]  94.56344 103.04441  98.57758  90.34179  90.40432  97.25036 110.16087
##  [148] 105.02314 106.58125  93.64644  94.90313 111.42166  98.79965  97.34388
##  [155] 101.50509 105.44850 100.04356 100.23349  92.67123 104.41229  99.17965
##  [162] 107.53292 101.82195  90.39792  84.05222 100.62570  98.21983  99.91959
##  [169]  90.19528 102.70981  98.32731  88.70233  90.74972 103.09004  88.28568
##  [176] 100.00051 103.88994  98.33723 106.16105  97.89534  98.49527  96.82221
##  [183]  97.78067 105.89948  99.42482  94.65552  93.36433  91.85416  92.16566
##  [190] 101.56286 121.75815 106.20370 100.62354  94.88206  93.71321 100.98945
##  [197]  95.69225  95.56637 101.50160 101.61649  92.39928 103.84804 107.10350
##  [204] 111.58744  98.99698  92.63066 105.93797  85.77849  96.75560  90.62887
##  [211] 100.65695 100.86954  96.94739 100.16276 102.10422  88.36398  96.45862
##  [218]  97.81552  89.34163  95.09918  90.11584 100.83208 101.16330 111.19539
##  [225] 101.17195 115.23920 101.47430  92.39896 101.01695  97.37216 108.49199
##  [232] 110.83271 101.00490  99.00640  86.72355  99.31187 102.97198  97.66921
##  [239] 107.10434  90.49001 102.61181 102.96857 103.01603  96.98987  88.43593
##  [246] 108.25383  94.56386 110.13111 112.13521 120.99265 100.50827  93.10325
##  [253] 112.44777  99.56319  98.22586 100.96484 112.96439 106.49221 102.09588
##  [260]  98.33470 102.69656 112.13980 101.64608 118.17814 103.13196 103.40540
##  [267]  99.39902 104.63635  98.46794  99.65787  93.08912 112.46158  98.32512
##  [274] 100.38694  99.38593 113.59222  96.25652 100.22600 101.15545  98.83821
##  [281] 103.96050  95.91277 110.35377 102.16971  83.04522 101.85201 104.83075
##  [288] 101.38901 103.74546  96.34709 108.40582  96.45871 100.77706  96.97167
##  [295] 102.10795 101.70023 108.09681 102.63587  97.65625 101.68092  99.28198
##  [302] 106.47276 104.03385  96.62634 101.83960 102.07004 101.73340 100.68486
##  [309]  96.26018 101.00391  89.83788 103.84014  92.01661  87.08430 101.12794
##  [316] 111.41198  95.98232 109.46326  99.33349 106.28514 102.54755  97.26418
##  [323]  90.69373 104.67262  91.67944  94.58466 100.89164 100.40316  99.18438
##  [330]  91.87316 108.86994  90.92165  91.43146  96.41397  82.48818  88.29153
##  [337] 107.50197  98.29373  99.44884  92.76681 102.49813 101.38737 101.93530
##  [344]  94.65647 107.98878 111.88269 100.42859  97.10136 113.39228  99.64481
##  [351]  88.03860 107.64982  92.82185  86.38416 105.99346  95.12570  92.24631
##  [358] 106.52606 102.32226 104.74807  93.60906 100.69022  97.20504  88.77994
##  [365] 107.90737 110.95431 107.97712 101.82913  98.77872 106.97517  86.54408
##  [372] 104.33553 109.12532  99.61971  98.63604  92.78762  89.04545 104.66290
##  [379] 103.73670 122.48188 104.56091 100.66509 104.88490  97.45248  89.74873
##  [386] 100.97469  95.51671  98.75483 107.00890 103.83926 116.43124 107.14330
##  [393]  92.36263  97.07104  98.63307 110.42854  97.08777 105.40266 103.58993
##  [400] 109.00032 111.74173 101.43560  96.74646  97.68013 110.90027 106.00200
##  [407]  94.72524  92.70860 110.90794 101.79501 107.08208  95.73410 103.50232
##  [414] 100.72377 100.83302 112.58904  97.11632 109.56297 108.92593 110.92980
##  [421] 102.85547  88.07015 107.02940  91.46797  90.90276 100.97484  99.78054
##  [428]  95.70747 108.45965 101.95553  84.42322  91.56156 108.56026 102.90360
##  [435]  95.91684 103.63731  95.69100  99.16172  98.01022  84.35218 101.80935
##  [442] 101.42450 105.28045  92.93507 106.40277  99.90608 109.30721 104.13838
##  [449] 101.24073 109.73983 114.49177  96.83795  88.94327 103.75058 110.07257
##  [456]  96.00466 100.91219  99.04004 106.64532  99.49275  94.27640  94.99293
##  [463]  97.31057 104.46321  90.83892  96.23999  92.57241 104.18238  89.78462
##  [470]  96.20905 104.66742 101.84355 102.13341  92.90728 104.26410  86.51517
##  [477] 111.21199 102.06889 108.11536 111.71046  90.73616  96.98806  94.73414
##  [484]  85.68390 106.97716  99.40113  90.31427  90.10186  94.99933 103.86845
##  [491]  90.52632  87.20375  78.43389 133.77980 106.61622 100.94964 105.36220
##  [498] 118.87780 103.81300 105.15170  90.96704  98.24581  98.27145  92.95346
##  [505]  99.10117 113.16065 100.42873 103.75855  90.17507  91.73830  93.55819
##  [512]  88.82999  95.17483  99.37052 100.73906 106.43899 109.40346 108.59328
##  [519]  87.55636  90.39187 108.05849  94.36629 109.51180 111.50090  94.13969
##  [526] 101.34821  91.16230  86.61341  90.92794  94.15869  88.68012  92.50711
##  [533]  98.26326  88.07729  96.31024 102.34229 103.19372  95.00675 107.99455
##  [540] 100.90313 106.30413 116.60951 102.18899  92.91114 110.36208  97.54739
##  [547] 100.07777  99.21793 101.10966 101.95566 113.67941  96.58718  96.78500
##  [554] 113.97514  97.80593 107.41107  89.62241 106.92618  97.60506 107.49522
##  [561] 113.38834 108.48806 108.27687  99.71753 109.83479  96.98818 105.57525
##  [568]  98.08588 110.66906  99.75421  94.19423 104.45455 105.36141 115.51086
##  [575]  94.41164  98.46013 108.48499  96.29530 103.23271 100.89120  87.30696
##  [582]  99.21887  98.80086  98.56372  96.14122 101.96920 102.51118  95.06496
##  [589]  95.38967 100.32601  94.65548  94.56099 101.85894  92.94141  93.93796
##  [596]  96.73081 114.47261 120.28136  85.38826 104.21116 112.18931  95.60098
##  [603] 101.48253 111.93263  90.30425  95.47443  97.42523  96.72229 101.19225
##  [610]  93.86689 100.37476  99.40089 102.49396  93.34206 101.08623  99.34340
##  [617] 100.53092 104.64393  96.55221  98.55754  94.33624  95.41766  96.45007
##  [624] 100.97140 102.05925  98.34855  96.95396  97.39041 102.81622 108.01099
##  [631]  96.76089  97.52009 101.16945 102.70544  99.05115 101.26496  96.09579
##  [638] 102.98429 100.91562 109.95788 102.92963 107.95557  90.19991 103.14442
##  [645]  94.59586  93.01039 113.07606 102.98218 111.36092  88.89660 102.06650
##  [652] 101.95314  89.00400  94.06921  94.00870 103.61869  98.20835 100.38653
##  [659]  91.31076 107.86976  88.45546  96.30892 104.66421 109.65591  97.73709
##  [666]  87.76384  87.55889  91.53958  99.76902 105.43556 102.86810 104.21399
##  [673] 104.37972 119.50351 100.88147 101.88751 103.30862  98.74279 102.17412
##  [680]  92.71833  99.94207 114.92651  90.70160 107.23152 104.67736 104.50080
##  [687]  96.56993 106.65094 102.24396  97.60543 109.72805  85.84588 110.55600
##  [694]  93.06968  90.00112  94.64601 104.37362  99.21842 101.85735  97.18406
##  [701] 111.87924 104.44849 110.97496 104.86912  95.75309  95.93884 103.04620
##  [708]  99.33497  96.59530  99.73511 107.46138 100.61708  97.88273  95.50523
##  [715]  87.77159 104.13410  99.63154 106.38232 109.63639 110.71812  94.75407
##  [722]  99.86386 106.05882  96.87083 100.49929  91.02235 108.06704 103.24560
##  [729] 100.41973 117.55787 103.96393 102.55973 109.40553  92.65406  90.60400
##  [736] 111.77905  96.41004 105.89494  96.64554  97.96821  95.62798  99.97740
##  [743]  93.25463 106.11694 106.20888  99.94326 108.88524  94.83751  98.46617
##  [750]  98.86928  93.15523  96.80631  96.99994 104.92613  88.06606 100.69113
##  [757] 103.67759 107.04164 109.28416  96.67825 104.11273  96.51885 102.14856
##  [764]  92.41350 102.40101 105.21326  94.04193  99.51141  99.63524  89.93903
##  [771] 106.53168 104.20317  96.01643 105.47308  87.05425 101.24503  97.99663
##  [778] 116.11448  99.21353 103.56187  91.39448  90.82117 106.27575 102.44303
##  [785]  98.55597 104.03033 103.58680 112.47971  95.46912 100.63739 109.78517
##  [792]  98.06233 106.63450  96.91841 101.64978  94.43952  99.54850 112.25846
##  [799] 116.36145 105.68101  90.97939  97.04789 103.58571 106.84938  92.32598
##  [806] 114.12723  88.42927 103.07770  94.35318  98.60106 104.12290  97.45895
##  [813] 102.35901 107.55078  91.22495  97.04382 103.89320  99.58226  96.19264
##  [820] 100.68243 104.00591  97.13521 112.55557 106.75806  96.40293  99.41698
##  [827]  98.01018  94.74184  98.76394 101.90416 112.03634  96.84842 104.71694
##  [834] 105.84669  91.90252 101.39648  99.30915  98.91371 106.44678 104.35390
##  [841]  91.23609 100.18974  86.78126 108.14106 102.45552  92.80435  95.12587
##  [848]  93.10227 114.85561  97.94076 111.25381  93.99849  94.28991  97.56167
##  [855]  97.75604 105.77706 104.95348  98.28365  92.72630  98.18982 110.95426
##  [862]  96.99250 114.40404  93.40090 102.40760  92.50450  90.00737  98.47570
##  [869] 101.86578  94.21597 103.29703  98.53546 109.22440 101.85882  93.06324
##  [876] 107.03880  94.15219  89.72064  97.31188 101.87961  98.86753  88.25822
##  [883] 110.72760  89.06658  91.19482 104.47034 103.63310 103.39327  99.96103
##  [890] 117.70096 103.17621 107.75024  92.47837  82.59295  96.15691 104.58926
##  [897]  89.52102  94.95797  95.72622 103.53877  88.42763 105.78718  92.50102
##  [904] 101.59008 109.80696 100.06104  96.03987 118.42744 102.62448  92.80573
##  [911] 109.85262  97.84872  96.42030 101.06734  92.63287  95.13725 103.43845
##  [918] 102.40939  86.71611 103.18285  97.32117  89.00621  99.92361  84.75884
##  [925] 109.42252  95.38031  97.83011  94.87282 101.19582 101.64716 107.41373
##  [932] 102.67419  96.95695 114.85442  99.24418  98.60267  98.57387 105.30228
##  [939] 109.58348  96.96969  85.76928 102.15183  92.20098  91.72992  99.58971
##  [946] 102.95779  98.16898 103.21473 104.31208 109.67143 101.08423  92.71915
##  [953] 109.11797  96.86271  85.46902  86.37256  90.74637 102.61629 109.49660
##  [960] 100.08103  83.87615 102.71285  99.43053 100.59307  97.92809  93.07519
##  [967]  94.80546  92.72045  96.05231  90.03669 102.99090  86.50909  92.75490
##  [974]  95.81167 106.72515  86.96840 108.12330 109.75482 103.15841 116.57593
##  [981] 113.71882  89.78510  98.04023  98.10058 107.01660  96.33334 103.96807
##  [988] 101.18980 101.82327 111.66546 105.59277 109.07567 105.75462  95.43854
##  [995] 100.81567 109.70459 106.14627 103.23544 107.06951 106.11316
mean2 <- mean(Gamma_mean)
mean2
## [1] 100.1183
var2 <- var(Gamma_mean)
var2
## [1] 51.2466
mean_diff = mean2 - mean1
round(mean_diff, 3)
## [1] 0.167
var_diff = var2 - var1
round(var_diff, 3)
## [1] 5.037

As the difference of the mean and variance from part(ii)(b) and part(ii)(a) are POSITIVE, which are 0.167 and 5.037 respectively, hence it can be concluded that both the mean and variance from part(ii)(b) are higher compared to part(ii)(a).

(iii) Comment on your findings in parts (i) and (ii).

The difference calculated of the Monte Carlo estimates for the mean and variance and those from Gamma (alpha + x, beta + 1) sample demonstrates that the posterior Distribution for the Poisson/Gamma credibility model is not the Gamma (alpha + x, theta + 1).

This may be due to the fact that the Gamma (alpha + x, beta + 1) sample has parameters that are different and where its alpha value is now dependent on the annual claim numbers model, X. On the other hand, its beta value is now greater by 1.

Hence, the difference in properties and values of the respective parameters are the main reason that lead to the difference in the model and the value of parameters.

Question 3

(i) Explain what error structure could be used in this GLM, including in your answer the R code used to justify your choice.

library(readxl)
datatrain <- read.csv("C:/Users/User/Documents/BAS/SEM 7 (AUG 2020)/MAT 3024 Regression Analysis/CBT/IFOA 2019 (CSB1)/CS1B_datatrain.csv", header = T)   

head(datatrain)
##   ID Age Car.Group                     Area NCD Gender Claim.number
## 1  1  89         7                   London   1 Female            0
## 2  2  22         6               South East   1 Female            0
## 3  3  52         3               South West   2   Male            0
## 4  4  90        11               North East   2 Female            0
## 5  5  89         5 Yorkshire and the Humber   3 Female            0
## 6  6  57        11                   London   2 Female            0
tail(datatrain)
##        ID Age Car.Group          Area NCD Gender Claim.number
## 7995 7995  89        20    South West   0   Male            1
## 7996 7996  36         2            NI   5 Female            0
## 7997 7997  96        19          East   5   Male            0
## 7998 7998  58         7 East Midlands   4   Male            0
## 7999 7999  42        11 East Midlands   4   Male            0
## 8000 8000  52         4          East   0 Female            1
str(datatrain)   
## 'data.frame':    8000 obs. of  7 variables:
##  $ ID          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Age         : int  89 22 52 90 89 57 98 58 82 40 ...
##  $ Car.Group   : int  7 6 3 11 5 11 10 17 20 8 ...
##  $ Area        : chr  "London" "South East" "South West" "North East" ...
##  $ NCD         : int  1 1 2 2 3 2 2 5 0 3 ...
##  $ Gender      : chr  "Female" "Female" "Male" "Female" ...
##  $ Claim.number: int  0 0 0 0 0 0 0 0 0 0 ...
summary(datatrain)
##        ID            Age           Car.Group         Area          
##  Min.   :   1   Min.   : 18.00   Min.   : 1.00   Length:8000       
##  1st Qu.:2001   1st Qu.: 39.00   1st Qu.: 6.00   Class :character  
##  Median :4000   Median : 59.00   Median :11.00   Mode  :character  
##  Mean   :4000   Mean   : 59.15   Mean   :10.57                     
##  3rd Qu.:6000   3rd Qu.: 80.00   3rd Qu.:16.00                     
##  Max.   :8000   Max.   :100.00   Max.   :20.00                     
##       NCD           Gender           Claim.number   
##  Min.   :0.000   Length:8000        Min.   :0.0000  
##  1st Qu.:1.000   Class :character   1st Qu.:0.0000  
##  Median :3.000   Mode  :character   Median :0.0000  
##  Mean   :2.543                      Mean   :0.1459  
##  3rd Qu.:4.000                      3rd Qu.:0.0000  
##  Max.   :5.000                      Max.   :3.0000

All values are positive integer with some values more than 1, so use Poisson distribution as error structure.

(ii) Fit a GLM that treats Age as linear factor and all other four factors as categorical variables. Your answer should show the coefficient, standard error, and p-value of each parameter estimate in the model.

# datatrain$car_group <- as.factor(datatrain$Car.Group)
# str(datatrain)
model1 <- glm(Claim.number ~ Age + factor(Car.Group) + factor(Area) + factor(NCD) + factor(Gender), data = datatrain, family = poisson())

summary(model1)
## 
## Call:
## glm(formula = Claim.number ~ Age + factor(Car.Group) + factor(Area) + 
##     factor(NCD) + factor(Gender), family = poisson(), data = datatrain)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.3203  -0.5624  -0.4445  -0.3185   3.4254  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                          -1.895413   0.245584  -7.718 1.18e-14 ***
## Age                                  -0.008371   0.001231  -6.802 1.03e-11 ***
## factor(Car.Group)2                    0.283162   0.285122   0.993 0.320648    
## factor(Car.Group)3                    0.311122   0.282852   1.100 0.271356    
## factor(Car.Group)4                    0.116037   0.292479   0.397 0.691563    
## factor(Car.Group)5                    0.578672   0.265600   2.179 0.029352 *  
## factor(Car.Group)6                    0.912683   0.251178   3.634 0.000279 ***
## factor(Car.Group)7                    0.260364   0.287397   0.906 0.364968    
## factor(Car.Group)8                    0.914236   0.253831   3.602 0.000316 ***
## factor(Car.Group)9                    0.877120   0.252408   3.475 0.000511 ***
## factor(Car.Group)10                   0.914426   0.250011   3.658 0.000255 ***
## factor(Car.Group)11                   0.799044   0.250434   3.191 0.001420 ** 
## factor(Car.Group)12                   1.025303   0.245168   4.182 2.89e-05 ***
## factor(Car.Group)13                   1.011678   0.248305   4.074 4.61e-05 ***
## factor(Car.Group)14                   1.118560   0.242695   4.609 4.05e-06 ***
## factor(Car.Group)15                   1.103179   0.244711   4.508 6.54e-06 ***
## factor(Car.Group)16                   0.996932   0.247455   4.029 5.61e-05 ***
## factor(Car.Group)17                   1.128584   0.242389   4.656 3.22e-06 ***
## factor(Car.Group)18                   1.198728   0.239721   5.001 5.72e-07 ***
## factor(Car.Group)19                   1.422781   0.238307   5.970 2.37e-09 ***
## factor(Car.Group)20                   1.317913   0.238579   5.524 3.31e-08 ***
## factor(Area)East Midlands             0.146664   0.132611   1.106 0.268739    
## factor(Area)London                    0.318305   0.126773   2.511 0.012045 *  
## factor(Area)NI                        0.393303   0.125065   3.145 0.001662 ** 
## factor(Area)North East               -0.060812   0.138426  -0.439 0.660439    
## factor(Area)North West               -0.193799   0.143745  -1.348 0.177590    
## factor(Area)South East               -0.323157   0.151830  -2.128 0.033303 *  
## factor(Area)South West               -0.097663   0.142546  -0.685 0.493256    
## factor(Area)Wales                    -0.309704   0.148283  -2.089 0.036744 *  
## factor(Area)West Midlands            -0.068206   0.141474  -0.482 0.629730    
## factor(Area)Yorkshire and the Humber  0.100272   0.132276   0.758 0.448421    
## factor(NCD)1                         -0.456078   0.086523  -5.271 1.36e-07 ***
## factor(NCD)2                         -0.679426   0.091303  -7.441 9.96e-14 ***
## factor(NCD)3                         -0.885118   0.096849  -9.139  < 2e-16 ***
## factor(NCD)4                         -0.963244   0.101502  -9.490  < 2e-16 ***
## factor(NCD)5                         -1.097532   0.104299 -10.523  < 2e-16 ***
## factor(Gender)Male                    0.259982   0.064879   4.007 6.14e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4569.9  on 7999  degrees of freedom
## Residual deviance: 4085.6  on 7963  degrees of freedom
## AIC: 6455.7
## 
## Number of Fisher Scoring iterations: 6

(iii) Describe the association between gender and the number of reported claims based on your output from part (ii). Your answer should include a numerical comparison of this association between male and female policyholders.

Beta <- coef(model1)["factor(Gender)Male"]
Beta
## factor(Gender)Male 
##          0.2599823

The coefficient for male is 0.2599823 while the coefficient for female is 0. This means that an increment of 1 unit in the regressor for male will cause an expected increase of 1 unit in the systematic component of the generalised linear model.

On the other hand, the zero coefficient for female indicates that the category of female is used as the baseline to obtain the significant positive results for the factor(sex)1 which is the category of male.

exp(Beta) - exp(0)
## factor(Gender)Male 
##          0.2969072

Male policyholders have higher mean of reported claaims of (29.69%) than female policyholders. The difference is significant at 5% significance level as the p-value = 6.14e-5 < 0.05 = alpha.

(iv) Comment on the fit of the model fitted in part (ii), based on the deviance value of the model, with reference to the suitability of the model.

null_deviance <- summary(model1)$null.deviance
null_deviance
## [1] 4569.915
residual_deviance <- deviance(model1)
residual_deviance
## [1] 4085.558
null_degree <- summary(model1)$df[2] + summary(model1)$df[1] - 1
null_degree
## [1] 7999
resid_degree <- summary(model1)$df[2]
resid_degree
## [1] 7963
deviance_diff <- null_deviance - residual_deviance
deviance_diff
## [1] 484.3565
df_diff <- null_degree - resid_degree
df_diff
## [1] 36
qchisq(0.95, 36)
## [1] 50.99846

After comparison is made, it is shown that the deviance is reduced by 484.36 while the degrees of freedom is reduced by 36.

As the observed difference = 484.3565 > 50.99846 = chi-square statistic with alpha = 0.05 and df = 36, the null hypothesis is rejected. Hence, the fitted model is significant at 5% significance level.

(v)(a) Add a variable representing the power of age squared to the data.

datatrain$Age_square= datatrain$Age^2

(v)(b) Fit an appropriate model including age squared as an explanatory variable in addition to the variables used in part (ii).

model2 <- glm(Claim.number ~ Age + Age_square + factor(Car.Group) + factor(Area) + factor(NCD) + factor(Gender), data = datatrain, family = poisson())

summary(model2)
## 
## Call:
## glm(formula = Claim.number ~ Age + Age_square + factor(Car.Group) + 
##     factor(Area) + factor(NCD) + factor(Gender), family = poisson(), 
##     data = datatrain)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.2760  -0.5532  -0.4261  -0.3063   3.2856  
## 
## Coefficients:
##                                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                          -4.118e-01  2.816e-01  -1.462 0.143613    
## Age                                  -7.268e-02  6.329e-03 -11.484  < 2e-16 ***
## Age_square                            5.603e-04  5.405e-05  10.366  < 2e-16 ***
## factor(Car.Group)2                    2.913e-01  2.852e-01   1.022 0.307011    
## factor(Car.Group)3                    2.854e-01  2.829e-01   1.009 0.313049    
## factor(Car.Group)4                    1.361e-01  2.925e-01   0.465 0.641617    
## factor(Car.Group)5                    5.608e-01  2.656e-01   2.111 0.034750 *  
## factor(Car.Group)6                    8.935e-01  2.512e-01   3.557 0.000376 ***
## factor(Car.Group)7                    2.729e-01  2.875e-01   0.949 0.342393    
## factor(Car.Group)8                    9.135e-01  2.539e-01   3.598 0.000320 ***
## factor(Car.Group)9                    8.766e-01  2.524e-01   3.473 0.000515 ***
## factor(Car.Group)10                   9.127e-01  2.501e-01   3.650 0.000263 ***
## factor(Car.Group)11                   8.012e-01  2.505e-01   3.198 0.001383 ** 
## factor(Car.Group)12                   1.039e+00  2.452e-01   4.239 2.25e-05 ***
## factor(Car.Group)13                   9.643e-01  2.483e-01   3.883 0.000103 ***
## factor(Car.Group)14                   1.109e+00  2.427e-01   4.568 4.93e-06 ***
## factor(Car.Group)15                   1.111e+00  2.445e-01   4.543 5.54e-06 ***
## factor(Car.Group)16                   9.760e-01  2.474e-01   3.945 7.99e-05 ***
## factor(Car.Group)17                   1.131e+00  2.424e-01   4.667 3.06e-06 ***
## factor(Car.Group)18                   1.188e+00  2.397e-01   4.957 7.14e-07 ***
## factor(Car.Group)19                   1.420e+00  2.383e-01   5.962 2.49e-09 ***
## factor(Car.Group)20                   1.322e+00  2.386e-01   5.540 3.03e-08 ***
## factor(Area)East Midlands             9.654e-02  1.327e-01   0.727 0.466981    
## factor(Area)London                    3.190e-01  1.268e-01   2.516 0.011863 *  
## factor(Area)NI                        3.837e-01  1.251e-01   3.067 0.002161 ** 
## factor(Area)North East               -6.318e-02  1.385e-01  -0.456 0.648255    
## factor(Area)North West               -2.015e-01  1.438e-01  -1.401 0.161092    
## factor(Area)South East               -3.268e-01  1.519e-01  -2.151 0.031438 *  
## factor(Area)South West               -1.148e-01  1.426e-01  -0.805 0.420806    
## factor(Area)Wales                    -3.166e-01  1.484e-01  -2.134 0.032820 *  
## factor(Area)West Midlands            -8.999e-02  1.415e-01  -0.636 0.524886    
## factor(Area)Yorkshire and the Humber  1.060e-01  1.323e-01   0.801 0.423004    
## factor(NCD)1                         -4.530e-01  8.657e-02  -5.233 1.67e-07 ***
## factor(NCD)2                         -6.705e-01  9.130e-02  -7.344 2.07e-13 ***
## factor(NCD)3                         -8.753e-01  9.692e-02  -9.031  < 2e-16 ***
## factor(NCD)4                         -9.487e-01  1.016e-01  -9.342  < 2e-16 ***
## factor(NCD)5                         -1.089e+00  1.043e-01 -10.439  < 2e-16 ***
## factor(Gender)Male                    2.657e-01  6.496e-02   4.091 4.30e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 4569.9  on 7999  degrees of freedom
## Residual deviance: 3981.3  on 7962  degrees of freedom
## AIC: 6353.4
## 
## Number of Fisher Scoring iterations: 6
null_deviance2 <- summary(model2)$null.deviance
null_deviance2
## [1] 4569.915
residual_deviance2 <- deviance(model2)
residual_deviance2
## [1] 3981.337
null_degree2 <- summary(model2)$df[2] + summary(model1)$df[1] - 1
null_degree2
## [1] 7998
resid_degree2 <- summary(model2)$df[2]
resid_degree2
## [1] 7962
deviance_diff2 <- null_deviance2 - residual_deviance2
deviance_diff2
## [1] 588.5777
df_diff2 <- null_degree2 - resid_degree2
df_diff2
## [1] 36
deviance_diff2 > deviance_diff + 2*df_diff2
## [1] TRUE

(v)(c) Comment on whether age squared is associated with the number of reported claims and on whether its inclusion improves the model fitted in part (ii), based on your output from part (v)(b).

The age squared coefficient has a p-value of lesser than 2e-16 < 0.05 = alpha, hence it shows that it is significant.

Also, it is proven that the deviance is reduced more than twice the change in degrees of freedom. Hence, the variable is significantly associated with the number of reported claims.