Question 1
(i) Construct a two-sided 95% confidence interval for the mean difference in weights. Assume the differences in weights to be normally distributed.
weight_before <- c(58.5, 60.3, 61.7, 69.0, 64.0, 62.6, 56.7)
weight_after <- c(60.0, 54.9, 58.1, 62.1, 58.5, 59.9, 54.4)
weight_diff <- weight_before - weight_after
weight_diff_size <- length(weight_diff)
weight_diff_mean <- mean(weight_diff)
weight_diff_stdv <- sd(weight_diff)
degree_of_freedom <- weight_diff_size - 1
t_score <- qt(0.975, degree_of_freedom)
confidence_interval <- c(weight_diff_mean - t_score*weight_diff_stdv/sqrt(weight_diff_size), weight_diff_mean + t_score*weight_diff_stdv/sqrt(weight_diff_size))
confidence_interval
## [1] 0.9897686 6.1245171
Conclusion: The two-sided 95% confidence interval for the mean difference in weights is [0.9897686,6.1245171].
Question 2
(i)(a) Repeat the Monte Carlo simulation of a credibility analysis with 1000 repetitions. Use the credibility factor Z=1/alpha to estimate the distribution of the posterior mean of parameter “lamda”.
library(invgamma)
## Warning: package 'invgamma' was built under R version 4.0.3
set.seed(508)
simulation = 1000
alpha = 3
theta = 10
Z = 1/alpha
posterior_mean <- rep(0, simulation)
X <- rep(0, simulation)
for(i in 1:simulation){
lamda = rinvgamma(1, shape = alpha, rate = theta)
x = rexp(1, 1/lamda)
X[i]= x
posterior_mean[i] = Z*x + (1-Z)*theta/(alpha-1)
}
posterior_mean
## [1] 10.787679 4.004492 7.352629 3.938469 5.174283 5.773552 3.759917
## [8] 4.361107 3.644244 4.320933 3.600820 5.251919 3.664630 3.437343
## [15] 5.698721 4.238877 7.448848 4.147898 5.381237 4.519442 3.511072
## [22] 6.386186 10.309665 3.398516 4.039870 3.620133 3.987903 3.549944
## [29] 3.799118 4.027226 4.862665 3.785228 3.453635 4.808308 3.708284
## [36] 4.248184 3.975580 5.804092 3.983857 3.370710 3.445960 3.525557
## [43] 4.126426 5.632545 4.053977 5.699111 9.826866 3.610726 4.891751
## [50] 3.524536 3.653747 3.379791 3.587522 5.281632 4.476903 13.938718
## [57] 4.328783 3.396682 4.425576 9.737877 4.440940 4.049396 5.856198
## [64] 3.545282 7.988983 3.528352 3.460233 3.746473 4.905222 3.474839
## [71] 3.600426 7.143879 3.867841 4.445173 7.365354 3.333474 5.394556
## [78] 3.746328 3.788844 5.681103 3.565494 3.337860 4.524515 8.096445
## [85] 5.539874 4.478216 3.905670 5.053452 3.986433 3.974038 6.605156
## [92] 4.668482 7.247284 3.831834 12.617454 3.731171 4.925688 3.912574
## [99] 4.312236 5.291052 3.823822 4.297184 4.833268 3.765076 3.368473
## [106] 4.803011 3.744405 5.367477 3.815068 4.096627 6.081352 13.488962
## [113] 6.767325 3.472159 3.351290 4.271077 4.646828 5.114747 8.080545
## [120] 4.807886 4.237252 3.779772 4.956105 3.473713 4.370283 5.851106
## [127] 4.347780 5.519935 4.026523 5.612978 6.182154 3.847134 3.382508
## [134] 5.803728 6.349085 4.582005 3.734777 3.966116 6.495820 3.562147
## [141] 4.029852 6.398565 4.904241 5.999412 6.013848 4.109885 3.503393
## [148] 3.555280 3.636339 5.050223 8.758505 4.131706 4.480623 6.680508
## [155] 4.925562 5.108362 4.027557 3.674691 4.252670 6.518433 4.235079
## [162] 3.559955 4.226598 3.626465 4.830192 23.663783 6.828567 4.650674
## [169] 3.359534 3.373003 4.604730 3.798236 3.377434 3.456319 6.386665
## [176] 3.344118 4.605153 3.492026 3.687670 4.371171 14.437302 5.184352
## [183] 4.288761 4.331929 4.610506 5.445721 4.683762 3.663825 3.527371
## [190] 3.526904 4.068335 5.038427 4.235394 6.044897 6.306354 3.707687
## [197] 3.407470 3.999684 3.478840 4.181758 4.888421 3.653241 5.426433
## [204] 12.658988 3.584790 8.650062 4.133849 3.584965 4.188749 3.800772
## [211] 3.437018 3.688614 4.658512 9.672260 4.573379 4.006852 4.652431
## [218] 3.500562 3.846194 4.789932 4.308561 3.435165 3.919064 9.990730
## [225] 5.073204 3.638954 4.183315 3.459365 3.549533 3.601303 4.976444
## [232] 3.369048 4.805990 3.531149 4.995554 4.252463 4.614993 5.439632
## [239] 3.860735 4.267660 3.514343 4.755595 4.322156 16.494300 3.714280
## [246] 3.629738 4.385435 3.685736 3.359241 4.461566 4.633818 4.092015
## [253] 8.762667 3.511456 9.828019 3.569450 3.684522 4.516947 3.549428
## [260] 3.536983 3.642456 4.240218 4.122019 3.581151 5.013071 5.062447
## [267] 3.979766 3.777869 5.101428 4.509249 5.017906 5.194486 3.563715
## [274] 6.286274 4.866633 4.147811 8.567851 4.061263 6.255519 4.245671
## [281] 6.084255 3.469410 3.648664 5.687585 4.495914 4.009731 4.164198
## [288] 3.421427 3.676036 4.576836 3.515794 5.687122 4.194376 4.822310
## [295] 9.098458 3.625591 4.338387 3.478803 11.144101 3.828813 4.680124
## [302] 4.459982 3.789894 3.506382 3.449137 3.407731 3.636466 3.474421
## [309] 6.423319 4.322540 3.531530 4.552226 4.737846 6.335236 3.896725
## [316] 3.835775 3.407313 3.847345 13.219633 4.194821 3.717239 7.866132
## [323] 5.008512 6.635342 4.121576 3.373266 6.883239 3.942542 3.450547
## [330] 6.116680 4.708658 3.980154 5.303808 4.000516 4.439875 3.517683
## [337] 4.203049 5.086068 3.347605 7.100914 4.346348 5.023225 5.731162
## [344] 8.400088 5.342110 4.200789 5.978356 7.040099 3.571648 16.976757
## [351] 4.492945 4.513997 3.400303 4.280681 4.076967 4.351296 8.473505
## [358] 3.847550 9.098675 4.374831 4.936703 3.436448 4.403436 3.540744
## [365] 3.340646 3.507705 4.613600 3.972573 3.645236 8.002819 5.651759
## [372] 6.558240 4.055923 3.923387 3.632470 3.879902 8.241400 8.600877
## [379] 3.497341 3.940884 5.440708 4.056389 3.462543 3.465261 3.989090
## [386] 3.957239 4.283731 3.384094 4.197831 6.283493 8.129240 4.119317
## [393] 4.128642 4.920555 3.485532 3.556740 4.901122 4.990825 3.915095
## [400] 8.913669 3.872756 3.423063 3.532035 4.999907 3.414914 3.336694
## [407] 4.905739 3.334990 3.462654 7.301690 5.189206 3.576466 4.411319
## [414] 3.632750 4.114442 9.928595 3.709196 3.512185 6.436412 7.662415
## [421] 3.743496 4.682611 5.485125 3.922223 4.356569 4.332852 3.650636
## [428] 5.513105 6.506093 5.396327 4.180270 8.526727 4.973971 4.150731
## [435] 3.769768 4.162085 4.292791 4.132557 8.102478 4.327450 4.698675
## [442] 12.901770 3.404518 3.546052 3.958736 5.090478 4.507910 3.606440
## [449] 4.634292 4.349447 3.896882 5.817466 3.424015 5.389747 3.742573
## [456] 3.761331 4.630245 4.740647 3.979690 3.962635 4.019475 4.412411
## [463] 4.123438 8.427067 4.315698 3.540145 3.520298 3.423806 4.269145
## [470] 8.032020 4.214815 7.305254 9.072054 4.087377 3.791415 4.384330
## [477] 4.156700 3.596796 3.699375 5.589300 3.682880 3.952649 3.803739
## [484] 3.456762 3.855027 5.195406 4.116909 3.511944 3.543488 4.431504
## [491] 3.872761 3.640635 7.932992 3.516948 4.178896 3.940470 5.064382
## [498] 4.427910 4.254190 6.148352 4.678541 4.097702 16.966185 6.714732
## [505] 4.312089 5.957861 3.438046 3.626774 4.711123 7.005208 5.861842
## [512] 8.084566 3.764364 5.112156 3.509892 4.541243 3.635431 3.942028
## [519] 5.401508 5.598004 3.810431 4.641591 3.754424 4.121204 7.921436
## [526] 5.336322 3.865240 3.892012 5.237728 7.658179 3.828863 3.501561
## [533] 16.253779 4.209874 4.088284 6.103688 3.916069 3.662578 7.710307
## [540] 5.102089 3.358149 3.382938 3.874895 3.510189 4.720819 4.134780
## [547] 5.781415 4.507966 4.078679 3.546156 3.646610 3.366559 4.252116
## [554] 3.543520 5.255360 3.863117 6.126400 4.220865 4.808106 5.463809
## [561] 3.857423 4.107220 3.984800 6.051878 3.347032 5.546730 4.436872
## [568] 3.938543 9.850314 3.349137 4.817697 3.580211 3.408454 3.379068
## [575] 3.426476 4.649539 3.399649 3.649928 4.071275 4.463659 3.691750
## [582] 6.103484 3.760637 8.679411 3.674069 4.500000 3.682916 5.521894
## [589] 5.154624 3.916028 4.148298 3.472690 3.723402 4.335467 3.385575
## [596] 3.488488 9.521467 4.014830 3.367788 3.458869 3.397291 4.143019
## [603] 3.614259 3.689817 17.810596 3.617614 5.221858 3.623444 3.646731
## [610] 4.707306 5.270672 5.112578 3.494257 3.380836 6.101766 4.950060
## [617] 6.938910 4.545325 4.200944 3.807739 4.340306 9.880651 3.808400
## [624] 3.609526 4.560805 4.252033 4.501385 4.620337 3.631929 3.571714
## [631] 9.354052 3.356318 3.410060 3.482077 3.648689 3.389522 4.981854
## [638] 3.751371 4.363128 4.126777 3.437157 9.083321 22.542653 4.561044
## [645] 4.156581 5.875978 4.646031 3.572224 3.651257 3.364292 3.608213
## [652] 4.374283 3.603310 3.985534 3.655081 5.058131 4.517255 4.591532
## [659] 4.822289 3.596317 3.402335 4.236203 7.416053 4.960865 4.135426
## [666] 5.488083 3.999444 3.747768 4.980687 3.772237 5.219193 3.741960
## [673] 11.152395 7.932314 3.494949 3.400157 6.325461 10.391607 4.924681
## [680] 3.506923 4.088508 3.614405 6.028570 4.993726 4.745982 3.695732
## [687] 4.186208 3.533799 4.407512 4.546385 3.782688 3.984126 5.021312
## [694] 3.504543 3.874480 4.128668 4.152251 5.207259 4.125451 4.187084
## [701] 3.854722 7.620993 3.460893 3.920754 3.803910 5.255638 3.873747
## [708] 3.367068 5.747232 4.836625 4.558355 5.943101 3.912491 4.224666
## [715] 3.405829 5.253328 4.328696 6.630868 5.291883 4.238747 4.113339
## [722] 5.075427 3.963649 4.266625 3.440875 3.585158 5.835994 5.629500
## [729] 9.409130 10.265310 3.687235 9.041338 7.807293 4.098831 3.805297
## [736] 4.054982 5.428435 7.974463 3.534706 5.482554 6.513360 5.519681
## [743] 3.539168 4.132714 3.546316 3.678342 3.440179 4.793617 3.757676
## [750] 4.509618 3.657608 5.883179 3.507940 3.623012 6.365470 3.432411
## [757] 6.432006 3.920322 3.849479 3.394965 6.546594 3.391756 3.944922
## [764] 3.921157 4.389315 3.472948 6.275120 4.629957 3.762649 5.002833
## [771] 5.349245 4.237065 4.161210 7.424755 3.427066 3.615589 3.652228
## [778] 7.457971 7.177882 3.945541 4.547753 3.453558 7.625347 4.505481
## [785] 8.363320 3.536622 7.603814 3.414147 3.526650 3.903042 3.480018
## [792] 3.578391 4.452721 4.139765 3.530550 4.682266 5.583198 6.653817
## [799] 5.467122 5.757508 3.400510 3.394988 4.412425 3.627728 5.561259
## [806] 3.819906 3.735731 9.237169 6.607413 3.747815 4.263449 3.336041
## [813] 3.452754 4.645954 13.346148 4.808386 3.473015 3.552897 4.490008
## [820] 3.427301 3.582157 5.752566 3.590654 3.522570 4.977402 4.089026
## [827] 3.654357 4.204616 14.405400 3.395152 6.370668 3.821348 4.803218
## [834] 4.011733 6.612212 24.428233 3.747455 11.770129 3.506718 10.023902
## [841] 3.644921 3.457438 7.639443 3.439307 11.775049 3.538067 6.843982
## [848] 5.191984 3.548718 3.562393 3.636264 3.446218 3.450627 4.041285
## [855] 4.606072 6.441264 4.049736 3.717495 4.950407 6.800731 4.204093
## [862] 11.534433 10.470570 5.209692 4.289867 56.537157 3.528617 4.929205
## [869] 3.502631 3.888781 3.715656 5.281908 3.370401 4.312158 5.509419
## [876] 3.412045 3.401115 5.157665 5.154028 3.525218 3.559851 6.928404
## [883] 9.842829 4.161899 3.739783 4.378798 3.702640 7.285479 5.391680
## [890] 4.014716 3.895717 4.485464 6.834661 6.566617 13.666889 5.540343
## [897] 5.275363 3.367246 3.499753 4.091715 3.760996 3.602175 8.225190
## [904] 4.273731 7.343453 6.909321 3.463444 3.334226 3.544951 9.329251
## [911] 4.660338 3.878545 3.403161 5.784140 4.182150 4.077272 3.419741
## [918] 4.213928 7.471789 3.914527 3.932505 3.729008 6.821493 4.955086
## [925] 3.460775 3.644205 3.421442 5.156680 4.117724 3.838177 4.074811
## [932] 3.799011 3.712921 3.829254 9.138670 6.191550 3.554297 5.294367
## [939] 3.843047 3.980560 3.489206 3.610275 3.847092 3.466941 4.873124
## [946] 4.397267 4.823008 4.570346 3.918202 5.265505 5.142342 4.147010
## [953] 3.586765 3.536943 5.285223 3.874026 3.643967 4.157855 4.197034
## [960] 3.627436 3.682170 3.610702 9.806385 4.459375 3.398503 4.666791
## [967] 7.496659 3.364376 4.953376 3.850600 6.658194 4.309880 4.769633
## [974] 3.662653 3.525202 4.179977 7.944241 3.756973 3.391335 4.845874
## [981] 3.586628 3.519675 8.691511 4.234541 4.000760 3.510726 3.611813
## [988] 3.536807 6.811752 3.618343 3.975355 12.359052 6.031932 4.290997
## [995] 6.185529 3.380162 8.606369 3.340724 3.778845 3.520688
(i)(b) Plot the histogram of the 1000 Monte Carlo posterior mean estimates calculated in part (i)(a).
hist(posterior_mean, main = "Histogram of Posterior Mean", xlab = "Estimated posterior mean", ylab = "frequency")

(ii)(a)Calculate the mean and variance of the Monte Carlo posterior mean estimates from part (i). Give your answers to three decimal places.
mean1 <- round(mean(posterior_mean),3)
mean1
## [1] 4.954
var1 <- round(var(posterior_mean),3)
var1
## [1] 7.577
(ii)(b) Draw another 1000 samples with sample size of 1000 from an inverse gamma distribution with parameters (alpha+x) and (theta+1). Compute the mean and variance from the samples. Give your answers to three decimal places.
set.seed(508)
InvGamma_mean <- rep(0,simulation)
for(i in 1:simulation){
lamda_2 = rinvgamma(1, shape=alpha, rate = theta)
x = rexp(1, 1/lamda_2)
y = rinvgamma(1000, shape = alpha + x, rate = theta + 1)
InvGamma_mean[i] = mean(y)
}
InvGamma_mean
## [1] 0.4528455 1.1154490 2.7660686 3.5108914 5.5036172 4.4175952 1.5760896
## [8] 5.1711123 2.0457014 4.8479597 0.7329518 1.4792504 4.9798844 3.2860042
## [15] 4.0467570 2.2440856 2.4428618 0.4418879 3.7246421 4.2532156 1.7005713
## [22] 4.4601984 2.8846886 1.0600917 2.9005702 1.8663431 1.4882195 5.0893601
## [29] 0.8246347 3.0530775 1.4140200 1.7268893 2.0228870 1.8658839 1.8655319
## [36] 1.3603827 2.3095280 1.3084266 2.9517030 2.6759447 0.5473656 1.2634541
## [43] 0.7939488 3.5468646 0.2921530 1.8740286 5.1608652 1.3927144 0.8249398
## [50] 1.9705110 4.5084420 3.8728939 1.9831910 1.5710352 3.2865672 1.2086492
## [57] 3.0852447 3.5649418 3.9870339 3.4866799 2.1337591 1.2332760 1.1113554
## [64] 1.8449380 3.6497043 3.3162041 0.5966781 0.7992572 0.6378376 3.0257287
## [71] 1.8895758 5.4887922 0.7715542 2.1602091 1.6059832 2.0219840 1.7276020
## [78] 2.5340317 0.7234572 5.2672128 5.0883979 2.8616020 4.1106597 3.5427518
## [85] 1.6819709 1.5107651 0.8003198 3.5364358 4.1263694 1.4424756 2.3492423
## [92] 1.7554169 4.9596701 0.4201931 0.2710016 1.1265545 2.1404786 3.5439059
## [99] 1.5426759 1.5203004 0.9087866 0.6071479 0.8639980 0.3489839 0.6494372
## [106] 3.9842289 4.0026614 2.8648763 1.3327295 5.0893934 1.1557094 2.0291828
## [113] 1.7936196 1.8690467 0.8735780 3.5181104 4.0897129 1.2497278 2.7668506
## [120] 3.7352879 1.2671235 0.7511893 3.8095218 3.2308745 0.9643090 3.4240817
## [127] 3.3723743 3.6490711 1.7051309 1.1699557 5.2377827 2.9916951 4.4127935
## [134] 2.1465436 3.6440365 3.6292358 2.0285370 0.4318151 0.3798680 3.0953474
## [141] 0.5329301 1.0396997 1.4546231 4.1498227 2.1817604 1.3748617 4.2871681
## [148] 0.9957030 4.5320760 0.7124334 0.8784055 2.8726773 1.0727903 1.0072409
## [155] 3.2943308 3.8456319 1.8831843 4.3894135 1.1945740 1.4234888 3.5922632
## [162] 0.4787572 2.9664013 1.5587183 1.6468420 2.4591638 2.3965643 3.5648569
## [169] 1.4750787 0.3769256 2.8577228 3.1602090 2.5596854 1.2439860 2.6451389
## [176] 2.7655296 5.0132480 1.1885544 1.3079784 4.8981617 3.9730371 3.2488457
## [183] 3.7945564 0.8139167 5.3073301 4.1056284 3.1658535 2.5926541 5.0210945
## [190] 2.8507450 2.0784141 1.3959440 2.9958665 0.5016819 2.5880195 1.0251910
## [197] 2.8622180 1.1371431 4.4704570 1.9255612 4.7689870 1.8889239 1.5270193
## [204] 2.5633827 2.4258583 0.5711748 4.3259181 2.1910928 0.9880339 2.9018115
## [211] 0.8415276 0.8375638 1.1844204 0.8494366 3.2024444 2.2394480 0.9588459
## [218] 3.7584095 2.9323297 0.8887133 1.6197297 0.8893519 0.9040925 0.8283098
## [225] 0.9635464 3.8877960 0.8044784 4.0106566 3.2527557 3.2296128 5.7194561
## [232] 1.3399381 1.1683980 2.6302498 4.4616083 1.7543147 0.3567273 5.1542434
## [239] 1.0898997 1.6329184 0.3221966 3.6138337 5.4769452 2.2508752 1.3397348
## [246] 0.7533053 3.8462243 4.1274383 0.8857076 4.9922789 1.8636874 1.9034812
## [253] 4.7527674 3.2312592 1.8846801 2.2229226 4.7768387 5.3540965 1.4105912
## [260] 3.0066666 3.7395103 4.8981606 3.5861133 1.6412343 3.4137261 3.8257461
## [267] 5.4224889 3.2670601 2.2151950 0.9696565 3.4014087 3.0962481 5.0385872
## [274] 1.9415213 3.1501884 0.3051769 0.1415141 0.8797703 3.9675819 2.2398994
## [281] 0.3808153 1.7268817 0.7965069 2.8038486 1.4971264 4.2488659 2.0737372
## [288] 1.7404600 1.9429186 4.7602304 4.9971591 0.8886146 2.9143835 1.4682736
## [295] 5.0122955 0.7503327 3.2005326 1.3049805 1.8009199 4.5793783 1.9687860
## [302] 3.9848185 3.6091545 3.2702234 3.3230683 3.3882814 2.1461322 3.5876499
## [309] 0.9365813 2.9483052 2.1238587 0.6236544 4.8542273 5.0720298 1.1146693
## [316] 1.7669281 2.2104013 3.6758661 4.1641893 1.5424859 3.1886789 5.3987226
## [323] 4.5608291 2.1277715 1.9568252 1.8738373 1.8398984 2.1121293 5.3145722
## [330] 1.6769802 1.7673316 3.8385041 2.9756252 3.7765909 2.4493343 0.9525884
## [337] 2.0041859 1.6331763 1.5579284 0.3907023 1.0249365 1.3639474 0.1133856
## [344] 0.5275447 5.7657049 2.9368747 1.8605942 4.8103918 1.9035779 4.2318347
## [351] 2.4496794 1.2328464 4.8814074 3.5897116 0.9551220 3.3478712 0.9012248
## [358] 4.6841297 0.9316971 4.1362256 2.6924055 1.8685647 0.2118757 1.8358248
## [365] 3.1752916 1.4996790 2.8724224 4.0339754 5.1511512 3.1305016 4.0958385
## [372] 1.2163290 3.2758148 4.3787013 1.2222095 3.6306436 5.1155056 3.1451475
## [379] 1.5405333 2.4677292 0.4800295 0.1928023 5.3588185 2.6651074 3.7447355
## [386] 1.4834094 2.4836019 0.8705411 1.4705107 5.2447729 3.3223226 0.3643578
## [393] 1.8424815 1.6275231 1.1517639 4.5701081 0.6315761 0.6762398 3.5530167
## [400] 3.1288605 0.8481073 1.7355158 3.7437854 4.0486484 3.1548851 2.6058688
## [407] 1.4136439 1.2983875 3.5272948 1.2568043 4.7690709 3.5196682 1.3016162
## [414] 0.3362631 5.2858398 0.6586516 5.6958084 3.1025407 2.2839498 0.8448567
## [421] 1.5100567 3.4922219 4.8826842 2.0583225 4.1700542 1.6679286 2.3383248
## [428] 1.2418360 1.5181339 2.2159176 0.6104076 5.4078878 1.9266301 4.9429668
## [435] 1.4026602 3.7369995 3.0541329 3.9852772 3.8980142 3.6231788 2.2526205
## [442] 2.4544315 1.0401197 5.2027624 5.2881917 2.9874583 3.9053168 2.0113123
## [449] 0.5402704 4.6586652 0.4503325 4.5278397 5.3745448 1.8519707 1.1442759
## [456] 0.9609141 5.2178094 3.1492760 1.7687426 1.0883882 0.2359387 4.3412886
## [463] 4.7151754 5.3413387 2.8940675 2.5558524 1.4253781 0.8899529 1.4964504
## [470] 0.5730605 0.5509815 4.1067092 3.4705732 3.8044874 1.0034477 3.9234279
## [477] 0.7236052 2.3499000 2.6709916 1.7538199 1.7982581 4.7466375 3.4971260
## [484] 3.4047280 5.0420100 1.4882295 0.3540391 1.1975753 3.6200914 2.9689053
## [491] 4.6876745 0.9857318 1.3412846 2.2205684 2.7775015 2.9665085 0.8962940
## [498] 1.7945433 0.3994264 4.5729785 3.2378235 3.1821613 3.5466778 3.7333771
## [505] 3.9409122 2.0508474 4.8576318 1.6721448 3.9377678 1.8117787 2.0535924
## [512] 1.8609506 4.4484776 5.0509178 0.9025151 5.5021956 1.6970009 1.1334251
## [519] 0.9282858 1.7133996 3.7474512 1.6169505 2.0136155 2.1963330 4.7514180
## [526] 0.4861871 5.3494781 0.7050817 3.4094564 0.6732598 0.8544340 2.4467829
## [533] 1.2371384 2.1291804 0.5146497 1.1871992 5.3918529 2.7944833 3.1825742
## [540] 1.3692722 4.3349099 1.7882319 0.6143338 0.8860344 3.0204682 3.1499355
## [547] 4.0879582 1.3718297 1.2894056 2.0148160 4.8233237 3.7076943 1.2833214
## [554] 2.7642430 1.1152592 3.8091597 2.4336382 5.1770561 2.3400343 2.5323501
## [561] 3.1058335 3.4070044 5.2643219 1.5897630 0.5659836 0.8129473 4.2272538
## [568] 0.6600196 2.8793511 4.0734599 3.2938192 1.0631020 4.2357894 3.1885529
## [575] 1.1942369 2.2316898 2.0059203 1.8674142 2.3772932 3.2159222 1.3907635
## [582] 1.4341080 1.3511132 2.8498743 5.0580096 0.4409590 3.5383323 1.2095557
## [589] 3.4614871 1.1448317 4.6912775 2.4109547 2.1014680 2.1734020 1.8053222
## [596] 2.2891560 4.4488872 4.0318710 1.9162956 4.7677817 4.0986832 0.5961969
## [603] 2.6415897 4.5148667 4.4516112 0.8184991 3.6194846 3.0952546 1.0653842
## [610] 4.0181571 1.4789458 0.7906875 3.3136626 0.7699718 5.6943778 1.5896442
## [617] 5.0554372 1.5436022 2.9929239 3.9771121 1.9813198 1.0215891 2.4841887
## [624] 5.0209129 2.0569310 1.3709069 5.2909022 3.7825458 0.9036302 1.5359432
## [631] 2.3039457 1.4213097 0.4982314 4.5628347 4.3588640 1.9198978 4.0147485
## [638] 1.8176554 4.1774287 2.8910149 4.1208819 4.1942820 3.0036219 4.8153169
## [645] 5.2455062 1.4780647 0.5296742 1.3437274 1.9849031 4.1564289 4.9787445
## [652] 3.3463027 2.3369469 2.4294849 3.7684277 2.3000704 3.5415034 3.2653695
## [659] 0.7361545 0.7890317 3.2986516 2.0693051 2.1750318 3.7314953 2.6779305
## [666] 1.6420698 3.4543812 0.8421596 1.4331441 3.0619411 2.5873898 4.7113765
## [673] 1.1743647 3.0365452 5.2386130 2.3931650 2.1941886 1.2788839 3.7347996
## [680] 1.3829213 0.6003583 0.8814799 1.5745850 1.2229610 2.0944786 2.6111498
## [687] 1.1531807 3.6266326 2.6618526 5.1095609 0.5386897 3.5477975 2.8832530
## [694] 2.6669611 2.3961350 2.2197567 1.0547531 1.4449093 1.0010750 1.2089777
## [701] 1.2536075 2.0747320 2.0304295 0.7987186 1.0944149 0.8652461 2.1546179
## [708] 0.7852070 0.4825748 1.1183794 3.6446113 0.4004811 3.7283501 2.4330317
## [715] 0.4873435 4.4582439 3.2599651 0.3775390 1.0764958 4.7223482 4.9877534
## [722] 4.0297753 3.2015358 4.9463936 2.0029758 4.5242077 0.6156263 0.6469607
## [729] 1.1609281 3.1796877 3.6444033 4.0784637 1.9966365 2.4287341 5.2530148
## [736] 2.7197471 1.5372965 1.9470886 1.7541256 3.3555713 0.4314949 2.3649226
## [743] 5.3744377 2.2079060 5.2529834 1.7084546 0.6321942 0.8493124 4.5540834
## [750] 0.5072338 4.0314859 1.9496685 2.5313778 0.4123980 1.9364205 4.3789011
## [757] 0.6035532 2.4621953 2.0805433 2.8421000 0.6827488 1.1554024 4.7044096
## [764] 3.2926485 3.2732105 5.1379018 2.8883640 2.9972377 1.0584864 3.1173693
## [771] 1.1722502 2.7507740 0.8231800 1.6334624 3.4906520 1.4174645 2.3439014
## [778] 0.2411913 1.4550513 2.2035601 2.1023716 1.2888984 1.9514718 5.3406310
## [785] 1.3434288 2.6095676 3.5353372 0.5470604 5.0820147 1.5267695 2.1217402
## [792] 3.8070330 3.3819589 0.8397971 2.8710814 3.9086337 1.2278954 1.4238853
## [799] 3.2031367 2.3668719 1.2014489 0.5101170 5.0572419 4.7732622 2.4837227
## [806] 5.0212588 2.7364925 2.5856292 1.6219991 3.5415430 4.2400029 4.2871797
## [813] 2.0536793 1.1045793 5.4593789 0.8075749 2.5121392 2.9714101 1.2087128
## [820] 2.0763828 1.7548921 3.0193853 5.2671710 3.9083375 0.4067920 1.1693598
## [827] 4.2266161 1.4495648 5.3771004 2.8567718 1.2741526 1.4894426 2.2135957
## [834] 1.5800951 4.0509605 1.8555491 1.9470107 0.8230733 3.3929320 1.6746741
## [841] 2.9507827 4.5560535 1.7241660 3.3192257 3.8884149 1.9143514 2.6175116
## [848] 3.6326724 1.6703912 2.7035559 2.3376948 0.4820992 3.9600808 4.3235997
## [855] 1.5415136 2.7991466 1.6170869 4.2664558 0.9433894 0.6692336 2.7701962
## [862] 1.8405431 0.4111109 3.2782767 1.7788668 1.3769273 2.6861293 1.9530853
## [869] 0.7002908 2.3039531 3.4925266 2.2420922 0.9543217 1.7678143 2.2201936
## [876] 4.5676954 1.4794846 2.8768592 4.3543676 2.7857130 1.2964815 1.4956199
## [883] 4.8149134 2.6014883 5.6532729 2.7600756 0.5965309 1.4289123 4.9812601
## [890] 3.9344124 3.7064027 3.1960860 1.3913708 2.2489057 4.4821830 2.3395064
## [897] 1.8985060 2.5456275 1.3131202 0.5445455 1.8612873 2.3798578 1.1804404
## [904] 0.9920554 4.2637722 0.9772130 4.8348768 1.0348114 3.6489887 4.1880202
## [911] 4.8635471 5.2653701 3.0574887 1.4598367 2.7214003 5.1148474 5.2626023
## [918] 1.7930783 4.3170545 3.3848387 1.2805839 0.2331395 1.5628935 0.7428235
## [925] 0.7826520 5.2765994 3.3129281 1.9157192 0.9519059 1.5245481 1.9348275
## [932] 0.3969309 4.4170088 2.7114701 3.7703103 3.3582338 1.4047555 3.1416984
## [939] 1.1753388 5.6655999 3.3629788 0.4407420 1.0502049 1.4345923 2.3366078
## [946] 3.8108014 3.7185987 0.8568197 1.0700136 2.9973626 2.5499141 1.9663407
## [953] 3.4933361 1.0805930 2.5226570 3.5184661 4.8552384 4.7373606 5.4035095
## [960] 5.4245378 2.5592343 1.3919447 1.2252652 1.0358870 4.3398540 5.1779559
## [967] 3.9759817 1.9382669 1.3281990 2.8732076 5.4448429 0.3283673 1.0930451
## [974] 3.0713871 2.0655776 1.2579016 3.1355631 2.9704311 1.0184263 5.7119790
## [981] 2.4825375 5.0661325 2.2326024 2.9603189 1.9882172 5.2286726 4.4573113
## [988] 3.0809559 1.4872853 2.3249624 2.9258383 1.5953500 0.9583087 0.5875814
## [995] 2.9655767 1.6470701 4.4075902 3.1703623 3.1812618 2.2687367
mean2 <- round(mean(InvGamma_mean),3)
mean2
## [1] 2.55
var2 <- round(var(InvGamma_mean),3)
var2
## [1] 2.101
(ii)(c) Compare the answers obtained in part (ii)(a) and part (ii)(b).
mean_diff = mean2 - mean1
mean_diff
## [1] -2.404
var_diff = var2 - var1
var_diff
## [1] -5.476
As the difference of the mean and variance from part(ii)(b) and part(ii)(a) are NEGATIVE, which are -2.404 and -5.476 respectively, hence it can be concluded that both the mean and variance from part(ii)(a) are higher compared to part(ii)(b).
Question 3
(i) Explain what error structure could be used in building this generalized linear model (GLM), including in your answer the R code used to justify your choice.
library(readxl)
HD_clean_data <- read.csv("C:/Users/User/Documents/BAS/SEM 7 (AUG 2020)/MAT 3024 Regression Analysis/CBT/IFOA April 2020 (CSB1)/HD_clean_data.csv",header = T)
head(HD_clean_data)
## Age Resting_Blood_Pressure Serum_Cholesterol Max_Heart_Rate_Achieved
## 1 63 145 233 150
## 2 67 160 286 108
## 3 67 120 229 129
## 4 37 130 250 187
## 5 41 130 204 172
## 6 56 120 236 178
## ST_Depression_Exercise Num_Major_Vessels_Flouro Sex Chest_Pain_Type
## 1 2.3 3 1 1
## 2 1.5 6 1 4
## 3 2.6 5 1 4
## 4 3.5 3 1 3
## 5 1.4 3 0 2
## 6 0.8 3 1 2
## Fasting_Blood_Sugar Resting_ECG Exercise_Induced_Angina
## 1 1 2 0
## 2 0 2 1
## 3 0 2 1
## 4 0 0 0
## 5 0 2 0
## 6 0 0 0
## Peak_Exercise_ST_Segment Thalassemia Diagnosis_Heart_Disease
## 1 3 6 0
## 2 2 3 1
## 3 2 7 1
## 4 3 3 0
## 5 1 3 0
## 6 1 3 0
tail(HD_clean_data)
## Age Resting_Blood_Pressure Serum_Cholesterol Max_Heart_Rate_Achieved
## 296 57 140 241 123
## 297 45 110 264 132
## 298 68 144 193 141
## 299 57 130 131 115
## 300 57 130 236 174
## 301 38 138 175 173
## ST_Depression_Exercise Num_Major_Vessels_Flouro Sex Chest_Pain_Type
## 296 0.2 3 0 4
## 297 1.2 3 1 1
## 298 3.4 5 1 4
## 299 1.2 4 1 4
## 300 0.0 4 0 2
## 301 0.0 2 1 3
## Fasting_Blood_Sugar Resting_ECG Exercise_Induced_Angina
## 296 0 0 1
## 297 0 0 0
## 298 1 0 0
## 299 0 0 1
## 300 0 2 0
## 301 0 0 0
## Peak_Exercise_ST_Segment Thalassemia Diagnosis_Heart_Disease
## 296 2 7 1
## 297 2 7 1
## 298 2 7 1
## 299 2 7 1
## 300 2 3 1
## 301 1 3 0
str(HD_clean_data)
## 'data.frame': 301 obs. of 14 variables:
## $ Age : int 63 67 67 37 41 56 62 57 63 53 ...
## $ Resting_Blood_Pressure : int 145 160 120 130 130 120 140 120 130 140 ...
## $ Serum_Cholesterol : int 233 286 229 250 204 236 268 354 254 203 ...
## $ Max_Heart_Rate_Achieved : int 150 108 129 187 172 178 160 163 147 155 ...
## $ ST_Depression_Exercise : num 2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
## $ Num_Major_Vessels_Flouro: int 3 6 5 3 3 3 5 3 4 3 ...
## $ Sex : int 1 1 1 1 0 1 0 0 1 1 ...
## $ Chest_Pain_Type : int 1 4 4 3 2 2 4 4 4 4 ...
## $ Fasting_Blood_Sugar : int 1 0 0 0 0 0 0 0 0 1 ...
## $ Resting_ECG : int 2 2 2 0 2 0 2 0 2 2 ...
## $ Exercise_Induced_Angina : int 0 1 1 0 0 0 0 1 0 1 ...
## $ Peak_Exercise_ST_Segment: int 3 2 2 3 1 1 3 1 2 3 ...
## $ Thalassemia : int 6 3 7 3 3 3 3 3 7 7 ...
## $ Diagnosis_Heart_Disease : int 0 1 1 0 0 0 1 0 1 1 ...
summary(HD_clean_data)
## Age Resting_Blood_Pressure Serum_Cholesterol
## Min. :29.00 Min. : 94.0 Min. :126.0
## 1st Qu.:48.00 1st Qu.:120.0 1st Qu.:211.0
## Median :56.00 Median :130.0 Median :242.0
## Mean :54.45 Mean :131.7 Mean :246.9
## 3rd Qu.:61.00 3rd Qu.:140.0 3rd Qu.:275.0
## Max. :77.00 Max. :200.0 Max. :564.0
## Max_Heart_Rate_Achieved ST_Depression_Exercise Num_Major_Vessels_Flouro
## Min. : 71.0 Min. :0.000 Min. :2.000
## 1st Qu.:134.0 1st Qu.:0.000 1st Qu.:3.000
## Median :153.0 Median :0.800 Median :3.000
## Mean :149.7 Mean :1.043 Mean :3.654
## 3rd Qu.:166.0 3rd Qu.:1.600 3rd Qu.:4.000
## Max. :202.0 Max. :6.200 Max. :6.000
## Sex Chest_Pain_Type Fasting_Blood_Sugar Resting_ECG
## Min. :0.0000 Min. :1.000 Min. :0.0000 Min. :0.00
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:0.0000 1st Qu.:0.00
## Median :1.0000 Median :3.000 Median :0.0000 Median :1.00
## Mean :0.6811 Mean :3.156 Mean :0.1462 Mean :0.99
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:0.0000 3rd Qu.:2.00
## Max. :1.0000 Max. :4.000 Max. :1.0000 Max. :2.00
## Exercise_Induced_Angina Peak_Exercise_ST_Segment Thalassemia
## Min. :0.0000 Min. :1.000 Min. :3.000
## 1st Qu.:0.0000 1st Qu.:1.000 1st Qu.:3.000
## Median :0.0000 Median :2.000 Median :3.000
## Mean :0.3256 Mean :1.601 Mean :4.734
## 3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:7.000
## Max. :1.0000 Max. :3.000 Max. :7.000
## Diagnosis_Heart_Disease
## Min. :0.0000
## 1st Qu.:0.0000
## Median :0.0000
## Mean :0.4585
## 3rd Qu.:1.0000
## Max. :1.0000
From the structure and property of the dataset, it can be clearly seen that the data is a binary data that consists of only values 0 and 1. Hence, we use Binomial distribution as the error structure in building this generalized linear model.
(ii) Fit a GLM that treats Age, Resting Blood Pressure, Serum Cholesterol, Maximum Heart Rate Achieved, ST Depression Exercise, and Number of Major Vessels as linear factors, and all other seven factors as categorical variables. Your answer should show the coefficient, standard error, and p-value of each parameter estimate in the model.
model <- glm(Diagnosis_Heart_Disease ~ Age + Resting_Blood_Pressure + Serum_Cholesterol + Max_Heart_Rate_Achieved + ST_Depression_Exercise + Num_Major_Vessels_Flouro + factor(Sex) + factor(Chest_Pain_Type) + factor(Fasting_Blood_Sugar) + factor(Resting_ECG) + factor(Exercise_Induced_Angina) + factor(Peak_Exercise_ST_Segment) + factor(Thalassemia), data = HD_clean_data, family = binomial())
summary(model)
##
## Call:
## glm(formula = Diagnosis_Heart_Disease ~ Age + Resting_Blood_Pressure +
## Serum_Cholesterol + Max_Heart_Rate_Achieved + ST_Depression_Exercise +
## Num_Major_Vessels_Flouro + factor(Sex) + factor(Chest_Pain_Type) +
## factor(Fasting_Blood_Sugar) + factor(Resting_ECG) + factor(Exercise_Induced_Angina) +
## factor(Peak_Exercise_ST_Segment) + factor(Thalassemia), family = binomial(),
## data = HD_clean_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7657 -0.5015 -0.1386 0.3134 2.7962
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -9.836277 3.089155 -3.184 0.001452 **
## Age -0.016429 0.024719 -0.665 0.506298
## Resting_Blood_Pressure 0.024578 0.011286 2.178 0.029426 *
## Serum_Cholesterol 0.004556 0.004007 1.137 0.255492
## Max_Heart_Rate_Achieved -0.018073 0.011134 -1.623 0.104548
## ST_Depression_Exercise 0.349926 0.230502 1.518 0.128988
## Num_Major_Vessels_Flouro 1.305828 0.275083 4.747 2.06e-06 ***
## factor(Sex)1 1.545454 0.532301 2.903 0.003692 **
## factor(Chest_Pain_Type)2 1.204088 0.768206 1.567 0.117020
## factor(Chest_Pain_Type)3 0.241776 0.662567 0.365 0.715180
## factor(Chest_Pain_Type)4 2.114648 0.666969 3.171 0.001522 **
## factor(Fasting_Blood_Sugar)1 -0.493078 0.588486 -0.838 0.402101
## factor(Resting_ECG)1 0.851656 2.488853 0.342 0.732209
## factor(Resting_ECG)2 0.510913 0.382580 1.335 0.181731
## factor(Exercise_Induced_Angina)1 0.738811 0.439467 1.681 0.092733 .
## factor(Peak_Exercise_ST_Segment)2 1.152364 0.472645 2.438 0.014764 *
## factor(Peak_Exercise_ST_Segment)3 0.519801 0.924706 0.562 0.574030
## factor(Thalassemia)6 -0.029854 0.790598 -0.038 0.969878
## factor(Thalassemia)7 1.399100 0.424551 3.295 0.000983 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 415.20 on 300 degrees of freedom
## Residual deviance: 192.95 on 282 degrees of freedom
## AIC: 230.95
##
## Number of Fisher Scoring iterations: 6
(iii) Interpret the coefficient for sex. Your answer should include a numerical comparison of the odds of getting heart disease between male and female patient.
Beta <- coef(model)["factor(Sex)1"]
Beta
## factor(Sex)1
## 1.545454
The coefficient for male is 1.545454 while the coefficient for female is 0. This means that an increment of 1 unit in the regressor for male will cause an expected increase of 1 unit in the systematic component of the generalised linear model.
On the other hand, the zero coefficient for female indicates that the category of female is used as the baseline to obtain the significant positive results for the factor(sex)1 which is the category of male.
exp(Beta)/(1+exp(Beta)) - exp(0)/(1+exp(0))
## factor(Sex)1
## 0.3242562
Male patients have higher mean of getting heart disease of (32.43%) than female patients. The difference is significant at 5% significance level as the p-value = 0.003692 < 0.05 = alpha.
(v) Find the reduced model by using backward elimination method based on p-value criteria.
model_backward_fit = step(model, direction = "backward", trace=0)
formula(model_backward_fit)
## Diagnosis_Heart_Disease ~ Resting_Blood_Pressure + Max_Heart_Rate_Achieved +
## ST_Depression_Exercise + Num_Major_Vessels_Flouro + factor(Sex) +
## factor(Chest_Pain_Type) + factor(Exercise_Induced_Angina) +
## factor(Peak_Exercise_ST_Segment) + factor(Thalassemia)
summary(model_backward_fit)
##
## Call:
## glm(formula = Diagnosis_Heart_Disease ~ Resting_Blood_Pressure +
## Max_Heart_Rate_Achieved + ST_Depression_Exercise + Num_Major_Vessels_Flouro +
## factor(Sex) + factor(Chest_Pain_Type) + factor(Exercise_Induced_Angina) +
## factor(Peak_Exercise_ST_Segment) + factor(Thalassemia), family = binomial(),
## data = HD_clean_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7896 -0.4929 -0.1479 0.3404 2.7824
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -9.48926 2.62745 -3.612 0.000304 ***
## Resting_Blood_Pressure 0.02315 0.01034 2.239 0.025157 *
## Max_Heart_Rate_Achieved -0.01464 0.01013 -1.446 0.148205
## ST_Depression_Exercise 0.38222 0.22161 1.725 0.084573 .
## Num_Major_Vessels_Flouro 1.24221 0.25447 4.881 1.05e-06 ***
## factor(Sex)1 1.43651 0.49449 2.905 0.003672 **
## factor(Chest_Pain_Type)2 1.22865 0.75333 1.631 0.102900
## factor(Chest_Pain_Type)3 0.20237 0.64952 0.312 0.755372
## factor(Chest_Pain_Type)4 2.18778 0.65675 3.331 0.000865 ***
## factor(Exercise_Induced_Angina)1 0.70999 0.43087 1.648 0.099392 .
## factor(Peak_Exercise_ST_Segment)2 1.20100 0.46245 2.597 0.009404 **
## factor(Peak_Exercise_ST_Segment)3 0.46862 0.89722 0.522 0.601458
## factor(Thalassemia)6 -0.24773 0.76358 -0.324 0.745609
## factor(Thalassemia)7 1.35384 0.41284 3.279 0.001040 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 415.20 on 300 degrees of freedom
## Residual deviance: 197.46 on 287 degrees of freedom
## AIC: 225.46
##
## Number of Fisher Scoring iterations: 6
null_deviance2 <- summary(model_backward_fit)$null.deviance
null_deviance2
## [1] 415.1958
residual_deviance2 <- deviance(model_backward_fit)
residual_deviance2
## [1] 197.4618
null_degree2 <- summary(model_backward_fit)$df[2] + summary(model_backward_fit)$df[1] - 1
null_degree2
## [1] 300
resid_degree2 <- summary(model_backward_fit)$df[2]
resid_degree2
## [1] 287
deviance_diff2 <- null_deviance2 - residual_deviance2
deviance_diff2
## [1] 217.734
df_diff2 <- null_degree2 - resid_degree2
df_diff2
## [1] 13
deviance_diff2 > deviance_diff + 2*df_diff2
## [1] FALSE
Hence, the fitted model is ln miu = -9.48926 + 0.02315x1 - 0.01464x2 + 0.38222x3 + 1.24221x4 + 1.43651x5 + 1.22865x6 + 0.20237x7 + 2.18778x8 + 0.70999x9 + 1.20100x10 + 0.46862x11 - 0.24773x12 + 1.35384x13. (Note that the regressor xi where i=1,2,3… is equivalent to the names of variables according to the summary of the model.).
Also, it is proven that the deviance is NOT reduced more than twice the change in degrees of freedom. Hence, the removal of regressors are not playing a significant role. Therefore, the model will perform better if it includes all regressors and variables.