class: center, middle, inverse, title-slide # Statistics with R ## R for Actuarial Students --- ### Mortality Investigation * Use the data file “Mortality_Investigation.csv” to answer the below questions. * This table contains the details of mortality investigation performed on 100 different male lives between 1-Jan-2016 and 31-Dec-2018. --- ### Part 1 Print first 6 rows of the data ensuring correct data formats for all the columns and calculate the proportion of the total sample died, survived and withdrew from the observation period. ```r Mort_Inv <- read.csv("Mortality_Investigation.csv") Mort_Inv$DoB<-as.Date(Mort_Inv$DoB) Mort_Inv$DoJ<-as.Date(Mort_Inv$DoJ) Mort_Inv$DoE<-as.Date(Mort_Inv$DoE) ``` --- ```r head(Mort_Inv) ``` ``` ## Life DoB DoJ DoE Exit_Reason ## 1 A1 1981-12-12 2018-11-13 2018-12-31 Survived ## 2 A2 1981-05-22 2017-10-06 2018-12-31 Survived ## 3 A3 1978-08-11 2018-01-30 2018-12-31 Survived ## 4 A4 1980-05-24 2016-05-12 2016-05-13 Withdrawal ## 5 A5 1979-04-03 2017-07-25 2018-12-31 Survived ## 6 A6 1979-11-08 2016-08-02 2017-04-14 Death ``` ```r prop.table(table(Mort_Inv$Exit_Reason)) ``` ``` ## ## Death Survived Withdrawal ## 0.31 0.40 0.29 ``` --- ### Part 2 Compute two new columns ‘Age_At_Entry’ and ‘Age_At_Exit’ for each person (you can assume a completed year as 365.25 days and you can compute the age as difference between the dates in decimals like 35.234 years) and print the last 6 rows of the newly formed table. --- ### Part 2 ```r Mort_Inv$Age_At_Entry<-round((Mort_Inv$DoJ-Mort_Inv$DoB)/365.25,4) Mort_Inv$Age_At_Exit<-round((Mort_Inv$DoE-Mort_Inv$DoB)/365.25,4) tail(Mort_Inv) ``` ``` ## Life DoB DoJ DoE Exit_Reason Age_At_Entry Age_At_Exit ## 95 A95 1981-03-28 2016-03-06 2019-11-21 Survived 34.9405 days 38.6502 days ## 96 A96 1981-01-17 2018-04-04 2018-09-14 Death 37.2101 days 37.6564 days ## 97 A97 1980-01-17 2016-08-29 2016-09-18 Death 36.6160 days 36.6708 days ## 98 A98 1978-04-17 2016-06-14 2016-07-07 Withdrawal 38.1602 days 38.2231 days ## 99 A99 1978-06-12 2017-09-05 2019-11-25 Survived 39.2334 days 41.4538 days ## 100 A100 1980-06-29 2018-03-27 2019-10-04 Survived 37.7413 days 39.2635 days ``` --- ### Part 3 Compute the average age at entry and the average age at exit of the people for whom “Death” was the reason of Exit. ```r mean(Mort_Inv$Age_At_Entry[Mort_Inv$Exit_Reason == "Death"]) ``` ``` ## Time difference of 37.01715 days ``` ```r mean(Mort_Inv$Age_At_Exit[Mort_Inv$Exit_Reason == "Death"]) ``` ``` ## Time difference of 37.89168 days ``` --- ### Part 4 If the age label used is “Age last Birthday”, then calculate the number of lives out of total 100 lives who have contributed the full year towards the central exposed to risk `\(E^{C}_{37}\)` ```r sum((Mort_Inv$Age_At_Entry)<37 & Mort_Inv$Age_At_Exit>38 ) ``` ``` ## [1] 14 ``` --- ### Part 5 If the age label used is “Age last Birthday”, then calculate the number of lives who did not contribute at all towards the central exposed to risk `\(E^{C}_{37}\)` ```r sum((Mort_Inv$Age_At_Entry)>38|Mort_Inv$Age_At_Exit<37) ``` ``` ## [1] 49 ``` --- ### Part 6 If the age label used is “Age last Birthday”, then calculate the total number of years of contribution to `\(E^{C}_{37}\)` by all the lives together (Round your answer to 2 digits after the decimal). ```r Mort_Inv$Contribution37 <- ifelse( (Mort_Inv$Age_At_Exit<37 |Mort_Inv$Age_At_Entry> 38), "No","Yes" ) Mort_Inv$contribution37_Period <- ifelse( Mort_Inv$Contribution37 == "Yes", (pmin(38,Mort_Inv$Age_At_Exit)- pmax(37,Mort_Inv$Age_At_Entry)),0) sum(Mort_Inv$contribution37_Period) ``` ``` ## [1] 27.4224 ``` ---