The self-management of asthma help improve patient health. Asthma self-management provide to the patient and caregivers the skills to understand the disease and its treatment. It teaches them to take medications appropriately, recognize early signs and symptoms of asthma episodes, seek medical care as appropriate, and identify and avoid environmental asthma allergens and irritants In this project, we study the characteristics that influence asthma self-management.
## [1] 13572 1057
The data set come from CDC with ulr = “https://www.cdc.gov/brfss/acbs/2016_documentation.html”. It is a survey study. The download file is “2016 ACBS Adult Data SAS [ZIP – 3.10 MB]” The unzip file has 899 variables and 13,922 cases. We have selected the variables to use on our studies.
Meaning of variables used in the dataset
ASTHNOW Have you ever been told by a doctor or other health professional that you have asthma?
TCH_SIGN Has a doctor or other health professional ever taught you… a. How to recognize early signs or symptoms of an asthma episode?
TCH_RESP Has a doctor or other health professional ever taught you… b. What to do during an asthma episode or attack?
TCH_MON A peak flow meter is a hand held device that measures how quickly you can blow air out of your lungs. Has a doctor or other health professional ever taught you… c. How to use a peak flow meter to adjust your daily medications?
MGT_PLAN An asthma action plan, or asthma management plan, is a form with instructions about when to change the amount or type of medicine, when to call the doctorfor advice, and when to go to the emergency room. Has a doctor or other health professional EVER given you an asthma action plan?
MOD_ENV (7.13) INTERVIEWER READ: Now, back to questions specifically about you. Has a health professional ever advised you to change things in your home, school, or work to improve your asthma
MGT_CLAS Have you ever taken a course or class on how to manage your asthma?
INHALERH (8.3) Did a doctor or other health professional show you how to use the inhaler?
INHALERW (8.4) Did a doctor or other health professional watch you use the inhaler?
Responses types (1) YES (2) NO (7) DON’T KNOW (9) REFUSED
MISS_DAY = “NUMBER OF MISSED DAYS”
MOD_ENV = “EVER ADVISED CHANGE THINGS IN YOUR HOME”
AGEDX = “AGE AT ASTHMA DIAGNOSIS”
AGEG_F6_M = “MODIFIED SIX AGE GROUPS USED IN ASTHMA ADULT POST-STRATIFICATION”
AIRCLEANER = “AIR CLEANER USED”
ASMDCOST = “COST BARRIER: PRIMARY CARE DOCTOR”
ASRXCOST = “COST BARRIER: MEDICATION”
ASSPCOST = “COST BARRIER: SPECIALIST”
CATTMPTS_F = “DISPOSITION CODES FOR CALL ATTEMPTS 1 THROUGH 20 …”
EMP_STAT = “CURRENT EMPLOYMENT STATUS”
EPIS_12M = “ASTHMA EPISODE OR ATTACK”
EPIS_TP = “NUMBER OF EPISODES / ATTACKS”
ER_TIMES = “NUMBER OF EMERGENCY ROOM VISITS”
ER_VISIT = “EMERGENCY ROOM VISIT”
EVER_ASTH = “EVER HAVE ASTHMA INCONSISTENT WITH BRFSS”
HOSPPLAN = “HOSPITAL FOLLOW-UP”
HOSPTIME = “NUMBER OF HOSPITAL VISITS”
HOSP_VST = “HOSPITAL VISIT”
QSTLANG_F = “LANGUAGE IDENTIFIER”
SCR_MED3 = “HAVE ALL THE MEDICATIONS”
UNEMP_R = “REASON NOT NOW EMPLOYED”
URG_TIME = “NUMBER OF URGENT VISITS”
WORKENV5 = “ASTHMA AGGRAVATED BY CURRENT JOB”
WORKENV6 = “ASTHMA CAUSED BY CURRENT JOB”
WORKENV7 = “ASTHMA AGGRAVATED BY PREVIOUS JOB”
WORKENV8 = “ASTHMA CAUSED BY PREVIOUS JOB”
WORKQUIT1 = “EVER CHANGE OR QUIT A JOB”
WORKSEN3 = “DOCTOR DIAGNOSED WORK ASTHMA”
WORKSEN4 = “SELF-IDENTIFIED WORK ASTHMA”
WORKTALK = “DOCTOR DISCUSSED WORK ASTHMA”
INS1 = “INSURANCE”
INS2 = “INSURANCE OR COVERAGE GAP”
LASTSYMP = “LAST HAD ANY SYMPTOMS OF ASTHMA”
LAST_MD = “LAST TALKED TO A DOCTOR”
LAST_MED = “LAST TOOK ASTHMA MEDICATION”
COMPASTH = “TYPICAL ATTACK”
ACT_DAYS30 = “ACTIVITY LIMITATION”
We select all possible variable that we can use in our dataset. We also start to clean the dataset
Here we categ
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV SEX
## 1:8390 1:9755 1:5615 1:3657 1: 1179 1:9168 1:4218 1:4896
## 2:4877 2:3525 2:7800 2:9604 2:12357 2:2778 2:9265 2:8676
## 7: 294 7: 284 7: 150 7: 299 7: 32 5: 427 7: 81
## 9: 11 9: 8 9: 7 9: 12 9: 4 6: 747 9: 8
## 7: 451
## 9: 1
##
## AGEG.F7 X_IMPRACE EDUCAL X_INCOMG X_RFBMI5 SMOKE100 COPD
## 1: 795 1:10741 1: 17 1:1778 1:3424 1:6302 1 : 2665
## 2:1222 2: 797 2: 245 2:2108 2:9509 2:7231 2 :10738
## 3:1293 3: 176 3: 666 3:1231 9: 639 7: 37 7 : 117
## 4:2077 4: 290 4:3148 4:1604 9: 2 9 : 10
## 5:3299 5: 997 5:4183 5:5213 NA's: 42
## 6:3148 6: 571 6:5297 9:1638
## 7:1738 9: 16
## EMPHY DEPRESS BRONCH DUR.30D INCINDT LAST.MD
## 1 : 1090 1 :5194 1 :3483 12 :4774 1: 287 4 :7760
## 2 :12344 2 :8269 2 :9902 6 :4118 2: 1001 5 :1880
## 7 : 85 7 : 41 7 : 134 10 :1548 3:12253 6 : 801
## 9 : 11 9 : 26 9 : 11 1 :1237 7: 29 7 :2852
## NA's: 42 NA's: 42 NA's: 42 2 : 909 9: 2 77: 140
## 11 : 736 88: 133
## (Other): 250 99: 6
## LAST.MED LAST.SYMP COMPASTH INS2 ER.VISIT HOSP.VST
## 1 :4789 1 :3513 11 :4252 1: 628 1:1327 1: 404
## 7 :2822 7 :2456 6 :4118 2:12137 2:6517 2:6524
## 3 :1480 3 :2440 3 :3196 5: 743 5:2609 4: 920
## 4 :1178 2 :1665 1 :1139 7: 51 6:3097 5:2609
## 2 :1174 4 :1548 2 : 795 9: 13 7: 22 6:3097
## 5 :1018 5 :1014 7 : 51 7: 16
## (Other):1111 (Other): 936 (Other): 21 9: 2
## ASRXCOST WORKTALK ACT.DAY30
## 1 :1509 1 : 2409 1:5667
## 2 :9421 2 :10681 2:3009
## 5 :2607 6 : 264 3:1431
## 7 : 16 7 : 176 4: 801
## 9 : 12 8 : 20 5:2609
## NA's: 7 9 : 9 7: 45
## NA's: 13 9: 10
####bHere we collapse certain variables with to many classes, and factors with few cases.
asthma.mgt.adult2 <- asthma.mgt.adult2 %>% mutate(COPD = replace(COPD, is.na(COPD), "7"),
EMPHY = replace(EMPHY, is.na(EMPHY), "7"),
DEPRESS = replace(DEPRESS, is.na(DEPRESS), "7"),
BRONCH = replace(BRONCH, is.na(BRONCH), "7"),
ASRXCOST = replace(ASRXCOST, is.na(ASRXCOST), "7"),
WORKTALK = replace(WORKTALK, is.na(WORKTALK), "7")
)
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV SEX
## 1:8390 1:9755 1:5615 1:3657 1: 1179 1:9168 1:4218 1:4896
## 2:4877 2:3525 2:7800 2:9604 2:12357 2:2778 2:9265 2:8676
## 3: 305 3: 292 3: 157 3: 311 3: 36 4:1174 3: 89
## 3: 452
##
##
##
## AGEG.F7 X_IMPRACE EDUCAL X_INCOMG X_RFBMI5 SMOKE100 COPD EMPHY
## 1: 795 1:10741 1: 17 1:1778 1:3424 1:6302 1: 2665 1: 1090
## 2:1222 2: 797 2: 245 2:2108 2:9509 2:7231 2:10738 2:12344
## 3:1293 3: 176 3: 666 3:1231 9: 639 7: 39 7: 169 7: 127
## 4:2077 4: 290 4:3148 4:1604 9: 11
## 5:3299 5: 997 5:4183 5:5213
## 6:3148 6: 571 6:5313 9:1638
## 7:1738
## DEPRESS BRONCH DUR.30D INCINDT LAST.MD LAST.MED LAST.SYMP
## 1:5194 1:3483 1 :1237 1: 287 4:7760 4:4789 1 :3513
## 2:8269 2:9902 10:1548 2: 1001 5:1880 5:2192 7 :2456
## 7: 83 7: 176 11: 736 3:12253 6: 801 6:2031 3 :2440
## 9: 26 9: 11 12:4774 7: 31 7:2852 7:4000 2 :1665
## 2 : 909 9: 279 9: 560 4 :1548
## 6 :4118 5 :1014
## 7 : 250 (Other): 936
## COMPASTH INS2 ER.VISIT HOSP.VST ASRXCOST WORKTALK ACT.DAY30
## 1 :1139 1: 628 1:1327 1: 404 1:1509 1: 2409 1:5667
## 11:4252 2:12137 2:6517 2:6524 2:9421 2:10681 2:3009
## 2 : 795 5: 743 5:2609 4: 920 5:2607 6: 264 3:1431
## 3 :3210 7: 51 6:3097 5:2609 7: 23 7: 189 4: 801
## 6 :4118 9: 13 7: 22 6:3097 9: 12 8: 20 5:2609
## 7 : 58 7: 16 9: 9 7: 55
## 9: 2
## [1] 13572 30
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV SEX
## 1:8390 1:9755 1:5615 1:3657 1: 1179 1:9168 1:4218 1:4896
## 2:4877 2:3525 2:7800 2:9604 2:12357 2:2778 2:9265 2:8676
## 3: 305 3: 292 3: 157 3: 311 3: 36 4:1174 3: 89
## 3: 452
##
##
##
## AGEG.F7 X_IMPRACE EDUCAL X_INCOMG X_RFBMI5 SMOKE100 COPD EMPHY
## 1: 795 1:10741 1: 17 1:1778 1:3424 1:6302 1: 2665 1: 1090
## 2:1222 2: 797 2: 245 2:2108 2:9509 2:7231 2:10738 2:12344
## 3:1293 3: 176 3: 666 3:1231 9: 639 7: 39 7: 169 7: 127
## 4:2077 4: 290 4:3148 4:1604 9: 11
## 5:3299 5: 997 5:4183 5:5213
## 6:3148 6: 571 6:5313 9:1638
## 7:1738
## DEPRESS BRONCH DUR.30D INCINDT LAST.MD LAST.MED LAST.SYMP
## 1:5194 1:3483 1 :1237 1: 287 4:7760 4:4789 1 :3513
## 2:8269 2:9902 10:1548 2: 1001 5:1880 5:2192 7 :2456
## 7: 83 7: 176 11: 736 3:12253 6: 801 6:2031 3 :2440
## 9: 26 9: 11 12:4774 7: 31 7:2852 7:4000 2 :1665
## 2 : 909 9: 279 9: 560 4 :1548
## 6 :4118 5 :1014
## 7 : 250 (Other): 936
## COMPASTH INS2 ER.VISIT HOSP.VST ASRXCOST WORKTALK ACT.DAY30
## 1 :1139 1: 628 1:1327 1: 404 1:1509 1: 2409 1:5667
## 11:4252 2:12137 2:6517 2:6524 2:9421 2:10681 2:3009
## 2 : 795 5: 743 5:2609 4: 920 5:2607 6: 264 3:1431
## 3 :3210 7: 51 6:3097 5:2609 7: 23 7: 189 4: 801
## 6 :4118 9: 13 7: 22 6:3097 9: 12 8: 20 5:2609
## 7 : 58 7: 16 9: 9 7: 55
## 9: 2
Histograms tell us how the data is distributed in the dataset (numeric fields).
There are highly correlated predictors. We are going to remove some of them. It remain 33 variable for our data set with 7 response variables.
We first extract variables related to education, #### Selection of variables
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV
## 1:8390 1:9755 1:5615 1:3657 1: 1179 1:9168 1:4218
## 2:4877 2:3525 2:7800 2:9604 2:12357 2:2778 2:9265
## 3: 305 3: 292 3: 157 3: 311 3: 36 4:1174 3: 89
## 3: 452
Elbow method Scree Plot
The number of cluster is 4
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV
## 1 1.664207 1.597786 1.821033 1.935424 1.967405 3.722017 1.849323
## 2 1.169630 1.043248 2.010332 1.825324 1.938491 1.151129 1.669630
## 3 1.099168 1.033932 1.000000 1.460814 1.827715 1.110552 1.565674
## 4 2.009950 1.871269 1.800373 1.984142 1.985386 1.511194 1.836754
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV target
## 1 1 1 2 2 2 1 1 2
## 2 1 1 1 1 1 1 2 3
## 3 1 1 1 1 2 1 2 3
## 4 2 1 2 2 2 2 2 4
## 5 2 1 2 2 2 1 2 2
## 6 2 1 2 2 2 1 2 2
## TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV target
## 1:8390 1:9755 1:5615 1:3657 1: 1179 1:9168 1:4218 1:1626
## 2:4877 2:3525 2:7800 2:9604 2:12357 2:2778 2:9265 2:4162
## 3: 305 3: 292 3: 157 3: 311 3: 36 4:1174 3: 89 3:4568
## 3: 452 4:3216
View of the clustering result
## # A tibble: 11 x 5
## TCH.SIGN target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 618 8390 0.0737
## 2 1 2 3456 8390 0.412
## 3 1 3 4118 8390 0.491
## 4 1 4 198 8390 0.0236
## 5 2 1 936 4877 0.192
## 6 2 2 706 4877 0.145
## 7 2 3 447 4877 0.0917
## 8 2 4 2788 4877 0.572
## 9 3 1 72 305 0.236
## 10 3 3 3 305 0.00984
## 11 3 4 230 305 0.754
## # A tibble: 3 x 6
## # Groups: target [2]
## TCH.SIGN target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 4118 8390 0.491 4118
## 2 2 4 2788 4877 0.572 2788
## 3 3 4 230 305 0.754 230
In the target response, 8 is the positive answer, 3 is the negative answer, 5 is don’t know and 6 is refused for the question: TCH_SIGN Has a doctor or other health professional ever taught you… a. How to recognize early signs or symptoms of an asthma episode?
## # A tibble: 3 x 6
## # Groups: TCH.SIGN [3]
## target TCH.SIGN count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 3 1 4118 4568 0.901 4118
## 2 4 2 2788 3216 0.867 2788
## 3 4 3 230 3216 0.0715 230
## # A tibble: 11 x 5
## TCH.RESP target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 741 9755 0.0760
## 2 1 2 3982 9755 0.408
## 3 1 3 4419 9755 0.453
## 4 1 4 613 9755 0.0628
## 5 2 1 798 3525 0.226
## 6 2 2 180 3525 0.0511
## 7 2 3 143 3525 0.0406
## 8 2 4 2404 3525 0.682
## 9 3 1 87 292 0.298
## 10 3 3 6 292 0.0205
## 11 3 4 199 292 0.682
## # A tibble: 3 x 6
## # Groups: target [2]
## TCH.RESP target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 4419 9755 0.453 4419
## 2 2 4 2404 3525 0.682 2404
## 3 3 4 199 292 0.682 199
In the target response, 8 is the positive answer, 3 is the negative answer, 1 is don’t know and 1 is refused for the question: TCH_RESP Has a doctor or other health professional ever taught you… b. What to do during an asthma episode or attack?
## # A tibble: 3 x 6
## # Groups: TCH.RESP [3]
## target TCH.RESP count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 3 1 4419 4568 0.967 4419
## 2 4 2 2404 3216 0.748 2404
## 3 4 3 199 3216 0.0619 199
## # A tibble: 10 x 5
## TCH.MON target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 319 5615 0.0568
## 2 1 2 46 5615 0.00819
## 3 1 3 4568 5615 0.814
## 4 1 4 682 5615 0.121
## 5 2 1 1279 7800 0.164
## 6 2 2 4027 7800 0.516
## 7 2 4 2494 7800 0.320
## 8 3 1 28 157 0.178
## 9 3 2 89 157 0.567
## 10 3 4 40 157 0.255
## # A tibble: 3 x 6
## # Groups: target [2]
## TCH.MON target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 4568 5615 0.814 4568
## 2 2 2 4027 7800 0.516 4027
## 3 3 2 89 157 0.567 89
In the target response, 8 is the positive answer, 7 are the negative answers, 2 is don’t know and 2 is refused for the question: TCH_MON A peak flow meter is a hand held device that measures how quickly you can blow air out of your lungs. Has a doctor or other health professional ever taught you… c. How to use a peak flow meter to adjust your daily medications?
## # A tibble: 3 x 6
## # Groups: TCH.MON [3]
## target TCH.MON count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 2 2 4027 4162 0.968 4027
## 2 2 3 89 4162 0.0214 89
## 3 3 1 4568 4568 1 4568
## # A tibble: 12 x 5
## MGT.PLAN target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 161 3657 0.0440
## 2 1 2 854 3657 0.234
## 3 1 3 2491 3657 0.681
## 4 1 4 151 3657 0.0413
## 5 2 1 1409 9604 0.147
## 6 2 2 3181 9604 0.331
## 7 2 3 2049 9604 0.213
## 8 2 4 2965 9604 0.309
## 9 3 1 56 311 0.180
## 10 3 2 127 311 0.408
## 11 3 3 28 311 0.0900
## 12 3 4 100 311 0.322
## # A tibble: 3 x 6
## # Groups: target [2]
## MGT.PLAN target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 2491 3657 0.681 2491
## 2 2 2 3181 9604 0.331 3181
## 3 3 2 127 311 0.408 127
In the target response, 8 is the positive answer, 3 is the negative answer, 9 is don’t know and 9 is refused for the question: MGT_PLAN An asthma action plan, or asthma management plan, is a form with instructions about when to change the amount or type of medicine, when to call the doctor for advice, and when to go to the emergency room. Has a doctor or other health professional EVER given you an asthma action plan?
## # A tibble: 3 x 6
## # Groups: MGT.PLAN [3]
## target MGT.PLAN count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 2 2 3181 4162 0.764 3181
## 2 2 3 127 4162 0.0305 127
## 3 3 1 2491 4568 0.545 2491
## # A tibble: 12 x 5
## MGT.CLAS target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 61 1179 0.0517
## 2 1 2 266 1179 0.226
## 3 1 3 798 1179 0.677
## 4 1 4 54 1179 0.0458
## 5 2 1 1557 12357 0.126
## 6 2 2 3886 12357 0.314
## 7 2 3 3759 12357 0.304
## 8 2 4 3155 12357 0.255
## 9 3 1 8 36 0.222
## 10 3 2 10 36 0.278
## 11 3 3 11 36 0.306
## 12 3 4 7 36 0.194
## # A tibble: 3 x 6
## # Groups: target [2]
## MGT.CLAS target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 798 1179 0.677 798
## 2 2 2 3886 12357 0.314 3886
## 3 3 3 11 36 0.306 11
In the target response, 8 is the positive answer, 8 or(3,7) is the negative answer, 8 is don’t know and 6 is refused for the question: MGT_CLAS Have you ever taken a course or class on how to manage your asthma?
## # A tibble: 3 x 6
## # Groups: MGT.CLAS [3]
## target MGT.CLAS count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 2 2 3886 4162 0.934 3886
## 2 3 1 798 4568 0.175 798
## 3 3 3 11 4568 0.00241 11
## # A tibble: 8 x 5
## INHALERW target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 2 3533 9168 0.385
## 2 1 3 4063 9168 0.443
## 3 1 4 1572 9168 0.171
## 4 2 2 629 2778 0.226
## 5 2 3 505 2778 0.182
## 6 2 4 1644 2778 0.592
## 7 4 1 1174 1174 1
## 8 3 1 452 452 1
## # A tibble: 4 x 6
## # Groups: target [3]
## INHALERW target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 4063 9168 0.443 4063
## 2 2 4 1644 2778 0.592 1644
## 3 4 1 1174 1174 1 1174
## 4 3 1 452 452 1 452
In the target response, 8 is the positive answer, 3 is the negative answer, 4 is don’t know and 1 is refused for the question: INHALERW (8.4) Did a doctor or other health professional watch you use the inhaler?
## # A tibble: 4 x 6
## # Groups: INHALERW [4]
## target INHALERW count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 4 1174 1626 0.722 1174
## 2 1 3 452 1626 0.278 452
## 3 3 1 4063 4568 0.889 4063
## 4 4 2 1644 3216 0.511 1644
## # A tibble: 12 x 5
## MOD.ENV target count etotal proportion
## <fct> <fct> <int> <int> <dbl>
## 1 1 1 268 4218 0.0635
## 2 1 2 1396 4218 0.331
## 3 1 3 2005 4218 0.475
## 4 1 4 549 4218 0.130
## 5 2 1 1335 9265 0.144
## 6 2 2 2745 9265 0.296
## 7 2 3 2542 9265 0.274
## 8 2 4 2643 9265 0.285
## 9 3 1 23 89 0.258
## 10 3 2 21 89 0.236
## 11 3 3 21 89 0.236
## 12 3 4 24 89 0.270
## # A tibble: 3 x 6
## # Groups: target [3]
## MOD.ENV target count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 1 3 2005 4218 0.475 2005
## 2 2 2 2745 9265 0.296 2745
## 3 3 4 24 89 0.270 24
In the target response, 3 is the positive answer, 3 is the negative answer, 4 is don’t know and 1 is refused for the question: MOD_ENV (7.13) INTERVIEWER READ: Now, back to questions specifically about you. Has a health professional ever advised you to change things in your home, school, or work to improve your asthma
length(asth.res1$TCH.SIGN)
## [1] 3
## # A tibble: 3 x 6
## # Groups: MOD.ENV [3]
## target MOD.ENV count etotal proportion group.max
## <fct> <fct> <int> <int> <dbl> <int>
## 1 2 2 2745 4162 0.660 2745
## 2 3 1 2005 4568 0.439 2005
## 3 4 3 24 3216 0.00746 24
asth.edul1 <- merge(asth.res1, asth.res2 ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.res3 ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.res4 ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.res5 ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.res6 ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.res7 ,by.x = "target", by.y = "target", all = TRUE) %>%
select(., target, TCH.SIGN, TCH.RESP, TCH.MON, MGT.PLAN, MGT.CLAS, INHALERW, MOD.ENV)
## Warning in merge.data.frame(., asth.res4, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x', 'count.y',
## 'etotal.y', 'proportion.y', 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.res5, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x', 'count.y',
## 'etotal.y', 'proportion.y', 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.res6, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x',
## 'count.y', 'etotal.y', 'proportion.y', 'group.max.y', 'count.x', 'etotal.x',
## 'proportion.x', 'group.max.x', 'count.y', 'etotal.y', 'proportion.y',
## 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.res7, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x',
## 'count.y', 'etotal.y', 'proportion.y', 'group.max.y', 'count.x', 'etotal.x',
## 'proportion.x', 'group.max.x', 'count.y', 'etotal.y', 'proportion.y',
## 'group.max.y' are duplicated in the result
asth.edul1
## target TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV
## 1 1 <NA> <NA> <NA> <NA> <NA> 4 <NA>
## 2 1 <NA> <NA> <NA> <NA> <NA> 3 <NA>
## 3 2 <NA> <NA> 2 2 2 <NA> 2
## 4 2 <NA> <NA> 2 3 2 <NA> 2
## 5 2 <NA> <NA> 3 2 2 <NA> 2
## 6 2 <NA> <NA> 3 3 2 <NA> 2
## 7 3 1 1 1 1 1 1 1
## 8 3 1 1 1 1 3 1 1
## 9 4 3 3 <NA> <NA> <NA> 2 3
## 10 4 2 2 <NA> <NA> <NA> 2 3
## 11 4 2 3 <NA> <NA> <NA> 2 3
## 12 4 3 2 <NA> <NA> <NA> 2 3
#write.csv(asth.edul1, "asthma_edu_level.csv")
asth.edul2 <- merge(asth.sign, asth.resp ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.mon ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.plan ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.clas ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.inhal ,by.x = "target", by.y = "target", all = TRUE) %>%
merge(., asth.env ,by.x = "target", by.y = "target", all = TRUE) %>%
select(., target, TCH.SIGN, TCH.RESP, TCH.MON, MGT.PLAN, MGT.CLAS, INHALERW, MOD.ENV)
## Warning in merge.data.frame(., asth.plan, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x', 'count.y',
## 'etotal.y', 'proportion.y', 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.clas, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x', 'count.y',
## 'etotal.y', 'proportion.y', 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.inhal, by.x = "target", by.y =
## "target", : column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x',
## 'count.y', 'etotal.y', 'proportion.y', 'group.max.y', 'count.x', 'etotal.x',
## 'proportion.x', 'group.max.x', 'count.y', 'etotal.y', 'proportion.y',
## 'group.max.y' are duplicated in the result
## Warning in merge.data.frame(., asth.env, by.x = "target", by.y = "target", :
## column names 'count.x', 'etotal.x', 'proportion.x', 'group.max.x',
## 'count.y', 'etotal.y', 'proportion.y', 'group.max.y', 'count.x', 'etotal.x',
## 'proportion.x', 'group.max.x', 'count.y', 'etotal.y', 'proportion.y',
## 'group.max.y' are duplicated in the result
asth.edul2
## target TCH.SIGN TCH.RESP TCH.MON MGT.PLAN MGT.CLAS INHALERW MOD.ENV
## 1 1 <NA> <NA> <NA> <NA> <NA> 4 <NA>
## 2 1 <NA> <NA> <NA> <NA> <NA> 3 <NA>
## 3 2 <NA> <NA> 2 2 2 <NA> 2
## 4 2 <NA> <NA> 2 3 2 <NA> 2
## 5 2 <NA> <NA> 3 2 2 <NA> 2
## 6 2 <NA> <NA> 3 3 2 <NA> 2
## 7 3 1 1 1 1 1 1 1
## 8 3 1 1 1 1 3 1 1
## 9 4 3 3 <NA> <NA> <NA> 2 3
## 10 4 2 2 <NA> <NA> <NA> 2 3
## 11 4 2 3 <NA> <NA> <NA> 2 3
## 12 4 3 2 <NA> <NA> <NA> 2 3
## 'data.frame': 13572 obs. of 24 variables:
## $ TARGET : num 0 1 1 0 0 0 0 0 1 0 ...
## $ SEX : Factor w/ 2 levels "1","2": 1 2 2 2 2 1 1 2 1 2 ...
## $ AGEG.F7 : Factor w/ 7 levels "1","2","3","4",..: 4 4 6 5 6 6 6 7 3 5 ...
## $ X_IMPRACE: Factor w/ 6 levels "1","2","3","4",..: 1 2 1 5 1 1 1 1 5 1 ...
## $ EDUCAL : Factor w/ 6 levels "1","2","3","4",..: 6 4 5 5 6 4 6 6 4 5 ...
## $ X_INCOMG : Factor w/ 6 levels "1","2","3","4",..: 5 1 2 5 5 5 5 6 1 5 ...
## $ X_RFBMI5 : Factor w/ 3 levels "1","2","9": 2 2 2 2 1 1 2 1 1 2 ...
## $ SMOKE100 : Factor w/ 3 levels "1","2","7": 2 2 2 2 2 1 1 2 1 2 ...
## $ COPD : Factor w/ 3 levels "1","2","7": 2 2 2 2 2 1 2 2 2 2 ...
## $ EMPHY : Factor w/ 4 levels "1","2","7","9": 2 2 2 2 2 1 2 2 2 2 ...
## $ DEPRESS : Factor w/ 4 levels "1","2","7","9": 2 1 2 1 2 2 2 2 1 2 ...
## $ BRONCH : Factor w/ 4 levels "1","2","7","9": 2 2 2 2 2 2 2 2 2 2 ...
## $ DUR.30D : Factor w/ 7 levels "1","10","11",..: 4 4 3 6 3 4 6 6 2 3 ...
## $ INCINDT : Factor w/ 4 levels "1","2","3","7": 1 3 3 3 3 3 3 3 3 2 ...
## $ LAST.MD : Factor w/ 5 levels "4","5","6","7",..: 1 1 1 4 4 3 1 4 2 1 ...
## $ LAST.MED : Factor w/ 5 levels "4","5","6","7",..: 1 1 3 4 3 2 1 4 4 1 ...
## $ LAST.SYMP: Factor w/ 8 levels "1","2","3","4",..: 1 1 3 7 3 2 5 7 4 3 ...
## $ COMPASTH : Factor w/ 6 levels "1","11","2","3",..: 4 4 6 5 2 2 5 5 2 4 ...
## $ INS2 : Factor w/ 5 levels "1","2","5","7",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ ER.VISIT : Factor w/ 5 levels "1","2","5","6",..: 1 2 2 3 4 4 2 3 4 2 ...
## $ HOSP.VST : Factor w/ 7 levels "1","2","4","5",..: 2 2 2 4 5 5 3 4 5 2 ...
## $ ASRXCOST : Factor w/ 5 levels "1","2","5","7",..: 2 2 2 3 2 2 2 3 2 2 ...
## $ WORKTALK : Factor w/ 6 levels "1","2","6","7",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ ACT.DAY30: Factor w/ 6 levels "1","2","3","4",..: 1 4 2 5 1 1 1 5 1 1 ...
Here were are going to drop missing data because they are only 12 over 13,922 rows. We also transform all predictors to categorical.
## TARGET SEX AGEG.F7 X_IMPRACE EDUCAL X_INCOMG X_RFBMI5 SMOKE100
## 0:9004 1:4896 1: 795 1:10741 1: 17 1:1778 1:3424 1:6302
## 1:4568 2:8676 2:1222 2: 797 2: 245 2:2108 2:9509 2:7231
## 3:1293 3: 176 3: 666 3:1231 9: 639 7: 39
## 4:2077 4: 290 4:3148 4:1604
## 5:3299 5: 997 5:4183 5:5213
## 6:3148 6: 571 6:5313 9:1638
## 7:1738
## COPD EMPHY DEPRESS BRONCH DUR.30D INCINDT LAST.MD LAST.MED
## 1: 2665 1: 1090 1:5194 1:3483 1 :1237 1: 287 4:7760 4:4789
## 2:10738 2:12344 2:8269 2:9902 10:1548 2: 1001 5:1880 5:2192
## 7: 169 7: 127 7: 83 7: 176 11: 736 3:12253 6: 801 6:2031
## 9: 11 9: 26 9: 11 12:4774 7: 31 7:2852 7:4000
## 2 : 909 9: 279 9: 560
## 6 :4118
## 7 : 250
## LAST.SYMP COMPASTH INS2 ER.VISIT HOSP.VST ASRXCOST WORKTALK
## 1 :3513 1 :1139 1: 628 1:1327 1: 404 1:1509 1: 2409
## 7 :2456 11:4252 2:12137 2:6517 2:6524 2:9421 2:10681
## 3 :2440 2 : 795 5: 743 5:2609 4: 920 5:2607 6: 264
## 2 :1665 3 :3210 7: 51 6:3097 5:2609 7: 23 7: 189
## 4 :1548 6 :4118 9: 13 7: 22 6:3097 9: 12 8: 20
## 5 :1014 7 : 58 7: 16 9: 9
## (Other): 936 9: 2
## ACT.DAY30
## 1:5667
## 2:3009
## 3:1431
## 4: 801
## 5:2609
## 7: 55
##
Proportion of Good Skill Management in terme of Education Level
Proportion of Good skill management in terme of Duration of Asthma Attack
##
## Call: glm(formula = TARGET ~ ., family = binomial, data = training1)
##
## Coefficients:
## (Intercept) SEX2 AGEG.F72 AGEG.F73 AGEG.F74 AGEG.F75
## -1.174536 0.277355 0.185815 -0.057024 -0.092841 -0.269772
## AGEG.F76 AGEG.F77 X_IMPRACE2 X_IMPRACE3 X_IMPRACE4 X_IMPRACE5
## -0.437822 -0.673535 0.290042 -0.304171 -0.125454 0.066594
## X_IMPRACE6 EDUCAL2 EDUCAL3 EDUCAL4 EDUCAL5 EDUCAL6
## -0.052169 0.385098 0.387940 0.554764 0.603154 0.583484
## X_INCOMG2 X_INCOMG3 X_INCOMG4 X_INCOMG5 X_INCOMG9 X_RFBMI52
## 0.171072 -0.015702 -0.012750 0.061015 -0.045599 0.010538
## X_RFBMI59 SMOKE1002 SMOKE1007 COPD2 COPD7 EMPHY2
## -0.016012 0.117088 -0.354173 -0.136657 -0.225422 -0.014686
## EMPHY7 EMPHY9 DEPRESS2 DEPRESS7 DEPRESS9 BRONCH2
## -0.534735 -0.142864 0.064885 0.133971 -0.487342 -0.064650
## BRONCH7 BRONCH9 DUR.30D10 DUR.30D11 DUR.30D12 DUR.30D2
## -0.388348 0.953002 0.292401 0.384874 0.142016 0.033771
## DUR.30D6 DUR.30D7 INCINDT2 INCINDT3 INCINDT7 LAST.MD5
## -0.357111 -0.438918 0.093255 1.127746 -0.326672 -0.036131
## LAST.MD6 LAST.MD7 LAST.MD9 LAST.MED5 LAST.MED6 LAST.MED7
## -0.162538 -0.449114 -0.669319 -0.269302 -0.352852 -0.437017
## LAST.MED9 LAST.SYMP2 LAST.SYMP3 LAST.SYMP4 LAST.SYMP5 LAST.SYMP6
## -2.154621 0.007867 0.141573 NA 0.534054 0.722396
## LAST.SYMP7 LAST.SYMP9 COMPASTH11 COMPASTH2 COMPASTH3 COMPASTH6
## 0.500094 -0.170911 -0.263481 -0.332154 -0.103789 NA
## COMPASTH7 INS22 INS25 INS27 INS29 ER.VISIT2
## -0.396203 0.225311 0.053888 -0.067315 -12.177179 -0.150526
## ER.VISIT5 ER.VISIT6 ER.VISIT7 HOSP.VST2 HOSP.VST4 HOSP.VST5
## -12.056841 -0.609194 -1.108039 -0.221764 -0.317311 NA
## HOSP.VST6 HOSP.VST7 HOSP.VST9 ASRXCOST2 ASRXCOST5 ASRXCOST7
## NA -1.516388 -13.699485 -0.059624 11.330773 0.147173
## ASRXCOST9 WORKTALK2 WORKTALK6 WORKTALK7 WORKTALK8 WORKTALK9
## -1.171160 -0.575374 -0.615817 -0.385011 0.445304 0.018963
## ACT.DAY302 ACT.DAY303 ACT.DAY304 ACT.DAY305 ACT.DAY307
## 0.147657 0.186494 0.130929 NA 0.003462
##
## Degrees of Freedom: 10857 Total (i.e. Null); 10768 Residual
## Null Deviance: 13850
## Residual Deviance: 12650 AIC: 12830
##
## Call: glm(formula = TARGET ~ SEX + AGEG.F7 + X_IMPRACE + X_INCOMG +
## SMOKE100 + COPD + DUR.30D + INCINDT + LAST.MD + LAST.MED +
## LAST.SYMP + COMPASTH + INS2 + ER.VISIT + HOSP.VST + WORKTALK +
## ACT.DAY30, family = binomial, data = training1)
##
## Coefficients:
## (Intercept) SEX2 AGEG.F72 AGEG.F73 AGEG.F74 AGEG.F75
## -0.678798 0.277213 0.190033 -0.057372 -0.087781 -0.261394
## AGEG.F76 AGEG.F77 X_IMPRACE2 X_IMPRACE3 X_IMPRACE4 X_IMPRACE5
## -0.427736 -0.655346 0.282615 -0.300361 -0.126453 0.046405
## X_IMPRACE6 X_INCOMG2 X_INCOMG3 X_INCOMG4 X_INCOMG5 X_INCOMG9
## -0.050401 0.196905 0.017796 0.030185 0.101560 -0.031801
## SMOKE1002 SMOKE1007 COPD2 COPD7 DUR.30D10 DUR.30D11
## 0.125338 -0.315724 -0.143446 -0.426710 0.282624 0.379578
## DUR.30D12 DUR.30D2 DUR.30D6 DUR.30D7 INCINDT2 INCINDT3
## 0.136008 0.031677 -0.376067 -0.456387 0.098652 1.139783
## INCINDT7 LAST.MD5 LAST.MD6 LAST.MD7 LAST.MD9 LAST.MED5
## -0.384838 -0.043707 -0.176186 -0.456484 -0.698527 -0.272267
## LAST.MED6 LAST.MED7 LAST.MED9 LAST.SYMP2 LAST.SYMP3 LAST.SYMP4
## -0.353263 -0.435766 -2.161181 0.003324 0.137790 NA
## LAST.SYMP5 LAST.SYMP6 LAST.SYMP7 LAST.SYMP9 COMPASTH11 COMPASTH2
## 0.535938 0.722550 0.500386 -0.174032 -0.272680 -0.330235
## COMPASTH3 COMPASTH6 COMPASTH7 INS22 INS25 INS27
## -0.105775 NA -0.418558 0.219372 0.050479 -0.081843
## INS29 ER.VISIT2 ER.VISIT5 ER.VISIT6 ER.VISIT7 HOSP.VST2
## -12.152164 -0.154096 -0.676921 -0.623558 -1.138380 -0.235152
## HOSP.VST4 HOSP.VST5 HOSP.VST6 HOSP.VST7 HOSP.VST9 WORKTALK2
## -0.333963 NA NA -1.535202 -13.765679 -0.580233
## WORKTALK6 WORKTALK7 WORKTALK8 WORKTALK9 ACT.DAY302 ACT.DAY303
## -0.676153 -0.496932 0.114223 -0.018377 0.146346 0.190329
## ACT.DAY304 ACT.DAY305 ACT.DAY307
## 0.124820 NA 0.019868
##
## Degrees of Freedom: 10857 Total (i.e. Null); 10788 Residual
## Null Deviance: 13850
## Residual Deviance: 12670 AIC: 12810
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1625 686
## 1 161 242
##
## Accuracy : 0.6879
## 95% CI : (0.6701, 0.7053)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.0005201
##
## Kappa : 0.1975
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.26078
## Specificity : 0.90985
## Pos Pred Value : 0.60050
## Neg Pred Value : 0.70316
## Prevalence : 0.34193
## Detection Rate : 0.08917
## Detection Prevalence : 0.14849
## Balanced Accuracy : 0.58532
##
## 'Positive' Class : 1
##
##
## Call: glm(formula = TARGET ~ SEX + AGEG.F7 + EDUCAL + X_INCOMG + BRONCH +
## DUR.30D + INCINDT + LAST.MD + LAST.MED + LAST.SYMP + COMPASTH +
## WORKTALK, family = binomial, data = training1)
##
## Coefficients:
## (Intercept) SEX2 AGEG.F72 AGEG.F73 AGEG.F74 AGEG.F75
## -1.118715 0.284237 0.167241 -0.071978 -0.092099 -0.256311
## AGEG.F76 AGEG.F77 EDUCAL2 EDUCAL3 EDUCAL4 EDUCAL5
## -0.413130 -0.637474 0.424996 0.401404 0.555033 0.603441
## EDUCAL6 X_INCOMG2 X_INCOMG3 X_INCOMG4 X_INCOMG5 X_INCOMG9
## 0.584129 0.150742 -0.056336 -0.051021 0.018527 -0.081780
## BRONCH2 BRONCH7 BRONCH9 DUR.30D10 DUR.30D11 DUR.30D12
## -0.099060 -0.574021 -0.010334 0.204898 0.292044 0.095202
## DUR.30D2 DUR.30D6 DUR.30D7 INCINDT2 INCINDT3 INCINDT7
## -0.001767 -0.556779 -0.508096 0.070431 1.097929 -0.384934
## LAST.MD5 LAST.MD6 LAST.MD7 LAST.MD9 LAST.MED5 LAST.MED6
## -0.299784 -0.423770 -0.705918 -0.836967 -0.304092 -0.401499
## LAST.MED7 LAST.MED9 LAST.SYMP2 LAST.SYMP3 LAST.SYMP4 LAST.SYMP5
## -0.491506 -2.222637 0.018452 0.156085 NA 0.530238
## LAST.SYMP6 LAST.SYMP7 LAST.SYMP9 COMPASTH11 COMPASTH2 COMPASTH3
## 0.739814 0.500976 -0.211515 -0.315412 -0.280003 -0.104776
## COMPASTH6 COMPASTH7 WORKTALK2 WORKTALK6 WORKTALK7 WORKTALK8
## NA -0.400090 -0.577549 -0.562460 -0.457966 -0.004085
## WORKTALK9
## -0.013825
##
## Degrees of Freedom: 10857 Total (i.e. Null); 10805 Residual
## Null Deviance: 13850
## Residual Deviance: 12730 AIC: 12830
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1626 701
## 1 160 227
##
## Accuracy : 0.6828
## 95% CI : (0.6649, 0.7002)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.003415
##
## Kappa : 0.1803
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.24461
## Specificity : 0.91041
## Pos Pred Value : 0.58656
## Neg Pred Value : 0.69875
## Prevalence : 0.34193
## Detection Rate : 0.08364
## Detection Prevalence : 0.14259
## Balanced Accuracy : 0.57751
##
## 'Positive' Class : 1
##
Since our dataset has multiple variable, we can use penalized logistic regression to find an optimal performing model. Ridge Regression and Lasso Regression have two different approaches. Ridge Regression incorporates all variables in the model and gives the coefficients of variables with minor contribution close to zero Lasso Regression keeps only the most significant variables and gives zero to the coefficient of the rest of variables.
Variation of Ridge Model Coefficient by Log Lambda
The coefficients are significative for negative log lambda and start stabilize around -4
Lambda that Minimises MSE
The plot shows that the log of the optimal value of lambda (i.e. the one that minimises the root mean square error) is approximately -3. The exact value can be viewed by examining the variable lambda_min in the code below. In general though, the objective of regularisation is to balance accuracy and simplicity. In the present context, this means a model with the smallest number of coefficients that also gives a good accuracy. To this end, the cv.glmnet function finds the value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda.
## [1] 0.0232599
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1770 905
## 1 16 23
##
## Accuracy : 0.6606
## 95% CI : (0.6425, 0.6785)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.397
##
## Kappa : 0.0206
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.024784
## Specificity : 0.991041
## Pos Pred Value : 0.589744
## Neg Pred Value : 0.661682
## Prevalence : 0.341931
## Detection Rate : 0.008475
## Detection Prevalence : 0.014370
## Balanced Accuracy : 0.507913
##
## 'Positive' Class : 1
##
We observe overfitting with this ridge model
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1786 928
## 1 0 0
##
## Accuracy : 0.6581
## 95% CI : (0.6399, 0.6759)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.5089
##
## Kappa : 0
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.0000
## Specificity : 1.0000
## Pos Pred Value : NaN
## Neg Pred Value : 0.6581
## Prevalence : 0.3419
## Detection Rate : 0.0000
## Detection Prevalence : 0.0000
## Balanced Accuracy : 0.5000
##
## 'Positive' Class : 1
##
We observe overfitting with this second ridge model
## 96 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -0.6785089445
## (Intercept) .
## SEX2 0.2377312946
## AGEG.F72 0.2924064689
## AGEG.F73 0.0882851251
## AGEG.F74 0.0739161765
## AGEG.F75 -0.0784363273
## AGEG.F76 -0.2266438614
## AGEG.F77 -0.4304243552
## X_IMPRACE2 0.2783708011
## X_IMPRACE3 -0.2617968077
## X_IMPRACE4 -0.1250822625
## X_IMPRACE5 0.0770993145
## X_IMPRACE6 -0.0291325153
## EDUCAL2 -0.1849724455
## EDUCAL3 -0.1539710139
## EDUCAL4 0.0048576314
## EDUCAL5 0.0472495259
## EDUCAL6 0.0240563693
## X_INCOMG2 0.1424495042
## X_INCOMG3 -0.0318310008
## X_INCOMG4 -0.0230713655
## X_INCOMG5 0.0438650791
## X_INCOMG9 -0.0566232475
## X_RFBMI52 0.0008484487
## X_RFBMI59 -0.0133589051
## SMOKE1002 0.1186408841
## SMOKE1007 -0.3227703841
## COPD2 -0.1146803040
## COPD7 -0.2100418150
## EMPHY2 -0.0119953775
## EMPHY7 -0.4306694169
## EMPHY9 0.0158954894
## DEPRESS2 0.0496872761
## DEPRESS7 0.0435764855
## DEPRESS9 -0.3683416941
## BRONCH2 -0.0644667318
## BRONCH7 -0.3570145913
## BRONCH9 0.5477470454
## DUR.30D10 0.0783812228
## DUR.30D11 0.2622753285
## DUR.30D12 0.0778292172
## DUR.30D2 -0.0099787768
## DUR.30D6 -0.0063046250
## DUR.30D7 -0.4413631006
## INCINDT2 -0.2309764521
## INCINDT3 0.7463276137
## INCINDT7 -0.5225892789
## LAST.MD5 0.0434348649
## LAST.MD6 -0.0675088335
## LAST.MD7 -0.3285714325
## LAST.MD9 -0.5710486154
## LAST.MED5 -0.1749557026
## LAST.MED6 -0.2340047015
## LAST.MED7 -0.2972935433
## LAST.MED9 -1.5011993571
## LAST.SYMP2 -0.0149304285
## LAST.SYMP3 0.1037468900
## LAST.SYMP4 0.0787196299
## LAST.SYMP5 0.0738268250
## LAST.SYMP6 0.2212507509
## LAST.SYMP7 -0.0216193311
## LAST.SYMP9 -0.2944037512
## COMPASTH11 -0.2050397620
## COMPASTH2 -0.2085562100
## COMPASTH3 -0.0304637973
## COMPASTH6 -0.0127191965
## COMPASTH7 -0.3240287061
## INS22 0.1619687023
## INS25 -0.0027254990
## INS27 -0.0889205227
## INS29 -1.9283450931
## ER.VISIT2 -0.0995234384
## ER.VISIT5 -0.1307948414
## ER.VISIT6 -0.2151046502
## ER.VISIT7 -0.9032624092
## HOSP.VST2 -0.0127124439
## HOSP.VST4 -0.0329132274
## HOSP.VST5 -0.1254808417
## HOSP.VST6 -0.2092959558
## HOSP.VST7 -1.0928568540
## HOSP.VST9 -2.6073184150
## ASRXCOST2 -0.0455939879
## ASRXCOST5 -0.1153836580
## ASRXCOST7 0.1327327507
## ASRXCOST9 -0.8039086057
## WORKTALK2 -0.5040869742
## WORKTALK6 -0.4678981762
## WORKTALK7 -0.3132559117
## WORKTALK8 0.3544284083
## WORKTALK9 0.0071385665
## ACT.DAY302 0.1277000083
## ACT.DAY303 0.1598666842
## ACT.DAY304 0.1249975782
## ACT.DAY305 -0.1174198946
## ACT.DAY307 -0.0230500270
Lambda that minimises MSE in Lasso
The plot shows that the log of the optimal value of lambda (i.e. the one that minimises the root mean square error) is approximately -10. The exact value can be viewed by examining the variable lambda_min in the code below. In general though, the objective of regularisation is to balance accuracy and simplicity. In the present context, this means a model with the smallest number of coefficients that also gives a good accuracy. To this end, the cv.glmnet function finds the value of lambda that gives the simplest model but also lies within one standard error of the optimal value of lambda.
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1640 711
## 1 146 217
##
## Accuracy : 0.6842
## 95% CI : (0.6664, 0.7017)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.002059
##
## Kappa : 0.1781
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.23384
## Specificity : 0.91825
## Pos Pred Value : 0.59780
## Neg Pred Value : 0.69758
## Prevalence : 0.34193
## Detection Rate : 0.07996
## Detection Prevalence : 0.13375
## Balanced Accuracy : 0.57604
##
## 'Positive' Class : 1
##
## 96 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -0.8606406405
## (Intercept) .
## SEX2 0.2594402162
## AGEG.F72 0.2340892110
## AGEG.F73 0.0034616959
## AGEG.F74 .
## AGEG.F75 -0.1503925580
## AGEG.F76 -0.3141421454
## AGEG.F77 -0.5410988285
## X_IMPRACE2 0.2789342148
## X_IMPRACE3 -0.2088864711
## X_IMPRACE4 -0.0796924307
## X_IMPRACE5 0.0499448037
## X_IMPRACE6 .
## EDUCAL2 -0.1306405673
## EDUCAL3 -0.1452989939
## EDUCAL4 .
## EDUCAL5 0.0213667953
## EDUCAL6 .
## X_INCOMG2 0.1528571389
## X_INCOMG3 .
## X_INCOMG4 .
## X_INCOMG5 0.0480297485
## X_INCOMG9 -0.0359921061
## X_RFBMI52 .
## X_RFBMI59 .
## SMOKE1002 0.1112354961
## SMOKE1007 -0.1956145109
## COPD2 -0.1086642369
## COPD7 -0.1401300934
## EMPHY2 .
## EMPHY7 -0.3890223792
## EMPHY9 .
## DEPRESS2 0.0411592140
## DEPRESS7 .
## DEPRESS9 -0.0220003046
## BRONCH2 -0.0516858120
## BRONCH7 -0.3190504035
## BRONCH9 .
## DUR.30D10 0.1244464432
## DUR.30D11 0.2430586667
## DUR.30D12 0.0480299274
## DUR.30D2 -0.0087203262
## DUR.30D6 .
## DUR.30D7 -0.4637068965
## INCINDT2 .
## INCINDT3 1.0114193590
## INCINDT7 -0.1267449349
## LAST.MD5 .
## LAST.MD6 -0.0971328786
## LAST.MD7 -0.4076446404
## LAST.MD9 -0.6022595780
## LAST.MED5 -0.2122311723
## LAST.MED6 -0.2848120938
## LAST.MED7 -0.3795487381
## LAST.MED9 -1.9850509111
## LAST.SYMP2 .
## LAST.SYMP3 0.1162080311
## LAST.SYMP4 0.0185077766
## LAST.SYMP5 0.0135376553
## LAST.SYMP6 0.1735428119
## LAST.SYMP7 .
## LAST.SYMP9 -0.2513469618
## COMPASTH11 -0.1923410551
## COMPASTH2 -0.1789780883
## COMPASTH3 -0.0003082605
## COMPASTH6 .
## COMPASTH7 -0.2199743590
## INS22 0.1598095306
## INS25 .
## INS27 .
## INS29 -1.8091008137
## ER.VISIT2 -0.1002383815
## ER.VISIT5 -0.2548500002
## ER.VISIT6 -0.3308156833
## ER.VISIT7 -0.8113733714
## HOSP.VST2 .
## HOSP.VST4 .
## HOSP.VST5 -0.0948642873
## HOSP.VST6 -0.0252333067
## HOSP.VST7 -0.7813748722
## HOSP.VST9 -1.9292755596
## ASRXCOST2 -0.0016379281
## ASRXCOST5 .
## ASRXCOST7 .
## ASRXCOST9 -0.0787455438
## WORKTALK2 -0.5490490040
## WORKTALK6 -0.5184983953
## WORKTALK7 -0.3338646176
## WORKTALK8 .
## WORKTALK9 .
## ACT.DAY302 0.1080484254
## ACT.DAY303 0.1359887589
## ACT.DAY304 0.0759042895
## ACT.DAY305 .
## ACT.DAY307 .
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 1716 822
## 1 70 106
##
## Accuracy : 0.6713
## 95% CI : (0.6533, 0.689)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.07509
##
## Kappa : 0.0932
##
## Mcnemar's Test P-Value : < 2e-16
##
## Sensitivity : 0.11422
## Specificity : 0.96081
## Pos Pred Value : 0.60227
## Neg Pred Value : 0.67612
## Prevalence : 0.34193
## Detection Rate : 0.03906
## Detection Prevalence : 0.06485
## Balanced Accuracy : 0.53752
##
## 'Positive' Class : 1
##
it <- glmnet(x, y, family = “multinomial”)
tLL <- fit\(nulldev - deviance(fit) k <- fit\)df n <- fit$nobs AICc <- -tLL+2k+2k*(k+1)/(n-k-1) AICc
## [1] -975.4119
## [1] -1043.829
## Confusion Matrix and Statistics
##
## Reference
## Prediction F T
## F 1635 712
## T 151 216
##
## Accuracy : 0.682
## 95% CI : (0.6641, 0.6995)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.004356
##
## Kappa : 0.1734
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.23276
## Specificity : 0.91545
## Pos Pred Value : 0.58856
## Neg Pred Value : 0.69663
## Prevalence : 0.34193
## Detection Rate : 0.07959
## Detection Prevalence : 0.13522
## Balanced Accuracy : 0.57411
##
## 'Positive' Class : T
##
## Partial Least Squares
##
## 10858 samples
## 23 predictor
## 2 classes: 'F', 'T'
##
## Pre-processing: centered (94), scaled (94)
## Resampling: Cross-Validated (10 fold, repeated 3 times)
## Summary of sample sizes: 9772, 9772, 9772, 9772, 9772, 9773, ...
## Resampling results across tuning parameters:
##
## ncomp ROC Sens Spec
## 1 0.6393174 1.0000000 0.0000000
## 2 0.6692877 0.9255111 0.1661172
## 3 0.6738019 0.9249578 0.1762821
## 4 0.6758076 0.9159058 0.2019231
## 5 0.6772703 0.9143360 0.2068681
## 6 0.6784624 0.9093482 0.2118132
## 7 0.6791885 0.9091633 0.2141941
## 8 0.6791625 0.9091169 0.2168498
## 9 0.6790442 0.9082397 0.2171245
## 10 0.6791496 0.9077783 0.2169414
## 11 0.6794495 0.9082865 0.2183150
## 12 0.6795904 0.9087943 0.2172161
## 13 0.6795758 0.9085632 0.2170330
## 14 0.6795364 0.9084245 0.2177656
## 15 0.6795312 0.9085169 0.2173993
##
## ROC was used to select the optimal model using the largest value.
## The final value used for the model was ncomp = 12.
## Confusion Matrix and Statistics
##
## Reference
## Prediction F T
## F 1636 713
## T 150 215
##
## Accuracy : 0.682
## 95% CI : (0.6641, 0.6995)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.004356
##
## Kappa : 0.1729
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.23168
## Specificity : 0.91601
## Pos Pred Value : 0.58904
## Neg Pred Value : 0.69647
## Prevalence : 0.34193
## Detection Rate : 0.07922
## Detection Prevalence : 0.13449
## Balanced Accuracy : 0.57385
##
## 'Positive' Class : T
##
## glmnet
##
## 10858 samples
## 23 predictor
## 2 classes: 'F', 'T'
##
## Pre-processing: centered (94), scaled (94)
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 9773, 9772, 9772, 9772, 9772, 9772, ...
## Resampling results across tuning parameters:
##
## alpha lambda ROC Sens Spec
## 0.1 3.206429e-05 0.6802415 0.8943211 0.241318681
## 0.1 7.407266e-05 0.6802415 0.8943211 0.241318681
## 0.1 1.711175e-04 0.6802415 0.8943211 0.241318681
## 0.1 3.953035e-04 0.6802415 0.8943211 0.241318681
## 0.1 9.132025e-04 0.6802846 0.8947644 0.240329670
## 0.1 2.109616e-03 0.6804602 0.8966207 0.236648352
## 0.1 4.873487e-03 0.6806421 0.9023842 0.227747253
## 0.1 1.125839e-02 0.6804576 0.9143263 0.205549451
## 0.1 2.600834e-02 0.6787664 0.9354952 0.160329670
## 0.1 6.008263e-02 0.6743981 0.9670277 0.084120879
## 0.2 3.206429e-05 0.6802413 0.8941548 0.241428571
## 0.2 7.407266e-05 0.6802413 0.8941548 0.241428571
## 0.2 1.711175e-04 0.6802413 0.8941548 0.241428571
## 0.2 3.953035e-04 0.6802557 0.8941825 0.241428571
## 0.2 9.132025e-04 0.6804270 0.8952354 0.239450549
## 0.2 2.109616e-03 0.6806806 0.8988929 0.233956044
## 0.2 4.873487e-03 0.6809683 0.9074549 0.220549451
## 0.2 1.125839e-02 0.6800156 0.9229714 0.186978022
## 0.2 2.600834e-02 0.6768876 0.9508733 0.126043956
## 0.2 6.008263e-02 0.6686358 0.9924356 0.014175824
## 0.3 3.206429e-05 0.6802420 0.8940440 0.241758242
## 0.3 7.407266e-05 0.6802420 0.8940440 0.241758242
## 0.3 1.711175e-04 0.6802420 0.8940440 0.241758242
## 0.3 3.953035e-04 0.6802907 0.8942933 0.241153846
## 0.3 9.132025e-04 0.6805116 0.8957895 0.238241758
## 0.3 2.109616e-03 0.6808756 0.9008049 0.230604396
## 0.3 4.873487e-03 0.6810121 0.9116110 0.211923077
## 0.3 1.125839e-02 0.6790696 0.9316712 0.169285714
## 0.3 2.600834e-02 0.6738435 0.9614306 0.096208791
## 0.3 6.008263e-02 0.6621744 1.0000000 0.000000000
## 0.4 3.206429e-05 0.6802456 0.8942103 0.241703297
## 0.4 7.407266e-05 0.6802456 0.8942103 0.241703297
## 0.4 1.711175e-04 0.6802456 0.8942103 0.241703297
## 0.4 3.953035e-04 0.6803585 0.8946535 0.240934066
## 0.4 9.132025e-04 0.6806077 0.8969256 0.236923077
## 0.4 2.109616e-03 0.6810925 0.9028553 0.228076923
## 0.4 4.873487e-03 0.6808637 0.9160442 0.203736264
## 0.4 1.125839e-02 0.6781757 0.9390136 0.153296703
## 0.4 2.600834e-02 0.6710244 0.9717103 0.066648352
## 0.4 6.008263e-02 0.6573615 1.0000000 0.000000000
## 0.5 3.206429e-05 0.6802603 0.8942103 0.241813187
## 0.5 7.407266e-05 0.6802603 0.8942103 0.241813187
## 0.5 1.711175e-04 0.6802612 0.8942103 0.241813187
## 0.5 3.953035e-04 0.6804279 0.8950691 0.240549451
## 0.5 9.132025e-04 0.6807016 0.8978677 0.235549451
## 0.5 2.109616e-03 0.6812028 0.9053213 0.224615385
## 0.5 4.873487e-03 0.6804699 0.9194523 0.195604396
## 0.5 1.125839e-02 0.6772468 0.9454422 0.139505495
## 0.5 2.600834e-02 0.6681134 0.9828764 0.036483516
## 0.5 6.008263e-02 0.6538381 1.0000000 0.000000000
## 0.6 3.206429e-05 0.6802591 0.8941272 0.241813187
## 0.6 7.407266e-05 0.6802591 0.8941272 0.241813187
## 0.6 1.711175e-04 0.6802724 0.8942103 0.241813187
## 0.6 3.953035e-04 0.6804657 0.8952353 0.239945055
## 0.6 9.132025e-04 0.6807795 0.8986711 0.234560440
## 0.6 2.109616e-03 0.6812633 0.9072608 0.221263736
## 0.6 4.873487e-03 0.6799424 0.9235530 0.187307692
## 0.6 1.125839e-02 0.6759076 0.9506517 0.125109890
## 0.6 2.600834e-02 0.6651068 0.9979220 0.002747253
## 0.6 6.008263e-02 0.6476886 1.0000000 0.000000000
## 0.7 3.206429e-05 0.6802623 0.8940718 0.241978022
## 0.7 7.407266e-05 0.6802623 0.8940718 0.241978022
## 0.7 1.711175e-04 0.6802898 0.8941272 0.241703297
## 0.7 3.953035e-04 0.6805081 0.8956233 0.239395604
## 0.7 9.132025e-04 0.6808912 0.8995027 0.233241758
## 0.7 2.109616e-03 0.6812659 0.9090341 0.218021978
## 0.7 4.873487e-03 0.6794641 0.9270440 0.179285714
## 0.7 1.125839e-02 0.6742793 0.9551963 0.111263736
## 0.7 2.600834e-02 0.6624273 1.0000000 0.000000000
## 0.7 6.008263e-02 0.6454073 1.0000000 0.000000000
## 0.8 3.206429e-05 0.6802720 0.8941272 0.241978022
## 0.8 7.407266e-05 0.6802720 0.8941272 0.241978022
## 0.8 1.711175e-04 0.6803014 0.8942657 0.241538462
## 0.8 3.953035e-04 0.6805375 0.8959004 0.238571429
## 0.8 9.132025e-04 0.6809874 0.9003893 0.231923077
## 0.8 2.109616e-03 0.6812187 0.9109459 0.214560440
## 0.8 4.873487e-03 0.6790118 0.9306183 0.171098901
## 0.8 1.125839e-02 0.6727659 0.9592694 0.099120879
## 0.8 2.600834e-02 0.6601809 1.0000000 0.000000000
## 0.8 6.008263e-02 0.6353261 1.0000000 0.000000000
## 0.9 3.206429e-05 0.6802894 0.8941826 0.242032967
## 0.9 7.407266e-05 0.6802894 0.8941826 0.242032967
## 0.9 1.711175e-04 0.6803392 0.8944319 0.241208791
## 0.9 3.953035e-04 0.6805658 0.8961775 0.238186813
## 0.9 9.132025e-04 0.6810919 0.9011929 0.230989011
## 0.9 2.109616e-03 0.6811689 0.9131346 0.210384615
## 0.9 4.873487e-03 0.6785547 0.9340816 0.163956044
## 0.9 1.125839e-02 0.6714683 0.9633978 0.087857143
## 0.9 2.600834e-02 0.6581901 1.0000000 0.000000000
## 0.9 6.008263e-02 0.6192685 1.0000000 0.000000000
## 1.0 3.206429e-05 0.6802743 0.8942103 0.241923077
## 1.0 7.407266e-05 0.6802743 0.8942103 0.241923077
## 1.0 1.711175e-04 0.6803633 0.8945981 0.241043956
## 1.0 3.953035e-04 0.6806191 0.8966485 0.237362637
## 1.0 9.132025e-04 0.6811696 0.9023566 0.230000000
## 1.0 2.109616e-03 0.6810133 0.9147696 0.206263736
## 1.0 4.873487e-03 0.6781338 0.9368800 0.157637363
## 1.0 1.125839e-02 0.6701328 0.9676096 0.074450549
## 1.0 2.600834e-02 0.6560262 1.0000000 0.000000000
## 1.0 6.008263e-02 0.5872780 1.0000000 0.000000000
##
## ROC was used to select the optimal model using the largest value.
## The final values used for the model were alpha = 0.7 and lambda = 0.002109616.
## alpha lambda
## 66 0.7 0.002109616
## Confusion Matrix and Statistics
##
## Reference
## Prediction F T
## F 1640 710
## T 146 218
##
## Accuracy : 0.6846
## 95% CI : (0.6667, 0.7021)
## No Information Rate : 0.6581
## P-Value [Acc > NIR] : 0.001807
##
## Kappa : 0.1793
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.23491
## Specificity : 0.91825
## Pos Pred Value : 0.59890
## Neg Pred Value : 0.69787
## Prevalence : 0.34193
## Detection Rate : 0.08032
## Detection Prevalence : 0.13412
## Balanced Accuracy : 0.57658
##
## 'Positive' Class : T
##
## glm.mod11 glm.mod12 ridge.mod1 ridge.mod2 lasso.mod1 lasso.mod2
## Accuracy 0.6879145 0.6827561 0.66064849 0.6580693 0.6842299 0.6713338
## Precision 0.6004963 0.5865633 0.58974359 NA 0.5977961 0.6022727
## Sensitivity 0.2607759 0.2446121 0.02478448 0.0000000 0.2338362 0.1142241
## Specificity 0.9098544 0.9104143 0.99104143 1.0000000 0.9182531 0.9608063
## F1 0.3636364 0.3452471 0.04756980 NA 0.3361735 0.1920290
## pls.mod1 pls.mod2 en.mod
## Accuracy 0.6820192 0.6820192 0.6845984
## Precision 0.5885559 0.5890411 0.5989011
## Sensitivity 0.2327586 0.2316810 0.2349138
## Specificity 0.9154535 0.9160134 0.9182531
## F1 0.3335907 0.3325599 0.3374613
With precision and specificity equal to 1, the ridge.mod2 model is overfitting. But lasso.mod1 has the best accuracy, precision, sensivity, and specificity.
We can plot the ROC curve and extract the AUC value.
Best Model with AUC
The Lasso model has the best Area Under the Curve.
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 8208 3520
## 1 796 1048
##
## Accuracy : 0.682
## 95% CI : (0.6741, 0.6898)
## No Information Rate : 0.6634
## P-Value [Acc > NIR] : 2.227e-06
##
## Kappa : 0.1653
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.22942
## Specificity : 0.91159
## Pos Pred Value : 0.56833
## Neg Pred Value : 0.69986
## Prevalence : 0.33658
## Detection Rate : 0.07722
## Detection Prevalence : 0.13587
## Balanced Accuracy : 0.57051
##
## 'Positive' Class : 1
##
The dot before the coefficient means that the lasso model ignore unimportant class of the variable.
## 96 x 1 sparse Matrix of class "dgCMatrix"
## s0
## (Intercept) -0.779627250
## (Intercept) .
## SEX2 0.277009383
## AGEG.F72 0.330360460
## AGEG.F73 0.070974710
## AGEG.F74 .
## AGEG.F75 -0.136878425
## AGEG.F76 -0.256333773
## AGEG.F77 -0.493612960
## X_IMPRACE2 0.292997741
## X_IMPRACE3 -0.286803639
## X_IMPRACE4 .
## X_IMPRACE5 .
## X_IMPRACE6 -0.004331095
## EDUCAL2 -0.054088606
## EDUCAL3 -0.067669698
## EDUCAL4 .
## EDUCAL5 0.035873647
## EDUCAL6 .
## X_INCOMG2 0.146624310
## X_INCOMG3 .
## X_INCOMG4 .
## X_INCOMG5 0.064057869
## X_INCOMG9 -0.036774320
## X_RFBMI52 .
## X_RFBMI59 -0.028364176
## SMOKE1002 0.125876474
## SMOKE1007 .
## COPD2 -0.124264966
## COPD7 .
## EMPHY2 0.013162274
## EMPHY7 -0.402759362
## EMPHY9 .
## DEPRESS2 0.018738710
## DEPRESS7 .
## DEPRESS9 .
## BRONCH2 -0.037758413
## BRONCH7 -0.328147049
## BRONCH9 .
## DUR.30D10 0.079177130
## DUR.30D11 0.184955227
## DUR.30D12 0.003960264
## DUR.30D2 .
## DUR.30D6 .
## DUR.30D7 -0.420857016
## INCINDT2 .
## INCINDT3 0.938995906
## INCINDT7 .
## LAST.MD5 .
## LAST.MD6 -0.119977273
## LAST.MD7 -0.401266743
## LAST.MD9 -0.585236929
## LAST.MED5 -0.219709485
## LAST.MED6 -0.296826230
## LAST.MED7 -0.415750708
## LAST.MED9 -2.080335812
## LAST.SYMP2 .
## LAST.SYMP3 0.122806811
## LAST.SYMP4 0.086519091
## LAST.SYMP5 .
## LAST.SYMP6 0.133828101
## LAST.SYMP7 .
## LAST.SYMP9 -0.314854748
## COMPASTH11 -0.199542636
## COMPASTH2 -0.209621740
## COMPASTH3 -0.035235080
## COMPASTH6 .
## COMPASTH7 -0.191927449
## INS22 0.098402396
## INS25 -0.017884077
## INS27 -0.093557160
## INS29 -0.988939087
## ER.VISIT2 -0.059881236
## ER.VISIT5 -0.245948577
## ER.VISIT6 -0.325013005
## ER.VISIT7 -0.731852448
## HOSP.VST2 .
## HOSP.VST4 .
## HOSP.VST5 -0.093194618
## HOSP.VST6 -0.024683506
## HOSP.VST7 -0.782042795
## HOSP.VST9 -1.839595107
## ASRXCOST2 .
## ASRXCOST5 .
## ASRXCOST7 .
## ASRXCOST9 -0.177166670
## WORKTALK2 -0.548777713
## WORKTALK6 -0.564731481
## WORKTALK7 -0.230557453
## WORKTALK8 .
## WORKTALK9 -0.289124337
## ACT.DAY302 0.098410549
## ACT.DAY303 0.149522931
## ACT.DAY304 0.142025917
## ACT.DAY305 .
## ACT.DAY307 .
A value greater than 1 means an increase effect on the odd ratio compare to baseline. For example, focusing on SEX variable, Women(SEX2) are more likely to have good Skill on asthma management than men(SEX1 the baseline). Other variables can be interpret the same way.
## 96 x 1 Matrix of class "dgeMatrix"
## s0
## (Intercept) 0.4585769
## (Intercept) 1.0000000
## SEX2 1.3191787
## AGEG.F72 1.3914696
## AGEG.F73 1.0735541
## AGEG.F74 1.0000000
## AGEG.F75 0.8720762
## AGEG.F76 0.7738836
## AGEG.F77 0.6104170
## X_IMPRACE2 1.3404398
## X_IMPRACE3 0.7506591
## X_IMPRACE4 1.0000000
## X_IMPRACE5 1.0000000
## X_IMPRACE6 0.9956783
## EDUCAL2 0.9473482
## EDUCAL3 0.9345691
## EDUCAL4 1.0000000
## EDUCAL5 1.0365249
## EDUCAL6 1.0000000
## X_INCOMG2 1.1579189
## X_INCOMG3 1.0000000
## X_INCOMG4 1.0000000
## X_INCOMG5 1.0661541
## X_INCOMG9 0.9638936
## X_RFBMI52 1.0000000
## X_RFBMI59 0.9720343
## SMOKE1002 1.1341421
## SMOKE1007 1.0000000
## COPD2 0.8831458
## COPD7 1.0000000
## EMPHY2 1.0132493
## EMPHY7 0.6684729
## EMPHY9 1.0000000
## DEPRESS2 1.0189154
## DEPRESS7 1.0000000
## DEPRESS9 1.0000000
## BRONCH2 0.9629455
## BRONCH7 0.7202571
## BRONCH9 1.0000000
## DUR.30D10 1.0823960
## DUR.30D11 1.2031646
## DUR.30D12 1.0039681
## DUR.30D2 1.0000000
## DUR.30D6 1.0000000
## DUR.30D7 0.6564840
## INCINDT2 1.0000000
## INCINDT3 2.5574122
## INCINDT7 1.0000000
## LAST.MD5 1.0000000
## LAST.MD6 0.8869406
## LAST.MD7 0.6694715
## LAST.MD9 0.5569739
## LAST.MED5 0.8027520
## LAST.MED6 0.7431731
## LAST.MED7 0.6598447
## LAST.MED9 0.1248883
## LAST.SYMP2 1.0000000
## LAST.SYMP3 1.1306660
## LAST.SYMP4 1.0903722
## LAST.SYMP5 1.0000000
## LAST.SYMP6 1.1431963
## LAST.SYMP7 1.0000000
## LAST.SYMP9 0.7298949
## COMPASTH11 0.8191053
## COMPASTH2 0.8108909
## COMPASTH3 0.9653784
## COMPASTH6 1.0000000
## COMPASTH7 0.8253667
## INS22 1.1034067
## INS25 0.9822749
## INS27 0.9106860
## INS29 0.3719711
## ER.VISIT2 0.9418764
## ER.VISIT5 0.7819624
## ER.VISIT6 0.7225180
## ER.VISIT7 0.4810171
## HOSP.VST2 1.0000000
## HOSP.VST4 1.0000000
## HOSP.VST5 0.9110162
## HOSP.VST6 0.9756186
## HOSP.VST7 0.4574705
## HOSP.VST9 0.1588817
## ASRXCOST2 1.0000000
## ASRXCOST5 1.0000000
## ASRXCOST7 1.0000000
## ASRXCOST9 0.8376402
## WORKTALK2 0.5776554
## WORKTALK6 0.5685128
## WORKTALK7 0.7940908
## WORKTALK8 1.0000000
## WORKTALK9 0.7489191
## ACT.DAY302 1.1034157
## ACT.DAY303 1.1612801
## ACT.DAY304 1.1526065
## ACT.DAY305 1.0000000
## ACT.DAY307 1.0000000