Blame It on The Alcohol An Event History Analysis of Alcohol Consumption
Author
Jacob M. Souch
Introduction
Data and Materials
NHEFS: The NHEFS will be used to conduct exploratory empirical analysis. NHEFS is a national longitudinal study that was jointly initiated by the National Center for Health Statistics and the National Institute on Aging in collaboration with other agencies of the Public Health Service.15,16 The NHEFs includes 14,407 NHANES I participants examined in 1971 to 1975. Respondents were 25 to 74 years old, corresponding to birth years 1897 to 1946. Additional follow-up of the NHEFS population was conducted in 1986, 1987, and 1992 using the same design and data collection procedures. The 1992 Mortality Data file contains death certificate information collected during each follow-up period coded according to ICD-9 multiple-cause-of-death procedures for all 4,497 deceased subjects identified through 1992. NHEFS data will be obtained through ICPSR. The formatted NHEFS data will be analyzed in R using the survival package in R. Kaplan-Meier plots will be generated by subpopulation (race/ethnicity) and by alcohol type. Daily alcohol consumption will be calculated[1] for total alcohol consumption (beer, wine whisky) and each individual drink type to identify differential risks. Survminer will be used to find median survival rates and summary statistics. Cox proportional hazards models will be used to estimate the effect of alcohol consumption on all-cause mortality risk while adjusting for age, sex, race, BMI, self-rated health, and educational achievement. Death is indicated as the event in the time-to-event analysis. Year of death through 1992 and age at interview can be used to construct age at death. Deaths not observed by 1992 are right censored in all subsequent analyses. All-cause death is considered for simplicity.
Data File
Description
N
NHANES I
Baseline characteristics of NHANES participants.
18,836
NHEFS 1992 Interview
Interview items obtained at folllow up in 1992
9,281
NHEFS 1992 Vital Tracing and Status
Status of participant (dead, alive, lost to followup) as of 1992
14,407
NHEFS 1992 Mortality
Detailed mortality items of respondents as of 1992
4,609
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.3.2
Warning: package 'readr' was built under R version 4.3.2
Warning: package 'purrr' was built under R version 4.3.2
Warning: package 'dplyr' was built under R version 4.3.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.3 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)
Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
NHEFS_Mortality92 <-read.csv("NHEFS_Mortality92.csv") NHEFS_Mortality92<- NHEFS_Mortality92 %>%select(SEQNUM, DTAGEDTH, MARITALSTATUS = MARTSTAT, EDUCATION = EDUCATN)NHEFS_Mortality92$EDUCATION <- car::Recode(NHEFS_Mortality92$EDUCATION,"'00' = 'No formal education';'09' = '1 year of high school';10 = '2 years of high school';11 = '3 years of high school';12 = '4 years of high school';13 = '1 year of college';14 = '2 years of college';15 = '3 years of college';16 = '4 years of college';17 = '5 or more years of college';99 = 'Not stated'")NHEFS_Mortality92$MARITALSTATUS <- car::Recode(NHEFS_Mortality92$MARITALSTATUS,"1 = 'Never married, single'; 2 = 'Married'; 3 = 'Widowed'; 4 = 'Divorced'; 5 = 'Marital status not on certificate'; 9 = 'Marital status not stated'")NHEFS_Mortality92$DTAGEDTH <- car::Recode(NHEFS_Mortality92$DTAGEDTH,"999 = 'AGE NOT STATED'")NHEFS_Mortality92$DTAGEDTH <-as.numeric(NHEFS_Mortality92$DTAGEDTH)
Warning: NAs introduced by coercion
NHANES I Data Files
NHANESI <-read_xlsx("NHANESI.xlsx")#Selected Variables NHANESI <- NHANESI %>%select(SEQNUM = SEQN,DRNKPYR = N1AH0404, DRINKFRQ71 = N1AH0405, DRNKTYPE71 = N1AH0406, DRNKVOL71 = N1AH0407, EDU = N1AH0020, STRATA = N1AH0194, PSU = N1AH0196, WAGES = N1AH0034)NHANESI$BACHELORS <- NHANESI$EDU ==45NHANESI$HSGRAD <- NHANESI$EDU >=34& NHANESI$EDU !=88NHANESI$INCOMEGT25K <- NHANESI$WAGES ==22 NHANESI$EDU <- car::Recode(NHANESI$EDU, "10='None'; 21='1st grade'; 22='2nd grade'; 23='3rd grade'; 24='4th grade'; 25='5th grade'; 26='6th grade'; 27='7th grade'; 28='8th grade'; 31='9th grade'; 32='10th grade'; 33='11th grade'; 34='12th grade'; 41='First year of college'; 42='Second year of college'; 43='Third year of college'; 44='Fourth year of college'; 45='Graduate'; 88='Blank, but applicable'") NHANESI$DRNKTYPE71 <- car::Recode(NHANESI$DRNKTYPE71,"1 = 'Beer'; 2 = 'Wine'; 3 = 'Liquor'; 8 = 'None'") NHANESI$DRNKTYPE71 <-relevel(as.factor(NHANESI$DRNKTYPE71),"Beer") NHANESI$DRINKFRQ71 <- car::Recode( NHANESI$DRINKFRQ71,"1 = 'Everyday';2 = 'Just about everyday';3 = 'About 2 or 3 times a week';4 = 'About 1-4 times a month';5 = 'More than 3 but less than 12 times a year';6 = 'No more than 2 or 3 times a year';8 = 'Blank, but applicable'" ) NHANESI$DRINKFRQ71 <-relevel(as.factor(NHANESI$DRINKFRQ71),'No more than 2 or 3 times a year') NHANESI$DRNKVOL71 <- car::Recode(NHANESI$DRNKVOL71,"88 = 'Blank'" )#Selected Survey NHANESIGROWTH <-read.xport("growthch.xpt") NHANESIGROWTH <- NHANESIGROWTH[NHANESIGROWTH$SURVEY ==3,] NHANESIGROWTH <- NHANESIGROWTH %>%select(SEQNUM = SEQN, AGE_EXAM, BMI, WEIGHT = WT_KG, HEIGHT = HT_CM)weights <-read.csv("NHEFS_weights.csv")
t1data <- data %>%select(`Age at Exam`= NEXAMAGE,`Age at Event`= ageevent,`Vital Tracing Status`= CVITALST,`Race`= RACE,`Bachelor's Degree`= BACHELORS,`High School Graduate`= HSGRAD,`Income Greater Than 25,000`= INCOMEGT25K,`Self-Identified Type of Drinker at NHANES I`= DRNKTYPE71,`Self-Reported Frequency of Drinking`= DRINKFRQ71,Sex = SEX)table1::table1(~`Age at Exam`+`Age at Event`+`Vital Tracing Status`+ Race +`Bachelor's Degree`+`High School Graduate`+`Income Greater Than 25,000`+`Self-Identified Type of Drinker at NHANES I`+`Self-Reported Frequency of Drinking`| Sex , data = t1data, footnote ="Values presented are unweighted.", caption ="TABLE 1. SUMMARY STATISTICS OF MEASURES")
TABLE 1. SUMMARY STATISTICS OF MEASURES
Female (N=8596)
Male (N=5811)
Overall (N=14407)
Values presented are unweighted.
Age at Exam
Mean (SD)
47.2 (15.5)
51.5 (15.3)
49.0 (15.6)
Median [Min, Max]
44.0 [24.0, 77.0]
53.0 [24.0, 76.0]
48.0 [24.0, 77.0]
Age at Event
Mean (SD)
63.9 (15.2)
66.4 (13.9)
64.9 (14.8)
Median [Min, Max]
63.0 [25.0, 97.0]
68.0 [25.0, 95.0]
65.0 [25.0, 97.0]
Vital Tracing Status
Alive
5797 (67.4%)
3004 (51.7%)
8801 (61.1%)
Dead
2134 (24.8%)
2470 (42.5%)
4604 (32.0%)
Traced alive with direct subject contact
173 (2.0%)
87 (1.5%)
260 (1.8%)
Traced alive without direct subject contact
116 (1.3%)
80 (1.4%)
196 (1.4%)
Unknown
376 (4.4%)
170 (2.9%)
546 (3.8%)
Race
White
7133 (83.0%)
4903 (84.4%)
12036 (83.5%)
Aleut, Eskimo or American Indian
13 (0.2%)
17 (0.3%)
30 (0.2%)
Asian/Pacific Islander
45 (0.5%)
48 (0.8%)
93 (0.6%)
Black
1376 (16.0%)
823 (14.2%)
2199 (15.3%)
Other
29 (0.3%)
20 (0.3%)
49 (0.3%)
Bachelor's Degree
Yes
381 (4.4%)
274 (4.7%)
655 (4.5%)
No
6488 (75.5%)
4205 (72.4%)
10693 (74.2%)
Missing
1727 (20.1%)
1332 (22.9%)
3059 (21.2%)
High School Graduate
Yes
3363 (39.1%)
2104 (36.2%)
5467 (37.9%)
No
3506 (40.8%)
2375 (40.9%)
5881 (40.8%)
Missing
1727 (20.1%)
1332 (22.9%)
3059 (21.2%)
Income Greater Than 25,000
Yes
338 (3.9%)
285 (4.9%)
623 (4.3%)
No
8258 (96.1%)
5526 (95.1%)
13784 (95.7%)
Self-Identified Type of Drinker at NHANES I
Beer
1281 (14.9%)
2195 (37.8%)
3476 (24.1%)
Liquor
2033 (23.7%)
1454 (25.0%)
3487 (24.2%)
None
2 (0.0%)
1 (0.0%)
3 (0.0%)
Wine
1015 (11.8%)
360 (6.2%)
1375 (9.5%)
Missing
4265 (49.6%)
1801 (31.0%)
6066 (42.1%)
Self-Reported Frequency of Drinking
No more than 2 or 3 times a year
1361 (15.8%)
517 (8.9%)
1878 (13.0%)
About 1-4 times a month
2062 (24.0%)
1461 (25.1%)
3523 (24.5%)
About 2 or 3 times a week
785 (9.1%)
937 (16.1%)
1722 (12.0%)
Blank, but applicable
9 (0.1%)
3 (0.1%)
12 (0.1%)
Everyday
385 (4.5%)
875 (15.1%)
1260 (8.7%)
Just about everyday
204 (2.4%)
356 (6.1%)
560 (3.9%)
More than 3 but less than 12 times a year
899 (10.5%)
394 (6.8%)
1293 (9.0%)
Missing
2891 (33.6%)
1268 (21.8%)
4159 (28.9%)
Creating the Survival Object
Kaplan-Meier
km <-with(data, Surv(NEXAMAGE,ageevent,event))
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary(km)
start stop status
Min. :24.00 Min. :25.00 Min. :0.0000
1st Qu.:35.00 1st Qu.:53.00 1st Qu.:0.0000
Median :48.00 Median :65.00 Median :0.0000
Mean :49.13 Mean :64.92 Mean :0.3196
3rd Qu.:65.00 3rd Qu.:77.00 3rd Qu.:1.0000
Max. :77.00 Max. :97.00 Max. :1.0000
NA's :326
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
According to the plot above, median age at death is 79 years. This means that half of the subjects died before 79 and half of them died after 79 years.
data %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ SEX, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 2. Kaplan-Meier Plot of Survival by Sex")+add_quantile()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ SEX, data = data)
n= 14081, number of events= 4531
(326 observations deleted due to missingness)
coef exp(coef) se(coef) z Pr(>|z|)
SEXMale 0.56749 1.76383 0.02992 18.96 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
SEXMale 1.764 0.5669 1.663 1.87
Concordance= 0.577 (se = 0.004 )
Likelihood ratio test= 359.6 on 1 df, p=<2e-16
Wald test = 359.7 on 1 df, p=<2e-16
Score (logrank) test = 369.2 on 1 df, p=<2e-16
Figure 2. shows marked differences in survival by sex. The median survival time for males was 75 years, whereas the median survival time for females was 82 years. The logrank test 𝛘2 = 369.2, P < 0.001 reveals significant differences in survival by sex.
data %>%filter(RACE %in%c("Black","White")) %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ RACE + SEX, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 3. Kaplan-Meier Plot of Survival by Sex and Race")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
data %>%filter(RACE %in%c("Black","White")) %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ RACE + SEX, data = .)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ RACE, data = data)
n= 14081, number of events= 4531
(326 observations deleted due to missingness)
coef exp(coef) se(coef) z
RACEAleut, Eskimo or American Indian 0.30435 1.35574 0.33379 0.912
RACEAsian/Pacific Islander 0.16876 1.18384 0.20076 0.841
RACEBlack 0.33415 1.39675 0.03817 8.754
RACEOther -0.44689 0.63961 0.57769 -0.774
Pr(>|z|)
RACEAleut, Eskimo or American Indian 0.362
RACEAsian/Pacific Islander 0.401
RACEBlack <2e-16 ***
RACEOther 0.439
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
RACEAleut, Eskimo or American Indian 1.3557 0.7376 0.7048 2.608
RACEAsian/Pacific Islander 1.1838 0.8447 0.7987 1.755
RACEBlack 1.3967 0.7159 1.2961 1.505
RACEOther 0.6396 1.5634 0.2062 1.984
Concordance= 0.529 (se = 0.003 )
Likelihood ratio test= 73.13 on 4 df, p=5e-15
Wald test = 78.22 on 4 df, p=4e-16
Score (logrank) test = 78.96 on 4 df, p=3e-16
Figure 3. above limits the analysis to Black and white individuals for the sake of simplicity and visual clarity. White females’ survival was best overall (medianage at death = 83) compared to Black males (medianage at death = 71).
The logrank test 𝛘2 = 78.96, P < 0.001 reveals significant differences in survival by race.
data %>%filter(DRINKFRQ71 !="Blank, but applicable") %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRINKFRQ71, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 4. Kaplan-Meier Plot of Survival by Drinking Frequency")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71,
data = data)
n= 10008, number of events= 2769
(4399 observations deleted due to missingness)
coef exp(coef) se(coef)
DRINKFRQ71About 1-4 times a month 0.02803 1.02843 0.05581
DRINKFRQ71About 2 or 3 times a week 0.14132 1.15180 0.06452
DRINKFRQ71Blank, but applicable -0.12552 0.88204 0.57900
DRINKFRQ71Everyday 0.23250 1.26175 0.06257
DRINKFRQ71Just about everyday 0.21924 1.24513 0.09031
DRINKFRQ71More than 3 but less than 12 times a year 0.03151 1.03201 0.06795
z Pr(>|z|)
DRINKFRQ71About 1-4 times a month 0.502 0.615485
DRINKFRQ71About 2 or 3 times a week 2.190 0.028490 *
DRINKFRQ71Blank, but applicable -0.217 0.828378
DRINKFRQ71Everyday 3.716 0.000203 ***
DRINKFRQ71Just about everyday 2.428 0.015201 *
DRINKFRQ71More than 3 but less than 12 times a year 0.464 0.642906
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef)
DRINKFRQ71About 1-4 times a month 1.028 0.9724
DRINKFRQ71About 2 or 3 times a week 1.152 0.8682
DRINKFRQ71Blank, but applicable 0.882 1.1337
DRINKFRQ71Everyday 1.262 0.7926
DRINKFRQ71Just about everyday 1.245 0.8031
DRINKFRQ71More than 3 but less than 12 times a year 1.032 0.9690
lower .95 upper .95
DRINKFRQ71About 1-4 times a month 0.9219 1.147
DRINKFRQ71About 2 or 3 times a week 1.0150 1.307
DRINKFRQ71Blank, but applicable 0.2836 2.744
DRINKFRQ71Everyday 1.1161 1.426
DRINKFRQ71Just about everyday 1.0431 1.486
DRINKFRQ71More than 3 but less than 12 times a year 0.9033 1.179
Concordance= 0.531 (se = 0.006 )
Likelihood ratio test= 21.85 on 6 df, p=0.001
Wald test = 22.3 on 6 df, p=0.001
Score (logrank) test = 22.37 on 6 df, p=0.001
Figure 4. above shows significant differences in survival by drinking frequency. The log rank test reveals significant differences in survival by drinking frequency habits, 𝛘2 = 22.37, P < 0.01.
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in agreg.fit(X, Y, istrat, offset, init, control, weights = weights, :
Loglik converged before variable 2 ; beta may be infinite.
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRNKTYPE71,
data = data)
n= 8134, number of events= 2213
(6273 observations deleted due to missingness)
coef exp(coef) se(coef) z Pr(>|z|)
DRNKTYPE71Liquor -2.393e-01 7.872e-01 4.570e-02 -5.237 1.63e-07 ***
DRNKTYPE71None -1.117e+01 1.410e-05 4.467e+02 -0.025 0.98
DRNKTYPE71Wine -4.141e-01 6.609e-01 6.706e-02 -6.175 6.60e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
DRNKTYPE71Liquor 0.7871642 1.270 0.7197 0.8609
DRNKTYPE71None 0.0000141 70900.262 0.0000 Inf
DRNKTYPE71Wine 0.6609332 1.513 0.5795 0.7538
Concordance= 0.547 (se = 0.006 )
Likelihood ratio test= 50.51 on 3 df, p=6e-11
Wald test = 49.42 on 3 df, p=1e-10
Score (logrank) test = 50.17 on 3 df, p=7e-11
Figure 5 above shows significant differences in survival by drinking frequency. The survival of wine drinkers is the best, followed by liquor, and beer drinkers. The log rank test reveals significant differences in survival by drinking type habits, 𝛘2 = 50.17, P < 0.001.
data %>%filter(DRNKTYPE71 !="None") %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + SEX, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and Sex")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
data %>%filter(DRNKTYPE71 !="None") %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + BACHELORS, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and College Education")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
data %>%filter(DRNKTYPE71 !="None") %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + HSGRAD, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and High School Education")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
data %>%filter(DRNKTYPE71 !="None") %>%survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + INCOMEGT25K, data = .) %>%ggsurvfit(linetype_aes =TRUE, linewidth =1) +labs(x ="Age",y ="Cumulative survival probability" ) +add_confidence_interval() +add_risktable(risktable_stats =c("{n.risk} ({cum.event})","{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),stats_label =c("At Risk (Cum. Events)", "Survival (95% CI)"))+ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and Income > 25,000")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary(cox1)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ SEX + RACE,
design = design.INT)
n= 14081, number of events= 4531
(326 observations deleted due to missingness)
coef exp(coef) se(coef) robust se
SEXMale 0.57872 1.78375 0.03446 0.03791
RACEAleut, Eskimo or American Indian 0.09014 1.09433 0.44083 0.35005
RACEAsian/Pacific Islander -0.57789 0.56108 0.34802 0.29021
RACEBlack 0.44590 1.56190 0.05221 0.07323
RACEOther -1.84704 0.15770 0.94672 0.79688
z Pr(>|z|)
SEXMale 15.265 < 2e-16 ***
RACEAleut, Eskimo or American Indian 0.258 0.7968
RACEAsian/Pacific Islander -1.991 0.0464 *
RACEBlack 6.089 1.14e-09 ***
RACEOther -2.318 0.0205 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
SEXMale 1.7837 0.5606 1.65601 1.9213
RACEAleut, Eskimo or American Indian 1.0943 0.9138 0.55105 2.1732
RACEAsian/Pacific Islander 0.5611 1.7823 0.31769 0.9910
RACEBlack 1.5619 0.6402 1.35307 1.8030
RACEOther 0.1577 6.3410 0.03308 0.7519
Concordance= 0.595 (se = 0.007 )
Likelihood ratio test= NA on 5 df, p=NA
Wald test = 325.4 on 5 df, p=<2e-16
Score (logrank) test = NA on 5 df, p=NA
(Note: the likelihood ratio and score tests assume independence of
observations within a cluster, the Wald and robust score tests do not).
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
`summarise()` has grouped output by 'ageevent'. You can override using the
`.groups` argument.
probs$DRNKTYPE71 <-factor(probs$DRNKTYPE71)probs$num <- probs$p*(1-probs$p)probs$sep <-sqrt(probs$num/probs$n)probs$me <-2*probs$sepprobs$`Drinker Type`<- probs$DRNKTYPE71ggplot(data = probs,aes(x = ageevent,y = p,ymin = p - me,ymax = p + me,fill =`Drinker Type`,color =`Drinker Type`,group =`Drinker Type` )) +geom_line(size =1) +geom_ribbon(alpha =0.15, aes(color =NULL)) +labs(title ="Mortality by Drinker Type",subtitle ="Ages 25 to 97",caption ="NHEFS 1992 Followup") +xlab(label ="Age at Death") +ylab(label ="Hazard of Death") +theme_pubclean()
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
The figure above shows mortality profile by self-identified drinker type. As shown below, generally beer drinkers have greater mortality and hazards across the lifespan. shows a Kaplan-Meier plot of survival disaggregated by drinker type, whereas the figure below shows hazards by drinker type. The hazards are highest across the lifespan for self-identified beer drinkers. Finally, the log rank test reveals significant differences in survival by drinking type habits, 𝛘2 = 50.17, P < 0.001.
The cox proportional hazards model in Table 3 indicates differences in survival of for all cause death by sex and race. Holding all else constant, the hazard of death is 1.77 times higher for males individuals, HR = 1.77(95% CI, 1.60 – 1.95); P < 0.001, compared to females at any given time. Moreover, the hazard of death is also significant higher for Black individuals, HR = 1.62 (95% CI, 1.41 – 1.87); P < 0.001. No significant differences were found for survival across drinking types. Figure 2 shows significant differences in survival by drinking frequency. The log rank test reveals significant differences in survival by drinking frequency habits, 𝛘2 = 22.37, P < 0.01.
s1 <-survfit(Surv(ageevent, event) ~1, data = data)
Supplement A: Creation of NHEFS Participant Weights
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in agreg.fit(X, Y, istrat, offset, init, control, weights = weights, :
Loglik converged before variable 2,5,14,17,19 ; beta may be infinite.
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
cox.zph(cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
chisq df p
DRNKVOL71 0.0027 26 1
GLOBAL 0.0027 26 1
survminer::ggcoxdiagnostics(cox)
Warning: `gather_()` was deprecated in tidyr 1.2.0.
ℹ Please use `gather()` instead.
ℹ The deprecated feature was likely used in the survminer package.
Please report the issue at <https://github.com/kassambara/survminer/issues>.
`geom_smooth()` using formula = 'y ~ x'
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: pseudoinverse used at 0
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: reciprocal condition number 2.0408e-14
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: There are other near singularities as well. 0.014619
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at 0
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
0.15756
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
number 2.0408e-14
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : There are other near
singularities as well. 0.014619
summary(cox)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
cox.zph(cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
chisq df p
DRINKFRQ71 0.00202 6 1
GLOBAL 0.00202 6 1
survminer::ggcoxdiagnostics(cox)
`geom_smooth()` using formula = 'y ~ x'
summary(cox)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71,
design = design.cox)
n= 10008, number of events= 2769
(4399 observations deleted due to missingness)
coef exp(coef) se(coef)
DRINKFRQ71About 1-4 times a month 0.08073 1.08407 0.06351
DRINKFRQ71About 2 or 3 times a week 0.11332 1.12000 0.07104
DRINKFRQ71Blank, but applicable -1.67831 0.18669 0.95513
DRINKFRQ71Everyday 0.33181 1.39348 0.06765
DRINKFRQ71Just about everyday 0.09576 1.10050 0.09823
DRINKFRQ71More than 3 but less than 12 times a year 0.08663 1.09049 0.07529
robust se z Pr(>|z|)
DRINKFRQ71About 1-4 times a month 0.07205 1.120 0.262537
DRINKFRQ71About 2 or 3 times a week 0.09051 1.252 0.210521
DRINKFRQ71Blank, but applicable 0.93682 -1.791 0.073213
DRINKFRQ71Everyday 0.08773 3.782 0.000155
DRINKFRQ71Just about everyday 0.13572 0.706 0.480429
DRINKFRQ71More than 3 but less than 12 times a year 0.08296 1.044 0.296390
DRINKFRQ71About 1-4 times a month
DRINKFRQ71About 2 or 3 times a week
DRINKFRQ71Blank, but applicable .
DRINKFRQ71Everyday ***
DRINKFRQ71Just about everyday
DRINKFRQ71More than 3 but less than 12 times a year
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef)
DRINKFRQ71About 1-4 times a month 1.0841 0.9224
DRINKFRQ71About 2 or 3 times a week 1.1200 0.8929
DRINKFRQ71Blank, but applicable 0.1867 5.3565
DRINKFRQ71Everyday 1.3935 0.7176
DRINKFRQ71Just about everyday 1.1005 0.9087
DRINKFRQ71More than 3 but less than 12 times a year 1.0905 0.9170
lower .95 upper .95
DRINKFRQ71About 1-4 times a month 0.94130 1.248
DRINKFRQ71About 2 or 3 times a week 0.93795 1.337
DRINKFRQ71Blank, but applicable 0.02976 1.171
DRINKFRQ71Everyday 1.17335 1.655
DRINKFRQ71Just about everyday 0.84346 1.436
DRINKFRQ71More than 3 but less than 12 times a year 0.92684 1.283
Concordance= 0.539 (se = 0.009 )
Likelihood ratio test= NA on 6 df, p=NA
Wald test = 18.69 on 6 df, p=0.005
Score (logrank) test = NA on 6 df, p=NA
(Note: the likelihood ratio and score tests assume independence of
observations within a cluster, the Wald and robust score tests do not).
Discussion
Conclusions. The present study analyzed the survival differentials of an aging U.S. civilian cohort. Survival times differed by expected and established sociodemographic controls, sex and race. Specifically, the all-cause mortality of men and Black individuals was found to be lower compared to their respective counterparts. These findings are in accord with the extant literature. The final Cox proportional hazard model did not show statistically significant differences in hazards by drink type (beer, wine, or liquor) or by drinking frequency.
Despite the fact that these findings differ from that of Sempos et al. and Klatsky et al. there are salient insights to be discussed. Even in light of the lack of statistical significance, the general trends of this analysis are in accord with the extant alcohol literature. Indeed, as Klatsky reports, greater hazards of mortality were observed for beer drinkers relative to wine drinkers.18 However, the intermediate position of liquor drinkers shown in Figure Panel A was not expected. The volume of drinks likely differs by type of alcohol and the hazards of heavy drinkers may be disproportionally represented in that type.
Limitations. This study has several limitations. First, the NHEFS only reports alcohol consumption behaviors at the NHANES I 1971-1974 baseline. Thus, the NHEFS does not provide a more complete or granular view of alcohol consumption across the lifespan. It is entirely possible, and likely, that drinking behaviors transformed across the lifespan of the cohorts in the context of multifaceted, multi-level social, political, and environmental changes between baseline and follow-up. Second, another consideration observed in this study ones like it is that wine drinkers may be associated with greater survival simply because of a greater propensity to persist the behavior across the lifespan, relative to beer and liquor drinkers who may cease the behavior as they age.18 Third, all measures of alcohol consumption are self-reported and subject to social desirability and recall effects. Finally, all interpretations are merely associational in nature as causality cannot be inferred.
Future Directions. More research is needed to understand the motivational factors that differentially contribute to alcohol consumption behaviors and, by extension, mortality, and morbidity risk profiles. Future longitudinal studies should consider additional timepoints to generate a more complete history of alcohol consumption behavior. Additionally, future analyses should consider more social behaviors and contextual factors like volunteership and church attendance as covariates.
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100,
nest = TRUE, data = data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Dependent variable
Predictors
Estimates
CI
DRINKFRQ71About 1-4 times a month
1.08
0.96 – 1.23
DRINKFRQ71 [About 2 or 3
times a week]
1.12
0.97 – 1.29
DRINKFRQ71 [Blank, but
applicable]
0.19
0.03 – 1.21
DRINKFRQ71 [Everyday]
1.39
1.22 – 1.59
DRINKFRQ71 [Just about
everyday]
1.10
0.91 – 1.33
DRINKFRQ71 [More than 3
but less than 12 times a
year]
1.09
0.94 – 1.26
Observations
10008
R2 Nagelkerke
0.004
AICc
39123.549
* p<0.05 ** p<0.01 *** p<0.001
References
1. Inc G. What Percentage of Americans Drink Alcohol? Gallup.com. Published December 29, 2022. Accessed October 20, 2023. https://news.gallup.com/poll/467507/percentage-americans-drink-alcohol.aspx
2. Wittgens C, Muehlhan M, Kräplin A, Wolff M, Trautmann S. Underlying mechanisms in the relationship between stress and alcohol consumption in regular and risky drinkers (MESA): methods and design of a randomized laboratory study. BMC Psychology. 2022;10(1):233. doi:10.1186/s40359-022-00942-1
3. Deaths and Years of Potential Life Lost From Excessive Alcohol Use — United States, 2011–2015 | MMWR. Accessed October 21, 2023. https://www.cdc.gov/mmwr/volumes/69/wr/mm6930a1.htm
4. Khamis AA, Salleh SZ, Ab Karim MS, et al. Alcohol Consumption Patterns: A Systematic Review of Demographic and Sociocultural Influencing Factors. Int J Environ Res Public Health. 2022;19(13):8103. doi:10.3390/ijerph19138103
5. Klatsky AL, Armstrong MA, Friedman GD. Alcohol and Mortality. Ann Intern Med. 1992;117(8):646-654. doi:10.7326/0003-4819-117-8-646
6. Tsai MK, Gao W, Wen CP. The relationship between alcohol consumption and health: J-shaped or less is more? BMC Medicine. 2023;21(1):228. doi:10.1186/s12916-023-02911-w
7. Benefits and hazards of alcohol-the J-shaped curve and public health | Emerald Insight. Accessed October 21, 2023. https://www.emerald.com/insight/content/doi/10.1108/DAT-09-2020-0059/full/html
8. Thompson PL. J-curve revisited: cardiovascular benefits of moderate alcohol use cannot be dismissed. Med J Aust. 2013;198(8). Accessed October 21, 2023. https://www.mja.com.au/journal/2013/198/8/j-curve-revisited-cardiovascular-benefits-moderate-alcohol-use-cannot-be
9. Marmot MG, Shipley MJ, Rose G, Thomas BrionyJ. ALCOHOL AND MORTALITY: A U-SHAPED CURVE. The Lancet. 1981;317(8220):580-583. doi:10.1016/S0140-6736(81)92032-8
10. Tian Y, Liu J, Zhao Y, et al. Alcohol consumption and all-cause and cause-specific mortality among US adults: prospective cohort study. BMC Medicine. 2023;21(1):208. doi:10.1186/s12916-023-02907-6
11. Stockwell T, Zhao J, Panwar S, Roemer A, Naimi T, Chikritzhs T. Do "Moderate" Drinkers Have Reduced Mortality Risk? A Systematic Review and Meta-Analysis of Alcohol Consumption and All-Cause Mortality. J Stud Alcohol Drugs. 2016;77(2):185-198. doi:10.15288/jsad.2016.77.185
12. Sempos CT, Rehm J, Wu T, Crespo CJ, Trevisan M. Average Volume of Alcohol Consumption and All-Cause Mortality in African Americans: The NHEFS Cohort. Alcoholism: Clinical and Experimental Research. 2003;27(1):88-92. doi:10.1111/j.1530-0277.2003.tb02726.x
13. Zhao J, Stockwell T, Roemer A, Naimi T, Chikritzhs T. Alcohol Consumption and Mortality From Coronary Heart Disease: An Updated Meta-Analysis of Cohort Studies. J Stud Alcohol Drugs. 2017;78(3):375-386. doi:10.15288/jsad.2017.78.375
14. Pearlin LI, Menaghan EG, Lieberman MA, Mullan JT. The Stress Process. Journal of Health and Social Behavior. 1981;22(4):337. doi:10.2307/2136676
15. DeAngelis RT. Striving While Black: Race and the Psychophysiology of Goal Pursuit. J Health Soc Behav. 2020;61(1):24-42. doi:10.1177/0022146520901695
16. Cox WM, Klinger E. A motivational model of alcohol use. Journal of Abnormal Psychology. 1988;97(2):168-180. doi:10.1037/0021-843X.97.2.168
17. Merrill JE, Thomas SE. Interactions between Adaptive Coping and Drinking to Cope in Predicting Naturalistic Drinking and Drinking Following a Lab-Based Psychosocial Stressor. Addict Behav. 2013;38(3):1672-1678. doi:10.1016/j.addbeh.2012.10.003
18. Klatsky AL, Friedman GD, Armstrong MA, Kipp H. Wine, Liquor, Beer, and Mortality. American Journal of Epidemiology. 2003;158(6):585-595. doi:10.1093/aje/kwg184
19. Klatsky AL, Armstrong MA, Kipp H. Correlates of alcoholic beverage preference: traits of persons who choose wine, liquor or beer. British Journal of Addiction. 1990;85(10):1279-1289. doi:10.1111/j.1360-0443.1990.tb01604.x
20. Halonen JI, Kivimäki M, Pentti J, et al. Association of the Availability of Beer, Wine, and Liquor Outlets with Beverage-Specific Alcohol Consumption: A Cohort Study. Alcoholism: Clinical and Experimental Research. 2014;38(4):1086-1093. doi:10.1111/acer.12350
21. Stephens MAC, Wand G. Stress and the HPA Axis. Alcohol Res. 2012;34(4):468-483.
22. Badrick E, Bobak M, Britton A, Kirschbaum C, Marmot M, Kumari M. The Relationship between Alcohol Consumption and Cortisol Secretion in an Aging Cohort. J Clin Endocrinol Metab. 2008;93(3):750-757. doi:10.1210/jc.2007-0737
23. High cortisol levels are associated with oxidative stress and mortality in maintenance hemodialysis patients - PMC. Accessed December 9, 2023. https://www-ncbi-nlm-nih-gov.libweb.lib.utsa.edu/pmc/articles/PMC8903641/
24. Lorem G, Cook S, Leon DA, Emaus N, Schirmer H. Self-reported health as a predictor of mortality: A cohort study of its relation to other health measurements and observation time. Sci Rep. 2020;10(1):4886. doi:10.1038/s41598-020-61603-0
25. Burstrom B, Fredlund P. Self rated health: Is it as good a predictor of subsequent mortality among adults in lower as well as in higher social classes? J Epidemiol Community Health. 2001;55(11):836-840. doi:10.1136/jech.55.11.836
26. Elliott M, Lowman J. Education, income and alcohol misuse: a stress process model. Soc Psychiatry Psychiatr Epidemiol. 2015;50(1):19-26. doi:10.1007/s00127-014-0867-3
27. Cox CS, Mussolino ME, Rothwell ST, et al. Plan and operation of the NHANES I Epidemiologic Followup Study, 1992. Vital Health Stat 1. 1997;(35):1-231.
28. NHANES I - Epidemiologic Followup Study (NHEFS). Accessed October 12, 2023. https://wwwn.cdc.gov/nchs/nhanes/nhefs/