Blame It on The Alcohol An Event History Analysis of Alcohol Consumption

Author

Jacob M. Souch

Introduction

Data and Materials

NHEFS: The NHEFS will be used to conduct exploratory empirical analysis. NHEFS is a national longitudinal study that was jointly initiated by the National Center for Health Statistics and the National Institute on Aging in collaboration with other agencies of the Public Health Service.15,16 The NHEFs includes 14,407 NHANES I participants examined in 1971 to 1975. Respondents were 25 to 74 years old, corresponding to birth years 1897 to 1946. Additional follow-up of the NHEFS population was conducted in 1986, 1987, and 1992 using the same design and data collection procedures. The 1992 Mortality Data file contains death certificate information collected during each follow-up period coded according to ICD-9 multiple-cause-of-death procedures for all 4,497 deceased subjects identified through 1992. NHEFS data will be obtained through ICPSR. The formatted NHEFS data will be analyzed in R using the survival package in R. Kaplan-Meier plots will be generated by subpopulation (race/ethnicity) and by alcohol type. Daily alcohol consumption will be calculated[1] for total alcohol consumption (beer, wine whisky) and each individual drink type to identify differential risks. Survminer will be used to find median survival rates and summary statistics. Cox proportional hazards models will be used to estimate the effect of alcohol consumption on all-cause mortality risk while adjusting for age, sex, race, BMI, self-rated health, and educational achievement.  Death is indicated as the event in the time-to-event analysis. Year of death through 1992 and age at interview can be used to construct age at death. Deaths not observed by 1992 are right censored in all subsequent analyses. All-cause death is considered for simplicity.

Data File Description N
NHANES I Baseline characteristics of NHANES participants. 18,836
NHEFS 1992 Interview Interview items obtained at folllow up in 1992 9,281
NHEFS 1992 Vital Tracing and Status Status of participant (dead, alive, lost to followup) as of 1992 14,407
NHEFS 1992 Mortality Detailed mortality items of respondents as of 1992 4,609
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.3.2
Warning: package 'readr' was built under R version 4.3.2
Warning: package 'purrr' was built under R version 4.3.2
Warning: package 'dplyr' was built under R version 4.3.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(survival)
library(ggsurvfit)
library(foreign)
library(readxl)
library(ggpubr)
Warning: package 'ggpubr' was built under R version 4.3.2
 setwd("C:/Users/jacob/OneDrive - University of Texas at San Antonio/OLD/Courses/Advanced Methods of Demographic Analysis/NHEFS")

NHEFS

Interview Measures

  NHEFS_Interview92 <- read_csv("N92interview.csv")
Rows: 9281 Columns: 1870
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (160): HANESEQ, RSLTDATE, RSLTMO, RSLTDY, GPIDAT87, IDATMM87, IDATDY87,...
dbl (1505): LINTSTAT, LINTDATE, LSSEX, LNDOB, LNURSING, LMARITAL, LWORKHST, ...
lgl  (205): BLANK1, BLANK2, BLANK3, BLANK3b, BLANK4, BLANK5, BLANK6, BLANK7,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  NHEFS_Interview92<- NHEFS_Interview92 %>% select(SEQNUM = HANESEQ, DRINKYR, OFTNBEER, OFTNWINE, OFTNLIQ, WKMOBEER, WKMOWINE, WKMOLIQ, NUMBBEER,NUMBWINE,NUMBSHOT, INTRACE = REVRACE, SRHEALTH = GENERAL, AGE92, LINTDATE )
  

  NHEFS_Interview92$DRINKYR <- car::Recode(
  NHEFS_Interview92$DRINKYR,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
            1 = 'YES';
            2 = 'NO'"
)
  
    NHEFS_Interview92$OFTNBEER <-car::Recode(
  NHEFS_Interview92$OFTNBEER,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             0 = 'NONE';
             94 = 'NO MORE THAN 3 BUT LESS THAN 12 TIMES PER YEAR';
  95 = 'NO MORE THAN 3 TIMES A YEAR'")
    
#Beer    
  NHEFS_Interview92$WKMOBEER <-car::Recode(
  NHEFS_Interview92$WKMOBEER,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             1 = 'WEEK';
             2 = 'MONTH'")
  
  
  NHEFS_Interview92$NUMBBEER <-car::Recode(
  NHEFS_Interview92$NUMBBEER,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             95 = 'LESS THAN ONE DRINK'")
    

#Wine
NHEFS_Interview92$OFTNWINE <-car::Recode(
  NHEFS_Interview92$OFTNWINE,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             0 = 'NONE';
             94 = 'NO MORE THAN 3 BUT LESS THAN 12 TIMES PER YEAR';
  95 = 'NO MORE THAN 3 TIMES A YEAR'")


NHEFS_Interview92$WKMOWINE <-car::Recode(
  NHEFS_Interview92$WKMOWINE,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             1 = 'WEEK';
             2 = 'MONTH'")

NHEFS_Interview92$NUMBWINE <-car::Recode(
  NHEFS_Interview92$NUMBWINE,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             95 = 'LESS THAN ONE DRINK'")

#Liquor

NHEFS_Interview92$OFTNLIQ <-car::Recode(
  NHEFS_Interview92$OFTNLIQ,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             0 = 'NONE';
             94 = 'NO MORE THAN 3 BUT LESS THAN 12 TIMES PER YEAR';
  95 = 'NO MORE THAN 3 TIMES A YEAR'")


NHEFS_Interview92$WKMOLIQ <-car::Recode(
  NHEFS_Interview92$WKMOLIQ,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             1 = 'WEEK';
             2 = 'MONTH'")

NHEFS_Interview92$NUMBSHOT <-car::Recode(
  NHEFS_Interview92$NUMBSHOT,
  "-1 = 'INAPPLICABLE';
            -7 = 'REFUSED';
            -8 = 'DO NOT KNOW';
            -9 = 'NOT ASCERTAINED';
             95 = 'LESS THAN ONE DRINK'")


#Race

NHEFS_Interview92$INTRACE <- car::Recode(
  NHEFS_Interview92$INTRACE,
  "1='Aleut, Eskimo or American Indian';
2= 'Asian/Pacific Islander' ;
3= 'Black';
4= 'White';
5= 'Other'"
)

#Self-Rated Health

NHEFS_Interview92$SRHEALTH <- car::Recode(NHEFS_Interview92$SRHEALTH,
"-1 = 'INAPPLICABLE';
-7 = 'REFUSED';
-8 = 'DON’T KNOW';
-9 = 'NOT ASCERTAINED';
1 = 'EXCELLENT';
2 = 'VERY GOOD';
3 = 'GOOD';
4 = 'FAIR';
5 = 'POOR'")

  NHEFS_Interview92$INTDATE <- lubridate::ymd(NHEFS_Interview92$LINTDATE)

Vital Tracing and Status Measures

  NHEFS_Vital92 <- read.csv("NHEFS_Vital92.csv")
  NHEFS_Vital92 <- NHEFS_Vital92 %>%  select(SEQNUM, CVITALST,DLKAMO92 , DLKADY92 ,DLKAMO92, DLKAYR92, NDOBMO, NDOBYR, RACE = REVRACE, NEXAMAGE, AGEDTH, SEXSUBJ)

  NHEFS_Vital92$DLKA92 <- as.Date(paste(NHEFS_Vital92$DLKAYR92, NHEFS_Vital92$DLKAMO92, NHEFS_Vital92$DLKADY92), format ="%y %m %d" )
  
  NHEFS_Vital92$DOB <- as.Date(paste(NHEFS_Vital92$NDOBYR, NHEFS_Vital92$NDOBMO,1),format = "%Y %m %d")
  
  NHEFS_Vital92$SEX <-car::Recode(NHEFS_Vital92$SEXSUBJ, "1 = 'Male'; 2 ='Female'")

  
  NHEFS_Vital92$CVITALST <- car::Recode(NHEFS_Vital92$CVITALST, 
          "'1' = 'Alive';
           '3' = 'Dead';
           '4' = 'Unknown';
           '5' = 'Traced alive with direct subject contact';
           '6' = 'Traced alive without direct subject contact';
           'subject’s vital status verified'")
  
  
  NHEFS_Vital92$RACE <- car::Recode(
  NHEFS_Vital92$RACE,
  "1='Aleut, Eskimo or American Indian';
2= 'Asian/Pacific Islander' ;
3= 'Black';
4= 'White';
5= 'Other'"
)
  
  NHEFS_Vital92$RACE <- relevel(as.factor(NHEFS_Vital92$RACE), ref ="White")
  NHEFS_Mortality92 <- read.csv("NHEFS_Mortality92.csv")
  
  NHEFS_Mortality92<- NHEFS_Mortality92 %>% select(SEQNUM, DTAGEDTH, MARITALSTATUS = MARTSTAT, EDUCATION = EDUCATN)

NHEFS_Mortality92$EDUCATION <- car::Recode(NHEFS_Mortality92$EDUCATION,
"'00' = 'No formal education';
'09' =  '1 year of high school';
10 =  '2 years of high school';
11 =  '3 years of high school';
12 =  '4 years of high school';
13 =  '1 year of college';
14 =  '2 years of college';
15 =  '3 years of college';
16 =  '4 years of college';
17 =  '5 or more years of college';
99 = 'Not stated'")

NHEFS_Mortality92$MARITALSTATUS <- car::Recode(NHEFS_Mortality92$MARITALSTATUS,
"1 = 'Never married, single';
 2  = 'Married';
 3  = 'Widowed';
 4  = 'Divorced';
 5  = 'Marital status not on certificate';
 9  = 'Marital status not stated'")

NHEFS_Mortality92$DTAGEDTH <- 
  car::Recode(NHEFS_Mortality92$DTAGEDTH,
              "999 = 'AGE NOT STATED'")

NHEFS_Mortality92$DTAGEDTH <-  as.numeric(NHEFS_Mortality92$DTAGEDTH)
Warning: NAs introduced by coercion

NHANES I Data Files

  NHANESI <- read_xlsx("NHANESI.xlsx")
  
  #Selected Variables
  NHANESI <- NHANESI %>% select(SEQNUM = SEQN,DRNKPYR =  N1AH0404, DRINKFRQ71 = N1AH0405, DRNKTYPE71 = N1AH0406, DRNKVOL71 = N1AH0407, EDU = N1AH0020, STRATA =  N1AH0194, PSU = N1AH0196, WAGES = N1AH0034)
  
  
NHANESI$BACHELORS <- NHANESI$EDU == 45

NHANESI$HSGRAD <- NHANESI$EDU >=34 & NHANESI$EDU != 88

NHANESI$INCOMEGT25K <- NHANESI$WAGES ==22
  
  NHANESI$EDU <- car::Recode(NHANESI$EDU, 
     "10='None';
     21='1st grade';
     22='2nd grade';
     23='3rd grade';
     24='4th grade';
     25='5th grade';
     26='6th grade';
     27='7th grade';
     28='8th grade';
     31='9th grade';
     32='10th grade';
     33='11th grade';
     34='12th grade';
     41='First year of college';
     42='Second year of college';
     43='Third year of college';
     44='Fourth year of college';
     45='Graduate';
     88='Blank, but applicable'")
  
  

  
  NHANESI$DRNKTYPE71 <- car::Recode(NHANESI$DRNKTYPE71,
                                 "1 = 'Beer';
                                  2 = 'Wine';
                                  3 = 'Liquor';
                                  8 = 'None'")
  
  NHANESI$DRNKTYPE71 <- relevel(as.factor(NHANESI$DRNKTYPE71),"Beer")
  
  
  NHANESI$DRINKFRQ71 <- car::Recode(
    NHANESI$DRINKFRQ71,
    "1 = 'Everyday';
2 = 'Just about everyday';
3 = 'About 2 or 3 times a week';
4 = 'About 1-4 times a month';
5 = 'More than 3 but less than 12 times a year';
6 = 'No more than 2 or 3 times a year';
8 = 'Blank, but applicable'"
  )
  
  NHANESI$DRINKFRQ71 <- relevel(as.factor(NHANESI$DRINKFRQ71),'No more than 2 or 3 times a year')
  
  
  NHANESI$DRNKVOL71 <- car::Recode(NHANESI$DRNKVOL71,
                                   "88 = 'Blank'" )
  
  #Selected Survey
  NHANESIGROWTH <- read.xport("growthch.xpt")
  NHANESIGROWTH <- NHANESIGROWTH[NHANESIGROWTH$SURVEY ==3,]
  NHANESIGROWTH <- NHANESIGROWTH %>% select(SEQNUM = SEQN, AGE_EXAM, BMI, WEIGHT = WT_KG, HEIGHT = HT_CM)

  
weights <- read.csv("NHEFS_weights.csv")

NHANESI$SEQNUM <- as.character(NHANESI$SEQNUM)
NHANESIGROWTH$SEQNUM <- as.character(NHANESIGROWTH$SEQNUM)
NHEFS_Interview92$SEQNUM <- as.character(NHEFS_Interview92$SEQNUM)
NHEFS_Mortality92$SEQNUM <- as.character(NHEFS_Mortality92$SEQNUM)
NHEFS_Vital92$SEQNUM <- as.character(NHEFS_Vital92$SEQNUM)
weights$SEQNUM <- as.character(weights$SEQNUM)


df <- list(NHEFS_Vital92, NHANESI, NHANESIGROWTH, weights)

data <- df %>% reduce(left_join, by= "SEQNUM")

Evaluate Data

data %>% select(SEQNUM, SEX, RACE, NEXAMAGE, HSGRAD, BACHELORS, INCOMEGT25K,DRINKFRQ71, DRNKTYPE71, DRNKVOL71) %>%

Amelia::missmap(.)

Demographic Characteristics

Creating ageevent Measure

data$ageevent <- case_when(is.na(data$AGEDTH) == FALSE ~ as.numeric(data$AGEDTH),
                           TRUE ~ round(as.numeric(data$DLKA92 - data$DOB)/365.25))


data$event <- data$CVITALST == "Dead"

data$time <- as.numeric(data$ageevent) - as.numeric(data$NEXAMAGE)
t1data <-
  data %>% select(
    `Age at Exam` = NEXAMAGE,
    `Age at Event` = ageevent,
    `Vital Tracing Status` =  CVITALST,
    `Race` = RACE,
    `Bachelor's Degree` =  BACHELORS,
    `High School Graduate` = HSGRAD,
    `Income Greater Than 25,000` = INCOMEGT25K,
    `Self-Identified Type of Drinker at NHANES I` =  DRNKTYPE71,
    `Self-Reported Frequency of Drinking` = DRINKFRQ71,
    Sex = SEX)
    
    

table1::table1(~ `Age at Exam` + `Age at Event` + `Vital Tracing Status` + Race + `Bachelor's Degree` + `High School Graduate` +`Income Greater Than 25,000` + `Self-Identified Type of Drinker at NHANES I` + `Self-Reported Frequency of Drinking` | Sex , data = t1data, footnote = "Values presented are unweighted.", caption = "TABLE 1.  SUMMARY STATISTICS OF MEASURES")
TABLE 1. SUMMARY STATISTICS OF MEASURES
Female
(N=8596)
Male
(N=5811)
Overall
(N=14407)

Values presented are unweighted.

Age at Exam
Mean (SD) 47.2 (15.5) 51.5 (15.3) 49.0 (15.6)
Median [Min, Max] 44.0 [24.0, 77.0] 53.0 [24.0, 76.0] 48.0 [24.0, 77.0]
Age at Event
Mean (SD) 63.9 (15.2) 66.4 (13.9) 64.9 (14.8)
Median [Min, Max] 63.0 [25.0, 97.0] 68.0 [25.0, 95.0] 65.0 [25.0, 97.0]
Vital Tracing Status
Alive 5797 (67.4%) 3004 (51.7%) 8801 (61.1%)
Dead 2134 (24.8%) 2470 (42.5%) 4604 (32.0%)
Traced alive with direct subject contact 173 (2.0%) 87 (1.5%) 260 (1.8%)
Traced alive without direct subject contact 116 (1.3%) 80 (1.4%) 196 (1.4%)
Unknown 376 (4.4%) 170 (2.9%) 546 (3.8%)
Race
White 7133 (83.0%) 4903 (84.4%) 12036 (83.5%)
Aleut, Eskimo or American Indian 13 (0.2%) 17 (0.3%) 30 (0.2%)
Asian/Pacific Islander 45 (0.5%) 48 (0.8%) 93 (0.6%)
Black 1376 (16.0%) 823 (14.2%) 2199 (15.3%)
Other 29 (0.3%) 20 (0.3%) 49 (0.3%)
Bachelor's Degree
Yes 381 (4.4%) 274 (4.7%) 655 (4.5%)
No 6488 (75.5%) 4205 (72.4%) 10693 (74.2%)
Missing 1727 (20.1%) 1332 (22.9%) 3059 (21.2%)
High School Graduate
Yes 3363 (39.1%) 2104 (36.2%) 5467 (37.9%)
No 3506 (40.8%) 2375 (40.9%) 5881 (40.8%)
Missing 1727 (20.1%) 1332 (22.9%) 3059 (21.2%)
Income Greater Than 25,000
Yes 338 (3.9%) 285 (4.9%) 623 (4.3%)
No 8258 (96.1%) 5526 (95.1%) 13784 (95.7%)
Self-Identified Type of Drinker at NHANES I
Beer 1281 (14.9%) 2195 (37.8%) 3476 (24.1%)
Liquor 2033 (23.7%) 1454 (25.0%) 3487 (24.2%)
None 2 (0.0%) 1 (0.0%) 3 (0.0%)
Wine 1015 (11.8%) 360 (6.2%) 1375 (9.5%)
Missing 4265 (49.6%) 1801 (31.0%) 6066 (42.1%)
Self-Reported Frequency of Drinking
No more than 2 or 3 times a year 1361 (15.8%) 517 (8.9%) 1878 (13.0%)
About 1-4 times a month 2062 (24.0%) 1461 (25.1%) 3523 (24.5%)
About 2 or 3 times a week 785 (9.1%) 937 (16.1%) 1722 (12.0%)
Blank, but applicable 9 (0.1%) 3 (0.1%) 12 (0.1%)
Everyday 385 (4.5%) 875 (15.1%) 1260 (8.7%)
Just about everyday 204 (2.4%) 356 (6.1%) 560 (3.9%)
More than 3 but less than 12 times a year 899 (10.5%) 394 (6.8%) 1293 (9.0%)
Missing 2891 (33.6%) 1268 (21.8%) 4159 (28.9%)

Creating the Survival Object

Kaplan-Meier

km <- with(data, Surv(NEXAMAGE,ageevent,event))
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary(km)
     start            stop           status      
 Min.   :24.00   Min.   :25.00   Min.   :0.0000  
 1st Qu.:35.00   1st Qu.:53.00   1st Qu.:0.0000  
 Median :48.00   Median :65.00   Median :0.0000  
 Mean   :49.13   Mean   :64.92   Mean   :0.3196  
 3rd Qu.:65.00   3rd Qu.:77.00   3rd Qu.:1.0000  
 Max.   :77.00   Max.   :97.00   Max.   :1.0000  
 NA's   :326                                     

Unweighted Life Table

km_fit <- survfit(Surv(NEXAMAGE,ageevent, event) ~ 1, data=data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary.km_fit <- summary(km_fit)

km_fit
Call: survfit(formula = Surv(NEXAMAGE, ageevent, event) ~ 1, data = data)

   326 observations deleted due to missingness 
     records n.max n.start events median 0.95LCL 0.95UCL
[1,]   14081  5569     369   4531     79      78      79
summary.km_fit
Call: survfit(formula = Surv(NEXAMAGE, ageevent, event) ~ 1, data = data)

326 observations deleted due to missingness 
 time n.risk n.event survival  std.err lower 95% CI upper 95% CI
   30   1756       3   0.9983 0.000986       0.9964        1.000
   31   2048       1   0.9978 0.001099       0.9957        1.000
   32   2343       4   0.9961 0.001388       0.9934        0.999
   33   2635       5   0.9942 0.001623       0.9910        0.997
   34   2894       7   0.9918 0.001856       0.9882        0.995
   35   3166       2   0.9912 0.001907       0.9874        0.995
   36   3462       6   0.9895 0.002029       0.9855        0.993
   37   3722       5   0.9881 0.002111       0.9840        0.992
   38   3985       7   0.9864 0.002207       0.9821        0.991
   39   4237       7   0.9848 0.002288       0.9803        0.989
   40   4514      13   0.9819 0.002413       0.9772        0.987
   41   4769       8   0.9803 0.002478       0.9754        0.985
   42   5012      14   0.9775 0.002577       0.9725        0.983
   43   5263      11   0.9755 0.002644       0.9703        0.981
   44   5450      17   0.9725 0.002737       0.9671        0.978
   45   5569      19   0.9691 0.002831       0.9636        0.975
   46   5504      13   0.9669 0.002895       0.9612        0.973
   47   5384      23   0.9627 0.003008       0.9568        0.969
   48   5270      17   0.9596 0.003091       0.9536        0.966
   49   5167      22   0.9555 0.003198       0.9493        0.962
   50   5063      19   0.9519 0.003290       0.9455        0.958
   51   5009      28   0.9466 0.003422       0.9399        0.953
   52   4921      36   0.9397 0.003587       0.9327        0.947
   53   4818      38   0.9323 0.003754       0.9250        0.940
   54   4750      34   0.9256 0.003898       0.9180        0.933
   55   4675      41   0.9175 0.004065       0.9096        0.925
   56   4576      30   0.9115 0.004184       0.9033        0.920
   57   4521      30   0.9054 0.004299       0.8970        0.914
   58   4436      35   0.8983 0.004432       0.8896        0.907
   59   4307      49   0.8881 0.004616       0.8791        0.897
   60   4182      49   0.8777 0.004795       0.8683        0.887
   61   4062      46   0.8677 0.004959       0.8581        0.877
   62   3956      51   0.8565 0.005137       0.8465        0.867
   63   3811      67   0.8415 0.005366       0.8310        0.852
   64   3685      58   0.8282 0.005556       0.8174        0.839
   65   3550      52   0.8161 0.005724       0.8050        0.827
   66   3778      80   0.7988 0.005920       0.7873        0.811
   67   3962      84   0.7819 0.006076       0.7701        0.794
   68   4123     109   0.7612 0.006229       0.7491        0.774
   69   4261     114   0.7408 0.006348       0.7285        0.753
   70   4316     107   0.7225 0.006434       0.7100        0.735
   71   4388     123   0.7022 0.006508       0.6896        0.715
   72   4441     149   0.6787 0.006569       0.6659        0.692
   73   4454     167   0.6532 0.006612       0.6404        0.666
   74   4430     194   0.6246 0.006633       0.6118        0.638
   75   4330     198   0.5961 0.006633       0.5832        0.609
   76   4036     186   0.5686 0.006626       0.5557        0.582
   77   3727     203   0.5376 0.006612       0.5248        0.551
   78   3412     178   0.5096 0.006593       0.4968        0.523
   79   3115     209   0.4754 0.006561       0.4627        0.488
   80   2805     194   0.4425 0.006518       0.4299        0.455
   81   2526     195   0.4083 0.006458       0.3959        0.421
   82   2237     195   0.3727 0.006378       0.3605        0.385
   83   1944     173   0.3396 0.006289       0.3275        0.352
   84   1703     155   0.3087 0.006188       0.2968        0.321
   85   1474     156   0.2760 0.006060       0.2644        0.288
   86   1212     124   0.2478 0.005947       0.2364        0.260
   87    953      97   0.2225 0.005867       0.2113        0.234
   88    737     107   0.1902 0.005787       0.1792        0.202
   89    522      56   0.1698 0.005773       0.1589        0.182
   90    372      46   0.1488 0.005831       0.1378        0.161
   91    253      27   0.1329 0.005956       0.1218        0.145
   92    166      25   0.1129 0.006262       0.1013        0.126
   93     92       7   0.1043 0.006574       0.0922        0.118
   94     52       4   0.0963 0.007189       0.0832        0.111
   95     16       1   0.0903 0.008910       0.0744        0.110
   96      3       1   0.0602 0.025281       0.0264        0.137

Weighted Life Table

wkm_fit <- survfit(Surv(NEXAMAGE,ageevent, event) ~ 1, data=data, weights = WT1100)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary(wkm_fit)
Call: survfit(formula = Surv(NEXAMAGE, ageevent, event) ~ 1, data = data, 
    weights = WT1100)

326 observations deleted due to missingness 
 time   n.risk n.event censored survival std.err lower 95% CI upper 95% CI
   30 13795806   29166   880066   0.9979 0.00189      0.99418        1.000
   31 15972243    1642   137854   0.9978 0.00190      0.99407        1.000
   32 18262458   20479    59988   0.9967 0.00201      0.99273        1.000
   33 20602958   33509   105958   0.9950 0.00222      0.99070        0.999
   34 22642902   44229    79721   0.9931 0.00242      0.98836        0.998
   35 24922596    7018    80527   0.9928 0.00243      0.98806        0.998
   36 27083919   32390    96320   0.9916 0.00250      0.98674        0.997
   37 29126318   31907   155051   0.9905 0.00256      0.98555        0.996
   38 31131623   46436    92114   0.9891 0.00264      0.98391        0.994
   39 33061482   76352   105684   0.9868 0.00296      0.98100        0.993
   40 34973526   47030   208518   0.9855 0.00299      0.97962        0.991
   41 36827127   48018   300809   0.9842 0.00306      0.97819        0.990
   42 38764241  117825   318277   0.9812 0.00321      0.97491        0.987
   43 40557978   62028   758715   0.9797 0.00325      0.97333        0.986
   44 41961506   81817  1271969   0.9778 0.00331      0.97131        0.984
   45 42857584  197685  1956259   0.9733 0.00354      0.96636        0.980
   46 42858372  116177  2473931   0.9706 0.00364      0.96351        0.978
   47 42478539  239177  2544278   0.9652 0.00389      0.95756        0.973
   48 42071068  129276  2545146   0.9622 0.00398      0.95442        0.970
   49 42038578  168073  2267806   0.9583 0.00411      0.95033        0.966
   50 41923164  155891  2069035   0.9548 0.00423      0.94652        0.963
   51 42082204  229682  2205161   0.9496 0.00437      0.94105        0.958
   52 41876350  293112  2111420   0.9429 0.00457      0.93401        0.952
   53 41606799  289514  2208036   0.9364 0.00471      0.92718        0.946
   54 41445989  270849  2158046   0.9302 0.00484      0.92081        0.940
   55 41247628  381203  1935105   0.9216 0.00506      0.91178        0.932
   56 41090227  325009  1796651   0.9144 0.00527      0.90409        0.925
   57 41331346  203376  2040490   0.9099 0.00536      0.89942        0.920
   58 41170132  325756  2019249   0.9027 0.00555      0.89184        0.914
   59 40674196  418439  1898370   0.8934 0.00574      0.88219        0.905
   60 40298375  377719  1872868   0.8850 0.00588      0.87355        0.897
   61 39758843  360708  1632875   0.8770 0.00602      0.86525        0.889
   62 39699659  460131  2111776   0.8668 0.00620      0.85474        0.879
   63 38729449  610150  1831999   0.8531 0.00646      0.84058        0.866
   64 37939901  631582  2062476   0.8389 0.00681      0.82570        0.852
   65 37045773  490424  1866720   0.8278 0.00697      0.81429        0.842
   66 36484234  725868  1816373   0.8114 0.00720      0.79739        0.826
   67 35412718  669034  1961551   0.7960 0.00740      0.78168        0.811
   68 34219181  865503  1970789   0.7759 0.00765      0.76106        0.791
   69 32790489  799112  2042301   0.7570 0.00781      0.74184        0.772
   70 31195657  733611  1865163   0.7392 0.00802      0.72364        0.755
   71 29692017  746556  1886987   0.7206 0.00818      0.70476        0.737
   72 28225438  820400  1641929   0.6997 0.00828      0.68362        0.716
   73 26823493  948655  1639680   0.6749 0.00843      0.65860        0.692
   74 25280886  901502  1630582   0.6509 0.00852      0.63436        0.668
   75 23592595 1099775  1268701   0.6205 0.00871      0.60368        0.638
   76 21312276  892617  1323574   0.5945 0.00879      0.57754        0.612
   77 19100539 1001910  1120078   0.5633 0.00890      0.54616        0.581
   78 16982158  864515  1318894   0.5347 0.00898      0.51735        0.553
   79 14798749  920776  1191732   0.5014 0.00906      0.48395        0.519
   80 12686241  773496   830415   0.4708 0.00911      0.45330        0.489
   81 11082330  867991   894834   0.4339 0.00920      0.41628        0.452
   82  9319506  721505   859613   0.4004 0.00915      0.38282        0.419
   83  7738387  632673   470114   0.3676 0.00907      0.35026        0.386
   84  6635601  581957   528663   0.3354 0.00899      0.31822        0.353
   85  5524981  498080   575396   0.3051 0.00884      0.28831        0.323
   86  4451505  477410   612822   0.2724 0.00871      0.25587        0.290
   87  3361273  363057   420645   0.2430 0.00851      0.22688        0.260
   88  2577571  381135   399054   0.2071 0.00828      0.19146        0.224
   89  1797382  175449   343220   0.1869 0.00818      0.17149        0.204
   90  1278713  161757   247579   0.1632 0.00828      0.14777        0.180
   91   869376   92481   211594   0.1459 0.00838      0.13032        0.163
   92   565302   80894   144624   0.1250 0.00874      0.10897        0.143
   93   339784   33117   122395   0.1128 0.00930      0.09597        0.133
   94   184271   15386   112981   0.1034 0.01020      0.08520        0.125
   95    55904    7753    37973   0.0890 0.01538      0.06347        0.125
   96    10178    7506     1459   0.0234 0.02100      0.00402        0.136

Kaplan-Meier Plots

data  %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ 1, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 1. Kaplan-Meier Plot of Survival")+
  add_censor_mark()+
  add_quantile()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

According to the plot above, median age at death is 79 years. This means that half of the subjects died before 79 and half of them died after 79 years.

data  %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ SEX, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 2. Kaplan-Meier Plot of Survival by Sex")+
  add_quantile()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

coxph(Surv(NEXAMAGE, ageevent, event) ~ SEX, data=data) %>% summary()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ SEX, data = data)

  n= 14081, number of events= 4531 
   (326 observations deleted due to missingness)

           coef exp(coef) se(coef)     z Pr(>|z|)    
SEXMale 0.56749   1.76383  0.02992 18.96   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

        exp(coef) exp(-coef) lower .95 upper .95
SEXMale     1.764     0.5669     1.663      1.87

Concordance= 0.577  (se = 0.004 )
Likelihood ratio test= 359.6  on 1 df,   p=<2e-16
Wald test            = 359.7  on 1 df,   p=<2e-16
Score (logrank) test = 369.2  on 1 df,   p=<2e-16

Figure 2. shows marked differences in survival by sex. The median survival time for males was 75 years, whereas the median survival time for females was 82 years. The logrank test 𝛘2 = 369.2, P < 0.001 reveals significant differences in survival by sex.

data  %>%
  filter(RACE %in% c("Black","White")) %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ RACE + SEX, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 3. Kaplan-Meier Plot of Survival by Sex and Race")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

data  %>%
  filter(RACE %in% c("Black","White")) %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ RACE +  SEX, data = .) 
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call: survfit(formula = Surv(NEXAMAGE, ageevent, event) ~ RACE + SEX, 
    data = .)

   313 observations deleted due to missingness 
                       records n.max n.start events median 0.95LCL 0.95UCL
RACE=White, SEX=Female    6994  3153     216   1670     83      82      83
RACE=White, SEX=Male      4813  1665     106   1978     76      75      77
RACE=Black, SEX=Female    1325   609      29    435     77      75      79
RACE=Black, SEX=Male       790   300      13    411     71      68      74
coxph(Surv(NEXAMAGE, ageevent, event) ~ RACE , data=data) %>% summary()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ RACE, data = data)

  n= 14081, number of events= 4531 
   (326 observations deleted due to missingness)

                                         coef exp(coef) se(coef)      z
RACEAleut, Eskimo or American Indian  0.30435   1.35574  0.33379  0.912
RACEAsian/Pacific Islander            0.16876   1.18384  0.20076  0.841
RACEBlack                             0.33415   1.39675  0.03817  8.754
RACEOther                            -0.44689   0.63961  0.57769 -0.774
                                     Pr(>|z|)    
RACEAleut, Eskimo or American Indian    0.362    
RACEAsian/Pacific Islander              0.401    
RACEBlack                              <2e-16 ***
RACEOther                               0.439    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                     exp(coef) exp(-coef) lower .95 upper .95
RACEAleut, Eskimo or American Indian    1.3557     0.7376    0.7048     2.608
RACEAsian/Pacific Islander              1.1838     0.8447    0.7987     1.755
RACEBlack                               1.3967     0.7159    1.2961     1.505
RACEOther                               0.6396     1.5634    0.2062     1.984

Concordance= 0.529  (se = 0.003 )
Likelihood ratio test= 73.13  on 4 df,   p=5e-15
Wald test            = 78.22  on 4 df,   p=4e-16
Score (logrank) test = 78.96  on 4 df,   p=3e-16

Figure 3. above limits the analysis to Black and white individuals for the sake of simplicity and visual clarity. White females’ survival was best overall (medianage at death = 83) compared to Black males (medianage at death = 71).

The logrank test 𝛘2 = 78.96, P < 0.001 reveals significant differences in survival by race.

data  %>%
  filter(DRINKFRQ71 != "Blank, but applicable") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRINKFRQ71, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 4. Kaplan-Meier Plot of Survival by Drinking Frequency")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

coxph(Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71 , data=data) %>% summary()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71, 
    data = data)

  n= 10008, number of events= 2769 
   (4399 observations deleted due to missingness)

                                                        coef exp(coef) se(coef)
DRINKFRQ71About 1-4 times a month                    0.02803   1.02843  0.05581
DRINKFRQ71About 2 or 3 times a week                  0.14132   1.15180  0.06452
DRINKFRQ71Blank, but applicable                     -0.12552   0.88204  0.57900
DRINKFRQ71Everyday                                   0.23250   1.26175  0.06257
DRINKFRQ71Just about everyday                        0.21924   1.24513  0.09031
DRINKFRQ71More than 3 but less than 12 times a year  0.03151   1.03201  0.06795
                                                         z Pr(>|z|)    
DRINKFRQ71About 1-4 times a month                    0.502 0.615485    
DRINKFRQ71About 2 or 3 times a week                  2.190 0.028490 *  
DRINKFRQ71Blank, but applicable                     -0.217 0.828378    
DRINKFRQ71Everyday                                   3.716 0.000203 ***
DRINKFRQ71Just about everyday                        2.428 0.015201 *  
DRINKFRQ71More than 3 but less than 12 times a year  0.464 0.642906    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                                    exp(coef) exp(-coef)
DRINKFRQ71About 1-4 times a month                       1.028     0.9724
DRINKFRQ71About 2 or 3 times a week                     1.152     0.8682
DRINKFRQ71Blank, but applicable                         0.882     1.1337
DRINKFRQ71Everyday                                      1.262     0.7926
DRINKFRQ71Just about everyday                           1.245     0.8031
DRINKFRQ71More than 3 but less than 12 times a year     1.032     0.9690
                                                    lower .95 upper .95
DRINKFRQ71About 1-4 times a month                      0.9219     1.147
DRINKFRQ71About 2 or 3 times a week                    1.0150     1.307
DRINKFRQ71Blank, but applicable                        0.2836     2.744
DRINKFRQ71Everyday                                     1.1161     1.426
DRINKFRQ71Just about everyday                          1.0431     1.486
DRINKFRQ71More than 3 but less than 12 times a year    0.9033     1.179

Concordance= 0.531  (se = 0.006 )
Likelihood ratio test= 21.85  on 6 df,   p=0.001
Wald test            = 22.3  on 6 df,   p=0.001
Score (logrank) test = 22.37  on 6 df,   p=0.001

Figure 4. above shows significant differences in survival by drinking frequency. The log rank test reveals significant differences in survival by drinking frequency habits, 𝛘2 = 22.37, P < 0.01.

data  %>%
  filter(DRNKTYPE71 != "None") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 5. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

coxph(Surv(NEXAMAGE, ageevent, event) ~ DRNKTYPE71 , data=data) %>% summary()
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in agreg.fit(X, Y, istrat, offset, init, control, weights = weights, :
Loglik converged before variable 2 ; beta may be infinite.
Call:
coxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRNKTYPE71, 
    data = data)

  n= 8134, number of events= 2213 
   (6273 observations deleted due to missingness)

                       coef  exp(coef)   se(coef)      z Pr(>|z|)    
DRNKTYPE71Liquor -2.393e-01  7.872e-01  4.570e-02 -5.237 1.63e-07 ***
DRNKTYPE71None   -1.117e+01  1.410e-05  4.467e+02 -0.025     0.98    
DRNKTYPE71Wine   -4.141e-01  6.609e-01  6.706e-02 -6.175 6.60e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                 exp(coef) exp(-coef) lower .95 upper .95
DRNKTYPE71Liquor 0.7871642      1.270    0.7197    0.8609
DRNKTYPE71None   0.0000141  70900.262    0.0000       Inf
DRNKTYPE71Wine   0.6609332      1.513    0.5795    0.7538

Concordance= 0.547  (se = 0.006 )
Likelihood ratio test= 50.51  on 3 df,   p=6e-11
Wald test            = 49.42  on 3 df,   p=1e-10
Score (logrank) test = 50.17  on 3 df,   p=7e-11

Figure 5 above shows significant differences in survival by drinking frequency. The survival of wine drinkers is the best, followed by liquor, and beer drinkers. The log rank test reveals significant differences in survival by drinking type habits, 𝛘2 = 50.17, P < 0.001.

data  %>%
  filter(DRNKTYPE71 != "None") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + SEX, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and Sex")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

data  %>%
  filter(DRNKTYPE71 != "None") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + BACHELORS, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and College Education")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

data  %>%
  filter(DRNKTYPE71 != "None") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + HSGRAD, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and High School Education")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

data  %>%
  filter(DRNKTYPE71 != "None") %>%
survfit2(Surv(NEXAMAGE,ageevent,event) ~ DRNKTYPE71 + INCOMEGT25K, data = .) %>% 
  ggsurvfit(linetype_aes = TRUE, linewidth =1) +
  labs(
    x = "Age",
    y = "Cumulative survival probability"
    ) + 
  add_confidence_interval() +
add_risktable(
  risktable_stats =
    c("{n.risk} ({cum.event})",
      "{round(estimate*100)}% ({round(conf.low*100)}, {round(conf.high*100)})"),
  stats_label = c("At Risk (Cum. Events)", "Survival (95% CI)")
)+
  ggtitle("Figure 6. Kaplan-Meier Plot of Survival by Self-Identified Drinker Type and Income > 25,000")
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Cox Proportion Hazards Models

Unweighted Model

design.INT <- survey::svydesign(strata=~STRATA, id=~PSU, weights=~WT1100,
                        nest=TRUE, data=data)

cox1<- survey::svycoxph(Surv(NEXAMAGE, ageevent, event) ~  SEX + RACE  , 
                      design = design.INT)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
summary(cox1)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ SEX + RACE, 
    design = design.INT)

  n= 14081, number of events= 4531 
   (326 observations deleted due to missingness)

                                         coef exp(coef) se(coef) robust se
SEXMale                               0.57872   1.78375  0.03446   0.03791
RACEAleut, Eskimo or American Indian  0.09014   1.09433  0.44083   0.35005
RACEAsian/Pacific Islander           -0.57789   0.56108  0.34802   0.29021
RACEBlack                             0.44590   1.56190  0.05221   0.07323
RACEOther                            -1.84704   0.15770  0.94672   0.79688
                                          z Pr(>|z|)    
SEXMale                              15.265  < 2e-16 ***
RACEAleut, Eskimo or American Indian  0.258   0.7968    
RACEAsian/Pacific Islander           -1.991   0.0464 *  
RACEBlack                             6.089 1.14e-09 ***
RACEOther                            -2.318   0.0205 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                     exp(coef) exp(-coef) lower .95 upper .95
SEXMale                                 1.7837     0.5606   1.65601    1.9213
RACEAleut, Eskimo or American Indian    1.0943     0.9138   0.55105    2.1732
RACEAsian/Pacific Islander              0.5611     1.7823   0.31769    0.9910
RACEBlack                               1.5619     0.6402   1.35307    1.8030
RACEOther                               0.1577     6.3410   0.03308    0.7519

Concordance= 0.595  (se = 0.007 )
Likelihood ratio test= NA  on 5 df,   p=NA
Wald test            = 325.4  on 5 df,   p=<2e-16
Score (logrank) test = NA  on 5 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).
sjPlot::tab_model(cox1, p.style = "stars", show.aicc = TRUE, use.viewer = TRUE) 
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
  Dependent variable
Predictors Estimates CI
SEX [Male] 1.78 *** 1.67 – 1.91
RACE [Aleut, Eskimo or
American Indian]
1.09 0.46 – 2.60
RACE [Asian/Pacific
Islander]
0.56 0.28 – 1.11
RACE [Black] 1.56 * 1.41 – 1.73
RACE [Other] 0.16 0.02 – 1.01
Observations 14081
R2 Nagelkerke 0.025
AICc 57254.328
* p<0.05   ** p<0.01   *** p<0.001

Person-Years Analysis

pydata<-survSplit(data, cut=c(24:97),end="ageevent",event="event")

pydata0 <- pydata %>% filter(ageevent >= NEXAMAGE & DRNKTYPE71 %in% c("Beer","Wine","Liquor"))

probs <- pydata0 %>%
    dplyr::group_by(ageevent, DRNKTYPE71) %>%
    dplyr::summarize(p=weighted.mean(event,WT1100, na.rm = TRUE),n=n())
`summarise()` has grouped output by 'ageevent'. You can override using the
`.groups` argument.
probs$DRNKTYPE71 <- factor(probs$DRNKTYPE71)

probs$num <- probs$p*(1-probs$p)

probs$sep <- sqrt(probs$num/probs$n)

probs$me <- 2*probs$sep

probs$`Drinker Type` <- probs$DRNKTYPE71

ggplot(
  data = probs,
  aes(
    x = ageevent,
    y = p,
    ymin = p - me,
    ymax = p + me,
    fill =  `Drinker Type`,
    color = `Drinker Type`,
    group = `Drinker Type`
  )
) +
  geom_line(size = 1) + geom_ribbon(alpha =
                                                                                                0.15, aes(color = NULL)) +
  labs(title = "Mortality by Drinker Type",
       subtitle = "Ages 25 to 97",
       caption = "NHEFS 1992 Followup") +
  xlab(label = "Age at Death") +
  ylab(label = "Hazard of Death") +
  theme_pubclean()
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

The figure above shows mortality profile by self-identified drinker type. As shown below, generally beer drinkers have greater mortality and hazards across the lifespan. shows a Kaplan-Meier plot of survival disaggregated by drinker type, whereas the figure below shows hazards by drinker type. The hazards are highest across the lifespan for self-identified beer drinkers. Finally, the log rank test reveals significant differences in survival by drinking type habits, 𝛘2   =  50.17, P < 0.001.

            The cox proportional hazards model in Table 3 indicates differences in survival of for all cause death by sex and race. Holding all else constant, the hazard of death is 1.77 times higher for males individuals, HR = 1.77(95% CI, 1.60 – 1.95); P < 0.001, compared to females at any given time. Moreover, the hazard of death is also significant higher for Black individuals, HR = 1.62 (95% CI, 1.41 – 1.87); P < 0.001.  No significant differences were found for survival across drinking types.  Figure 2 shows significant differences in survival by drinking frequency. The log rank test reveals significant differences in survival by drinking frequency habits, 𝛘2   =  22.37, P < 0.01.

s1 <- survfit(Surv(ageevent, event) ~ 1, data = data)

Supplement A: Creation of NHEFS Participant Weights

Derived from: Appendix Ill SAS code for computing sample weights for locations 1-100.

# Input NHEFS vital status file
#NHEFSVTS <- read.fwf("your_SAS_data_file.txt", widths = c(5, 2, 1, 1), col.names = c("SEQNO", "AGE", "SEX", "RACE"))

# Sort the data by SEQNO
#NHEFSVTS <- NHEFSVTS[order(NHEFSVTS$SEQNO), ]

# Input NHANES I sample weights
#NHANESI <- read.fwf("your_other_SAS_data_file.txt", widths = c(5, 6, 6), col.names = c("SEQNO", "WT165", "WT661OO"))

# Sort the data by SEQNO
#NHANESI <- NHANESZ[order(NHANESZ$SEQNO), ]

# Merge NHEFSVTS and NHANESZ by SEQNO

    NHEFS_Vital92 <- read.csv("NHEFS_Vital92.csv")
   
   NHANESI <- read_xlsx("NHANESI.xlsx")
   
   NHANESI$SEQNUM <- NHANESI$SEQN

merged_data <- merge(NHEFS_Vital92, NHANESI, by = "SEQNUM")


merged_data$SEX <- merged_data$SEXSUBJ


merged_data$WT165 <- merged_data$N1AH0176 
merged_data$WT66100 <- merged_data$N1AH0182



# # Input NHEFS vital status file
# NHEFSVTS <- read.fwf("your_SAS_data_file.txt", widths = c(5, 2, 1, 1), col.names = c("SEQNO", "AGE", "SEX", "RACE"))
# 
# # Sort the data by SEQNO
# NHEFSVTS <- NHEFSVTS[order(NHEFSVTS$SEQNO), ]
# 
# # Input NHANES I sample weights
# NHANESZ <- read.fwf("your_other_SAS_data_file.txt", widths = c(5, 6, 6), col.names = c("SEQNO", "WT165", "WT661OO"))
# 
# # Sort the data by SEQNO
# NHANESZ <- NHANESZ[order(NHANESZ$SEQNO), ]
# 
# # Merge NHEFSVTS and NHANESZ by SEQNO
# merged_data <- merge(NHEFSVTS, NHANESZ, by = "SEQNO")

# Adjust weights based on conditions
merged_data$AGEC <- cut(merged_data$NEXAMAGE, breaks = c(-Inf, 45, 64, Inf), labels = c(25, 45, 65), right = FALSE)
merged_data$RACEC <- ifelse(merged_data$REVRACE == 3, 2, 1)

merged_data$ADJ <- ifelse(merged_data$AGEC == 25 & merged_data$SEX == 1 & merged_data$RACEC == 1, 1255/1804,
                          ifelse(merged_data$AGEC == 25 & merged_data$SEX == 2 & merged_data$RACEC == 1, 2879/3661,
                                 ifelse(merged_data$AGEC == 45 & merged_data$SEX == 1 & merged_data$RACEC == 1, 1152/1661,
                                        ifelse(merged_data$AGEC == 45 & merged_data$SEX == 2 & merged_data$RACEC == 1, 1263/1875,
                                               ifelse(merged_data$AGEC == 65 & merged_data$SEX == 1 & merged_data$RACEC == 1, 1361/1523,
                                                      ifelse(merged_data$AGEC == 65 & merged_data$SEX == 2 & merged_data$RACEC == 1, 1503/1684,
                                                             ifelse(merged_data$AGEC == 25 & merged_data$SEX == 1 & merged_data$RACEC == 2, 203/251,
                                                                    ifelse(merged_data$AGEC == 25 & merged_data$SEX == 2 & merged_data$RACEC == 2, 658/734,
                                                                           ifelse(merged_data$AGEC == 45 & merged_data$SEX == 1 & merged_data$RACEC == 2, 214/259,
                                                                                  ifelse(merged_data$AGEC == 45 & merged_data$SEX == 2 & merged_data$RACEC == 2, 250/309,
                                                                                         ifelse(merged_data$AGEC == 65 & merged_data$SEX == 1 & merged_data$RACEC == 2, 294/313,
                                                                                                ifelse(merged_data$AGEC == 65 & merged_data$SEX == 2 & merged_data$RACEC == 2, 316/333, NA))))))))))))

# Adjust weights based on conditions
merged_data$WT1100 <- ifelse(!is.na(merged_data$WT66100), round(merged_data$WT66100 * (1 - merged_data$ADJ), 1),
                             round(merged_data$WT165 * merged_data$ADJ, 1))

weights <- merged_data %>% select(SEQNUM, WT165,WT66100,WT1100)

write.csv(weights, file =  "NHEFS_weights.csv")
design.cox <- survey::svydesign(strata=~STRATA, id=~PSU, weights=~WT1100,
                        nest=TRUE, data=data)

cox <- survey::svycoxph(Surv(NEXAMAGE, ageevent, event) ~ DRNKVOL71, 
                      design = design.cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in agreg.fit(X, Y, istrat, offset, init, control, weights = weights, :
Loglik converged before variable 2,5,14,17,19 ; beta may be infinite.
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
cox.zph(cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
           chisq df p
DRNKVOL71 0.0027 26 1
GLOBAL    0.0027 26 1
survminer::ggcoxdiagnostics(cox)
Warning: `gather_()` was deprecated in tidyr 1.2.0.
ℹ Please use `gather()` instead.
ℹ The deprecated feature was likely used in the survminer package.
  Please report the issue at <https://github.com/kassambara/survminer/issues>.
`geom_smooth()` using formula = 'y ~ x'
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: pseudoinverse used at 0
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: neighborhood radius 0.15756
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: reciprocal condition number 2.0408e-14
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: There are other near singularities as well. 0.014619
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at 0
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
0.15756
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
number 2.0408e-14
Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : There are other near
singularities as well. 0.014619

summary(cox)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRNKVOL71, 
    design = design.cox)

  n= 8134, number of events= 2213 
   (6273 observations deleted due to missingness)

                     coef  exp(coef)   se(coef)  robust se       z Pr(>|z|)    
DRNKVOL7110     1.075e+00  2.929e+00  2.083e-01  2.550e-01   4.215 2.50e-05 ***
DRNKVOL7111    -1.241e+01  4.094e-06  1.002e+03  5.367e-01 -23.114  < 2e-16 ***
DRNKVOL7112     9.482e-01  2.581e+00  2.675e-01  3.656e-01   2.594 0.009493 ** 
DRNKVOL7113     2.675e+00  1.451e+01  7.039e-01  8.607e-01   3.108 0.001886 ** 
DRNKVOL7114    -1.240e+01  4.128e-06  1.376e+03  8.089e-01 -15.326  < 2e-16 ***
DRNKVOL7115     4.119e-01  1.510e+00  4.927e-01  6.182e-01   0.666 0.505218    
DRNKVOL7116     1.713e+00  5.544e+00  4.037e-01  4.563e-01   3.753 0.000175 ***
DRNKVOL7117     5.947e-01  1.813e+00  6.567e-01  1.215e+00   0.489 0.624604    
DRNKVOL7118     7.360e-01  2.088e+00  5.683e-01  2.041e-01   3.606 0.000311 ***
DRNKVOL712      3.665e-02  1.037e+00  5.651e-02  7.877e-02   0.465 0.641718    
DRNKVOL7120     1.678e+00  5.354e+00  4.050e-01  5.719e-01   2.934 0.003345 ** 
DRNKVOL7124     1.790e+00  5.991e+00  7.026e-01  4.712e-01   3.800 0.000145 ***
DRNKVOL7125    -4.257e-01  6.533e-01  3.632e+00  1.414e+00  -0.301 0.763371    
DRNKVOL7127    -1.235e+01  4.331e-06  6.062e+02  1.003e+00 -12.312  < 2e-16 ***
DRNKVOL713      1.576e-01  1.171e+00  7.635e-02  1.001e-01   1.573 0.115613    
DRNKVOL7130     2.048e+00  7.753e+00  5.753e-01  6.119e-01   3.347 0.000816 ***
DRNKVOL7136    -1.237e+01  4.264e-06  1.202e+03  8.759e-01 -14.117  < 2e-16 ***
DRNKVOL714      4.919e-01  1.635e+00  8.734e-02  1.219e-01   4.035 5.45e-05 ***
DRNKVOL7145    -1.237e+01  4.245e-06  1.426e+03  1.006e+00 -12.302  < 2e-16 ***
DRNKVOL7148     1.364e+00  3.913e+00  1.373e+00  5.874e-01   2.323 0.020199 *  
DRNKVOL715      5.839e-01  1.793e+00  1.228e-01  1.644e-01   3.553 0.000381 ***
DRNKVOL716      7.501e-01  2.117e+00  1.080e-01  1.631e-01   4.598 4.27e-06 ***
DRNKVOL717      2.128e-01  1.237e+00  3.661e-01  5.669e-01   0.375 0.707440    
DRNKVOL718      6.174e-01  1.854e+00  1.896e-01  2.425e-01   2.546 0.010903 *  
DRNKVOL719      6.798e-01  1.973e+00  2.992e-01  2.429e-01   2.799 0.005127 ** 
DRNKVOL71Blank -8.329e-01  4.348e-01  1.119e+00  8.861e-01  -0.940 0.347224    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

               exp(coef) exp(-coef) lower .95 upper .95
DRNKVOL7110    2.929e+00  3.414e-01 1.777e+00 4.829e+00
DRNKVOL7111    4.094e-06  2.443e+05 1.430e-06 1.172e-05
DRNKVOL7112    2.581e+00  3.874e-01 1.261e+00 5.284e+00
DRNKVOL7113    1.451e+01  6.893e-02 2.685e+00 7.839e+01
DRNKVOL7114    4.128e-06  2.423e+05 8.456e-07 2.015e-05
DRNKVOL7115    1.510e+00  6.624e-01 4.495e-01 5.071e+00
DRNKVOL7116    5.544e+00  1.804e-01 2.267e+00 1.356e+01
DRNKVOL7117    1.813e+00  5.517e-01 1.674e-01 1.963e+01
DRNKVOL7118    2.088e+00  4.790e-01 1.399e+00 3.115e+00
DRNKVOL712     1.037e+00  9.640e-01 8.889e-01 1.211e+00
DRNKVOL7120    5.354e+00  1.868e-01 1.746e+00 1.642e+01
DRNKVOL7124    5.991e+00  1.669e-01 2.379e+00 1.508e+01
DRNKVOL7125    6.533e-01  1.531e+00 4.087e-02 1.044e+01
DRNKVOL7127    4.331e-06  2.309e+05 6.063e-07 3.093e-05
DRNKVOL713     1.171e+00  8.542e-01 9.620e-01 1.425e+00
DRNKVOL7130    7.753e+00  1.290e-01 2.337e+00 2.572e+01
DRNKVOL7136    4.264e-06  2.345e+05 7.661e-07 2.374e-05
DRNKVOL714     1.635e+00  6.115e-01 1.288e+00 2.077e+00
DRNKVOL7145    4.245e-06  2.356e+05 5.915e-07 3.046e-05
DRNKVOL7148    3.913e+00  2.556e-01 1.237e+00 1.237e+01
DRNKVOL715     1.793e+00  5.577e-01 1.299e+00 2.474e+00
DRNKVOL716     2.117e+00  4.723e-01 1.538e+00 2.915e+00
DRNKVOL717     1.237e+00  8.084e-01 4.072e-01 3.758e+00
DRNKVOL718     1.854e+00  5.394e-01 1.153e+00 2.982e+00
DRNKVOL719     1.973e+00  5.067e-01 1.226e+00 3.176e+00
DRNKVOL71Blank 4.348e-01  2.300e+00 7.656e-02 2.469e+00

Concordance= 0.572  (se = 0.011 )
Likelihood ratio test= NA  on 26 df,   p=NA
Wald test            = 2216  on 26 df,   p=<2e-16
Score (logrank) test = NA  on 26 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).
design.cox <- survey::svydesign(strata=~STRATA, id=~PSU, weights=~WT1100,
                        nest=TRUE, data=data)

cox <- survey::svycoxph(Surv(NEXAMAGE, ageevent, event) ~ SEX + RACE + DRNKTYPE71 + INCOMEGT25K , 
                      design = design.cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
Warning in agreg.fit(X, Y, istrat, offset, init, control, weights = weights, :
Loglik converged before variable 7 ; beta may be infinite.
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
cox.zph(cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
               chisq df    p
SEX         3.29e-04  1 0.99
RACE        1.34e-03  4 1.00
DRNKTYPE71  4.65e-04  3 1.00
INCOMEGT25K 1.14e-06  1 1.00
GLOBAL      1.90e-03  9 1.00
survminer::ggcoxdiagnostics(cox)
`geom_smooth()` using formula = 'y ~ x'

summary(cox)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ SEX + RACE + 
    DRNKTYPE71 + INCOMEGT25K, design = design.cox)

  n= 8134, number of events= 2213 
   (6273 observations deleted due to missingness)

                                           coef  exp(coef)   se(coef)
SEXMale                               5.689e-01  1.766e+00  4.971e-02
RACEAleut, Eskimo or American Indian  2.550e-01  1.290e+00  5.239e-01
RACEAsian/Pacific Islander           -8.489e-01  4.279e-01  4.854e-01
RACEBlack                             4.838e-01  1.622e+00  7.221e-02
RACEOther                            -2.914e+00  5.426e-02  1.866e+00
DRNKTYPE71Liquor                     -1.504e-01  8.603e-01  4.950e-02
DRNKTYPE71None                       -1.133e+01  1.198e-05  4.380e+02
DRNKTYPE71Wine                       -2.892e-01  7.489e-01  7.617e-02
INCOMEGT25KTRUE                      -4.801e-01  6.187e-01  1.049e-01
                                      robust se       z Pr(>|z|)    
SEXMale                               6.262e-02   9.084  < 2e-16 ***
RACEAleut, Eskimo or American Indian  2.751e-01   0.927  0.35389    
RACEAsian/Pacific Islander            4.126e-01  -2.058  0.03963 *  
RACEBlack                             9.958e-02   4.858 1.18e-06 ***
RACEOther                             1.184e+00  -2.461  0.01385 *  
DRNKTYPE71Liquor                      6.483e-02  -2.321  0.02031 *  
DRNKTYPE71None                        9.773e-01 -11.596  < 2e-16 ***
DRNKTYPE71Wine                        1.011e-01  -2.859  0.00425 ** 
INCOMEGT25KTRUE                       1.541e-01  -3.117  0.00183 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                     exp(coef) exp(-coef) lower .95 upper .95
SEXMale                              1.766e+00  5.661e-01 1.562e+00 1.997e+00
RACEAleut, Eskimo or American Indian 1.290e+00  7.749e-01 7.527e-01 2.213e+00
RACEAsian/Pacific Islander           4.279e-01  2.337e+00 1.906e-01 9.605e-01
RACEBlack                            1.622e+00  6.164e-01 1.335e+00 1.972e+00
RACEOther                            5.426e-02  1.843e+01 5.329e-03 5.525e-01
DRNKTYPE71Liquor                     8.603e-01  1.162e+00 7.577e-01 9.769e-01
DRNKTYPE71None                       1.198e-05  8.350e+04 1.764e-06 8.131e-05
DRNKTYPE71Wine                       7.489e-01  1.335e+00 6.142e-01 9.131e-01
INCOMEGT25KTRUE                      6.187e-01  1.616e+00 4.575e-01 8.368e-01

Concordance= 0.62  (se = 0.01 )
Likelihood ratio test= NA  on 9 df,   p=NA
Wald test            = 375.8  on 9 df,   p=<2e-16
Score (logrank) test = NA  on 9 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).
design.cox <- survey::svydesign(strata=~STRATA, id=~PSU, weights=~WT1100,
                        nest=TRUE, data=data)

cox <- survey::svycoxph(Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71 , 
                      design = design.cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
cox.zph(cox)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
             chisq df p
DRINKFRQ71 0.00202  6 1
GLOBAL     0.00202  6 1
survminer::ggcoxdiagnostics(cox)
`geom_smooth()` using formula = 'y ~ x'

summary(cox)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Call:
svycoxph(formula = Surv(NEXAMAGE, ageevent, event) ~ DRINKFRQ71, 
    design = design.cox)

  n= 10008, number of events= 2769 
   (4399 observations deleted due to missingness)

                                                        coef exp(coef) se(coef)
DRINKFRQ71About 1-4 times a month                    0.08073   1.08407  0.06351
DRINKFRQ71About 2 or 3 times a week                  0.11332   1.12000  0.07104
DRINKFRQ71Blank, but applicable                     -1.67831   0.18669  0.95513
DRINKFRQ71Everyday                                   0.33181   1.39348  0.06765
DRINKFRQ71Just about everyday                        0.09576   1.10050  0.09823
DRINKFRQ71More than 3 but less than 12 times a year  0.08663   1.09049  0.07529
                                                    robust se      z Pr(>|z|)
DRINKFRQ71About 1-4 times a month                     0.07205  1.120 0.262537
DRINKFRQ71About 2 or 3 times a week                   0.09051  1.252 0.210521
DRINKFRQ71Blank, but applicable                       0.93682 -1.791 0.073213
DRINKFRQ71Everyday                                    0.08773  3.782 0.000155
DRINKFRQ71Just about everyday                         0.13572  0.706 0.480429
DRINKFRQ71More than 3 but less than 12 times a year   0.08296  1.044 0.296390
                                                       
DRINKFRQ71About 1-4 times a month                      
DRINKFRQ71About 2 or 3 times a week                    
DRINKFRQ71Blank, but applicable                     .  
DRINKFRQ71Everyday                                  ***
DRINKFRQ71Just about everyday                          
DRINKFRQ71More than 3 but less than 12 times a year    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                                                    exp(coef) exp(-coef)
DRINKFRQ71About 1-4 times a month                      1.0841     0.9224
DRINKFRQ71About 2 or 3 times a week                    1.1200     0.8929
DRINKFRQ71Blank, but applicable                        0.1867     5.3565
DRINKFRQ71Everyday                                     1.3935     0.7176
DRINKFRQ71Just about everyday                          1.1005     0.9087
DRINKFRQ71More than 3 but less than 12 times a year    1.0905     0.9170
                                                    lower .95 upper .95
DRINKFRQ71About 1-4 times a month                     0.94130     1.248
DRINKFRQ71About 2 or 3 times a week                   0.93795     1.337
DRINKFRQ71Blank, but applicable                       0.02976     1.171
DRINKFRQ71Everyday                                    1.17335     1.655
DRINKFRQ71Just about everyday                         0.84346     1.436
DRINKFRQ71More than 3 but less than 12 times a year   0.92684     1.283

Concordance= 0.539  (se = 0.009 )
Likelihood ratio test= NA  on 6 df,   p=NA
Wald test            = 18.69  on 6 df,   p=0.005
Score (logrank) test = NA  on 6 df,   p=NA

  (Note: the likelihood ratio and score tests assume independence of
     observations within a cluster, the Wald and robust score tests do not).

Discussion

Conclusions. The present study analyzed the survival differentials of an aging U.S. civilian cohort. Survival times differed by expected and established sociodemographic controls, sex and race. Specifically, the all-cause mortality of men and Black individuals was found to be lower compared to their respective counterparts. These findings are in accord with the extant literature. The final Cox proportional hazard model did not show statistically significant differences in hazards by drink type (beer, wine, or liquor) or by drinking frequency.

Despite the fact that these findings differ from that of Sempos et al. and Klatsky et al. there are salient insights to be discussed. Even in light of the lack of statistical significance, the general trends of this analysis are in accord with the extant alcohol literature. Indeed, as Klatsky reports, greater hazards of mortality were observed for beer drinkers relative to wine drinkers.18 However, the intermediate position of liquor drinkers shown in Figure Panel A was not expected. The volume of drinks likely differs by type of alcohol and the hazards of heavy drinkers may be disproportionally represented in that type.

Limitations. This study has several limitations. First, the NHEFS only reports alcohol consumption behaviors at the NHANES I 1971-1974 baseline. Thus, the NHEFS does not provide a more complete or granular view of alcohol consumption across the lifespan. It is entirely possible, and likely, that drinking behaviors transformed across the lifespan of the cohorts in the context of multifaceted, multi-level social, political, and environmental changes between baseline and follow-up. Second, another consideration observed in this study ones like it is that wine drinkers may be associated with greater survival simply because of a greater propensity to persist the behavior across the lifespan, relative to beer and liquor drinkers who may cease the behavior as they age.18   Third, all measures of alcohol consumption are self-reported and subject to social desirability and recall effects. Finally, all interpretations are merely associational in nature as causality cannot be inferred.

Future Directions. More research is needed to understand the motivational factors that differentially contribute to alcohol consumption behaviors and, by extension, mortality, and morbidity risk profiles. Future longitudinal studies should consider additional timepoints to generate a more complete history of alcohol consumption behavior. Additionally, future analyses should consider more social behaviors and contextual factors like volunteership and church attendance as covariates.

sjPlot::tab_model(cox, p.style = "stars", show.aicc = TRUE, use.viewer = TRUE) 
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Stratified 1 - level Cluster Sampling design (with replacement)
With (1369) clusters.
survey::svydesign(strata = ~STRATA, id = ~PSU, weights = ~WT1100, 
    nest = TRUE, data = data)
Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created

Warning in Surv(NEXAMAGE, ageevent, event): Stop time must be > start time, NA
created
  Dependent variable
Predictors Estimates CI
DRINKFRQ71About 1-4 times a month 1.08 0.96 – 1.23
DRINKFRQ71 [About 2 or 3
times a week]
1.12 0.97 – 1.29
DRINKFRQ71 [Blank, but
applicable]
0.19 0.03 – 1.21
DRINKFRQ71 [Everyday] 1.39 1.22 – 1.59
DRINKFRQ71 [Just about
everyday]
1.10 0.91 – 1.33
DRINKFRQ71 [More than 3
but less than 12 times a
year]
1.09 0.94 – 1.26
Observations 10008
R2 Nagelkerke 0.004
AICc 39123.549
* p<0.05   ** p<0.01   *** p<0.001

References

1.             Inc G. What Percentage of Americans Drink Alcohol? Gallup.com. Published December 29, 2022. Accessed October 20, 2023. https://news.gallup.com/poll/467507/percentage-americans-drink-alcohol.aspx

2.             Wittgens C, Muehlhan M, Kräplin A, Wolff M, Trautmann S. Underlying mechanisms in the relationship between stress and alcohol consumption in regular and risky drinkers (MESA): methods and design of a randomized laboratory study. BMC Psychology. 2022;10(1):233. doi:10.1186/s40359-022-00942-1

3.             Deaths and Years of Potential Life Lost From Excessive Alcohol Use — United States, 2011–2015 | MMWR. Accessed October 21, 2023. https://www.cdc.gov/mmwr/volumes/69/wr/mm6930a1.htm

4.             Khamis AA, Salleh SZ, Ab Karim MS, et al. Alcohol Consumption Patterns: A Systematic Review of Demographic and Sociocultural Influencing Factors. Int J Environ Res Public Health. 2022;19(13):8103. doi:10.3390/ijerph19138103

5.             Klatsky AL, Armstrong MA, Friedman GD. Alcohol and Mortality. Ann Intern Med. 1992;117(8):646-654. doi:10.7326/0003-4819-117-8-646

6.             Tsai MK, Gao W, Wen CP. The relationship between alcohol consumption and health: J-shaped or less is more? BMC Medicine. 2023;21(1):228. doi:10.1186/s12916-023-02911-w

7.             Benefits and hazards of alcohol-the J-shaped curve and public health | Emerald Insight. Accessed October 21, 2023. https://www.emerald.com/insight/content/doi/10.1108/DAT-09-2020-0059/full/html

8.             Thompson PL. J-curve revisited: cardiovascular benefits of moderate alcohol use cannot be dismissed. Med J Aust. 2013;198(8). Accessed October 21, 2023. https://www.mja.com.au/journal/2013/198/8/j-curve-revisited-cardiovascular-benefits-moderate-alcohol-use-cannot-be

9.             Marmot MG, Shipley MJ, Rose G, Thomas BrionyJ. ALCOHOL AND MORTALITY: A U-SHAPED CURVE. The Lancet. 1981;317(8220):580-583. doi:10.1016/S0140-6736(81)92032-8

10.          Tian Y, Liu J, Zhao Y, et al. Alcohol consumption and all-cause and cause-specific mortality among US adults: prospective cohort study. BMC Medicine. 2023;21(1):208. doi:10.1186/s12916-023-02907-6

11.           Stockwell T, Zhao J, Panwar S, Roemer A, Naimi T, Chikritzhs T. Do "Moderate" Drinkers Have Reduced Mortality Risk? A Systematic Review and Meta-Analysis of Alcohol Consumption and All-Cause Mortality. J Stud Alcohol Drugs. 2016;77(2):185-198. doi:10.15288/jsad.2016.77.185

12.          Sempos CT, Rehm J, Wu T, Crespo CJ, Trevisan M. Average Volume of Alcohol Consumption and All-Cause Mortality in African Americans: The NHEFS Cohort. Alcoholism: Clinical and Experimental Research. 2003;27(1):88-92. doi:10.1111/j.1530-0277.2003.tb02726.x

13.          Zhao J, Stockwell T, Roemer A, Naimi T, Chikritzhs T. Alcohol Consumption and Mortality From Coronary Heart Disease: An Updated Meta-Analysis of Cohort Studies. J Stud Alcohol Drugs. 2017;78(3):375-386. doi:10.15288/jsad.2017.78.375

14.          Pearlin LI, Menaghan EG, Lieberman MA, Mullan JT. The Stress Process. Journal of Health and Social Behavior. 1981;22(4):337. doi:10.2307/2136676

15.          DeAngelis RT. Striving While Black: Race and the Psychophysiology of Goal Pursuit. J Health Soc Behav. 2020;61(1):24-42. doi:10.1177/0022146520901695

16.          Cox WM, Klinger E. A motivational model of alcohol use. Journal of Abnormal Psychology. 1988;97(2):168-180. doi:10.1037/0021-843X.97.2.168

17.          Merrill JE, Thomas SE. Interactions between Adaptive Coping and Drinking to Cope in Predicting Naturalistic Drinking and Drinking Following a Lab-Based Psychosocial Stressor. Addict Behav. 2013;38(3):1672-1678. doi:10.1016/j.addbeh.2012.10.003

18.          Klatsky AL, Friedman GD, Armstrong MA, Kipp H. Wine, Liquor, Beer, and Mortality. American Journal of Epidemiology. 2003;158(6):585-595. doi:10.1093/aje/kwg184

19.          Klatsky AL, Armstrong MA, Kipp H. Correlates of alcoholic beverage preference: traits of persons who choose wine, liquor or beer. British Journal of Addiction. 1990;85(10):1279-1289. doi:10.1111/j.1360-0443.1990.tb01604.x

20.          Halonen JI, Kivimäki M, Pentti J, et al. Association of the Availability of Beer, Wine, and Liquor Outlets with Beverage-Specific Alcohol Consumption: A Cohort Study. Alcoholism: Clinical and Experimental Research. 2014;38(4):1086-1093. doi:10.1111/acer.12350

21.          Stephens MAC, Wand G. Stress and the HPA Axis. Alcohol Res. 2012;34(4):468-483.

22.          Badrick E, Bobak M, Britton A, Kirschbaum C, Marmot M, Kumari M. The Relationship between Alcohol Consumption and Cortisol Secretion in an Aging Cohort. J Clin Endocrinol Metab. 2008;93(3):750-757. doi:10.1210/jc.2007-0737

23.          High cortisol levels are associated with oxidative stress and mortality in maintenance hemodialysis patients - PMC. Accessed December 9, 2023. https://www-ncbi-nlm-nih-gov.libweb.lib.utsa.edu/pmc/articles/PMC8903641/

24.          Lorem G, Cook S, Leon DA, Emaus N, Schirmer H. Self-reported health as a predictor of mortality: A cohort study of its relation to other health measurements and observation time. Sci Rep. 2020;10(1):4886. doi:10.1038/s41598-020-61603-0

25.          Burstrom B, Fredlund P. Self rated health: Is it as good a predictor of subsequent mortality among adults in lower as well as in higher social classes? J Epidemiol Community Health. 2001;55(11):836-840. doi:10.1136/jech.55.11.836

26.          Elliott M, Lowman J. Education, income and alcohol misuse: a stress process model. Soc Psychiatry Psychiatr Epidemiol. 2015;50(1):19-26. doi:10.1007/s00127-014-0867-3

27.          Cox CS, Mussolino ME, Rothwell ST, et al. Plan and operation of the NHANES I Epidemiologic Followup Study, 1992. Vital Health Stat 1. 1997;(35):1-231.

28.          NHANES I - Epidemiologic Followup Study (NHEFS). Accessed October 12, 2023. https://wwwn.cdc.gov/nchs/nhanes/nhefs/