This example will illustrate how to fit the discrete time hazard model to person-period data. Specifically, this example illustrates various parameterizartions of time in the discrete time model. In this example, I will use the event of a couple having a second birth. The data for this example come from the model.data Demographic and Health Survey for 2012 children’s recode file. This file contains information for all births in the last 5 years prior to the survey.

The second example will use data from the IPUMS NHIS and examine alternative methods for coding time between survey and mortality follow up.

Continuous duration outcome - DHS data

load the data

library(haven)
model.dat<-read_dta("~/Google Drive/classes/dem7223/dem7223_19/data/zzir62dt/ZZIR62FL.DTA")

Event - Second birth occurrence

In the DHS individual recode file, information on every live birth is collected using a retrospective birth history survey mechanism.

Since our outcome is time between first and second birth, we must select as our risk set, only women who have had a first birth.

The bidx variable indexes the birth history and if bidx_01 is not missing, then the woman should be at risk of having a second birth (i.e. she has had a first birth, i.e. bidx_01==1).

I also select only non-twin births (b0 == 0).

The DHS provides the dates of when each child was born in Century Month Codes.

To get the interval for women who actually had a second birth, that is the difference between the CMC for the first birth b3_01 and the second birth b3_02, but for women who had not had a second birth by the time of the interview, the censored time between births is the difference between b3_01 and v008, the date of the interview.

We have 6161 women who are at risk of a second birth.

#We form a subset of variables
sub<-subset(model.dat, model.dat$bidx_01==1&model.dat$b0_01==0)

#Here I keep only a few of the variables for the dates, and some characteristics of the women, and details of the survey
sub2<-data.frame(CASEID=sub$caseid, 
                 int.cmc=sub$v008,
                 fbir.cmc=sub$b3_01,
                 sbir.cmc=sub$b3_02,
                 marr.cmc=sub$v509,
                 rural=sub$v025,
                 educ=sub$v106,
                 age=sub$v012,
                 partneredu=sub$v701,
                 partnerage=sub$v730,
                 weight=sub$v005/1000000,
                 psu=sub$v021, strata=sub$v022)

sub2$agefb = (sub2$age - (sub2$int.cmc - sub2$fbir.cmc)/12)

#censoring indicator for death by age 5, in months (<=60 months)
sub2$secbi<-ifelse(is.na(sub2$sbir.cmc)==T, ((sub2$int.cmc))-((sub2$fbir.cmc)), (sub2$fbir.cmc-sub2$sbir.cmc))
sub2$b2event<-ifelse(is.na(sub2$sbir.cmc)==T,0,1) 
table(sub2$b2event)

## 
##    0    1 
## 1237 4789

sub2$educ.high<-ifelse(sub2$educ %in% c(2,3), 1, 0)
sub2$age2<-(sub2$agefb)^2
sub2$partnerhiedu<-ifelse(sub2$partneredu<3,0,ifelse(sub2$partneredu%in%c(8,9),NA,1 ))

Create the person-period file

The distinction between the way we have been doing things and the discrete time model, is that we treat time discretely, versus continuously. This means that we transform the data from the case-duration data format to the person-period format. For this example, a natural choice would be year, since we have intervals of equal length (12 months each).

R provides a useful function called survSplit() in the survival library that will split a continuous duration into discrete periods.

library(survival)
#make person period file
pp<-survSplit(Surv(secbi, b2event)~. , data = sub2[sub2$secbi>0,],
              cut=c(0,12, 24, 36, 48, 60, 72),  episode="year_birth")

pp$year <- pp$year_birth-1
pp<-pp[order(pp$CASEID, pp$year_birth),]
head(pp[, c("CASEID", "secbi", "b2event", "year", "educ.high", "agefb")], n=20)

##             CASEID secbi b2event year educ.high    agefb
## 1          1  1  2    12       0    1         0 29.58333
## 2          1  1  2    24       0    2         0 29.58333
## 3          1  1  2    32       1    3         0 29.58333
## 4          1  3  2    12       0    1         1 19.50000
## 5          1  3  2    24       0    2         1 19.50000
## 6          1  3  2    30       0    3         1 19.50000
## 7          1  4  2    12       0    1         0 31.66667
## 8          1  4  2    24       0    2         0 31.66667
## 9          1  4  2    32       1    3         0 31.66667
## 10         1  4  3    12       0    1         0 24.16667
## 11         1  4  3    20       1    2         0 24.16667
## 12         1  5  1    12       0    1         1 24.91667
## 13         1  5  1    24       0    2         1 24.91667
## 14         1  5  1    33       1    3         1 24.91667
## 15         1  6  2    12       0    1         0 31.75000
## 16         1  6  2    24       0    2         0 31.75000
## 17         1  6  2    36       0    3         0 31.75000
## 18         1  6  2    48       0    4         0 31.75000
## 19         1  6  2    60       0    5         0 31.75000
## 20         1  6  2    72       0    6         0 31.75000

We see that each woman is not in the data for multiple “risk periods”, until they experience the event (birth) or are censored.

Discrete time model

So, the best thing about the discrete time model, is that it’s just logistic regression. Each risk period is treated as a single Bernoulli trial, and the birth can either occur (y=1) or not (y=0) in the period. This is how we get the hazard of the event, as the estimated probability of failure in each discrete time period. So, any method you would like to use to model this probability would probably work (logit, probit models), but I will show two standard approaches.

First, we will use the traditional logit link to the binomial distribution, then we will use the complementary log-log link. The latter is used because it preserves the proportional hazards property of the model, as in the Cox model.

#generate survey design
library(survey)

## Loading required package: grid

## Loading required package: Matrix

## 
## Attaching package: 'survey'

## The following object is masked from 'package:graphics':
## 
##     dotchart

options(survey.lonely.psu = "adjust")
des<-svydesign(ids=~psu, strata = ~strata , weights=~weight, data=pp)

#Fit the basic logistic model with ONLY time in the model
#I do -1 so that no intercept is fit in the model, and we get a hazard estimate for each time period
fit.0<-svyglm(b2event~as.factor(year)-1,design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.0)

## 
## Call:
## svyglm(formula = b2event ~ as.factor(year) - 1, design = des, 
##     family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## as.factor(year)1 -4.62968    0.17640 -26.245  < 2e-16 ***
## as.factor(year)2 -1.80896    0.05286 -34.222  < 2e-16 ***
## as.factor(year)3 -0.76732    0.04603 -16.669  < 2e-16 ***
## as.factor(year)4 -0.70704    0.06203 -11.399  < 2e-16 ***
## as.factor(year)5 -0.92652    0.06158 -15.045  < 2e-16 ***
## as.factor(year)6 -0.92979    0.08186 -11.359  < 2e-16 ***
## as.factor(year)7  0.80355    0.10880   7.386  5.1e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1.000044)
## 
## Number of Fisher Scoring iterations: 7

#Plot the hazard function on the probability scale
haz<-1/(1+exp(-coef(fit.0)))
time<-seq(1,7,1)
plot(haz~time, type="l", ylab="h(t)")
title(main="Discrete Time Hazard Function for second birth")

Alternative time specifications

Here, we can specify how we want time in our model. The general model fits a hazard at every time point. If you have a lot of time points, and especially if you have low numbers of events at some time points, this can be computationally expensive.

We can, however, specify time as a linear or quadratic term, and the model will not fit separate hazards at all times, instead, the baseline hazard will be a linear or curvilinear function of time.

#Linear term for time
fit.l<-svyglm(b2event~year,design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.l)

## 
## Call:
## svyglm(formula = b2event ~ year, design = des, family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.90143    0.04425  -65.56   <2e-16 ***
## year         0.48399    0.01231   39.32   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 0.9228536)
## 
## Number of Fisher Scoring iterations: 4

Which shows the hazard decreases over time, which makes a lot of sense in the context of this outcome. Now we can consider quadratic terms for time and a natural spline function of time as well:

fit.s<-svyglm(b2event~year+I(year^2),design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.s)

## 
## Call:
## svyglm(formula = b2event ~ year + I(year^2), design = des, family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.364954   0.099119  -44.04   <2e-16 ***
## year         1.421368   0.064178   22.15   <2e-16 ***
## I(year^2)   -0.121211   0.008288  -14.62   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 0.8899194)
## 
## Number of Fisher Scoring iterations: 5

fit.c<-svyglm(b2event~year+I(year^2)+I(year^3 ),design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.c)

## 
## Call:
## svyglm(formula = b2event ~ year + I(year^2) + I(year^3), design = des, 
##     family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -10.193252   0.355177  -28.70   <2e-16 ***
## year          7.053531   0.284805   24.77   <2e-16 ***
## I(year^2)    -1.688779   0.070271  -24.03   <2e-16 ***
## I(year^3)     0.129140   0.005371   24.05   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1.016774)
## 
## Number of Fisher Scoring iterations: 6

fit.q<-svyglm(b2event~year+I(year^2)+I(year^3 )+I(year^4),design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.q)

## 
## Call:
## svyglm(formula = b2event ~ year + I(year^2) + I(year^3) + I(year^4), 
##     design = des, family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -9.283277   0.617622 -15.031  < 2e-16 ***
## year         5.864429   0.739788   7.927 2.01e-13 ***
## I(year^2)   -1.167265   0.316776  -3.685   0.0003 ***
## I(year^3)    0.036774   0.056085   0.656   0.5128    
## I(year^4)    0.005659   0.003465   1.633   0.1041    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 0.9833714)
## 
## Number of Fisher Scoring iterations: 7

#Spline
library(splines)
fit.sp<-svyglm(b2event~ns(year, df = 2),design=des , family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

summary(fit.sp)

## 
## Call:
## svyglm(formula = b2event ~ ns(year, df = 2), design = des, family = "binomial")
## 
## Survey design:
## svydesign(ids = ~psu, strata = ~strata, weights = ~weight, data = pp)
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -3.34925    0.06334  -52.88   <2e-16 ***
## ns(year, df = 2)1  5.11302    0.14699   34.79   <2e-16 ***
## ns(year, df = 2)2  1.87052    0.07779   24.05   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 0.8878417)
## 
## Number of Fisher Scoring iterations: 5

Now, let’s look at the hazards:

dat<-expand.grid(year=seq(1,7,1))
dat$genmod<-predict(fit.0, newdata=data.frame(year=as.factor(1:7 )), type="response")
dat$lin<-predict(fit.l, newdata=dat, type="response")
dat$sq<-predict(fit.s, newdata=dat, type="response")
dat$cub<-predict(fit.c, newdata=dat, type="response")
dat$quart<-predict(fit.q, newdata=dat, type="response")
dat$spline<-predict(fit.sp, newdata=expand.grid(year=seq(1,7,1)), type="response")

dat

##   year      genmod        lin         sq        cub     quart     spline
## 1    1 0.009663545 0.08185227 0.04458293 0.00901902 0.0105223 0.03391982
## 2    2 0.140764128 0.12636908 0.11846357 0.14085800 0.1371997 0.08454080
## 3    3 0.317059095 0.19008469 0.23294675 0.32193875 0.3221063 0.18017019
## 4    4 0.330252847 0.27578458 0.35004230 0.32418607 0.3319030 0.29978624
## 5    5 0.283631962 0.38190325 0.42840012 0.26844810 0.2671369 0.38977434
## 6    6 0.282966997 0.50062836 0.45008398 0.31566529 0.3012989 0.43823616
## 7    7 0.690733713 0.61928266 0.41224253 0.67403240 0.6840107 0.46295669

plot(genmod~year, dat, type="l", ylab="h(t)", xlab="Time")
title(main="Hazard function from different time parameterizations")
lines(lin~year, dat, col=2, lwd=2)
lines(sq~year, dat, col=3, lwd=2)
lines(cub~year, dat, col=4, lwd=2)
lines(quart~year, dat, col=5, lwd=2)
lines(spline~year, dat, col=6, lwd=2)

legend("topleft", legend=c("General Model", "Linear","Square", "Cubic", "Quartic", "Natural spline"), col=1:6, lwd=1.5)

#AIC table
aic<-round(c(
  fit.l$deviance+2*length(fit.l$coefficients),
  fit.s$deviance+2*length(fit.s$coefficients),
  fit.c$deviance+2*length(fit.c$coefficients),
  fit.q$deviance+2*length(fit.q$coefficients),
  fit.sp$deviance+2*length(fit.sp$coefficients),
  fit.0$deviance+2*length(fit.0$coefficients)),2)

#compare all aics to the one from the general model
dif.aic<-round(aic-aic[6],2)
data.frame(model =c( "linear","square", "cubic", "quartic","spline", "general"), aic=aic, aic_dif=dif.aic)

##     model      aic aic_dif
## 1  linear 20607.84 1432.79
## 2  square 20176.93 1001.88
## 3   cubic 19179.70    4.65
## 4 quartic 19177.30    2.25
## 5  spline 19969.57  794.52
## 6 general 19175.05    0.00

So the general model, quartic and spline give the same AIC points, but the square and cubic models also fit equally as well.

IPUMS NHIS Mortality example

library(survey)
library(survival)
library(car)

## Loading required package: carData

dat<-haven::read_dta("~/Google Drive/classes/dem7223/dem7223_19//data/nhis_00012.dta")
names(dat)

##  [1] "year"         "serial"       "strata"       "psu"         
##  [5] "nhishid"      "hhweight"     "region"       "pernum"      
##  [9] "nhispid"      "hhx"          "fmx"          "px"          
## [13] "perweight"    "sampweight"   "fweight"      "supp2wt"     
## [17] "intervwyr"    "astatflg"     "cstatflg"     "telephone"   
## [21] "age"          "sex"          "marstat"      "marst"       
## [25] "marstcohab"   "cohabmarst"   "cohabevmar"   "birthyr"     
## [29] "relate"       "famsize"      "momed"        "racea"       
## [33] "hispeth"      "yrsinus"      "hispyn"       "usborn"      
## [37] "citizen"      "racenew"      "educrec2"     "empstat"     
## [41] "pooryn"       "incfamr82on"  "gotnewelf"    "gotssiwhy"   
## [45] "gotnonssdis"  "gotdiv"       "gotchsup"     "gotstamp"    
## [49] "poverty"      "poverty2"     "ownership"    "lowrent"     
## [53] "health"       "height"       "weight"       "bmicalc"     
## [57] "bmi"          "bedayr"       "hstatyr"      "wldayr"      
## [61] "usualpl"      "typplsick"    "routcare"     "delaycost"   
## [65] "famdelaycono" "famdelaycost" "famybarcar"   "famybarcarno"
## [69] "placecar"     "delayappt"    "delayhrs"     "delayphone"  
## [73] "delaytrans"   "delaywait"    "ybarcare"     "ybardental"  
## [77] "ybarglass"    "ybarmeds"     "ybarmental"   "alc5upyr"    
## [81] "smokev"       "smokagereg"   "mortelig"     "mortstat"    
## [85] "mortdody"     "mortucodld"   "mortwt"       "mortcms"     
## [89] "mortndi"      "mortssa"      "mortwtsa"

sub<-subset(dat, dat$mortelig==1&is.na(dat$racenew)==F)
samps<-sample(1:length(sub$year), size = 100000, replace = F)
sub<-sub[samps,]

alternative codings of failure time

In mortality analysis, but in any type of case-duration data, we can construct our event indicator to represent if an event occurss during a set risk period. In the NHIS mortality, we could code the death as a few different ways:

If you are interested in the Age at death, then you will measure your outcome as the age when they died

Coding age at death. If the person died, then it is equal to the year the person died, minus the year in which they were born. If the person was still alive, then it is the difference between 2009 (year of follow up in this data file)

sub$d.age<-ifelse(sub$mortstat==1,sub$mortdody-(sub$year-sub$age) ,
                  ifelse(sub$mortstat==2,2009-(sub$year-sub$age), NA))

If you are interested in the risk of death in a follow up period, then you could code it as: - Did the person die in the follow up period at all? No time specified - Did the person die in a 1, 5, or 10 year period? This specifies a fixed risk period

sub$d.event<-ifelse(sub$mortstat==1,1,0) #did they die?

sub$timetodeath<-ifelse(sub$mortstat ==1, sub$mortdody-sub$year  , 2009 - sub$year ) #how long after the interview did they die?

sub$d1yr<-ifelse(sub$timetodeath<=1&sub$mortstat==1, 1,0) #did they die within 1 year of the interview
sub$d5yr<-ifelse(sub$timetodeath<=5&sub$mortstat==1, 1,0) #did they die within 5 years of the interview

Other variables

library(car)
sub$married<-recode(sub$marstat, recodes="00=NA; 10:13='married'; 20:40='sep'; 50='nm'; 99=NA" ,as.factor=T )
sub$male<-ifelse(sub$sex==1,1,0)
sub$mwt<-sub$mortwt/mean(sub$mortwt, na.rm=T)

sub$age5<-cut(sub$age,seq(15,85, 5))
  
sub$race<-Recode(sub$racenew, recodes ="10='wht'; 20 ='blk'; 30:61='other'; 97:99=NA", as.factor=T)
sub$college<-Recode(sub$educrec2, recodes="00=NA; 10:42='hs or less'; 50:53='some coll'; 54:60='coll'; else=NA", as.factor=T)
sub$black<-ifelse(sub$race=='blk',1,0)
sub$oth<-ifelse(sub$race=='other',1,0)
sub$hs<-ifelse(sub$college=='hs or less',1,0)
sub$col1<-ifelse(sub$college=='some coll',1,0)
sub$sep<-ifelse(sub$married=='sep',1,0)
sub$nm<-ifelse(sub$married=='nm',1,0)

sub$race<-Recode(sub$racenew, recodes ="10='wht'; 20 ='blk'; 30:61='other'; 97:99=NA", as.factor=T)
sub$hisp<-Recode(sub$hispyn, recodes="1=0; 2=1; else=NA")

sub$race_eth[sub$hisp == 0 & sub$race=="wht"]<-"NHWhite"

## Warning: Unknown or uninitialised column: 'race_eth'.

sub$race_eth[sub$hisp == 0 & sub$race=="blk"]<-"NHBlack"
sub$race_eth[sub$hisp == 0 & sub$race=="other"]<-"NHother"
sub$race_eth[sub$hisp == 1 ]<-"Hispanic"
sub$race_eth[is.na(sub$hisp) ==T | is.na(sub$race)==T]<-NA

Analysis of death outcome

des2<-svydesign(ids=~psu, strata=~strata, weights = ~perweight, data=sub, nest=T)

m1<-svyglm(d.event~race_eth+age5+college, design=des2, family=binomial (link="logit"))

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

m2<-svyglm(d1yr~race_eth+age5+college, design=des2, family=binomial (link="logit"))

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

m3<-svyglm(d5yr~race_eth+age5+college, design=des2, family=binomial (link="logit"))

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

#No age as predictor when age at death is the outcome!!!
library(survival)
m4<-coxph(Surv(d.age, d.event)~race_eth+college+cluster(strata),data=sub, weights=mortwt/mean(mortwt, na.rm=T), 
          subset = is.na(mortwt)==F&mortwt>0)

library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer

stargazer(m1,m2,m3, type="html", style="demography")


	d.event	d1yr	d5yr
	Model 1	Model 2	Model 3

race_ethNHBlack	0.302^***	0.101	0.195^**
	(0.051)	(0.114)	(0.063)
race_ethNHother	0.014	-0.084	-0.174
	(0.075)	(0.182)	(0.099)
race_ethNHWhite	0.015	-0.120	-0.149^**
	(0.040)	(0.090)	(0.050)
age5(20,25]	0.107	-0.275	-0.118
	(0.143)	(0.340)	(0.195)
age5(25,30]	0.367^*	-0.298	-0.015
	(0.142)	(0.329)	(0.198)
age5(30,35]	0.760^***	-0.082	0.369
	(0.131)	(0.360)	(0.195)
age5(35,40]	0.963^***	0.397	0.537^**
	(0.125)	(0.298)	(0.172)
age5(40,45]	1.233^***	0.177	0.565^**
	(0.120)	(0.306)	(0.173)
age5(45,50]	1.652^***	0.787^**	1.234^***
	(0.120)	(0.297)	(0.163)
age5(50,55]	1.953^***	1.201^***	1.527^***
	(0.116)	(0.287)	(0.167)
age5(55,60]	2.338^***	1.426^***	1.943^***
	(0.117)	(0.280)	(0.161)
age5(60,65]	2.753^***	1.871^***	2.316^***
	(0.115)	(0.280)	(0.161)
age5(65,70]	3.203^***	2.263^***	2.721^***
	(0.119)	(0.278)	(0.154)
age5(70,75]	3.810^***	2.514^***	3.174^***
	(0.120)	(0.277)	(0.157)
age5(75,80]	4.368^***	3.018^***	3.678^***
	(0.120)	(0.269)	(0.154)
age5(80,85]	5.254^***	3.759^***	4.603^***
	(0.121)	(0.254)	(0.153)
collegehs or less	0.700^***	0.618^***	0.586^***
	(0.035)	(0.106)	(0.055)
collegesome coll	0.400^***	0.281^*	0.346^***
	(0.040)	(0.117)	(0.059)
Constant	-4.767^***	-6.180^***	-5.150^***
	(0.121)	(0.283)	(0.165)
N	98,225	98,225	98,225
Log Likelihood	-26,123.370	-5,904.630	-15,712.820
AIC	52,284.750	11,847.260	31,463.640

p < .05; p < .01; p < .001

stargazer(m4,type="html", style="demography")


	d.age

race_ethNHBlack	0.031
	(0.043)
race_ethNHother	-0.150
	(0.058)
race_ethNHWhite	-0.397^***
	(0.035)
collegehs or less	0.174^***
	(0.027)
collegesome coll	0.225^***
	(0.031)
N	97,783
R²	0.004
Max. Possible R²	0.890
Log Likelihood	-107,701.200
Wald Test	293.050^*** (df = 5)
LR Test	366.619^*** (df = 5)
Score (Logrank) Test	394.152^*** (df = 5)

p < .05; p < .01; p < .001

DEM 7223 Event History Analysis - Discrete Time Hazard Model - Alternative Time Specifications

Corey Sparks, PhD

October 29, 2019