Chapter 8: Cabinet Data in Stata

Figure 8.1

First we run the Cox model with the exact discrete approximation and compute the martingale residuals.

stcox invest polar  numst format postelec caretakr, exactm nohr mgale(martingale)

We calculate the Cox-Snell residuals in the following way.

*Use predict to derive Cox-Snell residuals*

predict CoxSnell, csnell

*Re-stset the data to treat the Cox-Snell residuals as "the data" (i.e. the time variable)*

stset CoxSnell, fail(censor)

*Generate the K-M estimates for the new data*

sts generate km=s

Then we plot the the residuals on the x-axis and the integrated hazard oof the residuals on the y-axis, against a 45 degree line that serves as a reference line. If the model holds, the plot of the residuals agianst the integrated hazard should fall roughly on that line.

*Generate the integrated hazard (using double option for increased computer precision)*

gen double H_cs=-log(km) 

*Reset working directory to output folder

cd ~/Dropbox/event_history/rmd

twoway (line H_cs CoxSnell, sort) (line CoxSnell CoxSnell, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Cox-Snell Residuals from Cabinet Data") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellcab.gph, replace)
    
graph export coxsnellcab.png, replace

Figure 8.2

Before plotting the martingale residuals on the polarization index and formation attempts variables, we drop the variables from our previous analysis and re-stset the data.

*Drop previous data
drop martingale CoxSnell H_cs km

*Resetting the data to original form*

stset durat, fail(censor)

We start with estimating the model for Approach 1.

* Estimate model of interest

stcox format polar, exactp nohr mgale(mg)

Then we can plot the martingales and lowess term against either the polarization or formation attempts covoriate (Approach 1).

twoway (scatter mg polar, sort mfcolor(white) mlcolor(black)) (lowess mg polar, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Polarization Index") ///
    title("Martingale Residuals: Approach 1", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(polarff1.gph, replace)

twoway (scatter mg format, sort mfcolor(white) mlcolor(black)) (lowess mg format, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Formation Attempts") ///
    title("Martingale Residuals: Approach 1", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(formatff1.gph, replace)

drop mg

Now we run two different models, one with only the formation attempts covariate and the other with only the polarization index covariate. From the first model, we plot the Martingale residuals and the smoothed residuals against the polarization variable while in the second, we plot the residuals against the formation attempts covariate.

* First estimate submodel

stcox format, exactp nohr mgale(mg)

* Plot of martingales vs. polarization variable with lowess term

twoway (scatter mg polar, sort mfcolor(white) mlcolor(black)) (lowess mg polar, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Polarization Index") ///
    title("Martingale Residuals: Approach 2", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(polarff2.gph, replace)

drop mg

* Now test for functional form of formation attempts

stcox polar, exactp nohr mgale(mg)

* Plot of martingales vs. formation attempts variable with lowess term  

  twoway (scatter mg format, sort mfcolor(white) mlcolor(black)) (lowess mg format, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Formation Attempts") ///
    title("Martingale Residuals: Approach 2", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(formatff2.gph, replace)

drop mg

Lastly, we combine all four graphs. In all four plots, we see mostly flat lines centered around 0, which indicates that no adjustments need to be made to the functional form

graph combine polarff1.gph formatff1.gph polarff2.gph ///
 formatff2.gph, graphregion(color(white) ///
icolor(none)) saving(funcformcab.gph, replace)

graph export funcformcab.png, replace

Figure 8.6

We want to plot the Cox-snell residuals on the x-axis and the integrated hazard function on the y-axis for a variety of parametric models. Let’s start with the exponential.

streg invest polar  numst format postelec caretakr, dist(exp) time

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*

stset cs, failure(censor)

*Now generate K-M estimates*
  
sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Exponential Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellexp.gph, replace)
    
drop H cs km 
stset durat, fail(censor)

Let’s estimate the Weibull next and calculate the Cox-snell residuals and integrated hazard.

streg invest polar  numst format postelec caretakr, dist(weibull) time

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*
  
stset cs, failure(censor)

*Now generate K-M estimates*

sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Weibull Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellweib.gph, replace)

drop H cs km    
stset durat, fail(censor)

We do the same for the log-log model.

streg invest polar  numst format postelec caretakr, dist(loglog) 

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*

stset cs, failure(censor)

*Now generate K-M estimates*

sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Log-Logistic Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellll.gph, replace)

drop H cs km 
stset durat, fail(censor)

The log-normal model is next.

streg invest polar  numst format postelec caretakr, dist(lognorm) time

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*

stset cs, failure(censor)

*Now generate K-M estimates*

sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Log-Normal Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellln.gph, replace)
    
drop H cs km 
stset durat, fail(censor)

Here is the gompertz model.

streg invest polar  numst format postelec caretakr, dist(gompertz) nohr

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*

stset cs, failure(censor)

*Now generate K-M estimates*

sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Gompertz Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellgomp.gph, replace)
    
drop H cs km 
stset durat, fail(censor)

Lastly, here is the generalized gamma model.

streg invest polar  numst format postelec caretakr, dist(ggamma) time

*Now compute Cox-Snell residuals*

predict double cs, csnell

*Now restset the data*

stset cs, failure(censor)

*Now generate K-M estimates*

sts generate km=s

*Back out the estimated cumulative hazard:*

gen double H=-log(km)

*Graph functions*

twoway (line H cs, sort) (line cs cs, sort lpattern(solid)), ///
    legend(off) ///
    xtitle("Generalized Gamma Model") ///
    title("H(t) based on Cox-Snell Residuals", position(11)) ///
    scheme(s2mono) graphregion(color(white) icolor(none)) ///
    saving(coxsnellgg.gph, replace)
    
drop H cs km

We can combine all six graphs.

graph combine coxsnellexp.gph coxsnellweib.gph coxsnellll.gph ///
 coxsnellln.gph coxsnellgomp.gph coxsnellgg.gph, col(2) graphregion(color(white) ///
icolor(none)) saving(coxsnellparm.gph, replace)

graph export coxsnellparm.png, replace