Survival Analysis: Cox Proportional Hazards Regression

Hazard ratios, model assumptions, covariate selection, adjusted survival curves, and forest plots

Author

Introduction

In Part 1 we established why censoring is unavoidable and how survival data differs from ordinary outcomes. In Part 2 we built the Kaplan-Meier estimator — a descriptive tool that draws survival curves without assuming any distribution. But KM has one fundamental limitation:

It cannot adjust for confounders, and it cannot quantify the effect of continuous covariates.

This is where Cox proportional hazards regression enters.

“The Cox model lets the data choose its own baseline hazard — while still producing clean, interpretable regression coefficients.”

Cox regression is the most widely used survival model in medicine, epidemiology, and RWE because it occupies a rare sweet spot: it is semi-parametric — making no assumption about the shape of the baseline hazard (like KM), but fully parametric in how covariates modify that hazard (like logistic regression).

This article is Part 3 of the series:

Part 1

Censoring & Truncation— Understanding incomplete observation

Part 2

Kaplan-Meier — Non-parametric survival curves and log-rank tests

Part 3 — Current

Cox Regression — Hazard ratios, adjusted effects, and model assumptions

Why Cox Regression?

KM curves are excellent for two-group comparisons. But real-world survival data almost always involves:

📊

Continuous predictors

Age, biomarker levels, dose — impossible to show one KM curve per value

🔀

Multiple covariates

Adjusting for age, sex, comorbidities simultaneously to isolate one effect

📐

Effect quantification

The log-rank test gives a p-value; Cox gives a hazard ratio with a confidence interval

🔬

Confounder control

Unadjusted and adjusted HRs can differ substantially — Cox separates the two cleanly

Method	Handles multiple covariates	Continuous predictors	Produces effect size	Baseline hazard assumed
Kaplan-Meier	✗ No	✗ No	✗ No	None (Non-parametric
Log-rank test	✗ No	✗ No	✗ No	None (Non-parametric
Cox regression	✓ Yes	✓ Yes	✓ HR + CI	Unspecified (semi-parametric)

The Cox Model — Intuition and Formula

What Cox models

The Cox model specifies the hazard at time \(t\) for an individual with covariate vector \(\mathbf{X}\) as:

Cox proportional hazards model

\[h(t \mid \mathbf{X}) = h_0(t) \cdot \exp(\beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_K X_K)\]

\(h_0(t)\) — baseline hazard: the hazard when all covariates are zero (or at their reference levels). It is unspecified — Cox regression never estimates this function. This is what makes the model semi-parametric.

\(\exp(\beta_k X_k)\) — multiplicative covariate effect. Each \(\beta_k\) is a log hazard ratio; \(e^{\beta_k}\) is the hazard ratio for a one-unit increase in \(X_k\).

The hazard ratio

Taking the ratio of hazards for two individuals differing only in \(X_k\) (by one unit):

\[\text{HR} = \frac{h(t \mid X_k = x+1)}{h(t \mid X_k = x)} = e^{\beta_k}\]

The baseline hazard \(h_0(t)\) cancels completely. This is why Cox regression never needs to estimate \(h_0(t)\) — the HR is independent of it, and independent of time.

This time-independence is the proportional hazards assumption: the hazard ratio between any two individuals remains constant over the entire follow-up period.

Interpreting the hazard ratio

HR < 1

Protective

The covariate is associated with reduced hazard — better survival. Example: HR = 0.59 means 41% lower hazard.

HR = 1

No effect

The covariate has no association with the hazard. The survival curves are identical between groups.

HR > 1

Harmful

The covariate is associated with increased hazard — worse survival. Example: HR = 2.34 means 134% higher hazard.

HR ≠ Risk Ratio

The hazard is an instantaneous rate — the probability of the event occurring in the next instant, given survival so far. It is not a probability and not a risk. A hazard ratio of 2 does not mean the probability of the event is twice as high; it means the event rate is approximately twice as high at any given moment. This distinction matters when communicating results to non-statisticians.

Fitting Cox Models in R

Unadjusted model — single categorical predictor

We use coxph() with the same Surv() syntax as survfit():

# Unadjusted Cox regression: effect of sex on lung cancer survival
cox_sex <- coxph(Surv(time, status) ~ sex, data = lung)
cox_sex

Call:
coxph(formula = Surv(time, status) ~ sex, data = lung)

       coef exp(coef) se(coef)      z       p
sex -0.5310    0.5880   0.1672 -3.176 0.00149

Likelihood ratio test=10.63  on 1 df, p=0.001111
n= 228, number of events= 165

The coef column gives \(\hat{\beta}\) (the log hazard ratio). The exp(coef) column gives \(e^{\hat{\beta}}\) — the hazard ratio we interpret. Sex is coded 1 = male, 2 = female, so sex is a numeric variable where higher values correspond to female — resulting in HR < 1 (female protective).

Publication-ready table with `{gtsummary}`

Use tbl_regression(exp = TRUE) to exponentiate the coefficients and present HRs directly:

Code

coxph(Surv(time, status) ~ sex, data = lung) |>
  tbl_regression(
    exp   = TRUE,
    label = list(sex ~ "Sex (1 = Male, 2 = Female)")
  )

Table 1: Unadjusted Cox regression: effect of sex on lung cancer survival. exp = TRUE returns hazard ratios rather than log hazard ratios.

Characteristic	HR	95% CI	p-value
Sex (1 = Male, 2 = Female)	0.59	0.42, 0.82	0.001
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

Female patients have 41% lower hazard of death than males (HR = 0.59; 95% CI: 0.42, 0.82; \(p = 0.001\)).

Unadjusted model — continuous predictor

For a continuous predictor, the HR represents the multiplicative change in hazard for a one-unit increase in that covariate:

cox_age <- coxph(Surv(time, status) ~ age, data = lung)
summary(cox_age)

Call:
coxph(formula = Surv(time, status) ~ age, data = lung)

  n= 228, number of events= 165 

        coef exp(coef) se(coef)     z Pr(>|z|)  
age 0.018720  1.018897 0.009199 2.035   0.0419 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

    exp(coef) exp(-coef) lower .95 upper .95
age     1.019     0.9815     1.001     1.037

Concordance= 0.55  (se = 0.025 )
Likelihood ratio test= 4.24  on 1 df,   p=0.04
Wald test            = 4.14  on 1 df,   p=0.04
Score (logrank) test = 4.15  on 1 df,   p=0.04

Scaling continuous predictors

A one-year difference in age may produce a small HR that’s hard to communicate. To interpret on a clinically meaningful scale — say, a 10-year difference — multiply the coefficient by 10 before exponentiating (never multiply the HR itself):

beta_age <- coef(cox_age)["age"]
ci_age   <- confint(cox_age)["age", ]

cat("HR per 1 year :", round(exp(beta_age), 4), "\n")

HR per 1 year : 1.0189

cat("HR per 10 years:", round(exp(10 * beta_age), 4),
    " 95% CI: [", round(exp(10 * ci_age[1]), 3), ",",
    round(exp(10 * ci_age[2]), 3), "]\n")

HR per 10 years: 1.2059  95% CI: [ 1.007 , 1.444 ]

Multiplying \(\hat{\beta} \times 10\) then exponentiating is correct. Multiplying \(\text{HR}^{10}\) produces the same result algebraically but is clearer when written explicitly.

Multivariable Cox Regression

Including multiple covariates produces adjusted hazard ratios (AHRs) — the effect of each predictor holding all others constant.

Fitting the full model

cox_multi <- coxph(
  Surv(time, status) ~ sex + age + ph.ecog + ph.karno,
  data = lung
)
summary(cox_multi)

Call:
coxph(formula = Surv(time, status) ~ sex + age + ph.ecog + ph.karno, 
    data = lung)

  n= 226, number of events= 163 
   (2 observations deleted due to missingness)

              coef exp(coef)  se(coef)      z Pr(>|z|)    
sex      -0.572802  0.563943  0.169222 -3.385 0.000712 ***
age       0.012868  1.012951  0.009404  1.368 0.171226    
ph.ecog   0.633077  1.883397  0.176034  3.596 0.000323 ***
ph.karno  0.012558  1.012637  0.009514  1.320 0.186842    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

         exp(coef) exp(-coef) lower .95 upper .95
sex         0.5639     1.7732    0.4048    0.7857
age         1.0130     0.9872    0.9945    1.0318
ph.ecog     1.8834     0.5310    1.3338    2.6594
ph.karno    1.0126     0.9875    0.9939    1.0317

Concordance= 0.632  (se = 0.025 )
Likelihood ratio test= 31.27  on 4 df,   p=3e-06
Wald test            = 30.61  on 4 df,   p=4e-06
Score (logrank) test = 31.06  on 4 df,   p=3e-06

Code

cox_multi |>
  tbl_regression(
    exp   = TRUE,
    label = list(
      sex      ~ "Sex (1=M, 2=F)",
      age      ~ "Age (years)",
      ph.ecog  ~ "ECOG score",
      ph.karno ~ "Karnofsky score (physician)"
    )
  ) |>
  bold_p(t = 0.05)

Table 2: Multivariable Cox regression: adjusted hazard ratios for sex, age, and ECOG/Karnofsky performance scores. Each HR is adjusted for all other variables in the model.

Characteristic	HR	95% CI	p-value
Sex (1=M, 2=F)	0.56	0.40, 0.79	<0.001
Age (years)	1.01	0.99, 1.03	0.2
ECOG score	1.88	1.33, 2.66	<0.001
Karnofsky score (physician)	1.01	0.99, 1.03	0.2
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

Unadjusted vs adjusted — comparing effect estimates

A key reason to run both models is to assess confounding. If the unadjusted and adjusted HRs differ meaningfully, confounders are at work:

Code

cox_unadj <- coxph(Surv(time, status) ~ sex, data = lung)
cox_adj   <- coxph(Surv(time, status) ~ sex + age + ph.ecog, data = lung)

tbl_unadj <- cox_unadj |>
  tbl_regression(exp = TRUE, label = list(sex ~ "Sex")) |>
  modify_header(estimate = "**Unadjusted HR**")

tbl_adj <- cox_adj |>
  tbl_regression(exp = TRUE, include = "sex", label = list(sex ~ "Sex")) |>
  modify_header(estimate = "**Adjusted HR**")

tbl_merge(list(tbl_unadj, tbl_adj),
          tab_spanner = c("**Unadjusted**", "**Adjusted**"))

Table 3: Effect of sex on survival: unadjusted vs adjusted for age and ECOG score. Comparing these estimates reveals the degree of confounding.

Characteristic	Unadjusted			Adjusted
Characteristic	Unadjusted HR	95% CI	p-value	Adjusted HR	95% CI	p-value
Sex	0.59	0.42, 0.82	0.001	0.58	0.41, 0.80	<0.001
Abbreviations: CI = Confidence Interval, HR = Hazard Ratio

When should you report adjusted HRs?

In observational and RWE studies, always report adjusted HRs as your primary estimate. Unadjusted HRs are useful for describing the crude association, but they confound the true effect of exposure with the effects of other predictors. The difference between the two estimates quantifies the degree of confounding.

The Proportional Hazards Assumption

The most critical assumption of Cox regression is proportional hazards (PH): the hazard ratio between any two individuals must remain constant over time. Formally:

\[\frac{h(t \mid X_k = 1)}{h(t \mid X_k = 0)} = e^{\beta_k} \quad \text{for all } t\]

If this fails — if the effect of a covariate changes over time — the model is misspecified and the estimated HR is a dubious time-averaged quantity.

Assumption 1

Proportional hazards

The HR between any two covariate patterns is constant across all time points. Violating this is the most common and consequential misspecification.

Assumption 2

Non-informative censoring

Inherited from KM: the reason for censoring must be independent of the risk of the event, conditional on covariates in the model.

Assumption 3

Log-linearity

Continuous predictors enter the hazard log-linearly. If the true relationship is non-linear (e.g., U-shaped for age), the model is misspecified unless a transformation or spline is used.

Assumption 4

No influential observations

Extreme observations can disproportionately influence coefficient estimates. Residual diagnostics (dfbeta) should be examined in small or unbalanced datasets.

Testing the PH assumption with `cox.zph()`

cox.zph() performs a formal test by regressing scaled Schoenfeld residuals against time. A significant p-value indicates that the covariate’s effect is not constant over time — a PH violation:

ph_test <- cox.zph(cox_multi)
ph_test

          chisq df    p
sex      1.7133  1 0.19
age      0.0668  1 0.80
ph.ecog  1.6902  1 0.19
ph.karno 5.4535  1 0.02
GLOBAL   7.3678  4 0.12

A p-value below 0.05 for any covariate, or for the GLOBAL test, signals a potential PH violation for that term.

Visualizing with Schoenfeld residual plots

Code

ggcoxzph(ph_test,
         point.alpha = 0.3,
         point.col   = clr$teal,
         caption     = "Shaded band = 95% CI on Schoenfeld residuals · Flat line supports PH assumption")

Figure 1: Schoenfeld residual plots for each covariate. A flat, horizontal smoothed line indicates that the effect does not change over time (PH holds). A sloping or curved line indicates a time-varying effect (PH violated).

What to do when PH is violated

Several approaches exist depending on the nature of the violation:

Stratification: include the violating variable as a strata term (+ strata(var)) — this allows a separate baseline hazard per stratum without assuming proportionality
Time-varying coefficients: model \(\beta(t)\) explicitly using tt() or time_transform arguments in coxph()
Restricted mean survival time (RMST): an alternative estimand that doesn’t require the PH assumption
Parametric models: Weibull or flexible parametric models may fit better when the hazard shape is known

Visualizing Results: Forest Plot

A forest plot displays all adjusted HRs and their 95% CIs in a single figure — the standard visual output for multivariable survival analysis in publications. We show two approaches: a custom {ggplot2} plot for full styling control, and ggforest() from {survminer} which produces a two-panel publication-ready figure automatically.

Option 1 — Custom `{ggplot2}` forest plot

This approach gives you complete control over colour, axis labels, and theme:

Code

tidy_cox <- broom::tidy(cox_multi, exponentiate = TRUE, conf.int = TRUE) |>
  mutate(
    term = recode(term,
      "sex"      = "Sex (female vs male)",
      "age"      = "Age (per year)",
      "ph.ecog"  = "ECOG score",
      "ph.karno" = "Karnofsky score (physician)"
    ),
    significant = p.value < 0.05
  )

ggplot(tidy_cox, aes(x = estimate, y = reorder(term, estimate),
                     color = significant)) +
  geom_vline(xintercept = 1, linetype = "dashed",
             color = "gray50", linewidth = 0.7) +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0.22, linewidth = 0.8) +
  geom_point(size = 3.5) +
  scale_color_manual(
    values = c("TRUE" = clr$teal, "FALSE" = "gray60"),
    labels = c("TRUE" = "p < 0.05", "FALSE" = "p ≥ 0.05")
  ) +
  scale_x_log10(
    breaks = c(0.5, 0.75, 1, 1.25, 1.5, 2),
    labels = c("0.50", "0.75", "1.00", "1.25", "1.50", "2.00")
  ) +
  labs(
    x       = "Adjusted Hazard Ratio (log scale)",
    y       = NULL,
    color   = NULL,
    title   = "Adjusted Hazard Ratios — Multivariable Cox Model",
    caption = "Horizontal bars = 95% CI · Dashed line = HR of 1 (no effect)"
  ) +
  theme_survival() +
  theme(legend.position = "top")

Figure 2: Custom ggplot2 forest plot. Teal points are statistically significant (p < 0.05); grey points are not. Points left of HR = 1 indicate reduced hazard; points right indicate increased hazard.

Option 2 — `ggforest()` from `{survminer}`

ggforest() produces a two-panel figure: the left panel shows variable names, group labels, and annotated HR values; the right panel shows the forest plot. This is the format most commonly seen in clinical publications.

Code

# ggforest() works directly on a coxph object
# It automatically labels reference categories and annotates all estimates
ggforest(
  model    = cox_multi,
  data     = lung,
  main     = "Hazard Ratio",
  fontsize = 0.85
)

Figure 3: Forest plot produced by ggforest() from the {survminer} package. The left panel annotates each predictor with its HR, 95% CI, and p-value. The right panel plots the estimate geometrically. Reference levels for categorical variables are labelled explicitly.

When to use which

Use the ggplot2 approach when you need precise visual control — colours, fonts, faceting, or combining the forest plot into a multi-panel figure. Use ggforest() when you want a self-contained publication-ready table-and-plot combination with minimal code, especially for reports or manuscripts where the annotated HR values need to appear alongside the plot.

Fitting a larger model for ggforest()

ggforest() is most informative with a model that includes categorical predictors, because it automatically labels each level and marks the reference category. Here’s an example with a richer colon cancer model:

Code

# Prepare labelled factors
colon_clean <- colon |>
  filter(!is.na(status), !is.na(time)) |>
  mutate(
    sex    = factor(sex,   labels = c("Female", "Male")),
    differ = factor(differ, labels = c("Well", "Moderate", "Poor")),
    extent = factor(extent, labels = c("Submucosa", "Muscle", "Serosa", "Contiguous"))
  )

cox_colon <- coxph(
  Surv(time, status) ~ sex + age + differ + extent + node4,
  data = colon_clean
)

ggforest(
  model    = cox_colon,
  data     = colon_clean,
  main     = "Hazard Ratio — Colon Cancer Survival",
  fontsize = 0.8
)

Figure 4: ggforest() on a larger model with categorical predictors. Each factor level is shown separately with its HR relative to the reference level. This two-panel format is standard in clinical journals.

Plotting Adjusted Survival Curves

The ggadjustedcurves() function from {survminer} is the recommended approach for plotting survival curves from a Cox model. It correctly handles population-level averaging — something that manual predict() calls cannot do without extra work.

Four methods explained

ggadjustedcurves() supports four methods, each answering a subtly different question:

Method	What it shows	When to use
`"single"`	One average curve for the whole population	Overall population summary
`"average"`	Separate curves per group, unbalanced	When groups are already similar
`"conditional"`	Separate curves, balanced by replicating data	Default choice for group comparisons
`"marginal"`	Separate curves, IPW-balanced to reference	When group sizes differ substantially

The "conditional" method (default) replicates each observation \(k\) times (once per group), assigns each copy a different group value, and computes average predicted survival — giving true marginal adjusted curves that are not confounded by the covariate distributions differing across groups.

Fitting the model with `strata()`

For ggadjustedcurves() to know which variable to plot by, include it as strata() in the Cox model. The strata variable gets a separate baseline hazard per group while the other coefficients are shared:

# sex as a stratification variable; age and ECOG as adjustment covariates
cox_strata <- coxph(
  Surv(time, status) ~ age + ph.ecog + strata(sex),
  data = lung
)
cox_strata

Call:
coxph(formula = Surv(time, status) ~ age + ph.ecog + strata(sex), 
    data = lung)

            coef exp(coef) se(coef)     z        p
age     0.010566  1.010622 0.009241 1.143    0.253
ph.ecog 0.462424  1.587919 0.114761 4.029 5.59e-05

Likelihood ratio test=19.48  on 2 df, p=5.895e-05
n= 227, number of events= 164 
   (1 observation deleted due to missingness)

Conditional adjusted curves (recommended default)

Code

ggadjustedcurves(
  fit      = cox_strata,
  data     = lung,
  variable = "sex",
  method   = "conditional",
  palette  = c(clr$blue, clr$red),
  ylab     = "Survival probability",
  ggtheme  = theme_survival()
) +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  labs(
    title    = "Conditional Adjusted Survival Curves by Sex",
    subtitle = "Adjusted for age and ECOG performance score",
    x        = "Days since enrolment",
    caption  = "Method: conditional (data replicated per group, averaged predictions)"
  )

Figure 5: Conditional adjusted survival curves by sex, adjusted for age and ECOG score. The ‘conditional’ method replicates each patient across both groups and averages predictions — giving population-representative curves that are not confounded by covariate imbalance between sexes.

Marginal adjusted curves (IPW-balanced)

The "marginal" method uses inverse-probability weighting to rebalance the covariate distributions before estimating curves. It is more appropriate when groups differ substantially in their confounders:

Code

ggadjustedcurves(
  fit      = cox_strata,
  data     = lung,
  variable = "sex",
  method   = "marginal",
  palette  = c(clr$blue, clr$red),
  ylab     = "Survival probability",
  ggtheme  = theme_survival()
) +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  labs(
    title    = "Marginal Adjusted Survival Curves by Sex",
    subtitle = "IPW-balanced to full cohort distribution",
    x        = "Days since enrolment",
    caption  = "Method: marginal (inverse-probability weighting)"
  )

Figure 6: Marginal (IPW-balanced) adjusted survival curves by sex. The covariate distributions are reweighted to match the reference population before survival is estimated, giving a more robust marginal estimand.

Adjusted curves vs KM curves — the key distinction

KM curves show the observed survival stratified by group, with no adjustment for other covariates. ggadjustedcurves() shows the expected survival if both groups had the same covariate distribution — a counterfactual question. When groups are well balanced, the two will look similar. When they differ in important confounders (age, comorbidities), they can diverge substantially.

Extracting median adjusted survival

surv_adjustedcurves() returns the underlying data used by ggadjustedcurves(). You can use it to extract median adjusted survival times programmatically:

curves <- surv_adjustedcurves(
  fit      = cox_strata,
  data     = lung,
  variable = "sex",
  method   = "conditional"
)

# Find time where survival first drops to or below 0.50 for each group
curves |>
  group_by(variable) |>
  filter(surv <= 0.5) |>
  slice(1) |>
  select(variable, time, surv) |>
  rename(Group = variable,
         `Median survival (days)` = time,
         `S(t)` = surv)

Group	Median survival (days)	S(t)
1	270	0.4968930
2	426	0.4956246

Covariate Selection

One of the most consequential — and most frequently under-discussed — decisions in Cox regression is which variables to include. Poor covariate selection can produce biased HRs, inflated standard errors, or models that fail to replicate.

The fundamental principle: science first

The single most important rule is that covariate selection should be guided by subject-matter knowledge and the research question, not by automated statistical algorithms alone.

Stepwise selection is not recommended

Stepwise procedures (forward, backward, or both-directions selection driven by p-values or AIC) are widely used but have serious problems in survival analysis: they capitalize on chance, produce overfit models, give optimistically narrow CIs, and generate results that fail to replicate. The American Statistical Association and major epidemiology guidelines explicitly caution against stepwise selection as the primary method for confounder identification.

A practical framework for covariate selection

Step 1

Start with a DAG

Draw a Directed Acyclic Graph (DAG) of your assumed causal structure. Identify which variables are confounders (causally prior to both exposure and outcome), mediators (on the causal path), or colliders (sharing common effects). Include confounders; exclude mediators and colliders.

Step 2

Apply the 10-EPV rule

Ensure you have at least 10 Events Per Variable (EPV) — the number of events (not observations) divided by the number of parameters estimated. With fewer than 10 EPV, coefficient estimates become unstable. With fewer than 5 EPV, the model is unreliable regardless of sample size.

Step 3

Check change-in-estimate

For each candidate confounder, compare the adjusted and unadjusted HR for the primary exposure. If including a variable changes the primary HR by more than 10%, it is a meaningful confounder and should remain in the model, regardless of its own p-value.

Step 4

Handle missing data

Variables with substantial missingness (> 5–10%) should not simply be dropped. Use complete-case analysis only when missingness is small and likely MCAR. For larger gaps, consider multiple imputation before fitting the Cox model.

Checking the 10-EPV rule

# Events per variable check before fitting
n_events <- sum(lung$status == 2, na.rm = TRUE)
n_params  <- 4  # sex, age, ph.ecog, ph.karno

cat("Events       :", n_events, "\n")

Events       : 165

cat("Parameters   :", n_params, "\n")

Parameters   : 4

cat("EPV          :", round(n_events / n_params, 1), "\n")

EPV          : 41.2

cat("Minimum EPV  : 10\n")

Minimum EPV  : 10

cat("Status       :", ifelse(n_events / n_params >= 10, "✓ Sufficient", "⚠ Too few events"), "\n")

Status       : ✓ Sufficient

Comparing nested models with the likelihood ratio test

When deciding between a smaller and larger model, the likelihood ratio test (LRT) is the preferred formal test:

lung_complete <- lung |> 
  dplyr::select(time, status, sex, age, ph.ecog) |> 
  drop_na()
# Model 1: sex only
cox_m1 <- coxph(Surv(time, status) ~ sex, data = lung_complete)

# Model 2: sex + age + ECOG
cox_m2 <- coxph(Surv(time, status) ~ sex + age + ph.ecog, data = lung_complete)

# Likelihood ratio test: does adding age and ECOG significantly improve the model?
anova(cox_m1, cox_m2, test = "LRT")

loglik	Chisq	Df	Pr(>\|Chi\|)
-739.3309	NA	NA	NA
-729.2301	20.2016	2	4.1e-05

anova(cox_m1, cox_m2)

loglik	Chisq	Df	Pr(>\|Chi\|)
-739.3309	NA	NA	NA
-729.2301	20.2016	2	4.1e-05

AIC for non-nested model comparison

When models are not nested (e.g., comparing different sets of covariates), use the Akaike Information Criterion (AIC) instead of the LRT. Lower AIC indicates better balance between fit and complexity. Extract it with AIC(cox_model). Note that AIC comparisons are only valid within the same dataset and complete cases.

cat("AIC (sex only)         :", round(AIC(cox_m1), 2), "\n")

AIC (sex only)         : 1480.66

cat("AIC (sex + age + ECOG) :", round(AIC(cox_m2), 2), "\n")

AIC (sex + age + ECOG) : 1464.46

cat("Lower AIC is preferred.\n")

Lower AIC is preferred.

Visualizing variable importance: concordance and partial effects

The concordance statistic (C-statistic) is the survival analogue of the AUC — it measures the model’s ability to correctly rank pairs of patients by their risk. A value of 0.5 is no better than chance; 1.0 is perfect discrimination:

# C-statistic from the summary output
summary(cox_m1)$concordance  # sex only

         C      se(C) 
0.57747511 0.02079997

summary(cox_m2)$concordance  # sex + age + ECOG

         C      se(C) 
0.63713549 0.02506797

Adding age and ECOG score should meaningfully increase the C-statistic, confirming they carry prognostic information beyond sex alone.

Common Mistakes

Mistake 1 — Reporting HR as a risk ratio

The hazard ratio is an instantaneous rate ratio, not a cumulative risk ratio. Saying “the treated group had twice the risk” when the HR = 2 is technically incorrect. Use language like “twice the hazard rate” or “twice the instantaneous event rate.” In practice the distinction matters most when the outcome is common or when follow-up is long.

Mistake 2 — Never checking the PH assumption

Running coxph() and reporting results without verifying proportional hazards is one of the most common errors in applied survival analysis. cox.zph() takes one line of code. Always run it. A significant global p-value demands investigation and reporting.

Mistake 3 — Multiplying the HR for multi-unit differences

If age has HR = 1.035 per year, the HR for a 10-year difference is not \(1.035 \times 10 = 1.35\). It is \(e^{10 \times \hat{\beta}} = e^{10 \times 0.034} \approx 1.41\). Always rescale through the coefficient, not the exponentiated HR.

Mistake 4 — Reporting unadjusted HR as the main result in observational data

In RWE and observational studies, the unadjusted HR conflates the true exposure effect with confounding. The adjusted HR is the primary estimand of interest. Always state explicitly what variables were adjusted for.

Mistake 5 — Treating Cox model curves as population-average survival

Predicted survival curves from Cox regression represent a specific combination of covariate values, not the average survival of the cohort. For population-average adjusted survival curves, use ggadjustedcurves() from {survminer} or inverse-probability-weighting approaches.

Reporting Checklist

State clearly whether HRs are unadjusted or adjusted, and list all adjustment variables
Report HR, 95% CI, and p-value for each predictor
For categorical predictors with \(>\) 2 levels, report a multi-df Wald test using car::Anova()
For continuous predictors, specify the unit of comparison (per year, per 10-year increase, etc.)
Document the results of cox.zph() — state whether the PH assumption was supported
If PH is violated, describe the remediation approach (stratification, time-varying coefficient, etc.)
Present a forest plot for multivariable models with \(\geq\) 3 covariates
Report sample size, number of events, and number of observations removed due to missingness
State the tie-handling method (default: Efron; alternatives: Breslow, Exact)

Summary

Key Takeaways

The Cox model is semi-parametric: it makes no assumption about the baseline hazard \(h_0(t)\), but models covariate effects as multiplicative modifiers via \(e^{\beta_k X_k}\)
The key output is the hazard ratio \(e^{\hat{\beta}}\) — the multiplicative change in instantaneous event rate per unit increase in the covariate
HR < 1 = protective, HR = 1 = no effect, HR > 1 = harmful
The HR is not a risk ratio — it is an instantaneous rate ratio that is assumed constant over time
The proportional hazards assumption must always be tested with cox.zph() and Schoenfeld residual plots
Adjusted HRs are the primary estimand in observational/RWE studies; unadjusted HRs describe the crude association only
For continuous predictors, rescale through \(e^{k \hat{\beta}}\) to get the HR for a \(k\)-unit difference — never multiply the exponentiated HR
Visualize results with two forest plot options: a custom ggplot2 plot for full control, or ggforest() from {survminer} for a two-panel publication-ready figure with annotated estimates
Use ggadjustedcurves() from {survminer} for adjusted survival curves — the "conditional" method is the recommended default as it correctly averages across the population distribution rather than predicting at a single covariate value
Covariate selection should be guided by subject-matter knowledge and a DAG, not stepwise algorithms; always verify the 10-EPV rule before fitting

Extensions not covered here

This article covers the core Cox model. Important extensions include: time-varying covariates (when exposure changes during follow-up), competing risks (when multiple event types compete), frailty models (for clustered/hierarchical survival data), and flexible parametric models (when the baseline hazard shape is of scientific interest). These are natural next steps once the proportional hazards model is well understood.

Session Info

Show R session details

sessionInfo()

R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_India.utf8  LC_CTYPE=English_India.utf8   
[3] LC_MONETARY=English_India.utf8 LC_NUMERIC=C                  
[5] LC_TIME=English_India.utf8    

time zone: Asia/Calcutta
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] broom_1.0.10    scales_1.4.0    patchwork_1.3.2 gtsummary_2.4.0
 [5] ggsurvfit_1.2.0 survminer_0.5.1 ggpubr_0.6.2    survival_3.8-3 
 [9] tidyr_1.3.1     dplyr_1.1.4     ggplot2_4.0.1  

loaded via a namespace (and not attached):
 [1] gtable_0.3.6         xfun_0.54            htmlwidgets_1.6.4   
 [4] rstatix_0.7.3        lattice_0.22-7       vctrs_0.6.5         
 [7] tools_4.5.2          generics_0.1.4       tibble_3.3.0        
[10] pkgconfig_2.0.3      Matrix_1.7-4         data.table_1.17.8   
[13] RColorBrewer_1.1-3   S7_0.2.1             gt_1.1.0            
[16] lifecycle_1.0.4      compiler_4.5.2       farver_2.1.2        
[19] stringr_1.6.0        carData_3.0-5        litedown_0.8        
[22] sass_0.4.10          htmltools_0.5.8.1    yaml_2.3.10         
[25] Formula_1.2-5        pillar_1.11.1        car_3.1-3           
[28] broom.helpers_1.22.0 abind_1.4-8          km.ci_0.5-6         
[31] commonmark_2.0.0     tidyselect_1.2.1     digest_0.6.38       
[34] stringi_1.8.7        purrr_1.2.0          labeling_0.4.3      
[37] forcats_1.0.1        splines_4.5.2        cowplot_1.2.0       
[40] labelled_2.16.0      fastmap_1.2.0        grid_4.5.2          
[43] cli_3.6.5            magrittr_2.0.4       base64enc_0.1-3     
[46] cards_0.7.0          withr_3.0.2          backports_1.5.0     
[49] rmarkdown_2.30       gridExtra_2.3        ggsignif_0.6.4      
[52] zoo_1.8-14           hms_1.1.4            evaluate_1.0.5      
[55] knitr_1.50           haven_2.5.5          KMsurv_0.1-6        
[58] markdown_2.0         survMisc_0.5.6       rlang_1.1.6         
[61] xtable_1.8-4         glue_1.8.0           xml2_1.4.1          
[64] rstudioapi_0.17.1    jsonlite_2.0.0       R6_2.6.1            
[67] fs_1.6.6

References

Kassambara, A., Kosinski, M., & Biecek, P. (2024). survminer: Drawing survival curves using ggplot2. https://rpkgs.datanovia.com/survminer/

Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B, 34(2), 187–220.

Collett, D. (2015). Modelling Survival Data in Medical Research (3rd ed.). CRC Press.

Klein, J. P., & Moeschberger, M. L. (2003). Survival Analysis: Techniques for Censored and Truncated Data (2nd ed.). Springer.

Therneau, T. M., & Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer.

Therneau, T. M., Crowson, C. S., & Atkinson, E. J. (2015). Adjusted survival curves. CRAN vignette. https://cran.r-project.org/web/packages/survival/vignettes/adjcurve.pdf

Sjoberg, D. D., Whiting, K., Curry, M., Lavery, J. A., & Larmarange, J. (2021). Reproducible summary tables with the gtsummary package. The R Journal, 13(1), 570–580.

Introduction

Why Cox Regression?

The Cox Model — Intuition and Formula

What Cox models

The hazard ratio

Interpreting the hazard ratio

Fitting Cox Models in R

Unadjusted model — single categorical predictor

Publication-ready table with {gtsummary}

Unadjusted model — continuous predictor

Multivariable Cox Regression

Fitting the full model

Unadjusted vs adjusted — comparing effect estimates

The Proportional Hazards Assumption

Testing the PH assumption with cox.zph()

Visualizing with Schoenfeld residual plots

Visualizing Results: Forest Plot

Option 1 — Custom {ggplot2} forest plot

Option 2 — ggforest() from {survminer}

Plotting Adjusted Survival Curves

Four methods explained

Fitting the model with strata()

Conditional adjusted curves (recommended default)

Marginal adjusted curves (IPW-balanced)

Covariate Selection

The fundamental principle: science first

A practical framework for covariate selection

Checking the 10-EPV rule

Comparing nested models with the likelihood ratio test

Visualizing variable importance: concordance and partial effects

Common Mistakes

Reporting Checklist

Summary

Key Takeaways

Session Info

References

Publication-ready table with `{gtsummary}`

Testing the PH assumption with `cox.zph()`

Option 1 — Custom `{ggplot2}` forest plot

Option 2 — `ggforest()` from `{survminer}`

Fitting the model with `strata()`