Survival Analysis and Prognostic Modeling in Lung Cancer

Author: Amira Mandour
Biostatistician | Clinical Trials & Statistical Modeling Expert

2023-01-03

Introduction:

The goal of this study, is to investigate the survival outcomes of patients with advanced lung cancer. Using Kaplan-Meier survival analysis, we aim to evaluate how clinical factors such as age, sex, performance status, and weight loss influence patient survival times. By understanding these relationships, we seek to identify key prognostic factors that may guide clinical decision-making and improve patient outcomes in lung cancer care.

Data:

This study is a clinical trial of lung cancer patients. The dataset includes:

Methodology:

Kaplan-Meier Survival Analysis: We performed Kaplan-Meier survival analysis to estimate survival curves for different groups based on clinical variables. The log-rank test was used to compare survival between groups (e.g., male vs. female).

Cox Proportional Hazards Model: To investigate the effect of clinical covariates on survival, a Cox proportional hazards regression model was built. Hazard ratios (HR) and 95% confidence intervals (CI) were calculated for each covariate.

Testing the Proportional Hazards Assumption: The proportional hazards assumption was assessed using Schoenfeld residuals. Covariates that violated the assumption were identified, and stratification was applied where necessary.

Stratified Cox Model: In response to the violations of the proportional hazards assumption, we applied a stratified Cox model for covariates with significant violations, such as Karnofsky scores (both physician and patient assessments).

Post-Hoc Analysis of Gender Differences: We performed a subgroup analysis by gender to explore survival differences between male and female patients. The stratified Cox model was also applied to assess the influence of gender on survival while accounting for other covariates.

Statistical Analysis:

The Kaplan-Meier survival curve shown here illustrates the overall survival experience of the study cohort. The curve represents the proportion of patients surviving over time, with survival probability (in percentage) plotted against time in days. As expected, the survival probability decreases over time, reflecting the typical course of advanced lung cancer. The median survival time in this study is approximately 310 days, indicating that 50% of the patients survived beyond this point.(Figure 1)

Table 1. Kaplan–Meier Estimates of Survival by Sex
Sex Number at Risk Events Median Survival (days) 95% CI (Lower) 95% CI (Upper)
Male 138 112 270 212 310
Female 90 53 426 348 550

Table 1. The Kaplan–Meier survival estimates are presented in Table 1, summarizing the survival outcomes by sex. The table compares the median survival times and the event rates (percentage of deaths) for males and females in the study.

Males (n = 138) had a median survival time of 270 days (95% CI: 212–310), with 112 events (deaths), which corresponds to an event rate of 81.2%. This indicates that 81.2% of male patients in the cohort experienced the event (death) by the end of the study period.

Females (n = 90) had a significantly longer median survival time of 426 days (95% CI: 348–550), with 53 events (deaths), yielding an event rate of 58.9%. This means that 58.9% of female patients in the cohort experienced the event.

The findings from this Kaplan–Meier analysis indicate that gender is an important factor in determining survival outcomes. Females in this cohort had both a longer median survival time and a lower event rate, suggesting that gender-related factors could play a role in influencing prognosis.

Survival Test Results:

Table 2. Survival Test Results:
Log-Rank Test Results
Group N.Freq Observed Expected Test.Statistic..O.E..2.E Variance..O.E..2.V
Males 138 112 91.5817390295728 4.55227631047547 40.3714339796426
Females 90 53 73.4182609704272 5.67849708704487 40.3714339796427
Chi-squared 0.00131116452035549 p = 0.00131

The log-rank test was conducted to compare the survival distributions between male and female patients in the lung cancer cohort. The test revealed a statistically significant difference in survival between the two groups, with a chi-square statistic of 10.3 (degrees of freedom = 1) and a p-value of 0.001.(Table 2.)

Male Patients (n = 138): Of the 138 male patients, 112 observed events (deaths) were recorded, compared to an expected 91.6 events. This resulted in a test statistic contribution of 4.55.

Female Patients (n = 90): Among 90 female patients, 53 deaths were observed, with an expected number of 73.4 events, contributing 5.68 to the test statistic.

The p-value of 0.001 indicates that the survival distributions between males and females differ significantly. Specifically, females demonstrated significantly better survival compared to males in this cohort. These findings are consistent with existing literature suggesting that female patients with lung cancer tend to have better survival outcomes than male patients, possibly due to biological, treatment-related, or other gender-related factors.

Progression Free Survival:

The Kaplan-Meier curve for Progression-Free Survival (PFS) was calculated for the cohort. The median PFS for the study population was approximately 348 days, indicating that half of the patients in the study experienced disease progression or death within this time frame.

Kaplan-Meier Curves for OS and PFS by Sex:

Survival curves were generated for both Overall Survival (OS) and Progression-Free Survival (PFS), stratified by sex (male vs. female). The goal was to assess whether there are any significant differences in survival outcomes based on gender.

Overall Survival (OS)

The Kaplan-Meier curve for Overall Survival (OS) stratified by sex demonstrates distinct survival patterns for males and females in this cohort. The male survival curve consistently shows a lower survival probability across the study period compared to the female survival curve. Females have a longer median survival time, reflecting a better overall prognosis in this cohort.

Progression-Free Survival (PFS) Stratified by Sex

The Kaplan-Meier survival curves for Progression-Free Survival (PFS) were generated stratified by sex to explore potential differences in survival outcomes between males and females in the cohort. The survival curves for males and females are shown below, and the corresponding median PFS for each group is provided.

Median PFS for males: The median PFS for males was approximately 329 days, indicating that half of the male patients in the cohort remained progression-free for 329 days or longer before experiencing disease progression or death.

Median PFS for females: The median PFS for females was approximately 356 days, indicating that half of the female patients in the cohort remained progression-free for 356 days or longer.

The Kaplan-Meier curves reveal that females tend to have a longer progression-free survival compared to males, with the female cohort maintaining a higher probability of being progression-free at later time points. This suggests that females experience a slower rate of disease progression compared to males.

The median PFS for females is slightly higher than the median PFS for males, suggesting that female patients may benefit from a longer time without disease progression, which could be attributed to biological factors such as genetic differences or hormonal influences on disease progression and treatment response.

Cox Proportional Hazards Regression Model

A Cox proportional hazards regression model was fitted to assess the impact of various clinical and demographic factors on survival in lung cancer patients. The following variables were included in the model: Age, ECOG performance status (ph.ecog), Karnofsky performance score (ph.karno), and patient Karnofsky score (pat.karno).

Model Results:

Table 3. Cox Model for Survival Outcomes: A Multivariable Analysis of Lung Cancer Prognostic Factors
Characteristic Hazard Ratio (HR)1,2 95% CI2 P value3
Sex 0.57 0.41, 0.80 0.001
Age (years) 1.01 0.99, 1.03 0.2
ECOG performance status 1.76 1.22, 2.54 0.002
Physician Karnofsky score 1.02 1.00, 1.04 0.11
Patient Karnofsky score 0.99 0.98, 1.00 0.14
1 HRs are adjusted for all variables included in the model.
2 HR = Hazard Ratio, CI = Confidence Interval
3 P values are from the Wald test.

Overall Model Fit and Significance:

Evaluation of Proportional Hazards Assumption:

The proportional hazards assumption (PHA) was evaluated for the variables included in the Cox proportional hazards regression model using Schoenfeld residuals. The results of the Schoenfeld residuals test for each covariate, as well as the global test, are summarized below.(Table 4)

Table 4. Schoenfeld Residuals Test for Proportional Hazards Assumption
Covariate Chi-Square df p-value
Sex 1.704 1 0.19
Age 0.001 1 0.97
ECOG Performance Status 2.040 1 0.15
Physician Karnofsky Score (ph.karno) 5.411 1 0.02
Patient Karnofsky Score (pat.karno) 4.737 1 0.03
Global Test 9.229 5 0.10

The proportional hazards assumption was satisfied for most variables in the model, as indicated by the p-values of sex, age, and ECOG performance status, which were all greater than 0.05. However, there is evidence of a violation of the proportional hazards assumption for physician Karnofsky score (ph.karno) and patient Karnofsky score (pat.karno) (p-values 0.02 and 0.03, respectively). These findings suggest that the effects of these two variables on survival may change over time, and further analysis, such as including time-varying covariates for these variables, may be required.

Schoenfeld Residuals for Assessing the Proportional Hazards Assumption:

We fitted a Cox proportional hazards regression model to assess the impact of clinical variables on survival, stratifying by Physician Karnofsky Score (ph.karno) and Patient Karnofsky Score (pat.karno) to account for potential violations of the proportional hazards assumption for these variables.

Key Results:

Table 5. Cox Proportional Hazards Regression Model (Stratified) Results After Addressing Proportional Hazards Violations
Covariate Coefficient (ß) Hazard Ratio (HR) Confidence Interval p-value
95% CI (Lower) 95% CI (Upper)
Sex −0.638 0.528 0.356 0.784 0.002
Age 0.015 1.016 0.993 1.038 0.171
ph.ecog 0.329 1.389 0.852 2.265 0.188

Model Evaluation:

Conclusion:

In this analysis, the survival distributions for both overall survival (OS) and progression-free survival (PFS) were examined. The Kaplan-Meier curves demonstrate expected survival patterns in advanced lung cancer, with the female cohort exhibiting better survival outcomes compared to males. These findings were statistically supported by the log-rank test, which indicated a significant difference between the survival curves of males and females.

Further analysis using Cox proportional hazards regression identified ECOG performance status and gender as significant predictors of survival, consistent with prior studies that have emphasized the importance of functional status in lung cancer prognosis.