Author: Amira Mandour
Biostatistician | Clinical Trials & Statistical Modeling
Expert
Introduction:
The goal of this study, is to investigate the survival outcomes of patients with advanced lung cancer. Using Kaplan-Meier survival analysis, we aim to evaluate how clinical factors such as age, sex, performance status, and weight loss influence patient survival times. By understanding these relationships, we seek to identify key prognostic factors that may guide clinical decision-making and improve patient outcomes in lung cancer care.
Data:
This study is a clinical trial of lung cancer patients. The dataset includes:
Survival Time: Time from diagnosis to death (or censoring).
Censoring Status: Whether the patient was censored (1 = event occurred, 0 = censored).
Clinical Covariates: Age, sex, ECOG performance status (0, 1, 2, 3), weight loss (percentage), smoking status, and tumor stage.
Methodology:
Kaplan-Meier Survival Analysis: We performed Kaplan-Meier survival analysis to estimate survival curves for different groups based on clinical variables. The log-rank test was used to compare survival between groups (e.g., male vs. female).
Cox Proportional Hazards Model: To investigate the effect of clinical covariates on survival, a Cox proportional hazards regression model was built. Hazard ratios (HR) and 95% confidence intervals (CI) were calculated for each covariate.
Testing the Proportional Hazards Assumption: The proportional hazards assumption was assessed using Schoenfeld residuals. Covariates that violated the assumption were identified, and stratification was applied where necessary.
Stratified Cox Model: In response to the violations of the proportional hazards assumption, we applied a stratified Cox model for covariates with significant violations, such as Karnofsky scores (both physician and patient assessments).
Post-Hoc Analysis of Gender Differences: We performed a subgroup analysis by gender to explore survival differences between male and female patients. The stratified Cox model was also applied to assess the influence of gender on survival while accounting for other covariates.
The Kaplan-Meier survival curve shown here illustrates the overall survival experience of the study cohort. The curve represents the proportion of patients surviving over time, with survival probability (in percentage) plotted against time in days. As expected, the survival probability decreases over time, reflecting the typical course of advanced lung cancer. The median survival time in this study is approximately 310 days, indicating that 50% of the patients survived beyond this point.(Figure 1)
| Sex | Number at Risk | Events | Median Survival (days) | 95% CI (Lower) | 95% CI (Upper) |
|---|---|---|---|---|---|
| Male | 138 | 112 | 270 | 212 | 310 |
| Female | 90 | 53 | 426 | 348 | 550 |
Table 1. The Kaplan–Meier survival estimates are presented in Table 1, summarizing the survival outcomes by sex. The table compares the median survival times and the event rates (percentage of deaths) for males and females in the study.
Males (n = 138) had a median survival time of 270 days (95% CI: 212–310), with 112 events (deaths), which corresponds to an event rate of 81.2%. This indicates that 81.2% of male patients in the cohort experienced the event (death) by the end of the study period.
Females (n = 90) had a significantly longer median survival time of 426 days (95% CI: 348–550), with 53 events (deaths), yielding an event rate of 58.9%. This means that 58.9% of female patients in the cohort experienced the event.
The findings from this Kaplan–Meier analysis indicate that gender is an important factor in determining survival outcomes. Females in this cohort had both a longer median survival time and a lower event rate, suggesting that gender-related factors could play a role in influencing prognosis.
Survival Test Results:
| Group | N.Freq | Observed | Expected | Test.Statistic..O.E..2.E | Variance..O.E..2.V |
|---|---|---|---|---|---|
| Males | 138 | 112 | 91.5817390295728 | 4.55227631047547 | 40.3714339796426 |
| Females | 90 | 53 | 73.4182609704272 | 5.67849708704487 | 40.3714339796427 |
| Chi-squared | 0.00131116452035549 | p = 0.00131 |
The log-rank test was conducted to compare the survival distributions between male and female patients in the lung cancer cohort. The test revealed a statistically significant difference in survival between the two groups, with a chi-square statistic of 10.3 (degrees of freedom = 1) and a p-value of 0.001.(Table 2.)
Male Patients (n = 138): Of the 138 male patients, 112 observed events (deaths) were recorded, compared to an expected 91.6 events. This resulted in a test statistic contribution of 4.55.
Female Patients (n = 90): Among 90 female patients, 53 deaths were observed, with an expected number of 73.4 events, contributing 5.68 to the test statistic.
The p-value of 0.001 indicates that the survival distributions between males and females differ significantly. Specifically, females demonstrated significantly better survival compared to males in this cohort. These findings are consistent with existing literature suggesting that female patients with lung cancer tend to have better survival outcomes than male patients, possibly due to biological, treatment-related, or other gender-related factors.
The Kaplan-Meier curve for Progression-Free Survival (PFS) was calculated for the cohort. The median PFS for the study population was approximately 348 days, indicating that half of the patients in the study experienced disease progression or death within this time frame.
Kaplan-Meier Curves for OS and PFS by Sex:
Survival curves were generated for both Overall Survival (OS) and Progression-Free Survival (PFS), stratified by sex (male vs. female). The goal was to assess whether there are any significant differences in survival outcomes based on gender.
Overall Survival (OS)
The Kaplan-Meier curve for Overall Survival (OS) stratified by sex demonstrates distinct survival patterns for males and females in this cohort. The male survival curve consistently shows a lower survival probability across the study period compared to the female survival curve. Females have a longer median survival time, reflecting a better overall prognosis in this cohort.
Progression-Free Survival (PFS) Stratified by Sex
The Kaplan-Meier survival curves for Progression-Free Survival (PFS) were generated stratified by sex to explore potential differences in survival outcomes between males and females in the cohort. The survival curves for males and females are shown below, and the corresponding median PFS for each group is provided.
Median PFS for males: The median PFS for males was approximately 329 days, indicating that half of the male patients in the cohort remained progression-free for 329 days or longer before experiencing disease progression or death.
Median PFS for females: The median PFS for females was approximately 356 days, indicating that half of the female patients in the cohort remained progression-free for 356 days or longer.
The Kaplan-Meier curves reveal that females tend to have a longer progression-free survival compared to males, with the female cohort maintaining a higher probability of being progression-free at later time points. This suggests that females experience a slower rate of disease progression compared to males.
The median PFS for females is slightly higher than the median PFS for males, suggesting that female patients may benefit from a longer time without disease progression, which could be attributed to biological factors such as genetic differences or hormonal influences on disease progression and treatment response.
A Cox proportional hazards regression model was fitted to assess the impact of various clinical and demographic factors on survival in lung cancer patients. The following variables were included in the model: Age, ECOG performance status (ph.ecog), Karnofsky performance score (ph.karno), and patient Karnofsky score (pat.karno).
| Characteristic | Hazard Ratio (HR)1,2 | 95% CI2 | P value3 |
|---|---|---|---|
| Sex | 0.57 | 0.41, 0.80 | 0.001 |
| Age (years) | 1.01 | 0.99, 1.03 | 0.2 |
| ECOG performance status | 1.76 | 1.22, 2.54 | 0.002 |
| Physician Karnofsky score | 1.02 | 1.00, 1.04 | 0.11 |
| Patient Karnofsky score | 0.99 | 0.98, 1.00 | 0.14 |
| 1 HRs are adjusted for all variables included in the model. | |||
| 2 HR = Hazard Ratio, CI = Confidence Interval | |||
| 3 P values are from the Wald test. | |||
Age: The hazard ratio (HR) for age was 1.0108 (95% CI: 0.9922 - 1.030, p = 0.257), suggesting that for each additional year of age, the risk of death increases by approximately 1.08%. However, the effect of age on survival was not statistically significant (p > 0.05), indicating that age is not a strong predictor of survival in this cohort.
Sex was found to be significantly associated with overall survival. The estimated hazard ratio (HR) for sex was 0.570 (95% CI: 0.408 - 0.797, p = 0.001), indicating that females have a significantly lower risk of death compared to males.
Specifically, the hazard ratio (HR) of 0.570 means that, holding all other variables constant, females have approximately 43% lower risk of death than males (1 - 0.570 = 0.430). This is a statistically significant finding (p < 0.01), suggesting that sex is an important prognostic factor for overall survival.
ECOG Performance Status (ph.ecog): The HR for ECOG performance status was 1.6373 (95% CI: 1.1359 - 2.360, p = 0.0082), indicating that each increase in the ECOG score (which represents worsening performance status) is associated with a 63.7% increase in the risk of death. This result was statistically significant (p < 0.01), suggesting that ECOG performance status is a key determinant of survival in this patient population.
Karnofsky Performance Score (ph.karno): The HR for Karnofsky performance score was 1.0129 (95% CI: 0.9931 - 1.033, p = 0.20326), indicating a slight increase in the risk of death with each unit increase in the Karnofsky score. However, this effect was not statistically significant (p > 0.05), suggesting that Karnofsky performance score does not significantly affect survival in this model.
Patient Karnofsky Score (pat.karno): The HR for patient Karnofsky score was 0.9893 (95% CI: 0.9764 - 1.002, p = 0.10625), indicating a small decrease in the risk of death with higher scores. While this effect was negative, it was not statistically significant (p > 0.05), and hence does not appear to have a meaningful influence on survival in this analysis.
The likelihood ratio test (p = 3e-04), Wald test (p = 1e-04), and score test (p = 1e-04) all indicate that the model is statistically significant overall, suggesting that at least one of the covariates included in the model (age, ECOG, Karnofsky scores) is significantly associated with survival.
The concordance index (C-index) was 0.63 (SE = 0.024), indicating that the model has moderate discriminatory ability in predicting patient survival outcomes. A C-index of 0.63 suggests that the model can somewhat distinguish between patients who will experience the event (death) and those who will survive, but the model’s discriminatory power could be improved.
The proportional hazards assumption (PHA) was evaluated for the variables included in the Cox proportional hazards regression model using Schoenfeld residuals. The results of the Schoenfeld residuals test for each covariate, as well as the global test, are summarized below.(Table 4)
| Covariate | Chi-Square | df | p-value |
|---|---|---|---|
| Sex | 1.704 | 1 | 0.19 |
| Age | 0.001 | 1 | 0.97 |
| ECOG Performance Status | 2.040 | 1 | 0.15 |
| Physician Karnofsky Score (ph.karno) | 5.411 | 1 | 0.02 |
| Patient Karnofsky Score (pat.karno) | 4.737 | 1 | 0.03 |
| Global Test | 9.229 | 5 | 0.10 |
Global Test (p = 0.10): The global test p-value of 0.10 indicates that, overall, there is no significant evidence of a violation of the proportional hazards assumption for the full model. This suggests that, collectively, the effect of all covariates on the hazard of the event does not vary significantly over time.
Sex (p = 0.19): The p-value for sex is 0.19, which is above the 0.05 threshold. Therefore, there is no evidence to suggest that the proportional hazards assumption is violated for sex, meaning the effect of sex on the hazard of death is constant over time.
Age (p = 0.97): The p-value for age is 0.97, indicating that there is no significant violation of the proportional hazards assumption for age. The effect of age on the risk of death does not change over time in this model.
ECOG Performance Status (p = 0.15): The p-value for ECOG performance status is 0.15, suggesting that there is no significant violation of the proportional hazards assumption for ECOG. This implies that the effect of ECOG performance status on survival is constant over time.
Physician Karnofsky Score (ph.karno, p = 0.02): The p-value for physician Karnofsky score is 0.02, which is below 0.05, indicating that the proportional hazards assumption is violated for this covariate. This suggests that the effect of the physician-reported Karnofsky score on the risk of death changes over time.
Patient Karnofsky Score (pat.karno, p = 0.03): The p-value for patient Karnofsky score is 0.03, which is also below 0.05, indicating a violation of the proportional hazards assumption for this variable. Similar to physician Karnofsky score, the effect of patient-reported Karnofsky score on survival is not constant over time.
The proportional hazards assumption was satisfied for most variables in the model, as indicated by the p-values of sex, age, and ECOG performance status, which were all greater than 0.05. However, there is evidence of a violation of the proportional hazards assumption for physician Karnofsky score (ph.karno) and patient Karnofsky score (pat.karno) (p-values 0.02 and 0.03, respectively). These findings suggest that the effects of these two variables on survival may change over time, and further analysis, such as including time-varying covariates for these variables, may be required.
Schoenfeld Residuals for Assessing the Proportional Hazards Assumption:
We fitted a Cox proportional hazards regression model to assess the impact of clinical variables on survival, stratifying by Physician Karnofsky Score (ph.karno) and Patient Karnofsky Score (pat.karno) to account for potential violations of the proportional hazards assumption for these variables.
| Table 5. Cox Proportional Hazards Regression Model (Stratified) Results After Addressing Proportional Hazards Violations | |||||
| Covariate | Coefficient (ß) | Hazard Ratio (HR) | Confidence Interval | p-value | |
|---|---|---|---|---|---|
| 95% CI (Lower) | 95% CI (Upper) | ||||
| Sex | −0.638 | 0.528 | 0.356 | 0.784 | 0.002 |
| Age | 0.015 | 1.016 | 0.993 | 1.038 | 0.171 |
| ph.ecog | 0.329 | 1.389 | 0.852 | 2.265 | 0.188 |
Sex: The hazard ratio for sex is 0.528 (95% CI: 0.356, 0.784), indicating that female patients have a significantly lower risk of the event compared to male patients. This effect is statistically significant with a p-value of 0.00151. The negative coefficient for sex suggests that being female is associated with a decreased risk of the event.
Age: The hazard ratio for age is 1.015 (95% CI: 0.993, 1.038), which suggests a slight increase in the hazard for each additional year of age. However, this effect is not statistically significant (p = 0.17061), implying that age does not have a significant impact on survival in this cohort.
ECOG Performance Status: The hazard ratio for ECOG performance status (ph.ecog) is 1.389 (95% CI: 0.852, 2.265), suggesting that worse performance status is associated with a higher risk of the event. However, the p-value of 0.18790 indicates that this result is not statistically significant, and the effect of ECOG status may not be meaningful after adjustment for other covariates.
Model Evaluation:
Concordance Index (C-index): The C-index for this model is 0.597, which suggests that the model has moderate discriminatory ability, meaning it has a fair capacity to rank survival times accurately.
Likelihood Ratio Test: The likelihood ratio test (p = 0.002) indicates that the model is statistically significant overall.
Wald Test: The Wald test (p = 0.003) also suggests that the model provides a good fit to the data.
Score (Logrank) Test: The logrank test (p = 0.003) further supports the statistical significance of the model, indicating that the covariates collectively have a significant effect on survival.
Conclusion:
In this analysis, the survival distributions for both overall survival (OS) and progression-free survival (PFS) were examined. The Kaplan-Meier curves demonstrate expected survival patterns in advanced lung cancer, with the female cohort exhibiting better survival outcomes compared to males. These findings were statistically supported by the log-rank test, which indicated a significant difference between the survival curves of males and females.
Further analysis using Cox proportional hazards regression identified ECOG performance status and gender as significant predictors of survival, consistent with prior studies that have emphasized the importance of functional status in lung cancer prognosis.