library(ggplot2)
library(ggpubr)
library(ggrepel)
library(dplyr)
cancer <- read.csv("cancer_data.csv")
# Preview the data
head(cancer)
## Year Type Survival
## 1 Year.5 Prostate 99
## 2 Year.10 Prostate 95
## 3 Year.15 Prostate 87
## 4 Year.20 Prostate 81
## 5 Year.5 Thyroid 96
## 6 Year.10 Thyroid 96
# Re-order the Year factor so lines connect in the correct chronological sequence
cancer$Year <- factor(cancer$Year,
levels = c("Year.5", "Year.10", "Year.15", "Year.20"))
ggplot(cancer, aes(x = Year, y = Survival, group = Type, color = Type)) +
# Connect the four time-points for each cancer type with lines
geom_line(linewidth = 0.7) +
# Add a point at every time-point
geom_point(size = 2) +
# Add boxed Survival values at every point
geom_label(aes(label = Survival),
size = 2.5,
label.padding = unit(0.15, "lines"),
show.legend = FALSE) +
# Label cancer-type names at the 5-year endpoint (left side)
geom_text_repel(
data = filter(cancer, Year == "Year.5"),
aes(label = Type),
nudge_x = -0.35,
direction = "y",
hjust = 1,
segment.size = 0.3,
size = 3,
show.legend = FALSE
) +
# Label cancer-type names at the 20-year endpoint (right side)
geom_text_repel(
data = filter(cancer, Year == "Year.20"),
aes(label = Type),
nudge_x = 0.35,
direction = "y",
hjust = 0,
segment.size = 0.3,
size = 3,
show.legend = FALSE
) +
theme_bw() +
labs(
title = "Cancer Survival Rates Over Time by Type",
x = "Year",
y = "Survival Rate (%)",
color = "Cancer Type"
) +
# Expand x-axis so repelled labels on both sides are not clipped
scale_x_discrete(expand = expansion(mult = c(0.4, 0.4))) +
theme(
legend.position = "none",
plot.title = element_text(face = "bold", hjust = 0.5)
)
This slope/interaction graph displays the survival rates (%) of 24 different cancer types measured at 5, 10, 15, and 20 years. Each line represents a different cancer type, and the varying slopes and crossing patterns reveal interaction effects — cancers like Prostate and Thyroid maintain high survival rates over time, while others such as Multiple Myeloma and Pancreas decline sharply, demonstrating that the relationship between time and survival differs substantially by cancer type.