Code
library(tidyverse)
breast_data = read.csv("breast_cancer.csv")
breast_data = na.omit(breast_data)
breast_data = subset(breast_data, Tumor.Size > 0 & Tumor.Size <= 100)‘BreastScreen’ Australia is the national breast cancer screening program offering free mammograms to women aged 50–74 to improve early detection and reduce breast cancer mortality. More information: https://www.cancerscreening.gov.au/breastscreen
library(tidyverse)
breast_data = read.csv("breast_cancer.csv")
breast_data = na.omit(breast_data)
breast_data = subset(breast_data, Tumor.Size > 0 & Tumor.Size <= 100)As a result of the anlysis provided, it is recommended to conduct early tumor size screening and estrogen receptor (ER) testing. Patients who have smaller tumors and are ER-positive show substantially better survival results. The research findings make it imperative for more funding of early detection programs and hormone-based treatment approaches to enhance both prognosis and efficiency.
The Breast cancer dataset was examined through three essential variables which included Patient Status (Alive vs Deceased), Tumor Size, and Estrogen Receptor Status. These variables serve as essential indicators in order to understand treatment results and subsequently direct therapeutic choices
ggplot(breast_data, aes(x = Status)) +
geom_bar(fill = "darkblue") +
labs(title = "Patient Status Distribution", x = "Status", y = "Count") The Bar Plot of ‘Status’ shows that more patients in the dataset survived than those who did not. This initially seems encouraging, but it does introduce a noticeable bias in the outcome data. This skewed distribution may affect the interpretation of other variables, due to the fact that statistical relationships could be affected by the overrepresentation of survivors in the dataset.
RQ1: How does tumor size vary between patients who survive and those who do not following a breast cancer diagnosis?
ggplot(breast_data, aes(x = Status, y = Tumor.Size)) +
geom_boxplot(fill = "darkred") +
labs(title = "Tumor Size by Patient Status", x = "Status", y = "Tumor Size (mm)")The Comparative boxplot of ‘Tumor Size’ by ‘Status’ shows that there is a significant difference in tumor size between the two outcome groups. Patients who died had significantly larger tumours on average, with distributions extending to higher values. This is consistent with the established clinical knowledge that larger tumours are associated with poorer prognosis. Additionally, the presence of outliers within the deceased group highlights the increased severity of late-stage tumour progression. A Tumour size investigation by Elston & Ellis (1991) concluded that it is one of the most significant independent prognostic variables in breast cancer, impacting both survival rates and recurrence, thus showing the requirement for early detection and treatment programs.
RQ2: Is there an association between estrogen receptor (ER) status and breast cancer survival outcomes?
ggplot(breast_data, aes(x = Estrogen.Status, fill = Status)) +
geom_bar(position = "dodge") +
labs(title = "Estrogen Status by Patient Outcome", x = "Estrogen Status", y = "Count")The Double Bar Plot comparing ‘Estrogen Status’ and ‘Status’ shows that patients with positive estrogen receptor (ER) status have a higher chance of survival. This finding is in line with existing medical research which shows that ER-positive patients typically respond well to hormonal therapies. The extensive study conducted by he Early Breast Cancer Trialists Collaborative Group (2011), highlighted that hormone treatments like Tamoxifen greatly improve long-term survival and reduce recurrence in ER-positive patients. These insights illustrate the practical importance of ER status testing in predicting outcomes and tailoring appropriate treatments for patients.
The analysis demonstrates the Shared Value of Accountability through its transparent and replicable methods and clear communication of limitations. The analysis follows the Ethical Principle of Avoiding Harm by using responsible medical data interpretation to guide decisions without drawing exaggerated conclusions or creating potential misuse.
No artificial intelligence was used in the creation of the final version of this project, this includes research, analysis, and coding.
(Compartive Boxplot): Elston, C. W., & Ellis, I. O. (1991). Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology, 19(5), 403–410.
Link: https://doi.org/10.1111/j.1365-2559.1991.tb00229.x
(Double Barplot):EBCTCG (2011). Relevance of breast cancer hormone receptors and other factors to the efficacy of adjuvant tamoxifen: patient-level meta-analysis of randomised trials. The Lancet, 378(9793), 771–784.
Link: https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(11)60993-8/fulltext
Hi, do we need intext citations for research articles?: https://edstem.org/au/courses/19992/discussion/2676788?comment=5955837
Graphical Outputs: https://edstem.org/au/courses/19992/discussion/2676886
Error message?: https://edstem.org/au/courses/19992/discussion/2677945