2025-11-09

Introduction

Medical research for women has largely been ignored. During the 1990s significant changes in the medical field have allowed for more research to be completed. This presentation is being compiled to bring awareness to a medical condition that has been plaguing women for centuries. We will be reviewing data from 2010-2022 on women ages 50-60 and see if there has been an increase in breast cancer.

Dataset Preview

head(cancer)
##   Year Age_Group Cancer_Type Cases Rate_Per_100k
## 1 2010     50-60      Breast 18500         240.1
## 2 2011     50-60      Breast 18720         241.5
## 3 2012     50-60      Breast 18950         243.0
## 4 2013     50-60      Breast 19200         245.2
## 5 2014     50-60      Breast 19550         247.8
## 6 2015     50-60      Breast 19800         249.5

Hypothesis Testing: Model Slope

Trend in cancer rate:

We are reviewing data to see if there has been an upward trend in breast cancer for women between the ages of 50–60 years.

Null hypothesis (no upward trend): \[H_0:\ \beta_1 = 0\]

Alternative hypothesis (upward trend): \[H_a:\ \beta_1 > 0\]

Model used: \[\text{Rate}_{per\,100k} = \beta_0 + \beta_1(\text{Year}) + \epsilon\]

Regression Output / Slope Test

The results of this hypothesis test show an increase of women between 50–60 years being diagnosed with breast cancer. The test is significant (low p-value) and the slope is about 2.25 rate points per year.

## 
## Call:
## lm(formula = Rate_Per_100k ~ Year, data = cancer)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8615 -0.5126 -0.2659  0.4830  1.4450 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.282e+03  1.180e+02  -36.29 8.37e-13 ***
## Year         2.249e+00  5.852e-02   38.43 4.48e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7895 on 11 degrees of freedom
## Multiple R-squared:  0.9926, Adjusted R-squared:  0.9919 
## F-statistic:  1477 on 1 and 11 DF,  p-value: 4.479e-13

Decision Rule for Hypothesis Test

We decide based on the p-value at significance level \(\alpha = 0.05\):

\[p < 0.05 \Rightarrow \text{Reject } H_0\] \[p \ge 0.05 \Rightarrow \text{Fail to Reject } H_0\] The p-value was below the .05 threshold; therefore, we reject the null hypothesis.

Breast Cancer Rate Trend

ggplot(cancer, aes(x = Year, y = Rate_Per_100k))+
  geom_line(color="#8C1D40", linewidth = 1.2)+
  geom_point(size = 2.5)+
  labs(title = "Rate per 100,000 Women(Ages 50-60)", 
       x = "Year",
       y = "Rate per 100,000"
       )

Annual Breast Cancer Case Counts

Interactive Trend (plotly)

Conclusion

  • During 2010–2022, the statistical results indicate an increase in breast cancer for women between the ages of 50-60 years.
  • Hypothesis test results:
    • p-value = 4.48e-13
    • Slope ≈ 2.25 rate points per year