Analysis of Prostate Data

The mean age of patients in this prostate dataset is 61 years old. The median pre-operative PSA level was 6.2, while the mean pre-operative PSA level was 8.2. Mean time to recurrence was 32 days?

Headline about Prostate data

PSA levels indicate prostate problems

Older patients tend to have prostate problems more frequently

prostate %>% 
  mutate(aa = factor(aa, levels = c(0,1), 
                     labels = c("White", "African-American"))) %>% 
  mutate(fam_hx = factor(fam_hx, levels = c(0,1), 
      labels = c("No Family History", "FHx of Prostate Cancer"))) ->
prostate_factors
prostate %>% 
  select(age, p_vol, preop_psa, aa, fam_hx) %>% 
  group_by(aa, fam_hx) %>% 
  summarize(across(age:preop_psa, ~ mean(.x, na.rm=TRUE)))
## `summarise()` has regrouped the output.
## ℹ Summaries were computed grouped by aa and fam_hx.
## ℹ Output is grouped by aa.
## ℹ Use `summarise(.groups = "drop_last")` to silence this message.
## ℹ Use `summarise(.by = c(aa, fam_hx))` for per-operation grouping
##   (`?dplyr::dplyr_by`) instead.
## # A tibble: 4 × 5
## # Groups:   aa [2]
##      aa fam_hx   age p_vol preop_psa
##   <dbl>  <dbl> <dbl> <dbl>     <dbl>
## 1     0      0  61.8  56.9      8.06
## 2     0      1  59.5  57.3      7.22
## 3     1      0  60.7  54.3      9.90
## 4     1      1  60.1  51.4      8.71

Results Interpretation

Individual of AA race have higher risk of prostate cancer, but having a positive family hx does not seem to increase risk.

Plots

ggplot(prostate_factors)+
  aes(x = p_vol, y = preop_psa, col = aa) +
  geom_point() + 
  geom_smooth(method = "lm") +
  facet_grid(aa~fam_hx) + 
  labs(x = 'Prostate Volume', y = "Preoperative PSA",
       title = 'Relationship Between Prostate Volume and Preop PSA,\nSubdivided by Family History and Race') + 
  theme(legend.position = "bottom")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 11 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_point()`).

Plot Interpretation

White individuals seem to have a positive relationship between prostate volume and pre-op PSA, for both those with a positive family history or with no family history. Meanwhile, AA individuals seem to have no relationship between prostate volume and preoperative PSA for both those with a positive family history or with no family history.

Statistical Testing (T testing)

Hypothesis: African-American patients have a statistically significantly higher preoperative PSA level compared to White patients.

result<-prostate_factors %>%
  t_test(formula = preop_psa ~ aa,
         detailed = TRUE)
view(result)
print(result, width = Inf)
## # A tibble: 1 × 15
##   estimate estimate1 estimate2 .y.       group1 group2              n1    n2
## *    <dbl>     <dbl>     <dbl> <chr>     <chr>  <chr>            <int> <int>
## 1    -1.89      7.86      9.75 preop_psa White  African-American   259    54
##   statistic      p    df conf.low conf.high method alternative
## *     <dbl>  <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>      
## 1     -1.96 0.0534  71.7    -3.81    0.0288 T-test two.sided

African-American patients had a notably higher average pre-op PSA (9.75) compared to White patients (7.86), with a mean difference of -1.89. However, this result was not statistically significant with a p-value of 0.0534.