Analysis of Prostate Data

There are twenty variables and 316 observations.There is an average of 32.9 time to recurrence, with a min time of 0.27 and a maximum time of 103.600.

Headline about Prostate Data

This data set is both large and small.

prostate %>%
  mutate(aa = factor(aa, levels = c(0,1),
                     labels = c("White", "African-American"))) %>%
  mutate(fam_hx = factor(fam_hx, levels = c(0,1),
                         labels = c("No Family History", "FHx of Prostate Cancer"))) ->
  prostate_factors
prostate %>%
  select(aa, fam_hx, age, p_vol, preop_psa) %>%
  group_by(aa, fam_hx) %>%
  summarize(across(age:preop_psa, ~ mean(.x, na.rm=TRUE)))
## `summarise()` has grouped output by 'aa'. You can override using the `.groups`
## argument.
## # A tibble: 4 × 5
## # Groups:   aa [2]
##      aa fam_hx   age p_vol preop_psa
##   <dbl>  <dbl> <dbl> <dbl>     <dbl>
## 1     0      0  61.8  56.9      8.06
## 2     0      1  59.5  57.3      7.22
## 3     1      0  60.7  54.3      9.90
## 4     1      1  60.1  51.4      8.71

The average age for White patients with no family history was 61.8 years old. The average p_vol for African-American patients with family history was 51.41.

Including Plots

You can also embed plots, for example:

## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 11 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_point()`).

White patients with no family history have a steeper correlation between prostate volume and preoperative PSA.

These graphs suggest that family history is not as big of a factor as race.

Statistical Testing

We hypothesize that African-American patients in this dataset may, on average, have a higher preoperative PSA level than White patients.

prostate_factors %>%
  t_test(formula = preop_psa ~ aa, 
         detailed = TRUE)
## # A tibble: 1 × 15
##   estimate estimate1 estimate2 .y.    group1 group2    n1    n2 statistic      p
## *    <dbl>     <dbl>     <dbl> <chr>  <chr>  <chr>  <int> <int>     <dbl>  <dbl>
## 1    -1.89      7.86      9.75 preop… White  Afric…   259    54     -1.96 0.0534
## # ℹ 5 more variables: df <dbl>, conf.low <dbl>, conf.high <dbl>, method <chr>,
## #   alternative <chr>

We cannot reject the null hypothesis that there is no difference between preoppsa btw White and African American patients in this data set.