This is data about prostate stuff. It includes lots of categroies like age, family hx, prostate volume, t-stage and more.
On summary, the mean age was 61, and other suff blah blah blah.
prostate %>%
mutate(aa = factor(aa, levels = c(0,1),
labels = c("White", "African-American"))) %>%
mutate(fam_hx = factor(fam_hx, levels = c(0,1),
labels = c("No Family History", "FHx of Prostate Cancer"))) ->
prostate_factors
prostate %>%
select(age, p_vol, preop_psa, aa, fam_hx) %>%
group_by(aa, fam_hx) %>%
summarize(across(age:preop_psa, mean, na.rm=TRUE))
## `summarise()` has grouped output by 'aa'. You can override using the `.groups`
## argument.
## # A tibble: 4 × 5
## # Groups: aa [2]
## aa fam_hx age p_vol preop_psa
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 0 61.8 56.9 8.06
## 2 0 1 59.5 57.3 7.22
## 3 1 0 60.7 54.3 9.90
## 4 1 1 60.1 51.4 8.71
This is an analysis of the summary of the results. The results suggest that those who have a positive family history of prostate cancer and are caucasian have the highest prostate volume. It is important to note that this group are also on average the youngest, with the lowest pre-op psa. This could be statistically significant so I would want to look at sample size and run a stastistical test
You can also embed plots, for example:
ggplot(prostate_factors) +
aes(x = p_vol, y = preop_psa, col = aa) +
geom_point() +
geom_smooth(method = "lm") +
facet_grid(aa ~ fam_hx) +
labs(x = 'Prostate Volume', y = "Preoperative PSA",
title = 'Relationship Between Prostate Volume and Preop PSA,\nSubdivided by Family History and Race') +
theme(legend.position = "bottom")
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 11 rows containing non-finite values (stat_smooth).
## Warning: Removed 11 rows containing missing values (geom_point).
In here I would write about the graph but idc.
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
##Stastistical Testing: T-test
prostate_factors %>%
t_test(formula = preop_psa ~ aa, detailed = TRUE)
## # A tibble: 1 × 15
## estimate estima…¹ estim…² .y. group1 group2 n1 n2 stati…³ p df
## * <dbl> <dbl> <dbl> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
## 1 -1.89 7.86 9.75 preo… White Afric… 261 55 -1.96 0.0534 71.7
## # … with 4 more variables: conf.low <dbl>, conf.high <dbl>, method <chr>,
## # alternative <chr>, and abbreviated variable names ¹estimate1, ²estimate2,
## # ³statistic
Here I would write about the results of the t-test.