First code chunk in each R markdown document is named “setup”

include = FALSE, when this document is knitted, the code chunk will be run but no output or warnings or messages will appear in the output

Analysis of Prostate Data

Prostate dataset by Higgins for practice.

Headline about Prostate data

A practice assignment to see if i can italicise and bold text.

Data wrangling

Mutate variables aa and fam_hx into categorical variables or factors.. Then we will assign the result to a new object

prostate |> 
  select(age, p_vol, preop_psa, aa, fam_hx) |> 
  group_by(aa, fam_hx) |> 
  summarize(across(age:preop_psa, mean, na.rm = TRUE))
## Warning: There was 1 warning in `summarize()`.
## ℹ In argument: `across(age:preop_psa, mean, na.rm = TRUE)`.
## ℹ In group 1: `aa = 0` and `fam_hx = 0`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))
## `summarise()` has grouped output by 'aa'. You can override using the `.groups`
## argument.
## # A tibble: 4 × 5
## # Groups:   aa [2]
##      aa fam_hx   age p_vol preop_psa
##   <dbl>  <dbl> <dbl> <dbl>     <dbl>
## 1     0      0  61.8  56.9      8.06
## 2     0      1  59.5  57.3      7.22
## 3     1      0  60.7  54.3      9.90
## 4     1      1  60.1  51.4      8.71

White and no family history: mean age = 61.8 years ……..

Data Visualization

You can also embed plots, for example:

## `geom_smooth()` using formula = 'y ~ x'

Statistical testing (t-testing)

prostate_factors |> 
  t_test(formula = preop_psa ~ aa, detailed = TRUE)
## # A tibble: 1 × 15
##   estimate estimate1 estimate2 .y.    group1 group2    n1    n2 statistic      p
## *    <dbl>     <dbl>     <dbl> <chr>  <chr>  <chr>  <int> <int>     <dbl>  <dbl>
## 1    -1.89      7.86      9.75 preop… White  Afric…   259    54     -1.96 0.0534
## # ℹ 5 more variables: df <dbl>, conf.low <dbl>, conf.high <dbl>, method <chr>,
## #   alternative <chr>

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot. echo = FALSE means the results will appear but not the code chunk per se.