Note about packages: there’s many that can be used, but since every package has their own ways of calculating data, we’ll stick to tidyverse for now.
library(tidyverse)
Difference between this reading and the practice with Danielle is that this dataset is in a folder, keep in mind when coding in the future
birthw_data <- read.csv(file = "data/birthweight_data.csv")
Remember to use print function so the data can be seen in the final knitted output
birthw_summary <- birthw_data %>%
group_by(plurality) %>%
summarise(mean_birthw = mean(birthweight)) %>%
ungroup
print(birthw_summary)
## # A tibble: 2 × 2
## plurality mean_birthw
## <chr> <dbl>
## 1 singleton 3248.
## 2 twin 2311.
Table 1: mean birth weight of singleton and twin newborns
Note about summarise() function: we can custom the output names for the summaries (mean_gestation_age in this case)
earlyGage <- birthw_data %>%
group_by(child_ethn) %>%
summarise(min_gestation_age = min(gestation_age_w)) %>%
ungroup
print(earlyGage)
## # A tibble: 10 × 2
## child_ethn min_gestation_age
## <chr> <chr>
## 1 Aboriginal/Torres Strait Islander 33
## 2 African/African-American 26
## 3 Caucasian 26
## 4 East Asian 33
## 5 Hispanic/Latino 37
## 6 Middle-Eastern 28
## 7 Missing 36
## 8 Polynesian/Melanesian 28
## 9 South Asian 28
## 10 South-East Asian 29
Table 2: minimum gestational age of mothers by ethnicity
piping (%>%) is a useful function that automatically saves the steps in sequencial order, it’s the easiest logically.
This blog has a goood explanation of using group_by() and summarise() function in R.
use write_csv() function
write_csv(birthw_summary, file = "birthweight_summary.csv")