Learing to summarise
- number of observations in the data (hint use n()).
- number of observations that have missing counts (hint use is.na).
- proportion of observations that have missing counts.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.5
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.3 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.0.5
## Warning: package 'tibble' was built under R version 4.0.5
## Warning: package 'tidyr' was built under R version 4.0.5
## Warning: package 'readr' was built under R version 4.0.5
## Warning: package 'purrr' was built under R version 4.0.5
## Warning: package 'dplyr' was built under R version 4.0.5
## Warning: package 'stringr' was built under R version 4.0.5
## Warning: package 'forcats' was built under R version 4.0.5
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
tb_long <- read_rds("https://github.com/datascienceprogram/ids_course_data/raw/master/tb_long.rds")
tb_long %>%
summarise(n = n(),
missing = sum(is.na(count)),
prop = missing/n)
## # A tibble: 1 x 3
## n missing prop
## <int> <int> <dbl>
## 1 157820 106846 0.677