The following is a barebones example of what a completed assignment 1 that meets all of the non-Piazza related criteria.
The code below will first load the tidyverse library and read in the flouride and arsenic data. notice that I don’t have any warnings or messages in my html file. That is because I set code chunk to warning=FALSE and message=FALSE.
library(tidyverse)
library(knitr)
fluoride <- read.csv(url("http://jamessuleiman.com/teaching/datasets/fluoride.csv"),
stringsAsFactors = FALSE)
arsenic <- read.csv(url("http://jamessuleiman.com/teaching/datasets/arsenic.csv"),
stringsAsFactors = FALSE)
The assignment states: “Prepare a report that has an interesting narrative that focuses on a subset of the data you find interesting that includes both arsenic and fluoride data.” and also explicitly requests: “you must create a data frame or tibble that joins both arsenic and fluoride by location.”. So first, I’ll join the tables and display a table of the flouride and arsenic levels for the five highest combined arsenic + fluoride locations with at least 50 wells tested. I use kable make the table prettier.
arsenic_fluoride <- arsenic %>% inner_join(fluoride, by = "location")
arsenic_fluoride <- arsenic_fluoride %>%
mutate(combined_avg_level = (n_wells_tested.x * percent_wells_above_guideline.x +
n_wells_tested.y * percent_wells_above_guideline.y)/
(n_wells_tested.x + n_wells_tested.y))
combined_wells_top_5 <- arsenic_fluoride %>% select(location,
n_wells_tested_arsenic = n_wells_tested.x,
n_wells_tested_flouride = n_wells_tested.y,
percent_high_arsenic = percent_wells_above_guideline.x,
percent_high_fluoride = percent_wells_above_guideline.y,
combined_avg_level) %>%
arrange(desc(combined_avg_level)) %>%
slice(1:5)
kable(combined_wells_top_5)
| location | n_wells_tested_arsenic | n_wells_tested_flouride | percent_high_arsenic | percent_high_fluoride | combined_avg_level |
|---|---|---|---|---|---|
| Otis | 53 | 60 | 39.6 | 30.0 | 34.50265 |
| Manchester | 275 | 276 | 58.9 | 3.3 | 31.04955 |
| Surry | 181 | 175 | 40.3 | 18.3 | 29.48539 |
| Blue Hill | 241 | 209 | 42.7 | 9.6 | 27.32689 |
| Mercer | 33 | 32 | 36.4 | 15.6 | 26.16000 |
The assignment also requires:
echo = false in the code chunk to accomplish this.To keep things simple, I’ll just to a column chart of the top five table above ordered by combined average.
Notice, you don’t see the code that generated the chart above.