Data Import

First we read in the two data sets and delete missing values.

library(tidyverse)
fluoride <- read_csv("http://jamessuleiman.com/teaching/datasets/fluoride.csv")
fluoride <- fluoride %>% drop_na()
arsenic <- read_csv("http://jamessuleiman.com/teaching/datasets/arsenic.csv")
arsenic <- arsenic %>% drop_na()

Next we display the first few rows of fluoride.

head(fluoride)
## # A tibble: 6 x 6
##   location  n_wells_tested percent_wells_above_gui… median percentile_95 maximum
##   <chr>              <dbl>                    <dbl>  <dbl>         <dbl>   <dbl>
## 1 Otis                  60                     30    1.13           3.2      3.6
## 2 Dedham               102                     22.5  0.94           3.27     7  
## 3 Denmark               46                     19.6  0.45           3.15     3.9
## 4 Surry                175                     18.3  0.8            3.52     6.9
## 5 Prospect              57                     17.5  0.785          2.5      2.7
## 6 Eastbrook             31                     16.1  1.29           2.44     3.3

Then we display the first few rows of arsenic.

head(arsenic)
## # A tibble: 6 x 6
##   location    n_wells_tested percent_wells_above_g… median percentile_95 maximum
##   <chr>                <dbl>                  <dbl>  <dbl>         <dbl>   <dbl>
## 1 Manchester             275                   58.9   14            93       200
## 2 Gorham                 467                   50.1   10.5         130       460
## 3 Columbia                42                   50      9.8          65.9     200
## 4 Monmouth               277                   49.5   10           110       368
## 5 Eliot                   73                   49.3    9.7          41.4      45
## 6 Columbia F…             25                   48      8.1          53.8      71

Join data

In the code chunk below, We create a new datafrane called chemicals that joins fluoride and arsenic.

chemicals <- inner_join(arsenic,fluoride, by="location",suffix=c(" (arsenic)"," (fluoride)"))

Lets take a look at the join.

head(chemicals)
## # A tibble: 6 x 11
##   location `n_wells_tested… `percent_wells_… `median (arseni… `percentile_95 …
##   <chr>               <dbl>            <dbl>            <dbl>            <dbl>
## 1 Manches…              275             58.9             14               93  
## 2 Gorham                467             50.1             10.5            130  
## 3 Columbia               42             50                9.8             65.9
## 4 Monmouth              277             49.5             10              110  
## 5 Eliot                  73             49.3              9.7             41.4
## 6 Columbi…               25             48                8.1             53.8
## # … with 6 more variables: `maximum (arsenic)` <dbl>, `n_wells_tested
## #   (fluoride)` <dbl>, `percent_wells_above_guideline (fluoride)` <dbl>,
## #   `median (fluoride)` <dbl>, `percentile_95 (fluoride)` <dbl>, `maximum
## #   (fluoride)` <dbl>

##Subset

In the code chunk below, we create subsets of the data showing low levels and high levels of arsenic & fluoride.

high_arsenic <- chemicals %>% filter(`median (arsenic)` >= 10)

low_arsenic <- chemicals %>%  filter(`median (arsenic)` < 10)

high_fluoride <- chemicals %>% filter(`median (fluoride)` >= 2)

low_fluoride <- chemicals %>%  filter(`median (fluoride)` < 2)

For our final analysis, lets focus on towns with high arsenic levels. Counties also included.

high_arsenic_counties <- high_arsenic %>% mutate(county = c("Kenebec","Cumberland","Kenebec"))
head(high_arsenic_counties[,c(1,4,12)])
## # A tibble: 3 x 3
##   location   `median (arsenic)` county    
##   <chr>                   <dbl> <chr>     
## 1 Manchester               14   Kenebec   
## 2 Gorham                   10.5 Cumberland
## 3 Monmouth                 10   Kenebec

Visualize your subset

Lets now look at the counties with at least one town/city where the average arsenic level was greater than the required max limit as stated in the Maine’s Maximum Exposure Guideline

As shown in the map above, 2 out of the 16 counties in Maine had at least one town/city where the average arsenic level was greater than the required max limit as stated in the Maine’s Maximum Exposure Guideline.This is especially problematic since high levels of arsenic in water can have detrimental effects on the health of the public. I found the town of Gorham, Manchester, and Monmouth from Kennebec and Cumberland County respectively, to have an elevated amount of arsenic. Please note that I didn’t consider towns with arsenic levels slightly lower than 10 micrograms per liter (ug/L) which could mean that we are not accounting for towns and counties with potentially high arsenic levels. For example if I set the limit for high arsenic levels at 9 ug/L, an extra 2 towns is added to the list.In summary, we can say that arsenic levels are low in a majority of towns and counties but even one city with high arsenic levels is too much.