Arsenic and Flouride Levels in Maine

For this assignment I decided to determine if there were any towns in the top thirty for percentage of wells above guidelines in both arsenic and flouride and see if I could find any commonalities between them based on knowledge of the town.

I started off by importing the data from the csv files for both arsenic and flouride and selecting the top 30 results for both based on percentage of wells above the Maine guidelines. I only selected the “location” and “percent_wells_above_guideline” columns as I wasn’t concerned with the other variables. I then joined the tables together to figure out which towns were in the top 30 based on percentage above Maine guidelines for both arsenic and flouride.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(knitr)
arsenic <- read.csv("arsenic.csv", header = TRUE, stringsAsFactors = FALSE)
flouride <- read.csv("flouride.csv", header = TRUE, stringsAsFactors = FALSE)

arsenicf <- arsenic %>% select(location, percent_wells_above_guideline) %>% arrange(desc(percent_wells_above_guideline)) %>% top_n(30)
## Selecting by percent_wells_above_guideline
arsenicf
##               location percent_wells_above_guideline
## 1           Manchester                          58.9
## 2               Gorham                          50.1
## 3             Columbia                          50.0
## 4             Monmouth                          49.5
## 5                Eliot                          49.3
## 6       Columbia Falls                          48.0
## 7             Winthrop                          44.8
## 8            Hallowell                          44.6
## 9               Buxton                          43.4
## 10           Blue Hill                          42.7
## 11          Litchfield                          42.0
## 12              Hollis                          41.4
## 13              Orland                          40.7
## 14               Surry                          40.3
## 15            Danforth                          40.0
## 16          Mariaville                          40.0
## 17           Readfield                          39.8
## 18                Otis                          39.6
## 19              Dayton                          37.7
## 20            Sedgwick                          37.3
## 21              Mercer                          36.4
## 22         Scarborough                          35.2
## 23                Saco                          34.4
## 24              Camden                          34.0
## 25             Trenton                          33.7
## 26               Anson                          33.3
## 27               Wales                          33.3
## 28            Rangeley                          33.1
## 29             Oakland                          33.0
## 30 Carrabassett Valley                          32.5
## 31               Minot                          32.5
flouridef <- flouride %>% select(location, percent_wells_above_guideline) %>% arrange(desc(percent_wells_above_guideline)) %>% top_n(30)
## Selecting by percent_wells_above_guideline
flouridef
##            location percent_wells_above_guideline
## 1              Otis                          30.0
## 2            Dedham                          22.5
## 3           Denmark                          19.6
## 4             Surry                          18.3
## 5          Prospect                          17.5
## 6         Eastbrook                          16.1
## 7            Mercer                          15.6
## 8          Fryeburg                          15.4
## 9        Brownfield                          15.2
## 10 Stockton Springs                          14.3
## 11          Clifton                          14.0
## 12           Starks                          13.6
## 13       Marshfield                          12.9
## 14        Kennebunk                          12.7
## 15        Charlotte                          12.5
## 16             York                          12.4
## 17     Chesterville                          12.3
## 18         Stoneham                          12.0
## 19         Sedgwick                          11.2
## 20   Mechanic Falls                          11.1
## 21     Swans Island                          10.5
## 22         Franklin                          10.3
## 23       Smithfield                          10.1
## 24        Biddeford                           9.7
## 25        Otisfield                           9.7
## 26        Blue Hill                           9.6
## 27          Arundel                           9.5
## 28        Ellsworth                           9.3
## 29            Hiram                           8.9
## 30     Norridgewock                           8.9
af <- inner_join(arsenicf, flouridef, by = "location")
colnames(af) <- c("town", "% wells above-arsenic", "% wells above-flouride")
kable(af)
town % wells above-arsenic % wells above-flouride
Blue Hill 42.7 9.6
Surry 40.3 18.3
Otis 39.6 30.0
Sedgwick 37.3 11.2
Mercer 36.4 15.6

After figuring out the five results of the query, I decided to locate each town on a map. It turns out that both Surry and Sedgewick border Blue Hill, and Otis is very close to those three. All of them are in the downeast area, all very close to Mount Desert Island. The only one of these towns that is not close to the others is Mercer. Obviously correlation does not automatically mean causation in any way, but it was still interesting to note that 4 out of the 5 towns were in the same geographic area.