The existing grading scheme for the Mystic has a few virtues:
It has flaws:
The simplest proposal for a new grading rubric would be one that applies the same calculations used for the old grade, but broken out by water body. To get enough representative data from the Mystic River we need to look to MWRA data from the freshwater section of the Mystic.
The following calculations try to keep as much of the old system as possible, while addressing its weaknesses.
library(RODBC)
library(reshape2)
library(lubridate)
library(dplyr)
library(pander)
library(ggplot2)
library(dplyr)
library(knitr)
library(tidyr)
setwd("\\\\psf/Home/Dropbox/MysticDB")
source("./Rcode/Sandbox/Jeff/wq-precip-scripts/load_wq.R")
wq <- load_wq()
## Loading wq database: \\psf\Home\Dropbox\MysticDB\MysticDB_20150624.accdb
## Fetching tables...done
## Merging tables...done
## Excluding field blanks, duplicates...done
## Excluding flagged results...done
source('./Rcode/Sandbox/Jeff/wq-precip-scripts/load_precip.R')
precip <- load_precip()
## Loading precip file: \\psf\Home\Dropbox\MysticDB\Processed\Precip\LoganPrecip.xlsx
wq2 <- append_weather(wq,precip)
## Computing antecedent precip...done
## Computing DateHour column in wq dataframe...done
## Merging wq and precip...done
# Filter for Baseline and MWRA data only, surface samples only
wq3 <- tbl_df(wq2) %>%
filter(ProjectID %in% c("BASE", "BHWQM", "CSORWM")) %>%
filter(SampleDepthType %in% c("S", NA))
The old grading system assigned a grade by computing percent samples that met boating and swimming standards for the bacteria sampled. It combined E. coli and Enterococcus data by converting concentration data to the question of whether it exceeded the standard or not.
Appealing to the standards, we can associate each result with PassBoat and PassSwim variables (1 if pass, 0 if fail).
The mean of values in these columns for grouped rows (by WaterBody, Year, etc) will give proportion of samples in that group that pass the relevant standard, which is the old yardstick.
# Narrow results to ECOLI and ENT.
# Create new columns for PassBoat and PassSwim for each result, appealing to appropriate standard.
# Values are 1 if Pass and zero if fail. Average for grouped rows will now give
# proportion/percentage of samples that pass the relevant standard for that group
wqgrade <- wq3 %>%
filter(CharacteristicID == "ECOLI" | CharacteristicID == "ENT")%>%
mutate(PassBoat = ifelse(
((CharacteristicID == "ECOLI" & ResultValue < 1260) |
(CharacteristicID =="ENT" & ResultValue < 350))
, 1, 0)) %>%
mutate(PassSwim = ifelse(
((CharacteristicID == "ECOLI" & ResultValue < 235) |
(CharacteristicID =="ENT" & ResultValue < 104))
, 1, 0))
# This is the new dataframe with new columns PassBoat and PassSwim, =1 if pass =0 if not
# View(wqgrade)
In order to weight wet-weather days appropriately, the grade has calculated wet and dry proportions separately and then weights the dry value 75% and the wet value 25%. The last two columns of the table below represent the old scheme applied to each waterbody for 2014. Water bodies arranged in decreasing order of “swimmability.”
# now calculate the percentage of samples passing boat/swim by Weather, WaterBody, and year
# percentage of samples in 2014 meeting Swimming and boating standards in wet weather
wq_wet <- mutate(wqgrade, year = year(Datetime)) %>%
filter(year == 2014) %>%
filter(Weather == "Wet") %>%
group_by(WaterBodyID, year) %>%
summarize(PercPassBoatWet = mean(PassBoat)*100, PercPassSwimWet = mean(PassSwim)*100, n_wet = n())
# percentage of samples in 2014 meeting Swimming and boating standards in dry weather
wq_dry <- mutate(wqgrade, year = year(Datetime)) %>%
filter(year == 2014) %>%
filter(Weather == "Dry") %>%
group_by(WaterBodyID, year) %>%
summarize(PercPassBoatDry = mean(PassBoat)*100, PercPassSwimDry = mean(PassSwim)*100, n_dry = n())
# Join the tables and created a weighted sum column for swimming and boating
wq4 <- inner_join(wq_wet, wq_dry)%>%
mutate(PassBoatWeighted = (PercPassBoatDry*.75 +PercPassBoatWet*.25)) %>%
mutate(PassSwimWeighted = (PercPassSwimDry*.75 +PercPassSwimWet*.25))
wq4 <- tbl_df(wq4) %>%
arrange(desc(PassSwimWeighted))
| WaterBodyID | year | PercPassBoatWet | PercPassSwimWet | n_wet | PercPassBoatDry | PercPassSwimDry | n_dry | PassBoatWeighted | PassSwimWeighted |
|---|---|---|---|---|---|---|---|---|---|
| Upper Mystic Lake | 2014 | 100.00000 | 80.000000 | 5 | 100.00000 | 100.00000 | 7 | 100.00000 | 95.000000 |
| Mystic River (Salt) | 2014 | 75.55556 | 60.000000 | 45 | 98.46154 | 92.30769 | 65 | 92.73504 | 84.230769 |
| Mystic River (Fresh) | 2014 | 75.63025 | 54.621849 | 119 | 99.29078 | 86.52482 | 282 | 93.37565 | 78.549079 |
| Chelsea River | 2014 | 66.66667 | 33.333333 | 3 | 100.00000 | 88.88889 | 9 | 91.66667 | 75.000000 |
| Malden River | 2014 | 45.00000 | 25.000000 | 20 | 100.00000 | 71.87500 | 32 | 86.25000 | 60.156250 |
| Belle Isle Inlet | 2014 | 100.00000 | 66.666667 | 3 | 100.00000 | 55.55556 | 9 | 100.00000 | 58.333333 |
| Aberjona River | 2014 | 60.00000 | 6.666667 | 15 | 100.00000 | 47.36842 | 19 | 90.00000 | 37.192982 |
| Meetinghouse Brook | 2014 | 40.00000 | 40.000000 | 5 | 57.14286 | 28.57143 | 7 | 52.85714 | 31.428571 |
| Little River | 2014 | 43.75000 | 6.250000 | 32 | 73.07692 | 38.46154 | 26 | 65.74519 | 30.408654 |
| Alewife Brook | 2014 | 41.12150 | 5.607477 | 107 | 75.86207 | 34.48276 | 87 | 67.17693 | 27.263938 |
| Mill Creek | 2014 | 66.66667 | 0.000000 | 3 | 77.77778 | 22.22222 | 9 | 75.00000 | 16.666667 |
| Winns Brook | 2014 | 20.00000 | 0.000000 | 5 | 50.00000 | 16.66667 | 6 | 42.50000 | 12.500000 |
| Mill Brook | 2014 | 60.00000 | 0.000000 | 5 | 71.42857 | 14.28571 | 7 | 68.57143 | 10.714286 |
| Island End River | 2014 | 0.00000 | 0.000000 | 3 | 33.33333 | 11.11111 | 9 | 25.00000 | 8.333333 |
If you look at the last three years of data, instead of just one, rank order is largely the same, certainly in the cleanest streams.
wq_wet2 <- mutate(wqgrade, year = year(Datetime)) %>%
filter(year <= 2014 & year >= 2010) %>%
filter(Weather == "Wet") %>%
group_by(LocationID) %>%
summarize(PercPassBoatWet = mean(PassBoat)*100, PercPassSwimWet = mean(PassSwim)*100, n_wet = n())
wq_dry2 <- mutate(wqgrade, year = year(Datetime)) %>%
filter(year <= 2014 & year >= 2010) %>%
filter(Weather == "Dry") %>%
group_by(LocationID) %>%
summarize(PercPassBoatDry = mean(PassBoat)*100, PercPassSwimDry = mean(PassSwim)*100, n_dry = n())
wq5 <- inner_join(wq_wet2, wq_dry2)%>%
mutate(PassBoatWeighted = (PercPassBoatDry*.75 +PercPassBoatWet*.25)) %>%
mutate(PassSwimWeighted = (PercPassSwimDry*.75 +PercPassSwimWet*.25))
wq5 <- tbl_df(wq5) %>%
arrange(desc(PassSwimWeighted))
kable(wq5)
| LocationID | PercPassBoatWet | PercPassSwimWet | n_wet | PercPassBoatDry | PercPassSwimDry | n_dry | PassBoatWeighted | PassSwimWeighted |
|---|---|---|---|---|---|---|---|---|
| MWRA137 | 95.65217 | 91.304348 | 23 | 100.00000 | 96.87500 | 96 | 98.91304 | 95.48234 |
| MWRA059 | 95.83333 | 75.000000 | 48 | 100.00000 | 97.50000 | 160 | 98.95833 | 91.87500 |
| UPL001 | 100.00000 | 83.333333 | 18 | 100.00000 | 94.59459 | 37 | 100.00000 | 91.77928 |
| MWRA067 | 84.61538 | 65.384615 | 52 | 100.00000 | 96.87500 | 160 | 96.15385 | 89.00240 |
| MYR071 | 90.47619 | 71.428571 | 21 | 100.00000 | 94.59459 | 37 | 97.61905 | 88.80309 |
| MWRA167 | 81.03448 | 65.517241 | 58 | 100.00000 | 95.16129 | 186 | 95.25862 | 87.75028 |
| MWRA069 | 82.75862 | 58.620690 | 58 | 98.78049 | 92.68293 | 82 | 94.77502 | 84.16737 |
| MYRMMP | 80.00000 | 60.000000 | 10 | 96.77419 | 90.32258 | 31 | 92.58065 | 82.74194 |
| MWRA083 | 76.78571 | 41.964286 | 112 | 99.42529 | 91.66667 | 348 | 93.76539 | 79.24107 |
| MWRA052 | 75.43860 | 50.877193 | 57 | 94.44444 | 86.66667 | 90 | 89.69298 | 77.71930 |
| MYR275 | 77.77778 | 55.555556 | 9 | 93.75000 | 84.37500 | 32 | 89.75694 | 77.17014 |
| MWRA176 | 54.71698 | 32.075472 | 53 | 98.74214 | 90.56604 | 159 | 87.73585 | 75.94340 |
| CHR95S | 70.00000 | 30.000000 | 10 | 100.00000 | 87.09677 | 31 | 92.50000 | 72.82258 |
| MWRA057 | 75.00000 | 38.461539 | 52 | 98.73418 | 81.64557 | 158 | 92.80063 | 70.84956 |
| MWRA056 | 51.92308 | 19.230769 | 52 | 100.00000 | 84.37500 | 160 | 87.98077 | 68.08894 |
| MWRA066 | 58.53659 | 23.170732 | 82 | 96.96970 | 79.79798 | 198 | 87.36142 | 65.64117 |
| MWRA177 | 55.81395 | 20.930233 | 86 | 98.93617 | 77.12766 | 188 | 88.15562 | 63.07830 |
| BEI093 | 70.00000 | 50.000000 | 10 | 83.87097 | 48.38710 | 31 | 80.40323 | 48.79032 |
| ABR049 | 55.55556 | 0.000000 | 18 | 94.59459 | 59.45946 | 37 | 84.83483 | 44.59459 |
| ABR006 | 57.89474 | 21.052632 | 19 | 97.22222 | 50.00000 | 36 | 87.39035 | 42.76316 |
| MEB001 | 47.61905 | 23.809524 | 21 | 84.21053 | 36.84211 | 38 | 75.06266 | 33.58396 |
| ALB006 | 66.66667 | 4.761905 | 21 | 97.36842 | 39.47368 | 38 | 89.69298 | 30.79574 |
| MWRA174 | 31.25000 | 4.687500 | 128 | 79.76190 | 38.09524 | 168 | 67.63393 | 29.74330 |
| MWRA074 | 39.06250 | 4.687500 | 128 | 85.11905 | 36.90476 | 168 | 73.60491 | 28.85045 |
| ABR028 | 61.90476 | 9.523810 | 21 | 94.59459 | 29.72973 | 37 | 86.42214 | 24.67825 |
| MWRA183 | 37.50000 | 12.500000 | 8 | 50.00000 | 25.00000 | 24 | 46.87500 | 21.87500 |
| MWRA172 | 30.76923 | 2.307692 | 130 | 84.52381 | 27.97619 | 168 | 71.08516 | 21.55907 |
| MIB001 | 50.00000 | 0.000000 | 20 | 76.31579 | 23.68421 | 38 | 69.73684 | 17.76316 |
| MAR036 | 22.22222 | 5.555556 | 18 | 72.97297 | 21.62162 | 37 | 60.28529 | 17.60511 |
| MWRA070 | 21.53846 | 2.307692 | 130 | 80.00000 | 22.35294 | 170 | 65.38462 | 17.34163 |
| MIC004 | 33.33333 | 0.000000 | 9 | 50.00000 | 21.87500 | 32 | 45.83333 | 16.40625 |
| WIB001 | 42.10526 | 10.526316 | 19 | 69.44444 | 16.66667 | 36 | 62.60965 | 15.13158 |
g <- ggplot(wq5, aes(x=reorder(LocationID,-PassBoatWeighted), PassBoatWeighted))
g + geom_bar(stat = "identity", position="dodge") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_y_continuous(limits = c(0,100)) + xlab("Water Body") + ylab("Weighted Percent Boatable")
g <- ggplot(wq5, aes(x=reorder(LocationID,-PassSwimWeighted), y = PassSwimWeighted))
g + geom_bar(stat = "identity", position="dodge") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_y_continuous(limits = c(0,100)) + xlab("Water Body") + ylab("Weighted Percent Swimmable")
# combine PassSwimWeighted and PassBoatWeighted into long form: PassType, PassValue
wq5_long <- gather(wq5, PassType, PassValue, PassBoatWeighted:PassSwimWeighted)
#View(wq5_long)
g <- ggplot(wq5_long, aes(x = reorder(LocationID, PassValue), PassValue, fill = PassType))
g + geom_bar(stat = "identity", position="dodge") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_y_continuous(limits = c(0,100)) + xlab("Water Body") + ylab("Weighted Percent Meeting Standard") + scale_fill_brewer(palette = "Paired") + coord_flip()