In 2020, we tested more Story County streams than ever, and introduced new ways to make sense of the data. Updated 2021-01-11
# 2021-01-11 Updated with December data
# Import csv file provided by City of Ames
# This version includes March thru Dec 2020
# Does not include storm samples
# Skip unneeded columns for tidy format
library(tidyverse)
library(readr)
library(lubridate)
ameslab2020 <- read_csv("data/ameslab2020.csv", na = "NULL",
col_types = cols(CollectionDate = col_date(format = "%m/%d/%Y"),
CollectionTime = col_skip(), Comment = col_skip(),
MRL = col_skip(), LabID = col_integer(),
Method = col_skip(), Note = col_skip(),
Symbol = col_skip(), Unit = col_skip()))
# renames a column for clarity
ameslab2020 <- rename(ameslab2020, site = Description)
View(ameslab2020)
# Tidy the data, so each analyte is in a column
ames_tidy<- ameslab2020 %>%
pivot_wider( names_from = Analyte, values_from = Result) %>%
mutate(Year = year(CollectionDate), Month = month(CollectionDate, label = TRUE),
Day = day(CollectionDate))
# Adds a column from a lookup table to allow ordering sites from upstream to downstream
library(readr)
lookup_US_DS <- read_csv("data/lookup_US_DS.csv")
ames_tidy <- left_join(ames_tidy, lookup_US_DS, by = "site")
If you see a car stopped by a bridge in Story County pulling up a milk jug of water on a rope, there’s a good chance it’s me or volunteer Rick Dietz, doing our monthly monitoring route. We collect water samples from 10 sites, and City of Ames staff cover another five. Laboratory Services for City of Ames Water and Pollution Control tests the samples for nitrate, total phosphorus, total suspended solids, E. coli bacteria, and fecal coliform.
After normal to wet conditions in spring, drought conditions took hold in much of Story, Boone, and Hamilton counties. Ballard Creek dried up completely and water levels were low at many other sites.
Flow comparison, Ioway Creek at Moore Park in Ames
Here are photos for Ioway Creek (formerly known as Squaw Creek) in Ames, with graphs from a nearby USGS gage for reference. 100 cfs is enough water to float a canoe.
Streams with missing samples due to dry or stagnant conditions include Ballard Creek, West Indian Creek upstream of Nevada, Bear Creek in Roland, and Worrell and Clear Creek in Ames. Thick ice prevented sampling of some streams in December.
The support of a certified lab provides a backstop for volunteer monitoring and allows us to make direct comparisons with data collected from larger rivers by the Iowa DNR. It also allows us to test for E. coli bacteria, an indicator of fecal contamination from untreated sewage, livestock, pets, or wildlife.
The Iowa DNR uses two sets of criteria to evaluate E. coli bacteria. For a single sample, a threshold of 235 colonies per 100 mL is used for waters designated for primary contact recreation and children’s play, and a threshold of 2,880 colonies per 100mL is used for waters designated for secondary contact recreation. Two streams exceeded the secondary contact standard in October, but the water was barely flowing.
Based on single samples, every stream we tested exceeded the primary contact recreation standard for at least one month in spring or summer. Sites 1, 3, 5, 6, 7, 8, 11, 12, 13, and 14 are located at public parks. For additional context and discussion, read more here.
ggplot(data = ames_tidy) +
geom_col(mapping = aes(x= US_DS_order, y = `E. Coli`, fill = Month), position = "dodge2") +
geom_hline(yintercept = 235, color = 'red') +
annotate("text", x = 4, y = 500, color = 'red', label = "Primary contact standard") +
geom_hline(yintercept = 2880, color = 'brown')+
annotate("text", x = 4, y = 3200, color = 'brown', label = "Secondary contact standard") +
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Sites, arranged upstream to downstream", y = "E. coli (MPN/100mL)", title = "E. coli in Story County streams, single samples")
Bacteria concentrations can vary dramatically from day to day–for example a heavy rain may flush manure off a field or racoon dung out of a storm sewer. To evaluate bacteria concentrations across the recreational season (March 15-November 15) the geometric mean of samples collected from at least 7 weeks of the season is used, which better reflects typical conditions on a typical day. The thresholds for the seasonal geometric mean are 126 colonies per 100mL for primary contact recreation (such as swimming), and 630 colonies per 100mL for secondary contact recreation (such as shorefishing). We have enough data to compare ten sites to the seasonal criteria: all but the South Skunk River at W. Riverside Rd (Sleepy Hollow Access) exceeded the primary contact standard; none exceeded the secondary contact standard.
ames_geomean <- ames_tidy %>%
filter(Month %in% c("Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct")) %>%
group_by(site) %>%
summarise(count = n(),
geomean = round(exp(mean(log(`E. Coli`))), digits = 0))
filter(ames_geomean, count >= 7) %>%
ggplot() +
geom_col(aes(x= site, y = geomean)) +
geom_hline(yintercept = 126, color = 'red') +
annotate("text", x = 2, y = 300, color = 'red', label = "Primary contact standard") +
geom_hline(yintercept = 630, color = 'brown')+
annotate("text", x = 2, y = 700, color = 'brown', label = "Secondary contact standard") +
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Sites, arranged upstream to downstream", y = "E. coli geomean (MPN/100mL)", title = "E. coli in Story County streams, 2020 recreation season")
The most surprising result so far is the high phosphorus levels in West Indian Creek. Phosphorus is a nutrient that contributes to algae blooms. Phosphorus increases between Lincoln Highway (where the creek enters Nevada) and Jennett Heritage Area (about 5 miles downstream), and we’ve sampled mostly during low flow conditions, so treated wastewater is the most likely source. Fortunately, the City of Nevada has projects underway that could help clean it up. Nevada will be constructing a new wastewater treatment plant starting in 2022, which will include nutrient removal processes and UV disinfection. This led to an SRF sponsored project, awarded in October, that will fund stream restoration or stormwater treatment projects in and around town. We were happy to be included in the conversation and to provide data that supported the City’s application.
ggplot(data = ames_tidy) +
geom_col(mapping = aes(x= US_DS_order, y = `Total Phosphorus as P`, fill = Month), position = "dodge2") +
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Sites, arranged upstream to downstream", y = "Total Phosphorus (mg/L)", title = "Lab results from Story County monitoring route")
The graphs above were produced with “R”, a free open source software package for statistics and data science. Doing an analysis for the first time is difficult (a bit like computer programming) but doing an analysis for the second time (with more recent data, a different site, or a different pollutant) is ten times faster. We like the idea that our work will be transparent and repeatable.
Doing data analysis in R makes it easier to make comparisons between data from different sources. Are downstream waters being affected? Looking at data collected by the Iowa DNR at the nearest downstream monitoring station, Indian Creek at Colfax, it appears that phosphorus can be high there as well. There is no state standard for phosphorus, but 1 mg/L or greater is especially high.
# The "DNR" placeholders will stand in for any csv file downloaded from DNR's AQuIA
# Within a session, the DNR object can be used in other scripts.
DNR_input <- "data/siteSamplingData-10500001.csv"
DNR_output <- "data/output.csv"
# Document the input when you run variations:
# downloaded: 2020-08-12
# sites: Indian Creek at Colfax 10500001
# analytes: all
# dates: all
# import from csv. Some of the error flags have letter codes
# They are rare enough that R can't guess type from the first 1000 rows, so we have to specify
DNR <- read_csv(DNR_input,
col_types = cols(qualFlag = col_character(),
quantFlag = col_character(), quantLimit = col_double(),
remark = col_character(),
# Weird, this has a different date format (ISO, so no need to specify) than the last thing I downloaded from AQuIA!
sampleDate = col_datetime())) %>%
# Simplify date-time. For most purposes, time of day is not important.
mutate(sampleDate = as_date(sampleDate))
# Tidy a subset of the available data
DNR_nutrients <- DNR %>%
# Pull out just the analytes of interest: flow, E. coli, nitrate+nitrite, total phosphorus,
#TSS and turbidity
filter(cas_rn %in% c("FLOW", "68583-22-2", "NIT-NO3-NO2", "PHOSP-PHOSP", "TSS", "TURB")) %>%
# Then pull out just the relevant column. Site number and name, sample date,
# the shorthand code for the analyte (cas_rn), and result
# Columns dealing with units and detection limits can't easily be pivoted, so we'll deal with those
# in a separate analysis
select(siteID, name, sampleDate, cas_rn, result) %>%
# Tidy the dataset so that results for each variable show up in a separate column.
pivot_wider( names_from = cas_rn, values_from = result) %>%
mutate(sampleYear = year(sampleDate), sampleMonth = month(sampleDate, label = TRUE),
sampleDay = day(sampleDate))%>%
rename(`E-COLI` = `68583-22-2`)
# Graph the data
ggplot(DNR_nutrients) +
geom_point(mapping = aes(x= sampleDate, y = `PHOSP-PHOSP`, color = sampleMonth)) +
geom_hline(yintercept = 0.5, color = 'coral')+
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Date", y = "Total Phosphorus (mg/L)", title = "Phosphorus in Indian Creek at Colfax")
Total suspended solids are a measure of water clarity, that involves weighing the sediment (mud) that settles out the water. Suspended solids were highest in the main channel of the South Skunk River in spring. We are missing data from some sites in October, as dry conditions made it impossible to collect a water sample without stirring up the mud on the bottom.
ggplot(data = ames_tidy) +
geom_col(mapping = aes(x= US_DS_order, y = `Total Suspended Solids`, fill = Month), position = "dodge2") +
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Sites, arranged upstream to downstream", y = "Total Suspended Solids (mg/L)", title = "Lab results from Story County monitoring route")
Nitrogen, a major contributor to the “dead zone” in the Gulf of Mexico was especially high this June, exceeding the drinking water standard (10 mg/L) in most of the streams we tested. The exception was College Creek, which flows through Ames and drains mostly urban land. Nitrogen losses are highest in watersheds with a lot of tile-drained agriculture, and during times when tile lines are flowing. Read here to see this in action. Due to dry conditions, nitrogen levels dropped to low levels at most sites this fall.
ggplot(data = ames_tidy) +
geom_col(mapping = aes(x= US_DS_order, y = `Nitrate Nitrogen as N`, fill = Month), position = "dodge2") +
geom_hline(yintercept = 10, color = 'coral')+
theme(axis.text.x = element_text(angle = 90))+
labs(x = "Sites, arranged upstream to downstream", y = "Nitrate-N (mg/L)", title = "Lab results from Story County monitoring route")