note 0: This document contains overflow text from another RPubs document: Woods Creek Bacteria Monitoring. It probably does not make much sense out of this context.

note 1: The material in this RPub shows some preliminary work focusing on data presentation with very little focus interpretation of results from an ongoing sampling efforts in the Woods Creek Watershed (Lexington, Virginia, USA to assess concentration of E. coli/coliform bacteria with Coliscan Easygel, incubation, and plate ID/counts. Samples were/will be collected and analyzed by volunteers at 12 sites on Woods Creek and its tributaries on the second Tuesday of each month from July 2017 to June 2019. All E. coli results are reported in cfu/100 mL (cfu = colony-forming units).

note 2: This is a map of the study area (Woods Creek in and around Lexington, Virginia in the USA). Samples are colored according to concentration of E. coli (blue = low, white = med, red= high, gray = NA). For some reason, attaching a key to a leaflet map in Rmarkdown does not work so the key looks like this:
WCBkey
Click on a sample point to see the numeric concentration. Click on a watershed polygon to see information about its size and land use characteristics. Change background map and sample date on the right, drag the mouse to move the map, and use the mouse wheel to zoom.

Click on a sample point to see the numeric concentration. Click on a watershed polygon to see information about its size and land use characteristics. Change background map and sample date on the right, drag the mouse to move the map, and use the mouse wheel to zoom.
More information on the watersheds can be found at “Watershed Info” (data table). Watershed land use (% natural, % pasture, and % developed) is from the 2011 NLCD, % paved area is calculated from all road and parking lot surfaces (not including sidewalks), imperviousness is the mean of the values within the watershed from the NCLD 2011 Percent Developed Imperviousness dataset and has potential vales ranging from 0 (all area high permeability) to 100 (all area highly impermeable), buffer canopy is the mean of values within 30 meters of the perennial stream channels of a watershed from the NCLD 2011 USFS Tree Canopy cartographic dataset and has potential vales ranging from 0 (no canopy for all buffer area) to 100 (thick canopy for all buffer area), and % carbonate bedrock is calculated from whole rock geochemical analysis from Low and Hewitt (unpublished data).
An alternative version of this map can be found at rpubs.com/fecr2o4/WCbacteria and a 3D model of the Woods Creek Watershed can be found at sketchfab.com.

note 3: Subway maps schematically forgo positional and scaling constraints to focus on communicating the relationships in a network. In this modified version of a subway map of the study area, samples are colored according to E. coli concentration (blue = low, white = med, red= high, gray = NA)
WCBkey2
and the diameters of the circles are sized according to the area of the watershed for each sample site. Move the slider to the right to see later sample dates and hover over a sample point to see the numeric concentration and area. For some reason, the render order of the layers for this plot is reversed for every other time jump. I am not sure how to fix this.

note 4: Sankey maps/diagrams show both the relationships between nodes in a network and the magnitude of flow between nodes. In this this version, the size of each node is determined by the area of the watershed and the nodes are colored according to a weighted average of E. coli concentration (blue = low, white = med, red= high, gray = NA) for the 12-month study period. The weighting metric used gives more importance to samples that were collected during months that had more samples collected. The ploly sankey version positions nodes “intelligently” thereby providing less control over their default (initial) positioning but allows for more (easier?) control over color. The version below (D3) uses automatically assigned node colors but I like where the nodes are positioned (the tributaries are on the geographically “correct” sides of the main stem in this example).


note 5: The land cover of a watershed has the potential to affect water quality. The Woods Creek Watershed is almost ¾ pasture upstream of Lexington and slowly replaces pasture with developed land area as it flows through town (left). Its tributaries vary considerably from Town Branch with 100% developed land cover to the 65% forested “GC01” (left). As it flows through town, Woods Creek gains more imperviousness in its watershed but also has more trees near its channels (right). Hover your mouse over the points for the site name and numeric info. More information on the watershed can be found at the “Watershed Info” tab (data table).

note 6: Woods Creek does not have a stream gauge (yet) but the USGS stream gauge at nearby Kerrs Creek can provide an effective proxy for flow variation in Woods Creek. These 12 graphs show the flow (green line) at Kerrs Creek during the week preceding each of the sampling events (red dot). The unit is flow/median flow and 1 (shown with a dotted black line) is median flow, less than 1 is low flow, and stormflow is greater than 1. Hover your mouse over the line for the date/time and numeric info.

note 7: This tile plot shows how relative E.coli concentrations (blue = low, white = med, red = high, gray = NA) vary from site to site and from month to month. Hover over the tiles for numeric information, drag to zoom, and double-click to unzoom.

note 8: The left plot shows how the samples vary from site to site starting from upstream on the left to downstream on the right. Tributaries (shown in green) enter Woods Creek (shown in gray) before they appear in this order (for instance, Sarahs Run (samples SR0x) enters Woods Creek between WC02 and WC03). In a box plot, the black horizontal bar = median, boxes = 25th and 75th percentiles, whiskers (vertical black bars) = 1.5 times the interquartile range, and colored dots = raw data. Hover your mouse over the points for the E.coli concentration (vertical axis) and the month in which the sample was taken. The right plot shows how the samples vary from month to month starting from the first sample event in July, 2017 to the last in June, 2018. The horizontal dotted black line at a concentration of 253 cfu/100mL is the concentration of 235 cfu/100mL is used by the VADEQ as the standard that should be exceeded no more than 10.5% of the time at any given sample location. This is a potentially useful (but not perfect) guide for whether the E.coli concentration in a stream is low enough for recreational contact and general stream health. Please keep in mind that the minimum increment measured by the tests used here are 100 cfu/100mL meaning that results of 0, 100, or 200 are “acceptable” and 300 + are not. Hover your mouse over the points for the E.coli concentration (vertical axis) and the site name.

note 9: Tile plot (same format as before) with observed concentrations of general coliform bacteria.

note 10: Box plots (same format as before) with observed concentrations of general coliform bacteria.

note 11: This is an image mosaic of incubated petri dishes from water samples collected on October 10, 2017. Volunteers head at 8:00 am on the Tuesday of each month and collect water from their site(s), mix the stream water with some growth media/differential stain (Coliscan EasyGel), let the samples incubate for 24 hours at 98F, and then count the purplish blue (but not blueish blue) as E.coli and fuchsia (but not purple) as general coliform bacteria. Hover over each of the petri dishes to see the number of E.coli colony forming units (cfu) and drag to zoom to see small red numbers next to counted E.coli colonies. Double click to unzoom.

note 12: This is a table that contains all the data that has been reported as of June 20, 2018. You can sort by column header, show up to 100 entries, and see brief descriptions of the sample sites. These data are also available on google docs

note 13: This table contains information on land cover within the watersheds of the study sites. Watershed land use (% natural, % pasture, and % developed) is from the 2011 NLCD, % paved area is calculated from all road and parking lot surfaces (not including sidewalks), imperviousness is the mean of the values within the watershed from the NCLD 2011 Percent Developed Imperviousness dataset and has potential vales ranging from 0 (all area high permeability) to 100 (all area highly impermeable), buffer canopy is the mean of values within 30 meters of the perennial stream channels of a watershed from the NCLD 2011 USFS Tree Canopy cartographic dataset and has potential vales ranging from 0 (no canopy for all buffer area) to 100 (thick canopy for all buffer area), and % carbonate bedrock is calculated from whole rock geochemical analysis from Low and Hewitt (unpublished data).

note 14: This brief (under 5 minutes), unpolished instructional video is intended for people who are already familiar with this project (or similar projects); it shows how to access the information in this RPub document.

note 15: This is another subway map of the Woods Creek watershed system with additional sampling sites (mostly concentrating on the “in town” part of the watershed. Size corresponds to watershed area and color pallet varies with average specific conductance (uS/cm) from ~50 samples taken during a variety of flow conditions during 2016-2018 (blue = low, white = med, red= high, gray = NA).

note 16: Sankey map of the expanded Woods Creek Watershed. The nodes are colored according to average specific conductance (uS/cm) from ~50 samples taken during a variety of flow conditions during 2016-2018 (blue = low, white = med, red= high, gray = NA). If the initial placement of the nodes bothers you, move them (it’s kind of fun). There is another example below with randomly colored nodes and slightly better “intelligent” node placement (that can be updated).