This report utilizes data concerning the presence of raw sewage for beaches in Sydney Australia to explore improvement by site across a five-year period from 2013 to 2018. Data was provided by R-Ladies Sydney. The report starts off by examining all eleven beaches in the local area before narrowing its focus to environmental improvements at two specific sites.
#load packages
library(tidyverse)
library(here)
library(ggbeeswarm)
#read in data
plotbeaches <- read.csv(here("data", "cleanbeaches_new.csv"))
#Data tidying-
#coerce year to be
factor rather than integer
plotbeaches$year <- as.factor(plotbeaches$year)
#plotting bug levels by site
plotbeaches %>%
na.omit() %>%
ggplot(aes(x = site, y = beachbugs, colour = year)) +
geom_jitter() +
coord_flip() +
ggtitle("Beach Bugs: All Sites Year Over Year")
ggsave(here("output", "BeachesAllYearOverYear.png"))
Figure 1.) Beach bugs at all sites year over year. This plot shows enterococci (bug) levels at each of the eleven sites with layered annual data delineated by color. Of note, Little Bay Beach shows a recorded outlier observation of nearly 5000 bug contaminants from the year 2013 that elongates the dataset.
Comment: This code example uses the “jitter” technique, geom_jitter, from the ggplot2 library to apply a small amount of random variation to measurements for the aesthetic dispersal of points. From left to right, a dense amount of near zero observations can be seen clustered - followed by sparse spikes in bug observations. Time is delineated by color and a corresponding legend can be seen right of the plot.
#removing outliers to re-plot bug levels by site
plotbeaches %>%
na.omit() %>%
filter(beachbugs < 1000) %>%
ggplot(aes(x = year, y = beachbugs, colour = year)) +
geom_jitter() +
facet_wrap(~site) +
coord_flip() +
ggtitle("Beach Bugs by Site & Year (Outliers Removed)")
ggsave(here("output", "BeachByYearSansOutLi.png"))
Figure 2.) Beach bugs by site & year: Outliers removed. Here a dplyr “filter” function is added to the previous code to exclude the outlier value at Little Bay Beach and provide better definition of bug levels across all sites from 2013 to 2018. This is accomplished by specifying the logic “beachbugs < 1000”, whereby enterococci observations in the underlying data are filtered to only those below a one thousand count.
Comment: With the narrowed scope, distinct improvement can be seen at Bondi Beach where observations were high in 2013 and show a gradual decrease of bug observations into 2018. Conversely, Bronte Beach shows a “V” pattern where bugs decreased from 2013 to 2015 but saw a sharp return in 2016.
#isolating the progression of bugs at Bondi and Coogee Beaches
plotbeaches %>%
na.omit() %>%
filter(beachbugs < 1000) %>%
filter(site %in% c("Coogee Beach", "Bondi Beach")) %>%
ggplot(aes(x = year, y = beachbugs, colour = site)) +
geom_jitter() +
facet_wrap(~ site)
ggsave(here("output", "coogibondi.png"))
Figure 3.) Bondi and Coogee Beach subset. The final plot narrows in on Bondi and Coogee Beaches to examine environmental improvement for each location. Of note, the axis for these plots has changed; bug observations can be seen on the y-axis while years ascend from left to right.
Comment: This figure presents the best look at jitter geometry across the entire Sydney Beach data series. The point plot on the left shows the overall decline in bug observations at Bondi Beach from 600 in 2013 to approximately 200 in 2018. This beach can be said to have undergone improvement over the five-year period. The second point plot on the right tells us that bug levels at Coogee beach have remained high. Despite having relatively lower bug levels in 2018, Coogee is still experiencing high levels and time will tell if this downward trend persists.
Two of eleven beaches, Bondi and Clovelly, showed either a downward trend in bug observations or consistently low levels across the five-year period from 2013 to 2018 while observations at all other sites remained high. Further analysis is required to determine external factors contributing to persistent bug observations at the nine remaining sites. One recommendation would be to map all eleven beaches and examine factors of spatial autocorrelation.