1 Executive Summary

The data was obtained by a government organization for the Quality of the water from Enterococci Bacteria, The purpose of the project to aim and assist research questions using other references to back up the data and help people understand what all this really about, when people are interested they will come to this research if they have similar questions. Graphical summuries are presented to help and back up the research questions.


2 Full Report

2.1 Initial Data Analysis (IDA)

##install and load all the libraries
#install.packages("xts")
#install.packages('ggthemes', dependencies = TRUE)
library(ggplot2)
library(scales)
library(xts)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(readxl)
## Warning: package 'readxl' was built under R version 4.0.3
library(ggthemes)
Billarong <- read_excel("Billarong.xlsx", 
     col_types = c("text", "date", "numeric","text"))
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in C6 / R6C3: got 'NA'
## Warning in read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet, :
## Expecting numeric in C36 / R36C3: got 'NA'
AvalonBeach <- read_excel("AvalonBeach.xlsx", 
   col_types = c("text", "date", "numeric","text"))
## Quick look at top 6 rows of Billarong and AvalonBeach
head(Billarong)
## # A tibble: 6 x 4
##   Site                        Date                `Enterococci (cfu/100~ Month  
##   <chr>                       <dttm>                               <dbl> <chr>  
## 1 Bilarong Reserve (Narrabee~ 2018-02-07 00:00:00                     18 Februa~
## 2 Bilarong Reserve (Narrabee~ 2018-02-19 00:00:00                     77 Februa~
## 3 Bilarong Reserve (Narrabee~ 2018-05-21 00:00:00                     23 May    
## 4 Bilarong Reserve (Narrabee~ 2018-06-13 00:00:00                     17 June   
## 5 Bilarong Reserve (Narrabee~ 2018-06-19 00:00:00                     NA June   
## 6 Bilarong Reserve (Narrabee~ 2018-10-31 00:00:00                      8 October
head(AvalonBeach)
## # A tibble: 6 x 4
##   Site         Date                `Enterococci (cfu/100ml)` Month   
##   <chr>        <dttm>                                  <dbl> <chr>   
## 1 Avalon Beach 2018-01-25 00:00:00                         6 January 
## 2 Avalon Beach 2018-02-07 00:00:00                         0 February
## 3 Avalon Beach 2018-02-19 00:00:00                         1 February
## 4 Avalon Beach 2018-01-19 00:00:00                         1 January 
## 5 Avalon Beach 2018-02-01 00:00:00                         2 February
## 6 Avalon Beach 2018-03-08 00:00:00                         0 March
## Size of Billarong and AvalonBeach
dim(Billarong)
## [1] 58  4
dim(AvalonBeach)
## [1] 58  4
## R's classification of Billarong and AvalonBeach
class(Billarong)
## [1] "tbl_df"     "tbl"        "data.frame"
class(AvalonBeach)
## [1] "tbl_df"     "tbl"        "data.frame"
## R's classification of variables
str(Billarong)
## tibble [58 x 4] (S3: tbl_df/tbl/data.frame)
##  $ Site                   : chr [1:58] "Bilarong Reserve (Narrabeen Lagoon)" "Bilarong Reserve (Narrabeen Lagoon)" "Bilarong Reserve (Narrabeen Lagoon)" "Bilarong Reserve (Narrabeen Lagoon)" ...
##  $ Date                   : POSIXct[1:58], format: "2018-02-07" "2018-02-19" ...
##  $ Enterococci (cfu/100ml): num [1:58] 18 77 23 17 NA 8 13 50 180 5 ...
##  $ Month                  : chr [1:58] "February" "February" "May" "June" ...
str(AvalonBeach)
## tibble [58 x 4] (S3: tbl_df/tbl/data.frame)
##  $ Site                   : chr [1:58] "Avalon Beach" "Avalon Beach" "Avalon Beach" "Avalon Beach" ...
##  $ Date                   : POSIXct[1:58], format: "2018-01-25" "2018-02-07" ...
##  $ Enterococci (cfu/100ml): num [1:58] 6 0 1 1 2 0 33 5 0 0 ...
##  $ Month                  : chr [1:58] "January" "February" "February" "January" ...

The beaches around Sydney were examined for the level of “Enterococci bacteria” by The Environment, Energy and Science (EES). Enterococci bacteria is a large genus of lactic acid bacteria of the phylum Firmicutes. Enterococci bacteria cause infections, it would not affect much, however, it could be fatal and severe if it reaches to some body parts (“Enterococcus Faecalis: Causes, Symptoms, and Treatments”, 2020) . The database was created for the level of Enterococci bacteria in water. The six variables are Enterococci level for Avalon and Bilarong beaches, date of finding and the location of both beaches, in total six variables. The Two-location chosen for this research question was based on observation, depending on which water was active with Enterococci level and popularity. Cleaning of data was made for irrelevant data in excel towards this upcoming research questions. Changes were made for one variable in both locations where it was a character and switch to a date format for the variable “Date”. There is also another note for Bilarong Beach where two number were deleted in excel because of high chance of error, it had a level of 700 and 500 Enterococci where it looked highly unlikely. The limitation of this investigation could be because of people are surrounding the water and would be hard to examine especially in the weekends. Another limitation could be the accuracy of their search, it is impossible to scan the whole water, assuming they have the latest technology to do so. The aim of this project is to supply relevant data to people that are injured or in more details; having an opening on the skin where blood can get out and bacteria can get in, those are the vulnerable people. Those vulnerable people need to have a check on our data on which beach they would want to visit. The project aims to choose two sites of a land and then choose one beach from that site because all the nearby beach would have similar Enterococci level of Bacteria, Those beaches are Avalon Beach and Bilarong Beach. The Data retrieved from EES has a high reliability since it is trusted and funded by the government (references). In Conclusion, the data has 6 variables for two location. The aim of the project to inform people that are vulnerable to the bacteria and be compered to two beach locations.


2.2 Research Question 1

For each beach when is the lowest level Bacteria known as Enterococci Bacteria near Mona Vale, NSW, Australia throughout the year? and is it safe to swim in any?

#Research Question 1
#This is the Summary for Enterococci Bacterie in Billarong
    BacteriaB <- Billarong$`Enterococci (cfu/100ml)`
    summary(BacteriaB)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00    3.75   12.00   29.50   38.50  180.00       2
#This is the Summary for Enterococci Bacterie in AvalonBeach
    BacteriaA <- AvalonBeach$`Enterococci (cfu/100ml)`
    summary(BacteriaA)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00    1.81    1.00   33.00

In Mona Vale, NSW, Australia a radius of 5 Km was created and the furthest two beaches from Mona Vale in radius was chosen by observation. There were 58 tests on each beach, in barplot chart of “Tests during the year for Avalon Beach” and “for Bilarong Beach” were similar timing.

#Numbers of tests in Billarong monthly
OrdermonthB = factor(Billarong$Month, levels = c("January","February","March","April","May","June","July","August","September","Ocober","November","December"))
barplot(table(OrdermonthB),main="Tests during the year for Billarong Beach", col = sapply(11:24/24,hsv,0.9,0.9), las =3)

OrdermonthA = factor(AvalonBeach$Month, levels = c("January","February","March","April","May","June","July","August","September","Ocober","November","December"))
barplot(table(OrdermonthA),main="Tests during the year for Avalon Beach", col = sapply(11:24/24,hsv,0.9,0.3), las =3)


Now, The two charts for the scatterplot is about Bacteria Enterococci level in Bilarong Beach and Avalon Beach throughout the year from January to December. To answer the research question, plots need to be compared and then the answer will be clear. By observation for the duration from Jan-Dec in Bilarong beach, 45 tests result was under 50cfu/100ml. So the lowest point for the level of bacteria throughout the year for Bilarong beach is April. Another good time to visit that beach is at October. Since both of the observation had between (0-10cfu/100ml).

plot(Billarong$Date, Billarong$`Enterococci (cfu/100ml)`, xlab="Dates tested", ylab="Enterocci Bacteria cfu/100ml",col="blue",pch=21, main="Bacteria in Billarong Beach")
abline(h=0)
abline(h=50)
abline(h=100)
abline(h=150)

plot(AvalonBeach$Date, AvalonBeach$`Enterococci (cfu/100ml)`, xlab="Dates tested", ylab="Enterocci Bacteria cfu/100ml",col="blue",pch=21, main="Bacteria in Avalon Beach") 
abline(h=0) 
abline(h=10)
abline(h=20)
abline(h=30)

BacteriaAvalon = AvalonBeach$`Enterococci (cfu/100ml)`
BacteriaBillarong = Billarong$`Enterococci (cfu/100ml)`
lm(BacteriaAvalon~BacteriaBillarong)
## 
## Call:
## lm(formula = BacteriaAvalon ~ BacteriaBillarong)
## 
## Coefficients:
##       (Intercept)  BacteriaBillarong  
##          2.083916          -0.008898

Avalon Beach is much lower compered to Bilarong Beach, having Maximum of 33cfu/100ml. it even has lower mean then Bilarong difference by (29.50-1.81=27.69cfu/100ml) Throughout the year people can visit Avalon Beach most of the time.


However an extra data to show how relevant this is, Here it says similar case telling “Swimmers in NSW can paddle with peace of mind this summer, with beaches and waterways across the State registering the cleanest results on record in the 2019-2020 State of the Beaches report.” Quoted by (“Record results for State’s swimming spots as we head into summer”, 2020). This includes Avalon Beach and Bilarong Beach. Even If the Bacteria level is high for Bilarong it is still safe to swim.

2.3 Research Question 2

Now why is the Avalon Beach has lower Enterococci Bacteria level then Bilarong Beach?

Now to answer that, by the location found on the map, we see thaat Bilarong has smaller area of water with almost closed entrance found in north east of its location, However, The Avalon beach is a fully opened beach to its one side to the ocean. This could explain why the level is lower of Enterococci Bacteria in Avalon then in Bilarong beach. Bilarong is actually is more like a Lagoon then a beach.

#Going to do Assumption for T-test, for two beaches of the data related to the backteria

BA.sim <- replicate(10000, mean(sample(BacteriaA, size = 58, replace = TRUE)))
hist(BA.sim) 

BB.sim <- replicate(10000, mean(sample(BacteriaB, size = 58, replace = TRUE)))
hist(BB.sim) 

# Because of our test shows fatness on one of the sides for each, then our assumption will unreliable 
teststatforA = (mean(BacteriaA) - 0)/(sd(BacteriaA)/sqrt(length(BacteriaA)))
teststatforA
## [1] 2.821613
teststatforB = (mean(BacteriaB) - 0)/(sd(BacteriaB)/sqrt(length(BacteriaB)))
teststatforB
## [1] NA
#For TeststatforB The NA is caused because of having 2 NA in the data.
pt(teststatforA, 57, lower.tail = F)
## [1] 0.003283006
#for low probability
#T-test for both bacteria in both beaches in one go with all the details
t.test(mu = 0, BacteriaA, alternative = "greater")
## 
##  One Sample t-test
## 
## data:  BacteriaA
## t = 2.8216, df = 57, p-value = 0.003283
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
##  0.7375723       Inf
## sample estimates:
## mean of x 
##  1.810345
t.test(mu = 0, BacteriaB, alternative = "greater")
## 
##  One Sample t-test
## 
## data:  BacteriaB
## t = 5.2358, df = 55, p-value = 1.329e-06
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
##  20.07361      Inf
## sample estimates:
## mean of x 
##      29.5


3 References (if needed)

Style: APA

Enterococci data download. (2020). Retrieved 15 November 2020, from https://www.environment.nsw.gov.au/beachapp/report_enterococci.aspx/

Enterococcus Faecalis: Causes, Symptoms, and Treatments. (2020). Retrieved 15 November 2020, from https://www.healthline.com/health/enterococcus-faecalis

Record results for State’s swimming spots as we head into summer. (2020). Retrieved 15 November 2020, from https://www.miragenews.com/record-results-for-state-s-swimming-spots-as-we-head-into-summer/