1 Water Quality Sampling Data

In this assignment we will look at Water Quality Data from the City of Austin’s online data portal:https://data.austintexas.gov/Environment/Water-Quality-Sampling-Data/5tye-7ray.

The dataset contains the results of about a 1000 water quality tests performed on water bodies in Austin, in 2020.

We will use tidyverse packages to clean and study the datasets.

1.1 Load libraries and Import Data

library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ───────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

We will import the CSV directly from the City of Austin cite and study the data structure, before deciding what analysis we would like to perform on it.

water <- read_csv ('https://data.austintexas.gov/resource/5tye-7ray.csv')
glimpse(water) #studying the data structure
## Rows: 1,000
## Columns: 24
## $ watershed       <chr> "Lady Bird Lake", "Lady Bird Lake", "Lady Bird Lake",…
## $ sample_date     <dttm> 2020-08-18 15:10:00, 2020-08-18 15:10:00, 2020-08-18…
## $ site_name       <chr> "Lagoon at Festival Beach", "Lagoon at Festival Beach…
## $ site_type       <chr> "Lake", "Lake", "Lake", "Lake", "Lake", "Lake", "Lake…
## $ medium          <chr> "Surface Water", "Surface Water", "Surface Water", "S…
## $ param_type      <chr> "Solids/Conductivity", "Flow/Rainfall", "Alkalinity/H…
## $ parameter       <chr> "CONDUCTIVITY", "DAYS AFTER STORM", "PH", "FLOW SEVER…
## $ qualifier       <chr> NA, ">", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ result          <dbl> 475.40, 14.00, 8.00, 3.00, 22514.00, 9.88, 1.00, 30.7…
## $ unit            <chr> "uS/cm", "Days", "Standard units", "None", "None", "M…
## $ filter          <chr> "Total", "Total", "Total", "Total", "Total", "Dissolv…
## $ sample_id       <chr> "1997-Lagoon @ Festival beach SURF", "1997-Lagoon @ F…
## $ sample_site_no  <dbl> 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1, 1,…
## $ depth_in_meters <dbl> 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 7.3, 7.3, 7.3…
## $ method          <chr> "HYDROLAB", "NONE", "HYDROLAB", "TCEQ FLOW SEVERITY",…
## $ qc_flag         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ project         <chr> "Lady Bird Lake Harmful Algal Bloom Study", "Lady Bir…
## $ location        <chr> "\n,  \n(30.247941716350812, -97.72454604506493)", "\…
## $ ref_no          <dbl> 2794085, 2794103, 2794066, 2794152, 2794194, 2794080,…
## $ lat_dd_wgs84    <dbl> 30.24794, 30.24794, 30.24794, 30.24794, 30.24794, 30.…
## $ lon_dd_wgs84    <dbl> -97.72455, -97.72455, -97.72455, -97.72455, -97.72455…
## $ sample_ref_no   <dbl> 572564, 572564, 572564, 572564, 572564, 572564, 57257…
## $ time_null       <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ qc_type         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

1.2 Filtering the dataset

After studying the dataset, I have decided to focus my analysis on the Ph level and water temperature for these observations. Therefore, I will only keep the fields that I am interested, in for our dataset.

water <- tibble('Site_Name'=water$site_name,
                 'Site_Type' = water$site_type,
                  'Sample_Time'=water$sample_date,
                    'Parameter_Type' = water$param_type,
                      'Parameter' = water$parameter,
                        'Results' = water$result,
                          'Unit' = water$unit)
glimpse(water)
## Rows: 1,000
## Columns: 7
## $ Site_Name      <chr> "Lagoon at Festival Beach", "Lagoon at Festival Beach"…
## $ Site_Type      <chr> "Lake", "Lake", "Lake", "Lake", "Lake", "Lake", "Lake"…
## $ Sample_Time    <dttm> 2020-08-18 15:10:00, 2020-08-18 15:10:00, 2020-08-18 …
## $ Parameter_Type <chr> "Solids/Conductivity", "Flow/Rainfall", "Alkalinity/Ha…
## $ Parameter      <chr> "CONDUCTIVITY", "DAYS AFTER STORM", "PH", "FLOW SEVERI…
## $ Results        <dbl> 475.40, 14.00, 8.00, 3.00, 22514.00, 9.88, 1.00, 30.77…
## $ Unit           <chr> "uS/cm", "Days", "Standard units", "None", "None", "MG…

Now that we have filtered our dataset to only include the variables of interest, let’s further filter it down to observations where our parameters are PH and water temperature.

unique(water$Parameter) #looking at unique parameter values
##   [1] "CONDUCTIVITY"                                                 
##   [2] "DAYS AFTER STORM"                                             
##   [3] "PH"                                                           
##   [4] "FLOW SEVERITY CODE (1=NONE;2=LOW;3=NORM;4=FLOOD;5=HIGH;6=DRY)"
##   [5] "FIELD INSTRUMENT SERIAL NUMBER"                               
##   [6] "DISSOLVED OXYGEN"                                             
##   [7] "CODE FOR SAMPLE COLLECTION APP"                               
##   [8] "WATER TEMPERATURE"                                            
##   [9] "SECCHI DISK DEPTH"                                            
##  [10] "NUMBER OF AUSTIN BLIND SALAMANDERS PHOTOGRAPHED"              
##  [11] "TOTAL TIME SPENT"                                             
##  [12] "LIGHT INTENSITY"                                              
##  [13] "BSS SALS NOT PHOTGRAPHED >2 INCHES"                           
##  [14] "BSS SALS NOT PHOTGRAPHED <1 INCH"                             
##  [15] "ABS SALS NOT PHOTGRAPHED <1 INCH"                             
##  [16] "ABS SALS NOT PHOTGRAPHED >2 INCHES"                           
##  [17] "BSS SALS NOT PHOTGRAPHED 1-2 INCHES"                          
##  [18] "ABS SALS NOT PHOTGRAPHED 1-2 INCHES"                          
##  [19] "NUMBER OF BARTON SPRINGS SALAMANDERS PHOTOGRAPHED"            
##  [20] "FLOW"                                                         
##  [21] "POECILIIDAE (GAMBUSIA)"                                       
##  [22] "BASS (MICROPTERUS)"                                           
##  [23] "CICHLIDAE"                                                    
##  [24] "SUNFISH (LEPOMIS)"                                            
##  [25] "OTHER FISH"                                                   
##  [26] "OXIDATION-REDUCTION_POTENTIAL"                                
##  [27] "PLANT HEIGHT"                                                 
##  [28] "RUHU RUELLIA HUMILIS"                                         
##  [29] "MEAZ MELIA AZEDARACH"                                         
##  [30] "ACOS ACALYPHA OSTRYIFOLIA"                                    
##  [31] "RUNU RUELLIA NUDIFLORA"                                       
##  [32] "MIJA MIRABILIS JALAPA"                                        
##  [33] "CAVI2 CALYPTOCARPUS VIALIS"                                   
##  [34] "BRCA6 BROMUS CATHARTICUS"                                     
##  [35] "PERCENT COVER"                                                
##  [36] "DEPA6 DESMODIUM PANICULATUM"                                  
##  [37] "ACPH3 ACALYPHA PHLEOIDES"                                     
##  [38] "CAIL2 CARYA ILLINOINENSIS"                                    
##  [39] "CANOPY COVER"                                                 
##  [40] "CELA CELTIS LAEVIGATA"                                        
##  [41] "COCA COCCULUS CAROLINUS"                                      
##  [42] "ABWR ABUTILON WRIGHTII"                                       
##  [43] "OXDI2 OXALIS DILLENII"                                        
##  [44] "PAHY PARTHENIUM HYSTEROPHORUS"                                
##  [45] "PADI3 PASPALUM DILATATUM"                                     
##  [46] "SOHA SORGHUM HALEPENSE"                                       
##  [47] "RHYNCOSIA SP"                                                 
##  [48] "CHPR6 CHAMAESYCE PROSTRATA"                                   
##  [49] "RHPH2 RHYNCHOSIDA PHYSOCALYX"                                 
##  [50] "CAREX CAREX SPP."                                             
##  [51] "IPCOC2 IPOMOEA CORDATOTRILOBA VAR. CORDATOTRILOBA"            
##  [52] "CYES CYPERUS ESCULENTUS"                                      
##  [53] "TOAR TORILIS ARVENSIS"                                        
##  [54] "MOCI MONARDA CITRIODORA"                                      
##  [55] "EUDE4 EUPHORBIA DENTATA"                                      
##  [56] "ULCR ULMUS CRASSIFOLIA"                                       
##  [57] "AMTR AMBROSIA TRIFIDA"                                        
##  [58] "TORA2 TOXICODENDRON RADICANS"                                 
##  [59] "PAQU2 PARTHENOCISSUS QUINQUEFOLIA"                            
##  [60] "SILA20 SIDEROXYLON LANUGINOSUM"                               
##  [61] "VIMU2 VITIS MUSTANGENSIS"                                     
##  [62] "TRDA3 TRIPSACUM DACTYLOIDES"                                  
##  [63] "TAOFO TARAXACUM OFFICINALE"                                   
##  [64] "TRAGI TRAGIA SPP."                                            
##  [65] "ELVI3 ELYMUS VIRGINICUS"                                      
##  [66] "UNKNOWN PLANT 1"                                              
##  [67] "SMBO2 SMILAX BONA-NOX"                                        
##  [68] "VIMO2 VITIS MONTICOLA"                                        
##  [69] "RIHU2 RIVINA HUMILIS"                                         
##  [70] "ULAM ULMUS AMERICANA"                                         
##  [71] "CANOPY COVER CENTER"                                          
##  [72] "LITTER"                                                       
##  [73] "NUMBER OF ROCKS SCRAPED"                                      
##  [74] "VELOCITY/DEPTH REGIMES"                                       
##  [75] "RIPARIAN VEGETATIVE ZONE WIDTH (RIGHT BANK)"                  
##  [76] "PERCENT ALGAE COVER"                                          
##  [77] "VEGETATIVE PROTECTION (LEFT BANK)"                            
##  [78] "BANK STABILITY (RIGHT BANK)"                                  
##  [79] "FREQUENCY OF RIFFLES"                                         
##  [80] "NUMBER OF SURBERS"                                            
##  [81] "BANK STABILITY (LEFT BANK)"                                   
##  [82] "CANOPY COVER UPSTREAM"                                        
##  [83] "CLARITY"                                                      
##  [84] "SEDIMENT DEPOSITION"                                          
##  [85] "SURFACE APPEARANCE"                                           
##  [86] "VEGETATIVE PROTECTION (RIGHT BANK)"                           
##  [87] "EMBEDDEDNESS"                                                 
##  [88] "RIPARIAN VEGETATIVE ZONE WIDTH (LEFT BANK)"                   
##  [89] "# OF GRIDS SUBSAMPLED"                                        
##  [90] "ODOR"                                                         
##  [91] "CANOPY COVER DOWNSTREAM"                                      
##  [92] "EPIFAUNAL SUBSTRATE"                                          
##  [93] "CHANNEL ALTERATION"                                           
##  [94] "CHANNEL FLOW STATUS"                                          
##  [95] "SESC2 SETARIA SCHEEELEI"                                      
##  [96] "DEIL DESMANTHUS ILLINOENSIS"                                  
##  [97] "CYDA CYNODON DACTYLON"                                        
##  [98] "UNKNOWN GRASS 1"                                              
##  [99] "PHVI17 PHYSALIS VISCOSA"                                      
## [100] "FRAXI FRAXINUS SPP."                                          
## [101] "AREA SAMPLED"                                                 
## [102] "HEHE HEDERA HELIX"                                            
## [103] "ARAL3 ARGEMONE ALBIFLORA"                                     
## [104] "HEAN3 HELIANTHUS ANNUUS"                                      
## [105] "UNKNOWN GRASS 2"                                              
## [106] "NALE3 NASSELLA LEUCOTRICHA"

Looks like there are a lot of values stored under parameter. However I am only interested in the water Ph and temperature. I will create another tibble that is a subset of the water tibble, but only contains the observations for parameter = PH or Water Temperature.

water2 <- filter(water, water$Parameter=='PH'|water$Parameter=='WATER TEMPERATURE')
knitr:: kable (water2)
Site_Name Site_Type Sample_Time Parameter_Type Parameter Results Unit
Lagoon at Festival Beach Lake 2020-08-18 15:10:00 Alkalinity/Hardness/pH PH 8.00 Standard units
Lagoon at Festival Beach Lake 2020-08-18 15:10:00 Conventionals WATER TEMPERATURE 30.77 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:25:00 Alkalinity/Hardness/pH PH 7.26 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:25:00 Conventionals WATER TEMPERATURE 27.82 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 28.15 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 27.97 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 27.82 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.63 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.93 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 30.33 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 28.81 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 8.00 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 28.24 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.68 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.28 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.38 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.55 Standard units
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 29.12 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Conventionals WATER TEMPERATURE 28.28 Deg. Celsius
Lady Bird Lake @ Basin (AC) Lake 2020-08-18 14:15:00 Alkalinity/Hardness/pH PH 7.97 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:45:00 Alkalinity/Hardness/pH PH 7.46 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:45:00 Conventionals WATER TEMPERATURE 27.69 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Conventionals WATER TEMPERATURE 27.84 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Alkalinity/Hardness/pH PH 7.62 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Conventionals WATER TEMPERATURE 27.71 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Alkalinity/Hardness/pH PH 7.65 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Alkalinity/Hardness/pH PH 7.67 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Alkalinity/Hardness/pH PH 7.60 Standard units
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Conventionals WATER TEMPERATURE 27.77 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Conventionals WATER TEMPERATURE 27.73 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Conventionals WATER TEMPERATURE 28.01 Deg. Celsius
Lady Bird Lake @ 1st St (CC) Lake 2020-08-18 13:35:00 Alkalinity/Hardness/pH PH 7.61 Standard units
Lady Bird Lake @ Shoal Creek Lake 2020-08-18 13:20:00 Alkalinity/Hardness/pH PH 7.88 Standard units
Lady Bird Lake @ Shoal Creek Lake 2020-08-18 13:20:00 Conventionals WATER TEMPERATURE 30.17 Deg. Celsius
Lady Bird Lake @ Powerline Lake 2020-08-18 12:55:00 Alkalinity/Hardness/pH PH 7.70 Standard units
Lady Bird Lake @ Powerline Lake 2020-08-18 12:55:00 Conventionals WATER TEMPERATURE 28.23 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-18 12:30:00 Alkalinity/Hardness/pH PH 7.63 Standard units
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-18 12:30:00 Conventionals WATER TEMPERATURE 26.02 Deg. Celsius
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:50:00 Alkalinity/Hardness/pH PH 7.61 Standard units
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:50:00 Conventionals WATER TEMPERATURE 27.20 Deg. Celsius
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Conventionals WATER TEMPERATURE 27.46 Deg. Celsius
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Alkalinity/Hardness/pH PH 7.72 Standard units
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Alkalinity/Hardness/pH PH 7.71 Standard units
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Conventionals WATER TEMPERATURE 27.40 Deg. Celsius
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Alkalinity/Hardness/pH PH 7.77 Standard units
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Conventionals WATER TEMPERATURE 27.47 Deg. Celsius
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Alkalinity/Hardness/pH PH 7.75 Standard units
Lady Bird Lake @ Red Bud Isle (EC) Lake 2020-08-18 11:40:00 Conventionals WATER TEMPERATURE 27.49 Deg. Celsius
Redbud West of Parking Lot Lake 2020-08-18 11:20:00 Alkalinity/Hardness/pH PH 7.57 Standard units
Redbud West of Parking Lot Lake 2020-08-18 11:20:00 Conventionals WATER TEMPERATURE 27.04 Deg. Celsius
6012 Florencia Lane Spring 2020-08-14 10:15:00 Conventionals WATER TEMPERATURE 29.08 Deg. Celsius
6012 Florencia Lane Spring 2020-08-14 10:15:00 Alkalinity/Hardness/pH PH 6.95 Standard units
Barton Spring Pool @ Downstream Dam Stream 2020-08-12 12:10:00 Conventionals WATER TEMPERATURE 22.17 Deg. Celsius
Barton Spring Pool @ Downstream Dam Stream 2020-08-12 12:10:00 Alkalinity/Hardness/pH PH 7.04 Standard units
Barton Spring Spring 2020-08-12 12:05:00 Alkalinity/Hardness/pH PH 6.97 Standard units
Barton Spring Spring 2020-08-12 12:05:00 Conventionals WATER TEMPERATURE 21.61 Deg. Celsius
Eliza Spring Spring 2020-08-12 08:37:00 Conventionals WATER TEMPERATURE 21.56 Deg. Celsius
Eliza Spring Spring 2020-08-12 08:37:00 Alkalinity/Hardness/pH PH 7.18 Standard units
Lagoon at Festival Beach Lake 2020-08-11 12:40:00 Conventionals WATER TEMPERATURE 29.07 Deg. Celsius
Lagoon at Festival Beach Lake 2020-08-11 12:40:00 Alkalinity/Hardness/pH PH 7.69 Standard units
Redbud West of Parking Lot Lake 2020-08-11 11:55:00 Conventionals WATER TEMPERATURE 27.74 Deg. Celsius
Redbud West of Parking Lot Lake 2020-08-11 11:55:00 Alkalinity/Hardness/pH PH 7.58 Standard units
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-11 11:10:00 Conventionals WATER TEMPERATURE 23.07 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-11 11:10:00 Alkalinity/Hardness/pH PH 7.39 Standard units
Lady Bird Lake @ Powerline Lake 2020-08-11 10:40:00 Alkalinity/Hardness/pH PH 7.71 Standard units
Lady Bird Lake @ Powerline Lake 2020-08-11 10:40:00 Conventionals WATER TEMPERATURE 27.45 Deg. Celsius
Redbud West of Parking Lot Lake 2020-08-04 12:40:00 Conventionals WATER TEMPERATURE 27.20 Deg. Celsius
Redbud West of Parking Lot Lake 2020-08-04 12:40:00 Alkalinity/Hardness/pH PH 7.66 Standard units
Lagoon at Festival Beach Lake 2020-08-04 12:05:00 Alkalinity/Hardness/pH PH 8.09 Standard units
Lagoon at Festival Beach Lake 2020-08-04 12:05:00 Conventionals WATER TEMPERATURE 30.27 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-04 10:55:00 Conventionals WATER TEMPERATURE 23.26 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-08-04 10:55:00 Alkalinity/Hardness/pH PH 7.09 Standard units
Lady Bird Lake @ Powerline Lake 2020-08-04 10:30:00 Alkalinity/Hardness/pH PH 7.54 Standard units
Lady Bird Lake @ Powerline Lake 2020-08-04 10:30:00 Conventionals WATER TEMPERATURE 28.06 Deg. Celsius
Lagoon at Festival Beach Lake 2020-07-29 11:35:00 Alkalinity/Hardness/pH PH 7.70 Standard units
Lagoon at Festival Beach Lake 2020-07-29 11:35:00 Conventionals WATER TEMPERATURE 28.65 Deg. Celsius
Redbud West of Parking Lot Lake 2020-07-29 10:55:00 Conventionals WATER TEMPERATURE 26.21 Deg. Celsius
Redbud West of Parking Lot Lake 2020-07-29 10:55:00 Alkalinity/Hardness/pH PH 7.22 Standard units
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-07-29 10:20:00 Conventionals WATER TEMPERATURE 21.85 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-07-29 10:20:00 Alkalinity/Hardness/pH PH 6.94 Standard units
Lady Bird Lake @ Powerline Lake 2020-07-29 09:45:00 Conventionals WATER TEMPERATURE 27.13 Deg. Celsius
Lady Bird Lake @ Powerline Lake 2020-07-29 09:45:00 Alkalinity/Hardness/pH PH 7.60 Standard units
Waller Creek Downstream of Cesar Chavez Stream 2020-07-22 14:25:00 Alkalinity/Hardness/pH PH 8.34 Standard units
Waller Creek Downstream of Cesar Chavez Stream 2020-07-22 14:25:00 Conventionals WATER TEMPERATURE 28.99 Deg. Celsius
Waller Creek Upstream of 23rd Street Stream 2020-07-22 13:50:00 Alkalinity/Hardness/pH PH 7.97 Standard units
Waller Creek Upstream of 23rd Street Stream 2020-07-22 13:50:00 Conventionals WATER TEMPERATURE 27.18 Deg. Celsius
Waller Creek @ Shipe Park Stream 2020-07-22 13:25:00 Conventionals WATER TEMPERATURE 28.06 Deg. Celsius
Waller Creek @ Shipe Park Stream 2020-07-22 13:25:00 Alkalinity/Hardness/pH PH 7.92 Standard units
Spicewood Tributary Downstream of Spicewood Spring Stream 2020-07-22 12:00:00 Alkalinity/Hardness/pH PH 7.36 Standard units
Spicewood Tributary Downstream of Spicewood Spring Stream 2020-07-22 12:00:00 Conventionals WATER TEMPERATURE 25.09 Deg. Celsius
Lagoon at Festival Beach Lake 2020-07-22 11:55:00 Conventionals WATER TEMPERATURE 29.99 Deg. Celsius
Lagoon at Festival Beach Lake 2020-07-22 11:55:00 Alkalinity/Hardness/pH PH 7.98 Standard units
Taylor Slough South @ Reed Park (TSS) Stream 2020-07-22 11:35:00 Alkalinity/Hardness/pH PH 8.08 Standard units
Taylor Slough South @ Reed Park (TSS) Stream 2020-07-22 11:35:00 Conventionals WATER TEMPERATURE 25.83 Deg. Celsius
Redbud West of Parking Lot Lake 2020-07-22 11:05:00 Conventionals WATER TEMPERATURE 27.52 Deg. Celsius
Redbud West of Parking Lot Lake 2020-07-22 11:05:00 Alkalinity/Hardness/pH PH 7.19 Standard units
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-07-22 10:35:00 Conventionals WATER TEMPERATURE 22.02 Deg. Celsius
Barton Creek Mouth upstream @ Lady Bird Lake Stream 2020-07-22 10:35:00 Alkalinity/Hardness/pH PH 7.01 Standard units
Lady Bird Lake @ Powerline Lake 2020-07-22 10:00:00 Conventionals WATER TEMPERATURE 27.47 Deg. Celsius
Lady Bird Lake @ Powerline Lake 2020-07-22 10:00:00 Alkalinity/Hardness/pH PH 7.55 Standard units
Onion Creek @ South Austin Regional WWTP (SAR) Stream 2020-07-16 09:55:00 Conventionals WATER TEMPERATURE 29.46 Deg. Celsius
Onion Creek @ South Austin Regional WWTP (SAR) Stream 2020-07-16 09:55:00 Alkalinity/Hardness/pH PH 7.70 Standard units
Lagoon at Festival Beach Lake 2020-07-14 12:20:00 Conventionals WATER TEMPERATURE 29.86 Deg. Celsius
Lagoon at Festival Beach Lake 2020-07-14 12:20:00 Alkalinity/Hardness/pH PH 6.99 Standard units

1.3 Removing blank values

Blank or missing values can skew or results quite a bit. Therefore, we will get rid of any blank or missing values.

na.omit(water2$Results)
##   [1]  8.00 30.77  7.26 27.82 28.15 27.97 27.82  7.63  7.93 30.33 28.81  8.00
##  [13] 28.24  7.68  7.28  7.38  7.55 29.12 28.28  7.97  7.46 27.69 27.84  7.62
##  [25] 27.71  7.65  7.67  7.60 27.77 27.73 28.01  7.61  7.88 30.17  7.70 28.23
##  [37]  7.63 26.02  7.61 27.20 27.46  7.72  7.71 27.40  7.77 27.47  7.75 27.49
##  [49]  7.57 27.04 29.08  6.95 22.17  7.04  6.97 21.61 21.56  7.18 29.07  7.69
##  [61] 27.74  7.58 23.07  7.39  7.71 27.45 27.20  7.66  8.09 30.27 23.26  7.09
##  [73]  7.54 28.06  7.70 28.65 26.21  7.22 21.85  6.94 27.13  7.60  8.34 28.99
##  [85]  7.97 27.18 28.06  7.92  7.36 25.09 29.99  7.98  8.08 25.83 27.52  7.19
##  [97] 22.02  7.01 27.47  7.55 29.46  7.70 29.86  6.99

1.4 Converting the table from long to wide using Spread

First, we will drop the columns we do not need anymore such as unit and parameter type as we know what the corresponding values are for water pH and water temperature. We will overwrite our water2 tibble with a copy of itself excluding the unit and parameter type.

water2 <- water2[,-c(4,7)] #corresponsing column numbers for parameter type and unit
water2

Next, we will work on putting the water temperature and PH that were taken at the same time and at the same location, in a single row, because they are essentially from the same observation, but just different variables. We will use tidyverse’s spread function.

This is returning an error, for some row numbers. Let’s investigate what the issue is for these rows. We will look at the first five row numbers, specified in the error message.

water2[c(23,27,28,31,32),]

1.4.1 Duplicated

It looks like there are multiple PH values for an observation taken at the same time (18th August, 2020) at 13:35, at Lady Bird Lake. So there are duplicate measurements in our dataset. We do not have enough information to determine why this is the case, so we will just work towards removing the duplicate values using the duplicated function.

duplicate <- water2[,-5]#removing the 5th column because this is Results column which does not contain duplicates.
duplicate2 <- which(duplicated(duplicate)) #row numbers of values which are duplicates of earlier observations
duplicate2
##  [1]  6  7  9 10 11 12 13 14 15 16 17 18 19 20 25 26 27 28 29 30 31 32 43 44 45
## [26] 46 47 48

There seems to be quite a few duplicate observations in our dataset. We will filter them out from our water2 tibble and try the spread again.

water2 <- water2[-duplicate2,]
water2_wide <- spread(water2,Parameter,Results)
water2_wide

1.4.2 Colnames

Looks like our spread worked this time.As a final clean up, I’d like to change the column name for water temperature to Water_Temperature and PH to pH, using the colnames function.

colnames(water2_wide)[4] <- 'pH'
colnames(water2_wide)[5] <- 'Water_Temperature'
water2_wide

1.5 Exploratory Analysis Using GGPlot

Now that we have a cleaned dataset, let’s look at the temperature and pH statistics:

pH BoxPlot

boxplot(water2_wide$pH)

It looks like the ph Levels more or less ranged from around 7-8.2 wuth an average of ~7.6. for all sites.

Water Temperature Histogram

ggplot(data=water2_wide,aes(x=Water_Temperature,fill=Site_Type))+geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

For water temperature, it looks like the range for lakes maybe slightly less varied (temperature points seem fairly close together) than streams and springs.

1.5.1 Is there a correlation between Water Temperature and pH level?

Let’s try to do a scatterplot for pH and Water Temperature to gauge if there maybe a correlation between the 2, for each site type. We will also use a fit line to help us detect any assocations.

ggplot(water2_wide,
       aes(pH,Water_Temperature, color = Site_Type))+
  geom_point()+
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'

It appears that there is almost no correlation between the two, atleast for lake and streams (the lines appear almost straight,maybe sligtly positive for streams). There maybe a slightly negative correlation between the two for springs.