EPSCoR SCTC Aquatic Ecology

Analysis in progress

R Session Info

sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.2
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] plotrix_3.7-4      lme4_1.1-19        Matrix_1.2-15     
##  [4] ggridges_0.5.1     anytime_0.3.3      knitr_1.21        
##  [7] nnet_7.3-12        cowplot_0.9.4      FSA_0.8.22        
## [10] stringi_1.2.4      janitor_1.1.1      ggpmisc_0.3.0     
## [13] ggrepel_0.8.0      broom_0.5.1        DT_0.5            
## [16] lubridate_1.7.4    forcats_0.3.0      stringr_1.3.1     
## [19] dplyr_0.7.8        purrr_0.2.5        readr_1.3.1       
## [22] tidyr_0.8.2        tibble_2.0.0       ggplot2_3.1.0     
## [25] tidyverse_1.2.1    googlesheets_0.3.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0         lattice_0.20-38    assertthat_0.2.0  
##  [4] digest_0.6.18      R6_2.3.0           cellranger_1.1.0  
##  [7] plyr_1.8.4         backports_1.1.3    RApiDatetime_0.0.4
## [10] evaluate_0.12      httr_1.4.0         pillar_1.3.1      
## [13] rlang_0.3.1        lazyeval_0.2.1     readxl_1.2.0      
## [16] rstudioapi_0.9.0   minqa_1.2.4        nloptr_1.2.1      
## [19] rmarkdown_1.11     splines_3.5.2      htmlwidgets_1.3   
## [22] munsell_0.5.0      compiler_3.5.2     modelr_0.1.2      
## [25] xfun_0.4           pkgconfig_2.0.2    htmltools_0.3.6   
## [28] tidyselect_0.2.5   crayon_1.3.4       withr_2.1.2       
## [31] MASS_7.3-51.1      grid_3.5.2         nlme_3.1-137      
## [34] jsonlite_1.6       gtable_0.2.0       magrittr_1.5      
## [37] scales_1.0.0       cli_1.0.1          bindrcpp_0.2.2    
## [40] xml2_1.2.0         generics_0.0.2     tools_3.5.2       
## [43] glue_1.3.0         hms_0.4.2          yaml_2.2.0        
## [46] colorspace_1.3-2   rvest_0.3.2        bindr_0.1.1       
## [49] haven_2.0.0

This document explores relationships between air and water temperature in the Kenai EPSCoR Aquatic Ecology Project.


a.) Import water temperature data. b.) Import air temperature data. c.) Determine air-water sensitivity values to be used in bioenergetics modeling.



Water temperatures

Import Water Temperature Data Note: this code differs somewhat from that in “season_temp_stats.Rmd”

# links?

### Identify source sheets
# Lower reach sites
# Water temperatures for lower sites have been curated so as to provide maximum constinuous temporal extent
lower_sites_2015 <- gs_title("lower_sites_2015")
lower_sites_2016 <- gs_title("lower_sites_2016")

## Special procedure for Ptarmigan Creek 2015 & 2016
# a.) Ptarmigan Creek 2015: We want to use the file generated/discussed from "2015 Water Temperature Analaysis.Rmd" for both (lower and middle) reaches.  Lower and Middle sites are similar enough (and close enough) that we want to use the same dataset for both sites.

# b.) Ptarmigan Creek 2016: we want to use the MPC-2016 site for both middle and lower sites; they are within tolerance margin (see "2016 Water Temperature Analaysis.Rmd").  MPC-2016 site has most continuous dataset.

# Methods narration:  We replaced missing water temperature data with data from the closest available logger site, if the two sites averaged an annual difference of < 0.2 C .  

# see "2015" and "2016" Water Temperature Data .Rmd files for methods.


# 2015
## lower sites
LBC15 <- lower_sites_2015 %>%
  gs_read(ws = "LBC15", colnames = TRUE) %>%
  transform(DateTime = mdy_hm(DateTime))
LRR15 <- lower_sites_2015 %>%
  gs_read(ws = "LRR15", colnames = TRUE) %>%
  transform(DateTime = mdy_hm(DateTime))
LPC15 <- lower_sites_2015 %>%
  gs_read(ws = "LPC15", colnames = TRUE) %>%
  transform(DateTime = mdy_hm(DateTime))
LKR15 <- lower_sites_2015 %>%
  gs_read(ws = "LKR15", colnames = TRUE) %>%
  transform(DateTime = mdy_hm(DateTime))


# 2016
## lower sites
LBC16 <- lower_sites_2016 %>%
  gs_read(ws = "LBC16", colnames = TRUE) %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime))
LRR16 <- lower_sites_2016 %>%
  gs_read(ws = "LRR16", colnames = TRUE) %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime))

LPC16 <- lower_sites_2016 %>%
  gs_read(ws = "MPC16", colnames = TRUE) %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime)) %>%
 select(-Hydrologic_Segment,-Site) %>%
  mutate(Hydrologic_Segment = "Lower",
         Site = paste(Hydrologic_Segment, Stream_Name))

LKR16 <- lower_sites_2016 %>%
  gs_read(ws = "LKR16", colnames = TRUE) %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime))

## middle kenai river
MKR15 <- gs_title("2015_MKR_H2OTemp_Final.csv") %>%
  gs_read(ws = "2015_MKR_H2OTemp_Final.csv") %>%
  transform(DateTime = mdy_hm(DateTime))
MKR16 <- gs_title("2016_MKR_H2OTemp_Final.csv") %>%
  gs_read(ws = "2016_MKR_H2OTemp_Final.csv") %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime))

## middle ptarmigan creek 2015
MPC15 <- lower_sites_2015 %>%
  gs_read(ws = "LPC15", colnames = TRUE) %>%
  select(-Hydrologic_Segment,-Site) %>%
  mutate(Hydrologic_Segment = "Middle",
         Site = paste(Hydrologic_Segment, Stream_Name)) %>%
  transform(Date = as.character(Date),
            DateTime = mdy_hm(DateTime))

## middle ptarmigan creek 2016
MPC16 <- lower_sites_2016 %>%
  gs_read(ws = "MPC16", colnames = TRUE) %>%
  transform(Date = as.character(Date),
            DateTime = as.POSIXct(DateTime))



# bind together lower site dataframes (+ middle kenai river)
lower_sites <- bind_rows(
                         # lower sites
                         LBC15,LRR15,LPC15,LKR15,
                         LBC16,LRR16,LPC16,LKR16,
                         # middle kenai sites
                         MKR15,MKR16,
                         # those special ones
                         MPC15,MPC16) %>%
  transform(DateTime = as.character(DateTime)) %>%
  select(-X1)


Import middle and upper site water temp data


# links?

# other sites (middle and upper reaches)
BC15 <- gs_title("Beaver_Creek_2015_Water_Temperatures")
BC16 <- gs_title("Beaver_Creek_2016_Water_Temperatures")
RR15 <- gs_title("Russian_River_2015_Water_Temperatures")
RR16 <- gs_title("Russian_River_2016_Water_Temperatures")
PC15 <- gs_title("Ptarmigan_Creek_2015_Water_Temperatures")
PC16 <- gs_title("Ptarmigan_Creek_2016_Water_Temperatures")

Sys.sleep(6)

## Beaver Creek
#2015
MBC2015 <- BC15 %>% 
  gs_read(ws = "MBC_WL_10324669",col_names=TRUE)
UBC2015 <- BC15 %>% 
  gs_read(ws = "UBC_WL_10537880",col_names=TRUE)
#2016
MBC2016 <- BC16 %>% 
  gs_read(ws = "MBC-WL 2004069",col_names=TRUE)
UBC2016 <- BC16 %>%
  gs_read(ws = "UBC-WL 10718566",col_names=TRUE)
UBC1_2016 <- BC16 %>%
  gs_read(ws = "UBC1-WL 10886777")
UBC2_2016 <- BC16 %>%
  gs_read(ws = "UBC2-WL 10886775")

Sys.sleep(6)

# Russian River
# 2015
MRR2015 <- RR15 %>% 
  gs_read(ws = "MRR_WL_10324639",col_names=TRUE)
URR2015 <- RR15 %>% 
  gs_read(ws = "URR_WL_10324638",col_names=TRUE)
#2016
MRR2016 <- RR16 %>% 
  gs_read(ws = "MRR-WL 10886776",col_names=TRUE)  # 9/11/18 confirmed no unintentional data dupication in import
MRR1_2016 <- RR16 %>%
  gs_read(ws = "MRR1-WL 10324644", col_names = TRUE)
MRR2_2016 <- RR16 %>%
  gs_read(ws = "MRR2-WL 10324638", col_names = TRUE)
#URR2016 <- RR16 %>% 
#  gs_read(ws = "URR-WL 10886778", col_names=TRUE)     # hashtagged out for now; will use MRR2 site instead; see below
URR1_2016 <- RR16 %>%
  gs_read(ws = "URR1-WL 10767518", col_names = TRUE)

Sys.sleep(6)

##Ptarmigan Creek
#2015
# 2015 MPC already imported above
UPC2015 <- PC15 %>% 
  gs_read(ws = "UPC_WL_10324642",col_names=TRUE)
#2016
# 2016 MPC already imported above
#MPC2016 <- PC16 %>% 
#  gs_read(ws = "MPC-WL 10324639",col_names=TRUE)
UPC2016 <- PC16 %>% 
  gs_read(ws = "UPC-WL 10324648",col_names=TRUE)


# Bind middle & upper sites data
middle_upper_sites <- bind_rows(
  #2015
  MBC2015,UBC2015,
  MRR2015,URR2015,
  UPC2015,
  #2016
  MBC2016,UBC2016,UBC1_2016,UBC2_2016, 
  MRR2016,MRR1_2016,MRR2_2016,URR1_2016,
  UPC2016) 

# format middle & upper site dataframe to match lower sites dataframe
middle_upper_sites <- middle_upper_sites %>%
  select(-one_of("Latitude","Longitude","Logger Type","Sample Interval (min)",
         "Logger Serial #","Site Description","Logger Model")) %>%
  rename(Stream_Name = `Stream Name`,
         Hydrologic_Segment = `Hydrologic Segment`,
         Site_ID = `Site ID`)  %>%
  transform(DateTime = mdy_hm(DateTime)) %>%
  mutate(Date = as.character(date(DateTime)),
         Week = week(DateTime),
         Site = paste(Hydrologic_Segment,Stream_Name)) %>%
  transform(DateTime = as.character(DateTime)) %>%
  select(DateTime,Temp_C,Stream_Name,Hydrologic_Segment,
         Site_ID,Agency,Site,Date,Week)

# bind lower site dataframe with middle & upper site dataframe
fnl_temps <- bind_rows(lower_sites,middle_upper_sites) %>%
  select(-Date) %>%
  mutate(Date = date(DateTime),
         year = year(DateTime)) %>%
  # remove extraneous site label info from some parts of "Site" column
  separate(Site,sep = " ", into = c("reach","river","waterbody")) %>%
  unite(reach,river,waterbody, sep = " ",col = "Site") 


# remove temporary objects
rm(lower_sites,middle_upper_sites,
  MBC2015,UBC2015,
  MRR2015,URR2015,
  MPC2015,UPC2015,
  MBC2016,UBC2016,UBC1_2016,UBC2_2016, 
  MRR2016,MRR1_2016,MRR2_2016,URR2016,URR1_2016,
  MPC2016,UPC2016,
  BC15,BC16,RR15,RR16,PC15,PC16,MKR15,MKR16)
  
# remove temporary objects
rm(lower_sites_2015,lower_sites_2016,LBC15,LRR15,LPC15,LKR15,LBC16,LRR16,MPC16,LKR16)


Eliminate loggers unwanted for growth analyses (e.g. water Temp loggers with no associated air temp dataset nearby)

# Issue: in 2016, there exist multiple temperature loggers in a given reach but only one fish sampling site.
# Resolved: for growth analyses, use only the one logger that was closest to sampling site.

# create function to exclude multiple criteria from same column, opposite of "%in%"
'%notin%' <- Negate('%in%')

# filter out water temperature loggers non-adjacent to fish sampling sites.

# Also: Upper Russian River 2016 (URR-2016) water temp is not available due to logger malfunction.  Replace with closest downstream logger; MRR2-2016:
loggers_excluded <- c("UBC1-WL-2016",
                      "UBC2-WL-2016",
                      "MRR1-WL-2016", # just above lower russian lake
                      #"MRR2-2016",   # mid-way between lower and upper russian lakes; keep this site; will use for upper russian river 2016 because that dataset too is short to use (logger failure).
                      # leaving upper ptarmigan creek site in for now.  lentic site, may not be comparable to others.
                      "URR-WL-2016",   
                      "URR1-WL-2016"
                      #,
                      # "UPC-WL-2015", # upper ptarmigan creek site; lentic
                      # "UPC-WL-2016" # upper ptarmigan creek site; lentic
                      )

# exclude unwanted loggers
fnl_temps_sub <- fnl_temps %>%
  filter(Site_ID %notin% loggers_excluded) %>%
  filter(!is.na(Stream_Name),
         !is.na(Temp_C)) %>%
  transform(DateTime = as.POSIXct(DateTime))

## Special procedure for Upper Russian River 2016
# change MRR2-WL-2016 site from middle to upper reach.
# (upper russian river 2016 temp data largely missing due to logger malfunction)
z <- fnl_temps_sub %>%
  filter(Site_ID == "MRR2-WL-2016") %>%
  mutate_all(funs(str_replace_all(., "Middle", "Upper"))) %>%
  mutate(DateTime = as.POSIXct(DateTime),
         Temp_C = as.numeric(Temp_C),
         Week = as.numeric(Week),
         Date = as.Date(Date),
         year = as.numeric(year))

# reunite MRR2 data with overall dataframe
fnl_temps_sub <- bind_rows(fnl_temps_sub,z) %>%
  # clean up the "Site" column so that only name is present; not Site_ID, is present in this column
  separate(Site,sep = " ", into = c("reach","river","waterbody")) %>%
  unite(reach,river,waterbody, sep = " ",col = "Site")

# remove presence of rows with Site_ID = MRR2-WL-2016 and Site = 'Middle Russian River' to eliminate duplication (we are using MRR-WL-2015 site for MRR2016 data)

# create table of unwanted data and anti_join it to original table
anti_fnl_temps_sub <- fnl_temps_sub %>%
  filter(Site == "Middle Russian River" & Site_ID == "MRR2-WL-2016")

fnl_temps_sub <- anti_join(fnl_temps_sub,anti_fnl_temps_sub)

rm(LPC16,MPC15)


Plot water temperature data


Sys.sleep(6)

# plot raw data to confirm it is sensical...?
fnl_temps_x <- fnl_temps_sub %>%
  mutate(doy = yday(DateTime)) %>%
  separate(Site, sep = " ","reach", remove = F) %>%
  group_by(Stream_Name,reach,Site,Date,year,doy) %>%
  summarise(daily_mean_temp = mean(Temp_C))

ggplot(fnl_temps_x, aes(doy,daily_mean_temp, color = as.factor(year))) +
  geom_line() +
  facet_grid(Stream_Name ~ reach) +
  theme_bw() + 
  ggtitle("2015-2016 Water Temperature Data")


An annual water temperature dataset associated with each site is present in the “fnl_temps_sub” dataframe (subsetted final temperatures).



Air temperatures


Import air temperatures


Sys.sleep(6)

# 2015 Air temps
## Beaver Creek
LBC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "LBC-AL 10324664") 
MBC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "MBC-AL 2004069")
UBC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "UBC-AL 2004060")

Sys.sleep(6)

## Russian River
LRR15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "LRR-AL 10537877")
MRR15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "MRR-AL 10324659")
URR15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "URR-AL 1065516")

Sys.sleep(6)

## Ptarmigan Creek
LPC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "LPC-AL 10324665")
MPC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "MPC-AL 10537882")
UPC15_air <- gs_read(gs_title("Air_Temperatures_2015_EPSCoR_SCTC.xlsx"),ws = "UPC-AL 10324647")


# 2016 Air Temps

## Beaver Creek
LBC16_air <- gs_read(gs_title("Beaver_Creek_2016_Air_Temperatures.xlsx"),ws = "LBC-AL 10324647")
MBC16_air <- gs_read(gs_title("Beaver_Creek_2016_Air_Temperatures.xlsx"),ws = "MBC-AL 10324645")
UBC16_air <- gs_read(gs_title("Beaver_Creek_2016_Air_Temperatures.xlsx"),ws = "UBC-AL 2004072")

Sys.sleep(6)

## Russian River
LRR16_air <- gs_read(gs_title("Russian_River_2016_Air_Temperatures.xlsx"),ws = "LRR-AL 10324665")
MRR16_air <- gs_read(gs_title("Russian_River_2016_Air_Temperatures.xlsx"),ws = "MRR-AL 10324640")
URR16_air <- gs_read(gs_title("Russian_River_2016_Air_Temperatures.xlsx"),ws = "URR-AL 10537882")

Sys.sleep(6)

## Ptarmigan Creek
LPC16_air <- gs_read(gs_title("Ptarmigan_Creek_2016_Air_Temperatures.xlsx"),ws = "LPC-AL 2004060")
MPC16_air <- gs_read(gs_title("Ptarmigan_Creek_2016_Air_Temperatures.xlsx"),ws = "MPC-AL 10537875")
UPC16_air <- gs_read(gs_title("Ptarmigan_Creek_2016_Air_Temperatures.xlsx"),ws = "UPC-AL 10537880")

Sys.sleep(6)

# Kenai River sites associated air temps

# Lower Kenai River
# downloaded csv to local directory from NWS LCD service on 12/17/18
LKR_air <- read.csv("/Users/bmeyer/Google Drive/Thesis/R Analysis/Temperature Data/NWS_download/1578886.csv") %>%
  select(DATE,HOURLYDRYBULBTEMPC,LATITUDE,LONGITUDE) %>%
  transform(DATE = round_date(mdy_hm(DATE),"15 minutes")) %>%  # round to nearest 0.25 hrs
  mutate(Agency = "NWS",
         `Stream Name` = "Kenai River",
         Hydrologic.Segment = "Lower",
         Site.ID = "LKR-AL-2015",
         Logger.Type = "Air",
         site = "Lower Kenai River",
         Date = date(DATE),
         year = year(DATE)) %>%
  rename(DateTime = DATE,
         Temp_C = HOURLYDRYBULBTEMPC,
         Latitude = LATITUDE,
         Longitude = LONGITUDE) %>%
  transform(DateTime = as.character(DateTime))

LKR15_air <- LKR_air %>%
  filter(year == 2015)

LKR16_air <- LKR_air %>%
  filter(year == 2016)


# bind 2015 air temps
air15 <- bind_rows(LBC15_air,MBC15_air,UBC15_air,
                   LRR15_air,MRR15_air,URR15_air,
                   LPC15_air,MPC15_air,UPC15_air) %>%
  select(-Notes)

# bind 2016 air temps
air16 <- bind_rows(LBC16_air,MBC16_air,UBC16_air,
                   LRR16_air,MRR16_air,URR16_air,
                   LPC16_air,MPC16_air,UPC16_air) %>%
  select(-X10,-X15,-X16)

# bind 2015 and 2016 air temps
airtemps <- bind_rows(air15,air16) %>%
  transform(DateTime = mdy_hm(DateTime)) %>%
  select(-Logger.Model,-Site.Description,-Sample.Interval..min.,-Logger.Serial..,-Contact) %>%
  mutate(site = as.factor(paste(Hydrologic.Segment,Stream.Name)),
         Date = date(DateTime),
         year = year(DateTime)) %>%
  filter(!is.na(DateTime)) %>%
  transform(DateTime = as.character(DateTime)) %>%
  bind_rows(LKR15_air,LKR16_air)

# subset to date range of interest
## May 1 - Nov 1, 2015 and 2016

z15 <- airtemps %>%
  filter(Date >= as.Date("2015-05-01") & Date <= as.Date("2015-11-01"))
z16 <- airtemps %>%
  filter(Date >= as.Date("2016-05-01") & Date <= as.Date("2016-11-01"))
airtemps <- bind_rows(z15,z16)

# remove temporary objects
#rm(LBC15_air,MBC15_air,UBC15_air,
#   LRR15_air,MRR15_air,URR15_air,
#   LPC15_air,MPC15_air,UPC15_air,
#   LBC16_air,MBC16_air,UBC16_air,
#   LRR16_air,MRR16_air,URR16_air,
#   LPC16_air,MPC16_air,UPC16_air,
#   LKR_air,LKR15_air,LKR16_air)

rm(z,z1,z15,z16)


Plot raw air temperature data


# NOTE: it appears that there is once a day (~midnight) every day that has a missing air temp observation from the kenai airport air temp data set.  These must be ignored in order to calculated daily means.

# plot raw data to confirm it is sensical...?
airtemps_means <- airtemps %>%
  filter(!is.na(Temp_C)) %>%
  mutate(doy = yday(DateTime)) %>%
  group_by(Stream.Name,Hydrologic.Segment,site,Date,year,doy) %>%
  summarise(daily_mean_airtemp = mean(Temp_C))

ggplot(airtemps_means, aes(doy,daily_mean_airtemp, color = as.factor(year))) +
  geom_line() +
  facet_grid(Stream.Name ~ Hydrologic.Segment) +
  theme_bw() + 
  ggtitle("2015-2016 Air Temperature Data")

NOTE: need to QC the data for early middle/upper russian river


Join water and air temperature data in to same dataframe

# join water and air temps in same dataframe by DateTime and site

## prep water temps dataframe for join
fnl_temps_sub <- fnl_temps_sub %>%
  transform(DateTime = as.POSIXct(DateTime)) %>%
  rename(water_temp_C = Temp_C,
         Agency_water = Agency,
         Site_ID_water = Site_ID) %>%
  select(-Week)

## prep air temps dataframe for join
airtemps <- airtemps %>%
  transform(DateTime = as.POSIXct(DateTime)) %>%
  rename(air_temp_C = Temp_C,
         Stream_Name = Stream.Name,
         Hydrologic_Segment = Hydrologic.Segment,
         Site_ID_air = Site.ID,
         Site = site,
         Agency_air = Agency) %>%
  select(-Latitude,-Longitude,-Logger.Type)


# left_join of water and airtemps dfs 
temps <- left_join(fnl_temps_sub,airtemps, by = c("DateTime","year","Date","Hydrologic_Segment","Stream_Name","Site"))


Plot data to visualize extent and overall shape(s)


temps %>%
  ggplot(aes(air_temp_C,water_temp_C, color = as.factor(year))) +
  geom_jitter(width = 2) +
  facet_grid(Stream_Name ~ Hydrologic_Segment) +
  theme_bw()


temps %>%
  filter(!is.na(air_temp_C)) %>%
  group_by(Site) %>%
  summarise(n_obs = n())

Notes: Lower Kenai River site is less data-rich because air temperature data from the Kenai Airport is hourly rather than every 0.25 hrs.

In progress: Seeking a choice on which air temperature dataset to use for Middle Kenai River. Supposedly RAWS data is available for summers 2015 and 2016 but having trouble trying to figure out how to download. Emailed m. mccarthy on 12/21/2018.



Sensitivity relationships

Calculate sensitivity and r2 at annual site level

  
# formula structure
formula <- y ~ x

# formula and r2 labels
formula_labels <- stat_poly_eq(aes(label = paste("atop(",..eq.label.., ",", ..adj.rr.label.., ")", sep = "")),
               formula = formula, 
               rr.digits = 2, 
               coef.digits = 2, 
               parse = TRUE,
               size = 3,
               geom = "label", 
               alpha = 0.8)


# plot water vs air temp fort each site/year
temps %>%
  ggplot(aes(air_temp_C,water_temp_C)) +
  geom_jitter(width = 2) +
  facet_wrap(year~Site) +
  formula_labels +
  theme_bw()

Notes: sensitivity varies greatly when lumped at an annual basis (e.g same site can have strong or weak relationship from year to year). Not a meaningful scale of interpretation.

Also, we ned to have a common time frame for all sites.


Calculate sensitivity (slope) and r2 for each site/week/year


# thought: a.) reduce to common time frame and b.) reduce data to tibble of site/year/week; use data within to calculate slope, r2, and n()

# air-water temp sensitivity formula
air_h20_sens <- function(df) {
  lm(water_temp_C ~ air_temp_C, data = df)}

# calculate slope of air-water relationship at site/year/week level
sens_vals <- temps %>%
  filter(!is.na(water_temp_C),
         !is.na(air_temp_C))%>%
  mutate(week = week(DateTime),
         doy = yday(DateTime)) %>%
  group_by(Site,Hydrologic_Segment,year,week) %>%
  nest() %>%
  # calculate slope for weekly relationship of air-water temps
  mutate(sens = map(data,air_h20_sens),
         tidy_sens = map(sens,broom::tidy),
         weekly_slope = map_dbl(tidy_sens,~.$estimate[2])) %>%
  # calculate r2 for weekly relationship of air-water temps
  mutate(fit = map(data,air_h20_sens),
         summary = map(fit,glance),
         r_sq = map_dbl(summary,"r.squared")) %>%
  select(Site,Hydrologic_Segment,year,week,weekly_slope,r_sq)

# table of values
datatable(sens_vals, caption = "Air-Water Sensitivity Values by Site/Year/Week", 
          filter = 'top', options = list(
  pageLength = 5, autoWidth = TRUE)) %>%
  formatSignif(c("weekly_slope","r_sq"), digits = 3) 


The above approach may employ a temporal scale too specific to be useful in relating to future projected air temperatures. Consider other approaches, e.g.: Mauger et al 2017 uses “3 years of weekly mean July temperatures,” calculating sensitivity from a linear mixed-effects model with year as a random intercept.

“We used a linear mixed effects model with a random intercept for year to account for the temporal correlation between weekly values within a year. We used Akaike’s information criterion (AIC) to compare results from a mixed effects model to one without a random intercept; the mixed effects model resulted in better model fit at 29 of the 48 sites.”

For my work, that would produce only 8 data points (2 years x 4 weeks per month) from which to acquire a slope (monthly site sensitivity value). Thought: using weekly mean values is probably the minimum timeframe we want for average values.

Let’s try this identical approach as in Mauger 2017 to see how things look. (Do we get rational results?):

# calculate weekly mean water and air temp values (if at least 70% of data in that week captured)

# note: a full week of observations for records every 0.25 hrs is 672 observations, a full week of observations every 1.0 hrs is 168 (e.g. mainstem kenai river sites).  Create a column specifying 672 obs per week for all sites except mainstem kenai, 168 for mainstem kenai sites

# create intermediate dataframe to join to specify number of max observations per week
weekly_obs <- temps %>%
  distinct(Site,Stream_Name) %>%
  mutate(max_weeklyobs = ifelse(Stream_Name == "Kenai River",168,672))
  
# calculate mean weekly temperatures by site
meantemps <- temps %>% 
  filter(!is.na(air_temp_C)) %>%
  mutate(month = month(DateTime),
         week = week(DateTime)) %>%
  group_by(month,year,week,Site) %>%
  mutate(n_week_obs = n()) %>%
  group_by(month,week,year,Site,n_week_obs) %>%
  summarise(mean_h20 = mean(water_temp_C),
            mean_air = mean(air_temp_C))
  
# it appears that at the lower kenai iste (maybe the middle one too?) observation frequency was not always just hourly (some have > 168 obs/week).  That should be OK; specifify diff in methods though.

# join and calculate fraction of maximum possible observations present per week/site
meantemps <- left_join(meantemps,weekly_obs) %>%
  mutate(frac_week_obs = n_week_obs/max_weeklyobs) %>%
  # filter out weeks with less than 70% of observations present
  filter(frac_week_obs >= .7)


## prep formula labels for each plot

# formula structure
formula <- y ~ x

# formula and r2 labels
formula_labels <- stat_poly_eq(aes(label = paste("atop(",..eq.label.., ",", ..adj.rr.label.., ")", sep = "")),
               formula = formula, 
               rr.digits = 2, 
               coef.digits = 2, 
               parse = TRUE,
               size = 3,
               geom = "label", 
               alpha = 0.8)

# plot
meantemps %>%
  filter(month != 9) %>%
  # manually order appearance of sites
  transform(Site = factor(Site, levels=c("Lower Beaver Creek",
                                         "Middle Beaver Creek",
                                         "Upper Beaver Creek",
                                         "Lower Russian River",
                                         "Middle Russian River",
                                         "Upper Russian River",
                                         "Lower Ptarmigan Creek",
                                         "Middle Ptarmigan Creek",
                                         "Upper Ptarmigan Creek",
                                         "Lower Kenai River",
                                         "Middle Kenai River"))) %>%
              filter(month != 10) %>%
  ggplot(aes(mean_air,mean_h20), color = year) + 
  geom_point() +
  geom_smooth(method = "lm") +
  facet_grid(Site~month) +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 360)) +
  ggtitle("Air-Water Temperature Sensitivity Relationships") +
  formula_labels


# need to calculate values as shown in formula labels and interpret; prep for use in modeling future water temps under scernarios....


Calculate slope,r2, and n values for each site/month air water sensitivity relationship


# calculate slope, r2, and n for each site/month from weekly mean air/water temp values
sens_model <- function(df) {
  lm(mean_h20 ~ mean_air, data = df)
}

# create values from nested tibble
sens <- meantemps %>%
  group_by(Site,month) %>%
  nest() %>%
  mutate(
    sens = map(data, sens_model),
    tidy_sens = map(sens, broom::tidy),
    sens_slope = map_dbl(tidy_sens, ~.$estimate[2]),
    sens_int = map_dbl(tidy_sens, ~.$estimate[[1]]),
    n_weeks = map_dbl(data, nrow),
    summary = map(sens,summary),
    r_sq = map_dbl(summary,"r.squared"))

# table of values
sens %>%
  select(Site,month,sens_slope,sens_int,r_sq,n_weeks) %>%
datatable(caption = "Weekly Air-Water Sensitivity Values by Site/Year", 
          filter = 'top', options = list(autoWidth = TRUE, pagelength = 5)) %>%
  formatSignif(c("sens_slope","sens_int","r_sq"), digits = 3) 


Postulation: we can have semi-realistic/useful site/month sensitivity values for site/months with a minimum of 3 data points…?

Filter out site/months with <3 weeks


z <- sens %>%
  filter(n_weeks >= 3)

Question: can FB 4.1 do monthly instead of daily input/output?

–> get: mean decadal monthly air temps available on snap for kenai…

–> possible approach: download monthly 1x1 km SNAP air temp projections, get monthly-mean projections for each site, use slope/intercept from each (current) sensitivity relationship to get projected water temps

–> so we’d have decadal projected water temps as monthly means. possibly 2030–2039 and 2060–2069 (mauger et al method)

files are very big, may need to leave comp overnight @ campus: http://ckan.snap.uaf.edu/dataset/projected-monthly-temperature-1-km-cmip3-ar4/resource/73b9ac65-3343-49f7-8fcc-6f2c6f46c59c

to do: go thru practice exercise of extracting temp data from geotiff file: https://www.earthdatascience.org/courses/earth-analytics/lidar-raster-data-r/introduction-to-spatial-metadata-r/