Readme

This is an R Notebook. If you’d like to pre-rendered figures, read a summary of analysis and view code, please open the html file in a browser. If you’d like to run the code yourself download and open the rmd file in rstudio or any other integrated development environment that supports r and markdown.

This notebook collects any computational work associated with understanding effects to bull trout from drawdowns of the Rimrock Resevoir.

require(cowplot)
require(plotly)
require(effects)
require(mice)
require(zoo)
require(tidyverse)
require(lubridate)
require(magrittr)
require(pracma)

Redd Counts

Redd count data is used to assess population abundance. While (a) observer error, (b) variation in redd characteristics among life-history forms, (c) redd superimposition, and (d) skewed sex ratios can introduce bias or error, redd counts are the best available long term data to estimate trends in abundance. In the Yakima basin redd count data has been consistently monitored in multiple index reaches for nearly 40 years.

This section contains summaries and visualization of redd count data for the Rimrock population complex. Summaries for other populations and several analyses that attempt to parse the spatial and temporal scales at which redd count data are autocorrelated, are available in the Yakima basin-wide analysis notebook

Data are from: Divens (2024) Yakima Basin Bull Trout Spawning Surveys 2023. WDFW.

Data Import and QC

Import

# Data Input

# Divens 2024 provides the raw data (yay), but the format is not ideal for input, we already have completed some manual clean up. The data and associated readmes are provided in the excel workbook titled redd_counts.xlsx in this directory.

# Here we import the data and roll up (or break down) all redd counts to the local population scale. This is the largest spatial scale at which population size is determined by local birth/deaths and may also fit well to the population genetic definition of a population.

redds <- readxl::read_xlsx("data/redd_counts.xlsx", sheet = 2)

# now lets make this data long
redd_long <- redds %>%
  pivot_longer(cols = -c(local_pop, sub_pop, pop_complex, current_life_history), names_to = "year", values_to = "redds_sub_pop")

#now, roll up the small scale counts to the local pops level
redds_up <- redd_long %>%
  group_by(local_pop, year) %>%
  summarise(redds = sum(redds_sub_pop, na.rm = TRUE)) %>%
  ungroup() 

redds_up %<>% 
  left_join(distinct(select(redd_long, local_pop, pop_complex), .keep_all = TRUE)) %>%
  mutate(year = as.numeric(year)) %>%
  filter(pop_complex=="Rimrock")

Filtering

When to start time series?

While the complete Yakima redd count dataset provided by Divens (2024) goes back to 1984, not all populations and index reaches were surveyed at that time.

For Rimrock populations the first reliable redd count data begins in
South Fork Tieton: 1994
Indian Creek: 1988
North Fork Tieton: 2007

Low Counts

The metadata provided with the redd counts includes a list of years where the data are expected to undercount values that would have been observed under normal conditions. For example high flows may reduce redd observation, or fire exclusions reduce redd counts to zero. If these errors are temporally autocorrelated this may produce false trend signals. Errors in individual years will aslo reduce our power to identify correlations between populations or environmental/management variables.

The basin-wide analysis found that temporal autocorrelation in undercounted years is present in the data, but has little impact on basin-wide trends. However, at smaller spatial scales (e.g. this analysis), we should exclude all years with undercounts from further analysis.

# import data 
bad_years <- readxl::read_xlsx("data/redd_counts.xlsx", sheet = 4)

#create an id for matching
bad_years %<>%
  mutate(by_id = paste(year, local_pop, sep = "_") )

by_ids <- bad_years$by_id

# set bad years to missing
redds_up %<>%
  mutate(by_id = paste(year, local_pop, sep = "_"),
         redds_na = case_when(by_id %in% by_ids ~ NA_real_,
                           TRUE ~ redds)) %>%
  select(-by_id)

Filtering Summary
For each population, the filtered dataset excludes all years before surveying in index reaches began, and all years that are noted in the metadata provided in Divens (2024) as potential undercounts. Since 1994, the following years are labeled as undercounts and excluded from further analysis:

SF Tieton: 1995, 2010, 2013
Indian Creek: 2013
NF Tieton: 1994 - 2006, 2010, 2013, 2014, 2018

We will also exclude NF Tieton from further analysis. Using SF Tieton and Indian together provides 27 data points (30 years - 3 bad data years). Including NF Tieton would reduce the data to just 13 points.

Redd Count Summary

Now Let’s plot the filtered redd count data for each population.

ggplot(data = filter(drop_na(redds_up, redds_na)), aes(x = year, y = redds))+
  geom_point()+
  geom_line()+
  geom_smooth(span = 5)+
  theme_bw()+xlab("Year")+ylab("Redds")+
  scale_x_continuous(breaks = seq(1987, 2023, 4), minor_breaks = seq(1987, 2023, 2)) +
  facet_grid(rows = vars(local_pop), scales = "free_y")

Since we are interested in Rimrock-wide trends, let’s also plot the combined redd count data for SF Tieton and Indian together. We will exclude NF Tieton since fewer data are available. We will also exclude years before reliable counts were available in SF Tieton (1994)

redds_sf_i <- redds_up %>%
  filter(local_pop %in% c("SF_Tieton", "Indian")) %>%
  group_by(year) %>%
  summarise(redds_total = sum(redds_na)) %>%
  drop_na(redds_total) %>%
  filter(year > 1993)

ggplot(data = redds_sf_i, aes(x = year, y = redds_total))+
  geom_point()+
  geom_line()+
  geom_smooth(method = "lm")+
  theme_bw()+xlab("Year")+ylab("Redds SF Tieon + Indian Creek")+
  scale_x_continuous(breaks = seq(1993, 2023, 4), minor_breaks = seq(1993, 2023, 2))

Linear trend towards decline is evident in the time series of redd counts. We will need to detrend the data to identify impacts of operations or environmental variables.

Rimrock Operations

Unlike other reservoirs in the Yakima Project, Rimrock Lake was not a natural lake, and operations at Tieton Dam can draw Rimrock down to extremely low levels. Complete drawdowns of Rimrock Lake occurred four times, (1926, 1931, 1973, and 1979), and are associated with collapse of the Rimrock kokanee fishery the following year (Mongillo & Faulconer, 1980). The kokanee fishery did not recover from the 1973 drawdown for six years, despite stocking, and 95 – 99% of the population was lost to entrainment during the 1979 drawdown (Mongillo & Faulconer, 1980, pp. 31, 34). Analysis of kokanee catch records also indicate that deep drawdowns, defined as those below ~30,000 af, measurably reduce kokanee abundance and productivity (Mongillo & Faulconer, 1980, p. 31), prompting the Systems Operation Advisory Committee to recommend maintaining Rimrock above this level in 2001. Rimrock has been drafted beneath 30,000 af eight times since 1981, but only once since the 2001 recommendation

Rimrock has been under so called “flip-flop” operations since 1981. Currently, the pool is maintained at 127,000 acre-feet until August 10th to provide connectivity over a barrier waterfall at SF Tieton River (Thomas, 2001). This corresponds to a pool elevation of 2,894 msl, where a barrier waterfall as SF Tieton forms every year. This elevation keeps the barrier waterfall to 5 feet, a height where bull trout passage is assumed to be unimpaired (Thomas 2001). To avoid any waterfall forming, the pool must be maintained above 131,000 acre feet.

Let’s make a figure of Rimrock pool elevations over time to better understand operations and how they interact with recommendations to maintain connectivity and limit entrainment. Data below is from Jan 1 1981 to Nov 27 2023

# here we import the data, pulled from the BoR hydromet page from 1981 to present (Nov 27 2023)

#rim af = RIMROCK RESERVOIR,TIETON RIVER AND WEATHER STATION Reservoir Water Storage, acre-feet
#rim fb = RIMROCK RESERVOIR,TIETON RIVER AND WEATHER STATION Reservoir Water Surface Elevation, feet
#rim gd = RIMROCK RESERVOIR,TIETON RIVER AND WEATHER STATION Average Stream Stage, feet
#rim qd = RIMROCK RESERVOIR,TIETON RIVER AND WEATHER STATION Average Stream Discharge, cfs
#rim qu = RIMROCK RESERVOIR,TIETON RIVER AND WEATHER STATION Estimated Average Unregulated Flow, cfs

rimrock <- read_tsv("data/rimrock_hydromet.txt", comment = "#")

#next let's clean thing up, we'll create a month-day column to compare data across years
rimrock %<>%
  mutate(yday = as.Date(yday(DateTime), origin = "2050-01-01"),
         year = year(DateTime))

# ggplot(rimrock)+geom_line(aes(yday, rim_af, color = factor(year)), alpha = 0.5)+
#   scale_x_date(date_labels = "%b", breaks = "1 month")+scale_color_viridis_d(guide = "none")+
#   geom_smooth(aes(yday, rim_af), color = "black", size = 2)+
#   theme_classic()+xlab("Date")+ylab("Rimrock Capacity (acre-feet)")+
#   #geom_vline(aes(xintercept = mdy("08-10-2050")))+
#   geom_hline(aes(yintercept = 127000))+
#   annotate(geom= "text", x = mdy("04-10-2050"), y = 123000, label = "Barrier Falls Form")+
#   #annotate(geom= "text", x = mdy("08-08-2050"), y = 50000, label = "Upstream Spawning\nMigration Complete", angle = 90)
#   annotate("rect", xmin = mdy("06-01-2050"), xmax = mdy("08-10-2050"), ymin = -Inf, ymax = Inf, fill = "#FFEA46FF" , alpha = 0.2)+
#   annotate("rect", xmin = mdy("08-10-2050"), xmax = mdy("10-15-2050"), ymin = -Inf, ymax = Inf, fill = "#00336FFF" , alpha = 0.2) +
#   annotate(geom= "text", x = mdy("07-01-2050"), y = 50000, label = "Upstream Spawning\nMigration", angle = 90) +
#   annotate(geom= "text", x = mdy("09-8-2050"), y = 50000, label = "Adult Downstream \n Post-Spawning Migration", angle = 90)

# for BA
ggplot(rimrock)+geom_line(aes(yday, rim_af, color = factor(year)), alpha = 0.5)+
  scale_x_date(date_labels = "%b", breaks = "1 month")+scale_color_viridis_d(guide = "none")+
  geom_smooth(aes(yday, rim_af), color = "black", size = 2)+
  theme_classic()+xlab("Date")+ylab("Rimrock Capacity (acre-feet)")+
  scale_y_continuous(breaks = c(30000, 50000, 100000, 127000, 150000, 200000))+
  #geom_vline(aes(xintercept = mdy("08-10-2050")))+
  geom_hline(aes(yintercept = 127000), linetype =2)+
  geom_hline(aes(yintercept = 30000), linetype =2)+
  annotate(geom= "text", x = mdy("04-10-2050"), y = 123000, label = "Barrier Falls Form")+
  annotate(geom= "text", x = mdy("04-10-2050"), y = 26000, label = "2001 SOAC Recommended Minimum Low Pool")+
  #annotate(geom= "text", x = mdy("08-08-2050"), y = 50000, label = "Upstream Spawning\nMigration Complete", angle = 90)
  annotate("rect", xmin = mdy("06-01-2050"), xmax = mdy("08-10-2050"), ymin = -Inf, ymax = Inf, fill = "#FFEA46FF" , alpha = 0.2)+
  annotate("rect", xmin = mdy("09-15-2050"), xmax = mdy("10-15-2050"), ymin = -Inf, ymax = Inf, fill = "#00336FFF" , alpha = 0.2)# +

#  annotate(geom= "text", x = mdy("07-01-2050"), y = 50000, label = "Upstream Spawning\nMigration", angle = 90) +
#  annotate(geom= "text", x = mdy("09-8-2050"), y = 50000, label = "Adult Downstream \n Post-Spawning Migration", angle = 90)

Figure Caption: Volume of Rimrock Lake through the year demonstrating winter drawdown, and the overlap of flows with migration timing and formation of a passage barrier at the mouth of South Fork Tieton River. Colored lines are individual years from 1981- 2023, with more recent years in lighter (yellow) colors. Yellow period from June to mid-August approximates peak upstream migration, blue period from mid-August to October approximates peak downstream, post-spawn migration of adults. Heavy black line is loess-smooth of all years. Data from Bureau of Reclamation Hydromet.

Correlation of Pool Height and Redd Counts

Next we examine the relationship between Rimrock pool height and redd counts.

Some notes on variables:
(1) Data: Since we are interested in Rimrock-wide effects, we will use the combined, filtered redd count data for Indian Creek and South Fork Tieton River. Years with known data quality issues are excluded (see filtering section above for details), and NF Tieton is excluded because the data set only begins in 2006.
(2) Lags: Drawdown overlaps occurs during spawning season and outmigration. Drawdown is expected to have little impact on redd counts during that year. Instead the potential impacts of drawdowns are expected to be delayed for one or several years, depending on the mechanism of effect. For example, effects to prey base may only be realized in redd count data following 2 or 3 years. We will examine impacts 1, 2 and 3 years after drawdown.
(3) Explanatory Variable: Following the approach of Mongillo et al 1980, we will use the minimum pool reached in a year as the explanatory variable for drawdown depth. This not only has the benefit of precendent, it also makes sense from a mechanistic basis. Entrainment is determined by forebay habitat usage and intake velocity. Intake velocity does not vary much across years, but forebay habtiat usage does because of the bathymetry of Rimrock Lake. The forebay is very deep relative to the rest of the reservoir, concentrating individuals near the intake during deeper drawdowns. (4) Trends: There is a strong declining trend in the available data. This may reduce power to identify relationships between depth of drawdown and redd counts in following years. We will also consider detrended data.

Some notes on statistical approach: Ultimately we are unsure of the relationship between several explanatory variables and detrended redd counts, at this point and this is best approached as a model building exercise. We will build a saturated model, validate it, then conduct model selection, and hypothesis testing for the final predictors.

Data Prep

# create min_pool
rimrock %>%
  group_by(year) %>%
  slice_min(rim_af, with_ties = F) ->min_pool_year

# add lagged pool height data
redds_sf_i %<>%
  left_join(select(min_pool_year, year, rim_af))%>%
  mutate(lag1_year = year -1) %>%
  left_join(select(min_pool_year, year, lag1_rim_af=rim_af), by = c("lag1_year"= "year")) %>%
  mutate(lag2_year = year -2) %>%
  left_join(select(min_pool_year, year, lag2_rim_af=rim_af), by = c("lag2_year"= "year")) %>%
  mutate(lag3_year = year -3) %>% 
  left_join(select(min_pool_year, year, lag3_rim_af=rim_af), by = c("lag3_year"= "year")) 

# add detrended redd counts

lm <- lm(redds_total ~ year, data = redds_sf_i)
redds_sf_i %<>%
  mutate(resid = residuals(lm))


# qc check on detrend function, note: looks good!
# ggplot(data = redds_sf_i)+
#   geom_point( aes(x = year, y = redds_total))+
#   geom_line( aes(x = year, y = redds_total))+
#   geom_smooth( aes(x = year, y = redds_total),method = "lm")+
#   geom_point(aes(x = year, y = resid))+
#   theme_bw()+xlab("Year")+ylab("Redds SF Tieon + Indian Creek")+
#   scale_x_continuous(breaks = seq(1993, 2023, 4), minor_breaks = seq(1993, 2023, 2))+
#   geom_hline(aes(yintercept = 0))

Model Selection

Here we will used backwards stepwise model selection based on the likelihood ratio test to determine which, if any, of the variables is a good predictor of redd count residuals.

Let’s start with the global (saturated) model that includes rimrock pool elvation one, two and three years prior. Then we will do some model validation.

saturated_model <- lm(resid~lag1_rim_af + lag2_rim_af +lag3_rim_af, data = redds_sf_i)
#plot(saturated_model)
DHARMa::simulateResiduals(saturated_model, plot = TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.64 0.64 0.488 0.252 0.8 0.72 0.084 0.352 0.336 0.1 0.244 0.592 0.24 0.9 0.996 0.86 0.892 0.208 0.472 0.56 ...

Model validation looks great using both Pearson and simulated residuals, normal residuals, no outliers with high leverage, independence of residuals and predictors. Let’s move on to model selection

drop1(saturated_model, test = "Chisq")

2 year lag is not a good predictor according to LRT (p = 0.73). Remove and refit.

m2 <- lm(resid~lag1_rim_af  +lag3_rim_af, data = redds_sf_i)
drop1(m2, test = "Chisq")

3 year lag is not a good predictor according to LRT (p = 0.39). Remove and refit.

m3 <- lm(resid~lag1_rim_af , data = redds_sf_i)
drop1(m3, test = "Chisq")

Great, our final model includes only the previous year’s pool volume. Let’s conduct a simple T-test and plot the relationship

Final Model

summary(m3)

## 
## Call:
## lm(formula = resid ~ lag1_rim_af, data = redds_sf_i)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -94.33 -46.64 -17.68  40.84 167.50 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -7.641e+01  3.560e+01  -2.147   0.0417 *
## lag1_rim_af  1.424e-03  6.185e-04   2.302   0.0299 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 66.86 on 25 degrees of freedom
## Multiple R-squared:  0.1749, Adjusted R-squared:  0.1419 
## F-statistic: 5.301 on 1 and 25 DF,  p-value: 0.02992

ggplot(redds_sf_i, aes(lag1_rim_af, resid))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Detrended Indian Creek + SF Tieton Redds \n 1 year lag")

Our final model includes only minimum pool elevation the previous year. The slope is significantly different thatn zero (t-test p value = 0.0299).

Simple Approach

Some folks may prefer a simple approach with model selection, simply test for each relationship. This is not preferred because of multiple comparisons, but let’s have these ready to go.

Year 1 Lag

First let’s look at the impact of drawdown in the year immediately following it.

First we will use the raw data:

ggplot(redds_sf_i, aes(lag1_rim_af, redds_total))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Indian Creek + SF Tieton Redds \n 1 year lag")

summary(lm(redds_total~lag1_rim_af, data = redds_sf_i))

## 
## Call:
## lm(formula = redds_total ~ lag1_rim_af, data = redds_sf_i)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -127.97  -59.52   -9.53   69.78  167.10 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 2.271e+02  4.589e+01   4.949 4.26e-05 ***
## lag1_rim_af 1.142e-03  7.975e-04   1.433    0.164    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 86.21 on 25 degrees of freedom
## Multiple R-squared:  0.07586,    Adjusted R-squared:  0.03889 
## F-statistic: 2.052 on 1 and 25 DF,  p-value: 0.1644

Then we will use the detrended data

ggplot(redds_sf_i, aes(lag1_rim_af, resid))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Detrended Indian Creek + SF Tieton Redds \n 1 year lag")

summary(lm(resid~lag1_rim_af, data = redds_sf_i))

## 
## Call:
## lm(formula = resid ~ lag1_rim_af, data = redds_sf_i)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -94.33 -46.64 -17.68  40.84 167.50 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -7.641e+01  3.560e+01  -2.147   0.0417 *
## lag1_rim_af  1.424e-03  6.185e-04   2.302   0.0299 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 66.86 on 25 degrees of freedom
## Multiple R-squared:  0.1749, Adjusted R-squared:  0.1419 
## F-statistic: 5.301 on 1 and 25 DF,  p-value: 0.02992

Year 2 Lag

Now let’s look at the impact of drawdown 2 years after

First we will use the raw data:

ggplot(redds_sf_i, aes(lag2_rim_af, redds_total))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Indian Creek + SF Tieton Redds \n 2 year lag")

No relationship, let’s look at the detrended data.

ggplot(redds_sf_i, aes(lag2_rim_af, resid))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Detrended Indian Creek + SF Tieton Redds \n 2 year lag")

summary(lm(resid~lag2_rim_af, data = redds_sf_i))

## 
## Call:
## lm(formula = resid ~ lag2_rim_af, data = redds_sf_i)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -90.62 -45.30 -31.32  29.53 172.31 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  8.4292235 38.2709502   0.220    0.827
## lag2_rim_af -0.0001686  0.0007114  -0.237    0.815
## 
## Residual standard error: 73.53 on 25 degrees of freedom
## Multiple R-squared:  0.002243,   Adjusted R-squared:  -0.03767 
## F-statistic: 0.05619 on 1 and 25 DF,  p-value: 0.8145

No relationship.

Year 3 Lag

Now let’s look at the impact of drawdown 2 years after

First we will use the raw data:

ggplot(redds_sf_i, aes(lag3_rim_af, redds_total))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Indian Creek + SF Tieton Redds \n 3 year lag")

No relationship, let’s look at the detrended data.

ggplot(redds_sf_i, aes(lag3_rim_af, resid))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Detrended Indian Creek + SF Tieton Redds \n 3 year lag")

summary(lm(resid~lag3_rim_af, data = redds_sf_i))

## 
## Call:
## lm(formula = resid ~ lag3_rim_af, data = redds_sf_i)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -100.02  -38.90  -21.10   33.12  165.16 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.9682441 34.5193759   0.549    0.588
## lag3_rim_af -0.0003822  0.0006351  -0.602    0.553
## 
## Residual standard error: 73.08 on 25 degrees of freedom
## Multiple R-squared:  0.01428,    Adjusted R-squared:  -0.02515 
## F-statistic: 0.3621 on 1 and 25 DF,  p-value: 0.5528

Compete against environmental variables

There should be strong correlation between environmental conditions and the need for deep drawdowns. Therefore, the relationship between drawdown depth and redd counts the following year could be mediated by a common factor, and not be causal. One way to evaluate this question would be to build a model that includes environmental variables as potential explanatory factors, then examine the degree the effects of these variables on redd counts can be parsed from those of drawdown depth using the variance inflation factor.

For now, let’s use snowpack at a nearby SNOTEL site (White Pass) on Apr 1 of the year BEFORE redd counts. The rationale for using the lagged effect is that we are curious about other factors that might explain why lagged drawdown depth correlates with redd counts, and not that there is a correlation with water quantity available for instream flows in the year of the redd counts.

#data prep
sp <- read_csv("data/snowpack.csv")
sp <- sp[-1,c(1, 13)]

#note water year is designated as the calendar year that it ends, so the data for Apr corresponds to the calendar year

sp %<>%
  rename(SWE_Apr = "Apr...13", year = "Water Year") %>%
  mutate(year_lag = year + 1)

redds_sf_i %<>% 
  left_join(select(sp, year_lag, SWE_Apr_lag1 = SWE_Apr), by = c("year" = "year_lag"), keep = F) # note the consequence of this join is that the SWE data is lagged the same amount as the rimrock lag 1 column 

redds_sf_i %<>%
  mutate(SWE_Apr_lag1 = as.numeric(SWE_Apr_lag1))

#build model
a <- lm(resid~ lag1_rim_af + SWE_Apr_lag1, data = redds_sf_i)
b <- lm(resid~  SWE_Apr_lag1, data = redds_sf_i)
c <- lm(resid~ lag1_rim_af, data = redds_sf_i)
d <- lm(resid~ 1, data = redds_sf_i)

Let’s do some EDA. We will look at correlation among predictors and the extent of multicollinearity using the variance inflation factor.

#correlation
cor(select(redds_sf_i, resid, lag1_rim_af, SWE_Apr_lag1))

##                  resid lag1_rim_af SWE_Apr_lag1
## resid        1.0000000   0.4182495    0.1032171
## lag1_rim_af  0.4182495   1.0000000    0.4868621
## SWE_Apr_lag1 0.1032171   0.4868621    1.0000000

#vif
car::vif(a)

##  lag1_rim_af SWE_Apr_lag1 
##     1.310676     1.310676

Next is model validation. We will use the “plot” function to examine residuals and QQ plots, as well as the simulated resiudals approach in DHARMa.

#model validation
DHARMa::simulateResiduals(a, plot = TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.724 0.772 0.568 0.32 0.772 0.708 0.088 0.272 0.336 0.124 0.216 0.496 0.244 0.908 0.996 0.792 0.828 0.092 0.448 0.384 ...

Great looking model! Residuals look normally distributed, no outliers with outsized influence (leverage), no dependence. DHARMa simulated residuals confirms.

First let’s conduct backwards stepwise model selection based on the likelihood ratio test and the AIC

# backwards stepwise model selection
drop1(a, test = "Chisq")

drop1(c, test = "Chisq")

This approach suggests SWE the year before does not improve the fit to the data (LRT p -value = 0.5).

Same Year SWE Others might question the choice to use lagged SWE. Let’s also briefly look at SWE the same year.

redds_sf_i %<>% 
  left_join(select(sp, year, SWE_Apr), by = c("year" = "year"), keep = F) %>% 
  mutate(SWE_Apr = as.numeric(SWE_Apr))

e <- lm(resid~ lag1_rim_af + SWE_Apr, data = redds_sf_i)
drop1(e, test = "Chisq")

SWE in the same year only marginally improves the fit to the data (delta AIC = 1.4, lrt p-value = 0.063). We will not include it. However, Since this is a close case (delta AIC <2) we should at least explore to ensure it does not drastically change the interpretation of the results. Let’s look at the predicted effect of minimum pool elevation across two models (residual_redds ~ lagged_rimrock_acre_feet, and residual_redds ~ lagged_rimrock_acre_feet + SWE)

#plot prediction from model
require(effects)
eff1 <- predictorEffect("lag1_rim_af", c)
effdf <- as.data.frame(eff1)

eff2 <- predictorEffect("lag1_rim_af", e)
effdf2 <- as.data.frame(eff2)

#ggplot(data = effdf, aes(x = (lag1_rim_af), y = fit))+ 
#  geom_line()+
#  geom_smooth( aes(ymin = lower, ymax = upper), stat = "identity") +
#  theme_bw()+
#  xlab("Low Pool Previous Year")+ylab("Change in Redd Counts from Linear Trend")

In the first model (no SWE), a drawdown to 30kaf is associated with a loss of 33.7 redds the following year. In the second model (includes SWE same year), a drawdown to 30kaf is associated with a loss of 34.3 redds the following year. I do not believe the change in the prediction is substantial enough to warrant the additional complexity needed to explain all of this in the whitepaper. Also, let’s remember (a) delta AIC is less than 2 and lrt p value is > 0.05, and (b) we’re more interested in the previous year. Let’s exclude it.

Summary

Minimum pool and SWE are strongly correlated (r = 0.48). However, when both are used in a lm to model detrended redd counts from 1994 - 2023, the VIF is surprisingly low at 1.31. Model validation looks good. Backward stepwise selection using likelihood ratio tests, and a cutoff of p = 0.05 suggests that pool elevation (p = 0.02269), and not snowpack (p = 0.531), is a good predictor of redds the following year.

#plot prediction from model
require(effects)
eff1 <- predictorEffect("lag1_rim_af", c)
effdf <- as.data.frame(eff1)

#ggplot(data = effdf, aes(x = (lag1_rim_af), y = fit))+ 
#  geom_line()+
#  geom_smooth( aes(ymin = lower, ymax = upper), stat = "identity") +
#  theme_bw()+
#  xlab("Low Pool Previous Year")+ylab("Change in Redd Counts from Linear Trend")

Conclusions

There is a relationship between the depth of drawdown and the number of bull trout redds in SF Tieton and Indian Creek the following year. This relationship is apparent from raw redd count data, but only become statistically significant (p = 0.03) when using the residuals from a linear model that accounts for the overall decline in redd counts over the period analyzed. This relationship is shown below.

ggplot(redds_sf_i, aes(lag1_rim_af, resid))+geom_point()+geom_smooth(method ="lm")+theme_bw()+xlab("Rimrock Low Pool (acre-feet)")+ylab("Detrended Indian Creek + SF Tieton Redds \n 1 year lag") + annotate(geom= "text", x = 25000, y = 100, label = "p = 0.0299")

If we use this trendline to make some generalizations, since 1994 drawdowns below the 2001 SOAC reccomendation to maintain Rimrock pool above 30k acre-feet are associated with a decline the following year of ~53 redds in SF Tieton and Indian Creek. This decline may be due to entrainment, impacts to prey base, changes in connectivity, or some combination of these factors.

Miscellaneous

This section collects additional analyses not included in the whitepaper for clarity.

GAM/ Non-Linear Relationship

There’s no strong reason to assume a linear relationship between entrainment and pool volume. We chose pool volume given the focus in previous studies and it’s utility as an operation benchmark. Entrainment rate generally is a function of proximity to the inlet and inlet velocity. In Rimrock, bathymetry concentrates fish at the forebay non-linearly with reservoir elevation.

Mongillo and Falconer arrived at their 30kaf cutoff by conducting regressions on portions of the data subdivided into different bins of pool elevation, an idea similar to local regression/LOESS. Let’s take a similar approach, but formal. WE will use general additive modeling with cubic regression spline and cross validation.

require(mgcv)
f <-  gam(resid~ s(lag1_rim_af, fx = FALSE, k=-1, bs = "cr"), data = redds_sf_i)
plot(f, se = TRUE)
f_pred <- predict(f, se = TRUE, type = "response")
gam.check(f)

Population Level Response

We chose to focus on the sum of Indian and SFT redds to reduce the influence of population level stochasticity and increase power to focus identify shared Rimrock-wide impacts of Tieton Dam operations. However, there may be sufficient power to identify effects in each population, or we may find that one population is impacted more than the other. For example, genetic stock identification of bull trout in Tieton pool collected in 2003 found that 1.5 fold more Indian Creek fish than SF Tieton fish, despite the fact that the SF Tieton population was approximately 1.7 fold larger than Indian Creek at that time (using redd counts in index reaches).

I predict that when analyzing each population separately, there will be limited power to identify a relationship, but the non-significant association will be much steeper in Indian than SF Tieton.

# we can't just use the existing dataset because we cut it at 1994..
min_pool_year %<>%
  mutate(year1 = year + 1)
ind <- redds_up %>%
  filter(local_pop == "Indian") %>%
  left_join(select(min_pool_year, year1, lag1_rim_af =rim_af), by = c("year" = "year1")) %>%
  drop_na(redds_na) %>%
  mutate(resid_redds = detrend(redds_na))

## Adding missing grouping variables: `year`

ggplot(data = ind, aes(lag1_rim_af, resid_redds))+geom_point()+geom_smooth(method = "lm")

## `geom_smooth()` using formula = 'y ~ x'

summary(lm(resid_redds ~ lag1_rim_af, data = ind))

## 
## Call:
## lm(formula = resid_redds ~ lag1_rim_af, data = ind)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -101.645  -41.307   -3.879   43.358  101.682 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -4.222e+01  2.311e+01  -1.827   0.0768 .
## lag1_rim_af  8.885e-04  4.403e-04   2.018   0.0518 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 58.11 on 33 degrees of freedom
## Multiple R-squared:  0.1098, Adjusted R-squared:  0.08286 
## F-statistic: 4.072 on 1 and 33 DF,  p-value: 0.0518

sft <- redds_up %>%
  filter(local_pop == "SF_Tieton") %>%
  left_join(select(min_pool_year, year1, lag1_rim_af =rim_af), by = c("year" = "year1")) %>%
  drop_na(redds_na) %>%
  mutate(resid_redds = detrend(redds_na))

## Adding missing grouping variables: `year`

ggplot(data = sft, aes(lag1_rim_af, resid_redds))+geom_point()+geom_smooth(method = "lm")

## `geom_smooth()` using formula = 'y ~ x'

summary(lm(resid_redds ~ lag1_rim_af, data = sft))

## 
## Call:
## lm(formula = resid_redds ~ lag1_rim_af, data = sft)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -63.916 -27.146  -0.859  20.742  93.752 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.549e+01  2.193e+01  -1.162    0.256
## lag1_rim_af  4.750e-04  3.811e-04   1.246    0.224
## 
## Residual standard error: 41.2 on 25 degrees of freedom
## Multiple R-squared:  0.0585, Adjusted R-squared:  0.02084 
## F-statistic: 1.553 on 1 and 25 DF,  p-value: 0.2242

Wow, nailed it. not a significant relationship in either pop (although close in Indian), but the slope of the relationship is 2 fold greater in Indian than SF Tieton.

R Notebook Rimrock Drawdown

Readme

Redd Counts

Data Import and QC

Import

Filtering

Redd Count Summary

Rimrock Operations

Correlation of Pool Height and Redd Counts

Data Prep

Model Selection

Final Model

Simple Approach

Year 1 Lag

Year 2 Lag

Year 3 Lag

Compete against environmental variables

Summary

Conclusions

Miscellaneous

GAM/ Non-Linear Relationship

Population Level Response