This assignment will examine various aspects of flood physical and statistical properties in a single watershed-The Big Thompson River at Estes Park in the Colorado Front Range. My research group has studied this watershed in detail-see Yu et al. (2021, Water Resources Research), Connecting Hydrometeorological Processes to Low-Probability Floods in the Mountainous Colorado Front Range. While I make no claim that the paper is an act of genuis, it does, accidentally or not, draw on several key themes from Seminal Papers in Flood Hydrology: connections between flood geophysical processes and flood probability (including the importance of “mixture distributions,” which arguably don’t receive enough explicit emphasis in Seminal Papers), the challenge of estimating very rare flood quantiles, bounded/unboundedness, rainfall and flood regional homogeneity/heterogeneity, and others. You should start by reading the Yu et al. paper in detail. Then, complete the following questions/tasks.

NOTE: The .rmd file is an R Markdown Notebook. If you have never used R/Rstudio, you should spend some time familiarizing yourself with it via a tutorial such as this one. If you have never used R Notebooks, you may wish to do a tutorial such as this one. There are plenty of other examples and tutorials on both R/Rstudio and R Notebooks! When you execute code within the notebook, the results appear beneath the code. You can also “knit” a notebook into a nice PDF, Word, or HTML document (though it can be finicky-keep things as simple as possible!). Please submit your assignment as a PDF, Word, or HTML document created using an R Notebook. You can work in groups of up to three, and submit a single assignment, if you like. Please indicate the names of all group members, but also please create a “Problem Set Group” on Canvas, which I think you can do here.

Question 1-Big Floods in Big Thompson!

Question 1a The Big Thompson watershed is no stranger to big floods. The best known of these are the 31 July 1976 flood that killed 144 people, and the September 2013 front range floods, that were just a huge mess. Briefly describe the key differences between these two floods, based on the following USGS infographic. Pay particular attention to where and how much rain fell. Also, if you examine closely the tabular USGS peak streamflow observations for the site 06733000 BIG THOMPSON RIVER AT ESTES PARK, CO, you’ll note that the 31 July 1976 wasn’t even the largest flood peak for that year at that site. Make sure that point makes it into your comparison. In 1976, one storm event that lasted only a few hours suddenly fell over a small area, and the greatest amount of rainfall fell over the Big Thompson River, which drains through Estes Park. Because this rainfall event was so short and so intense, the flow through the Big Thompson River was incredibly flashy, yet spatially incredibly varied as well. Through the Big Thompson Canyon flood peaks reached unprecedented heights, yet at the Estes Park gage, where the floodplain is more connected, the gage height wasn’t even the greatest of the year. In 2013, heavy rain fell for a week over a large area, impacting several stream systems, including the Big Thompson. While it is clear that 2013 experienced more rainfall, the rate at which rain fell in 1976 cannot be understated.

Question 1b Next, let’s add two more big floods in the watershed. The first is from 1982, which you can see from the graphic below is actually the largest recorded flood peak for that site (note that after 1998, the gage was turned over to the Colorado Division of Water Resources, but trust me that no flood since has been larger than the 1982 one, although 2013 came close!

Returning to the tabular annual peak streamflow data, try to see what caused the 1982 flood (hint: pay attention to the “peak streamflow qualification codes”). Also, Jarrett and Costa (1988; see Yu et al. 2021 for full reference) report paleoflood evidence in the neighborhood of \(85-140 m^3 s^{-1}\) that occurred 8,000-10,000 years ago. What criticisms could you raise about including either the 1982 or the paleoflood data in a quantitative flood frequency analysis (FFA)? Think specifically about the assumption of independent and identically distributed (i.i.d.) flood observations.

This 1982 peak flow occurred due to a breached dam. It would not be appropriate to use the 1982 flood peak or the paleoflood data in a FFA because these events did not occur without outside influence, and therefore does not represent the nature of the watershed and could not occur independently or on an identically distributed timescale.

Question 1c Again examining the tabular annual peak streamflow data, describe when during the year floods in Big Thompson tend to occur, and what you think causes them. The Yu et al. (2021) paper can help in this regard.

Monsoon season in Colorado officially begins on June 15, and the vast majority of peak flows at the Estes Park site occurred in June. This may be because this monsoon season overlaps with snow meltwater runoff; thus on top of an already full river and saturated soils, heavy rains are added.

Question 2-The Times, They are a-Changing (or are they?) Now, let’s read, plot,and examine long-term daily streamflow data for that site:

Question 2a Do you see any changes over time in the streamflow regime (e.g. high flows, low flows, seasonality, etc.)? If yes, describe them. If not, explain what aspects you looked for but failed to find.

Between 2010 and 2020, low flows began to occur with much greater frequency, while high flows generally continued to occur regularly and at expected rates. Additionally, after 2000, instead of flows gradually rising and falling with the seasons, the stream spends much more of the year between 10 and 50 m^3/s, giving the streamflow regime plot a “clumpy” look when viewing the time period after 2000 compared to pre-2000.

Question 2b Briefly describe any deeper analyses that you think might help reveal potential changes in the streamflow observations. You do not need to perform these analyses. Bear in mind that this isn’t a statistics course, so you aren’t expected to know a lot of specific trend analyses or time series methods.

Without knowing the correct terminology to describe this, it really does concern me how “clumpy” the streamflow regime plot becomes post-2000. The stream begins to spend much more time out of the year at lower flows- not extremely low, in fact at higher rates than pre-2000 regular low flows, but perhaps the rate at which you would expect the stream to flow at in October. This coupled with a new trend in extremely low flows occurring every few years post-2000 is very interesting. It is so abrupt, I almost wonder if there was some new development at about 2005? Or, when the gage was turned over to the Colorado Division of Water Resources in 1998, did the methodology of collecting gage reading change (e.g. collected at more concentrated time intervals) that revealed more “irregular” behavior?

Question 3-Assume Crash Position, because Things are About to Get Extreme For this question, you’ll need to use the ‘extRemes’ R package by Eric Gilleland and Rick “Tricky Ricky” Katz at NCAR. We will fire up the library and read in some data, but first you should familiarize yourself with their paper Gilleland and Katz (2016), which introduces the package and tells you enough to survive this question. Pay particular attention to Section 1, Section 2 (excluding 2.1, 2.2, and 2.4), and the early parts of Section 3 (basically, 3.1 up to and including pg. 10/Fig. 2). Also, we’re only going to focus on the GEV distribution, so if you want, you can ignore anything related to the GP distribution (although it is a good thing to know about!).

NOTE: The data we’ll use are precisely those used to fit the dashed (snowmelt peaks) and solid (rainfall peaks) black lines in Figure 9 of Yu et al. (2021).

So there is your data, your mission.

Question 3a Fit GEV distributions to both the snowmelt-related and the rainfall-related peaks. First, comment on whether you think the GEV distribution, and therefore the extreme value theory more broadly, seems to describe these data, within reason. To put this more simply: comment on the goodness of fit revealed by when you used the “plot()” function on a fit produced by the “fevd()” function. (HINT: you should get something that looks like Fig. 2 from Gilleland and Katz, 2016).

Note: contents of chunks are included to show work

#snowpeaks plots
plot(snowpeaks$discharge*0.0283, type = "l", col = "darkblue", lwd = 1.5, cex.lab = 1.25, main= "Snowpeaks", xlab = "Year", ylab = "Maximum Snow Peak (m3/s)")

fit1 <- fevd(snowpeaks$discharge*0.0283, snowpeaks, method=c("Lmoments"), units = "m3/s")
distill(fit1)
##   location      scale      shape 
## 20.6408727  9.1619813 -0.1004036
plot(fit1)

#rainpeaks plots
plot(rainpeaks$discharge*0.0283, type = "l", col = "darkblue", lwd = 1.5, cex.lab = 1.25, main="Rainpeaks", xlab = "Year", ylab = "Maximum Snow Peak(m3/s)")

fit2 <- fevd(rainpeaks$discharge*0.0283, rainpeaks, method=c("Lmoments"), units = "m3/s")
distill(fit2)
##   location      scale      shape 
## 12.4876121  5.5318506  0.3325414
plot(fit2)

Considering the diagnostics from the GEV df fitted to the snowpeak and rainpeak data, it seems a bit premature to claim that the GEV describes the data well. In each dataset, the model overestimates the data (looking at the density plots of empirical data and fitted GEV df). The 95% confidence bands revealed when plotting quantiles from a sample drawn from the fitted GEV df against the empirical data quantiles are wider than I would prefer. I wonder if this poor fit is due to the relatively few number of data points we are working with, and the “abnormal” nature of the 2013 event.

Question 3b Use the “return.level()” function to examine the 2-year, 20-year, and 100-year return period estimates. Compare them across the two distributions, and provide your results.

#snowpeaks return levels
return.level.fevd.lmoments(fit1, return.period = c(2,20,100),do.ci = TRUE)
## fevd(x = snowpeaks$discharge * 0.0283, data = snowpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5% Estimate    97.5%
## 2-year   20.67369 23.93782 27.73030
## 20-year  36.41636 44.17105 51.76309
## 100-year 41.41906 54.39451 71.45907
#is it necessary to use return.level.fevd.lmoments() instead of return.level()?
#fevd was already run with the L-moments method specified, which may be why both
#syntaxes produce the same result. It still seems good to specify the method

#rainpeaks return levels
return.level.fevd.lmoments(fit2, return.period = c(2,20,100), do.ci = TRUE)
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5% Estimate     97.5%
## 2-year   12.20828 14.64384  17.72941
## 20-year  25.79468 40.51915  58.75457
## 100-year 34.86103 72.65629 140.67497

The 2-year estimated return period flow for snowpeaks is greater than that of rainpeaks, while the rainpeaks 100-year estimated return period flow surpasses that of the snowpeaks’. This tells me that the rainpeak data is more tail-heavy than the snowpeak data.

Question 3c Finally, compare the values of the resulting parameters: location (indicative of the mean of a distribution), scale (indicative of the spread/variance of a distribution), and shape (indicative of a distribution’s skewness). You should do this estimation using the method of L moments (use the extRemes help manual or help command ‘?fevd’ to see how to do this). Of particular interest with extreme values is the shape parameter. What does the sign of the shape parameters of these two GEV distributions indicate about the existence or nonexistence of an asymptotic upper bound on the underlying process? Try to connect your answer to parts a and b of this question.

#snowpeaks parameters
ci(fit1,type="parameter")
## fevd(x = snowpeaks$discharge * 0.0283, data = snowpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##                2.5%   Estimate      97.5%
## location 17.3913164 20.6408727 24.0925792
## scale     6.6092024  9.1619813 11.4683983
## shape    -0.3748613 -0.1004036  0.1252027
#rainpeaks parameters
ci(fit2, type="parameter")
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##                 2.5%   Estimate      97.5%
## location 10.68327761 12.4876121 14.7825670
## scale     3.75777371  5.5318506  7.7843312
## shape    -0.02702344  0.3325414  0.5686012

For the snowpeaks resulting parameters, the location parameter hovered at about 20, the scale parameter was about 9, and the shape parameter is a rather small negative value. This may indicate that there is an upper bound to snowmelt flooding. For the rainpeaks resulting parameters, the location parameter was about 12 (shifted to a slightly earlier timeframe than that of snowpeaks), the scale parameter was about 5 (more condensed than that of snowpeaks) and the shape parameter was a positive value, indicating a heavy tail and an unbounded nature.

Question 3d The following command will help you find the maximum from each year since 1983, regardless of whether the peak is from snowmelt or rainfall:

newframe<- data.frame(rainpeaks=rainpeaks$discharge,snowpeaks=snowpeaks$discharge)
newframe$maxpeak <- apply(newframe, 1, max)  

Fit a new GEV distribution to this new combined dataset. Now you have three different GEV distributions. Compare the 5-year and 500-year estimates from your three GEV distributions. What does this seem to say about the magnitude of very rare floods from these different types of processes? And what does this say about “business as usual”, i.e ignoring the underlying processes that created floods, instead treating all datapoints as i.i.d.?

#combined peaks plots
plot(newframe$maxpeak*0.0283, type = "l", col = "darkblue", lwd = 1.5, cex.lab = 1.25, xlab = "Year", ylab = "Maximum of Either Peak (m3/s)")

fit3 <- fevd(newframe$maxpeak*0.0283, newframe, method=c("Lmoments"), units = "m3/s")
distill(fit3)
##    location       scale       shape 
## 21.51047837  8.99678829  0.09771038
plot(fit3)

#snowpeaks 5 & 500 year return periods
return.level.fevd.lmoments(fit1, return.period = c(5,500), do.ci = TRUE)
## fevd(x = snowpeaks$discharge * 0.0283, data = snowpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5% Estimate    97.5%
## 5-year   28.74438 33.39854 38.52143
## 500-year 44.76789 62.99368 98.97361
#rainpeaks 5 & 500 year return periods
return.level.fevd.lmoments(fit2, return.period = c(5,500), do.ci = TRUE)
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5%  Estimate     97.5%
## 5-year   18.64091  23.24603  28.38856
## 500-year 45.31018 127.19329 349.91943
#combined peaks 5 & 500 year return periods
return.level.fevd.lmoments(fit3, return.period = c(5,500), do.ci = TRUE)
## fevd(x = newframe$maxpeak * 0.0283, data = newframe, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5% Estimate     97.5%
## 5-year   30.28850 36.04414  42.90463
## 500-year 54.94376 98.40911 195.35682

The 500-year estimated snowpeaks flood is estimated to be about 63m3/s, while the 500-year estimated rainpeaks flood is estimated to be about 127m3/s. When these two datasets are combined, the snowpeaks data dampens the rainpeaks data, and the 500-year combined estimated flood is estimated to be about 98m3/s. Clearly, floods resultant of extreme rain events are greater in magnitude (although it is worth noting that snowmelt generated runoff dominates annual flood peaks). In our changing environment, more precipitation is falling as rain instead of snow. Therefore if we continue to estimate flood return periods and magnitudes based on this combined approach, we may underestimate flood peaks and return periods.

Question 3e Based on these data and extreme value theory, attempt to determine the upper bound of snowmelt flooding in this watershed. (HINT: you can do it entirely by modifying commands you’ve already used in this question.) If you can find an upper bound, or some semblance of one, what do you think is the physical explanation for its existence?

return.level.fevd.lmoments(fit1, return.period = c(500, 5000, 50000, 500000, 5000000), do.ci = TRUE)
## fevd(x = snowpeaks$discharge * 0.0283, data = snowpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##                2.5% Estimate     97.5%
## 500-year   42.90072 62.99368  94.77462
## 5000-year  43.93930 73.09036 139.93932
## 50000-year 44.34253 81.09975 199.94806
## 5e+05-year 44.49918 87.45567 282.65462
## 5e+06-year 44.56004 92.49965 403.91311

According to the above return level data, it seems that the upper limit of snowmelt flooding in this watershed is somewhere below 100m3/s. This upper limit seems to reflect the upper limit of the snowfalls on record- the watershed has never experienced (on record) enough snowfall to ever generate a flood above this threshold.

Question 3f Lastly, the dirty laundry. Comment on the following:

  1. How important is the 2013 flood for determining the upper tail (e.g. 500-year return period) of the rainfall-driven flood peak distribution?

The 2013 flood is very important for determining the upper tail of the rainfall-driven flood peak distribution, although, as no other flood compares to it, the confidence of this estimate is very low.

  1. How important is the choice of parameter estimation technique? To do this, compare fitting using L moments and maximum likelihood estimation (MLE).

First I’d like to visually compare the diagnostics from the GEV df fitted to the rainpeak data for both the L-moments and MLE methods, as well as the estimated parameters and return periods and magnitudes

plot(fit2)

ci(fit2,type="parameter")
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##                 2.5%   Estimate      97.5%
## location 10.64130398 12.4876121 14.8743449
## scale     3.62517509  5.5318506  7.7093713
## shape    -0.03082949  0.3325414  0.6326091
return.level.fevd.lmoments(fit2, return.period = c(5,500), do.ci = TRUE)
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("Lmoments"), 
##     units = "m3/s")
## 
## [1] "Parametric Bootstrap"
## 502  iterations
## 
##              2.5%  Estimate     97.5%
## 5-year   18.51070  23.24603  28.87991
## 500-year 45.13914 127.19329 347.50986
fit4 <- fevd(rainpeaks$discharge*0.0283, rainpeaks, method=c("MLE"), units = "m3/s")
plot(fit4)

ci(fit4,type="parameter")
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("MLE"), 
##     units = "m3/s")
## 
## [1] "Normal Approx."
## 
##          95% lower CI   Estimate 95% upper CI
## location  10.69565384 12.9977974   15.2999410
## scale      4.66147992  6.4360997    8.2107194
## shape     -0.01691118  0.1833998    0.3837108
return.level.fevd.lmoments(fit4, return.period = c(5,500), do.ci = TRUE)
## fevd(x = rainpeaks$discharge * 0.0283, data = rainpeaks, method = c("MLE"), 
##     units = "m3/s")
## 
## [1] "Normal Approx."
## 
##                       95% lower CI Estimate 95% upper CI
## 5-year return level       19.15088 24.10999      29.0691
## 500-year return level     26.58074 87.58637     148.5920

Upon comparison of the above plots and values, it seems that the choice of parameter estimation technique is very important. Due to the 2013 flood, confidence is very low in each, however using the L-moments method, the shape parameter was estimated to be 0.33 and the 500-year flood was estimated to be 127m3/s, while using the MLE method produced a shape parameter estimate of 0.18 and a 500-year flood of 87.59. The heavier tail estimated via the L-moments method suggests the possibility of floods of greater magnitude. If this observation is correct and applies more broadly, I understand why NOAA precipitation frequency analysis is so firmly grounded on L-moment statistics.

  1. Based on your answers to subquestions 1 and 2, state in your own words in no more than one or two paragraphs the potential value of the “process-based” approach to flood frequency analysis described in Yu et al. (2021). Also, provide a few more sentences describing some of the shortcomings of process-based approaches (don’t just quote from Section 5.3 of that paper).

The “process-based” approach to flood frequency analysis described in Yu et. al. (2021) provides something that the above GEV curves lack: time. A key reason the confidence bands in the above analyses are so wide is that the timespan we are working with is only 36 years old, therefore “anomalous” events such as the 2013 flood introduces a great amount of uncertainty in the distributions. Over the span of 36 years, a limited combination of events may occur, and the full range of possibilities (antecedent soil moisture, air temperature changes, rain over snow events, etc.) cannot be recorded and used to develop a complete flood frequency analysis. Yu et. al. (2021) utilizes this 36 years of input data and via stochastic storm transposition, temporally resamples and spatially transposes storms to develop 10 realizations of 1,000 synthetic years, which, after feeding through event-based simulations, provides a far more (synthetically) informed flood frequency distribution.

The transposition of storms and assuming the relationships between precipitation/snowmelt events and behaviors of the basin may limit the accuracy of process-based approaches to flood frequency analysis. Storm transposition is an entire discipline that requires knowledge of likely meteorological behavior and inspires questions of homogeneous areas. Additionally, assumptions must be made regarding the relationship between basin characteristics and meteorological events, and while Yu et. al. (2021) worked hard to address antecedent soil moisture conditions, other factors such as baseflow may be more difficult to parameterize.

Question 4 Suppose that you have been tasked with estimating the largest imaginable flood in the Big Thompson watershed, irrespective of its probability of occurrence. How would you approach this problem? Plan on answering in ~500 words, plus (optionally) any graphics that you think might be helpful.

To estimate any largest imaginable flood in the heavily dammed western United States, it is tempting to simply calculate the effects of all infrastructure failing at once, therefore forcing the volume of all reservoir stores and flood control structures through the watershed at once. That is certainly the largest imaginable flood I can think of that hardly has a chance of occurrence. Barring widespread monkeywrenching or any national security threat, this scenario cannot be taken seriously. So instead, I would suggest a modification of the process-based approach outlined in Yu et. al. (2021) while considering the larger ideas presented in Seminal Paper 5, Vance Myers’ “Meteorology of Hypothetical Flood Sequences in the Mississippi River.”

Within Myers (1959), the issue at hand was how to design flood control infrastructure on the Lower Mississippi River following the Great Mississippi flood of 1927 and 1928 Flood Control Act. The proposed solution was to dream up the most catastrophic series of meteorological events, within reason, and predict how the basin would respond. In the spirit of this idea, I would want to feed a proccess-based approach the most catastrophic meteorological events that behaved similarly to the Big Thompson Floods of 1976 and 2013. The 1972 Black Hills Flood and the 1995 Rapidan Storm may qualify, as each displayed similar characteristics, and may justifiably have a change of occurrence over the Big Thompson. Thus after calibrating a model of the Big Thompson watershed with real forcing variables from 1983 to 2018, I would introduce only the most catastrophic rainfall events via stochastic storm transposition.

Yu et. al. uses random selection of events and parameters to generate a catalog of events; RainyDay randomly selects storms from the catalog to represent synthetic years, and antecedent watershed conditions are randomly selected against the backdrop of these randomly selected storms. I would instead program the process to select for the most catastrophic situations; extreme storms centered directly over the watershed, while the soil moisture and baseflows are already at maximum values.

Given the relatively short period over which we have a reliable series of precipitation and flood data for our watersheds, each of the three principles for improving estimation of extreme flood probabilities must be used. Trading space for time via storm transposition, focusing on the tails by only introducing the largest storms to our models, and introducing more structure to these models via extreme value theory are all important. In today’s climate, it does not seem unreasonable to attempt to model the most catastrophic flood possible, as we very well may need to be prepared for such events in the nearing future.