Background: challenges associated with enforcing distributional support when fitting GEV models

In its location-scale parameterization the GEV has CDF

\[\begin{equation} G(y) = \text{exp}\left\{-\left[1 + \xi\left(\frac{y-\mu}{\sigma}\right)\right]^{-1/\xi}\right\}. \end{equation}\]

The support of the distribution depends on all three parameters and the data:

\[\begin{equation} \tag{(1)} \{ y : 1 + \xi(y-\mu)/\sigma > 0 \} \end{equation}\]

where \(-\infty < \mu < \infty\), \(\sigma > 0\), and \(-\infty < \xi < \infty\).

This is typically handled (see Aki Vehtari’s work with the Generalized Pareto distribution) by forcing the burden of support onto a single parameter such that the bounds for this one parameter are a continuously differentiable function of the other parameters (and, in the case of the GEV, the maximum and minimum data points as well). In the GEV distribution it makes sense to have the \(\xi\) parameter be the one that accommodates the support.

How the sign of \(\xi\) controls the support of the GEV distribution

If \(\xi\) has bounds dependent on the other two parameters we can think of the sign of \(\xi\) as being determined by the sign of the quantity \((y-\mu)/\sigma\). Assuming all \(y\) are positive, we can consider a toy scenario where \(\mu\) is guaranteed to fall between min(\(y\)) and max(\(y\)):

If \((y-\mu)/\sigma > 0\) then \(\xi < 0\)
Then the \(y\) we care about is \(y_{max}\) and \(\xi\) needs a lower bound such that \[\xi > \frac{-1}{(y_{max}-\mu)/\sigma}.\]
If \((y-\mu)/\sigma < 0\) then \(\xi > 0\)
Then the \(y\) we care about is \(y_{min}\) and \(\xi\) needs an upper bound such that \[\xi < \frac{-1}{(y_{min}-\mu)/\sigma}.\]

This works because we can trap \(\xi\) between two bounds, one always dependent on \(y_{max}\) and the other always dependent on \(y_{min}\). But this logic fails when applied to the GEV distribution in its standard location-scale parameterization since \(\mu\) can fall outside the range of the data. This means we cannot definitively tie \(y_{max}\) to the lower bound and \(y_{min}\) to the upper bound, respectively, since we would need to know the value of \(\mu\) ahead of time to know which order statistic we care about in the context of Eqn (1).

This is where the issue arises with the \(\texttt{brms}\) implementation of the GEV.

brms issue: problems enforcing support with the location-scale parameterization

The \(\texttt{brms}\) package transforms an unconstrained parameter \(\kappa\) to \(\xi\) using the link function

\[\begin{equation} \tag{2} \xi = \left(\frac{1}{1 + e^{-\kappa}}\right)*{\gamma(y,\mu,\sigma)} + \text{min}\left\{ \frac{-1}{\text{min}\left(\frac{y-\mu}{\sigma}\right)} , \frac{-1}{\text{max}\left(\frac{y-\mu}{\sigma}\right)}\right\} \end{equation}\]

where

\[\begin{equation} \gamma(y,\mu,\sigma) = \text{max}\left\{ \frac{-1}{\text{min}\left(\frac{y-\mu}{\sigma}\right)} , \frac{-1}{\text{max}\left(\frac{y-\mu}{\sigma}\right)}\right\} - \text{min}\left\{ \frac{-1}{\text{min}\left(\frac{y-\mu}{\sigma}\right)} , \frac{-1}{\text{max}\left(\frac{y-\mu}{\sigma}\right)}\right\}. \end{equation}\]

Problems arise when \(\mu\) is outside the range of the data and both quantities used for the logit shift in Eqn (2) become positive (or negative).

We simulate 50 data points from the GEV distribution where the parameters are chosen to mimic a typical flood event dataset at a small catchment and the number of data points is a typical flood record length.

set.seed(0)
mu <- 3.25
sigma <- 1.41
xi <- -0.17

n <- 50

gev_data <- list(N = n,
                 y = extRemes::revd(n,mu,sigma,xi,type="GEV"))

c(min(gev_data$y),max(gev_data$y))
## [1] 0.8644518 6.8028075

For debugging purposes, we can encode the brms link function as a transformed parameter in a local Stan model (see attached minimum working example in issue report on github). The sampler proposes the following parameter values during warmup:

t_mu = 14.0945 # mu greater than max(data)
t_sigma = 2.33266
t_k = -2.61045

With these values, under the shifted logit transformation in Eqn (2), the resulting \(\xi\) can never be smaller than

bnds <- c(-1/min((gev_data$y-t_mu)/t_sigma),-1/max((gev_data$y-t_mu)/t_sigma))
min(bnds)
## [1] 0.1763153

But we would need \(\xi\) smaller than this value to enforce support:

The \(\xi\) value produced by the transform and these parameter values (purple star in above plot) is clearly outside the support:

t_xi <- (1/(1+exp(-t_k)))^(max(bnds)-min(bnds)) + min(bnds)

# check support of the GEV 
min(1 + t_xi*(gev_data$y-t_mu)/t_sigma) > 0
## [1] FALSE

This results in a \(\texttt{NaN}\) value when the log-likelihood is evaluated.

Having this \(\texttt{NaN}\) value is an issue because we are essentially defining an area of our parameter space to have zero likelihood without reflecting this in the parameter bounds. This can cause fitting issues that manifest as divergences and initialization problems that persist even with good priors and initial values. We noticed this when we started trying to fit the brms implementation of the gev on real flood data.

We can’t fix this distributional support problem by constraining \(\mu\) to be within the range of data because there is no viable reason why it should be so. Instead, we think the solution is to reparameterize the GEV such that we can make claims about one of the parameters that allow us to reliably tie the order statistics (ymin, ymax) to the bounds on \(\xi\).

Enforcing support via a quantile-based reparameterization of the GEV

This reparameterization and the subsequent parameter bounds rely on two properties of the data that are common in extreme event modeling: first, that all events are positive (for example, we could have a series of flood events where the amount of water is measured in cubic meters per second; negative values would not make sense here) and second, that the \(\xi\) parameter falls between -0.5 and 0.5. This second assumption is common in hydrological modeling of extremes (link).

The relationship between the location parameter, \(\mu\), and the median, \(\eta\), is given as \[\begin{equation} \eta = \begin{cases} \mu + \sigma \frac{\text{log}(2)^{-\xi}-1}{\xi} & \text{if}\ \xi \neq 0 \\ \mu - \text{log}\left(\text{log}(2)\right) & \text{if}\ \xi = 0. \end{cases} \label{eqn:medianrelation} \end{equation}\]

In our application we also reparameterize the scale parameter to reduce dependency:

\[\begin{equation} \beta = \text{log}\left(\frac{\sigma}{\eta}\right). \label{eqn:beta} \end{equation}\]

Under this reparameterization the support condition then becomes

\[\begin{equation} \text{min}\left\{1 + \xi \left( \frac{y-\eta + \eta e^{\beta} \left(\frac{\text{log}(2)^{-\xi}-1}{\xi}\right) }{\eta e^{\beta}} \right)\right\} > 0. \end{equation}\]

Ignoring the \(\texttt{min()}\) for the time being, setting the \(1 + \xi(...)\) quantity equal to zero and solving for \(\xi\) gives a solution in the form of the Lambert-W function (analytic continuation of the product log function):

\[\begin{equation} \tag{(3)} \xi = \frac{W_{c_1}\left( \frac{e^{\beta}\eta \text{log}(\text{log}(2))}{\eta - y} \right)}{\text{log}(\text{log}(2))}, \hspace{0.5cm} \eta \neq y, \,\, e^{\beta}\eta \neq 0. \end{equation}\]

The Lambert W function has an infinite number of branches, most of which are in the complex plane. Luckily for us, however, the range of values we are interested in (-0.5 to 0.5) is fully contained within one of the two branches of this function that deal with real numbers:

With our new parameterization it is reasonable to assume

\[\begin{equation} \tag{4} 0 \leq y_{min} < \eta < y_{max} \end{equation}\]

since the median flood for a catchment should not exceed the maximum flood observed or be less than the minimum observed flood. Knowing this, and observing that the Lambert-W function is strictly increasing within (-0.5,0.5), it follows that \(y_{max}\) minimizes Eqn (3) and \(y_{min}\) maximizes Eqn (3), giving us bounds

\[\begin{equation} \tag{5} \frac{W_{0}\left( \frac{e^{\beta}\eta \text{log}(\text{log}(2))}{\eta - y_{max}} \right)}{\text{log}(\text{log}(2))} < \xi < \frac{W_{0}\left( \frac{e^{\beta}\eta \text{log}(\text{log}(2))}{\eta - y_{min}} \right)}{\text{log}(\text{log}(2))}. \end{equation}\]

Since the Lambert-W function is only defined for values \(\geq -1/e\), we need additional bounds on the \(\beta\) parameter. The argument of the Lambert-W function in the lower bound is always positive, so we only need to worry about the argument that depends on \(y_{min}\) in the upper bound when constructing bounds for \(\beta\). Setting

\[\begin{equation} \frac{e^{\beta}\eta \text{log}(\text{log}(2))}{\eta - y_{min}} \geq -\frac{1}{e} \end{equation}\]

we find that \(\beta\) has an upper bound given by

\[\begin{equation} \tag{6} \beta < \text{log}\left( \frac{y_{min}-\eta}{\eta \text{log}(\text{log}(2))} \right) - 1 \end{equation}\]

We can plot the bounds as a function of \(\beta\) for a chosen \(\eta\):

where the green shading in the above figure indicates a prior for \(\xi\) over -0.5 to 0.5 and the grey dashed line is the upper limit for \(\beta\).

So, in sum, this construction requires 4 components:

  1. constraints on \(\eta\) and \(y_{min},y_{max}\) given in Eqn (4)
  2. constraints on \(\xi\) given in Eqn (5)
  3. constraints on \(\beta\) given in Eqn (6)
  4. and a prior on the \(\xi\) parameter such that it can only take values within the interval \((-0.5,0.5)\).

Remaining questions

A minimum working example for both Stan models discussed above is attached to the Github issue report.

The median reparameterization would give us a way to enforce distributional support for an individual (local) GEV model.

However, the open question is if this does us any good if we want to put regressors on the parameters, i.e. for a regional flood model. It’s difficult to think of link functions that would enforce these bounds.

This is an interesting, but difficult, problem.

This content is aligned center.