The support of the GEV changes depending on \(\xi\):

\[\begin{align} \xi > 0&: \hspace{1cm} y \in \left[\mu - \frac{\sigma}{\xi}, +\infty\right) \hspace{0.25cm} \implies \hspace{0.25cm} \text{min(y)} \geq \mu - \frac{\sigma}{\xi}\\[10pt] \xi = 0&: \hspace{1cm} y \in (-\infty, +\infty)\\[10pt] \xi < 0&: \hspace{1cm} y \in \left(-\infty, \mu - \frac{\sigma}{\xi}\right] \hspace{0.25cm} \implies \hspace{0.25cm} \text{max(y)} \leq \mu - \frac{\sigma}{\xi} \end{align}\]

Where \(y\) is our data. This change in support needs to be reflected in the Stan model.

First attempt

Force the bounds on \(\mu\) to satisfy the support inequalities

We place conditional bounds on \(\mu\) in the \(\texttt{parameter}\) block:

functions{
  ## a function that returns the log density of the GEV
  real gev(vector y, real mu, real sigma, real xi){
    int N = rows(y);
    vector[N] lpdf;
    vector[N] b;
    vector[N] a;
    a = (y - mu) / sigma;
    b = 1 + a * xi;
    
    if(fabs(xi) > 1e-15){ ##if xi =/= 0
      for(i in 1:N)
      lpdf[i] = -log(sigma) - (1 + 1/xi)*log(b[i]) - pow(b[i],(-1/xi));
    } else{
      for(i in 1:N)
      lpdf[i] = -log(sigma) - a[i] - exp(-a[i]);
    }
    return sum(lpdf);
  }
}
## The input data is a vector 'y' of length 'N'.
data {
  int<lower=0> N;
  vector[N] y;
}
transformed data {
  real ymin = min(y);
  real ymax = max(y);
}
## The parameters accepted by the model. 
parameters {
  real<lower=0, upper=10> sigma;
  real<lower=-0.5,upper=0.5> xi; 
  real<lower= (xi < 0 ? ymax + sigma/xi : negative_infinity()),
       upper= (xi > 0 ? ymin + sigma/xi : positive_infinity())> mu;
}


## The model to be estimated. We model the output
## 'y' to be distributed GEV with mu, sigma, xi
model {
  ## Priors on the model parameters:
  target += beta_lpdf(-xi + 0.5 | 6, 9);
  
  target += gev(y, mu, sigma, xi);
}

The problem with this model is that the conditional bounds on \(\mu\) as they’re currently written introduce a discontinuity in derivatives at \(\texttt{ymin}\) and \(\texttt{ymax}\): it’s not continuously differentiable at the condition boundaries, which creates problems for Hamiltonian Monte Carlo.

If we run the model for multiple chains, we can observe this behavior; depending on initial value, Stan gets stuck either above or below \(\xi = 0\).

mu = 10
sigma = 2
xi = 0.1

numPts <- 100

# simulate "observed data"
gev_data <- list(N = numPts,
                 y = revd(numPts, mu, sigma, xi))

fit <- stan(
  file = 'GEV_option1.stan',
  data = gev_data,
  chains = 3,
  warmup = 1000,
  iter = 5000,
  cores = 1,
  refresh = 1,
  seed = 42
)
traceplot(fit, inc_warmup = T)

This discontinuity is severe enough that Stan throws a bunch of “divergences” and can’t converge to the true parameter values (note the odd behavior in the trace plot for \(\sigma\)–we can’t trust any values this model is giving us if the sampler is diverging). In Stan terms, a divergence is when “the simulated Hamiltonian trajectory departs from the true trajectory as measured by departure of the Hamiltonian value from its initial value”. In layman’s terms, this means Stan fell off the cliff between \(\xi < 0\) and \(\xi > 0\). We can actually observe this discontinuity if we look at the interaction of \(\xi\), \(\mu\), and the log posterior.