1.2 Question 2) assignment 2 Extremes

CHEVALIER Paolo

2025-02-27

1.2 Question 2)

library(ismev)

Loading the data

england_rainfall = read.table("https://www.mas.ncl.ac.uk/~nlf8/teaching/mas3918/England-Rainfall.txt",header=TRUE)

Attach the data so we don’t have to use england_rainfall$…

attach(england_rainfall)

Overlook on the data

I have to use the Boston data rainfall data.

Let’s look at basic stats from our data.

summary(Boston)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.270   2.111   2.970  38.060

Plotting the series gives the following graph;

plot(ts(Boston))

The data looks somewhat stationary which is great (might have some trend but we aren’t asked to deal with it for now).

hist(Boston)

hist(Boston,ylim=c(0,50))

All those plots show an analysis of extremes (Thresholds) is adequate.

Treshold selection

mrl.plot(Boston)
lines(x=c(16,28),y=c(3.9,4.6),col="green")

16mm seems like a good threshold, linearity can be suggested above that value for u.

Preprocessing the data

threshold=16
boston.above.threshold=Boston[Boston>threshold]
length(boston.above.threshold)
## [1] 242
length(Boston)
## [1] 26662

We end up with 242 observations where we would’ve had 74 with a block maxima approach.

boston.exceedances=boston.above.threshold-threshold

Fitting the GPD

model=gpd.fit(Boston,16)
## $threshold
## [1] 16
## 
## $nexc
## [1] 242
## 
## $conv
## [1] 0
## 
## $nllh
## [1] 573.4325
## 
## $mle
## [1] 3.85315140 0.02062179
## 
## $rate
## [1] 0.009076588
## 
## $se
## [1] 0.37207225 0.07206272

Therefore, we have \(\hat{\sigma}=3.85315140 \, (0.37207225)\) and \(\hat{\xi}=0.02062179 \, (0.07206272)\)

Let’s compute their respective 95% confidence intervals in the usual way

sigma_CI=c(model$mle[1]-1.96*model$se[1],model$mle[1]+1.96*model$se[1])
xi_CI=c(model$mle[2]-1.96*model$se[2],model$mle[2]+1.96*model$se[2])
sigma_CI
## [1] 3.123890 4.582413
xi_CI
## [1] -0.1206211  0.1618647

0 is contained in the 95% confidence interval for \(\xi\) but we will keep it like this assuming the ismev package handles it well.

NB: when setting \(\hat{\xi}=0\) the gpd.prof() function does not work anyway.

Checking model adequacy

gpd.diag(model)

All these plots seems to indicate that our model is adequate.

100 years return levels

Like in the last assignment, let’s create a function to compute return levels using the lesson’s derivations.

return_level=function(u,sigma,xi,rate,ny,r){
  if(xi==0){
    return(u+sigma*log(ny*r*rate))
  }
  return(u+(sigma/xi)*((ny*r*rate)^xi - 1))
}

Let’s now compute the 100 years return level

z100=return_level(threshold,model$mle[1],model$mle[2],model$rate,365.25,100)

Therefore the 100 years return level is \(\hat{z_{100}}=39.76mm\)

Confidence interval for the 100 years return level

Let’s use the profile log-likelihood to derive our 95% confidence interval for \(z_{100}\)

gpd.prof(model,100,xlow=30,xup=150,conf=0.95)
## If routine fails, try changing plotting interval
#to check the maxima is indeed where it should be
abline(v=z100,col="red",lty=2)
#trying to get the bounds for the CI
abline(v=34.7,col="blue",lty=2)
abline(v=51.5,col="blue",lty=2)

Therefore, the confidence interval is \(\text{CI}_{0.95}(z_{100})=[34.7,51.5]mm\)

Conclusion

If we want, say a building to be able to withstand a one in a 100 year rainfall event. We need it to be able to withstand around 52mm of rain.