Compare F of Poisson vs ETAS

Objective:

Generate randomized PIT histograms using two different cumulative mass functions (\(F\)): the Poisson and the empirical distribution for ETAS under simulation.

1. Load simulated data

simmat <- read.csv("../simulated_data/simmat.r")
head(simmat)

##   V1      V2 V3
## 1  0 0.04088  1
## 2  0 0.04121  1
## 3  0 0.04177  1
## 4  0 0.04420  1
## 5  0 0.08061  1
## 6  1 0.34477  1

Where the each row is a pixel, the first column is the observed count, the second is the expected count, and the third is the pattern number.

2. Compute the ETAS e.c.m.f

it <- 50000
NA_vec <- rep(NA, it)
for (i in 1:it) {
  z <- simetas(theta, x1 = 1, y1 = 1, T = endT, sor=1)
  NA_vec[i] <- counts_in_pix(z$lon, z$lat, polygon_list)[64]
}
F_ETAS <- ecdf(NA_vec)
plot(F_ETAS, verticals = TRUE, do.points = FALSE, xlim = c(-2, 15), 
     main = "ETAS ecmf", col = "steelblue")

plot of chunk unnamed-chunk-3

Use these simulated counts \(N(A)\) to write a function that is analogous to ppois().

pETASsub <- function(y, NA_vec) {
 mean(NA_vec <= y) 
}

pETAS <- function(x, NA_vec) {
  sapply(x, pETASsub, NA_vec)
}

3. Plot the histograms with the Poisson F

The important thing to note here is that the histograms have a systematic non-uniformity: the right most bin is consistency too high.

n_pixels <- 400
par(mfrow = c(5, 2))
for(i in 1:10){
  which_sim <- i
  counts <- simmat[simmat[,3] == which_sim, 1]
  meanlambdas <- simmat[simmat[,3] == which_sim, 2]
  
  exp_null <- meanlambdas * (1/n_pixels)
  x_zero <- counts[counts == 0]
  x_nzero <- counts[counts != 0]
  v <- runif(n_pixels)
  
  Fx <- ppois(x_nzero, exp_null[counts != 0])
  Fxm1 <- ppois(x_nzero - 1,exp_null[counts != 0])
  u_nzero <- Fxm1 + v[counts != 0] * (Fx - Fxm1)
  u_zero <- v[counts == 0] * ppois(0, exp_null[counts == 0])
  u <- c(u_nzero, u_zero)
  hist(u)
}

plot of chunk unnamed-chunk-5

4. Plot the histograms with the ETAS F

This method differs from the Poisson F in an important way: for ETAS F, we can simulate a single NA_vec to use on all simulated point patterns. For the previous Poisson F, it is necessary to recalculate E(N(A)) using MC integration on every simulation, which is where the bulk of the computational burden is.

par(mfrow = c(5, 2))
for(i in 1:10){
  which_sim <- i
  counts <- simmat[simmat[,3] == which_sim, 1]

  x_zero <- counts[counts == 0]
  x_nzero <- counts[counts != 0]
  v <- runif(n_pixels)
  
  Fx <- pETAS(x_nzero, NA_vec)
  Fxm1 <- pETAS(x_nzero - 1, NA_vec)
  u_nzero <- Fxm1 + v[counts != 0] * (Fx - Fxm1)
  u_zero <- v[counts == 0] * pETAS(0, NA_vec)
  u <- c(u_nzero, u_zero)
  hist(u)
}

plot of chunk unnamed-chunk-6

These histograms appear be behaving much closer to how we would expect: no systematic deviations from the uniform.