library(BASTION)
library(igraph)
library(ggraph)
library(tidyverse)
library(ggpubr)
library(sf)
library(sp)
library(sn)
library(fdaPDE)
library(mgcv)
library(doParallel)
library(foreach)
library(doRNG)
library(mvtnorm)
library(TruncatedNormal)
library(scoringRules)
library(spatstat)
library(gridExtra)Constrained Spatial Point Process Intensity Estimation using Ensembles of Spanning Trees
Introduction
Analysis of spatial point patterns (event occurrences) is needed for many applications
A reasonable starting point is to start with a nonparametric estimation of the first order moment (intensity function)
Most popular approaches’ (kernel methods1, LGCP2, spline methods3) validity depends on Euclidean assumption
Many data sets are collected on a complex domain (e.g., road/pipeline networks, domains with interior holes, land separated by water)
Many point processes may have sharp discontinuities in their intensity
Intensity Estimation
We assume we can characterize our point pattern by an inhomogeneous Poisson process
Using our data, we want to estimate the function \(\lambda(s)\) on our entire domain
- \(\lambda(s)\) may not be globally smooth
Since \(\lambda(s)=\lim_{\nu(ds)\rightarrow 0}\frac{E(N(ds))}{\nu(ds)}\), we can treat \(\lambda(s)\nu(ds)\) as the probability of an event occurring in the ‘infinitesimal’ set \(ds\)
- So, in some sense we want to ‘collapse’ \(ds\) at every point \(s\) in our domain while tracking \(N(ds)\) to estimate \(\lambda(s)\)
Existing Literature
(Ferraccioli et al. 2021) - density estimation as equivalent problem to inhomogeneous PPP intensity estimation (smooth only)
(Baddeley et al. 2020) - kernel estimators on line networks (smooth only)
(Moradi et al. 2019) - resample-smoothing for Voronoi intensity estimator
Additive Tree Based Approach
Idea: Use the Bayesian Random Spanning Tree Partition model framework instead of directly using Voronoi cells
Model \(\lambda(s) = \sum_{m=1}^M \lambda_m(s)\)
Assume that \(\lambda_m(s)\) is piecewise constant on the domain
What the constant pieces are varies with \(m\)
This allows us to greatly simplify the normally challenging integral \(\int_{A_m}\lambda_m ds = \lambda_m\nu(A_j)\)
We’ll call each \(\lambda_m(s)\) a ‘learner’
Across the MCMC, we merge and split the piecewise constant regions for flexible, spatially contiguous partioning
Model Description
Data \(y = y_1,…,y_N\) lying in constrained domain \(A\)
Partition \(\pi = \{A_1,...,A_k\}\) of our domain as \(k\) contiguous subregions \(A_j\) for \(j \in \{1,...,k\}\)
- Each one of these subregions is a cluster
\(|A_j|\) is Lebesgue measure \(\nu\) of cluster \(A_j\)
- For most problems this is just the area, could be length or volume
\(n_j = \sum_{i=1}^N I(y_i \in A_j)\), the number of observations in each cluster
- Our likelihood is
So, per learner, we’ll use Gibbs sampling4 to estimate the posterior distributions:
\[\begin{align} [k^{(t+1)}&|k^{(t)}] \label{kdist}\\ [\underline{\lambda}^{(t+1)}&|\pi^{(t)},\underline{y},k^{(t+1)}] \label{lambdapost}\\ [\pi^{(t+1)}&|\underline{\lambda}^{(t+1)},\underline{y},k^{(t+1)}] \label{partitionpost} \end{align}\]Computation
bound_gg = ggraph(example_rsf, layout = example_mesh %>%
st_centroid() %>%
st_coordinates()) +
geom_sf(data=boundary_rot)
mesh_gg = ggraph(example_rsf, layout = example_mesh %>%
st_centroid() %>%
st_coordinates()) +
geom_sf(data=boundary_rot) +
geom_sf(data=example_mesh) +
geom_node_point()
partition_gg = ggraph(example_rsf, layout = example_mesh %>%
st_centroid() %>%
st_coordinates()) +
geom_sf(data=boundary_rot) +
geom_sf(data=example_mesh) +
geom_edge_link() +
geom_node_point(aes(color = Partition)) +
scale_color_discrete()
data_gg = ggplot(data = example_points) +
geom_sf(data=boundary_rot) +
geom_sf(data=example_mesh) +
geom_point(aes(x, y, color = mem))
grid.arrange(bound_gg, mesh_gg, partition_gg, data_gg, nrow = 2)Simulation Example
# boundary specifies the data domain (can be constrained)
# N_LEARNERS specifies the number of simultaneous chains in the bagging setup
# MESH_REF specifies how fine the meshes should be on the domain
# data is the observations in the form of coordinates (sf points)
learners = makeVoronoiMeshes(boundary = boundary_rot,
N_LEARNERS = 48,
MESH_REF = 100) %>%
makeLearners(data = simulated_data_rot)
# How long to run the chain
MCMC = 5000
# Initial iterations to get rid of to reach stationary distribution
BURNIN = 1000
# Sampling interval to reduce autocorrelation
THIN = 20
# Alpha and beta for Gamma prior distribution on piecewise constants
shape_hp = (1/3)
rate_hp = 0.0000001
# Lambda for truncated Poisson prior on number of clusters
lambda_k_hp = 50
# Where to truncate the Poisson prior on number of clusters
max_clusters = 100
densityOutputRot = densityBASTunconditioned(learners = learners,
MCMC,
BURNIN,
THIN,
shape_hp,
rate_hp,
lambda_k_hp,
max_clusters,
seed = 12345,
data_prior = TRUE,
normalized = TRUE,
subsetting = "dirichlet",
parallel_cores = 12,
detailed_output = FALSE)grid_points = truth_density_rot[,1:2] %>%
as.matrix() %>%
st_multipoint() %>%
st_sfc %>%
st_cast("POINT") %>%
st_sf()
intensity_samples = getDensitySamples(densityOutputRot,
grid_points,
intensity = TRUE)
density_samples = t(apply(intensity_samples, 1, function(x) x/sum(x)))grid_post_mean = colMeans(density_samples)
grid_post_median = apply(density_samples, 2, median)
grid_post_sd = apply(density_samples, 2, sd)
grid_post_mean_sqerr = (grid_post_mean - truth_density_rot$full_dens)^2
grid_post_mean_abserr = abs(grid_post_mean - truth_density_rot$full_dens)
grid_post_median_sqerr = (grid_post_median - truth_density_rot$full_dens)^2
grid_post_median_abserr = abs(grid_post_median - truth_density_rot$full_dens)
grid_MISE = mean(grid_post_mean_sqerr)
grid_MISE_median = mean(grid_post_median_sqerr)
cat("MISE based on posterior mean:", grid_MISE, "\n")MISE based on posterior mean: 1.981295e-10
cat("MISE based on posterior median:", grid_MISE_median, "\n")MISE based on posterior median: 1.981582e-10
grid_crps = crps_sample(y = truth_density_rot$full_dens,
dat = t(density_samples))
cat("Mean CRPS:", mean(grid_crps), "\n")Mean CRPS: 7.645029e-06
grid_MAE = mean(grid_post_mean_abserr)
grid_MAE_median = mean(grid_post_median_abserr)
cat("MAE based on posterior mean:", grid_MAE, "\n")MAE based on posterior mean: 8.137434e-06
cat("MAE based on posterior median:", grid_MAE_median, "\n")MAE based on posterior median: 8.130804e-06
grid_df = cbind(truth_density_rot[,1:2],
grid_post_mean,
grid_post_median,
grid_post_sd,
grid_post_mean_abserr,
grid_post_median_abserr,
grid_post_mean_sqerr,
grid_post_median_sqerr)
grid_plot = ggplot(grid_df, aes(x = x, y = y))
#grid.arrange(
ggplot(truth_density_rot, aes(x = x, y = y)) +
geom_point(aes(color = full_dens)) +
# geom_point(data = as.data.frame(simulated_data_matrix_rotated),
# size = 0.1) +
scale_color_viridis(limits = c(0, 0.0002)) +
guides(fill = "none", shape = "none") +
labs(title = "True Intensity Field",
color = "True Intensity")#, grid_plot + geom_point(aes(color = grid_post_median)) +
scale_color_viridis(limits = c(0, 0.0002)) +
# geom_point(data = as.data.frame(simulated_data_matrix_rotated),
# size = 0.1) +
labs(title = "Posterior Median Intensity Field",
color = "Est. Intensity")#,grid_plot + geom_point(aes(color = grid_post_mean_sqerr)) +
scale_color_viridis() +
labs(title = "Posterior Median - Squared Error",
color = "SE(Intensity)")#, #ncol = 3
#)Real Data Example
Ongoing Research
Boosting vs Bagging
- Currently bagging has worked better, but boosting using thinning properties should be possible
- Model Averaging
Mixture of Experts
Better transition kernel for the partition based on graph message passing
Line network data
- Lebesgue measure is ambivalent to dimension, just need reasonable graph construction
Variational Inference for Speed Improvements
Conclusion
The new method is robust to constrained point patterns
It provides for uncertainty quantification through its Bayesian framework
This allows for better estimation of the discontinuities in the intensity function
Needs data along the boundary to effectively capture this
Different meta-model structures are reasonable and possible
References
Footnotes
We have to do Metropolis-Within-Gibbs because the posterior of the partition is not available in closed form↩︎