INTRODUCTION

This website is designed as a comprehensive resource for graduate students in statistics to gain a deeper understanding of advanced spatial data analysis. It covers a range of topics, including map creation, modeling, and spatiotemporal forecasting. By providing detailed tutorials, practical examples, and case studies, the website aims to enhance students’ skills and knowledge in these crucial areas of spatial statistics. Whether you are looking to learn the fundamentals of mapping or explore complex spatiotemporal models, this platform offers valuable insights and tools to support your academic and research endeavors. Some libraries are required

library(raster)
library(ggplot2)
library(maptools)
library(spdep)
library(geodata)
library(dplyr)
library(janitor)
library(flextable)
library(knitr)
library(webshot2)
library(webshot)
require(moonBook) 
library(webr)
library(tidyr)
library(RColorBrewer)
library(kableExtra)
library(spgwr)
library(tmap)

Topic 1: Mapping

In Topic 1, we will delve into the essential concepts and techniques of mapping in the context of spatial data analysis. This topic will provide students with the knowledge and skills needed to create a wide array of maps using R, a powerful tool for statistical computing and graphics.

By the end of Topic 1, students will have the capability to design and produce various types of maps, including:

Through hands-on exercises and practical examples, students will become proficient in utilizing R to generate these maps, enhancing their ability to visualize and interpret spatial data effectively.

For our example, we will focus on using a detailed map of the city of Bandung, broken down by sub-districts. This map has been sourced from the Raster package in R, which provides extensive tools for spatial data analysis and manipulation. Utilizing this map, we will demonstrate various advanced techniques in spatial statistics, enabling a comprehensive analysis of the geographic and demographic data within Bandung’s sub-districts.

For the first step, we will extract a detailed map of Indonesia using the gadm function from the geodata package. This function allows us to easily obtain spatial data, which we will then refine and analyze for our study. By focusing on the map of Indonesia, we can set the foundation for more localized analyses, such as those involving the city of Bandung.Before proceeding, make sure you have set your working directory to a suitable location where you want to save and access your spatial data. This ensures that all files are organized and easily accessible during the analysis.

# Set your working directory
setwd("/users/mindra/@AdvancedSpatial")

# Download level 3 administrative boundaries for Indonesia
#Indonesia3<-readRDS('gadm36_IDN_3_sp.rds')  
Indonesia3 <- geodata::gadm("IDN", level = 3, path = getwd())
# Filter the data to get the map of Bandung
Bandung3<-Indonesia3[Indonesia3$NAME_2 == "Kota Bandung",]

Next, we will visualize the map of Bandung using the plot function. To do this, we need to filter the administrative boundaries to focus on Bandung.

# Plot the map of Bandung using plot function
plot(Bandung3, axes=T,main = "Map of Bandung",cex.axis=0.8)

# Plot the map of Bandung using ggplot2
Bandung3_sf<-st_as_sf(Bandung3)
ggplot(data = Bandung3_sf) +
  geom_sf() +
  ggtitle("Map of Bandung") +
  theme_bw()

Simulation of COVID-19 Cases and Population Size

Now, we will create a choropleth map using simulated data. For this example, we assume we have a dataset that includes COVID-19 case counts and the population at risk in the city of Bandung. We will generate a simulated dataset for 30 districts within Bandung city, where each district will have a specified number of COVID-19 cases and a corresponding population size.

Generating the Simulated Dataset

In our simulation, we will:

  1. Generate COVID-19 case counts using a Poisson distribution with a mean of 5 cases per district.
  2. Generate the population size for each district using a Poisson distribution with a mean of 1000 people per district.

Creating the Choropleth Map

Using the simulated data, we will create two choropleth maps:

  1. A map showing the distribution of COVID-19 cases across the districts.
  2. A map showing the population size in each district. These maps will help visualize the spatial distribution of COVID-19 cases and the population at risk, providing insights into how the disease spreads and identifying areas that might need more resources or interventions.

Let’s proceed with the data generation and mapping:

n_districts <- 30

# Generate Poisson data for COVID-19 cases with mean 5
set.seed(123)
covid_cases <- rpois(n_districts, lambda = 5)

# Generate population size data with mean 1000
set.seed(123)
population_size <- rpois(n_districts, lambda = 1000)

# Create a data frame

data <- data.frame(id=c(1:30), District = Bandung3_sf$NAME_3, COVID_Cases = covid_cases, Population_Size = population_size)

data %>%
  kable("html", caption = "COVID-19 Cases and Population Size in Bandung Districts") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
COVID-19 Cases and Population Size in Bandung Districts
id District COVID_Cases Population_Size
1 Andir 4 982
2 Antapani 7 1037
3 Arcamanik 4 946
4 Astanaanyar 8 1004
5 Babakan Ciparay 9 1054
6 Bandung Kidul 2 1014
7 Bandung Kulon 5 959
8 Bandung Wetan 8 945
9 Batununggal 5 1038
10 Bojongloa Kaler 5 1011
11 Bojongloa Kidul 9 1012
12 Buahbatu 5 1003
13 Cibeunying Kaler 6 982
14 Cibeunying Kidul 5 1040
15 Cibiru 2 1026
16 Cicendo 8 998
17 Cidadap 3 975
18 Cinambo 2 966
19 Coblong 4 989
20 Gedebage 9 976
21 Kiaracondong 8 1033
22 Lengkong 6 995
23 Mandalajati 6 1004
24 Panyileukan 11 964
25 Rancasari 6 989
26 Regol 6 990
27 Sukajadi 5 995
28 Sukasari 5 1025
29 Sumur Bandung 4 1021
30 Ujung Berung 3 1017

Next, we will combine the data with the Bandung map.

Bandung3_sf$id<-c(1:30) 
Bandung3_merged <- Bandung3_sf %>%
  left_join(data, by = "id") 

Next, we’ll craft Choropleth maps illustrating COVID-19 cases and population density. These maps will offer insightful visualizations, aiding in the identification of correlations and hotspots. By juxtaposing COVID-19 data with population distribution, we can gain a deeper understanding of the pandemic’s impact across different regions. Let’s embark on creating these informative Choropleth maps to illuminate key patterns and trends.

ggplot(Bandung3_merged) +
  geom_sf(aes(fill = COVID_Cases)) +
  theme_bw() +
  labs(title = "Map of Number of COVID-19",
       fill = "Incidence rate of TB")

ggplot(Bandung3_merged) +
  geom_sf(aes(fill = Population_Size)) +
  theme_bw() +
  labs(title = "Map of Population size",
       fill = "Incidence rate of TB")

To further enhance the Choropleth map’s visual impact, we can employ scale_fill_gradient(). This function allows us to define a color gradient for the fill aesthetic, offering flexibility in customizing the map’s color scheme. By adjusting the low and high parameters, you can tailor the colors to represent the range of values effectively. In the example provided, shades of green to red were utilized, but feel free to substitute them with any color names or hexadecimal codes that align with your preferences and convey the desired message.

ggplot(Bandung3_merged) +
  geom_sf(aes(fill = COVID_Cases)) +
  theme_bw() +
  scale_fill_gradient(low = "green", high = "red") +
  labs(title = "Map of Number of COVID-19",
       fill = "Incidence rate of TB")

ggplot(Bandung3_merged) +
  geom_sf(aes(fill = Population_Size)) +
  theme_bw() +
  scale_fill_gradient(low = "green", high = "red") +
  labs(title = "Map of Population size",
       fill = "Incidence rate of TB")

summary(data$COVID_Cases)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   4.000   5.000   5.667   7.750  11.000

We can also create a map with interval values.

ggplot(Bandung3_merged) +
  geom_sf(aes(fill = COVID_Cases)) + 
  scale_fill_gradientn(colors = c("green", "yellow", "red"),  # Adjust colors as needed
                       breaks = c(0, 3, 6, 9),  # Define interval breaks
                       labels = c("0-3", "4-6", "7-9", "9+"),  # Labels for intervals
                       limits = c(0, 11),  # Adjust the range of values
                       na.value = "grey50") +  # Color for missing values
  theme_bw() +
  labs(title = "Choropleth Map with Interval Values",
       fill = "Value")

Topic 2: Geographically Wegithed Regression

Explanation of Geographically Weighted Regression (GWR)

Geographically Weighted Regression (GWR) is a spatial statistical technique used to model spatially varying relationships between a dependent variable and independent variables. Unlike traditional regression models that assume constant relationships across space, GWR recognizes that relationships between variables may vary geographically.

Figure 1. Spatial concept of GWR

Key Concepts:

Spatial Non-Stationarity:

GWR accounts for spatial non-stationarity, which means that the relationships between variables change over space. This is often the case in spatial data where the relationship between variables may vary across different locations.

Localized Parameter Estimates:

Instead of estimating a single set of parameters for the entire study area, GWR estimates separate parameters for each location. This results in localized parameter estimates that reflect the specific relationships observed at each location.

Weighting Scheme:

In GWR, neighboring observations are given different weights based on their proximity to the target location. This allows the model to give more importance to nearby observations when estimating local parameters, effectively capturing the spatial structure of the data.

Implementation in R:

The spgwr package in R provides functions to implement Geographically Weighted Regression. Here’s a basic example of how to fit a GWR model in R:

###GWR Model If we are interested in the influence or effect of variable x1 on variable y, we create a regression model in the following form:

\[y_i = \beta_0 + \beta_1 x_{1i} + \epsilon_i\]

Where: \(y_i\) = dependent variable, \(x_{1i}\) = independent variable, \(\beta_0\) = intercept, \(\beta_1\) = coefficient representing the effect of \(x_{1i}\) on \(y_i\), \(\epsilon_i\) = error term.

The coefficient \(\beta_1\) represents the increase in y due to a one-unit increase in \(x_i\). This equation represents the global model, and \(\beta_1\) is applied to all spatial units in the study area. The assumption is that the relationship between x and y does not change according to the area.

Stationarity is a common term in statistics, generally meaning that parameters do not change over time or space. In spatial statistics, stationarity is equivalent to the homogeneity of effects, or, that a process operates similarly regardless of where we observe the process. This can be a weak assumption, and we may ask whether x affects y differently at different geographic locations, or in terms of parameters.

In the discussion, we explore and measure the uneven spatial distribution of variable y. For instance, we use Moran’s I to measure the extent of global and local spatial autocorrelation. In this lab guide, we will examine the uneven spatial distribution in the relationship between two variables, x and y. The method we will discuss in this guide that captures this spatial heterogeneity is Geographically Weighted Regression (GWR). First proposed by Brundson et al. (1996), GWR estimates \(\beta_p\) at each location \(i\), using centroids for polygon data. The model takes the following form:

\[ y_i = \beta_{0i} + \beta_{1i}x_{1i} + \ldots + \beta_{pi}x_{pi} + \epsilon_i \]

where \(\beta_{pi}\) is the local realization of \(\beta_p\) at location \(i\). This constructs a surface trend of parameter values for each independent variable and model intercept.

GWR is a result of ordinary least squares (OLS) regression and enhances the modeling sophistication by allowing the relationship between independent and dependent variables to vary by location.

Note that the basic OLS regression model above is just a special case of the GWR model where the coefficients are constant across space. Parameters in GWR are estimated using weighted least squares. The weighting matrix is a diagonal matrix, with each diagonal element \(w_{ij}\) being a function of the observation location. The role of the weight matrix is to assign higher values to observations close to \(i\), as it is assumed that nearby observations will influence each other more than those farther away (Tobler’s Law). There are three major decisions to be made when running GWR: (1) the kernel density function determines the weight \(w_{ij}\), (2) the bandwidth function \(h\), which determines the rate of distance decay, and (3) who should be considered as neighbors.

Kernel

The kernel density function determines the weights assigned to neighboring units:

  1. A common density function is the Gaussian weighting function:

\[ w_{ij} = \exp\left(-\frac{d_{ij}^2}{h^2}\right) \]

Where \(d_{ij}\) is the distance between locations \(i\) and \(j\), and \(h\) is the bandwidth.

  1. Exponential function: \[ w_{ij} = \exp\left(-\frac{d_{ij}}{h}\right) \]

  2. Bi-square function: \[ w_{ij} = 1 - \left(-\frac{d_{ij}^2}{h^2}\right)^2 \]

  3. Tri-cube function:

\[ w_{ij} = 1 - \left(-\frac{d_{ij}^3}{h^3}\right)^3 \]

Choosing the optimal bandwidth

Choosing the weighting function also involves selecting the bandwidth \(h\). There are several ways to do this. R uses two methods. The first one employs cross-validation to select the optimal kernel bandwidth. The cross-validation method takes the formula:

\[ CV = \sum_i (y_i - \hat{y}_{(-i)}(\beta))^2 \]

where \(\hat{y}_{(-i)}(\beta)\) is the estimated value of \(y_i\) with the observation for point \(i\) removed from the calibration process. Here, the aim is to find \(h\) that minimizes \(CV\). Essentially, this minimizes the sum of squared errors across all locations \(i\), arriving at the optimal bandwidth.

Another method selects the bandwidth that minimizes the Akaike Information Criterion (AIC).

To run GWR, we need to use the spgwr package.

GWR Estimation

The varying-regression coefficients can be estimated as: \[ \hat{\beta}(u) = (x^{\prime} W(u) x)^{-1} x^{\prime} W(u) y \]

where \(u\) denotes the location of the \(u\)-th observation with

\[ W(u) = \begin{bmatrix} w_1(u) & 0 & \cdots & 0 \\ 0 & w_2(u) & \vdots & \vdots \\ \vdots & \vdots & \ddots & 0 \\ 0 & \cdots & 0 & w_n(u) \end{bmatrix} \]

Example using the spgwr Package

his example is obtained from CRD 298 - Spatial Methods in Community Research. You can find more information here.

download.file(url = "https://raw.githubusercontent.com/crd230/data/master/phil_tracts.zip", destfile = "phil_tracts.zip")
unzip(zipfile = "phil_tracts.zip")
philly <- st_read("phil_tracts.shp")
## Reading layer `phil_tracts' from data source 
##   `/Users/mindra/@AdvancedSpatial/phil_tracts.shp' using driver `ESRI Shapefile'
## Simple feature collection with 376 features and 25 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 476461.9 ymin: 4413070 xmax: 503772 ymax: 4443067
## Projected CRS: UTM_Zone_18_Northern_Hemisphere

Fit ols

fit.ols<-glm(usarea ~ lmhhinc  + lpop + pnhblk + punemp + pvac + ph70 + lmhval + phnew + phisp, data = philly)
summary(fit.ols)
## 
## Call:
## glm(formula = usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + 
##     ph70 + lmhval + phnew + phisp, data = philly)
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  534.491    164.270   3.254  0.00124 ** 
## lmhhinc        2.462     12.176   0.202  0.83990    
## lpop          -1.344      6.338  -0.212  0.83216    
## pnhblk        21.158     18.077   1.170  0.24260    
## punemp        -5.097     63.645  -0.080  0.93622    
## pvac         371.699     58.427   6.362 5.96e-10 ***
## ph70         -79.691     35.535  -2.243  0.02552 *  
## lmhval       -45.668     10.458  -4.367 1.64e-05 ***
## phnew         17.958    319.042   0.056  0.95514    
## phisp        -56.308     30.695  -1.834  0.06741 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 4829.927)
## 
##     Null deviance: 2938287  on 375  degrees of freedom
## Residual deviance: 1767753  on 366  degrees of freedom
## AIC: 4268.4
## 
## Number of Fisher Scoring iterations: 2

Bandwidth optimum

philly.sp <- as(philly, "Spatial")
gwr.b1<-gwr.sel(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 +lmhval + phnew + phisp, philly.sp)
## Bandwidth: 14224.76 CV score: 1845565 
## Bandwidth: 22993.17 CV score: 1860162 
## Bandwidth: 8805.59 CV score: 1818722 
## Bandwidth: 5456.356 CV score: 1750890 
## Bandwidth: 3386.415 CV score: 1610987 
## Bandwidth: 2107.121 CV score: 1497008 
## Bandwidth: 1316.474 CV score: 1412725 
## Bandwidth: 827.8277 CV score: 1787166 
## Bandwidth: 1618.475 CV score: 1446078 
## Bandwidth: 1129.828 CV score: 1448167 
## Bandwidth: 1377.629 CV score: 1414671 
## Bandwidth: 1298.316 CV score: 1413143 
## Bandwidth: 1324.036 CV score: 1412698 
## Bandwidth: 1322.62 CV score: 1412697 
## Bandwidth: 1322.714 CV score: 1412696 
## Bandwidth: 1322.708 CV score: 1412696 
## Bandwidth: 1322.708 CV score: 1412696 
## Bandwidth: 1322.708 CV score: 1412696 
## Bandwidth: 1322.708 CV score: 1412696
gwr.b1
## [1] 1322.708

Fit GWR

gwr.fit1<-gwr(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 + lmhval + phnew + phisp, data = philly.sp, bandwidth = gwr.b1, se.fit=T, hatmatrix=T)

Bisquare Bandwidth

gwr.b2<-gwr.sel(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 + lmhval + phnew + phisp, data = philly.sp, gweight = gwr.bisquare) 
## Bandwidth: 14224.76 CV score: 1797387 
## Bandwidth: 22993.17 CV score: 1840913 
## Bandwidth: 8805.59 CV score: 1634064 
## Bandwidth: 5456.356 CV score: 1578492 
## Bandwidth: 1760.637 CV score: NA 
## Bandwidth: 4044.717 CV score: NA 
## Bandwidth: 6735.649 CV score: 1585951 
## Bandwidth: 5535.278 CV score: 1581610 
## Bandwidth: 4917.158 CV score: NA 
## Bandwidth: 5250.4 CV score: 1569337 
## Bandwidth: 5123.113 CV score: 1563084 
## Bandwidth: 5044.445 CV score: NA 
## Bandwidth: 5171.732 CV score: 1565614 
## Bandwidth: 5093.064 CV score: 1561374 
## Bandwidth: 5074.493 CV score: NA 
## Bandwidth: 5104.542 CV score: 1562042 
## Bandwidth: 5085.971 CV score: NA 
## Bandwidth: 5097.448 CV score: 1561631 
## Bandwidth: 5090.355 CV score: NA 
## Bandwidth: 5094.739 CV score: 1561473 
## Bandwidth: 5092.029 CV score: NA 
## Bandwidth: 5093.704 CV score: 1561412 
## Bandwidth: 5092.669 CV score: NA 
## Bandwidth: 5093.309 CV score: 1561389 
## Bandwidth: 5092.913 CV score: 1561365 
## Bandwidth: 5092.82 CV score: NA 
## Bandwidth: 5092.971 CV score: 1561369 
## Bandwidth: 5092.878 CV score: NA 
## Bandwidth: 5092.935 CV score: 1561367 
## Bandwidth: 5092.9 CV score: 1561365 
## Bandwidth: 5092.891 CV score: NA 
## Bandwidth: 5092.905 CV score: 1561365 
## Bandwidth: 5092.897 CV score: NA 
## Bandwidth: 5092.902 CV score: 1561365 
## Bandwidth: 5092.899 CV score: 1561365 
## Bandwidth: 5092.898 CV score: 1561364 
## Bandwidth: 5092.897 CV score: NA 
## Bandwidth: 5092.898 CV score: 1561365 
## Bandwidth: 5092.898 CV score: 1561364 
## Bandwidth: 5092.898 CV score: NA 
## Bandwidth: 5092.898 CV score: 1561364
gwr.fit2<-gwr(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 + lmhval + phnew + phisp, data = philly.sp, bandwidth = gwr.b2, gweight = gwr.bisquare, se.fit=T, hatmatrix=T)
gwr.b2
## [1] 5092.898
gwr.fit2
## Call:
## gwr(formula = usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + 
##     ph70 + lmhval + phnew + phisp, data = philly.sp, bandwidth = gwr.b2, 
##     gweight = gwr.bisquare, hatmatrix = T, se.fit = T)
## Kernel function: gwr.bisquare 
## Fixed bandwidth: 5092.898 
## Summary of GWR coefficient estimates at data points:
##                    Min.    1st Qu.     Median    3rd Qu.       Max.   Global
## X.Intercept.  -649.3890    -5.7699   134.6249   512.9574  2336.5957 534.4908
## lmhhinc       -180.3145    -4.4545     1.7487    13.7554    68.2914   2.4616
## lpop           -49.1608     1.2314     6.3430    19.0823    69.7005  -1.3441
## pnhblk        -106.4233     1.3658    41.0256    96.5291   285.2134  21.1576
## punemp        -397.5988  -143.8982    -6.2685    57.4553   729.4700  -5.0966
## pvac          -757.5534     8.8245   209.8576   370.7793   650.3669 371.6993
## ph70          -643.0070  -207.9799   -66.3040   -19.8028   142.9682 -79.6910
## lmhval        -150.2726   -69.5496   -34.8198    -6.7118   107.7625 -45.6676
## phnew        -1844.6086  -418.1211    19.6153   509.9117  7421.2055  17.9575
## phisp         -221.0604   -26.5670    -7.5865    84.2566  1418.3152 -56.3076
## Number of data points: 376 
## Effective number of parameters (residual: 2traceS - traceS'S): 132.4964 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 243.5036 
## Sigma (residual: 2traceS - traceS'S): 62.2312 
## Effective number of parameters (model: traceS): 107.6713 
## Effective degrees of freedom (model: traceS): 268.3287 
## Sigma (model: traceS): 59.2826 
## Sigma (ML): 50.0803 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 4316.932 
## AIC (GWR p. 96, eq. 4.22): 4117.761 
## Residual sum of squares: 943021.6 
## Quasi-global R2: 0.6790573

Fixed or adaptive kernel.

The GWR model we ran above produces a fixed distance to search for neighbors to include in the local regression. However, there are places in our data where tracts are denser. This means that in some areas, particularly in downtown Philadelphia, we will include a larger number of neighbors in the local regression compared to other areas, such as large channels on the city outskirts. In this case, an adaptive kernel is suitable.

Fixed and Adaptive Kernel To specify an adaptive kernel, set adapt = TRUE when determining the optimal bandwidth using gwr.sel().

gwr.b3<-gwr.sel(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 + lmhval + phnew + phisp, data = philly.sp, adapt = TRUE)
## Adaptive q: 0.381966 CV score: 1785033 
## Adaptive q: 0.618034 CV score: 1814714 
## Adaptive q: 0.236068 CV score: 1733969 
## Adaptive q: 0.145898 CV score: 1650907 
## Adaptive q: 0.09016994 CV score: 1565518 
## Adaptive q: 0.05572809 CV score: 1471615 
## Adaptive q: 0.03444185 CV score: 1401939 
## Adaptive q: 0.02128624 CV score: 1384273 
## Adaptive q: 0.01588534 CV score: 1431365 
## Adaptive q: 0.02662582 CV score: 1380970 
## Adaptive q: 0.02518866 CV score: 1380056 
## Adaptive q: 0.02491844 CV score: 1380031 
## Adaptive q: 0.02487775 CV score: 1380032 
## Adaptive q: 0.02495914 CV score: 1380032 
## Adaptive q: 0.02491844 CV score: 1380031
gwr.b3
## [1] 0.02491844

The bandwidth distance will vary according to the spatial density of features in the input feature class. The bandwidth becomes a function of the number of nearest neighbors, so each local estimation is based on the same number of features. Instead of a specific distance, the number of neighbors used for analysis is reported. Plug the bandwidth into the gwr() function, which runs the GWR model, using the adapt argument.

Adaptive

gwr.fit3<-gwr(usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + ph70 + lmhval +
phnew + phisp, data = philly.sp, adapt=gwr.b3, se.fit=T, hatmatrix=T)

gwr.fit3
## Call:
## gwr(formula = usarea ~ lmhhinc + lpop + pnhblk + punemp + pvac + 
##     ph70 + lmhval + phnew + phisp, data = philly.sp, adapt = gwr.b3, 
##     hatmatrix = T, se.fit = T)
## Kernel function: gwr.Gauss 
## Adaptive quantile: 0.02491844 (about 9 of 376 data points)
## Summary of GWR coefficient estimates at data points:
##                     Min.     1st Qu.      Median     3rd Qu.        Max.
## X.Intercept. -1413.25718     2.04814   150.67770   593.38119  2856.09861
## lmhhinc        -77.30238    -6.62505     2.08877    20.59832   121.03243
## lpop           -71.53993     0.32328     6.55222    19.42020    93.59455
## pnhblk        -139.33868    -0.35274    39.43998   102.07286   462.87992
## punemp        -592.27650  -109.64202    -3.93096    63.56270   623.38186
## pvac         -1410.12965    11.95427   193.34738   350.39251  1047.77143
## ph70          -975.65611  -190.62161   -67.38336   -13.17506   137.47857
## lmhval        -185.48730   -73.39044   -36.70912    -7.56967    48.91389
## phnew        -2570.54553  -577.37945    29.21937   654.40082  4045.23829
## phisp         -182.91660   -29.72723    -7.23980    65.71058   771.29484
##                Global
## X.Intercept. 534.4908
## lmhhinc        2.4616
## lpop          -1.3441
## pnhblk        21.1576
## punemp        -5.0966
## pvac         371.6993
## ph70         -79.6910
## lmhval       -45.6676
## phnew         17.9575
## phisp        -56.3076
## Number of data points: 376 
## Effective number of parameters (residual: 2traceS - traceS'S): 177.8408 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 198.1592 
## Sigma (residual: 2traceS - traceS'S): 54.21695 
## Effective number of parameters (model: traceS): 135.2358 
## Effective degrees of freedom (model: traceS): 240.7642 
## Sigma (model: traceS): 49.18654 
## Sigma (ML): 39.35938 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 4258.02 
## AIC (GWR p. 96, eq. 4.22): 3964.174 
## Residual sum of squares: 582484.4 
## Quasi-global R2: 0.8017605

Presenting GWR Results

When presenting GWR results, you first want to show the distribution of coefficients for each variable. You also want to present the estimates of the global model. Typing gwr.fit3 in R provides you with this information, particularly the minimum value, 25th percentile, median, 75th percentile, and maximum value of each variable coefficient, along with the global regression coefficients and model fitness measures. The gwr.fit3 object is a GWR object. The object contains several other objects. For example, typing

gwr.fit3$results
## $v1
## [1] 135.2358
## 
## $v2
## [1] 92.63086
## 
## $delta1
## [1] 198.1592
## 
## $delta2
## [1] 150.7794
## 
## $sigma2
## [1] 2939.477
## 
## $sigma2.b
## [1] 1549.161
## 
## $AICb
## [1] 4258.02
## 
## $AICh
## [1] 3964.174
## 
## $AICc
## [1] 4349.03
## 
## $edf
## [1] 198.1592
## 
## $rss
## [1] 582484.4
## 
## $nu1
## [1] 92.63086
## 
## $odelta2
## [1] 115.4856
## 
## $n
## [1] 376

provides you with overall model results such as AIC. Typing gwr.fit3$bandwidth gives you the bandwidth values for each of the 376 tracts in the dataset. The bandwidth object yields only one value, which is the bandwidth used for all tracts.

Compare fixed and adaptive bandwidth

gwr.fit1$bandwidth
## [1] 1322.708
gwr.fit3$bandwidth
##   [1] 1149.6960 1183.1752 1326.2045 1172.4310 1080.9776 1080.4400 1253.3075
##   [8] 1618.3852 1933.3755 1317.3601 1250.3552 1297.9878 1206.6260 1237.7607
##  [15] 1348.3389 1161.8085 1094.2641 1262.8032 1679.6423 1668.6594 1449.5608
##  [22] 1377.3837 1460.3981 1239.4362 1284.6018 1494.5331 1404.9279 1347.7279
##  [29] 1569.7716 1713.0538 1126.3074  936.9146  885.2005 1037.1643 1094.9965
##  [36]  951.7536 1024.4739 1206.9699 1038.2901  966.2943  987.5778  937.3304
##  [43] 1161.6782 1015.1787 1000.2734 4929.8934 2912.4102 3427.6077 2497.9841
##  [50] 2180.7579 2041.0962 2247.6751 1531.2191 1399.2416 1167.3991 1281.4852
##  [57] 1147.1804 1203.5498 1219.8614 1480.6902 1395.4648 1424.8599 1753.3825
##  [64] 1443.1328 1258.2813 1448.0626 1522.8533 1377.6613 1644.8620 1445.4104
##  [71] 1263.7606 1260.3937 1273.1631 1210.8011 1202.9345 1264.6042 1202.4543
##  [78] 1027.8557 1491.4673 1250.6009 2104.3717 1944.9710 1903.3906 2113.4772
##  [85] 1930.2247 1800.0350 1895.7397 2356.5608 2273.4443 2507.9185 1433.7533
##  [92] 1522.7432 1623.1450 1485.4180 1521.6495 1445.0000 1455.7156 1548.0717
##  [99] 1484.3209 1517.3800 1674.9190 1627.1831 1600.2474 1761.9482 1461.7746
## [106] 1375.5426 1344.3843 1345.1753 1543.3719 1498.2537 1495.3136 1685.1020
## [113] 1640.5623 1687.7274 1801.9435 1667.9202 1797.8396 1861.2456 1792.4967
## [120] 1848.9709 1765.1738 2487.6310 2028.3775 2777.8273 2726.9368 2085.7093
## [127] 2749.1080 2148.2730 2068.6457 1624.7288 1415.1939 3580.3917 3317.2191
## [134] 1741.6830 2378.0338 2059.9268 3242.7861 1577.6733 1325.4036 1352.8377
## [141] 1157.3042  915.1007  953.6252 1164.6167 2590.0996 2248.3944 1282.1087
## [148] 2238.4517 2300.8338  863.4088  898.4611  848.7930 1366.5018 1462.4520
## [155] 1299.8048 1309.0455 1509.1945 1517.7368 1542.2995 1336.0976 1369.7717
## [162] 1723.8893 1737.3518 1805.6483 1991.9663 1446.8963 1811.0270 1887.9420
## [169] 2749.8621 1812.7160 1204.4508 1560.6511 3137.4233 2860.0573 2995.4353
## [176] 2649.7392 1632.1079 1269.1912 3977.7382 1113.0079 1227.6887 1266.9417
## [183] 2084.1229 1760.0500 1721.3045 2382.1732 3152.5118 1156.6920 1195.2937
## [190] 1144.2137  913.7131  889.4278  938.7811 1225.6714 1691.2698 1683.5002
## [197] 1142.8835 1347.4642  997.7492 1328.6705 1422.4118 1062.5272 1270.5024
## [204] 1502.9227 1323.8170 1060.8398 1091.3630 1757.2854 1266.7186 1559.6225
## [211] 1090.8856 1173.5343 1451.4422 1223.2996  996.5115 1182.3345 1768.5657
## [218] 1784.1195 1373.5698 1439.4176 1148.8977 1443.8581 1292.8810 1174.2284
## [225] 1616.1746 2145.3811 1629.4297 1969.2547 2252.2645 2372.0125 1203.3670
## [232] 1255.6972 1227.9477 1098.4566 1055.0571 1481.5491 1760.7239 1778.0384
## [239] 1168.1166 1125.1907 1256.6905 1330.0300 1585.8025 2497.3748 2746.0348
## [246] 1509.3036 1931.8575 2627.5873 3017.1983 3781.5675 2010.8207 2197.9499
## [253] 2059.7430 1813.8544 1729.5140 1588.6780 1651.7633 1477.1388 1970.6281
## [260] 1727.7785 2089.6800 1960.2473 1495.3766 1435.0412 2207.3149 1856.5329
## [267] 1460.4975 1456.0199 1805.9800 1427.5722 1597.3591 1666.3611 1528.8778
## [274] 1776.4450 1570.7058 1829.3064 1927.5048 1761.8633 1449.8676 1516.1387
## [281] 1518.9827 1241.1781 1470.8321 1521.9097 3655.6712 1987.0742 2224.5394
## [288] 2241.0749 3449.9692 2855.8109 2788.3023 1740.6684  923.3570  915.2436
## [295] 1061.4771  952.3694 3232.3931 1317.5031  991.8756  998.7429  969.8376
## [302] 1079.1369 2010.8536 1986.9894 1535.4527 1593.9031 1462.7244 1159.8429
## [309] 1207.2627 1455.2354 1484.5447 1267.8352 1231.6060 1268.5112 1066.8641
## [316] 1110.6200 1250.7819 1096.2617 1887.6722 1745.3567 1344.4359 2413.6404
## [323] 2752.6523 1272.9689  939.6277 1085.8393  825.8420 1345.9480 1106.3061
## [330] 3364.2918 1613.1166 1273.4573 2140.7797 2778.9098 3267.8459 1745.7554
## [337] 1464.1753 1794.8908 1403.0026 1565.3360 1581.6820 1519.9556 1312.1269
## [344] 1406.8640 1474.6692 1429.8799 1436.3432 1371.3166 1249.1307 1531.4984
## [351] 1433.9176 1463.0054 1716.8105 1759.2807 1654.2402 1605.6215 2944.9500
## [358] 2050.0637 1910.5561 1643.7374 2553.3518 3397.2850 2548.2420 2559.1329
## [365] 2823.3119 3040.4003 2753.9114 2920.7425 3122.3298 1273.1214 2554.0051
## [372] 1776.5060 1582.0286 1508.4337 2528.0667 1066.8106

Map Adaptive Bandwidth

philly$bwadapt <- gwr.fit3$bandwidth
tm_shape(philly, unit = "mi") + tm_polygons(col = "bwadapt", style = "quantile",palette = "Reds", border.alpha = 0, title = "") +
tm_scale_bar(breaks = c(0, 1, 2), size = 1, position = c("right", "bottom")) +
tm_compass(type = "4star", position = c("left", "top")) +
tm_layout(main.title = "GWR bandwidth", main.title.size = 0.95, frame = FALSE, legend.outside = TRUE)

Testing GWR Regression Coefficients

One item in the GWR object is called SDF, and this is a spatial polygon data frame containing the regression model estimates for 376 tracts. The variable X.Intercept. to phisp provides regression coefficients. X.Intercept.se to phisp_se provide the standard error of the coefficient.

In addition to mapping the coefficient sizes, you also need to map whether these coefficients are statistically significant. Unfortunately, R does not have neatly organized information for you. However, you can use the coefficient sizes and standard errors to obtain the t-statistic, which can then be mapped to the t-distribution to find the p-value. Specifically, when testing H0: β̂ p = 0 against alternative H1: β̂ p≠0, one evaluates the magnitude of the t ratio. Under the null hypothesis, the t ratio follows a t-distribution, and therefore one can compute its probability. To do this in R, first obtain the degrees of freedom from the gwr results object.”

dfree<-gwr.fit3$results$edf
philly$pnhblk.t <- gwr.fit3$SDF$pnhblk/gwr.fit3$SDF$pnhblk_se
philly$pnhblk.t.p<-2*pt(-abs(philly$pnhblk.t), dfree)

Mapping Test Results (p-values)

breaks <- c(0,0.01,0.05,0.1,1)
tm_shape(philly, unit = "mi") +
tm_polygons(col = "pnhblk.t.p",palette = "Reds", breaks = breaks,
border.alpha = 0, title = "") +
tm_scale_bar(breaks = c(0, 1, 2), size = 1, position = c("right", "bottom")) +
tm_compass(type = "4star", position = c("left", "top")) +
tm_layout(main.title = "t-stat", main.title.size = 0.95, frame = FALSE, legend.outside = TRUE)

Topic 3: Bayesian spatiotemporal models

During the past two decades, a diverse array of spatiotemporal models has emerged for disease modeling, with many falling under the intrinsic conditional autoregressive (iCAR) category, extending the well-established Besag, York, and Mollie (BYM) model (Besag et al., 1991; Knorr-Held, 2000; Waller et al., 1997). In this paper, the BYM model is applied to infectious disease, presuming that, given the underlying relative risk, θ_it, the incidence count in each region i and at time t, y_it, follows a Poisson distribution with a mean and variance equal to λ_it = E_itθ_it:

\[ y_{it} \, | \, E_{it}\theta_{it} \sim \text{Poisson}(E_{it}\theta_{it}), \, i = 1, \ldots, n \, \text{and} \, t = 1, \ldots, T, \]

where E_it represents the expected number of incidences and θ_it denotes the relative risk in region i and at time t. For spatiotemporal data, the expected number of incidences can be derived based on reference rates as follows:

  1. The average for each period: \[ E_{it} = N_{it}\left(\frac{\sum_{i=1}^{n} y_{it}}{\sum_{i=1}^{n} N_{it}}\right), \, t = 1, \ldots, T, \]

  2. The overall average across all periods: \[ E_{it} = N_{it}\left(\frac{\sum_{i=1}^{n}\sum_{t=1}^{T} y_{it} / nT}{\sum_{i=1}^{n}\sum_{t=1}^{T} N_{it} / nT}\right), \, i = 1, \ldots, n \, \text{and} \, t = 1, \ldots, T, \]

where N_it represents the population in region i at time t, n is the number of observed regions, and T is the number of observed periods. Following Abente et al. (2018), the overall average is utilized in this study.

The mean of the Poisson distribution, log(λ_it), is decomposed using the natural logarithm link function:

\[ \log(\lambda_{it}) = \log(E_{it}) + \log(\theta_{it}), \, i = 1, \ldots, n \, \text{and} \, t = 1, \ldots, T \]

The second component, log(θ_it), is the subject of further investigation, modeled as a pure model with random effects consisting of structured and unstructured variance components:

\[ \eta_{it} = \log(\theta_{it}) = \alpha + \omega_i + \upsilon_i + \phi_t + \gamma_t + \delta_{it}, \, i = 1, \ldots, n \, \text{and} \, t = 1, \ldots, T, \]

where α represents the intercept denoting the overall relative risk, ω_i, υ_i, ϕ_t, and γ_t are the spatially structured, spatially unstructured, temporally structured, and temporally unstructured random effect components, respectively, and δ_it represents the spatiotemporal interaction.

The estimation of the spatiotemporal model can be accomplished using frequentist or Bayesian methods. Frequentist methods like maximum likelihood (ML) often prove inadequate, especially for small regions, due to Poisson sampling variation (Ugarte et al., 2014). Therefore, Bayesian modeling, which allows for the coherent integration of missing or uncertain data, is adopted here. The hierarchical Bayesian approach is employed, wherein observed data in the i^th region at time t, y_i = (y_i1, …, y_iT)‘, is generated from a probability distribution p(y_i |Φ,τ) with unknown parameters Φ = (α,ω’,υ’,ϕ’,γ’,δ’)’ and hyperparameters τ = (τ_α,τ_ω,τ_υ,τ_ϕ,τ_γ,τ_δ)’ and hyperpriors p(τ). Under the assumption of independence, the joint posterior density of Φ and τ, given y, is determined, with the marginal posterior distributions derived for further analysis.

Consider the observed data in the i^th region at time t, y_i = (y_i1, …, y_iT)‘, generated from a probability distribution p(y_i |Φ,τ) with unknown parameters Φ = (α,ω’,υ’,ϕ’,γ’,δ’)’ and hyperparameters τ = (τ_α,τ_ω,τ_υ,τ_ϕ,τ_γ,τ_δ)’. The unknown parameters Φ are treated as random variables with priors p(Φ|τ), and with unknown hyperparameters τ. The likelihood function of the number of incidences is identical to the joint density of y_i, i = 1, …, n. Therefore, the joint posterior density of Φ and τ, given y, is provided as follows:

\[ p(\Phi, \tau | y) = p(y | \Phi, \tau) p(\Phi | \tau) p(\tau) / p(y | \tau) \]

where p(y | , ) is the likelihood function of the number of incidences y_it, and p(y | ) is the marginal likelihood of the data given hyperparameters τ, typically considered as a normalization constant. Thus, the posterior density can be specified as:

\[ p(\Phi | y) \propto p(y | \Phi, \tau) p(\Phi | \tau) p(\tau) \]

For the Poisson distribution, the likelihood function of the number of incidences y_it can be expressed as:

\[ p(y | \Phi, \tau) = \prod_{i=1}^{n} \prod_{t=1}^{T} \frac{{\exp\left(-E_{it}\theta_{it}\right)(E_{it}\theta_{it})^{y_{it}}}}{{y_{it}!}} \]

The prior distributions are employed to model the spatial and temporal effect components. The spatially structured random effect of region i (ω_i) is modeled using the intrinsic conditional autoregressive prior (Besag et al., 1991):

\[ \omega_i | \omega_{(-i)}, \tau_{\omega}, W \sim N\left(\frac{\sum_{j=1}^{n} w_{ij} \omega_j}{\sum_{i=1}^{n} w_{ij}}, \frac{1}{\tau_{\omega} \sum_{i=1}^{n} w_{ij}}\right) \forall t \, \text{and} \, i = 1, ..., n \]

where ω_{(-i)} indicates all the elements in ω except the i^th element, W = (w_ij) is the adjacency matrix with w_ij = 1 if i and j are adjacent (i.e., are first-order contiguous) and w_ij = 0 otherwise, and τ_ω is the precision parameter of ω_i. The joint prior density function of ω = (ω_1, …, ω_n)’ over time t is:

\[ p(\omega | \tau_{\omega}) \propto \tau_{\omega}^{(n-1)/2} \exp\left(-\frac{\tau_{\omega}}{2} \sum_{(i \sim j)} (\omega_i - \omega_j)^2\right) \propto \tau_{\omega}^{(n-1)/2} \exp\left(-\frac{1}{2} \omega' Q_{\omega} \omega\right) \forall t \]

with the precision matrix Q_{} = {} R{} and R_{} the n × n spatial structure matrix defined as:

\[ R_{\omega} = \text{diag}\left(\frac{1}{n_i}, \frac{1}{n_2}, ..., \frac{1}{n_n}\right) - \frac{1}{n_i}w_{ij} \] Spatial Structure Matrix

where n_i is the number of neighbors of region i, and i ~ j denotes that regions i and j are neighbors.

The spatially unstructured random effect of region i (υ_i) follows an exchangeable normal distribution:

\[ \upsilon_i | \tau_{\upsilon} \sim N(0, 1/\tau_{\upsilon}) \forall t \, \text{and} \, i = 1, ..., n \]

where τ_υ is the precision parameter of υ_i. The joint density function of the vector υ = (υ_1, …, υ_n)’ over time t is:

\[ p(\upsilon | \tau_{\upsilon}) \propto \tau_{\upsilon}^{(n/2)} \exp\left(-\frac{\tau_{\upsilon}}{2} \sum_{t=1}^{n} \upsilon_i^2\right) \propto \tau_{\upsilon}^{(n/2)} \exp\left(-\frac{1}{2} \upsilon' Q_{\upsilon} \upsilon\right) \forall t \]

with Q_{} = _{} I_n being the precision matrix and I_n the n × n identity matrix.

Incidences of infectious disease typically increase when the breeding conditions for the Aedes-spp. mosquito are favorable, usually during the rainy season (Choi et al., 2016). Consequently, endemic outbreaks can have a regular seasonal pattern (CDC, 2014). In addition, there is likely to be a trend over long periods. The temporally structured component (ϕ_t) is thus the sum of a temporal trend (φ_t) and a seasonal component (κ_t):

\[ \phi_t = \phi_t + \kappa_t \]

A common temporal trend (φ_t) is a random walk of order one (RW1):

\[ \phi_{t+1} - \phi_t | \tau_{\phi} \sim N(0, 1/\tau_{\phi}) \forall i \, \text{and} \, t = 1, ..., T-1 \]

with τ_φ being the precision parameter. The joint density function of φ = (φ_1, …, φ_T)’ for region i is:

\[ p(\phi | \tau_{\phi}) \propto \tau_{\phi}^{((T-1))/2} \exp\left(-\frac{\tau_{\phi}}{2} \sum_{t=1}^{T-1} (\phi_{t+1} - \phi_t)^2\right) \propto \tau_{\phi}^{((T-1))/2} \exp\left(-\frac{1}{2} \phi' Q_{\phi(T \times T)^{(RW1)}} \phi\right) \forall i \]

with the precision matrix \(Q_{\phi(T \times T)^{(RW1)}} = \tau_{\phi} R_{\phi(T \times T)^{(RW1)}}\) and \(R_{\phi(T \times T)^{(RW1)}}\) being the T × T temporal trend structure matrix for the RW1 prior.

A random walk of order two (RW2) could apply to φ_t instead of a RW1. This prior is more appropriate if the data have a pronounced linear trend. The temporal trend (φ_t) of a RW2 is:

\[ \phi_t - 2\phi_{t+1} + \phi_{t+2} | \tau_{\phi} \sim N(0, 1/\tau_{\phi}) \forall i \, \text{and} \, t = 1, ..., T-2 \]

with joint density function for region i:

\[ p(\phi | \tau_{\phi}) \propto \tau_{\phi}^(((T-2))/2) \exp\left(-\frac{\tau_{\phi}}{2} \sum_{t=1}^{T-2} (\phi_t - 2\phi_{t+1} + \phi_{t+2})^2\right) \propto \tau_{\phi}^(((T-2))/2) \exp\left(-\frac{1}{2} \phi' Q_{\phi(T \times T)^{(RW2)}} \phi\right) \forall i \]

with \(Q_{\phi(T \times T)^{(RW2)}} = \tau_{\phi} R_{\phi(T \times T)^{(RW2)}}\) being the precision matrix for RW2 and \(R_{\phi(T \times T)^{(RW2)}}\) the T × T temporal trend structure matrix of a RW2 prior. Temporally structured matrix RW1

where all empty cells of \(R_{\phi(T \times T)^{(RW1)}}\) are zero.

A random walk of order two (RW2) could apply to \(\phi_t\) instead of a RW1. This prior is more appropriate if the data has a pronounced linear trend. The temporal trend \((\phi_t)\) of a RW2 is:

\[ \phi_t - 2\phi_{t+1} + \phi_{t+2} | \tau_{\phi} \sim N(0, 1/\tau_{\phi}) \forall i \, \text{and} \, t = 1, ..., T-2 \]

with joint density function for region i:

\[ p(\phi | \tau_{\phi}) \propto \tau_{\phi}^{(((T-2)))/2} \exp\left(-\frac{\tau_{\phi}}{2} \sum_{t=1}^{T-2} (\phi_t - 2\phi_{t+1} + \phi_{t+2})^2\right) \propto \tau_{\phi}^{(((T-2)))/2} \exp\left(-\frac{1}{2} \phi' Q_{\phi(T \times T)^{(RW2)}} \phi\right) \forall i \]

with \(Q_{\phi(T \times T)^{(RW2)}} = \tau_{\phi} R_{\phi(T \times T)^{(RW2)}}\) being the precision matrix and \(R_{\phi(T \times T)^{(RW2)}}\) the T×T temporal trend structure matrix of a RW2 prior:

Temporally structured matrix RW2

The seasonal component (\(\kappa_t\)) with periodicity m (with m being smaller than the number of time periods T) is:

\[ \kappa_t + \kappa_{(t+1)} + \ldots + \kappa_{(t+m-1)} | \tau_{\kappa} \sim N(0, 1/\tau_{\kappa}) \forall i \, \text{and} \, t = 1, \ldots, T-m+1 \]

where \(\tau_{\kappa}\) is the precision parameter of \(\kappa_t\). The joint density function of \(\kappa = (\kappa_1, \ldots, \kappa_T)'\) is:

\[ p(\kappa | \tau_{\kappa}) \propto \tau_{\kappa}^{(((T-m+1)))/2} \exp\left(-\frac{\tau_{\kappa}}{2} \sum_{t=1}^{(T-m+1)} (\kappa_t + \kappa_{(t+1)} + \ldots + \kappa_{(t+m-1)})^2\right) \] \[ \propto \tau_{\kappa}^{(((T-m+1)))/2} \exp\left(-\frac{1}{2} \kappa' Q_{\kappa} \kappa\right) \forall i \]

with \(Q_{\kappa} = \tau_{\kappa} R_{\kappa}\) being the precision matrix and \(R_{\kappa}\) the \(T \times T\) temporal structure matrix for the seasonal prior:

Temporally structured matrix Seasonal

For example, for \(m = 4\) we have:

Temporally structured matrix Seasonal m=4

For the temporally unstructured component (\(\gamma_t\)) for region i, we assume an exchangeable normal distribution:

\[ \gamma_t | \tau_{\gamma} \sim N(0, 1/\tau_{\gamma}) \forall i \, \text{and} \, t = 1, \ldots, T, \]

where \(\tau_{\gamma}\) is the precision parameter of \(\gamma_t\), with joint density function of \(\gamma = (\gamma_1, \ldots, \gamma_T)'\) for region i given by:

\[ p(\gamma | \tau_{\gamma}) \propto \tau_{\gamma}^{(T/2)} \exp\left(-\frac{\tau_{\gamma}}{2} \sum_{t=1}^{T} \gamma_t^2\right) \] \[ \propto \tau_{\gamma}^{(T/2)} \exp\left(-\frac{1}{2} \gamma' Q_{\gamma} \gamma\right) \forall i, \]

with \(Q_{\gamma} = \tau_{\gamma} I_T\) being the precision matrix and \(I_T\) the \(T \times T\) identity matrix.

The random components \(\omega\), \(\nu\), \(\phi\), \(\kappa\), and \(\gamma\) are described as the main effect, while \(\delta\) denotes the spatiotemporal interaction of the spatial and temporal main effects. Four types of interactions with corresponding priors for \(\delta\) have been proposed (Knorr-Held 2000).

Type I Interaction combines the spatially and temporally unstructured main effects (\(\nu_i\) and \(\gamma_t\)). Then \(\delta\) is independent and identically Gaussian distributed with a mean of zero and precision \(\tau_{\delta}\) with a structure matrix given by \(R_{\delta(I)} = R_{\nu} \otimes R_{\gamma} = I_n \otimes I_T = I_{nT}\), where \(\otimes\) denotes the Kronecker product. The density of the interaction component \(\delta\) is then:

\[ p(\delta | \tau_{\delta}) \propto \tau_{\delta}^{(nT/2)} \exp\left(-\frac{\tau_{\delta}}{2} \sum_{i=1}^{n} \sum_{t=1}^{T} \delta_{it}^2\right) \] \[ \propto \tau_{\delta}^{(nT/2)} \exp\left(-\frac{1}{2} \delta' Q_{\delta(I)} \delta\right) \text{ where } i=1, \ldots, n \, \text{ and } \, t=1, \ldots, T \]

where \(\delta = (\delta_{11}, \ldots, \delta_{nT})'\) and the precision matrix \(Q_{\delta(I)} = \tau_{\delta} R_{\delta(I)}\). The interaction can be thought of as unobserved covariates for each observation \(it\) that do not have any structure in space and time.

Type II interaction combines the spatially unstructured main effect (\(\nu_i\)) and one of the temporally structured main effects (\(\phi_t\) or \(\kappa_t\)). Due to the complexity of the computations, the temporal trend (\(\phi_t\)) is usually used. Then \(\delta_i = (\delta_{it}, \ldots, \delta_{iT})'\) for \(i = 1, 2, \ldots, n\) follows an independent RW1 or RW2, and \(\delta\) is independent and identically normally distributed with mean zero. The structure matrix of Type II interaction is \(R_{\delta} = R_{\nu} \otimes R_{\phi}\), where \(R_{\nu} = I_n\) and \(R_{\phi}\) is the temporal structure matrix of RW1 or RW2.

This interaction assumes that the temporal trends are different from region to region but do not have any structure within a region. The structured matrix \(R_{\delta}\) has rank \(n(T-1)\) for RW1 and \(n(T-2)\) for RW2, and the prior for \(\delta\) is:

For RW1: \[ p(\delta | \tau_{\delta}) \propto \tau_{\delta}^{\left(\frac{n(T-1)}{2}\right)} \exp\left(-\frac{\tau_{\delta}}{2} \sum_{i=1}^{n} \sum_{t=1}^{T-1} (\delta_{i,t+1} - \delta_{i,t})^2 \right) \] \[ \propto \tau_{\delta}^{\left(\frac{n(T-1)}{2}\right)} \exp\left(-\frac{1}{2} \delta' Q_{\delta(II)}^{(RW1)} \delta \right) \text{ where } i=1, \ldots, n \, \text{ and } \, t=1, \ldots, T-1 \]

For RW2: \[ p(\delta | \tau_{\delta}) \propto \tau_{\delta}^{\left(\frac{n(T-2)}{2}\right)} \exp\left(-\frac{\tau_{\delta}}{2} \sum_{i=1}^{n} \sum_{t=1}^{T-2} (\delta_{it} - 2\delta_{i,t+1} + \delta_{i,t+2})^2 \right) \] \[ \propto \tau_{\delta}^{\left(\frac{n(T-2)}{2}\right)} \exp\left(-\frac{1}{2} \delta' Q_{\delta(II)}^{(RW2)} \delta \right) \text{ where } i=1, \ldots, n \, \text{ and } \, t=1, \ldots, T-2 \]

where \(\delta = (\delta_1^{'}, \ldots, \delta_n^{'})^{'}\), and \(Q_{\delta(II)}^{(RW1)} = \tau_{\delta} R_{\delta(II)}^{(RW1)}\) is the precision matrix for RW1 and \(Q_{\delta(II)}^{(RW2)} = \tau_{\delta} R_{\delta(II)}^{(RW2)}\) for RW2, and where \(R_{\delta(II)}^{(RW1)}\) and \(R_{\delta(II)}^{(RW2)}\) are the temporal trend structure matrices for RW1 and RW2, respectively.

Type III interaction combines the temporally unstructured main effect (\(\gamma_t\)) and the spatially structured main effect (\(\omega_i\)). Then \(\delta_t = (\delta_{1t}, \ldots, \delta_{nt})^{'}\) for \(t=1,\ldots,T\) follows an independent iCAR prior with a mean of zero. The structure matrix of Type III interaction is \(R_{\delta(III)} = R_{\gamma} \otimes R_{\omega}\), where \(R_{\gamma} = I_T\) and \(R_{\omega}\) is the spatial structure matrix defined through the iCAR prior specification. This interaction assumes that the spatially structured components are independent over time. The structured matrix \(R_{\delta(III)}\) has rank \(T(n-1)\) and the prior for \(\delta\) is provided by:

\[ p(\delta | \tau_{\delta}) \propto \tau_{\delta}^{\left(\frac{T(n-1)}{2}\right)} \exp\left(-\frac{\tau_{\delta}}{2} \sum_{t=1}^{T} \sum_{i \sim j}^{n} (\delta_{it} - \delta_{jt})^2 \right) \] \[ \propto \tau_{\delta}^{\left(\frac{T(n-1)}{2}\right)} \exp\left(-\frac{1}{2} \delta^{'} Q_{\delta(III)} \delta \right) \quad \text{where} \quad i=1,\ldots,n \, \text{and} \, t=1,\ldots,T, \]

with \(\delta = (\delta_1^{'},\ldots,\delta_T^{'})^{'}\) and the precision matrix being \(Q_{\delta(III)} = \tau_{\delta} R_{\delta(III)}\).

Type IV interaction combines the spatially and temporally structured main effects (\(\omega_i\) and \(\phi_t\)), which imply that \(\delta = (\delta_{11}, \ldots, \delta_{nt})^{'}\) for \(i=1,\ldots,n\) and \(t=1,\ldots,T\) is dependent over space and time. The temporal dependency structure for each region therefore depends on the temporal structure of the neighboring regions. Then, \(\delta\) is independent and identically normally distributed with a mean of zero. The structure matrix of Type IV interaction is \(R_{\delta(IV)} = R_{\omega} \otimes R_{\phi}\). \(R_{\omega}\) denotes the spatial structure matrix defined by the iCAR prior with \(R_{\phi}\) being the temporal trend structure matrix defined through RW1 or RW2. The structured matrix \(R_{\delta}\) has rank \((T-1)(n-1)\) for RW1 and \((T-2)(n-1)\) for RW2. The joint density prior for \(\delta\) is provided by:

RW1: \[ p(\delta | \tau_{\delta} ) \propto \tau_{\delta}^{\left( \frac{n(T-1)}{2} \right)} \exp\left( -\frac{\tau_{\delta}}{2} \sum_{t=1}^{T-1} \sum_{i \sim j}^{n} (\delta_{i,t+1} - \delta_{j,t+1} - \delta_{i,t} + \delta_{j,t})^2 \right) \] \[ \propto \tau_{\delta}^{\left( \frac{n(T-1)}{2} \right)} \exp\left( -\frac{1}{2} \delta^{'} Q_{\delta(IV)}^{(RW1)} \delta \right) \quad i=1,\ldots,n \quad \text{and} \quad t=1,\ldots,T-1 \]

RW2: \[ p(\delta | \tau_{\delta} ) \propto \tau_{\delta}^{\left( \frac{n(T-2)}{2} \right)} \exp\left( -\frac{\tau_{\delta}}{2} \sum_{t=1}^{T-2} \sum_{i \sim j}^{n} (\delta_{it} - \delta_{jt} - 2\delta_{i,t+1} + 2\delta_{j,t+1} + \delta_{i,t+2} - \delta_{j,t+2})^2 \right) \] \[ \propto \tau_{\delta}^{\left( \frac{n(T-2)}{2} \right)} \exp\left( -\frac{1}{2} \delta^{'} Q_{\delta(IV)}^{(RW2)} \delta \right) \quad i=1,\ldots,n \quad \text{and} \quad t=1,\ldots,T-2 \]

with \(\delta = (\delta_{11}, \ldots, \delta_{nT})^{'}\), and where \(Q_{\delta(IV)}^{(RW1)} = \tau_{\delta} R_{\delta(IV)}^{(RW1)}\) is the precision matrix for RW1 and \(Q_{\delta(IV)}^{(RW2)} = \tau_{\delta} R_{\delta(IV)}^{(RW2)}\) for RW2, where \(R_{\delta(IV)}^{(RW1)}\) and \(R_{\delta(IV)}^{(RW2)}\) are the temporal trend structure matrices for the RW1 and RW2 priors, respectively.

Up to a proportionality constant, the product of the likelihood (for the Poisson distribution) and the independent prior distributions for the unknown parameters \(p(\alpha), p(\omega), p(v), p(\phi), p(\kappa), p(\gamma), p(\delta), p(\tau_v), p(\tau_{\omega}), p(\tau_{\phi}), p(\tau_{\kappa}), p(\tau_{\gamma}), p(\tau_{\delta})\) yields the joint posterior distribution of the model parameters.

\[ p(\alpha, \omega, v, \phi, \kappa, \gamma, \delta, \tau_v, \tau_{\omega}, \tau_{\phi}, \tau_{\kappa}, \tau_{\gamma}, \tau_{\delta} | y) \propto p(y | \alpha, \omega, v, \phi, \kappa, \gamma, \delta, \tau_v, \tau_{\omega}, \tau_{\phi}, \tau_{\kappa}, \tau_{\gamma}, \tau_{\delta}) \] \[ \times p(\alpha) p(\omega) p(v) p(\phi) p(\kappa) p(\gamma) p(\delta) p(\tau_v) p(\tau_{\omega}) p(\tau_{\phi}) p(\tau_{\kappa}) p(\tau_{\gamma}) p(\tau_{\delta}). \]

The marginal posterior distributions are obtained from joint posterior, and from which summary statistics (e.g. mean, median, mode or quantiles for credible intervals) can be derived.

Application