Introduction
Welcome to Module II of our Spatial Statistics and Disease Mapping
course! Building upon the foundations of spatial statistics from Module
I, we now delve into the different types of spatial data and the
corresponding analytical techniques. This module focuses on
understanding the methods we can use to analyze different spatial data
types, without the R code demonstrations, and provides a conceptual
understanding with practical examples.
1. Types of Spatial Data and Their Analysis Methods
In this module, we’ll explore the main types of spatial data: areal
(lattice), geostatistical, and point pattern data, alongside the
specific analytical methods used to address different spatial
questions.
1.1 Areal Data (Lattice Data) Analysis
Areal data is spatially aggregated data within predefined areas, such
as administrative boundaries or census tracts.
1.1.1 Spatial Autocorrelation
- Definition: Spatial autocorrelation is the measure
of similarity between values at locations based on their proximity. It
assesses whether values at nearby areas tend to be more similar than
those farther away.
- Measures: Common measures include Moran’s I and
Geary’s C, which quantify the degree of spatial clustering.
- Real-World Examples:
- Public Health: Analyzing the spatial distribution
of disease rates across regions. For example, mapping areas with higher
rates of diabetes and if there is a spatial pattern of higher
rates.
- Socioeconomics: Examining if neighboring census
tracts have similar average income levels, using spatial data from
surveys.
- Environmental Science: Studying if neighboring
areas have similar levels of deforestation based on satellite data.
1.1.2 Spatial Regression Models
- Definition: Spatial regression models analyze
relationships between a response variable and explanatory variables
while explicitly accounting for spatial dependencies.
- Models: Includes spatial lag models, spatial error
models, and geographically weighted regression (GWR).
- Real-World Examples:
- Public Health: Modeling the relationship between
access to healthcare facilities and mortality rates across
administrative areas, considering spatial clustering of mortality.
- Socioeconomics: Assessing the relationship between
socioeconomic factors and crime rates across police districts,
accounting for spatial spillovers of crime.
- Environmental Science: Modeling the impact of land
use on biodiversity, considering spatial correlation of vegetation
between adjacent regions.
1.1.3 Disease Mapping
- Definition: Disease mapping aims to estimate and
visualize the spatial distribution of disease risk. It is about
identifying areas with higher or lower risk and assessing factors
influencing those risks.
- Models: Bayesian hierarchical models are often used
for disease mapping.
- Real-World Examples:
- Public Health: Mapping disease prevalence or
incidence in different areas using survey data.
- Public Health: Creating maps to monitor temporal
change in the geographical distribution of malaria.
- Environmental Health: Mapping areas with increased
risk of environmental hazards based on surveys and environmental
data
1.2 Geostatistical Data Analysis
Geostatistical data represent continuous variables observed at
specific locations across a spatial field. The focus is often on
predicting values at unobserved locations.
1.2.1 Kriging
- Definition: Kriging is a geostatistical
interpolation technique that predicts values at unsampled locations
based on spatial correlation. It provides not just predictions but also
their associated uncertainty.
- Real-World Examples:
- Public Health: Estimating the spatial distribution
of air pollution across a city using measurements from monitoring
stations.
- Environmental Science: Predicting soil
contamination levels in a region using sample data at different
locations
- Agricultural Science: Mapping soil nutrient
concentrations based on data from field samples.
1.2.2 Variogram Analysis
- Definition: Variogram analysis explores the spatial
dependence as a function of distance, providing insights into how
spatial correlation decays with increasing distance.
- Real-World Examples:
- Public Health: Analyzing how quickly incidence of
disease changes with distance between observations.
- Environmental Science: Studying how the
concentrations of pollutants in groundwater vary with distance.
- Geoscience: Examining the spatial variability of
soil properties.
1.2.3 Spatial Regression and Prediction
- Definition: Spatial regression methods for
geostatistical data combine explanatory variables with spatial effects
to make predictions, especially useful when local and global trends
exist.
- Real-World Examples:
- Public Health: Modeling the risk of vector-borne
diseases based on environmental factors, accounting for spatial
correlation of risk across locations.
- Environmental Science: Predicting biodiversity
using habitat variables, considering spatial dependence between nearby
areas.
- Agricultural Science: Modeling crop yields using
climate and soil data, with spatial dependence between
observations.
1.3 Point Pattern Data Analysis
Point pattern data represent locations of events in space, often
without reference to area. Analysis focuses on the spatial distribution
of events.
1.3.1 Point Pattern Intensity
- Definition: Intensity refers to the number of
events per unit area. Analysis reveals where events are more or less
concentrated.
- Real-World Examples:
- Public Health: Assessing intensity of emergency
calls in a city to allocate resources.
- Urban Planning: Evaluating the density of
businesses in different areas of a city.
- Ecology: Examining the intensity of animal nests in
a forest to understand distribution.
1.3.2 Nearest Neighbor Analysis
- Definition: Nearest neighbor analysis calculates
distances between points and their nearest neighbors to determine
clustering, dispersion, or randomness.
- Real-World Examples:
- Public Health: Determining if disease cases are
clustered, which may indicate an outbreak or a point source of
infection.
- Retail Analysis: Studying the proximity of
competing retail outlets.
- Ecology: Examining the distribution of different
tree species.
1.3.3 Spatial Point Process Models
- Definition: Spatial point process models use
stochastic models to describe the mechanism that generate point
patterns.
- Real-World Examples:
- Public Health: Modeling distribution of infectious
disease cases, revealing factors that influence spread.
- Criminology: Modeling spatial distribution of crime
to understand risk factors and inform police strategies.
- Ecology: Understanding spatial distribution of
animal nests and factors that influence it.
2. Example: Disease Analysis
Let’s consider disease as an example:
- Areal Data: We might have disease rates by
administrative region. We can use spatial autocorrelation to assess if
high-risk areas cluster, or spatial regression to see the relationship
between risk and local risk factors.
- Geostatistical Data: If we have disease risk
indicators measured at different points, we can use kriging to predict
and map disease risk in unmeasured areas.
- Point Pattern Data: Individual disease locations
would enable us to use methods for point patterns to assess whether
cases cluster and help us investigate the spread of disease.
3. Conclusion and Transition to Module III
Module II covered core spatial analysis methods and their
applications based on different spatial data types (areal,
geostatistical, and point patterns). This provides a strong foundation
for analyzing spatial patterns, understanding relationships and
exploring spatial processes in the real world.
In Module III, we will shift our focus to
Regression Analysis using Spatial Methods. We will
review linear regression, introduce spatial regression models (spatial
lag and spatial error models), explore their application with survey
data, and focus on how to interpret and assess these model outputs. We
will be combining the concepts of spatial analysis from module II with
regression methods in module III. ```