Topic in Econometrics (2018)

16/04/2018

Spatial Data

Revolutionary new tools for collecting and analyzing information (GPS, electronic noise, brain/neutral activities, etc)
Economics: Regional science, geopolitics, geography economics, urban economics, etc.
Spatial statistics: big-data, SPDE, computation. Moreover, most of the data are unfortunately unreplicated.
Phenomena that occur at one spatial scale are not observable at another due to complex interactions at this scale. (Activities at the town-level, city-level, state-level are quite different.)

Spatial Data

Let \((x,y)\) be the position coordinate.
\(Z(x,y)\): quantities (scalar or vector) measured at location \((x,y)\)
Three categories:

Geostatistical data: \(Z(x,y)\) vary continuously in the spatial variables \((x,y)\). The value of \(Z\) is measured at a set of points with coordinates \((x_i, y_i)\), \(i=1,\dots,n\).
Areal data: \(Z\) are defined only at a set of locations \(\mathcal{D}=\{A, B, C,\dots \}\).
Point Process: A spatial point pattern is a dataset giving the observed spatial locations of things or events.

Image

f = function(x,y){15*(cos(sqrt((x-3)^2+3*(y-3)^2)))^2} 
library(spatstat); A = as.im(f, W=square(6)); plot(A)

People in Gordon Square, London

Crimes in Chicago

Random Field

Spatial Data

The objectives:

Estimation of a global, nonspatial statistic such as the value of the mean or CDF.
The knowledge of the variation of the quantity sampled.
Creation of a global function for prediction or inference.

Spatial Autocorrelation

Spatial autocorrelation: near things are more related than distant things
A random variable that is measured at a set of locations is called a random field
\(Z(x,y) = T(x,y) + \eta (x,y) + \varepsilon(x,y)\),
\(T(x,y)\) deterministic trend, \(\eta(x,y)\) spatially autocorrelated random field, \(\varepsilon(x,y)\) uncorrelated random field.
For example, \(T(x,y)= \frac{2e^{-Z(x,y)}}{1+ e^{-Z(x,y)}}\) logistic function

Spatial Autocorrelation

Stationary: \(\mathbb{E}[Z_i] =\mu\), \(\mbox{Var}(Z_i)=\sigma^2\), \[\mbox{Cov}(Z_i, Z_j)=\sigma^2 \mbox{Cor}\{(x_i,y_i)-(x_j,y_j)\}\] where is \(\mbox{Cor}\{(x_i,y_i)-(x_j,y_j)\}\) a correlation function depending only on the relative locations of \((x_i, y_i)\) and \((x_j, y_j)\)

Spatial Autocorrelation

Consider the simplest spatial autocorrelated random field on a square \(m\times m\) lattice. \[Z_i = \mu + \eta_i \\ \eta_i = \lambda\left( \sum_{j=1}^{n} w_{ij} \eta_j \right) + \varepsilon_i, \, i=1,2,\dots,n\] where \(n=m^2\), \(w_{ij}\) measures connection strength between \(Y_i\) and \(Y_j\)
\(w_{ij}\) plays an important role in the modeling of the spatial structure.

Spatial Autocorrelation

The simplest \(w_{ij}\): the elements of \(W\) are nonzero only for those \(w_{ij}\) representing cells whose adjoining boundaries have length greater than zero.
Note that \(\eta = (\mathbf{I} - \lambda \mathbf{W})^{-1}\varepsilon\).
the quantity \((\mathbf{I} - \lambda \mathbf{W})\) is sometimes called the spatial filter.

Spatial Autocorrelation

library(spdep); set.seed(123); X1 = rnorm(100); X2 = rnorm(100)
eps = 0.01 * rnorm(100); b = c(0.5,0.3)
Y = b[1]*X1 + b[2]*X2 + eps
print(coef(lm(Y ~ X1 + X2)), digits = 2)

## (Intercept)          X1          X2 
##      0.0014      0.4987      0.3002

Spatial Autocorrelation

nlist = cell2nb(10, 10); IrWinv = invIrM(nlist, 0.6)
Y = IrWinv %*% (b[1]*X1 + b[2]*X2 + eps) # with spatial weight
print(coef(lm(Y ~ X1 + X2)), digits = 2)

## (Intercept)          X1          X2 
##       0.021       0.542       0.324

Spatial Autocorrelation - GLM

coords = expand.grid(1:20,20:1) 
nlist = cell2nb(20, 20); 
IrWinv = invIrM(nlist, 0.6) 
X = IrWinv %*% rnorm(400)
p = 1 / (1 + exp(-X)); Y = rbinom(length(p), 1, p)
Y.glm = glm(Y ~ X, family = "binomial")
print(coef(Y.glm))

## (Intercept)           X 
## -0.02401708  0.85820559

error.glm = (sum(c(0,1) - coef(Y.glm)))^2
print(error.glm, digits = 2)

## [1] 0.027

Spatial Autocorrelation - GLM

A = autocov_dist(Y, as.matrix(coords), nbs = 1, type = "inverse")
Y.acm = glm(Y ~ X + A, family = "binomial")
error.acm = (sum(c(0,1) - coef(Y.acm)[1:2]))^2
print(c(error.glm, error.acm), digits = 2)

## [1] 0.0275 0.0076

Poin Process

Are the points spread uniformly over the survey region?
Does the density of points depend on an explanatory variable?
Nothing about the points themselves, but about the way the points were generated.
A point process: a random mechanism whose outcome is a point pattern.
Not to handle point data where the \((x, y)\) locations are fixed. New notation.

Point Process

X=ppp(x=1:20,y=20:1, c(-10,40),c(-10,40))
plot(X)

Point Process

A point pattern is a set \(\mathbf{x}= \{ x_1,x_2,\dots,x_n \}\).
In two-dimensional space \(x_i \in \mathbb{R}^{2}\).
The number of points \(n=\#(\mathbf{x})\).
If \(\mathcal{G}\) is a region, \(\mathbf{x} \cap \mathcal{G}\) for the subset falling in \(\mathcal{G}\).

Region

Region = owin(poly=list(list(x=c(0,10,0), y=c(0,0,10)), 
                        list(x=c(2,2,4,4), y=c(2,4,4,2)))); plot(Region)

Point Process

A finite point process:

every possible outcome is in a set (point pattern) with a finite number of points
for any region \(\mathcal{M}\), the number \(\#(\mathbf{x} \cap \mathcal{G})\) is a random variable.

If \(\#(\mathbf{x})\) is infinite, we can consider a locally finite point process where the conditions become 1. for locally finite point pattern, 2. for any bounded region \(\mathcal{G}\).

Uniformly Random Points

library(spatstat)
plot(runifpoint(50, square(1), nsim=2))

Uniformly Random Points

The probability of the uniformly random vector \(U=(U_1,U_2)\) falling in \(\mathcal{M} \subset \mathcal{G}\) is \[\Pr \{ U \in \mathcal{M} \} = \int_{\mathcal{M}} p(u_1, u_2)du_1 du_2.\] where \(p(u_1,u_2) = 1/|\mathcal{G}|\) for \(|\mathcal{G}|\) the area of \(\mathcal{G}\).
Or the probability can be \(\Pr \{ \mathbf{U} \in \mathcal{M} \}=|\mathcal{M}| / |\mathcal{G}|\).
The probability of \(\#(U \cap \mathcal{M})\) is the number of successes falling in \(\mathcal{M}\) which can be modelled by a binomial distribution \[\Pr\{ \#(U \cap \mathcal{M}) = k \} = \left(\begin{array}{c} n\\ k\\ \end{array}\right) p^k (1-p)^{n-k}.\]

Poisson process

Homogeneous Poisson point process: complete spatial randomness
Homogeneity: the points have no preference for any spatial location. The expected number of points falling in \(\mathcal{M}\) should be proportional to \(|\mathcal{M}|\): \[\mathbb{E}[\#(X \cap \mathcal{M})] = \lambda |\mathcal{M}|.\]

Poisson process

Independence: information about the outcome in one region of space has no influence on the outcome in other regions. For disjoint sets \(\mathcal{M}_1\) and \(\mathcal{M}_2\), \(\#(X \cap \mathcal{M}_1)\) and \(\#(X \cap \mathcal{M}_2)\) are independent random variables.
The probability of \(\#(X \cap \mathcal{M})\) follows a Poisson distribution \[\Pr\{ \#(X \cap \mathcal{M}) = k \} = e^{-\mu} \frac{\mu^{k}}{k!}.\]

Remark

For a homogeneous Poisson process with intensity \(\lambda\), if each point can be deleted (cleared) with probability \(p\), the points after the random clear is still a homogeneous Poisson process but with intensity \(p \lambda\).
Combine two homogeneous Poisson processes \(\mathcal{X}\) and \(\mathcal{Y}\): \[\mathcal{Z} = \mathcal{X} \cup \mathcal{Y}\] with intensities \(\lambda_x\) and \(\lambda_y\), then the new process is still a homogeneous Poisson process with intensity \(\lambda_z = \lambda_x +\lambda_y\).

Poisson process

X = runifpoint(50, square(c(0,2))) 
Y = runifpoint(50, square(c(1,3)))
plot(superimpose(X,Y))

Poisson process

X=rpoispp(1000)
plot(density(X))

Poisson process

X=rpoispp(100, win=owin(c(0,5),c(0,8))); X

## Planar point pattern: 4039 points
## window: rectangle = [0, 5] x [0, 8] units

intensity(X)/area(Window(X)) #100/40 =2.5

## [1] 2.524375

Poisson process

Inhomogeneous Poisson point process: probability varing regarding its spatial region.
The mean parameter of its probability is a function \(\lambda(u)\) of spatial location \(u\).
In the region \(\mathcal{M}\), the expted total number of points is \(\int_{\mathcal{M}} \lambda(u)du\).

Stationary Point Process

Stationary point process: if when we view the process through a set (window) \(\mathcal{M}\), its statistical properties do not depend on the location of the set.
For a point pattern \(\mathbf{x}\), let \(\mathbf{x} + v\) be the shifting each poin by the same translation vector \(v\): \[\mathbf{x}+v = \{x_i +v : i = 1,...,n\}.\] Stationary means the statistical properties of \(X\) and \(X + v\) are identical, for any choice of the translation vector \(v\). For example: \[\mathbb{E}[\#(X \cap \mathcal{M})]= \mathbb{E}[\#(X \cap \mathcal{M} - v )]\]

Correlation

The distance \(d_{ij} = \| x_i - x_j \|\) between all ordered pairs of distinct points \(x_i\) and \(x_j\).
Empirical CDF of the pairwise distances \[\hat{H}(r) =\frac{1}{n(n-1)}\sum_{i=1}^{n} \sum_{j=1, j\neq i}^{n} \mathbf{1}\{ d_{ij} \leq r \}= \frac{1}{n-1} \bar{\mathbb{B}}(r)\] where \(\bar{\mathbb{B}}(r)\) is the average number of \(r\)-neighbours per data point for any distance value \(r\geq 0\).

Correlation

The expected number of \(r\)-neighbours of a point process \(X\), divided by the intensity \(\lambda\) is called \(K\)-function \[K(r) = \frac{1}{\lambda} \mathbb{E}[\#(\mathbb{B}_r (u)) | u \in X ]\] where \(\#(\mathbb{B}_r (u))\) denotes the number of \(r\)-neighbours of \(u\): \[\mathbb{B}_r (u) = \sum_{j=1}^{\#(\mathbf{x})} \mathbf{1} \{ 0\leq \|u-x_j\| \leq r \}.\]

Correlation

plot(Kest(X, correction="none"))

#To assess whether the locations are completely random.

Correlation

Dirichlet tile (also known as Voronoi polygons) \[ \{ u\in\mathbb{R}^2: \|u-x_i\| =\min_j \|u - x_j \| \}.\]

plot(dirichlet(X))

MLE

The likelihood function is \[L(\lambda | \mathbf{x}) = \lambda^{\#(\mathbf{x})} e^{1-\lambda}|W|\] where the point process is inside a spatial window \(W\). The derivative of the log-likelihood is \[\frac{\#(\mathbf{x})}{\lambda} - |W|.\] So MLE is \(\hat{\lambda}=\#(\mathbf{x})/|W|\).

Approximation MLE

Quadrature approximation: the Poisson process likelihood within the window \(W\) can be divided into small areas a: \[\int_W \lambda_\theta (u) du \approx \sum_j \lambda_\theta (u_j) a\] where \(u_j\) is the center of the \(j\)-th small area.
Sum over the small areas \[\sum_i \log \lambda_\theta (x_i) \approx \sum_j n_j \log \lambda_\theta (u_j).\]
The log-likelihood becomes \[\log L(\theta) = \sum_{j} ( n_j \log \lambda_\theta (u_j)-\lambda_\theta (u_j)a].\]

Marked Point Process

The locations of points can be classified as covariates.
A spatial covariate is a function \(Z(u)\) observable at everty spatial location \(u\in \mathcal{G}\).
Once these covariates are part of the outcomes of the point pattern, we call them marks.
A marked point pattern is an unordered set \[\mathbf{y} = \{ (x_1,m_1), \dots, (x_1,m_1) \}, x_i\in\mathcal{G}, m_i \in \mathcal{C} \]

Marked Point Process

X=ppp(x=1:20,y=20:1, c(-10,40),c(-10,40), marks=rnorm(20))
X$marks

##  [1]  0.47127077 -0.59186389 -0.65191589 -1.45676385  0.17228460
##  [6]  0.55096552  0.57172204  1.18917935 -2.36533382  0.03322404
## [11] -1.33429768  0.60878579  0.08002488 -1.06872802 -0.37755366
## [16]  1.40288068 -0.54144195 -1.72492125  1.08919484  1.16976361

Marked Point Process

shapley.unmark=unmark(shapley)
plot(shapley.unmark)

Marked Point Process

4215 galaxies in the Shapley Supercluster

head(marks(shapley),3)

##     Mag     V SigV
## 1 15.23 15056   81
## 2 17.22 16995   32
## 3 17.29 21211   81

Mag (Galaxy magnitude), V (Recession velocity), SigV (estimated std. V).

Marked Point Process

Exponential Family

Exponential family of covariate \[\lambda_\theta(u) = h(u) \exp \{B(u) + \theta^{T} \mathbf{Z}(u)\} \\ = h(u) \exp\{B(u) + \theta_1 Z_1(u) +\cdots + \theta_p Z_p(u)\}.\] where \(h(u)\), \(B(u)\) and \(Z_1 (u),\dots,Z_p (u)\) are known functions, \(\theta_1, \dots, \theta_p\) are parameters to be estimated.
The log-likelihood is \[\log L(\theta) = \sum_{i=1}^{n} B(u_i) \\ + \theta^{T} \sum_{i=1}^{n}\mathbf{Z}(x_i) - \int_W \exp\{B(u) + \theta^{T} \mathbf{Z}(u)\} du.\] Here \(\log h(u)\) is ignored.

Exponential Family

X=rpoispp(1000, win=owin(c(0,5),c(0,8)))
fit = ppm(X ~ 1); fit

## Stationary Poisson process
## Intensity: 996.95
##             Estimate        S.E.  CI95.lo  CI95.hi Ztest     Zval
## log(lambda) 6.904701 0.005007642 6.894886 6.914515   *** 1378.833

Exponential Family

The model is fitted by \[\lambda(u) = \exp(\theta_0)\] where \(theta_0\) is the intercept so that \(\lambda(u)= \exp(6.93)\approx1000\).

exp(6.93)

## [1] 1022.494

GLM

GLM of \(X(u)\) on \(Z(u)\). \(Z(u)\) should be in one of the five categories

An area giving the values of a spatial covariate
A function which can be evaluated at any location \((x, y)\) to obtain the value of the spatial covariate.

3.4.5. A window, or a tessellation which will be interpreted as a factor covariate or a single number.

GLM

f = function(x,y){50 * exp(2*x*y)}; sim = rpoispp(f); plot(sim)

GLM

fit.sim = ppm(sim ~ x)
res.sim = residuals(fit.sim); plot(res.sim)

GLM

fit2.sim = ppm(sim ~ f)
res2.sim = residuals(fit2.sim); plot(res2.sim)

Galaxies

fit = ppm(shapley.unmark~1); fit

## Stationary Poisson process
## Intensity: 19.06951
##             Estimate       S.E.  CI95.lo CI95.hi Ztest    Zval
## log(lambda) 2.948091 0.01540285 2.917902 2.97828   *** 191.399

fit.poly = ppm(shapley.unmark~I(x^2)); fit.poly

## Nonstationary Poisson process
## 
## Log intensity:  ~I(x^2)
## 
## Fitted trend coefficients:
##   (Intercept)        I(x^2) 
## 10.5231373026 -0.0001836613 
## 
##                  Estimate         S.E.       CI95.lo       CI95.hi Ztest
## (Intercept) 10.5231373026 2.616657e-01 10.0102819661 11.0359926390   ***
## I(x^2)      -0.0001836613 6.422959e-06 -0.0001962501 -0.0001710726   ***
##                  Zval
## (Intercept)  40.21596
## I(x^2)      -28.59450

Galaxies

Win.s = window(shapley); 
f=function(x,y){15*(cos(sqrt((x-3)^2+3*(y-3)^2)))^2} 
shapley.w = as.im(f, W=Win.s); plot(shapley.w)

Galaxies

fit.w = ppm(shapley.unmark~shapley.w); fit.w

## Nonstationary Poisson process
## 
## Log intensity:  ~shapley.w
## 
## Fitted trend coefficients:
## (Intercept)   shapley.w 
##  3.04141536 -0.01238761 
## 
##                Estimate        S.E.     CI95.lo      CI95.hi Ztest
## (Intercept)  3.04141536 0.025981231  2.99049309  3.092337643   ***
## shapley.w   -0.01238761 0.002911177 -0.01809341 -0.006681806   ***
##                   Zval
## (Intercept) 117.062017
## shapley.w    -4.255189

Galaxies

plot(fit.w, what=c("intensity"))

Galaxies

The alternative (\(H_1\)): an inhomogeneous Poisson process

anova(fit, fit.w, test="LR")

## Analysis of Deviance Table
## 
## Model 1: ~1   Poisson
## Model 2: ~shapley.w   Poisson
##   Npar Df Deviance  Pr(>Chi)    
## 1    1                          
## 2    2  1   18.132 2.061e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Galaxies

sub=shapley.extra$UKSTfields[[4]]; S=owin(sub);
plot(shapley.unmark[S]);

Galaxies

fit.sub = ppm(shapley.unmark~sub); fit.sub

## Nonstationary Poisson process
## 
## Log intensity:  ~sub
## 
## Fitted trend coefficients:
## (Intercept)     subTRUE 
##   2.7939342   0.8061887 
## 
##              Estimate       S.E.   CI95.lo   CI95.hi Ztest      Zval
## (Intercept) 2.7939342 0.01792299 2.7588058 2.8290626   *** 155.88550
## subTRUE     0.8061887 0.03505245 0.7374872 0.8748903   ***  22.99949

Galaxies

plot(simulate(fit.sub))

Linear Network

v = ppp(x=(-2):2, y=3*c(0,1,2,1,0), c(-3,3), c(-1,7)) 
edg = matrix(c(1,2,3,4,2,
               2,3,4,5,4), ncol=2) 
letterA = linnet(v, edges=edg); plot(letterA)

Chicago

plot(density(unmark(chicago), 10))

Chicago

in.chicago=intensity(chicago); in.chicago

## 
##      assault     burglary     cartheft       damage      robbery 
## 0.0006741528 0.0001605126 0.0002247176 0.0011235879 0.0001284100 
##        theft     trespass 
## 0.0012198955 0.0001926151

lppm(chicago~1)

## Point process model on linear network
## Stationary multitype Poisson process
## 
## Possible marks: 'assault', 'burglary', 'cartheft', 'damage', 'robbery', 
## 'theft' and 'trespass'
## 
## Uniform intensity for each mark level:   0.0005319845
## 
##              Estimate       S.E.   CI95.lo   CI95.hi Ztest     Zval
## (Intercept) -7.538896 0.09284767 -7.720874 -7.356918   *** -81.1964
## Original data: chicago
## Linear network with 338 vertices and 503 lines
## Enclosing window: rectangle = [0.3894, 1281.9863] x [153.1035, 1276.5602] 
## feet

Chicago

fit.chicago = lppm(chicago~marks)
coef(fit.chicago)

##   (Intercept) marksburglary markscartheft   marksdamage  marksrobbery 
##    -7.3032960    -1.4340387    -1.0979106     0.5124923    -1.6566755 
##    markstheft markstrespass 
##     0.5954129    -1.2513815

Chicago

X.chicago = rpoislpp(in.chicago, domain(chicago))
plot(X.chicago)

Cox Process

A Poisson process with a random intensity function \[\lambda(u) =\mathbb{E}[\Lambda(u)] = \mathbb{E}[\#(X \cap \mathcal{M})|\Lambda].\]
Log-Gaussian Cox process \[\Lambda(u) = \exp G(u)\] where \(G(u)\) is a Gaussian random field whose mean function is \[\mu(u) =\mathbb{E}[G(u)]\] and the covariance function is \[C(u,v) =\mbox{Cov}(G(u), G(v)) = \mathbb{E}[G(u)G(v)] - \mu(u)\mu(v).\]

Cox Process

A simple option for the covariance function of the Guassian random field is \[C(u,v)=\sigma^2 \exp(-\tau \|u-v\|).\]
Matern cluster \[\lambda(u) = \mathbb{E}\Lambda(u) = \mathbb{E}\sum_i h(u-y_i) = \int k(v) h(u-v)dv\] where \(k(v)\) is a kernel function \(\int_{\mathbb{R}^2} k(v)dv=1\).

Cox Process

X = rLGCP(model="exp", mu=4, var=0.2, scale=0.1, win = square(1))
plot(density(X)) # scale is 1/\tau

Matern cluster

mu = as.im(function(x,y){ exp(2 * (x-y) + 1) }, owin())
X = rMatClust(10, 0.05, mu); plot(X)

Matern cluster

fit.k = kppm(shapley.unmark ~ shapley.w, "MatClust"); fit.k

## Inhomogeneous cluster point process model
## Fitted to point pattern dataset 'shapley.unmark'
## Fitted by minimum contrast
##  Summary statistic: inhomogeneous K-function
## 
## Log intensity:  ~shapley.w
## 
## Fitted trend coefficients:
## (Intercept)   shapley.w 
##  3.04141536 -0.01238761 
## 
## Cluster model: Matern cluster process
## Fitted cluster parameters:
##      kappa      scale 
## 0.07288465 1.01535339 
## Mean cluster size:  [pixel image]