Point Patterns: Hypothesis Testing

This exercise is the second part of the point pattern analysis series.

The most important concept to understand in dealing with point patterns is the idea that the observed data are conceptualized as a single realization of an underlying intensity surface. When you flip a coin there is some probability of getting heads. Imagine a landscape filled with magic coins that flip themselves when you say, “flip” and magically land the exact same spot whenever flipped. The coins are evenly spaced 1m apart and fill the land. When you say “flip” a map of heads emerges (also magically) in your computer, the map shows you the locations of the heads but not the tails. Each time you say “flip” there will be a different outcome and a different map. All magic coins are equal so heads are equally likely in all places.

When working with a point pattern we often assume that our data is like a single flipping of the magic coins. That is our data is a single realization from some spatially extensive probability surface. On that surface a positive outcome, or an “event,” is equally likely at all locations. If heads are equally likely in all places than the number of heads we'd expect in any given region is proportion to the area of the region. Two randomly selected regions of the same size should have roughly the same number of heads.

We can ruin the fun of being in this magical land by stating these rules formally. The number of points falling in any region \( B \) is a Poisson random variable. For each realization of a Poisson random variable we get a discrete (integer) value. A Poisson distribution is controlled by a single parameter \( \lambda \), which controls both the mean and variance. The fact that the mean and the variance of a Poisson variable are equal is important to remember, we'll exploit this property of Poisson variables in statistical tests.

## Simulate 10 realizations of a Poisson random variable
for (i in 1:10) {
    print(rpois(n = 1, lambda = 5))
}

## [1] 3
## [1] 7
## [1] 3
## [1] 2
## [1] 5
## [1] 4
## [1] 9
## [1] 5
## [1] 1
## [1] 6

Assuming \( \lambda \) does not vary spatially the expected number of points in a region \( B \) is simply \( \lambda*area(B) \). In addition, as long as regions don't overlap the number of points in \( B_1 \) is completely independent from the number of points in \( B_2 \). Finally, the location of points within each region is completely random, the location of the first point has no effect of the location of the second point. If all of these rules hold, and they do in the magical example, a point pattern is said to be a uniform Poisson Point Process or sometimes CSR (complete spatial randomness).

In point pattern analysis \( \lambda \) is called intensity and it is typically estimated from data. In the magical world a 10 by 10 meter area would contain 100 coins and roughly 50 heads. The simplest estimate of \( \lambda \) would be the number of heads divided by the area \( \lambda = 50/100 \). This estimate assumes that lambda is homogenous, that is, it remains constant within a study area.

If a magical Unicorn (or a Leprechaun) blessed some of the coins so that they were more likely to yield heads our process would no longer be CSR. Places with blessed coins would have more dots on the map. Alternatively, if some of the coins were removed we might expect more heads in areas where coins remained. These two perturbations of our coins might be difficult to distinguish using only a map of heads that result from a single flip.

Simple hypothesis tests with point patterns compare an observed distribution of points (i.e. your data) to simulated distributions of points based on an assumption of CSR. One of the nice things about CSR is that its fairly easy so simulate. With one line of R code we can simulate the land of magical coins (minus the Unicorn and the Leprechaun):

library(spatstat)
magicLand <- rpoispp(lambda = 0.5, win = owin(c(0, 10), c(0, 10)))
plot(magicLand)

plot of chunk unnamed-chunk-2

# Note: One departure from the scenario is that with rpoispp() possible
# locations are not fixed.

The corny scenario has served its purpose and we can leave it behind, if you run the line below multiple times you should get a different map each time. Each of the maps will be a single realization of complete spatial randomness (CSR).

plot(rpoispp(lambda = 0.5, win = owin(c(0, 10), c(0, 10))))

One last thing, note that the intensity does not exactly equal .5:

# window area and number of points
summary(magicLand)

## Planar point pattern: 38 points
## Average intensity 0.38 points per square unit  
## 
## Coordinates are given to 7 decimal places
## 
## Window: rectangle = [0, 10] x [0, 10] units 
## Window area =  100 square units


# your number of points might be slightly different.
magicLand$n/100  #this is the intensity of the simulated map

## [1] 0.38

Conceptually, simple hypothesis testing for point patterns involves comparisons of observed patterns (data) to many simulated realizations of CSR. One of the cool things about this approach is that its possible to think of many different point generating processes, we can use the same simulation framework to compare our data to any number of hypothetical point generating processes. THe methods used to test CSR are generalizable to non-uniform maps of \( \lambda \).

Hypothesis Tests of CSR

The most basic hypothesis test for point patterns is called the Quadrat Test. The study region is divided into \( n \times n \) grid. For each of the \( m \) cells in the \( n \times n \) grid the number of observed points is counted and the mean of each quadrat is calculated. For each of the \( m \) quadrats the variance is calculated \( s^2=\sum (x_i-\bar{x})^2 \) where \( x_i \) is the sum of the points in quadrat \( i \). If the data are generated by a Poisson distribution the variance to mean ratio (VMR), \( \frac{s^2}{\bar{x}} \), should equal 1 (because the mean and the variance should both be \( \lambda \)). The VMR follows a \( \chi^2 \) distribution with \( m-1 \) degrees of freedom. This VMR approach works when \( m > 6 \) and \( \bar(x) > 1 \). Spatstst uses a slightly different procedure for the quadrat test. Then a \( \chi ^2 \) test is preformed which is simply:

\[ \chi ^2 = \sum _1 ^m \frac{(observed - expected)^2}{expected} \]

If there are \( m \) quadrats the results can be evaluated using a \( \chi ^2 \) distribution with \( m-1 \) degrees of freedom. The function quadrat.tests does most of the work for us:

# create a random map in a 10 by 10 window
ranMap <- rpoispp(lambda = 10, win = owin(c(0, 10), c(0, 10)))
# divide the window into six quadrants calculate chi-square
qTest <- quadrat.test(ranMap, nx = 3, ny = 3)

# visualize the result which shows observed count, expected count and SD.
plot(qTest)

plot of chunk unnamed-chunk-5

Because the map was randomly generated using CSR the quadrat test should return a high p-value, indicating that the data is not significantly different from CSR:

qTest

## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  ranMap 
## X-squared = 6.518, df = 8, p-value = 0.8212
## alternative hypothesis: two.sided 
## 
## Quadrats: 3 by 3 grid of tiles

We could repeat this test with real data. The data set bei includes data on several thousand Beilschmiedia tree locations in a tropical study site. In this data the location of points is clearly non-random. Thus we hope a quadrat test on bei would reject the null hypothesis of CSR.

data(bei)  #see ?bei for details
plot.ppp(bei)

plot of chunk unnamed-chunk-7

qTest <- quadrat.test(bei, nx = 10, ny = 4)
qTest

## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  bei 
## X-squared = 2388, df = 39, p-value < 2.2e-16
## alternative hypothesis: two.sided 
## 
## Quadrats: 10 by 4 grid of tiles

It works! The small p-value suggests the this data set was not generated under CSR.

The bei data set also includes data for elevation and slope. Visually, its hard to tell if either of these variables explain the distribution of trees:

# extract raster images
bei.elev <- bei.extra$elev
bei.slope <- bei.extra$grad

# plot raster images
plot(bei.slope)
plot(bei, add = T, pch = 16, cex = 0.2)

plot of chunk unnamed-chunk-10

Distance Tests

Complete spatial randomness suggests that the number of points is proportional to the area of a sub region; this property of CSR explicitly tested by the quadrant test. However, CSR also suggests that the locations of points are independent from each other. Two important violations of independence are:

Clustering: Points are clumped. Points are closer together than expected under CSR. In the generating process points “attract” each other. For example, if the trees in bei have seeds that are not very mobile there is reason to suspect that trees would cluster.
Dispersion: Points are farther apart than expected. This can happen when, in the generating process, points repel each other.

There are a variety of statistical tests which examine inter-point distances. Two commonly used tests are the nearest neighbor distance test (sometimes called the G-function) and the reduced second moment measure (usually called the K-function).

The G-function is the cumulative distribution of points as a function of the distance between each point and its nearest neighbor. For example, a G-function would tell us that 100% of points have a neighbor within 100m, 80% have a neighbor within 90m, 50% have a neighbor within 30m and so on. If we plotted the G-function the left would start at zero and the right would end at 1. The G-function for bei is:

plot(Gest(bei))

plot of chunk unnamed-chunk-11

##      lty col  key           label                           meaning
## km     1   1   km   hat(G)[km](r)     Kaplan-Meier estimate of G(r)
## rs     2   2   rs hat(G)[bord](r) border corrected estimate of G(r)
## han    3   3  han  hat(G)[han](r)          Hanisch estimate of G(r)
## theo   4   4 theo      G[pois](r)          theoretical Poisson G(r)

In the above figure the blue line is what one would expect from CSR and the other line(s) is what we observe in the bei data. This plot indicates clustering, a greater proportion of trees have nearest neighbors at each distance than expected under CSR.

The weakness of the G-function is that it only looks at nearest neighbors. The K-function is a “second order” test in that it looks at the spatial dependence among points. The phrase “second order properties” is synonymous with “spatial dependence.”

If we have two locations \( s_i \) and \( s_j \) separated by a distance \( h \) we can denotes the strength of their spatial dependence, or second order intensity as \( \gamma (s_i, s_j) \). A point process is said to be stationary if \( \gamma (s_i, s_j) \) depends only upon the distance \( h \) between the two points. In a stationary process the actual locations of \( s_i \) and \( s_j \) are irrelevant, their spatial interaction is purely a function of the separation distance. Complete spatial randomness implies a constant intensity and hence stationarity.

The K-function calculates the expected number of events within \( h \) units of a randomly selected point, \( \lambda K(h)= E(number\ of\ events\ within\ a\ distance\ h\ of\ an\ arbitrary\ event) \).

The is K-function evaluated using simulation. A function called envelope() will do most of the heavy lifting, including:

Calculating the K-function for our data.
Simulating CSR and calculating the k-function for each simulated pattern.

bei.k <- envelope(bei, Kest, nsim = 99, correction = "border")

## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
## 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
## 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
## 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
## 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
## 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.

plot(bei.k)

plot of chunk unnamed-chunk-12

##      lty col  key      label
## obs    1   1  obs  K[obs](r)
## theo   2   2 theo K[theo](r)
## hi     1   8   hi   K[hi](r)
## lo     1   8   lo   K[lo](r)
##                                                meaning
## obs            observed value of K(r) for data pattern
## theo                 theoretical value of K(r) for CSR
## hi   upper pointwise envelope of K(r) from simulations
## lo   lower pointwise envelope of K(r) from simulations

This plot shows the range of values obtained via simulation as a grey envelope. The observed k-function is much higher than the values obtained via simulation of CSR. This implies clustering, at all distances there are more points near each point than expected under CSR.

A black line inside of the grey envelope implies no significant difference from CSR. One cool thing about the K-function is that results can be scale specific, as scale varies a point pattern can contain no pattern, clustering, or dispersion. A map can be clustered at some scales but dispersed at others.

Another important thing to remember is that in the above simulation the envelope represents a homogeneous Poisson process. However, most interesting questions involve point pattern where CSR is not realistic, we need some new methods…

Inhomogeneous Tests

If events are not equally likely at all points within the window the process under investigation is said to be Inhomogeneous. In a homogeneous process \( \lambda \) is constant at all location within the window, in an inhomogenous process \( \lambda \) varies spatially.

Spatstat allows the use of images to estimate lambda. In the Beilschmiedia data perhaps the intensity of trees at a given location is shaped by slope and elevation.

It is possible to use the raster images of slope and elevation to create a point process model (ppm) which allows the simulation of points based on a raster images (or collections of raster images). The ppm() function uses the input images to estimate a spatially varying \( \lambda \).

ppm.bei <- ppm(bei, ~sl + el, covariates = list(sl = bei.slope, el = bei.elev))

We can use the point process model to generate points:

point.sim <- simulate(ppm.bei, 1)
plot(point.sim)

plot of chunk unnamed-chunk-14

We can view the predicted intensity at each location in the window:

plot(predict.ppm(ppm.bei, type = "lambda"), main = "Predicted Intensity (lambda)")  #predicted intensity

plot of chunk unnamed-chunk-15

Finally, we can use envelope to see if our observed data is a possible a realization from the model of lambda displayed above.

bei.k <- envelope(ppm.bei, Kest, nsim = 99, correction = "border", verbose = F)
plot(bei.k)

plot of chunk unnamed-chunk-16

##       lty col   key     label
## obs     1   1   obs K[obs](r)
## mmean   2   2 mmean bar(K)(r)
## hi      1   8    hi  K[hi](r)
## lo      1   8    lo  K[lo](r)
##                                                 meaning
## obs             observed value of K(r) for data pattern
## mmean              sample mean of K(r) from simulations
## hi    upper pointwise envelope of K(r) from simulations
## lo    lower pointwise envelope of K(r) from simulations

The results show significant clustering up to about 110m. Then the observed k-function enters the grey envelope. This means that at some scales the observed data is a plausible realization of the inhomogeneous intensity surface we created but at scales less than 100m it is too clustered. This simulation can be viewed as a (failed) attempt to explain the observed distribution of trees in the bei data.

Multitype Point Patterns

Another interesting type of test can be done with marked points, where each mark represents a different type of event/object. For example, the Wiki Leaks data contained different types of events that resulted in fatalities. The lansing data set describes the location of six different species of trees. We might want to examine the distribution of species of type \( a \) relative to the location of species type \( b \).

data(lansing)
plot(split(lansing))

plot of chunk unnamed-chunk-17

The “cross-K” function returns the number of points of type \( b \) within \( h \) units of a type \( a \) point. For example, we might want to know if maple trees are clustered (or dispersed) around black oak trees. We can do all of the pairwise comparisons using the alltypes function.

at <- alltypes(lansing, "Kcross", envelope = T)  #this takes awhile
plot(at)

plot of chunk unnamed-chunk-18

These results are interesting. Look at the first column, black oak trees. Reading down the column we see:

Black oak trees are clustered in the study area (upper left plot)
Hickory trees are clustered near black oak trees
Maple trees are dispersed (repulsed) from black oak trees
Miscellaneous trees are also dispersed from black oaks
In general, red oaks are neither clustered or dispersed from black oak trees. At large scales red oaks seem to be dispersed (the black line leaves the grey envelope).
White oak trees are neither clustered or dispersed relative to black oaks. The black line stays entirely within the grey envelope.

We might also want to look at the difference between two K-functions. If red oak and white oak had similar spatial distributions the difference between a K-function for red oak and a K-function for white oak should be zero at all scales. Departures from zero would indicate that one is more or less clustered than the other at that scale. I am not aware of an easy way to assess significance, other than writing your own simulation (which might be fun side project for the lab?).

Kred <- Kest(lansing[lansing$marks == "redoak", ], correction = "best")
Kwhite <- Kest(lansing[lansing$marks == "whiteoak", ], correction = "best")
plot(Kred$iso - Kwhite$iso, type = "o", ylim = c(-0.01, 0.02))

plot of chunk unnamed-chunk-19

A final test is the Kdot() test, which examines one species relative to all other species. It can be run for a multi-type point pattern like lansing in the same manner as the K-cross function.

at <- alltypes(lansing, "Kdot", envelope = T)
plot(at)

plot of chunk unnamed-chunk-20

These results are not interesting, it seems than none of the species has any significant pattern when compared to all other species.

Local K-Functions

The K-function can also be computed for each point in the window (its a computationally intensive process). That is, for each point how many other points are around it as a function of distance. This technique was developed by Getis and Franklin (1987) as an exploratory method to show how spatial structure changes as a function of scale.

lK <- localK(bei, verbose = F)  #takes a long time!!!!
plot(lK, main = "local K functions for bei", legend = FALSE)  #scale seems odd

These local K functions are hard to interpret when viewed as a group. We can plot the k-function for a single tree:

plot(lK, iso1606 ~ r)

We could also map out the values of the local k-Function for all trees for a particular distance:

lK.60 <- localK(bei, rvalue = 60, verbose = F)  #takes a long time!!!! Set verbose=T to track progress
bei.lk.60 <- unmark(bei) %mark% lK.60
lk.bei.smooth <- smooth.ppp(bei.lk.60, sigma = 50)
plot(lk.bei.smooth, col = topo.colors(128), main = "smoothed neighbourhood density")
contour(lk.bei.smooth, add = TRUE)

This map shows the variation in the local K-function when distance (\( h \)) is 60 meters. Not sure how useful this is, the pattern is clear but the units are hard to interpret. The resulting plot is not very different from plot(density(bei, sigma=50)). Getis and franklin argue that maps such as this show the scale specific second order intensity.