After some trial and error, the intensity is estimated with quartic kernel and the bandwidth is chosen to be 10. When \(b=10\), the intensity surface would not be too smooth or too rough. The process seems to be clustering, it’s shown clearly both on the map and on the intensity estimate. Given that the data is for the locations of homes of juvenile offenders living in a housing area, it seems that the environment of neighbourhood may have influence on crimes.
To simulate a Poisson cluster process similar to the cardiff data, the pcp() function in R is used to estimate the parameters, the estimates are \(\hat\rho = 0.01173741\) and \(s^2 = 6.64577990\). Based on these two parameters, we further adjust \(m\) so that the generated point process would have about 168 points on average. Finally, \(m = 2.5\) is chosen. The generated PCP process have 177 points.
As in the last lab, the envelop is simulated using 101 Monte Carlo samples from a CSR process. The \(.025\) and \(.975\) quantiles are plotted as envelop.
From the L-function, we can see that the poisson cluster process is a suitable model for the cardiff data. The L-function have the same trend.
The kppm() function is used to fit a model with intercept, X, Y and interaction. The model is \(\lambda(x, y) = 1 + x + y + x*y\).
library(spatstat)
##
## spatstat 1.41-1 (nickname: 'Ides of March')
## For an introduction to spatstat, type 'beginner'
ugdat <- read.table("ugdat.dat")
ugpoly <- read.table("ugpoly.dat")
poly <- as.points(ugpoly)
## Warning in as.points(ugpoly): List/data.frame components should be named x
## and y
names(ugpoly) <- c("x", "y")
pp <- ppp(ugdat[, 1], ugdat[, 2],
poly = list(x = ugpoly[, 1], y = ugpoly[, 2]))
fit <- kppm(pp ~ x*y)
Parameter estimates is obtained as:
## (Intercept) x y x:y
## -9.692845e+00 -4.085755e-04 -1.046967e-04 1.018167e-07
We can further simulate point pattern based on this model and plot the point map.
pp.sim <- simulate.kppm(fit)
df <- data.frame(x = pp.sim[[1]]$x, y = pp.sim[[1]]$y)
p <- ggplot(data = ugpoly, aes(x, y)) + geom_path() + coord_equal()
p + geom_point(aes(pp.x, pp.y), data = data.frame(pp$x, pp$y))
p + geom_point(aes(x, y), data = df, shape = 3)
The estimated intensity is obtained as in problem 1. Both using quartic kernel and bandwidth \(b = 200\).
lamest <- kernel2d(as.matrix(ugdat), poly, 200, 100, 100)
## Xrange is 264 2813
## Yrange is 752 4248
## Doing quartic kernel
df2 <- expand.grid(x = lamest$x, y = lamest$y)
df2 <- data.frame(df2, z = as.vector(lamest$z))
p <- ggplot(data = df2, aes(x, y, z = z)) +
stat_contour(aes(colour = ..level..), size = 4) +
scale_colour_gradient(low = "white", high = "black")
p + geom_path(aes(x, y, z = 0), data = ugpoly)
lamest <- kernel2d(as.matrix(df), poly, 200, 100, 100)
## Xrange is 264 2813
## Yrange is 752 4248
## Doing quartic kernel
df2 <- expand.grid(x = lamest$x, y = lamest$y)
df2 <- data.frame(df2, z = as.vector(lamest$z))
p <- ggplot(data = df2, aes(x, y, z = z)) +
stat_contour(aes(colour = ..level..), size = 4) +
scale_colour_gradient(low = "white", high = "black")
p + geom_path(aes(x, y, z = 0), data = ugpoly)
It doesn’t fit the intensity well. I think that’s mainly because of the covarites isn’t significance enough.