Points Pattern Homework

What is the distribution of the data?

Below we see that solely looking at the density of the data that there are far more incidences of lung cancer than of the larynx. This is also evident from the basic plots above.

both<-split(chorley)
larynx<-both$larynx
lung<-both$lung

hist(density(lung))

hist(density(larynx))

The perspective density plot (left) and the contour plot (right) indicate that there are two centers of high density. Without a good deal more of information, we can only infer that there is either a greater population density in these regions or there was a higher exposure rate. I suspect it is the former rather than latter.

require(spatstat)
data(chorley)
summary(chorley)

## Marked planar point pattern:  1036 points
## Average intensity 3.287268 points per square km
## 
## *Pattern contains duplicated points*
## 
## Coordinates are given to 1 decimal place
## i.e. rounded to the nearest multiple of 0.1 km
## 
## Multitype:
##        frequency proportion intensity
## larynx        58 0.05598456 0.1840363
## lung         978 0.94401540 3.1032320
## 
## Window: polygonal boundary
## single connected closed polygon with 131 vertices
## enclosing rectangle: [343.45, 366.45] x [410.41, 431.79] km
## Window area = 315.155 square km
## Unit of length: 1 km
## Fraction of frame area: 0.641

chorley.den<-density(chorley)
persp(chorley.den,theta = 30, phi = 30)

plot(chorley.den, main='Larnyx and Throat Cancer Density',legend=TRUE)
contour(chorley.den,add=TRUE)
points(chorley,pch=c(6,8),cex=1.3)

You can also split the covariates up….NEAT! Here I applied the split function and compared a kernel bandwidth of 1 (top) and 3 (bottom). The split function is nice and simple for ppp. data sets that is categorical data where as the subset function is nice for ppp. data sets that are continuous

plot(density(split(chorley),sigma=1)) #Lower level of smoothing

plot(density(split(chorley),sigma=3)) #Higher level of smoothing

n <- 100
both<-split(chorley)
larynx<-both$larynx
lung<-both$lung

larynxK<- envelope(larynx, fun = Kest, nsim = n)

## Generating 100 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
## 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
## 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,  100.
## 
## Done.

lungK<- envelope(lung, fun = Kest, nsim = n)

## Generating 100 simulations of CSR  ...
## 1, 2,  [etd 4:18] 3,  [etd 4:02] 4,
##  [etd 3:56] 5,  [etd 3:54] 6,  [etd 3:54] 7,  [etd 3:50] 8,
##  [etd 3:48] 9,  [etd 3:47] 10,  [etd 3:45] 11,  [etd 3:42] 12,
##  [etd 3:38] 13,  [etd 3:36] 14,  [etd 3:34] 15,  [etd 3:33] 16,
##  [etd 3:29] 17,  [etd 3:28] 18,  [etd 3:26] 19,  [etd 3:24] 20,
##  [etd 3:21] 21,  [etd 3:18] 22,  [etd 3:15] 23,  [etd 3:13] 24,
##  [etd 3:10] 25,  [etd 3:08] 26,  [etd 3:06] 27,  [etd 3:02] 28,
##  [etd 3:00] 29,  [etd 2:57] 30,  [etd 2:55] 31,  [etd 2:52] 32,
##  [etd 2:49] 33,  [etd 2:46] 34,  [etd 2:44] 35,  [etd 2:42] 36,
##  [etd 2:40] 37,  [etd 2:37] 38,  [etd 2:34] 39,  [etd 2:32] 40,
##  [etd 2:30] 41,  [etd 2:27] 42,  [etd 2:24] 43,  [etd 2:22] 44,
##  [etd 2:19] 45,  [etd 2:17] 46,  [etd 2:14] 47,  [etd 2:11] 48,
##  [etd 2:09] 49,  [etd 2:07] 50,  [etd 2:04] 51,  [etd 2:02] 52,
##  [etd 1:59] 53,  [etd 1:57] 54,  [etd 1:54] 55,  [etd 1:51] 56,
##  [etd 1:49] 57,  [etd 1:46] 58,  [etd 1:44] 59,  [etd 1:42] 60,
##  [etd 1:39] 61,  [etd 1:37] 62,  [etd 1:34] 63,  [etd 1:32] 64,
##  [etd 1:29] 65,  [etd 1:27] 66,  [etd 1:24] 67,  [etd 1:22] 68,
##  [etd 1:19] 69,  [etd 1:17] 70,  [etd 1:14] 71,  [etd 1:12] 72,
##  [etd 1:09] 73,  [etd 1:07] 74,  [etd 1:04] 75,  [etd 1:02] 76,
##  [etd 59 sec] 77,  [etd 57 sec] 78,  [etd 54 sec] 79,  [etd 52 sec] 80,
##  [etd 49 sec] 81,  [etd 47 sec] 82,  [etd 44 sec] 83,  [etd 42 sec] 84,
##  [etd 40 sec] 85,  [etd 37 sec] 86,  [etd 35 sec] 87,  [etd 32 sec] 88,
##  [etd 30 sec] 89,  [etd 27 sec] 90,  [etd 25 sec] 91,  [etd 22 sec] 92,
##  [etd 20 sec] 93,  [etd 17 sec] 94,  [etd 15 sec] 95,  [etd 12 sec] 96,
##  [etd 10 sec] 97,  [etd 7 sec] 98,  [etd 5 sec] 99,  [etd 2 sec]  100.
## 
## Done.

plot(larynxK)

plot(lungK)

par(mfrow=c(1,2))

plot(larynx,pch=1,cols = "green" )
plot(lung,pch=2,cex=0.4,cols= "deeppink" )

Below is four plots: Incidences of larynx cancers plotted over the density of larynx cancer incidence (top left);Incidences of Lung cancers plotted over the density of larynx cancer incidence (top right); Incidences of Larynx cancers plotted over the density of lung cancer incidence (bottom left); and Incidences of Lung cancers plotted over the density of lung cancer incidence (bottom right). No major differences are obvious. They seem to cluster in basically the same regions. However, the density of lung cancers is simply higher and therefor creates a slightly different point density distribution.

lung.den <- density(lung,sigma=1.7)
larynx.den <- density(larynx,sigma=1.7)

par(mfrow=c(2,2))

plot(larynx.den,main="Larynx upon Larynx")
points(larynx, pch='*',cex=1.5)

plot(larynx.den,main="Lung Upon Larynx")
points(lung, pch='.',cex=1.5)

plot(lung.den,main="Larynx Upon Lung",ylab="eae")
points(larynx, pch="*",cex=1.5)

plot(lung.den,main="Lung Upon Lung")
points(lung, pch='.',cex=1.5)

Below we can see that the both lung and larynx cancers are strongly clustered at all distances in the plot window. It is clear that the clustering of lung cancers is much stronger at all distances. The diagonals of the array show that the larynx and lung cancers are more tightly clustered than larynx cancers alone. Further, the distribution of lung cancers relative to other lung cancers are the most tightly clustered at all radii or distances. Given the time I would love to spend more time looking at the distribution of the population in this area relative to these centers of cancers incidence and perhaps look at the prevailing weather patterns to see what more could be driving this distribution. However I must move along… I too digress.

both2K <- alltypes(chorley, "K", envelope = TRUE, verbose=FALSE)
plot(both2K)

Points Pattern Homework - Klinger

Data

What is the distribution of the data?