Review

Titik adalah objek dalam dimensi nol dan menggambarkan sifat dari titik menduduki ruang. Pola keseluruhan titik dalam ruang mencerminkan sifat keseluruhan titik tersebut. Pembahasan bukan membicarakan sebuah titik tapi keseluruhan titik dalam ruang.

Ada tiga pola titik:

  • Acak : setiap titik mempunyai peluang yang sama untuk menduduki suatu ruang dan tidak dipengaruhi oleh titik yang lain
  • Uniform : setiap titik adalah sejauh mungkin dari titik-titik tetangganya.
  • Cluster : Banyak titik terkonsentrasi menduduki pada ruang yang sama, dan ruang yang lain sangat sedikit ditempati oleh titik.

First Approach: Quadrant Method

Metode Kuadran

  • Bagilah Daerah Studi menjadi beberapa sel yang berukuran sama. Ukuran Sel Ditentukan oleh skala yang diinginkan
  • Tentukan Rata-rata Banyaknya Titik per sel
  • Tentukan Variance banyaknya titik per sel
  • Hitung perbandingan Ragam dengan Rata-rata (VMR)

\[ \overline{x} = \frac{N}{m} \text{ } \text{ } \text{ } \text{, } \text{ } \text{ } \text{ } S^2 = \frac{\sum_{i = 1}^{m}(x_i - \overline{x})^2}{m - 1} \text{ } \text{ } \text{ } \text{, } \text{ } \text{ } \text{ } VMR = \frac{S^2}{\overline{x}} \text{ }\\ \text{ }\\ N = \text{banyaknya titik}\\ m = \text{banyaknya sel}\\ x_i = \text{banyaknya titik dalam sel i}\\ \overline{x} = \text{rata-rata banyaknya titik persel}\\ S^2 = \text{ragam banyaknya titik persel}\\ VMR = \text{perbandingan ragam dengan rata-rata}\\ \] VMR = 0 titik menyebar Uniform(sistematik)
VMR = 1 titik menyebar acak
VMR > 1 titik menyebar lebih mengelompok

Hipotesis

H0: Titik menyebar acak
H1: Titik tidak menyebar acak

Jika m < 30 maka didekati dengan sebaran Khi Kuadrat

\[ \chi^2 = (m - 1)VMR = \frac{(m - 1)S^2}{\overline{x}} \]

Jika m >= 30 maka didekati dengan sebaran Z

\[ Z = \frac{(m - 1)(VMR-(m-1))}{\sqrt{2(m-1)}} = \sqrt{\frac{m-1}{2}}(VMR - 1) \]

Bila alpha 5%, Z > 1.96 menolak H0 dan menerima H1 dengan kesimpulan Kluster
Bila alpha 5%, Z < -1.96 menolak H0 dan menerima H1 dengan kesimpulan Uniform

Aplikasi pada R

library(spatstat)
data(swedishpines)

help(swedishpines)

Swedish Pines Point Pattern

Description

The data give the locations of pine saplings in a Swedish forest.

Usage

data(swedishpines)

Format

An object of class “ppp” representing the point pattern of tree locations in a rectangular plot 9.6 by 10 metres.

Cartesian coordinates are given in decimetres (multiples of 0.1 metre) rounded to the nearest decimetre. Type rescale(swedishpines) to get an equivalent dataset where the coordinates are expressed in metres.

See ppp.object for details of the format of a point pattern object.


X <- swedishpines
plot(X)

summary(X)
## Planar point pattern:  71 points
## Average intensity 0.007395833 points per square unit (one unit = 0.1 metres)
## 
## Coordinates are integers
## i.e. rounded to the nearest unit (one unit = 0.1 metres)
## 
## Window: rectangle = [0, 96] x [0, 100] units
## Window area = 9600 square units
## Unit of length: 0.1 metres
contour(density(X,10),axes = F)

q <- quadratcount(X, nx = 4, ny = 3)
q
##              x
## y             [0,24) [24,48) [48,72) [72,96]
##   [66.7,100]       7       3       6       5
##   [33.3,66.7)      5       9       7       7
##   [0,33.3)         4       3       6       9
plot(q)

mu <- mean(q)
sigma <- sd(q)^2
VMR <- sigma/mu
VMR
## [1] 0.6901408
quadrat.test(q)
## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  
## X2 = 7.5915, df = 11, p-value = 0.5013
## alternative hypothesis: two.sided
## 
## Quadrats: 4 by 3 grid of tiles
quadrat.test(q, alt = "regular")
## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  
## X2 = 7.5915, df = 11, p-value = 0.2506
## alternative hypothesis: regular
## 
## Quadrats: 4 by 3 grid of tiles

Bagaimana kesimpulan Anda?

Second Approach: Empirical K-Function

nn <- nndist(swedishpines)
hist(nn)

The Empirical K-Function

\[ \hat{K}(r) = \frac{|W|}{n(n-1)}\sum_{i=1}^{n}\sum_{j=1\text{ },\text{ } j\neq{i}}^{n}1\{d_{ij}\le{r}\}e_{ij}(r)\\ \text{ }\\ d_{ij} = \text{observed pairwise distance}\\ r = \text{distance value}(r\ge{0})\\ e_{ij} = \text{edge correction weight}\\ n = \text{number of points}\\ |W| = \text{the area of observation window} \]

In summary, the empirical K-function \(\hat{K}(r)\) is the cumulative average number of data points lying within a distance r of a typical data point, corrected for edge effects, and standardised by dividing by the intensity.

Possible pattern for \(\hat{K}(r)\)

Theoritical K-Function

\[ K_{pois}(r) = \pi r^2 \] This is the K function for a homogeneous Poisson process

Use of empirical K-Function

Application on R

K<- Kest(swedishpines, correction="Ripley")
plot(K)

E<-envelope(swedishpines,Kest, nsim=99)
## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.
plot(E)

Non-graphical test

mad.test(swedishpines, Kest, nsim=99, alternative="two.sided")
## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,  99.
## 
## Done.
## 
##  Maximum absolute deviation test of CSR
##  Monte Carlo test based on 99 simulations
##  Summary function: K(r)
##  Reference function: theoretical
##  Alternative: two.sided
##  Interval of distance values: [0, 24] units (one unit = 0.1 metres)
##  Test statistic: Maximum absolute deviation
##  Deviation = observed minus theoretical
## 
## data:  swedishpines
## mad = 150.69, rank = 29, p-value = 0.29

Menggunakan data external

Jenis data yang diperlukan

Import Data

Data dapat diunduh di: city.rds crime.rds

library(raster)
library(spatstat)
city <- readRDS('city.rds')
crime <- readRDS('crime.rds')

Konversi data menjadi titik point pattern

border<-city
coord.city<-city@polygons[[1]]@Polygons[[1]]@coords
window<-owin(poly=data.frame(x=rev(coord.city[,1]),
y=rev(coord.city[,2])))
plot(window)

Menghitung titik di dalam kuadran

crime2 <- remove.duplicates(crime)
crime2 <- crime2[crime,]
crime2.ppp<-ppp(x=crime2@coords[,1],y=crime2@coords[,2],
window=window)
quad<-quadratcount(crime2.ppp)
plot(quad, col="red")
plot(crime2.ppp, add=T, pch=20, cex = 0.5)

quadrat.test(crime2.ppp,alt="cluster")
## 
##  Chi-squared test of CSR using quadrat counts
## 
## data:  crime2.ppp
## X2 = 480.83, df = 22, p-value < 2.2e-16
## alternative hypothesis: clustered
## 
## Quadrats: 23 tiles (irregular windows)

Using K-funtion

K<- Kest(crime2.ppp,
correction="Ripley")
plot(K)

mad.test(crime2.ppp, Kest, nsim = 20, alternative = "greater")
## Generating 20 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,  20.
## 
## Done.
## 
##  Maximum signed deviation test of CSR
##  Monte Carlo test based on 20 simulations
##  Summary function: K(r)
##  Reference function: theoretical
##  Alternative: greater
##  Interval of distance values: [0, 3697.26712973043]
##  Test statistic: Maximum signed deviation
##  Deviation = observed minus theoretical
## 
## data:  crime2.ppp
## mad = 16518845, rank = 1, p-value = 0.04762

Simulasi Data

Poisson process

To create a Poisson process with uniform intensity of 50 over [0; 1] [0; 1]

pp0 <- rpoispp(50)
plot(pp0)

Matern Clustering

pp3 <- rMatClust(12, 0.1, 4)
plot(pp3)

Apakah data yang dibangkitkan selalu sesuai dengan yg diinginkan?

Other generator function

  • rThomas
  • rCauchy
  • rVarGamma
  • rNeymanScott
  • rGaussPoisson

Other point pattern data

  • amacrine (rabbit amacrine cells, locations and 2 types)
  • anemones (sea anemones data, locations and sizes)
  • ants (ant nests data, location and 2 types)
  • bei (tropical rainforest trees, locations)
  • betacells (cat retinal ganglia data, locations, 2 types and sizes)
  • bramblecanes (Bramble Canes data, locations and 3 types)
  • cells (biological cells data, locations)
  • chorley (cancer data, locations and 2 types)
  • finpines (Finnish Pines data, locations and 2 size measures)
  • hamster (hamster tumour data, locations and 2 types)
  • japanesepines (Japanese Pines data, locations)
  • lansing (Lansing Woods data, locations and 6 types)
  • longleaf (Longleaf Pines data, locations and sizes)
  • nztrees (trees data, locations)
  • ponderosa (ponderosa pine trees data, locations)
  • redwood (redwood samplings data, locations)
  • spruces (Spruce trees in Saxonia, locations and sizes)

References

  1. Baddeley, A., Rubak, E., and Turner, R. 2016. Spatial Point Patterns: methodology and applications with R. Boca Raton: CRC Press.
  2. Other relevant references.