Initial Findings

Getting the information we need

First, read in the data…

oldwd <- getwd()
setwd('MB_2016_QLD')
mesh <- readOGR('.','MB_2016_QLD')
mesh <- st_as_sf(mesh)
setwd(oldwd)

setwd('2016_crimes_ALL_GROUPS_Commerial_final')
crimes <- readOGR('.','2016_crimes_ALL_GROUPS_Commerial_final')
crimes %>% st_as_sf %>% as_tibble -> crimes
setwd(oldwd)

crimes %>% filter(Crime_grp=='Drug') %>% 
  mutate(StartDate=ymd_hms(sub('T',' ',StartDate),tz='Australia/Brisbane'),
         hour=hour(StartDate), 
         MB_CODE16=as.character(MB_CODE16)) %>% 
  group_by(hour,MB_CODE16) %>% 
  summarise(count=n()) %>% 
  arrange(MB_CODE16,hour) -> propdam

table(propdam$MB_CODE16,propdam$hour) -> occ_mat
colnames(occ_mat) <- paste0("T",0:23,"00")

Cluster Analysis

Then we ran a PAM cluster analysis on the data:

cl <- pam(occ_mat,4,metric='manhattan')

Note

We use a Manhattan distance metric
We used 4 clusters after some exploration

Results

Next we link the clusters back to the mesh blocks and compute the probability of events occuring at each hour for each of the groups of clusters.

occ_mat %>% as_tibble %>% transmute(mb=Var1,time=Var2,n=n) %>% arrange(mb) -> occ_tab
tibble(mb=row.names(occ_mat),cl=cl$clustering) -> clusters
occ_tab %>% left_join(clusters) %>% group_by(time,cl) %>% summarise(prob=mean(n),N=n()) %>% ungroup %>%
  mutate(cl=paste0('Clust',cl),time=as.numeric(sub('T','',time)))-> profiles

finally we visualise the results as ‘radar’ plots

ggplot(profiles,aes(x=time,y=prob,fill=N)) + geom_line(stat='identity') + facet_wrap(~cl) + coord_polar()

Initial Findings

Chris Brunsdon and Jon Corcoran

4/10/2017

Getting the information we need

Cluster Analysis

Results