Getting the data in the right format for statistical analysis
* Create a table with one row or each centre (inc location, average price, and number of bids)
* Tell R which columns are spatial coordinates
* Convert dataset to a format that spatstat can understand (‘ppp’)
centres.sp <- tenders.sp %>%
filter(lat > 0) %>%
group_by(centre,lon,lat) %>%
summarise(price = mean(priceM2, na.rm = T)) %>%
replace_na(list(price = 0))
coordinates(centres.sp) <- c('lon','lat')
centres.ppp <- unmark(as.ppp(centres.sp))
plot(centres.ppp)
Next, we read in a shapefile that contains the outline of the country and define it as the ‘window’ for our spatial point pattern.
sg = readOGR(".","sg-all")
## OGR data source with driver: ESRI Shapefile
## Source: "D:\School Stuff\02 .221 - Making Maps I\Lab\Lab 9", layer: "sg-all"
## with 1 features
## It has 13 fields
sg.window <- as.owin(sg)
centres.ppp <- centres.ppp[sg.window]
plot(centres.ppp)
Now that we have a proper window defined, let’s look at the clustering of hawker centres.
plot(Kest(centres.ppp))
We can also create density plots and contour maps:
plot(density(centres.ppp, 0.02))
contour(density(centres.ppp, 0.02))
Let’s test the assumption that the clustering of hawker centres is just a function of the underlying population in the area.
* First, we load a raster file with the population of Singapore
* Then we estimate the intensity of the hawkwer centre point pattern as a function of the population (covariate)
pop <- as.im(readGDAL("sg-pop.tif"))
## sg-pop.tif has GDAL driver GTiff
## and has 37 rows and 58 columns
plot(rhohat(centres.ppp,pop))
We can also see what the effect is, if we weigh each centre by its average tender price:
plot (rhohat(centres.ppp,pop, weights=centres.sp$price))
Notice that there are outliers in areas with a high population.
We would have expected a much higher intensity of hawker centres there based on the rest of the pattern.
* Find out where these outliers are located
* Reflect on what makes these centres different
* Why can we expect a ‘dip’ towards the upper end of the population density?
plot(pop)
plot(centres.ppp, add=T)
Outliers are located at Jurong East, Choa Chu Kang, Woodlands and Sengkang areas.
These centers are different because these areas are residential towns which have been developed fairly recently. They are within the top 10 most populated regions in Singapore (source), and have been the focus of new developments. In fact, Woodlands and Jurong East have been transformed into Regional Centres (commercial districts second to the central business district). These areas have been planned with large scale shopping centres in mind and houses some of Singapore’s largest shopping malls:
After independence, when Singapore started devloping its urban estates, hawker centers were built as a solution to unlicensed street hawkers (hence the name hawker centres). Thus the more matured residential estates have much more hawker centres, and the larger the estate, the more hawker centres it has.
However, the current most populous residential areas are those which were developed more recently. These newer residential areas have been planned out accomodate the growing population and shopping malls with their indoor (and airconditioned) food centres are the ‘modern’ counterparts to hawker centres. Thus, we can expect a ‘dip’ in hawker centre numbers as we reach the population number matching these estates.
Older estates like Toa Payoh or Ang Mo Kio do have huge malls, but those were build after the residential infrastucture had been around for many decades rather than being built together with the core residential areas.
PART 0 - Preparation
PART 1 - Data Manipulation
PART 2 - Spatial Data
PART 3 - Spatial Point Patterns