Task - analyze the geographic distribution of childcare and kindergarten services in Singapore by using appropriate map analysis & spatial point patterns analysis techniques
Section A: The supply and demand of childcare and kindergarten services by planning subzone
Section B: Spatial Point Pattern Analysis
More about the datasets used:
According to Early Childhood Development Agency (ECDA), Singapore preschool landscape comprises both kindergartens and childcare centres.
Kindergartens provide pre-school developmental programme for children from about 2 years to below 7 years of age.
Childcare centres provide child care services and pre-school developmental programmes for children aged between 18 months and below 7 years old.Several centres also provide infant care programmes for infants aged between 2 and 18 months old.
Since the age range overlaps for childcare and kindergarten services, this project will assume that childcare services are for children aged 0 to 2 years, while kindergarten services are for children aged 3 to 6
Since the population data set (Singapore Residents by Planning AreaSubzone, Age Group, Sex and Type of Dwelling, June 2011-2019) has a variable ‘AG’ that have ages binned in 4 years, namely ‘0_to_4’, ‘5_to_9’ and so on, the AG count will be divided in half for the sake of getting the count for children aged 0 to 2 years and children aged 3 to 6.
R packages used:
packages = c('sf', 'tidyverse', 'ggplot2', 'ggthemes', 'plotly', 'tmap','rgdal','rvest', 'maptools','mapview', 'raster', 'spatstat')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
We will read the available data into both sp object and sf object although we will be using the sf object first.
mpsz_sp <- readOGR(dsn = "data/geospatial", layer= "MP14_SUBZONE_WEB_PL")
## OGR data source with driver: ESRI Shapefile
## Source: "D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial", layer: "MP14_SUBZONE_WEB_PL"
## with 323 features
## It has 15 fields
mpsz_sf <- st_read(dsn = "data/geospatial", layer= "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
summary(mpsz_sf)
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C
## Min. : 1.0 Min. : 1.000 Length:323 Length:323
## 1st Qu.: 81.5 1st Qu.: 2.000 Class :character Class :character
## Median :162.0 Median : 4.000 Mode :character Mode :character
## Mean :162.0 Mean : 4.625
## 3rd Qu.:242.5 3rd Qu.: 6.500
## Max. :323.0 Max. :17.000
## CA_IND PLN_AREA_N PLN_AREA_C REGION_N
## Length:323 Length:323 Length:323 Length:323
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## REGION_C INC_CRC FMEL_UPD_D X_ADDR
## Length:323 Length:323 Min. :2014-12-05 Min. : 5093
## Class :character Class :character 1st Qu.:2014-12-05 1st Qu.:21864
## Mode :character Mode :character Median :2014-12-05 Median :28465
## Mean :2014-12-05 Mean :27257
## 3rd Qu.:2014-12-05 3rd Qu.:31674
## Max. :2014-12-05 Max. :50425
## Y_ADDR SHAPE_Leng SHAPE_Area geometry
## Min. :19579 Min. : 871.5 Min. : 39438 MULTIPOLYGON :323
## 1st Qu.:31776 1st Qu.: 3709.6 1st Qu.: 628261 epsg:NA : 0
## Median :35113 Median : 5211.9 Median : 1229894 +proj=tmer...: 0
## Mean :36106 Mean : 6524.4 Mean : 2420882
## 3rd Qu.:39869 3rd Qu.: 6942.6 3rd Qu.: 2106483
## Max. :49553 Max. :68083.9 Max. :69748299
Checking if there are NA values in the planning area data set:
There are no NA values.
any(is.na(mpsz_sf))
## [1] FALSE
tm_shape(mpsz_sf)+
tm_basemap('OpenStreetMap')+
tm_polygons("PLN_AREA_N")+
tm_text("SUBZONE_N", size = 0.1)+
tm_borders(lwd=0.5)+
tm_layout(legend.height = 0.45,
legend.width = 0.35,
frame = FALSE) +
tm_compass(type="8star", size = 2) +
tm_scale_bar(width = 0.15) +
tm_credits("www.data.gov.sg",
position = c("left", "bottom"))
This is Singapore with the different planning subzones shown, this assignment will be based on this map.
Firstly, we read the child care & kindergarten KML data into R as a Spatial Object (sp) by using readOGR().
childcare_sp <- readOGR("data/geospatial/child-care-services-kml.kml", "CHILDCARE")
## OGR data source with driver: KML
## Source: "D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial\child-care-services-kml.kml", layer: "CHILDCARE"
## with 1545 features
## It has 2 fields
kindergarten_sp <- readOGR("data/geospatial/kindergartens-kml.kml", "KINDERGARTENS")
## OGR data source with driver: KML
## Source: "D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial\kindergartens-kml.kml", layer: "KINDERGARTENS"
## with 448 features
## It has 2 fields
childcare_sf <- st_read("data/geospatial/child-care-services-kml.kml", "CHILDCARE")
## Reading layer `CHILDCARE' from data source `D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial\child-care-services-kml.kml' using driver `KML'
## Simple feature collection with 1545 features and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
## z_range: zmin: 0 zmax: 0
## geographic CRS: WGS 84
kindergarten_sf <- st_read("data/geospatial/kindergartens-kml.kml", "KINDERGARTENS")
## Reading layer `KINDERGARTENS' from data source `D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial\kindergartens-kml.kml' using driver `KML'
## Simple feature collection with 448 features and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 103.6887 ymin: 1.247759 xmax: 103.9717 ymax: 1.455452
## z_range: zmin: 0 zmax: 0
## geographic CRS: WGS 84
preschool_sf <- st_read("data/geospatial/pre-schools-location-kml.kml", "PRESCHOOLS_LOCATION")
## Reading layer `PRESCHOOLS_LOCATION' from data source `D:\SMU Year 3 Sem 1\IS415 Geospatial Analytics and Application\Take-home_Ex01\data\geospatial\pre-schools-location-kml.kml' using driver `KML'
## Simple feature collection with 1925 features and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 103.6824 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
## z_range: zmin: 0 zmax: 0
## geographic CRS: WGS 84
To make the data more accurate, we should check the coordinate reference systems first and change the projection systems to EPSG 3414 to make it more accurate. We use st_crs() function to check sf object’s crs and crs() function to check sp object’s crs.
st_crs(mpsz_sf)
st_crs(childcare_sf)
st_crs(kindergarten_sf)
st_crs(preschool_sf)
crs(mpsz_sp)
crs(childcare_sp)
crs(kindergarten_sp)
Some of the sf data are not in SVY21 and EPSG3414 so we will use st_transform() & spTransform() functions to change it to EPSG3414.
mpsz_sf_svy21 <- st_transform(mpsz_sf, 3414)
childcare_sf_svy21 <- st_transform(childcare_sf,3414)
kindergarten_sf_svy21 <- st_transform(kindergarten_sf,3414)
preschool_sf_svy21 <- st_transform(preschool_sf, 3414)
mpsz_sp_svy21 <- spTransform(mpsz_sp, CRS("+init=epsg:3414"))
childcare_sp_svy21 <- spTransform(childcare_sp, CRS("+init=epsg:3414"))
kindergarten_sp_svy21 <- spTransform(kindergarten_sp, CRS("+init=epsg:3414"))
Let’s check the CRS again for the sf objects.
st_crs(mpsz_sf_svy21)
st_crs(childcare_sf_svy21)
st_crs(kindergarten_sf_svy21)
st_crs(preschool_sf_svy21)
Let’s check the CRS again for the sp objects.
crs(mpsz_sp_svy21)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs
crs(childcare_sp_svy21)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs
crs(kindergarten_sp_svy21)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs
Now that projection system is done, let’s move on to displaying the KML file and see what we have.
head(childcare_sf_svy21, n=1)
## Simple feature collection with 1 feature and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 27976.73 ymin: 45716.7 xmax: 27976.73 ymax: 45716.7
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name
## 1 kml_1
## Description
## 1 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>ADDRESSBLOCKHOUSENUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSBUILDINGNAME</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSPOSTALCODE</th> <td>760742</td> </tr><tr bgcolor=""> <th>ADDRESSSTREETNAME</th> <td>742, YISHUN AVENUE 5, #01 - 470, SINGAPORE 760742</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSTYPE</th> <td></td> </tr><tr bgcolor=""> <th>DESCRIPTION</th> <td>Child Care Services</td> </tr><tr bgcolor="#E3E3F3"> <th>HYPERLINK</th> <td></td> </tr><tr bgcolor=""> <th>LANDXADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor="#E3E3F3"> <th>LANDYADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor=""> <th>NAME</th> <td>AVERBEL CHILD DEVELOPMENT CENTRE PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>PHOTOURL</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSFLOORNUMBER</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>AEA27114446235CE</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200826094036</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSUNITNUMBER</th> <td></td> </tr></table></center>
## geometry
## 1 POINT Z (27976.73 45716.7 0)
head(kindergarten_sf_svy21, n=1)
## Simple feature collection with 1 feature and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 28847.97 ymin: 40124.9 xmax: 28847.97 ymax: 40124.9
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name
## 1 kml_1
## Description
## 1 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>ADDRESSBLOCKHOUSENUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSBUILDINGNAME</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSFLOORNUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSPOSTALCODE</th> <td>560644</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSSTREETNAME</th> <td>644 Ang Mo Kio Ave 4 #01-850 S(560644)</td> </tr><tr bgcolor=""> <th>ADDRESSTYPE</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>DESCRIPTION</th> <td>Kindergartens</td> </tr><tr bgcolor=""> <th>HYPERLINK</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>LANDXADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor=""> <th>LANDYADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor="#E3E3F3"> <th>NAME</th> <td>PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)</td> </tr><tr bgcolor=""> <th>PHOTOURL</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>904D106E26156265</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200813015028</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSUNITNUMBER</th> <td></td> </tr></table></center>
## geometry
## 1 POINT Z (28847.97 40124.9 0)
head(preschool_sf_svy21, n=1)
## Simple feature collection with 1 feature and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 13258.34 ymin: 35611.04 xmax: 13258.34 ymax: 35611.04
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name
## 1 kml_1
## Description
## 1 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>CENTRE_NAME</th> <td>BRILLIANT TOTS PTE. LTD.</td> </tr><tr bgcolor=""> <th>CENTRE_CODE</th> <td>PT9334</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESS</th> <td>610, JURONG WEST STREET 65, #01 - 534, S 640610</td> </tr><tr bgcolor=""> <th>POSTAL_CODE</th> <td>640610</td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>0523C7904478A63D</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200812235534</td> </tr></table></center>
## geometry
## 1 POINT Z (13258.34 35611.04 0)
You can see that all the variables that we need are lumped into the Description variable. Hence, we need to split the Description variable to create new columns.
We use the read_html(), html_node() and html_table() functions from the rvest package to split the table in the Description variable and pivot_wider() function to create new columns of Attribute and Value. Credits: Stack Overflow
attributes <- lapply(X = 1:nrow(childcare_sf_svy21),
FUN = function(x) {
childcare_sf_svy21 %>%
slice(x) %>%
pull(Description) %>%
read_html() %>%
html_node("table") %>%
html_table(header = TRUE, trim = TRUE, dec = ".", fill = TRUE) %>%
as_tibble(.name_repair = ~ make.names(c("Attribute", "Value"))) %>%
pivot_wider(names_from = Attribute, values_from = Value)
})
Next, we create a new dataframe that adds new columns from the different Attributes and Values in the respective columns and deselect Description to make the data tidy.
childcare_attr <- childcare_sf_svy21 %>%
bind_cols(bind_rows(attributes)) %>%
dplyr::select(-Description)
Let’s see the new sf childcare dataframe.
head(childcare_attr, n=1)
## Simple feature collection with 1 feature and 16 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 27976.73 ymin: 45716.7 xmax: 27976.73 ymax: 45716.7
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name ADDRESSBLOCKHOUSENUMBER ADDRESSBUILDINGNAME ADDRESSPOSTALCODE
## 1 kml_1 760742
## ADDRESSSTREETNAME ADDRESSTYPE
## 1 742, YISHUN AVENUE 5, #01 - 470, SINGAPORE 760742
## DESCRIPTION HYPERLINK LANDXADDRESSPOINT LANDYADDRESSPOINT
## 1 Child Care Services 0 0
## NAME PHOTOURL ADDRESSFLOORNUMBER
## 1 AVERBEL CHILD DEVELOPMENT CENTRE PTE LTD
## INC_CRC FMEL_UPD_D ADDRESSUNITNUMBER
## 1 AEA27114446235CE 20200826094036
## geometry
## 1 POINT Z (27976.73 45716.7 0)
any(is.na(childcare_attr))
## [1] FALSE
attributes <- lapply(X = 1:nrow(kindergarten_sf_svy21),
FUN = function(x) {
kindergarten_sf_svy21 %>%
slice(x) %>%
pull(Description) %>%
read_html() %>%
html_node("table") %>%
html_table(header = TRUE, trim = TRUE, dec = ".", fill = TRUE) %>%
as_tibble(.name_repair = ~ make.names(c("Attribute", "Value"))) %>%
pivot_wider(names_from = Attribute, values_from = Value)
})
kindergarten_attr <- kindergarten_sf_svy21 %>%
bind_cols(bind_rows(attributes)) %>%
dplyr::select(-Description)
Let’s see the new sf kindergarten dataframe.
head(kindergarten_attr, n=1)
## Simple feature collection with 1 feature and 16 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 28847.97 ymin: 40124.9 xmax: 28847.97 ymax: 40124.9
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name ADDRESSBLOCKHOUSENUMBER ADDRESSBUILDINGNAME ADDRESSFLOORNUMBER
## 1 kml_1
## ADDRESSPOSTALCODE ADDRESSSTREETNAME ADDRESSTYPE
## 1 560644 644 Ang Mo Kio Ave 4 #01-850 S(560644)
## DESCRIPTION HYPERLINK LANDXADDRESSPOINT LANDYADDRESSPOINT
## 1 Kindergartens 0 0
## NAME PHOTOURL
## 1 PCF Sparkletots Preschool @ Yio Chu Kang Blk 644 (KN)
## INC_CRC FMEL_UPD_D ADDRESSUNITNUMBER
## 1 904D106E26156265 20200813015028
## geometry
## 1 POINT Z (28847.97 40124.9 0)
any(is.na(kindergarten_attr))
## [1] FALSE
attributes <- lapply(X = 1:nrow(preschool_sf_svy21),
FUN = function(x) {
preschool_sf_svy21 %>%
slice(x) %>%
pull(Description) %>%
read_html() %>%
html_node("table") %>%
html_table(header = TRUE, trim = TRUE, dec = ".", fill = TRUE) %>%
as_tibble(.name_repair = ~ make.names(c("Attribute", "Value"))) %>%
pivot_wider(names_from = Attribute, values_from = Value)
})
preschool_attr <- preschool_sf_svy21 %>%
bind_cols(bind_rows(attributes)) %>%
dplyr::select(-Description)
Let’s see the new sf preschool dataframe.
head(preschool_attr, n=1)
## Simple feature collection with 1 feature and 7 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 13258.34 ymin: 35611.04 xmax: 13258.34 ymax: 35611.04
## z_range: zmin: 0 zmax: 0
## projected CRS: SVY21 / Singapore TM
## Name CENTRE_NAME CENTRE_CODE
## 1 kml_1 BRILLIANT TOTS PTE. LTD. PT9334
## ADDRESS POSTAL_CODE INC_CRC
## 1 610, JURONG WEST STREET 65, #01 - 534, S 640610 640610 0523C7904478A63D
## FMEL_UPD_D geometry
## 1 20200812235534 POINT Z (13258.34 35611.04 0)
any(is.na(preschool_attr), n=1)
## [1] TRUE
summary(childcare_attr)
## Name ADDRESSBLOCKHOUSENUMBER ADDRESSBUILDINGNAME
## Length:1545 Length:1545 Length:1545
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## ADDRESSPOSTALCODE ADDRESSSTREETNAME ADDRESSTYPE DESCRIPTION
## Length:1545 Length:1545 Length:1545 Length:1545
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## HYPERLINK LANDXADDRESSPOINT LANDYADDRESSPOINT NAME
## Length:1545 Length:1545 Length:1545 Length:1545
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## PHOTOURL ADDRESSFLOORNUMBER INC_CRC FMEL_UPD_D
## Length:1545 Length:1545 Length:1545 Length:1545
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## ADDRESSUNITNUMBER geometry
## Length:1545 POINT Z :1545
## Class :character epsg:3414 : 0
## Mode :character +proj=tmer...: 0
summary(kindergarten_attr)
## Name ADDRESSBLOCKHOUSENUMBER ADDRESSBUILDINGNAME
## Length:448 Length:448 Length:448
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## ADDRESSFLOORNUMBER ADDRESSPOSTALCODE ADDRESSSTREETNAME ADDRESSTYPE
## Length:448 Length:448 Length:448 Length:448
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## DESCRIPTION HYPERLINK LANDXADDRESSPOINT LANDYADDRESSPOINT
## Length:448 Length:448 Length:448 Length:448
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## NAME PHOTOURL INC_CRC FMEL_UPD_D
## Length:448 Length:448 Length:448 Length:448
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## ADDRESSUNITNUMBER geometry
## Length:448 POINT Z :448
## Class :character epsg:3414 : 0
## Mode :character +proj=tmer...: 0
summary(preschool_attr)
## Name CENTRE_NAME CENTRE_CODE ADDRESS
## Length:1925 Length:1925 Length:1925 Length:1925
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## POSTAL_CODE INC_CRC FMEL_UPD_D geometry
## Length:1925 Length:1925 Length:1925 POINT Z :1925
## Class :character Class :character Class :character epsg:3414 : 0
## Mode :character Mode :character Mode :character +proj=tmer...: 0
Let’s see the sp objects.
It’s also lumped into the Description variable, but we will leave it first since we alr processed the sf KML data.
head(childcare_sp_svy21, n=1)
## Name
## 1 kml_1
## Description
## 1 <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>ADDRESSBLOCKHOUSENUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSBUILDINGNAME</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSPOSTALCODE</th> <td>760742</td> </tr><tr bgcolor=""> <th>ADDRESSSTREETNAME</th> <td>742, YISHUN AVENUE 5, #01 - 470, SINGAPORE 760742</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSTYPE</th> <td></td> </tr><tr bgcolor=""> <th>DESCRIPTION</th> <td>Child Care Services</td> </tr><tr bgcolor="#E3E3F3"> <th>HYPERLINK</th> <td></td> </tr><tr bgcolor=""> <th>LANDXADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor="#E3E3F3"> <th>LANDYADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor=""> <th>NAME</th> <td>AVERBEL CHILD DEVELOPMENT CENTRE PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>PHOTOURL</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSFLOORNUMBER</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>AEA27114446235CE</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200826094036</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSUNITNUMBER</th> <td></td> </tr></table></center>
population <- read_csv('data/aspatial/respopagesextod2011to2019.csv')
head(population)
## # A tibble: 6 x 7
## PA SZ AG Sex TOD Pop Time
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males HDB 1- and 2-Room Flats 0 2011
## 2 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males HDB 3-Room Flats 10 2011
## 3 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males HDB 4-Room Flats 30 2011
## 4 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males HDB 5-Room and Executive F~ 50 2011
## 5 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males HUDC Flats (excluding thos~ 0 2011
## 6 Ang Mo K~ Ang Mo Kio Town~ 0_to~ Males Landed Properties 0 2011
distinct(population, SZ)
## # A tibble: 323 x 1
## SZ
## <chr>
## 1 Ang Mo Kio Town Centre
## 2 Cheng San
## 3 Chong Boon
## 4 Kebun Bahru
## 5 Sembawang Hills
## 6 Shangri-La
## 7 Tagore
## 8 Townsville
## 9 Yio Chu Kang
## 10 Yio Chu Kang East
## # ... with 313 more rows
Checking if there are NA values in the population data set:
There are no NA values.
any(is.na(population))
## [1] FALSE
population[rowSums(is.na(population))!=0,]
## # A tibble: 0 x 7
## # ... with 7 variables: PA <chr>, SZ <chr>, AG <chr>, Sex <chr>, TOD <chr>,
## # Pop <dbl>, Time <dbl>
To make the choropleth map, we need to first split the AG variable (age) in the population data set to different variable of binned ages.
We are particularly interested in 2 variables, childcare services age, 0 to 2 years old and kindergarten services age, 3 to 6 years old.
pop_preschool <- population %>%
filter(Time == 2019) %>%
spread(AG, Pop) %>%
mutate(`CHILDCARE_POP` = `0_to_4` %/% 2) %>%
mutate(`KINDERGARTEN_POP` = (`0_to_4` %/% 2) + (`5_to_9` %/% (2/5)))%>%
mutate(`YOUNG` = `0_to_4`+`5_to_9`+`10_to_14`+
`15_to_19`+`20_to_24`) %>%
mutate(`ECONOMICALLY_ACTIVE` = rowSums(.[9:13])+rowSums(.[15:17]))%>%
mutate(`AGED`= rowSums(.[18:22])) %>%
mutate(`TOTAL` = rowSums(.[5:22])) %>%
mutate_at(.vars = vars(PA, SZ, Sex), .funs = funs(toupper))%>%
dplyr::select(`PA`, `SZ`, `KINDERGARTEN_POP`, `CHILDCARE_POP`, `YOUNG`, `ECONOMICALLY_ACTIVE`, `AGED`, `TOTAL`) %>%
filter(`ECONOMICALLY_ACTIVE`> 0)
Since we do not need the data on TOD, the residence type of the population in each planning area, we will create a new dataframe that adds up the population in each subzone.
pop_preschool2 <- data.frame()
index = 1
for (i in 1:nrow(pop_preschool)){
if (i == 1){
data1 <- pop_preschool[i,]
df1 <- data.frame(data1)
pop_preschool2 <- rbind(pop_preschool2, df1)
}
if ((i != 1) & (pop_preschool[i,2] %in% pop_preschool2[,2] == 0)){
data2 <- pop_preschool[i,]
df2 <- data.frame(data2)
pop_preschool2 <- rbind(pop_preschool2, df2)
index = index + 1
} else{
for (j in 3:8){
pop_preschool2[index, j] = pop_preschool2[index,j] + pop_preschool[i, j]
}
}
}
From the code above, the first observation of subzone Ang Mo Kio Town Centre is added twice to the final dataframe because of our first condition, so we have to reduce the first row value by the first observation of subzone Ang Mo Kio Town Centre before further manipulating the data.
for (a in 3:8){
pop_preschool2[1,a] = pop_preschool2[1,a] - pop_preschool[1,a]
}
Then, we will create new columns named:
‘PRESCHOOL_POP’ which is the percentage of the population of children aged 0 to 6 who go for childcare and kindergarten services
‘CHILDCARE_POP_PERCENT’ which is the percentage of the population of children aged 0 to 2 who go for childcare services
‘KINDERGARTEN_POP_PERCENT’ which is the percentage of the population of children aged 3 to 6 who go for kindergarten services
pop_preschool_final <- pop_preschool2 %>%
mutate(`PRESCHOOL_POP` = ((`CHILDCARE_POP` + `KINDERGARTEN_POP`)/ `TOTAL`)*100)%>%
mutate(`CHILDCARE_POP_PERCENT` = ((`CHILDCARE_POP`)/ `TOTAL`)*100)%>%
mutate(`KINDERGARTEN_POP_PERCENT` = ((`KINDERGARTEN_POP`)/ `TOTAL`)*100)%>%
dplyr::select(`PA`, `SZ`, `CHILDCARE_POP`,`CHILDCARE_POP_PERCENT`, `KINDERGARTEN_POP`,`KINDERGARTEN_POP_PERCENT`, `PRESCHOOL_POP`, `YOUNG`, `ECONOMICALLY_ACTIVE`, `AGED`, `TOTAL`)
summary(pop_preschool_final)
## PA SZ CHILDCARE_POP CHILDCARE_POP_PERCENT
## Length:233 Length:233 Min. : 0.0 Min. :0.0000
## Class :character Class :character 1st Qu.: 65.0 1st Qu.:0.4448
## Mode :character Mode :character Median : 210.0 Median :0.7490
## Mean : 398.2 Mean :0.8852
## 3rd Qu.: 535.0 3rd Qu.:1.1663
## Max. :2875.0 Max. :4.1201
## KINDERGARTEN_POP KINDERGARTEN_POP_PERCENT PRESCHOOL_POP YOUNG
## Min. : 0 Min. : 0.000 Min. : 0.000 Min. : 0
## 1st Qu.: 453 1st Qu.: 3.167 1st Qu.: 3.579 1st Qu.: 750
## Median : 1527 Median : 5.242 Median : 5.947 Median : 2760
## Mean : 2520 Mean : 5.755 Mean : 6.640 Mean : 4568
## 3rd Qu.: 3437 3rd Qu.: 8.029 3rd Qu.: 9.130 3rd Qu.: 6520
## Max. :16540 Max. :20.554 Max. :24.635 Max. :34260
## ECONOMICALLY_ACTIVE AGED TOTAL
## Min. : 30 Min. : 0 Min. : 4088
## 1st Qu.: 1630 1st Qu.: 590 1st Qu.: 13014
## Median : 5840 Median : 2510 Median : 29240
## Mean : 9623 Mean : 3464 Mean : 33147
## 3rd Qu.:14660 3rd Qu.: 5120 3rd Qu.: 47260
## Max. :75280 Max. :28520 Max. :155628
We can see that ‘pop_preschool_final’ has 233 observations of 11 variables, thus we want to check if it corresponds to the distinct values of subzone areas. From the code, we can see that the distinct value is 233, it corresponds to the length of our final population dataframe.
distinct(pop_preschool, SZ)
## # A tibble: 233 x 1
## SZ
## <chr>
## 1 ANG MO KIO TOWN CENTRE
## 2 CHENG SAN
## 3 CHONG BOON
## 4 KEBUN BAHRU
## 5 SEMBAWANG HILLS
## 6 SHANGRI-LA
## 7 TAGORE
## 8 TOWNSVILLE
## 9 YIO CHU KANG EAST
## 10 YIO CHU KANG WEST
## # ... with 223 more rows
Finally, we will merge the aspatial population data (pop_preschool_final) that we have tidied with the spatial data which is the Master Plan 2014 Planning Area Boundary. For the arguments, we put all.x as TRUE so that all the x dataframe (mpsz_sp_svy21) will be kept even if no matches are found, it’s similar to left_join() function in SQL.
mpsz_preschool <- merge(mpsz_sp_svy21, pop_preschool_final, by.x = "SUBZONE_N", by.y = "SZ", all.x=TRUE)
We can see that there are 90 NA’s from the population data, meaning that not all the subzone planning area are captured in the population data, they have NA population.
summary(mpsz_preschool)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x 2667.538 56396.44
## y 15748.721 50256.33
## Is projected: TRUE
## proj4string :
## [+proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs]
## Data attributes:
## SUBZONE_N OBJECTID SUBZONE_NO SUBZONE_C
## Length:323 Min. : 1.0 Min. : 1.000 Length:323
## Class :character 1st Qu.: 81.5 1st Qu.: 2.000 Class :character
## Mode :character Median :162.0 Median : 4.000 Mode :character
## Mean :162.0 Mean : 4.625
## 3rd Qu.:242.5 3rd Qu.: 6.500
## Max. :323.0 Max. :17.000
##
## CA_IND PLN_AREA_N PLN_AREA_C REGION_N
## Length:323 Length:323 Length:323 Length:323
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## REGION_C INC_CRC FMEL_UPD_D X_ADDR
## Length:323 Length:323 Length:323 Min. : 5093
## Class :character Class :character Class :character 1st Qu.:21864
## Mode :character Mode :character Mode :character Median :28465
## Mean :27257
## 3rd Qu.:31674
## Max. :50425
##
## Y_ADDR SHAPE_Leng SHAPE_Area PA
## Min. :19579 Min. : 871.5 Min. : 39438 Length:323
## 1st Qu.:31776 1st Qu.: 3709.6 1st Qu.: 628261 Class :character
## Median :35113 Median : 5211.9 Median : 1229894 Mode :character
## Mean :36106 Mean : 6524.4 Mean : 2420882
## 3rd Qu.:39869 3rd Qu.: 6942.6 3rd Qu.: 2106483
## Max. :49553 Max. :68083.9 Max. :69748299
##
## CHILDCARE_POP CHILDCARE_POP_PERCENT KINDERGARTEN_POP
## Min. : 0.0 Min. :0.0000 Min. : 0
## 1st Qu.: 65.0 1st Qu.:0.4448 1st Qu.: 453
## Median : 210.0 Median :0.7490 Median : 1527
## Mean : 398.2 Mean :0.8852 Mean : 2520
## 3rd Qu.: 535.0 3rd Qu.:1.1663 3rd Qu.: 3437
## Max. :2875.0 Max. :4.1201 Max. :16540
## NA's :90 NA's :90 NA's :90
## KINDERGARTEN_POP_PERCENT PRESCHOOL_POP YOUNG ECONOMICALLY_ACTIVE
## Min. : 0.000 Min. : 0.000 Min. : 0 Min. : 30
## 1st Qu.: 3.167 1st Qu.: 3.579 1st Qu.: 750 1st Qu.: 1630
## Median : 5.242 Median : 5.947 Median : 2760 Median : 5840
## Mean : 5.755 Mean : 6.640 Mean : 4568 Mean : 9623
## 3rd Qu.: 8.029 3rd Qu.: 9.130 3rd Qu.: 6520 3rd Qu.:14660
## Max. :20.554 Max. :24.635 Max. :34260 Max. :75280
## NA's :90 NA's :90 NA's :90 NA's :90
## AGED TOTAL
## Min. : 0 Min. : 4088
## 1st Qu.: 590 1st Qu.: 13014
## Median : 2510 Median : 29240
## Mean : 3464 Mean : 33147
## 3rd Qu.: 5120 3rd Qu.: 47260
## Max. :28520 Max. :155628
## NA's :90 NA's :90
Demand is a lot about population, so we will plot maps of the overall population, preschool age population and describe the spatial patterns.
map_layout <- function(x,y,z){
tm_shape(x)+
tm_fill(y,
style = 'equal',
palette = 'Blues',
legend.hist = TRUE,
legend.is.portrait = TRUE,
legend.hist.z = 0.2)+
tm_borders(alpha = 0.5, col = 'white')+
tm_layout(main.title = z,
main.title.position = 'center',
main.title.size = 1.2,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE)
}
tm_shape(mpsz_preschool)+
tm_fill("TOTAL", style = 'pretty')+
tm_style('natural')+
tm_layout(main.title = "Overall population of Singapore",
main.title.position = 'center',
main.title.size = 1.2,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_borders(alpha = 0.5)
n1 <- order(mpsz_preschool$TOTAL, na.last = TRUE, decreasing = TRUE)[1:4]
((mpsz_preschool$TOTAL)[n1])
## [1] 155628 120259 109526 103208
((mpsz_preschool$PLN_AREA_N)[n1])
## [1] "TAMPINES" "WOODLANDS" "BEDOK" "TAMPINES"
((mpsz_preschool$SUBZONE_N)[n1])
## [1] "TAMPINES EAST" "WOODLANDS EAST" "BEDOK NORTH" "TAMPINES WEST"
From this map, we can see that there are only 4 subzone areas that have a population over 100,000 people and 1 subzone area that has a population over 150,000
The highest populated subzone area is Tampines East with 155,628 people
d1 <- map_layout(mpsz_preschool, "PRESCHOOL_POP", "Overall population of children aged 0 to 6")
d1
n2 <- order(mpsz_preschool$PRESCHOOL_POP, na.last = TRUE, decreasing = TRUE)[1:5]
((mpsz_preschool$PRESCHOOL_POP)[n2])
## [1] 24.63457 24.55684 21.10544 19.74317 19.31104
((mpsz_preschool$PLN_AREA_N)[n2])
## [1] "PUNGGOL" "PUNGGOL" "SENGKANG" "SENGKANG" "YISHUN"
((mpsz_preschool$SUBZONE_N)[n2])
## [1] "WATERWAY EAST" "MATILDA" "FERNVALE" "ANCHORVALE"
## [5] "YISHUN EAST"
Punggol, Waterwest East = 24.63 %
Punggol, Matilda = 24.56 %
Sengkang, Fernvale = 21.10 %
d2 <- map_layout(mpsz_preschool, "CHILDCARE_POP_PERCENT", "Population of children aged 0 to 2 (Childcare services)")
d2
n3 <- order(mpsz_preschool$CHILDCARE_POP_PERCENT, na.last = TRUE, decreasing = TRUE)[1:5]
((mpsz_preschool$CHILDCARE_POP_PERCENT)[n3])
## [1] 4.120092 4.002532 3.377225 3.107460 2.988886
((mpsz_preschool$PLN_AREA_N)[n3])
## [1] "PUNGGOL" "PUNGGOL" "PUNGGOL" "SENGKANG" "SENGKANG"
((mpsz_preschool$SUBZONE_N)[n3])
## [1] "WATERWAY EAST" "MATILDA" "PUNGGOL TOWN CENTRE"
## [4] "FERNVALE" "ANCHORVALE"
Punggol, Waterwest East = 4.12 % %
Punggol, Matilda = 4.00 %
Punggol, Punggol Town Centre = 3.38 %
d3 <- map_layout(mpsz_preschool, "KINDERGARTEN_POP_PERCENT", "Population of children aged 3 to 7 (Kindergarten services)")
d3
n4 <- order(mpsz_preschool$KINDERGARTEN_POP_PERCENT, na.last = TRUE, decreasing = TRUE)[1:5]
((mpsz_preschool$KINDERGARTEN_POP_PERCENT)[n4])
## [1] 20.55431 20.51447 17.99798 16.75428 16.40684
((mpsz_preschool$PLN_AREA_N)[n4])
## [1] "PUNGGOL" "PUNGGOL" "SENGKANG" "SENGKANG" "YISHUN"
((mpsz_preschool$SUBZONE_N)[n4])
## [1] "MATILDA" "WATERWAY EAST" "FERNVALE" "ANCHORVALE"
## [5] "YISHUN EAST"
Punggol, Matilda = 20.55 %
Punggol, Waterway East = 20.51 %
Sengkang, Fernvale = 17.99 %
#library(grid)
#grid.newpage()
#pushViewport(viewport(layout = grid.layout(nrow = 2, ncol = 2)))
#print(d1, vp = viewport(layout.pos.row = 1, layout.pos.col = 1:2))
#print(d2, vp = viewport(layout.pos.row = 2, layout.pos.col = 1))
#print(d3, vp = viewport(layout.pos.row = 2, layout.pos.col = 2))
tmap_arrange(d1,d2,d3)
From this map, we can see that the population of childcare age and kindergarten age does not vary from the expected, the areas with the 2 highest percentage of both kindergarten and childcare population are still Punggol Waterway East and Punggol Matilda
We can also see that the population of children age 0 to 2 are mostly concentrated in the north-east area while the population of children age 3 to 7 are more widely spread, including the east area
Next, we have to identify the pre-schools in each planning subzone area.
Firstly, let’s see if there are any duplicated points events. There seems to be one duplicated point events from the result of the code block. However when we check if there’s any duplicate in Name which contains the block number and building level, it has no duplicate. So, maybe some childcare centres have the same street name.
nrow(childcare_attr)
## [1] 1545
nrow(distinct(childcare_attr, ADDRESSSTREETNAME))
## [1] 1544
any(duplicated(childcare_attr$ADDRESSSTREETNAME))
## [1] TRUE
any(duplicated(childcare_attr$Name))
## [1] FALSE
There are no duplicate data in the kindergarten KML file.
any(duplicated(kindergarten_attr$ADDRESSSTREETNAME))
## [1] FALSE
any(duplicated(kindergarten_attr$Name))
## [1] FALSE
We use the lengths() function and st_intersects() function to calculate the count of preschools in each planning subzone area. For this, we use the sf dataframe because st_intersect() function does not work with sp objects.
mpsz_preschool$`CHILDCARE_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, childcare_sf_svy21))
mpsz_preschool$`KINDERGARTEN_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, kindergarten_sf_svy21))
mpsz_preschool$`PRESCHOOL_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, preschool_sf_svy21))
mpsz_sf_svy21$`CHILDCARE_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, childcare_sf_svy21))
mpsz_sf_svy21$`KINDERGARTEN_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, kindergarten_sf_svy21))
mpsz_sf_svy21$`PRESCHOOL_COUNT` <- lengths(st_intersects(mpsz_sf_svy21, preschool_sf_svy21))
We calculate the area of the planning subzone areas using st_area() from the sf package and it only works on non-Spatial data frame so we have to use the ‘mpsz_sf_svy21’.
mpsz_preschool$AREA <- mpsz_sf_svy21 %>%
st_area()
mpsz_sf_svy21 <- mpsz_sf_svy21 %>%
mutate(`AREA` = mpsz_preschool$AREA)
Next, we can calculate the respective childcare centres density and kindergartens density by diving the count over area of each planning subzone. Then, multiply by 1,000,000 because the projection system is in SVY21 and it’s in metres. To convert m^2 to km^2, we have to multiply by 1,000,000. We also add the childcare density & kindergarten density to the sf data frame ‘mpsz_sf_svy21’ for future map plotting or gg2plots.
mpsz_preschool$CHILDCARE_DENSITY <- (mpsz_preschool$CHILDCARE_COUNT/ mpsz_preschool$AREA * 1000000)
mpsz_preschool$KINDERGARTEN_DENSITY <- (mpsz_preschool$KINDERGARTEN_COUNT/ mpsz_preschool$AREA * 1000000)
mpsz_preschool$PRESCHOOL_DENSITY <- ((mpsz_preschool$KINDERGARTEN_COUNT+ mpsz_preschool$CHILDCARE_COUNT)/ mpsz_preschool$AREA * 1000000)
mpsz_sf_svy21$CHILDCARE_DENSITY <- (mpsz_preschool$CHILDCARE_COUNT/ mpsz_preschool$AREA * 1000000)
mpsz_sf_svy21$KINDERGARTEN_DENSITY <- (mpsz_preschool$KINDERGARTEN_COUNT/ mpsz_preschool$AREA * 1000000)
mpsz_sf_svy21$PRESCHOOL_DENSITY <- ((mpsz_preschool$KINDERGARTEN_COUNT+ mpsz_preschool$CHILDCARE_COUNT)/ mpsz_preschool$AREA * 1000000)
#mpsz_sp_svy21 <- mpsz_preschool %>%
# mutate(`CHILDCARE_DENSITY` = `CHILDCARE_COUNT`/ `AREA` * 1000000) %>%
# mutate(`KINDERGARTEN_DENSITY` = `KINDERGARTEN_COUNT`/ `AREA` * 1000000)
ggplot(data = mpsz_sf_svy21, aes(x= '', y= as.numeric(`CHILDCARE_DENSITY`)))+
geom_boxplot()
ggplot(data = mpsz_sf_svy21, aes(x= '', y= as.numeric(`KINDERGARTEN_DENSITY`))) +
geom_boxplot()
Box plot description: We can see that both boxplots are equally skewed to the left, towards 0 and have some outliers. However, the boxplot for kindergarten_density is more skewed to the left of the graph and to the right of the range (above the middle) compared to the childcare_density data.
Hence, we cannot use quantile method for our tmap plots, since it will be skewed. We will be using ‘pretty’ and ‘equal’ or ‘fixed’ with manually determined breaks.
s1 <- tm_shape(mpsz_preschool)+
tm_fill("PRESCHOOL_COUNT",
style = 'equal',
palette = 'Purples',
legend.hist = TRUE,
legend.is.portrait = TRUE,
legend.hist.z = 0.2)+
tm_layout(main.title = "Number of preschool services",
main.title.position = 'center',
main.title.size = 1.2,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_borders(alpha = 0.5)
s1
Checking out the top 3 planning subzone area with highest preschool count.
n5 <- order(mpsz_preschool$PRESCHOOL_COUNT, na.last = TRUE, decreasing = TRUE)[1:3]
((mpsz_preschool$PRESCHOOL_COUNT)[n5])
## [1] 58 47 31
((mpsz_preschool$PLN_AREA_N)[n5])
## [1] "TAMPINES" "WOODLANDS" "BEDOK"
((mpsz_preschool$SUBZONE_N)[n5])
## [1] "TAMPINES EAST" "WOODLANDS EAST" "BEDOK NORTH"
((mpsz_preschool$PRESCHOOL_POP)[n5])
## [1] 12.459198 15.383464 9.702719
Tampines East: 58 centres with 12.5% of the population being eligible for preschool
Woodlands East: 47 centres with 15.4% of the population being eligible for preschool
Bedok North: 31 centres with 9.7% of the population being eligible for preschool
Checking out the top 3 planning subzone area with highest childcare & kindergarten count.
n6 <- order(mpsz_preschool$CHILDCARE_COUNT, na.last = TRUE, decreasing = TRUE)[1:5]
((mpsz_preschool$CHILDCARE_COUNT)[n6])
## [1] 42 42 27 25 23
((mpsz_preschool$PLN_AREA_N)[n6])
## [1] "TAMPINES" "WOODLANDS" "SENGKANG" "BEDOK" "TAMPINES"
((mpsz_preschool$SUBZONE_N)[n6])
## [1] "TAMPINES EAST" "WOODLANDS EAST" "SENGKANG TOWN CENTRE"
## [4] "BEDOK NORTH" "TAMPINES WEST"
n7 <- order(mpsz_preschool$KINDERGARTEN_COUNT, na.last = TRUE, decreasing = TRUE)[1:3]
((mpsz_preschool$KINDERGARTEN_COUNT)[n7])
## [1] 17 11 10
((mpsz_preschool$PLN_AREA_N)[n7])
## [1] "TAMPINES" "MARINE PARADE" "GEYLANG"
((mpsz_preschool$SUBZONE_N)[n7])
## [1] "TAMPINES EAST" "KATONG" "ALJUNIED"
We can see that both the ranks of highest childcare centres are not similar except Tampines East which has the highest overall population in Singapore too.
However, highest count does not always mean good, because it might be too much for a small area and too few for a big area. So, let’s take a look at the density next.
map_layout_density <- function(x,y,z){
tm_shape(x)+
tm_fill(y,
style = 'fixed',
breaks = c(0,1,5,10,15,20),
palette = "Purples",
legend.hist = TRUE,
legend.is.portrait = TRUE,
legend.hist.z = 0.2)+
tm_borders(alpha = 0.5, col = 'white')+
tm_layout(main.title = z,
main.title.position = 'center',
main.title.size = 1.2,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE)
}
s2 <- map_layout_density(mpsz_preschool, "PRESCHOOL_DENSITY", "Density of preschool services")
s3 <- map_layout_density(mpsz_preschool, "CHILDCARE_DENSITY", "Density of childcare services")
s4 <- map_layout_density(mpsz_preschool, "KINDERGARTEN_DENSITY", "Density of kindergarten services")
par(mfrow= c(2,2))
s2
s3
s4
n8 <- order(mpsz_preschool$PRESCHOOL_DENSITY, na.last = TRUE, decreasing = TRUE)[1:5]
((mpsz_preschool$PRESCHOOL_DENSITY)[n8])
## Units: [1/m^2]
## [1] 35.60169 34.93134 25.35630 23.89766 23.36775
((mpsz_preschool$PLN_AREA_N)[n8])
## [1] "DOWNTOWN CORE" "MANDAI" "DOWNTOWN CORE" "SEMBAWANG"
## [5] "SERANGOON"
((mpsz_preschool$SUBZONE_N)[n8])
## [1] "CECIL" "MANDAI ESTATE" "PHILLIP"
## [4] "SEMBAWANG CENTRAL" "SERANGOON NORTH"
((mpsz_preschool$PRESCHOOL_POP)[n8])
## [1] 0.000000 2.975484 NA 11.749184 8.684998
For the relationship between the childcare and kindergarten services, we can see that childcare services are found at the same planning subzone areas and at more planning subzones but which are still neighbouring, nothing too distant.
General trend: As the childcare/kindergarten service/centres increase, the density also increase, the variation is caused by the difference in area size.
We want to find out whether the preschools are proportional to the population of each planning subzone area so we should plot preschool density vs preschool population.
tmap_mode("view")
## tmap mode set to interactive viewing
tmap_arrange(d1, s2, asp=1, ncol=2)
## legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.
## legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.
tmap_mode("plot")
## tmap mode set to plotting
We can see that some areas with high preschool population are served well, with 15-20% of preschool services density such as Woodlands East, Keat Hong and Anchorvale, the north & northeast area.
However, we can also see lacking services in some high populated areas such as Fernvale, Holland road and Frankel, newly rising areas which may need more preschool facilities.
EDA for point maps of location of childcare and kindergarten services in separate point maps
We have done the following in Section A:
Import planning subzone boundary layer, childcare KML file, kindergarten KML file into R as sp objects using readOGR() function
Check the CRS of each sp object and transform it into EPSG3414 using spTransform() function
So, the spatstat package requires the data to be in ppp object form. We need to convert the sp object to ppp object format.
childcare_sp1 <- as(childcare_sp_svy21, "SpatialPoints")
childcare_ppp <- as(childcare_sp1, "ppp")
kindergarten_sp1 <- as(kindergarten_sp_svy21, "SpatialPoints")
kindergarten_ppp <- as(kindergarten_sp1, "ppp")
mpsz_owin <- as(mpsz_sp_svy21, "owin")
mpsz_childcare_ppp = childcare_ppp[mpsz_owin]
mpsz_kindergarten_ppp = kindergarten_ppp[mpsz_owin]
par(mfrow= c(1,2))
plot(mpsz_sp_svy21)
plot(childcare_sp_svy21,lwd= 0.1, add=TRUE)
plot(mpsz_sp_svy21)
plot(kindergarten_sp_svy21,lwd= 0.1, add=TRUE)
par(mfrow = c(1,2))
plot(childcare_ppp)
plot(kindergarten_ppp)
par(mfrow= c(1,2))
plot(mpsz_childcare_ppp)
plot(mpsz_kindergarten_ppp)
Changing the order of variables of the sf dataframe, childcare_attr and kindergarten_attr, so that the names of the centres can be displayed on the interactive map.
childcare_attr2 <- childcare_attr%>%
dplyr::select(NAME, ADDRESSBLOCKHOUSENUMBER, ADDRESSBUILDINGNAME, ADDRESSPOSTALCODE,ADDRESSSTREETNAME, ADDRESSTYPE, DESCRIPTION, LANDXADDRESSPOINT, LANDYADDRESSPOINT, INC_CRC, FMEL_UPD_D, geometry)
kindergarten_attr2 <- kindergarten_attr %>%
dplyr::select(NAME, ADDRESSBLOCKHOUSENUMBER, ADDRESSBUILDINGNAME, ADDRESSPOSTALCODE,ADDRESSSTREETNAME, ADDRESSTYPE, DESCRIPTION, LANDXADDRESSPOINT, LANDYADDRESSPOINT, INC_CRC, FMEL_UPD_D, geometry)
tmap_mode("view")
## tmap mode set to interactive viewing
a <- tm_shape(childcare_attr2)+
tm_basemap("OpenStreetMap")+
tm_bubbles(col='red', size= 0.2)
b <- tm_shape(kindergarten_attr2)+
tm_basemap("OpenStreetMap")+
tm_bubbles(col='blue', size = 0.2)
tmap_arrange(a, b, asp=1, ncol=2)
tmap_mode("plot")
## tmap mode set to plotting
childcare_sk_ppp = childcare_ppp[SK_owin]
kindergarten_sk_ppp = kindergarten_ppp[SK_owin]
childcare_bd_ppp = childcare_ppp[BD_owin]
kindergarten_bd_ppp = kindergarten_ppp[BD_owin]
childcare_bb_ppp = childcare_ppp[BB_owin]
kindergarten_bb_ppp = kindergarten_ppp[BB_owin]
childcare_hg_ppp = childcare_ppp[HG_owin]
kindergarten_hg_ppp = kindergarten_ppp[HG_owin]
Rescaling to kilometres (km)
childcare_sk_ppp.km = rescale(childcare_sk_ppp, 1000, "km")
kindergarten_sk_ppp.km = rescale(kindergarten_sk_ppp, 1000, "km")
childcare_bd_ppp.km = rescale(childcare_bd_ppp, 1000, "km")
kindergarten_bd_ppp.km = rescale(kindergarten_bd_ppp, 1000, "km")
childcare_bb_ppp.km = rescale(childcare_bb_ppp, 1000, "km")
kindergarten_bb_ppp.km = rescale(kindergarten_bb_ppp, 1000, "km")
childcare_hg_ppp.km = rescale(childcare_hg_ppp, 1000, "km")
kindergarten_hg_ppp.km = rescale(kindergarten_hg_ppp, 1000, "km")
par(mfrow= c(1,2))
plot(childcare_sk_ppp.km, main = "Sengkang childcare centres")
plot(kindergarten_sk_ppp.km, main = "Sengkang kindergartens")
Map Description:
Sengkang’s childcare centres are quite clustered especially within that area near the end
Sengkang’s kindergartens are not as clustered, since the point events are less frequent
Bedok Planning Area
par(mfrow= c(1,2))
plot(childcare_bd_ppp.km, main = "Bedok childcare centres")
plot(kindergarten_bd_ppp.km, main = "Bedok kindergartens")
Map Description:
Bedok’s childcare centres are quite randomly located even though there seems to be small clusters
Bedok kindergartens’ spatial pattern is random
Bukit Batok Planning Area
par(mfrow= c(1,2))
plot(childcare_bb_ppp.km, main = "Bukit Batok childcare centres")
plot(kindergarten_bb_ppp.km, main = "Bukit Batok kindergartens")
Map Description:
Bukit Batok childcare centres’ spatial pattern is random
Bukit Batok kindergartens’ spatial pattern is random
Hougang Planning Area
par(mfrow= c(1,2))
plot(childcare_hg_ppp.km, main = "Hougang childcare centres")
plot(kindergarten_hg_ppp.km, main = "Hougang kindergartens")
Map Description:
Hougang’s childcare centres are quite randomly located even though there seems to be small clusters
Hougang kindergartens’ spatial pattern is random
We will be using the L-Function(Besag 1977), to compute the L-function estimation and envelope() function to perform monte carlo simulation test. Both functions are from spatstat package.
We will use L-function, a transformed version of Ripley’s K-function,because it’s the most commonly used function.
L_sk_c = Lest(childcare_sk_ppp, correction = "Ripley")
plot(L_sk_c, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
To confirm our observed spatial patterns from computing the estimation, we can conduct a hypothesis test.
Null Hypothesis, H0: The distribution of childcare services in Sengkang is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Sengkang is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200, 199 because it starts from 0.
L_sk_c.csr <- envelope(childcare_sk_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
Normal way of plotting:
plot(L_sk_c.csr, .-r ~r, xlab="d", ylab= "L(d)-r")
However, we can’t zoom in to see the exact distance when the estimated L-function exceeds the envelope to indicate statistical significance.
So, we will plot a quantum plot that allows us to slide over a range of values and zoom into the estimated L-function. Help is taken from link1, link2.
quantumPlot <- function(x, title, colour=c("#d73027", "#ffffbf", "#91bfdb")){
#env.data <- as.data.frame(tree.data)
#env.data <- env.data[-1,]
df <- as.data.frame(x)
ggquantumplot <- ggplot(df, aes(r, obs-r))+
geom_line(colour=c("#4d4d4d"))+
geom_line(aes(r,theo-r), colour="red", linetype = "dashed")+
geom_ribbon(aes(ymin=lo-r,ymax=hi-r),alpha=0.1, colour=c("#91bfdb")) +
xlab("Distance r (m)") +
ylab("L(r)-r") +
geom_rug(data=df[df$obs > df$hi,], sides="b", colour=colour[1]) +
geom_rug(data=df[df$obs < df$lo,], sides="b", colour=colour[2]) +
geom_rug(data=df[df$obs >= df$lo & df$obs <= df$hi,], sides="b", color=colour[3]) +
theme_tufte()+
ggtitle(title)
txt1<-"Clustering"
txt2<-"Regular/Uniform"
txt3<-"Random"
if (nrow(df[df$obs > df$hi,])==0){
if (nrow(df[df$obs < df$lo,])==0){
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt3, traces = 4) %>%
rangeslider()
}else if (nrow(df[df$obs >= df$lo & df$obs <= Lcsr_df$hi,])==0){
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt2, traces = 4) %>%
rangeslider()
}else {
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt2, traces = 4) %>%
style(text = txt3, traces = 5) %>%
rangeslider()
}
} else if (nrow(df[df$obs < df$lo,])==0){
if (nrow(df[df$obs >= df$lo & df$obs <= df$hi,])==0){
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt1, traces = 4) %>%
rangeslider()
} else{
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt1, traces = 4) %>%
style(text = txt3, traces = 5) %>%
rangeslider()
}
} else{
ggplotly(ggquantumplot, dynamicTicks=T) %>%
style(text = txt1, traces = 4) %>%
style(text = txt2, traces = 5) %>%
style(text = txt3, traces = 6) %>%
rangeslider()
}
}
quantumPlot(L_sk_c.csr, "Pairwise Distance: L function-Sengkang Childcare")
Conclusion:
Until distance ~66m, childcare centres are not statistically significant and we do not have enough evidence to reject the null hypothesis that the distribution is random, although the estimated L-function is almost one line with the upper envelope even before 150m
After distance ~66m/ when the estimated L-function is above the empirical line and envelope, childcare centres are statistically significant & we have enough evidence to reject the null hypothesis. Hence, the distribution of childcare centres in Sengkang is clustered
L_sk_k = Lest(kindergarten_sk_ppp, correction = "Ripley")
plot(L_sk_k, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Sengkang is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Sengkang is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_sk_k.csr <- envelope(kindergarten_sk_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_sk_k.csr, "Pairwise Distance: L function-Sengkang Kindergarten")
Conclusion:
We can see that the estimated L-function keeps going in and out of the envelope, so from 0 to 188m, the distance is statistically significant there is clustering
Then from 188m to 492m, the distribution is random and it goes to clustering again
L_bd_c = Lest(childcare_bd_ppp, correction = "Ripley")
plot(L_bd_c, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Bedok is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Bedok is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_bd_c.csr <- envelope(childcare_bd_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_bd_c.csr, "Pairwise Distance: L function-Bedok Childcare")
Conclusion:
Until distance ~447m, childcare centres are not statistically significant and we do not have enough evidence to reject the null hypothesis
After distance ~447m/ when the estimated L-function is above the empirical line and envelope, childcare centres are statistically significant & we have enough evidence to reject the null hypothesis. Hence, the distribution of childcare centres in Sengkang is clustered
L_bd_k = Lest(kindergarten_bd_ppp, correction = "Ripley")
plot(L_bd_k, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Sengkang is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Sengkang is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_bd_k.csr <- envelope(kindergarten_bd_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_bd_k.csr, "Pairwise Distance: L function-Bedok Kindergarten")
Conclusion:
This plot is quite different from the rest because from distance 0 to 45m, the estimated L-function is already above the envelope, indicating that they are statistically significant & we have enough evidence to reject the null hypothesis. Hence, the distribution of kindergartens in Bedok is clustered
After distance 45m, the estimated L-function is within the envelope, indicating random distribution
L_bb_c = Lest(childcare_bb_ppp, correction = "Ripley")
plot(L_bb_c, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Bukit Batok is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Bukit Batok is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_bb_c.csr <- envelope(childcare_bb_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_bb_c.csr, "Pairwise Distance: L function-Bukit Batok Childcare")
Conclusion:
L_bb_k = Lest(kindergarten_bb_ppp, correction = "Ripley")
plot(L_bb_k, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Bukit Batok is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Bukit Batok is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_bb_k.csr <- envelope(kindergarten_bb_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_bb_k.csr, "Pairwise Distance: L function-Bukit Batok Kindergarten")
Conclusion:
L_hg_c = Lest(childcare_hg_ppp, correction = "Ripley")
plot(L_hg_c, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Hougang is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Hougang is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_hg_c.csr <- envelope(childcare_hg_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_hg_c.csr, "Pairwise Distance: L function-Hougang Childcare")
Conclusion:
Until distance ~674m, childcare centres are not statistically significant and we do not have enough evidence to reject the null hypothesis that the distribution is random, although the estimated L-function is almost one line with the upper envelope even before 150m
After distance ~674m/ when the estimated L-function is above the empirical line and envelope, childcare centres are statistically significant & we have enough evidence to reject the null hypothesis. Hence, the distribution of childcare centres in Sengkang is clustered
L_hg_k = Lest(kindergarten_hg_ppp, correction = "Ripley")
plot(L_hg_k, .-r ~r,
ylab= "L(d)-r", xlab= "d(m)")
Null Hypothesis, H0: The distribution of childcare services in Hougang is randomly distributed.
Alternative Hypothesis, H1:The distribution of childcare services in Hougang is not randomly distributed.
We set the alpha value to be 0.01. Since it’s a two tailed test, the nsim, number of simulation will be at least 2/0.01 = 200.
L_hg_k.csr <- envelope(kindergarten_hg_ppp, Lest, nsim = 199, rank =1, glocal = TRUE)
quantumPlot(L_hg_k.csr, "Pairwise Distance: L function-Hougang Kindergarten")
Conclusion:
This plot is quite different from the rest because from distance 0 to 58m, the estimated L-function is already above the envelope, indicating that they are statistically significant & we have enough evidence to reject the null hypothesis. Hence, the distribution of kindergartens in Bedok is clustered
After distance 58m, the estimated L-function is within the envelope, indicating random distribution
We can see that most of the distributions of childcare centres and kindergartens are equally clustered and random, and a mix of both derived from the quantum plot map.
To create KDE map, we use the density() function of spatstat and bw.diggle() for automatic bandwidth selection method.
For some of the visualizations, the kernels are manually chosen to visualize the density better.
kde_childcare_sk_bw <- density(childcare_sk_ppp.km, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
kde_kindergartens_sk_bw <- density(kindergarten_sk_ppp.km, sigma = 0.25, edge = TRUE, kernel = "gaussian")
par(mfrow=c(1,2))
plot(kde_childcare_sk_bw, main = "Sengkang's Childcare Density")
plot(kde_kindergartens_sk_bw, main = "Sengkang's Kindergarten Density")
kde_childcare_bd_bw <- density(childcare_bd_ppp.km, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
kde_kindergartens_bd_bw <- density(kindergarten_bd_ppp.km, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
par(mfrow=c(1,2))
plot(kde_childcare_bd_bw, main = "Bedok's Childcare Density")
plot(kde_kindergartens_bd_bw, main = "Bedok's Kindergarten Density")
kde_childcare_bb_bw <- density(childcare_bb_ppp.km, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
kde_kindergartens_bb_bw <- density(kindergarten_bb_ppp.km, sigma = 0.25, edge = TRUE, kernel = "gaussian")
par(mfrow=c(1,2))
plot(kde_childcare_bb_bw, main = "Bukit Batok's Childcare Density")
plot(kde_kindergartens_bb_bw, main = "Bukit Batok's Kindergarten Density")
kde_childcare_hg_bw <- density(childcare_hg_ppp.km, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
kde_kindergartens_hg_bw <- density(kindergarten_hg_ppp.km, sigma = 0.25, edge = TRUE, kernel = "gaussian")
par(mfrow=c(1,2))
plot(kde_childcare_hg_bw, main = "Hougang's Childcare Density")
plot(kde_kindergartens_hg_bw, main = "Hougang's Kindergarten Density")
Firstly, we have to convert the KDE output into grid object using as.SpatialGridDataFrame.im() function so that we can turn it into raster object using raster() function.
gridded_kde_childcare_sk_bw <- as.SpatialGridDataFrame.im(kde_childcare_sk_bw)
kde_childcare_sk_bw_raster <- raster(gridded_kde_childcare_sk_bw)
kde_childcare_sk_bw_raster
## class : RasterLayer
## dimensions : 128, 128, 16384 (nrow, ncol, ncell)
## resolution : 0.05660914, 0.0220703 (x, y)
## extent : 30.12219, 37.36816, 39.76083, 42.58583 (xmin, xmax, ymin, ymax)
## crs : NA
## source : memory
## names : v
## values : -1.133223e-14, 53.42105 (min, max)
Then, we assign the projection system to the raster object using CRS() function so that when we map the raster object, it will return the accurate longtitude & latitude.
crs(kde_childcare_sk_bw_raster) <- CRS('+init=EPSG:3414')
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(kde_childcare_sk_bw_raster) +
tm_basemap("OpenStreetMap")+
tm_raster('v')+
tm_layout(legend.position = c('left', 'bottom'), frame = FALSE)
## legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.
## Variable(s) "v" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
We will be creating a function to plot the KDE output using tmap package.
tmap_mode("plot")
## tmap mode set to plotting
KDE_tmap <- function(x,y){
tm_shape(x)+
tm_basemap("OpenStreetMap")+
tm_raster(y)+
tm_layout(legend.height = 0.45,
legend.width = 0.35,
frame = FALSE) +
tm_compass(type="8star", size = 2) +
tm_scale_bar(width = 0.15)
}
KDE_tmap(kde_childcare_sk_bw_raster, "v")
## Variable(s) "v" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
gridded_kde_kindergarten_sk_bw <- as.SpatialGridDataFrame.im(kde_kindergartens_sk_bw)
kde_kindergarten_sk_bw_raster <- raster(gridded_kde_kindergarten_sk_bw)
crs(kde_kindergarten_sk_bw_raster) <- CRS('+init=EPSG:3414')
KDE_tmap(kde_kindergarten_sk_bw_raster, "v")
## Variable(s) "v" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
gridded_kde_childcare_bd_bw <- as.SpatialGridDataFrame.im(kde_childcare_bd_bw)
kde_childcare_bd_bw_raster <- raster(gridded_kde_childcare_bd_bw)
gridded_kde_kindergarten_bd_bw <- as.SpatialGridDataFrame.im(kde_kindergartens_bd_bw)
kde_kindergarten_bd_bw_raster <- raster(gridded_kde_kindergarten_bd_bw)
crs(kde_childcare_bd_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
crs(kde_kindergarten_bd_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
tmap_mode("plot")
## tmap mode set to plotting
par(mfrow=c(1,2))
KDE_tmap(kde_childcare_bd_bw_raster, "v")
KDE_tmap(kde_kindergarten_bd_bw_raster, "v")
gridded_kde_childcare_bb_bw <- as.SpatialGridDataFrame.im(kde_childcare_bb_bw)
kde_childcare_bb_bw_raster <- raster(gridded_kde_childcare_bb_bw)
gridded_kde_kindergarten_bb_bw <- as.SpatialGridDataFrame.im(kde_kindergartens_bb_bw)
kde_kindergarten_bb_bw_raster <- raster(gridded_kde_kindergarten_bb_bw)
crs(kde_childcare_bb_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
crs(kde_kindergarten_bb_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
tmap_mode("plot")
## tmap mode set to plotting
par(mfrow=c(1,2))
KDE_tmap(kde_childcare_bb_bw_raster, "v")
KDE_tmap(kde_kindergarten_bb_bw_raster, "v")
gridded_kde_childcare_hg_bw <- as.SpatialGridDataFrame.im(kde_childcare_hg_bw)
kde_childcare_hg_bw_raster <- raster(gridded_kde_childcare_hg_bw)
gridded_kde_kindergarten_hg_bw <- as.SpatialGridDataFrame.im(kde_kindergartens_hg_bw)
kde_kindergarten_hg_bw_raster <- raster(gridded_kde_kindergarten_hg_bw)
crs(kde_childcare_hg_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
crs(kde_kindergarten_hg_bw_raster) <- CRS('+init=EPSG:3414')
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum SVY21 in CRS definition
tmap_mode("plot")
## tmap mode set to plotting
par(mfrow=c(1,2))
KDE_tmap(kde_childcare_hg_bw_raster, "v")
KDE_tmap(kde_kindergarten_hg_bw_raster, "v")
The advantages of KDE over point maps are:
However, point maps also have some aspects which are useful, such as showing the variable text as we hover over the point events, it’s also the accurate location of each point events.
From the assignment, we can see that not all planning subzone areas are sufficiently served by the preschool facilities (from the preschool density data)
From the assignment, we can see some new planning subzone areas that are more appealing for the new families, from the rising children population in certain planning subzone areas
From the assignment, we can see that some CBD and downtown areas have high preschool density, indicating some trends of maybe parents dropping off their children before going to work. There are other trends that influence our data
From the assignment, we can also see that there are more childcare centres than kindergarten, due to the trend of busy workng parents, also stated in ECDA’s website
The author learnt the difference between sp objects and sf dataframe, and the functions used to handle different data objects
The author learnt how to handle KML file, getting the important information from it
The author learnt the hard way about manipulating csv dataframe from dividing the age variable and creating new columns
The author learnt more about the different spatial point pattern analysis functions and the plotting different kind of maps, be it ggplot, tmap and ggmap
Most importantly, the author becomes more confident in coding geospatial data in R