1 Introduction

1.1 Objectives of Study

  1. To gain understanding on the supply and demand of childcare services in 2017 and 2020 at planning subzone level.
  2. To gain understanding on the geographic distribution of childcare in 2017 and 2020 in Singapore.

1.2 Tasks

Section A: The supply and demand of childcare and kindergarten services by planning subzone

  1. Exploratory Spatial Data Analysis:

    o Using appropriate EDA and choropleth mapping techniques to reveal the supply and demand of childcare services in 2017 and 2020 at the planning subzone level. Describe the spatial patterns observed.

  2. Analytics mapping:

    o Using appropriate analytics mapping techniques to reveal the temporal changes of the childcare services at the planning subzone level.

  3. Geocommunication:

    o Describe the results of (1) and (2) and draw statistical conclusions.

Section B: Spatial Point Pattern Analysis

  1. Exploratory Spatial Data Analysis:

    o Using point mapping techniques, display the location of childcare services in 2019 and 2020 at the national level. Describe the spatial patterns reveal by their respective distribution.

  2. With reference to the spatial point patterns observed in (4):

    o Formulate the null hypothesis and alternative hypothesis and select the confidence level.

    o Perform the test by using appropriate 2nd order spatial point patterns analysis technique.

    o With reference to the analysis results, draw statistical conclusions.

  3. With reference to the results derived in (4) and (5):

    o Derive kernel density maps of childcare services in 2017 and 2020.

    o Using appropriate tmap functions, display the kernel density maps on openstreetmap of Singapore.

    o With reference to the analysis results, draw statistical conclusions.

    o Compare the advantages of kernel density maps (6) over point maps (4).

2 Getting Started

2.1 Data Acquisition

  1. Childcare Services Data in 2017 (Provided in hands-on exercise)

  2. Latest Childcare Data Set from data.gov.sg.

    Source: https://data.gov.sg/dataset/child-care-services

  3. URA Master Plan 2014 Subzone Boundary Web from data.gov.sg.

    Source: https://data.gov.sg/dataset/master-plan-2014-subzone-boundary-web

  4. Singapore Resident By Subzone and Type of Dwelling 2011 to 2019 from data.gov.sg.

    Source: https://data.gov.sg/dataset/singapore-residents-by-subzone-and-type-of-dwelling-2011-2019

2.2 Definition of Childcare Centres According to Early Childhood Development Agency (ECDA)

Child care centres provide child care services and pre-school developmental programmes for children aged between 18 months and below 7 years old. Several centres also provide infant care programmes for infants aged between 2 and 18 months old. Centres may offer full-day, half-day and flexible programmes to cater to the different working schedules of parents.

Source: https://www.ecda.gov.sg/pages/aboutus.aspx

2.3 Installing R Packages

These are the packages installed in this assignment.

packages = c('sp', 'rgdal', 'rgeos', 'sf', 'tidyverse','tmap','maptools','raster','spatstat','OpenStreetMap','rpostgis','dplyr','plyr')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}
## Loading required package: sp
## Loading required package: rgdal
## rgdal: version: 1.5-16, (SVN revision 1050)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
## Path to GDAL shared files: C:/Users/jingwei/Documents/R/win-library/4.0/rgdal/gdal
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
## Path to PROJ shared files: C:/Users/jingwei/Documents/R/win-library/4.0/rgdal/proj
## Linking to sp version:1.4-2
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
## Loading required package: rgeos
## rgeos version: 0.5-3, (SVN revision 634)
##  GEOS runtime version: 3.8.0-CAPI-1.13.1 
##  Linking to sp version: 1.4-2 
##  Polygon checking: TRUE
## Loading required package: sf
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
## Loading required package: tidyverse
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.2
## v tidyr   1.1.1     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## Loading required package: tmap
## Loading required package: maptools
## Checking rgeos availability: TRUE
## Loading required package: raster
## 
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
## 
##     select
## The following object is masked from 'package:tidyr':
## 
##     extract
## Loading required package: spatstat
## Loading required package: spatstat.data
## Loading required package: nlme
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:raster':
## 
##     getData
## The following object is masked from 'package:dplyr':
## 
##     collapse
## Loading required package: rpart
## Registered S3 method overwritten by 'spatstat':
##   method     from
##   print.boxx cli
## 
## spatstat 1.64-1       (nickname: 'Help you I can, yes!') 
## For an introduction to spatstat, type 'beginner'
## 
## Note: spatstat version 1.64-1 is out of date by more than 4 months; we recommend upgrading to the latest version.
## 
## Attaching package: 'spatstat'
## The following objects are masked from 'package:raster':
## 
##     area, rotate, shift
## Loading required package: OpenStreetMap
## Loading required package: rpostgis
## Loading required package: RPostgreSQL
## Loading required package: DBI
## Loading required package: plyr
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:purrr':
## 
##     compact

2.4 Importing Geospatial / Aspatial Data into R

There are 4 main datasets that will be imported:

For Geospatial:

  1. mpsz: URA Master Plan 2014 Subzone Boundary Web from data.gov.sg

  2. childcare_2017: Childcare Services Data in 2017 (Provided in hands-on exercise)

  3. childcare_2020: Latest Childcare Data Set from data.gov.sg

For Aspatial:

  1. sg_residents: Singapore Resident By Subzone and Type of Dwelling 2011 to 2019 from data.gov.sg

These data will be checked and transform / assign if necessary.

2.4.1 mpsz

mpsz_sf <- st_read(dsn = "data/geospatial", 
                layer = "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
mpsz_sf
## Simple feature collection with 323 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS:  SVY21
## First 10 features:
##    OBJECTID SUBZONE_NO       SUBZONE_N SUBZONE_C CA_IND      PLN_AREA_N
## 1         1          1    MARINA SOUTH    MSSZ01      Y    MARINA SOUTH
## 2         2          1    PEARL'S HILL    OTSZ01      Y          OUTRAM
## 3         3          3       BOAT QUAY    SRSZ03      Y SINGAPORE RIVER
## 4         4          8  HENDERSON HILL    BMSZ08      N     BUKIT MERAH
## 5         5          3         REDHILL    BMSZ03      N     BUKIT MERAH
## 6         6          7  ALEXANDRA HILL    BMSZ07      N     BUKIT MERAH
## 7         7          9   BUKIT HO SWEE    BMSZ09      N     BUKIT MERAH
## 8         8          2     CLARKE QUAY    SRSZ02      Y SINGAPORE RIVER
## 9         9         13 PASIR PANJANG 1    QTSZ13      N      QUEENSTOWN
## 10       10          7       QUEENSWAY    QTSZ07      N      QUEENSTOWN
##    PLN_AREA_C       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
## 1          MS CENTRAL REGION       CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2          OT CENTRAL REGION       CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3          SR CENTRAL REGION       CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4          BM CENTRAL REGION       CR 3775D82C5DDBEFBD 2014-12-05 26782.83
## 5          BM CENTRAL REGION       CR 85D9ABEF0A40678F 2014-12-05 26201.96
## 6          BM CENTRAL REGION       CR 9D286521EF5E3B59 2014-12-05 25358.82
## 7          BM CENTRAL REGION       CR 7839A8577144EFE2 2014-12-05 27680.06
## 8          SR CENTRAL REGION       CR 48661DC0FBA09F7A 2014-12-05 29253.21
## 9          QT CENTRAL REGION       CR 1F721290C421BFAB 2014-12-05 22077.34
## 10         QT CENTRAL REGION       CR 3580D2AFFBEE914C 2014-12-05 24168.31
##      Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
## 1  29220.19   5267.381  1630379.3 MULTIPOLYGON (((31495.56 30...
## 2  29782.05   3506.107   559816.2 MULTIPOLYGON (((29092.28 30...
## 3  29974.66   1740.926   160807.5 MULTIPOLYGON (((29932.33 29...
## 4  29933.77   3313.625   595428.9 MULTIPOLYGON (((27131.28 30...
## 5  30005.70   2825.594   387429.4 MULTIPOLYGON (((26451.03 30...
## 6  29991.38   4428.913  1030378.8 MULTIPOLYGON (((25899.7 297...
## 7  30230.86   3275.312   551732.0 MULTIPOLYGON (((27746.95 30...
## 8  30222.86   2208.619   290184.7 MULTIPOLYGON (((29351.26 29...
## 9  29893.78   6571.323  1084792.3 MULTIPOLYGON (((20996.49 30...
## 10 30104.18   3454.239   631644.3 MULTIPOLYGON (((24472.11 29...
mpsz_sf_3414 <- st_set_crs(mpsz_sf, 3414)
## Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
## that
st_crs(mpsz_sf_3414)
## Coordinate Reference System:
##   User input: EPSG:3414 
##   wkt:
## PROJCRS["SVY21 / Singapore TM",
##     BASEGEOGCRS["SVY21",
##         DATUM["SVY21",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4757]],
##     CONVERSION["Singapore Transverse Mercator",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["unknown"],
##         AREA["Singapore"],
##         BBOX[1.13,103.59,1.47,104.07]],
##     ID["EPSG",3414]]

2.4.2 sg_residents

sg_residents <- read_csv("data/aspatial/planning-area-subzone-age-group-sex-and-type-of-dwelling-june-2011-2019.csv")
## Parsed with column specification:
## cols(
##   planning_area = col_character(),
##   subzone = col_character(),
##   age_group = col_character(),
##   sex = col_character(),
##   type_of_dwelling = col_character(),
##   resident_count = col_double(),
##   year = col_double()
## )

2.4.3 childcare_2017

childcare_2017_sf <- st_read(dsn = "data/geospatial", 
                             layer = "CHILDCARE")
## Reading layer `CHILDCARE' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 1312 features and 18 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 11203.01 ymin: 25667.6 xmax: 45404.24 ymax: 49300.88
## projected CRS:  SVY21
childcare_2017_sf_3414 <- st_set_crs(childcare_2017_sf, 3414)
## Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
## that
st_crs(childcare_2017_sf_3414)
## Coordinate Reference System:
##   User input: EPSG:3414 
##   wkt:
## PROJCRS["SVY21 / Singapore TM",
##     BASEGEOGCRS["SVY21",
##         DATUM["SVY21",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4757]],
##     CONVERSION["Singapore Transverse Mercator",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["unknown"],
##         AREA["Singapore"],
##         BBOX[1.13,103.59,1.47,104.07]],
##     ID["EPSG",3414]]

2.4.4 childcare_2020

childcare_2020_sf <- st_read("data/geospatial/child-care-services-kml.kml")
## Reading layer `CHILDCARE' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial\child-care-services-kml.kml' using driver `KML'
## Simple feature collection with 1545 features and 2 fields
## geometry type:  POINT
## dimension:      XYZ
## bbox:           xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
## z_range:        zmin: 0 zmax: 0
## geographic CRS: WGS 84
childcare_2020_sf_3414 <- st_transform(childcare_2020_sf, 3414)
st_crs(childcare_2020_sf_3414)
## Coordinate Reference System:
##   User input: EPSG:3414 
##   wkt:
## PROJCRS["SVY21 / Singapore TM",
##     BASEGEOGCRS["SVY21",
##         DATUM["SVY21",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4757]],
##     CONVERSION["Singapore Transverse Mercator",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",1.36666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",103.833333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",1,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",28001.642,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",38744.572,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["northing (N)",north,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["easting (E)",east,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["unknown"],
##         AREA["Singapore"],
##         BBOX[1.13,103.59,1.47,104.07]],
##     ID["EPSG",3414]]

3 Section A

3.1 Overview of Datasets

In this section we are shown some important unfiltered figures in the datasets and an idea of what the current trend looks like.

Number of sg residents over the years

We can see that the number of residents has been increasing over the years (2011 to 2019).

sg_resident_sum <- tapply(sg_residents$resident_count, sg_residents$year, FUN=sum)

sg_resident_sum_df <- as.data.frame.table(sg_resident_sum)

colnames(sg_resident_sum_df) <- c("Year", "Residents")
sg_resident_sum_df
##   Year Residents
## 1 2011   3794150
## 2 2012   3824120
## 3 2013   3849790
## 4 2014   3875700
## 5 2015   3909000
## 6 2016   3940320
## 7 2017   3973220
## 8 2018   4001110
## 9 2019   4033420
ggplot(data = sg_resident_sum_df, aes(x = Year, y = Residents))+
  geom_point(color = "#00AFBB", size = 2) + ggtitle("Number of Residents Over the Years (2011 to 2019)")

Number of childcare centers over the years

We can see that there is an increase in number of childcare between 2017 and 2019.

no_of_childcare <- c(nrow(childcare_2017_sf_3414), nrow(childcare_2020_sf_3414))
childcare_year <- c('2017','2020')

plot(childcare_year,no_of_childcare,"o", main = "Number of Childcare Over the Years (2017 and 2020)")

3.2 Filtering and comparisons

In this section, we will be filtering the datasets and making comparisons with the maps generated.

First, we will be filtering sg_residents dataset by year and resident count that is above 0. So as to generate two sets of data, 1 for residents in 2017 and 1 for residents in 2019. Additionally, we will be categorising the age groups into YOUNG, ECONOMY ACTIVE, AGED and TOTAl.

Do note that YOUNG age group, we will be grouping ‘0_to_4’ and ‘5_to_9’. As according to ECDA’s definition, childcare services are for children aged between 18 months and below 7 years old. So the nearest available grouping will be between ‘0_to_4’ and ‘5_to_9’.

We also added a YOUNGRATIO column which is calculated by YOUNG / TOTAL, so as to give us a perspective of the ratio of young to the total number of residents in an area.

We will then join the mpsz dataset with the new sg resident dataset using left join with the unique identifier being subzone.

3.2.1 Filter sg_residents data and joining with mpsz

sg_residents_2017 <- sg_residents %>%
  filter(resident_count > 0) %>%
  filter(year == 2017) %>%
  spread(age_group,resident_count, fill = 0) %>%
  mutate(YOUNG = `0_to_4`+`5_to_9`) %>%
  mutate(`ECONOMY ACTIVE` = `25_to_29` + `30_to_34` + `35_to_39` + `40_to_44` + `45_to_49` + `50_to_54` + `55_to_59` + `60_to_64`) %>%
  mutate(`AGED`= `65_to_69` + `70_to_74` + `75_to_79` + `80_to_84` + `85_to_89` + `90_and_over`) %>%
  mutate(`TOTAL`=rowSums(.[6:24])) %>%
  mutate(`YOUNGRATIO` = (`YOUNG` )/`TOTAL`) %>%
  mutate_if(is.character, str_to_upper) %>%
  dplyr::select(`planning_area`, `subzone`, `sex`,`type_of_dwelling`,`year`,`YOUNG`, `ECONOMY ACTIVE`, `AGED`,`TOTAL`,`YOUNGRATIO`) %>%
  mutate(YOUNG = as.integer(YOUNG)) %>%
  mutate(TOTAL = as.integer(TOTAL))
sg_residents_2017
## # A tibble: 1,897 x 10
##    planning_area subzone sex   type_of_dwelling  year YOUNG `ECONOMY ACTIVE`
##    <chr>         <chr>   <chr> <chr>            <dbl> <int>            <dbl>
##  1 ANG MO KIO    ANG MO~ FEMA~ CONDOMINIUMS AN~  2017   100              560
##  2 ANG MO KIO    ANG MO~ FEMA~ HDB 3-ROOM FLATS  2017    10              170
##  3 ANG MO KIO    ANG MO~ FEMA~ HDB 4-ROOM FLATS  2017    40              230
##  4 ANG MO KIO    ANG MO~ FEMA~ HDB 5-ROOM AND ~  2017    90              590
##  5 ANG MO KIO    ANG MO~ MALES CONDOMINIUMS AN~  2017   100              500
##  6 ANG MO KIO    ANG MO~ MALES HDB 3-ROOM FLATS  2017    20              140
##  7 ANG MO KIO    ANG MO~ MALES HDB 4-ROOM FLATS  2017    40              190
##  8 ANG MO KIO    ANG MO~ MALES HDB 5-ROOM AND ~  2017    90              530
##  9 ANG MO KIO    CHENG ~ FEMA~ HDB 1- AND 2-RO~  2017    20              220
## 10 ANG MO KIO    CHENG ~ FEMA~ HDB 3-ROOM FLATS  2017   370             4640
## # ... with 1,887 more rows, and 3 more variables: AGED <dbl>, TOTAL <int>,
## #   YOUNGRATIO <dbl>
mpsz_filtered_2017 <- left_join(mpsz_sf_3414, sg_residents_2017, by = c("SUBZONE_N" = "subzone"))
mpsz_filtered_2017$YOUNG[is.na(mpsz_filtered_2017$YOUNG)] <- 0
mpsz_filtered_2017$TOTAL[is.na(mpsz_filtered_2017$TOTAL)] <- 0
mpsz_filtered_2017_1 <- mpsz_filtered_2017 %>%
  filter(YOUNG != 0)
sg_residents_2019 <- sg_residents %>%
  filter(resident_count > 0) %>%
  filter(year == 2019) %>%
  spread(age_group,resident_count, fill = 0) %>%
  mutate(YOUNG = `0_to_4`+`5_to_9`) %>%
  mutate(`ECONOMY ACTIVE` = `25_to_29` + `30_to_34` + `35_to_39` + `40_to_44` + `45_to_49` + `50_to_54` + `55_to_59` + `60_to_64`) %>%
  mutate(`AGED`= `65_to_69` + `70_to_74` + `75_to_79` + `80_to_84` + `85_to_89` + `90_and_over`) %>%
  mutate(`TOTAL`=rowSums(.[6:24])) %>%
  mutate(`YOUNGRATIO` = (`YOUNG` )/`TOTAL`) %>%
  mutate_if(is.character, str_to_upper) %>%
  dplyr::select(`planning_area`, `subzone`, `sex`,`type_of_dwelling`,`year`,`YOUNG`, `ECONOMY ACTIVE`, `AGED`,`TOTAL`,`YOUNGRATIO`) %>%
  mutate(YOUNG = as.integer(YOUNG)) %>%
  mutate(TOTAL = as.integer(TOTAL))
sg_residents_2019
## # A tibble: 1,924 x 10
##    planning_area subzone sex   type_of_dwelling  year YOUNG `ECONOMY ACTIVE`
##    <chr>         <chr>   <chr> <chr>            <dbl> <int>            <dbl>
##  1 ANG MO KIO    ANG MO~ FEMA~ CONDOMINIUMS AN~  2019   100              600
##  2 ANG MO KIO    ANG MO~ FEMA~ HDB 3-ROOM FLATS  2019    20              160
##  3 ANG MO KIO    ANG MO~ FEMA~ HDB 4-ROOM FLATS  2019    30              200
##  4 ANG MO KIO    ANG MO~ FEMA~ HDB 5-ROOM AND ~  2019    60              530
##  5 ANG MO KIO    ANG MO~ MALES CONDOMINIUMS AN~  2019   110              510
##  6 ANG MO KIO    ANG MO~ MALES HDB 3-ROOM FLATS  2019    20              140
##  7 ANG MO KIO    ANG MO~ MALES HDB 4-ROOM FLATS  2019    20              160
##  8 ANG MO KIO    ANG MO~ MALES HDB 5-ROOM AND ~  2019    70              480
##  9 ANG MO KIO    ANG MO~ MALES OTHERS            2019     0               10
## 10 ANG MO KIO    CHENG ~ FEMA~ HDB 1- AND 2-RO~  2019    20              230
## # ... with 1,914 more rows, and 3 more variables: AGED <dbl>, TOTAL <int>,
## #   YOUNGRATIO <dbl>
mpsz_filtered_2019 <- left_join(mpsz_sf_3414, sg_residents_2019, by = c("SUBZONE_N" = "subzone"))
mpsz_filtered_2019$YOUNG[is.na(mpsz_filtered_2019$YOUNG)] <- 0
mpsz_filtered_2019$TOTAL[is.na(mpsz_filtered_2019$TOTAL)] <- 0
mpsz_filtered_2019_1 <- mpsz_filtered_2019 %>%
  filter(YOUNG != 0)

3.2.2 Comparisons of maps for changes in young ratio and distribution

In this section, we will be comparing the maps generated for both young ratio and young distribution in 2017 and 2020 respectively.

young_ratio_map_2017 <- tm_shape(mpsz_filtered_2017_1)+
  tm_layout(main.title = "Distribution of Young Ratio in 2017",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("YOUNGRATIO",
          style = "quantile",
          palette = "Greens",
          title = "Ratio of Young") +
  tm_borders(alpha = 0.5)
young_ratio_map_2019 <-tm_shape(mpsz_filtered_2019_1)+
  tm_layout(main.title = "Distribution of Young Ratio in 2019",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("YOUNGRATIO",
          style = "quantile",
          palette = "Greens",
          title = "Ratio of Young") +
  tm_borders(alpha = 0.5)

Ratio of Young by Planning Subzone in 2017 and 2019 Map Comparisons

tmap_arrange(young_ratio_map_2017,young_ratio_map_2019)

From the 2 maps above, you can see that the distribution of young ratio is not even across the island. Some of the subzones have a change in ratio of young to total residents while other some subzone remains the same as in 2017. The changes in subzone usually affects the neighbouring subzone as it can be seen that usually when a subzone’s distribution decreases, the neighbouring subzone distribution increases, likewise when one increases the other decreases. There are also subzones that are not filled as these areas are usually occupied by forests or water bodies such as reservoirs.

young_map_2017 <- tm_shape(mpsz_filtered_2017_1)+
  tm_layout(main.title = "Distribution of Young in 2017",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Greens",
          title = "Number of Young") +
  tm_borders(alpha = 0.5)
young_map_2019 <- tm_shape(mpsz_filtered_2019_1)+
  tm_layout(main.title = "Distribution of Young in 2019",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Greens",
          title = "Number of Young") +
  tm_borders(alpha = 0.5)

Distribution of Young by Planning Subzone in 2017 and 2019 Map Comparisons

tmap_arrange(young_map_2017,young_map_2019)

From the above 2 maps, we can see that the number of young distribution remains relatively unchanged between 2017 and 2019 as many subzone still reflects the same degree of young in 2019. We can also see that the number of young are not evenly distributed across the island. It seems that there are areas that have more concentration of young. For instance, the north east part of the island. This could be due to more younger families staying in these areas.There are also subzones that are not filled as these areas are usually occupied by forests or water bodies such as reservoirs.

3.2.3 Comparisons of maps for changes in numbers of childcare in subzone

In this section, we will be generating a number of maps and graphs comparing the demand and supply of childcare services in subzone level.

tmap_mode("plot")
## tmap mode set to plotting
mpsz_filtered_2017$`count_subzone`<- lengths(st_intersects(mpsz_filtered_2017, childcare_2017_sf_3414))
summary(mpsz_filtered_2017$`count_subzone`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   2.000   5.000   6.487   9.000  36.000

Distribution of Childcare by Planning Subzone in 2017 and 2019 Map Comparisons

childcare_map_2017 <- tm_shape(mpsz_filtered_2017)+
  tm_layout(main.title = "Distribution of Childcare in 2017",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Greens",
          title = "Counts") +
  tm_borders(alpha = 0.5)
mpsz_filtered_2019$`count_subzone`<- lengths(st_intersects(mpsz_filtered_2019, childcare_2020_sf_3414))
summary(mpsz_filtered_2019$`count_subzone`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   3.000   6.000   7.652  11.000  42.000
childcare_map_2020 <- tm_shape(mpsz_filtered_2019)+
  tm_layout(main.title = "Distribution of Childcare in 2020",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Greens",
          title = "Counts") +
  tm_borders(alpha = 0.5)
tmap_arrange(childcare_map_2017,childcare_map_2020)

From the 2 maps above, we can see the distribution of childcare services across the island. They are not evenly distributed across the island. Certain areas such as the east and north east have a higher distribution of childcare in the subzones. This could be due to the number of younger families situated in these areas which is reflected in the previous 2 maps showing the distribution of young, similarly, it can be seen that there is a high demand for childcare in north east area which also matches the high supply of childcare. However, it seems that the western part of the island has relatively lesser supply of childcare, given that the number of young in certain subzone in the west is still considerable.

3.3 Temporal Changes of Childcare Services at the Planning Subzone Level

As there is many subzones, it is difficult to display the changes in every single one of them. Thus, in order to still give a better visualisation of the subzone changes, the changes are filtered by regions. By filtering by regions, we can better see the changes of childcare services at the planning subzone level as each region has a good amount of subzones in it.

no_of_childcare_by_region_year_2017 = ddply(mpsz_filtered_2017,~REGION_N,summarise,no_of_childcare_2017=length(count_subzone))
no_of_childcare_by_region_year_2020 = ddply(mpsz_filtered_2019,~REGION_N,summarise,no_of_childcare_2020=length(count_subzone))

no_of_childcare_by_region_year <- left_join(no_of_childcare_by_region_year_2017, no_of_childcare_by_region_year_2020, by = c("REGION_N" = "REGION_N"))

no_of_childcare_by_region_year$difference <- no_of_childcare_by_region_year$no_of_childcare_2020 - no_of_childcare_by_region_year$no_of_childcare_2017

Difference in number of childcare services between 2017 and 2020

The table and graphs show the number of childcare services in 2017 and 2019, along with the differences of the years.

no_of_childcare_by_region_year
##            REGION_N no_of_childcare_2017 no_of_childcare_2020 difference
## 1    CENTRAL REGION                  846                  839         -7
## 2       EAST REGION                  186                  195          9
## 3 NORTH-EAST REGION                  356                  362          6
## 4      NORTH REGION                  209                  215          6
## 5       WEST REGION                  390                  402         12
ggplot(data = no_of_childcare_by_region_year, aes(x = REGION_N, y = difference))+
  geom_point(color = "#00AFBB", size = 2) + ggtitle("Difference in Number of Childcare Services between 2017 and 2020")

The table and graph above shows the number of childcare in 2017 and 2020, along with the differences for each region. As there are too many subzones, we grouped them up into regions instead. The subzone within each region will be displayed in the maps below later on.

It can be seen that most regions have experienced an increase in number of childcare available. With the west region gaining the most with an increase of 12. However, there is also a drop for central region which experienced a decrease of 7. This could be due to more families wanting to have childcare services near their homes instead as there is a greater proportion of youngs in regions compared to the central regions.

mpsz_filtered_2017_without_spaces_central_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'CENTRAL REGION', count_subzone != 0)

mpsz_filtered_2017_without_spaces_east_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'EAST REGION', count_subzone != 0)

mpsz_filtered_2017_without_spaces_north_east_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'NORTH-EAST REGION', count_subzone != 0)

mpsz_filtered_2017_without_spaces_north_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'NORTH REGION', count_subzone != 0)

mpsz_filtered_2017_without_spaces_west_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'WEST REGION', count_subzone != 0)


mpsz_filtered_2019_without_spaces_central_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'CENTRAL REGION', count_subzone != 0)

mpsz_filtered_2019_without_spaces_east_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'EAST REGION', count_subzone != 0)

mpsz_filtered_2019_without_spaces_north_east_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'NORTH-EAST REGION', count_subzone != 0)

mpsz_filtered_2019_without_spaces_north_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'NORTH REGION', count_subzone != 0)

mpsz_filtered_2019_without_spaces_west_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'WEST REGION', count_subzone != 0)
tm_shape(mpsz_filtered_2017_without_spaces_central_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2017 in Central Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_central_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2020 in Central Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_east_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2017 in East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_east_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2020 in East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_north_east_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2017 in North-East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_north_east_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2020 in North-East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2017 in North Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2020 in North Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2017 in West Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
  tm_fill("count_subzone",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Childcare Services 2020 in West Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

From the above maps, we can see that the number of childcare services in each region is not evenly distributed. Certain subzones have a greater number of childcare while others have a less concentration. However, between 2017 and 2020, we can see that most regions has seen a increase of childcare. Looking further, we can also see that subzones that have less concentration of childcare has seen an increase in childcare, along with subzone that did not have childcares services previously. There are also subzones that have higher number of childcare concentrated together, this could be due to them being near homes or schools which brings convenience to families.

no_of_young_by_region_year_2017 = ddply(mpsz_filtered_2017,~REGION_N,summarise,no_of_young_2017= sum(YOUNG))
no_of_young_by_region_year_2019 = ddply(mpsz_filtered_2019,~REGION_N,summarise,no_of_young_2019= sum(YOUNG))

no_of_young_region_year <- left_join(no_of_young_by_region_year_2017, no_of_young_by_region_year_2019, by = c("REGION_N" = "REGION_N"))

no_of_young_region_year$difference <- no_of_young_region_year$no_of_young_2019 - no_of_young_region_year$no_of_young_2017

Difference in number of young between 2017 and 2020

The table and graphs show the number of young in 2017 and 2019, along with the differences of the years.

no_of_young_region_year
##            REGION_N no_of_young_2017 no_of_young_2019 difference
## 1    CENTRAL REGION            83530            79050      -4480
## 2       EAST REGION            59680            57020      -2660
## 3 NORTH-EAST REGION           102070           103990       1920
## 4      NORTH REGION            57000            59340       2340
## 5       WEST REGION            86880            84510      -2370
ggplot(data = no_of_young_region_year, aes(x = REGION_N, y = difference))+
  geom_point(color = "#00AFBB", size = 2) + ggtitle("Difference in Number of Young between 2017 and 2020")

The table and graph above shows the number of young in 2017 and 2020, along with the differences for each region. As there are too many subzones, we grouped them up into regions instead. The subzone within each region will be displayed in the maps below later on.

It can be seen that most regions have experienced an decrease in number of young. With the greatest decrease in the central region of 4480, along with decrease in east and west regions. However, there is also an increse in young in regions such as the north and north east of the island.

mpsz_filtered_2017_without_spaces_central_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'CENTRAL REGION', YOUNG != 0)

mpsz_filtered_2017_without_spaces_east_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'EAST REGION', YOUNG != 0)

mpsz_filtered_2017_without_spaces_north_east_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'NORTH-EAST REGION', YOUNG != 0)

mpsz_filtered_2017_without_spaces_north_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'NORTH REGION', YOUNG != 0)

mpsz_filtered_2017_without_spaces_west_region <- mpsz_filtered_2017 %>%
  filter(REGION_N == 'WEST REGION', YOUNG != 0)


mpsz_filtered_2019_without_spaces_central_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'CENTRAL REGION', YOUNG != 0)

mpsz_filtered_2019_without_spaces_east_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'EAST REGION', YOUNG != 0)

mpsz_filtered_2019_without_spaces_north_east_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'NORTH-EAST REGION', YOUNG != 0)

mpsz_filtered_2019_without_spaces_north_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'NORTH REGION', YOUNG != 0)

mpsz_filtered_2019_without_spaces_west_region <- mpsz_filtered_2019 %>%
  filter(REGION_N == 'WEST REGION', YOUNG != 0)
tm_shape(mpsz_filtered_2017_without_spaces_central_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2017 in Central Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_central_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2020 in Central Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_east_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2017 in East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_east_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2020 in East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_north_east_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2017 in North-East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_north_east_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2020 in North-East Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2017 in North Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2020 in North Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2017_without_spaces_west_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2017 in West Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

tm_shape(mpsz_filtered_2019_without_spaces_west_region) +
  tm_fill("YOUNG",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = TRUE,
            main.title = "Number of Young 2020 in West Region",
            title.position = c("center", "center"), 
            title.size = 30) +
  tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated

From the above maps, we can see that the number of young in each region is not evenly distributed. Certain subzones have a greater number of young while others have a less concentration. However, between 2017 and 2020, we can see that some subzones remain the same as in 2017 and a few has seen a decrease in young in its subzone.

Most subzones remain relatively the same as before, this could be due to the young still being in the same age group category. Also, this is expected as most families usually stays in a subzone for a long period of time. The difference in years between the 2 maps is only 3 years, thus it cannot be expected to have bigger and more obvious changes compared to the number of childcare services.

In conclusion for the temporary changes in demand and supply of childcare centres, we have seen an increase in overall childcare services of about 1.308% between 2017 and 2020 while there is a decrease in number of young of about 1.349%. This shows that the current differences in number of childcare services and young remains relatively close to each other after 3 years.

However, for certain regions that experienced a drop in young, for instance in central region, there has been a drop of 5.36% of young but the decrease in childcare is only about 0.827% in that region. Also, in another instance, the number of young in the west region decreased by 2.72% but there is an increase of childcare services of about 3.076%. Thus it shows that the supply of childcare is currently not entirely based on the demand of childcare services in the region and certain areas have a bigger disparity between supply and demand of childcare service.

4 Section B

4.1 Check Spatial data and convert into sp format

In this section, we have to convert the data in ppp object form. However, as there is no direct way to convert, we will convert SpatialDataFrame into Spatial objects first.

childcare_2017_sp <- readOGR(dsn = "data/geospatial", layer="CHILDCARE")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\IS415\Assignment1\Take-home_Ex01\data\geospatial", layer: "CHILDCARE"
## with 1312 features
## It has 18 fields
childcare_2017_sp <- as(childcare_2017_sp, "SpatialPoints")
childcare_2020_sp <- as(childcare_2020_sf_3414, "Spatial") 
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
##  but +towgs84= values preserved
childcare_2020_sp <- as(childcare_2020_sp, "SpatialPoints") 
mpsz_filtered_sp <- as(mpsz_sf_3414, "Spatial") 
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
##  but +towgs84= values preserved
crs(childcare_2017_sp)
## CRS arguments:
##  +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs
crs(childcare_2020_sp)
## CRS arguments:
##  +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0
## +units=m +no_defs
crs(mpsz_filtered_sp)
## CRS arguments:
##  +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0
## +units=m +no_defs

4.2 Location of childcare services in 2017 and 2020 at national level

In this section, we will generate the locations of childcare services in 2017 and 2020 at national level.

Location of childcare services in 2017

plot(mpsz_filtered_sp, border="darkgrey")+
plot(childcare_2017_sp, add=TRUE)

## integer(0)

Location of childcare services in 2020

plot(mpsz_filtered_sp, border="darkgrey")+
plot(childcare_2020_sp, add=TRUE)

## integer(0)

4.3 Description of spatial patterns by their respective distribution

From the 2 maps shown above, we can see the location of childcare services represented by “+” at the national level. There seems to be a greater number of childcare services in terms of location as there are more points seen on the 2020 map. There are also areas where there is no childcare services available. This are areas that are occupied by water bodies such as reservoirs and forests and also heavily concentrated industrial regions. Also, from the map, it can be seen that the childcare services are not evenly distributed and are clustered together especially in the north east region of the island.

4.4 Convert into ppp format and check for duplicated points

In this section, we will convert objects into ppp format. After which we will need to check for duplicated points and handle them using jitter if necessary.

childcare_ppp_2017 <- as(childcare_2017_sp, "ppp")
childcare_ppp_2020 <- as(childcare_2020_sp, "ppp")
any(duplicated(childcare_ppp_2017))
## [1] TRUE
any(duplicated(childcare_ppp_2020))
## [1] TRUE
childcare_ppp_jit_2017 <- rjitter(childcare_ppp_2017, retry=TRUE, nsim=1, drop=TRUE)
childcare_ppp_jit_2020 <- rjitter(childcare_ppp_2020, retry=TRUE, nsim=1, drop=TRUE)
sum(multiplicity(childcare_ppp_jit_2017) > 1)
## [1] 0
sum(multiplicity(childcare_ppp_jit_2020) > 1)
## [1] 0

4.3 Extract Planning Area and Whole of Singapore for 2017 & 2019

In this section, we need to extract 4 planning areas: SENGKANG, BEDOK, BUKIT BATOK, HOUGANG and whole of singapore for the kernel density maps later on.

planning_areas <- mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N %in% c("SENGKANG","BEDOK","BUKIT BATOK", "HOUGANG"),]

whole_sg <- mpsz_filtered_sp


sk = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "SENGKANG",]
bd = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "BEDOK",]
bb = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "BUKIT BATOK",]
hg = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "HOUGANG",]

4.4 Convert spatial point data frame and create owin objects

planning_areas_sp <- as(planning_areas, "SpatialPolygons")
whole_sg_sp <- as(whole_sg, "SpatialPolygons")

childcare_sp_2017 = as(childcare_ppp_jit_2017, "SpatialPoints")
childcare_sp_2020 = as(childcare_2020_sp, "SpatialPoints")
sk_sp = as(sk, "SpatialPolygons")
bd_sp = as(bd, "SpatialPolygons")
bb_sp = as(bb, "SpatialPolygons")
hg_sp = as(hg, "SpatialPolygons")
planning_areas_owin <- as(planning_areas_sp, "owin")
whole_sg_owin <- as(whole_sg_sp, "owin")

sk_owin = as(sk_sp, "owin")
bd_owin = as(bd_sp, "owin")
bb_owin = as(bb_sp, "owin")
hg_owin = as(hg_sp, "owin")
childcare_ppp_2017_pa <- childcare_ppp_jit_2017[planning_areas_owin]
childcare_ppp_2020_pa <- childcare_ppp_jit_2020[planning_areas_owin]

childcare_ppp_2017_sg <- childcare_ppp_jit_2017[whole_sg_owin]
childcare_ppp_2020_sg <- childcare_ppp_jit_2020[whole_sg_owin]

childcare_sk_ppp_2017 = childcare_ppp_jit_2017[sk_owin]
childcare_bd_ppp_2017 = childcare_ppp_jit_2017[bd_owin]
childcare_bb_ppp_2017 = childcare_ppp_jit_2017[bb_owin]
childcare_hg_ppp_2017 = childcare_ppp_jit_2017[hg_owin]

childcare_sk_ppp_2020 = childcare_ppp_jit_2020[sk_owin]
childcare_bd_ppp_2020 = childcare_ppp_jit_2020[bd_owin]
childcare_bb_ppp_2020 = childcare_ppp_jit_2020[bb_owin]
childcare_hg_ppp_2020 = childcare_ppp_jit_2020[hg_owin]
plot(childcare_sk_ppp_2017)

plot(childcare_bd_ppp_2017)

plot(childcare_bb_ppp_2017)

plot(childcare_hg_ppp_2017)

plot(childcare_sk_ppp_2020)

plot(childcare_bd_ppp_2020)

plot(childcare_bb_ppp_2020)

plot(childcare_hg_ppp_2020)

plot(childcare_ppp_2017_pa)

plot(childcare_ppp_2020_pa)

plot(childcare_ppp_2017_sg)

plot(childcare_ppp_2020_sg)

The maps above shows the location of childcare services in 2017 and 2020 at their various areas: SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.

4.5 Planning Areas

Planning Area 2017

In this section, we will compute the L-Function Estimation for Planning Area 2017 which consist of SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.

L_pa_2017 = Lest(childcare_ppp_2017_pa, correction = "Ripley")
plot(L_pa_2017, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)")

4.5.1 Test Hypothesis for Planning Area 2017

Childcare Services in 2017

The test hypotheses are:

Ho = The distribution of childcare services in 2017 for planning areas are randomly distributed.

H1= The distribution of childcare services in 2017 at planning areas are not randomly distributed.

The null hypothesis will be rejected if p-value if smaller than alpha value of 0.001.

We will run a simulation of 200 times based on the alpha value.

L_pa_2017.csr <- envelope(childcare_ppp_2017_pa, Lest, nsim = 199, rank = 1, glocal=TRUE)
## Generating 199 simulations of CSR  ...
## 1, 2,  [etd 14:37] 3, 4 [etd 14:52] .6 [etd 14:55] .8
##  [etd 15:29] .10 [etd 14:59] .12 [etd 14:43] .14 [etd 14:21] .16
##  [etd 14:02] .18 [etd 13:52] .20 [etd 13:40] .22 [etd 13:32] .24
##  [etd 13:15] .26 [etd 13:14] .28 [etd 13:05] .30 [etd 13:02] .32
##  [etd 12:53] .34 [etd 12:47] .36 [etd 12:33] .38 [etd 12:22] .40
##  [etd 12:12] .42 [etd 12:03] .44 [etd 11:56] .46 [etd 11:49] .48
##  [etd 11:36] .50 [etd 11:27] .52 [etd 11:24] .54 [etd 11:16] .56
##  [etd 11:07] .58 [etd 10:57] .60 [etd 10:46] .62 [etd 10:37] .64
##  [etd 10:30] .66 [etd 10:19] .68 [etd 10:10] .70 [etd 10:01] .72
##  [etd 9:49] .74 [etd 9:42] .76 [etd 9:34] .78 [etd 9:25] .80
##  [etd 9:16] .82 [etd 9:07] .84 [etd 8:58] .86 [etd 8:50] .88
##  [etd 8:39] .90 [etd 8:31] .92 [etd 8:20] .94 [etd 8:11] .96
##  [etd 8:03] .98 [etd 7:53] .100 [etd 7:45] .102 [etd 7:35] .104
##  [etd 7:25] .106 [etd 7:16] .108 [etd 7:07] .110 [etd 6:57] .112
##  [etd 6:48] .114 [etd 6:38] .116 [etd 6:28] .118 [etd 6:19] .120
##  [etd 6:08] .122 [etd 5:59] .124 [etd 5:49] .126 [etd 5:40] .128
##  [etd 5:30] .130 [etd 5:20] .132 [etd 5:11] .134 [etd 5:02] .136
##  [etd 4:52] .138 [etd 4:44] .140 [etd 4:35] .142 [etd 4:25] .144
##  [etd 4:16] .146 [etd 4:06] .148 [etd 3:57] .150 [etd 3:48] .152
##  [etd 3:38] .154 [etd 3:29] .156 [etd 3:20] .158 [etd 3:11] .160
##  [etd 3:01] .162 [etd 2:52] .164 [etd 2:42] .166 [etd 2:33] .168
##  [etd 2:24] .170 [etd 2:16] .172 [etd 2:07] .174 [etd 1:58] .176
##  [etd 1:49] .178 [etd 1:39] .180 [etd 1:30] .182 [etd 1:21] .184
##  [etd 1:11] .186 [etd 1:02] .188 [etd 53 sec] .190 [etd 43 sec] .192
##  [etd 34 sec] .194 [etd 24 sec] .196 [etd 14 sec] .198 [etd 5 sec]  199.
## 
## Done.
plot(L_pa_2017.csr, . - r ~ r, xlab="d", ylab="L(d)-r", xlim=c(0,500))

Based on the simulation conducted above for L-Function on Planning Area 2017, we reject the null hypothesis as the observed L value is greater than its corresponding L (theo) and cuts above the upper confidence envelop at d = 80. Thus spatial clustering is statistically significant.

Planning Area 2020

In this section, we will compute the L-Function Estimation for Planning Area 2020 which consist of SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.

L_pa_2020 = Lest(childcare_ppp_2020_pa, correction = "Ripley")
plot(L_pa_2020, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)")

### 4.5.2 Test Hypothesis for Planning Area 2020

Childcare Services in 2020

The test hypotheses are:

Ho = The distribution of childcare services in 2020 at planning areas are randomly distributed.

H1= The distribution of childcare services in 2020 at planning areas are not randomly distributed.

The null hypothesis will be rejected if p-value if smaller than alpha value of 0.001.

We will run a simulation of 200 times based on the alpha value.

L_pa_2020.csr <- envelope(childcare_ppp_2020_pa, Lest, nsim = 199, rank = 1, glocal=TRUE)
## Generating 199 simulations of CSR  ...
## 1, 2,  [etd 17:06] 3, 4 [etd 16:01] .6 [etd 16:59] .8
##  [etd 18:00] .10 [etd 18:29] .12 [etd 18:08] .14 [etd 18:12] .16
##  [etd 18:08] .18 [etd 17:52] .20 [etd 18:07] .22 [etd 18:11] .24
##  [etd 17:58] .26 [etd 17:42] .28 [etd 17:32] .30 [etd 17:22] .32
##  [etd 17:16] .34 [etd 17:18] .36 [etd 17:00] .38 [etd 17:00] .40
##  [etd 16:45] .42 [etd 16:29] .44 [etd 16:12] .46 [etd 16:00] .48
##  [etd 15:45] .50 [etd 15:29] .52 [etd 15:15] .54 [etd 15:04] .56
##  [etd 14:58] .58 [etd 14:50] .60 [etd 14:43] .62 [etd 14:31] .64
##  [etd 14:16] .66 [etd 14:02] .68 [etd 13:50] .70 [etd 13:39] .72
##  [etd 13:25] .74 [etd 13:11] .76 [etd 13:01] .78 [etd 12:47] .80
##  [etd 12:35] .82 [etd 12:21] .84 [etd 12:08] .86 [etd 11:56] .88
##  [etd 11:41] .90 [etd 11:29] .92 [etd 11:15] .94 [etd 11:03] .96
##  [etd 10:52] .98 [etd 10:37] .100 [etd 10:26] .102 [etd 10:16] .104
##  [etd 10:02] .106 [etd 9:55] .108 [etd 9:42] .110 [etd 9:31] .112
##  [etd 9:18] .114 [etd 9:05] .116 [etd 8:50] .118 [etd 8:36] .120
##  [etd 8:22] .122 [etd 8:09] .124 [etd 7:55] .126 [etd 7:43] .128
##  [etd 7:30] .130 [etd 7:17] .132 [etd 7:05] .134 [etd 6:51] .136
##  [etd 6:41] .138 [etd 6:28] .140 [etd 6:16] .142 [etd 6:05] .144
##  [etd 5:55] .146 [etd 5:42] .148 [etd 5:29] .150 [etd 5:16] .152
##  [etd 5:03] .154 [etd 4:50] .156 [etd 4:37] .158 [etd 4:24] .160
##  [etd 4:12] .162 [etd 4:00] .164 [etd 3:47] .166 [etd 3:34] .168
##  [etd 3:21] .170 [etd 3:08] .172 [etd 2:54] .174 [etd 2:42] .176
##  [etd 2:29] .178 [etd 2:16] .180 [etd 2:03] .182 [etd 1:51] .184
##  [etd 1:39] .186 [etd 1:26] .188 [etd 1:13] .190 [etd 1:00] .192
##  [etd 46 sec] .194 [etd 33 sec] .196 [etd 20 sec] .198 [etd 7 sec]  199.
## 
## Done.
plot(L_pa_2020.csr, . - r ~ r, xlab="d", ylab="L(d)-r", xlim=c(0,500))

Based on the simulation conducted above for L-Function on Planning Area 2020, we reject the null hypothesis as the observed L value is greater than its corresponding L (theo) and cuts above the upper confidence envelop at d = 120. Thus spatial clustering is statistically significant.

4.6 Kernel Density Maps of Childcare Services in 2017 and 2020

In this section, we will generate kernel density maps of childcare services in 2017 and 2020.

We will be using adaptive bandwidth using bw.diggle to compute sigma.

KDE for Childcare 2017 using Adaptive Bandwidth

kde_childcare_2017_bw <- density(childcare_ppp_2017_sg, sigma=bw.diggle, edge=TRUE, kernel="gaussian")
plot(kde_childcare_2017_bw)

KDE for Childcare 2020 using Adaptive Bandwidth

kde_childcare_2020_bw <- density(childcare_ppp_2020_sg, sigma=bw.diggle, edge=TRUE, kernel="gaussian")
plot(kde_childcare_2020_bw)

4.7 Kernel Density Maps on openstreetmap of Singapore

In this section, we will display kernel density maps on openstreetmap of Singapore.

First, we need to convert the ppp object into grid object, after which we need to convert it to raster object.

We also need to convert the crs of the raster object to 3414.

kde_2017 <- density(childcare_ppp_2017_sg, sigma = bw.diggle, edge=TRUE, kernel="gaussian")
gridded_kde_childcare_sg_2017 <- as.SpatialGridDataFrame.im(kde_2017)

kde_childcare_sg_2017_raster <- raster(gridded_kde_childcare_sg_2017)

projection(kde_childcare_sg_2017_raster) <- crs("+init=EPSG:3414 +datum=WGS84 +units=km")
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(kde_childcare_sg_2017_raster) + 
  tm_raster("v", alpha=0.5, title = "Childcare Services 2017",midpoint = NA) +
  tmap_options(basemaps = c('OpenStreetMap'))

kde_2020 <- density(childcare_ppp_2020_sg, sigma = bw.diggle, edge=TRUE, kernel="gaussian")
gridded_kde_childcare_sg_2020 <- as.SpatialGridDataFrame.im(kde_2020)

kde_childcare_sg_2020_raster <- raster(gridded_kde_childcare_sg_2020)

projection(kde_childcare_sg_2020_raster) <- crs("+init=EPSG:3414 +datum=WGS84 +units=km")
tm_shape(kde_childcare_sg_2020_raster) + 
  tm_raster("v", alpha=0.5, title = "Childcare Services 2020",midpoint = NA) +
  tmap_options(basemaps = c('OpenStreetMap'))

4.7.1 Results and conclusion

Based on the simulation of the L-Function test, we can conclude that the distribution of childcare centres in 2017 and 2020 are spatially clustered. Also, based on the Density Kernel Maps we can see clusters of childcare located together, especially in areas located at the north and north east region.

4.7.2 Advantages of Kernel Density Maps over Point Maps

  1. Kernel Density Maps allows us to see the degree of concentration of locations points, thus we are able to better understand and visualise the extent. However, for point maps, there is no sense of degree of concentration, visually, users will only be able to see the individual points on the map but not relative to its location unlike kernel density map where it shows visually if the location is concentrated.

  2. Kernel density also allows a range of colours so that users can better visualise the map relative to the surroundings, however, point maps only display a single colour on the map.

  3. Kernel Density also displays a legend to show the range of available categories, so that users can quickly match a certain area to that particular category unlike point map.