Section A: The supply and demand of childcare and kindergarten services by planning subzone
Exploratory Spatial Data Analysis:
o Using appropriate EDA and choropleth mapping techniques to reveal the supply and demand of childcare services in 2017 and 2020 at the planning subzone level. Describe the spatial patterns observed.
Analytics mapping:
o Using appropriate analytics mapping techniques to reveal the temporal changes of the childcare services at the planning subzone level.
Geocommunication:
o Describe the results of (1) and (2) and draw statistical conclusions.
Section B: Spatial Point Pattern Analysis
Exploratory Spatial Data Analysis:
o Using point mapping techniques, display the location of childcare services in 2019 and 2020 at the national level. Describe the spatial patterns reveal by their respective distribution.
With reference to the spatial point patterns observed in (4):
o Formulate the null hypothesis and alternative hypothesis and select the confidence level.
o Perform the test by using appropriate 2nd order spatial point patterns analysis technique.
o With reference to the analysis results, draw statistical conclusions.
With reference to the results derived in (4) and (5):
o Derive kernel density maps of childcare services in 2017 and 2020.
o Using appropriate tmap functions, display the kernel density maps on openstreetmap of Singapore.
o With reference to the analysis results, draw statistical conclusions.
o Compare the advantages of kernel density maps (6) over point maps (4).
Childcare Services Data in 2017 (Provided in hands-on exercise)
Latest Childcare Data Set from data.gov.sg.
URA Master Plan 2014 Subzone Boundary Web from data.gov.sg.
Source: https://data.gov.sg/dataset/master-plan-2014-subzone-boundary-web
Singapore Resident By Subzone and Type of Dwelling 2011 to 2019 from data.gov.sg.
Source: https://data.gov.sg/dataset/singapore-residents-by-subzone-and-type-of-dwelling-2011-2019
Child care centres provide child care services and pre-school developmental programmes for children aged between 18 months and below 7 years old. Several centres also provide infant care programmes for infants aged between 2 and 18 months old. Centres may offer full-day, half-day and flexible programmes to cater to the different working schedules of parents.
These are the packages installed in this assignment.
packages = c('sp', 'rgdal', 'rgeos', 'sf', 'tidyverse','tmap','maptools','raster','spatstat','OpenStreetMap','rpostgis','dplyr','plyr')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
## Loading required package: sp
## Loading required package: rgdal
## rgdal: version: 1.5-16, (SVN revision 1050)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
## Path to GDAL shared files: C:/Users/jingwei/Documents/R/win-library/4.0/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
## Path to PROJ shared files: C:/Users/jingwei/Documents/R/win-library/4.0/rgdal/proj
## Linking to sp version:1.4-2
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
## Loading required package: rgeos
## rgeos version: 0.5-3, (SVN revision 634)
## GEOS runtime version: 3.8.0-CAPI-1.13.1
## Linking to sp version: 1.4-2
## Polygon checking: TRUE
## Loading required package: sf
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
## Loading required package: tidyverse
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.1 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Loading required package: tmap
## Loading required package: maptools
## Checking rgeos availability: TRUE
## Loading required package: raster
##
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
##
## select
## The following object is masked from 'package:tidyr':
##
## extract
## Loading required package: spatstat
## Loading required package: spatstat.data
## Loading required package: nlme
##
## Attaching package: 'nlme'
## The following object is masked from 'package:raster':
##
## getData
## The following object is masked from 'package:dplyr':
##
## collapse
## Loading required package: rpart
## Registered S3 method overwritten by 'spatstat':
## method from
## print.boxx cli
##
## spatstat 1.64-1 (nickname: 'Help you I can, yes!')
## For an introduction to spatstat, type 'beginner'
##
## Note: spatstat version 1.64-1 is out of date by more than 4 months; we recommend upgrading to the latest version.
##
## Attaching package: 'spatstat'
## The following objects are masked from 'package:raster':
##
## area, rotate, shift
## Loading required package: OpenStreetMap
## Loading required package: rpostgis
## Loading required package: RPostgreSQL
## Loading required package: DBI
## Loading required package: plyr
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following object is masked from 'package:purrr':
##
## compact
There are 4 main datasets that will be imported:
For Geospatial:
mpsz: URA Master Plan 2014 Subzone Boundary Web from data.gov.sg
childcare_2017: Childcare Services Data in 2017 (Provided in hands-on exercise)
childcare_2020: Latest Childcare Data Set from data.gov.sg
For Aspatial:
These data will be checked and transform / assign if necessary.
mpsz_sf <- st_read(dsn = "data/geospatial",
layer = "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
mpsz_sf
## Simple feature collection with 323 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
## First 10 features:
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N
## 1 1 1 MARINA SOUTH MSSZ01 Y MARINA SOUTH
## 2 2 1 PEARL'S HILL OTSZ01 Y OUTRAM
## 3 3 3 BOAT QUAY SRSZ03 Y SINGAPORE RIVER
## 4 4 8 HENDERSON HILL BMSZ08 N BUKIT MERAH
## 5 5 3 REDHILL BMSZ03 N BUKIT MERAH
## 6 6 7 ALEXANDRA HILL BMSZ07 N BUKIT MERAH
## 7 7 9 BUKIT HO SWEE BMSZ09 N BUKIT MERAH
## 8 8 2 CLARKE QUAY SRSZ02 Y SINGAPORE RIVER
## 9 9 13 PASIR PANJANG 1 QTSZ13 N QUEENSTOWN
## 10 10 7 QUEENSWAY QTSZ07 N QUEENSTOWN
## PLN_AREA_C REGION_N REGION_C INC_CRC FMEL_UPD_D X_ADDR
## 1 MS CENTRAL REGION CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2 OT CENTRAL REGION CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3 SR CENTRAL REGION CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4 BM CENTRAL REGION CR 3775D82C5DDBEFBD 2014-12-05 26782.83
## 5 BM CENTRAL REGION CR 85D9ABEF0A40678F 2014-12-05 26201.96
## 6 BM CENTRAL REGION CR 9D286521EF5E3B59 2014-12-05 25358.82
## 7 BM CENTRAL REGION CR 7839A8577144EFE2 2014-12-05 27680.06
## 8 SR CENTRAL REGION CR 48661DC0FBA09F7A 2014-12-05 29253.21
## 9 QT CENTRAL REGION CR 1F721290C421BFAB 2014-12-05 22077.34
## 10 QT CENTRAL REGION CR 3580D2AFFBEE914C 2014-12-05 24168.31
## Y_ADDR SHAPE_Leng SHAPE_Area geometry
## 1 29220.19 5267.381 1630379.3 MULTIPOLYGON (((31495.56 30...
## 2 29782.05 3506.107 559816.2 MULTIPOLYGON (((29092.28 30...
## 3 29974.66 1740.926 160807.5 MULTIPOLYGON (((29932.33 29...
## 4 29933.77 3313.625 595428.9 MULTIPOLYGON (((27131.28 30...
## 5 30005.70 2825.594 387429.4 MULTIPOLYGON (((26451.03 30...
## 6 29991.38 4428.913 1030378.8 MULTIPOLYGON (((25899.7 297...
## 7 30230.86 3275.312 551732.0 MULTIPOLYGON (((27746.95 30...
## 8 30222.86 2208.619 290184.7 MULTIPOLYGON (((29351.26 29...
## 9 29893.78 6571.323 1084792.3 MULTIPOLYGON (((20996.49 30...
## 10 30104.18 3454.239 631644.3 MULTIPOLYGON (((24472.11 29...
mpsz_sf_3414 <- st_set_crs(mpsz_sf, 3414)
## Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
## that
st_crs(mpsz_sf_3414)
## Coordinate Reference System:
## User input: EPSG:3414
## wkt:
## PROJCRS["SVY21 / Singapore TM",
## BASEGEOGCRS["SVY21",
## DATUM["SVY21",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4757]],
## CONVERSION["Singapore Transverse Mercator",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["northing (N)",north,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["easting (E)",east,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["Singapore"],
## BBOX[1.13,103.59,1.47,104.07]],
## ID["EPSG",3414]]
sg_residents <- read_csv("data/aspatial/planning-area-subzone-age-group-sex-and-type-of-dwelling-june-2011-2019.csv")
## Parsed with column specification:
## cols(
## planning_area = col_character(),
## subzone = col_character(),
## age_group = col_character(),
## sex = col_character(),
## type_of_dwelling = col_character(),
## resident_count = col_double(),
## year = col_double()
## )
childcare_2017_sf <- st_read(dsn = "data/geospatial",
layer = "CHILDCARE")
## Reading layer `CHILDCARE' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 1312 features and 18 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 11203.01 ymin: 25667.6 xmax: 45404.24 ymax: 49300.88
## projected CRS: SVY21
childcare_2017_sf_3414 <- st_set_crs(childcare_2017_sf, 3414)
## Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
## that
st_crs(childcare_2017_sf_3414)
## Coordinate Reference System:
## User input: EPSG:3414
## wkt:
## PROJCRS["SVY21 / Singapore TM",
## BASEGEOGCRS["SVY21",
## DATUM["SVY21",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4757]],
## CONVERSION["Singapore Transverse Mercator",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["northing (N)",north,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["easting (E)",east,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["Singapore"],
## BBOX[1.13,103.59,1.47,104.07]],
## ID["EPSG",3414]]
childcare_2020_sf <- st_read("data/geospatial/child-care-services-kml.kml")
## Reading layer `CHILDCARE' from data source `C:\IS415\Assignment1\Take-home_Ex01\data\geospatial\child-care-services-kml.kml' using driver `KML'
## Simple feature collection with 1545 features and 2 fields
## geometry type: POINT
## dimension: XYZ
## bbox: xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
## z_range: zmin: 0 zmax: 0
## geographic CRS: WGS 84
childcare_2020_sf_3414 <- st_transform(childcare_2020_sf, 3414)
st_crs(childcare_2020_sf_3414)
## Coordinate Reference System:
## User input: EPSG:3414
## wkt:
## PROJCRS["SVY21 / Singapore TM",
## BASEGEOGCRS["SVY21",
## DATUM["SVY21",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4757]],
## CONVERSION["Singapore Transverse Mercator",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["northing (N)",north,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["easting (E)",east,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["Singapore"],
## BBOX[1.13,103.59,1.47,104.07]],
## ID["EPSG",3414]]
In this section we are shown some important unfiltered figures in the datasets and an idea of what the current trend looks like.
Number of sg residents over the years
We can see that the number of residents has been increasing over the years (2011 to 2019).
sg_resident_sum <- tapply(sg_residents$resident_count, sg_residents$year, FUN=sum)
sg_resident_sum_df <- as.data.frame.table(sg_resident_sum)
colnames(sg_resident_sum_df) <- c("Year", "Residents")
sg_resident_sum_df
## Year Residents
## 1 2011 3794150
## 2 2012 3824120
## 3 2013 3849790
## 4 2014 3875700
## 5 2015 3909000
## 6 2016 3940320
## 7 2017 3973220
## 8 2018 4001110
## 9 2019 4033420
ggplot(data = sg_resident_sum_df, aes(x = Year, y = Residents))+
geom_point(color = "#00AFBB", size = 2) + ggtitle("Number of Residents Over the Years (2011 to 2019)")
Number of childcare centers over the years
We can see that there is an increase in number of childcare between 2017 and 2019.
no_of_childcare <- c(nrow(childcare_2017_sf_3414), nrow(childcare_2020_sf_3414))
childcare_year <- c('2017','2020')
plot(childcare_year,no_of_childcare,"o", main = "Number of Childcare Over the Years (2017 and 2020)")
In this section, we will be filtering the datasets and making comparisons with the maps generated.
First, we will be filtering sg_residents dataset by year and resident count that is above 0. So as to generate two sets of data, 1 for residents in 2017 and 1 for residents in 2019. Additionally, we will be categorising the age groups into YOUNG, ECONOMY ACTIVE, AGED and TOTAl.
Do note that YOUNG age group, we will be grouping ‘0_to_4’ and ‘5_to_9’. As according to ECDA’s definition, childcare services are for children aged between 18 months and below 7 years old. So the nearest available grouping will be between ‘0_to_4’ and ‘5_to_9’.
We also added a YOUNGRATIO column which is calculated by YOUNG / TOTAL, so as to give us a perspective of the ratio of young to the total number of residents in an area.
We will then join the mpsz dataset with the new sg resident dataset using left join with the unique identifier being subzone.
sg_residents_2017 <- sg_residents %>%
filter(resident_count > 0) %>%
filter(year == 2017) %>%
spread(age_group,resident_count, fill = 0) %>%
mutate(YOUNG = `0_to_4`+`5_to_9`) %>%
mutate(`ECONOMY ACTIVE` = `25_to_29` + `30_to_34` + `35_to_39` + `40_to_44` + `45_to_49` + `50_to_54` + `55_to_59` + `60_to_64`) %>%
mutate(`AGED`= `65_to_69` + `70_to_74` + `75_to_79` + `80_to_84` + `85_to_89` + `90_and_over`) %>%
mutate(`TOTAL`=rowSums(.[6:24])) %>%
mutate(`YOUNGRATIO` = (`YOUNG` )/`TOTAL`) %>%
mutate_if(is.character, str_to_upper) %>%
dplyr::select(`planning_area`, `subzone`, `sex`,`type_of_dwelling`,`year`,`YOUNG`, `ECONOMY ACTIVE`, `AGED`,`TOTAL`,`YOUNGRATIO`) %>%
mutate(YOUNG = as.integer(YOUNG)) %>%
mutate(TOTAL = as.integer(TOTAL))
sg_residents_2017
## # A tibble: 1,897 x 10
## planning_area subzone sex type_of_dwelling year YOUNG `ECONOMY ACTIVE`
## <chr> <chr> <chr> <chr> <dbl> <int> <dbl>
## 1 ANG MO KIO ANG MO~ FEMA~ CONDOMINIUMS AN~ 2017 100 560
## 2 ANG MO KIO ANG MO~ FEMA~ HDB 3-ROOM FLATS 2017 10 170
## 3 ANG MO KIO ANG MO~ FEMA~ HDB 4-ROOM FLATS 2017 40 230
## 4 ANG MO KIO ANG MO~ FEMA~ HDB 5-ROOM AND ~ 2017 90 590
## 5 ANG MO KIO ANG MO~ MALES CONDOMINIUMS AN~ 2017 100 500
## 6 ANG MO KIO ANG MO~ MALES HDB 3-ROOM FLATS 2017 20 140
## 7 ANG MO KIO ANG MO~ MALES HDB 4-ROOM FLATS 2017 40 190
## 8 ANG MO KIO ANG MO~ MALES HDB 5-ROOM AND ~ 2017 90 530
## 9 ANG MO KIO CHENG ~ FEMA~ HDB 1- AND 2-RO~ 2017 20 220
## 10 ANG MO KIO CHENG ~ FEMA~ HDB 3-ROOM FLATS 2017 370 4640
## # ... with 1,887 more rows, and 3 more variables: AGED <dbl>, TOTAL <int>,
## # YOUNGRATIO <dbl>
mpsz_filtered_2017 <- left_join(mpsz_sf_3414, sg_residents_2017, by = c("SUBZONE_N" = "subzone"))
mpsz_filtered_2017$YOUNG[is.na(mpsz_filtered_2017$YOUNG)] <- 0
mpsz_filtered_2017$TOTAL[is.na(mpsz_filtered_2017$TOTAL)] <- 0
mpsz_filtered_2017_1 <- mpsz_filtered_2017 %>%
filter(YOUNG != 0)
sg_residents_2019 <- sg_residents %>%
filter(resident_count > 0) %>%
filter(year == 2019) %>%
spread(age_group,resident_count, fill = 0) %>%
mutate(YOUNG = `0_to_4`+`5_to_9`) %>%
mutate(`ECONOMY ACTIVE` = `25_to_29` + `30_to_34` + `35_to_39` + `40_to_44` + `45_to_49` + `50_to_54` + `55_to_59` + `60_to_64`) %>%
mutate(`AGED`= `65_to_69` + `70_to_74` + `75_to_79` + `80_to_84` + `85_to_89` + `90_and_over`) %>%
mutate(`TOTAL`=rowSums(.[6:24])) %>%
mutate(`YOUNGRATIO` = (`YOUNG` )/`TOTAL`) %>%
mutate_if(is.character, str_to_upper) %>%
dplyr::select(`planning_area`, `subzone`, `sex`,`type_of_dwelling`,`year`,`YOUNG`, `ECONOMY ACTIVE`, `AGED`,`TOTAL`,`YOUNGRATIO`) %>%
mutate(YOUNG = as.integer(YOUNG)) %>%
mutate(TOTAL = as.integer(TOTAL))
sg_residents_2019
## # A tibble: 1,924 x 10
## planning_area subzone sex type_of_dwelling year YOUNG `ECONOMY ACTIVE`
## <chr> <chr> <chr> <chr> <dbl> <int> <dbl>
## 1 ANG MO KIO ANG MO~ FEMA~ CONDOMINIUMS AN~ 2019 100 600
## 2 ANG MO KIO ANG MO~ FEMA~ HDB 3-ROOM FLATS 2019 20 160
## 3 ANG MO KIO ANG MO~ FEMA~ HDB 4-ROOM FLATS 2019 30 200
## 4 ANG MO KIO ANG MO~ FEMA~ HDB 5-ROOM AND ~ 2019 60 530
## 5 ANG MO KIO ANG MO~ MALES CONDOMINIUMS AN~ 2019 110 510
## 6 ANG MO KIO ANG MO~ MALES HDB 3-ROOM FLATS 2019 20 140
## 7 ANG MO KIO ANG MO~ MALES HDB 4-ROOM FLATS 2019 20 160
## 8 ANG MO KIO ANG MO~ MALES HDB 5-ROOM AND ~ 2019 70 480
## 9 ANG MO KIO ANG MO~ MALES OTHERS 2019 0 10
## 10 ANG MO KIO CHENG ~ FEMA~ HDB 1- AND 2-RO~ 2019 20 230
## # ... with 1,914 more rows, and 3 more variables: AGED <dbl>, TOTAL <int>,
## # YOUNGRATIO <dbl>
mpsz_filtered_2019 <- left_join(mpsz_sf_3414, sg_residents_2019, by = c("SUBZONE_N" = "subzone"))
mpsz_filtered_2019$YOUNG[is.na(mpsz_filtered_2019$YOUNG)] <- 0
mpsz_filtered_2019$TOTAL[is.na(mpsz_filtered_2019$TOTAL)] <- 0
mpsz_filtered_2019_1 <- mpsz_filtered_2019 %>%
filter(YOUNG != 0)
In this section, we will be comparing the maps generated for both young ratio and young distribution in 2017 and 2020 respectively.
young_ratio_map_2017 <- tm_shape(mpsz_filtered_2017_1)+
tm_layout(main.title = "Distribution of Young Ratio in 2017",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("YOUNGRATIO",
style = "quantile",
palette = "Greens",
title = "Ratio of Young") +
tm_borders(alpha = 0.5)
young_ratio_map_2019 <-tm_shape(mpsz_filtered_2019_1)+
tm_layout(main.title = "Distribution of Young Ratio in 2019",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("YOUNGRATIO",
style = "quantile",
palette = "Greens",
title = "Ratio of Young") +
tm_borders(alpha = 0.5)
Ratio of Young by Planning Subzone in 2017 and 2019 Map Comparisons
tmap_arrange(young_ratio_map_2017,young_ratio_map_2019)
From the 2 maps above, you can see that the distribution of young ratio is not even across the island. Some of the subzones have a change in ratio of young to total residents while other some subzone remains the same as in 2017. The changes in subzone usually affects the neighbouring subzone as it can be seen that usually when a subzone’s distribution decreases, the neighbouring subzone distribution increases, likewise when one increases the other decreases. There are also subzones that are not filled as these areas are usually occupied by forests or water bodies such as reservoirs.
young_map_2017 <- tm_shape(mpsz_filtered_2017_1)+
tm_layout(main.title = "Distribution of Young in 2017",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("YOUNG",
style = "quantile",
palette = "Greens",
title = "Number of Young") +
tm_borders(alpha = 0.5)
young_map_2019 <- tm_shape(mpsz_filtered_2019_1)+
tm_layout(main.title = "Distribution of Young in 2019",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("YOUNG",
style = "quantile",
palette = "Greens",
title = "Number of Young") +
tm_borders(alpha = 0.5)
Distribution of Young by Planning Subzone in 2017 and 2019 Map Comparisons
tmap_arrange(young_map_2017,young_map_2019)
From the above 2 maps, we can see that the number of young distribution remains relatively unchanged between 2017 and 2019 as many subzone still reflects the same degree of young in 2019. We can also see that the number of young are not evenly distributed across the island. It seems that there are areas that have more concentration of young. For instance, the north east part of the island. This could be due to more younger families staying in these areas.There are also subzones that are not filled as these areas are usually occupied by forests or water bodies such as reservoirs.
In this section, we will be generating a number of maps and graphs comparing the demand and supply of childcare services in subzone level.
tmap_mode("plot")
## tmap mode set to plotting
mpsz_filtered_2017$`count_subzone`<- lengths(st_intersects(mpsz_filtered_2017, childcare_2017_sf_3414))
summary(mpsz_filtered_2017$`count_subzone`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 2.000 5.000 6.487 9.000 36.000
Distribution of Childcare by Planning Subzone in 2017 and 2019 Map Comparisons
childcare_map_2017 <- tm_shape(mpsz_filtered_2017)+
tm_layout(main.title = "Distribution of Childcare in 2017",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("count_subzone",
style = "quantile",
palette = "Greens",
title = "Counts") +
tm_borders(alpha = 0.5)
mpsz_filtered_2019$`count_subzone`<- lengths(st_intersects(mpsz_filtered_2019, childcare_2020_sf_3414))
summary(mpsz_filtered_2019$`count_subzone`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 3.000 6.000 7.652 11.000 42.000
childcare_map_2020 <- tm_shape(mpsz_filtered_2019)+
tm_layout(main.title = "Distribution of Childcare in 2020",
main.title.position = "center",
main.title.size = 1,
legend.height = 0.45,
legend.width = 0.35,
legend.outside = FALSE,
legend.position = c("right", "bottom"),
frame = FALSE) +
tm_fill("count_subzone",
style = "quantile",
palette = "Greens",
title = "Counts") +
tm_borders(alpha = 0.5)
tmap_arrange(childcare_map_2017,childcare_map_2020)
From the 2 maps above, we can see the distribution of childcare services across the island. They are not evenly distributed across the island. Certain areas such as the east and north east have a higher distribution of childcare in the subzones. This could be due to the number of younger families situated in these areas which is reflected in the previous 2 maps showing the distribution of young, similarly, it can be seen that there is a high demand for childcare in north east area which also matches the high supply of childcare. However, it seems that the western part of the island has relatively lesser supply of childcare, given that the number of young in certain subzone in the west is still considerable.
As there is many subzones, it is difficult to display the changes in every single one of them. Thus, in order to still give a better visualisation of the subzone changes, the changes are filtered by regions. By filtering by regions, we can better see the changes of childcare services at the planning subzone level as each region has a good amount of subzones in it.
no_of_childcare_by_region_year_2017 = ddply(mpsz_filtered_2017,~REGION_N,summarise,no_of_childcare_2017=length(count_subzone))
no_of_childcare_by_region_year_2020 = ddply(mpsz_filtered_2019,~REGION_N,summarise,no_of_childcare_2020=length(count_subzone))
no_of_childcare_by_region_year <- left_join(no_of_childcare_by_region_year_2017, no_of_childcare_by_region_year_2020, by = c("REGION_N" = "REGION_N"))
no_of_childcare_by_region_year$difference <- no_of_childcare_by_region_year$no_of_childcare_2020 - no_of_childcare_by_region_year$no_of_childcare_2017
Difference in number of childcare services between 2017 and 2020
The table and graphs show the number of childcare services in 2017 and 2019, along with the differences of the years.
no_of_childcare_by_region_year
## REGION_N no_of_childcare_2017 no_of_childcare_2020 difference
## 1 CENTRAL REGION 846 839 -7
## 2 EAST REGION 186 195 9
## 3 NORTH-EAST REGION 356 362 6
## 4 NORTH REGION 209 215 6
## 5 WEST REGION 390 402 12
ggplot(data = no_of_childcare_by_region_year, aes(x = REGION_N, y = difference))+
geom_point(color = "#00AFBB", size = 2) + ggtitle("Difference in Number of Childcare Services between 2017 and 2020")
The table and graph above shows the number of childcare in 2017 and 2020, along with the differences for each region. As there are too many subzones, we grouped them up into regions instead. The subzone within each region will be displayed in the maps below later on.
It can be seen that most regions have experienced an increase in number of childcare available. With the west region gaining the most with an increase of 12. However, there is also a drop for central region which experienced a decrease of 7. This could be due to more families wanting to have childcare services near their homes instead as there is a greater proportion of youngs in regions compared to the central regions.
mpsz_filtered_2017_without_spaces_central_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'CENTRAL REGION', count_subzone != 0)
mpsz_filtered_2017_without_spaces_east_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'EAST REGION', count_subzone != 0)
mpsz_filtered_2017_without_spaces_north_east_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'NORTH-EAST REGION', count_subzone != 0)
mpsz_filtered_2017_without_spaces_north_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'NORTH REGION', count_subzone != 0)
mpsz_filtered_2017_without_spaces_west_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'WEST REGION', count_subzone != 0)
mpsz_filtered_2019_without_spaces_central_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'CENTRAL REGION', count_subzone != 0)
mpsz_filtered_2019_without_spaces_east_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'EAST REGION', count_subzone != 0)
mpsz_filtered_2019_without_spaces_north_east_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'NORTH-EAST REGION', count_subzone != 0)
mpsz_filtered_2019_without_spaces_north_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'NORTH REGION', count_subzone != 0)
mpsz_filtered_2019_without_spaces_west_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'WEST REGION', count_subzone != 0)
tm_shape(mpsz_filtered_2017_without_spaces_central_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2017 in Central Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_central_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2020 in Central Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_east_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2017 in East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_east_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2020 in East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_north_east_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2017 in North-East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_north_east_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2020 in North-East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2017 in North Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2020 in North Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2017 in West Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
tm_fill("count_subzone",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Childcare Services 2020 in West Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
From the above maps, we can see that the number of childcare services in each region is not evenly distributed. Certain subzones have a greater number of childcare while others have a less concentration. However, between 2017 and 2020, we can see that most regions has seen a increase of childcare. Looking further, we can also see that subzones that have less concentration of childcare has seen an increase in childcare, along with subzone that did not have childcares services previously. There are also subzones that have higher number of childcare concentrated together, this could be due to them being near homes or schools which brings convenience to families.
no_of_young_by_region_year_2017 = ddply(mpsz_filtered_2017,~REGION_N,summarise,no_of_young_2017= sum(YOUNG))
no_of_young_by_region_year_2019 = ddply(mpsz_filtered_2019,~REGION_N,summarise,no_of_young_2019= sum(YOUNG))
no_of_young_region_year <- left_join(no_of_young_by_region_year_2017, no_of_young_by_region_year_2019, by = c("REGION_N" = "REGION_N"))
no_of_young_region_year$difference <- no_of_young_region_year$no_of_young_2019 - no_of_young_region_year$no_of_young_2017
Difference in number of young between 2017 and 2020
The table and graphs show the number of young in 2017 and 2019, along with the differences of the years.
no_of_young_region_year
## REGION_N no_of_young_2017 no_of_young_2019 difference
## 1 CENTRAL REGION 83530 79050 -4480
## 2 EAST REGION 59680 57020 -2660
## 3 NORTH-EAST REGION 102070 103990 1920
## 4 NORTH REGION 57000 59340 2340
## 5 WEST REGION 86880 84510 -2370
ggplot(data = no_of_young_region_year, aes(x = REGION_N, y = difference))+
geom_point(color = "#00AFBB", size = 2) + ggtitle("Difference in Number of Young between 2017 and 2020")
The table and graph above shows the number of young in 2017 and 2020, along with the differences for each region. As there are too many subzones, we grouped them up into regions instead. The subzone within each region will be displayed in the maps below later on.
It can be seen that most regions have experienced an decrease in number of young. With the greatest decrease in the central region of 4480, along with decrease in east and west regions. However, there is also an increse in young in regions such as the north and north east of the island.
mpsz_filtered_2017_without_spaces_central_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'CENTRAL REGION', YOUNG != 0)
mpsz_filtered_2017_without_spaces_east_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'EAST REGION', YOUNG != 0)
mpsz_filtered_2017_without_spaces_north_east_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'NORTH-EAST REGION', YOUNG != 0)
mpsz_filtered_2017_without_spaces_north_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'NORTH REGION', YOUNG != 0)
mpsz_filtered_2017_without_spaces_west_region <- mpsz_filtered_2017 %>%
filter(REGION_N == 'WEST REGION', YOUNG != 0)
mpsz_filtered_2019_without_spaces_central_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'CENTRAL REGION', YOUNG != 0)
mpsz_filtered_2019_without_spaces_east_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'EAST REGION', YOUNG != 0)
mpsz_filtered_2019_without_spaces_north_east_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'NORTH-EAST REGION', YOUNG != 0)
mpsz_filtered_2019_without_spaces_north_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'NORTH REGION', YOUNG != 0)
mpsz_filtered_2019_without_spaces_west_region <- mpsz_filtered_2019 %>%
filter(REGION_N == 'WEST REGION', YOUNG != 0)
tm_shape(mpsz_filtered_2017_without_spaces_central_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2017 in Central Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_central_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2020 in Central Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_east_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2017 in East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_east_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2020 in East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_north_east_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2017 in North-East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_north_east_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2020 in North-East Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_north_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2017 in North Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_north_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2020 in North Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2017_without_spaces_west_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2017 in West Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
tm_shape(mpsz_filtered_2019_without_spaces_west_region) +
tm_fill("YOUNG",
style = "quantile",
palette = "Blues",
thres.poly = 0) +
tm_facets(by="REGION_N",
free.coords=TRUE,
drop.shapes=TRUE) +
tm_layout(legend.show = TRUE,
main.title = "Number of Young 2020 in West Region",
title.position = c("center", "center"),
title.size = 30) +
tm_borders(alpha = 0.5)
## Warning: The argument drop.shapes has been renamed to drop.units, and is
## therefore deprecated
From the above maps, we can see that the number of young in each region is not evenly distributed. Certain subzones have a greater number of young while others have a less concentration. However, between 2017 and 2020, we can see that some subzones remain the same as in 2017 and a few has seen a decrease in young in its subzone.
Most subzones remain relatively the same as before, this could be due to the young still being in the same age group category. Also, this is expected as most families usually stays in a subzone for a long period of time. The difference in years between the 2 maps is only 3 years, thus it cannot be expected to have bigger and more obvious changes compared to the number of childcare services.
In conclusion for the temporary changes in demand and supply of childcare centres, we have seen an increase in overall childcare services of about 1.308% between 2017 and 2020 while there is a decrease in number of young of about 1.349%. This shows that the current differences in number of childcare services and young remains relatively close to each other after 3 years.
However, for certain regions that experienced a drop in young, for instance in central region, there has been a drop of 5.36% of young but the decrease in childcare is only about 0.827% in that region. Also, in another instance, the number of young in the west region decreased by 2.72% but there is an increase of childcare services of about 3.076%. Thus it shows that the supply of childcare is currently not entirely based on the demand of childcare services in the region and certain areas have a bigger disparity between supply and demand of childcare service.
In this section, we have to convert the data in ppp object form. However, as there is no direct way to convert, we will convert SpatialDataFrame into Spatial objects first.
childcare_2017_sp <- readOGR(dsn = "data/geospatial", layer="CHILDCARE")
## OGR data source with driver: ESRI Shapefile
## Source: "C:\IS415\Assignment1\Take-home_Ex01\data\geospatial", layer: "CHILDCARE"
## with 1312 features
## It has 18 fields
childcare_2017_sp <- as(childcare_2017_sp, "SpatialPoints")
childcare_2020_sp <- as(childcare_2020_sf_3414, "Spatial")
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
childcare_2020_sp <- as(childcare_2020_sp, "SpatialPoints")
mpsz_filtered_sp <- as(mpsz_sf_3414, "Spatial")
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
crs(childcare_2017_sp)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs
crs(childcare_2020_sp)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0
## +units=m +no_defs
crs(mpsz_filtered_sp)
## CRS arguments:
## +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0
## +units=m +no_defs
In this section, we will generate the locations of childcare services in 2017 and 2020 at national level.
Location of childcare services in 2017
plot(mpsz_filtered_sp, border="darkgrey")+
plot(childcare_2017_sp, add=TRUE)
## integer(0)
Location of childcare services in 2020
plot(mpsz_filtered_sp, border="darkgrey")+
plot(childcare_2020_sp, add=TRUE)
## integer(0)
From the 2 maps shown above, we can see the location of childcare services represented by “+” at the national level. There seems to be a greater number of childcare services in terms of location as there are more points seen on the 2020 map. There are also areas where there is no childcare services available. This are areas that are occupied by water bodies such as reservoirs and forests and also heavily concentrated industrial regions. Also, from the map, it can be seen that the childcare services are not evenly distributed and are clustered together especially in the north east region of the island.
In this section, we will convert objects into ppp format. After which we will need to check for duplicated points and handle them using jitter if necessary.
childcare_ppp_2017 <- as(childcare_2017_sp, "ppp")
childcare_ppp_2020 <- as(childcare_2020_sp, "ppp")
any(duplicated(childcare_ppp_2017))
## [1] TRUE
any(duplicated(childcare_ppp_2020))
## [1] TRUE
childcare_ppp_jit_2017 <- rjitter(childcare_ppp_2017, retry=TRUE, nsim=1, drop=TRUE)
childcare_ppp_jit_2020 <- rjitter(childcare_ppp_2020, retry=TRUE, nsim=1, drop=TRUE)
sum(multiplicity(childcare_ppp_jit_2017) > 1)
## [1] 0
sum(multiplicity(childcare_ppp_jit_2020) > 1)
## [1] 0
In this section, we need to extract 4 planning areas: SENGKANG, BEDOK, BUKIT BATOK, HOUGANG and whole of singapore for the kernel density maps later on.
planning_areas <- mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N %in% c("SENGKANG","BEDOK","BUKIT BATOK", "HOUGANG"),]
whole_sg <- mpsz_filtered_sp
sk = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "SENGKANG",]
bd = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "BEDOK",]
bb = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "BUKIT BATOK",]
hg = mpsz_filtered_sp[mpsz_filtered_sp@data$PLN_AREA_N == "HOUGANG",]
planning_areas_sp <- as(planning_areas, "SpatialPolygons")
whole_sg_sp <- as(whole_sg, "SpatialPolygons")
childcare_sp_2017 = as(childcare_ppp_jit_2017, "SpatialPoints")
childcare_sp_2020 = as(childcare_2020_sp, "SpatialPoints")
sk_sp = as(sk, "SpatialPolygons")
bd_sp = as(bd, "SpatialPolygons")
bb_sp = as(bb, "SpatialPolygons")
hg_sp = as(hg, "SpatialPolygons")
planning_areas_owin <- as(planning_areas_sp, "owin")
whole_sg_owin <- as(whole_sg_sp, "owin")
sk_owin = as(sk_sp, "owin")
bd_owin = as(bd_sp, "owin")
bb_owin = as(bb_sp, "owin")
hg_owin = as(hg_sp, "owin")
childcare_ppp_2017_pa <- childcare_ppp_jit_2017[planning_areas_owin]
childcare_ppp_2020_pa <- childcare_ppp_jit_2020[planning_areas_owin]
childcare_ppp_2017_sg <- childcare_ppp_jit_2017[whole_sg_owin]
childcare_ppp_2020_sg <- childcare_ppp_jit_2020[whole_sg_owin]
childcare_sk_ppp_2017 = childcare_ppp_jit_2017[sk_owin]
childcare_bd_ppp_2017 = childcare_ppp_jit_2017[bd_owin]
childcare_bb_ppp_2017 = childcare_ppp_jit_2017[bb_owin]
childcare_hg_ppp_2017 = childcare_ppp_jit_2017[hg_owin]
childcare_sk_ppp_2020 = childcare_ppp_jit_2020[sk_owin]
childcare_bd_ppp_2020 = childcare_ppp_jit_2020[bd_owin]
childcare_bb_ppp_2020 = childcare_ppp_jit_2020[bb_owin]
childcare_hg_ppp_2020 = childcare_ppp_jit_2020[hg_owin]
plot(childcare_sk_ppp_2017)
plot(childcare_bd_ppp_2017)
plot(childcare_bb_ppp_2017)
plot(childcare_hg_ppp_2017)
plot(childcare_sk_ppp_2020)
plot(childcare_bd_ppp_2020)
plot(childcare_bb_ppp_2020)
plot(childcare_hg_ppp_2020)
plot(childcare_ppp_2017_pa)
plot(childcare_ppp_2020_pa)
plot(childcare_ppp_2017_sg)
plot(childcare_ppp_2020_sg)
The maps above shows the location of childcare services in 2017 and 2020 at their various areas: SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.
In this section, we will compute the L-Function Estimation for Planning Area 2017 which consist of SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.
L_pa_2017 = Lest(childcare_ppp_2017_pa, correction = "Ripley")
plot(L_pa_2017, . -r ~ r,
ylab= "L(d)-r", xlab = "d(m)")
Childcare Services in 2017
The test hypotheses are:
Ho = The distribution of childcare services in 2017 for planning areas are randomly distributed.
H1= The distribution of childcare services in 2017 at planning areas are not randomly distributed.
The null hypothesis will be rejected if p-value if smaller than alpha value of 0.001.
We will run a simulation of 200 times based on the alpha value.
L_pa_2017.csr <- envelope(childcare_ppp_2017_pa, Lest, nsim = 199, rank = 1, glocal=TRUE)
## Generating 199 simulations of CSR ...
## 1, 2, [etd 14:37] 3, 4 [etd 14:52] .6 [etd 14:55] .8
## [etd 15:29] .10 [etd 14:59] .12 [etd 14:43] .14 [etd 14:21] .16
## [etd 14:02] .18 [etd 13:52] .20 [etd 13:40] .22 [etd 13:32] .24
## [etd 13:15] .26 [etd 13:14] .28 [etd 13:05] .30 [etd 13:02] .32
## [etd 12:53] .34 [etd 12:47] .36 [etd 12:33] .38 [etd 12:22] .40
## [etd 12:12] .42 [etd 12:03] .44 [etd 11:56] .46 [etd 11:49] .48
## [etd 11:36] .50 [etd 11:27] .52 [etd 11:24] .54 [etd 11:16] .56
## [etd 11:07] .58 [etd 10:57] .60 [etd 10:46] .62 [etd 10:37] .64
## [etd 10:30] .66 [etd 10:19] .68 [etd 10:10] .70 [etd 10:01] .72
## [etd 9:49] .74 [etd 9:42] .76 [etd 9:34] .78 [etd 9:25] .80
## [etd 9:16] .82 [etd 9:07] .84 [etd 8:58] .86 [etd 8:50] .88
## [etd 8:39] .90 [etd 8:31] .92 [etd 8:20] .94 [etd 8:11] .96
## [etd 8:03] .98 [etd 7:53] .100 [etd 7:45] .102 [etd 7:35] .104
## [etd 7:25] .106 [etd 7:16] .108 [etd 7:07] .110 [etd 6:57] .112
## [etd 6:48] .114 [etd 6:38] .116 [etd 6:28] .118 [etd 6:19] .120
## [etd 6:08] .122 [etd 5:59] .124 [etd 5:49] .126 [etd 5:40] .128
## [etd 5:30] .130 [etd 5:20] .132 [etd 5:11] .134 [etd 5:02] .136
## [etd 4:52] .138 [etd 4:44] .140 [etd 4:35] .142 [etd 4:25] .144
## [etd 4:16] .146 [etd 4:06] .148 [etd 3:57] .150 [etd 3:48] .152
## [etd 3:38] .154 [etd 3:29] .156 [etd 3:20] .158 [etd 3:11] .160
## [etd 3:01] .162 [etd 2:52] .164 [etd 2:42] .166 [etd 2:33] .168
## [etd 2:24] .170 [etd 2:16] .172 [etd 2:07] .174 [etd 1:58] .176
## [etd 1:49] .178 [etd 1:39] .180 [etd 1:30] .182 [etd 1:21] .184
## [etd 1:11] .186 [etd 1:02] .188 [etd 53 sec] .190 [etd 43 sec] .192
## [etd 34 sec] .194 [etd 24 sec] .196 [etd 14 sec] .198 [etd 5 sec] 199.
##
## Done.
plot(L_pa_2017.csr, . - r ~ r, xlab="d", ylab="L(d)-r", xlim=c(0,500))
Based on the simulation conducted above for L-Function on Planning Area 2017, we reject the null hypothesis as the observed L value is greater than its corresponding L (theo) and cuts above the upper confidence envelop at d = 80. Thus spatial clustering is statistically significant.
In this section, we will compute the L-Function Estimation for Planning Area 2020 which consist of SENGKANG, BEDOK, BUKIT BATOK, HOUGANG.
L_pa_2020 = Lest(childcare_ppp_2020_pa, correction = "Ripley")
plot(L_pa_2020, . -r ~ r,
ylab= "L(d)-r", xlab = "d(m)")
### 4.5.2 Test Hypothesis for Planning Area 2020
Childcare Services in 2020
The test hypotheses are:
Ho = The distribution of childcare services in 2020 at planning areas are randomly distributed.
H1= The distribution of childcare services in 2020 at planning areas are not randomly distributed.
The null hypothesis will be rejected if p-value if smaller than alpha value of 0.001.
We will run a simulation of 200 times based on the alpha value.
L_pa_2020.csr <- envelope(childcare_ppp_2020_pa, Lest, nsim = 199, rank = 1, glocal=TRUE)
## Generating 199 simulations of CSR ...
## 1, 2, [etd 17:06] 3, 4 [etd 16:01] .6 [etd 16:59] .8
## [etd 18:00] .10 [etd 18:29] .12 [etd 18:08] .14 [etd 18:12] .16
## [etd 18:08] .18 [etd 17:52] .20 [etd 18:07] .22 [etd 18:11] .24
## [etd 17:58] .26 [etd 17:42] .28 [etd 17:32] .30 [etd 17:22] .32
## [etd 17:16] .34 [etd 17:18] .36 [etd 17:00] .38 [etd 17:00] .40
## [etd 16:45] .42 [etd 16:29] .44 [etd 16:12] .46 [etd 16:00] .48
## [etd 15:45] .50 [etd 15:29] .52 [etd 15:15] .54 [etd 15:04] .56
## [etd 14:58] .58 [etd 14:50] .60 [etd 14:43] .62 [etd 14:31] .64
## [etd 14:16] .66 [etd 14:02] .68 [etd 13:50] .70 [etd 13:39] .72
## [etd 13:25] .74 [etd 13:11] .76 [etd 13:01] .78 [etd 12:47] .80
## [etd 12:35] .82 [etd 12:21] .84 [etd 12:08] .86 [etd 11:56] .88
## [etd 11:41] .90 [etd 11:29] .92 [etd 11:15] .94 [etd 11:03] .96
## [etd 10:52] .98 [etd 10:37] .100 [etd 10:26] .102 [etd 10:16] .104
## [etd 10:02] .106 [etd 9:55] .108 [etd 9:42] .110 [etd 9:31] .112
## [etd 9:18] .114 [etd 9:05] .116 [etd 8:50] .118 [etd 8:36] .120
## [etd 8:22] .122 [etd 8:09] .124 [etd 7:55] .126 [etd 7:43] .128
## [etd 7:30] .130 [etd 7:17] .132 [etd 7:05] .134 [etd 6:51] .136
## [etd 6:41] .138 [etd 6:28] .140 [etd 6:16] .142 [etd 6:05] .144
## [etd 5:55] .146 [etd 5:42] .148 [etd 5:29] .150 [etd 5:16] .152
## [etd 5:03] .154 [etd 4:50] .156 [etd 4:37] .158 [etd 4:24] .160
## [etd 4:12] .162 [etd 4:00] .164 [etd 3:47] .166 [etd 3:34] .168
## [etd 3:21] .170 [etd 3:08] .172 [etd 2:54] .174 [etd 2:42] .176
## [etd 2:29] .178 [etd 2:16] .180 [etd 2:03] .182 [etd 1:51] .184
## [etd 1:39] .186 [etd 1:26] .188 [etd 1:13] .190 [etd 1:00] .192
## [etd 46 sec] .194 [etd 33 sec] .196 [etd 20 sec] .198 [etd 7 sec] 199.
##
## Done.
plot(L_pa_2020.csr, . - r ~ r, xlab="d", ylab="L(d)-r", xlim=c(0,500))
Based on the simulation conducted above for L-Function on Planning Area 2020, we reject the null hypothesis as the observed L value is greater than its corresponding L (theo) and cuts above the upper confidence envelop at d = 120. Thus spatial clustering is statistically significant.
In this section, we will generate kernel density maps of childcare services in 2017 and 2020.
We will be using adaptive bandwidth using bw.diggle to compute sigma.
kde_childcare_2017_bw <- density(childcare_ppp_2017_sg, sigma=bw.diggle, edge=TRUE, kernel="gaussian")
plot(kde_childcare_2017_bw)
kde_childcare_2020_bw <- density(childcare_ppp_2020_sg, sigma=bw.diggle, edge=TRUE, kernel="gaussian")
plot(kde_childcare_2020_bw)
In this section, we will display kernel density maps on openstreetmap of Singapore.
First, we need to convert the ppp object into grid object, after which we need to convert it to raster object.
We also need to convert the crs of the raster object to 3414.
kde_2017 <- density(childcare_ppp_2017_sg, sigma = bw.diggle, edge=TRUE, kernel="gaussian")
gridded_kde_childcare_sg_2017 <- as.SpatialGridDataFrame.im(kde_2017)
kde_childcare_sg_2017_raster <- raster(gridded_kde_childcare_sg_2017)
projection(kde_childcare_sg_2017_raster) <- crs("+init=EPSG:3414 +datum=WGS84 +units=km")
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(kde_childcare_sg_2017_raster) +
tm_raster("v", alpha=0.5, title = "Childcare Services 2017",midpoint = NA) +
tmap_options(basemaps = c('OpenStreetMap'))
kde_2020 <- density(childcare_ppp_2020_sg, sigma = bw.diggle, edge=TRUE, kernel="gaussian")
gridded_kde_childcare_sg_2020 <- as.SpatialGridDataFrame.im(kde_2020)
kde_childcare_sg_2020_raster <- raster(gridded_kde_childcare_sg_2020)
projection(kde_childcare_sg_2020_raster) <- crs("+init=EPSG:3414 +datum=WGS84 +units=km")
tm_shape(kde_childcare_sg_2020_raster) +
tm_raster("v", alpha=0.5, title = "Childcare Services 2020",midpoint = NA) +
tmap_options(basemaps = c('OpenStreetMap'))
Based on the simulation of the L-Function test, we can conclude that the distribution of childcare centres in 2017 and 2020 are spatially clustered. Also, based on the Density Kernel Maps we can see clusters of childcare located together, especially in areas located at the north and north east region.
Kernel Density Maps allows us to see the degree of concentration of locations points, thus we are able to better understand and visualise the extent. However, for point maps, there is no sense of degree of concentration, visually, users will only be able to see the individual points on the map but not relative to its location unlike kernel density map where it shows visually if the location is concentrated.
Kernel density also allows a range of colours so that users can better visualise the map relative to the surroundings, however, point maps only display a single colour on the map.
Kernel Density also displays a legend to show the range of available categories, so that users can quickly match a certain area to that particular category unlike point map.