Many people commute by MRT nowadays. Since public transportation has become more and more important in modern society, the aim of the assignment is to have a geography exploration of MRT in Singapore. Overall, the analysis will discover the population density in the different planning areas of Singapore and then explore the relationship between the population density and the MRT location. Furthermore, in this assignment we can have a closer look on how many people in different planning area transport by MRT by geography analysis.
Statistics Singapore: https://www.singstat.gov.sg/find-data/search-by-theme/population/geographic-distribution/latest-data Kaggle (Singapore Train Station Coordinates): https://www.kaggle.com/yxlee245/singapore-train-station-coordinates?select=mrt_lrt_data.csv Inclass data: Geogaphy data in class
To analyze the information about Singapore MRT, the main design for this assignment would start from understanding the population density and the MRT location in different planning areas to take a glance at the relationship between them. Then, we can see the chart that showcases the population transport by MRT only to have a deeper understanding about the usage of MRT in Singapore.
Diffculty on data preprocessing of combining data
For visualization a map, the data should include the geography information and link to all different information and it is difficult to combine all the information together. It take a lot of time to explore how to join dataset.
The difficulty in adding information on leaflet chart
While adding the information about population transport by MRT in Singapore. There’s difficulty to add the information in the leaflet chart and also cannot emphasize the the ranking population. Hence, I separate to another graph.
Heatmap with interaction
The ideal visualization for the topic is that picturing a heatmap of Singapore by population density and then add the MRT on the map. However, the heatmap that leaflet provided would be too complicated to add other information on the same map. Hence, I separate them into different maps.
Sketch
# suppressing warnings
defaultW <- getOption("warn")
options(warn = -1)
# importing libraries
library(corrplot)
library(data.table)
library(dplyr)
library(ggplot2)
library(tidyverse)
library(viridis)
library(stringr)
library("tidyverse")
library("leaflet")
library(tmap)
library(sf)
options(warn = defaultW)
Import the data.
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `D:\SMU\Term 2\Visual\Assignment 5\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## projected CRS: SVY21
Preparing the information by grouping the age group in 2019 and combining with the geography data.
popag2019 <- popag %>%
filter(Time == 2019) %>%
spread(AG, Pop) %>%
mutate(YOUNG = `0_to_4`+`5_to_9`+`10_to_14`+
`15_to_19`+`20_to_24`) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[9:13])+
rowSums(.[15:17]))%>%
mutate(`AGED`=rowSums(.[18:22])) %>%
mutate(`TOTAL`=rowSums(.[5:22])) %>%
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)
/`ECONOMY ACTIVE`) %>%
mutate_at(.vars = vars(PA, SZ), toupper) %>%
select(`PA`, `SZ`, `YOUNG`, `ECONOMY ACTIVE`, `AGED`,
`TOTAL`, `DEPENDENCY`) %>%
filter(`ECONOMY ACTIVE` > 0)
Combined the geography data by subzone.
mpsz_popag2019 <- left_join(mpsz, popag2019,by = c("SUBZONE_N" = "SZ"))
Calculate the population density
mpsz_2019 <- mpsz_popag2019 %>% mutate(pop_den = TOTAL / SHAPE_Area * 1e6)
map2019 <- mpsz_2019[-c(1:2)]
Draw a density map
Build interactive MRT map
Combined geography data with MRT information
According to the chart above we can observe three main findings: