A short description of not more than 350 words.
Description: The aim of this visualisation is to analyse Singaporeans’ commuting and transport habits. Aside from car usage, Singapore’s well-established public transport system offers residents many options for commuting, such as Mass Rapid Transport (MRT) train services and public buses.
This data is taken from Singapore’s General Household Survey 2015, and comprises two tables. The two tables are about Mode of Transport to work and Travelling Time to work. Hence, analysis of the data aims to yield insights about what modes of transport Singaporean residents use to commute to work, and how long their commuting journey to work is.
Insights from visualisation: The first insight from this visualisation are that the residents from planning areas of Jurong, Yishun, Hougang, Bedok and Tampines have the highest MRT & Public Bus Only, as well as Car Only, usage rates. The second insight is that Woodlands residents have the highest rates of long Travel Time in Singapore, with 21 thousand residents declaring a travel time of more than 60 minutes.
This interactive map allows you to visualise the rate of public transport usage across planning areas in Singapore. The colour intensity on the map represents the level of MRT & Public Bus Only usage, and the data is segmented by quantiles. Hover over each coloured planning area to view the name of the planning area. Click on a planning area to view more details about MRT & Public Bus Only usage.
As we can see, areas such as Jurong, Yishun, Hougang, Bedok and Tampines are amongst the planning areas with the highest rates of MRT & Public Bus only usage, falling in the highest quantile range of 33.6-41.4 thousand residents indicating this option.
Below is another interactive map that shows the number of “Car Only” transport users in Singapore. The colour intensity on the map represents the level of Car Only usage, and the data is segmented by quantiles. Hover over each coloured planning area to view the name of the planning area. Click on a planning area to view more details about Car Only usage.
When compared to the previous map on MRT & Public Bus Only usage, the map on Car Only usage yields some surprising results. Contrary to expectations of public transport acting as a substitute for car usage, regions such as Jurong, Hougang, Paya Lebar, Bedok and Tampines appear to have a high Car Only usage and MRT & Public Bus Only usage. This may be because these areas may have a relatively higher population density.
In this map, these aforementioned regions fall in the highest quantile of 21.36 - 36.90 thousand residents declaring Car Only usage.
This visualisation looks at how many residents across Singapore have a travel time of more than 60 minutes, which for the purposes of this analysis will be defined as a long travel time. Hover over each coloured planning area to view the name of the planning area. Click on a planning area to view more details about the number of residents which have a long travel time.
Units in thousands
We can see that Woodlands has the highest number of residents with a long travelling time, as represented by the red colouring that corresponds to the highest quantile of 20 - 25 thousand.
The bar chart below is based on travel times for all Singapore residents, and is colour-coded according to the length of travel times. Hover over each bar for more details on the number of residents for each Travel Time category (please note that the Residents figures are in thousands).
Units in thousands
Amongst the Travel Time categories, 16 - 30 Mins appears to be the most frequent travel timing for Singapore residents.
Open R Studio, and select File and then New Project. Name the new project “VA Assignment 5”. Once the project is open, select File again, and then select New File, and select R Markdown from the drop-down menu. Name the R Markdown file “Assignment 5”.
Rename the .csv file from “OutputFile.csv” to “Transport.csv”. Save it in the same working directory that the R Markdown file is saved in.
Open the .csv file in Excel after downloading it from the SingStat website. Remove unnecessary headings and notes from above and below the data.
Since both tables have a “Total” column, rename the first “Total” column (in column B of the .csv file) to “Mode Total”, and rename the second “Total” column (in column N of the .csv file) to “Time Total”. This helps to distinguish the two totals from each of the tables.
Fill in the first cell (cell A1) with “Planning Area”. Save the .csv file.
Go to data.gov.sg and download the MP14 file in shp format. Save it to the working directory. Unzip the folder and make sure the unzipped folder is saved to the same working directory. Delete the original zipped folder.
library(tidyverse)
library(dplyr)
library(ggplot2)
library(plotly)
library(data.table)
library(sf)
library(tmap)
library(data.table)
Capitalise the planning area names.
transport_data <- read_csv("Transport.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## `Planning Area` = col_character(),
## `Mode Total` = col_number(),
## `Public Bus Only` = col_double(),
## `MRT Only` = col_double(),
## `MRT & Public Bus Only` = col_double(),
## `Other Combinations Of MRT Or Public Bus` = col_double(),
## `Taxi Only` = col_double(),
## `Car Only` = col_double(),
## `Private Chartered Bus/Van Only` = col_double(),
## `Lorry/Pickup Only` = col_double(),
## `Motorcycle/ Scooter Only` = col_double(),
## Others = col_double(),
## `No Transport Required` = col_double(),
## `Time Total` = col_number(),
## `Up To 15 Mins` = col_double(),
## `16 - 30 Mins` = col_double(),
## `31 - 45 Mins` = col_double(),
## `46 - 60 Mins` = col_double(),
## `More Than 60 Mins` = col_double()
## )
transport_data$"Planning Area"= toupper(transport_data$`Planning Area`)
mpsz <- st_read(dsn = "C:/Users/Lynnette/Documents/Class Content/Current folder (change once uploaded to hard disk)/SMU Y3S2/Visual Analytics/Assignment 5/VA Assignment 5/master-plan-2014-subzone-boundary-web-shp",
layer = "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `C:\Users\Lynnette\Documents\Class Content\Current folder (change once uploaded to hard disk)\SMU Y3S2\Visual Analytics\Assignment 5\VA Assignment 5\master-plan-2014-subzone-boundary-web-shp' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## Projected CRS: SVY21
mpsz
## Simple feature collection with 323 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## Projected CRS: SVY21
## First 10 features:
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N
## 1 1 1 MARINA SOUTH MSSZ01 Y MARINA SOUTH
## 2 2 1 PEARL'S HILL OTSZ01 Y OUTRAM
## 3 3 3 BOAT QUAY SRSZ03 Y SINGAPORE RIVER
## 4 4 8 HENDERSON HILL BMSZ08 N BUKIT MERAH
## 5 5 3 REDHILL BMSZ03 N BUKIT MERAH
## 6 6 7 ALEXANDRA HILL BMSZ07 N BUKIT MERAH
## 7 7 9 BUKIT HO SWEE BMSZ09 N BUKIT MERAH
## 8 8 2 CLARKE QUAY SRSZ02 Y SINGAPORE RIVER
## 9 9 13 PASIR PANJANG 1 QTSZ13 N QUEENSTOWN
## 10 10 7 QUEENSWAY QTSZ07 N QUEENSTOWN
## PLN_AREA_C REGION_N REGION_C INC_CRC FMEL_UPD_D X_ADDR
## 1 MS CENTRAL REGION CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2 OT CENTRAL REGION CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3 SR CENTRAL REGION CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4 BM CENTRAL REGION CR 3775D82C5DDBEFBD 2014-12-05 26782.83
## 5 BM CENTRAL REGION CR 85D9ABEF0A40678F 2014-12-05 26201.96
## 6 BM CENTRAL REGION CR 9D286521EF5E3B59 2014-12-05 25358.82
## 7 BM CENTRAL REGION CR 7839A8577144EFE2 2014-12-05 27680.06
## 8 SR CENTRAL REGION CR 48661DC0FBA09F7A 2014-12-05 29253.21
## 9 QT CENTRAL REGION CR 1F721290C421BFAB 2014-12-05 22077.34
## 10 QT CENTRAL REGION CR 3580D2AFFBEE914C 2014-12-05 24168.31
## Y_ADDR SHAPE_Leng SHAPE_Area geometry
## 1 29220.19 5267.381 1630379.3 MULTIPOLYGON (((31495.56 30...
## 2 29782.05 3506.107 559816.2 MULTIPOLYGON (((29092.28 30...
## 3 29974.66 1740.926 160807.5 MULTIPOLYGON (((29932.33 29...
## 4 29933.77 3313.625 595428.9 MULTIPOLYGON (((27131.28 30...
## 5 30005.70 2825.594 387429.4 MULTIPOLYGON (((26451.03 30...
## 6 29991.38 4428.913 1030378.8 MULTIPOLYGON (((25899.7 297...
## 7 30230.86 3275.312 551732.0 MULTIPOLYGON (((27746.95 30...
## 8 30222.86 2208.619 290184.7 MULTIPOLYGON (((29351.26 29...
## 9 29893.78 6571.323 1084792.3 MULTIPOLYGON (((20996.49 30...
## 10 30104.18 3454.239 631644.3 MULTIPOLYGON (((24472.11 29...
transport_data1 <- left_join(mpsz, transport_data,
by = c("PLN_AREA_N" = "Planning Area"))
Create an interactive map that visualises public transport use in Singapore. Use the variable “MRT & Public Bus Only” with tmap elements tm_shape(), tm_polygons(), tm_fill() and tm_borders. Set the colour palette to “Blues” in tm_polygons(). Users can hover over each coloured area to see which planning area it is in the hover text.
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(transport_data1)+ tm_polygons("MRT & Public Bus Only",id="PLN_AREA_N", palette="Blues", style="quantile", border.alpha = 0.5) # + tm_layout(title="Number of Residents who use MRT & Public Bus Only (thousands)", title.position = "center")
Create a “Car Only” map using the same elements as in the previous map. Change the tm_polygons() and tm_fill() col attribute to “Car Only”. Change the colour palette to “OrRd” for greater contrast with the previous map.
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(transport_data1)+ tm_polygons("Car Only",id="PLN_AREA_N",palette="OrRd", style = "quantile", borders.alpha=0.5) # +tm_layout(title="Number of Residents who use Car Only (thousands)")
Concentration of long travel times (More than 60 mins) plotted on a map
Units in thousands
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(transport_data1) + tm_polygons("More Than 60 Mins",id="PLN_AREA_N",palette="OrRd")+ tm_fill(style = "quantile") + tm_borders(alpha = 0.5) # + tm_layout(title= 'Concentration of Long Travel Times (More Than 60 Mins)', title.position = c('right', 'top'))
## Warning: One tm layer group has duplicated layer types, which are omitted. To
## draw multiple layers of the same type, use multiple layer groups (i.e. specify
## tm_shape prior to each of them).
Bar chart of Travel times for selected estate (Jurong West). First, isolate out the Jurong West data that is needed, and transpose the data. Then, use setDT() to convert the x-axis information into the first column. Omit the last row which contains geographic coordinate information. Plot out the graph using ggplot and ggplotly. Color-code the bar chart by length of travel time.
Units in thousands
total_data <- as.data.frame(t(transport_data[1,15:19]))
new1 <- setDT(as.data.frame(total_data), keep.rownames = "Time Taken")
Time_Taken <- new1$"Time Taken"
Residents <- new1$"V1"
positions <- c("Up To 15 Mins", "16 - 30 Mins", "31 - 45 Mins","46 - 60 Mins","More Than 60 Mins")
ggplotly(ggplot(new1, aes(x=Time_Taken, y=Residents,fill=Time_Taken)) + geom_bar(stat="identity")+labs(y="Number of residents (thousands)",x="Time Taken")+scale_x_discrete(limits = positions) + scale_fill_manual(values=c("#FACFC6","#F4AA9B","#F67055","#F54C28","#FFF0DC"))+ theme(legend.position = "none"), tooltip = c("Residents"))
Since I used a dataset that was made out of two data tables, there were some data columns that had duplicate names, which made it difficult to use the data for analysis. Therefore, I changed some of the names for the columns to make it clear what each specific column was referring to. I also had to splice and reformat some of the data, especially for the last graph which was a bar plot, because the data was not in a format that was suitable for bar plotting.
The titles of the graphs did not appear where they should have after knitting, when I used tm_layout() to specify the titles (tm_layout() is commented out in Section 3 for reference). Hence, I had to use headers in place of titles for the maps specifically, and I also added it for the bar plot for the sake of standardisation even though the bar plot’s title attribute was working properly during knitting.
In order to set up the comparison for Modes of Transport in Section 2, I had to differentiate the two Modes of Transport graphs. I chose to do so by choosing the “Blues” colour palette for the first map, and the “OrRd” colour palette for the second map, in order to create visual contrast and prepare the viewer for comparisons between the two.
For the Travel Times map and graphs, I decided to give them a similar colour theme because they are analysing similar aspects of the topic of Travel Times.
Introduction to dataset: The dataset comprises two tables from Singapore’s General Household Survey 2015. The two tables are about mode of transport to work and Travelling Time to work, and are called “Table 146 Resident Working Persons Aged 15 Years and Over by Planning Area and Usual Mode of Transport to Work” and “Table 147 Resident Working Persons Aged 15 Years and Over by Planning Area and Travelling Time to Work”. The data is taken from SingStat. The numerical data is displayed in thousands.