Singapore’s public transport use rose to hit a record high in 2018, with a total of 7.54 million trips made on buses or trains each day.
Here’s what may come across your mind: Do you ever have experiences where a bus ride that is supposed to be short and quick took way longer than expected? Are you frustrated that the bus stops at every stop even though there’s nobody boarding or alighting? And why do we have so many bus stops that almost nobody uses?
What if we can reimagine the public bus network in Singapore through data?
This dataviz provides an Exploratory Data Analysis (EDA) on the dataset, in order to gain intuition on how we can improve public transport in Singapore.
Dataset is available through an API connection to LTA Data Mall. Data is not publically available but available upon a written request. For this project, we will need to write a script in order to make an API call to extract the data we need. Data includes Live data as well as Historical data.
The purpose of this Data Visualization is exploratory, it allows us to take a look at the data visually before deep diving into any potential issues.
This visualization shows the location of all busstops in Singapore, with the proportion being the passenger volume.
This visualization shows the places people from a busstop travels to.
First, load the necessary R packages in RStudio.
tidyverse
contains a set of essential packages for data manipulation and explorationleaflet
is an open-sourced javascript libraries for interactive mapssf
stands for simple-features, it is used for wrangling geospatial datatmap
offers a flexible, layer-based, to create thematic maps such as choropleths and bubble mapsRColorBrewer
offers beautiful colours which we will use in the visualizationleaflet.extras
provides additional functionality map function, such as search barsgeosphere
is used to convert distance between two points and plot out the points in betweenpackages <- c('tidyverse', 'leaflet', 'sf', 'RColorBrewer', 'leaflet.extras', 'geosphere')
for (p in packages){
if (!require(p,character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
Second, load the necessary data in Rstudio.
This dataset shows the number of tap ins and tap outs per bus stop per time of day, segmented by weekdays/weekends in January 2020. The datasets used here are downloaded directly from LTA DataMall API.
busstop_volume <- read.csv("data/passenger volume by busstop.csv")
colnames(busstop_volume)[5] = "BusStopCode"
head(busstop_volume)
## YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE BusStopCode
## 1 2020-01 WEEKENDS/HOLIDAY 6 BUS 67551
## 2 2020-01 WEEKDAY 6 BUS 67551
## 3 2020-01 WEEKDAY 7 BUS 66541
## 4 2020-01 WEEKENDS/HOLIDAY 7 BUS 66541
## 5 2020-01 WEEKENDS/HOLIDAY 17 BUS 54209
## 6 2020-01 WEEKDAY 17 BUS 54209
## TOTAL_TAP_IN_VOLUME TOTAL_TAP_OUT_VOLUME
## 1 224 22
## 2 3922 122
## 3 648 364
## 4 127 109
## 5 736 325
## 6 1388 920
In addition, we import busstop information information. This dataset includes latitude and longitude. This dataset is extracted by parsing busstop information from LTA Datamall to Onemap.sg in order to extract the planning_area data. As there are too many factors to planning_area, we mapped some of the smallers area to Others
to reduce categorical features.
busstop_information <- read.csv("data/busstops_with_planning_area.csv")[3:8]
busstop_information$planning_area = str_to_title(busstop_information$planning_area)
busstop_information <- filter(busstop_information, planning_area != "Invalid")
busstop_information$planning_area <- as.character(busstop_information$planning_area)
busstop_information$planning_area[busstop_information$planning_area %in% c('Central Water Catchment', 'Mandai', 'Marina South', 'Museum', 'Newton', 'Orchard', 'Outram', 'Seletar', 'Rochor', 'Singapore River', 'Tanglin', 'Southern Islands', 'River Valley', 'Paya Lebar', 'Straits View', 'Tengah')] <- "Others"
head(busstop_information)
## BusStopCode Description Latitude Longitude RoadName
## 1 1012 Hotel Grand Pacific 1.296848 103.8525 Victoria St
## 2 1013 St. Joseph's Ch 1.297710 103.8532 Victoria St
## 3 1019 Bras Basah Cplx 1.296990 103.8530 Victoria St
## 4 1029 Opp Natl Lib 1.296673 103.8544 Nth Bridge Rd
## 5 1039 Bugis Cube 1.298208 103.8555 Nth Bridge Rd
## 6 1059 Bugis Stn Exit B 1.300757 103.8561 Victoria St
## planning_area
## 1 Downtown Core
## 2 Others
## 3 Downtown Core
## 4 Downtown Core
## 5 Downtown Core
## 6 Downtown Core
Now we have latitude and longitude in the busstop_information
, we join it to busstop_volume
dataset through ’BusStopCode`. We also remove busstop_volume from memory to free the memory.
busstop_volume_lat_long <- dplyr::inner_join(busstop_volume, busstop_information, by ='BusStopCode')
str(busstop_volume_lat_long)
## 'data.frame': 197054 obs. of 12 variables:
## $ YEAR_MONTH : Factor w/ 1 level "2020-01": 1 1 1 1 1 1 1 1 1 1 ...
## $ DAY_TYPE : Factor w/ 2 levels "WEEKDAY","WEEKENDS/HOLIDAY": 2 1 1 2 2 1 2 1 2 1 ...
## $ TIME_PER_HOUR : int 6 6 7 7 17 17 12 12 17 17 ...
## $ PT_TYPE : Factor w/ 1 level "BUS": 1 1 1 1 1 1 1 1 1 1 ...
## $ BusStopCode : int 67551 67551 66541 66541 54209 54209 61049 61049 67611 67611 ...
## $ TOTAL_TAP_IN_VOLUME : int 224 3922 648 127 736 1388 92 140 442 765 ...
## $ TOTAL_TAP_OUT_VOLUME: int 22 122 364 109 325 920 72 143 117 284 ...
## $ Description : Factor w/ 4479 levels "112 Katong","18 Woodsville",..: 1526 1526 1901 1901 1047 1047 39 39 1224 1224 ...
## $ Latitude : num 1.4 1.4 1.38 1.38 1.38 ...
## $ Longitude : num 104 104 104 104 104 ...
## $ RoadName : Factor w/ 821 levels "Adam Rd","Admiralty Dr",..: 573 573 115 115 26 26 755 755 565 565 ...
## $ planning_area : chr "Sengkang" "Sengkang" "Hougang" "Hougang" ...
rm(busstop_volume)
In order to reduce dimensions, we filtered the busstop_volume_lat_long
data, setting TIME_PER_HOUR
to 10 and DAY_TYPE
to Weekday. We also created tap_in_out_radius
to set the size of the bubble in the map.
location <- busstop_volume_lat_long %>%
dplyr::filter(DAY_TYPE == c('WEEKDAY'))%>%
dplyr::filter(TIME_PER_HOUR == 10)%>%
dplyr::group_by(BusStopCode)%>%
dplyr::arrange(desc(BusStopCode))%>%
rename(c(lat = Latitude, lon = Longitude))
#location$tap_in_radius <- sqrt(location$TOTAL_TAP_IN_VOLUME)/6
#location$tap_out_radius <- sqrt(location$TOTAL_TAP_OUT_VOLUME)/6
location$tap_in_out_radius <- (location$TOTAL_TAP_IN_VOLUME + location$TOTAL_TAP_OUT_VOLUME)**(1/2)/6
#location <- head(location)
head(location)
## # A tibble: 6 x 13
## # Groups: BusStopCode [6]
## YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE BusStopCode TOTAL_TAP_IN_VO~
## <fct> <fct> <int> <fct> <int> <int>
## 1 2020-01 WEEKDAY 10 BUS 99189 65
## 2 2020-01 WEEKDAY 10 BUS 99181 276
## 3 2020-01 WEEKDAY 10 BUS 99171 5
## 4 2020-01 WEEKDAY 10 BUS 99161 1
## 5 2020-01 WEEKDAY 10 BUS 99139 0
## 6 2020-01 WEEKDAY 10 BUS 99131 747
## # ... with 7 more variables: TOTAL_TAP_OUT_VOLUME <int>, Description <fct>,
## # lat <dbl>, lon <dbl>, RoadName <fct>, planning_area <chr>,
## # tap_in_out_radius <dbl>
As we want this map to be filterable, we need to create many mini datasets, one for each region. To do this, we create a list called planning_area_list
which contains a list of unique planning_area. After which, we do a loop over this list, with each iteration filtering the main dataset by the region, and saving it in a seperate dataset.
planning_area_list <-sort(unique(location$planning_area))
for (i in planning_area_list)
{
df <- filter(location, planning_area == i)
assign(paste(i), df)
}
This map is made such that it is filterable. Some features I added includes:
Set3
#pal <- colorNumeric(palette = "Reds", domain = c(0:11), reverse = FALSE)
pal <- colorFactor(palette = 'Set3', domain = planning_area_list)
map <-
leaflet(width = 800, height = 600)%>%
#addTiles(group = "OSM") %>%
addProviderTiles("CartoDB", group = "CartoDB") %>%
#addProviderTiles("Esri", group = "Esri") %>%
addCircleMarkers(data=`Ang Mo Kio`, group = 'Ang Mo Kio', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bedok`, group = 'Bedok', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bishan`, group = 'Bishan', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Boon Lay`, group = 'Boon Lay', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bukit Batok`, group = 'Bukit Batok', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bukit Merah`, group = 'Bukit Merah', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bukit Panjang`, group = 'Bukit Panjang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Bukit Timah`, group = 'Bukit Timah', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Changi`, group = 'Changi', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Choa Chu Kang`, group = 'Choa Chu Kang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Clementi`, group = 'Clementi', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Downtown Core`, group = 'Downtown Core', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Geylang`, group = 'Geylang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Hougang`, group = 'Hougang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Jurong East`, group = 'Jurong East', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Jurong West`, group = 'Jurong West', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Kallang`, group = 'Kallang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Lim Chu Kang`, group = 'Lim Chu Kang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Marine Parade`, group = 'Marine Parade', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Novena`, group = 'Novena', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Pasir Ris`, group = 'Pasir Ris', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Pioneer`, group = 'Pioneer', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Punggol`, group = 'Punggol', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Queenstown`, group = 'Queenstown', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Sembawang`, group = 'Sembawang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Sengkang`, group = 'Sengkang', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Serangoon`, group = 'Serangoon', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Sungei Kadut`, group = 'Sungei Kadut', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Tampines`, group = 'Tampines', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Toa Payoh`, group = 'Toa Payoh', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Tuas`, group = 'Tuas', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Western Water Catchment`, group = 'Western Water Catchment', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Woodlands`, group = 'Woodlands', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Yishun`, group = 'Yishun', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`Others`, group = 'Others', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color = ~pal(planning_area),
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
setView(lat = 1.3004641, lng = 104.4496028, zoom = 11)%>%
setMaxBounds(lng1 = 103.801959 + .25,
lat1 = 1.32270 + .25,
lng2 = 103.801959 - .25,
lat2 = 1.32270 - .25)%>%
#addSearchOSM() %>%
#addReverseSearchOSM()%>%
addLayersControl(baseGroups =unique(planning_area_list))
#layersControlOptions(collapsed = FALSE, autoZIndex = TRUE)
#addSearchFeatures(
# targetGroups = planning_area_list,
# Set the search zoom level to 18
# options = searchFeaturesOptions(zoom = 18))
map
First, we do so by filtering the data to Weekday, while allowing data from all hours of the day to be included.
location_all_time <- busstop_volume_lat_long %>%
dplyr::filter(DAY_TYPE == c('WEEKDAY'))%>%
# dplyr::filter(planning_area == c('BISHAN'))%>%
#dplyr::filter(TIME_PER_HOUR == 6)%>%
dplyr::group_by(BusStopCode)%>%
dplyr::arrange(desc(BusStopCode))%>%
rename(c(lat = Latitude, lon = Longitude))
location_all_time$tap_in_radius <- sqrt(location_all_time$TOTAL_TAP_IN_VOLUME)/6
location_all_time$tap_out_radius <- sqrt(location_all_time$TOTAL_TAP_OUT_VOLUME)/6
location_all_time$tap_in_out_radius <- (location_all_time$TOTAL_TAP_IN_VOLUME + location_all_time$TOTAL_TAP_OUT_VOLUME)**(1/2)/20
location_all_time <-location_all_time[c('TIME_PER_HOUR', 'TOTAL_TAP_IN_VOLUME', 'TOTAL_TAP_OUT_VOLUME', 'Description', 'lat', 'lon', 'planning_area', 'tap_in_out_radius')]
tail(location_all_time)
## # A tibble: 6 x 8
## TIME_PER_HOUR TOTAL_TAP_IN_VO~ TOTAL_TAP_OUT_V~ Description lat lon
## <int> <int> <int> <fct> <dbl> <dbl>
## 1 10 589 1833 Hotel Gran~ 1.30 104.
## 2 17 2375 2215 Hotel Gran~ 1.30 104.
## 3 6 374 642 Hotel Gran~ 1.30 104.
## 4 19 1979 1927 Hotel Gran~ 1.30 104.
## 5 13 1233 2122 Hotel Gran~ 1.30 104.
## 6 16 1692 1934 Hotel Gran~ 1.30 104.
## # ... with 2 more variables: planning_area <chr>, tap_in_out_radius <dbl>
Secondly, we create a unique list of TIME_PER_HOUR and looping over it to create a dataframe for each hour of the day. A filter of TOTAL_TAP_IN_VOLUME >100 is implemented to reduce the data size.
TIME_PER_HOUR_list <-sort(unique(location_all_time$TIME_PER_HOUR))
for (i in TIME_PER_HOUR_list)
{
df <- filter(location_all_time, TIME_PER_HOUR == i)%>%filter(TOTAL_TAP_IN_VOLUME > 200)
assign(paste(i), df)
}
Finally, we create a map of the traffic by time of the day. Reader will be able to choose which time of the day he/she wants to see.
Work in process: To convert this to an animation
#pal <- colorFactor(palette = 'Set3', domain = TIME_PER_HOUR_list)
map_time <-
leaflet(width = 800, height = 600)%>%
addProviderTiles("CartoDB") %>%
addCircleMarkers(data=`6`, group = '06:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`7`, group = '07:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`8`, group = '08:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`9`, group = '09:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`10`, group = '10:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`11`, group = '11:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`12`, group = '12:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`13`, group = '13:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`14`, group = '14:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`15`, group = '15:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`16`, group = '16:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`17`, group = '17:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`18`, group = '18:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`19`, group = '19:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`20`, group = '20:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`21`, group = '21:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`22`, group = '22:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
addCircleMarkers(data=`23`, group = '23:00', stroke = FALSE, fillOpacity = 0.75,
lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
setView(lat = 1.3704641, lng = 103.8496028, zoom = 11)%>%
setMaxBounds(lng1 = 103.801959 + .25,
lat1 = 1.32270 + .25,
lng2 = 103.801959 - .25,
lat2 = 1.32270 - .25)%>%
#addSearchOSM() %>%
#addReverseSearchOSM()%>%
addLayersControl(baseGroups = c('06:00','07:00','08:00','09:00','10:00','11:00','12:00','13:00','14:00','15:00','16:00','17:00','18:00','19:00','20:00','21:00','22:00','23:00'))
#layersControlOptions(collapsed = FALSE, autoZIndex = TRUE)
#addSearchFeatures(
# targetGroups = planning_area_list,
# Set the search zoom level to 18
# options = searchFeaturesOptions(zoom = 18))
map_time
In the second proposed visualization, we want to create a map where I can see traffic flow from busstop A to all other busstop.
The data is stored in another dataset, origin_destination_bus_202001.csv
, also downloaded from LTA Data Mall. This dataset is huge (280MB). So when reading this data in, I have to be careful not to exceed our memory capacity. Hence, I used R pip function to filter the data before reading it into the dataframe origin_destination
.
origin_destination <- read.csv("data/origin_destination_bus_202001.csv") %>%
dplyr::filter(ORIGIN_PT_CODE == c('54009'))%>%
dplyr::filter(DAY_TYPE == c('WEEKDAY'))
origin_destination <- origin_destination[with(origin_destination, order(-TOTAL_TRIPS)), ]%>%slice(0:50)
Similar to previous code chunk, we add in latitude and longtitude by joining it to busstop_information. We need to join them twice as we need once for Origin station, and one for Destination station.
Hence, I create two dataframe, one for origin busstop_information_origin
and one for destination busstop_information_destination
and join them twice.
busstop_information_origin <- busstop_information[c('BusStopCode', 'Description', 'Latitude', 'Longitude', 'planning_area')]%>%
rename(Origin_Description = Description, Origin_Latitude = Latitude, Origin_Longitude = Longitude)
busstop_information_destination <- busstop_information[c('BusStopCode', 'Description', 'Latitude', 'Longitude', 'planning_area' )]%>%
rename(Destination_Description = Description, Destination_Latitude = Latitude, Destination_Longitude = Longitude)
origin_destination <- origin_destination %>%
#rename(Origin_BusStopCode = ORIGIN_PT_CODE, Destination_BusStopCode = DESTINATION_PT_CODE)
dplyr::inner_join(busstop_information_origin, by = c("ORIGIN_PT_CODE" = "BusStopCode"))%>%
dplyr::inner_join(busstop_information_destination, by = c("DESTINATION_PT_CODE" = "BusStopCode"))
head(origin_destination)
## YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
## 1 2020-01 WEEKDAY 19 BUS 54009 54389
## 2 2020-01 WEEKDAY 18 BUS 54009 54389
## 3 2020-01 WEEKDAY 18 BUS 54009 54181
## 4 2020-01 WEEKDAY 20 BUS 54009 54389
## 5 2020-01 WEEKDAY 19 BUS 54009 54181
## 6 2020-01 WEEKDAY 18 BUS 54009 54191
## TOTAL_TRIPS Origin_Description Origin_Latitude Origin_Longitude
## 1 69520 Ang Mo Kio Int 1.369688 103.8486
## 2 65480 Ang Mo Kio Int 1.369688 103.8486
## 3 52760 Ang Mo Kio Int 1.369688 103.8486
## 4 52300 Ang Mo Kio Int 1.369688 103.8486
## 5 51740 Ang Mo Kio Int 1.369688 103.8486
## 6 51500 Ang Mo Kio Int 1.369688 103.8486
## planning_area.x Destination_Description Destination_Latitude
## 1 Ang Mo Kio Blk 465 1.366390
## 2 Ang Mo Kio Blk 465 1.366390
## 3 Ang Mo Kio BLK 258 1.370533
## 4 Ang Mo Kio Blk 465 1.366390
## 5 Ang Mo Kio BLK 258 1.370533
## 6 Ang Mo Kio BLK 170 1.374472
## Destination_Longitude planning_area.y
## 1 103.8566 Ang Mo Kio
## 2 103.8566 Ang Mo Kio
## 3 103.8365 Ang Mo Kio
## 4 103.8566 Ang Mo Kio
## 5 103.8365 Ang Mo Kio
## 6 103.8376 Ang Mo Kio
While searching online, I didn’t find much resources on plotting this type of movement map. I used addPolylines on leaflet map for this task, which prove to be not a very nice solution. Probably need to consult Professor Kam to get some advice on this.
#format data
origin_destination$Origin_Longitude = as.numeric(as.character(origin_destination$Origin_Longitude))
origin_destination$Origin_Latitude = as.numeric(as.character(origin_destination$Origin_Latitude))
origin_destination$Destination_Longitude = as.numeric(as.character(origin_destination$Destination_Longitude))
origin_destination$Destination_Latitude = as.numeric(as.character(origin_destination$Destination_Latitude))
origin_destination$Destination_Description = as.factor(as.numeric(as.factor(origin_destination$Destination_Description)))
pathList = NULL
factpal <- colorFactor(heat.colors(30), pathList$id)
for(i in 1:nrow(origin_destination))
{
tmp = gcIntermediate(c(origin_destination$Origin_Longitude[i],
origin_destination$Origin_Latitude[i]),
c(origin_destination$Destination_Longitude[i],
origin_destination$Destination_Latitude[i]),n = 25,
addStartEnd=TRUE)
tmp = data.frame(tmp)
tmp$id = origin_destination[i,]$Destination_Description
tmp$weight = sqrt(origin_destination[i,]$TOTAL_TRIPS)/50
tmp$color = factpal(origin_destination[i,]$Destination_Description)
pathList = c(pathList,list(tmp))
}
leaflet(width = 800, height = 600)%>%
addProviderTiles("CartoDB") -> lf
#add each entry of pathlist to the leaflet object
for (path in pathList)
{
lf %>% addPolylines(data = path,
lng = ~lon,
lat = ~lat,
weight = ~weight
#color = ~color
) -> lf
}
#show output
lf
In a static visualization, users can only see what you want to show them, but in an interactive data visualization, users have the opportunity to interact with the data, explore, and draw insights for themselves. It gives the user a much better user experience as they are not simply ‘shown what you want them to see’.
With interactivity, we are able to incorporate so much more information into a visualization. We can take care of dimensionality by leaving them all in filters, and in the case of maps, users can zoom in and out to different parts of the map to explore features that interest them.
The purpose of a data viz is to convince users of a certain topic using data. Having interactivity, users can form their own cause and effect analysis which helps them gain better intuition on the reason for the data. Thoughts generated by the user itself stays on longer and stronger, compared to a point being told to them.