• 1 Overview
  • 2 Visualization Planning - Core Purpose, Components, and Interactivity Features
    • 2.1 Location and Passenger Volume of Bus Stops (Proportional Symbol Map)
      • 2.1.1 Potential Insights:
      • 2.1.2 Interactivity Features
    • 2.2 Busstop to Busstop Volume across Time
      • 2.2.1 Potential Insights:
      • 2.2.2 Interactivity Features
  • 3 DataViz Step by Step Guide
    • 3.1 Load R packages
    • 3.2 Load Data
      • 3.2.1 Busstop Volume
      • 3.2.2 Busstop Volume
    • 3.3 Data Processing
      • 3.3.1 Insert Latitude and Longitude
    • 3.4 Visualization 1-a: Proportional Symbol Map by Planning Area
      • 3.4.1 Creating dataframes for each planning region
      • 3.4.2 Plotting the map
      • 3.4.3 Filtering the Data
      • 3.4.4 Creating a list of time of the day
      • 3.4.5 Plotting the Map
    • 3.5 Visualization 2: Busstop to Busstop Volume across Time
      • 3.5.1 Adding in Latitude and Longtitude
      • 3.5.2 Plotting the Map
  • 4 Reflection on Interactivity
    • 4.1 User-Visualization Interaction
    • 4.2 Presenting Dimensionality
    • 4.3 Forming Intuition

1 Overview

Singapore’s public transport use rose to hit a record high in 2018, with a total of 7.54 million trips made on buses or trains each day.

Here’s what may come across your mind: Do you ever have experiences where a bus ride that is supposed to be short and quick took way longer than expected? Are you frustrated that the bus stops at every stop even though there’s nobody boarding or alighting? And why do we have so many bus stops that almost nobody uses?

What if we can reimagine the public bus network in Singapore through data?

This dataviz provides an Exploratory Data Analysis (EDA) on the dataset, in order to gain intuition on how we can improve public transport in Singapore.

Dataset is available through an API connection to LTA Data Mall. Data is not publically available but available upon a written request. For this project, we will need to write a script in order to make an API call to extract the data we need. Data includes Live data as well as Historical data.

2 Visualization Planning - Core Purpose, Components, and Interactivity Features

The purpose of this Data Visualization is exploratory, it allows us to take a look at the data visually before deep diving into any potential issues.

2.1 Location and Passenger Volume of Bus Stops (Proportional Symbol Map)

This visualization shows the location of all busstops in Singapore, with the proportion being the passenger volume.

2.1.1 Potential Insights:

  • Visualize which busstops are overutilized or underutilized
  • Which busstop serves as a possible transit point
  • Density of busstops in different estates

2.1.2 Interactivity Features

  • Ability to zoom in/out of the map to a specific location
  • Choosing the time of the day
  • Choosing of weekday/weekend

2.2 Busstop to Busstop Volume across Time

This visualization shows the places people from a busstop travels to.

2.2.1 Potential Insights:

  • Shows the flow of people from a location to another
  • Suggest bus route improvement
  • Understand movement between regions to regions

2.2.2 Interactivity Features

  • Ability to zoom in/out of the map to a specific location
  • Select a busstop by clicking on the location, the visualization will show all movement from
  • Animate time of the day, showing different location communters travel to at different time of the day
  • Selection of weekday / weekend

3 DataViz Step by Step Guide

3.1 Load R packages

First, load the necessary R packages in RStudio.

  • tidyverse contains a set of essential packages for data manipulation and exploration
  • leaflet is an open-sourced javascript libraries for interactive maps
  • sf stands for simple-features, it is used for wrangling geospatial data
  • tmap offers a flexible, layer-based, to create thematic maps such as choropleths and bubble maps
  • RColorBrewer offers beautiful colours which we will use in the visualization
  • leaflet.extras provides additional functionality map function, such as search bars
  • geosphere is used to convert distance between two points and plot out the points in between
packages <- c('tidyverse', 'leaflet', 'sf', 'RColorBrewer', 'leaflet.extras', 'geosphere')

for (p in packages){
  if (!require(p,character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

3.2 Load Data

3.2.1 Busstop Volume

Second, load the necessary data in Rstudio.

This dataset shows the number of tap ins and tap outs per bus stop per time of day, segmented by weekdays/weekends in January 2020. The datasets used here are downloaded directly from LTA DataMall API.

busstop_volume <- read.csv("data/passenger volume by busstop.csv")
colnames(busstop_volume)[5] = "BusStopCode"

head(busstop_volume)
##   YEAR_MONTH         DAY_TYPE TIME_PER_HOUR PT_TYPE BusStopCode
## 1    2020-01 WEEKENDS/HOLIDAY             6     BUS       67551
## 2    2020-01          WEEKDAY             6     BUS       67551
## 3    2020-01          WEEKDAY             7     BUS       66541
## 4    2020-01 WEEKENDS/HOLIDAY             7     BUS       66541
## 5    2020-01 WEEKENDS/HOLIDAY            17     BUS       54209
## 6    2020-01          WEEKDAY            17     BUS       54209
##   TOTAL_TAP_IN_VOLUME TOTAL_TAP_OUT_VOLUME
## 1                 224                   22
## 2                3922                  122
## 3                 648                  364
## 4                 127                  109
## 5                 736                  325
## 6                1388                  920

3.2.2 Busstop Volume

In addition, we import busstop information information. This dataset includes latitude and longitude. This dataset is extracted by parsing busstop information from LTA Datamall to Onemap.sg in order to extract the planning_area data. As there are too many factors to planning_area, we mapped some of the smallers area to Others to reduce categorical features.

busstop_information <- read.csv("data/busstops_with_planning_area.csv")[3:8]

busstop_information$planning_area = str_to_title(busstop_information$planning_area)
busstop_information <- filter(busstop_information, planning_area != "Invalid")

busstop_information$planning_area <- as.character(busstop_information$planning_area)
busstop_information$planning_area[busstop_information$planning_area %in% c('Central Water Catchment', 'Mandai', 'Marina South', 'Museum', 'Newton', 'Orchard', 'Outram', 'Seletar', 'Rochor', 'Singapore River', 'Tanglin', 'Southern Islands', 'River Valley', 'Paya Lebar', 'Straits View', 'Tengah')] <- "Others"

head(busstop_information)
##   BusStopCode         Description Latitude Longitude      RoadName
## 1        1012 Hotel Grand Pacific 1.296848  103.8525   Victoria St
## 2        1013     St. Joseph's Ch 1.297710  103.8532   Victoria St
## 3        1019     Bras Basah Cplx 1.296990  103.8530   Victoria St
## 4        1029        Opp Natl Lib 1.296673  103.8544 Nth Bridge Rd
## 5        1039          Bugis Cube 1.298208  103.8555 Nth Bridge Rd
## 6        1059    Bugis Stn Exit B 1.300757  103.8561   Victoria St
##   planning_area
## 1 Downtown Core
## 2        Others
## 3 Downtown Core
## 4 Downtown Core
## 5 Downtown Core
## 6 Downtown Core

3.3 Data Processing

3.3.1 Insert Latitude and Longitude

Now we have latitude and longitude in the busstop_information, we join it to busstop_volume dataset through ’BusStopCode`. We also remove busstop_volume from memory to free the memory.

busstop_volume_lat_long <- dplyr::inner_join(busstop_volume, busstop_information, by ='BusStopCode')
str(busstop_volume_lat_long)
## 'data.frame':    197054 obs. of  12 variables:
##  $ YEAR_MONTH          : Factor w/ 1 level "2020-01": 1 1 1 1 1 1 1 1 1 1 ...
##  $ DAY_TYPE            : Factor w/ 2 levels "WEEKDAY","WEEKENDS/HOLIDAY": 2 1 1 2 2 1 2 1 2 1 ...
##  $ TIME_PER_HOUR       : int  6 6 7 7 17 17 12 12 17 17 ...
##  $ PT_TYPE             : Factor w/ 1 level "BUS": 1 1 1 1 1 1 1 1 1 1 ...
##  $ BusStopCode         : int  67551 67551 66541 66541 54209 54209 61049 61049 67611 67611 ...
##  $ TOTAL_TAP_IN_VOLUME : int  224 3922 648 127 736 1388 92 140 442 765 ...
##  $ TOTAL_TAP_OUT_VOLUME: int  22 122 364 109 325 920 72 143 117 284 ...
##  $ Description         : Factor w/ 4479 levels "112 Katong","18 Woodsville",..: 1526 1526 1901 1901 1047 1047 39 39 1224 1224 ...
##  $ Latitude            : num  1.4 1.4 1.38 1.38 1.38 ...
##  $ Longitude           : num  104 104 104 104 104 ...
##  $ RoadName            : Factor w/ 821 levels "Adam Rd","Admiralty Dr",..: 573 573 115 115 26 26 755 755 565 565 ...
##  $ planning_area       : chr  "Sengkang" "Sengkang" "Hougang" "Hougang" ...
rm(busstop_volume)

3.4 Visualization 1-a: Proportional Symbol Map by Planning Area

In order to reduce dimensions, we filtered the busstop_volume_lat_long data, setting TIME_PER_HOUR to 10 and DAY_TYPE to Weekday. We also created tap_in_out_radius to set the size of the bubble in the map.

location <- busstop_volume_lat_long %>%
  dplyr::filter(DAY_TYPE == c('WEEKDAY'))%>%
  dplyr::filter(TIME_PER_HOUR == 10)%>%
  dplyr::group_by(BusStopCode)%>%
  dplyr::arrange(desc(BusStopCode))%>%
  rename(c(lat = Latitude, lon = Longitude))

#location$tap_in_radius <- sqrt(location$TOTAL_TAP_IN_VOLUME)/6
#location$tap_out_radius <- sqrt(location$TOTAL_TAP_OUT_VOLUME)/6
location$tap_in_out_radius <- (location$TOTAL_TAP_IN_VOLUME + location$TOTAL_TAP_OUT_VOLUME)**(1/2)/6

#location <- head(location)
head(location)
## # A tibble: 6 x 13
## # Groups:   BusStopCode [6]
##   YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE BusStopCode TOTAL_TAP_IN_VO~
##   <fct>      <fct>            <int> <fct>         <int>            <int>
## 1 2020-01    WEEKDAY             10 BUS           99189               65
## 2 2020-01    WEEKDAY             10 BUS           99181              276
## 3 2020-01    WEEKDAY             10 BUS           99171                5
## 4 2020-01    WEEKDAY             10 BUS           99161                1
## 5 2020-01    WEEKDAY             10 BUS           99139                0
## 6 2020-01    WEEKDAY             10 BUS           99131              747
## # ... with 7 more variables: TOTAL_TAP_OUT_VOLUME <int>, Description <fct>,
## #   lat <dbl>, lon <dbl>, RoadName <fct>, planning_area <chr>,
## #   tap_in_out_radius <dbl>

3.4.1 Creating dataframes for each planning region

As we want this map to be filterable, we need to create many mini datasets, one for each region. To do this, we create a list called planning_area_list which contains a list of unique planning_area. After which, we do a loop over this list, with each iteration filtering the main dataset by the region, and saving it in a seperate dataset.

planning_area_list <-sort(unique(location$planning_area))

for (i in planning_area_list)
{
  df <- filter(location, planning_area == i)  
  assign(paste(i), df)
}

3.4.2 Plotting the map

This map is made such that it is filterable. Some features I added includes:

  • Filters - Reader can choose which planning area he/she wants to explore
  • Colour - Using colorbrewer palette Set3
  • Bounds - Readers will not be able to scroll out of Singapore
#pal <- colorNumeric(palette = "Reds", domain = c(0:11), reverse = FALSE)
pal <- colorFactor(palette = 'Set3', domain = planning_area_list)

map <-
  leaflet(width = 800, height = 600)%>%
  #addTiles(group = "OSM") %>% 
  addProviderTiles("CartoDB", group = "CartoDB") %>% 
  #addProviderTiles("Esri", group = "Esri") %>% 
  addCircleMarkers(data=`Ang Mo Kio`, group = 'Ang Mo Kio', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bedok`, group = 'Bedok', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bishan`, group = 'Bishan', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Boon Lay`, group = 'Boon Lay', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bukit Batok`, group = 'Bukit Batok', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bukit Merah`, group = 'Bukit Merah', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bukit Panjang`, group = 'Bukit Panjang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Bukit Timah`, group = 'Bukit Timah', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Changi`, group = 'Changi', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Choa Chu Kang`, group = 'Choa Chu Kang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Clementi`, group = 'Clementi', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Downtown Core`, group = 'Downtown Core', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Geylang`, group = 'Geylang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Hougang`, group = 'Hougang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Jurong East`, group = 'Jurong East', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Jurong West`, group = 'Jurong West', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>% 
  addCircleMarkers(data=`Kallang`, group = 'Kallang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Lim Chu Kang`, group = 'Lim Chu Kang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Marine Parade`, group = 'Marine Parade', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Novena`, group = 'Novena', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`Pasir Ris`, group = 'Pasir Ris', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Pioneer`, group = 'Pioneer', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Punggol`, group = 'Punggol', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Queenstown`, group = 'Queenstown', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Sembawang`, group = 'Sembawang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Sengkang`, group = 'Sengkang', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Serangoon`, group = 'Serangoon', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%     
  addCircleMarkers(data=`Sungei Kadut`, group = 'Sungei Kadut', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Tampines`, group = 'Tampines', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Toa Payoh`, group = 'Toa Payoh', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Tuas`, group = 'Tuas', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Western Water Catchment`, group = 'Western Water Catchment', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Woodlands`, group = 'Woodlands', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Yishun`, group = 'Yishun', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  addCircleMarkers(data=`Others`, group = 'Others', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, color =  ~pal(planning_area),  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  setView(lat = 1.3004641, lng = 104.4496028, zoom = 11)%>%
  setMaxBounds(lng1 = 103.801959 + .25, 
             lat1 = 1.32270 + .25, 
             lng2 = 103.801959 - .25, 
             lat2 = 1.32270 - .25)%>%
  #addSearchOSM() %>% 
  #addReverseSearchOSM()%>%
  addLayersControl(baseGroups =unique(planning_area_list))
  #layersControlOptions(collapsed = FALSE, autoZIndex = TRUE)
  #addSearchFeatures(
  #  targetGroups = planning_area_list,
    # Set the search zoom level to 18
  #  options = searchFeaturesOptions(zoom = 18)) 

map

Leaflet | © OpenStreetMap contributors © CARTO
## Visualization 1-b: Proportional Symbol Map by Time Another visualization we want to create is busstop by Time. In the previous map, we show traffic by region, now we want to show traffic by time.

3.4.3 Filtering the Data

First, we do so by filtering the data to Weekday, while allowing data from all hours of the day to be included.

location_all_time <- busstop_volume_lat_long %>%
  dplyr::filter(DAY_TYPE == c('WEEKDAY'))%>%
 # dplyr::filter(planning_area == c('BISHAN'))%>%
  #dplyr::filter(TIME_PER_HOUR == 6)%>%
  dplyr::group_by(BusStopCode)%>%
  dplyr::arrange(desc(BusStopCode))%>%
  rename(c(lat = Latitude, lon = Longitude))

location_all_time$tap_in_radius <- sqrt(location_all_time$TOTAL_TAP_IN_VOLUME)/6
location_all_time$tap_out_radius <- sqrt(location_all_time$TOTAL_TAP_OUT_VOLUME)/6
location_all_time$tap_in_out_radius <- (location_all_time$TOTAL_TAP_IN_VOLUME + location_all_time$TOTAL_TAP_OUT_VOLUME)**(1/2)/20

location_all_time <-location_all_time[c('TIME_PER_HOUR', 'TOTAL_TAP_IN_VOLUME', 'TOTAL_TAP_OUT_VOLUME',  'Description', 'lat', 'lon', 'planning_area', 'tap_in_out_radius')]

tail(location_all_time)
## # A tibble: 6 x 8
##   TIME_PER_HOUR TOTAL_TAP_IN_VO~ TOTAL_TAP_OUT_V~ Description   lat   lon
##           <int>            <int>            <int> <fct>       <dbl> <dbl>
## 1            10              589             1833 Hotel Gran~  1.30  104.
## 2            17             2375             2215 Hotel Gran~  1.30  104.
## 3             6              374              642 Hotel Gran~  1.30  104.
## 4            19             1979             1927 Hotel Gran~  1.30  104.
## 5            13             1233             2122 Hotel Gran~  1.30  104.
## 6            16             1692             1934 Hotel Gran~  1.30  104.
## # ... with 2 more variables: planning_area <chr>, tap_in_out_radius <dbl>

3.4.4 Creating a list of time of the day

Secondly, we create a unique list of TIME_PER_HOUR and looping over it to create a dataframe for each hour of the day. A filter of TOTAL_TAP_IN_VOLUME >100 is implemented to reduce the data size.

TIME_PER_HOUR_list <-sort(unique(location_all_time$TIME_PER_HOUR))

for (i in TIME_PER_HOUR_list)
{
  df <- filter(location_all_time, TIME_PER_HOUR == i)%>%filter(TOTAL_TAP_IN_VOLUME > 200)
  assign(paste(i), df)
}

3.4.5 Plotting the Map

Finally, we create a map of the traffic by time of the day. Reader will be able to choose which time of the day he/she wants to see.

Work in process: To convert this to an animation

#pal <- colorFactor(palette = 'Set3', domain = TIME_PER_HOUR_list)

map_time <-
  leaflet(width = 800, height = 600)%>%
  addProviderTiles("CartoDB") %>% 
  addCircleMarkers(data=`6`, group = '06:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`7`, group = '07:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,   
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`8`, group = '08:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,   
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`9`, group = '09:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`10`, group = '10:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`11`, group = '11:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`12`, group = '12:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`13`, group = '13:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`14`, group = '14:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`15`, group = '15:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`16`, group = '16:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`17`, group = '17:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`18`, group = '18:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`19`, group = '19:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`20`, group = '20:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`21`, group = '21:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`22`, group = '22:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius,  
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%
  addCircleMarkers(data=`23`, group = '23:00', stroke = FALSE, fillOpacity = 0.75,
                   lng = ~lon, lat = ~lat, label = ~Description, radius = ~tap_in_out_radius, 
                   popup = ~paste0("<b>", Description, "</b>", "<br/>", 'Tap in Volume: ', TOTAL_TAP_IN_VOLUME, "<br/>", 'Tap out volume: ', TOTAL_TAP_OUT_VOLUME))%>%  
  setView(lat = 1.3704641, lng = 103.8496028, zoom = 11)%>%
  setMaxBounds(lng1 = 103.801959 + .25, 
             lat1 = 1.32270 + .25, 
             lng2 = 103.801959 - .25, 
             lat2 = 1.32270 - .25)%>%
  #addSearchOSM() %>% 
  #addReverseSearchOSM()%>%
  addLayersControl(baseGroups = c('06:00','07:00','08:00','09:00','10:00','11:00','12:00','13:00','14:00','15:00','16:00','17:00','18:00','19:00','20:00','21:00','22:00','23:00'))
  #layersControlOptions(collapsed = FALSE, autoZIndex = TRUE)
  #addSearchFeatures(
  #  targetGroups = planning_area_list,
    # Set the search zoom level to 18
  #  options = searchFeaturesOptions(zoom = 18)) 

map_time
Leaflet | © OpenStreetMap contributors © CARTO

3.5 Visualization 2: Busstop to Busstop Volume across Time

In the second proposed visualization, we want to create a map where I can see traffic flow from busstop A to all other busstop.

The data is stored in another dataset, origin_destination_bus_202001.csv, also downloaded from LTA Data Mall. This dataset is huge (280MB). So when reading this data in, I have to be careful not to exceed our memory capacity. Hence, I used R pip function to filter the data before reading it into the dataframe origin_destination.

origin_destination <- read.csv("data/origin_destination_bus_202001.csv") %>% 
                      dplyr::filter(ORIGIN_PT_CODE == c('54009'))%>% 
                      dplyr::filter(DAY_TYPE == c('WEEKDAY'))

origin_destination <- origin_destination[with(origin_destination, order(-TOTAL_TRIPS)), ]%>%slice(0:50)

3.5.1 Adding in Latitude and Longtitude

Similar to previous code chunk, we add in latitude and longtitude by joining it to busstop_information. We need to join them twice as we need once for Origin station, and one for Destination station.

Hence, I create two dataframe, one for origin busstop_information_origin and one for destination busstop_information_destination and join them twice.

busstop_information_origin <- busstop_information[c('BusStopCode', 'Description', 'Latitude', 'Longitude', 'planning_area')]%>%
  rename(Origin_Description = Description, Origin_Latitude = Latitude, Origin_Longitude = Longitude)

busstop_information_destination <- busstop_information[c('BusStopCode', 'Description', 'Latitude', 'Longitude', 'planning_area' )]%>%
  rename(Destination_Description = Description, Destination_Latitude = Latitude, Destination_Longitude = Longitude)

origin_destination <- origin_destination %>%
                      #rename(Origin_BusStopCode = ORIGIN_PT_CODE, Destination_BusStopCode = DESTINATION_PT_CODE)
                      dplyr::inner_join(busstop_information_origin, by = c("ORIGIN_PT_CODE" = "BusStopCode"))%>%
                      dplyr::inner_join(busstop_information_destination, by = c("DESTINATION_PT_CODE" = "BusStopCode"))

head(origin_destination)
##   YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
## 1    2020-01  WEEKDAY            19     BUS          54009               54389
## 2    2020-01  WEEKDAY            18     BUS          54009               54389
## 3    2020-01  WEEKDAY            18     BUS          54009               54181
## 4    2020-01  WEEKDAY            20     BUS          54009               54389
## 5    2020-01  WEEKDAY            19     BUS          54009               54181
## 6    2020-01  WEEKDAY            18     BUS          54009               54191
##   TOTAL_TRIPS Origin_Description Origin_Latitude Origin_Longitude
## 1       69520     Ang Mo Kio Int        1.369688         103.8486
## 2       65480     Ang Mo Kio Int        1.369688         103.8486
## 3       52760     Ang Mo Kio Int        1.369688         103.8486
## 4       52300     Ang Mo Kio Int        1.369688         103.8486
## 5       51740     Ang Mo Kio Int        1.369688         103.8486
## 6       51500     Ang Mo Kio Int        1.369688         103.8486
##   planning_area.x Destination_Description Destination_Latitude
## 1      Ang Mo Kio                 Blk 465             1.366390
## 2      Ang Mo Kio                 Blk 465             1.366390
## 3      Ang Mo Kio                 BLK 258             1.370533
## 4      Ang Mo Kio                 Blk 465             1.366390
## 5      Ang Mo Kio                 BLK 258             1.370533
## 6      Ang Mo Kio                 BLK 170             1.374472
##   Destination_Longitude planning_area.y
## 1              103.8566      Ang Mo Kio
## 2              103.8566      Ang Mo Kio
## 3              103.8365      Ang Mo Kio
## 4              103.8566      Ang Mo Kio
## 5              103.8365      Ang Mo Kio
## 6              103.8376      Ang Mo Kio

3.5.2 Plotting the Map

While searching online, I didn’t find much resources on plotting this type of movement map. I used addPolylines on leaflet map for this task, which prove to be not a very nice solution. Probably need to consult Professor Kam to get some advice on this.

#format data 
origin_destination$Origin_Longitude = as.numeric(as.character(origin_destination$Origin_Longitude))
origin_destination$Origin_Latitude = as.numeric(as.character(origin_destination$Origin_Latitude))
origin_destination$Destination_Longitude = as.numeric(as.character(origin_destination$Destination_Longitude))
origin_destination$Destination_Latitude = as.numeric(as.character(origin_destination$Destination_Latitude))
origin_destination$Destination_Description = as.factor(as.numeric(as.factor(origin_destination$Destination_Description)))

pathList = NULL

factpal <- colorFactor(heat.colors(30), pathList$id)

for(i in 1:nrow(origin_destination))
{
tmp = gcIntermediate(c(origin_destination$Origin_Longitude[i],
                 origin_destination$Origin_Latitude[i]),
               c(origin_destination$Destination_Longitude[i],
                 origin_destination$Destination_Latitude[i]),n = 25,
               addStartEnd=TRUE)
tmp = data.frame(tmp)
tmp$id = origin_destination[i,]$Destination_Description
tmp$weight = sqrt(origin_destination[i,]$TOTAL_TRIPS)/50
tmp$color = factpal(origin_destination[i,]$Destination_Description)
pathList = c(pathList,list(tmp))
}

leaflet(width = 800, height = 600)%>%
addProviderTiles("CartoDB") -> lf

#add each entry of pathlist to the leaflet object
for (path in pathList)
{
  lf %>% addPolylines(data = path,
                      lng = ~lon, 
                      lat = ~lat,
                      weight = ~weight
                      #color = ~color
                      ) -> lf
}
#show output
lf
Leaflet | © OpenStreetMap contributors © CARTO

4 Reflection on Interactivity

4.1 User-Visualization Interaction

In a static visualization, users can only see what you want to show them, but in an interactive data visualization, users have the opportunity to interact with the data, explore, and draw insights for themselves. It gives the user a much better user experience as they are not simply ‘shown what you want them to see’.

4.2 Presenting Dimensionality

With interactivity, we are able to incorporate so much more information into a visualization. We can take care of dimensionality by leaving them all in filters, and in the case of maps, users can zoom in and out to different parts of the map to explore features that interest them.

4.3 Forming Intuition

The purpose of a data viz is to convince users of a certain topic using data. Having interactivity, users can form their own cause and effect analysis which helps them gain better intuition on the reason for the data. Thoughts generated by the user itself stays on longer and stronger, compared to a point being told to them.