1 Overview

In this visualisation, we would like to compare and study the trends of traffic flows between the 3 harbour crossings linking Kowloon and Hong Kong Island - the Cross Harbour Tunnel, Eastern Harbour Tunnel and Western Harbour Tunnel. We are interested to find out if there has been a shift in traffic patterns between the harbour crossings over the years.

(This visualisation is part of Assignment 5 of the ISSS608 Visual Analytics and Applications course.)

1.1 Data Source

The datasets for this visualisation were obtained from data.gov.hk, and the specific datasets used were the Monthly Traffic and Transport Digest - Tunnel, Lantau Link and Vehicular Ferry Services Statistics published by the Transport Department.

1.2 Terminologies

  • CHT: Cross Harbour Tunnel
  • EHC: Eastern Harbour Crossing
  • WHC: Western Harbour Crossing

2 Major Challenges

A sample of the original dataset is shown below:

Some challenges that have been identified include:

2.1 Identifying type of vehicle

In the dataset, there is a column, VEHICLE_CLASS_CODE, identifying the type of vehicle that crossed a particular tunnel in a certain direction in a month. However, there was no proper data dictionary that was available directly on the data.gov.hk website that could help us identify what the relevant vehicle class code represents. Therefore, we will need to match the vehicle class code to the type of vehicle. The Hong Kong Transport Department has a similar set of data published in PDF from where we can match the vehicle class code based on the number of vehicles crossing the particular tunnel at any given month.

2.2 Visualising changes in traffic flows over time

We would like to create a visualisation that could best represent the traffic flows between the 3 tunnels over time, with an added condition that it has to be interactive.

After sourcing various examples of interactive visualisations, we have decided that the traffic flows could be best represented by a ternary plot that shows the combined traffic flows in both directions. The ternary plots were chosen because each axis could represent each of the 3 harbour tunnels. However, the ternary plot is only viable because Hong Kong had exactly 3 harbour tunnels, for which we will have to source for alternative plots should there be more such tunnels built in future.

3 Steps to Visualisation

3.1 Install and Load packages

We will need the following packages:

  • tidyverse packages for reading and wrangling data;
  • ggtern for plotting static ternary chart(s); and
  • plotly for creating interactive charts
packages = c(
  'ggtern',
  'lubridate',
  'plotly',
  'tidyverse'
)

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.2 Load the Data

We will load 3 sets of data, each representing a harbour tunnel.

raw_cht = read_csv('data/table31a_eng.csv')
raw_ehc = read_csv('data/table31b_eng.csv')
raw_whc = read_csv('data/table31c_eng.csv')

3.3 Data Wrangling

3.3.1 Combining Data

Because the datasets share similar columns, we can merge them by using the rbind() function.

raw_traffic_harbour_tunnel = rbind(raw_cht, raw_ehc, raw_whc)

3.3.2 Identifying Vehicle Type Codes

We will need to identify the type of vehicle by comparing the number of vehicles passing through the tunnel with data from Hong Kong’s Transport Department.

We will further use a function that will help us obtain the vehicle type from 3 columns - the VEHICLE_CLASS_CODE, GOODS_VEHICLE_TYPE_CODE and BUS_TYPE_CODE.

get_vehicle_type = function(vehicle, goods_type, bus_type){
  vehicle_class_codes = list(
    '1' = 'Private Cars',
    '2' = 'Motor Cycles',
    '3' = 'Taxis',
    '5' = 'Goods Vehicles',
    '15' = 'Private/Public Light Buses',
    '16' = 'Private/Public Buses'
  )
  
  goods_vehicle_type_code = c(
    '<5.5 ton',
    '5.5-24 ton',
    '>24 ton'
  )
  
  bus_type_code = list(
    'SD' = 'Single Deck',
    'DD' = 'Double Deck'
  )
  
  vehicle = as.character(vehicle)
  output = vehicle_class_codes[vehicle]
  output = ifelse(is.na(goods_type), output, paste(output, goods_vehicle_type_code[goods_type]))
  output = ifelse(is.na(bus_type), output, paste(output, bus_type_code[bus_type]))
  output = as.character(output)
  
  return(output)
}

3.3.3 Cleaning Dataset

We will then clean the dataset by:

  • converting the YR_MTH column into a proper Date field; and
  • using the get_vehicle_type() function to obtain the vehicle type
raw_traffic_harbour_tunnel = raw_traffic_harbour_tunnel %>%
  mutate(
    YEAR = substr(YR_MTH, 1, 4),
    MONTH = substr(YR_MTH, 5, 6),
    VEHICLE_TYPE = get_vehicle_type(VEHICLE_CLASS_CODE, GOODS_VEHICLE_TYPE_CODE, BUS_TYPE_CODE)
  )

Check that the dataset has been processed correctly:

raw_traffic_harbour_tunnel %>%
  filter(TUN_BRIDGE_CODE == 'CHT', YR_MTH == '201301', BOUND_CODE == 'SB') %>%
  arrange(VEHICLE_CLASS_CODE) %>%
  select(-TUN_BRIDGE_CODE, -YR_MTH, -BOUND_CODE)

3.4 Visualisation: Ternary Plot

To prepare the ternary plot, we will need to further clean the data by calculating the average traffic flow of a given tunnel in a given direction.

sb_traffic = raw_traffic_harbour_tunnel %>%
  select(-YR_MTH, -VEHICLE_CLASS_CODE, -GOODS_VEHICLE_TYPE_CODE, -BUS_TYPE_CODE) %>%
  group_by(TUN_BRIDGE_CODE, BOUND_CODE, VEHICLE_TYPE, YEAR) %>%
  summarise(NO_VEHICLE = mean(NO_VEHICLE)) %>%
  ungroup()

sb_traffic

3.4.1 Visualising Single Direction Traffic Flow

We will need to spread out the data, using the spread() function from the tidyr package. We will start off with the traffic flow in the Northbound direction.

sb_traffic_nb = sb_traffic %>%
  filter(BOUND_CODE == 'NB') %>%
  select(-BOUND_CODE) %>%
  spread(TUN_BRIDGE_CODE, NO_VEHICLE) %>%
  mutate(
    TOTAL = rowSums(.[3:5])
  )
sb_traffic_nb

Then create a ternary plot for one specific year - 2019.

ggtern(
  data = sb_traffic_nb %>% 
    filter(YEAR == 2019), 
  aes(x= CHT, y = EHC, z = WHC, col = VEHICLE_TYPE, size = TOTAL)
) +
  geom_point() +
  ggtitle('Average Vehicular Flow Across the Harbour, Northbound, 2019') +
  theme_showarrows()

We will now try to visualise the traffic patterns for different years using facetting.

ggtern(
  data = sb_traffic_nb, 
  aes(x= CHT, y = EHC, z = WHC, col = VEHICLE_TYPE, size = TOTAL)
) +
  facet_wrap(~YEAR) +
  geom_point() +
  ggtitle('Average Vehicular Flow Across the Harbour, Northbound') +
  theme_rgbw()

We will do the same for Southbound.

sb_traffic_sb = sb_traffic %>%
  filter(BOUND_CODE == 'NB') %>%
  select(-BOUND_CODE) %>%
  spread(TUN_BRIDGE_CODE, NO_VEHICLE) %>%
  mutate(
    TOTAL = rowSums(.[3:5])
  )

ggtern(
  data = sb_traffic_sb, 
  aes(x= CHT, y = EHC, z = WHC, col = VEHICLE_TYPE, size = TOTAL)
) +
  facet_wrap(~YEAR) +
  geom_point() +
  ggtitle('Average Vehicular Flow Across the Harbour, Southbound') +
  theme_rgbw()

3.4.2 Visualising Dual Direction Traffic Flow

Finally, we will also visualise for both bounds. We will need to sum the traffic flows for both directions.

sb_traffic_all = sb_traffic %>%
  group_by(TUN_BRIDGE_CODE, YEAR, VEHICLE_TYPE) %>%
  summarise(NO_VEHICLE = mean(NO_VEHICLE)) %>%
  ungroup() %>%
  spread(TUN_BRIDGE_CODE, NO_VEHICLE) %>%
  mutate(
    TOTAL = rowSums(.[3:5])
  )
sb_traffic_all
ggtern(
  data = sb_traffic_all, 
  aes(x= CHT, y = EHC, z = WHC, col = VEHICLE_TYPE, size = TOTAL)
) +
  facet_wrap(~YEAR) +
  geom_point() +
  ggtitle('Average Vehicular Flow Across the Harbour, Both Directions') +
  theme_rgbw()

4 Visualisation and Insights

4.1 Final Visualisation

The final visualisation is an interactive ternary plot visualising average monthly traffic flows across Hong Kong harbour tunnels in both directions. The visualisation will be for the period between 2013 and 2019.

axis = function(txt) {
  list(
    title = txt,
    tickformat = "%",
    tickfont = list(size = 10)
  )
}

plot_ly(
  sb_traffic_all %>% filter(YEAR != 2020),
  type = 'scatterternary',
  mode = 'markers',
  a = ~CHT,
  b = ~EHC,
  c = ~WHC,
  frame = ~YEAR,
  color = ~VEHICLE_TYPE,
  size = ~TOTAL,
  text = ~paste(
    '<strong>', VEHICLE_TYPE, '</strong>',
    '<br />Cross Harbour Tunnel: ', round(CHT, 2),
    '<br />Eastern Harbour Crossing: ', round(EHC, 2),
    '<br />Western Harbour Crossing: ', round(WHC, 2)
  ),
  marker = list(
    symbol = 'circle',
    opacity = 0.8,
    size = 16,
    sizemode = 'diameter',
    sizeref = 4,
    line = list(width = 1, color = '#FFFFFF')
  )
) %>%
  layout(
    title = 'Average Monthly Traffic Flows Across Hong Kong Harbour Tunnels (Both Directions), 2013 - 2019',
    ternary = list(
      aaxis = axis('Cross Harbour Tunnel'), 
      baxis = axis('Eastern<br />Harbour Crossing'), 
      caxis = axis('Western<br />Harbour Crossing')
    )
  ) %>%
  animation_slider(
    currentvalue = list(prefix = 'YEAR '),
    font = list(color = 'black')
  ) %>%
  animation_opts(
    2000, 
    redraw = FALSE
  )

4.2 Insights from Visualisation

4.2.2 Shift towards Other Harbour Tunnels

However, we can also observe that over the years, cross-harbour traffic has started to shift towards the other tunnels, such as the Eastern Harbour Crossing and Western Harbour Crossing. From the ternary plot, average annual traffic flow leading into the Cross Harbour Tunnel has fallen, the most dramatic being that of single deck buses, from 67% in 2013 to 54% in 2019. Note that this could be due to the fall in the vehicle population of single deck buses in Hong Kong. This can possibly be attributed to the constant congestion leading into the Cross Harbour Tunnel, leading to motorists have considering using alternative tunnels to cross the harbour.