This report aims to critically discuss the effectiveness of a data visualisation task and how it might be improved.

Original dashboard

Transport for London 2018, London Buses Safety Dashboard - Quarter One Report – January - March 2018, Transport for London, p. 3. Available at: http://content.tfl.gov.uk/q1-18-london-bus-safety-dashboard.pdf [Accessed 1 February 2019].

Critical discussion

Carrying over 6 million passengers per day, London Buses have a crucial mission to ensure safety for their customers on every journey. With this strategy, a dashboard was created to monitor the bus safety performance in London and identify the area for improvement. It is supposed to inform people who are responsible for transportation safety in London and anyone from the public who are interested in this area. The selected visualisation presents the number of incidents and injuries trend from 2015 to Q1/2018 by using a four-quadrant dashboard with diverse types of charts (e.g. bar graph, pie chart, donut chart and line graph). In general, the dashboard illustrates the key intended messages by having proper use of chart types, clear titles and legends. Nevertheless, it also contains some limitations in accuracy.

The bar graph: In overall, it can deliver the message that the number of incidents in the last quarter is the lowest over last three year thanks to the proper use of chart type and scale (i.e. starting from 0). However, there might be some minor improvements that help the general trend can be detected easier. First, too many horizontal grids seem to clutter the graph. Reducing their visibility and adding vertical markers to facilitate year-on-year comparison might be helpful. It enables us to realize that the first quarter is always the lowest within a year, and this year is the lowest among all. Besides, the improper format of the axis title also should be corrected.

The pie chart (about incident types): It is reasonable to use a pie chart for proportion data. However, when there are too many categories, it becomes very challenging to perceive the differences between them by comparing the angles of the pieces. Therefore, it might be rational to change to bar graph to achieve a higher accurate visual elementary task by judging the position along a common scale. In addition, if there is no reason to stick to the current order, rearrange the categories by their value also make it easier to identify the most common problems, which are collisions, bus fouling and slip trip fall.

The donut chart: It presents the proportion of incidents with and without injuries. As there are only two categories, it might be acceptable to use the donut chart. However, the data seems to be mismatched. The total number of incidents for the current quarter is lower than 17,500 in the bar graph, yet equal to 18,369 in donut chart. It also implies the redundancy in the visualisation. A single proportion number of incidents with injuries might be sufficient to deliver the same message.

The line graph: It successfully shows the general upward trend of injuries number. In addition, it is also possible to realize the seasonality feature, in which the first half of the year is always lower than the second one. However, similar to the suggestion in the first graph, vertical markers might help year-on-year comparison easier. Furthermore, there is another mismatch between the donut chart and this line graph. The number of incidents with injuries is 1,456 in the donut chart, but higher than 1,500 in this line graph. It might be the case that there are more than two injuries per incidents. If it is true, the axis label of line graph must be changed to from “number of incidents” to “number of injuries” avoid misleading.

Furthermore, considering the objective is identifying the area to improve safety, it might be misleading if having the proportion of incident types without its relation to the number of injuries. As can be seen in my redesign, while the collision is the most common type of incidents (38%), it contributes only 13.5% to the total number of injuries. Meanwhile, slip trip fall accounts for only 7% of incidents but causes 51% of injuries. Even though apart of this message is mentioned in the summary, it is not visualised in any part of this report and might either cause misunderstanding or confusion about the main problem to solve.

Proposed dashboard

Incident data is approximation from the original graph using WebPlotDigitizer.

Injury data is from Transport for London. Available at: https://tfl.gov.uk/cdn/static/cms/documents/qsr-publication.xlsx [Accessed 1 February 2019].

R codes

knitr::opts_chunk$set(warning = FALSE, message = FALSE, comment = NA, fig.align='center')
# Import library and dataset
library(tidyverse)
library(gridExtra)
library(cowplot)
library(grid)
library(readxl)

Data_visualisation_1_1 <- read_excel("Data visualisation 1.xlsx", sheet = "chart1")
Data_visualisation_1_2 <- read_excel("Data visualisation 1.xlsx", sheet = "chart2")
Data_visualisation_1_3 <- read_excel("Data visualisation 1.xlsx", sheet = "chart3")
Data_visualisation_1_4 <- read_excel("Data visualisation 1.xlsx", sheet = "chart4")
Data_visualisation_1_5 <- read_excel("Data visualisation 1.xlsx", sheet = "chart5")

#Create dashboard
p1 <- ggplot(Data_visualisation_1_1, aes(Quarter_Label, Total_Incidents))+
  geom_col(fill = "darkblue", alpha = 0.8)+
  geom_vline(xintercept = c(0.5, 4.5, 8.5, 12.5), alpha =0.5, linetype ="longdash")+
  theme(legend.position="none")+
  theme_minimal()+
  theme(axis.text.x = element_text(size=12),
        axis.text.y = element_text(size=12),
        plot.title = element_text(size=14),
        axis.title=element_text(size=14))+
  labs(title="All Reported Incidents On London Buses",
       x =NULL, y = "Number of Incidents")+
  annotate("text", x = 13, y = 21500, label = "7.9%", fontface = "bold", size = 6, col = "red", alpha =.7)+
  annotate("text", x = 13, y = 20000, label = " \n with \n injuries", size = 3.5, fontface = "bold", alpha =.7)+
  ylim(0,22500)

p2 <- ggplot(Data_visualisation_1_2, aes(x = reorder(Incident_Event_Type, Percent), Percent))+
  geom_col(fill = "darkblue", alpha = 0.8)+
  geom_text(aes(y = Percent + 2,label = Percent))+
  theme(legend.position="none")+
  coord_flip()+
  theme_minimal()+
  theme(axis.text.x = element_text(size=12),
        axis.text.y = element_text(size=12),
        plot.title = element_text(size=14),
        axis.title=element_text(size=14))+
  labs(title="Percent of Incident Types for the Current Quarter",
       x =NULL, y = "Percent of Incidents")+
  ylim(0,60)

p3 <- ggplot(Data_visualisation_1_4, aes(Quarter_Label, Injuries))+
  geom_col(fill = "red", alpha=.5)+
  geom_vline(xintercept = c(0.5, 4.5, 8.5, 12.5), alpha =0.5, linetype ="longdash")+
  theme(legend.position="none")+
  theme_minimal()+
  theme(axis.text.x = element_text(size=12),
        axis.text.y = element_text(size=12),
        plot.title = element_text(size=14),
        axis.title=element_text(size=14))+
  labs(title="All Injuries from Incidents Involving London Buses",
       x =NULL, y = "Number of Injuries")+
  ylim(0,2250)
  
p4 <- ggplot(Data_visualisation_1_5, aes(x = reorder(Incident_Event_Type, Percent), Percent))+
  geom_col(fill = "red", alpha=.5)+
  geom_text(aes(y = Percent + 2,label = Percent))+
  theme(legend.position="none")+
  coord_flip()+
  theme_minimal()+
  theme(axis.text.x = element_text(size=12),
        axis.text.y = element_text(size=12),
        plot.title = element_text(size=14),
        axis.title=element_text(size=14))+
  labs(title="Percent of Injuries by Incident Types for the Current Quarter",
       x =NULL, y = "Percent of Injuries")+
  ylim(0,60)

grid.arrange(plot_grid(p1, p2, p3, p4, scale = 0.9), 
             top = textGrob("London Buses Safety Dashboard – Q1 \nIncidents and Injuries",
                            x=0.065,hjust=0,
                            gp=gpar(fontsize=20, fontface="bold"))) 
# ggsave("Data Visualisation and Analysis - Task 1.png", width = 16, height = 9)

Data Visualisation and Analysis - Critical discussion

Xuan Pham

Feb 1, 2019

Original dashboard

Critical discussion

Proposed dashboard

R codes