RPubs URL: https://rpubs.com/Chuver/699257

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
ardd <- read.csv("C:/Users/verno/OneDrive/Documents/RMIT/MATH2404/Datasets/ardd.csv")
summary(ardd)
##     Crash.ID           State               Month             Year     
##  Min.   :19892001   Length:30719       Min.   : 1.000   Min.   :1989  
##  1st Qu.:20031438   Class :character   1st Qu.: 3.000   1st Qu.:2003  
##  Median :20082224   Mode  :character   Median : 6.000   Median :2008  
##  Mean   :20076447                      Mean   : 6.497   Mean   :2007  
##  3rd Qu.:20142018                      3rd Qu.:10.000   3rd Qu.:2014  
##  Max.   :20208004                      Max.   :12.000   Max.   :2020  
##    Dayweek              Time            Crash.Type        Bus.Involvement   
##  Length:30719       Length:30719       Length:30719       Length:30719      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Heavy.Rigid.Truck.Involvement Articulated.Truck.Involvement  Speed.Limit    
##  Length:30719                  Length:30719                  Min.   :  5.00  
##  Class :character              Class :character              1st Qu.: 60.00  
##  Mode  :character              Mode  :character              Median : 80.00  
##                                                              Mean   : 82.88  
##                                                              3rd Qu.:100.00  
##                                                              Max.   :130.00  
##   Road.User            Gender               Age        
##  Length:30719       Length:30719       Min.   :  0.00  
##  Class :character   Class :character   1st Qu.: 23.00  
##  Mode  :character   Mode  :character   Median : 37.00  
##                                        Mean   : 41.28  
##                                        3rd Qu.: 57.00  
##                                        Max.   :101.00  
##  National.Remoteness.Areas SA4.Name.2016      National.LGA.Name.2017
##  Length:30719              Length:30719       Length:30719          
##  Class :character          Class :character   Class :character      
##  Mode  :character          Mode  :character   Mode  :character      
##                                                                     
##                                                                     
##                                                                     
##  National.Road.Type Christmas.Period   Easter.Period       Age.Group        
##  Length:30719       Length:30719       Length:30719       Length:30719      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Day.of.week        Time.of.day       
##  Length:30719       Length:30719      
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 
head(ardd)
##   Crash.ID State Month Year  Dayweek  Time Crash.Type Bus.Involvement
## 1 20203092   Qld    10 2020 Saturday 18:00   Multiple              No
## 2 20201114   NSW    10 2020 Saturday  5:55   Multiple              No
## 3 20205025    WA    10 2020 Saturday 19:00   Multiple              No
## 4 20205030    WA    10 2020   Sunday  1:17     Single              No
## 5 20203027   Qld    10 2020 Saturday 18:00     Single              No
## 6 20205008    WA    10 2020   Sunday 23:49     Single              No
##   Heavy.Rigid.Truck.Involvement Articulated.Truck.Involvement Speed.Limit
## 1                            No                            No          60
## 2                            No                            No         100
## 3                           Yes                            No         110
## 4                            No                            No          90
## 5                            No                            No          60
## 6                            No                            No         110
##          Road.User Gender Age National.Remoteness.Areas  SA4.Name.2016
## 1 Motorcycle rider   Male  38 Major Cities of Australia Sunshine Coast
## 2           Driver   Male  25  Inner Regional Australia      Illawarra
## 3           Driver   Male  48                                         
## 4           Driver   Male  41                                         
## 5           Driver   Male  29 Major Cities of Australia Sunshine Coast
## 6        Passenger Female   1                                         
##   National.LGA.Name.2017 National.Road.Type Christmas.Period Easter.Period
## 1     Sunshine Coast (R)     Collector Road               No            No
## 2            Wollondilly      Arterial Road               No            No
## 3                                                         No            No
## 4                                                         No            No
## 5     Sunshine Coast (R)     Collector Road               No            No
## 6                                                         No            No
##   Age.Group Day.of.week Time.of.day
## 1  26_to_39     Weekend       Night
## 2  17_to_25     Weekend       Night
## 3  40_to_64     Weekend       Night
## 4  40_to_64     Weekend       Night
## 5  26_to_39     Weekend       Night
## 6   0_to_16     Weekend       Night

Introduction

As drivers we take for granted the freedom we have (in a COVID-19 normal world). There is no place that is too far. We are only really limited by how long we can drive for and how much fuel we have in the tank. Road travel has certainly opened up Australia. We all look forward to the big Road trip for holidays. But the downside of our freedom to go places is the “fatality” factor. There’s always the concern at the back of our minds that we may not get to our destination.

According to the manager of National Road Safety Partnership Program (NRSPP), Jerome Carslake in an article published by QBE, the 5 most common causes of car accidents in Australia are:

1.Fatigue

2.Speed

3.Distractions(including mobile phones)

4.Alcohol

5.Drugs

Besides the permanent loss and pain that comes with losing someone in an accident. There is a tremendous economic cost to every fatality. In a study by Economic Connections Pty Ltd ECON who where commissioned by the Australian Automobile Association (AAA) to quantify the cost of the cost of road trauma, they found that the economic cost of each road fatality in 2015 was $4.34 million.

So in trying to better understand the where, who, when and how of road fatalities in Australia and in so doing increase the awareness of all drivers, with the ultimate aim of reducing the road toll, we’ll examine a series of graphs from 2015 to 2019 sourced from the Australian Road Deaths Database ARDD. This has been converted to the ardd data frame which has been used to generate all the graphs.

Where do Australian road fatalities occur?

In order to form a big picture of where most road fatalities occur in Australia, we examine the breakdown by state and territory as a starting point. As seen below, across all states from 2015 to 2019, the number of road fatalities has been somewhat stable, even declining a little but increasing in 2019. This is surprising given the enhanced safety on modern vehicles such as Autonomous Emergency Braking (AEB), as well as improved road infrastructure. New South Wales with the biggest population of 8157000 at the end of March 2020 (ABS) had the highest fatalities whilst the Australian Capital Territory with 429800(ABS) had the least. The number of fatalities appear to be linked to population size and road distances.

theme_update(axis.text = element_text(size = 20))
ardd <- read.csv("C:/Users/verno/OneDrive/Documents/RMIT/MATH2404/Datasets/ardd.csv")
f1 <- ardd %>%
  filter(Year >= "2015",
         Year < "2020") %>%
  group_by(State)

g1 <- ggplot(f1, aes(x = as.factor(Year)))
g1 +  geom_bar(fill = "#9ebcda", width = 0.7) + facet_wrap(~State) + 
theme_gray() + geom_text(stat="count", aes(label=..count..), vjust=0.6) + 
labs(x = "Year", y = "Fatalities", 
title = "Vehicle Fatalities in Australia between 2015 and 2019",
caption = "Source: Australian Road Deaths Database") 
Vehicle Fatalities in Australia between 2015 and 2019

Vehicle Fatalities in Australia between 2015 and 2019

Who is involved?

Now that we understand that New South Wales, Victoria, Queensland and Western Australia account for the majority of the cases of road fatalities, let’s further determine the breakdown by gender and age group.

ardd %>%
 filter(!(National.Remoteness.Areas %in% "")) %>%
 filter(!(National.Road.Type %in% 
    "")) %>%
  filter(Year >= "2015",
         Year < "2020") %>%
 ggplot() +
 aes(x = Age.Group) +
 geom_bar(fill = "#9ebcda", width = 0.4) +
 theme_gray() +
 facet_grid(vars(), vars(Gender)) + geom_text(stat="count", 
 aes(label=..count..), 
 hjust=1) + coord_flip() + labs(x = "Age Group", y = "Fatalities", 
 title = "Fatalities by Gender and Age Group between 2015 and 2019", 
 caption = "Source: Australian Road Deaths Database")
Fatalities by Gender and Age Group between 2015 and 2019

Fatalities by Gender and Age Group between 2015 and 2019

As shown above males account for more deaths than females in every age category. The high risk group are males aged between 40-64. According to budgetdirect, from 2013-2018, males account for 73% of all Australian road fatalities. This is a scary statistic. Males are by nature more aggressive than females. Aggression certainly is a key ingredient in road rage incidents. This view is echoed by the Queensland Department of Housing and Public Works who in their Managing aggressive drivers and road rage fact sheet mention ‘Aggressive driver behaviour is contributing to crashes and physical conflict between road users’(Queensland Department of Housing and Public Works 2016, para.1).

Now let’s examine the gender-age-state breakdown to better understand the state composition via the following boxplot:

g1 <- ardd %>%
  filter(!(National.Remoteness.Areas %in% "")) %>%
  filter(!(National.Road.Type %in% 
             "")) %>%
  filter(Year >= "2015",
         Year < "2020")
ggplot(g1, aes(x=State, y = Age, fill = Gender)) + geom_boxplot() +
scale_fill_hue() + theme_grey() + labs(x = "State", 
title = "Age and Gender Fatalities by State between 2015 and 2019", 
caption = "Source: Australian Road Deaths Database")
Age and Gender Fatalities by State between 2015 and 2019

Age and Gender Fatalities by State between 2015 and 2019

Boxplots provide a representation of data between 25 and 75 percent of the observations, whilst the line in each box show the median age. Basically, what the boxplots tell us is that most male fatalities are in the 40’s, whilst it is higher for females in the 50’s. The Northern Territory has the lowest median age for males whilst South Australia has the highest. It is interesting to note that the ACT has the highest median age for females whilst the NT the lowest.

When do these car fatalities occur?

The next factor to consider is month, time of day and day of week. One may expect that most road fatalities occur at night or in the early hours of the morning. Is this the case? Let’s have a further look…

y1 <- ardd %>%
 filter(!(National.Remoteness.Areas %in% "")) %>%
 filter(!(National.Road.Type %in% 
    "")) %>%
 filter(Year >= "2015",
        Year < "2020")
 p1 <- ggplot(y1, aes(x = as.factor(Month))) +
 geom_bar(fill = "#9ebcda") +
 theme_gray() +
 facet_grid(vars(Day.of.week), vars(Time.of.day)) + 
geom_text(stat="count", aes(label=..count..), hjust=1) + coord_flip() + labs(x = "Months", 
y = "Fatalities", 
title = "Fatalities by Month, Day/Night and Weekday/Weekend", 
caption = "Source: Australian Road Deaths Database") 
p1 
Fatalities by Month, Day/Night and Weekday/Weekend

Fatalities by Month, Day/Night and Weekday/Weekend

As shown above, most fatalities actually occur during the day and during the week. Those of you with an attention to detail will notice the higher Day numbers for January, March and August months. These higher numbers coincide with the New year, Easter and Winter school holidays. So school holidays do appear to be a factor.

How does Speed play a role?

The final piece of the puzzle in our examination of the Australian road fatalities, is how does speed come into the equation? Now let’s interactively explore how speed determines the type of road user in a fatal accident.

r1 <- ardd %>%
  mutate(Zone = case_when(Speed.Limit >= 100 ~ 'Beyond 100 Km/h',
                            Speed.Limit >= 50 ~ 'Between 50-99 Km/h',
                              Speed.Limit < 50 ~ 'Below 50 Km/h')) %>%
  filter(!(National.Remoteness.Areas %in% "")) %>%
  filter(!(National.Road.Type %in% 
             "")) %>%
  filter(Year >= "2015",
         Year < "2020")
s1 <- (ggplot(r1, aes(x = Zone, fill = Road.User))) + geom_bar() + 
facet_grid(Crash.Type~.) + 
scale_fill_manual(values=c("#d73027", "#fc8d59", "#fee090",
"#ffffbf", "#e0f3f8", "#91bfdb","#4575b4")) +
theme_grey() +  theme(axis.title=element_text(size=8.5, face = "bold"), 
axis.text.x = element_text(size = 7,face ="bold"), 
axis.text.y = element_text(size = 6)) + labs(x ="         \nSpeed Zone    
                                                          Source: Australian Road Deaths Database", 
y = "Fatalities\n", 
title = "Fatalities by Speed Zone, Road Users & Crash Type between 2015 & 2019")
ggplotly(s1)

Fatalities by Speed Zone, Road User & Crash Types between 2015 & 2019

Using the interactive bar chart, by hovering the cursor, we can get specific detail as to the number of fatalities, the type of road user(e.g. driver, passenger, pedestrian, etc.) and the speed zone they were in when the accident occurred. We also get an indication whether there was one or more vehicles involved by the Single and Multiple groupings respectively.

Upon further analysis of the graphs above, the following can be concluded:

  • Higher speeds (above 100 Km/h) result in more driver and passenger deaths.

  • Lower speeds (between 50 and 99 Km/h) result in more motorcylist, cyclist and pedestrian deaths.

  • At speeds below 50 Km/h, pedestrians form the majority of casualties.

  • Speed does not appear to be a factor for single or multiple vehicle accidents as their barcharts are similar.

Conclusion

Given all the information above, we can see that high speed results in the majority of driver and passenger fatalities irrespective of where the accident occurred in Australia. Pedestrian, cyclist and motorbike fatalities are more likely to occur at lower speeds. The group most likely to cause road fatalities are males aged between 40-64. As for a time perspective, weekends and night time are not real factors, as most fatalities occur during the daylight hours of the work week. But certain months of the year, particularly during the school holidays, are a factor.

Fatalities on the road are an inevitable part of life. In order to reduce the toll, additional driver education and awareness should be undertaken and focused primarily on middle-aged males. Perhaps an additional demerit loading on high risk groups can be introduced? The implementation of more point-to-point cameras which calculate average speed between 2 points could change driver speed behaviour and discourage speeding as it is always “on”. With the rapid pace of technology, smart cars will soon have the ability to automatically limit speeds and even take over from the driver if required.

Ultimately as drivers we should all exercise care whenever stepping inside a vehicle. We should ensure that we are well rested and provide sufficient time to get to our destination. Some courtesy and respect towards other drivers can reduce the opportunity for road rage. Yes, there is technology and laws to “assist” us but at the end of the day we are masters of our own destiny. If we all do our atomic bit, a reduction in our road fatality numbers is inevitable.

References