For my final project, I chose to explore a dataset that collected information about traffic accidents. I chose to explore this to gain a better understanding of the causes of traffic accidents. Last year, my sister was in a car accident that left her with a broken leg. She wasn’t able to walk for a while and still has a limp. I have been in a small car accident myself and have friends who have as well. I’ve always been a careful driver, but after my sister’s accident, I’ve been extra vigilant. Sometimes, traffic accidents are truly just accidents, but I think many have causes that could’ve been avoided. I wanted to look at this data to help gain some awareness of theconditions that cause traffic accidents and learn how to avoid them as best as possible.
Description
The dataset has over 200,000 observations of 24 variables. Its variable types include Character, Integer, and Numerical values. The dataset describes information about traffic accidents from 2016 to 2023. The variables describe aspects regarding the accident such as the conditions, type, and outcomes. Based on the initial inspection, “crash_date” stands out as a column of interest. It is stored as a character, but could be more useful as a datetime, so that may be a value to change in the future. The author states that the data for this dataset was “obtained from the internet”. Based on the information recorded and how it was recorded, I believe the data was likely obtained from police traffic incident reports or insurance reports.
The type of traffic control device involved (e.g., traffic light, sign)
group_by()
weather_condition
Character
Continuous
The weather conditions at the time of the accident
group_by()
lighting_condition
Character
Continuous
The lighting conditions at the time of the accident
group_by()
first_crash_type
Character
Continuous
The initial type of the crash (e.g., head-on, rear-end)
group_by()
trafficway_type
Character
Continuous
The type of roadway involved in the accident (e.g., highway, local road)
group_by()
alignment
Character
Continuous
The alignment of the road where the accident occurred (e.g., straight, curved)
group_by()
roadway_surface_cond
Character
Continuous
The condition of the roadway surface (e.g., dry, wet, icy)
group_by()
road_defect
Character
Continuous
Any defects present on the road surface
group_by()
crash_type
Character
Continuous
The overall type of the crash
group_by()
intersection_related_i
Character
Discrete
Whether the accident was related to an intersection
filter(), group_by()
damage
Character
Continuous
The extent of the damage caused by the accident
group_by()
prim_contributory_cause
Character
Continuous
The primary cause contributing to the crash
group_by()
num_units
Numerical: Integer
Discrete
The number of vehicles involved in the accident
filter()
most_severe_injury
Character
Continuous
The most severe injury sustained in the crash
group_by()
injuries_total
Numerical
Discrete
The total number of injuries reported
filter(), summary()
injuries_fatal
Numerical
Discrete
The number of fatal injuries resulting from the accident
filter(), summary()
injuries_incapacitating
Numerical
Discrete
The number of incapacitating injuries
filter(), summary()
injuries_non_incapacitating
Numerical
Discrete
The number of non-incapacitating injuries
filter(), summary()
injuries_reported_not_evident
Numerical
Discrete
The number of injuries reported but not visibly evident
filter(), summary()
injuries_no_indication
Numerical
Discrete
The number of cases with no indication of injury
filter(), summary()
crash_hour
Numerical: Integer
Discrete
The hour the accident occurred (0-23)
filter()
crash_day_of_week
Numerical: Integer
Discrete
The day of the week the accident occurred (1-7)
filter()
crash_month
Numerical: Integer
Discrete
The month the accident occurred
filter()
Ethical Considerations
Given that the data was collected from the internet, there could be some concerns regarding how the data was collected. Its source could be unreliable or inaccurate. There is also no indication of what areas or regions this data was collected from. Because of this, information that could impact the number of traffic accidents that occur, like population or the number of drivers in the area, cannot be considered in this analysis.
Initial Hypothesis
Traffic accidents resulting in severe injuries (fatal or incapacitating) occur more often during adverse weather conditions and poor lighting, suggesting a significant positive correlation between poor driving conditions and accident-related injuries.
Data Processing
# Load the needed librarieslibrary(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggplot2)library(lubridate)
Warning: package 'lubridate' was built under R version 4.4.2
Attaching package: 'lubridate'
The following objects are masked from 'package:base':
date, intersect, setdiff, union
library(scales)
# Read in the datasetaccidents_data_raw <-read.csv("data/traffic_accidents.csv")
Dataset Inspection
# Inspect the first few rowshead(accidents_data_raw)
crash_date traffic_control_device weather_condition
1 07/29/2023 01:00:00 PM TRAFFIC SIGNAL CLEAR
2 08/13/2023 12:11:00 AM TRAFFIC SIGNAL CLEAR
3 12/09/2021 10:30:00 AM TRAFFIC SIGNAL CLEAR
4 08/09/2023 07:55:00 PM TRAFFIC SIGNAL CLEAR
5 08/19/2023 02:55:00 PM TRAFFIC SIGNAL CLEAR
6 09/06/2023 12:59:00 AM NO CONTROLS RAIN
lighting_condition first_crash_type trafficway_type alignment
1 DAYLIGHT TURNING NOT DIVIDED STRAIGHT AND LEVEL
2 DARKNESS, LIGHTED ROAD TURNING FOUR WAY STRAIGHT AND LEVEL
3 DAYLIGHT REAR END T-INTERSECTION STRAIGHT AND LEVEL
4 DAYLIGHT ANGLE FOUR WAY STRAIGHT AND LEVEL
5 DAYLIGHT REAR END T-INTERSECTION STRAIGHT AND LEVEL
6 DARKNESS, LIGHTED ROAD FIXED OBJECT NOT DIVIDED STRAIGHT AND LEVEL
roadway_surface_cond road_defect crash_type
1 UNKNOWN UNKNOWN NO INJURY / DRIVE AWAY
2 DRY NO DEFECTS NO INJURY / DRIVE AWAY
3 DRY NO DEFECTS NO INJURY / DRIVE AWAY
4 DRY NO DEFECTS INJURY AND / OR TOW DUE TO CRASH
5 UNKNOWN UNKNOWN NO INJURY / DRIVE AWAY
6 WET UNKNOWN INJURY AND / OR TOW DUE TO CRASH
intersection_related_i damage prim_contributory_cause
1 Y $501 - $1,500 UNABLE TO DETERMINE
2 Y OVER $1,500 IMPROPER TURNING/NO SIGNAL
3 Y $501 - $1,500 FOLLOWING TOO CLOSELY
4 Y OVER $1,500 UNABLE TO DETERMINE
5 Y $501 - $1,500 DRIVING SKILLS/KNOWLEDGE/EXPERIENCE
6 N $501 - $1,500 UNABLE TO DETERMINE
num_units most_severe_injury injuries_total injuries_fatal
1 2 NO INDICATION OF INJURY 0 0
2 2 NO INDICATION OF INJURY 0 0
3 3 NO INDICATION OF INJURY 0 0
4 2 NONINCAPACITATING INJURY 5 0
5 2 NO INDICATION OF INJURY 0 0
6 1 NONINCAPACITATING INJURY 2 0
injuries_incapacitating injuries_non_incapacitating
1 0 0
2 0 0
3 0 0
4 0 5
5 0 0
6 0 2
injuries_reported_not_evident injuries_no_indication crash_hour
1 0 3 13
2 0 2 0
3 0 3 10
4 0 0 19
5 0 3 14
6 0 0 0
crash_day_of_week crash_month
1 7 7
2 1 8
3 5 12
4 4 8
5 7 8
6 4 9
# Display the column namesnames(accidents_data_raw)
The initial inspection leads me to believe this dataset has little to no missing values. To confirm, I will be using the base R function is.na() to identify potential missing values.
# Look for missing values in each columnprint(colSums(is.na(accidents_data_raw)))
Since this dataset is already fairly clean, there is no need to handle any missing values.
Handling Outliers
Most of the data in this dataset is categorical, and the numerical variables describe the number of cars involved the the accident, the injuries associated with the accident, and when it occurred. Given the nature of the numerical characteristics of the dataset, I’ll check the following variables for any outliers: num_units, injuries_total, injuries_fatal, injuries_incapacitating, injuries_non_incapacitating, injuries_reported_not_evident, and injuries_no_indication.
# Print the summary statistics to the described variablesaccidents_data_raw %>%select(num_units, injuries_total, injuries_fatal, injuries_incapacitating, injuries_non_incapacitating, injuries_reported_not_evident, injuries_no_indication) %>%summary()
num_units injuries_total injuries_fatal injuries_incapacitating
Min. : 1.000 Min. : 0.0000 Min. :0.000000 Min. :0.0000
1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.:0.000000 1st Qu.:0.0000
Median : 2.000 Median : 0.0000 Median :0.000000 Median :0.0000
Mean : 2.063 Mean : 0.3827 Mean :0.001858 Mean :0.0381
3rd Qu.: 2.000 3rd Qu.: 1.0000 3rd Qu.:0.000000 3rd Qu.:0.0000
Max. :11.000 Max. :21.0000 Max. :3.000000 Max. :7.0000
injuries_non_incapacitating injuries_reported_not_evident
Min. : 0.0000 Min. : 0.0000
1st Qu.: 0.0000 1st Qu.: 0.0000
Median : 0.0000 Median : 0.0000
Mean : 0.2212 Mean : 0.1215
3rd Qu.: 0.0000 3rd Qu.: 0.0000
Max. :21.0000 Max. :15.0000
injuries_no_indication
Min. : 0.000
1st Qu.: 2.000
Median : 2.000
Mean : 2.244
3rd Qu.: 3.000
Max. :49.000
# Construct boxplots for the variablesaccidents_data_raw %>%select(num_units, injuries_total, injuries_fatal, injuries_incapacitating, injuries_non_incapacitating, injuries_reported_not_evident, injuries_no_indication) %>% tidyr::gather(metric, value) %>%ggplot(aes(x = metric, y = value)) +geom_boxplot() +theme_minimal() +labs(title ="Distribution of Numerical Traffic Acident Variables",x ="Metric",y ="Value") +theme(axis.text.x =element_text(angle =45))
Looking at the box plot, it seems that all these variables have many outliers. However, for a dataset like this, that is not unusual. There is no “normal” result of a car accident. The number of cars involved or injuries suffered depends on many outside factors, some of which can’t be predicted. For this reason, I will be keeping the outliers in the dataset, as they could be important points of information. It’s possible that some of these variables could have a relationship to be explored. For example, the scatterplot below shows a weak positive correlation between the number of units involved in the accident and the total number of injuries.
# Create a scatter plotggplot(accidents_data_raw, aes(x = injuries_total, y = num_units)) +geom_point(alpha =0.5, color ="red") +theme_minimal() +labs(title ="Number of Units vs Total Injuries",subtitle ="Possible Positive Correlation Among Outliers") +geom_smooth(method =lm, color ="blue")
`geom_smooth()` using formula = 'y ~ x'
Transformations
All the numerical variables are normal and don’t require any transformations.
Exploratory Analysis and Visualization
Question 1: Of the total number of traffic accidents recorded, how many result in severe injury?
First, it is important to establish how often a traffic accident may result in a severe injury. It’s likely that many will not, which is important to put into perspective to keep from spreading unnecessary fears. For this project, an injury is “severe’ if it is incapacitating or fatal.
# Create a new data frame with a column that totals all the reports of injuries and no injuries (different from "injuries_total")accidents_reported <- accidents_data_raw %>%mutate(total_injuries_reported = injuries_fatal + injuries_incapacitating + injuries_non_incapacitating + injuries_reported_not_evident + injuries_no_indication)accidents_reported %>%select(total_injuries_reported, injuries_total) %>%slice(1:10)
# Create a dataframe with the injury to report ratio for visualizationaccident_injury_ratio <-data.frame(injury_types=c("Fatal Injuries", "Incapacitating Injuries", "Non-Incapacitating Injuries", "Reported, Not Evident", "None"),ratio=c(sum(accidents_reported$injuries_fatal) /sum(accidents_reported$total_injuries_reported),sum(accidents_reported$injuries_incapacitating) /sum(accidents_reported$total_injuries_reported),sum(accidents_reported$injuries_non_incapacitating) /sum(accidents_reported$total_injuries_reported),sum(accidents_reported$injuries_reported_not_evident) /sum(accidents_reported$total_injuries_reported),sum(accidents_reported$injuries_no_indication) /sum(accidents_reported$total_injuries_reported) ))accident_injury_ratio
# Visualize the ratiosggplot(accident_injury_ratio, aes(x =reorder(injury_types, -ratio), y = ratio, fill = injury_types)) +geom_bar(stat ="identity", alpha =0.8) +labs(title ="Majority of Traffic Accidents Report No Injuries",x ="Injury Reported",y ="Ratio of Report to Total Reports" ) +theme_minimal() +theme(plot.title =element_text(size =18, face ="bold"),axis.title =element_text(size =12),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =45, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
It’s clear that a vast majority of accidents result in no injury at all. That’s a reassuring thing to keep in mind while exploring this dataset. However, given that this data set is so large, even just a small percentage of accidents resulting in fatal injuries is still hundreds of fatalities. And non-incapacitating injuries can still have horrible consequences. It’s also important to note that many traffic accidents are minor, like sideswipes or dings. It’s likely there are certain accidents that result in more serious consequences.
Question 2: How does the type of injury relate to the accident/crash type?
As stated above, the type of accident is an important factor in the possibility and seriousness of a resulting injury. Involvement of a pedestrian, a non-vehicle object, or the crash angle are all variables to consider.
# Find the top ten most common crash typesaccidents_reported %>%group_by(first_crash_type) %>%summarise(reports_made =sum(total_injuries_reported)) %>%arrange(-reports_made)
# A tibble: 18 × 2
first_crash_type reports_made
<chr> <dbl>
1 TURNING 173763
2 ANGLE 148323
3 REAR END 112464
4 SIDESWIPE SAME DIRECTION 52284
5 PEDESTRIAN 19974
6 PEDALCYCLIST 11829
7 PARKED MOTOR VEHICLE 7265
8 FIXED OBJECT 6790
9 HEAD ON 5216
10 SIDESWIPE OPPOSITE DIRECTION 4908
11 REAR TO FRONT 2913
12 REAR TO SIDE 2059
13 OTHER OBJECT 1242
14 OTHER NONCOLLISION 366
15 OVERTURNED 160
16 REAR TO REAR 110
17 ANIMAL 107
18 TRAIN 15
# List of the top 10 most common crash types reportedtop_10_crash_types <-c("TURNING", "ANGLE", "REAR END", "SIDESWIPE SAME DIRECTION", "PEDESTRIAN", "PEDALCYCLIST", "PARKED MOTOR VEHICLE", "FIXED OBJECT", "HEAD ON", "SIDESWIPE OPPOSITE DIRECTION")# Create a dataframe with the injury to report ratio for visualizationaccident_injury_crash_type <- accidents_reported %>%group_by(first_crash_type) %>%summarise(injuries_fatal_ratio =sum(injuries_fatal) /sum(total_injuries_reported),injuries_no_indication_ratio =sum(injuries_no_indication) /sum(total_injuries_reported)) %>%filter(first_crash_type %in% top_10_crash_types)accident_injury_crash_type
# A tibble: 10 × 3
first_crash_type injuries_fatal_ratio injuries_no_indication_ra…¹
<chr> <dbl> <dbl>
1 ANGLE 0.000829 0.804
2 FIXED OBJECT 0.00604 0.758
3 HEAD ON 0.000575 0.756
4 PARKED MOTOR VEHICLE 0.000688 0.926
5 PEDALCYCLIST 0.00127 0.651
6 PEDESTRIAN 0.00476 0.568
7 REAR END 0.000116 0.911
8 SIDESWIPE OPPOSITE DIRECTION 0.000611 0.892
9 SIDESWIPE SAME DIRECTION 0.000172 0.962
10 TURNING 0.000432 0.877
# ℹ abbreviated name: ¹injuries_no_indication_ratio
# Visualize the ratios for fatal crashesggplot(accident_injury_crash_type, aes(x =reorder(first_crash_type, -injuries_fatal_ratio), y = injuries_fatal_ratio, fill = first_crash_type)) +geom_bar(stat ="identity", alpha =0.8) +labs(title ="Top 10 Crashes that Result in Fatal Injury",x ="Crash Type",y ="Number of Fatal Injuries Reported" ) +theme_minimal() +theme(plot.title =element_text(size =13, face ="bold"),axis.title =element_text(size =11),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =45, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
# Visualize the ratios for no injury crashesggplot(accident_injury_crash_type, aes(x =reorder(first_crash_type, -injuries_no_indication_ratio), y = injuries_no_indication_ratio, fill = first_crash_type)) +geom_bar(stat ="identity", alpha =0.8) +labs(title ="Top 10 Crashes that Result in No Injury",x ="Crash Type",y ="Number of No Injuries Reported" ) +theme_minimal() +theme(plot.title =element_text(size =14, face ="bold"),axis.title =element_text(size =11),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =45, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
By looking at the ratios of the types of injuries reported to the total reports, we can get a better idea of which crash types result in which injury. Crashes that result in no injury are more minor ones, like a rear-end, sideswipes, and dinging a parked car. This makes sense as to why these don’t result in any injury. By contrast, crashes that result in fatal injury usually include parties that are not in cars and fixed objects, which also makes sense.
Question 3: How does the likelihood of a severe injury differ across different accident conditions?
There are outside factors that can cause an accident, increase the likelihood of one, and impact any resulting injuries. While they can’t be completely avoided, it’s important to note more dangerous conditions to be more aware as a driver. The main conditions I want to examine are weather and lighting, as I think they both have the most impact on the how a driver drives.
# Create a dataframe with the severe injury to report ratio, grouped by weather conditions# Using the ratio to get more accurate results across the weather conditionssevere_injuries_weather <- accidents_reported %>%group_by(weather_condition) %>%summarise(injuries_severe = (sum(injuries_fatal) +sum(injuries_incapacitating)) /sum(total_injuries_reported),injuries_severe_sum =sum(injuries_fatal) +sum(injuries_incapacitating) ) %>%filter(injuries_severe >0.01) # Filter to get the more common causessevere_injuries_weather
# Visualize both conditions to compareggplot(severe_injuries_weather, aes(x =reorder(weather_condition, -injuries_severe), y = injuries_severe, fill = weather_condition)) +geom_bar(stat ="identity", alpha =0.8) +labs(title ="Majority of Severe Accident-Related Injuries Occur During Inclement Weather",x ="Weather Conditions",y ="Injuries Recorded" ) +theme_minimal() +theme(plot.title =element_text(size =13, face ="bold"),axis.title =element_text(size =12),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =45, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
ggplot(severe_injuries_lighting, aes(x =reorder(lighting_condition, -injuries_severe), y = injuries_severe, fill = lighting_condition)) +geom_bar(stat ="identity", alpha =0.8) +labs(title ="Majority of Severe Accident-Related Injuries Occur In Poor Lighting",x ="Weather Conditions",y ="Injuries Recorded" ) +theme_minimal() +theme(plot.title =element_text(size =13, face ="bold"),axis.title =element_text(size =12),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =55, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
As expected, the more dangerous driving conditions are wind, sleet, hail, and relative darkness or poor lighting. What’s interesting is that more severe accidents occur on lighted roads than in complete darkness. This could indicate some carelessness when driving on lit roads or that some lighted roads are poorly lit.
Question 4: How does the likelihood of a traffic-related injury change as the time of day changes?
Certain times of the day can be busier, resulting in more traffic and likely, more accidents. It’s good to examine to help determine how traffic accidents relate to time.
# Get the total number of injuries by hour of the dayaccidents_hours <- accidents_reported %>%group_by(crash_hour) %>%summarise(injuries_total =sum(injuries_total) ) accidents_hours
ggplot(accidents_hours, aes(x = crash_hour, y = injuries_total)) +geom_line(color ="#800", linewidth =1.25) +labs(title ="More Accident-Related Injuries Occur During Rush Hour",x ="Time of Day (24 hr)",y ="Injuries Recorded" ) +theme_minimal() +theme(plot.title =element_text(size =13, face ="bold"),axis.title =element_text(size =12),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =55, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_fill_brewer(palette="Spectral")
# Get the total number of injuries by monthaccidents_months <- accidents_reported %>%group_by(crash_month) %>%summarise(injuries_total =sum(injuries_total) ) accidents_months
As expected, the number of accident-related injuries peak around 4-6 pm, which is rush hour. This makes sense, as it’s the busiest part of the day. There is a clear peak around 8 am, which is also a busy time when a majority of people go to work. What is surprising is that injuries are higher during the summer and fall, with a sharp increase from April to May and a peak during October. There could be many reasons for this, like weather, holidays, or whether people are in school or not.
Hypothesis Generation
Hypothesis: More traffic-related injuries occur during when more people are present on or near the road, the most severe occurring during inclement weather, poor lighting, and with non-vehicle entities.
Based on the results of this exploration, while a majority of reported traffic accidents result in no injuries, those that do occur more often during the summer and fall, peaking during the time the roads are the busiest. Of the traffic accidents that result in injuries, the most fatal ones occur during inclement weather, in poor lighting, and involve pedestrians or fixed objects more than other vehicles. This is meaningful for stakeholders because being able to accurately predict when an accident may happen can help people avoid them. This will help reduce the physical, emotional, and monetary tolls that traffic accidents can cause. Data such as this can also help insurance companies adjust their rates and quotes accordingly.
Some useful additional data to help test this hypothesis would be location-specific information and car type information (car, truck, sedan, etc..). Car type information could help further specify the analysis to determine the more dangerous vehicles. Location-specific information will help with finding clear patterns by region to help draw better conclusions as to how to address this issue. If this hypothesis is true, we could take clear and meaningful steps to help prevent injuries and deaths caused by traffic accidents.
Stakeholder Communication
The purpose of this analysis was to investigate the factors that may correlate with traffic accidents that result in injuries. Some key findings are that of the traffic accidents reported, a majority result in no injury at all. However, about 10% result in some injury, which translates to thousands of potentially life-altering injuries and hundreds of deaths. It was found that drivers should be extra cautious around pedestrians, cyclists, and fixed objects, as these types of crashes result in the most fatalities. Drivers should also exercise caution during inclement weather and poor lighting conditions. The analysis also indicates that the more dangerous times to be on the road are also the busiest times, when people are rushing to and from work, and later in the year when people travel much more for the holidays. These results show that driver stakeholders should take extra care when driving in any of these conditions. Insurance stakeholders should take these conditions into account when setting rates and paying out any claims.
This analysis resulted in the following hypothesis: More traffic-related injuries occur during when more people are present on or near the road, the most severe occurring during inclement weather, poor lighting, and with non-vehicle entities. Next steps would be to investigate further by region to examine how these initial trends hold in different areas. There should also be steps taken to spread more awareness about driver safety, with a focus on putting the message out to the public during inclement weather and rush hour. Education about pedestrian safety should also be prioritized. Not only will promoting safety reduce injuries and deaths, it will save drivers hundreds of thousands of dollars each year.
# Preparation code for stakeholder visualization# Get the total injuries for each month, grouped by month and the damage in $stakeholder_viz <- accidents_reported %>%group_by(crash_month, damage) %>%summarise(injuries_total =sum(injuries_total) )
`summarise()` has grouped output by 'crash_month'. You can override using the
`.groups` argument.
stakeholder_viz
# A tibble: 36 × 3
# Groups: crash_month [12]
crash_month damage injuries_total
<int> <chr> <dbl>
1 1 $500 OR LESS 630
2 1 $501 - $1,500 498
3 1 OVER $1,500 4591
4 2 $500 OR LESS 529
5 2 $501 - $1,500 369
6 2 OVER $1,500 3741
7 3 $500 OR LESS 505
8 3 $501 - $1,500 416
9 3 OVER $1,500 4604
10 4 $500 OR LESS 453
# ℹ 26 more rows
ggplot(stakeholder_viz, aes(x=crash_month, y=injuries_total, fill=damage)) +geom_area(stat ="identity") +labs(title ="Thousands of Traffic Injuries Is Thousands of Dollars Gone Each Month",x ="Month",y ="Injuries Recorded" ) +theme_minimal() +theme(plot.title =element_text(size =13, face ="bold"),axis.title =element_text(size =12),axis.text.y =element_text(size =10),axis.text.x =element_text(size =10, angle =55, hjust =1),panel.grid.minor =element_blank() ) +scale_y_continuous(labels = scales::comma) +scale_x_continuous(breaks =pretty_breaks()) +scale_fill_brewer(palette="Spectral")