Being a very prominent electric vehicle, Telsa and its safety gets a lot of attention. This scrutiny leads to a lot of questions about deaths caused by Telsas. Are Tesla vehicles dangerous? Are drivers in certain locations more reckless? Is the AutoPilot feature advanced enough to be truly safe? Visualizing this data can help answer these questions - and inspire more discovery.
The data from this report was found on Kaggle. It contains information about 254 cases of Tesla vehicle accidents resulting in fatalities. Important information from this data set include date, location, and type of person(s) who died. This is the link: https://www.kaggle.com/datasets/thedevastator/tesla-accident-fatalities-analysis-and-statistic?resource=download
The analysis of this fatal Tesla accident data set reveals trends about fatalities, including when/where they occur and who the affect. Much of this data does not show significance and can be easily explained. Based on this analysis, it cannot be definitively said whether Tesla vehicles are dangerous.
The hidden code shows necessary libraries and setup to convert the Tesla dataset into something usable for visualizations.
#setup wd and file
setwd("/Users/graceaberle/Library/CloudStorage/OneDrive-LoyolaUniversityMaryland/DS736")
teslafile <- "/Users/graceaberle/Library/CloudStorage/OneDrive-LoyolaUniversityMaryland/DS736/R_datafiles/Tesla Deaths - Deaths (3).csv"
#libraries
library(data.table) #bc using fread
library(ggplot2)
library(dplyr)
library(ggthemes)
library(ggrepel)
library(RColorBrewer)
library(scales)
library(lubridate)
library(plotly)
#read file as df
df0 <- read.csv(teslafile)
#adjusting df, fixing NAs, etc
df0 <- df0[-21]
df0[,c(7:15)] <- lapply(df0[,c(7:15)], as.integer)
df0$Year <- as.factor(df0$Year)
df0[is.na(df0)] <- 0
#adjusting State column
df0["State"][df0["State"] ==""] <- "non-USA"
df0$State <- trimws(df0$State, which = "both")
df0["State"][df0["State"] == "HA"] <- "HI"
df0$Date <- mdy(df0$Date)
This chart shows a count of deaths due to Tesla vehicles spanning from 2013 to 2022. The count of deaths has increased from 2013 to 2022. The largest jump occurred from 2018 to 2019.
#fig changes size of viz
deathbyyeardf <- df0 %>%
group_by(Year) %>%
summarise(nDeath=length(Year), .groups='keep') %>%
data.frame
viz2 <- ggplot(deathbyyeardf, aes(x=Year, y=nDeath, group=1, label = nDeath)) +
geom_line(color='black', size=.5) +
geom_point(shape = 25, size=5, color='red', fill='red') +
geom_text_repel(point.padding = 5) +
labs(x='Year', y='# of Deaths', title='Count of Tesla Deaths from 2013 to 2022') +
theme_light() +
scale_y_continuous(breaks = waiver(), n.breaks = 12)
viz2
This bar chart show the count of deaths based on who the person was in the incident. The majority of victims are those in other vehicles. Drivers are a close second with Tesla occupants and cyclists/pedestrian almost tied for third and fourth.
sums <- c(sum(df0$Tesla.driver), sum(df0$Tesla.occupant), sum(df0$Other.vehicle), sum(df0$Cyclists..Peds))
df2 <- data.frame(x=c('Tesla Driver', 'Tesla Occupant', 'Other Vehicle', 'Cyclist/Pedestrian'), y=sums)
viz4 <- ggplot(df2, aes(reorder(x, -y, sum), y, fill=x)) +
geom_bar(stat='identity', color='black') +
theme_few() +
labs(title="Count of Each Type of Death", x='Death Category', y='Death Count', fill='') +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette="Accent") +
geom_text(aes(label=y), vjust=2)
viz4
This chart shows how many of each type of death occurred on each day of the week based on who was killed. Comparing weekends to weekdays does not indicate any obvious differences. The highest counts of incidents took place on Saturday and Mondays with 43 and 42 respectively. The lowest counts occurred on Tuesdays and Thursdays with 28 and 27. The proportions of types of deaths are also fairly evenly distributed throughout the week.
allDcount0 <- df0 %>%
select(Tesla.driver, Tesla.occupant, Other.vehicle, Cyclists..Peds, Deaths, Date) %>%
mutate(dtype = ifelse(Tesla.driver != 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Driver Only",
ifelse(Tesla.driver == 0 & Tesla.occupant !=0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Occupants Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle != 0 & Cyclists..Peds ==0, "Other Vehicle Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds !=0, "Cyclist/Peds Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Other Type",
ifelse((Tesla.driver != 0 | Tesla.occupant != 0), "Inside the Tesla Only",
'Multiple Types'))))))) %>%
mutate(day = weekdays(ymd(Date), abbreviate = T)) %>%
group_by(Deaths, dtype, day) %>%
summarise(countdtype = length(dtype), .groups='keep') %>%
data.frame()
totalday0 <- allDcount0 %>%
group_by(day) %>%
summarise(totalday = sum(countdtype))
dayorder <- factor(allDcount0$day, level=c('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))
allDcount0$dtype <- factor(allDcount0$dtype, levels=c('Other Type', 'Occupants Only', 'Cyclist/Peds Only', 'Inside the Tesla Only', 'Driver Only', 'Other Vehicle Only'))
viz1 <- ggplot(allDcount0, aes(x=dayorder, y=countdtype, fill=dtype)) +
geom_bar(stat='identity') +
labs(title = "Death Count by Death Type and Day of Week", x = "Day of the Week", y = "Death Count", fill = "Death Type") +
theme_few() +
scale_fill_brewer(palette = "Accent") +
geom_text(data = totalday0, aes(x=day, y=totalday, label=totalday, fill=NULL), vjust=-.45)
viz1
This choropleth map shows a count of how many deaths occurred in each state in the United States.The highest number of deaths arise in California and Florida. California is a clear outlier with at least 70 and most other states in the 1-20 range. White states have no reported Tesla fatalities.
statecounts <- df0 %>%
filter(State != "non-USA") %>%
select(State, Case..) %>%
group_by(State) %>%
summarise(ncase = length(Case..), .groups = 'keep') %>%
data.frame()
viz6 <- plot_ly(type="choropleth", locations=statecounts$State, locationmode="USA-states", z=statecounts$ncase, colorscale = 'Portland') %>%
colorbar(dtick=10) %>%
layout(geo=list(scope="usa"), title="Map: Count of Deaths in each State (USA)")
viz6
Of states where deaths occurred, this trellis chart shows types of deaths in each state. 28 states are represented in this chart. Most only have only one type of death because - as shown before - they have few instances. States like California, Florida, New York, and Pennsylvania have more variety in victims. Other states only have one type of victim like Alabama, Hawaii, and Iowa where only occupants of other vehicles were killed. A lot of states (Idaho, Illinois, Indiana, Maine, Michigan, North Carolina, Oregon, Tennessee, and Washington) had cases where no one outside the Tesla vehicle died.
allDcount <- df0 %>%
select(Tesla.driver, Tesla.occupant, Other.vehicle, Cyclists..Peds, Deaths, State) %>%
mutate(dtype = ifelse(Tesla.driver != 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Driver Only",
ifelse(Tesla.driver == 0 & Tesla.occupant !=0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Occupants Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle != 0 & Cyclists..Peds ==0, "Other Vehicle Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds !=0, "Cyclist/Peds Only",
ifelse(Tesla.driver == 0 & Tesla.occupant ==0 & Other.vehicle == 0 & Cyclists..Peds ==0, "Other Type",
ifelse((Tesla.driver != 0 | Tesla.occupant != 0), "Inside the Tesla Only",
'Multiple Types'))))))) %>%
group_by(Deaths, dtype, State) %>%
summarise(countdtype = length(dtype), .groups='keep') %>%
data.frame()
viz3 <- ggplot(data = filter(allDcount, allDcount$State != "non-USA"), aes(x="", y=countdtype, fill=dtype)) +
geom_bar(stat="identity", position="fill") +
coord_polar(theta="y", start=0) +
labs(fill="Death Type", x=NULL, y=NULL, title="Count of Types of Deaths by State") +
theme_few() +
theme(plot.title = element_text(hjust=0.5),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank()) +
facet_wrap(~State, ncol=7, nrow=4) +
scale_fill_brewer(palette = "Accent")
viz3
This nested donut chart compares the percentage of cyclist and pedestrians deaths where autopilot is claimed vs not claimed. Of incidents where AutoPilot was claimed, 15.4% of victims were cyclists or pedestrians. Of incidents where there was no AutoPilot claim, 14.5% of victims were cyclists or pedestrians.
autopeds <- df0 %>%
filter(AutoPilot.claimed != 0) %>%
select(AutoPilot.claimed, Cyclists..Peds) %>%
mutate(killed = ifelse(Cyclists..Peds != 0, "Cyclist or Pedestrian", "Other")) %>%
group_by(killed) %>%
summarise(nAutoPilot = length(AutoPilot.claimed), .groups = 'keep') %>%
ungroup() %>%
mutate(percenttotal = round(100*nAutoPilot/sum(nAutoPilot), 1)) %>%
data.frame()
otherpeds <- df0 %>%
filter(AutoPilot.claimed == 0) %>%
select(AutoPilot.claimed, Cyclists..Peds) %>%
mutate(killed = ifelse(Cyclists..Peds != 0, "Cyclist or Pedestrian", "Other")) %>%
group_by(killed) %>%
summarise(nAutoPilot = length(AutoPilot.claimed), .groups = 'keep') %>%
ungroup() %>%
mutate(percenttotal = round(100*nAutoPilot/sum(nAutoPilot), 1)) %>%
data.frame()
viz5 <- plot_ly(hole = .7) %>%
add_trace(data=autopeds, labels = ~killed, values= ~nAutoPilot, type="pie",
hovertemplate = "AutoPilot Claimed <br> Victim Type: %{label}<br>Percent: %{percent}<br>Death Count: %{value}<extra></extra>") %>%
add_trace(data = otherpeds, labels = ~killed, values = ~nAutoPilot, type="pie",
hovertemplate = "AutoPilot Not Claimed <br> Victim Type: %{label}<br>Percent: %{percent}<br>Death Count: %{value}<extra></extra>",
domain = list(x = c(0.16, 0.84), y = c(0.16, 0.84))) %>%
layout(title="AutoPilot Claimed vs Not Claimed Cyclist/Pedestrian Deaths")
viz5
The growth of incidents over the years could possibly be explained by the expansion of Telsa vehicles into new markets and more production. Occupants of other vehicles tend to be victims more often than those inside the Tesla or cyclists/pedestrians. The death count by day of week does not show any obvious trends weekend to weekday, meaning commuting traffic does not have a high impact. The type of death reflects a similar pattern to the previous chart with the highest number of deaths being those in other vehicles. No definitive conclusion can be made about days of the week and counts of deaths.The map shows California and Florida with the most fatal incidents. This possibly reflects the popularity of these vehicles in those states more so than the driving of their inhabitants. The trellis chart of death counts by state show which have more variety in deaths. The states with more incidents tend to have more types of deaths, indicating that with more data is more diversity Comparing AutoPilot to non-AutoPilot claims shows a fairly even percentage of victims were cyclists or pedestrians. This data cannot prove that AutoPilot does not identify non-vehicle hazards since the same rate is revealed without AutoPilot. This analysis of fatal Tesla incidents displays many interesting statistics but more data is needed to make a full conclusion on the safety of Teslas and the habits of their drivers.