Electric vehicles have exploded in popularity in recent years, sparking a new arms race among automotive companies. Statistica.com forecasts 1.27 million EV sales in 2024. Age old industry leaders were forced to adapt to changing markets.
The Electric Vehicle Population Dataset analyzes the distribution of Battery Electric Vehicles, and Hybrid Electric Vehicles that are registered through Washington State Department of Licensing. The dataset did have a significant amount of null values, such as MSRP of cars, and electric range of vehicles. Check out the the dataset on Kaggle now: Kaggle.com Published on Jan 26, 2024. 166800 rows
This analysis brings EV growth to life. In just 12 years, EVs in Washington went from 782 in 2011, to over 51,000 in 2023. Starting with 4 companies in 2011, now 15 companies hold marketshare. Tesla quickly shot through the ranks, with 2023 being completely dominated by Tesla sales. In 2023 Tesla sold 48.6% of the market, and the closest competitor was Hyundai at 5.14%.
Basic statistics about the Washington Electric Car dataset.
The amount of NULL values is seen in MSRP by looking at the Mean. The
average cost of an EV is not $1153. The Electric Range is skewed due to
a large amount of NULL values, and hybrids being included which have a
much smaller average range.
Below is the summary statistic of
the dataframe
summary(df)
## VIN (1-10) County City State
## Length:166800 Length:166800 Length:166800 Length:166800
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## Postal Code Model Year Make Model
## Min. : 1730 Min. :1997 Length:166800 Length:166800
## 1st Qu.:98052 1st Qu.:2018 Class :character Class :character
## Median :98122 Median :2021 Mode :character Mode :character
## Mean :98174 Mean :2020
## 3rd Qu.:98371 3rd Qu.:2023
## Max. :99577 Max. :2024
## NA's :5
## Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility
## Length:166800 Length:166800
## Class :character Class :character
## Mode :character Mode :character
##
##
##
##
## Electric Range Base MSRP Legislative District DOL Vehicle ID
## Min. : 0.00 Min. : 0 Min. : 1.00 Min. : 4385
## 1st Qu.: 0.00 1st Qu.: 0 1st Qu.:18.00 1st Qu.:179074064
## Median : 0.00 Median : 0 Median :33.00 Median :224404526
## Mean : 61.51 Mean : 1153 Mean :29.18 Mean :217241994
## 3rd Qu.: 84.00 3rd Qu.: 0 3rd Qu.:42.00 3rd Qu.:251342132
## Max. :337.00 Max. :845000 Max. :49.00 Max. :479254772
## NA's :360
## Vehicle Location Electric Utility 2020 Census Tract
## Length:166800 Length:166800 Min. : 1001020100
## Class :character Class :character 1st Qu.:53033009701
## Mode :character Mode :character Median :53033029602
## Mean :52977091766
## 3rd Qu.:53053073001
## Max. :56033000100
## NA's : 5
Below is the structure of the datafame
str(df)
## Classes 'data.table' and 'data.frame': 166800 obs. of 17 variables:
## $ VIN (1-10) : chr "3C3CFFGE4E" "5YJXCBE40H" "3MW39FS03P" "7PDSGABA8P" ...
## $ County : chr "Yakima" "Thurston" "King" "Snohomish" ...
## $ City : chr "Yakima" "Olympia" "Renton" "Bothell" ...
## $ State : chr "WA" "WA" "WA" "WA" ...
## $ Postal Code : int 98902 98513 98058 98012 98031 98370 98367 98370 98366 98019 ...
## $ Model Year : int 2014 2017 2023 2023 2020 2024 2018 2017 2018 2018 ...
## $ Make : chr "FIAT" "TESLA" "BMW" "RIVIAN" ...
## $ Model : chr "500" "MODEL X" "330E" "R1S" ...
## $ Electric Vehicle Type : chr "Battery Electric Vehicle (BEV)" "Battery Electric Vehicle (BEV)" "Plug-in Hybrid Electric Vehicle (PHEV)" "Battery Electric Vehicle (BEV)" ...
## $ Clean Alternative Fuel Vehicle (CAFV) Eligibility: chr "Clean Alternative Fuel Vehicle Eligible" "Clean Alternative Fuel Vehicle Eligible" "Not eligible due to low battery range" "Eligibility unknown as battery range has not been researched" ...
## $ Electric Range : int 87 200 20 0 322 39 33 238 215 114 ...
## $ Base MSRP : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Legislative District : int 14 2 11 21 33 23 26 23 26 45 ...
## $ DOL Vehicle ID : int 1593721 257167501 224071816 260084653 253771913 259427829 477087012 214494213 280785123 129133343 ...
## $ Vehicle Location : chr "POINT (-120.524012 46.5973939)" "POINT (-122.817545 46.98876)" "POINT (-122.1298876 47.4451257)" "POINT (-122.1873 47.820245)" ...
## $ Electric Utility : chr "PACIFICORP" "PUGET SOUND ENERGY INC" "PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)" "PUGET SOUND ENERGY INC" ...
## $ 2020 Census Tract :integer64 53077000700 53067012331 53033025803 53061051927 53033029305 53035940100 53035092902 53035090502 ...
## - attr(*, ".internal.selfref")=<externalptr>
Below is the number of columns in the dataframe
nrow(df)
## [1] 166800
Vehicle Sales by Year Pie Chart
This chart shows the rapid rise of
EVs sales in Washington. The inner most circle is sales in 2011. Sales
in 2011 highlight the initial dominance Nissan had in the EV space with
a 88.2% market share (690 cars). Their closest competitor was Chevy at
9.72% market share (76 cars). Tesla was not even a threat to them in
2011 with only 6 cars. This would quickly change in 5 short years to
2016.
In 2016, Nissan had almost doubled their sales in the state
to 1,120 but saw their market share dwindle from 88.2% to 20.3%.
Competition had already surpassed them. Tesla went from < 1% of
market share, to 28.8% market share (1587 cars) in 2016. Tesla was not
the only competitor now, 7 other mainstream competitors entered the
market. Ford, Kia, BWM, Volkswagen, Audi, Volvo, Hyundai, and many
smaller companies were now in the game hovering around 13%-5% sales
each. The dynamics would once again change massively from 2016 to
2023.
In 2023, Tesla would fortify their lead by increasing sales
from 28.8% market share (1587 cars), to 48.6% market share (24,979
cars). A staggering 15x increase in sales over 7 years. Teslas closest
competitor was no longer Nissan, it is now Hyundai. Tesla has a massive
lead as Hyundai only has 5.14% share of sales (2639 cars). Nissan, now a
thing of the past, holds a 2.5% sales share (1286 cars). In 12 years,
Tesla went from last to the leader. In the coming years we will see if
Tesla continues to hold its dominance over the electric car scene, as
many car companies are desperate to get a cut of the growing
market.
# Creating the data frame
make_df <- df %>%
select(Make, `Model Year`) %>%
mutate(year = `Model Year`,
myBrand = ifelse(Make=="TESLA", "Tesla",
ifelse(Make=="NISSAN","Nissan",
ifelse(Make=="CHEVROLET", "Chevrolet",
ifelse(Make=="FORD", "Ford",
ifelse(Make=="BMW", "BMW",
ifelse(Make=="KIA", "Kia",
ifelse(Make=="TOYOTA", "Toyota",
ifelse(Make=="VOLKSWAGEN","Volkswagen",
ifelse(Make=="JEEP", 'Jeep',
ifelse(Make=="HYUNDAI","Hyundai",
ifelse(Make=="VOLVO","Volvo",
ifelse(Make=="RIVIAN", "Rivian",
ifelse(Make=="AUDI","Audi",
ifelse(Make=="CHRYSLER","Chrysler","Other"))))))))))))))) %>%
group_by(year, myBrand) %>%
dplyr::summarise(n=length(myBrand), .groups='keep') %>%
group_by(year) %>%
mutate(percent_of_total = round(100*n/sum(n),1)) %>%
ungroup() %>%
data.frame()
# Creating the visualization
plot_ly(hole=0.7) %>%
layout(title = "Electric Cars Sales in (2011, 2016, 2023)") %>%
add_trace(data = make_df[make_df$year == 2023,],
labels = ~myBrand,
values = ~make_df[make_df$year == 2023,"n"],
type = "pie",
textposition = "inside",
hovertemplate = "Year; 2023<br>Brand:%{label}<br>Percent:%{percent}<br>Electric Cars: %{value}<extra></extra>") %>%
add_trace(data = make_df[make_df$year == 2016,],
labels = ~myBrand,
values = ~make_df[make_df$year == 2016,"n"],
type = "pie",
textposition = "inside",
hovertemplate = "Year; 2016<br>Brand:%{label}<br>Percent:%{percent}<br>Electric Cars: %{value}<extra></extra>",
domain = list(
x = c(0.16, 0.84),
y = c(0.16, 0.84))) %>%
add_trace(data = make_df[make_df$year == 2011,],
labels = ~myBrand,
values = ~make_df[make_df$year == 2011,"n"],
type = "pie",
textposition = "inside",
hovertemplate = "Year; 2011<br>Brand:%{label}<br>Percent:%{percent}<br>Electric Cars: %{value}<extra></extra>",
domain = list(
x = c(0.27, 0.73),
y = c(0.27, 0.73)))
This chart shows the top EV models
that were sold in Washington, showing the growth of car models within
the State.
In 2020, the Tesla Model Y had sales of 2,335 cars
(10th place). This shot up quickly to 6,570 cars in 2021 (4th place).
Not stopping its growth there, the 2022 version of the Model Y sold
7,351 models (2nd place). As if the growth over 2 years from ~2,000 to
~7,000 wasn’t impressive enough. The 2023 Model Y over doubled sales
with 16,566 models.(1st place)
The closest competitor to Tesla in
2023 was Volkswagen, Hyundai, and Chevrolet. Volkswagen sold 1,956 ID.4
models (11th place). Hyundai sold 1,777 IONIQ 5 (14th place). Chevrolet
sold 1,570 Bolt models (15th place).
# Creating a dataframe with the count of car types
big_tot <- df %>%
select(Make, Model, `Model Year`) %>%
group_by(Make, Model, `Model Year`) %>%
dplyr::summarise(n = n(), .groups = 'keep') %>%
data.frame()
# Creating a new variable for the full car name to be on the plot
big_tot$MakeModel <- paste(big_tot$Make, big_tot$Model, big_tot$Model.Year)
# Ordering the dataframe
model_df <- big_tot[order(big_tot$n, decreasing = TRUE),]
model_df <- model_df[1:15,]
model_df$Model.Year <- as.factor(model_df$Model.Year)
# Creating a ceiling for the graph. (Prevents parts being cut off)
max_y <- round_any(max(model_df$n), 18000, ceiling)
# Creating the graph
ggplot(model_df, aes(x = reorder(MakeModel, n, sum), y = n, fill = Model.Year)) +
geom_bar(stat = "identity") +
coord_flip() +
theme_fivethirtyeight() +
labs(title = "Top Electric Cars in Washington", x="", y = "Quantity", fill = "Year") +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_brewer(palette = "Dark2") +
geom_line(inherit.aes = FALSE, data=model_df,
aes(x = MakeModel, y = n, group=1), colour = "#FF0000", linewidth=1) +
geom_point(inherit.aes = FALSE, data=model_df,
aes(x = MakeModel, y = n, group = 1),
size = 3, shape = 21, fill = "#4D07D0", color = "black") +
geom_text(data = model_df, aes(x = MakeModel, y = n, label = n, fill = NULL), hjust = -0.15, size=4) +
theme(legend.background = element_rect(fill = "transparent"),
legend.box.background = element_rect(fill = "transparent", colour=NA),
legend.spacing = unit(-1, "lines")) +
scale_y_continuous(labels = comma, limits=c(0, max_y))
Top Battery Distances(Dataset had many NULL values for 2021-2024)
This chart shows the top 10 driving
distances in BEVs. This data was only available for 2020 models and
below due to no data being in the dataset for battery ranges 2021 and
up.
The top car was the Tesla Model S 2020 with a top distance of
333.5 miles. Standing almost 35 miles above the second place. The Tesla
Model 3 took second place at 298.67 miles. Third place went to the Tesla
Model X at 291 miles.
Going all the way down to Teslas
competition, Hyudai and Chevrolet were the only other companies.
Chevrolet held 7th place with 259 miles. Hyundai held 8th & 9th
place at 258 miles.
# Data Battery Plot
# Failed dataframe
#bat_df <- df %>%
# select(Make, Model, `Model Year`, `Electric Range`)%>%
# distinct(Make, Model, `Model Year`, `Electric Range`) %>%
# group_by(Make, Model, `Model Year`) %>%
# data.frame()
#bat_df <- bat_df[order(-bat_df$Electric.Range),]
# The following dataframe was created with the assistance of ChatGPT to assist in the manual labor.
# The top cars have different battery ranges, but the same make model and year.
# I could not figure out how to combine and average similar records.
# I pasted data from the top 20 records and asked ChatGPT to hard code the dataframe.
# Combine into a new dataframe
bat_df <- data.frame(
Model = c("TESLA MODEL S 2020", "TESLA MODEL 3 2020", "TESLA MODEL X 2020", "TESLA MODEL X 2019",
"TESLA MODEL S 2019", "TESLA MODEL S 2012", "CHEVROLET BOLT EV 2020", "HYUNDAI KONA 2019",
"HYUNDAI KONA 2020", "TESLA MODEL S 2018", "TESLA ROADSTER 2010", "TESLA ROADSTER 2011",
"KIA NIRO 2019", "KIA NIRO 2020", "CHEVROLET BOLT EV 2017", "CHEVROLET BOLT EV 2019", "TESLA MODEL X 2018"
, "CHEVROLET BOLT EV 2018", "JAGUAR I-PACE 2019", "JAGUAR I-PACE 2020"),
Range = c(round(mean(c(337, 330)), 2), round(mean(c(322, 308, 266)), 2), round(mean(c(293, 289)), 2), 289, 270,
265, 259, 258, 258, 249, 245, 245, 239, 239, 238, 238, 238, 238, 234, 234)
)
bat_df <- head(bat_df, 10)
# This line of code was generated with ChatGPT. I was having errors with my
# color being out of order. ChatGPT suggested to implement this line of code for correction
bat_df$order <- reorder(bat_df$Model, bat_df$Range, sum)
# Creating manual colors from https://r-charts.com/color-palette-generator/
bar_colors <- c("#65ff1d", "#61f236", "#5ee54f", "#5ad968", "#57cc81", "#53bf9b", "#50b2b4", "#4ca6cd", "#4999e6", "#458cff")
ggplot(bat_df, aes(x = order, y = Range, fill = order)) +
geom_bar(stat = 'identity') +
scale_fill_manual(values = bar_colors) +
theme(plot.title = element_text(hjust = 0.5)) +
labs(title = "Top Driving Distances", x = "Car", y = "Distance") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(size = 8)) +
geom_text(aes(label = paste(Range, "Miles")), vjust = -1)
Distribution of Electric Vehicle Types Over Years
This heatmap showcases the adoption
of Battery Electric Vehicles V.S Plug-in Hybrid Electric
Vehicles.
Starting in 2012, there were more PHEVs than BEVs. This
trend would reverse in 2015, with a flip of 3,560 BEVs and only 1,273
PHEVs.
Battery electric vehicles would continue to run away with
the lead. Throughout 2019-2022 there would be ~4.5 BEVS for every PHEV.
This trend would accelerate further with the boom of Tesla sales in
2023. In 2023 there were 6,689 PHEVs which is significant growth from
870 in 2012. However, in 2023 there were 44,462 BEVs, roughly 6.6x the
amount of PHEVs. Starting at 760, battery electric vehicles saw
staggering growth over just 12 years.
# Creating the data frame to filter for only hybrid vehicles
notElec <- df %>%
filter(`Model Year`>2011, `Model Year`<2024) %>%
group_by(`Model Year`, `Electric Vehicle Type`) %>%
dplyr::summarize(n = n(), .groups = 'keep')
# Creating the heatmap
breaks <- c(seq(0, max(notElec$n), by=5000))
ggplot(notElec, aes(x=`Model Year`, y = `Electric Vehicle Type`, fill = n)) +
geom_tile(color="black") +
geom_text(aes(label=comma(n))) +
labs(title = "Heatmap: Vehicle Type by Year",
y = "Type of Electric",
x = "Model Year",
fill = "Quantity of Cars") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5)) +
scale_fill_continuous(low="white", high= "#18FF3B", breaks = breaks) +
guides(fill = guide_legend(reverse=TRUE, override.aes = list(colour="black")))
Average cost of cars over year (Data is an approximation due to massive amount of NULL MSRP values)
This chart shows the average cost
of electric car sales per year. In 2008, there were barely any electric
car sales, but had a staggering $94,000 average pricetag. Initial
electric car sales were expensive because there was barely a market for
them. Electric cars were not mainstream, so mainstream brands ignored
them. The highest point in the graph is 2011 at a staggering
$109,000.
One year later in 2012 when Nissan took market
dominance, they dropped the price to $64,000. This would stay level for
2 years until new people entered the market.
The lowest point in
the graph coincides with when 10 companies were all desperate for market
dominance. They all began undercutting eachothers prices bringing the
lowest point to $32,280. Additionally, in 2016 hybrid electric vehicles
still had ~31% of market share.
The graph finishes off in 2020
at ~$81,000. This coincides with when Tesla took market dominance with a
60% market share.
# Creating the cost by year dataframe
cost <- df %>%
mutate(year = `Model Year`) %>%
filter(`Base MSRP`>0) %>%
group_by(year) %>%
dplyr::summarise(AverageCost = mean(`Base MSRP`, na.rm = TRUE),
CarCount = n()
) %>%
data.frame()
# Reordering the cost
cost <- cost[order(-cost$year),]
# Getting rid of unique outliers with low sales
aveCost <- cost %>%
filter(CarCount>5)
# Creating logic for the low and high point in the graph
hi_lo <- aveCost %>%
filter(AverageCost == min(AverageCost) | AverageCost == max(AverageCost)) %>%
data.frame()
ggplot(aveCost, aes(x=year, y=AverageCost)) +
geom_line(color='black', linewidth=1) +
geom_point(shape=21, size=4, color='red', fill='white') +
labs(x="Year", y="Average Electric Car Cost", title="Electric Car Cost by Year", caption="Source: Washington Electric Cars (Kaggle)") +
scale_y_continuous(labels=comma) +
theme_light() +
theme(plot.title = element_text(hjust = 0.5)) +
geom_point(data = hi_lo, aes(x = year, y=AverageCost),shape=21, size=4, fill='red', color='white') +
geom_label_repel(aes(label= ifelse(AverageCost==max(AverageCost) | AverageCost == min(AverageCost),scales::comma(AverageCost), "")),
box.padding = 1,
point.padding = 1,
size=4,
color="Grey50",
segment.color = 'darkblue')
Over the past decade, the electric vehicle market in Washington has seen dramatic shifts. Starting initially with just 4 companies and around a thousand sales, it exploded over the decade. Tesla beat out the competition and took market dominance. The Tesla Model Y sold 16,566 models in 2023. The closest competitor was Volkswagen with 1,956 sales in 2023. Battery size/distance has increased through the years, the biggest battery is in the Tesla Model S at 333.5 miles. Hybrids initially outnumbered the Battery Electric Vehicles, but eventually the BEVs won out, vastly outnumbering the hybrids. The average cost of electric cars was initially above 100,000 in 2011, and then dropped down to 32,280 in 2016 as competition intensified. The cost rose back up to ~80,000 in 2020. These graphs highlight the dynamic and rapidly evolving landscape of the EV market.