Electric Car Sales Analytics - Rakha Hafish Setiawan
Introduction
In this project, the data set “Electric Car Sales by Model in the US” was used in order to determine the insights and the actionable insights for various users such as investors and engineers to better understand the growth and trend of the Electric Cars from 2012 to 2019. Additionally this project was also initiated in order to practice the author’s skills in pipeline development and analysis with the R Programming Language and the tidyverse library.
Original public data set courtesy of mathurinache on Kaggle.
Library Activation
Here the necessary libraries were activated and the designated data set was read. The raw data set itself contains 99 columns, 3 consists of the make and model data as well as the logo for each of the makes, and the rest contains the actual sales data that needed reformatting from wide to long for the ease of aggregation.
library(tidyverse)library(sqldf)library(readxl)MyCars =read_excel("C:/Users/Rakha Hafish S/Downloads/Electric Car Sales by Model in USA.xlsx")MyCars %>%head(5)
# A tibble: 5 × 99
Make Model Logo `janv-12` `Feb 2012` `mars-12` `Apr 2012` `May 2012`
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Chevrolet Volt http… 603 1626 3915 5377 7057
2 Toyota Prius PHV http… 0 21 912 2566 3652
3 Nissan Leaf http… 676 1154 1733 2103 2613
4 Tesla Model S http… 0 NA NA NA NA
5 Ford C-Max En… http… 0 NA NA NA NA
# ℹ 91 more variables: `juin-12` <dbl>, `juil-12` <dbl>, `Aug 2012` <dbl>,
# `sept-12` <dbl>, `oct-12` <dbl>, `nov-12` <dbl>, `Dec 2012` <dbl>,
# `janv-13` <dbl>, `Feb 2013` <dbl>, `mars-13` <dbl>, `Apr 2013` <dbl>,
# `May 2013` <dbl>, `juin-13` <dbl>, `juil-13` <dbl>, `Aug 2013` <dbl>,
# `sept-13` <dbl>, `oct-13` <dbl>, `nov-13` <dbl>, `Dec 2013` <dbl>,
# `janv-14` <dbl>, `Feb 2014` <dbl>, `mars-14` <dbl>, `Apr 2014` <dbl>,
# `May 2014` <dbl>, `juin-14` <dbl>, `juil-14` <dbl>, `Aug 2014` <dbl>, …
Data Transformation
Reformatting and Replacement
In this code chunk, the data set was reformatted using tidyr and dplyr. The logo column will not be necessary so it was deleted out of the set. The wide format was changed to long format, and all the NA values were replaced with 0.
# A tibble: 5 × 4
Make Model Date Sales
<chr> <chr> <chr> <chr>
1 Chevrolet Volt janv-12 603
2 Toyota Prius PHV janv-12 0
3 Nissan Leaf janv-12 676
4 Tesla Model S janv-12 0
5 Ford C-Max Energi janv-12 0
Isolation and Cleaning
In this chunk, the Date column was isolated using dplyr and stringr in order to better clean it. This particular column was heavily cleaned in order to enable the yearly aggregate for the Electric Car Sales.
MyDates = MyCars %>%select(everything()) %>%gather(Dates, Sales, 4:99) %>%select(Dates) %>%apply(2, FUN = str_remove_all, " ") %>%data.frame() %>%apply(2, FUN = str_remove_all, "-") %>%data.frame() %>%apply(2, FUN = str_to_title) %>%data.frame()MyDates %>%head(3) %>%tibble()
# A tibble: 5 × 4
Make Model Year Sales
<chr> <chr> <dbl> <chr>
1 Chevrolet Volt 2012 603
2 Toyota Prius PHV 2012 0
3 Nissan Leaf 2012 676
4 Tesla Model S 2012 0
5 Ford C-Max Energi 2012 0
Exploratory Data Analysis
Exploratory data analysis (EDA) was done in order to obtain a bird’s eyes view of the sales growth and trends among the various makes and models.
Max and Average Sales of each make and model - overall
In this query, the top 3 sales between 2012-2019 were achieved by Chevrolet, Nissan, and Tesla, lead by Volt.
sqldf("SELECT Make, Model, MAX(Sales), AVG(Sales) FROM CarsFixed GROUP BY Make, Model ORDER BY AVG(Sales) DESC")
Make Model MAX(Sales) AVG(Sales)
1 Chevrolet Volt 99270 81689.218750
2 Nissan Leaf 99646 69983.354167
3 Tesla Model S 99932 68923.708333
4 Tesla Model 3 9944 33512.614583
5 Toyota Prius PHV 9623 33408.177083
6 Ford Fusion Energi 9043 29734.677083
7 Ford C-Max Energi 9999 24178.354167
8 Tesla Model X 9737 20397.968750
9 BMW i3 9997 16632.937500
10 Fiat 500e 9326 13218.468750
11 Toyota Prius Prime 8587 11946.229167
12 Chevrolet Bolt EV 8171 11793.093750
13 VW e-Golf 9800 5803.104167
14 BMW X5 xDrive40e 9813 5466.343750
15 Ford Focus Electric 97 5407.354167
16 Honda Clarity PHEV 9354 4410.041667
17 Chevrolet Spark EV 908 4100.302083
18 Smart ED 923 3863.312500
19 Audi A3 Sprtbk e-tron 9978 3404.052083
20 BMW 530e 9328 2674.760417
21 BMW i8 9 2600.177083
22 Chrysler Pacifica Hybrid 9428 2380.406250
23 Kia Soul EV 981 2324.625000
24 Hyundai Sonata PHEV 810 2170.322917
25 Mercedes B-Class ED 92 2145.333333
26 BMW 330e 999 2096.010417
27 Porsche Cayenne S-E 945 2008.552083
28 Volvo XC90 926 1963.020833
29 Toyota Rav4 EV 940 1884.208333
30 Cadillac ELR 891 1738.177083
31 Mitsubishi i-MiEV 845 1672.677083
32 Porsche Panamera E-Hybrid 965 1481.375000
33 Mitsubishi Outlander PHEV 99 991.875000
34 Honda Fit EV 980 829.447917
35 Honda Accord PHV 975 770.343750
36 Mercedes C350e 988 706.510417
37 Kia Niro PHEV 966 662.833333
38 Volvo XC60 PHEV 962 657.687500
39 Kia Optima PHEV 930 585.177083
40 Honda Clarity BEV 594 541.552083
41 Mercedes S550e 998 501.916667
42 Mercedes GLE 550e 99 471.520833
43 Mini Countryman SE PHEV 882 398.677083
44 Hyundai IONIQ EV 99 393.468750
45 Hyundai IONIQ PHEV 958 335.020833
46 BMW 740e 978 276.781250
47 Mercedes GLC 350e 888 253.479167
48 Audi e-tron 5369 237.635417
49 Jaguar I-Pace 789 228.447917
50 Volvo S90 T8 PHEV 988 148.885417
51 Porsche 918 Spyder 86 118.656250
52 Cadillac CT6 PHEV 89 108.833333
53 Hyundai Kona Electric 897 60.708333
54 Subaru Crosstrek Hybrid 970‬ 48.385417
55 Kia Niro EV 880‬ 43.114583
56 Other Other 35 21.125000
57 Porsche Taycan 130 1.354167
Average Sales by Year
This query shows the growth of the average sales by year.
sqldf("SELECT Year, AVG(Sales) FROM CarsFixed GROUP BY Year")
This query shows the best selling models of each year.
sqldf("SELECT Year, MAX(Sales), Make, Model FROM CarsFixed GROUP BY Year")
Year MAX(Sales) Make Model
1 2012 9819 Nissan Leaf
2 2013 9650 Tesla Model S
3 2014 9999 Ford C-Max Energi
4 2015 9997 BMW i3
5 2016 99270 Chevrolet Volt
6 2017 99932 Tesla Model S
7 2018 9944 Tesla Model 3
8 2019 9978 Audi A3 Sprtbk e-tron
Insights
According to the analysis, it is recommended to invest in Tesla due it’s stability in population of the best selling model in 2013, 2017 and 2018.
Additionally it is imperative if investing in Tesla to evenly distribute the shares over the different models and develop similar machinations, due to their population in the overall yearly analysis.
It is recommended to watch over the other best-selling makes and models such as Ford and Chevrolet.