Introduction

This dataset contains information about cars made in 2022-2023 from major manufacturers around the world. Most of the visualizations are made using columns such as make, model, price, fuel type, body type, horsepower, MPG, engine, gearbox. The purpose of these visualizations was to investigate the relationship between price, body type, and fuel type across major brands due to the increasing popularity of electric and hybrid cars.

df <- read.csv("~/Downloads/Used Car Data/cars_us_2022.csv")

library(lubridate)
library(dplyr)
library(scales)
library(ggthemes)
library(ggplot2)
library(ggrepel)
library(plotly)
unique(df$Brand)
##  [1] "Honda"        "BMW"          "Lexus"        "Hyundai"      "Toyota"      
##  [6] "KIA"          "Nissan"       "Audi"         "Chevrolet"    "Ford"        
## [11] "Mercedes"     "Porsche"      "Infiniti"     "Jaguar"       "Cadillac"    
## [16] "Land Rover"   "Jeep"         "Volkswagen"   "Maserati"     "Subaru"      
## [21] "Dodge"        "Mazda"        "Chrysler"     "Aston Martin" "Ferrari"     
## [26] "Lamborghini"  "Bugatti"      "Bentley"      "Rolls Royce"  "Mclaren"     
## [31] "Lincoln"      "Alfa Romeo"   "Volvo"        "MINI"         "Fiat"        
## [36] "Acura"        "Genesis"      "Buick"        "GMC"          "Tesla"       
## [41] "Rimac"        "Koenigsegg"   "Lotus"        "Renault"      "Suzuki"      
## [46] "MG"           "Skoda"        "JAC"          "Proton"       "Changan"

First Visualization

First is a collection of pie charts from 9 manufacturers which represents their percentage of fuel type offerings.

pie_chart_brands <- c("Honda", "BMW", "Lexus", "Hyundai", "Toyota", "KIA", "Nissan", "Audi", "Chevrolet")

df_filtered <- df %>% 
  filter(Brand %in% pie_chart_brands) %>%
  count(Fuel.Type, Brand) %>%
  mutate(percent = n / sum(n))

df_filtered <- df_filtered %>%
  filter(Fuel.Type != "Unknown")

df_filtered <- df_filtered %>%
  group_by(Brand) %>%
  mutate(total_count = sum(n),
         percent_of_total = round(n/total_count * 100)) %>%
  ungroup()

ggplot(data = df_filtered, aes(x="", y = percent, fill = Fuel.Type )) +
  geom_bar(stat="identity" , position="fill") +
  coord_polar(theta="y", start=0) +
  labs(fill = "Fuel Type", x = NULL, y = NULL, 
       title = "2022-23 Fuel Type Offerings from Different Manufacturers",
       caption = "Slices under 5% are not labeled") +
  theme_light() +
  theme(plot.title = element_text(hjust=0.5),
        axis.text = element_blank(),
        axis.ticks = element_blank(),
        panel.grid = element_blank()) +
  facet_wrap(~Brand, ncol=3, nrow=3) +
  scale_fill_brewer(palette = "Spectral") +
  geom_text(aes(x=1.7, label=ifelse(percent_of_total>5, paste0(percent_of_total, "%"),"")),
            size = 2.8, 
            position=position_fill(vjust = 0.5))

Second Visualization

Next is a bar chart that depicts the different kinds of body styles offered by 6 brands: 3 from the US, and 3 from outside of the US. It is meant to show the increasing demand in the SUV segment, and a departure from the coupe and wagon offerings.

bar_models <- df[df$Brand %in% c("Toyota", "KIA", "Honda", "Ford", "Chevrolet", "Dodge") & !is.na(df$Body.Type) & df$Body.Type != "", ]


ggplot(bar_models, aes(x=Brand, fill=Body.Type)) + 
  geom_bar() +
  stat_count(position=position_stack(vjust=0.5), show.legend = FALSE) +
  xlab("Brand") +
  ylab("Count of Models") +
  ggtitle("Count of Models and Body Type by Brand") +
  scale_fill_discrete(name="Body Type") +
  theme(plot.title = element_text(hjust = 0.5))

Third Visualization

Here is a line chart that shows the average price of all of the models from the same 6 brands. It was interesting that the US brands had noticeably higher average prices. This is due to the fact that they have more sports car and coupe offerings with larger engines, whereas Honda, Toyota, and Kia focus more on affordable, economy cars.

line_models <- bar_models


avg_price_by_brand <- line_models %>%
  group_by(Brand) %>%
  summarize(Avg_Price = mean(Price))


ggplot(avg_price_by_brand, aes(x = Brand, y = Avg_Price, group = 1)) +
  geom_line(color='black' , linewidth=1) +
  geom_point(shape=21, size=4, color='red', fill='white') +
  labs(x="Brand", y="Average Price", title="Average Price of Models")+
  scale_y_continuous(labels=function(x) paste0("$", comma(x))) +
  theme_light()+
  theme(plot.title= element_text(hjust = 0.5)) +
  geom_label_repel(aes(label=sprintf("$%.0f", Avg_Price)), 
                   box.padding= 1, 
                   point.padding= 1, 
                   size = 4, color ='Grey50', 
                   segment.color= 'darkblue')

Fourth Visualization

This heatmap below shows the relationship between body style and price for electric and hybrid powertrains. These particular brands were chosen because they have pledged to have only electric models in the future.

heat_models <- df[df$Brand %in% c("Audi", "Lexus", "Volvo" ,"Jaguar", "Mercedes", "Porsche" , "Acura" , "Toyota" , "Lincoln" , "Volkswagen"),]


heat_models <- subset(heat_models, Fuel.Type == "Electric" | Fuel.Type == "Hybrid")

heat_models_filtered <- select(heat_models, Brand, Model.Number, Price, Power.hp, Fuel.Type, Body.Type)


ggplot(heat_models_filtered , aes(x = Brand , y = Body.Type , fill = Price )) +
  geom_tile(color = "black") +
  coord_equal(ratio=1) +
  labs(title = "Heatmap: Car Price by Body Type",
       x = "Brand" , 
       y = "Body Type" ,
       fill = "Price") + 
  theme_minimal()+
  theme(plot.title = element_text(hjust=0.5))

Fifth Visualization

One of the early adopters of the “zero emissions” pledge was Porsche, who plans to be completely carbon neutral by 2030. Here below is a breakdown of their offerings as of now in 2022-2023.

porsche_data <- subset(df, Brand == "Porsche")

porsche_pct <- porsche_data %>%
  filter(Fuel.Type != "Unknown") %>%
  group_by(Fuel.Type) %>%
  summarize(Pct = n()/nrow(porsche_data))

plot_ly(porsche_pct, labels = ~Fuel.Type, values = ~Pct, type = "pie",
        hole = 0.6, textposition = "inside", textinfo = "label+percent") %>%
  layout(title = "Percentage of Porsche Models by Fuel Type (2022-2023)")

Conclusion

It was found that the increasing popularity of hybrid and electric car models have made the SUV segment more popular than before, and more luxury manufacturers such as Audi and Porsche are gradually moving to phase out their lines of internal combustion offerings in favor of hybrid and electric powertrains.