Introduction

This report presents a descriptive analysis of an electric vehicle dataset containing information on manufacturer, model year, electric vehicle type, county location, and electric range. The goal is to describe patterns in EV adoption over time, manufacturer concentration, geographic distribution, and vehicle range characteristics. The report includes basic descriptive statistics and five accompanying graphs, each explained in detail.

ev_data <- read_excel("U:/R/Monteforte_MidModuleData.xlsx")

ev_data <- clean_names(ev_data) %>%
  mutate(
    model_year = as.integer(model_year),
    make = trimws(as.character(make)),
    county = trimws(as.character(county)),
    electric_vehicle_type = trimws(as.character(electric_vehicle_type)),
    electric_range = suppressWarnings(as.numeric(electric_range))
  )


ev_full <- ev_data %>%
  filter(!is.na(model_year)) %>%
  count(model_year, name = "total_vehicles") %>%
  right_join(
    tibble(model_year = seq(
      min(ev_data$model_year, na.rm = TRUE),
      max(ev_data$model_year, na.rm = TRUE)
    )),
    by = "model_year"
  ) %>%
  replace_na(list(total_vehicles = 0)) %>%
  arrange(model_year)

ggplot(ev_full, aes(model_year, total_vehicles)) +
  geom_line() +
  geom_point() +
  labs(
    title = "EVs by Model Year",
    x = "Model Year",
    y = "Number of Vehicles"
  ) +
  theme_minimal()

This line chart shows how EV counts vary by model year. The overall pattern increases in the more recent model years, indicating that EV adoption has grown substantially over time. Any sharp rises in the most recent years suggest stronger consumer demand and wider availability of EV options. This trend matters for decision makers because increasing adoption often implies greater need for charging infrastructure and EV-related services.

top_makes <- ev_data %>%
  filter(!is.na(make), make != "") %>%
  count(make, name = "total_vehicles", sort = TRUE) %>%
  slice_head(n = 10)

ggplot(top_makes, aes(reorder(make, total_vehicles), total_vehicles)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top 10 EV Manufacturers",
    x = "Manufacturer",
    y = "Number of Vehicles"
  ) +
  theme_minimal()

This bar chart ranks the top ten manufacturers by the number of vehicles in the dataset. The plot typically shows that a small number of manufacturers account for a large portion of EVs, which indicates market concentration. Concentration can reflect brand leadership, production scale, consumer preferences, and availability of models. For planning purposes, knowing the dominant makes can help target support services, parts supply, and dealership or service capacity.

top5_makes <- ev_data %>%
  filter(!is.na(make), make != "") %>%
  count(make, sort = TRUE) %>%
  slice_head(n = 5) %>%
  pull(make)

nested_data <- ev_data %>%
  filter(
    make %in% top5_makes,
    !is.na(electric_vehicle_type),
    electric_vehicle_type != ""
  ) %>%
  count(electric_vehicle_type, make, name = "total_vehicles")

ggplot(nested_data, aes(x = 2, y = total_vehicles, fill = make)) +
  geom_col(color = "white") +
  coord_polar(theta = "y") +
  xlim(0.5, 2.5) +
  facet_wrap(~ electric_vehicle_type) +
  labs(
    title = "Top 5 Manufacturers Within Each EV Type",
    fill = "Make"
  ) +
  theme_void()

This visualization compares the top five manufacturers within each EV type by using separate “pie-like” charts for each type. By splitting Battery Electric Vehicles (BEVs) and Plug-in Hybrid Electric Vehicles (PHEVs), it becomes easier to see whether the same manufacturers dominate both categories. Differences across the pies suggest that manufacturers may focus more heavily on one technology type than another. This is useful for understanding how the market is segmented and how adoption may differ by technology.

ev_range_clean <- ev_data %>%
  filter(electric_range > 0)

ggplot(ev_range_clean, aes(electric_range)) +
  geom_histogram(bins = 30) +
  labs(
    title = "Electric Range Distribution (Excluding Zero Values)",
    x = "Electric Range (miles)",
    y = "Count"
  ) +
  theme_minimal()

This histogram displays the distribution of reported electric ranges after removing zero values that likely represent missing data. The plot usually shows many vehicles clustered in a mid-range band, with fewer vehicles achieving very high ranges. A right-skewed shape indicates that while high-range EVs exist, they are less common than mid-range models. This visualization helps decision makers understand typical range capabilities and how common longer-range vehicles are in the market.

top_counties <- ev_data %>%
  filter(!is.na(county), county != "") %>%
  count(county, name = "total_vehicles", sort = TRUE) %>%
  slice_head(n = 10)

ggplot(top_counties, aes(reorder(county, total_vehicles), total_vehicles)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top 10 Counties by EV Count",
    x = "County",
    y = "Number of Vehicles"
  ) +
  theme_minimal()

This bar chart shows which counties have the highest counts of EVs in the dataset. A strong concentration in a few counties often reflects population density, urbanization, income differences, and charging infrastructure availability. Geographic clustering is important because it can guide where new charging stations, incentives, or outreach programs may have the biggest impact. It also provides a starting point for deeper analysis, such as comparing EV counts per capita or exploring differences between urban and rural areas.