Introduction
This report presents a descriptive analysis of an electric vehicle dataset containing information on manufacturer, model year, electric vehicle type, county location, and electric range. The goal is to describe patterns in EV adoption over time, manufacturer concentration, geographic distribution, and vehicle range characteristics. The report includes basic descriptive statistics and five accompanying graphs, each explained in detail.
ev_data <- read_excel("U:/R/Monteforte_MidModuleData.xlsx")
ev_data <- clean_names(ev_data) %>%
mutate(
model_year = as.integer(model_year),
make = trimws(as.character(make)),
county = trimws(as.character(county)),
electric_vehicle_type = trimws(as.character(electric_vehicle_type)),
electric_range = suppressWarnings(as.numeric(electric_range))
)
ev_full <- ev_data %>%
filter(!is.na(model_year)) %>%
count(model_year, name = "total_vehicles") %>%
right_join(
tibble(model_year = seq(
min(ev_data$model_year, na.rm = TRUE),
max(ev_data$model_year, na.rm = TRUE)
)),
by = "model_year"
) %>%
replace_na(list(total_vehicles = 0)) %>%
arrange(model_year)
ggplot(ev_full, aes(model_year, total_vehicles)) +
geom_line() +
geom_point() +
labs(
title = "EVs by Model Year",
x = "Model Year",
y = "Number of Vehicles"
) +
theme_minimal()
This line chart shows how EV counts vary by model year. The overall pattern increases in the more recent model years, indicating that EV adoption has grown substantially over time. Any sharp rises in the most recent years suggest stronger consumer demand and wider availability of EV options. This trend matters for decision makers because increasing adoption often implies greater need for charging infrastructure and EV-related services.
top_makes <- ev_data %>%
filter(!is.na(make), make != "") %>%
count(make, name = "total_vehicles", sort = TRUE) %>%
slice_head(n = 10)
ggplot(top_makes, aes(reorder(make, total_vehicles), total_vehicles)) +
geom_col() +
coord_flip() +
labs(
title = "Top 10 EV Manufacturers",
x = "Manufacturer",
y = "Number of Vehicles"
) +
theme_minimal()
This bar chart ranks the top ten manufacturers by the number of vehicles
in the dataset. The plot typically shows that a small number of
manufacturers account for a large portion of EVs, which indicates market
concentration. Concentration can reflect brand leadership, production
scale, consumer preferences, and availability of models. For planning
purposes, knowing the dominant makes can help target support services,
parts supply, and dealership or service capacity.
top5_makes <- ev_data %>%
filter(!is.na(make), make != "") %>%
count(make, sort = TRUE) %>%
slice_head(n = 5) %>%
pull(make)
nested_data <- ev_data %>%
filter(
make %in% top5_makes,
!is.na(electric_vehicle_type),
electric_vehicle_type != ""
) %>%
count(electric_vehicle_type, make, name = "total_vehicles")
ggplot(nested_data, aes(x = 2, y = total_vehicles, fill = make)) +
geom_col(color = "white") +
coord_polar(theta = "y") +
xlim(0.5, 2.5) +
facet_wrap(~ electric_vehicle_type) +
labs(
title = "Top 5 Manufacturers Within Each EV Type",
fill = "Make"
) +
theme_void()
This visualization compares the top five manufacturers within each EV
type by using separate “pie-like” charts for each type. By splitting
Battery Electric Vehicles (BEVs) and Plug-in Hybrid Electric Vehicles
(PHEVs), it becomes easier to see whether the same manufacturers
dominate both categories. Differences across the pies suggest that
manufacturers may focus more heavily on one technology type than
another. This is useful for understanding how the market is segmented
and how adoption may differ by technology.
ev_range_clean <- ev_data %>%
filter(electric_range > 0)
ggplot(ev_range_clean, aes(electric_range)) +
geom_histogram(bins = 30) +
labs(
title = "Electric Range Distribution (Excluding Zero Values)",
x = "Electric Range (miles)",
y = "Count"
) +
theme_minimal()
This histogram displays the distribution of reported electric ranges
after removing zero values that likely represent missing data. The plot
usually shows many vehicles clustered in a mid-range band, with fewer
vehicles achieving very high ranges. A right-skewed shape indicates that
while high-range EVs exist, they are less common than mid-range models.
This visualization helps decision makers understand typical range
capabilities and how common longer-range vehicles are in the market.
top_counties <- ev_data %>%
filter(!is.na(county), county != "") %>%
count(county, name = "total_vehicles", sort = TRUE) %>%
slice_head(n = 10)
ggplot(top_counties, aes(reorder(county, total_vehicles), total_vehicles)) +
geom_col() +
coord_flip() +
labs(
title = "Top 10 Counties by EV Count",
x = "County",
y = "Number of Vehicles"
) +
theme_minimal()
This bar chart shows which counties have the highest counts of EVs in
the dataset. A strong concentration in a few counties often reflects
population density, urbanization, income differences, and charging
infrastructure availability. Geographic clustering is important because
it can guide where new charging stations, incentives, or outreach
programs may have the biggest impact. It also provides a starting point
for deeper analysis, such as comparing EV counts per capita or exploring
differences between urban and rural areas.