#Dataset
Electric Vehicle Population Data https://catalog.data.gov/dataset/electric-vehicle-population-data
ev <- read.csv("C:/Users/ghale/Downloads/Electric_Vehicle_Population_Data.csv")
#view dimensions and preview
dim(ev)
## [1] 264628 17
head(ev)
## VIN..1.10. County City State Postal.Code Model.Year Make Model
## 1 WA1E2AFY8R Thurston Olympia WA 98512 2024 AUDI Q5 E
## 2 WAUUPBFF4J Yakima Wapato WA 98951 2018 AUDI A3
## 3 1N4AZ0CP0F King Seattle WA 98125 2015 NISSAN LEAF
## 4 WA1VAAGE5K King Kent WA 98031 2019 AUDI E-TRON
## 5 7SAXCAE57N Snohomish Bothell WA 98021 2022 TESLA MODEL X
## 6 KNDJP3AEXG Snohomish Lynnwood WA 98037 2016 KIA SOUL
## Electric.Vehicle.Type
## 1 Plug-in Hybrid Electric Vehicle (PHEV)
## 2 Plug-in Hybrid Electric Vehicle (PHEV)
## 3 Battery Electric Vehicle (BEV)
## 4 Battery Electric Vehicle (BEV)
## 5 Battery Electric Vehicle (BEV)
## 6 Battery Electric Vehicle (BEV)
## Clean.Alternative.Fuel.Vehicle..CAFV..Eligibility Electric.Range
## 1 Not eligible due to low battery range 23
## 2 Not eligible due to low battery range 16
## 3 Clean Alternative Fuel Vehicle Eligible 84
## 4 Clean Alternative Fuel Vehicle Eligible 204
## 5 Eligibility unknown as battery range has not been researched 0
## 6 Clean Alternative Fuel Vehicle Eligible 93
## Base.MSRP Legislative.District DOL.Vehicle.ID Vehicle.Location
## 1 0 22 263239938 POINT (-122.90787 46.9461)
## 2 0 15 318160860 POINT (-120.42083 46.44779)
## 3 0 46 184963586 POINT (-122.30253 47.72656)
## 4 0 11 259426821 POINT (-122.17743 47.41185)
## 5 0 1 208182236 POINT (-122.18384 47.8031)
## 6 31950 21 209171889 POINT (-122.27734 47.83785)
## Electric.Utility X2020.Census.Tract
## 1 PUGET SOUND ENERGY INC 53067010910
## 2 PACIFICORP 53077940008
## 3 CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA) 53033000700
## 4 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 53033029306
## 5 PUGET SOUND ENERGY INC 53061051922
## 6 PUGET SOUND ENERGY INC 53061051928
#Introduction
Electric Vehicle Population Data analyzes a dataset containing 264,628 electric vehicles and 17 variables, documenting individual electric vehicles registered in Washington State. The dataset includes detailed information about each vehicle, such as the make, model, model year, electric vehicle type, electric range, location, and utility district. The key variables used here are:
Electric Vehicle Type — categorical: “Battery Electric Vehicle (BEV)” or “Plug-in Hybrid Electric Vehicle (PHEV)”
Electric Range — numeric: reported electric range in miles
Other variables (make, model year, county, etc.) are available for follow-up analyses. The dataset was obtained from the Washington State Department of Licensing (DOL). It can be accessed at the following link: https://catalog.data.gov/dataset/electric-vehicle-population-data
#Research Question
Is there a significant difference in the average electric range between Battery Electric Vehicles (BEV) and Plug-in Hybrid Electric Vehicles (PHEV)?
#Data Analysis
In this section I clean the data, remove invalid electric-range values (missing or non-positive), and produce EDA visualizations. I use filter(), select() plus summary functions.
#Remove missing values for Electric Range and Vehicle Type
clean_ev <- ev %>%
filter(!is.na(Electric.Range),
!is.na(Electric.Vehicle.Type))
#Select only the variables needed
ev_selected <- clean_ev %>%
select(Electric.Vehicle.Type, Electric.Range)
summary(ev_selected)
## Electric.Vehicle.Type Electric.Range
## Length:264624 Min. : 0.00
## Class :character 1st Qu.: 0.00
## Mode :character Median : 0.00
## Mean : 41.71
## 3rd Qu.: 34.00
## Max. :337.00
#Visualization
#Histogram of Electric Range
ggplot(ev_selected, aes(x = Electric.Range, fill = Electric.Vehicle.Type)) +
geom_histogram(position = "identity", alpha = 5, bins = 25) +
labs(title = "Distribution of Electric Range for BEVs vs PHEVs")
#Boxplot Comparing BEV and PHEV Range
ggplot(ev_selected, aes(x = Electric.Vehicle.Type, y = Electric.Range)) +
geom_boxplot() +
labs(title = "Electric Range Comparison: BEV vs PHEV")
#Interpretation from Visualization
The histogram and boxplot both show a clear difference in electric range between BEVs and PHEVs. BEVs have a distribution shifted to the right, with higher median values and a wider spread in range, indicating longer electric-only driving capabilities and more model variety. In contrast, PHEVs cluster tightly at lower electric ranges, typically between 10 and 40 miles, reflecting their design as hybrid vehicles that rely partially on gasoline. The lack of major overlap between the two groups visually reinforces that BEVs consistently deliver more electric range than PHEVs.
#Statistical Analysis
To answer the research question, I conduct an independent two-sample t-test comparing the mean electric range of BEVs and PHEVs.
#Hypotheses
Null Hypothesis (H₀): μ_BEV = μ_PHEV (There is no difference in average electric range between BEVs and PHEVs.)
Alternative Hypothesis (Hₐ): μ_BEV ≠ μ_PHEV (There is a significant difference in average electric range between BEVs and PHEVs.)
#Separate groups
bev <- clean_ev %>% filter(Electric.Vehicle.Type == "Battery Electric Vehicle (BEV)") %>% pull(Electric.Range)
phev <- clean_ev %>% filter(Electric.Vehicle.Type == "Plug-in Hybrid Electric Vehicle (PHEV)") %>% pull(Electric.Range)
#Sample sizes
n_bev <- length(bev)
n_phev <- length(phev)
data.frame(group = c("BEV", "PHEV"), n = c(n_bev, n_phev))
## group n
## 1 BEV 210575
## 2 PHEV 54049
#T-test
bev <- ev_selected %>% filter(Electric.Vehicle.Type == "Battery Electric Vehicle (BEV)") %>% pull(Electric.Range)
phev <- ev_selected %>% filter(Electric.Vehicle.Type == "Plug-in Hybrid Electric Vehicle (PHEV)") %>% pull(Electric.Range)
t.test(bev, phev, alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: bev and phev
## t = 62.357, df = 244421, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 12.35578 13.15772
## sample estimates:
## mean of x mean of y
## 44.31870 31.56195
#Conclusion
The results show a statistically significant difference between the two groups (t = 62.357, df = 244,421, p < 2.2e-16). Because the p-value is far below α = 0.05, we reject the null hypothesis and conclude that the true mean electric ranges of BEVs and PHEVs are not equal. BEVs have a higher average range (44.32 miles) compared to PHEVs (31.56 miles). The 95% confidence interval for the difference in means (12.36 to 13.16 miles) does not include zero, confirming that BEVs provide substantially greater electric range than PHEVs on average.