#Dataset

Electric Vehicle Population Data https://catalog.data.gov/dataset/electric-vehicle-population-data

ev <- read.csv("C:/Users/ghale/Downloads/Electric_Vehicle_Population_Data.csv")

#view dimensions and preview

dim(ev)
## [1] 264628     17
head(ev)
##   VIN..1.10.    County     City State Postal.Code Model.Year   Make   Model
## 1 WA1E2AFY8R  Thurston  Olympia    WA       98512       2024   AUDI    Q5 E
## 2 WAUUPBFF4J    Yakima   Wapato    WA       98951       2018   AUDI      A3
## 3 1N4AZ0CP0F      King  Seattle    WA       98125       2015 NISSAN    LEAF
## 4 WA1VAAGE5K      King     Kent    WA       98031       2019   AUDI  E-TRON
## 5 7SAXCAE57N Snohomish  Bothell    WA       98021       2022  TESLA MODEL X
## 6 KNDJP3AEXG Snohomish Lynnwood    WA       98037       2016    KIA    SOUL
##                    Electric.Vehicle.Type
## 1 Plug-in Hybrid Electric Vehicle (PHEV)
## 2 Plug-in Hybrid Electric Vehicle (PHEV)
## 3         Battery Electric Vehicle (BEV)
## 4         Battery Electric Vehicle (BEV)
## 5         Battery Electric Vehicle (BEV)
## 6         Battery Electric Vehicle (BEV)
##              Clean.Alternative.Fuel.Vehicle..CAFV..Eligibility Electric.Range
## 1                        Not eligible due to low battery range             23
## 2                        Not eligible due to low battery range             16
## 3                      Clean Alternative Fuel Vehicle Eligible             84
## 4                      Clean Alternative Fuel Vehicle Eligible            204
## 5 Eligibility unknown as battery range has not been researched              0
## 6                      Clean Alternative Fuel Vehicle Eligible             93
##   Base.MSRP Legislative.District DOL.Vehicle.ID            Vehicle.Location
## 1         0                   22      263239938  POINT (-122.90787 46.9461)
## 2         0                   15      318160860 POINT (-120.42083 46.44779)
## 3         0                   46      184963586 POINT (-122.30253 47.72656)
## 4         0                   11      259426821 POINT (-122.17743 47.41185)
## 5         0                    1      208182236  POINT (-122.18384 47.8031)
## 6     31950                   21      209171889 POINT (-122.27734 47.83785)
##                                Electric.Utility X2020.Census.Tract
## 1                        PUGET SOUND ENERGY INC        53067010910
## 2                                    PACIFICORP        53077940008
## 3  CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA)        53033000700
## 4 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)        53033029306
## 5                        PUGET SOUND ENERGY INC        53061051922
## 6                        PUGET SOUND ENERGY INC        53061051928

#Introduction

Electric Vehicle Population Data analyzes a dataset containing 264,628 electric vehicles and 17 variables, documenting individual electric vehicles registered in Washington State. The dataset includes detailed information about each vehicle, such as the make, model, model year, electric vehicle type, electric range, location, and utility district. The key variables used here are:

Electric Vehicle Type — categorical: “Battery Electric Vehicle (BEV)” or “Plug-in Hybrid Electric Vehicle (PHEV)”

Electric Range — numeric: reported electric range in miles

Other variables (make, model year, county, etc.) are available for follow-up analyses. The dataset was obtained from the Washington State Department of Licensing (DOL). It can be accessed at the following link: https://catalog.data.gov/dataset/electric-vehicle-population-data

#Research Question

Is there a significant difference in the average electric range between Battery Electric Vehicles (BEV) and Plug-in Hybrid Electric Vehicles (PHEV)?

#Data Analysis

In this section I clean the data, remove invalid electric-range values (missing or non-positive), and produce EDA visualizations. I use filter(), select() plus summary functions.

#Remove missing values for Electric Range and Vehicle Type

clean_ev <- ev %>%
filter(!is.na(Electric.Range),
!is.na(Electric.Vehicle.Type))

#Select only the variables needed

ev_selected <- clean_ev %>%
select(Electric.Vehicle.Type, Electric.Range)

summary(ev_selected)
##  Electric.Vehicle.Type Electric.Range  
##  Length:264624         Min.   :  0.00  
##  Class :character      1st Qu.:  0.00  
##  Mode  :character      Median :  0.00  
##                        Mean   : 41.71  
##                        3rd Qu.: 34.00  
##                        Max.   :337.00

#Visualization

#Histogram of Electric Range
ggplot(ev_selected, aes(x = Electric.Range, fill = Electric.Vehicle.Type)) +
geom_histogram(position = "identity", alpha = 5, bins = 25) +
labs(title = "Distribution of Electric Range for BEVs vs PHEVs")

#Boxplot Comparing BEV and PHEV Range

ggplot(ev_selected, aes(x = Electric.Vehicle.Type, y = Electric.Range)) +
geom_boxplot() +
labs(title = "Electric Range Comparison: BEV vs PHEV")

#Interpretation from Visualization

The histogram and boxplot both show a clear difference in electric range between BEVs and PHEVs. BEVs have a distribution shifted to the right, with higher median values and a wider spread in range, indicating longer electric-only driving capabilities and more model variety. In contrast, PHEVs cluster tightly at lower electric ranges, typically between 10 and 40 miles, reflecting their design as hybrid vehicles that rely partially on gasoline. The lack of major overlap between the two groups visually reinforces that BEVs consistently deliver more electric range than PHEVs.

#Statistical Analysis

To answer the research question, I conduct an independent two-sample t-test comparing the mean electric range of BEVs and PHEVs.

#Hypotheses

Null Hypothesis (H₀): μ_BEV = μ_PHEV (There is no difference in average electric range between BEVs and PHEVs.)

Alternative Hypothesis (Hₐ): μ_BEV ≠ μ_PHEV (There is a significant difference in average electric range between BEVs and PHEVs.)

#Separate groups

 bev <- clean_ev %>% filter(Electric.Vehicle.Type == "Battery Electric Vehicle (BEV)") %>% pull(Electric.Range)
phev <- clean_ev %>% filter(Electric.Vehicle.Type == "Plug-in Hybrid Electric Vehicle (PHEV)") %>% pull(Electric.Range)

#Sample sizes

n_bev <- length(bev)
n_phev <- length(phev)
data.frame(group = c("BEV", "PHEV"), n = c(n_bev, n_phev))
##   group      n
## 1   BEV 210575
## 2  PHEV  54049
#T-test
bev <- ev_selected %>% filter(Electric.Vehicle.Type == "Battery Electric Vehicle (BEV)") %>% pull(Electric.Range)
phev <- ev_selected %>% filter(Electric.Vehicle.Type == "Plug-in Hybrid Electric Vehicle (PHEV)") %>% pull(Electric.Range)

t.test(bev, phev, alternative = "two.sided")
## 
##  Welch Two Sample t-test
## 
## data:  bev and phev
## t = 62.357, df = 244421, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  12.35578 13.15772
## sample estimates:
## mean of x mean of y 
##  44.31870  31.56195

#Conclusion

The results show a statistically significant difference between the two groups (t = 62.357, df = 244,421, p < 2.2e-16). Because the p-value is far below α = 0.05, we reject the null hypothesis and conclude that the true mean electric ranges of BEVs and PHEVs are not equal. BEVs have a higher average range (44.32 miles) compared to PHEVs (31.56 miles). The 95% confidence interval for the difference in means (12.36 to 13.16 miles) does not include zero, confirming that BEVs provide substantially greater electric range than PHEVs on average.