Introduction

Transportation is currently the United States’ largest source of greenhouse gas emissions, and transportation-sector electrification is widely recognized as one of the best strategies for significantly reducing these emissions. Invented as early as mid-19th century, electric vehicles (EVs) have become the modern society’s choice to reduce carbon emissions and promote sustainable growth. Recent years have witnessed a rapid rising trend of global EV growth.

In the US, EV registrations reached record market share of 1.8% in 2020. Regionally, the New York City’s share of 2% is driving the Northeast. According to their 2020 Renewables on the Rise report, New York State ranks #2 in the United States for EV sold and available EV charging infrastructure, having more than 53,000 registered EVs on the road and an estimated 1,750 public EV charging stations.

Using New York State as an example, our project aims to analyze the historical trend and current status of the EV market, and predict the future image of EV industry. Specifically, our project consists of 3 parts: first, we analyze the historical trend of the EV industry in New York State, together with the contributing factors. Second, we map the current distribution of EV registration and charging stations in New York State at county level. Finally, we employ sentiment analysis and LDA model analysis to explore customer’s needs and expectation about EV.

Part 1: Exploring the development of EV market and major factors

In this part, we’ll visualize the historical development of EVs in New York State and analyze the major drivers for the rapid growth. First, We’ll start by providing a historical overview of the EV registration.

The rapid growth of EV registration in New York State

xl_data <- "EV-Registration-Tables.xlsx"

EV_time <- read_excel(path=xl_data, sheet=5)

EV_time <- head(EV_time, -1)

EV_time <- mutate(EV_time, Time=paste(`Month Name`, stri_sub(EV_time$Year, 3,4), sep=" '"))


p1 <- EV_time %>%
  group_by("Year") %>%
  hchart(., "area", hcaes(x = `Time`, y=cumsum(`Total EVs`)))%>%  
  hc_title(text = "Original EV Registration in New York State (2011-2019)") %>%
  hc_subtitle(text = "Source: New York State Energy Research and Development Authority") %>%
  hc_yAxis(title = list(text="Cumulative sum of EV"))
p1

The above graph shows the cumulative number of EV registration in New York State since 2011. EV registration has grown 100 times between December 2011 to December 2015, and has maintained a steady growth rate up till now. New Yorkers have demonstrated strong interest for EVs over time.

Major drivers of New York State EV growth

Next, we’ll explore several key driving factors of rapid EV growth in New York State.

1st factor: growing population

The population of New York State has maintained a growing pace during the first half of 2010s. Hence, this creates more demand for auto industry including EV.

population <- read_excel("ny_pop.xlsx")
p3 <- ggplot(population, aes(x=year, y=cumsum(total_population/1000000), text=paste("Year: ", year,
 "<br>cumulative population (Millions): ", cumsum(total_population/1000000)))) + geom_bar(stat="identity",alpha=.6, width=0.8, fill= "#0882c7")+labs(title = "New York State Cumulative Population",subtitle = "Source: U.S. Census Bureau", x="Year",y="Cumulative population (in millions)")+
         theme_classic()+
   theme(plot.title = element_text(size = 12, face = "bold"),
         axis.title.x = element_text(size = 10),
          axis.title.y = element_text(size = 10))+
  scale_x_continuous(breaks = c(2010, 2011,2012,2013,2014,2015,2016,2017,2018,2019), labels = c("2010", "2011","2012", "2013","2014", "2015", "2016", "2017", "2018", "2019"))

ggplotly(p3, tooltip="text")

2nd factor: Carbon emission reduction and policy support

During the past 3 decades, gasoline and diesel remain as the top 2 sources of greeenhouse emissions under the transportation sector in the U.S.. Hence, New York State has issued a package of policies to accelerate the electrification trend in transportation. For example, New York State is one of the first states in the U.S. to sign a Zero Emission Vehicle Memorandum of Understanding (ZEV MOU) to enact policies that will ensure the deployment of 3.3 million light-duty ZEVs by 2025. Furthermore, under New York State’s Charge NY initiative, electric vehicle buyers can receive a rebate of up to 2,000 for new car purchases or leases, together with a Federal Tax Credit of up to 7,500. In addition, automotive manufacfurers are offering more EV models for consumers to choose, ranging from cost-effective to luxury EV.

Therefore, policy support and the increasing variety of EVs make it a good timing for consumers to purchase EV.

emission <- read_excel("ny_emission.xlsx")
p4 <- ggplot(emission, aes(fill=type, y=inventory, x=as.factor(year), text=paste("Year: ", as.factor(year),
                                                                                            "<br>Emissions inventory: ", inventory,
                                                                                            "<br>Type: ", type)))+
  geom_bar(position="dodge", stat="identity",alpha=.6, width=0.8) +
         labs(title = "Transportation Sector Greenhouse Emissions Inventory, 1990–2016",subtitle = "Source: New York State Energy Research and Development Authority", x="Year",y="Emissions Inventory (MMtCO2e)")+
         theme_bw()+
   theme(plot.title = element_text(size = 10, face = "bold"),
         axis.title.x = element_text(size = 9),
          axis.title.y = element_text(size = 9))

3rd factor: Increasing number and variety of EV models offered

Due to the boost of design and battery technology, nowadays automotive manufacturers are offering various kinds of EV to the consumers. It is estimated that the number of battery electric (BEV) and plug-in hybrid (PHEV) passenger vehicle models available to U.S. consumers will increase from 60 to 83 between 2020 and 2022. Meanwhile, numerous manufacturers have publicly signaled their investment plans and commitments to a future of electric vehicle.

In addition, EVs have become more affordable because of dramatic reductions in the cost of batteries. The cost of battery packs has fallen from approximately $1,000/kilowatt-hour (kWh) in 2010 to approximately $156/kWh in 2019. These factors all contribute to the growth of EV sales.

EV_make <- read_excel(path=xl_data, sheet=3)
EV_make <- head(EV_make, -1)
EV_make <- mutate(EV_make, Name=paste(Make, Model))

p5 <- EV_make %>%
  hchart("treemap", hcaes(x='Name', value= 'Registrations', color= 'Registrations')) %>%
           hc_title(text = "EV make-models in New York State by popularity") %>%
           hc_subtitle(text = "Source: New York State Energy Research and Development Authority")
p5

The above treemap shows the great variety of EV models in the current U.S. market. There are 66 EV models (PHEV and BEV) by 32 automotive manufacturers available to U.S. consumers. Tesla Model 3 and Toyota Prius Prime enjoy the predominant popularity among U.S. consumer, followed by Tesla Model S, Chevrolet Volt, Ford Fusion Energi, and Tesla Model X.

In addition, U.S consumers prefer domestic car brands (Tesla, Chevrolet, Ford) the most, followed by Japanese brand (Toyota, Honda, and Nissan). Except for the two luxury EV models from Tesla (Model S and Model X), 8 out of the top 10 popular EVs are under $50,000 (MSRP). It suggests that affordability and energy efficiency are the key factors when it comes to EV purchase for New York consumers.

metric = read_excel("ev_info.xlsx")
p6 <- ggplot(metric,aes(x = range, y = MSRP/1000, label=type))+
  #geom_point(aes(size = cost_per_mile),alpha=.6)+
  geom_text(aes(color=factor(brand)), alpha = 0.6, vjust=-0.2, check_overlap=TRUE, size=3.5)+
  labs(title = "Features of EV by Make-models", subtitle = "Source: Visual Capitalist", x="Range On a Single Charge(mile)",y="Manufacturer's Suggested Retail Price($k)")+ geom_label()+theme_bw()+
  theme(plot.title = element_text(size = 12, face = "bold"),
         axis.title.x = element_text(size = 9),
          axis.title.y = element_text(size = 9))
ggplotly(p6)

The above graph offers more detailed make-models information of the current EV market in the U.S.. In terms of affordability, there are six EV models available for under $50,000 (MSRP) with a driving range of up to 250 miles. Specifically, EV models by Japanese carmakers (Nissan, Hyundai) emphasis the most on affordability. All of the 5 EV models by Japanese manufacturers are under $40,000 (MSRP). There will be even more models with a net cost of under $50,000 when current federal, state, and rebates are factored in.

Part 2: mapping the locations of EV charging stations in New York State

In this part, we’ll map the current distribution of EV registration and charging stations in New York State, to examine whether charging stations would affect EV sales in the NY State. The assumption is that the number of charging stations is positively associated with EV sales. Both EV charging station and registration data are downloaded from New York State Energy Research and Development Authority, and were updated in 2021.

charging <- read.csv("EV_Charging_Stations.csv")

charging1 <-charging %>%
  select(City, ZIP, Latitude, Longitude,EV.Connector.Types, Access.Days.Time)
  
m <- leaflet(charging1) %>%
  addTiles('http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png')%>%
     setView(-75.3167, 43.09287, zoom = 6)


content <- paste("City name:",charging1$City,"<br/>",
                 "Zipcode:", charging1$ZIP,"<br/>",
                 "EV Connector Types:",charging1$EV.Connector.Types,"<br/>",
                 "Access Days Time:", charging1$Access.Days.Time, "<br/>")


m2 <- m %>% addCircles(popup = content)

mclust <- m2 %>% addCircleMarkers(popup=content, clusterOptions = markerClusterOptions())
mclust

From the above map, the coverage rate of EV charging stations in New York State is pretty high. Charging stations are mostly distributed in the southeast New York. The clusters of EV charging stations are significantly correlated with population density, where bigger cities have more charging stations. New York City has the most charging stations, followed by other large cities such as Albany, Buffalo, Rochester, Yonkers, and Syracuse. Next, let’s further explore the distribution within NYC.

nybb <- readOGR("nybb_21a", layer="nybb")
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/christinalv/DataScience_Assignments/Data Visualization/nybb_21a", layer: "nybb"
## with 5 features
## It has 4 fields
charging2 <- charging1 %>%
  filter(City %in% c("New York", "Bronx", "Queens", "Brooklyn", "Staten Island", "Long Island City"))%>%
  rename(BoroName = City)

charging2$BoroName <- as.character(charging2$BoroName)
charging2$BoroName[charging2$BoroName == "New York"] <- "Manhattan"
charging2$BoroName[charging2$BoroName == "Long Island City"] <- "Queens"


charging3 <- charging2 %>%
  select(BoroName) %>%
  group_by(BoroName) %>%
  add_count(BoroName) %>%
  distinct()

nybb$number_of_charging <- charging3$n

p7 <- tm_shape(nybb) + tm_fill("number_of_charging", title="Density of charging stations in NYC", style="pretty") +
  tm_borders(alpha=0.5)+tm_style("white")

p7

The above map is a heatmap of the distribution of EV charging stations in New York City. There are 326 EV charging stations in Manhattan, followed by 75 in Brooklyn, 21 in Queens, 19 in Staten Island, and 14 in Bronx. New Yorkers don’t need to worry about charging their EVs in Manhattan since the borough has the most charging stations available in the state.

nyzip = read_excel("ny_zipcode.xlsx")

nyzip <- nyzip %>%
  select(Zip, Latitude, Longitude)

nyzip$Zip <- as.character(nyzip$Zip)
  


EV_registration <- read_excel(path=xl_data, sheet=2)
EV_registration <- head(EV_registration, -1)
EV_registration <- EV_registration %>% rename(Zip = `ZIP Code`)

total <- left_join(nyzip, EV_registration, by=("Zip"))
total[is.na(total)] <- 0

mm <- leaflet(total) %>%
  addTiles('http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png')%>%
     setView(-75.3167, 43.09287, zoom = 6)

content2 <- paste("Zipcode:", total$Zip,"<br/>",
                 "Number of PHEV/EREV:",total$`PHEV/EREV`,"<br/>",
                 "Number of BEV:", total$BEV, "<br/>",
                 "Total EVs:", total$`Total EVs`, "<br/>")

m3 <- mm %>% addCircles(popup = content2)

mclust2 <- m3 %>% addCircleMarkers(popup=content2, clusterOptions = markerClusterOptions())

mclust2

The above map displays the distribution of EV registration in New York State by zipcode. The data is downloaded from New York State Energy Research and Development Authority and is updated in 2021. EV registration seems to be roughly even distributed throughout NY state at first glance. EVs are mostly distributed in the Southeast New York, which is the same pattern as shown in the map of charging stations.

Comparing the two maps, EV-charging station ratio is pretty high in Northeast and central NY, where more EVs are registrated but with fewer charging stations. By contrast, EV-charging station ratio is lowest in New York City.

