Analysis of Electric Vehicles

The dataset contains records on electric vehicles, including model year, make, range, location, and MSRP. We’ll start by filtering the data to focus on model years from 2010 to 2024 to examine a period of significant EV adoption. Let’s now look at the results. I have provided a line chart, bar chart, pie chart, waterfall chart, and heatmap. These five charts collectively tell a story of EV adoption trends, brand competition, range capabilities, regional preferences, and pricing evolution. They present a comprehensive look at the EV market landscape over the past decade, highlighting the advancements, challenges, and consumer trends within this transformative sector.


import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import FuncFormatter

# Load the dataset
file_path = "/Users/jordyn/Documents/Python Projects/Electric_Vehicle_Population_Data.csv"  # Update this path if needed
ev_data = pd.read_csv(file_path)

# Filter data for model years between 2010 and 2024
filtered_data = ev_data[(ev_data['Model Year'] >= 2010) & (ev_data['Model Year'] <= 2024)]

# 1. Time-based Analysis: Count of EVs by Model Year within the filtered range
model_year_counts_filtered = filtered_data['Model Year'].value_counts().sort_index()

# 2. Distribution Analysis: Top 10 most popular EV Makes within the filtered range
make_counts_filtered = filtered_data['Make'].value_counts().head(10)

# Formatter function to add commas to the y-axis
def comma_formatter(x, pos):
    return f"{int(x):,}"

In the first visualization, we examine the count of electric vehicles by model year. This line chart will help us understand how adoption has grown over time and highlight the peak years of EV model releases.Observing how adoption accelerates or decelerates over time can highlight key periods of growth, potentially linked to external factors like government subsidies or technological breakthroughs. This trend helps stakeholders understand the pace of EV adoption and the impact of strategic incentives on consumer behavior.

# Visualization 1: Line Chart for EV Adoption by Year (2010-2024) with Max Data Point Label
plt.figure(figsize=(10, 5))
plt.plot(model_year_counts_filtered.index, model_year_counts_filtered.values, marker='o')
plt.title('Number of Electric Vehicles by Model Year (2010-2024)')
plt.xlabel('Model Year')
plt.ylabel('Number of Vehicles')
plt.gca().yaxis.set_major_formatter(FuncFormatter(comma_formatter))
plt.grid(True)

# Find the max data point and annotate it
max_value_filtered = model_year_counts_filtered.max()
max_year_filtered = model_year_counts_filtered.idxmax()
plt.annotate(f'{max_value_filtered:,}', xy=(max_year_filtered, max_value_filtered),
             xytext=(max_year_filtered, max_value_filtered + 1000),
             ha='center', arrowprops=dict(arrowstyle='->', color='black'))

plt.tight_layout()
plt.show()

The bar chart below shows the top 10 most popular EV makes. This will illustrate which brands are most dominant in the EV market.Understanding which brands dominate the market can reveal consumer preferences, the impact of brand reputation, and each brand’s ability to capture market share. It can also guide potential buyers or investors looking for the most popular EV brands and suggest which brands are driving the transition to electric mobility.

# Visualization 2: Bar Chart for Top 10 EV Makes (2010-2024) with Data Labels
plt.figure(figsize=(10, 5))
bars_filtered = make_counts_filtered.plot(kind='bar')
plt.title('Top 10 Electric Vehicle Makes (2010-2024)')
plt.xlabel('Make')
plt.ylabel('Number of Vehicles')
plt.xticks(rotation=45)
plt.gca().yaxis.set_major_formatter(FuncFormatter(comma_formatter))

# Add data labels on top of each bar
for bar in bars_filtered.patches:
    height = bar.get_height()
    bars_filtered.annotate(f'{height:,}', xy=(bar.get_x() + bar.get_width() / 2, height),
                           xytext=(0, 5), textcoords="offset points", ha='center', va='bottom')

plt.tight_layout()
plt.show()

This nested pie chart, with the inner ring showing EV types (like BEVs and PHEVs) and the outer ring displaying the top 10 cities, demonstrates the distribution of EV types across cities. This chart may show whether some cities prefer certain types of EVs over others.Regional preferences for different types of EVs can reflect local policies, infrastructure, or consumer priorities. For instance, a city with many BEVs might have strong support for charging infrastructure, while a preference for PHEVs might indicate areas where charging infrastructure is still developing. This chart is significant for urban planners and policymakers who need to address specific needs or preferences in their regions.

#Visualization 3 Pie Chart
# Group by City to get total counts and select the top 10 cities
top_cities = filtered_data['City'].value_counts().nlargest(10).index
filtered_top_cities_data = filtered_data[filtered_data['City'].isin(top_cities)]

# Group by Electric Vehicle Type to get counts for the inner ring
ev_type_counts = filtered_top_cities_data['Electric Vehicle Type'].value_counts()

# Re-group by State and City for the nested pie chart with only top 10 cities
state_city_counts_top = filtered_top_cities_data.groupby(['State', 'City']).size().reset_index(name='Count')

# Labels and sizes for the inner and outer rings
# Inner Ring: Electric Vehicle Types
inner_labels = ev_type_counts.index
inner_sizes = ev_type_counts.values

# Outer Ring: Top 10 Cities within each State
outer_labels = state_city_counts_top.apply(lambda x: f"{x['City']} ({x['State']})", axis=1)
outer_sizes = state_city_counts_top['Count']

# Plotting the nested pie chart
fig, ax = plt.subplots(figsize=(10, 10))

# Inner ring (electric vehicle types)
inner_pie = ax.pie(inner_sizes, labels=inner_labels, radius=1,
                   wedgeprops=dict(width=0.3, edgecolor='w'), labeldistance=0.8)

# Add text labels (counts) to each segment in the inner ring
for i, pie_wedge in enumerate(inner_pie[0]):
    # Get the angle of the wedge to place the label correctly
    angle = (pie_wedge.theta2 - pie_wedge.theta1) / 2 + pie_wedge.theta1
    x = 0.7 * np.cos(np.radians(angle))  # Adjust position for inner circle radius
    y = 0.7 * np.sin(np.radians(angle))
    ax.text(x, y, f'{inner_sizes[i]:,}', ha='center', va='center', fontsize=10, color='black')

# Outer ring (top 10 cities within states)
ax.pie(outer_sizes, labels=outer_labels, radius=1.3, wedgeprops=dict(width=0.3, edgecolor='w'))
plt.title("Electric Vehicle Count by Type and Top 10 Cities (2010-2024)", y= 1.1) 
plt.tight_layout()
plt.show()

The waterfall chart below compares the average electric range across the top 13 EV makes.This chart emphasizes the capacity differences across brands. A longer average range might indicate a focus on battery efficiency and range, appealing to consumers with long commutes or those needing reliable long-distance travel options.This comparison is essential as it highlights which manufacturers are leading in range, a critical factor for many EV buyers. Range anxiety is a common concern, and brands with higher average ranges can better appeal to consumers with range requirements. This chart can help stakeholders see where different brands fall on the spectrum of battery efficiency and range capability.

You can see that the most popular car makes (Tesla, Nissan) are not the makes with the most electric range. This shows that electric range is not a big deciding factor for consumers. It can be potentially assumed that consumers will far commutes do not select EV’s as their car of choice.

#Visualization 4: Waterfall
# Group by Make and calculate total Electric Range for each Make, then select the top 13
range_by_make = filtered_data.groupby('Make')['Electric Range'].mean().nlargest(13).sort_values(ascending=False)

# Initialize lists for Waterfall Chart data
cumulative_sum = 0
starts = []
ends = []
makes = []

# Loop through each Make's contribution to calculate cumulative values
for make, value in range_by_make.items():
    starts.append(cumulative_sum)
    cumulative_sum += value
    ends.append(cumulative_sum)
    makes.append(make)

# Plotting the Waterfall Chart
fig, ax = plt.subplots(figsize=(12, 6))

# Calculate the changes as differences between starts and ends
changes = np.array(ends) - np.array(starts)

# Create the waterfall bars
colors = ['green' if change > 0 else 'red' for change in changes]
bars = ax.bar(makes, changes, bottom=starts, color=colors)

# Format y-axis to include commas
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, pos: f'{int(x):,}'))
ax.set_title("Average Electric Range Contribution for Top 13 Makes (2010-2024)")
ax.set_xlabel("Make")
ax.set_ylabel("Average Electric Range (miles)")

# Rotate x-axis labels for readability
plt.xticks(rotation=45, ha='right')
# Add data labels for each bar
for bar, change in zip(bars, changes):
    ax.text(bar.get_x() + bar.get_width() / 2, bar.get_height() + bar.get_y(), 
            f'{int(change):,}', ha='center', va='bottom')

plt.tight_layout()
plt.show()

My final visualization is a heatmap showing the average MSRP of EVs by model year. This chart helps us understand pricing trends and changes over time in the EV market. Warmer colors for high prices and cooler colors for low prices can indicate trends in affordability or premium pricing.Tracking the changes in average MSRP over time can show how the affordability of EVs has evolved. Rising prices might point to an increase in premium features or newer technology, while decreasing prices could indicate that EVs are becoming more accessible to the general public. This chart is essential for understanding price trends and can help policymakers and manufacturers gauge if EVs are becoming more accessible or remain a premium option.

*I also want to point out an interesting outlier in the data which is 2015 where there was only one electric vehicle entry. This extreme change in price is due to the introduction of the Porsche electric plug-in hybrid race car at a price tag of $845K.

#visualization 5: heat map

# Filter data for model years between 2010 and 2024 and exclude rows with Base MSRP = $0
filtered_data = ev_data[(ev_data['Model Year'] >= 2010) & 
                        (ev_data['Model Year'] <= 2024) & 
                        (ev_data['Base MSRP'] > 0)]

# Group by Model Year to calculate the average MSRP for each year
avg_msrp_per_year = filtered_data.groupby('Model Year')['Base MSRP'].mean()

# Identify the most expensive EV in 2015 and retrieve its make
most_expensive_2015_row = filtered_data[filtered_data['Model Year'] == 2015].nlargest(1, 'Base MSRP')
most_expensive_2015_price = most_expensive_2015_row['Base MSRP'].values[0]
most_expensive_2015_make = most_expensive_2015_row['Make'].values[0]

# Convert to a DataFrame for heat map plotting
avg_msrp_df = avg_msrp_per_year.to_frame().T  # Transpose to have years on x-axis

# Plotting the heat map
fig, ax = plt.subplots(figsize=(10, 2))
cax = ax.imshow(avg_msrp_df, cmap='coolwarm', aspect='auto')

# Adding titles and labels
plt.title("Average Base MSRP by Model Year (2010-2020)")
plt.xlabel("Model Year")
plt.ylabel("Average MSRP")

# Setting ticks and labels for years
ax.set_xticks(range(len(avg_msrp_df.columns)))
ax.set_xticklabels(avg_msrp_df.columns, rotation=45, ha='right')

# Hide y-axis ticks as there's only one row
ax.set_yticks([])
# Adding color bar to show the scale of MSRP
cbar = fig.colorbar(cax, ax=ax, format='$%.0f')
cbar.set_label('Average Base MSRP ($)')

# Annotate each year with its average MSRP value, rotated vertically
for i, year in enumerate(avg_msrp_df.columns):
    avg_price = avg_msrp_df.iloc[0, i]
    ax.text(i, 0, f'${int(avg_price):,}', ha='center', va='center', color='black', fontsize=10, rotation=90)

# Special annotation for the most expensive EV in 2015 with Make and MSRP
if 2015 in avg_msrp_df.columns:
    i_2015 = avg_msrp_df.columns.get_loc(2015)
    ax.text(i_2015, 0.4, f'{most_expensive_2015_make}\nHighest: ${int(most_expensive_2015_price):,}', 
            ha='center', va='center', color='red', fontsize=10, fontweight='bold', rotation=0)

plt.tight_layout()
plt.show()



These visualizations provide a comprehensive look at the EV market from 2010 to 2024, highlighting trends in adoption, brand popularity, vehicle range, regional preferences, and pricing evolution. The electric vehicle industry is moving towards broader consumer acceptance and innovation, paving the way for a sustainable future.