library(data.table)
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
##
## hour, isoweek, mday, minute, month, quarter, second, wday, week,
## yday, year
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
ev_2 <- fread("C:\\Users\\eplas\\OneDrive\\Desktop\\ev2.csv")
state_counts <- ev_2[, .N, by = .(State)]
total_cars <- sum(state_counts$N)
state_counts[, Proportion := N/total_cars]
state_counts
We can visualize this data using ggplot2.
ggplot(state_counts, aes(x=State, y= Proportion, fill = State))+
geom_bar(stat = "identity")+
theme_minimal()+
labs(title= "Proportional Distribution of Evs by State",
x = "State",
y="Proportion of Total EVs",
fill = "State")
My video presentation was getting a bit long, but I wanted to add another bar chart in for a better visualization of the dataset in R.
First, I would like to filter out the state of WA since there is such a high concentration of EVs located there within this dataset.
state_counts_filtered <- state_counts[State != "WA"]
state_counts_filtered
Now I would like to create a bar chart of this data.
ggplot(state_counts_filtered, aes(x= State, y= Proportion, fill = State)) +
geom_bar(stat = "identity")+
theme_minimal()+
labs(title = "Proportional Distribution of EVs by State (excluding WA)",
x = "State",
y= "Proportion of Total EVs",
fill = "State")
We can see that CA has the highest number of Evs, after WA, in this dataset.