Objective
The objective of the original data visualisation was to highlight the production of renewable energy by states of the USA between 1990-2017. The amount of renewable energy is measured in Megawatt hour (MWh), generated by the producers in the respective states. The original data visualisation was posted in a Reddit’s community called “dataisbeautiful” which is a place for visual representations of data. The primary target audience of the visualisation appears to be the general audience in North America, especially the environmentalists and green activists.
The visualisation chosen had the following three main issues:
The title of the data visualisation was “How Green is USA” which does not help the viewers to contextualise the visualisation. It fails to explain that the visualisation is about production of renewable energy by the states of USA. Additionally, no subtitles and annotations were used to provide essential explanations and to emphasize key data points. Moreover, the visualisation is not effectively labelled as it does not have a legend which can be used to describe the hues/colours.
The visualisation had no colour association which made it difficult to draw conclusions. Although contrasting colours were used to make comparison between the states, it was difficult to distinguish between the hues.
The graph used to depict the data is not appropriate to visualise. The regional map is usually used to show the distribution of some feature in different region. Using the wrong type of graph will mislead and confuse the viewers. A bar chart should be used to compare the states over a period of time.
Reference
The following code was used to fix the issues identified in the original.
library(tidyr)
library(readr)
library(dplyr)
library(tidyverse)
library(ggplot2)
library(gganimate)
library(gifski)
library(knitr)
## Preprocessing data - Cleaning and preparing data for visualization
energy <- read_csv("test.csv")
energy$State <- as.factor(energy$State)
energy$`Energy Type`<- as.factor(energy$`Energy Type`)
energy <- energy %>% spread(key = `Energy Type`, `Generation (Mwh)`)
energy[is.na(energy)] <- 0
energy <- energy %>% group_by(State,Year) %>% summarise("Renewable Energy Generated (Mwh)" = sum(Renewable),"Non-Renewable Energy Generated (Mwh)" = sum(`Non Renewable`))
energy <- energy %>% mutate("Share of Renewable Energy" = 100*`Renewable Energy Generated (Mwh)`/(`Renewable Energy Generated (Mwh)`+`Non-Renewable Energy Generated (Mwh)`))
## Using ggplot2 to visualize the data
Ranking <- energy %>% group_by(Year) %>% mutate(EnergyRank = rank(-`Share of Renewable Energy`), Relative_Rank = `Renewable Energy Generated (Mwh)`/`Renewable Energy Generated (Mwh)`[EnergyRank==1], Label = paste0(" ",round(`Share of Renewable Energy`,digits=1))) %>% group_by(State) %>% filter(EnergyRank <=10) %>% ungroup()
Graph_Animation <- ggplot(Ranking, aes(EnergyRank, group = State , color = as.factor(State), fill = as.factor(State))) +
geom_tile(aes(y = `Share of Renewable Energy`/2,
height = `Share of Renewable Energy`,
width = 0.9), alpha = 0.9, color = "black") +
geom_text(aes(y = 0, label = paste(State, " ")), vjust = 0.2, hjust = 1, size = 5) +
geom_text(aes(y=`Share of Renewable Energy`,label = Label, hjust=0, size = 5)) +
coord_flip(clip = "off", expand = FALSE) +
scale_y_continuous(labels = scales::comma) +
scale_x_reverse() +
guides(color = F, fill = F) +
theme(panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.grid.major.x = element_line( size=.1, color="grey" ),
panel.grid.minor.x = element_line( size=.1, color="grey" ),
axis.line=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_blank(),
axis.ticks=element_blank(),
axis.title.y=element_blank(),
axis.title.x=element_blank(),
legend.position="none",
plot.title=element_text(hjust=0.5, colour="black", vjust=2, size=25, face="bold"),
plot.subtitle=element_text(hjust=0.5, size=18, face="italic", color="black"),
plot.caption =element_text(size=15, hjust=0.5, face="italic", color="black"),
plot.background=element_blank(),
plot.margin = margin(2,2, 2, 4, "cm")) +
transition_states(Year, transition_length = 1, state_length = 1) +
view_follow(fixed_x = TRUE) +
labs(title = 'Share of Renewable Energy in Total Energy Produced by North American States : {closest_state}',
subtitle = "Top 10 States",
caption = "% Share of Renewable Energy | Data Source: U.S Energy Information Administration ")
Data Reference
The following plot fixes the main issues in the original data visualization.