The following markdown is going to show a poorly made visualization, and then talk about ways to improve the visualization.
The figure attempts attempts to visualize a poll where people voted where they would want a new NHL team to be. The purpose of this visualization is to show where the most people want to have a new NHL team.
There’s a not of reasons I don’t like this figure:
Lets start out by making the data frame so we can hold the data.
#make data.frame
df <- data.frame(
place = as.factor(c("Houston", "Quebec City", "Arizona", "Atlanta", "Toronto", "Austin", "Saskatoon", "San Diego")),
votes = c(54,47,24,17,8,4,3,3),
color = as.factor(c(1,0,0,0,0,0,0,0)) # I've included a color factor for the figure
)
#view dataframe!
head(df)
## place votes color
## 1 Houston 54 1
## 2 Quebec City 47 0
## 3 Arizona 24 0
## 4 Atlanta 17 0
## 5 Toronto 8 0
## 6 Austin 4 0
Now that we have the data, lets visualize it! , fill = color
df |>
mutate(place = fct_reorder(place, votes))|>
ggplot(aes(x = place, y = votes, fill = color)) +
geom_col()+
theme_classic()+
geom_text(aes(label = votes), position=position_dodge(width=0.9), hjust=
-1, color = "#f7f8fa")+
labs(
title = '"Where do you want to see a new NHL team play?"',
subtitle = "<span style='color:#f7f8fa'> Houston was voted as</span> <span style='color:#2951e3'>the most desireable new location for a NHL team.</span> ",
caption =
"Teams reaciving two votes: Kansas City, Helsinki, Oklahoma City, and Miami \nTeams reciving a single vote: Boise, Dubai, Green Bay, Halifax, Jaxson Hole, Milwaukee, \nand Orlando \n \nSource: The Athletic NHL Staff (Anonymous NHL Player Poll: 175 votes) from Sept. 27 - Nov. 10"
)+
ylim(c(0,60))+
scale_fill_manual(values = c("#f7f8fa","#2951e3")) +
coord_flip()+
theme(
plot.subtitle = element_markdown(),
plot.title = element_text(color = "white", size = 16),
plot.caption = element_text(color = "white", hjust = 0),
axis.text.y = element_text(color = "white"),
axis.ticks.y = element_line(color = "white"),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.line.x = element_blank(),
axis.line.y = element_blank(),
plot.background = element_rect(fill = 'black', color = "black"),
panel.background = element_rect(fill = 'black'),
legend.position = "none"
)
Okay now we have fixed the scale! You can see how Quebec City is really not all that far behind Houston in the poll. This is why the scale you use makes such a big difference! I have another gripe with this figure. Why did they not include all of the other teams? They decided to abritrarily decided to not include the places that have less than 3 votes.
Let’s see what it looks like if we include all of the teams!
#make data.frame
df2 <- data.frame(
place = as.factor(c("Houston", "Quebec City", "Arizona", "Atlanta", "Toronto", "Austin", "Saskatoon", "San Diego","Kansas City", "Helsinki", "Oklahoma City", "Miami", "Boise", "Dubai", "Green Bay", "Halifax", "Jaxson Hole", "Milwaukee", "Orlando")),
votes = c(54,47,24,17,8,4,3,3,2,2,2,2,1,1,1,1,1,1,1),
color = as.factor(c(1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0))
)
Now that we have a new data frame, lets visualize the figure with all of the votes!
df2 |>
mutate(place = fct_reorder(place, votes))|>
ggplot(aes(x = place, y = votes, fill = color)) +
geom_col()+
theme_classic()+
geom_text(aes(label = votes), position=position_dodge(width=0.9), hjust=
-1, color = "#f7f8fa")+
labs(
title = '"Where do you want to see a new NHL team play?"',
subtitle = "<span style='color:#f7f8fa'> Houston was voted as</span> <span style='color:#2951e3'>the most desireable new location for a NHL team.</span> ",
caption =
"Source: The Athletic NHL Staff (Anonymous NHL Player Poll: 175 votes) from Sept. 27 - Nov. 10"
)+
ylim(c(0,60))+
scale_fill_manual(values = c("#f7f8fa","#2951e3")) +
coord_flip()+
theme(
plot.subtitle = element_markdown(),
plot.title = element_text(color = "white", size = 16),
plot.caption = element_text(color = "white", hjust = 1),
axis.text.y = element_text(color = "white"),
axis.ticks.y = element_line(color = "white"),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.line.x = element_blank(),
axis.line.y = element_blank(),
plot.background = element_rect(fill = 'black', color = "black"),
panel.background = element_rect(fill = 'black'),
legend.position = "none"
)
There you have it! We have our finalized figure USING A
CORRECT SCALE.
When you visualize data, you should always have your scale and lengths
of your bars match the actual values you are plotting!!!