Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Larry Kim (2019).


Objective

The objective of this original data visualisation is to find out what are the most expensive (highest cost per click) key words for Google AdWords and the target audience could be any organisation that provide online advertisement, or any orgnisation that would like to do online marketing via any search engine or alike, or general audience interested in online advertisement.

The visualisation chosen had the following three main issues:

  • The pie chart has 20 different colors, though there are annotations for the smaller areas, it is normally recommended for pir charts to have 5-7 colours (based on short term memory; macDonald 1999) and no more than 1 colors (Buts 2012).
  • The percentage and area propotions are not correct, e.g. it says #1 is 24%, and #2 is 12.8%, which is 36.8% in total, however they take more than 50% of the pie chart area.And actually this percentage information is not really relevant to the audience as the focus of the visualisation is the cost rather than the frequency of the keywords.
  • The colors are also not so right, e.g. #1 is in red and #2 is in green and next to #1, which can be an issue for red/green color blinded audience; also the color associations are not ideal and do not reflect the key words (e.g.: red color used for key word #7 Donate).

Reference

Code

The following code was used to fix the issues identified in the original.

library(rvest)
library(ggplot2)
library(tidyverse)

webtables <- read_html("https://www.wordstream.com/articles/most-expensive-keywords")

keywordsdata <- html_table(html_nodes(webtables,"table")[[1]])

plotdata <- keywordsdata %>% rename(CPC = "Cost per Click (CPC)")

plotdata$CPC <- as.numeric(sub("\\$","", plotdata$CPC))

plotdata$Keyword <- plotdata$Keyword %>% factor(levels = plotdata$Keyword[order(-plotdata$CPC)])

p1 <- ggplot(plotdata,aes(x=Keyword,y=CPC))+
  geom_bar(stat = "identity", fill = "dodgerblue4")+
  theme(axis.text.x=element_text(angle = 45,hjust=1))+
  geom_text(aes(label=CPC),vjust=-0.5,size=3)+
  labs(title = "The 20 Most Expensive Keywords in Google Ads", x="Keywords", y="Costs per Click (CPC)")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.