This is my first attempt at using R Markdown as a solution to Makeover Monday challenges set by @VizWizBi and @acotgreave at http://vizwiz.blogspot.co.uk/p/makeover-monday-challenges.html.
The challenges have emerged out of the Tableau community and I’m a huge Tableau fan, but they also seem like a great way to see what I can manage in R and to learn some new techniques at the same time.
This week’s challenge is to make over this vis.
I could load the data from csv, but it’s very small so let’s just create it here.
#Recreate raw data
source_of_traffic <- c("Netflix","YouTube","HTTP","Amazon Video","iTunes","BitTorrent","Hulu","Facebook","Other")
percent_of_traffic <- c(37.1,17.9,6.1,3.1,2.8,2.7,2.6,2.5,25.4)
df <- data.frame(source_of_traffic, percent_of_traffic)
print(df)
## source_of_traffic percent_of_traffic
## 1 Netflix 37.1
## 2 YouTube 17.9
## 3 HTTP 6.1
## 4 Amazon Video 3.1
## 5 iTunes 2.8
## 6 BitTorrent 2.7
## 7 Hulu 2.6
## 8 Facebook 2.5
## 9 Other 25.4
I wonder what the original would look like, rendered in ggplot? I’ve never drawn a donut chart in ggplot before and it turns out that it’s trickier than I expected. Comments in the code below note the awkward bits.
library(ggplot2)
library(dplyr)
# Adding value labels to each segment means working out where the centre of the segment will be drawn
df <- df %>%
mutate(pos = cumsum(percent_of_traffic) - (0.5 * percent_of_traffic))
original_colours <- c("#7D0808", "#D91A00", "#FFB200", "#F78200", "#0F283E", "#919191", "#87BD24", "#005CB0", "#D6D6D6")
# The colour palette doesn't work unless you force the data to sort in its original order. ggplot will default to alphabetical order
df$source_of_traffic <- factor(df$source_of_traffic, as.character(df$source_of_traffic))
ggplot(df, aes(x="", y=percent_of_traffic, fill=source_of_traffic)) +
geom_bar(width=0.25, stat= "identity") +
geom_text(aes(label=paste(percent_of_traffic,"%",sep=""),y=pos), size=3)+
coord_polar(theta = "y") +
scale_fill_manual(values=original_colours, name="Source of Traffic") +
ggtitle("Percentage of peak period downstream internet traffic in North America")+
#Turn off all of ggplot's default formatting
theme(panel.grid=element_blank(),
panel.background=element_blank(),
axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(hjust = 0, size=12))
Well that was quite a lot of effort just to produce a very average looking donut chart and I’m still not happy with the chart’s rendered size, which seems to be very difficult to change for a donut. It’s the down side of using R; you can do almost anything, so you often have to specify in minute detail what you actually want.
In ggplot’s defence, the help files do note that coord_polar() exists so that you can draw pie charts and then warns against their use. Reproducing the original vis isn’t what ggplot is good at.
The original on this week’s challenge strikes me as a classic of the type you often get in marketing. The dataset is very simple, but if you draw a simple chart then you fear that nobody will think it’s clever.
Bar chart? Too basic.
Pie chart? Too common.
Donut chart!
Simplicity is good. Bar chart was a good answer and a donut doesn’t improve on it.
#Sort the data
df <- transform(df, source_of_traffic=reorder(source_of_traffic, df$percent_of_traffic) )
#Drop "Other". It's needed for a pie or donut, but not interesting in a bar.
df <- df[df$source_of_traffic!="Other",]
ggplot(df, aes(x=source_of_traffic, y=percent_of_traffic)) +
geom_bar(stat="identity") +
coord_flip() +
geom_text(aes(label=paste(percent_of_traffic,"%",sep=""),y=percent_of_traffic, hjust=1), size=3, color="white")+
ggtitle("Percentage of peak period downstream internet traffic in North America")+
#Turn off all of ggplot's default formatting
theme(text = element_text(size=15),
panel.grid=element_blank(),
panel.background=element_blank(),
axis.ticks=element_blank(),
axis.text.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(hjust = 0, size=12))
That’ll do. Seeing as I’ve already been a killjoy about the original vis overcomplicating a simple dataset, I’m not even going to colour it in. Corporate, analytical grey. Lovely!
(seriously, for publication, I promise I’d colour it in)
You might have spotted that I filtered out the “Others” category. Once you move away from a donut, or pie, I don’t think it’s really serving much purpose.
Why not? This has been a fun way to learn some new things and brush up my R Markdown at the same time. Complicated or really fun builds are likely to still happen in Tableau for the time being though, because I can make a lot more happen in Tableau, much faster.