As I was trying to understand The Economist’s visualization, I really just wanted to see a scatterplot of success rates by amount pledged. Creating this plot clarified what I was seeing in The Economist’s visualization - at first I thought there would be a pretty clear negative correlation between amount pledged and success rate, but it turns out there is a little more nuance to this relationship. (I did add a line of best fit, but it was not as clear of a negative correlation as I thought, and it was ugly.) What I ended up being able to communicate in this plot is the success rate of each project type by the amount pledged. I decided to add in one more dimension, which is average pledge, represented by the size of the points. According to Few, people are not generally very good at determining small size differences, such as the difference between, say, the average amount pledged for technology projects versus design, but adding this element can at least point out gross differences. For instance, it is obvious that technology projects have a much larger average pledge than comics projects, which may account for part of why the total amount pledged for technology is also greater than that for comics. Since the range of values is not terribly large for average pledge, this element doesn’t add a lot, but I think it efficiently adds more information that can allow for more inferences about crowdfunding by project type. I did experiment with using other variables for size of the points, such as the number of projects launched. I liked this result, but in an effort to stay true to the original visualization, I landed on average pledge. What this visualization does not achieve is any ranking of success rates or amount pledged by project type, which seemed to be key elements in The Economist’s final result. I decided to forgo any ranking because this plot does allow the reader to understand which project types had the highest amounts pledged or the highest success rates thanks to the point labels. Like in The Economist’s visualization, I have distinct colors for each project type, but no legend. I could have also made each point just black, but I think this looks better and distinguishes the points more. I didn’t find a color legend to be necessary for this plot. I regret that there is some overlap in the labels for “Design” and “Film & Video,” but it turns out there are some constraints with labeling points in ggplot2. I experimented with directlabels, as suggested in the Wickham reading, but it was not much better. In summary, I actually think this plot is much easier to understand than The Economist’s visualization.
Update By using ggrepel, I was able to rectify my issue with the overlapping point labels. Thanks, Tim!
ggplot(kickstarter, aes(x = Money_pledged,
y = Success_rate,
label = Project_type)) +
geom_point(aes(color = Project_type,
size = Average_pledge,
alpha = 0.7), na.rm = TRUE) +
geom_text_repel(size = 3.5, point.padding = 0.2, na.rm = TRUE) +
labs(x = "Amount Pledged (millions of US Dollars)",
y = "Success Rate (%)*",
title = "Success Rates of Crowdfunded Projects on Kickstarter by Amount Pledged (2012)",
caption = "*Success rate is the number of fully funded projects as a percent of all projects",
size = "Average pledge amount (US dollars)") +
scale_x_continuous(breaks = c(1000000, 20000000, 40000000, 60000000, 80000000),
labels = c("1m", "20m", "40m", "60m", "80m")) +
scale_y_continuous(breaks = c(10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
labels = c("10", "20", "30", "40", "50", "60", "70", "80", "90", "100")) +
scale_size_continuous(breaks = c(60, 80, 100), labels = c("$60", "$80", "$100")) +
guides(size = guide_legend(),
color = FALSE,
alpha = FALSE) +
theme(legend.position = "top",
plot.caption = element_text(hjust = 0))