After reviewing the data and examining the charts from the economist, I thought about what we would want to show in a visualization - what data, which relationships, etc. To me, the “output” of the kickstarter data is the success rate - and we want to look at the factors that contribute to that. We can examine other components along the way, but I wanted to highlight this variable the most.
data kickstart - data set created from csv file.
Aesthetics
y-variable: I mapped the success rate to the y axis to highlight this variable the most.
x variable: Since there were 13 categories, I thought that was too many to use shapes, or some identifier within the chart that would be unique enough to differentiate them from one another. Conversely, I felt there were not enough data points (1 per category) to use any facet/grid layouts. This led me to map the categories to the x axis. Further, I plotted these horizontally by their success rate - again giving people the ability to see most successful, to least successful categories.
size: I plotted number of pledges by size. This gives a ballpark idea of how popular any given category was. It shows where the masses invest, and where few went - I think that’s an interesting consideration in looking at which start-ups were more successful.
color: I plotted the amount pledged. While not the professionals favorite encoding for this purpose, it does show investment dollars easily enough to give the viewer a general idea of how much money was invested in any given category. Again, with fewer categories, I probably would have plotted pledged amounts on an axis and used symbols for the categories.
geometry: I chose geom_point because I felt this plotted the information I was interest in a simple to read plot, and allowed me enough options (below) to add detail.
labs: Labeled the chart and all mapped components.
theme: I used themes to make the chart “prettier” - some text was faint, and I thought with a number of components on the plot, I needed to help clarify axes, labels, etc. I also changed background colors to make the data points pop a bit better.
scales: I used these to clarify the variables, so the plot was easier to comprehend to the reader, including formatting labels and breaks. Of interest I chose to change the scale for the size (number of pledges). The default was 500k and 1,000k. There are several categories with less that 100k pledges, and the scale was very misleading or hard to get an idea of how many pledges there really were for the categories with lower participation. This may be improved - challenge is putting out a scale that represents the range of values from 25k - 1,400k. I also chose to use breaks for the success rate from 20% to 80% to use more of the plot and emphasize the change in success rate from category to category. While I was concerned the scale should start at 0%, I felt better after reading this article from Junk Charts.
Neither mine nor the Economist’s show the number of projects launched - while this is interesting, I don’t think it has a direct impact on success. What it would show is where there are more start-ups - which is not something I thought was necessary to show in this visualization.
I agree that the Economist made a good decision to eliminate the average amount pledged. The variation is not huge between the lowest and highest categories, making it fairly unexciting.
#read CSV
kickstart <- read.csv("A2_kickstart.csv", stringsAsFactors = FALSE)
#Libraries
library(dplyr)
library(ggplot2)
library(RColorBrewer)
library(scales)
library(ggthemes)
library(grid)
#Clean data
colnames(kickstart) <- c("Category","No.Launched", "No.Successful", "Pledged.Amt", "No.Pledges", "Success.Rate", "Avg.Pledge$")
blue.bold.text <- element_text(face = "bold", color = "darkblue")
black.text <- element_text(color = "black")
black.45angle.text <- element_text(angle = 45, hjust = 1, color = "black")
#Plot
kickstart.plot <- ggplot(kickstart, aes(x = reorder(Category, -Success.Rate), y = Pledged.Amt/1000, size = No.Pledges/1000, col = Success.Rate/100)) +
geom_point() +
labs(title = "Crowdfunded Projects on Kickstarter in 2012", x = "Category (Ranked by Success Rate)", y = "Amount Pledged (in thousands)", color = "Success Rate", size = "Number of Pledges (in thousands)") +
theme(axis.text.x = black.45angle.text,axis.text.y = black.text, plot.title = element_text(hjust = 0.5), axis.title = blue.bold.text, panel.background = element_rect(fill = "lightgray", color = "black"), plot.background = element_rect(fill = "gray95", color = "black"), aspect.ratio=3.5/4) +
scale_size_continuous(breaks = c(100, 500, 1000), range = c(5, 15), label = comma_format()) +
scale_y_continuous(label = dollar_format(), limits = c(0, 100000), breaks = c(0, 20000, 40000, 60000, 80000, 100000)) +
scale_color_gradient(low = "gold", high = "red", labels = percent)
#Suggestions to change:
#More contrasting Colors - yellow to red (hot)
#Increase horizontal width - used aspect.ratio
#Background lighter - done
#Amount pledged on Y; Success as color - done
kickstart.plot