I loaded the needed libraries as well as pulled in the Kickstarter data (via a downloaded .CSV file).
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
I felt the data would be best represented as points on a scatterplot, so I would be able to include an X-variable and Y-variable that were both meaningful measures for the Kickstarter projects. My first variation had Success (as a percentage) on the x-axis and Money Pledged on the y-axis. With the categories as the points color. I wanted to include the total number of projects as well and size of the points seemed like a great fit for that metric. It would easily allow someone to see which category was the largest, and which was the smallest. I played around with the variables on different axises, but ended up deciding to keep my x- and y- variables the same as my original plot. It seemed to be the easiest way to compare the relation between money pledged and success, which were the factors I (and it seems the Economist) found most interesting to incorporate.
I only wanted to include one additional variable that would determine the size of the points. Including a fourth as the fill was another option I tried, but felt it busied the plot too much. Concise and clear was my preference. The size variable I initial chose was total projects, but I also tried average amount pledged (which was a key factor in the Economist’s charting). Although this did seem interesting, I felt total number of projects was more impactful, especially as the overall amount of money was already featured as the x-axis.
Once I had settled on my variables, I focused on the labels and the geom_text labels for categories. I wanted to ensure you could easily read the category that was represented by the various points, without limiting the visual of the points.
#Renaming variables for ease of use
Success <-Kickstarter$Success.rate...
AveragePledge <- Kickstarter$Average.pledge...
MoneyPledged <- (Kickstarter$Money.pledged...)/1000000
#Initial Plots
ggplot(Kickstarter, aes(x = MoneyPledged, y= Success, col = Category)) +
geom_point()
## Warning: Removed 2 rows containing missing values (geom_point).
#Switching my variables
ggplot(Kickstarter, aes(y = MoneyPledged, x= Success, label = Category)) +
geom_point(aes(size=Launched), col= "grey") +
geom_text(size = 4, angle = 45, vjust = 0, nudge_y = 2)+
ylab("Money Pledged (in thousands of USD$)") +
xlab("Successful Projects (% of total)") +
labs(size = "Total Projects Launched")
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_text).
#Adding title and solidifying the variables I want to include
ggplot(Kickstarter, aes(x = MoneyPledged, y= Success, label = Category)) +
geom_point(aes(size=Launched), col= "grey") +
scale_size_continuous(range = c(4,16)) +
geom_text(size = 3.5, angle = 45, vjust = 0, nudge_y = 2)+
xlab("Total Money Pledged (Millions of USD$)") +
ylab("Successful Projects (% of total)") +
labs(size = "Total Projects", title = "Crowdfunded Kickstarter Projects in 2012")
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_text).