library(tidyverse)
library(knitr)
library(plotly)
library(lubridate)
options(scipen = 9999)
Kickstarter is American company based in New York, that maintains a global crowdfunding platform focused on creativity and merchandising. The company’s stated mission is to “help bring creative projects to life”. Kickstarter has reportedly received more than $4 billion in pledges from 15.5 million backers to fund 257,000 creative projects, such as films, music, stage shows, comics, journalism, video games, technology and food-related projects. (https://en.wikipedia.org/wiki/Kickstarter)
Some things I want to analyze:
ksdata <- read.csv("ks-projects-201801.csv")
str(ksdata)
## 'data.frame': 378661 obs. of 15 variables:
## $ ID : int 1000002330 1000003930 1000004038 1000007540 1000011046 1000014025 1000023410 1000030581 1000034518 100004195 ...
## $ name : Factor w/ 375765 levels ""," IT’S A HOT CAPPUCCINO NIGHT ",..: 332493 135633 364946 344770 77274 206067 293430 69281 284103 290686 ...
## $ category : Factor w/ 159 levels "3D Printing",..: 109 94 94 91 56 124 59 42 114 40 ...
## $ main_category : Factor w/ 15 levels "Art","Comics",..: 13 7 7 11 7 8 8 8 5 7 ...
## $ currency : Factor w/ 14 levels "AUD","CAD","CHF",..: 6 14 14 14 14 14 14 14 14 14 ...
## $ deadline : Factor w/ 3164 levels "2009-05-03","2009-05-16",..: 2288 3042 1333 1017 2247 2463 1996 2448 1790 1863 ...
## $ goal : num 1000 30000 45000 5000 19500 50000 1000 25000 125000 65000 ...
## $ launched : Factor w/ 378089 levels "1970-01-01 01:00:00",..: 243292 361975 80409 46557 235943 278600 187500 274014 139367 153766 ...
## $ pledged : num 0 2421 220 1 1283 ...
## $ state : Factor w/ 6 levels "canceled","failed",..: 2 2 2 2 1 4 4 2 1 1 ...
## $ backers : int 0 15 3 1 14 224 16 40 58 43 ...
## $ country : Factor w/ 23 levels "AT","AU","BE",..: 10 23 23 23 23 23 23 23 23 23 ...
## $ usd.pledged : num 0 100 220 1 1283 ...
## $ usd_pledged_real: num 0 2421 220 1 1283 ...
## $ usd_goal_real : num 1534 30000 45000 5000 19500 ...
I’m using kable() function from knitr package to make simple table viewing
kable(head(ksdata))
| ID | name | category | main_category | currency | deadline | goal | launched | pledged | state | backers | country | usd.pledged | usd_pledged_real | usd_goal_real |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1000002330 | The Songs of Adelaide & Abullah | Poetry | Publishing | GBP | 2015-10-09 | 1000 | 2015-08-11 12:12:28 | 0 | failed | 0 | GB | 0 | 0 | 1533.95 |
| 1000003930 | Greeting From Earth: ZGAC Arts Capsule For ET | Narrative Film | Film & Video | USD | 2017-11-01 | 30000 | 2017-09-02 04:43:57 | 2421 | failed | 15 | US | 100 | 2421 | 30000.00 |
| 1000004038 | Where is Hank? | Narrative Film | Film & Video | USD | 2013-02-26 | 45000 | 2013-01-12 00:20:50 | 220 | failed | 3 | US | 220 | 220 | 45000.00 |
| 1000007540 | ToshiCapital Rekordz Needs Help to Complete Album | Music | Music | USD | 2012-04-16 | 5000 | 2012-03-17 03:24:11 | 1 | failed | 1 | US | 1 | 1 | 5000.00 |
| 1000011046 | Community Film Project: The Art of Neighborhood Filmmaking | Film & Video | Film & Video | USD | 2015-08-29 | 19500 | 2015-07-04 08:35:03 | 1283 | canceled | 14 | US | 1283 | 1283 | 19500.00 |
| 1000014025 | Monarch Espresso Bar | Restaurants | Food | USD | 2016-04-01 | 50000 | 2016-02-26 13:38:27 | 52375 | successful | 224 | US | 52375 | 52375 | 50000.00 |
kable(tail(ksdata))
| ID | name | category | main_category | currency | deadline | goal | launched | pledged | state | backers | country | usd.pledged | usd_pledged_real | usd_goal_real | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 378656 | 999976312 | Angela’s Poetry (Canceled) | Poetry | Publishing | CAD | 2014-09-20 | 5500 | 2014-08-06 03:46:07 | 0 | canceled | 0 | CA | 0 | 0 | 4949.6 |
| 378657 | 999976400 | ChknTruk Nationwide Charity Drive 2014 (Canceled) | Documentary | Film & Video | USD | 2014-10-17 | 50000 | 2014-09-17 02:35:30 | 25 | canceled | 1 | US | 25 | 25 | 50000.0 |
| 378658 | 999977640 | The Tribe | Narrative Film | Film & Video | USD | 2011-07-19 | 1500 | 2011-06-22 03:35:14 | 155 | failed | 5 | US | 155 | 155 | 1500.0 |
| 378659 | 999986353 | Walls of Remedy- New lesbian Romantic Comedy feature unlike any other!! | Narrative Film | Film & Video | USD | 2010-08-16 | 15000 | 2010-07-01 19:40:30 | 20 | failed | 1 | US | 20 | 20 | 15000.0 |
| 378660 | 999987933 | BioDefense Education Kit | Technology | Technology | USD | 2016-02-13 | 15000 | 2016-01-13 18:13:53 | 200 | failed | 6 | US | 200 | 200 | 15000.0 |
| 378661 | 999988282 | Nou Renmen Ayiti! We Love Haiti! | Performance Art | Art | USD | 2011-08-16 | 2000 | 2011-07-19 09:07:47 | 524 | failed | 17 | US | 524 | 524 | 2000.0 |
sapply(ksdata, function(x) sum(is.na(x)))
## ID name category main_category
## 0 0 0 0
## currency deadline goal launched
## 0 0 0 0
## pledged state backers country
## 0 0 0 0
## usd.pledged usd_pledged_real usd_goal_real
## 3797 0 0
I will remove the column usd.pledged because it contains NA values, and also I don’t need it for data analysis.
ksdata <- ksdata[, -13]
colnames(ksdata)[13] <- "usd_pledged"
colnames(ksdata)[14] <- "usd_goal"
By Main Category
project.category <- ksdata %>%
group_by(main_category) %>%
summarize(count = n()) %>%
arrange(desc(count))
project.category$main_category <- factor(project.category$main_category, levels = project.category$main_category)
ggplot(project.category, aes(x = main_category, y = count)) +
geom_bar(stat = "identity", aes(fill = main_category), show.legend = F) +
labs(title = "Total Projects per Category", x = "Category Name", y = "Total") +
theme(axis.text.x = element_text(angle=90, hjust=1),
plot.title=element_text(hjust=0.5)) +
geom_text(aes(label = paste0(round(count/1000, 1), "K")), vjust = -0.5)
From this graph, Film & Video is the most popular main category in Kickstarter with a total of 63.6K projects were released all the time. Music, Publishing, Games, and Technology make it under top 5 popular main category in Kickstarter.
By Subcategory
project.subcategory <- ksdata %>%
group_by(category) %>%
summarize(count = n()) %>%
arrange(desc(count))
project.subcategory$category <- factor(project.subcategory$category, levels = project.subcategory$category)
ggplot(head(project.subcategory, 15), aes(x = category, y = count)) +
geom_bar(stat = "identity", aes(fill = category)) +
labs(title = "Top 10 Projects by Sub-Category", x = "Sub-Category Name", y = "Total") +
theme(axis.text.x = element_text(angle=90, hjust=1),
plot.title=element_text(hjust=0.5),
legend.position = "bottom") +
geom_text(aes(label = paste0(round(count/1000, 1), "K")), vjust = -0.5) +
scale_y_continuous(limits = c(0,25000))
From subcategory, Product Design becomes the most popular project with over 22.3K projects were released all time.
Top 20 highest funded projects
kable(head(ksdata[order(-ksdata$usd_pledged), c(2,3,13)], 20))
| name | category | usd_pledged | |
|---|---|---|---|
| 157271 | Pebble Time - Awesome Smartwatch, No Compromises | Product Design | 20338986 |
| 250255 | COOLEST COOLER: 21st Century Cooler that’s Actually Cooler | Product Design | 13285226 |
| 216630 | Pebble 2, Time 2 + All-New Pebble Core | Product Design | 12779843 |
| 289916 | Kingdom Death: Monster 1.5 | Tabletop Games | 12393140 |
| 282417 | Pebble: E-Paper Watch for iPhone and Android | Product Design | 10266846 |
| 293862 | The World’s Best TRAVEL JACKET with 15 Features || BAUBAX | Product Design | 9192056 |
| 187653 | Exploding Kittens | Tabletop Games | 8782572 |
| 6666 | OUYA: A New Kind of Video Game Console | Gaming Hardware | 8596475 |
| 309631 | THE 7th CONTINENT – What Goes Up, Must Come Down. | Tabletop Games | 7072757 |
| 271277 | The Everyday Backpack, Tote, and Sling | Product Design | 6565782 |
| 75901 | Fidget Cube: A Vinyl Desk Toy | Product Design | 6465690 |
| 368574 | Shenmue 3 | Video Games | 6333296 |
| 30042 | Pono Music - Where Your Soul Rediscovers Music | Sound | 6225355 |
| 89482 | Bring Back MYSTERY SCIENCE THEATER 3000 | Television | 5764229 |
| 148586 | The Veronica Mars Movie Project | Narrative Film | 5702153 |
| 14851 | Bloodstained: Ritual of the Night | Video Games | 5545992 |
| 301827 | Dark Souls™ - The Board Game | Tabletop Games | 5494493 |
| 292245 | Bring Reading Rainbow Back for Every Child, Everywhere! | Web | 5408917 |
| 143662 | ZeTime: World’s first smartwatch with hands over touchscreen | Wearables | 5333793 |
| 1874 | Redefining Italian Luxury Watches - Filippo Loreti | Design | 5020667 |
Top 20 project with highest total backers
kable(head(ksdata[order(-ksdata$backers), c(2,3,11)],20))
| name | category | backers | |
|---|---|---|---|
| 187653 | Exploding Kittens | Tabletop Games | 219382 |
| 75901 | Fidget Cube: A Vinyl Desk Toy | Product Design | 154926 |
| 292245 | Bring Reading Rainbow Back for Every Child, Everywhere! | Web | 105857 |
| 148586 | The Veronica Mars Movie Project | Narrative Film | 91585 |
| 182658 | Double Fine Adventure | Video Games | 87142 |
| 23405 | Bears vs Babies - A Card Game | Tabletop Games | 85581 |
| 157271 | Pebble Time - Awesome Smartwatch, No Compromises | Product Design | 78471 |
| 239176 | Torment: Tides of Numenera | Video Games | 74405 |
| 272925 | Project Eternity | Video Games | 73986 |
| 38292 | Yooka-Laylee - A 3D Platformer Rare-vival! | Video Games | 73206 |
| 215085 | ZNAPS -The $9 Magnetic Adapter for your mobile devices | Technology | 70122 |
| 368574 | Shenmue 3 | Video Games | 69320 |
| 282417 | Pebble: E-Paper Watch for iPhone and Android | Product Design | 68929 |
| 293644 | Mighty No. 9 | Video Games | 67226 |
| 216630 | Pebble 2, Time 2 + All-New Pebble Core | Product Design | 66673 |
| 14851 | Bloodstained: Ritual of the Night | Video Games | 64867 |
| 97372 | Joking Hazard | Tabletop Games | 63758 |
| 6666 | OUYA: A New Kind of Video Game Console | Gaming Hardware | 63416 |
| 250255 | COOLEST COOLER: 21st Century Cooler that’s Actually Cooler | Product Design | 62642 |
| 196550 | Wasteland 2 | Video Games | 61290 |
Main Category Visualization
category.pledged <- ksdata %>%
group_by(main_category) %>%
summarize(total = sum(usd_pledged)) %>%
arrange(desc(total))
category.pledged$main_category <- factor(category.pledged$main_category, levels = category.pledged$main_category)
ggplot(category.pledged, aes(x = main_category, y = total / 1000000)) +
geom_bar(stat = "identity", aes(fill = main_category)) +
labs(title = "Total Amount Pledged by Category", x = "Project Category", y = "Amount Pledged (in Millions USD)") +
theme(plot.title = element_text(hjust = 0.5), axis.text.x = element_text(angle = 90, hjust = 1), legend.position="null") +
geom_text(aes(label = paste0("$", round(total/1000000, 1))), vjust = -0.5)
Games, Design, and Technology were the most funded categories with total funds from these three categories more than 2 Billion USD.
Sub-Category Visualization
category.pledged <- ksdata %>%
group_by(category) %>%
summarize(total = sum(usd_pledged)) %>%
arrange(desc(total))
category.pledged$category <- factor(category.pledged$category, levels = category.pledged$category)
plot2b <- ggplot(head(category.pledged, 10), aes(x = category, y = total / 1000000)) +
geom_bar(stat = "identity", aes(fill = category)) +
labs(title = "Top 10 - Total Amount Pledged by Sub Category", x = "Sub Category", y = "Amount Pledged in USD") +
theme(plot.title = element_text(hjust = 0.5), axis.text.x = element_text(angle = 90, hjust = 1), legend.position="null") +
geom_text(aes(label = paste0(round(total/1000000, 1), "M")), vjust = -0.5)
plot2b
Product Design and Tabletop Games become the top 2 most popular sub-category with a total from both categories more than 1 Billion USD.
state.freq <- ksdata %>%
group_by(state) %>%
summarize(count = n()) %>%
arrange(desc(count))
state.freq$state <- factor(state.freq$state, levels = state.freq$state)
ggplot(state.freq, aes(x = state, y = count)) +
geom_bar(stat = "identity", aes(fill = state), show.legend = F) +
ggtitle("Project by Status") + xlab("Status") + ylab("Total") +
geom_text(aes(label = paste0(round(count/1000,1), "K")), vjust = -0.5)
But in reality, Kickstarter projects have more failed projects compare to successful projects.
state.grp <- ksdata %>%
filter(state!="undefined") %>%
mutate(grp=ifelse(state %in% c("successful", "failed"), "complete", "incomplete")) %>%
group_by(grp, state) %>%
summarize(count=n()) %>%
mutate(pct=count/sum(count)) %>%
arrange(grp, desc(-state))
state.grp$state <- factor(state.grp$state, levels=state.grp$state)
plot <- ggplot(state.grp, aes(grp, pct, fill=state)) + geom_bar(stat="identity") +
ggtitle("Project Status by Completion") + xlab("Project Completion") + ylab("Percentage") +
geom_text(aes(label=paste0(round(pct*100,1),"%")),
position=position_stack(vjust=0.5),
colour="white", size=5) +
theme(plot.title=element_text(hjust=0.5),
axis.title=element_text(size=12, face="bold"),
axis.text.x=element_text(size=12), legend.position="bottom",
legend.title=element_text(size=12, face="bold")) +
scale_y_continuous(labels=scales::percent)
ggplotly(plot)
As a comparison, the failed project has a 59.6% proportion compare to 40.4% successful projects. And surprisingly, 89.3% of projects were canceled before the funding time was finished.
state.pct <- ksdata %>%
filter(state %in% c("successful", "failed")) %>%
group_by(main_category, state) %>%
summarize(count=n()) %>%
mutate(pct=count/sum(count)) %>%
arrange(desc(state), pct)
state.pct$main_category <- factor(state.pct$main_category,
levels=state.pct$main_category[1:(nrow(state.pct)/2)])
ggplot(state.pct, aes(main_category, pct, fill=state)) + geom_bar(stat="identity") +
ggtitle("Success vs. Failure Rate by Project Category") +
xlab("Project Category") + ylab("Percentage") + scale_y_continuous(labels=scales::percent) +
scale_fill_discrete(name="Project Status", breaks=c("successful", "failed"),
labels=c("Success", "Failure")) +
geom_text(aes(label=paste0(round(pct*100,1),"%")), position=position_stack(vjust=0.5),
colour="white", size=5) +
theme(plot.title=element_text(hjust=0.5), axis.title=element_text(size=12, face="bold"),
axis.text.x=element_text(size=12), legend.position="bottom",
legend.title=element_text(size=12, face="bold")) + coord_flip()
Dance, Theater, Comics become the best success rate in Kickstarter.
year.freq <- ksdata %>%
filter(year(launched)!="1970") %>%
group_by(year=year(launched)) %>%
summarize(count=n())
ggplot(year.freq, aes(year, count, fill=count)) + geom_bar(stat="identity") +
ggtitle("Number of Projects by Launch Year") + xlab("Year") + ylab("Frequency") +
scale_x_discrete(limits=c(2009:2018)) +
geom_text(aes(label=paste0(count)), vjust=-0.5) +
theme(plot.title=element_text(hjust=0.5), axis.title=element_text(size=12, face="bold"),
axis.text.x=element_text(size=12), legend.position="null")
2015 is the top year in Kickstarter history with over 70k projects were released at that time.
state.pct2 <- ksdata %>%
filter(year(launched)!="1970", state %in% c("successful", "failed")) %>%
group_by(year=year(launched), state) %>%
summarize(count=n()) %>%
mutate(pct=count/sum(count)) %>%
arrange(desc(state))
ggplot(state.pct2, aes(year, pct, fill=state)) + geom_bar(stat="identity") +
ggtitle("Success vs. Failure Rate by Year Launched") +
xlab("Year") + ylab("Percentage") + scale_x_discrete(limits=c(2009:2017)) +
scale_y_continuous(labels=scales::percent) +
geom_text(aes(label=paste0(round(pct*100,1),"%")), position=position_stack(vjust=0.5),
colour="white", size=5) +
theme(plot.title=element_text(hjust=0.5), axis.title=element_text(size=12, face="bold"),
axis.text.x=element_text(size=12), legend.position="bottom",
legend.title=element_text(size=12, face="bold"))
Kickstarter’s failed projects were increasing from 2014-2015, but it’s getting better from 2016-2017 with the increasing trend in successful projects. Project owner needs to learn from the mistake in the past.
Kickstarter will be the top choice among crowdfunding platform in the creative industry, with Product Design, Tabletop Games, Video Games, and also Technology become popular category every year. Project owner need to make it more appealing to current market and learn from previous failed project in the past.