Jarek Kupisz
30th of October 2017
These questions are one of the most important ones for game studios to answer when designing a new game.
With game development being such a complicated and resource-intensive process, as a game maker, you cannot afford to answer them wrongly!
Fortunately we have a solution for your headaches - Video Game Sales Prediction app!
Access the app here ZROB LINKAAAA and with few clicks you will get to know your game’s sales potential down to hardware platform level!
Just select your game’s genre, publisher you wish to work with and ESRB rating. Responsive interface allows for fast comparison between multiple scenarios.
We used data from industry leading experts: vgchartz.com. We cleaned, pre-processed and limited data only to current platforms and years 2011-2016, so our predictions are as accurate as possible! For full code go to technical documentation.
# Initial seb scrap ot the vgchartz.com data, that is used as .csv comes from:
# https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings
vg <- read.csv("data/Video_Games_Sales_as_at_22_Dec_2016.csv", stringsAsFactors = F)
vg <- subset(vg, Year_of_Release >= 2011 &
Platform %in% c("3DS", "PC", "PS4", "PSV", "WiiU", "XOne") &
Year_of_Release != "N/A")
first_space <- grep("^ ", vg$Name)
vg$Name[first_space] <- gsub("^ ", "", vg$Name[first_space])
mp_games <- names(table(vg$Name)[table(vg$Name) > 1])
for (g in mp_games){
if (nchar(max(unique(vg$Rating[vg$Name == g]))) != 0){
vg$Rating[vg$Name == g] <- max(unique(vg$Rating[vg$Name == g]))}}
for (g in unique(vg$Genre)){
ratings <- table(vg$Rating[vg$Genre == g & vg$Rating != ""])
ratings_prob <- ratings/length(ratings)
to_fill <- vg$Rating == "" & vg$Genre == g
set.seed(1)
vg$Rating[to_fill] <- sample(names(ratings), sum(to_fill), replace = T,
prob = ratings)}We use the power of gradinet boosting machine to create a lot of small “rule of thumbs” (ex. when publisher is EA and genre is Shooter then on average sales will exceed 900000) predictors and combine them into the ultimate prediction app!
For more information see technical documentation
train_ctrl <- trainControl(verboseIter = T)
train_grid <- expand.grid(n.trees = seq(50, 300, by = 50),
shrinkage = c(0.01, 0.05, 0.1),
interaction.depth = 1:3, n.minobsinnode = 10)
set.seed(1)
fit_gbm <- train(Global_Sales~., data = vg, method = "gbm",
distribution = 'gaussian', verbose = F, trControl = train_ctrl,
tuneGrid = train_grid)
saveRDS(fit_gbm, "data/fit_gbm.rds")