Look at the Pokemon games information form the IGN data.

data file

df<-read.csv('/Users/jonathanbouchet/Desktop/WORK/PROJECT/gaming/IGN/ign.csv',sep=',')
#fix one entry having release_year = 1970
#df[df$release_year<1980,]
df[df$release_year<1980,'release_year']<-2012
df[df$release_year<1980,'release_month']<-4
df[df$release_year<1980,'release_day']<-27
library(ggplot2)
library(gridExtra)
library(dplyr)

I re-organized the genre and platform into more general group, for example :

#sony<-c('PlayStation','PlayStation 2','PlayStation 3','PlayStation 4' ,'PlayStation Portable','PlayStation Vita')
#and
#action<-c('Action','Action, Adventure','Action, Compilation','Action, Platformer','Action, Puzzle','Action, RPG','Action, Simulation','Action, Editor','Action, Strategy')

I create new columns for these 2 new features

df$newGenre<-sapply(df$genre, genreBetter)
df$newPlatform<-sapply(df$platform, newManufacturer)

I filter the initial data to look for Pokemon games. I also create another feature for portables/home console games

pokeDf<-filter(df, grepl("Pokemon|Pokken",title))
#after using filter on factor level, we need to drop the unused ones
pokeDf <- droplevels(pokeDf)

pokeGamePort<-c('Nintendo DS','Game Boy','Game Boy Color','Game Boy Advance','Nintendo 3DS','iPhone','Android')
pokeGameHome<-c('Nintendo 64','GameCube','Wii','Wii U')

PORT<-function(x){
    if (x %in% pokeGamePort == TRUE) {return('PORTABLE')}
    else if (x %in% pokeGameHome == TRUE) {return('HOME')}
}

pokeDf$TYPE<-sapply(pokeDf$platform,PORT)

Genre and Platform

g0<-ggplot(pokeDf,aes(x=factor(release_year))) + geom_bar(aes(fill=platform),color='black') + theme(axis.text.x = element_text(angle=90, hjust=1)) + xlab('Year')+ theme(legend.position="top")+ theme(legend.title=element_blank())
print(g0)

g1<-ggplot(pokeDf,aes(x=factor(release_year))) + geom_bar(aes(fill=newPlatform),color='black') + theme(axis.text.x = element_text(angle=90, hjust=1)) + xlab('Year')+ theme(legend.position="top")+ theme(legend.title=element_blank())
print(g1)

g2<-ggplot(pokeDf,aes(x=reorder(genre,genre,function(x)-length(x)))) + geom_bar(aes(fill=platform),color='black') + theme(axis.text.x = element_text(angle=90, hjust=1)) + xlab('genre') + theme(legend.title=element_blank())
print(g2)

g3<-ggplot(pokeDf,aes(x=factor(TYPE))) + geom_bar() + theme(axis.text.x = element_text(angle=90, hjust=1)) + xlab('TYPE')
print(g3)

Comments

  • Initially I was surprised to not see a game on the most recent platform (Wii U), only to find that my regex was incomplete.
  • From the top plot, we see that there was (at least) 1 Pokemon game released each year.
  • The distribution between portable and home consoles is not equal, roughly 2.5 more games on portable rather than home console.
  • Making the same plot but for re-sized platform (last plot), we see the shift in recent years towards new machines (Apple, Android).
  • Games are predominently RPG’s.

Scores

g4<-ggplot(pokeDf,aes(score)) + geom_histogram(aes(fill=(platform)),bins=100) + theme(legend.position=c(.2, .75)) + xlab('scores') + theme(legend.title=element_blank())
print(g4)

g5<-ggplot(pokeDf,aes(score)) + geom_histogram(aes(fill=(editors_choice)),bins=100) + theme(legend.position=c(.2, .75)) + xlab('scores') + theme(legend.title=element_blank())
print(g5)

#grid.arrange(g3,g4,ncol=2)

Comments

  • The games from the first generations(on Game Boy, Game Boy Color) are those who get the higher scores, according IGN notation