Data introduction:

In 20 years, the gaming industry has evolved immenstly both in quantity and in quality. This dataset, available in Kaggle (https://www.kaggle.com/egrinstein/20-years-of-games), provides information and overall online scoring of game releases from the past two decades.

By exploring this dataset, I’m looking to review the most and least popular game platforms and genres.

Data Structure:

## 'data.frame':    18625 obs. of  11 variables:
##  $ X             : int  0 1 2 3 4 5 6 7 8 9 ...
##  $ score_phrase  : Factor w/ 11 levels "Amazing","Awful",..: 1 1 6 6 6 5 2 1 2 5 ...
##  $ title         : Factor w/ 12589 levels "'Splosion Man",..: 5702 5703 9767 7249 7249 11406 2908 4446 2908 11406 ...
##  $ url           : Factor w/ 18577 levels "/games/0-d-beat-drop/xbox-360-14342395",..: 8390 8387 14319 10813 10812 16931 4271 6526 4270 16932 ...
##  $ platform      : Factor w/ 59 levels "Android","Arcade",..: 39 39 15 58 36 20 58 33 36 33 ...
##  $ score         : num  9 9 8.5 8.5 8.5 7 3 9 3 7 ...
##  $ genre         : Factor w/ 112 levels "Action","Action, Adventure",..: 64 64 69 94 94 105 38 82 38 105 ...
##  $ editors_choice: Factor w/ 2 levels "N","Y": 2 2 1 1 1 1 1 2 1 1 ...
##  $ release_year  : int  2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
##  $ release_month : int  9 9 9 9 9 9 9 9 9 9 ...
##  $ release_day   : int  12 12 12 11 11 11 11 11 11 11 ...

This data set consists of 18,625 rows of data, with a few missing values (in one column only) which were converted to “NA” and dropped.

##              X   score_phrase          title            url       platform 
##              0              0              0              0              0 
##          score          genre editors_choice   release_year  release_month 
##              0             36              0              0              0 
##    release_day 
##              0
##              X   score_phrase          title            url       platform 
##              0              0              0              0              0 
##          score          genre editors_choice   release_year  release_month 
##              0              0              0              0              0 
##    release_day 
##              0

The data set contains the following game information:

Column Details
Title Game’s title
Url Link to the online review (will not be used in this analysis)
Score Game’s score according to the IGN website (Imagine Games Network)
Score_Phrase Categorical scoring
Editors_choice Editor’s choice vote (will not be used in this analysis)
Release_year Game’s release year
Release_month Game’s release month (will not be used in this analysis)
Release_day Game’s release day (will not be used in this analysis)
Platform Game’s platform
Genre Game’s genre

The most imortant parameter in the data, in my opinion, is the IGN scoring. Other game feautures (such as platform, release year and genre) can be used in order to review best/worst scored games, as well as to check correlations between these variables and gamings score.

Initial data review:

##        X           score_phrase 
##  Min.   :    0   Great   :4765  
##  1st Qu.: 4651   Good    :4725  
##  Median : 9298   Okay    :2940  
##  Mean   : 9306   Mediocre:1957  
##  3rd Qu.:13959   Amazing :1803  
##  Max.   :18624   Bad     :1268  
##                  (Other) :1131  
##                                      title      
##  Cars                                   :   10  
##  Madden NFL 07                          :   10  
##  Open Season                            :   10  
##  Brain Challenge                        :    9  
##  LEGO Star Wars II: The Original Trilogy:    9  
##  Madden NFL 08                          :    9  
##  (Other)                                :18532  
##                                             url       
##  /games/aladdin/gba-566703                    :    2  
##  /games/big-league-sports/wii-14275098        :    2  
##  /games/blur/xbox-360-14222096                :    2  
##  /games/call-of-duty-modern-warfare-2/ps3-2550:    2  
##  /games/crash-twinsanity/ps2-667247           :    2  
##  /games/defiance/pc-71832                     :    2  
##  (Other)                                      :18577  
##           platform        score              genre      editors_choice
##  PC           :3367   Min.   : 0.500   Action   :3797   N:15074       
##  PlayStation 2:1684   1st Qu.: 6.000   Sports   :1916   Y: 3515       
##  Xbox 360     :1631   Median : 7.300   Shooter  :1610                 
##  Wii          :1362   Mean   : 6.951   Racing   :1228                 
##  PlayStation 3:1355   3rd Qu.: 8.200   Adventure:1175                 
##  Nintendo DS  :1044   Max.   :10.000   Strategy :1071                 
##  (Other)      :8146                    (Other)  :7792                 
##   release_year  release_month    release_day  
##  Min.   :1970   Min.   : 1.00   Min.   : 1.0  
##  1st Qu.:2003   1st Qu.: 4.00   1st Qu.: 8.0  
##  Median :2007   Median : 8.00   Median :16.0  
##  Mean   :2007   Mean   : 7.14   Mean   :15.6  
##  3rd Qu.:2010   3rd Qu.:10.00   3rd Qu.:23.0  
##  Max.   :2016   Max.   :12.00   Max.   :31.0  
## 

My analysis:

As mentioned above, not all the columns were used. A new data set was created with the relevant columns for this analysis: The main game features I’ll focus on are release year, genre and platform. These will serve as the explanatory variables for the scoring.

##                                      title                platform   
##  Cars                                   :   10   PC           :3367  
##  Madden NFL 07                          :   10   PlayStation 2:1684  
##  Open Season                            :   10   Xbox 360     :1631  
##  Brain Challenge                        :    9   Wii          :1362  
##  LEGO Star Wars II: The Original Trilogy:    9   PlayStation 3:1355  
##  Madden NFL 08                          :    9   Nintendo DS  :1044  
##  (Other)                                :18532   (Other)      :8146  
##      score          score_phrase        genre       release_year 
##  Min.   : 0.500   Great   :4765   Action   :3797   Min.   :1970  
##  1st Qu.: 6.000   Good    :4725   Sports   :1916   1st Qu.:2003  
##  Median : 7.300   Okay    :2940   Shooter  :1610   Median :2007  
##  Mean   : 6.951   Mediocre:1957   Racing   :1228   Mean   :2007  
##  3rd Qu.: 8.200   Amazing :1803   Adventure:1175   3rd Qu.:2010  
##  Max.   :10.000   Bad     :1268   Strategy :1071   Max.   :2016  
##                   (Other) :1131   (Other)  :7792

I’m basing my analysis on the assumption that this data set is comprised of all the games released in the gaming industry. In other words, this analysis will be on the gaming industry in its entirety.

Game Titles:

At first there seemed to be duplicate games (repeated game titles), but after reviewing the data I found that the titles are the same, but the platform differs. This means that these are the same games but on different platforms. However, the scoring of these games were the same for all the platforms on which they were released. The duplicates were removed, and we are now left with 12,555 rows of data (~67% of the original data set).

Note: The duplicate selection was automatic: the first unique title was kept and the following rows with the same title were dropped. Since the data is still not sorted according to platform, the selection process was in fact random.

Game release years:

The “oldest” game in the data set was from 1970, but this was the only release until the 90s, so the data for that game was dropped from the data set. We will focus on games released from 1996 onward.

[Figure 1]

We can see that amount of games released continously increased from 2001 until it peaked in 2008. Following that year, the amount has been on a constant decrease. This is partly suprising, as it would seem that the technological developments in the field in the last decade would be manifested in the amount of games released across platforms. Although this might seem to be due to the fact duplicates were removed (same game, different platforms), an analysis on the raw data revealed the same trends.

Game genres:

There is a large amount of genres in the data set because in some games, the genres seem to include a sub-genre in addition to the main one. For example, there is a an “Action” genre and an “Action, Adventure” genre. I decided to narrow down the genres by grouping them according to the main “value”“, without the sub-category. This reduced the amount of genres from 112 to 30.

##   [1] "Action"                    "Action, Adventure"        
##   [3] "Action, Compilation"       "Action, Editor"           
##   [5] "Action, Platformer"        "Action, Puzzle"           
##   [7] "Action, RPG"               "Action, Simulation"       
##   [9] "Action, Strategy"          "Adult, Card"              
##  [11] "Adventure"                 "Adventure, Adult"         
##  [13] "Adventure, Adventure"      "Adventure, Compilation"   
##  [15] "Adventure, Episodic"       "Adventure, Platformer"    
##  [17] "Adventure, RPG"            "Baseball"                 
##  [19] "Battle"                    "Board"                    
##  [21] "Board, Compilation"        "Card"                     
##  [23] "Card, Battle"              "Card, Compilation"        
##  [25] "Card, RPG"                 "Casino"                   
##  [27] "Compilation"               "Compilation, Compilation" 
##  [29] "Compilation, RPG"          "Educational"              
##  [31] "Educational, Action"       "Educational, Adventure"   
##  [33] "Educational, Card"         "Educational, Productivity"
##  [35] "Educational, Puzzle"       "Educational, Simulation"  
##  [37] "Educational, Trivia"       "Fighting"                 
##  [39] "Fighting, Action"          "Fighting, Adventure"      
##  [41] "Fighting, Compilation"     "Fighting, RPG"            
##  [43] "Fighting, Simulation"      "Flight"                   
##  [45] "Flight, Action"            "Flight, Racing"           
##  [47] "Flight, Simulation"        "Hardware"                 
##  [49] "Hunting"                   "Hunting, Action"          
##  [51] "Hunting, Simulation"       "Music"                    
##  [53] "Music, Action"             "Music, Adventure"         
##  [55] "Music, Compilation"        "Music, Editor"            
##  [57] "Music, RPG"                "Other"                    
##  [59] "Other, Action"             "Other, Adventure"         
##  [61] "Party"                     "Pinball"                  
##  [63] "Pinball, Compilation"      "Platformer"               
##  [65] "Platformer, Action"        "Platformer, Adventure"    
##  [67] "Productivity"              "Productivity, Action"     
##  [69] "Puzzle"                    "Puzzle, Action"           
##  [71] "Puzzle, Adventure"         "Puzzle, Compilation"      
##  [73] "Puzzle, Platformer"        "Puzzle, RPG"              
##  [75] "Puzzle, Word Game"         "Racing"                   
##  [77] "Racing, Action"            "Racing, Compilation"      
##  [79] "Racing, Editor"            "Racing, Shooter"          
##  [81] "Racing, Simulation"        "RPG"                      
##  [83] "RPG, Action"               "RPG, Compilation"         
##  [85] "RPG, Editor"               "RPG, Simulation"          
##  [87] "Shooter"                   "Shooter, Adventure"       
##  [89] "Shooter, First-Person"     "Shooter, Platformer"      
##  [91] "Shooter, RPG"              "Simulation"               
##  [93] "Simulation, Adventure"     "Sports"                   
##  [95] "Sports, Action"            "Sports, Baseball"         
##  [97] "Sports, Compilation"       "Sports, Editor"           
##  [99] "Sports, Fighting"          "Sports, Golf"             
## [101] "Sports, Other"             "Sports, Party"            
## [103] "Sports, Racing"            "Sports, Simulation"       
## [105] "Strategy"                  "Strategy, Compilation"    
## [107] "Strategy, RPG"             "Strategy, Simulation"     
## [109] "Trivia"                    "Virtual Pet"              
## [111] "Wrestling"                 "Wrestling, Simulation"
##  [1] "Action"       "Adult"        "Adventure"    "Baseball"    
##  [5] "Battle"       "Board"        "Card"         "Casino"      
##  [9] "Compilation"  "Educational"  "Fighting"     "Flight"      
## [13] "Hardware"     "Hunting"      "Music"        "Other"       
## [17] "Party"        "Pinball"      "Platformer"   "Productivity"
## [21] "Puzzle"       "Racing"       "RPG"          "Shooter"     
## [25] "Simulation"   "Sports"       "Strategy"     "Trivia"      
## [29] "Virtual Pet"  "Wrestling"

This confirmed that most games releases have been from the Action genre.

Game platforms:

Platforms were also narrowed down, in the same way as the genres (since we are not looking into specific versions of the platforms, but the main platforms generally). This way we reduced the amount of platforms significantly.

Note: Some of the names were slightly altered in this process (for example: “Game Boy”" is now “Game”), but this was disregarded due to its lack of impact on the analysis.

##  [1] "Android"              "Arcade"               "Atari 2600"          
##  [4] "Atari 5200"           "Commodore 64/128"     "Dreamcast"           
##  [7] "Dreamcast VMU"        "DVD / HD Video Game"  "Game Boy"            
## [10] "Game Boy Advance"     "Game Boy Color"       "Game.Com"            
## [13] "GameCube"             "Genesis"              "iPad"                
## [16] "iPhone"               "iPod"                 "Linux"               
## [19] "Lynx"                 "Macintosh"            "Master System"       
## [22] "N-Gage"               "NeoGeo"               "NeoGeo Pocket Color" 
## [25] "NES"                  "New Nintendo 3DS"     "Nintendo 3DS"        
## [28] "Nintendo 64"          "Nintendo 64DD"        "Nintendo DS"         
## [31] "Nintendo DSi"         "Ouya"                 "PC"                  
## [34] "PlayStation"          "PlayStation 2"        "PlayStation 3"       
## [37] "PlayStation 4"        "PlayStation Portable" "PlayStation Vita"    
## [40] "Pocket PC"            "Saturn"               "Sega 32X"            
## [43] "Sega CD"              "SteamOS"              "Super NES"           
## [46] "TurboGrafx-16"        "TurboGrafx-CD"        "Vectrex"             
## [49] "Web Games"            "Wii"                  "Wii U"               
## [52] "Windows Phone"        "Windows Surface"      "Wireless"            
## [55] "WonderSwan"           "WonderSwan Color"     "Xbox"                
## [58] "Xbox 360"             "Xbox One"
##  [1] "Android"     "Arcade"      "Atari"       "Commodore"   "Dreamcast"  
##  [6] "DVD"         "Game"        "GameCube"    "Genesis"     "iPad"       
## [11] "iPhone"      "iPod"        "Linux"       "Lynx"        "Macintosh"  
## [16] "Master"      "N-Gage"      "NeoGeo"      "NES"         "Nintendo"   
## [21] "PC"          "PlayStation" "Pocket"      "Saturn"      "Sega"       
## [26] "Super"       "TurboGraf"   "Vectrex"     "Web"         "Wii"        
## [31] "Windows"     "Wireless"    "WonderSwan"  "Xbox"

After consulting with a gaming expert, I decided to narrow down both the genre and platform variables even further, by manually creating fewer categories:

Platform:

  1. PlayStation
  2. Xbox
  3. Nintendo (includes: ‘Game’,‘GameCube’, ‘NES’, ‘New’ ,‘Nintendo’,‘Super’,‘Wii’)
  4. Sega (includes: ‘Dreamcast’,‘Genesis’, ‘Master’,‘Saturn’ ,‘Sega’)
  5. Atari (includes: ‘Atari’,‘Lynx’)
  6. Apple (includes: ‘iPad’,‘iPhone’,‘iPod’,‘Macintosh’)
  7. PC (includes:‘Pocket’,‘PC’,‘Windows’,‘SteamOS’,‘linux’)
  8. Android (includes: ‘Android’,‘Ouya’)
  9. Small_Platforms (includes: ‘WonderSwan’,‘WonderSwan’,‘N-Gage’, ‘TurboGraf’ ‘TurboGrafx-16’,‘TurboGrafx-CD’,‘NeoGeo’)
  10. Wireless
  11. Other (includes other assorted platforms)

Genre:

  1. Action (includes: ‘Action’, ‘Strategy’,‘Fighting’)
  2. Adventure (includes: ‘Adventure’)
  3. Sport (includes: ‘Baseball’,‘Sports’,‘Wrestling’,‘Hunting’)
  4. Platform (includes: ‘Platform’)
  5. Racing (includes: ‘Racing’)
  6. RPG (includes: ‘RPG’)
  7. Sim (includes: ‘Simulation’,‘Productivity’)
  8. Strategy (includes: ‘Puzzle’,‘Strategy’)
  9. Music (includes: ‘Music’)
  10. Flight (includes: ‘Flight’)
  11. Shooter (includes:‘Shooter’)
  12. Assorted_genre (includes: ‘Party’,‘Card’,‘Pinball’,‘Compilation’,‘Educational’)
  13. Other (includes other genres)

[Figure 2]

[Figure 3]

This made the data a lot clearer for analysis and confirmed that most games released are from the Nintendo platform and most games were from the Action genre. We will look into these parameters later on. No additional “data cleaning” is needed at this point.

Scoring:

[Figure 4]

A preliminary review of the data, revealed that the average scoring of all games in this data set is 6.95, while the median is 7.3- by which we can learn that most of the scorings are above the average.

Let’s take a look at games with a score of 10:

[Figure 5]

It’s interesting to see the amount of games per platform and genre that received a perfect 10 score, but since we are interested in the average overall score, we’ll continue on with our analysis.

Scoring phrases are also available in this data set:

##  [1] "Amazing"     "Awful"       "Bad"         "Disaster"    "Good"       
##  [6] "Great"       "Masterpiece" "Mediocre"    "Okay"        "Painful"    
## [11] "Unbearable"

However I decided to create my own scoring categories: 6 equaly diveded levels (0.5-10).

##          Very Low               Low     Below Average Average and Above 
##               167               563              1539              2515 
##              High         Very High 
##              5271              2500

Below I show the new scoring categories’ frequencies by release year, platform and genre:

[Figure 6]

[Figure 7]

[Figure 8]

We see that games released in 2015 and 2005 had the most “high” and “very high scorings”, while the 1996 games were given the most “very low” rankings. As a fan of the classics, this seemed counterintuitive.
The “high” rankings were also given mostly to Android and Apple games, while the “very high” rankings were awarded to Sega , Xbox and smaller less known platform games. There also seems to be a lot of “high” rankings in the “other platform” category- which we will explore later on. Lastly and surprisingly, the well known Nintendo platform has the most “very low” scores. Regarding the genre rankings, the highest ranking were awarded to RPG games while the lowest rankings were given to racing games.

Bivariate/Multivariate Analysis:

In this section I will review the games by platforms and genres:

[Figure 9]

As mentioned earlier, most games are from the action genre, mainly from the PC, Nintendo and Playstation platforms. The RPG genre stand out as despite its relatively low amount of games, they are evenly distubuted among the mentioned platforms. The Nintendo platform, is widely present in all genres and there’s a high amount of games in the “other genre” categry, which further emphesizes the need to review this categoty. Most Xbox, Sega and smaller platform games are from the action genre, and these were the platforms that have the highest amount of very high rankings.

Let’s review the numerical scorings:

##    newGenre AVG_Score
## 8       RPG  7.490524
## 9   Shooter  7.087476
## 12 Strategy  7.077423
## 5     Music  6.984559
## 4    Flight  6.916783
## 1    Action  6.915130
## 6     Other  6.890380
## 2   Advent.  6.856034
## 11    Sport  6.772688
## 7    Racing  6.740217
## 10      Sim  6.708039
## 3  Assorted  6.487437
##    newPlatform AVG_Score
## 2        Apple  7.366534
## 1      Android  7.350000
## 8         Sega  7.272862
## 10    Wireless  7.217537
## 6           PC  7.093757
## 11        Xbox  7.087042
## 3        Atari  7.052941
## 9   SmallPlat.  7.023377
## 7  PlayStation  6.904976
## 4     Nintendo  6.546212
## 5        Other  6.064286

[Figure 10]

[Figure 11]

The best rated genre is RPG and, as mentioned above, the big releasing platforms have similar amounts of games in this genre. Looking at the average scores, The best rated platform is Apple, which ranks fifth in the amount of releases and Android, which has the lowest amount of releases. Sega has the highest median score, but the amount of releases of this platforms it too low to deduce anyting from this observation. We can also see that Nintendo not only had the highest amount of very low scores, but also has the lowest overall average score.

Scoring categories:

[Figure 12]

These are the platform and genre combinations with the highest very high ratings:

Platform Genre
Android Strategy
Apple Strategy
Apple Other
Apple Music
Apple Adventure
Atari Flight
Nintendo Adventure
Other Music
Other Action
PC Racing
PC Music
PC Action
Playstation Strategy
Playstation Sport
Playstation Other
Playstation Action
Sega RPG
Sega Assorted
Sega Music
Wireless Shooter
Wireless RPG
Wireless Action
Xbox RPG
Xbox Other
Xbox Music

I will now plot the genre-platforme combinations with the numerical scoring (different coloring was used in order to emphesize the difference between the results):

[Figure 13]

Here we can see that the combination Playstation - Strategy has the highest numerical score.

Below I review the average score of the platforms with the biggest amount of releases. There are not the best rated platforms, but it is more addecuate to analyise the platforms with the most data (as the highest rated platforms have fewer releases):

Nintendo, Playstation, PC and Xbox:

[Figure 14]

We might have an answer to the never ending debate regarding which is the best platform- Xbox or Playstation. These consoles have been in direct competition throughout the years and fans constatly debate about which one is superior. Since our data shows that its median scores are higher across most platforms, Xbox wins! The biggest platform Nintendo has numerous games in all genres but they are mostly just above average.

Smaller Platforms

[Figure 15]

A short review of the smaller platform games reveals that the there is little variation in the genres released by each company. “NeoGeo” stands out as having the biggest amount of releases and variety in genres.

Other platform category

[Figure 16]

The “Other” category had two highly rated combinations- in the “Action” and “Music” genres. Only the amount of “Action” games is visible in this plot, but the scoring is not as high as expected.

RPG and Racing

[Figure 17]

RPG is the best genre, and the best RPG games are from the Sega Platform, while Nintendo’s racing games still stand out as the worst rated. Let’s take a look at this genre more closely:

[Figure 18]

In 1996, “Playstation” had almost complete control of the RPG genre. PC, Nintendo and Sega platforms quickly closed the gap. Sega released games in this successfull genre in 2001, but by 2002 it left this market while Nintendo and PC continued their releases throughout the years (joined by Xbox and Apple alternately).

Summary:

By analyzing this data set, we learned that the gaming industry, mostly PC, Nintendo and Playstation, have flodded the market with action games in the past 20 years. However RPG games have had more success with audiences and even though the big platforms have all released RPG games, Sega has actually dominated this genre (not in amount, but in quality).

The Nintendo platform is widely present in all genres and has a fair amount of the highest score, but its overall performance was rather average. Smaller platforms tend not to diverisy and mainly focus on one or two specific genres and their overall scoring is also average. Regarding the release years, 2008 was the golden year in amount of releases, but 2015 and 2005 had the most high scored games.

Finally, the biggest revelation we found is that Xbox wins the Playstation-Xbox battle, by having the most highly scored games.