Data introduction:

In 20 years, the gaming industry has evolved immenstly both in quantity and in quality. This dataset, available in Kaggle (https://www.kaggle.com/egrinstein/20-years-of-games), provides information and overall online scoring of various games from the past two decades.

By exploring this dataset, I’m looking to review the most and least popular game platforms and genres.

Data Structure:

## 'data.frame':    18625 obs. of  11 variables:
##  $ X             : int  0 1 2 3 4 5 6 7 8 9 ...
##  $ score_phrase  : Factor w/ 11 levels "Amazing","Awful",..: 1 1 6 6 6 5 2 1 2 5 ...
##  $ title         : Factor w/ 12589 levels "'Splosion Man",..: 5702 5703 9767 7249 7249 11406 2908 4446 2908 11406 ...
##  $ url           : Factor w/ 18577 levels "/games/0-d-beat-drop/xbox-360-14342395",..: 8390 8387 14319 10813 10812 16931 4271 6526 4270 16932 ...
##  $ platform      : Factor w/ 59 levels "Android","Arcade",..: 39 39 15 58 36 20 58 33 36 33 ...
##  $ score         : num  9 9 8.5 8.5 8.5 7 3 9 3 7 ...
##  $ genre         : Factor w/ 112 levels "Action","Action, Adventure",..: 64 64 69 94 94 105 38 82 38 105 ...
##  $ editors_choice: Factor w/ 2 levels "N","Y": 2 2 1 1 1 1 1 2 1 1 ...
##  $ release_year  : int  2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
##  $ release_month : int  9 9 9 9 9 9 9 9 9 9 ...
##  $ release_day   : int  12 12 12 11 11 11 11 11 11 11 ...

This data set consists of 18,625 rows of data, with a few missing values (in one column only) which were converted to “NA” and dropped.

##              X   score_phrase          title            url       platform 
##              0              0              0              0              0 
##          score          genre editors_choice   release_year  release_month 
##              0             36              0              0              0 
##    release_day 
##              0
##              X   score_phrase          title            url       platform 
##              0              0              0              0              0 
##          score          genre editors_choice   release_year  release_month 
##              0              0              0              0              0 
##    release_day 
##              0

The data set contains the following game information:

Column Details
Title Game’s title
Url Link to the online review (will not be used in this analysis)
Score Game’s score according to the IGN website (Imagine Games Network)
Score_Phrase Categorical scoring
Editors_choice Editor’s choice vote (will not be used in this analysis)
Release_year Game’s release year
Release_month Game’s release month (will not be used in this analysis)
Release_day Game’s release day (will not be used in this analysis)
Platform Game’s platform
Genre Game’s genre

The main focus of the data, in my opinion, is the IGN scoring. Other game feautures (such as platform, release year and genre) can be used in order to review best/worst scored games, as well as to check correlations between these parameters and a gamings score.

My analysis:

The main game features I’ll focus on in this analysis are release year, genre and platform. These will serve as the explanatory variables for the scoring.

I’m basing my analysis on the assumption that this data set is comprised of all the games released in the gaming industry. In other words, this analysis will be on the gaming industry in its entirety.

Initial data review:

##        X           score_phrase 
##  Min.   :    0   Great   :4765  
##  1st Qu.: 4651   Good    :4725  
##  Median : 9298   Okay    :2940  
##  Mean   : 9306   Mediocre:1957  
##  3rd Qu.:13959   Amazing :1803  
##  Max.   :18624   Bad     :1268  
##                  (Other) :1131  
##                                      title      
##  Cars                                   :   10  
##  Madden NFL 07                          :   10  
##  Open Season                            :   10  
##  Brain Challenge                        :    9  
##  LEGO Star Wars II: The Original Trilogy:    9  
##  Madden NFL 08                          :    9  
##  (Other)                                :18532  
##                                             url       
##  /games/aladdin/gba-566703                    :    2  
##  /games/big-league-sports/wii-14275098        :    2  
##  /games/blur/xbox-360-14222096                :    2  
##  /games/call-of-duty-modern-warfare-2/ps3-2550:    2  
##  /games/crash-twinsanity/ps2-667247           :    2  
##  /games/defiance/pc-71832                     :    2  
##  (Other)                                      :18577  
##           platform        score              genre      editors_choice
##  PC           :3367   Min.   : 0.500   Action   :3797   N:15074       
##  PlayStation 2:1684   1st Qu.: 6.000   Sports   :1916   Y: 3515       
##  Xbox 360     :1631   Median : 7.300   Shooter  :1610                 
##  Wii          :1362   Mean   : 6.951   Racing   :1228                 
##  PlayStation 3:1355   3rd Qu.: 8.200   Adventure:1175                 
##  Nintendo DS  :1044   Max.   :10.000   Strategy :1071                 
##  (Other)      :8146                    (Other)  :7792                 
##   release_year  release_month    release_day  
##  Min.   :1970   Min.   : 1.00   Min.   : 1.0  
##  1st Qu.:2003   1st Qu.: 4.00   1st Qu.: 8.0  
##  Median :2007   Median : 8.00   Median :16.0  
##  Mean   :2007   Mean   : 7.14   Mean   :15.6  
##  3rd Qu.:2010   3rd Qu.:10.00   3rd Qu.:23.0  
##  Max.   :2016   Max.   :12.00   Max.   :31.0  
## 

dd

Game Titles:

At first there seemed to be duplicate games (repeated game titles), but after reviewing the data I found that the titles are the same, but the platform differs. This means that these are the same games but on different platforms. However- the scoring of these games were the same for all the platforms on which they were released. The duplicates were removed, and we are now left with 12,555 rows of data (~67% of the original data set).

Game release years:

The “oldest” game in the data set was from 1970, but this was the only release until the 90s, so the data for that game was dropped from the data set. We will focus on games released from 1996 onward.

We can see that amount of games released continously increased from 2001 until it peaked in 2008. Following that year, the amount was on a constant decrease. This was partly suprising, as it would seem that the technilogical developments in the field in the last decade would be manifested in the amount of games released across platforms. Although this might seem to be due to the fact duplicates were removed (same game, different platforms),but an analysis on the raw data reveled the same trends.

Game genres:

There is a large amount of genres in the data set because in some games, the genres seems to include a sub genre in addition to the main one. For example, there is a an “Action” genre and an “Action, Adventure” genre. I decided to narrow down the genres by grouping them according to the main “value”“, without the sub category. This narrowed down the genres from 112 to 30.

##   [1] "Action"                    "Action, Adventure"        
##   [3] "Action, Compilation"       "Action, Editor"           
##   [5] "Action, Platformer"        "Action, Puzzle"           
##   [7] "Action, RPG"               "Action, Simulation"       
##   [9] "Action, Strategy"          "Adult, Card"              
##  [11] "Adventure"                 "Adventure, Adult"         
##  [13] "Adventure, Adventure"      "Adventure, Compilation"   
##  [15] "Adventure, Episodic"       "Adventure, Platformer"    
##  [17] "Adventure, RPG"            "Baseball"                 
##  [19] "Battle"                    "Board"                    
##  [21] "Board, Compilation"        "Card"                     
##  [23] "Card, Battle"              "Card, Compilation"        
##  [25] "Card, RPG"                 "Casino"                   
##  [27] "Compilation"               "Compilation, Compilation" 
##  [29] "Compilation, RPG"          "Educational"              
##  [31] "Educational, Action"       "Educational, Adventure"   
##  [33] "Educational, Card"         "Educational, Productivity"
##  [35] "Educational, Puzzle"       "Educational, Simulation"  
##  [37] "Educational, Trivia"       "Fighting"                 
##  [39] "Fighting, Action"          "Fighting, Adventure"      
##  [41] "Fighting, Compilation"     "Fighting, RPG"            
##  [43] "Fighting, Simulation"      "Flight"                   
##  [45] "Flight, Action"            "Flight, Racing"           
##  [47] "Flight, Simulation"        "Hardware"                 
##  [49] "Hunting"                   "Hunting, Action"          
##  [51] "Hunting, Simulation"       "Music"                    
##  [53] "Music, Action"             "Music, Adventure"         
##  [55] "Music, Compilation"        "Music, Editor"            
##  [57] "Music, RPG"                "Other"                    
##  [59] "Other, Action"             "Other, Adventure"         
##  [61] "Party"                     "Pinball"                  
##  [63] "Pinball, Compilation"      "Platformer"               
##  [65] "Platformer, Action"        "Platformer, Adventure"    
##  [67] "Productivity"              "Productivity, Action"     
##  [69] "Puzzle"                    "Puzzle, Action"           
##  [71] "Puzzle, Adventure"         "Puzzle, Compilation"      
##  [73] "Puzzle, Platformer"        "Puzzle, RPG"              
##  [75] "Puzzle, Word Game"         "Racing"                   
##  [77] "Racing, Action"            "Racing, Compilation"      
##  [79] "Racing, Editor"            "Racing, Shooter"          
##  [81] "Racing, Simulation"        "RPG"                      
##  [83] "RPG, Action"               "RPG, Compilation"         
##  [85] "RPG, Editor"               "RPG, Simulation"          
##  [87] "Shooter"                   "Shooter, Adventure"       
##  [89] "Shooter, First-Person"     "Shooter, Platformer"      
##  [91] "Shooter, RPG"              "Simulation"               
##  [93] "Simulation, Adventure"     "Sports"                   
##  [95] "Sports, Action"            "Sports, Baseball"         
##  [97] "Sports, Compilation"       "Sports, Editor"           
##  [99] "Sports, Fighting"          "Sports, Golf"             
## [101] "Sports, Other"             "Sports, Party"            
## [103] "Sports, Racing"            "Sports, Simulation"       
## [105] "Strategy"                  "Strategy, Compilation"    
## [107] "Strategy, RPG"             "Strategy, Simulation"     
## [109] "Trivia"                    "Virtual Pet"              
## [111] "Wrestling"                 "Wrestling, Simulation"
##  [1] "Action"       "Adult"        "Adventure"    "Baseball"    
##  [5] "Battle"       "Board"        "Card"         "Casino"      
##  [9] "Compilation"  "Educational"  "Fighting"     "Flight"      
## [13] "Hardware"     "Hunting"      "Music"        "Other"       
## [17] "Party"        "Pinball"      "Platformer"   "Productivity"
## [21] "Puzzle"       "Racing"       "RPG"          "Shooter"     
## [25] "Simulation"   "Sports"       "Strategy"     "Trivia"      
## [29] "Virtual Pet"  "Wrestling"

This confirmed that most games reviewed are from the Action genre.

Game platforms:

Platforms were also narrowed down, in the same way as the genres (since we are not looking into specific version of the platform, but the main platform generally). This way we reduced the amount of platforms significantly.

Note: Some of the names are slightly altered (for example: “Game Boy”" is now “Game”), but this was disregarded due to its lack of impact on the analysis.

##  [1] "Android"              "Arcade"               "Atari 2600"          
##  [4] "Atari 5200"           "Commodore 64/128"     "Dreamcast"           
##  [7] "Dreamcast VMU"        "DVD / HD Video Game"  "Game Boy"            
## [10] "Game Boy Advance"     "Game Boy Color"       "Game.Com"            
## [13] "GameCube"             "Genesis"              "iPad"                
## [16] "iPhone"               "iPod"                 "Linux"               
## [19] "Lynx"                 "Macintosh"            "Master System"       
## [22] "N-Gage"               "NeoGeo"               "NeoGeo Pocket Color" 
## [25] "NES"                  "New Nintendo 3DS"     "Nintendo 3DS"        
## [28] "Nintendo 64"          "Nintendo 64DD"        "Nintendo DS"         
## [31] "Nintendo DSi"         "Ouya"                 "PC"                  
## [34] "PlayStation"          "PlayStation 2"        "PlayStation 3"       
## [37] "PlayStation 4"        "PlayStation Portable" "PlayStation Vita"    
## [40] "Pocket PC"            "Saturn"               "Sega 32X"            
## [43] "Sega CD"              "SteamOS"              "Super NES"           
## [46] "TurboGrafx-16"        "TurboGrafx-CD"        "Vectrex"             
## [49] "Web Games"            "Wii"                  "Wii U"               
## [52] "Windows Phone"        "Windows Surface"      "Wireless"            
## [55] "WonderSwan"           "WonderSwan Color"     "Xbox"                
## [58] "Xbox 360"             "Xbox One"
##  [1] "Android"     "Arcade"      "Atari"       "Commodore"   "Dreamcast"  
##  [6] "DVD"         "Game"        "GameCube"    "Genesis"     "iPad"       
## [11] "iPhone"      "iPod"        "Linux"       "Lynx"        "Macintosh"  
## [16] "Master"      "N-Gage"      "NeoGeo"      "NES"         "Nintendo"   
## [21] "PC"          "PlayStation" "Pocket"      "Saturn"      "Sega"       
## [26] "Super"       "TurboGraf"   "Vectrex"     "Web"         "Wii"        
## [31] "Windows"     "Wireless"    "WonderSwan"  "Xbox"

After consulting with a gaming expert, I decided to narrow both the genre and platform variables even further by manually creating fewer categories:

Platform:

  1. PlayStation
  2. Xbox
  3. Nintendo (includes: ‘Game’,‘GameCube’, ‘NES’, ‘New’ ,‘Nintendo’,‘Super’,‘Wii’)
  4. Sega (includes: ‘Dreamcast’,‘Genesis’, ‘Master’,‘Saturn’ ,‘Sega’)
  5. Atari (includes: ‘Atari’,‘Lynx’)
  6. Apple (includes: ‘iPad’,‘iPhone’,‘iPod’,‘Macintosh’)
  7. PC (includes:‘Pocket’,‘PC’,‘Windows’,‘SteamOS’,‘linux’)
  8. Android (includes: ‘Android’,‘Ouya’)
  9. Small_Platforms (includes: ‘WonderSwan’,‘WonderSwan’,‘N-Gage’,‘TurboGrafx-16’,‘TurboGrafx-CD’,‘NeoGeo’)

Genre:

  1. Action (includes: ‘Action’, ‘Strategy’,‘Fighting’)
  2. Adventure (includes: ‘Adventure’)
  3. Sport (includes: ‘Baseball’,‘Sports’,‘Wrestling’,‘Hunting’)
  4. Platform (includes: ‘Platform’)
  5. Racing (includes: ‘Racing’)
  6. RPG (includes: ‘RPG’)
  7. Sim (includes: ‘Simulation’,‘Productivity’)
  8. Strategy (includes: ‘Puzzle’,‘Strategy’)
  9. Music (includes: ‘Music’)
  10. Flight (includes: ‘Flight’)
  11. Shooter (includes:‘Shooter’)
  12. Assorted_genre (includes: ‘Party’,‘Card’,‘Pinball’,‘Compilation’,‘Educational’)

This made the data a lot clearer for analysis and confirmed that most games released are from the Nintendo platform and most games were from the Action genre. We will look into these parameters later on.

No additional “data cleaning” is needed at this point.

Scoring:

A preliminary review of the data, revealed that the average scoring of all games in this data set is 6.95, while the median is 7.3- by which we can learn that most of the scoring are above the average.

Scoring phrases are also available in this data sets:

##  [1] "Amazing"     "Awful"       "Bad"         "Disaster"    "Good"       
##  [6] "Great"       "Masterpiece" "Mediocre"    "Okay"        "Painful"    
## [11] "Unbearable"

However I decided to create my own scoring categories- 6 equal levels (0.5-10).

##          Very Low               Low     Below Average Average and Above 
##               167               563              1539              2515 
##              High         Very High 
##              5271              2500

Below I show the new scoring categories’ frequencies by release year, platform and genre:

We see that the high and very high scorings were given to games released in 2015 and 2005, while the very low rankings were given to the oldest games in 1996. The high ranking were also given mostly to Android and Apple games, while the very high rankings were awarded to Sega , Xbox and smaller less known platform games. There also seems to be a lot of high rankings in the “other platform” category- which we will explore later on. The Nintendo platform surplisigly has the most very low scores. Regarding the genre rankings, the highest ranking were awarded to RPG games while the lowest rankings were given to racing games.

Bivariate Analysis:

Bivariate Plots Section

In this section I will review the games by platforms and genres:

As mentioned earlier, most games are from the action genre, mainly from the PC, Nintendo and Playstation platforms. The highly rated RPG genre is evenly distubuted among these platforms. The Nintendo platform, is widely present in all genres and there’s a high amount of games in the “other genre” categry, which further emphesizes he need to review this categoty. Most Xbox Sega and smaller platform games are from the action genre, and these were the platforms that has the highest amount of very high rankings.

Let’s review the numerical scorings:

##    newGenre AVG_Score
## 8       RPG  7.490524
## 9   Shooter  7.087476
## 12 Strategy  7.077423
## 5     Music  6.984559
## 4    Flight  6.916783
## 1    Action  6.915130
## 6     Other  6.890380
## 2   Advent.  6.856034
## 11    Sport  6.772688
## 7    Racing  6.740217
## 10      Sim  6.708039
## 3  Assorted  6.487437
##    newPlatform AVG_Score
## 2        Apple  7.366534
## 1      Android  7.350000
## 8         Sega  7.272862
## 5        Other  7.191210
## 6           PC  7.093362
## 10        Xbox  7.087042
## 3        Atari  7.052941
## 9   SmallPlat.  7.042857
## 7  PlayStation  6.904976
## 4     Nintendo  6.546212

XXXXX

Scoring categories:

These are the platform and genre combinations with the highest very high ratings: - Android- Strategy - Apple - Strategy - Apple - Other - Apple - Music - Apple - Adventure - Atari - Flight - Nintendo - Adventure - Other - RPG - Other - Action - PC - Racing - PC - Music - PC - Action - Playstation - Strategy - Playstation - Sport - Playstation - Other - Playstation - Action - Sega - RPG - Sega - Assorted - Small Platforms - RPG - Xbox - RPG - Xbox - Other - Xbox - Music

I will now plot the same combinations but with the numerical scoring (different coloring was used in order to emphesize the difference between the results):

There we can see that the combination Playstation - Strategy has the higher numerical score.

Below I review the average score of the platforms and genres that stood out:

Playstation, Xbox, Sega, Smaller Platforms, Nintendo

It is intering to see that the smaller platforms have extremely low average rating on the highly rated RPG genre, while being highly rated in the strategy genre. Sega’s best games average scoring is actually from the music genre, but it is also highly rated in RPG games. Xbox’s best rated games are shooters.

Smaller Platforms

Other category

RPG and Racing

The best RPG games are definitely from the Sega Platform, and Nintendo’s racing games still stand out as the worst rated.

Final Plots and Summary

We can see that Nintendo not only had the highest amount of very low scores, but also has the lowest overall average score. Sega was among the platforms with highest amount of very high scores and also has an overall high average score but Xbox and smaller platform are actually not among the highest averaged scores. This places Sega as the top rated platform, with the highest average score. RPG continues to be the highest rated genre and racing is lowest rated one.

Plot One

Description One

Plot Two

Description Two

Plot Three

Description Three


Reflection