In 20 years, the gaming industry has evolved immenstly both in quantity and in quality. This dataset, available in Kaggle (https://www.kaggle.com/egrinstein/20-years-of-games), provides information and overall online scoring of game releases from the past two decades.
By exploring this dataset, I’m looking to review the most and least popular game platforms and genres.
## 'data.frame': 18625 obs. of 11 variables:
## $ X : int 0 1 2 3 4 5 6 7 8 9 ...
## $ score_phrase : Factor w/ 11 levels "Amazing","Awful",..: 1 1 6 6 6 5 2 1 2 5 ...
## $ title : Factor w/ 12589 levels "'Splosion Man",..: 5702 5703 9767 7249 7249 11406 2908 4446 2908 11406 ...
## $ url : Factor w/ 18577 levels "/games/0-d-beat-drop/xbox-360-14342395",..: 8390 8387 14319 10813 10812 16931 4271 6526 4270 16932 ...
## $ platform : Factor w/ 59 levels "Android","Arcade",..: 39 39 15 58 36 20 58 33 36 33 ...
## $ score : num 9 9 8.5 8.5 8.5 7 3 9 3 7 ...
## $ genre : Factor w/ 112 levels "Action","Action, Adventure",..: 64 64 69 94 94 105 38 82 38 105 ...
## $ editors_choice: Factor w/ 2 levels "N","Y": 2 2 1 1 1 1 1 2 1 1 ...
## $ release_year : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
## $ release_month : int 9 9 9 9 9 9 9 9 9 9 ...
## $ release_day : int 12 12 12 11 11 11 11 11 11 11 ...
This data set consists of 18,625 rows of data, with a few missing values (in one column only) which were converted to “NA” and dropped.
## X score_phrase title url platform
## 0 0 0 0 0
## score genre editors_choice release_year release_month
## 0 36 0 0 0
## release_day
## 0
## X score_phrase title url platform
## 0 0 0 0 0
## score genre editors_choice release_year release_month
## 0 0 0 0 0
## release_day
## 0
The data set contains the following game information:
Column | Details |
---|---|
Title | Game’s title |
Url | Link to the online review (will not be used in this analysis) |
Score | Game’s score according to the IGN website (Imagine Games Network) |
Score_Phrase | Categorical scoring |
Editors_choice | Editor’s choice vote (will not be used in this analysis) |
Release_year | Game’s release year |
Release_month | Game’s release month (will not be used in this analysis) |
Release_day | Game’s release day (will not be used in this analysis) |
Platform | Game’s platform |
Genre | Game’s genre |
The most imortant parameter in the data, in my opinion, is the IGN scoring. Other game feautures (such as platform, release year and genre) can be used in order to review best/worst scored games, as well as to check correlations between these variables and gamings score.
## X score_phrase
## Min. : 0 Great :4765
## 1st Qu.: 4651 Good :4725
## Median : 9298 Okay :2940
## Mean : 9306 Mediocre:1957
## 3rd Qu.:13959 Amazing :1803
## Max. :18624 Bad :1268
## (Other) :1131
## title
## Cars : 10
## Madden NFL 07 : 10
## Open Season : 10
## Brain Challenge : 9
## LEGO Star Wars II: The Original Trilogy: 9
## Madden NFL 08 : 9
## (Other) :18532
## url
## /games/aladdin/gba-566703 : 2
## /games/big-league-sports/wii-14275098 : 2
## /games/blur/xbox-360-14222096 : 2
## /games/call-of-duty-modern-warfare-2/ps3-2550: 2
## /games/crash-twinsanity/ps2-667247 : 2
## /games/defiance/pc-71832 : 2
## (Other) :18577
## platform score genre editors_choice
## PC :3367 Min. : 0.500 Action :3797 N:15074
## PlayStation 2:1684 1st Qu.: 6.000 Sports :1916 Y: 3515
## Xbox 360 :1631 Median : 7.300 Shooter :1610
## Wii :1362 Mean : 6.951 Racing :1228
## PlayStation 3:1355 3rd Qu.: 8.200 Adventure:1175
## Nintendo DS :1044 Max. :10.000 Strategy :1071
## (Other) :8146 (Other) :7792
## release_year release_month release_day
## Min. :1970 Min. : 1.00 Min. : 1.0
## 1st Qu.:2003 1st Qu.: 4.00 1st Qu.: 8.0
## Median :2007 Median : 8.00 Median :16.0
## Mean :2007 Mean : 7.14 Mean :15.6
## 3rd Qu.:2010 3rd Qu.:10.00 3rd Qu.:23.0
## Max. :2016 Max. :12.00 Max. :31.0
##
As mentioned above, not all the columns were used. A new data set was created with the relevant columns for this analysis: The main game features I’ll focus on are release year, genre and platform. These will serve as the explanatory variables for the scoring.
## title platform
## Cars : 10 PC :3367
## Madden NFL 07 : 10 PlayStation 2:1684
## Open Season : 10 Xbox 360 :1631
## Brain Challenge : 9 Wii :1362
## LEGO Star Wars II: The Original Trilogy: 9 PlayStation 3:1355
## Madden NFL 08 : 9 Nintendo DS :1044
## (Other) :18532 (Other) :8146
## score score_phrase genre release_year
## Min. : 0.500 Great :4765 Action :3797 Min. :1970
## 1st Qu.: 6.000 Good :4725 Sports :1916 1st Qu.:2003
## Median : 7.300 Okay :2940 Shooter :1610 Median :2007
## Mean : 6.951 Mediocre:1957 Racing :1228 Mean :2007
## 3rd Qu.: 8.200 Amazing :1803 Adventure:1175 3rd Qu.:2010
## Max. :10.000 Bad :1268 Strategy :1071 Max. :2016
## (Other) :1131 (Other) :7792
I’m basing my analysis on the assumption that this data set is comprised of all the games released in the gaming industry. In other words, this analysis will be on the gaming industry in its entirety.
At first there seemed to be duplicate games (repeated game titles), but after reviewing the data I found that the titles are the same, but the platform differs. This means that these are the same games but on different platforms. However, the scoring of these games were the same for all the platforms on which they were released. The duplicates were removed, and we are now left with 12,555 rows of data (~67% of the original data set).
Note: The duplicate selection was automatic: the first unique title was kept and the following rows with the same title were dropped. Since the data is still not sorted according to platform, the selection process was in fact random.
The “oldest” game in the data set was from 1970, but this was the only release until the 90s, so the data for that game was dropped from the data set. We will focus on games released from 1996 onward.
[Figure 1]
We can see that amount of games released continously increased from 2001 until it peaked in 2008. Following that year, the amount has been on a constant decrease. This is partly suprising, as it would seem that the technological developments in the field in the last decade would be manifested in the amount of games released across platforms. Although this might seem to be due to the fact duplicates were removed (same game, different platforms), an analysis on the raw data revealed the same trends.
There is a large amount of genres in the data set because in some games, the genres seem to include a sub-genre in addition to the main one. For example, there is a an “Action” genre and an “Action, Adventure” genre. I decided to narrow down the genres by grouping them according to the main “value”“, without the sub-category. This reduced the amount of genres from 112 to 30.
## [1] "Action" "Action, Adventure"
## [3] "Action, Compilation" "Action, Editor"
## [5] "Action, Platformer" "Action, Puzzle"
## [7] "Action, RPG" "Action, Simulation"
## [9] "Action, Strategy" "Adult, Card"
## [11] "Adventure" "Adventure, Adult"
## [13] "Adventure, Adventure" "Adventure, Compilation"
## [15] "Adventure, Episodic" "Adventure, Platformer"
## [17] "Adventure, RPG" "Baseball"
## [19] "Battle" "Board"
## [21] "Board, Compilation" "Card"
## [23] "Card, Battle" "Card, Compilation"
## [25] "Card, RPG" "Casino"
## [27] "Compilation" "Compilation, Compilation"
## [29] "Compilation, RPG" "Educational"
## [31] "Educational, Action" "Educational, Adventure"
## [33] "Educational, Card" "Educational, Productivity"
## [35] "Educational, Puzzle" "Educational, Simulation"
## [37] "Educational, Trivia" "Fighting"
## [39] "Fighting, Action" "Fighting, Adventure"
## [41] "Fighting, Compilation" "Fighting, RPG"
## [43] "Fighting, Simulation" "Flight"
## [45] "Flight, Action" "Flight, Racing"
## [47] "Flight, Simulation" "Hardware"
## [49] "Hunting" "Hunting, Action"
## [51] "Hunting, Simulation" "Music"
## [53] "Music, Action" "Music, Adventure"
## [55] "Music, Compilation" "Music, Editor"
## [57] "Music, RPG" "Other"
## [59] "Other, Action" "Other, Adventure"
## [61] "Party" "Pinball"
## [63] "Pinball, Compilation" "Platformer"
## [65] "Platformer, Action" "Platformer, Adventure"
## [67] "Productivity" "Productivity, Action"
## [69] "Puzzle" "Puzzle, Action"
## [71] "Puzzle, Adventure" "Puzzle, Compilation"
## [73] "Puzzle, Platformer" "Puzzle, RPG"
## [75] "Puzzle, Word Game" "Racing"
## [77] "Racing, Action" "Racing, Compilation"
## [79] "Racing, Editor" "Racing, Shooter"
## [81] "Racing, Simulation" "RPG"
## [83] "RPG, Action" "RPG, Compilation"
## [85] "RPG, Editor" "RPG, Simulation"
## [87] "Shooter" "Shooter, Adventure"
## [89] "Shooter, First-Person" "Shooter, Platformer"
## [91] "Shooter, RPG" "Simulation"
## [93] "Simulation, Adventure" "Sports"
## [95] "Sports, Action" "Sports, Baseball"
## [97] "Sports, Compilation" "Sports, Editor"
## [99] "Sports, Fighting" "Sports, Golf"
## [101] "Sports, Other" "Sports, Party"
## [103] "Sports, Racing" "Sports, Simulation"
## [105] "Strategy" "Strategy, Compilation"
## [107] "Strategy, RPG" "Strategy, Simulation"
## [109] "Trivia" "Virtual Pet"
## [111] "Wrestling" "Wrestling, Simulation"
## [1] "Action" "Adult" "Adventure" "Baseball"
## [5] "Battle" "Board" "Card" "Casino"
## [9] "Compilation" "Educational" "Fighting" "Flight"
## [13] "Hardware" "Hunting" "Music" "Other"
## [17] "Party" "Pinball" "Platformer" "Productivity"
## [21] "Puzzle" "Racing" "RPG" "Shooter"
## [25] "Simulation" "Sports" "Strategy" "Trivia"
## [29] "Virtual Pet" "Wrestling"
This confirmed that most games releases have been from the Action genre.
Platforms were also narrowed down, in the same way as the genres (since we are not looking into specific versions of the platforms, but the main platforms generally). This way we reduced the amount of platforms significantly.
Note: Some of the names were slightly altered in this process (for example: “Game Boy”" is now “Game”), but this was disregarded due to its lack of impact on the analysis.
## [1] "Android" "Arcade" "Atari 2600"
## [4] "Atari 5200" "Commodore 64/128" "Dreamcast"
## [7] "Dreamcast VMU" "DVD / HD Video Game" "Game Boy"
## [10] "Game Boy Advance" "Game Boy Color" "Game.Com"
## [13] "GameCube" "Genesis" "iPad"
## [16] "iPhone" "iPod" "Linux"
## [19] "Lynx" "Macintosh" "Master System"
## [22] "N-Gage" "NeoGeo" "NeoGeo Pocket Color"
## [25] "NES" "New Nintendo 3DS" "Nintendo 3DS"
## [28] "Nintendo 64" "Nintendo 64DD" "Nintendo DS"
## [31] "Nintendo DSi" "Ouya" "PC"
## [34] "PlayStation" "PlayStation 2" "PlayStation 3"
## [37] "PlayStation 4" "PlayStation Portable" "PlayStation Vita"
## [40] "Pocket PC" "Saturn" "Sega 32X"
## [43] "Sega CD" "SteamOS" "Super NES"
## [46] "TurboGrafx-16" "TurboGrafx-CD" "Vectrex"
## [49] "Web Games" "Wii" "Wii U"
## [52] "Windows Phone" "Windows Surface" "Wireless"
## [55] "WonderSwan" "WonderSwan Color" "Xbox"
## [58] "Xbox 360" "Xbox One"
## [1] "Android" "Arcade" "Atari" "Commodore" "Dreamcast"
## [6] "DVD" "Game" "GameCube" "Genesis" "iPad"
## [11] "iPhone" "iPod" "Linux" "Lynx" "Macintosh"
## [16] "Master" "N-Gage" "NeoGeo" "NES" "Nintendo"
## [21] "PC" "PlayStation" "Pocket" "Saturn" "Sega"
## [26] "Super" "TurboGraf" "Vectrex" "Web" "Wii"
## [31] "Windows" "Wireless" "WonderSwan" "Xbox"
After consulting with a gaming expert, I decided to narrow down both the genre and platform variables even further, by manually creating fewer categories:
[Figure 2]
[Figure 3]
This made the data a lot clearer for analysis and confirmed that most games released are from the Nintendo platform and most games were from the Action genre. We will look into these parameters later on. No additional “data cleaning” is needed at this point.
[Figure 4]
A preliminary review of the data, revealed that the average scoring of all games in this data set is 6.95, while the median is 7.3- by which we can learn that most of the scorings are above the average.
Let’s take a look at games with a score of 10:
[Figure 5]
It’s interesting to see the amount of games per platform and genre that received a perfect 10 score, but since we are interested in the average overall score, we’ll continue on with our analysis.
Scoring phrases are also available in this data set:
## [1] "Amazing" "Awful" "Bad" "Disaster" "Good"
## [6] "Great" "Masterpiece" "Mediocre" "Okay" "Painful"
## [11] "Unbearable"
However I decided to create my own scoring categories: 6 equaly diveded levels (0.5-10).
## Very Low Low Below Average Average and Above
## 167 563 1539 2515
## High Very High
## 5271 2500
Below I show the new scoring categories’ frequencies by release year, platform and genre:
[Figure 6]
[Figure 7]
[Figure 8]
We see that games released in 2015 and 2005 had the most “high” and “very high scorings”, while the 1996 games were given the most “very low” rankings. As a fan of the classics, this seemed counterintuitive.
The “high” rankings were also given mostly to Android and Apple games, while the “very high” rankings were awarded to Sega , Xbox and smaller less known platform games. There also seems to be a lot of “high” rankings in the “other platform” category- which we will explore later on. Lastly and surprisingly, the well known Nintendo platform has the most “very low” scores. Regarding the genre rankings, the highest ranking were awarded to RPG games while the lowest rankings were given to racing games.
In this section I will review the games by platforms and genres:
[Figure 9]
As mentioned earlier, most games are from the action genre, mainly from the PC, Nintendo and Playstation platforms. The RPG genre stand out as despite its relatively low amount of games, they are evenly distubuted among the mentioned platforms. The Nintendo platform, is widely present in all genres and there’s a high amount of games in the “other genre” categry, which further emphesizes the need to review this categoty. Most Xbox, Sega and smaller platform games are from the action genre, and these were the platforms that have the highest amount of very high rankings.
Let’s review the numerical scorings:
## newGenre AVG_Score
## 8 RPG 7.490524
## 9 Shooter 7.087476
## 12 Strategy 7.077423
## 5 Music 6.984559
## 4 Flight 6.916783
## 1 Action 6.915130
## 6 Other 6.890380
## 2 Advent. 6.856034
## 11 Sport 6.772688
## 7 Racing 6.740217
## 10 Sim 6.708039
## 3 Assorted 6.487437
## newPlatform AVG_Score
## 2 Apple 7.366534
## 1 Android 7.350000
## 8 Sega 7.272862
## 10 Wireless 7.217537
## 6 PC 7.093757
## 11 Xbox 7.087042
## 3 Atari 7.052941
## 9 SmallPlat. 7.023377
## 7 PlayStation 6.904976
## 4 Nintendo 6.546212
## 5 Other 6.064286
[Figure 10]
[Figure 11]
The best rated genre is RPG and, as mentioned above, the big releasing platforms have similar amounts of games in this genre. Looking at the average scores, The best rated platform is Apple, which ranks fifth in the amount of releases and Android, which has the lowest amount of releases. Sega has the highest median score, but the amount of releases of this platforms it too low to deduce anyting from this observation. We can also see that Nintendo not only had the highest amount of very low scores, but also has the lowest overall average score.
[Figure 12]
These are the platform and genre combinations with the highest very high ratings:
Platform | Genre |
---|---|
Android | Strategy |
Apple | Strategy |
Apple | Other |
Apple | Music |
Apple | Adventure |
Atari | Flight |
Nintendo | Adventure |
Other | Music |
Other | Action |
PC | Racing |
PC | Music |
PC | Action |
Playstation | Strategy |
Playstation | Sport |
Playstation | Other |
Playstation | Action |
Sega | RPG |
Sega | Assorted |
Sega | Music |
Wireless | Shooter |
Wireless | RPG |
Wireless | Action |
Xbox | RPG |
Xbox | Other |
Xbox | Music |
I will now plot the genre-platforme combinations with the numerical scoring (different coloring was used in order to emphesize the difference between the results):
[Figure 13]
Here we can see that the combination Playstation - Strategy has the highest numerical score.
Below I review the average score of the platforms with the biggest amount of releases. There are not the best rated platforms, but it is more addecuate to analyise the platforms with the most data (as the highest rated platforms have fewer releases):
[Figure 14]
We might have an answer to the never ending debate regarding which is the best platform- Xbox or Playstation. These consoles have been in direct competition throughout the years and fans constatly debate about which one is superior. Since our data shows that its median scores are higher across most platforms, Xbox wins! The biggest platform Nintendo has numerous games in all genres but they are mostly just above average.
[Figure 15]
A short review of the smaller platform games reveals that the there is little variation in the genres released by each company. “NeoGeo” stands out as having the biggest amount of releases and variety in genres.
[Figure 16]
The “Other” category had two highly rated combinations- in the “Action” and “Music” genres. Only the amount of “Action” games is visible in this plot, but the scoring is not as high as expected.
[Figure 17]
RPG is the best genre, and the best RPG games are from the Sega Platform, while Nintendo’s racing games still stand out as the worst rated. Let’s take a look at this genre more closely:
[Figure 18]
In 1996, “Playstation” had almost complete control of the RPG genre. PC, Nintendo and Sega platforms quickly closed the gap. Sega released games in this successfull genre in 2001, but by 2002 it left this market while Nintendo and PC continued their releases throughout the years (joined by Xbox and Apple alternately).
By analyzing this data set, we learned that the gaming industry, mostly PC, Nintendo and Playstation, have flodded the market with action games in the past 20 years. However RPG games have had more success with audiences and even though the big platforms have all released RPG games, Sega has actually dominated this genre (not in amount, but in quality).
The Nintendo platform is widely present in all genres and has a fair amount of the highest score, but its overall performance was rather average. Smaller platforms tend not to diverisy and mainly focus on one or two specific genres and their overall scoring is also average. Regarding the release years, 2008 was the golden year in amount of releases, but 2015 and 2005 had the most high scored games.
Finally, the biggest revelation we found is that Xbox wins the Playstation-Xbox battle, by having the most highly scored games.