In 20 years, the gaming industry has evolved immenstly both in quantity and in quality. This dataset, available in Kaggle (https://www.kaggle.com/egrinstein/20-years-of-games), provides information and overall online scoring of various games from the past two decades.
By exploring this dataset, I’m looking to review the most and least popular game platforms and genres.
## 'data.frame': 18625 obs. of 11 variables:
## $ X : int 0 1 2 3 4 5 6 7 8 9 ...
## $ score_phrase : Factor w/ 11 levels "Amazing","Awful",..: 1 1 6 6 6 5 2 1 2 5 ...
## $ title : Factor w/ 12589 levels "'Splosion Man",..: 5702 5703 9767 7249 7249 11406 2908 4446 2908 11406 ...
## $ url : Factor w/ 18577 levels "/games/0-d-beat-drop/xbox-360-14342395",..: 8390 8387 14319 10813 10812 16931 4271 6526 4270 16932 ...
## $ platform : Factor w/ 59 levels "Android","Arcade",..: 39 39 15 58 36 20 58 33 36 33 ...
## $ score : num 9 9 8.5 8.5 8.5 7 3 9 3 7 ...
## $ genre : Factor w/ 112 levels "Action","Action, Adventure",..: 64 64 69 94 94 105 38 82 38 105 ...
## $ editors_choice: Factor w/ 2 levels "N","Y": 2 2 1 1 1 1 1 2 1 1 ...
## $ release_year : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
## $ release_month : int 9 9 9 9 9 9 9 9 9 9 ...
## $ release_day : int 12 12 12 11 11 11 11 11 11 11 ...
This data set consists of 18,625 rows of data, with a few missing values (in one column only) which were converted to “NA” and dropped.
## X score_phrase title url platform
## 0 0 0 0 0
## score genre editors_choice release_year release_month
## 0 36 0 0 0
## release_day
## 0
## X score_phrase title url platform
## 0 0 0 0 0
## score genre editors_choice release_year release_month
## 0 0 0 0 0
## release_day
## 0
The data set contains the following game information:
| Tables | Are |
|---|---|
| col 3 is | right-aligned |
| col 2 is | centered |
| zebra stripes | are neat |
The main focus of the data, in my opinion, is the IGN scoring. Other game feautures (such as platform, release year and genre) can be used in order to review best/worst scored games, as well as to check correlations between these parameters and a gamings score.
The main game features I’ll focus on in this analysis are release year, genre and platform. These will serve as the explanatory variables for the scoring.
I’m basing my analysis on the assumption that this data set is comprised of all the games released in the gaming industry. In other words, this analysis will be on the gaming industry in its entirety.
## X score_phrase
## Min. : 0 Great :4765
## 1st Qu.: 4651 Good :4725
## Median : 9298 Okay :2940
## Mean : 9306 Mediocre:1957
## 3rd Qu.:13959 Amazing :1803
## Max. :18624 Bad :1268
## (Other) :1131
## title
## Cars : 10
## Madden NFL 07 : 10
## Open Season : 10
## Brain Challenge : 9
## LEGO Star Wars II: The Original Trilogy: 9
## Madden NFL 08 : 9
## (Other) :18532
## url
## /games/aladdin/gba-566703 : 2
## /games/big-league-sports/wii-14275098 : 2
## /games/blur/xbox-360-14222096 : 2
## /games/call-of-duty-modern-warfare-2/ps3-2550: 2
## /games/crash-twinsanity/ps2-667247 : 2
## /games/defiance/pc-71832 : 2
## (Other) :18577
## platform score genre editors_choice
## PC :3367 Min. : 0.500 Action :3797 N:15074
## PlayStation 2:1684 1st Qu.: 6.000 Sports :1916 Y: 3515
## Xbox 360 :1631 Median : 7.300 Shooter :1610
## Wii :1362 Mean : 6.951 Racing :1228
## PlayStation 3:1355 3rd Qu.: 8.200 Adventure:1175
## Nintendo DS :1044 Max. :10.000 Strategy :1071
## (Other) :8146 (Other) :7792
## release_year release_month release_day
## Min. :1970 Min. : 1.00 Min. : 1.0
## 1st Qu.:2003 1st Qu.: 4.00 1st Qu.: 8.0
## Median :2007 Median : 8.00 Median :16.0
## Mean :2007 Mean : 7.14 Mean :15.6
## 3rd Qu.:2010 3rd Qu.:10.00 3rd Qu.:23.0
## Max. :2016 Max. :12.00 Max. :31.0
##
dd
At first there seemed to be duplicate games (repeated game titles), but after reviewing the data I found that the titles are the same, but the platform differs. This means that these are the same games but on different platforms. However- the scoring of these games were the same for all the platforms on which they were released. The duplicates were removed, and we are now left with 12,555 rows of data (~67% of the original data set).
The “oldest” game in the data set was from 1970, but this was the only release until the 90s, so the data for that game was dropped from the data set. We will focus on games released from 1996 onward.
We can see that amount of games released continously increased from 2001 until it peaked in 2008. Following that year, the amount was on a constant decrease. This was partly suprising, as it would seem that the technilogical developments in the field in the last decade would be manifested in the amount of games released across platforms. Although this might seem to be due to the fact duplicates were removed (same game, different platforms),but an analysis on the raw data reveled the same trends.
There is a large amount of genres in the data set because in some games, the genres seems to include a sub genre in addition to the main one. For example, there is a an “Action” genre and an “Action, Adventure” genre. I decided to narrow down the genres by grouping them according to the main “value”“, without the sub category. This narrowed down the genres from 112 to 30.
## [1] "Action" "Action, Adventure"
## [3] "Action, Compilation" "Action, Editor"
## [5] "Action, Platformer" "Action, Puzzle"
## [7] "Action, RPG" "Action, Simulation"
## [9] "Action, Strategy" "Adult, Card"
## [11] "Adventure" "Adventure, Adult"
## [13] "Adventure, Adventure" "Adventure, Compilation"
## [15] "Adventure, Episodic" "Adventure, Platformer"
## [17] "Adventure, RPG" "Baseball"
## [19] "Battle" "Board"
## [21] "Board, Compilation" "Card"
## [23] "Card, Battle" "Card, Compilation"
## [25] "Card, RPG" "Casino"
## [27] "Compilation" "Compilation, Compilation"
## [29] "Compilation, RPG" "Educational"
## [31] "Educational, Action" "Educational, Adventure"
## [33] "Educational, Card" "Educational, Productivity"
## [35] "Educational, Puzzle" "Educational, Simulation"
## [37] "Educational, Trivia" "Fighting"
## [39] "Fighting, Action" "Fighting, Adventure"
## [41] "Fighting, Compilation" "Fighting, RPG"
## [43] "Fighting, Simulation" "Flight"
## [45] "Flight, Action" "Flight, Racing"
## [47] "Flight, Simulation" "Hardware"
## [49] "Hunting" "Hunting, Action"
## [51] "Hunting, Simulation" "Music"
## [53] "Music, Action" "Music, Adventure"
## [55] "Music, Compilation" "Music, Editor"
## [57] "Music, RPG" "Other"
## [59] "Other, Action" "Other, Adventure"
## [61] "Party" "Pinball"
## [63] "Pinball, Compilation" "Platformer"
## [65] "Platformer, Action" "Platformer, Adventure"
## [67] "Productivity" "Productivity, Action"
## [69] "Puzzle" "Puzzle, Action"
## [71] "Puzzle, Adventure" "Puzzle, Compilation"
## [73] "Puzzle, Platformer" "Puzzle, RPG"
## [75] "Puzzle, Word Game" "Racing"
## [77] "Racing, Action" "Racing, Compilation"
## [79] "Racing, Editor" "Racing, Shooter"
## [81] "Racing, Simulation" "RPG"
## [83] "RPG, Action" "RPG, Compilation"
## [85] "RPG, Editor" "RPG, Simulation"
## [87] "Shooter" "Shooter, Adventure"
## [89] "Shooter, First-Person" "Shooter, Platformer"
## [91] "Shooter, RPG" "Simulation"
## [93] "Simulation, Adventure" "Sports"
## [95] "Sports, Action" "Sports, Baseball"
## [97] "Sports, Compilation" "Sports, Editor"
## [99] "Sports, Fighting" "Sports, Golf"
## [101] "Sports, Other" "Sports, Party"
## [103] "Sports, Racing" "Sports, Simulation"
## [105] "Strategy" "Strategy, Compilation"
## [107] "Strategy, RPG" "Strategy, Simulation"
## [109] "Trivia" "Virtual Pet"
## [111] "Wrestling" "Wrestling, Simulation"
## [1] "Action" "Adult" "Adventure" "Baseball"
## [5] "Battle" "Board" "Card" "Casino"
## [9] "Compilation" "Educational" "Fighting" "Flight"
## [13] "Hardware" "Hunting" "Music" "Other"
## [17] "Party" "Pinball" "Platformer" "Productivity"
## [21] "Puzzle" "Racing" "RPG" "Shooter"
## [25] "Simulation" "Sports" "Strategy" "Trivia"
## [29] "Virtual Pet" "Wrestling"
This confirmed that most games reviewed are from the Action genre.
Platforms were also narrowed down, in the same way as the genres (since we are not looking into specific version of the platform, but the main platform generally). This way we reduced the amount of platforms significantly.
Note: Some of the names are slightly altered (for example: “Game Boy”" is now “Game”), but this was disregarded due to its lack of impact on the analysis.
## [1] "Android" "Arcade" "Atari 2600"
## [4] "Atari 5200" "Commodore 64/128" "Dreamcast"
## [7] "Dreamcast VMU" "DVD / HD Video Game" "Game Boy"
## [10] "Game Boy Advance" "Game Boy Color" "Game.Com"
## [13] "GameCube" "Genesis" "iPad"
## [16] "iPhone" "iPod" "Linux"
## [19] "Lynx" "Macintosh" "Master System"
## [22] "N-Gage" "NeoGeo" "NeoGeo Pocket Color"
## [25] "NES" "New Nintendo 3DS" "Nintendo 3DS"
## [28] "Nintendo 64" "Nintendo 64DD" "Nintendo DS"
## [31] "Nintendo DSi" "Ouya" "PC"
## [34] "PlayStation" "PlayStation 2" "PlayStation 3"
## [37] "PlayStation 4" "PlayStation Portable" "PlayStation Vita"
## [40] "Pocket PC" "Saturn" "Sega 32X"
## [43] "Sega CD" "SteamOS" "Super NES"
## [46] "TurboGrafx-16" "TurboGrafx-CD" "Vectrex"
## [49] "Web Games" "Wii" "Wii U"
## [52] "Windows Phone" "Windows Surface" "Wireless"
## [55] "WonderSwan" "WonderSwan Color" "Xbox"
## [58] "Xbox 360" "Xbox One"
## [1] "Android" "Arcade" "Atari" "Commodore" "Dreamcast"
## [6] "DVD" "Game" "GameCube" "Genesis" "iPad"
## [11] "iPhone" "iPod" "Linux" "Lynx" "Macintosh"
## [16] "Master" "N-Gage" "NeoGeo" "NES" "Nintendo"
## [21] "PC" "PlayStation" "Pocket" "Saturn" "Sega"
## [26] "Super" "TurboGraf" "Vectrex" "Web" "Wii"
## [31] "Windows" "Wireless" "WonderSwan" "Xbox"
After consulting with a gaming expert, I decided to narrow both the genre and platform variables even further by manually creating fewer categories:
This made the data a lot clearer for analysis and confirmed that most games released are from the Nintendo platform and most games were from the Action genre. We will look into these parameters later on.
No additional “data cleaning” is needed at this point.
A preliminary review of the data, revealed that the average scoring of all games in this data set is 6.95, while the median is 7.3- by which we can learn that most of the scoring are above the average.
Scoring phrases are also available in this data sets:
## [1] "Amazing" "Awful" "Bad" "Disaster" "Good"
## [6] "Great" "Masterpiece" "Mediocre" "Okay" "Painful"
## [11] "Unbearable"
However I decided to create my own scoring categories- 6 equal levels (0.5-10).
## Very Low Low Below Average Average and Above
## 167 563 1539 2515
## High Very High
## 5271 2500
Below I show the new scoring categories’ frequencies by release year, platform and genre:
We see that the high and very high scorings were given to games released in 2015 and 2005, while the very low rankings were given to the oldest games in 1996. The high ranking were also given mostly to Android and Apple games, while the very high rankings were awarded to Sega , Xbox and smaller less known platform games. There also seems to be a lot of high rankings in the “other platform” category- which we will explore later on. The Nintendo platform surplisigly has the most very low scores. Regarding the genre rankings, the highest ranking were awarded to RPG games while the lowest rankings were given to racing games.
In this section I will review the games by platforms and genres:
As mentioned earlier, most games are from the action genre, mainly from the PC, Nintendo and Playstation platforms. The highly rated RPG genre is evenly distubuted among these platforms. The Nintendo platform, is widely present in all genres and there’s a high amount of games in the “other genre” categry, which further emphesizes he need to review this categoty. Most Xbox Sega and smaller platform games are from the action genre, and these were the platforms that has the highest amount of very high rankings.
Let’s review the numerical scorings:
## newGenre AVG_Score
## 8 RPG 7.490524
## 9 Shooter 7.087476
## 12 Strategy 7.077423
## 5 Music 6.984559
## 4 Flight 6.916783
## 1 Action 6.915130
## 6 Other 6.890380
## 2 Advent. 6.856034
## 11 Sport 6.772688
## 7 Racing 6.740217
## 10 Sim 6.708039
## 3 Assorted 6.487437
## newPlatform AVG_Score
## 2 Apple 7.366534
## 1 Android 7.350000
## 8 Sega 7.272862
## 5 Other 7.191210
## 6 PC 7.093362
## 10 Xbox 7.087042
## 3 Atari 7.052941
## 9 SmallPlat. 7.042857
## 7 PlayStation 6.904976
## 4 Nintendo 6.546212
XXXXX
These are the platform and genre combinations with the highest very high ratings: - Android- Strategy - Apple - Strategy - Apple - Other - Apple - Music - Apple - Adventure - Atari - Flight - Nintendo - Adventure - Other - RPG - Other - Action - PC - Racing - PC - Music - PC - Action - Playstation - Strategy - Playstation - Sport - Playstation - Other - Playstation - Action - Sega - RPG - Sega - Assorted - Small Platforms - RPG - Xbox - RPG - Xbox - Other - Xbox - Music
I will now plot the same combinations but with the numerical scoring (different coloring was used in order to emphesize the difference between the results):
There we can see that the combination Playstation - Strategy has the higher numerical score.
Below I review the average score of the platforms and genres that stood out:
It is intering to see that the smaller platforms have extremely low average rating on the highly rated RPG genre, while being highly rated in the strategy genre. Sega’s best games average scoring is actually from the music genre, but it is also highly rated in RPG games. Xbox’s best rated games are shooters.
The best RPG games are definitely from the Sega Platform, and Nintendo’s racing games still stand out as the worst rated.
We can see that Nintendo not only had the highest amount of very low scores, but also has the lowest overall average score. Sega was among the platforms with highest amount of very high scores and also has an overall high average score but Xbox and smaller platform are actually not among the highest averaged scores. This places Sega as the top rated platform, with the highest average score. RPG continues to be the highest rated genre and racing is lowest rated one.