videogame_data <- read.csv("games.csv")All-Time Video Game Statistics
Project One
Initial Data Exploration and Motivation
Reading in the Dataset
Inspecting the Dataset
# Checking dimensions, first and last few rows
dim(videogame_data)[1] 195209 10
head(videogame_data) id name date rating reviews plays
1 1000001 Cathode Ray Tube Amusement Device 1947-12-31 3.6 85 149
2 1000002 Bertie the Brain 1950-08-25 3.0 26 46
3 1000003 Nim 1951-12-31 1.9 9 26
4 1000004 Draughts 1952-08-31 2.8 9 30
5 1000005 OXO 1952-12-31 3.1 22 80
6 1000006 Pool 1954-06-26 3.1 12 33
playing backlogs wishlists
1 1 42 72
2 0 9 17
3 0 2 8
4 0 4 7
5 0 11 15
6 0 3 4
description
1 The cathode ray tube amusement device is the earliest known interactive electronic game to use a cathode ray tube (CRT). It is a device that records and controls the quality of an electronic signal. The strength of the electronic signals produced by the amusement device is controlled by knobs which influences the trajectory of the CRT's light beam. The device is purely electromechanical and does not use any memory device, computer, or programming. The player turns a control knob to position the CRT beam on the screen; to the player, the beam appears as a dot, which represents a reticle or scope. The player has a restricted amount of time in which to maneuver the dot so that it overlaps an airplane, and then to fire at the airplane by pressing a button. If the beam's gun falls within the predefined mechanical coordinates of a target when the user presses the button, then the CRT beam defocuses, simulating an explosion.
2 Currently considered the first videogame in history. A tic-tac-toe clone.
3 The Nimrod was a special purpose computer that played the game of Nim, designed and built by Ferranti and displayed at the Exhibition of Science during the 1951 Festival of Britain. It was the first digital computer exclusively designed to play a game, though its true intention was to illustrate the principles of the (then novel) digital computer for the public.
4 A game of draughts (a.k.a. checkers) written for the Ferranti Mark 1 computer by Christopher Strachey at the University of Manchester between 1951 and 1952. In the summer of 1952, the program was able to "play a complete game of Draughts at a reasonable speed".
5 OXO was a computer game developed by Alexander S. Douglas in 1952 for the EDSAC computer, which simulates a game of Noughts and crosses, also sometimes called Tic-tac-toe. OXO is the earliest known game to display visuals on a video monitor. To play OXO, the player would enter input using a rotary telephone controller, and output was displayed on the computer's 35×16 dot matrix cathode ray tube. Each game was played against an artificially intelligent opponent.
6 A game of pool (billiards) developed by William George Brown and Ted Lewis in 1954 on the MIDSAC computer, intended primarily to showcase the computing power of the MIDSAC. "The game displayed a 2-inch rendition of the pool cue for the players to line up their shots and ran a simulation of the colliding and ricocheting balls in real-time, implementing a full game of a cue ball and 15 frame balls for two players. Graphics were drawn in real-time on a monochrome 13" point plotting X-Y display, the screen being updated by the program 40 times a second (that is, in a normal in-game situations with 2 to 4 balls moving at once). However, for time constraints, the table and its pockets weren’t drawn by the computer graphics, but were rather drawn manually onto the display using a grease pencil." - Norbert Landsteiner for masswerk.at
tail(videogame_data) id name date rating reviews plays
195204 1204007 Hometown Poker Hero NA 0 0
195205 1204008 Duke Nukem II Remastered 4.5 0 4
195206 1204009 Deadoxer NA 0 0
195207 1204010 Lottso! Express NA 0 0
195208 1204011 Little White Man vs. X NA 0 0
195209 1204012 Lost Lagoon 2: Cursed and Forgotten NA 0 1
playing backlogs wishlists
195204 0 0 0
195205 0 1 2
195206 0 1 0
195207 0 0 0
195208 0 1 0
195209 0 0 0
description
195204
195205 A remaster by Blaze Entertainment as part of Duke Nukem 1+2 Remastered which is set to be released exclusively for Evercade in Duke Nukem Collection 1.
195206 This is a small horror game in which you have to escape.
195207
195208 Attention: Attention, attention, attention!!! Sometimes the game automatically locks enemies, press the tab key to unlock it. (At the beginning, pressing tab is to lock the enemy) Other operations are in the legal terms
195209 Escape from a mysterious and dangerous island in Lost Lagoon 2: Cursed & Forgotten! After waking up shipwrecked, you realize that you have been cursed by powers beyond your understanding. Break the curse quickly because malevolent islanders lurk in the lush landscape and are dead set on making you their next victim! Find a way to return home before it's too late.
nrow(videogame_data)[1] 195209
ncol(videogame_data)[1] 10
Brief Description
The dataset contains information about various video games, including their titles, genres, release years, backlogs, and ratings.
(This is only the first file, I will add in the genres, gaming platforms, and the individual gaming scores later into the main dataset, I couldn’t get it to download.)
Dataset was acquired through Kaggle. can be accessed through this link: https://www.kaggle.com/datasets/gsimonx37/backloggd?select=games.csv
Motivation
Understanding video game trends can help answer key industry questions, such as:
- What factors contribute to a game’s success?
- Are certain platforms or genres more successful than others?
- How do different genre trends change over time?
- What types of games have higher/lower scores?
- What is the popular and least popular game of all time and is there a correlation to those types of games having certain critic scores?
Hypothesis
A potential hypothesis to test:
- “Games released on multiple platforms tend to have higher total sales compared to platform-exclusive games.”
- “Games released on newer platforms (e.g., PS5, Xbox Series X) have higher average user scores than games released on older platforms (e.g., PS2, Xbox 360).”
Ethical Considerations
- Bias Awareness: Recognizing potential biases, such as favoring certain game genres or assuming popularity equals quality.
- Data Integrity: Ensuring the dataset is accurate and not misrepresented.
- Representation: Consideration of indie vs. AAA developers
Table Creation / Data Dictionary
data_dictionary <- data.frame(
Variable_Name = colnames(videogame_data),
Class = sapply(videogame_data, class),
Continuity = sapply(videogame_data, function(x) ifelse(is.numeric(x), "Continuous", "Discrete")),
Description = c(
"Game ID Number",
"Game Title",
"When it was Released",
"What is the Viewership Rating",
"How many Reviews",
"How many Plays the game has",
"How many are Playing",
"How many people put it to the side to play later, Backlogs",
"How many people added to the Wishlist",
"Description of The game"
)
)
summary(videogame_data) id name date rating
Min. :1000001 Length:195209 Length:195209 Min. :-0.30
1st Qu.:1051405 Class :character Class :character 1st Qu.: 2.50
Median :1102074 Mode :character Mode :character Median : 3.00
Mean :1102931 Mean : 2.97
3rd Qu.:1154452 3rd Qu.: 3.50
Max. :1204012 Max. : 6.20
NA's :131670
reviews plays playing backlogs
Min. : -1.000 Min. : -1.0 Min. : -1.000 Min. : -1.00
1st Qu.: 0.000 1st Qu.: 0.0 1st Qu.: 0.000 1st Qu.: 0.00
Median : 0.000 Median : 2.0 Median : 0.000 Median : 1.00
Mean : 9.211 Mean : 135.6 Mean : 4.331 Mean : 39.89
3rd Qu.: 1.000 3rd Qu.: 10.0 3rd Qu.: 0.000 3rd Qu.: 5.00
Max. :8814.000 Max. :83000.0 Max. :10000.000 Max. :17000.00
wishlists description
Min. : -1.00 Length:195209
1st Qu.: 0.00 Class :character
Median : 0.00 Mode :character
Mean : 20.11
3rd Qu.: 3.00
Max. :11000.00
data_dictionary Variable_Name Class Continuity
id id integer Continuous
name name character Discrete
date date character Discrete
rating rating numeric Continuous
reviews reviews integer Continuous
plays plays integer Continuous
playing playing integer Continuous
backlogs backlogs integer Continuous
wishlists wishlists integer Continuous
description description character Discrete
Description
id Game ID Number
name Game Title
date When it was Released
rating What is the Viewership Rating
reviews How many Reviews
plays How many Plays the game has
playing How many are Playing
backlogs How many people put it to the side to play later, Backlogs
wishlists How many people added to the Wishlist
description Description of The game