All-Time Video Game Statistics

Project One

Author

DSA_406_001_SP25_project_vmbaxi

Published

February 18, 2025

Initial Data Exploration and Motivation

Reading in the Dataset

videogame_data <- read.csv("games.csv")

Inspecting the Dataset

# Checking dimensions, first and last few rows
dim(videogame_data)
[1] 195209     10
head(videogame_data)
       id                              name       date rating reviews plays
1 1000001 Cathode Ray Tube Amusement Device 1947-12-31    3.6      85   149
2 1000002                  Bertie the Brain 1950-08-25    3.0      26    46
3 1000003                               Nim 1951-12-31    1.9       9    26
4 1000004                          Draughts 1952-08-31    2.8       9    30
5 1000005                               OXO 1952-12-31    3.1      22    80
6 1000006                              Pool 1954-06-26    3.1      12    33
  playing backlogs wishlists
1       1       42        72
2       0        9        17
3       0        2         8
4       0        4         7
5       0       11        15
6       0        3         4
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          description
1 The cathode ray tube amusement device is the earliest known interactive electronic game to use a cathode ray tube (CRT). It is a device that records and controls the quality of an electronic signal. The strength of the electronic signals produced by the amusement device is controlled by knobs which influences the trajectory of the CRT's light beam. The device is purely electromechanical and does not use any memory device, computer, or programming. The player turns a control knob to position the CRT beam on the screen; to the player, the beam appears as a dot, which represents a reticle or scope. The player has a restricted amount of time in which to maneuver the dot so that it overlaps an airplane, and then to fire at the airplane by pressing a button. If the beam's gun falls within the predefined mechanical coordinates of a target when the user presses the button, then the CRT beam defocuses, simulating an explosion.
2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Currently considered the first videogame in history. A tic-tac-toe clone.
3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        The Nimrod was a special purpose computer that played the game of Nim, designed and built by Ferranti and displayed at the Exhibition of Science during the 1951 Festival of Britain. It was the first digital computer exclusively designed to play a game, though its true intention was to illustrate the principles of the (then novel) digital computer for the public.
4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               A game of draughts (a.k.a. checkers) written for the Ferranti Mark 1 computer by Christopher Strachey at the University of Manchester between 1951 and 1952. In the summer of 1952, the program was able to "play a complete game of Draughts at a reasonable speed".
5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   OXO was a computer game developed by Alexander S. Douglas in 1952 for the EDSAC computer, which simulates a game of Noughts and crosses, also sometimes called Tic-tac-toe. OXO is the earliest known game to display visuals on a video monitor. To play OXO, the player would enter input using a rotary telephone controller, and output was displayed on the computer's 35×16 dot matrix cathode ray tube. Each game was played against an artificially intelligent opponent.
6                                                                                               A game of pool (billiards) developed by William George Brown and Ted Lewis in 1954 on the MIDSAC computer, intended primarily to showcase the computing power of the MIDSAC. "The game displayed a 2-inch rendition of the pool cue for the players to line up their shots and ran a simulation of the colliding and ricocheting balls in real-time, implementing a full game of a cue ball and 15 frame balls for two players. Graphics were drawn in real-time on a monochrome 13" point plotting X-Y display, the screen being updated by the program 40 times a second (that is, in a normal in-game situations with 2 to 4 balls moving at once). However, for time constraints, the table and its pockets weren’t drawn by the computer graphics, but were rather drawn manually onto the display using a grease pencil." - Norbert Landsteiner for masswerk.at
tail(videogame_data)
            id                                name date rating reviews plays
195204 1204007                 Hometown Poker Hero          NA       0     0
195205 1204008            Duke Nukem II Remastered         4.5       0     4
195206 1204009                            Deadoxer          NA       0     0
195207 1204010                     Lottso! Express          NA       0     0
195208 1204011              Little White Man vs. X          NA       0     0
195209 1204012 Lost Lagoon 2: Cursed and Forgotten          NA       0     1
       playing backlogs wishlists
195204       0        0         0
195205       0        1         2
195206       0        1         0
195207       0        0         0
195208       0        1         0
195209       0        0         0
                                                                                                                                                                                                                                                                                                                                                                         description
195204                                                                                                                                                                                                                                                                                                                                                                              
195205                                                                                                                                                                                                                       A remaster by Blaze Entertainment as part of Duke Nukem 1+2 Remastered which is set to be released exclusively for Evercade in Duke Nukem Collection 1.
195206                                                                                                                                                                                                                                                                                                                      This is a small horror game in which you have to escape.
195207                                                                                                                                                                                                                                                                                                                                                                              
195208                                                                                                                                                   Attention: Attention, attention, attention!!! Sometimes the game automatically locks enemies, press the tab key to unlock it. (At the beginning, pressing tab is to lock the enemy) Other operations are in the legal terms
195209 Escape from a mysterious and dangerous island in Lost Lagoon 2: Cursed & Forgotten! After waking up shipwrecked, you realize that you have been cursed by powers beyond your understanding. Break the curse quickly because malevolent islanders lurk in the lush landscape and are dead set on making you their next victim! Find a way to return home before it's too late.
nrow(videogame_data)
[1] 195209
ncol(videogame_data)
[1] 10

Brief Description

The dataset contains information about various video games, including their titles, genres, release years, backlogs, and ratings.

(This is only the first file, I will add in the genres, gaming platforms, and the individual gaming scores later into the main dataset, I couldn’t get it to download.)

Dataset was acquired through Kaggle. can be accessed through this link: https://www.kaggle.com/datasets/gsimonx37/backloggd?select=games.csv

Motivation

Understanding video game trends can help answer key industry questions, such as:

  • What factors contribute to a game’s success?
  • Are certain platforms or genres more successful than others?
  • How do different genre trends change over time?
  • What types of games have higher/lower scores?
  • What is the popular and least popular game of all time and is there a correlation to those types of games having certain critic scores?

Hypothesis

A potential hypothesis to test:

  • “Games released on multiple platforms tend to have higher total sales compared to platform-exclusive games.”
  • “Games released on newer platforms (e.g., PS5, Xbox Series X) have higher average user scores than games released on older platforms (e.g., PS2, Xbox 360).”

Ethical Considerations

  • Bias Awareness: Recognizing potential biases, such as favoring certain game genres or assuming popularity equals quality.
  • Data Integrity: Ensuring the dataset is accurate and not misrepresented.
  • Representation: Consideration of indie vs. AAA developers

Table Creation / Data Dictionary

data_dictionary <- data.frame(
  Variable_Name = colnames(videogame_data),
  Class = sapply(videogame_data, class),
  Continuity = sapply(videogame_data, function(x) ifelse(is.numeric(x), "Continuous", "Discrete")),
  Description = c(
    "Game ID Number",
    "Game Title",
    "When it was Released",
    "What is the Viewership Rating",
    "How many Reviews",
    "How many Plays the game has",
    "How many are Playing",
    "How many people put it to the side to play later, Backlogs",
    "How many people added to the Wishlist",
    "Description of The game"
  )
)
summary(videogame_data)
       id              name               date               rating      
 Min.   :1000001   Length:195209      Length:195209      Min.   :-0.30   
 1st Qu.:1051405   Class :character   Class :character   1st Qu.: 2.50   
 Median :1102074   Mode  :character   Mode  :character   Median : 3.00   
 Mean   :1102931                                         Mean   : 2.97   
 3rd Qu.:1154452                                         3rd Qu.: 3.50   
 Max.   :1204012                                         Max.   : 6.20   
                                                         NA's   :131670  
    reviews             plays            playing             backlogs       
 Min.   :  -1.000   Min.   :   -1.0   Min.   :   -1.000   Min.   :   -1.00  
 1st Qu.:   0.000   1st Qu.:    0.0   1st Qu.:    0.000   1st Qu.:    0.00  
 Median :   0.000   Median :    2.0   Median :    0.000   Median :    1.00  
 Mean   :   9.211   Mean   :  135.6   Mean   :    4.331   Mean   :   39.89  
 3rd Qu.:   1.000   3rd Qu.:   10.0   3rd Qu.:    0.000   3rd Qu.:    5.00  
 Max.   :8814.000   Max.   :83000.0   Max.   :10000.000   Max.   :17000.00  
                                                                            
   wishlists        description       
 Min.   :   -1.00   Length:195209     
 1st Qu.:    0.00   Class :character  
 Median :    0.00   Mode  :character  
 Mean   :   20.11                     
 3rd Qu.:    3.00                     
 Max.   :11000.00                     
                                      
data_dictionary
            Variable_Name     Class Continuity
id                     id   integer Continuous
name                 name character   Discrete
date                 date character   Discrete
rating             rating   numeric Continuous
reviews           reviews   integer Continuous
plays               plays   integer Continuous
playing           playing   integer Continuous
backlogs         backlogs   integer Continuous
wishlists       wishlists   integer Continuous
description   description character   Discrete
                                                           Description
id                                                      Game ID Number
name                                                        Game Title
date                                              When it was Released
rating                                   What is the Viewership Rating
reviews                                               How many Reviews
plays                                      How many Plays the game has
playing                                           How many are Playing
backlogs    How many people put it to the side to play later, Backlogs
wishlists                        How many people added to the Wishlist
description                                    Description of The game