2025-10-20

Dataset

This project uses the data set “games” from kaggle where it depicts the most popular video games from 1980-2023.

This data set was obtained from backloggd which is a video game collection website where people can rate or review games.

The data set has 14 columns with some being the release dates of games, the companies, the ratings, the amount of current players, and the amount it has been played

Summary

In this project I will be analyzing the data set by producing four graphs total then giving a statistical analysis on the results. The following is a summary of the four graphs and what they visualize: Bar Graph: Shows the top ranked video games within the data set. It takes the top 10 best ranked video games and then visualizes the top 10 in a horizontal bar graph. This is measured out of 5 stars. Scatter Plot: This plot shows a 3D analysis of the top 10 most played games as well as the ratings on the top 10. This 3D plot shows the differences in the ratings although these are the most played. Pie Chart: This chart takes the top 50 rated games and then takes the percentage of the companies that produced these games. This visualizes which companies produce the highest rated games. Line Graph: This graph shows the time trend of the growth of the average players and total amount played for each game from 1990-2023

Ggplot Top Ranked Video Games

This horizontal bar graph visualizes the top 10 highest rated games out of 5 stars. This allows us to see what the top ranked games are according to backloggd.

Plotly 3D

This 3D plot takes 3 variables and visualizes the top 10 most played games and compares this to their rating. We can see that just because it is popular does not mean that it is rated well.

Pie Chart Code

In order to make a pie chart to show which companies produce the top rated video games first the top 50 best rated games had to be found. After this then I could group the companies together and then count them to find the total games produced.

top_teams = games %>%
  arrange(desc(Rating)) %>%
  slice_head(n = 50)

company <- top_teams %>%
  group_by(Team) %>%
  summarise(Count = n()) %>%
    ungroup()

Plotly

To show what companies produce the highest rated games the pie chart does just this. The pie chart visualizes the percent of games produces by each company in the top 50 which makes it clear the percentages of every company. In this we can see that ZA/UM video games produced 12% of the top 50 video games.

Ggplot

With this line plot, we can see that video games have increased in popularity over the years as the average current players and total play has been highly increasing.

Statistical analysis

This summary prints the top 10 best ranked video games with the total amount of plays it has. By looking at this we can see that some of the higher rated games have smaller amounts of players while others have high amounts. This could also cause a bias as the games with more players will tend to have more reviews leading it to have a less chance of being highly rated.

##                                Title Rating Plays_num
## 1  Elden Ring: Shadow of the Erdtree    4.8         1
## 2       Disco Elysium: The Final Cut    4.6      6000
## 3                        Outer Wilds    4.6      7700
## 4                      Disco Elysium    4.6      4000
## 5       Umineko: When They Cry Chiru    4.6      1700
## 6        Bloodborne: The Old Hunters    4.6      4400
## 7      Hitman World of Assassination    4.6       167
## 8       Final Fantasy XIV: Endwalker    4.6      2500
## 9    Metal Gear Solid 3: Subsistence    4.6      3700
## 10 Final Fantasy XIV: Shadowbringers    4.6      3000