1.Overview

Image Credit: Leviathyn

Video game industry has been always fascinating and created billions dollars profit annually.

Nintendo, one of the most prestigious video game company, boosts its revenue by publishing a new video game console - Switch in March 2017. Nintendo published their first platform video game - Donkey Kong in 1981. Since then, they have published many popular games, such as Mario, The Legend of Zelda, Super Smash Bros, and Pokémon. Also, Take-Two Interactive have pulished some popular games, such like NBA2K and Grand Theft Auto. However, we’re curious that is the game industry still igniting comparing to the past? How’s the video game market across regions or platforms?

Therefore, we have done our researches and this report aims to provide the analysis to the video game sales from 1980-2008.

1.1.Purpose of Visualization

In order to give our audiences better understanding of the relationship between video game sales and other featuers, we’ve tranformed the data into different well-organized charts. It helps to simplify the figures in a simple manner and also increase the readability.

Image Credit: npd

The rankings above reveals that top 10 video games in 2020. Although the data we have only have meaningful data until 2016, we are still able to discover some similarities in the historical data such as the previous versions of Grand Theft Auto or Mario series, and came up with some insights of the video game industry.

1.2.Sketch of Proposed DataViz Design

2.Suggestions

In this report, we use tidyverse to preprocess the raw data, and produce the basic graphs with ggplot2 and ggtern. Lastly, plotly is used to provide the the graphs with interactivity.

3.DataViz Step-by Step

3.1. Install and Load R Packages

Before start, we need to install and load some necessary and handy packages into the RStudio.

packages = c('ggplot2', 'ggtern', 'plotly', 'readr', 'formattable', 'DT', 'ggpubr')

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.2.Loading & Preprocessing

Load video games sales data - vgsales.csv. Create some data tables for further analysis: 1. sales_genre: Global sales by genre. 2. sales_genre_year: Global sales by genre and year. 3. sales_year: Global sales by year. 4. sales_platform: Global sales by platforms. 5. new_sales: Global sales by regions.

Note: All the sales are in Million(USD)

data <- read_csv("/Users/jcheah/Documents/MITB/Term 3/Visual Analytics/Assignment 4/dataset/vgsales.csv")

sales_genre <- data %>% group_by(Genre) %>% summarise(Global_Sales = sum(Global_Sales))

sales_genre_year <- data %>% group_by(Genre, Year) %>% summarise(Sales = sum(Global_Sales))
sales_genre_year <- sales_genre_year[which(!is.na(sales_genre_year$Year)),]

sales_year <- data %>% group_by(Year) %>% summarise(total_sales = sum(Global_Sales))
class(sales_year$Year) <- 'numeric'

Image Credit: Statista

From the graph above, it is easily to see that the global video market has the largest revenue in 2008. However, after 2008 the revenue dropped drastically in the following years. We speculated there might be a potential reason here. That is, this dataset doesn’t include the digital payments from some online video game digital distribution services, such as Steam and GOG. Therefore, since the subscribers of the platform are also increased exponentially, lots of revenue might be shifted from individual customers to the service providers.

For this reason, the data after 2008 will not be included in this report so that we won’t get weird result against the real-world video game market trend.

data1 <-  data[data$Year <= 2008, ]
sales_genre1 <- data1 %>% group_by(Genre) %>% summarise(Global_Sales = sum(Global_Sales))

sales_genre_year1 <- data1 %>% group_by(Genre, Year) %>% summarise(Sales = sum(Global_Sales))
sales_genre_year1 <- sales_genre_year1[which(!is.na(sales_genre_year1$Year)),]

sales_year1 <- data1 %>% group_by(Year) %>% summarise(total_sales = sum(Global_Sales))
class(sales_year1$Year) <- 'numeric'
Create sales_platform table which contains data from 1999 to 2008:
sales_platform <-  data1 %>% group_by(Platform, Year) %>% summarise(sales = sum(Global_Sales))
sales_platform <- sales_platform[!is.na(sales_platform$Year),]
sales_platform <- sales_platform[sales_platform$Year != "N/A",]
sales_platform_1999 <- sales_platform[sales_platform$Year >= 1999,]
Create new_sales table by incorporating the regions data into table:
#transform region into row
na_sales <- data1[, 1:7]
na_sales$Region <- 'North America'
eu_sales <- data1[, c(1:6, 8)]
eu_sales$Region <- 'Europe'
jp_sales <- data1[, c(1:6, 9)]
jp_sales$Region <- 'Japan'
other_sales <- data1[, c(1:6, 10)]
other_sales$Region <- 'Other'

#Align the column names
colnames(na_sales)[7] <- 'Sales'
colnames(jp_sales)[7] <- 'Sales'
colnames(eu_sales)[7] <- 'Sales'
colnames(other_sales)[7] <- 'Sales'
new_sales <- rbind(na_sales, eu_sales, jp_sales, other_sales)

4.Final Visualization and Insights

4.1.Explore the data

The data table below provides a very convenient way for the users to explore the data. Search any video game you’re interested in and all the relevent information will pop out immediately.

The trend below shows us that the size of video game market is increasing. The growth is larger than the previous year and doesn’t seem to slow down.

4.2.Under the surface

We are curious about what kinds of platforms contribute the most to the sales of video games?

The graph below shows that the video game consoles, such as PS series and Xbox dominated the video comparing to the portable game console. Hence, it is possible that we can expect a spike of sales would happen when there are new video consoles released.

p3 <- ggplot(data = sales_platform_1999, aes(x = Year, y = sales, fill = Platform)) +     
  geom_bar(stat = 'identity', alpha = 0.8) +
  labs(title='Global sales by platform')

ggplotly(p3)

The sales between genres also have huge differences. As we can see from the trend graphs below, although all the sale are increasing, the growth rates are a lot different between genres. For example, Action and Sport have a very deep growth curve. However, the growth of Puzzle, Platform and Strategy are relatively flat. The reason might be the releasing of some popular games such as Diablo 2 and Code Red.

The box plot also has similar result to the matrix plots. Action and Sport have the highest global sales.

According to the table below, North America is the biggest video game market. It has more sales than the total of all other markets.

Region Sales Rank
North America 2881.83 1
Europe 1378.66 2
Japan 923.62 3
Other 444.51 4

North America, Europe and Japan are the three biggest video game markets. Therefore, it would be interesting if we visualize the the sales between these three markets in a ternary chart.

In the chart, it is clear that a huge volume of sales are heavily skewed to the North American market and EU has more salesin the range 20 - 40 millions. One thing that is interesting, the sales in Japan are distributed more evenly than other regions.

Interactive ternary chart for global video game sales:

4.3. Pokemon Series

Image Credit: Pokemongolive

Pokemon is the memories of everyone’s childhood, especially for someone in my generation(Borned after 1990).However, we want to know whether this phenomenon only exists in the Eastern Asia(Japan can be one of the representative countries in the perspective of video game). Based on the curiosity, I plotted a ternary chart for video game sales of Pokemon series.

In the chart, some of the points located in the high sales area are NA market. Besides, several point are in the middle range of sales(20-50) for all the regions. There are only few points having really high sales in a specific region.

Interactive ternary chart for Pokemon series sales:

5.Data Storytelling

To summarise the graphs, I have wrapped up three points below:

1.Video Console v.s Handheld Console

By comparing the sales between handheld console and video console, it’s obvious that the video consoles brought more sales to the games. One of the reasons is that the game for video console is usually more expensive. Moreover, some multi-players games such as Mario Kart and Just Dance are more suitable on the big screen instead of the handheld consoles.

3.Region Market Comparison

There are games which have 100% sales in North American or Japanese markets. However, mush less cases are found in the Europe market. I personally think that’s because the games from Europe game developement companies are less than the other two major markets. Besides, one fun fact that no matter is in NA/EU/JP markets, the top 5 games are all published by Nintendo, one of the global largest game developement companies from Japan.