Video Game Sales Analytics: A Business Perspective

Author

Hadrien

Published

March 3, 2025

Introduction

This report analyzes the “vgsales” dataset from Kaggle to provide actionable insights for video game publishers. Using R and Python within a Quarto document, we explore sales trends, platform performance, and genre popularity to optimize development and marketing strategies.


1. Topic Selection

Problem Statement:
How can video game publishers optimize their development and marketing strategies based on historical sales data across platforms, genres, and regions?

Relevance:
Understanding sales patterns enables publishers to allocate resources efficiently, target high-profit markets, and maximize return on investment in a competitive industry.


Data Acquisition & Preparation

Load and Clean the Dataset

# A tibble: 6 × 11
   Rank Name           Platform Year  Genre Publisher NA_Sales EU_Sales JP_Sales
  <dbl> <chr>          <chr>    <chr> <chr> <chr>        <dbl>    <dbl>    <dbl>
1     1 Wii Sports     Wii      2006  Spor… Nintendo      41.5    29.0      3.77
2     2 Super Mario B… NES      1985  Plat… Nintendo      29.1     3.58     6.81
3     3 Mario Kart Wii Wii      2008  Raci… Nintendo      15.8    12.9      3.79
4     4 Wii Sports Re… Wii      2009  Spor… Nintendo      15.8    11.0      3.28
5     5 Pokemon Red/P… GB       1996  Role… Nintendo      11.3     8.89    10.2 
6     6 Tetris         GB       1989  Puzz… Nintendo      23.2     2.26     4.22
# ℹ 2 more variables: Other_Sales <dbl>, Global_Sales <dbl>

##EDA

Year VS Global sales

Calculate total sales by region

Calculate total global sales per year

We analyze the global sales in function of the years to see the trends

3) Business Analytics

Analytical Techniques Aggregation: Summarize sales by platform and genre to identify top performers. Correlation Analysis: Examine relationships between regional sales to inform market strategies. Justification Aggregations highlight profitable platforms and genres, while correlation analysis reveals market similarities, both critical for business decisions without needing advanced modeling.

Sales per platform

# A tibble: 10 × 2
   Platform Total_Global_Sales
   <chr>                 <dbl>
 1 PS2                   1256.
 2 X360                   980.
 3 PS3                    958.
 4 Wii                    927.
 5 DS                     822.
 6 PS                     731.
 7 GBA                    318.
 8 PSP                    296.
 9 PS4                    278.
10 PC                     259.

Sales by genre

# A tibble: 6 × 2
  Genre        Total_Global_Sales
  <chr>                     <dbl>
1 Action                    1751.
2 Sports                    1331.
3 Shooter                   1037.
4 Role-Playing               927.
5 Platform                   831.
6 Misc                       810.

Correlation between regional sales

          NA_Sales  EU_Sales  JP_Sales
NA_Sales 1.0000000 0.7677267 0.4497874
EU_Sales 0.7677267 1.0000000 0.4355845
JP_Sales 0.4497874 0.4355845 1.0000000

Interpretation

Platforms:

High sales on PS2 and Xbox 360 suggest targeting these ecosystems.

Genres:

Action and Sports lead, indicating strong demand.

Regional Correlation:

NA and EU markets align closely, while Japan differs, necessitating tailored strategies.

5) Using functions and Error handling

Code
calculate_total_sales <- function(data, group_var) {
  if (!group_var %in% names(data)) {
    stop("Error: Grouping variable does not exist in the data.")
  }
  data %>%
    group_by(.data[[group_var]]) %>%
    summarise(Total_Global_Sales = sum(Global_Sales)) %>%
    arrange(desc(Total_Global_Sales))
}

tryCatch({
  platform_sales <- calculate_total_sales(data, "Platform")
  print(head(platform_sales))
}, error = function(e) {
  message("Error: ", e$message)
})
# A tibble: 6 × 2
  Platform Total_Global_Sales
  <chr>                 <dbl>
1 PS2                   1256.
2 X360                   980.
3 PS3                    958.
4 Wii                    927.
5 DS                     822.
6 PS                     731.

6)Vizualisation and interpretation

Visualization 1: Top 10 Platforms by Global Sales A bar chart is chosen to rank platforms clearly by sales volume. Interpretation: PS2 and Xbox 360 dominate, suggesting a focus on these platforms for maximum reach.

Visualization 2: Global Sales by Genre A bar chart effectively compares genre performance. Interpretation: Action and Sports genres lead, indicating high market potential.

Visualization 3: Regional Sales Correlation A scatter plot matrix (pairs plot) is selected to visualize correlations between regional sales intuitively. Interpretation: NA and EU sales correlate strongly, while JP sales diverge, highlighting the need for region-specific approaches.

8) Conclusion and business recommandations

Key Insights

Focus on dominant platforms (e.g., PS2, Xbox 360) and popular genres (Action, Sports). Japan’s unique sales patterns require distinct strategies.

Recommendations

Prioritize multi-platform releases on top platforms. Develop Action and Sports games to leverage demand. Tailor marketing for Japan with localized campaigns.

Limitations & Future Steps

Historical data may not reflect current trends; newer data would enhance relevance. Future work could use predictive modeling for deeper insights.