data <- read.csv("C:\\Users\\gajaw\\OneDrive\\Desktop\\STATS\\vgsales.csv")
col1_summary <- summary(data$Global_Sales)
print(col1_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0100 0.0600 0.1700 0.5374 0.4700 82.7400
Insight:
The global sales value ranges from 0.17 million at the median to 82.74
million at the highest. This demonstrates a notable discrepancy in
sales, suggesting that a small percentage of games have far higher
global sales than the bulk.
Significance: A wide discrepancy between the median and maximum sales points to a few blockbuster games controlling the majority of total sales in the market. Knowing which games fit into this group will make it easier to focus on popular features or genres.
Further Question: What traits do these blockbuster games share (genre, publisher, platform, etc.) and how are they different from titles that have sold less?
col2_summary <- summary(data$Other_Sales)
print(col2_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00000 0.00000 0.01000 0.04806 0.04000 10.57000
Insight: Other_Sales values range from 0.01 million at the median to 10.57 million at the maximum, showing that, with very few exceptions, most games have relatively modest sales outside of the primary markets (NA, EU, and JP).
Significance: This implies that the primary markets account for the great bulk of game sales, with other regions contributing very little to the total. To more successfully enter these smaller markets, businesses might need to create tailored tactics.
Further Question: Which games, and what special
qualities or tactics helped them succeed, have sold better in these
other regions?
col3_unique_val <- unique(data$Genre)
col3_val_count <- table(data$Genre)
#Categorical Summary for column Publisher'
print(data.frame(Value=col3_unique_val, Count= col3_val_count))
## Value Count.Var1 Count.Freq
## 1 Sports Action 3316
## 2 Platform Adventure 1286
## 3 Racing Fighting 848
## 4 Role-Playing Misc 1739
## 5 Puzzle Platform 886
## 6 Misc Puzzle 582
## 7 Shooter Racing 1249
## 8 Simulation Role-Playing 1488
## 9 Action Shooter 1310
## 10 Fighting Simulation 867
## 11 Adventure Sports 2346
## 12 Strategy Strategy 681
Insight : The Sports genre has the most games
(3316), suggesting that it is one of the most regularly generated or
categorized categories in the dataset.
Significance: A robust and steady market demand could
be the reason behind the popularity of the Sports genre. Knowing the
reasons behind the popularity of this genre could be useful in creating
new games or improving ones that already exist to appeal to this
market.
Further Question: Does the greater quantity of sports
games correspond with higher worldwide sales figures for this genre, or
does the volume of creation not equal the success of sales?
How do different game genres impact global sales?
What sales patterns can be observed over time for different genres?
Are there certain genres with unusually high or low sales that could skew overall market analysis?
Aggregate function for question 1 -
mean_global_sales<- aggregate(Global_Sales ~ Genre, data= data,sum)
print(mean_global_sales)
## Genre Global_Sales
## 1 Action 1751.18
## 2 Adventure 239.04
## 3 Fighting 448.91
## 4 Misc 809.96
## 5 Platform 831.37
## 6 Puzzle 244.95
## 7 Racing 732.04
## 8 Role-Playing 927.37
## 9 Shooter 1037.37
## 10 Simulation 392.20
## 11 Sports 1330.93
## 12 Strategy 175.12
Insights:
Top-Performing Genres Globally:
Action games lead with total global sales of 1,751.18 million, indicating the broadest appeal worldwide.
Sports (1,330.93 million) and Shooter games (1,037.37 million) also perform exceptionally well, reflecting strong market demand.
Moderate-Performing Genres:
Lower-Performing Genres:
Significance:
Action, Sports, and Shooter games dominate global sales, suggesting they are the most profitable and widely appealing genres.
Lower-performing genres could still be profitable within niche markets or through innovative strategies.
A mix of high-performing and niche genres could offer a balanced approach for game developers.
Further Questions:
How do sales vary by region or change over time?
What impact do marketing strategies have on different genres?
Can we expand the audience for lower-performing genres?
Box plot visualization between 2 columns - Global sales & Genre
library(ggplot2)
ggplot(data, aes(x = Genre, y = Global_Sales)) +
geom_boxplot(fill = "skyblue", color = "black") +
labs(title = "Distribution of global sales by genre", x = "Genre", y = "Global sales in millions") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Insights:
Sports, Platform, Racing, and Shooter genres have the most variability in sales, with several games reaching very high global sales. For example, a game in the Sports genre has reached around 80 million units sold, indicating some blockbuster hits in these categories.
Adventure, Puzzle, Strategy, and Simulation genres show lower and more consistent sales, with few games achieving very high sales. These genres tend to have fewer top-sellers and more steady performance across games.
Almost every genre has a few outlier games that significantly exceed the typical sales for that genre, showing that standout successes can occur in any genre.
Significance :
Genres like Sports and Shooter can produce big hits, but also show wide variability, meaning they are higher risk.
Genres like Adventure and Puzzle are more predictable but generally achieve lower sales, suggesting a safer but less lucrative market.
Further Questions:
What factors make some games in high-variability genres more successful?
Are there regional preferences that affect sales by genre?
How have these trends changed over time?
Scatter Plot for correlation between columns - Global sales and other sales by Genre
ggplot(data, aes(x = Global_Sales, y = Other_Sales, color = Genre)) +
geom_point(alpha = 0.7) +
labs(title = "Correlation between global sales and other sales by Genre", x = "Global sales in millions)", y = "Other sales in millions)") +
theme_minimal()
Insight:
The scatter plot indicates a favourable association between
video game sales globally and other sales categories, with the majority
of titles having poor sales in both. Certain genres, such as “Action,”
“Shooter,” and “Platform,” have sales outliers that are abnormally high,
suggesting a wider audience.
Significance: Games that sell well internationally typically sell better elsewhere or on different platforms. Nonetheless, the majority of games don’t sell well.
Further Questions:
Which areas make up the majority of “Other Sales”?
Do sales follow any patterns over time or vary depending on the platform?
What characteristics distinguish certain games as sales outliers?
Line plot for the trends in global sales over the time by genre
ggplot(data, aes(x = Year, y = Global_Sales, color = Genre , group = Genre )) +
geom_line() +
labs(title = "Trend of global sales over time by Genre ", x = "Year", y = "Global Sales in millions") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
Insights:
Sales Fluctuate: There is a clear peak in the “Sports” category, but
sales patterns vary greatly between genres, with some exhibiting strong
spikes.
Consistently Poor Sales: Over time, some genres have had consistently
low sales, which may indicate a decline in popularity or growth.
Significance:
By identifying popular genres over time, this data can assist publishers
and game developers in scheduling upcoming releases.
Further Questions:
What led to the sudden increases in sales?
Why do some genres always have greater or lower popularity than others?
Do these tendencies vary based on the platform or region?
Stacked bar plot showing Interaction between variables: Sales distribution by Genre and publisher
ggplot(data, aes(x = Genre, y = Global_Sales, fill = Publisher)) +
geom_bar(stat = "identity") +
labs(title = "Sales distribution by Genre and Publisher", x = "Genre", y = "Total global sales in millions")+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Insights:
Diverse Publishers for Every Genre: Indicating that some genres are more
competitive, some genres have numerous publishers contributing to sales,
while others only have a small number. Sales Dominance: Publishers with
a strong market position are those whose sales in a certain category are
dominated.
Significance:
This can direct new publishers to less competitive sectors and aid in
identifying which publishers are successful in specific genres.
Further Questions:
Certain publishers are top in certain categories, and why?
What distinguishes more successful publishers?
Why is publisher diversity higher in some genres than in
others?