Case Study
1 “Unlocking Insights: Can you tell me a story with Data ?” - Compiled By: “Joshua Lizardi”
Today, we will explore the intersection of data analysis and the ever-evolving world of the video games industry.
As budding industry professionals, you are about to step into the shoes of a team of modern IT professionals, tasked with attacking a series of real-world scenarios and challenges.
Throughout this case study, you will have to apply both statistical concepts, and data analysis techniques using tools of your choice. But remember, while crunching numbers is essential, it’s equally crucial to interpret your findings correctly and draw meaningful insights meant to guide decision-makers.
Each scenario will try to present unique statistical challenges that mirror real-life situations commonly faced in the industry.
By the end of this case study, you will have honed your statistical skills and, hopefully, gain a deeper understanding of how statistics and data-driven insights can drive success in business.
2 The Task: Statistics Case Study
The goal today is to analyze and draw insights from the dataset and present key findings from each question.
Your team must address each of the 20 questions, a Presentation must be made presenting Key findings from the analysis. Key findings may consist of solutions, visuals, arguments, really anything you think is interesting about the results of each question.*** “Try to find the story the data is telling, and tell it to the rest of us!!!!”***
Your Roles: - You will work in teams, simulating a cross-functional work environment, to analyze the dataset and address various business scenarios and questions. - Teams will consist of members from different majors, such as Information Studies, Business Analytics, MBA, and Engineering Management, to help provide diverse perspectives and expertise.
Presentations and Discussions: - The top 3 teams will have the opportunity to present their findings, methodologies, and conclusions to the class. The 1st place team will be chosen by peer vote and if that team’s findings are of good quality they will be published with credit given to each team member on the Rpubs Website.-- After each presentation, there will be time for class discussions, questions, and peer evaluations.
Time Management: - You will have a limited time frame to complete your analysis, so effective time management and teamwork are crucial.
Simulation of a Work Environment: - The group formations and roles assigned aim to simulate a work-like environment where cross-functional teams collaborate on complex business problems. This should help prepare you for collaborative work in your future careers.
Tools and Techniques: - As you dive into the Case Study, You may find that the questions present a variety of challenges that may require tools and methods not easily done in Microsoft Excel. You may use statistical software like R, python as well as online calculators like wolframalpha.com and statskingdom.com for your analysis. Yes, you can use ChatGPT as well.
Feel free to choose tools that align best with your preferences, but keep in mind the requirements of each scenario. Your choice should be based on the complexity of the analysis you need to perform and your ability to preform the task using the tool in a timely manner.
Remember, the goal today is not learning the tools themselves, though mastery of a tool is important at some point in your careers, here, today, your ability to interpret and communicate your findings effectively is the goal. Document your analysis, assumptions, and interpretations meticulously, as this is an essential part of any and all real world data-driven investigations.
I remind you that though you are under time constraints, I encourage you to explore and experiment, as each of these tools applied to this case study can provide differing levels of valuable insights. This will hopefully be a window, a small peek into the types of statistical challenges faced in real world settings.
Be creative. Have Fun! Cite your Sources.
Support and Resources: - Your instructor is available for guidance and support throughout the case study.
3 A Note on Assumption Checking & Remedial Measures:
It’s important to remember that statistical assumptions should be checked during your analysis to ensure the validity of your results.
I understand that assumption checking & remedial measures may not have been extensively covered in our textbook. I want to highlight its importance; it’s a fundamental aspect of statistical analysis. Furthermore, I want you to understand that it should be applied correctly in every analysis that utilizes statistical techniques. In fact, if it is not done, you are not doing statistics.
Assumptions are specific conditions that must be met for various statistical tests and models to produce valid and reliable results. They serve as the foundation of our analyses and help us make accurate inferences about the data. Violating assumptions can lead to incorrect conclusions. Checking to make sure our data follows the assumptions helps us understand the quality and reliability of our results.
Some common assumptions on data include but are not limited to; normality, homoscedasticity, independence, Outliers, and Sample Size.
If your data violates any necessary assumptions, interpret the results cautiously. Consider whether the violation is severe and whether remedial measures are warranted.
Remedial measures can include; Alternative statistical methods, Data transformations, as well as Non-statistical methods.
Seek Guidance: If you encounter challenges with assumption checking and remedial measures, don’t hesitate to seek guidance from your instructor or consult additional resources.
Documentation: It’s crucial to document your assumption-checking process and any actions taken to address violations in your analysis report.
Remember that assumption checking is a skill that develops with practice. It’s okay to encounter challenges along the way. What’s important is that you’re aware of the concept, and you make a sincere effort to apply it to your statistical analyses.
If you have any questions or need further clarification on assumption checking as you progress through the case study, please feel free to reach out. I’m here to support your learning journey.
4 Accessing Data for the Gaming Industry Statistics Case Study
Data Access: - You can download the datasets below by right clicking on the links and opening them in new tabs:
Main Dataset 1 - Game Profit Data: To access this data, please follow this link: Dataset 1 - Game Profit Data.
Dataset 2 - Game Playtime Data: - To access this data, please follow this link: Dataset 2 -Game Playtime Data.
Dataset 3 - Player demographic Data: - To access this data, please follow this link: Dataset 3 - Player demographic Data.
Again, right click each link and open in new tab. If you encounter any issues with downloading or accessing the data sets, feel free to reach out for assistance.
Main Dataset Overview: - The dataset contains data for games created by a video game publisher over a 15-year period. -- There are 50 observations (video games) and 13 features, including marketing spend, research and development spend, etc.--No missing values are present in the dataset.
Aggregated Variables: - The features “Sales,” “Cost,” and “Profit” are aggregates of other variables in the dataset. -- Sales is calculated as the product of “Unit Price” and “Units Sold.” -- Cost is the sum of “R.D. Spend,” “Administration,” and “Marketing Spend.” -- Profit is calculated as Sales minus Cost.
Data Collection Methods: - Games were categorized into groups based on game type. -- Games with multiple expansions, parts, or related releases were either removed or averaged to create representative data. -- For games released on multiple platforms, the best-performing platform versions were selected, and others were removed. -- Games in the dataset were chosen randomly from the groups using a cluster sampling algorithm.
Game ID and ID Number: - The “Game ID” column contains a unique identifier for each game. -- The “ID Number” contains coded information about each game, including multi-part games of the same series having similar IDs and games on different platforms having minor differences in IDs.
Dataset 2 & 3: -The Games selected above were also used to help create Data-sets 2 & 3.
Game play data of players from the games selected above was collected. Players were selected using a Simple random sampling method and data was anonymized then aggregated by country and time of day to create dataset 2. This player data was also aggregated by demographic, player IGP use, and Platform to create dataset 3.
5 Case Study Questions & Scenarios
Introduction: Stepping into the well-lit offices of Olympus Interactive, you’re a mix of nerves and excitement. As the latest hires in the roles of differing IT and Data professionals, you’re about to embark on a journey with a mid-sized yet iconic game publisher known for its trailblazing titles and industry innovations. You’ve barely had a moment to admire the wall of game posters when you’re ushered into a conference room. You recognize a few faces from industry events and news – the key decision-makers of Olympus Interactive. They’re deep in discussion, and as you pull out your laptop, you’re informed this is a critical strategy meeting. And so, on your very first day, eyes wide… you dive right in…
Scenario 1: The Indie Project Dilemma Background: Olympus Interactive, after dominating the AAA market, decides to fund indie projects. They’ve supported 36 indie titles so far. However, the finance team, upon reviewing the indie division’s figures, expresses concern regarding profit projections being potentially less than anticipated.
Question: What’s the probability that a random sample of 36 games averages less than $57,000 in profit?
1. Probability of Average Profit: - Sample size (n) = 36, Target average profit (x) = $57,000 - Find the probability that a random sample of 36 games averages less than $57000 in profit.
Scenario 2: The Annual Report Background: At Olympus Interactive’s annual shareholder meeting, the CEO confidently claimed their games average at least $600,000 in profit. Some of the veteran data analysts have expressed skepticism, recalling instances where profits weren’t as stellar.
Task: Test the CEO’s claim against your suspicion at the 5% level of significance.
2. CEO’s Profit Claim: - Given: Sample size (n) = 50, Claimed average profit (μ) = $600,000, Significance level (α) = 0.05 - Test the CEO’s claim that the average profit from most games made by the company is at least $600,000, against your suspicion, at the 5% level of significance.
Scenario 3: The IGP Revolution Background: A shift in Olympus Interactive’s monetization strategy sees a stronger emphasis on In-Game Purchases (IGP). After preliminary research suggesting 20% player engagement with IGP, the monetization team wonders how common high IGP engagement rates might be among smaller player groups.
Question: What’s the probability that for a random sample of 10 gamers, 6 will engage in IGP?
3. IGP Participation Probability: -“In Game Purchases” IGP, are a growing revenue stream for many video-game companies. One survey showed that up to 20% of players take part in IGP. Given this surveys results are valid for our players, what is the probability that for a random sample of 10 gamers, 6 will take part in IGP?
Scenario 4: The Nintendo Collaboration Background: Olympus Interactive announces a monumental partnership with Nintendo. Initial excitement soon gives way to industry whispers questioning the profitability of the collaboration’s initial game releases.
Question: What’s the probability that profit from a random sample of 3 of these Nintendo games averages less than $200,000?
4. Nintendo Games: - Find the probability that profit from a random sample of 3 of these games averages less than $200,000. What does this suggest about the claim made?
Scenario 5: Budget Allocations Background: A leak from Olympus Interactive’s financial department hints at disproportionate budgets allocated to Research & Development, Marketing, and Administration. The board demands clarity, fearing potential resource mismanagement.
Task: Check if the budget claims made by the board can be backed be the data.
5. Operations Manager’s Budget Claim: - How can we check the claim that Research & Development, Marketing, and Administration have roughly the same budget on every project?
Scenario 6: Marketing’s Gambit Background: A bold marketing campaign for Olympus Interactive’s upcoming game raises eyebrows. Internal debates question the hefty budget, which seems comparable to the game’s actual Research & Development costs.
Question: How can you validate the claim about the company’s expenditure?
6. Marketing vs. Research & Development: - How can we check the claim that on average, the company spends the same amount on marketing as on Research & Development per game?
Scenario 7: The MOBA Era Background: The golden era of MOBA games sees Olympus Interactive releasing multiple titles pre-2012. However, post-2012 titles haven’t garnered as much attention, leading to speculations about their comparative success.
Question: How could you verify the claim about the sales of MOBA games?
7. Sales of MOBA Games: - How could we check the claim that the average sales of MOBA & RTS games before 2012 were significantly higher than sales of MOBA & RTS games after 2012?
Scenario 8: The CFO’s Vision Background: Olympus Interactive’s CFO unveils a new strategy, banking heavily on IGP as the future sales driver. The data analytics team is brought on board to either validate or question this vision.
Question: How would you validate the CFO’s claim about IGPs driving sales?
8. CFO’s IGP Claim: - How could we check the CFO’s claim that IGP is a main driver of sales among all games? What statistical methods could be used?
Scenario 9: The Global Gaming Pattern Background: As Olympus Interactive expands globally, user data reveals varied playtime patterns. An ambitious analyst suggests that playtime remains consistent across time zones, but this theory is up for debate.
Question: How would you test the analyst’s claim about playtime?
9. Country and Time Effects on Playtime: - A new analyst is convinced time of day effects playtime regardless of country. Test whether country and times of day have an effect on playtime. Is the analyst’s claim justified? Explain.
Scenario 10: The Playtime Myth Background: Olympus Interactive’s new moblie RTS game has been a hit in the US market. During a press release, the Operations Manager offhandedly comments about American players not typically playing marathon sessions. The data analytics department decides to delve into playtime statistics to verify this.
Question: Can you reject the Operations Manager’s claim about the median playtime in the US given your data?
10. Median Playtime in the US: - The Operations Manager claimed that the median number of minutes played in the US can’t be more than 750 minutes. Can you reject the claim given the data?
Scenario 11: Quality Over Quantity Debate Background: In a strategy meeting at Olympus Interactive, the Production Head emphasizes producing fewer but higher-quality games. He argues that higher IGN ratings, indicating better quality, leads to better sales. This statement becomes a hot topic, demanding a data-backed response.
Question: Can you confirm the Production Head’s assertion about the correlation between IGN ratings and sales?
11. Game Quality and Sales: - The Production department claims that games with higher IGN ratings have significantly higher sales. Investigate this claim.
Scenario 12: The Genre Bias Inquiry Background: After their latest RPG received a lower-than-expected IGN rating, Olympus Interactive’s team raises concerns about potential biases in the rating system favoring certain genres.
Question: How would you investigate the team’s concerns about IGN’s rating system bias towards certain genres?
12. Game Genre Preferences: - The Marketing department believes that the IGN rating system has preferences for game genres. Investigate this claim.
Scenario 13: Price Drop Speculations Background: With the holiday season approaching, Olympus Interactive’s pricing team speculates that a small discount on premium games could lead to a substantial boost in sales. Before finalizing this, they need a detailed statistical analysis.
Question: How would you determine if a $10 price drop for games priced $20 and above will significantly increase units sold?
13. Game Price Elasticity: - The Pricing team wants to determine if reducing the price of every game $20 and above by $10 will result in a significant increase in the number of units sold. Come up with a way to do this with with a 5% significance level.
Scenario 14: Cross-Platform Player Preferences Background: Olympus Interactive is set to release its first cross-platform game. The marketing department wants to get ahead of the curve by understanding player preferences for gaming platforms to better target their promotions.
Task: Outline an approach for this survey, emphasizing data collection methods and potential biases.
14. Platform Preference: -Write out the steps and methods you would use to conduct a survey among gamers to determine their preferred gaming platform (e.g., PC, console, mobile) and whether their choice is influenced by factors like graphics quality, game availability, or price. How would you collect results? What questions would you ask? Explain the effect data collection methods and question types has on data quality and the over all analysis.
Scenario 15: The Development Time Conundrum Background: A controversial article claims that Olympus Interactive’s best-selling games are those with shorter development times. This sparks internal discussions, leading to a comprehensive analysis to discern any potential correlation.
Question: Can you find any correlation between game development time and its sales?
15. Game Development Time: - Investigate if there’s a correlation between the time it takes to develop a game and its eventual sales figures.
Scenario 16: Engagement Equals Spending? Background: Olympus Interactive’s monetization team observes a potential trend: Players who spend more time in game seem less inclined to make in-game purchases. They task the data analytics team with validating this observation.
Question: Are new players more likely to make in-game purchases?
16. Player Engagement and In-Game Purchases: - Examine whether players who spend more time in a game are more likely to make in-game purchases. Analyze player engagement (playtime) and IGP data to assess this relationship.
Scenario 17: Dream RPG Project Background: After the success of its previous RPGs, Olympus Interactive aims to create the ultimate RPG. Preliminary budget discussions begin, but the finance team needs a more concrete estimate to allocate resources.
Question: Can you provide a statistically backed estimate on the cost of developing a high-rated RPG game?
17. Minimum Cost of High Performance Product: - The Technical team wants to estimate the minimum cost of making a high rated RPG game. Provide a statistically valid estimate.
Scenario 18: The IGP Rating Allegation Background: The mood at Olympus Interactive is a blend of tension and curiosity. A recently retired and respected game designer, during a casual media interview, dropped a potential bombshell, suggesting that IGN’s game ratings might be influenced by the inclusion of in-game purchases (IGP). The gaming community latched onto this hint, leading to heated debates and speculations across various platforms. In a bid to clarify the issue and safeguard Olympus Interactive’s reputation for shareholders, the management quickly commissions an internal review. This pivotal task is assigned to a recently on-boarded product analyst known for his investigative prowess. Weeks into the analysis, during a preliminary presentation, the product analyst makes a shocking claim. He alleges that IGN Entertainment Inc. indeed has a bias for games with IGPs, skewing the metrics used by Olympus Interactive to measure game performance. The analyst passionately argues, “Our direction could be drastically affected by this bias. We might end up designing games purely for higher ratings, compromising on profit and customer value. Just incorporating IGP could artificially boost our IGN ratings.” The room, filled with seasoned professionals, is in a state of shock. They await the concrete data and analysis that could substantiate such a groundbreaking claim. But, in a twist of fate, as the young analyst tries to pull up his meticulously prepared presentation, his laptop crashes. It soon becomes clear that, in his rookie oversight, he hadn’t established a backup system for his work. The crucial evidence that could validate his claim is now lost. Doubt and suspicion fill the room. While some sympathize with the young analyst’s unfortunate mishap, others question the authenticity of his claims, given the lack of concrete evidence. The company is now left in a delicate position, balancing on the edge of a potential PR crisis with only the word of a greenhorn analyst to guide them.
Question: Is there any evidence of such bias?
18. IGP & IGN Rating Bias: - Is there any significant association between IGP and IGN Ratings?
Scenario 19: The Sales-Predicting Formula Background: As the evening draws near, the founder of Olympus Interactive steps up. Reflecting on the company’s journey, she wonders aloud if there’s a formula — a secret sauce — linking game development costs, marketing, distribution budgets, and the sales they achieve. This becomes your next big assignment.
Question: The Owner wants to understand the relationship between sales and the amount spent to develop, market, and distribute a game. She suspects the sales of new games can be predicted from the amount of time money and effort spent on the game, regardless of game type or console. Can we deny or confirm her suspicions given our data?
19. Impact of Department Budgets : - Analyze and confirm or debunk the founder’s belief regarding the relationship between expenditure and sales.
Scenario 20: Targeting the Right Gamers Background: Wrapping up the session, the marketing head presents a new campaign tailored for in-game purchase promotions. Before they roll it out, they want to ensure it targets the right demographics. They believe that factors like age, gender, and location might influence in-game purchase decisions, but they need data to back it up.
Question: Do demographics influence in-game purchase behavior?
20. Player Demographics and IGP: - Investigate whether player demographics, such as age, gender, or location, influence participation in in-game purchases.
Conclusion:
As the last slide dims and the projector whirs down, the weight of the day’s discussions and your crucial role at Olympus Interactive settles in. Every insight you have provided could potentially guide the next big decision or game launch. … the thrill of shaping experiences for millions worldwide.