Dataset Selection

Board Games

This analysis examines data from my board game collection to determine specific trends. It will assist in board game selection for guests and alert me of any oversights I may have in my categorizations. It should also give insight that could help in broadening my board gaming horizons.

Dataset

I created a board game catalog that gives data on many aspects of the board games I own, from player count, to cost, to mechanics.

Variables

Title: Title of the board game, expansion or accessory

Type: Base Game or Expansion, where accessories count as expansions

ExpansionFor: Lists the base game of an expansion

MinPlayers and MaxPlayers: Recommended number of players

AvgPlayTime: Calculated average playtime in minutes

Genre: Broad board game genre

Mechanics: Primary mechanic(s) of the board game

Publisher: Primary publisher of the board game

Complexity: Calculated numerical complexity on a 1 to 10 scale

Cost: Average retail cost for the board game in USD

3D Plotly: Average Play Time, Complexity, Cost

3D Scatter Analysis

Let us begin here because these are all numerical variables that are important for deciding on board game purchases.

Genre relations: It seems that the genres of board games tended to cluster together more with lower complexity.

RPG Genre: This genre seems to have the largest range in cost, complexity, playtime combinations. It also seems to have the largest variance and no apparent regression. This could be a factor of not owning a lot of games in this genre.

A clear regression: It seems there is a general linear relation between cost, playtime and complexity. While cost and playtime could be observed from the box and reseller, complexity was determined by a number of factors such as a combination of cost, playtime, publisher, genre and so on. Getting such a nice linear relation reinforces the accuracy of our complexity calculations.

### With the following code, we can extract a p-value from a linear
### regression to conclude if our previous statement is accurate.
reg1 = lm(CostUSD ~ Complexity + AvgPlayTime, data = bg)
summary(reg1)[4]     #index 4 gives p-value and t-value of variables

3D Scatter Analysis Cont.

## $coefficients
##               Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) -7.4654663 2.19639205 -3.398968 8.041711e-04
## Complexity  11.0433585 0.80836212 13.661400 4.168205e-31
## AvgPlayTime  0.1441576 0.03661674  3.936931 1.110359e-04

This regression shows very small p-values and large t-values indicating the regression we discovered is highly accurate. This would be a good model to follow in future cost, complexity, time analyses.

Strategy Genre: While the strategy genre seems to be the largest genre we own, this scatterplot may be misleading if a lot of family and party games overlapped in that smaller area in the low cost, complexity and playtime range.

Plotly Histogram

To confirm or deny if the previous scatterplot is misleading, we may use a histogram of genres to compare counts of games in them.

Histogram Notes

This proves that I do, in fact, own a majority of Strategy games. This also shows that somewhere along the line, we stopped categorizing games as card games and started adding them to other genres. We may have moved “card game” to a mechanic instead of a genre. To account for this possibility, I removed this single board game from future data analyses with dplyr:

## bg is the variable for the current dataframe so we
## will filter out the single genre by name
bg = bg |>
  filter(Genre != "Card Game")

Note: The histogram was originially a pie chart, but was changed to a bar graph to better illustrate the extraneous Card Game genre.

ggplot Boxplot

We are always trying to make game decisions based on guests and their comfort level, so the following helps discover trends on these terms.

ggplot Bar Chart

This time, let’s see if there is a similar correlation between the minimum players and the play time, a variable closely related to complexity.

Player Counts, Complexity and Play Time

Comparing the two previous graphs, we can make a generalization that the lower number of recommended players for a game generally indicates a higher complexity. Typically, for games that allow 1-player modes, they tend to be more complex which explains why a lower player count means a longer play time and complexity. It is also generally accepted that 3-player games are also relatively long and complex due to the uneven player count and added decision making. Both graphs support these theses categorically, quantitatively and qualitatively.

General Statistics

I wanted to calculate some simple averages for a final statistical analysis for general conclusions.

## # A tibble: 5 × 6
##   Genre    Count avgCost avgComplexity avgTime sdComplexity
##   <chr>    <int>   <dbl>         <dbl>   <dbl>        <dbl>
## 1 Abstract    10    27             2.6    29.5        0.516
## 2 Family      58    22.2           2      26.6        0.558
## 3 Party       28    19.6           2.2    23.9        0.431
## 4 RPG         10    98             6.7   134.         1.49 
## 5 Strategy   114    55.9           5      73          1.41

Reliability: Taking into consideration the quantity of party games and their standard deviation, it seems the determined complexity of them is relatively accurate. On the other hand, the complexity determination for RPG games is unlikely to be reliable because of the large standard deviation but small sample size. This falls in line with our qualitative analysis of the cost, complexity, play time scatterplot.

Conclusions

1. There is a positive, linear relationship between complexity, play time and cost of board games.

2. Board game genres tend to cluster together in terms of their complexities and play times, and these cluster get more compact with decreasing complexities and play times. Strategy board games possibly have the largest cluster span.

3. The lower the maximum amount of players, it is likely that the game is more complex. RPG games tend to be the most complex but some strategy games are particularly complex and rate higher than the average RPG game. The complexity of Abstract, Family and Party games are relatively similar.

4. The theories of lower player-count board games being more complex and having longer play times is generally true. In the case of 3-player games, we may also conclude that they take more time to play, probably because they require more decisions and have an odd number of players.

Future Studies

Things to Rework: In the future, more meaningful data may be retrieved if Genres were less broad. Expanding out to ten consistently used genres would probably give more insight as to whether clustering is an actual phenomenon. This will also probably provide less skewed data for RPGs and strategy games. On the other hand, it would probably be more beneficial to broaden the mechanics used and reduce them to primary and secondary uses since we didn’t analyze mechanics in this study due to the sheer amount.