Introduction

When I was first introduced to R, it was through data on animals - penguins, to be specific. Now, since I am a big fan of fantasy role playing games, I am curious to perform a similar kind of ecological analysis, only on the fictional monsters from Dungeons and Dragons. The data for this project comes from Kaggle, in the form of a .csv file at https://www.kaggle.com/datasets/mrpantherson/dnd-5e-monsters?resource=download

dnd_monsters <- read.csv("/Users/MicahIsser/downloads/dnd_monsters.csv")
head(dnd_monsters)
##              name                                                  url  cr
## 1       aarakocra https://www.aidedd.org/dnd/monstres.php?vo=aarakocra 1/4
## 2         abjurer                                                        9
## 3         aboleth   https://www.aidedd.org/dnd/monstres.php?vo=aboleth  10
## 4 abominable-yeti                                                        9
## 5        acererak                                                       23
## 6         acolyte   https://www.aidedd.org/dnd/monstres.php?vo=acolyte 1/4
##                   type   size ac  hp speed         align legendary
## 1 humanoid (aarakocra) Medium 12  13   fly  neutral good          
## 2  humanoid (any race) Medium 12  84       any alignment          
## 3           aberration  Large 17 135  swim   lawful evil Legendary
## 4          monstrosity   Huge 15 137        chaotic evil          
## 5               undead Medium 21 285        neutral evil          
## 6  humanoid (any race) Medium 10   9       any alignment          
##                              source str dex con int wis cha
## 1               Monster Manual (BR)  10  14  10  11  12  11
## 2          Volo's Guide to Monsters  NA  NA  NA  NA  NA  NA
## 3              Monster Manual (SRD)  21   9  15  18  15  18
## 4                    Monster Manual  NA  NA  NA  NA  NA  NA
## 5 Adventures (Tomb of Annihilation)  NA  NA  NA  NA  NA  NA
## 6              Monster Manual (SRD)  10  10  10  10  14  11

Beginning Exploratory Data Analysis

I can see that there are a number of columns for different qualities about each monster, including size, alignment, challenge rating (cr), armor class (ac), and various attributes. I wonder for which of these qualities there is a normal (bell-shaped) distribution, and for which is it skewed in one direction or another. For example, I hypothesize that there may be more monsters with low cr than high, since there are more beginning characters than high leveled ones.

So the majority of the monsters are medium size, followed by large and then huge. This makes sense, since there are many humanoid creatures such as orcs, goblins, or bugbears, who would be roughly equal in stature to adventurers. Now let’s take a look at the distribution of moral alignments among the monsters.

This provides some interesting information: there are far more evil monsters than good ones, and chaotic evil monsters are the most common ones of all. It also appears that there are a number of monsters (over 175) whose alignment is listed as NA. I wonder if this is an accident, or if it is intentional.

##  [1] name      url       cr        type      size      ac        hp       
##  [8] speed     align     legendary source    str       dex       con      
## [15] int       wis       cha      
## <0 rows> (or 0-length row.names)

Based on this list, and on the entries in this pdf copy of the monster manual - https://dn790005.ca.archive.org/0/items/dungeon-masters-guide/Monster%20Manual.pdf - it seems that there are a number of creatures that are listed as ‘unaligned.’ Most of these, like the allosaurus or badger, are animals who are not capable of speaking, and therefore do not have a clear moral disposition.

There also seem to be monster subtypes. Let’s look at the distribution what is the most common type of monster.

So the most common types of monsters are beasts, monstrosities, and humanoids although the ‘fiend’ class may appear smaller than it actually is because it is divided between the chaotic evil demons and the lawful evil devils.

Now I’m curous to learn more about how difficult these monsters are to fight, by plotting their challenge rating (cr).

Multivariable Correlations

As I suspected, the graph of challenge ratings heavily skews towards easier monsters, with the median being at cr 2 (equal to a level 2 adventurer). Now I would like to begin finding relationships between the values of multiple variables. Unfortunately, several of the column values are stored as characters, rather than numbers, so I need to convert them before they can be manipulated.

## Warning: NAs introduced by coercion
## 'data.frame':    762 obs. of  20 variables:
##  $ name        : chr  "aarakocra" "abjurer" "aboleth" "abominable-yeti" ...
##  $ url         : chr  "https://www.aidedd.org/dnd/monstres.php?vo=aarakocra" "" "https://www.aidedd.org/dnd/monstres.php?vo=aboleth" "" ...
##  $ cr          : Ord.factor w/ 33 levels "1/8"<"1/4"<"1/2"<..: 2 12 13 12 26 2 17 20 19 16 ...
##  $ type        : Factor w/ 14 levels "beast","monstrosity",..: NA 3 8 2 5 3 4 5 4 4 ...
##  $ size        : Factor w/ 6 levels "Tiny","Small",..: 3 3 4 5 3 3 5 5 5 5 ...
##  $ ac          : int  12 12 17 15 21 10 19 19 19 18 ...
##  $ hp          : int  13 84 135 137 285 9 195 225 225 172 ...
##  $ speed       : chr  "fly" "" "swim" "" ...
##  $ align       : chr  "neutral good" "any alignment" "lawful evil" "chaotic evil" ...
##  $ legendary   : chr  "" "" "Legendary" "" ...
##  $ source      : chr  "Monster Manual (BR)" "Volo's Guide to Monsters" "Monster Manual (SRD)" "Monster Manual" ...
##  $ str         : num  10 NA 21 NA NA 10 23 NA 25 23 ...
##  $ dex         : num  14 NA 9 NA NA 10 14 NA 10 10 ...
##  $ con         : num  10 NA 15 NA NA 10 21 NA 23 21 ...
##  $ int         : num  11 NA 18 NA NA 10 14 NA 16 14 ...
##  $ wis         : num  12 NA 15 NA NA 14 13 NA 15 13 ...
##  $ cha         : num  11 NA 18 NA NA 11 17 NA 19 17 ...
##  $ cr_numeric  : num  0.25 9 10 9 23 0.25 14 17 16 13 ...
##  $ size_numeric: num  3 3 4 5 3 3 5 5 5 5 ...
##  $ size_factor : Factor w/ 6 levels "Tiny","Small",..: 3 3 4 5 3 3 5 5 5 5 ...
## [1] 0.5451912

The correlation coefficient for the relationship between challenge rating and size is nearly .5, which is fairly significant. But now let’s look at that relationship graphically, so we can better see how bigger monsters tend to be tougher.

This graph shows how most of the tiny monsters have a low CR, while most of the gargantuan monsters have a higher CR. But it makes me wonder about the outliers. Let’s print just the row where the size_numeric is 1 which has the maximum value in cr_numeric, and conversely, the row where the size_numeric is 6 and that has the minimum value in cr_numeric.

## [1] "Row with size_numeric = 1 and maximum cr_numeric:"
##         name url cr   type size ac hp speed        align legendary
## 169 demilich     18 undead Tiny 20 80   fly neutral evil Legendary
##             source str dex con int wis cha cr_numeric size_numeric size_factor
## 169 Monster Manual  NA  NA  NA  NA  NA  NA         18            1        Tiny
## [1] "Row with size_numeric = 6 and minimum cr_numeric:"
##             name url cr  type       size ac  hp speed     align legendary
## 106 brontosaurus      5 beast Gargantuan 15 121       unaligned          
##                                source str dex con int wis cha cr_numeric
## 106 Adventures (Tomb of Annihilation)  NA  NA  NA  NA  NA  NA          5
##     size_numeric size_factor
## 106            6  Gargantuan

The poor brontosaurus, though gargantuan, only has a challenge rating of 5, where the demilich, while tiny, has a challenge rating of 18. Let’s recalculate the correlation coefficient without these two outliers, to see how they might be skewing the data.

## [1] 0.5591973

So the values for the correlation coefficient do change significantly - from 0.498 to 0.516 - without the inclusion of those two monsters.

Now I’m interested in examining the relationship between the six different attributed - Strength, Dexterity, Constitution, Intelligence, Wisdom, and Charisma - to find patterns in how their qualities may differ. Are older, more powerful creatures like dragons have time to develop both their strength and intelligence - and therefore those two attributes will have a direct correlation? Or will the stronger creatures be less intelligent, showing an inverse correlation?

##            str        dex        con       int       wis       cha
## str  1.0000000 -0.1927045  0.8378196 0.3453803 0.3727750 0.4614201
## dex -0.1927045  1.0000000 -0.1560511 0.2487943 0.3439224 0.2354319
## con  0.8378196 -0.1560511  1.0000000 0.4801935 0.4511550 0.5833508
## int  0.3453803  0.2487943  0.4801935 1.0000000 0.6495574 0.8997981
## wis  0.3727750  0.3439224  0.4511550 0.6495574 1.0000000 0.7255545
## cha  0.4614201  0.2354319  0.5833508 0.8997981 0.7255545 1.0000000
## corrplot 0.92 loaded

This correlation plot has some fascinating implications. Monsters who are agile (high Dexterity) tend to have low Strength and Constitution; whereas those who are strong (high Strength) tend to also be tough (high Constitution). Bit the most surprising implication comes from the correlation of 0.9 between intelligence and charisma. It seems that creatures are are stupid are also not very socially smooth and, conversely, the smartest monsters are also the best talkers. Let’s use a scatterplot to see how monsters fall along these two variables.

Looking at the data, there seem to be three creatures with abnormally high intelligence. Let’s check out what these three monsters are.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
##      name                                                url cr          type
## 1  zariel  https://www.aidedd.org/dnd/monstres.php?vo=zariel 26 fiend (devil)
## 2 sibriex https://www.aidedd.org/dnd/monstres.php?vo=sibriex 18 fiend (demon)
## 3   solar   https://www.aidedd.org/dnd/monstres.php?vo=solar 21          <NA>
##    size ac  hp speed        align legendary                      source str dex
## 1 Large 21 580   fly  lawful evil Legendary Mordenkainen's Tome of Foes  27  24
## 2  Huge 19 150   fly chaotic evil Legendary Mordenkainen's Tome of Foes  10   3
## 3 Large 21 243   fly  lawful good Legendary        Monster Manual (SRD)  26  22
##   con int wis cha cr_numeric size_numeric size_factor
## 1  28  26  27  30         26            4       Large
## 2  23  25  24  25         18            5        Huge
## 3  26  25  25  30         21            4       Large

These three creatures - a devil, a demon, and a celestial - hold to the general correlation. They are all very intelligent and also very charismatic.

Part of being a dungeon master for Dungeons and Dragons involves the observations of trends within the world of the fantasy game. The patterns present in the Monster Manual have important implications for those who want to lead their own games - whether the DM hopes to play into these trends, or purposefully subvert them. And Player Characters may optimize their characters by noticing patterns in what are the most frequent and most dangeous types of monsters. Overall, data analysis is a powerful tool for making the game of D&D more enjoyable and compelling.