When I was first introduced to R, it was through data on animals - penguins, to be specific. Now, since I am a big fan of fantasy role playing games, I am curious to perform a similar kind of ecological analysis, only on the fictional monsters from Dungeons and Dragons. The data for this project comes from Kaggle, in the form of a .csv file at https://www.kaggle.com/datasets/mrpantherson/dnd-5e-monsters?resource=download
dnd_monsters <- read.csv("/Users/MicahIsser/downloads/dnd_monsters.csv")
head(dnd_monsters)
## name url cr
## 1 aarakocra https://www.aidedd.org/dnd/monstres.php?vo=aarakocra 1/4
## 2 abjurer 9
## 3 aboleth https://www.aidedd.org/dnd/monstres.php?vo=aboleth 10
## 4 abominable-yeti 9
## 5 acererak 23
## 6 acolyte https://www.aidedd.org/dnd/monstres.php?vo=acolyte 1/4
## type size ac hp speed align legendary
## 1 humanoid (aarakocra) Medium 12 13 fly neutral good
## 2 humanoid (any race) Medium 12 84 any alignment
## 3 aberration Large 17 135 swim lawful evil Legendary
## 4 monstrosity Huge 15 137 chaotic evil
## 5 undead Medium 21 285 neutral evil
## 6 humanoid (any race) Medium 10 9 any alignment
## source str dex con int wis cha
## 1 Monster Manual (BR) 10 14 10 11 12 11
## 2 Volo's Guide to Monsters NA NA NA NA NA NA
## 3 Monster Manual (SRD) 21 9 15 18 15 18
## 4 Monster Manual NA NA NA NA NA NA
## 5 Adventures (Tomb of Annihilation) NA NA NA NA NA NA
## 6 Monster Manual (SRD) 10 10 10 10 14 11
I can see that there are a number of columns for different qualities about each monster, including size, alignment, challenge rating (cr), armor class (ac), and various attributes. I wonder for which of these qualities there is a normal (bell-shaped) distribution, and for which is it skewed in one direction or another. For example, I hypothesize that there may be more monsters with low cr than high, since there are more beginning characters than high leveled ones.
So the majority of the monsters are medium size, followed by large and then huge. This makes sense, since there are many humanoid creatures such as orcs, goblins, or bugbears, who would be roughly equal in stature to adventurers. Now let’s take a look at the distribution of moral alignments among the monsters.
This provides some interesting information: there are far more evil
monsters than good ones, and chaotic evil monsters are the most common
ones of all. It also appears that there are a number of monsters (over
175) whose alignment is listed as NA. I wonder if this is an accident,
or if it is intentional.
## [1] name url cr type size ac hp
## [8] speed align legendary source str dex con
## [15] int wis cha
## <0 rows> (or 0-length row.names)
Based on this list, and on the entries in this pdf copy of the monster manual - https://dn790005.ca.archive.org/0/items/dungeon-masters-guide/Monster%20Manual.pdf - it seems that there are a number of creatures that are listed as ‘unaligned.’ Most of these, like the allosaurus or badger, are animals who are not capable of speaking, and therefore do not have a clear moral disposition.
There also seem to be monster subtypes. Let’s look at the distribution what is the most common type of monster.
So the most common types of monsters are beasts, monstrosities, and humanoids although the ‘fiend’ class may appear smaller than it actually is because it is divided between the chaotic evil demons and the lawful evil devils.
Now I’m curous to learn more about how difficult these monsters are to fight, by plotting their challenge rating (cr).
As I suspected, the graph of challenge ratings heavily skews towards easier monsters, with the median being at cr 2 (equal to a level 2 adventurer). Now I would like to begin finding relationships between the values of multiple variables. Unfortunately, several of the column values are stored as characters, rather than numbers, so I need to convert them before they can be manipulated.
## Warning: NAs introduced by coercion
## 'data.frame': 762 obs. of 20 variables:
## $ name : chr "aarakocra" "abjurer" "aboleth" "abominable-yeti" ...
## $ url : chr "https://www.aidedd.org/dnd/monstres.php?vo=aarakocra" "" "https://www.aidedd.org/dnd/monstres.php?vo=aboleth" "" ...
## $ cr : Ord.factor w/ 33 levels "1/8"<"1/4"<"1/2"<..: 2 12 13 12 26 2 17 20 19 16 ...
## $ type : Factor w/ 14 levels "beast","monstrosity",..: NA 3 8 2 5 3 4 5 4 4 ...
## $ size : Factor w/ 6 levels "Tiny","Small",..: 3 3 4 5 3 3 5 5 5 5 ...
## $ ac : int 12 12 17 15 21 10 19 19 19 18 ...
## $ hp : int 13 84 135 137 285 9 195 225 225 172 ...
## $ speed : chr "fly" "" "swim" "" ...
## $ align : chr "neutral good" "any alignment" "lawful evil" "chaotic evil" ...
## $ legendary : chr "" "" "Legendary" "" ...
## $ source : chr "Monster Manual (BR)" "Volo's Guide to Monsters" "Monster Manual (SRD)" "Monster Manual" ...
## $ str : num 10 NA 21 NA NA 10 23 NA 25 23 ...
## $ dex : num 14 NA 9 NA NA 10 14 NA 10 10 ...
## $ con : num 10 NA 15 NA NA 10 21 NA 23 21 ...
## $ int : num 11 NA 18 NA NA 10 14 NA 16 14 ...
## $ wis : num 12 NA 15 NA NA 14 13 NA 15 13 ...
## $ cha : num 11 NA 18 NA NA 11 17 NA 19 17 ...
## $ cr_numeric : num 0.25 9 10 9 23 0.25 14 17 16 13 ...
## $ size_numeric: num 3 3 4 5 3 3 5 5 5 5 ...
## $ size_factor : Factor w/ 6 levels "Tiny","Small",..: 3 3 4 5 3 3 5 5 5 5 ...
## [1] 0.5451912
The correlation coefficient for the relationship between challenge rating and size is nearly .5, which is fairly significant. But now let’s look at that relationship graphically, so we can better see how bigger monsters tend to be tougher.
This graph shows how most of the tiny monsters have a low CR, while most of the gargantuan monsters have a higher CR. But it makes me wonder about the outliers. Let’s print just the row where the size_numeric is 1 which has the maximum value in cr_numeric, and conversely, the row where the size_numeric is 6 and that has the minimum value in cr_numeric.
## [1] "Row with size_numeric = 1 and maximum cr_numeric:"
## name url cr type size ac hp speed align legendary
## 169 demilich 18 undead Tiny 20 80 fly neutral evil Legendary
## source str dex con int wis cha cr_numeric size_numeric size_factor
## 169 Monster Manual NA NA NA NA NA NA 18 1 Tiny
## [1] "Row with size_numeric = 6 and minimum cr_numeric:"
## name url cr type size ac hp speed align legendary
## 106 brontosaurus 5 beast Gargantuan 15 121 unaligned
## source str dex con int wis cha cr_numeric
## 106 Adventures (Tomb of Annihilation) NA NA NA NA NA NA 5
## size_numeric size_factor
## 106 6 Gargantuan
The poor brontosaurus, though gargantuan, only has a challenge rating of 5, where the demilich, while tiny, has a challenge rating of 18. Let’s recalculate the correlation coefficient without these two outliers, to see how they might be skewing the data.
## [1] 0.5591973
So the values for the correlation coefficient do change significantly - from 0.498 to 0.516 - without the inclusion of those two monsters.
Now I’m interested in examining the relationship between the six different attributed - Strength, Dexterity, Constitution, Intelligence, Wisdom, and Charisma - to find patterns in how their qualities may differ. Are older, more powerful creatures like dragons have time to develop both their strength and intelligence - and therefore those two attributes will have a direct correlation? Or will the stronger creatures be less intelligent, showing an inverse correlation?
## str dex con int wis cha
## str 1.0000000 -0.1927045 0.8378196 0.3453803 0.3727750 0.4614201
## dex -0.1927045 1.0000000 -0.1560511 0.2487943 0.3439224 0.2354319
## con 0.8378196 -0.1560511 1.0000000 0.4801935 0.4511550 0.5833508
## int 0.3453803 0.2487943 0.4801935 1.0000000 0.6495574 0.8997981
## wis 0.3727750 0.3439224 0.4511550 0.6495574 1.0000000 0.7255545
## cha 0.4614201 0.2354319 0.5833508 0.8997981 0.7255545 1.0000000
## corrplot 0.92 loaded
This correlation plot has some fascinating implications. Monsters who are agile (high Dexterity) tend to have low Strength and Constitution; whereas those who are strong (high Strength) tend to also be tough (high Constitution). Bit the most surprising implication comes from the correlation of 0.9 between intelligence and charisma. It seems that creatures are are stupid are also not very socially smooth and, conversely, the smartest monsters are also the best talkers. Let’s use a scatterplot to see how monsters fall along these two variables.
Looking at the data, there seem to be three creatures with abnormally high intelligence. Let’s check out what these three monsters are.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## name url cr type
## 1 zariel https://www.aidedd.org/dnd/monstres.php?vo=zariel 26 fiend (devil)
## 2 sibriex https://www.aidedd.org/dnd/monstres.php?vo=sibriex 18 fiend (demon)
## 3 solar https://www.aidedd.org/dnd/monstres.php?vo=solar 21 <NA>
## size ac hp speed align legendary source str dex
## 1 Large 21 580 fly lawful evil Legendary Mordenkainen's Tome of Foes 27 24
## 2 Huge 19 150 fly chaotic evil Legendary Mordenkainen's Tome of Foes 10 3
## 3 Large 21 243 fly lawful good Legendary Monster Manual (SRD) 26 22
## con int wis cha cr_numeric size_numeric size_factor
## 1 28 26 27 30 26 4 Large
## 2 23 25 24 25 18 5 Huge
## 3 26 25 25 30 21 4 Large
These three creatures - a devil, a demon, and a celestial - hold to the general correlation. They are all very intelligent and also very charismatic.
Part of being a dungeon master for Dungeons and Dragons involves the observations of trends within the world of the fantasy game. The patterns present in the Monster Manual have important implications for those who want to lead their own games - whether the DM hopes to play into these trends, or purposefully subvert them. And Player Characters may optimize their characters by noticing patterns in what are the most frequent and most dangeous types of monsters. Overall, data analysis is a powerful tool for making the game of D&D more enjoyable and compelling.