Data 607 - Project 2

To begin, here is a summary of the tasks I’ll be completing for this project:

Project 2 - Data 607: Summary of Tasks

Dataset Selection:

Choose any three “wide” datasets from the Week 6 Discussion items. Avoid using the Sample Post dataset (already used in Week 6).

Data Preparation:

For each chosen dataset, create a CSV file (or optionally a MySQL database) that contains all the dataset information. Use a “wide” structure that mirrors how the data appears in the discussion item.

Data Import & Transformation:

Read each CSV file into R. Use the tidyr and dplyr packages to tidy and transform your data. (Note: This step carries the most weight in your grade.)

Analysis & Documentation:

Perform the analysis as requested in the discussion item for each dataset. Prepare an R Markdown file that includes your code, narrative descriptions of your data cleanup, analysis, and conclusions. Publish the R Markdown file to rpubs.com.

library(dplyr)
library(tidyverse)

With the correct packages installed, it’s time to bring in the Data. I will be using:

D&D Data I shared in Discussion Cheeses Data shared by a classmate Rolling Stone Top Album data shared by a classmate.

I downloaded all the datasets in their untidy forms and loaded them into my Github for easy access. Time to pull the untidy data into this project and check that it loaded in properly:

DungeonsAndDragons <- "https://raw.githubusercontent.com/tcgraham-data/Data-607-Project-2/refs/heads/main/dnd_chars_all.csv"
Cheese <- "https://raw.githubusercontent.com/tcgraham-data/Data-607-Project-2/refs/heads/main/cheeses.csv"
ROllingStoneTopAlbums <- "https://raw.githubusercontent.com/tcgraham-data/Data-607-Project-2/refs/heads/main/Rolling%20Stone%20500%20(public)%20-%20Data.csv"

dnddata <- read.csv(DungeonsAndDragons)
cheesedata <- read.csv(Cheese)
albumdata <- read.csv(ROllingStoneTopAlbums)

head(dnddata)
##        ip  finger                             hash     name           race
## 1         ed15f9d fe3ed6570067d2cd808bcee0a4396824 ee1e382c     Hill Dwarf
## 2         ed15f9d aa656cef94740b1da714d82782d8a8c4 ee1e382c     Hill Dwarf
## 3 6b5d3f4 d922658 04b9abba9ce759ef13319c520bf0eb69 f1f6ff43          Human
## 4 9b7218f b5d19a0 ba92372c716a832e03e866447cb2d102 f92bdd74 Fallen Aasimar
## 5 9b7218f b5d19a0 2f4a637dd42b9aa945abf870fe34edcf f92bdd74 Fallen Aasimar
## 6 bf0845c 6594cbd de16fbfebad327876582875339ad1e1f 5b8cf3ca        Variant
##               background                 date                class
## 1 Guild Member - Justice 2022-08-23T20:02:11Z Sorcerer 13|Cleric 1
## 2 Guild Member - Justice 2022-08-23T19:43:25Z Sorcerer 13|Cleric 1
## 3                  Noble 2022-08-22T14:57:09Z           Fighter 13
## 4              Outlander 2022-08-22T12:12:53Z   Sorcerer 3|Druid 2
## 5              Outlander 2022-08-22T12:07:21Z   Sorcerer 3|Druid 2
## 6            Entertainer 2022-08-22T03:19:30Z               Bard 1
##         justClass                        subclass level
## 1 Sorcerer|Cleric     Clockwork Soul|Order Domain    14
## 2 Sorcerer|Cleric     Clockwork Soul|Order Domain    14
## 3         Fighter                 Eldritch Knight    13
## 4  Sorcerer|Druid Shadow Magic|Circle of Wildfire     5
## 5  Sorcerer|Druid Shadow Magic|Circle of Wildfire     5
## 6            Bard                                     1
##                                    feats  HP AC Str Dex Con Int Wis Cha
## 1 Fey Touched|War Caster|Metamagic Adept 146 10   9  11  20  14  14  20
## 2                                        133 10   9  11  18  14  14  20
## 3                     Heavy Armor Master 140 21  20  12  19  14  11  10
## 4                                         34 16   8  10  14  10  16  16
## 5                                         34 16   8  10  14  10  16  16
## 6                                         10 13  10  16  14  12   8  16
##        alignment                                                      skills
## 1                                               Arcana|Religion|Intimidation
## 2                                               Arcana|Religion|Intimidation
## 3             CG                       History|Insight|Perception|Persuasion
## 4 Caltico Neutro                           Athletics|Arcana|Insight|Survival
## 5 Caltico Neutro                           Athletics|Arcana|Insight|Survival
## 6                Acrobatics|Sleight of Hand|Perception|Deception|Performance
##                                                   weapons
## 1                                  Crossbow, light|Dagger
## 2                                  Crossbow, light|Dagger
## 3 Longsword +1|Dagger|Crossbow, light|Mace Of Disruption 
## 4                                                  Dagger
## 5                                                  Dagger
## 6                                Longsword|Dagger|Longbow
##                                                                                                                                                                                                                                         spells
## 1                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 2                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                             
##   castingStat
## 1         Cha
## 2         Cha
## 3         Int
## 4         Cha
## 5         Cha
## 6         Cha
##                                                               choices
## 1 metamagic/Twinned Spell*Subtle Spell*Quickened Spell*Extended Spell
## 2                metamagic/Twinned Spell*Subtle Spell*Quickened Spell
## 3                                              fighting style/Defense
## 4                            metamagic/Transmuted Spell*Seeking Spell
## 5                            metamagic/Transmuted Spell*Seeking Spell
## 6                                                                    
##         country countryCode processedAlignment good lawful processedRace
## 1        Canada          CA                      NA     NA         Dwarf
## 2        Canada          CA                      NA     NA         Dwarf
## 3 United States          US                 CG   NA     NA         Human
## 4        Brazil          BR                      NA     NA       Aasimar
## 5        Brazil          BR                      NA     NA       Aasimar
## 6        Canada          CA                      NA     NA      Half-Elf
##                                                                                                                                                                                                                                processedSpells
## 1                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 2                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                             
##                        processedWeapons               alias
## 1                Crossbow, Light|Dagger     thirsty_davinci
## 2                Crossbow, Light|Dagger     thirsty_davinci
## 3 Longsword|Dagger|Crossbow, Light|Mace         cool_bhabha
## 4                                Dagger vigilant_mcclintock
## 5                                Dagger vigilant_mcclintock
## 6              Longsword|Dagger|Longbow        quirky_hertz
head(cheesedata)
##               cheese                                        url  milk
## 1         Aarewasser         https://www.cheese.com/aarewasser/   cow
## 2   Abbaye de Belloc   https://www.cheese.com/abbaye-de-belloc/ sheep
## 3   Abbaye de Belval   https://www.cheese.com/abbaye-de-belval/   cow
## 4  Abbaye de Citeaux  https://www.cheese.com/abbaye-de-citeaux/   cow
## 5    Abbaye de Tamié              https://www.cheese.com/tamie/   cow
## 6 Abbaye de Timadeuc https://www.cheese.com/abbaye-de-timadeuc/   cow
##       country               region family                       type
## 1 Switzerland                 <NA>   <NA>                  semi-soft
## 2      France          Pays Basque   <NA>         semi-hard, artisan
## 3      France                 <NA>   <NA>                  semi-hard
## 4      France             Burgundy   <NA> semi-soft, artisan, brined
## 5      France               Savoie   <NA>              soft, artisan
## 6      France province of Brittany   <NA>                  semi-hard
##   fat_content calcium_content               texture    rind       color
## 1        <NA>            <NA>               buttery  washed      yellow
## 2        <NA>            <NA>   creamy, dense, firm natural      yellow
## 3      40-46%            <NA>               elastic  washed       ivory
## 4        <NA>            <NA> creamy, dense, smooth  washed       white
## 5        <NA>            <NA>  creamy, open, smooth  washed       white
## 6        <NA>            <NA>                  soft  washed pale yellow
##                  flavor             aroma vegetarian vegan
## 1                 sweet           buttery      FALSE FALSE
## 2         burnt caramel          lanoline       TRUE FALSE
## 3                  <NA>          aromatic      FALSE FALSE
## 4 acidic, milky, smooth barnyardy, earthy      FALSE FALSE
## 5         fruity, nutty perfumed, pungent      FALSE FALSE
## 6         salty, smooth             nutty      FALSE FALSE
##                      synonyms                             alt_spellings
## 1                        <NA>                                      <NA>
## 2 Abbaye Notre-Dame de Belloc                                      <NA>
## 3                        <NA>                                      <NA>
## 4                        <NA>                                      <NA>
## 5                        <NA> Tamié, Trappiste de Tamie, Abbey of Tamie
## 6                        <NA>                                      <NA>
##                                    producers
## 1                                       Jumi
## 2                                       <NA>
## 3                                       <NA>
## 4                                       <NA>
## 5                                       <NA>
## 6 Abbaye Cistercienne NOTRE-DAME DE TIMADEUC
head(albumdata)
##        Sort.Name     Clean.Name                      Album X2003.Rank.Old
## 1 Sinatra, Frank  Frank Sinatra     In the Wee Small Hours            101
## 2    Diddley, Bo     Bo Diddley Bo Diddley / Go Bo Diddley            212
## 3 Presley, Elvis  Elvis Presley              Elvis Presley             56
## 4 Sinatra, Frank  Frank Sinatra Songs for Swingin' Lovers!            302
## 5 Little Richard Little Richard      Here's Little Richard             50
## 6        Beyonce        Beyonce                   Lemonade             NA
##   X2003.Rank X2012.Rank X2020.Rank X2020.2003.Differential Release.Year
## 1        100        101        282                    -182         1955
## 2        214        216        455                    -241         1955
## 3         55         56        332                    -277         1956
## 4        306        308         NA                    -195         1956
## 5         50         50        227                    -177         1957
## 6         NA         NA         32                     469         2016
##                   Album.Genre Album.Type Wks.on.Billboard
## 1               Big Band/Jazz     Studio               14
## 2 Rock n' Roll/Rhythm & Blues     Studio                -
## 3 Rock n' Roll/Rhythm & Blues     Studio              100
## 4               Big Band/Jazz     Studio                ?
## 5                                 Studio                5
## 6                                 Studio               87
##   Peak.Billboard.Position Spotify.Popularity
## 1                       2                 48
## 2                     201                 50
## 3                       1                 58
## 4                       2                 62
## 5                      13                 64
## 6                       1                 73
##                            Spotify.URI
## 1 spotify:album:3GmwKB1tgPZgXeRJZSm9WX
## 2 spotify:album:1cbtDEwxCjMhglb49OgNBR
## 3 spotify:album:7GXP5OhYyPVLmcVfO9Iqin
## 4 spotify:album:4kca7vXd1Wo5GE2DMafvMc
## 5 spotify:album:18tV6PLXYvVjsdOVk0S7M8
## 6 spotify:album:7dK54iZuOxXFarGhXwEXfF
##                                  Chartmetric.Link Artist.Member.Count
## 1 https://app.chartmetric.com/album/3115110/about                   1
## 2 https://app.chartmetric.com/album/4137525/about                   1
## 3 https://app.chartmetric.com/album/1338334/about                   1
## 4 https://app.chartmetric.com/album/1075976/about                   1
## 5 https://app.chartmetric.com/album/3128391/about                   1
## 6 https://app.chartmetric.com/album/4160994/about                   1
##   Artist.Gender Artist.Birth.Year.Sum Debut.Album.Release.Year
## 1          Male                  1915                     1946
## 2          Male                  1928                     1955
## 3          Male                  1935                     1956
## 4          Male                  1915                     1946
## 5          Male                  1932                     1957
## 6        Female                  1981                     2003
##   Avg..Age.at.Top.500.Album Years.Between.Debut.and.Top.500.Album
## 1                        40                                     9
## 2                        27                                     0
## 3                        21                                     0
## 4                        41                                    10
## 5                        25                                     0
## 6                        35                                    13
##                 Album.ID           Album.ID.Quoted X X.1
## 1 3GmwKB1tgPZgXeRJZSm9WX "3GmwKB1tgPZgXeRJZSm9WX",      
## 2 1cbtDEwxCjMhglb49OgNBR "1cbtDEwxCjMhglb49OgNBR",      
## 3 7GXP5OhYyPVLmcVfO9Iqin "7GXP5OhYyPVLmcVfO9Iqin",      
## 4 4kca7vXd1Wo5GE2DMafvMc "4kca7vXd1Wo5GE2DMafvMc",      
## 5 18tV6PLXYvVjsdOVk0S7M8 "18tV6PLXYvVjsdOVk0S7M8",      
## 6 7dK54iZuOxXFarGhXwEXfF "7dK54iZuOxXFarGhXwEXfF",

Now that our data is loaded, we need to clean it up with the intent of doing something with it. Since a lot of the discussions didn’t necessarily have a request built in, I have developed my own questions to answer:

  1. D&D Data - What are the top Character classes and races by percent? Further, can we figure out what the top class/race pairs are by percent?

  2. Cheese Data - Who on earth makes the most cheese styles?

  3. Top Album Data - Which albums have had the most percent change on the rank list from 2003 to 2020?

Dungeons And Dragons Data

First thing we need to do is remove columns that aren’t providing us with any real data related to the game itself. This can be done by manually look at the columns and striking those which aren’t relevant using dplyr:

dnd_data_clean <- dnddata %>%
  select(-matches("^(name|ip|finger|hash|date|alias)"))

head(dnd_data_clean)
##             race             background                class       justClass
## 1     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 2     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 3          Human                  Noble           Fighter 13         Fighter
## 4 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 5 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 6        Variant            Entertainer               Bard 1            Bard
##                          subclass level                                  feats
## 1     Clockwork Soul|Order Domain    14 Fey Touched|War Caster|Metamagic Adept
## 2     Clockwork Soul|Order Domain    14                                       
## 3                 Eldritch Knight    13                     Heavy Armor Master
## 4 Shadow Magic|Circle of Wildfire     5                                       
## 5 Shadow Magic|Circle of Wildfire     5                                       
## 6                                     1                                       
##    HP AC Str Dex Con Int Wis Cha      alignment
## 1 146 10   9  11  20  14  14  20               
## 2 133 10   9  11  18  14  14  20               
## 3 140 21  20  12  19  14  11  10             CG
## 4  34 16   8  10  14  10  16  16 Caltico Neutro
## 5  34 16   8  10  14  10  16  16 Caltico Neutro
## 6  10 13  10  16  14  12   8  16               
##                                                        skills
## 1                                Arcana|Religion|Intimidation
## 2                                Arcana|Religion|Intimidation
## 3                       History|Insight|Perception|Persuasion
## 4                           Athletics|Arcana|Insight|Survival
## 5                           Athletics|Arcana|Insight|Survival
## 6 Acrobatics|Sleight of Hand|Perception|Deception|Performance
##                                                   weapons
## 1                                  Crossbow, light|Dagger
## 2                                  Crossbow, light|Dagger
## 3 Longsword +1|Dagger|Crossbow, light|Mace Of Disruption 
## 4                                                  Dagger
## 5                                                  Dagger
## 6                                Longsword|Dagger|Longbow
##                                                                                                                                                                                                                                         spells
## 1                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 2                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                             
##   castingStat
## 1         Cha
## 2         Cha
## 3         Int
## 4         Cha
## 5         Cha
## 6         Cha
##                                                               choices
## 1 metamagic/Twinned Spell*Subtle Spell*Quickened Spell*Extended Spell
## 2                metamagic/Twinned Spell*Subtle Spell*Quickened Spell
## 3                                              fighting style/Defense
## 4                            metamagic/Transmuted Spell*Seeking Spell
## 5                            metamagic/Transmuted Spell*Seeking Spell
## 6                                                                    
##         country countryCode processedAlignment good lawful processedRace
## 1        Canada          CA                      NA     NA         Dwarf
## 2        Canada          CA                      NA     NA         Dwarf
## 3 United States          US                 CG   NA     NA         Human
## 4        Brazil          BR                      NA     NA       Aasimar
## 5        Brazil          BR                      NA     NA       Aasimar
## 6        Canada          CA                      NA     NA      Half-Elf
##                                                                                                                                                                                                                                processedSpells
## 1                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 2                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                             
##                        processedWeapons
## 1                Crossbow, Light|Dagger
## 2                Crossbow, Light|Dagger
## 3 Longsword|Dagger|Crossbow, Light|Mace
## 4                                Dagger
## 5                                Dagger
## 6              Longsword|Dagger|Longbow

With that done, we need to get an “NA” added to any cell that is currently empty. That way I don’t have to worry about cells getting weird on me:

dnd_data_clean <- dnd_data_clean %>%
  mutate(across(where(is.character), ~na_if(., "")))

head(dnd_data_clean)
##             race             background                class       justClass
## 1     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 2     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 3          Human                  Noble           Fighter 13         Fighter
## 4 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 5 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 6        Variant            Entertainer               Bard 1            Bard
##                          subclass level                                  feats
## 1     Clockwork Soul|Order Domain    14 Fey Touched|War Caster|Metamagic Adept
## 2     Clockwork Soul|Order Domain    14                                   <NA>
## 3                 Eldritch Knight    13                     Heavy Armor Master
## 4 Shadow Magic|Circle of Wildfire     5                                   <NA>
## 5 Shadow Magic|Circle of Wildfire     5                                   <NA>
## 6                            <NA>     1                                   <NA>
##    HP AC Str Dex Con Int Wis Cha      alignment
## 1 146 10   9  11  20  14  14  20           <NA>
## 2 133 10   9  11  18  14  14  20           <NA>
## 3 140 21  20  12  19  14  11  10             CG
## 4  34 16   8  10  14  10  16  16 Caltico Neutro
## 5  34 16   8  10  14  10  16  16 Caltico Neutro
## 6  10 13  10  16  14  12   8  16           <NA>
##                                                        skills
## 1                                Arcana|Religion|Intimidation
## 2                                Arcana|Religion|Intimidation
## 3                       History|Insight|Perception|Persuasion
## 4                           Athletics|Arcana|Insight|Survival
## 5                           Athletics|Arcana|Insight|Survival
## 6 Acrobatics|Sleight of Hand|Perception|Deception|Performance
##                                                   weapons
## 1                                  Crossbow, light|Dagger
## 2                                  Crossbow, light|Dagger
## 3 Longsword +1|Dagger|Crossbow, light|Mace Of Disruption 
## 4                                                  Dagger
## 5                                                  Dagger
## 6                                Longsword|Dagger|Longbow
##                                                                                                                                                                                                                                         spells
## 1                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 2                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                         <NA>
##   castingStat
## 1         Cha
## 2         Cha
## 3         Int
## 4         Cha
## 5         Cha
## 6         Cha
##                                                               choices
## 1 metamagic/Twinned Spell*Subtle Spell*Quickened Spell*Extended Spell
## 2                metamagic/Twinned Spell*Subtle Spell*Quickened Spell
## 3                                              fighting style/Defense
## 4                            metamagic/Transmuted Spell*Seeking Spell
## 5                            metamagic/Transmuted Spell*Seeking Spell
## 6                                                                <NA>
##         country countryCode processedAlignment good lawful processedRace
## 1        Canada          CA               <NA>   NA     NA         Dwarf
## 2        Canada          CA               <NA>   NA     NA         Dwarf
## 3 United States          US                 CG   NA     NA         Human
## 4        Brazil          BR               <NA>   NA     NA       Aasimar
## 5        Brazil          BR               <NA>   NA     NA       Aasimar
## 6        Canada          CA               <NA>   NA     NA      Half-Elf
##                                                                                                                                                                                                                                processedSpells
## 1                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 2                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                         <NA>
##                        processedWeapons
## 1                Crossbow, Light|Dagger
## 2                Crossbow, Light|Dagger
## 3 Longsword|Dagger|Crossbow, Light|Mace
## 4                                Dagger
## 5                                Dagger
## 6              Longsword|Dagger|Longbow

Since I’m really only interested in the “race” and “justClass” columns for my observation, I want to make sure we remove any rows where those are empty. Since you can’t functionally play the game without those two elements, there’s not really a reason to measure them, even as “empty”:

dnd_data_clean <- dnd_data_clean %>%
  filter(!is.na(race) & race != "empty",
         !is.na(justClass) & justClass != "empty")

# Display the cleaned dataset
head(dnd_data_clean)
##             race             background                class       justClass
## 1     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 2     Hill Dwarf Guild Member - Justice Sorcerer 13|Cleric 1 Sorcerer|Cleric
## 3          Human                  Noble           Fighter 13         Fighter
## 4 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 5 Fallen Aasimar              Outlander   Sorcerer 3|Druid 2  Sorcerer|Druid
## 6        Variant            Entertainer               Bard 1            Bard
##                          subclass level                                  feats
## 1     Clockwork Soul|Order Domain    14 Fey Touched|War Caster|Metamagic Adept
## 2     Clockwork Soul|Order Domain    14                                   <NA>
## 3                 Eldritch Knight    13                     Heavy Armor Master
## 4 Shadow Magic|Circle of Wildfire     5                                   <NA>
## 5 Shadow Magic|Circle of Wildfire     5                                   <NA>
## 6                            <NA>     1                                   <NA>
##    HP AC Str Dex Con Int Wis Cha      alignment
## 1 146 10   9  11  20  14  14  20           <NA>
## 2 133 10   9  11  18  14  14  20           <NA>
## 3 140 21  20  12  19  14  11  10             CG
## 4  34 16   8  10  14  10  16  16 Caltico Neutro
## 5  34 16   8  10  14  10  16  16 Caltico Neutro
## 6  10 13  10  16  14  12   8  16           <NA>
##                                                        skills
## 1                                Arcana|Religion|Intimidation
## 2                                Arcana|Religion|Intimidation
## 3                       History|Insight|Perception|Persuasion
## 4                           Athletics|Arcana|Insight|Survival
## 5                           Athletics|Arcana|Insight|Survival
## 6 Acrobatics|Sleight of Hand|Perception|Deception|Performance
##                                                   weapons
## 1                                  Crossbow, light|Dagger
## 2                                  Crossbow, light|Dagger
## 3 Longsword +1|Dagger|Crossbow, light|Mace Of Disruption 
## 4                                                  Dagger
## 5                                                  Dagger
## 6                                Longsword|Dagger|Longbow
##                                                                                                                                                                                                                                         spells
## 1                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 2                                Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|Summon Construct*4|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                         <NA>
##   castingStat
## 1         Cha
## 2         Cha
## 3         Int
## 4         Cha
## 5         Cha
## 6         Cha
##                                                               choices
## 1 metamagic/Twinned Spell*Subtle Spell*Quickened Spell*Extended Spell
## 2                metamagic/Twinned Spell*Subtle Spell*Quickened Spell
## 3                                              fighting style/Defense
## 4                            metamagic/Transmuted Spell*Seeking Spell
## 5                            metamagic/Transmuted Spell*Seeking Spell
## 6                                                                <NA>
##         country countryCode processedAlignment good lawful processedRace
## 1        Canada          CA               <NA>   NA     NA         Dwarf
## 2        Canada          CA               <NA>   NA     NA         Dwarf
## 3 United States          US                 CG   NA     NA         Human
## 4        Brazil          BR               <NA>   NA     NA       Aasimar
## 5        Brazil          BR               <NA>   NA     NA       Aasimar
## 6        Canada          CA               <NA>   NA     NA      Half-Elf
##                                                                                                                                                                                                                                processedSpells
## 1                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 2                                                 Alarm*1|Protection from Evil and Good*1|Command*1|Heroism*1|Aid*2|Lesser Restoration*2|Dispel Magic*3|Protection from Energy*3|Freedom of Movement*4|*|Greater Restoration*5|Wall of Force*5
## 3 Prestidigitation*0|Mage Hand*0|Blade Ward*0|Light*0|Alarm*1|Burning Hands*1|Unseen Servant*1|Chromatic Orb*1|Magic Missile*1|Identify*1|Detect Magic*1|Absorb Elements*1|Armor of Agathys*1|Shatter*2|Misty Step*2|Enlarge/Reduce*2|Catnap*3
## 4                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 5                                                                                                                                                                                                     Burning Hands*1|Cure Wounds*1|Darkness*2
## 6                                                                                                                                                                                                                                         <NA>
##                        processedWeapons
## 1                Crossbow, Light|Dagger
## 2                Crossbow, Light|Dagger
## 3 Longsword|Dagger|Crossbow, Light|Mace
## 4                                Dagger
## 5                                Dagger
## 6              Longsword|Dagger|Longbow

I do want to make it clear I want to keep all this data because you never know when you might need it. You never know when you just NEED to know what percentage of DnD players are Eldritch Knights…But since I’m not going to use all of it, I want to make a unique datafram with only the information I’m interested in:

dnd_subset <- dnd_data_clean %>% select(race, justClass)

head(dnd_subset)
##             race       justClass
## 1     Hill Dwarf Sorcerer|Cleric
## 2     Hill Dwarf Sorcerer|Cleric
## 3          Human         Fighter
## 4 Fallen Aasimar  Sorcerer|Druid
## 5 Fallen Aasimar  Sorcerer|Druid
## 6        Variant            Bard

Now we’re getting somewhere. The last thing I see that’s a problem is that some characters have dual-classed. How do I treat that? In DnD parlance, the first listed class is the primary class, and the second is a secondary classing. Now we’ll want to make a new column that splits up any dual classes by looking for a pipe and extracting info after the pipe:

dnd_subset <- dnd_subset %>%
  mutate(`Secondary Class` = ifelse(grepl("\\|", justClass),
                                      sub(".*\\|", "", justClass),
                                      NA))

head(dnd_subset)
##             race       justClass Secondary Class
## 1     Hill Dwarf Sorcerer|Cleric          Cleric
## 2     Hill Dwarf Sorcerer|Cleric          Cleric
## 3          Human         Fighter            <NA>
## 4 Fallen Aasimar  Sorcerer|Druid           Druid
## 5 Fallen Aasimar  Sorcerer|Druid           Druid
## 6        Variant            Bard            <NA>

That bit of research worked well. But I still have the extra secondary data in justClass and need to remove it. And I want to rename “justClass” to “Primary Class”

dnd_subset <- dnd_subset %>%
  mutate(justClass = sub("\\|.*", "", justClass)) %>%
  rename(`Primary Class` = justClass)
head(dnd_subset)
##             race Primary Class Secondary Class
## 1     Hill Dwarf      Sorcerer          Cleric
## 2     Hill Dwarf      Sorcerer          Cleric
## 3          Human       Fighter            <NA>
## 4 Fallen Aasimar      Sorcerer           Druid
## 5 Fallen Aasimar      Sorcerer           Druid
## 6        Variant          Bard            <NA>

And to just clean this up a little bit more, so we can capture who does and doesn’t have a Second class:

dnd_subset <- dnd_subset %>%
  mutate(`Secondary Class` = ifelse(is.na(`Secondary Class`) | `Secondary Class` == "", "none", `Secondary Class`))
head(dnd_subset)
##             race Primary Class Secondary Class
## 1     Hill Dwarf      Sorcerer          Cleric
## 2     Hill Dwarf      Sorcerer          Cleric
## 3          Human       Fighter            none
## 4 Fallen Aasimar      Sorcerer           Druid
## 5 Fallen Aasimar      Sorcerer           Druid
## 6        Variant          Bard            none

Let’s start answering some questions. We have 10,867 records here. So this is a pretty solid statistical sample.

##DnD Question 1 - What’s the most popular race?

library(ggplot2)

ggplot(dnd_subset, aes(x = race)) +
  geom_bar(fill = "skyblue", color = "black") +
  labs(title = "Distribution of Race", x = "Race", y = "Count") +
  theme_minimal()

That was unhelpful. I had no idea there were that many race options in the game. Color me impressed. I guess we’ll just tabulate this differently. Let’s make a new dataset that lists all races and then counts each instance and descends from most to least. Then we can graph the top ten.

dnd_race_count <- dnd_subset %>%
  count(race) %>%       
  rename(Count = n) %>%   
  arrange(desc(Count))    

print(dnd_race_count)
##                                   race Count
## 1                                Human  2068
## 2                             Half-Elf   688
## 3                             Wood Elf   572
## 4                           Dragonborn   486
## 5                             Tiefling   449
## 6                             Half-Orc   425
## 7                             High Elf   378
## 8                       Mountain Dwarf   368
## 9                               Tabaxi   320
## 10                          Hill Dwarf   319
## 11                  Lightfoot Halfling   318
## 12                        Forest Gnome   219
## 13                             Goliath   198
## 14                           Warforged   194
## 15                              Goblin   181
## 16                            Dark Elf   179
## 17                             Firbolg   167
## 18                              Turtle   139
## 19                   Protector Aasimar   138
## 20                             Variant   118
## 21                          Rock Gnome   116
## 22                               Kenku   106
## 23                          Lizardfolk   106
## 24                              Triton   103
## 25                              Kobold   102
## 26                    Ghastly Halfling    97
## 27                            Minotaur    95
## 28                      Stout Halfling    90
## 29                      Fallen Aasimar    89
## 30                             Bugbear    82
## 31                         Fire Genasi    77
## 32                     Scourge Aasimar    70
## 33                          Deep Gnome    69
## 34                        Serpentblood    69
## 35                          Shadow Elf    68
## 36                                 Orc    63
## 37                           Aarakocra    60
## 38                          Changeling    60
## 39                         Eladrin ToF    59
## 40                              Custom    52
## 41                      Lineage - Glas    41
## 42                        Water Genasi    41
## 43                         Elephantine    40
## 44                             Sea Elf    40
## 45                          Air Genasi    39
## 46                            Birdfolk    39
## 47                             Aasimar    38
## 48                             Centaur    37
## 49                           Hobgoblin    34
## 50                        Earth Genasi    33
## 51                      Lineage - Zari    31
## 52                       Animal Hybrid    28
## 53                        Dreamtouched    28
## 54                      Lineage - Meph    28
## 55                               Satyr    25
## 56                             Leonine    23
## 57                      Lineage - Levi    22
## 58                      Lineage - Fier    20
## 59                           Chromatic    19
## 60                          Gray Dwarf    19
## 61                         Eladrin DMG    18
## 62                           Detection    17
## 63                             Dhampir    17
## 64                               Fairy    15
## 65                          Geth - Yan    15
## 66                            Harefolk    15
## 67                                 Gem    12
## 68                      Lineage - Disp    12
## 69                            Pale Elf    12
## 70                      Reborn Lineage    11
## 71                            Sentinel    11
## 72                              Tortle    11
## 73                            Half-elf    10
## 74                              Stride    10
## 75                          Asgardian      9
## 76                             Healing     9
## 77                       Hexed Lineage     9
## 78                      Lineage - Baal     9
## 79                      Lineage - Mamm     9
## 80                      Lotus Halfling     9
## 81                             Passage     9
## 82                       Human Variant     8
## 83                               Storm     8
## 84                           Violetken     8
## 85                           Dire wolf     7
## 86                                Hunt     7
## 87                              Making     7
## 88                                Drow     6
## 89                          Geth - Zer     6
## 90                           Githzerai     6
## 91                       Half-Dark Elf     6
## 92                         Hospitality     6
## 93                           Kalashtar     6
## 94                             Loxodon     6
## 95                            Metallic     6
## 96                               Tooth     6
## 97            Dragonborn (Draconblood)     5
## 98                               Envoy     5
## 99                    Lineage - Zariel     5
## 100                             Shadow     5
## 101                Shifter (Wildhunt)      5
## 102                            Warding     5
## 103                            Eladrin     4
## 104                           Half Elf     4
## 105                             Humana     4
## 106                   Variant Half Elf     4
## 107                           Vedalken     4
## 108                                 CA     3
## 109                     Copper Dragon      3
## 110                              Dwarf     3
## 111                 Ghostwise Halfling     3
## 112                             Goblyn     3
## 113                              Grung     3
## 114                    Half-Elf (Wood)     3
## 115                       Half-Khindra     3
## 116                               Hide     3
## 117                     Imaskari Human     3
## 118                             Leonin     3
## 119                        Meadowguard     3
## 120                            Myconid     3
## 121                    Ratfolk (white)     3
## 122                           Scribing     3
## 123                            Shifter     3
## 124                           Succubus     3
## 125                   Variant Half-Elf     3
## 126                      Variant Human     3
## 127                                 BR     2
## 128                 Dhampir (Illithid)     2
## 129                          Div Spawn     2
## 130                            Finding     2
## 131                  Giant Wolf Spider     2
## 132                              Gnoll     2
## 133                           Gray-Elf     2
## 134                       Grove Cervan     2
## 135                         Half-Dwarf     2
## 136                     Half-Elf(Drow)     2
## 137                      High Moon Elf     2
## 138                      Human (Dwarf)     2
## 139            Human (Mark of Finding)     2
## 140                     Human - Custom     2
## 141                       Human Hybrid     2
## 142            Human Tiefling (Glasya)     2
## 143                 Human Undertownian     2
## 144        Human Undertownian Ancestor     2
## 145                         Juggeenaut     2
## 146                         Juggernaut     2
## 147                        Juggernaut      2
## 148                                Kor     2
## 149                          Longtooth     2
## 150                  Longtooth Shifter     2
## 151                            Luxodon     2
## 152                  Making Mark Human     2
## 153                           Meio-Orc     2
## 154                         Minotaurus     2
## 155                               Ogre     2
## 156                            Owlfolk     2
## 157                           Plasmoid     2
## 158                             Reborn     2
## 159                  Reborn (high elf)     2
## 160                        Sprite Lord     2
## 161                                 US     2
## 162                     Variant Human      2
## 163                           Wald Elf     2
## 164                  Wand Golem Thingy     2
## 165                               Wolf     2
## 166                   Wood Elf(Aereni)     2
## 167                      Wooden Puppet     2
## 168                            Yuan-Ti     2
## 169                  Yuan-ti pureblood     2
## 170                            firbolg     2
## 171                      1/2 Elf--Var.     1
## 172                          Aaracokra     1
## 173                           Aassling     1
## 174               Animal Hybrid (Bear)     1
## 175                     Asgardian (MD)     1
## 176                Augmented Combatant     1
## 177                     Axolotl Hybrid     1
## 178                           Bearfolk     1
## 179                              Beast     1
## 180                        Bird Hybrid     1
## 181                             Blefus     1
## 182                 Bonedevil Tiefling     1
## 183                           Boofreen     1
## 184                            Caliban     1
## 185                       Cambiaformas     1
## 186                 Caprien (Goatfolk)     1
## 187                     Changeling 2.0     1
## 188              Child Of The Old Ones     1
## 189                            Chitine     1
## 190                   Chromatic(green)     1
## 191                            Cloaker     1
## 192                          Clockwork     1
## 193                   Clockwork Dancer     1
## 194        Custom - Desert Lizardfolk      1
## 195 Custom - Fish in a suit of armour      1
## 196                         Desert Elf     1
## 197                           Dhampire     1
## 198              Dire Wolf Hengenyokai     1
## 199                           Direwolf     1
## 200                               Doll     1
## 201                       Dragonforged     1
## 202                             Drakin     1
## 203                              Drow      1
## 204                            Duergar     1
## 205                Earth Genasi (MotM)     1
## 206                        Eladrin Elf     1
## 207                                Elf     1
## 208            Fallen Aasimar (goblin)     1
## 209                       Flying Snake     1
## 210                      Forest Dragon     1
## 211                       Freed Undead     1
## 212                           Frog man     1
## 213                     Gem (Sapphire)     1
## 214                        Genasi-null     1
## 215                          Gethzerai     1
## 216                              Ghost     1
## 217            Glowing One (Celestial)     1
## 218                              Goron     1
## 219                        Grovetender     1
## 220                           Halb-Ork     1
## 221                Half Elf/Half Siren     1
## 222                   Half Orc Finding     1
## 223               Half Orc Fire Gensai     1
## 224                  Half Sea High Elf     1
## 225                    Half-Elf (Drow)     1
## 226                   Half-Elf Variant     1
## 227                         Half-Giant     1
## 228          Half-Orc (Custom lineage)     1
## 229                     Halfling Gnome     1
## 230  Hengenyokai- Sabre Toothed Tiger      1
## 231                  High Elf/Skeleton     1
## 232                   High Elf/Vampire     1
## 233                       Higher Elfin     1
## 234                         Hill Giant     1
## 235                                Hin     1
## 236             Human (Elven Ancestry)     1
## 237                    Human (Variant)     1
## 238                   Human (Variant)      1
## 239              Human (dragon blood?)     1
## 240                    Human (variant)     1
## 241                      Human 1 inspo     1
## 242                       Human Age 31     1
## 243                         Human Var.     1
## 244                  Human Walldweller     1
## 245                 Human/Marsh-Wiggle     1
## 246                         Human/Shou     1
## 247                         Jackalfolk     1
## 248               Juggernaut Warforged     1
## 249                     Kenku/half elf     1
## 250                            Kitsune     1
## 251                              Kitty     1
## 252                   Kobold (Volvo's)     1
## 253          Kobold Scout (Blue Scale)     1
## 254                              Krijg     1
## 255                      Lightfoot Hin     1
## 256                   Lineage - Fierna     1
## 257                   Lineage - Glasya     1
## 258                           Locathah     1
## 259                           Loxodon      1
## 260                           Luxodont     1
## 261              Mark of Warding Dwarf     1
## 262                           Mechlock     1
## 263                         Medio Orco     1
## 264                             Medusa     1
## 265                 Meio-Elfo Variante     1
## 266            Minotaur (Death Knight)     1
## 267                           Moon Elf     1
## 268                           Mothfolk     1
## 269                    Mountain Dwarf      1
## 270                            Mycanid     1
## 271                               Naga     1
## 272           Nature Tiefling (Dryad?)     1
## 273                           Nautilid     1
## 274                            Octopus     1
## 275                     Ogrillon Zombi     1
## 276                     Olympian, Ares     1
## 277                               Oryk     1
## 278                           Pandaren     1
## 279                            Penguin     1
## 280 Protector Aasimar: Iluskan Parents     1
## 281                Ratfolk(underfolk)      1
## 282                     Shadar-Kai Elf     1
## 283                     Shadar-kai elf     1
## 284                          Sharkfolk     1
## 285                            Sheikah     1
## 286                              Shiya     1
## 287                       Simic Hybrid     1
## 288                           Skeleton     1
## 289                        Skellington     1
## 290                         SpiderFolk     1
## 291                          Staemmige     1
## 292                Swiftstride Shifter     1
## 293                    Synthetic Human     1
## 294                             Tauren     1
## 295                Teifling - Levistus     1
## 296                          Tieffelin     1
## 297                  Tiefling (Mammon)     1
## 298                  Tiefling - Zariel     1
## 299                  Tiefling Levistus     1
## 300            Tiefling Lineage - Baal     1
## 301           Tortle - Razorback Tribe     1
## 302                              Troll     1
## 303                         Turtlebear     1
## 304                         UA Eladrin     1
## 305                             Undead     1
## 306                            Unlucky     1
## 307                            Vampire     1
## 308                 Vampire (Zendikar)     1
## 309                    Variant HalfElf     1
## 310                          Vedalken      1
## 311                     Warforged Bear     1
## 312                 Warforged Official     1
## 313                Water Genasi 27 Ans     1
## 314                         Winter Orc     1
## 315                             Yordle     1
## 316                  Yuan-ti Pureblood     1
## 317                           moon Elf     1
## 318                               naga     1
## 319                               nyad     1
## 320                             tortle     1
## 321                  warforged (envoy)     1
library(ggplot2)

top_races <- dnd_race_count %>% 
  slice_max(Count, n = 10)

ggplot(top_races, aes(x = reorder(race, Count), y = Count)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Races by Count", x = "Race", y = "Count") +
  theme_minimal()

This is cool. And it makes a lot of sense. In the game, humans get +1 to all stats for a total of +6 overall. Other races have a total of +3 to a stat. The “upside” to choosing a race other than human is that while humans have a blanket +1 to every stat, each race has one stat that receives a +2 perk. In general, this makes humans easier to play for new players. Most new players are encouraged to start with a human and then choose a more interesting race on the “next” game. I wonder what the percentage is between “human” and “all others:

human_vs_other <- dnd_race_count %>%
  mutate(race_group = ifelse(tolower(race) == "human", "Human", "Other")) %>%
  group_by(race_group) %>%
  summarise(Total_Count = sum(Count))

human_vs_other
## # A tibble: 2 × 2
##   race_group Total_Count
##   <chr>            <int>
## 1 Human             2068
## 2 Other             8808
total_count <- sum(human_vs_other$Total_Count)

human_count <- human_vs_other %>% 
  filter(race_group == "Human") %>% 
  pull(Total_Count)

human_percentage <- (human_count / total_count) * 100

human_percentage
## [1] 19.01434

That’s much more in line with what I thought. The bar chart makes it look like Humans are far and away the top choice. Which they are for a single racial class. However, the reality is that it’s going to be the largest instance because it is the most friendly to beginner players. So it’s almost better to think of “human” as beginners and all others as veteran players. I would believe that 81% of DnD players more likely to play more complex and challenging characters.

D&D Data Question 2 - What’s the most played character class?

dnd_class_count <- dnd_subset %>%
  count(`Primary Class`) %>%       
  rename(Count = n) %>%           
  arrange(desc(Count))             

dnd_class_count
##        Primary Class Count
## 1            Fighter  1463
## 2              Rogue  1308
## 3             Cleric  1003
## 4          Barbarian   956
## 5            Paladin   891
## 6             Ranger   749
## 7             Wizard   747
## 8              Druid   722
## 9               Monk   716
## 10              Bard   701
## 11          Sorcerer   658
## 12           Warlock   649
## 13         Artificer   218
## 14    Revised Ranger    44
## 15      Blood Hunter    26
## 16            Mystic    13
## 17             Dwarf     2
## 18        Gunslinger     2
## 19          Half-Elf     2
## 20      Battle Clown     1
## 21 Crafting Commoner     1
## 22        Dragonborn     1
## 23            Genasi     1
## 24             Human     1
## 25          commoner     1
top_classes <- dnd_class_count %>%
  slice_max(Count, n = 10)

top_classes
##    Primary Class Count
## 1        Fighter  1463
## 2          Rogue  1308
## 3         Cleric  1003
## 4      Barbarian   956
## 5        Paladin   891
## 6         Ranger   749
## 7         Wizard   747
## 8          Druid   722
## 9           Monk   716
## 10          Bard   701
ggplot(top_classes, aes(x = reorder(`Primary Class`, Count), y = Count)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Primary Classes", x = "Primary Class", y = "Count") +
  theme_minimal()

Answer: Fighter. That makes a lot of sense. It’s one of the most flexible classes to generally do what you want to do. About all you can’t do well as a “fighter” is cast magic. But magic users are hard because they get real squishy, real fast. OK. I’ve spent way too much time on DnD data. Time to tackle CHeese.

Cheese Data

OK. I have to get more focused on what I’m doing here. I don’t want this to ramble too much. My main question here was:

Which nation makes the most chesses on earth?

Let’s begin be tidying up our data. There are a few columns here that don’t provide us with much information, nor do we really need it because of how little infor we get out of it. So let’s scrub some data:

cheese_data_clean <- cheesedata %>%
  select(-matches("^(url|region|family|fat_content|calcium_content|synonyms|alt_spellings|producers|vegetarian|vegan)"))

head(cheese_data_clean)
##               cheese  milk     country                       type
## 1         Aarewasser   cow Switzerland                  semi-soft
## 2   Abbaye de Belloc sheep      France         semi-hard, artisan
## 3   Abbaye de Belval   cow      France                  semi-hard
## 4  Abbaye de Citeaux   cow      France semi-soft, artisan, brined
## 5    Abbaye de Tamié   cow      France              soft, artisan
## 6 Abbaye de Timadeuc   cow      France                  semi-hard
##                 texture    rind       color                flavor
## 1               buttery  washed      yellow                 sweet
## 2   creamy, dense, firm natural      yellow         burnt caramel
## 3               elastic  washed       ivory                  <NA>
## 4 creamy, dense, smooth  washed       white acidic, milky, smooth
## 5  creamy, open, smooth  washed       white         fruity, nutty
## 6                  soft  washed pale yellow         salty, smooth
##               aroma
## 1           buttery
## 2          lanoline
## 3          aromatic
## 4 barnyardy, earthy
## 5 perfumed, pungent
## 6             nutty

Now we need to update this so that any empty cells are given the null of “NA”

cheese_data_clean <- cheese_data_clean %>%
  mutate(across(where(is.character), ~na_if(., "")))

head(cheese_data_clean)
##               cheese  milk     country                       type
## 1         Aarewasser   cow Switzerland                  semi-soft
## 2   Abbaye de Belloc sheep      France         semi-hard, artisan
## 3   Abbaye de Belval   cow      France                  semi-hard
## 4  Abbaye de Citeaux   cow      France semi-soft, artisan, brined
## 5    Abbaye de Tamié   cow      France              soft, artisan
## 6 Abbaye de Timadeuc   cow      France                  semi-hard
##                 texture    rind       color                flavor
## 1               buttery  washed      yellow                 sweet
## 2   creamy, dense, firm natural      yellow         burnt caramel
## 3               elastic  washed       ivory                  <NA>
## 4 creamy, dense, smooth  washed       white acidic, milky, smooth
## 5  creamy, open, smooth  washed       white         fruity, nutty
## 6                  soft  washed pale yellow         salty, smooth
##               aroma
## 1           buttery
## 2          lanoline
## 3          aromatic
## 4 barnyardy, earthy
## 5 perfumed, pungent
## 6             nutty

Perfect. Now we have solid data. But it’s still wide and we need to make it long to get the answer we seek. And I need to make the question harder so I can have fun exploding columns. Let’s keep cheese, country, and type:

cheese_data_subset <- cheese_data_clean %>%
  select(cheese, country, type)


head(cheese_data_subset)
##               cheese     country                       type
## 1         Aarewasser Switzerland                  semi-soft
## 2   Abbaye de Belloc      France         semi-hard, artisan
## 3   Abbaye de Belval      France                  semi-hard
## 4  Abbaye de Citeaux      France semi-soft, artisan, brined
## 5    Abbaye de Tamié      France              soft, artisan
## 6 Abbaye de Timadeuc      France                  semi-hard

Great. Now I want to generate new columns for type such that each column only has one descriptor:

cheese_data_subset <- cheese_data_subset %>%
  mutate(type = trimws(type)) %>% 
  separate(type, into = c("type1", "type2", "type3"), sep = ",", fill = "right", extra = "drop")
  
head(cheese_data_subset)
##               cheese     country     type1    type2   type3
## 1         Aarewasser Switzerland semi-soft     <NA>    <NA>
## 2   Abbaye de Belloc      France semi-hard  artisan    <NA>
## 3   Abbaye de Belval      France semi-hard     <NA>    <NA>
## 4  Abbaye de Citeaux      France semi-soft  artisan  brined
## 5    Abbaye de Tamié      France      soft  artisan    <NA>
## 6 Abbaye de Timadeuc      France semi-hard     <NA>    <NA>

OK this is getting fun. type1 really seams to be about cheese hardness. So let’s rename the columns and then sort data accordingly:

cheese_data_subset <- cheese_data_subset %>%
  rename(hardness = type1,
         category = type2,
         type = type3)

head(cheese_data_subset)
##               cheese     country  hardness category    type
## 1         Aarewasser Switzerland semi-soft     <NA>    <NA>
## 2   Abbaye de Belloc      France semi-hard  artisan    <NA>
## 3   Abbaye de Belval      France semi-hard     <NA>    <NA>
## 4  Abbaye de Citeaux      France semi-soft  artisan  brined
## 5    Abbaye de Tamié      France      soft  artisan    <NA>
## 6 Abbaye de Timadeuc      France semi-hard     <NA>    <NA>
unique_hardness <- unique(cheese_data_subset$hardness)
unique_category <- unique(cheese_data_subset$category)
unique_type <- unique(cheese_data_subset$type)

cat("Unique Hardness values:\n")
## Unique Hardness values:
print(unique_hardness)
##  [1] "semi-soft"  "semi-hard"  "soft"       "hard"       "firm"      
##  [6] "fresh soft" "semi-firm"  NA           "fresh firm" "artisan"
cat("\nUnique Category values:\n")
## 
## Unique Category values:
print(unique_category)
##  [1] NA               " artisan"       " soft-ripened"  " brined"       
##  [5] " blue-veined"   " smear-ripened" " semi-soft"     " processed"    
##  [9] " firm"          " hard"          " whey"          " semi-hard"    
## [13] " organic"
cat("\nUnique Type values:\n")
## 
## Unique Type values:
print(unique_type)
## [1] NA               " brined"        " organic"       " blue-veined"  
## [5] " artisan"       " soft-ripened"  " processed"     " smear-ripened"

Now we need to start moving data around to be in the correct columns.

cheese_data_subset <- cheese_data_subset %>%
  mutate(
    category = if_else(
      hardness == "artisan",
      if_else(category %in% c("empty", "", NA_character_), 
              "artisan", 
              paste(category, "artisan", sep = ", ")),
      category
    ),
    hardness = if_else(hardness == "artisan", NA_character_, hardness)
  )

head(cheese_data_subset)
##               cheese     country  hardness category    type
## 1         Aarewasser Switzerland semi-soft     <NA>    <NA>
## 2   Abbaye de Belloc      France semi-hard  artisan    <NA>
## 3   Abbaye de Belval      France semi-hard     <NA>    <NA>
## 4  Abbaye de Citeaux      France semi-soft  artisan  brined
## 5    Abbaye de Tamié      France      soft  artisan    <NA>
## 6 Abbaye de Timadeuc      France semi-hard     <NA>    <NA>

Ok this worked so well that I decided I wanted to scan all three columns for all keywords. Then I wanted to seperate each key word into the right column. I thought this would be easy. It was not. I turned to the internet for help and discovered some new libraries to get this done. I don’t know if this is OK or not…but it tidies my data how I wanted it. So I guess there’s that.

library(stringr)
library(purrr)

hardness_keywords <- c("semi-soft", "semi-hard", "soft", "hard", "firm", "fresh soft", "semi-firm", "fresh firm")
category_keywords <- c("artisan", "processed", "organic", "smear-ripened")
type_keywords     <- c("soft-ripened", "brined", "blue-veined", "smear-ripened", "whey")

reassign_keywords <- function(hardness_val, category_val, type_val) {
  all_vals <- c(hardness_val, category_val, type_val)
  all_vals <- unlist(str_split(all_vals, ","))
  all_vals <- str_trim(all_vals)
  all_vals <- all_vals[all_vals != ""]
  
  new_hardness <- paste(intersect(all_vals, hardness_keywords), collapse = ", ")
  new_category <- paste(intersect(all_vals, category_keywords), collapse = ", ")
  new_type     <- paste(intersect(all_vals, type_keywords), collapse = ", ")
  
  tibble(new_hardness = new_hardness,
         new_category = new_category,
         new_type = new_type)
}

new_values <- cheese_data_subset %>% 
  select(hardness, category, type) %>%
  pmap_dfr(reassign_keywords)

cheese_data_subset_updated <- cheese_data_subset %>%
  bind_cols(new_values) %>%
  select(-hardness, -category, -type) %>%
  rename(
    hardness = new_hardness,
    category = new_category,
    type = new_type
  )

cheese_data_subset_updated <- cheese_data_subset_updated %>%
  mutate(across(everything(), ~ifelse(. == "", NA_character_, .)))

head(cheese_data_subset_updated)
##               cheese     country  hardness category   type
## 1         Aarewasser Switzerland semi-soft     <NA>   <NA>
## 2   Abbaye de Belloc      France semi-hard  artisan   <NA>
## 3   Abbaye de Belval      France semi-hard     <NA>   <NA>
## 4  Abbaye de Citeaux      France semi-soft  artisan brined
## 5    Abbaye de Tamié      France      soft  artisan   <NA>
## 6 Abbaye de Timadeuc      France semi-hard     <NA>   <NA>

PERFECT. This is the coolest thing I’ve done in R. I’m really happy with this. OK. Now to get a bar chart of cheeses by country of origin:

country_counts <- cheese_data_subset_updated %>%
  count(country) %>%          
  arrange(desc(n)) %>%        
  slice_max(n, n = 20)        

ggplot(country_counts, aes(x = reorder(country, n), y = n)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 20 Countries by Number of Rows", 
       x = "Country", 
       y = "Count") +
  theme_minimal()

Oh crud. Turns our the country column was also comma dilineated. Let’s truncate that and redo the graph:

cheese_data_subset_updated <- cheese_data_subset_updated %>%
  mutate(country = sub(",.*", "", country))

head(cheese_data_subset_updated$country)
## [1] "Switzerland" "France"      "France"      "France"      "France"     
## [6] "France"
country_counts <- cheese_data_subset_updated %>%
  count(country) %>%    
  arrange(desc(n)) %>%      
  slice_max(n, n = 20)        


ggplot(country_counts, aes(x = reorder(country, n), y = n)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 20 Countries by Number of Rows", 
       x = "Country", 
       y = "Count") +
  theme_minimal()

Better. I suppose if I’m being thorough. I should collapse “Ireland” “Scotland” and “England” all into “United Kingdom” and see how it looks. It’ll have an effect…

cheese_data_subset_updated <- cheese_data_subset_updated %>%
  mutate(country = case_when(
    country %in% c("Ireland", "Scotland", "England") ~ "United Kingdom",
    TRUE ~ country
  ))

head(cheese_data_subset_updated$country)
## [1] "Switzerland" "France"      "France"      "France"      "France"     
## [6] "France"

Let’s try the chart one more time:

country_counts <- cheese_data_subset_updated %>%
  count(country) %>%        
  arrange(desc(n)) %>%        
  slice_max(n, n = 20)        

ggplot(country_counts, aes(x = reorder(country, n), y = n)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 20 Countries by Number of Rows", 
       x = "Country", 
       y = "Count") +
  theme_minimal()

Who in the world makes the most cheese? The United States. King of Cheese.

Onto the last bit of data.

##Rolling Stone Top Album Data

I am refusing to go off on any wild data goose chases. I’m like 5 hours into this project and ready to wrap it up. So here’s the question we want to answer, knowing that we have the top album data from 2003 to 2020. I want to know what the differential in movement is and I want to know the top movers up and top movers down. Like, over the course of 17 years, who got more popular and who got less popular.

##Question - Who were the top movers up and down over the course of 17 years?

Let’s begin by scrubbing data that has no real meaning for us:

album_data_clean <- albumdata %>%
  select(-matches("^(Album.ID|Album.ID.Quoted)"))

head(album_data_clean)
##        Sort.Name     Clean.Name                      Album X2003.Rank.Old
## 1 Sinatra, Frank  Frank Sinatra     In the Wee Small Hours            101
## 2    Diddley, Bo     Bo Diddley Bo Diddley / Go Bo Diddley            212
## 3 Presley, Elvis  Elvis Presley              Elvis Presley             56
## 4 Sinatra, Frank  Frank Sinatra Songs for Swingin' Lovers!            302
## 5 Little Richard Little Richard      Here's Little Richard             50
## 6        Beyonce        Beyonce                   Lemonade             NA
##   X2003.Rank X2012.Rank X2020.Rank X2020.2003.Differential Release.Year
## 1        100        101        282                    -182         1955
## 2        214        216        455                    -241         1955
## 3         55         56        332                    -277         1956
## 4        306        308         NA                    -195         1956
## 5         50         50        227                    -177         1957
## 6         NA         NA         32                     469         2016
##                   Album.Genre Album.Type Wks.on.Billboard
## 1               Big Band/Jazz     Studio               14
## 2 Rock n' Roll/Rhythm & Blues     Studio                -
## 3 Rock n' Roll/Rhythm & Blues     Studio              100
## 4               Big Band/Jazz     Studio                ?
## 5                                 Studio                5
## 6                                 Studio               87
##   Peak.Billboard.Position Spotify.Popularity
## 1                       2                 48
## 2                     201                 50
## 3                       1                 58
## 4                       2                 62
## 5                      13                 64
## 6                       1                 73
##                            Spotify.URI
## 1 spotify:album:3GmwKB1tgPZgXeRJZSm9WX
## 2 spotify:album:1cbtDEwxCjMhglb49OgNBR
## 3 spotify:album:7GXP5OhYyPVLmcVfO9Iqin
## 4 spotify:album:4kca7vXd1Wo5GE2DMafvMc
## 5 spotify:album:18tV6PLXYvVjsdOVk0S7M8
## 6 spotify:album:7dK54iZuOxXFarGhXwEXfF
##                                  Chartmetric.Link Artist.Member.Count
## 1 https://app.chartmetric.com/album/3115110/about                   1
## 2 https://app.chartmetric.com/album/4137525/about                   1
## 3 https://app.chartmetric.com/album/1338334/about                   1
## 4 https://app.chartmetric.com/album/1075976/about                   1
## 5 https://app.chartmetric.com/album/3128391/about                   1
## 6 https://app.chartmetric.com/album/4160994/about                   1
##   Artist.Gender Artist.Birth.Year.Sum Debut.Album.Release.Year
## 1          Male                  1915                     1946
## 2          Male                  1928                     1955
## 3          Male                  1935                     1956
## 4          Male                  1915                     1946
## 5          Male                  1932                     1957
## 6        Female                  1981                     2003
##   Avg..Age.at.Top.500.Album Years.Between.Debut.and.Top.500.Album X X.1
## 1                        40                                     9      
## 2                        27                                     0      
## 3                        21                                     0      
## 4                        41                                    10      
## 5                        25                                     0      
## 6                        35                                    13

OK….Now all this is interesting, but we really only need a dataframe that covers what we’re interested in, which is Clean.Name, Album, and X2020.2003.Differential

album_subset <- album_data_clean %>%
  select(Clean.Name, Album, X2020.2003.Differential)

head(album_subset)
##       Clean.Name                      Album X2020.2003.Differential
## 1  Frank Sinatra     In the Wee Small Hours                    -182
## 2     Bo Diddley Bo Diddley / Go Bo Diddley                    -241
## 3  Elvis Presley              Elvis Presley                    -277
## 4  Frank Sinatra Songs for Swingin' Lovers!                    -195
## 5 Little Richard      Here's Little Richard                    -177
## 6        Beyonce                   Lemonade                     469

Terrific. Let’s organize by descending.

top_positive <- album_subset %>%
  filter(X2020.2003.Differential > 0) %>%     
  slice_max(order_by = X2020.2003.Differential, n = 15)

top_negative <- album_subset %>%
  filter(X2020.2003.Differential < 0) %>%     
  slice_min(order_by = X2020.2003.Differential, n = 15)

top_positive
##        Clean.Name                             Album X2020.2003.Differential
## 1      Kanye West My Beautiful Dark Twisted Fantasy                     484
## 2  Kendrick Lamar               To Pimp a Butterfly                     482
## 3         Beyonce                          Lemonade                     469
## 4   Amy Winehouse                     Back to Black                     468
## 5        D'Angelo                            Voodoo                     460
## 6         OutKast                          Aquemini                     452
## 7       Kate Bush                    Hounds of Love                     433
## 8      Kanye West               The College Dropout                     427
## 9     Frank Ocean                            Blonde                     422
## 10        Beyonce                           Beyonce                     420
## 11          Jay-Z                     The Blueprint                     414
## 12    Erykah Badu                           Baduizm                     412
## 13  Missy Elliott                     Supa Dupa Fly                     408
## 14      Radiohead                             Kid A                     408
## 15          Drake                         Take Care                     406
top_negative
##                      Clean.Name                                         Album
## 1                Robert Johnson                       The Complete Recordings
## 2                    Boz Scaggs                                    Boz Scaggs
## 3   Ian Dury and the Blockheads                         New Boots & Panties!!
## 4                  Bonnie Raitt                                    Give It Up
## 5                     B.B. King                      Live in Cook County Jail
## 6                     Bob Dylan                                  Modern Times
## 7             Bruce Springsteen                                    The Rising
## 8                     Radiohead                                      Amnesiac
## 9               Vampire Weekend                               Vampire Weekend
## 10                       M.I.A.                                          Kala
## 11               Arctic Monkeys Whatever People Say I Am, That's What I'm Not
## 12                         MGMT                          Oracular Spectacular
## 13            The White Stripes                             White Blood Cells
## 14            My Morning Jacket                                             Z
## 15 Creedence Clearwater Revival               Chronicle: The 20 Greatest Hits
## 16                    Manu Chao                   Proxima estacion: Esperanza
## 17               The Beach Boys                            The Smile Sessions
##    X2020.2003.Differential
## 1                     -501
## 2                     -501
## 3                     -501
## 4                     -501
## 5                     -501
## 6                     -501
## 7                     -501
## 8                     -501
## 9                     -501
## 10                    -501
## 11                    -501
## 12                    -501
## 13                    -501
## 14                    -501
## 15                    -501
## 16                    -501
## 17                    -501

OK. Well that was interesting. I had to take a second look at the data, because I don’t understand how we can have 15 records at -501. Turns out that if an album was not on the chart in 2003 OR again in 2020 it had a net score of -501. This is possible because they are on the list from the stats pulled in 2012. So all 15 albums listed in top_negative all popped onto the list in 2012 only. They weren’t there in 2003 and were gone again by 2012. I guess the differntial column is less useful than I would have hoped?

SO let’s rerun our dataframe with the 2003 and 2020 data. I’m not interested in albums that were on and off between those two periods. I also want to calculate my own differential. So really, I’m asking that of the 2003 albums that were still on the list by 2020…who were the top movers. So we need to reclean our data a bit:

album_subset <- album_data_clean %>%
  select(Clean.Name, Album, X2003.Rank, X2020.Rank)

head(album_subset)
##       Clean.Name                      Album X2003.Rank X2020.Rank
## 1  Frank Sinatra     In the Wee Small Hours        100        282
## 2     Bo Diddley Bo Diddley / Go Bo Diddley        214        455
## 3  Elvis Presley              Elvis Presley         55        332
## 4  Frank Sinatra Songs for Swingin' Lovers!        306         NA
## 5 Little Richard      Here's Little Richard         50        227
## 6        Beyonce                   Lemonade         NA         32

OK. Let’s make this an exploration of classic rock rather than the new stuff. If you aren’t on the list in 2003, I’m not interested. Also…if you aren’t on the list in 2020, I’m not interested. So let’s remove those:

album_subset <- album_subset %>%
  filter(!is.na(X2003.Rank) & !is.na(X2020.Rank))

head(album_subset)
##       Clean.Name                      Album X2003.Rank X2020.Rank
## 1  Frank Sinatra     In the Wee Small Hours        100        282
## 2     Bo Diddley Bo Diddley / Go Bo Diddley        214        455
## 3  Elvis Presley              Elvis Presley         55        332
## 4 Little Richard      Here's Little Richard         50        227
## 5    Miles Davis               Kind of Blue         12         31
## 6  John Coltrane                Giant Steps        102        232

Now that we have that, let’s calculate the differential between 2003 and 2020:

album_subset <- album_subset %>%
  mutate(Difference = X2003.Rank - X2020.Rank)

head(album_subset)
##       Clean.Name                      Album X2003.Rank X2020.Rank Difference
## 1  Frank Sinatra     In the Wee Small Hours        100        282       -182
## 2     Bo Diddley Bo Diddley / Go Bo Diddley        214        455       -241
## 3  Elvis Presley              Elvis Presley         55        332       -277
## 4 Little Richard      Here's Little Richard         50        227       -177
## 5    Miles Davis               Kind of Blue         12         31        -19
## 6  John Coltrane                Giant Steps        102        232       -130

Alright…not let’s look at the top movers on the high and low sides:

top_positive_movers <- album_subset %>%
  filter(Difference > 0) %>%  
  slice_max(order_by = Difference, n = 15)

top_negative_movers <- album_subset %>%
  filter(Difference < 0) %>%
  slice_min(order_by = Difference, n = 15)

top_positive_movers
##              Clean.Name                           Album X2003.Rank X2020.Rank
## 1              D'Angelo                          Voodoo        488         28
## 2                 Jay-Z                   The Blueprint        464         50
## 3             Radiohead                           Kid A        428         20
## 4                  Hole               Live Through This        466        106
## 5          Wu-Tang Clan  Enter the Wu-Tang: 36 Chambers        386         27
## 6                   Nas                        Illmatic        400         44
## 7            Funkadelic                    Maggot Brain        486        136
## 8                Fugees                       The Score        477        134
## 9        George Michael                           Faith        480        151
## 10         Cyndi Lauper                She's So Unusual        494        184
## 11          John Prince                      John Prine        458        149
## 12 The Notorious B.I.G.                Life After Death        483        179
## 13          Lauryn Hill The Miseducation of Lauryn Hill        312         10
## 14           The Police                   Synchronicity        455        159
## 15              OutKast                       Stankonia        359         64
##    Difference
## 1         460
## 2         414
## 3         408
## 4         360
## 5         359
## 6         356
## 7         350
## 8         343
## 9         329
## 10        310
## 11        309
## 12        304
## 13        302
## 14        296
## 15        295
top_negative_movers
##                     Clean.Name                           Album X2003.Rank
## 1                 Muddy Waters                   The Anthology         38
## 2                 Phil Spector        Back to Mono (1958-1969)         64
## 3                     Al Green                   Greatest Hits         52
## 4               Robert Johnson King of the Delta Blues Singers         27
## 5               Linda Ronstadt              Heart Like a Wheel        164
## 6           Jefferson Airplane             Surrealistic Pillow        146
## 7                 Howlin' Wolf        Moanin' in the Moonlight        153
## 8                  The Stooges                     The Stooges        185
## 9       Sly & The Family Stone                   Greatest Hits         60
## 10                     Ramones                Rocket to Russia        105
## 11               Elvis Presley                   Elvis Presley         55
## 12 The Flying Burrito Brothers        The Gilded Palace of Sin        192
## 13                Bonnie Raitt                    Nick of Time        229
## 14              Elvis Costello                  My Aim Is True        168
## 15                 Marvin Gaye                 Let's Get It On        165
##    X2020.Rank Difference
## 1         483       -445
## 2         489       -425
## 3         456       -404
## 4         374       -347
## 5         490       -326
## 6         471       -325
## 7         477       -324
## 8         488       -303
## 9         343       -283
## 10        385       -280
## 11        332       -277
## 12        462       -270
## 13        492       -263
## 14        430       -262
## 15        422       -257

OK, this is very cool.

##Answer

We can see how D’Angelo, Jay-Z and Radiohead had massive resurgences from 2003 to 2020. And looking at the other direction, Muddy Waters, Phil Spector and Al Green lost a lot of ground.

I honestly don’t know what to make of most of this information, but there are the top album movers over a fun 17 year span.

##Conclusion

This was a cool exercise. I feel so much more comfortable cleaning data. To the extent that I have a much better understanding of what that even means. More crucially, I have total confidence in my ability to just move data around to get the kinds of answers I want out of it, or better, the answers paying clients will want out of it. This was neat. It was hard and challenging, but neat. Thanks for the exercise!!