Alexander Hamilton: The Breakdown

Sentiment and other analyses of lyrics from the hit musical Hamilton

Ethan Milne

2019-07-31

Introduction

I made this document because I’m in the process of getting more familiar with the RMarkdown publishing format, and wanted to do a deep dive into the world of sentiment analysis in R. There’s a lot of other things going on behind the scenes involved with producing this: RMarkdown requires a basic understanding of LaTeX, some CSS1 In particular, I’m using the Tufte CSS package to get this nice page formatting with big margins for notes like these, the sentiment analysis I’m doing will require learning a few different R packages, in particular the Tidyverse family.

I chose to look at a dataset of lyrics from Hamilton2 See Kaggle Hamilton Dataset for the original data I used, a musical I like that has enough name recognition for other people to generally understand what I’m talking about. The questions I’m trying to answer are: How does the general mood of the musical shift over time? What negative or positive words have the most impact on general mood? Is there a way to see the relationship between different characters based on the order of their lines?

Sentiment Analysis, the Bing Method

Loading Packages and Getting Data

To start, I need to first load the requisite packages for this analysis and import my Data into R. I have a local copy of the Hamilton dataset downloaded from Kaggle already in my R Workspace, but referencing its URL should work for data importing as well.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(stringr)
library(tidytext)
library(textdata)
library(tidyr)
library(ggplot2)
library(readxl)

Hamilton_Songs <- read_excel("~/Desktop/Hamilton_songs.xlsx")
Hamilton_Songs
## # A tibble: 3,634 x 5
##    total_line_number song_line_number title     speaker lines              
##                <dbl>            <dbl> <chr>     <chr>   <chr>              
##  1                 1                1 Alexande… BURR    How does a bastard…
##  2                 2                2 Alexande… BURR    Scotsman, dropped …
##  3                 3                3 Alexande… BURR    Spot in the Caribb…
##  4                 4                4 Alexande… BURR    Grow up to be a he…
##  5                 5                5 Alexande… LAURENS The ten-dollar Fou…
##  6                 6                6 Alexande… LAURENS Got a lot farther …
##  7                 7                7 Alexande… LAURENS By being a lot sma…
##  8                 8                8 Alexande… LAURENS By being a self-st…
##  9                 9                9 Alexande… LAURENS By fourteen, they …
## 10                10               10 Alexande… JEFFER… And every day whil…
## # … with 3,624 more rows

Cleaning Data

You’ll notice that the data above has lines as character strings. However, if we want to analyze each individual word, we need to break lines down into their individual components3 It’s important that we do this to put data in a “tidy” format, i.e.: Every observation has its own individual row. This makes the process of evaluating sentiment much easier to handle:

#splits sentences into units of 1 word each
tidy_hamilton <- Hamilton_Songs %>%
  unnest_tokens(word, lines)


#gets rid of useless words that don't have any sentiment 
#like "and" or "how"
cleaned_hamilton <- tidy_hamilton %>%
  anti_join(get_stopwords())
## Joining, by = "word"
cleaned_hamilton
## # A tibble: 10,433 x 5
##    total_line_number song_line_number title              speaker word     
##                <dbl>            <dbl> <chr>              <chr>   <chr>    
##  1                 1                1 Alexander Hamilton BURR    bastard  
##  2                 1                1 Alexander Hamilton BURR    orphan   
##  3                 1                1 Alexander Hamilton BURR    son      
##  4                 1                1 Alexander Hamilton BURR    whore    
##  5                 2                2 Alexander Hamilton BURR    scotsman 
##  6                 2                2 Alexander Hamilton BURR    dropped  
##  7                 2                2 Alexander Hamilton BURR    middle   
##  8                 2                2 Alexander Hamilton BURR    forgotten
##  9                 3                3 Alexander Hamilton BURR    spot     
## 10                 3                3 Alexander Hamilton BURR    caribbean
## # … with 10,423 more rows

Bing Sentiment Analysis

Next, I’ll be sorting the words defined in the object “cleaned_hamilton” into positive or negative buckets. To do that, I’ll use a lexicon called “Bing”. Bing functions as a dictionary of words that have an associated value of positive or negative. This lexicon is very binary in how it treats the sentiment of words, so later on I’ll look at how using other lexicons changes the analysis. For starters, here’s how to get the Bing Lexicon:

bing <- get_sentiments("bing")

Here’s some examples of what the bing lexicon considers to be positive vs negative words4 At the bottom of these tables you can see how many rows are left out of the sample. For whatever reason, the Bing lexicon appears to have far more negative than positive words.

## # A tibble: 2,005 x 2
##    word        sentiment
##    <chr>       <chr>    
##  1 abound      positive 
##  2 abounds     positive 
##  3 abundance   positive 
##  4 abundant    positive 
##  5 accessable  positive 
##  6 accessible  positive 
##  7 acclaim     positive 
##  8 acclaimed   positive 
##  9 acclamation positive 
## 10 accolade    positive 
## # … with 1,995 more rows
## # A tibble: 4,781 x 2
##    word        sentiment
##    <chr>       <chr>    
##  1 2-faces     negative 
##  2 abnormal    negative 
##  3 abolish     negative 
##  4 abominable  negative 
##  5 abominably  negative 
##  6 abominate   negative 
##  7 abomination negative 
##  8 abort       negative 
##  9 aborted     negative 
## 10 aborts      negative 
## # … with 4,771 more rows

So now I want to take this dataset and apply the bing lexicon to it. I’m going to use an inner join function, that filters the dataset only for the words that are also in the bing lexicon, and then appends a value of positive or negative to them5 If you aren’t familiar with inner join functions, here’s a quick example: Table1 has columns ABC, and Table 2 has columns ABD. Innerjoining Table1 to Table2 would leave us with a combined Table with Columns A and B. Columns C and D would be excluded as they are not present in both tables.. I’m going to do this focusing only on the first song in the musical, “Alexander Hamilton”, and see what sort of output I get:

alexander_hamilton_sentiment <- cleaned_hamilton %>%
  filter(title=="Alexander Hamilton") %>%
  inner_join(bing) %>%
  count(song_line_number, sentiment) %>%
  spread(sentiment, n, fill=0) %>%
  mutate(sentiment = positive - negative)
## Joining, by = "word"
sentimentGraph <- ggplot(alexander_hamilton_sentiment, aes(x=song_line_number, y=sentiment)) + geom_bar(stat="identity", show.legend = FALSE) 

sentimentGraph + ggtitle("Alexander Hamilton", "Sourced from the Kaggle Dataset Library") 

This isn’t all that impressive. Lots of lines are missing, likely because they were solely composed of words not in the bing lexicon, and the net sentiment change by line doesn’t really follow a wave-y pattern that you’d expect to see if sentiment by line in the song were fluid. Instead, I’m going to look at how net sentiment changes over time, using the song order as a proxy for time.

I’m going to also filter for only those songs that have a combined absolute sentiment value of over 20, or at least 20 words that register in the bing lexicon. This excludes some extremely brief songs like “A Winter’s Ball” that are too small to really have an impact on overal mood of the musical.

hamilton_sentiment <- cleaned_hamilton %>%
  inner_join(bing) %>%
  count(title, sentiment) %>%
  spread(sentiment, n, fill=0) %>%
  mutate(sentiment = positive-negative) %>%
  filter((negative+positive) > 20)
## Joining, by = "word"
hamilton_levels <- c("Alexander Hamilton", "Aaron Burr, Sir", "My Shot", "The Story of Tonight", "The Schuyler Sisters", "Farmer Refuted", "You'll Be Back", "Right Hand Man", "A Winter's Ball", "Helpless", "Satisfied", "The Story of Tonight (Reprise)", "Wait For It", "Stay Alive", "Ten Duel Commandments", "Meet Me Inside", "That Would Be Enough", "Guns and Ships", "History Has Its Eyes On You", "Yorktown (The World Turned Upside Down)", "What Comes Next?", "Dear Theodosia", "Non-Stop", "What'd I Miss", "Cabinet Battle #1", "Take A Break", "Say No To This", "The Room Where It Happens", "Schuyler Defeated", "Cabinet Battle #2", "Washington On Your Side", "One Last Time", "I Know Him", "The Adams Administration", "We Know", "Hurrican", "Burn", "Blow Us All Away", "Stay Alive (Reprise)", "It's Quiet Uptown", "The Election of 1800", "Your Obedient Servant", "Best of Wives and Best of Women", "The World Was Wide Enough", "Who Lives, Who Dies, Who Tells Your Story")

I’ve also included an object called “hamilton_levels”6 There’s likely an easier way to do this than writing out each song in order, I just haven’t found it yet.. R seems to alphebatize non-numeric x axes, so by creating a vector with all the song names in order I can make any graph I create also put the songs in their chronological order.

#Time to Graph
AllSongsGraph <- ggplot(hamilton_sentiment, aes(x=factor(title, levels=hamilton_levels), y=sentiment, fill=sentiment)) + geom_bar(stat="identity", show.legend = FALSE) 

#now to make the graph look pretty
AllSongs <- AllSongsGraph +  
  ggtitle("Net Sentiment of Songs in Hamilton") + 
  xlab("Song") + ylab("Net Sentiment") + 
  theme(axis.text.x = element_text(angle = 90)) 
AllSongs

This is interesting. We can see a clear difference in the net sentiment of songs in the first half of the musical7 Non-Stop is the last song in the first act, and there is a relatively clear downward trend away from overall positivity to a much more bittersweet ending.

I did mention that the method I used with the Bing lexicon had some problems. For starters, every word classified as positive or negative gets an equal weight. This doesn’t seem right; “unhappy” is much less bad than “devastated” or “distraught”, yet Bing classifies all of these as having the same sentiment value.

Thankfully, we aren’t limited to just one lexicon. There’s another one called “afinn” that assigns words a score from -5 to 5, which allows for a far more nuanced analysis of net sentiment8 The last lexicon I’m aware of is “nrc”, which takes a different approach and categorizes words into emotional categories like “fear” or “joy”. This can be useful when you have extremely large quantities of words like when analyzing a book (I used it in an analysis of Alice in Wonderland), but musicals tend to have less raw total words to work with, so any analysis that split words into as many categories as nrc does will end up being relatively crude..

AFINN Sentiment

Here’s a quick look at the afinn lexicon:

## # A tibble: 2,477 x 2
##    word       value
##    <chr>      <dbl>
##  1 abandon       -2
##  2 abandoned     -2
##  3 abandons      -2
##  4 abducted      -2
##  5 abduction     -2
##  6 abductions    -2
##  7 abhor         -3
##  8 abhorred      -3
##  9 abhorrent     -3
## 10 abhors        -3
## # … with 2,467 more rows

This lexicon seems a lot more nuanced. Let’s see what sort of output we get using this lexicon instead.

#joining afinn with the dataset to get scores
hamilton_sentiment2 <- cleaned_hamilton %>%
  inner_join(afinn) %>%
  group_by(title) %>%
  summarise(sentiment = sum(value)) 
## Joining, by = "word"
#initial graph
AllSongsGraph2 <- ggplot(hamilton_sentiment2, aes(x=factor(title, levels=hamilton_levels), y=sentiment, fill=sentiment)) + geom_bar(stat="identity", show.legend = FALSE) 

#now to make the graph look pretty
AllSongs2 <- AllSongsGraph2 +  
  ggtitle("Net Sentiment of Songs in Hamilton") + 
  xlab("Song") + ylab("Net Sentiment") + 
  theme(axis.text.x = element_text(angle = 90)) 
AllSongs2

This chart tells a slightly different story. The downward trend isn’t nearly so pronounced, and there are some songs with extremely high net sentiment9 Satisfied, one of the highest sentiment songs is a notable case I’ll get into soon. The results are similar to what we got with the bing lexicon, just with more nuance. This is not to say AFINN is better than BING, just different.

Sentiment Contribution

Now that we’ve graphed sentiment over time, let’s look at the words that most contribute to positive or negative sentiment.

bing_word_counts <- cleaned_hamilton %>%
  inner_join(bing) %>%
  count(word, sentiment, sort=TRUE) %>%
  filter(n>10) %>%
  mutate(n = ifelse(sentiment == "negative", -n, n)) %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(word, n, fill = sentiment)) +
  geom_col() +
  coord_flip() +
  labs(y = "Contribution to sentiment") +
  ggtitle("Which Words Contributed Most to Net Sentiment?")
## Joining, by = "word"
bing_word_counts

While interesting, I think this graph reveals a flaw in many lexicon-based sentiment analyses. For example, the word “Helpless” is considered the word that most contributed to negative sentiment. However, in at least half the times it’s used, Eliza is referring to feeling helplessly in love with Alexander Hamilton. Further, the word “Satisfied” seems to be a big contributor to positive sentiment, despite it being primarily used by Angelica to say she will “never be satisfied”, or to sarcastically ask Hamilton if he is Satisfied. In practice, these words might mean the opposit of what Bing or AFINN think they do.

Line Networks

Another question I wanted to answer is how can we visualize the relationship between actors based on the order of their lines. To do this, I’ll use the Circlize package in R. I’ve modified the initial Kaggle dataset of lines significantly in excel beforehand, because Excel is extremely good at offsetting relational equations in the way I needed to make my data clean for Circlize to handle. See below for a sample of the new dataset I created.

## ========================================
## circlize version 0.4.6
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: http://jokergoo.github.io/circlize_book/book/
## 
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization 
##   in R. Bioinformatics 2014.
## ========================================
## # A tibble: 136 x 2
##    From     To      
##    <chr>    <chr>   
##  1 BURR     HAMILTON
##  2 HAMILTON BURR    
##  3 BURR     HAMILTON
##  4 HAMILTON BURR    
##  5 BURR     ENSEMBLE
##  6 ENSEMBLE HAMILTON
##  7 HAMILTON ENSEMBLE
##  8 ENSEMBLE HAMILTON
##  9 HAMILTON BURR    
## 10 BURR     HAMILTON
## # … with 126 more rows

As you can see, this data only has two columns, “From”, and “To”. I’ve also edited much of the data to have cleaner, broader categories^[For example, there were about 10 categories that went through various permutations of “BURR & ENSEMBLE” or “BURR & MEN & WOMEN”, or “COMPANY (EXCEPT HAMILTON)”. I used my best judgement in sorting these categories into broader categories of Ensemble, Company, or if the line was primarily a character singing with the ensemble supporting, that character’s name.

Now to visualize this with the Circlize package. Because I pre-cleaned my data, the command is very simple10 some of these categories overlap. I’m still learning how to use the circlize package well, and will update this when I can. Circlize is very powerful, but was initially designed for music notes that typically only have two characters in their notation (E5, etc), which makes full names harder to fit in without heavy modification:

chordDiagram(Hamilton_lines)

We can see the dense connections between Hamilton and Burr, and just how involved the general ensemble is to the play.

Conclusion

I think I accomplished what I set out to do at the beginning of this project. I’ve looked into showing net sentiment changes over time using different lexicons, identified specific words that contribute most to net sentiment, and learned a cool way to visualize networks in the Circlize package.

There’s flaws in what I’ve done. The commentary on the limitations of lexicon-based sentiment analysis shows that. However, it’s a first step towards getting into more accurate sentiment analyses tools like the SentimentR package or more recent Natural Language Processing (NLP) methods.

Here’s a recap of all the interesting visualizations I made in this document: