Introduction
The musical Hamilton hit the musical scene in 2016 and became an instant classic. Blending the music styles of hip-hop, jazz, R&B, and Broadway on stage in a way that’s never been done before. The show has been renowned for its music and lyrics since the first performance. The composer, and original actor for the titular, Lin Manuel-Miranda wrote this show over the course of seven years. Hamilton was nominated for a record breaking 16 Tony awards in 13 different categories and won 11 of them, including Best Original Score (music and lyrics) for a Musical.

The show follows the life of the founding father, Alexander Hamilton beginning with immigrating to the United States, the nation’s first major sex scandal, and his eventual death by duel. The other main characters that will be discussed during this analysis are Aaron Burr, Hamilton’s friend turned enemy and Eliza Hamilton, his wife and mother of their children. Another unique feature of the show is that the actors who play the other supporting characters switch roles from Act I to Act II.

Throughout this analysis I will be using the musical term ostinatom which means a melodic phrase repeated throughout a composition.

To begin this text analysis we need to first load the following packages.

library(tidyverse)
library(tidytext)
library(textdata)
library(wordcloud2)
library(gridExtra)
library(readr)
ham_lyrics <- read_csv("ham_lyrics.csv")

Analysis by Lyric
The first step is to create a filter for all of the lyrics in the show, excluding the common words such as “the” and “me” that do not add any value to the lyrics of the musical.

all_ham_lyrics <- ham_lyrics %>%
  unnest_tokens(word, lines)%>%
  anti_join(stop_words)
## Joining, by = "word"

Using this filter we can look at the 20 most common words in the musical.

all_ham_lyrics %>%
  count(word, sort = TRUE) %>%
  head(20)
## # A tibble: 20 × 2
##    word          n
##    <chr>     <int>
##  1 da           89
##  2 wait         81
##  3 time         77
##  4 hamilton     75
##  5 hey          69
##  6 burr         63
##  7 shot         58
##  8 sir          56
##  9 alexander    50
## 10 whoa         42
## 11 gonna        38
## 12 rise         37
## 13 world        36
## 14 em           35
## 15 story        35
## 16 alive        34
## 17 satisfied    33
## 18 york         33
## 19 helpless     32
## 20 home         32

We can also visualize the most common words of the musical in the following word cloud.

all_ham_lyrics %>%
  count(word, sort = TRUE) %>%
  wordcloud2()

From the list and the word cloud we can see the word “da” appears the most in the musical. “Da” as the most common word was a surprise to me because I know that it is only used in a couple songs. The character King George III sings the word and is only in three songs of the entire musical, so I wanted to see in which song he sang that word the most.

all_ham_lyrics %>%
  filter(word == "da") %>%
  count(title, sort = TRUE)
## # A tibble: 2 × 2
##   title              n
##   <chr>          <int>
## 1 You'll Be Back    76
## 2 I Know Him        13

It appears that “You’ll Be Back” was the most popular song for tha tlyric, but it was alarming that only two of his three songs were listed. The missing song from the list is “What Comes Next,” so I listened to the song and heard him sing his ostinatom “da da da da da”. I believe that this shows an error in the original data source.


The next most popular word from the musical is wait. I know that in the musical the character Aaron Burr is known for being patient and waiting for his time to take action, whereas Alexander Hamilton is more impulsive. I assume that Burr will be the most common speaker of the word wait, because it is what he is most known for, and wanted to check.

all_ham_lyrics %>%
  filter(word == "wait") %>%
  separate_rows(sep = "/") %>%
  count(speaker, sort = TRUE)
## # A tibble: 12 × 2
##    speaker                       n
##    <chr>                     <int>
##  1 BURR                         21
##  2 ENSEMBLE                     21
##  3 BURR & ENSEMBLE              11
##  4 HAMILTON                      8
##  5 COMPANY                       4
##  6 COMPANY (EXCEPT HAMILTON)     4
##  7 HAMILTON & COMPANY            3
##  8 ANGELICA                      2
##  9 ELIZA                         2
## 10 FULL COMPANY                  2
## 11 MEN                           2
## 12 LAURENS                       1

Burr and the Ensemble sing this word the same number of times, which is not what I originally predicted, but it makes sense. When I hear the word wait, I think of Burr’s song “Wait for It” and in that song whenever he says the line “I’m willing to wait for it,” the ensemble sings is as well. I wanted to confirm that this was the most common occurrence of the word wait and decided to check.

all_ham_lyrics %>%
  filter(word == "wait") %>%
  count(title, sort = TRUE)
## # A tibble: 11 × 2
##    title                                         n
##    <chr>                                     <int>
##  1 Wait For It                                  40
##  2 Hurricane                                    14
##  3 Alexander Hamilton                            8
##  4 Non-Stop                                      7
##  5 Take A Break                                  3
##  6 The Room Where It Happens                     3
##  7 Satisfied                                     2
##  8 My Shot                                       1
##  9 Schuyler Defeated                             1
## 10 The World Was Wide Enough                     1
## 11 Who Lives, Who Dies, Who Tells Your Story     1

This confirms that the song “Wait for it” has significantly more instances of the word wait being sung. I thought that it was interesting the song Hurricane was the second most common because that is a song sung mainly by Alexander Hamilton, and he is not known for waiting. Hamilton’s ostinatom throughout the show is about not having enough time to write and “not throwing away my shot.” The last most common word I will look at is “time,” because I’m not sure whether or not that will come from Hamilton or Eliza more.

all_ham_lyrics %>%
  filter(word == "time") %>%
  separate_rows(sep = "/") %>%
  count(speaker, sort = TRUE)
## # A tibble: 21 × 2
##    speaker                                 n
##    <chr>                               <int>
##  1 HAMILTON                               13
##  2 BURR                                    8
##  3 ELIZA & COMPANY                         7
##  4 ENSEMBLE                                7
##  5 WASHINGTON                              7
##  6 ELIZA                                   6
##  7 COMPANY                                 5
##  8 HAMILTON/LAFAYETTE/LAURENS/MULLIGAN     5
##  9 LAURENS                                 4
## 10 BURR & ALL WOMEN                        2
## # … with 11 more rows

It appears that Hamilton sings the lyric the most and is followed by Burr, which was unexpected. Burr and Hamilton have many conversations about their different approaches to the way they spend their time, but it makes sense that Hamilton talks about it a couple more times than Burr. I also noticed that Washington is tied for the third most common speaker, which makes sense because he sings a song entitled “One Last Time.”

Analysis by Same Character
The next thing I wanted to investigate was what each character’s most common words were per act of the musical. There are two acts in the musical, each containing 23 songs, and are about two different times in Hamilton’s life. The first act is about Hamilton coming to America, fighting in the Revolutionary War, getting married to Eliza, and ends with the birth of his son and the end of the war. The second act is about creating a stable government, Hamilton’s public affair, the death of his son, and his own death resulting from a duel with Aaron Burr. To analyze each act I first had to create two filters separating the songs by each act.

c("Alexander Hamilton", "Aaron Burr, Sir", "My Shot", "The Story of Tonight", "The Schuyler Sisters", "Farmer Refuted", "You'll Be Back", "Right Hand Man", "A Winter's Ball", "Helpless", "Satisfied", "The Story of Tonight (Reprise)", "Wait For It", "Stay Alive", "Ten Duel Commandments", "Meet Me Inside", "That Would Be Enough", "Guns and Ships", "History Has Its Eyes On You", "Yorktown (The World Turned Upside Down)", "What Comes Next?", "Dear Theodosia", "Non-Stop") ->ACT_1
c("What'd I Miss", "Cabinet Battle #1", "Take A Break", "Say No To This", "The Room Where It Happens", "Schuyler Defeated", "Cabinet Battle #2", "Washington On Your Side", "One Last Time", "I Know Him", "The Adams Administration", "We Know", "Hurricane", "Burn", "Blow Us All Away", "Stay Alive (Reprise)", "Stay Alive (Reprise)", "It's Quiet Uptown", "The Election of 1800", "Your Obedient Servant", "Best of Wives and Best of Women", "The World Was Wide Enough", "Who Lives, Who Dies, Who Tells Your Story") ->ACT_2

Once I created the filters for songs in each act, I used them to make the following filters of the most common lyrics Hamilton sings in each act of the show.
Alexander Hamilton

ham_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "HAMILTON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Hamilton ACT I Top Ten Words")
ham_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "HAMILTON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Hamilton ACT II Top Ten Words")


With both of these filters I can look at the graphs side by side to compare Hamilton’s most common lyrics per act.

grid.arrange(ham_actI_graph, ham_actII_graph)


Hamilton had the same first and third most common words in both acts. It is really interesting because those words are often spoken together. Hamilton will usually say “Burr sir” together. It is also interesting to see that the other common words follow the overall plot of each act. The first act for Hamilton is about fighting in the war and his most common words are words like shot, war, and command. The second act is about forming the government and his most common words are writing, plan, and time. I expected the word time to be more common in the first act because throughout the show he is worried about not having enough time to do what he wants in his life, but it does make sense that it is more common in the second act because he reaching the end of his life and that is when he is more worried about running out of time. The only word that has a sentiment analysis value is war. I was surprised that the word shot did not have a sentiment value attached as well. I believe that Hamilton’s most common words follow the theme of both acts during the show.
Aaron Burr
The next character we will look at is Burr, and we will follow the same structure for the following characters.

burr_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "BURR") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Burr ACT I Top Ten Words")
burr_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "BURR") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Burr ACT II Top Ten Words")
grid.arrange(burr_actI_graph, burr_actII_graph)


Aaron Burr is one of the clearest examples that a character. They start out as friends with a common interest in winning the war, but after Hamilton beats Burr for many political positions, so he becomes upset with Hamilton and feels that the two are fighting against each other now. I thought that it was interesting that Hamilton is such a common word for Burr, but I realized it’s because he is playing the part of the narrator. So while the other characters are talking to Hamilton they don’t say his name as often as Burr does when he is talking about Hamilton. Burr’s sentiment analysis only has a few values that are not accurate with context. For example, when Burr says nice in the second act, he is sarcastically saying that about Washington being too nice of a President. Even though the character Burr has the clearest line of friends to enemies, it is not really reflected in the sentiment anakysis.
Eliza Schuyler/Hamilton

eliza_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "ELIZA") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Eliza ACT I Top Ten Words")
eliza_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "ELIZA") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Eliza ACT II Top Ten Words")
grid.arrange(eliza_actI_graph, eliza_actII_graph)


Eliza Schuyler/Hamilton is Alexander Hamilton’s main love interest throughout the show. They get fall in love, get married, and have a child in the first act of the show, and then go through a rough time when Hamilton’s affair becomes publicized in the second act, but eventually she forgives him. I think Eliza follows the friends to enemies theme of the show because they are in love in the first act, and then is obviously upset with him in the second act. In the first act Eliza has a few words with values for the sentiment analysis, however they are not completely accurate with context. Eliza’s song Helpless is about how she feels helplessly in love with Hamilton, which is a good thing, however this word has a negative value associated with it. I thought that it was interesting that Eliza’s most common words in the second act are seven, eight, and nine in French, but remember that she teaches her son to count to ten in French multiple times during the show. I was surprised that the words burn and break did not have a negative value from the sentiment analysis in act II as well. I think that it is clear that Eliza follows the friends to enemies plot with context of the show, but it could not be determined soley from the lyrics.
Angelica

angelica_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "ANGELICA") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Angelica ACT I Top Ten Words")
angelica_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "ANGELICA") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Angelica ACT II Top Ten Words")
grid.arrange(angelica_actI_graph, angelica_actII_graph)


Angelica Schuyler is Eliza’s sister who also wants to be with Hamilton, but stepped aside so her sister could marry him. She is not in as much of the second act because she marries someone who lives in London and moves away, but returns when Hamilton’s affair becomes public to help her sister. In the first act Angelica’s most common word is “satisfied,” which is not surprising because that is the title of her solo song. Two of her other most common words are about Eliza (sister and bride), which show how much she cares about her sister. Alexander is in the list for both acts, which makes sense, but I was surprised to see that it was the most common word for her second act. There are only a few values for the sentiment analysis. Similarly to Eliza and the term helpless, when Angelic sings satisfied she is talking about how she would have had a more satisfying life if she had married Hamilton. Even though the word on its own has positive connotations, in this context I believe that it is more negative and should have a lower value. The second act only has one value for the sentiment analysis and it is positive for the word reach which I thought was interesting because I can’t recall when she sings that word. I think that Angelica is more of a friend to her sister and supports her no matter what, so she follows the friends to enemies track only because Eliza does.
George Washington

washington_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "WASHINGTON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Washington ACT I Top Ten Words")
washington_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "WASHINGTON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Washington ACT II Top Ten Words")
grid.arrange(washington_actI_graph, washington_actII_graph)


George Washington acts as a role model for Hamilton during both acts. I am not really sure how he would follow the friends can be your enemies theme because he always supported Hamilton, up until the affair went public. But, Hamilton was the one who publicized the affair, so he would be responsible for any consequences of that action. Washington’s most common words in the first act are mostly about Hamilton. His most common word is Hamilton, he calls Hamilton his son and Alexander, which are all three words in his top ten most common words. In his second act Hamilton is actually spoken more times than the first act, but is not his most common lyric. The reason that Jefferson has made it onto the list is because during the Cabinet Battle Songs where Hamilton and Jefferson are arguing, Washington acts as a mediator (with a slight bias towards Hamilton) and says their names a lot. There are only two words with a sentiment analysis value for each act, but I am surprised that the word “goodbye” from the second act didn’t have a value as well. I think that it is fair to say that Washington’s most common words make sense for his character, but do not follow the friends and enemies theme.
Analysis by Different Character
When Lin Manuel Miranda wrote this show, he wanted the some of the actors to play different characters parts in Act I vs Act II. For example, the character who plays Hercules Mulligan in Act I also plays James Madison in Act II. The reason behind this is because he wanted it to follow the theme that the same people who are your friends can also be your enemies. Mulligan is one of Hamilton’s main friends who helps him fight in the war in the first act, and Madison is Jefferson’s second in command who are both trying to take down Hamilton. I always thought that this was an interesting concept and wanted to see if there were any similarities or differences between the lyrics for the actors who play different characters.
The first set of characters on that list are Mulligan/Madison, and I followed the same filter steps as I did above.
Hercules Mulligan/James Madison

mulligan_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "MULLIGAN") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Mulligan ACT I Top Ten Words")
madison_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "MADISON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Madison ACT II Top Ten Words")
grid.arrange(mulligan_actI_graph, madison_actII_graph)


Hercules Mulligan was one of Alexander Hamilton’s friends who helped him fight the war in the first act and James Madison is Thomas Jefferson’s advisor, who is helping him fight against Hamilton politically. Neither Mulligan nor Madison have many words in each act because they only really sing by themselves in one song per act. In this first act Mulligan’s most common words come from the song “Aaron Burr Sir” and the second act comes from “Washington on Your Side.” There is only one word with a sentiment value, which is nice and that word is actually said sarcastically towards Washington for being too nice of a President (which also happened with Burr). I don’t think that this lyric analysis shows the differences between the characters because there is not enough information from the lyrics alone.
Marquis de Lafayette/Thomas Jefferson

Lafayette_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "LAFAYETTE") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Lafayette ACT I Top Ten Words")
jefferson_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "JEFFERSON") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Jefferson ACT II Top Ten Words")
grid.arrange(Lafayette_actI_graph, jefferson_actII_graph)


Marquis de Lafayette is one of Hamilton’s three main friends in the first act that helps him fight the war. Lafayette is from France, so it is no surprise that his most common word is France. In the second act as Thomas Jefferson, he and Hamilton get into many political fights about Washington, so it makes sense that those are some of his most common words. There are only two values for the sentiment analysis in each act, and they both have the same value so it does not appear that the characters Lafayette/Jefferson differ depending on when they are friends or enemies.
Peggy Schuyler/Maria Reynolds

Peggy_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "PEGGY") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Peggy ACT I Top Ten Words")
Maria_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "MARIA") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Maria ACT II Top Ten Words")
grid.arrange(Peggy_actI_graph, Maria_actII_graph)


The characters Peggy and Maria are two of the most polar opposite in terms of their character, even though both only really sing in one song per act. Peggy Schuyler is younger sister to Eliza and Angelica and her most popular words reflect her innocence. Her ostinatom is saying “and Peggy” whenever the sisters introduce themselves, and it makes sense why that is her most common word. After that her most common words only have a count of one because she does not sing in many other songs. In the second act the character Maria Reynolds is the person that Alexander Hamilton has an affair with. This character is sexualized and is the exact opposite of Peggy from the first act. I think that they lyrics from the second act don’t really represent the character without that piece of context. I was surprised that certain words had a sentiment analysis value, but others did not. The words from the first act have negative connotations and make sense, but I was shocked that the lyrics “cheatin” and “beatin” in the second act were not associated with a negative value.
John Laurens/Philip Hamilton

Laurens_actI_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_1 & speaker %in% "LAURENS") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Laurens ACT I Top Ten Words")
Philip_actII_graph <-all_ham_lyrics %>%
  filter(title %in% ACT_2 & speaker %in% "PHILIP") %>% 
  count(word, sort = TRUE) %>% 
  left_join(get_sentiments("afinn")) %>% 
  head(10) %>%
  ggplot(aes(x=n, y=reorder(word, n), fill = value)) + 
  geom_col() +
  xlab("Count") + ylab("Word") + ggtitle("Philip ACT II Top Ten Words")
grid.arrange(Laurens_actI_graph, Philip_actII_graph)


The final character of the analysis is John Laurens/Philip Hamilton. Laurens was a close friend of Hamilton and Philip was his oldest son. In the first act Laurens works with Hamilton during the war. His ostinatom is to “rise up” so it makes sense that his most popular word is rise. Another phrase that he repeats a lot is raise a glass to freedom. We can see that raise and glass are two of the most common words, but I am a little surprised freedom did not make the list because all three of those words are usually said together as a phrase. In the second act his most common words are seven, eight, and nine in French, just like Eliza’s. They both count two ten while playing the piano multiple times during this act and while Philip is dying so it is not surprising that these are both of their most common words. I think that the lyrics for Laurens/Philip match each character’s persona, but I don’t think that it necessarily follows the friends and enemies theme per act. Laurens is fighting a war in the first act and his words follow that theme with rise up and raise a glass. And in the second act he is playing Hamilton’s son as a child and his words match the innocence of that character in counting numbers and saying father.

Conclusion
Overall, it could not de determined if characters followed the theme that your friends can become your enemies theme throughout the show, just based on the sentiment analysis. They most common lyrics became clearer when context for the show was given, but it is extremely difficult to prove that the theme is followed without the sentiment analysis values. This investigation did prove that each character’s ostinatom was one of their most commonly said words throughout the show. I found it really interesting to see the different words that were most common for each character because I would often recognize with song they came from. I also thought it was interesting that the word immigrant wasn’t more common throughout the show, especially in the first act, because most of the characters are immigrants fighting for freedom in America, which is another prominent theme of the show.
I believe that part of the reason this investigation was unsuccessful in proving the theme of the show, is because it is a staged oroduction and there is more to it than just the lyrics. There are ostinatoms for each character in the music and stage directions. When that is all combined it can become more clear. But when we isolate part of it and look at one aspect, the lyrics, it can be difficult to prove something about the show as a whole.

Limitations
The most significant limitation was the lack of words with value for the sentiment analysis. The characters Peggy/Maria had the most sentiment analysis and there were only five values. Every single main character had another character’s name as one of their most common words, but there are no values associated with names. It would be interesting to see the sentiment analysis for other character’s names, especially for the characters who play different roles each act. If we had those values, then we could really see if the lyrics follow the theme of your friends can also be your enemies. Burr would also be a character to include in that analysis, even though he is the same character both acts, because he is Hamilton’s first friend and enemy and Hamilton in his top ten most common words for both acts. Additionally, the sentiment analysis was not always accurate for the context of the show. Eliza’s ostinatom is that she feels helplessly in love with Hamilton when he is around, but the sentiment analysis assigns the word helpless a value of -2. The word usually has a negative connotation, but given the context of the show, it has a more positive meaning.
Another limitation to this analysis was that the words for each character only counted if they said them by themselves. There were some lyrics that multiple characters sang, that did not count in the character’s analysis. Additionally this data source needs a closer review because we found an error in the King’s lyrics, which could be evidence for more errors in the data that were not found.


Another analysis that could be done using this data is a deeper look into the ensemble. Most of the time the ensemble is repeating words that the main characters say, but it would be interesting to see if there is are certain characters they repeat more or have parts of their own. There are certain characters in the ensemble that have featured roles, but this can’t always be seen from the lyrics. One ensemble member acts as the omen of death throughout the show, but never sings about it, because it is all expressed through the choreography.