There are certain aspects you expect from a musical. A large dance number before the act ends, a sad song right before the climax, a dramatic duet between the love interests, and a happy ending just to name a few. But do all musicals follow the same format? Is the last song in Act 1 always the happiest? Does the saddest song always happen in Act 2? For my final project, I decided to track emotion throughout four musicals to see if there are any trends.
I had previously analyzed lyric sentiment for the most streamed musicals on Spotify for a previous project. You can read about that project here. I decided to reanalyze the same musicals for this project as well. Since I already had some data, I knew what to expect rather find new musicals for this one.
In terms of trends, I wanted to see if certain types of songs appear in similar places in the soundtracks. Each soundtrack had a different amount of songs, ranging from 14 songs for Dear Evan Hansen to 46 songs for Hamilton. I was going to focus on placement of emotion and songs rather than the exact number. Say positive songs tend to occur at the halfway point of act one. If that hypothesis is correct, then track four for Dear Evan Hansen and track twelve for Hamilton would have higher indicators of positivity. If track two for Dear Evan Hansen and track twenty for Hamilton had the highest indicators of positivity, then there is no trend in terms of where emotion is used.
The first step was getting the Spotify data. The problem was I had run into a couple roadblocks. First, musical albums typically do not credit just one artist. Each member of the main cast gets a credit, as well as the ensemble which is listed as “Musical Ensemble” or “Original Broadway Cast of Musical.” If I use the get_album_data function, then I would get an error if I only wrote one artist or if I wrote all artists. To get around the issue, I had to get each track’s individual information and then compile it into one database. The exception, thankfully, was Hamilton. Each performer was credited on the tracks, but only Lin-Manuel Miranda was given credit for the album as a whole. I broke up the songs into act one and act two to get a better picture of the songs.
get_album_data("Lin-Manuel Miranda", "Hamilton (Original Broadway Cast Recording)") -> hamiltonmusical
## Warning: `mutate_()` was deprecated in dplyr 0.7.0.
## Please use `mutate()` instead.
## See vignette('programming') for more help
## Warning: `html_session()` was deprecated in rvest 1.0.0.
## Please use `session()` instead.
## Warning: All elements of `...` must be named.
## Did you want `data = c(line, lyric)`?
hamiltonmusical %>%
filter(track_n <= 23) -> hamiltionact1
hamiltonmusical %>%
filter(track_n > 23 & track_n < 47) -> hamiltionact2
get_album_tracks("0LhDyJXelg31FKLW5GDcKi") -> deh
deh %>%
mutate( get_track_audio_features(deh$id)) -> deh
deh %>%
filter(track_number <= 8) -> dehact1
deh %>%
filter(track_number > 8) -> dehact2
get_album_tracks("2MWo0RliwlkObUN13r5ITR") -> wickedmusical
wickedmusical %>%
mutate( get_track_audio_features(wickedmusical$id)) -> wickedmusical
wickedmusical %>%
filter(track_number <= 11) -> wickedact1
wickedmusical %>%
filter(track_number > 11) -> wickedact2
get_album_tracks("6emT6Wf0qivQ0Hyx0gruyr") -> bemorechill
bemorechill %>%
mutate( get_track_audio_features(bemorechill$id)) -> bemorechill
bemorechill %>%
filter(track_number <= 13) -> bmcact1
bemorechill %>%
filter(track_number > 13) -> bmcact2
The breakdown in terms of songs per show and act look like this:
Danceability measures how suitable the song is for dancing. In this case, a more danceable song indicates the song is more upbeat and lively. These types of songs are more likely, which means a high danceability score means the song is more positive.
As stated earlier, I am looking for the placement of the highs and lows rather than the exact track. If the highest point is in the middle of the graph for all four shows, then that means the middle track tends to be the most upbeat.
In general, the most danceable songs occurred in the back half of the act. The least danceable songs, however, had no noticeable trends. Hamilton and Wicked had their least danceable songs towards the end, Dear Evan Hansen closed the act on the least danceable song, and Be More Chill opened on the least danceable song. To keep track of the highest and lowest ranking tracks, I marked them down in the following list:
Act Two was the opposite of Act One: least danceable songs had noticeable trends and most danceable songs had no distinct trends. Hamilton and Wicked ended on the least danceable song and Dear Evan Hansen had the least danceable song in the back half of the act. Be More Chill was the exception to this rule. Like Act One, the act opened on the least danceable song. Most danceable songs occurred throughout the act. Dear Evan Hansen opened on the most danceable song, Hamilton and Be More Chill had the most danceable songs in the first half, and Wicked had the most danceable song in the back half. The list for Act Two looked like this:
Just because a song is the most danceable does not automatically mean the song is the most positive. If other indicators align, then that means that section of the show is more positive. If nothing else aligns, then that means the other indicators are not correlated with one another.
Energy measures the intensity of the song. High energy values are louder and faster, while low energy songs are calmer and quieter. While highest energy does not necessarily mean most positive and vice versa, upbeat songs are more likely to be positive. I expected the trends to be similar to danceability, since more upbeat songs are easier to dance to.
For Act One, there was no definitive trend for most energetic songs. Be More Chill and Dear Evan Hansen had their most energetic songs in the middle, whereas Wicked and Hamilton had their most energetic songs in the end. Least energetic songs were more likely to be towards the end of the soundtrack. Only Be More Chill had the least energtic song in the first half of the act. For Act One, the most and least energetic songs are as follows:
Right away, I noticed almost every song changed. Only track four from Dear Evan Hansen was most danceable and most energetic. Track eight from Wicked is both the least danceable and most energetic song of Act One, and almost every other high/low point by two or more places in the soundtrack. I was curious to see if Act Two would produce similar results.
In Act Two, the most energetic songs were in the front half of the act. Be More Chill and Hamilton had the most energetic songs within the first two songs, Dear Evan Hansen had the most energetic song in the first half, and Wicked was the exception since the most energetic song was third to last on the soundtrack. Least energetic songs had no noticeable trends. Be More Chill had the least energetic song right after the most energetic one, Wicked had theirs in the first half of the act, Hamilton had theirs in the second half of the act, and Dear Evan Hansen ended on the least energetic song. The track numbers are as follows:
Like Act One, only one song matched in terms of danceability and energy. This time, it was track seventeen from Wicked. Everything else shifted by two or more tracks, and there was a song that was the least danceable and most energetic. For Act Two, it was track fourteen in Be More Chill. Because there was so little overlap between the two indicators, I theorized these two were not as closely as I thought. It would come down to my last indicator to see if there were trends for emotions or not.
Valence measures positivity of a song. The closer the song is to 1.0, the more positive the song is. Valence only measures the sound of the song, not the actual lyrics.
Valence could indicate one of two things. The first option is that valence could match up with neither of the two indicators. If this option happened, then that means there are no trends for emotions in songs. Emotion relies more on the plot and dialogue than where the song occurs in the act. If valence matched up with one of the two indicators, then that means one indicator is weaker in terms of determining emotion.
Like energy, the most positive songs were in the back half of the soundtrack. The exception to the rule was Dear Evan Hansen, who opened the show with the most positive song of the act. Least positive songs were more likely to be in the very beginning of the show. The exception to this trend was Hamilton, with track seventeen being the least positive. The track numbers are as follows:
There was some overlap between danceability and positivity. Both track fifteen of Hamilton and track seven of Dear Evan Hansen were considered the most danceable and most positive songs. Track seventeen of Hamilton and track one of Be More Chill were also considered the least danceable and least positive songs. There was no overlap between positivity and energy.
Unlike Act One, the most positive songs were more likely to be in the front half of the act. Be More Chill opened on the most positive song, Hamilton and Dear Evan Hansen had theirs in the first half, and Wicked was the exception. The least positive songs of Act Two were all more likely to be in the back half of the show. The track numbers are as follows:
Like Act One, danceability and positivity overlapped. Track seventeen of Wicked was both most danceable and positive, while track nineteen was both least danceable and positive. Track forty-six of Hamilton was also the least danceable and positive. Track fourteen of Be More Chill was the most positive and least danceable. There was one song that overlapped between positivity and energy, and that was track fifteen of Be More Chill.
I thought metrics to analyze that were similar enough to one another to have noticeable overlap. Clearly, I was mistaken. Very few metrics overlapped with one another, and most times the high and low points did not match up. Because of this observation, I have come to the conclusion that emotion does not influence the placement of a song within a soundtrack. Emotional songs, whether super positive or super negative, will be placed where they make sense in the plot.