When people think of country music, they often think of country radio. There are stereotypes that go along with this that deter a lot of people from the genre of country music all together. One of the biggest stereotypes is that country music is only about trucks, beer, and girls. While this may be true for some artists, I want to show the range of country music by comparing two totally opposite artists: Morgan Wallen and Zach Bryan. Morgan Wallen is a well-known country artist who fits into the “trucks, beer, and girls” category. On the contrary, Zach Bryan focuses on bringing country music back to story telling and leans more towards country folk music. I have decided to analyze their most recent albums, “Dangerous: The Double Album” (30 songs) and “American Heartbreak” (34 songs).
The Genius API was used to get this data which was provided by my instructor. Here are the links to the two albums. Morgan Wallen - Dangerous: The Double Album Lyrics and Tracklist | Genius Zach Bryan - American Heartbreak Lyrics and Tracklist | Genius
I first converted the list of lyrics into a data frame.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(tidytext)
library(jsonlite)
##
## Attaching package: 'jsonlite'
##
## The following object is masked from 'package:purrr':
##
## flatten
library(ggthemes)
AmericanHeartbreak <- fromJSON("C:/Users/stilt/OneDrive/Desktop/Lyrics_AmericanHeartbreak.json")
Dangerous <- fromJSON("C:/Users/stilt/OneDrive/Desktop/Lyrics_DangerousTheDoubleAlbum.json")
as.data.frame(AmericanHeartbreak$tracks$song) -> AmericanHeartbreak_df
AmericanHeartbreak_df %>%
mutate(album = "AmericanHeartbreak") -> Americanheartbreak_df
as.data.frame(Dangerous$tracks$song) -> Dangerous_df
Dangerous_df %>%
mutate(album = "Dangerous") -> Dangerous_df
library(tidytext)
Dangerous_df %>%
unnest_tokens(word, lyrics) -> Dangerous_words
Dangerous_words %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
#count(word, sort = TRUE) %>%
inner_join(get_sentiments('afinn')) -> Dangerous_sentiment
## Joining, by = "word"
mean(Dangerous_sentiment$value)
## [1] -0.0951586
The overall sentiment value of the “Dangerous: The Double Album” was -0.0951586.
library(wordcloud2)
Dangerous_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', 'tickets', 'yeah', 'lyrics', 'outro', 'intro', '1', '2', '138you', 'liveget', 'tickets', 'morgan', 'wallen', 'ooh')) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
#head(100) %>%
wordcloud2(shape = 'circle')
## Joining, by = "word"
## Joining, by = "word"
I made a word cloud for the “Dangerous: The Double Album.” The bigger the word, the more frequent the word in the album. My hypothesis was correct in saying that Morgan Wallen would use a lot of words like “beer” and “girl”. Country is the most used word in the entire album which I did not initially expect, but does still make sense when thinking about his songs. Some of the song titles include, “Somethin’ Country”, “Country A$$ Shit”, and “Whatcha Think of Country Now.”
Dangerous_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', '10', 'ooh', 'tickets', 'outro', '4', '5', 'yeah', '1', '2', '3', 'lyrics', '138you', 'liveget', 'tickets', 'morgan', 'wallen')) %>%
count(word, sort = TRUE) %>%
left_join(get_sentiments('afinn')) %>%
head(10) %>%
ggplot(aes(reorder(word, n) ,n, fill = value)) + geom_col() + coord_flip() + theme_economist()
## Joining, by = "word"
## Joining, by = "word"
Dangerous_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', '10', 'woah', 'tickets', 'outro', '4', '5', 'yeah', '1', '2', '3', 'lyrics', '138you', 'liveget', 'tickets', 'morgan', 'wallen')) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments('afinn')) %>%
head(10) %>%
ggplot(aes(reorder(word, n) ,n, fill = value)) + geom_col() + coord_flip() + theme_economist()
## Joining, by = "word"
## Joining, by = "word"
I made two bar graphs showing the top 10 most popular words. The first bar graph is of all words accounted for (minus stop words). I found it interesting that although “girl” is #2, “love” is not until #9. “Love” is also the only word with sentiment that made it on the overall top 10 graph. The second graph is of the top 10 words with sentiment. I was surprised to see that “ass” is #2. It is also clear that the overall sentiment of the top 10 words is on the “bad” side.
Dangerous_df %>%
unnest_tokens(word, lyrics) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> Dangerous_afinn_neg
## Joining, by = "word"
## Joining, by = "word"
Dangerous_afinn_neg %>%
head(10)
## word n value
## 1 ass 23 -4
## 2 damn 13 -4
## 3 hell 11 -4
## 4 shit 8 -4
## 5 damned 1 -4
## 6 kill 11 -3
## 7 hate 5 -3
## 8 bad 4 -3
## 9 dumb 4 -3
## 10 liar 3 -3
Dangerous_df %>%
unnest_tokens(word, lyrics) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> Dangerous_afinn_pos
## Joining, by = "word"
## Joining, by = "word"
Dangerous_afinn_pos %>%
head(10)
## word n value
## 1 funny 1 4
## 2 win 1 4
## 3 love 27 3
## 4 perfect 2 3
## 5 luck 1 3
## 6 strong 8 2
## 7 top 7 2
## 8 kiss 6 2
## 9 care 5 2
## 10 ha 5 2
I then looked at the top 10 words for positive and negative sentiment of this album. I felt that his negative words were tame for what they could have been. In this chart, “ass” is ranked as #1. When looking at the positive words, I find it interesting that “funny” is ranked with a higher value than “love.” “Funny” is said once, but not in an overly positive way. “Funny” is used in “More Surprised than Me”. He is saying that all of the assumptions about him are funny.
library(tidytext)
AmericanHeartbreak_df %>%
unnest_tokens(word, lyrics) -> AmericanHeartbreak_words
AmericanHeartbreak_words %>%
filter(!word %in% c('chorus', 'verse', 'ooh', 'yeah')) %>%
#count(word, sort = TRUE) %>%
inner_join(get_sentiments('afinn')) -> AmericanHeartbreak_sentiment
## Joining, by = "word"
mean(AmericanHeartbreak_sentiment$value)
## [1] 0.2118959
The overall sentiment value of the “American Heartbreak” album is 0.2118959.
library(wordcloud2)
AmericanHeartbreak_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', '10', 'tickets', 'outro', '4', '5', 'yeah', '1', '2', '3', 'lyrics', '173you', 'liveget', 'tickets', 'zach', 'bryan')) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
head(400) %>%
wordcloud2(shape = 'circle')
## Joining, by = "word"
## Joining, by = "word"
I made another word cloud for the “American Heartbreak” album. “Night”, “Time”, and “Home” look to be some of the most prominent words. The typical country words that were seen in the Morgan Wallen Album are not as large on this word cloud.
AmericanHeartbreak_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', '10', 'tickets', 'outro', '4', '5', 'yeah', '1', '2', '3', 'lyrics', '173you', 'liveget', 'tickets', 'zach', 'bryan')) %>%
count(word, sort = TRUE) %>%
left_join(get_sentiments('afinn')) %>%
head(10) %>%
ggplot(aes(reorder(word, n) ,n, fill = value)) + geom_col() + coord_flip() + theme_economist()
## Joining, by = "word"
## Joining, by = "word"
AmericanHeartbreak_words %>%
anti_join(stop_words) %>%
filter(!word %in% c('chorus', 'verse', '10', 'tickets', 'outro', '4', '5', 'yeah', '1', '2', '3', 'lyrics', '173you', 'liveget', 'tickets', 'zach', 'bryan')) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments('afinn')) %>%
head(10) %>%
ggplot(aes(reorder(word, n) ,n, fill = value)) + geom_col() + coord_flip() + theme_economist()
## Joining, by = "word"
## Joining, by = "word"
I made two bar graphs showing the top 10 most popular words. The first bar graph is of all words accounted for (minus stop words). I was very surprised to see that “time” was the most frequent word. After looking back through the song titles, the songs, “Poems and Closing Time” and “Morning Time” have contributed to this. The second graph is of the top 10 words with sentiment. The sentiment is looking overall “good”. I feel that this top 10 is a good representation of how the overall vibe of this album is. It is happy, yet sad, but also comforting.
AmericanHeartbreak_df %>%
unnest_tokens(word, lyrics) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> AmericanHeartbreak_afinn_neg
## Joining, by = "word"
## Joining, by = "word"
AmericanHeartbreak_afinn_neg %>%
head(10)
## word n value
## 1 bastards 1 -5
## 2 damn 21 -4
## 3 hell 20 -4
## 4 ass 5 -4
## 5 fucking 5 -4
## 6 shit 3 -4
## 7 damned 2 -4
## 8 asshole 1 -4
## 9 fuck 1 -4
## 10 pissed 1 -4
AmericanHeartbreak_df %>%
unnest_tokens(word, lyrics) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> AmericanHeartbreak_afinn_pos
## Joining, by = "word"
## Joining, by = "word"
AmericanHeartbreak_afinn_pos %>%
head(10)
## word n value
## 1 fun 2 4
## 2 heavenly 2 4
## 3 win 1 4
## 4 love 35 3
## 5 happy 23 3
## 6 loved 7 3
## 7 beautiful 6 3
## 8 glad 3 3
## 9 luck 3 3
## 10 perfect 3 3
I also looked at the top 10 words for positive and negative sentiment of this album. He uses strong negative words. Although these are rather harsh, he uses them in a way of emphasis and to add emotion into his lyrics. Looking at his positive words, I want to put emphasis on “heavenly” and “beautiful.” I feel that these words are good to note because they are more complex words to describe a person or place than other words often used in country songs.
In conclusion, the data that I found overall seems very representative for each album and the category of country music that they fall into. When looking at the Morgan Wallen album, the most popular words are very basic and stereotypical. The graph with sentiment I felt captured his album the best. “Love”, “ass”, “wasted”, “dirt”, and “damn”, which were the top five words summarize a lot of the songs on this album. I also feel that these are words that often steer people away from country music. Comparing this album to the Zach Bryan album was very interesting to me. Right from the beginning, his album already had a higher sentiment value than Morgan Wallen’s album. Bryan’s is positive and Wallen’s is negative. “Time” was the most frequent word which I feel is very unique to Bryan. Similarly, to Wallen, the data from the graph with sentiment is most representative to his album. When I see words like, “happy”, “sunshine”, “die”, “safe”, and “pray” it makes me acknowledge the range of lyrics yet also Bryan’s ability to make the listener feel multiple emotions. The biggest thing that stuck out to me was the differences in comparing the negative sentiments. Bryan used such strong, negative words which evoke emotion, while Wallen used tamer words that packed less of a punch. I initially would have thought that Wallen would have used worse words because his music is harsher, but I was incorrect. I would say that overall, I have succeeded in demonstrating the range of country music.