library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.0.6 ✓ dplyr 1.0.4
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(tidytext)
library(genius)
library(wordcloud2)
library(rmarkdown)
library(devtools)
## Loading required package: usethis
Taylor Swift is one of the world’s most well known singers. She has 9 successful award winning albums filled with plenty of catchy songs. However, Swift was originally known for her country genre music up until fans noticed her genre switch in her 4th album, Red. From then on, Taylor Swift was a pop artist. Pop and Country are very different, but one thing that remains somewhat constant about Taylor Swift’s music, is the theme of love. No matter what stage of her life she was in, Taylor devoted most of her songs to be circled around love. I thought it would be interesting to analyze which of the 9 albums focused the most on the theme of love, and the difference in sentiment the theme of love plays in those 9 albums.
I obtained data on Taylor Swift’s 9 albums from Genius. Genius is a website that holds the biggest collection of song lyrics and musical knowledge. https://genius.com/artists/Taylor-swift this link will send you to Taylor Swift’s plethora of songs, playlists, and albums. I noticed that when I clicked “show all albums by Taylor Swift” it gave me a lot of different variations of her 9 albums, I went to Wikipedia where I settled on her 9 distinct albums, and just focused on those 9 studio albums https://en.wikipedia.org/wiki/Taylor_Swift_albums_discography
I predict that as Taylor’s albums go on, the theme of love fades. I think that love was most prominent in her first few albums, and as Taylor progressively got more and more famous, she took on a more mature and sophisticated perspective of the once “fairy-tale” love she wrote about when she was younger. While analyzing the sentiment behind each album, I believe I’ll find that her more recent albums have more negative words than her earlier albums. To complete this project, I used the following packages: tidyverse, tidytext, genius, wordcloud2, ggplot2, ramarkdown and devtools. I created the love words list by looking up synonyms of love, and reading definitions from https://www.merriam-webster.com/dictionary/love
The layout of my project is as follows: I run through the albums one by one, in chronological order. For each album, I look at the top words said in the album in a wordcloud and a table. From there, I am able to determine the total number of words said in the album. Next, I apply my love words list to the album, and get a table of how many/which words are said within the album. Lastly, on the inidivudal album aspect, I look at the nrc sentiment for each album. Getting the nrc sentiment will put the words into categories of positive, negative, anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. Using this will help me to determine the theme of her music over the years and if it has shifted and changed. After analyzing each album on its own, I’ll then look at the 9 albums as a whole. I first found each album’s average sentiment using afinn. Using this data, I created a dataset to see the albums side by side with their average sentiment value in a table format. I then created a visualization of the albums and their distinguished average sentiment in order to visualize the comparison between the 9 of them.
My list of “love words”:
love_words <- c("love", "loved", "loving", "lover", "feel", "felt", "feeling", "heart", "boyfriend", "girlfriend", "husband", "wife", "marriage", "marry", "married", "baby", "kiss", "feel", "touch", "personal", "tender", "affectionate", "affection", "passion", "passionate", "desire", "desirable", "desired", "want", "wanted", "wanting", "devoted", "devotion", "devote", "endearment", "endear", "admire", "admiring", "admired", "adore", "adoring", "adornment", "sex", "sexual", "attached", "attachment", "appreciate", "appreciated", "appreciating", "cherish", "cherishing")
TaylorSwift <- genius_album("Taylor Swift", "Taylor Swift")
## Joining, by = c("album_name", "track_n", "track_url")
TaylorSwift %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Taylor Swift. The most popular word of the album was “wanna” at 26.
TaylorSwift %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 641 x 2
## word n
## <chr> <int>
## 1 you 213
## 2 i 159
## 3 the 155
## 4 and 130
## 5 a 95
## 6 me 85
## 7 to 84
## 8 my 70
## 9 that 70
## 10 on 60
## # … with 631 more rows
This gave me the word count of Taylor Swift the album at 641 words.
TaylorSwift %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 371 x 2
## word n
## <chr> <int>
## 1 wanna 26
## 2 beautiful 20
## 3 should've 20
## 4 love 19
## 5 song 18
## 6 baby 15
## 7 time 15
## 8 hope 13
## 9 eyes 12
## 10 smile 12
## # … with 361 more rows
This code enabled me to look at the top 10 words of the album, Taylor Swift, in a table view.
Now, I want to analyze the love words that were said in the album, Taylor Swift.
TaylorSwift %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
## word n
## <chr> <int>
## 1 love 19
## 2 baby 15
## 3 heart 11
## 4 feel 5
## 5 kiss 4
## 6 adore 1
## 7 feeling 1
## 8 loving 1
After comparing the love words to the words said in the album, it was found that 8 love words were said, with “love” being the most frequent at 12 times.
Now I’ll look into the sentiment of the album, Taylor Swift.
TaylorSwift%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, surprise, trust.
Now, I’ll analyze her second album, Fearless.
Fearless <- genius_album("Taylor Swift", "Fearless")
## Joining, by = c("album_name", "track_n", "track_url")
Fearless %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Fearless The most popular word of the album was “la” at 26.
Fearless %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 745 x 2
## word n
## <chr> <int>
## 1 you 245
## 2 and 206
## 3 i 203
## 4 the 128
## 5 to 96
## 6 me 87
## 7 a 69
## 8 know 61
## 9 you're 59
## 10 in 58
## # … with 735 more rows
This gave me the word count of Fearless at 745 words.
Fearless %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 467 x 2
## word n
## <chr> <int>
## 1 la 26
## 2 feel 22
## 3 baby 16
## 4 love 16
## 5 time 15
## 6 belong 12
## 7 rains 12
## 8 feeling 11
## 9 loved 10
## 10 run 10
## # … with 457 more rows
This code enabled me to look at the top 10 words of the album, Fearless, in a table view.
Now, I want to analyze the love words that were said in the album, Fearless.
Fearless %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
## word n
## <chr> <int>
## 1 feel 22
## 2 baby 16
## 3 love 16
## 4 feeling 11
## 5 loved 10
## 6 kiss 6
## 7 marry 2
## 8 girlfriend 1
It was interesting to see 8 love words were highlighted again when looking at the frequency in her next album.“Feel” was most popular in this album, at 22.
Now I’ll look into the sentiment of the album, Fearless.
Fearless%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear,joy, positive. It is noted that this album had more negative sentiment than her previous album.
Next, I’ll look at her third album, Speak Now
SpeakNow <- genius_album("Taylor Swift", "Speak Now")
## Joining, by = c("album_name", "track_n", "track_url")
SpeakNow %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Speak Now. The most popular word of the album was “time” at 28.
SpeakNow %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 925 x 2
## word n
## <chr> <int>
## 1 you 295
## 2 i 209
## 3 the 209
## 4 and 157
## 5 to 102
## 6 me 94
## 7 a 88
## 8 your 82
## 9 on 77
## 10 it 75
## # … with 915 more rows
This gave me the word count of Speak Now at 925 words.
SpeakNow %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 643 x 2
## word n
## <chr> <int>
## 1 time 28
## 2 grow 21
## 3 meet 17
## 4 mind 17
## 5 gonna 13
## 6 live 13
## 7 night 13
## 8 remember 13
## 9 eyes 12
## 10 forever 12
## # … with 633 more rows
This code enabled me to look at the top 10 words of the album, Speak Now, in a table view.
Now, I want to analyze the love words that were said in the album, Speak Now.
SpeakNow %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
## word n
## <chr> <int>
## 1 love 12
## 2 kiss 8
## 3 feel 7
## 4 baby 6
## 5 loved 4
## 6 heart 3
## 7 feeling 2
## 8 touch 2
The love words are exactly the same words and counts as the first album, Taylor Swift.
Now I’ll look into the sentiment of the album, SpeakNow.
SpeakNow%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, surprise, trust. The exact same sentiments as her first album, Taylor Swift.
Next, I’ll look at her fourth album, Red
Red <- genius_album("Taylor Swift", "Red")
## Joining, by = c("album_name", "track_n", "track_url")
Red %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Red. The most popular word of the album was “time” at 66. It is noted that “time” was also the most popular word of Taylor’s album before this, Speak Now.
Red %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 917 x 2
## word n
## <chr> <int>
## 1 you 355
## 2 i 261
## 3 and 226
## 4 the 184
## 5 oh 108
## 6 me 107
## 7 to 104
## 8 a 102
## 9 in 98
## 10 like 91
## # … with 907 more rows
This gave me the word count of Red at 917 words.
Red %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 630 x 2
## word n
## <chr> <int>
## 1 time 66
## 2 ooh 60
## 3 stay 32
## 4 trouble 32
## 5 yeah 28
## 6 home 25
## 7 love 25
## 8 starlight 22
## 9 wanna 22
## 10 dancing 18
## # … with 620 more rows
This code enabled me to look at the top 10 words of the album, Red, in a table view.
Now, I want to analyze the love words that were said in the album, Red.
Red %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 13 x 2
## word n
## <chr> <int>
## 1 love 25
## 2 loving 12
## 3 feeling 10
## 4 feel 6
## 5 heart 5
## 6 loved 4
## 7 baby 2
## 8 kiss 2
## 9 touch 2
## 10 attached 1
## 11 lover 1
## 12 married 1
## 13 passionate 1
Red finally showed new data, with 13 love words in the album. “Love” was seen as the most said word at 25.
Sentiment for Red:
Red%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. It is noted that this album had similar sentiments as Fearless, with the addition of negative and sadness.
Data for the fifth album, 1989
genius_album("Taylor Swift", "1989") -> Taylor1989
## Joining, by = c("album_name", "track_n", "track_url")
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, 1989. The most popular word of the album was “shake” at 78.
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 698 x 2
## word n
## <chr> <int>
## 1 i 358
## 2 you 261
## 3 the 196
## 4 oh 180
## 5 and 160
## 6 we 127
## 7 to 109
## 8 it 106
## 9 in 93
## 10 are 83
## # … with 688 more rows
This gave me the word count of 1989 at 698 words.
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 457 x 2
## word n
## <chr> <int>
## 1 shake 78
## 2 love 77
## 3 woods 39
## 4 stay 33
## 5 baby 32
## 6 gonna 30
## 7 york 30
## 8 girl 25
## 9 bad 23
## 10 hey 22
## # … with 447 more rows
This code enabled me to look at the top 10 words of the album, 1989, in a table view.
Now, I want to analyze the love words that were said in the album, 1989.
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 5 x 2
## word n
## <chr> <int>
## 1 love 77
## 2 baby 32
## 3 heart 3
## 4 kiss 2
## 5 girlfriend 1
1989 only had 5 love words, but it had the highest count of the word “love” at 77, which is noticeably higher than the other albums saying the word love.
Sentiment for 1989:
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, disgust, fear, joy, negative, positive, sadness. I found this interesting how there are more negative sentiments than positive.
Next, I’ll look at her sixth album, Reputation
Reputation <- genius_album("Taylor Swift", "Reputation")
## Joining, by = c("album_name", "track_n", "track_url")
Reputation %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Reputation. The most popular word of the album was “di” at 81.
Reputation %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 932 x 2
## word n
## <chr> <int>
## 1 you 325
## 2 i 295
## 3 the 226
## 4 it 213
## 5 me 190
## 6 and 161
## 7 my 143
## 8 a 137
## 9 to 107
## 10 so 104
## # … with 922 more rows
This gave me the word count of Reputation at 932 words.
Reputation %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 659 x 2
## word n
## <chr> <int>
## 1 di 81
## 2 call 46
## 3 ooh 39
## 4 wanna 37
## 5 ha 34
## 6 time 34
## 7 ah 33
## 8 baby 33
## 9 yeah 32
## 10 bad 31
## # … with 649 more rows
This code enabled me to look at the top 10 words of the album, Reputation, in a table view.
Now, I want to analyze the love words that were said in the album, Reputation.
Reputation %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 11 x 2
## word n
## <chr> <int>
## 1 baby 33
## 2 feel 20
## 3 love 19
## 4 heart 14
## 5 feeling 8
## 6 touch 7
## 7 kiss 3
## 8 loved 3
## 9 boyfriend 1
## 10 girlfriend 1
## 11 lover 1
The top love word in the Red album was “baby” 33 times and there was a total of 11 love words in the album.
Sentiment for Reputation:
Reputation%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. Same observation as previous album - how there are more negative sentiments than positive.
Next, I’ll look at her seventh album, Lover
Lover <- genius_album("Taylor Swift", "Lover")
## Joining, by = c("album_name", "track_n", "track_url")
Lover %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Lover. The most popular word of the album was “ooh” at 69.
Lover %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 1,029 x 2
## word n
## <chr> <int>
## 1 i 396
## 2 you 263
## 3 the 243
## 4 and 155
## 5 my 148
## 6 me 132
## 7 a 117
## 8 to 115
## 9 oh 102
## 10 in 96
## # … with 1,019 more rows
This gave me the word count of Lover at 1,019 words.
Lover %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 759 x 2
## word n
## <chr> <int>
## 1 ooh 69
## 2 love 44
## 3 wanna 42
## 4 daylight 40
## 5 ah 29
## 6 baby 29
## 7 yeah 25
## 8 street 23
## 9 walk 19
## 10 home 18
## # … with 749 more rows
This code enabled me to look at the top 10 words of the album, Lover, in a table view.
Now, I want to analyze the love words that were said in the album, Lover.
Lover %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 10 x 2
## word n
## <chr> <int>
## 1 love 44
## 2 baby 29
## 3 lover 8
## 4 touch 6
## 5 feeling 5
## 6 heart 4
## 7 kiss 4
## 8 marry 4
## 9 adore 1
## 10 loved 1
Once again, “love” is the top word in the album Lover, at 44. 10 love words are expressed in this album.
Sentiment for Lover:
Lover%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. Same exact sentiments as previous album.
Next, I’ll look at her eighth album, Folklore
Folklore <- genius_album("Taylor Swift", "Folklore")
## Joining, by = c("album_name", "track_n", "track_url")
Folklore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Folklore. The most popular word of the album was “time” at 34.
Folklore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 1,020 x 2
## word n
## <chr> <int>
## 1 you 243
## 2 i 234
## 3 the 178
## 4 and 120
## 5 me 97
## 6 a 92
## 7 to 90
## 8 in 85
## 9 my 85
## 10 your 64
## # … with 1,010 more rows
This gave me the word count of Folklore at 1,020 words.
Folklore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 738 x 2
## word n
## <chr> <int>
## 1 time 34
## 2 ooh 21
## 3 love 13
## 4 mad 13
## 5 ah 11
## 6 call 11
## 7 hope 11
## 8 woman 11
## 9 mine 10
## 10 heart 9
## # … with 728 more rows
This code enabled me to look at the top 10 words of the album, Folklore, in a table view.
Now, I want to analyze the love words that were said in the album, Folklore.
Folklore %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 6 x 2
## word n
## <chr> <int>
## 1 love 13
## 2 heart 9
## 3 baby 5
## 4 kiss 5
## 5 loved 2
## 6 feel 1
“Love” is said 13 times, but only 6 love words are said in Folklore.
Sentiment for Folklore:
Folklore%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. This is now the third album in the row with the same sentiments.
Lastly, I’ll look at her ninth album, Evermore
Evermore <- genius_album("Taylor Swift", "Evermore")
## Joining, by = c("album_name", "track_n", "track_url")
Evermore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words) %>%
wordcloud2()
## Joining, by = "word"
This code gave me the top words said in the album, Evermore. The most popular word of the album was “ooh” at 26.
Evermore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE)
## # A tibble: 1,038 x 2
## word n
## <chr> <int>
## 1 the 222
## 2 i 206
## 3 you 200
## 4 and 143
## 5 my 119
## 6 it 113
## 7 to 101
## 8 your 90
## 9 in 85
## 10 a 78
## # … with 1,028 more rows
This gave me the word count of Evermore at 1,037 words.
Evermore %>%
unnest_tokens(word, lyric) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 750 x 2
## word n
## <chr> <int>
## 1 ooh 26
## 2 love 17
## 3 ah 14
## 4 died 13
## 5 eyes 13
## 6 hand 13
## 7 stay 13
## 8 time 13
## 9 yeah 13
## 10 alive 12
## # … with 740 more rows
This code enabled me to look at the top 10 words of the album, Evermore, in a table view.
Now, I want to analyze the love words that were said in the album, Evermore.
Evermore %>%
unnest_tokens(word, lyric) %>%
filter(word %in% love_words) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
## word n
## <chr> <int>
## 1 love 17
## 2 feel 5
## 3 touch 5
## 4 feeling 4
## 5 baby 2
## 6 heart 2
## 7 loved 2
## 8 wife 1
“Love” is said 17 times, and there are 8 love words in Evermore.
Sentiment for Evermore:
Evermore%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('nrc')) %>%
count(word, sentiment, sort = TRUE) %>%
head(10) %>%
ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"
It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, and trust. This took a huge turn in comparison to her last three albums, which were more negative and all tookon the exact same sentiments.
Now I will look at the 9 albums’ average sentiment as a whole.
TaylorSwift %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "TaylorSwift") -> TaylorSwift_afinn
## Joining, by = "word"
## Joining, by = "word"
TaylorSwift_afinn %>%
mutate(mean(TaylorSwift_afinn$value) ) -> TaylorSwift_mean
The mean sentiment value for her first album was 0.6747967, rounding to 1 which can conclude it was on a neutral sentiment, but more positive
Fearless %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Fearless") -> Fearless_afinn
## Joining, by = "word"
## Joining, by = "word"
Fearless_afinn %>%
mutate(mean(Fearless_afinn$value) ) -> Fearless_mean
The mean sentiment value for Fearless was 0.5091743, similar to Taylor Swift, rounding to 1 which can conclude it was on a neutral sentiment, but more positive.
SpeakNow %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "SpeakNow") -> SpeakNow_afinn
## Joining, by = "word"
## Joining, by = "word"
SpeakNow_afinn %>%
mutate(mean(SpeakNow_afinn$value) ) -> SpeakNow_mean
The mean for Speak Now is -0.1115242, this negative number allows me to conclude it took on a more negative sentiment than her previous two albums.
Red %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Red") -> Red_afinn
## Joining, by = "word"
## Joining, by = "word"
Red_afinn %>%
mutate(mean(Red_afinn$value) ) -> Red_mean
The mean for Red is 0.2614213, this number determines that Red took on a pretty neutral sentiment.
Taylor1989 %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Taylor1989") -> Taylor1989_afinn
## Joining, by = "word"
## Joining, by = "word"
Taylor1989_afinn %>%
mutate(mean(Taylor1989_afinn$value) ) -> Taylor1989_mean
The mean sentiment value for 1989 was -0.1903409, this negative number proves that it took on a fairly negative sentiment
Reputation %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Reputation") -> Reputation_afinn
## Joining, by = "word"
## Joining, by = "word"
Reputation_afinn %>%
mutate(mean(Reputation_afinn$value) ) -> Reputation_mean
The mean sentiment value was 0.03287671, super close to 0, which can allow me to say Reputation had a neutral sentiment
Lover%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Lover") -> Lover_afinn
## Joining, by = "word"
## Joining, by = "word"
Lover_afinn %>%
mutate(mean(Lover_afinn$value) ) -> Lover_mean
The mean sentiment value was 0.1123348, just like all of the other albums, it is close to 0 so it’s a fairly neutral sentiment.
Folklore%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Folklore") -> Folklore_afinn
## Joining, by = "word"
## Joining, by = "word"
Folklore_afinn %>%
mutate(mean(Folklore_afinn$value) ) -> Folklore_mean
The mean sentiment value for Folklore was -0.4354244, which is on the negative side of things, meaning the sentiment was more negative
Evermore%>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
inner_join(get_sentiments('afinn')) %>%
mutate(album = "Evermore") -> Evermore_afinn
## Joining, by = "word"
## Joining, by = "word"
Evermore_afinn %>%
mutate(mean(Evermore_afinn$value) ) -> Evermore_mean
The mean sentiment was -0.5076453, taking on the largest negative value of the 9 albums.
Now, I’ll look at the albums as a whole. I had to first create a dataset which I did here by naming the columns by the album mean and aligning it with their matching average number
all_albums <- data.frame("Album" = c("TaylorSwift_mean", "Fearless_mean","SpeakNow_mean", "Red_mean", "Taylor1989_mean", "Reputation_mean","Lover_mean", "Folklore_mean" ,"Evermore_mean" ), "Average Sentiment" = c(0.67, 0.51, -0.11, 0.26,-0.19, 0.03, 0.11, -0.44, -0.51 ))
This enabled me to view the albums side by side with their average sentiment value in a table format.
I wanted to visualize the albums and their average sentiment in a column chart, but first wanted t remind R that my Album names were already listed in the order I’d like for them to appear as:
all_albums$Album <- factor(all_albums$Album, levels = all_albums$Album)
library(ggplot2)
ggplot(all_albums, aes(Album, Average.Sentiment, fill= Album)) + geom_col()
This visualization enabled me to see how many albums were on a negative sentiment scale. A total of 4 albums were on the negative side of the column chart: Evermore, Folklore, Speak Now, and 1989. The remaining 5 albums were on the positive spectrum of the sentiment column chart. It’s important to note that her first 2 albums, Taylor Swift and Fearless scored very high on the positive sentiment in comparison to the other albums.
In conclusion, by looking at each album on it’s own, I determined the progress Taylor Swift has made over the years as an artist. I first found the top words said in each album, then briefed it down to the top 10 words. From here, I was able to view the total number of words in the album. Then, I applied those words of the album to my list of love words. Lastly, I determined the sentiment of the album using “nrc” because I wanted to see the words broken into positive/negative.
All in all, after looking at the most popular love word in each album, I found that “love” was the most frequently said love word in 7 of her albums. Reputation and Fearless had different top love words. When looking for a reason why this might have occurred, I noticed these two albums had noticeably negative sentiments. Both Fearless and Reputation shared the sentiments of anger, anticipation, disgust, fear, joy, positive. I looked into Taylor’s dating time line, and found that Taylor and Joe Jonas broke up just before the release of her album, Fearless, in 2008. https://www.billboard.com/photos/1484087/taylor-swifts-boyfriend-timeline-12-relationships-their-songs
As for her album Reputation, Taylor actually made some references as to what/who each song is about. This album was around the time of the famous Taylor and Kanye West feud, so that was definitely referenced, as well as her then current relationship and some exes. https://www.popsugar.com/entertainment/Who-Songs-Taylor-Swift-Reputation-About-44244774
On the other hand, since 7 of the 9 albums had the popular love word “love” it verifies the fact that Taylor’s songwriting is consistent around the theme of love. As she grew older and produced more and more songs, the theme of love was still apparent in majority of her albums. But, just because the same word “love” was used, doesn’t mean that it is used in the same positive way. This is where sentiments came into play. By looking at the “nrc” sentiment of each album, I was able to see the positive and negative tones of each album.
After doing all of the steps for each album, I wanted to look at the 9 albums as a whole in order to focus on the sentiments more. In reference to the ggplot columns, Taylor’s most positive scored sentiment album was her first album, Taylor Swift. Her most negative scored sentiment was her latest album, Evermore. This was an interesting find because it shows how different her earliest and latest albums are. This exemplifies the point that Taylor Swift has grown up in the public eye. She has been a popstar since she was 14, and is now 31 years old. We have all watched her, and her music, grow up - and this column chart of the different sentiments of her music shows how the positive and negative themes range. Taylor loves writing about relative things and relationships going on her life, so these things definitely play a factor into her song lyrics.
In regards to my original hypothesis which was predicting that as Taylor’s albums go on, the theme of love fades. I still think that love was most prominent in her first few albums, and as Taylor progressively got more and more famous, she took on a more mature and sophisticated perspective of the once “fairy-tale” love she wrote about when she was younger.
I originally believed that when I analyzed the sentiment behind each album, I thought that her more recent albums have more negative words than her earlier albums. This wasn’t entirely the case. I found that some of her earlier albums included negative words as well. One interesting find was that her latest 2 albums, Folklore and Evermore were very negative even though they both held a high count of the word “love” when I analyzed their individual love words. This could prove that Taylor Swift’s opinion on love has changed from her entirely positive first album, Taylor Swift, which was found to hold the themes of anticipation, joy, positive, surprise, trust and also scored the highest on the average sentiment chart.
At the end of the day, Taylor Swift’s music has changed a lot over the years, but one thing that remains constant is the theme of love. Whether she’s talking about it in a positive or negative way, you can rely on all 9 of her albums to include some sort of love words. If you’re looking to listen to a more positively averaged album, her first two - Taylor Swift and Fearless - are for you. If you’re looking for a more negative outlook, her latest albums - Folklore and Evermore- should do the trick.