library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.6     ✓ dplyr   1.0.4
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(tidytext)
library(genius)
library(wordcloud2)
library(rmarkdown)
library(devtools)
## Loading required package: usethis

Introduction

Taylor Swift is one of the world’s most well known singers. She has 9 successful award winning albums filled with plenty of catchy songs. However, Swift was originally known for her country genre music up until fans noticed her genre switch in her 4th album, Red. From then on, Taylor Swift was a pop artist. Pop and Country are very different, but one thing that remains somewhat constant about Taylor Swift’s music, is the theme of love. No matter what stage of her life she was in, Taylor devoted most of her songs to be circled around love. I thought it would be interesting to analyze which of the 9 albums focused the most on the theme of love, and the difference in sentiment the theme of love plays in those 9 albums.

Data

I obtained data on Taylor Swift’s 9 albums from Genius. Genius is a website that holds the biggest collection of song lyrics and musical knowledge. https://genius.com/artists/Taylor-swift this link will send you to Taylor Swift’s plethora of songs, playlists, and albums. I noticed that when I clicked “show all albums by Taylor Swift” it gave me a lot of different variations of her 9 albums, I went to Wikipedia where I settled on her 9 distinct albums, and just focused on those 9 studio albums https://en.wikipedia.org/wiki/Taylor_Swift_albums_discography

Predictions

I predict that as Taylor’s albums go on, the theme of love fades. I think that love was most prominent in her first few albums, and as Taylor progressively got more and more famous, she took on a more mature and sophisticated perspective of the once “fairy-tale” love she wrote about when she was younger. While analyzing the sentiment behind each album, I believe I’ll find that her more recent albums have more negative words than her earlier albums. To complete this project, I used the following packages: tidyverse, tidytext, genius, wordcloud2, ggplot2, ramarkdown and devtools. I created the love words list by looking up synonyms of love, and reading definitions from https://www.merriam-webster.com/dictionary/love

Layout of Project

The layout of my project is as follows: I run through the albums one by one, in chronological order. For each album, I look at the top words said in the album in a wordcloud and a table. From there, I am able to determine the total number of words said in the album. Next, I apply my love words list to the album, and get a table of how many/which words are said within the album. Lastly, on the inidivudal album aspect, I look at the nrc sentiment for each album. Getting the nrc sentiment will put the words into categories of positive, negative, anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. Using this will help me to determine the theme of her music over the years and if it has shifted and changed. After analyzing each album on its own, I’ll then look at the 9 albums as a whole. I first found each album’s average sentiment using afinn. Using this data, I created a dataset to see the albums side by side with their average sentiment value in a table format. I then created a visualization of the albums and their distinguished average sentiment in order to visualize the comparison between the 9 of them.

My list of “love words”:

love_words <- c("love", "loved", "loving", "lover", "feel", "felt", "feeling", "heart", "boyfriend", "girlfriend", "husband", "wife", "marriage", "marry", "married", "baby", "kiss", "feel", "touch", "personal", "tender", "affectionate", "affection", "passion", "passionate", "desire", "desirable", "desired", "want", "wanted", "wanting", "devoted", "devotion",  "devote", "endearment", "endear", "admire", "admiring", "admired", "adore", "adoring", "adornment", "sex", "sexual", "attached", "attachment", "appreciate", "appreciated", "appreciating", "cherish", "cherishing")

Taylor Swift Album

TaylorSwift <- genius_album("Taylor Swift", "Taylor Swift")
## Joining, by = c("album_name", "track_n", "track_url")
TaylorSwift %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Taylor Swift. The most popular word of the album was “wanna” at 26.

TaylorSwift %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 641 x 2
##    word      n
##    <chr> <int>
##  1 you     213
##  2 i       159
##  3 the     155
##  4 and     130
##  5 a        95
##  6 me       85
##  7 to       84
##  8 my       70
##  9 that     70
## 10 on       60
## # … with 631 more rows

This gave me the word count of Taylor Swift the album at 641 words.

TaylorSwift %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 371 x 2
##    word          n
##    <chr>     <int>
##  1 wanna        26
##  2 beautiful    20
##  3 should've    20
##  4 love         19
##  5 song         18
##  6 baby         15
##  7 time         15
##  8 hope         13
##  9 eyes         12
## 10 smile        12
## # … with 361 more rows

This code enabled me to look at the top 10 words of the album, Taylor Swift, in a table view.

Now, I want to analyze the love words that were said in the album, Taylor Swift.

TaylorSwift %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
##   word        n
##   <chr>   <int>
## 1 love       19
## 2 baby       15
## 3 heart      11
## 4 feel        5
## 5 kiss        4
## 6 adore       1
## 7 feeling     1
## 8 loving      1

After comparing the love words to the words said in the album, it was found that 8 love words were said, with “love” being the most frequent at 12 times.

Now I’ll look into the sentiment of the album, Taylor Swift.

TaylorSwift%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, surprise, trust.

Fearless Album

Now, I’ll analyze her second album, Fearless.

Fearless <- genius_album("Taylor Swift", "Fearless")
## Joining, by = c("album_name", "track_n", "track_url")
Fearless %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Fearless The most popular word of the album was “la” at 26.

Fearless %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 745 x 2
##    word       n
##    <chr>  <int>
##  1 you      245
##  2 and      206
##  3 i        203
##  4 the      128
##  5 to        96
##  6 me        87
##  7 a         69
##  8 know      61
##  9 you're    59
## 10 in        58
## # … with 735 more rows

This gave me the word count of Fearless at 745 words.

Fearless %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 467 x 2
##    word        n
##    <chr>   <int>
##  1 la         26
##  2 feel       22
##  3 baby       16
##  4 love       16
##  5 time       15
##  6 belong     12
##  7 rains      12
##  8 feeling    11
##  9 loved      10
## 10 run        10
## # … with 457 more rows

This code enabled me to look at the top 10 words of the album, Fearless, in a table view.

Now, I want to analyze the love words that were said in the album, Fearless.

Fearless %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
##   word           n
##   <chr>      <int>
## 1 feel          22
## 2 baby          16
## 3 love          16
## 4 feeling       11
## 5 loved         10
## 6 kiss           6
## 7 marry          2
## 8 girlfriend     1

It was interesting to see 8 love words were highlighted again when looking at the frequency in her next album.“Feel” was most popular in this album, at 22.

Now I’ll look into the sentiment of the album, Fearless.

Fearless%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear,joy, positive. It is noted that this album had more negative sentiment than her previous album.

Speak Now

Next, I’ll look at her third album, Speak Now

SpeakNow <- genius_album("Taylor Swift", "Speak Now")
## Joining, by = c("album_name", "track_n", "track_url")
SpeakNow %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Speak Now. The most popular word of the album was “time” at 28.

SpeakNow %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 925 x 2
##    word      n
##    <chr> <int>
##  1 you     295
##  2 i       209
##  3 the     209
##  4 and     157
##  5 to      102
##  6 me       94
##  7 a        88
##  8 your     82
##  9 on       77
## 10 it       75
## # … with 915 more rows

This gave me the word count of Speak Now at 925 words.

SpeakNow %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 643 x 2
##    word         n
##    <chr>    <int>
##  1 time        28
##  2 grow        21
##  3 meet        17
##  4 mind        17
##  5 gonna       13
##  6 live        13
##  7 night       13
##  8 remember    13
##  9 eyes        12
## 10 forever     12
## # … with 633 more rows

This code enabled me to look at the top 10 words of the album, Speak Now, in a table view.

Now, I want to analyze the love words that were said in the album, Speak Now.

SpeakNow %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
##   word        n
##   <chr>   <int>
## 1 love       12
## 2 kiss        8
## 3 feel        7
## 4 baby        6
## 5 loved       4
## 6 heart       3
## 7 feeling     2
## 8 touch       2

The love words are exactly the same words and counts as the first album, Taylor Swift.

Now I’ll look into the sentiment of the album, SpeakNow.

SpeakNow%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, surprise, trust. The exact same sentiments as her first album, Taylor Swift.

Red

Next, I’ll look at her fourth album, Red

Red <- genius_album("Taylor Swift", "Red")
## Joining, by = c("album_name", "track_n", "track_url")
Red %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Red. The most popular word of the album was “time” at 66. It is noted that “time” was also the most popular word of Taylor’s album before this, Speak Now.

Red %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 917 x 2
##    word      n
##    <chr> <int>
##  1 you     355
##  2 i       261
##  3 and     226
##  4 the     184
##  5 oh      108
##  6 me      107
##  7 to      104
##  8 a       102
##  9 in       98
## 10 like     91
## # … with 907 more rows

This gave me the word count of Red at 917 words.

Red %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 630 x 2
##    word          n
##    <chr>     <int>
##  1 time         66
##  2 ooh          60
##  3 stay         32
##  4 trouble      32
##  5 yeah         28
##  6 home         25
##  7 love         25
##  8 starlight    22
##  9 wanna        22
## 10 dancing      18
## # … with 620 more rows

This code enabled me to look at the top 10 words of the album, Red, in a table view.

Now, I want to analyze the love words that were said in the album, Red.

Red %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 13 x 2
##    word           n
##    <chr>      <int>
##  1 love          25
##  2 loving        12
##  3 feeling       10
##  4 feel           6
##  5 heart          5
##  6 loved          4
##  7 baby           2
##  8 kiss           2
##  9 touch          2
## 10 attached       1
## 11 lover          1
## 12 married        1
## 13 passionate     1

Red finally showed new data, with 13 love words in the album. “Love” was seen as the most said word at 25.

Sentiment for Red:

Red%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. It is noted that this album had similar sentiments as Fearless, with the addition of negative and sadness.

1989 Album

Data for the fifth album, 1989

genius_album("Taylor Swift", "1989") -> Taylor1989
## Joining, by = c("album_name", "track_n", "track_url")
Taylor1989 %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, 1989. The most popular word of the album was “shake” at 78.

Taylor1989 %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 698 x 2
##    word      n
##    <chr> <int>
##  1 i       358
##  2 you     261
##  3 the     196
##  4 oh      180
##  5 and     160
##  6 we      127
##  7 to      109
##  8 it      106
##  9 in       93
## 10 are      83
## # … with 688 more rows

This gave me the word count of 1989 at 698 words.

Taylor1989 %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 457 x 2
##    word      n
##    <chr> <int>
##  1 shake    78
##  2 love     77
##  3 woods    39
##  4 stay     33
##  5 baby     32
##  6 gonna    30
##  7 york     30
##  8 girl     25
##  9 bad      23
## 10 hey      22
## # … with 447 more rows

This code enabled me to look at the top 10 words of the album, 1989, in a table view.

Now, I want to analyze the love words that were said in the album, 1989.

Taylor1989 %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 5 x 2
##   word           n
##   <chr>      <int>
## 1 love          77
## 2 baby          32
## 3 heart          3
## 4 kiss           2
## 5 girlfriend     1

1989 only had 5 love words, but it had the highest count of the word “love” at 77, which is noticeably higher than the other albums saying the word love.

Sentiment for 1989:

Taylor1989  %>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, disgust, fear, joy, negative, positive, sadness. I found this interesting how there are more negative sentiments than positive.

Reputation Album

Next, I’ll look at her sixth album, Reputation

Reputation <- genius_album("Taylor Swift", "Reputation")
## Joining, by = c("album_name", "track_n", "track_url")
Reputation %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Reputation. The most popular word of the album was “di” at 81.

Reputation %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 932 x 2
##    word      n
##    <chr> <int>
##  1 you     325
##  2 i       295
##  3 the     226
##  4 it      213
##  5 me      190
##  6 and     161
##  7 my      143
##  8 a       137
##  9 to      107
## 10 so      104
## # … with 922 more rows

This gave me the word count of Reputation at 932 words.

Reputation %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 659 x 2
##    word      n
##    <chr> <int>
##  1 di       81
##  2 call     46
##  3 ooh      39
##  4 wanna    37
##  5 ha       34
##  6 time     34
##  7 ah       33
##  8 baby     33
##  9 yeah     32
## 10 bad      31
## # … with 649 more rows

This code enabled me to look at the top 10 words of the album, Reputation, in a table view.

Now, I want to analyze the love words that were said in the album, Reputation.

Reputation %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 11 x 2
##    word           n
##    <chr>      <int>
##  1 baby          33
##  2 feel          20
##  3 love          19
##  4 heart         14
##  5 feeling        8
##  6 touch          7
##  7 kiss           3
##  8 loved          3
##  9 boyfriend      1
## 10 girlfriend     1
## 11 lover          1

The top love word in the Red album was “baby” 33 times and there was a total of 11 love words in the album.

Sentiment for Reputation:

Reputation%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. Same observation as previous album - how there are more negative sentiments than positive.

Lover

Next, I’ll look at her seventh album, Lover

Lover <- genius_album("Taylor Swift", "Lover")
## Joining, by = c("album_name", "track_n", "track_url")
Lover %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Lover. The most popular word of the album was “ooh” at 69.

Lover %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 1,029 x 2
##    word      n
##    <chr> <int>
##  1 i       396
##  2 you     263
##  3 the     243
##  4 and     155
##  5 my      148
##  6 me      132
##  7 a       117
##  8 to      115
##  9 oh      102
## 10 in       96
## # … with 1,019 more rows

This gave me the word count of Lover at 1,019 words.

Lover %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 759 x 2
##    word         n
##    <chr>    <int>
##  1 ooh         69
##  2 love        44
##  3 wanna       42
##  4 daylight    40
##  5 ah          29
##  6 baby        29
##  7 yeah        25
##  8 street      23
##  9 walk        19
## 10 home        18
## # … with 749 more rows

This code enabled me to look at the top 10 words of the album, Lover, in a table view.

Now, I want to analyze the love words that were said in the album, Lover.

Lover %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 10 x 2
##    word        n
##    <chr>   <int>
##  1 love       44
##  2 baby       29
##  3 lover       8
##  4 touch       6
##  5 feeling     5
##  6 heart       4
##  7 kiss        4
##  8 marry       4
##  9 adore       1
## 10 loved       1

Once again, “love” is the top word in the album Lover, at 44. 10 love words are expressed in this album.

Sentiment for Lover:

Lover%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. Same exact sentiments as previous album.

Folklore

Next, I’ll look at her eighth album, Folklore

Folklore <- genius_album("Taylor Swift", "Folklore")
## Joining, by = c("album_name", "track_n", "track_url")
Folklore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Folklore. The most popular word of the album was “time” at 34.

Folklore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) 
## # A tibble: 1,020 x 2
##    word      n
##    <chr> <int>
##  1 you     243
##  2 i       234
##  3 the     178
##  4 and     120
##  5 me       97
##  6 a        92
##  7 to       90
##  8 in       85
##  9 my       85
## 10 your     64
## # … with 1,010 more rows

This gave me the word count of Folklore at 1,020 words.

Folklore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 738 x 2
##    word      n
##    <chr> <int>
##  1 time     34
##  2 ooh      21
##  3 love     13
##  4 mad      13
##  5 ah       11
##  6 call     11
##  7 hope     11
##  8 woman    11
##  9 mine     10
## 10 heart     9
## # … with 728 more rows

This code enabled me to look at the top 10 words of the album, Folklore, in a table view.

Now, I want to analyze the love words that were said in the album, Folklore.

Folklore %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 6 x 2
##   word      n
##   <chr> <int>
## 1 love     13
## 2 heart     9
## 3 baby      5
## 4 kiss      5
## 5 loved     2
## 6 feel      1

“Love” is said 13 times, but only 6 love words are said in Folklore.

Sentiment for Folklore:

Folklore%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anger, anticipation, disgust, fear, joy, negative, positive, sadness. This is now the third album in the row with the same sentiments.

Evermore

Lastly, I’ll look at her ninth album, Evermore

Evermore <- genius_album("Taylor Swift", "Evermore")
## Joining, by = c("album_name", "track_n", "track_url")
Evermore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words) %>% 
  wordcloud2()
## Joining, by = "word"

This code gave me the top words said in the album, Evermore. The most popular word of the album was “ooh” at 26.

Evermore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE)  
## # A tibble: 1,038 x 2
##    word      n
##    <chr> <int>
##  1 the     222
##  2 i       206
##  3 you     200
##  4 and     143
##  5 my      119
##  6 it      113
##  7 to      101
##  8 your     90
##  9 in       85
## 10 a        78
## # … with 1,028 more rows

This gave me the word count of Evermore at 1,037 words.

Evermore %>% 
  unnest_tokens(word, lyric) %>%
  count(word, sort = TRUE) %>%
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 750 x 2
##    word      n
##    <chr> <int>
##  1 ooh      26
##  2 love     17
##  3 ah       14
##  4 died     13
##  5 eyes     13
##  6 hand     13
##  7 stay     13
##  8 time     13
##  9 yeah     13
## 10 alive    12
## # … with 740 more rows

This code enabled me to look at the top 10 words of the album, Evermore, in a table view.

Now, I want to analyze the love words that were said in the album, Evermore.

Evermore %>% 
  unnest_tokens(word, lyric) %>%
  filter(word %in% love_words) %>% 
  count(word, sort = TRUE) %>% 
  anti_join(stop_words)
## Joining, by = "word"
## # A tibble: 8 x 2
##   word        n
##   <chr>   <int>
## 1 love       17
## 2 feel        5
## 3 touch       5
## 4 feeling     4
## 5 baby        2
## 6 heart       2
## 7 loved       2
## 8 wife        1

“Love” is said 17 times, and there are 8 love words in Evermore.

Sentiment for Evermore:

Evermore%>% 
  unnest_tokens(word, lyric) %>% 
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('nrc')) %>% 
  count(word, sentiment, sort = TRUE) %>% 
  head(10) %>% 
  ggplot(aes(word, n, fill = sentiment)) + geom_col()
## Joining, by = "word"
## Joining, by = "word"

It was found that the sentiments of this album in reference to the top 10 words were anticipation, joy, positive, and trust. This took a huge turn in comparison to her last three albums, which were more negative and all tookon the exact same sentiments.

Now I will look at the 9 albums’ average sentiment as a whole.

TaylorSwift %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "TaylorSwift") -> TaylorSwift_afinn
## Joining, by = "word"
## Joining, by = "word"
TaylorSwift_afinn %>%
  mutate(mean(TaylorSwift_afinn$value) ) -> TaylorSwift_mean

The mean sentiment value for her first album was 0.6747967, rounding to 1 which can conclude it was on a neutral sentiment, but more positive

Fearless %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Fearless") -> Fearless_afinn
## Joining, by = "word"
## Joining, by = "word"
Fearless_afinn %>%
  mutate(mean(Fearless_afinn$value) ) -> Fearless_mean

The mean sentiment value for Fearless was 0.5091743, similar to Taylor Swift, rounding to 1 which can conclude it was on a neutral sentiment, but more positive.

SpeakNow %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "SpeakNow") -> SpeakNow_afinn
## Joining, by = "word"
## Joining, by = "word"
SpeakNow_afinn %>%
  mutate(mean(SpeakNow_afinn$value) ) -> SpeakNow_mean

The mean for Speak Now is -0.1115242, this negative number allows me to conclude it took on a more negative sentiment than her previous two albums.

Red %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Red") -> Red_afinn
## Joining, by = "word"
## Joining, by = "word"
Red_afinn %>%
  mutate(mean(Red_afinn$value) ) -> Red_mean

The mean for Red is 0.2614213, this number determines that Red took on a pretty neutral sentiment.

Taylor1989 %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Taylor1989") -> Taylor1989_afinn
## Joining, by = "word"
## Joining, by = "word"
Taylor1989_afinn %>%
  mutate(mean(Taylor1989_afinn$value) ) -> Taylor1989_mean

The mean sentiment value for 1989 was -0.1903409, this negative number proves that it took on a fairly negative sentiment

Reputation %>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Reputation") -> Reputation_afinn
## Joining, by = "word"
## Joining, by = "word"
Reputation_afinn %>%
  mutate(mean(Reputation_afinn$value) ) -> Reputation_mean

The mean sentiment value was 0.03287671, super close to 0, which can allow me to say Reputation had a neutral sentiment

Lover%>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Lover") -> Lover_afinn
## Joining, by = "word"
## Joining, by = "word"
Lover_afinn %>%
  mutate(mean(Lover_afinn$value) ) -> Lover_mean

The mean sentiment value was 0.1123348, just like all of the other albums, it is close to 0 so it’s a fairly neutral sentiment.

Folklore%>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Folklore") -> Folklore_afinn
## Joining, by = "word"
## Joining, by = "word"
Folklore_afinn %>%
  mutate(mean(Folklore_afinn$value) ) -> Folklore_mean

The mean sentiment value for Folklore was -0.4354244, which is on the negative side of things, meaning the sentiment was more negative

Evermore%>%
  unnest_tokens(word, lyric) %>%
  anti_join(stop_words) %>% 
  inner_join(get_sentiments('afinn')) %>% 
  mutate(album = "Evermore") -> Evermore_afinn
## Joining, by = "word"
## Joining, by = "word"
Evermore_afinn %>%
  mutate(mean(Evermore_afinn$value) ) -> Evermore_mean

The mean sentiment was -0.5076453, taking on the largest negative value of the 9 albums.

Now, I’ll look at the albums as a whole. I had to first create a dataset which I did here by naming the columns by the album mean and aligning it with their matching average number

all_albums <- data.frame("Album" = c("TaylorSwift_mean", "Fearless_mean","SpeakNow_mean", "Red_mean", "Taylor1989_mean", "Reputation_mean","Lover_mean", "Folklore_mean" ,"Evermore_mean" ), "Average Sentiment" = c(0.67, 0.51, -0.11, 0.26,-0.19, 0.03, 0.11, -0.44, -0.51 ))

This enabled me to view the albums side by side with their average sentiment value in a table format.

I wanted to visualize the albums and their average sentiment in a column chart, but first wanted t remind R that my Album names were already listed in the order I’d like for them to appear as:

all_albums$Album <- factor(all_albums$Album, levels = all_albums$Album)
library(ggplot2)            
ggplot(all_albums, aes(Album, Average.Sentiment, fill= Album)) + geom_col()

This visualization enabled me to see how many albums were on a negative sentiment scale. A total of 4 albums were on the negative side of the column chart: Evermore, Folklore, Speak Now, and 1989. The remaining 5 albums were on the positive spectrum of the sentiment column chart. It’s important to note that her first 2 albums, Taylor Swift and Fearless scored very high on the positive sentiment in comparison to the other albums.

Conclusion

In conclusion, by looking at each album on it’s own, I determined the progress Taylor Swift has made over the years as an artist. I first found the top words said in each album, then briefed it down to the top 10 words. From here, I was able to view the total number of words in the album. Then, I applied those words of the album to my list of love words. Lastly, I determined the sentiment of the album using “nrc” because I wanted to see the words broken into positive/negative.

All in all, after looking at the most popular love word in each album, I found that “love” was the most frequently said love word in 7 of her albums. Reputation and Fearless had different top love words. When looking for a reason why this might have occurred, I noticed these two albums had noticeably negative sentiments. Both Fearless and Reputation shared the sentiments of anger, anticipation, disgust, fear, joy, positive. I looked into Taylor’s dating time line, and found that Taylor and Joe Jonas broke up just before the release of her album, Fearless, in 2008. https://www.billboard.com/photos/1484087/taylor-swifts-boyfriend-timeline-12-relationships-their-songs

As for her album Reputation, Taylor actually made some references as to what/who each song is about. This album was around the time of the famous Taylor and Kanye West feud, so that was definitely referenced, as well as her then current relationship and some exes. https://www.popsugar.com/entertainment/Who-Songs-Taylor-Swift-Reputation-About-44244774

On the other hand, since 7 of the 9 albums had the popular love word “love” it verifies the fact that Taylor’s songwriting is consistent around the theme of love. As she grew older and produced more and more songs, the theme of love was still apparent in majority of her albums. But, just because the same word “love” was used, doesn’t mean that it is used in the same positive way. This is where sentiments came into play. By looking at the “nrc” sentiment of each album, I was able to see the positive and negative tones of each album.

After doing all of the steps for each album, I wanted to look at the 9 albums as a whole in order to focus on the sentiments more. In reference to the ggplot columns, Taylor’s most positive scored sentiment album was her first album, Taylor Swift. Her most negative scored sentiment was her latest album, Evermore. This was an interesting find because it shows how different her earliest and latest albums are. This exemplifies the point that Taylor Swift has grown up in the public eye. She has been a popstar since she was 14, and is now 31 years old. We have all watched her, and her music, grow up - and this column chart of the different sentiments of her music shows how the positive and negative themes range. Taylor loves writing about relative things and relationships going on her life, so these things definitely play a factor into her song lyrics.

In regards to my original hypothesis which was predicting that as Taylor’s albums go on, the theme of love fades. I still think that love was most prominent in her first few albums, and as Taylor progressively got more and more famous, she took on a more mature and sophisticated perspective of the once “fairy-tale” love she wrote about when she was younger.

I originally believed that when I analyzed the sentiment behind each album, I thought that her more recent albums have more negative words than her earlier albums. This wasn’t entirely the case. I found that some of her earlier albums included negative words as well. One interesting find was that her latest 2 albums, Folklore and Evermore were very negative even though they both held a high count of the word “love” when I analyzed their individual love words. This could prove that Taylor Swift’s opinion on love has changed from her entirely positive first album, Taylor Swift, which was found to hold the themes of anticipation, joy, positive, surprise, trust and also scored the highest on the average sentiment chart.

At the end of the day, Taylor Swift’s music has changed a lot over the years, but one thing that remains constant is the theme of love. Whether she’s talking about it in a positive or negative way, you can rely on all 9 of her albums to include some sort of love words. If you’re looking to listen to a more positively averaged album, her first two - Taylor Swift and Fearless - are for you. If you’re looking for a more negative outlook, her latest albums - Folklore and Evermore- should do the trick.