Vampire Weekend is an indie band that formed in 2006. Since then, they have released four albums. “Vampire Weekend” was released in 2008, then “Contra” in 2010, “Modern Vampires of the City” in 2013, and “Father of the Bride” in 2019. There is a sense to fans that the overall image and tone of the band started very happy and light with the first two albums, got very dark at the third, then became more positive again but not to the point where they began.
vw_word_album1 %>%
head(10) ->album1top10
album1top10 %>%
ggplot(aes(reorder(word, -n), n, fill=word)) + geom_col() + scale_fill_tableau() + ggtitle("Top 10 Words of 'Vampire Weekend'") + labs(y= "Number of Appearances", x = "Words") vw_word_album2 %>%
head(10) ->album2top10
album2top10 %>%
ggplot(aes(reorder(word, -n), n, fill=word)) + geom_col() + scale_fill_tableau() + ggtitle("Top 10 Words of 'Contra'") + labs(y= "Number of Mentions", x = "Words") The sentiment analysis was accomplished with the Afinn lexicon. Afinn ranks words on a scale of -5 to 5 based on postive or negative connotation. -5 is the most negative rating and 5 is the most positive.
The tables below are the 10 most positive and 10 most negative words from each respective album. If multiple words have the same ranking in the lexicon, they are then ordered in the table based on frequency.
vw_song_lyrics %>%
filter(title %in% "Vampire Weekend") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> vw_word_album_afinn_pos
vw_word_album_afinn_pos %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| funny | 1 | 4 |
| charm | 2 | 3 |
| praise | 2 | 3 |
| perfect | 1 | 3 |
| chance | 3 | 2 |
| fine | 2 | 2 |
| smile | 2 | 2 |
| tops | 2 | 2 |
| true | 2 | 2 |
| cares | 1 | 2 |
There are 18 occurrences of the 10 unique most positive words. The median value is 2 and mean is 2.5.
vw_song_lyrics %>%
filter(title %in% "Vampire Weekend") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> vw_word_album_afinn_neg
vw_word_album_afinn_neg %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| fuck | 5 | -4 |
| shit | 1 | -4 |
| cruel | 3 | -3 |
| dumb | 2 | -3 |
| evil | 1 | -3 |
| lost | 1 | -3 |
| murdering | 1 | -3 |
| racist | 1 | -3 |
| insane | 6 | -2 |
| collapse | 1 | -2 |
There are 22 occurrences of the 10 unique most negative words. The mean and median values are both -3.
vw_song_lyrics %>%
filter(title %in% "Contra") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> c_word_album_afinn_pos
c_word_album_afinn_pos %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| funny | 2 | 4 |
| love | 1 | 3 |
| lovely | 1 | 3 |
| chance | 3 | 2 |
| care | 2 | 2 |
| honest | 2 | 2 |
| brave | 1 | 2 |
| enjoy | 1 | 2 |
| fair | 1 | 2 |
| fine | 1 | 2 |
There are 14 occurrences of the 10 unique most positive words. The median and mean values are both 2.
vw_song_lyrics %>%
filter(title %in% "Contra") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> c_word_album_afinn_neg
c_word_album_afinn_neg %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| cruel | 2 | -3 |
| lost | 2 | -3 |
| worse | 2 | -3 |
| bad | 1 | -3 |
| desperate | 1 | -3 |
| die | 1 | -3 |
| fake | 1 | -3 |
| horrified | 1 | -3 |
| victim | 1 | -3 |
| bitter | 4 | -2 |
There are 16 occurrences of the 10 unique most negative words. The mean median is -3 and the mean is -2.9.
vw_song_lyrics %>%
filter(title %in% "Modern Vampires of the City") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> mvotc_word_album_afinn_pos
mvotc_word_album_afinn_pos %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| rejoicing | 1 | 4 |
| love | 20 | 3 |
| praise | 4 | 3 |
| excited | 3 | 3 |
| blessing | 2 | 3 |
| charming | 1 | 3 |
| luck | 1 | 3 |
| pleasant | 1 | 3 |
| won | 1 | 3 |
| stronger | 4 | 2 |
There are 39 occurrences of the 10 unique most positive words. The median and mean values are both 3.
vw_song_lyrics %>%
filter(title %in% "Modern Vampires of the City") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> mvotc_word_album_afinn_neg
mvotc_word_album_afinn_neg %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| damn | 2 | -4 |
| hell | 1 | -4 |
| die | 8 | -3 |
| idiot | 3 | -3 |
| died | 2 | -3 |
| bad | 1 | -3 |
| hate | 1 | -3 |
| lost | 1 | -3 |
| fire | 14 | -2 |
| fool | 5 | -2 |
There are 38 occurrences of the 10 unique most negative words. The mean and median values are both -3.
vw_song_lyrics %>%
filter(title %in% "Father of the Bride") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(value)) -> fotb_word_album_afinn_pos
fotb_word_album_afinn_pos %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| win | 5 | 4 |
| funny | 1 | 4 |
| triumph | 1 | 4 |
| love | 7 | 3 |
| affection | 4 | 3 |
| grand | 1 | 3 |
| loved | 1 | 3 |
| perfect | 1 | 3 |
| sympathy | 6 | 2 |
| proud | 4 | 2 |
There are 31 occurrences of the 10 unique most positive words. The median value is 3 and mean is 3.1.
vw_song_lyrics %>%
filter(title %in% "Father of the Bride") %>%
unnest_tokens(word, lyric) %>%
anti_join(stop_words) %>%
count(word, sort = TRUE) %>%
inner_join(get_sentiments("afinn")) %>%
arrange(desc(-value)) -> fotb_word_album_afinn_neg
fotb_word_album_afinn_neg %>%
head(10) %>% kable| word | n | value |
|---|---|---|
| die | 8 | -3 |
| worried | 4 | -3 |
| violence | 3 | -3 |
| anger | 2 | -3 |
| cruel | 2 | -3 |
| evil | 2 | -3 |
| fake | 2 | -3 |
| hate | 2 | -3 |
| kill | 2 | -3 |
| lost | 2 | -3 |
There are 29 occurrences of the 10 unique most negative words. The mean and median values are both -3.
Most of the top 10 most common words among the four albums are filler like “ooh” and “la” or simply have no connatation either way.
Of the 10 most positive words that appear in each individual album, the number of occurrences does not follow a clear negatively or positively sloped line. A line of best fit would be positively sloped as they go from 18 to 15 to 39 to 31. Meaning, the most postive words are used more often as time goes on generally.
As for negative words, the total occurrences go from 22 to 16 to 38 to 29. This follows the exact same pattern. It appears that language with strong connotations (be it positive or negative) tend to appear with each album.
The absolute values of the means and medians for all eight tables are very similar to each other as well. For each album, the absolute values of the medians and means of the negative rankings are greater than or equal to their positive counterparts in every instance but one. The exception is the last album. The absolute values for both medians are 3 and the absolute value for the negative mean is 3 and the absolute value for the positive mean is 3.1. Close, but not equal.
Because the data showed the same patterns for the rise of positive and negative words and overall similar outcomes for measures of central tendencies, the hypothesis has been disproved. While the lyrics have become more negative, they have also become more positive with time.
The SpotifyR package comes with a mix of standard metrics and API-specific metrics. The Spotify-specific measurement, valence, ascribes a number from 0.0-1.0 to a track based on “musical positivness” (per the API’s documentation). The closer to 1.0, the higher the valence and the more positive it is.
Querying top ten because many have duplicates.
| track_name | valence |
|---|---|
| Oxford Comma | 0.974 |
| Oxford Comma | 0.973 |
| M79 | 0.948 |
| White Sky | 0.944 |
| M79 | 0.940 |
| Sunflower (feat. Steve Lacy) | 0.933 |
| Sunflower (feat. Steve Lacy) | 0.932 |
| White Sky | 0.908 |
| The Kids Don’t Stand A Chance | 0.906 |
| Holiday | 0.893 |
The results are displayed from highest to lowest number, by release chronology.
vampireweekend %>%
group_by(album_name) %>%
summarise(mean(valence)) %>%
arrange(desc(`mean(valence)`)) %>%
kable | album_name | mean(valence) |
|---|---|
| Contra | 0.7397500 |
| Vampire Weekend | 0.7362727 |
| Father of the Bride | 0.5201026 |
| Modern Vampires of the City | 0.4884615 |
vampireweekend %>%
group_by(album_name) %>%
summarise(median(valence)) %>%
arrange(desc(`median(valence)`)) %>%
kable | album_name | median(valence) |
|---|---|
| Contra | 0.8130 |
| Vampire Weekend | 0.7795 |
| Father of the Bride | 0.5250 |
| Modern Vampires of the City | 0.4910 |
vampireweekend %>%
group_by(album_name) %>%
ggplot(aes(x = valence, y = album_name, fill = ..x..)) +
geom_density_ridges_gradient() +
xlim(0,1) +
theme(legend.position = "none") +
ggtitle("Density Plot of Valences For Each Album") +
labs(y= "Album Name", x = "Valence") vampireweekend %>% group_by(album_name) %>%
ggplot(aes(x = valence, fill = album_name)) +
geom_density(alpha=.4, color=NA) +
xlim(0,1) +
ggtitle("Density Plot of Valences Across Albums") +
labs(y= "Density", x = "Valence", fill = "Album Name") The mean and medians show that the hypothesis is correct. The first two albums for both are very high, the third album’s mean and median are much lower, and the fourth album’s values are in the middle.
The first two albums both cluster at high valences. The third album does not truly peak at all. The last album’s follow’s the same pattern here, of slightly peaking about halfway through.
Based on the conclusion from the lyrics, about the use of polarized language in both directions increasing over time, the music does not follow the same pattern. If it did, the valence density plots would have started in the center then later albums would have two peaks: one on each side of the chart of similar sizes. In turned out to be the opposite.