Pitchfork is a website that writes reviews on all genres of music. They originated in the summer of 1996. After 26 years, they are still keeping up with music trends and releasing reviews on new music. I find this website very interesting, as they have a wide range of genres that they review. In this analysis, I will mainly focus on five genres: jazz, folk/country, rap, rock, and pop/r&b. I feel that this is a comprehensive group of data to look at. For each genre, I have chosen to look at the number of reviews from 1999-2017, the most popular words used in the reviews, and a sentiment analysis of the reviews per genre. I hypothesize that words used in the reviews will reflect words that would be seen in lyrics of that corresponding genre. I also believe that the words used in the reviews will reflect the tone of that genre’s music.
First, I loaded my packages.
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## Loading required package: usethis
The data set I used is https://www.kaggle.com/datasets/nolanbconaway/pitchfork-data
With the help of my professor and class tutor, we converted this data set into a csv file.
## New names:
## Rows: 23633 Columns: 20
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (8): title, artist, url, author, author_type, pub_date, genre, content dbl
## (12): ...1, reviewid...2, score, best_new_music, pub_weekday, pub_day, p...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
## • `reviewid` -> `reviewid...2`
## • `reviewid` -> `reviewid...15`
## • `reviewid` -> `reviewid...17`
## • `reviewid` -> `reviewid...19`
To start off this analysis, I wanted to look at the average ratings out of 10 for each genre.
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## # A tibble: 5 × 2
## genre avg
## <chr> <dbl>
## 1 folk/country 7.36
## 2 jazz 7.34
## 3 rock 6.99
## 4 rap 6.91
## 5 pop/r&b 6.87
Next, I looked at the number of reviews per genre, per year. It is important to note the y-axis values of these graphs. This shows the number of reviews. All of the graphs are scaled to be the same size, but have different values.
## Warning: Removed 1 row(s) containing missing values (geom_path).
In this graph, we can see that jazz reviews peak around 2003. Around 2006 they drop and have a resurgence again in 2008. From 2008 until 2015 there is a consistent decline in jazz reviews. From 2015-2017 it appears there was more interest in jazz, as the number of reviews has skyrocketed.
## Warning: Removed 1 row(s) containing missing values (geom_path).
This graph shows that there was a consistent incline of folk/country reviews until around 2007. At this point there is a drop until around 2012. Until 2017, there has been a gradual increase in number of reviews.
## Warning: Removed 1 row(s) containing missing values (geom_path).
In this graph, there is a gradual increase from 2000 until 2017 in reviews for rap music. There are two small dips in the line around 2006 and 2014. From 2014-2017 the number of reviews for rap music has continued to climb higher than before.
## Warning: Removed 1 row(s) containing missing values (geom_path).
This graph shows the gradual increase, then decrease of rock music reviews. This graph is almost a perfect hump shape. The number of reviews increases until around 2007, and then there is a gradual decline.
## Warning: Removed 1 row(s) containing missing values (geom_path).
In this graph, we can see that the reviews for pop/r&b music inclines until around 2011. At this point, it declines until around 2014, where it then shoots back up to having more reviews than ever before in 2017.
After looking at all of the graphs, you can tell when certain types of music got more popular. For example, between 2008 and 2010 there is a spike in reviews for pop music. I remember this time period being huge for pop music. Similarly, from 2014 to 2017 there is an increase in rap music reviews which also makes sense, as rap was very popular then. Looking at folk/country music, there is a big drop from 2007 till 2012. This makes sense because that era was overtaken by pop music. It would be interesting to print these graphs out on transparent paper and see how each graph lines up.
In the next section of my analysis, I looked at the most popular words that appeared in the reviews per genre. I chose to visualize this by using word clouds. The bigger the word, the more commonly used.
## Joining, by = "word"
## Joining, by = "word"
In this jazz word cloud, I find it interesting that playing is a main word use. I am lead to believe that this is referring to playing of instruments. Other words that stick out to me such as, “bass,” “guitar,” “horn,” “sax,” “funk,” “60s”, and “70s” are all words that immediately make me think of jazz music. This word cloud visualizes what I was anticipating.
## Joining, by = "word"
## Joining, by = "word"
The most popular word for the folk/country word cloud is “guitar.” “Acoustic” is also high on the list. I always thing of acoustic guitars when I think of country music, so this makes sense. “Home,” “road,” and “career” are three words that stick out to me on this word cloud that seem especially specific to country. I also like how there is “pop” and “rock” present to hint at pop country and rock country.
## Joining, by = "word"
## Joining, by = "word"
Looking at this rap word cloud, it is very different from the previous two. “Fuck” and “shit” are the first swear words found on a word cloud. There is a lot of swearing in rap so it makes sense there would be swear words referenced in the reviews as well. “90s” also catches my eye. I believe that the 90s was an iconic time for rap music, and I am sure it is being referenced in today’s reviews as well. Lastly, “beats” and “black” have a big presence in this word cloud. “Beats” sets the tone of this word cloud and gives it that rap feeling. “Black” is another prominent word here as rapping originated in Black culture.
## Joining, by = "word"
## Joining, by = "word"
In this rock music word cloud, “indie,” “punk,” “art,” and “dark” stick out to me as words I would associate with rock music. “Indie” and “punk” cover both sides of the rock spectrum. I associate “art” with the indie side of rock, as I do “dark” with the punk side. I think this word cloud does a good job showing the different parts of rock music.
## Joining, by = "word"
## Joining, by = "word"
The most popular word in the pop/r&b word cloud is “love.” This makes sense as every pop song I can think of is about love. One word that makes me happy to see is, “dance.” In my opinion, pop music is the easiest to dance to, so it makes sense that it is written about. “Summer” and “radio” also stick out to me. A lot of pop music that I know is from listening to the radio, even genres like country have a pop spin when played on the radio. “Summer” is another word that sticks out to me because pop music reminds me of summer nights in the way that it is almost airy sounding.
My last part of the analysis is about sentiment. The graph below shows the different levels of average sentiment per genre, on a scale of negative to positive.
## Joining, by = "word"
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Warning in genre == c("jazz", "folk/country", "rap", "rock", "pop/r&b"): longer
## object length is not a multiple of shorter object length
## Joining, by = "word"
## Joining, by = "word"
In the pop/r&b graph, “love” is the most positive word, and “lost” and “bad” are the most negative.
## Joining, by = "word"
## Joining, by = "word"
This graph of jazz music stands out to me because the most common word is not “love.” This is very unusual. In this case, the most common word is “free” which I find interesting. “Strange” is another word that I do not see often.
## Joining, by = "word"
## Joining, by = "word"
The folk/country graph shows two evenly extremes of sentiment. “Death,” “lost”, and “dead” are used often, but so is “pretty,” “sweet,” and “beautiful.” I think that this does a good job of showing the range of tone in folk/country reviews
## Joining, by = "word"
## Joining, by = "word"
The sentiment in the top words for rap music is overwhelmingly negative. “Shit”, “fuck,” “dead,” and “cut” are all words that stand out in this graph. Similarly, to the word cloud, this graph highlights the use of swear words.
## Joining, by = "word"
## Joining, by = "word"
This rock graph is looking very similar to the pop/r&b graph. Six of the words over lap and both start in the same order of “love, hard pretty easy.”
In conclusion, I would say that overall, the words used in reviews reflect the same tone and feeling of that genre. Besides rock and pop/r&b having some overlaps, the word clouds and sentiment graphs were very clear in what genre they belonged to. A lot of the words in the word clouds especially, could even be lyrics that would fit into that genre. I also believe that there is correlation between the first set of graphs showing number of reviews per genre, per year. As said before, I would love to overlap the lines to see if there is consistency with genres increasing while others decrease. In the future, it would be interesting to sort the word clouds by year to see if reviewers used certain slang that was popular in that time frame. Another route with this data set would be to look at specific artists in each genre. There are endless opportunities with this data set, but I am pleased with my findings.