Load Libraries

# Load any R Packages you may need
library(mosaic)
library(tidyverse)

Load Data

The following exercises use the spotify_songs.csv data set. This data set is hosted online on github. You should load this data set from github into R using the URL link provided below and call this data spotify_songs.

https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv

#Type your code here...

Exercise 1

(a)

###Use favstats() to find several descriptive statistics for speechiness, separated by playlist_genre.

spotifydata <- fread(“C:\Users\sowpa\OneDrive\Documents\spotify_songs.csv”)

favstats(speechiness ~ playlist_genre, data = spotifydata, na.rm=TRUE)

gf_point(speechiness ~ playlist_genre, data = spotifydata) + labs(x = “playlist_genre”, y = “speechiness”, title = “Speechiness by Genre”)

(b)

###Which genre has the highest average level of speechiness?

###The genre that has the highest average level of speechiness is Rap.

(c)

###Which genre has the lowest average level of speechiness?

###The genre that has the lowest average level of speechiness is Rock.

Exercise 2

###The band Queen has the most songs in this particular data set. Let’s compare Queen’s songs to another artist of your choosing.

###Using the code chunk below, filter the spotify_songs data set to contain the songs for Queen and your chosen artist.Call this filtered data set spotify_songs_QueenV[your-artist], where you replace [your-artist] with the name of the artist (i.e. if I choose Taylor Swift, the name would be spotify_songs_QueenVTaylorSwift):

Filter Data

###NOTE: Be sure to match exactly the spelling and case sensitivity when selecting categories or your data set might not filter properly!

Type your code here

spotify_subset = filter(spotifydata, track_artist ==“Queen” | track_artist == “Ed Sheeran”)

print(spotify_subset)

tally(~playlist_genre|track_artist, data = spotify_subset, format= “percent”)

my_tbl <- tally(track_artist ~ playlist_genre, data = spotify_subset) print(my_tbl) mosaicplot(my_tbl, color = c(“skyblue”, “orange”,“red”, “brown”, “green”), main = “% of Artist’s Song By Playlist Genre”)

gf_point(danceability ~ playlist_genre, color = ~ track_artist, data = spotify_subset) + labs(x = “playlist_genre”, y = “danceability”, title = “Danceability by Genre and Artist”)

gf_point(energy ~ playlist_genre, color = ~ track_artist, data = spotify_subset) + labs(x = “playlist_genre”, y = “energy”, title = “Energy by Genre and Artist”)

###Examine the filtered data set by performing R analysis to compare Queen’s songs to the Artist you choose by creating one chart/graph that helps compare the two artists.