In order to compare the popularity of the oldest album, Taylor Swift, with the newest album, Fearless (Taylor’s Version), a new data frame needed to be created which contains only the songs from these two albums and their individual popularity. This data frame, called oldvsnew, was used to get the five number summary of the popularity for both albums. This was all done using the package dplyr in order to filter the data, add new columns, and summarize the song data in each album.
# data frame with just the old and new album, their songs, and popularity
oldvsnew = music %>%
filter(album %in% c("Taylor Swift", "Fearless (Taylor's Version)")) %>%
select(name, album, popularity) %>%
mutate(colors = ifelse(album == "Taylor Swift", "lightblue", "goldenrod3"))
# five number summary and mean for both albums
stats = oldvsnew %>%
group_by(album) %>%
summarize(min = min(popularity),
q1 = quantile(popularity, 0.25),
median = median(popularity),
mean = mean(popularity),
q3 = quantile(popularity, 0.75),
max = max(popularity)) %>%
arrange(min) # rearrange