Wrangling HW 4

Homework 4

First, we make sure to load the tidyverse library and our data. The data I look at here today is from the Spotify data that I am looking at on the midterm and final project in this course.

library(tidyverse)
spotify_songs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv')

This visualization is an exploration of the danceability of the most popular songs included in this Spotify data. The popularity of a track is based on a scale of 0 to 100 from least to most popular. The danceability variable describes how suitable a song is for dancing based on a combination of factors (tempo, rhythm, beat, regularity) on a scale of 0.0 as least danceable to 1.0 being most danceable. In this scatter polot, I zoomed in on the most popular songs, ranging from 80-100, to see what the breakdown of genres based on song of high popularity and danceability. It is not surprising that these most popular songs are predominantly very danceable based on the genre breakdown. It also makes sense that more popular songs would be songs that people want to dance to. While, the insights from this visual are not particularly groundbreaking, I am pleased that this information comes across clearly and is easy to understand.

spotify_songs %>% 
  ggplot(aes(x = track_popularity, y = danceability, color = playlist_genre) ) +
    geom_jitter(alpha = .25) +
    coord_cartesian(xlim = c(80,100)) +
    scale_y_continuous(name = "Danceability") +
    scale_x_continuous(name = "Popularity") +
    ggtitle("Track Popularity vs Danceability")

Wrangling HW 4

Emily Thie

11/13/2020

Homework 4