library(ggplot2)
# excel file
albums <- read_excel("../00_data/data/myData.xlsx")
albums
## # A tibble: 500 × 21
## sort_name clean_name album rank_2003 rank_2012 rank_2020 differential
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Gaye, Marvin Marvin Ga… What… 6 6 1 5
## 2 Beach Boys The Beach… Pet … 2 2 2 0
## 3 Mitchell, Joni Joni Mitc… Blue 30 30 3 27
## 4 Wonder, Stevie Stevie Wo… Song… 56 57 4 52
## 5 Beatles The Beatl… Abbe… 14 14 5 9
## 6 Nirvana Nirvana Neve… 17 17 6 11
## 7 Fleetwood Mac Fleetwood… Rumo… 25 26 7 18
## 8 Prince and the R… Prince Purp… 72 76 8 64
## 9 Dylan, Bob Bob Dylan Bloo… 16 16 9 7
## 10 Hill, Lauryn Lauryn Hi… The … 312 314 10 302
## # ℹ 490 more rows
## # ℹ 14 more variables: release_year <dbl>, genre <chr>, type <chr>,
## # weeks_on_billboard <dbl>, peak_billboard_position <dbl>,
## # spotify_popularity <dbl>, spotify_url <chr>, artist_member_count <dbl>,
## # artist_gender <chr>, artist_birth_year_sum <dbl>,
## # debut_album_release_year <dbl>, ave_age_at_top_500 <dbl>,
## # years_between <dbl>, album_id <chr>
The longer the album has been released, the higher the album is ranked.
ggplot(data = albums) +
geom_point(mapping = aes(x = rank_2020, y = release_year))
Upon referencing the data, there appears to be no correlation of the year the album was released to where it was ranked in the year 2020.