Import data

library(ggplot2)
# excel file
albums <- read_excel("../00_data/data/myData.xlsx")
albums
## # A tibble: 500 × 21
##    sort_name         clean_name album rank_2003 rank_2012 rank_2020 differential
##    <chr>             <chr>      <chr>     <dbl>     <dbl>     <dbl>        <dbl>
##  1 Gaye, Marvin      Marvin Ga… What…         6         6         1            5
##  2 Beach Boys        The Beach… Pet …         2         2         2            0
##  3 Mitchell, Joni    Joni Mitc… Blue         30        30         3           27
##  4 Wonder, Stevie    Stevie Wo… Song…        56        57         4           52
##  5 Beatles           The Beatl… Abbe…        14        14         5            9
##  6 Nirvana           Nirvana    Neve…        17        17         6           11
##  7 Fleetwood Mac     Fleetwood… Rumo…        25        26         7           18
##  8 Prince and the R… Prince     Purp…        72        76         8           64
##  9 Dylan, Bob        Bob Dylan  Bloo…        16        16         9            7
## 10 Hill, Lauryn      Lauryn Hi… The …       312       314        10          302
## # ℹ 490 more rows
## # ℹ 14 more variables: release_year <dbl>, genre <chr>, type <chr>,
## #   weeks_on_billboard <dbl>, peak_billboard_position <dbl>,
## #   spotify_popularity <dbl>, spotify_url <chr>, artist_member_count <dbl>,
## #   artist_gender <chr>, artist_birth_year_sum <dbl>,
## #   debut_album_release_year <dbl>, ave_age_at_top_500 <dbl>,
## #   years_between <dbl>, album_id <chr>

State one question

The longer the album has been released, the higher the album is ranked.

Plot data

ggplot(data = albums) +
    geom_point(mapping = aes(x = rank_2020, y = release_year))

Interpret

Upon referencing the data, there appears to be no correlation of the year the album was released to where it was ranked in the year 2020.