Import data

# csv file
data <- read_csv("../00_data/myData.csv")
data
## # A tibble: 691 × 22
##     ...1 sort_name   clean_name album rank_2003 rank_2012 rank_2020 differential
##    <dbl> <chr>       <chr>      <chr>     <dbl>     <dbl>     <dbl>        <dbl>
##  1     1 Sinatra, F… Frank Sin… "In …       100       101       282         -182
##  2     2 Diddley, Bo Bo Diddley "Bo …       214       216       455         -241
##  3     3 Presley, E… Elvis Pre… "Elv…        55        56       332         -277
##  4     4 Sinatra, F… Frank Sin… "Son…       306       308        NA         -195
##  5     5 Little Ric… Little Ri… "Her…        50        50       227         -177
##  6     6 Beyonce     Beyonce    "Lem…        NA        NA        32          469
##  7     7 Winehouse,… Amy Wineh… "Bac…        NA       451        33          468
##  8     8 Crickets    Buddy Hol… "The…       421       420        NA          -80
##  9     9 Bush, Kate  Kate Bush  "Hou…        NA        NA        68          433
## 10    10 Davis, Mil… Miles Dav… "Kin…        12        12        31          -19
## # ℹ 681 more rows
## # ℹ 14 more variables: release_year <dbl>, genre <chr>, type <chr>,
## #   weeks_on_billboard <dbl>, peak_billboard_position <dbl>,
## #   spotify_popularity <dbl>, spotify_url <chr>, artist_member_count <dbl>,
## #   artist_gender <chr>, artist_birth_year_sum <dbl>,
## #   debut_album_release_year <dbl>, ave_age_at_top_500 <dbl>,
## #   years_between <dbl>, album_id <chr>

State one question

Artists rankings in 2003

Plot data

ggplot(data = data) + 
  geom_point(mapping = aes(x = rank_2003, y = sort_name))

Interpret

My data needs to be more limited to be able to show the relationship between the artist rankings in 2003