The Report:

What is the dataset?

The database that I ended up choosing for this project was a database that housed the top 1000 most played spotify songs of all time. The reason for why I chose this database is because, when I saw it, I thought it could be good data to use for a project and I enjoy music.

This dataset has, of course, 1000 rows of data. The 8 variables are:

  1. track_name: The title of the song or track
  2. artist: Name(s) of the track’s performing artist(s)
  3. album: The album in which the track was initially released
  4. release_date: The official release date of the track or album
  5. popularity: A score (0-100) based on how frequently the track is streamed, shared, and added to playlists. Higher means more popular
  6. spotify_url: Direct URL to the track on Spotify, useful for previewing or sharing.
  7. id: ID given to song by Spotify
  8. duration_min: Duration of the track in minutes (converted from milliseconds)

Interesting Finding

What I wanted to look at with this data was if there was any connect between the release date of the song and its popularity. The first thing I decided to look at was how many of the 1000 songs were released in each year.

spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")
spotify_dates <- as.Date(spotify_data$release_date, format = "%Y-%m-%d")
spotify_dates <- format(spotify_dates, "%Y")
years_table <-as.data.frame(table(spotify_dates))


print(ggplot(years_table, aes(x=Freq, y=spotify_dates)) +
  geom_bar(stat="identity") +
  labs(title = "How many Top 1000 Songs were release in each year", 
         x = "Number of songs in Top 1000", y = "Year of Release"))

As shown from the graph, either we are coming out of one of the greatest years of music ever, or there is a bit of recency bias for songs released last year. I decided to take 2025 and 2024 out of the equation because of the recency bias that they hold.

spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")
spotify_dates <- as.Date(spotify_data$release_date, format = "%Y-%m-%d")
spotify_dates <- format(spotify_dates, "%Y")
spotify_dates <- subset(spotify_dates, spotify_dates < 2024)
years_table <-as.data.frame(table(spotify_dates))
years_table <-arrange(years_table, spotify_dates)

print(ggplot(years_table, aes(x=Freq, y=spotify_dates)) +
  geom_bar(stat="identity") +
  labs(title = "How many Top 1000 Songs were release in each year", 
         x = "Number of songs in Top 1000", y = "Year of Release"))

In the graph, we can see a clear gradual increase from 2008 to 2018 followed by a gradual decrease following after that.

spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")
spotify_dates <- as.Date(spotify_data$release_date, format = "%Y-%m-%d")
spotify_dates <- format(spotify_dates, "%m")
spotify_dates <- subset(spotify_dates, spotify_dates < 2024)
years_table <-as.data.frame(table(spotify_dates))
years_table <-arrange(years_table, years_table$spotify_dates)

print(ggplot(years_table, aes(x=spotify_dates, y=Freq)) +
  geom_bar(stat="identity") +
  labs(title = "How many Top 1000 Songs were release in each year",
         x = "Month of Release", y = "Number of songs in Top 1000"))

This graph, similarly to the last, is looking at the month that these Top 1000 songs were released in. The months are in their numerical form so 01 is January and so on. February is going to have less song releases in that month because it is a shorter month in general. What surprised me the most when looking at this graph is December being the second lowest month. You would think with the release of Christmas music December would be at least average. After looking it up, I learned that a lot of Christmas songs are actually released in late November in preparation for December. Next I wanted to see what the average popularity of each month was. All of the songs are in the top 1000, which means that even being on this list is a sign of major popularity. But, if all of a single month is of much lower quality popularity compared to others, that could change some things.

spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")
spotify_data$release_date <- as.Date(spotify_data$release_date, format = "%Y-%m-%d")
spotify_data$release_date <- format(spotify_data$release_date, "%m")

testing <- spotify_data %>% 
  group_by(release_date) %>%
  summarize(
    count = n(),
    mean_popularity = mean(popularity, na.rm = TRUE)
  )

print(ggplot(testing, aes(x=release_date, y=mean_popularity, group = 1)) +
  
  geom_line() +
  labs(title = "How many Top 1000 Songs were release in each month", 
         x = "Month of Release", y = "Average Popularity of Songs"))

NA is representing songs that didn’t have a release date attached to them. It being average in popularity compared to everything else is interesting and definitely builds a case between release date and popularity being connected. Once again, February and July are at the bottom of the list.

spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")
spotify_data$release_date <- as.Date(spotify_data$release_date, format = "%Y-%m-%d")
spotify_data$release_date <- format(spotify_data$release_date, "%m")

testing <- spotify_data %>% 
  group_by(release_date) %>%
  summarize(
    count = n(),
    mean_popularity = mean(popularity, na.rm = TRUE)
  )

library(ggplot2)
ggp <- ggplot(testing) +
    geom_bar(aes(x=release_date, y=count, group = 1), stat="identity") +
geom_line(aes(x=release_date, y=mean_popularity, group = 1), stat="identity", color="red", size=1) +
  labs(title = "How many Top 1000 Songs were release in each month", 
         x = "Month of Release", y = "Number of Songs")+
  scale_y_continuous(sec.axis=sec_axis(~.*.01,name = "Mean Popularity"))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(ggp)

This graph combines both the number of songs in that month and the average popularity of those songs. As we can see, the average popularity and the number of songs released in that month are somewhat following the same trajectory. February and July are both consistently at the bottom. The pattern of the data is relatively the same, but it doesn’t follow 100% of the time. April takes a big dive in average popularity compared to the loss in number of songs released in April. This leads me to believe that there is probably a small connection to popularity of a song and when it is released. Songs released in early spring and late fall are more likely to get popular than songs released in the middle of summer.

Skills Demonstration

1.Read in data from a .csv file.

library(tidyverse)
spotify_data <- read.csv(file="spotify_top_1000_tracks.csv")

2. Illustrate the use of summary on a data frame.

summary(spotify_data)
##   track_name           artist             album           release_date      
##  Length:1000        Length:1000        Length:1000        Length:1000       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    popularity    spotify_url             id             duration_min   
##  Min.   : 0.00   Length:1000        Length:1000        Min.   :0.9691  
##  1st Qu.:37.00   Class :character   Class :character   1st Qu.:2.7540  
##  Median :68.00   Mode  :character   Mode  :character   Median :3.2938  
##  Mean   :56.67                                         Mean   :3.3185  
##  3rd Qu.:79.00                                         3rd Qu.:3.7457  
##  Max.   :97.00                                         Max.   :9.4979

3. Illustrate the use of table on an attribute of a dataframe.

artist_table <- as.data.frame(table(spotify_data$artist))
artist_table <- arrange(artist_table, desc(Freq))
print(artist_table)
##                          Var1 Freq
## 1                  The Weeknd   26
## 2                Taylor Swift   25
## 3               Avril Lavigne   21
## 4                 Alan Walker   14
## 5               Ariana Grande   12
## 6                Selena Gomez   12
## 7                  Ed Sheeran   11
## 8               Justin Bieber   11
## 9                  Badscandal    9
## 10                Demi Lovato    9
## 11            Imagine Dragons    9
## 12                 Katy Perry    9
## 13                    Rihanna    9
## 14              Billie Eilish    8
## 15             Camila Cabello    8
## 16               David Guetta    8
## 17               Lana Del Rey    8
## 18                   Maroon 5    8
## 19                     PACANI    8
## 20                      Drake    7
## 21           The Chainsmokers    7
## 22                  Arc North    6
## 23               Brent Faiyaz    6
## 24                   Doja Cat    6
## 25             Meghan Trainor    6
## 26               Shawn Mendes    6
## 27                        Sia    6
## 28                      Adele    5
## 29                 Anne-Marie    5
## 30                    Ava Max    5
## 31                 Bebe Rexha    5
## 32                    Beyoncé    5
## 33              Calvin Harris    5
## 34                   Dua Lipa    5
## 35             Ellie Goulding    5
## 36                     Eminem    5
## 37                Goodscandal    5
## 38                   Harddope    5
## 39                 Marshmello    5
## 40                Miley Cyrus    5
## 41                  Nito-Onna    5
## 42              One Direction    5
## 43                Post Malone    5
## 44          Sabrina Carpenter    5
## 45          The Neighbourhood    5
## 46                   54GODART    4
## 47                       Aqua    4
## 48                Blvck Cobrv    4
## 49                 Bruno Mars    4
## 50       Cigarettes After Sex    4
## 51          Justin Timberlake    4
## 52                   Kavinsky    4
## 53                  Lil Nas X    4
## 54                     Miguel    4
## 55          Nightcore Reality    4
## 56                       P!nk    4
## 57                    Shakira    4
## 58                      TEZIS    4
## 59                 The Wanted    4
## 60                    5lowers    3
## 61                 A$AP Rocky    3
## 62               Clean Bandit    3
## 63                   DJ Snake    3
## 64                      EQRIC    3
## 65                Frank Ocean    3
## 66                     Halsey    3
## 67         Into The Nightcore    3
## 68             Jennifer Lopez    3
## 69                       Joji    3
## 70             Kendrick Lamar    3
## 71                  LexMorris    3
## 72                 Little Mix    3
## 73               Majid Jordan    3
## 74             Olivia Rodrigo    3
## 75                     The xx    3
## 76                       Twin    3
## 77                Yellow Pvnk    3
## 78                       ZAYN    3
## 79                       Zedd    3
## 80                   347aidan    2
## 81                       ABBA    2
## 82               Alessia Cara    2
## 83                  Ali Gatie    2
## 84                      Arash    2
## 85                   B Martin    2
## 86                      B.o.B    2
## 87            Backstreet Boys    2
## 88               Bella Poarch    2
## 89            Black Eyed Peas    2
## 90                  blackbear    2
## 91              Bryson Tiller    2
## 92                    Cascada    2
## 93               Charlie Puth    2
## 94           Childish Gambino    2
## 95         Christina Aguilera    2
## 96            Christina Perri    2
## 97                   Coldplay    2
## 98               Daddy Yankee    2
## 99                    Doechii    2
## 100         Empire Of The Sun    2
## 101                 Ericovich    2
## 102               Evanescence    2
## 103                  Flo Rida    2
## 104                    GIVĒON    2
## 105             Glass Animals    2
## 106          Gym Class Heroes    2
## 107                    Indila    2
## 108             Isabel LaRosa    2
## 109                  J Balvin    2
## 110                  Jay Sean    2
## 111                  Jessie J    2
## 112               Just Lowkey    2
## 113                Kali Uchis    2
## 114                     Kesha    2
## 115                  Labrinth    2
## 116                 Lady Gaga    2
## 117                  Lil Peep    2
## 118                Luis Fonsi    2
## 119              Lukas Graham    2
## 120                Mac Miller    2
## 121                     MANSA    2
## 122               Maria Beyer    2
## 123               Mark Ronson    2
## 124                Max Martis    2
## 125                    Medusa    2
## 126                   MetaBoy    2
## 127                     Ne-Yo    2
## 128             Nick Giardino    2
## 129                      NUUD    2
## 130               OneRepublic    2
## 131                Phantogram    2
## 132                     PHURS    2
## 133                      RAYE    2
## 134                  Route 94    2
## 135                 Sam Smith    2
## 136             SAMMY & LESEN    2
## 137          Sasha Alex Sloan    2
## 138  Selena Gomez & The Scene    2
## 139               Skylar Grey    2
## 140       Swedish House Mafia    2
## 141             The Kid LAROI    2
## 142        The Pussycat Dolls    2
## 143                 The Vamps    2
## 144                 Timbaland    2
## 145               Tones And I    2
## 146                Tory Lanez    2
## 147              Travis Scott    2
## 148                  Two Feet    2
## 149                  vaultboy    2
## 150              XXXTENTACION    2
## 151              Yohan Gerber    2
## 152             ✝✝✝ (Crosses)    1
## 153                  24kGoldn    1
## 154                      2WEI    1
## 155       5 Seconds of Summer    1
## 156                    5UNDER    1
## 157                     6LACK    1
## 158           7 Hills Worship    1
## 159                    Aanysa    1
## 160                  AFROUZEN    1
## 161                     Ainae    1
## 162                    Akcent    1
## 163                      Akon    1
## 164              Alan Jackson    1
## 165               Alban Chela    1
## 166             Alec Benjamin    1
## 167             Alex & Sierra    1
## 168                   Alex C.    1
## 169           Alexander Rybak    1
## 170            Alexandra Stan    1
## 171                    Alfons    1
## 172               Alicia Keys    1
## 173                     Alosa    1
## 174                       ANN    1
## 175                      ANRY    1
## 176              Anson Seabra    1
## 177            Arctic Monkeys    1
## 178            Arizona Zervas    1
## 179          Armin van Buuren    1
## 180                   Artemas    1
## 181                      Ashe    1
## 182                  Astrid S    1
## 183                Asura Ghai    1
## 184                    AURORA    1
## 185                    AVAION    1
## 186                    Avicii    1
## 187            Axel Johansson    1
## 188                 Ayo & Teo    1
## 189                 Baby Tate    1
## 190            Bad Meets Evil    1
## 191                    Baltra    1
## 192                     Bazzi    1
## 193                 BEATSMASH    1
## 194                   Becky G    1
## 195                 Ben Delay    1
## 196                Ben Leuman    1
## 197              benny blanco    1
## 198              Benson Boone    1
## 199                  Besomage    1
## 200                 Besomorph    1
## 201                     BICEP    1
## 202                   BIMONTE    1
## 203             Bishop Briggs    1
## 204                 Blackjack    1
## 205                      Blue    1
## 206              Blue Violets    1
## 207                Bo Burnham    1
## 208                  Bon Jovi    1
## 209                BoyWithUke    1
## 210           Bridgit Mendler    1
## 211            Britney Spears    1
## 212              Bronski Beat    1
## 213                   Bruklin    1
## 214             Capri Everitt    1
## 215                   Cardi B    1
## 216          Carly Rae Jepsen    1
## 217                   Cartoon    1
## 218                     CARYS    1
## 219                    Catiso    1
## 220             Celina Sharma    1
## 221               Céline Dion    1
## 222                  Cesqeaux    1
## 223              Charly Black    1
## 224                Cher Lloyd    1
## 225              chicago city    1
## 226          Chord Overstreet    1
## 227                 citrulinq    1
## 228                      CKay    1
## 229                 Clara Mae    1
## 230                Conan Gray    1
## 231                    Corm!!    1
## 232                    Corona    1
## 233                      Cour    1
## 234                     CRÜPO    1
## 235                   CryJaxx    1
## 236           Crystal Castles    1
## 237              Culture Code    1
## 238                     CYRIL    1
## 239                Dan + Shay    1
## 240            Daniel Bellomo    1
## 241   Daryl Hall & John Oates    1
## 242                Davy Fresh    1
## 243                      Daya    1
## 244                Dean Lewis    1
## 245              Depeche Mode    1
## 246            Destiny Rogers    1
## 247                  Devinity    1
## 248               Dillin Hoox    1
## 249 Dimitri Vegas & Like Mike    1
## 250        Divine Deluxe Hitz    1
## 251                DJ Fronteo    1
## 252                   DJ Goja    1
## 253                 DJ Shadow    1
## 254                       Djo    1
## 255                 Dmitrii G    1
## 256                      dnvn    1
## 257                  Don Omar    1
## 258               Don Toliver    1
## 259              Dove Cameron    1
## 260               Dream Chaos    1
## 261           Duncan Laurence    1
## 262                       E S    1
## 263               Edward Maya    1
## 264                     Egzod    1
## 265                Elley Duhé    1
## 266                   ElyOtto    1
## 267                Em Beihold    1
## 268                   EMELINE    1
## 269                     Eniru    1
## 270          Enrique Iglesias    1
## 271                Faul & Wad    1
## 272             Fifth Harmony    1
## 273              Flame Runner    1
## 274         Foster The People    1
## 275                    Future    1
## 276                    G-Eazy    1
## 277               Gabry Ponte    1
## 278            Gang of Youths    1
## 279                   Gaullin    1
## 280                     GAYLE    1
## 281             Gesaffelstein    1
## 282                     gnash    1
## 283                     GonSu    1
## 284                  Gorillaz    1
## 285                     Gotye    1
## 286             Gracie Abrams    1
## 287                 Green Day    1
## 288                    GYMBRO    1
## 289                  Haddaway    1
## 290                  Harmless    1
## 291              Harry Styles    1
## 292                     HAWK.    1
## 293                      Home    1
## 294                     HUGEL    1
## 295                     HXDES    1
## 296                 HXPETRAIN    1
## 297               Hyper VIPER    1
## 298               iamnotshane    1
## 299                      IDER    1
## 300              Idina Menzel    1
## 301               Iggy Azalea    1
## 302              Infinity Ink    1
## 303                     Iniko    1
## 304                     ISAEV    1
## 305              Isaiah Falls    1
## 306                   J.Tajor    1
## 307                     Jaden    1
## 308              Jai Waetford    1
## 309               James Blunt    1
## 310                       Jax    1
## 311                     JAYEM    1
## 312              Jaymes Young    1
## 313                   JAZMYNE    1
## 314               Jean Dawson    1
## 315            Jessica Mauboy    1
## 316                Joel Adams    1
## 317            Johnny Orlando    1
## 318                   Joinnus    1
## 319                Jon Lajoie    1
## 320            Jonas Brothers    1
## 321             Jordan Clarke    1
## 322             Josh Caballes    1
## 323                JubyPhonic    1
## 324                Juice WRLD    1
## 325                      JVKE    1
## 326                    K'NAAN    1
## 327                     K-391    1
## 328                   K3YN0T3    1
## 329           Kacey Musgraves    1
## 330                Kaia Jette    1
## 331                    KALUMA    1
## 332                   KAROL G    1
## 333                 Kate Bush    1
## 334                KAYTRANADA    1
## 335            Kelly Clarkson    1
## 336                       Ken    1
## 337               Kenya Grace    1
## 338                  Kid Cudi    1
## 339                Kid Travis    1
## 340                      KiDi    1
## 341                   KiLLTEQ    1
## 342                      Kina    1
## 343       Kurt Hugo Schneider    1
## 344                       kwn    1
## 345                      Kygo    1
## 346                 Kyle Hume    1
## 347                      Lauv    1
## 348         League of Legends    1
## 349                     LEDUC    1
## 350               Leona Lewis    1
## 351                 Lew Heart    1
## 352             Lewis Capaldi    1
## 353                Leyla Blue    1
## 354               Liam Dakota    1
## 355                  Libianca    1
## 356                 Lil Skies    1
## 357                 Lil Tecca    1
## 358                 Lil Wayne    1
## 359  Lilly Wood and The Prick    1
## 360          Lindsey Stirling    1
## 361                  Lintrepy    1
## 362                     Lorde    1
## 363                Loren Gray    1
## 364               LØST SIGNAL    1
## 365                  Lost Sky    1
## 366             Louis Theroux    1
## 367          Luke Christopher    1
## 368                  Lykke Li    1
## 369                       M83    1
## 370            Madilyn Bailey    1
## 371              Madison Beer    1
## 372                   Madonna    1
## 373                    MAGIC!    1
## 374               Magnus Gunn    1
## 375                   Mahalia    1
## 376                  Mandrazo    1
## 377               Marco Nobel    1
## 378                     Mario    1
## 379                Mark Ambor    1
## 380             Martin Garrix    1
## 381               Masked Wolf    1
## 382          Melanie Martinez    1
## 383                 Melodream    1
## 384                    Mentol    1
## 385              Milky Chance    1
## 386                     MNA55    1
## 387            Modern Talking    1
## 388                      MOHA    1
## 389                   Mohombi    1
## 390              Montell Fish    1
## 391               Mr Saxobeat    1
## 392                  Mr.Kitty    1
## 393                 Mura Masa    1
## 394                      Muse    1
## 395                      MXZI    1
## 396                      n$vd    1
## 397                     nashi    1
## 398               Naughty Boy    1
## 399                    NEFFEX    1
## 400                     Neoni    1
## 401             Nessa Barrett    1
## 402                  New West    1
## 403                     NEWER    1
## 404               Nicky Youre    1
## 405                 Night Inn    1
## 406     Nightcore Collectives    1
## 407          Nightcore Dreams    1
## 408             Nightcore Red    1
## 409                Niklas Dee    1
## 410                  Nitecore    1
## 411                  NO FEELS    1
## 412                Noah Cyrus    1
## 413                     Noizy    1
## 414                     NoMBe    1
## 415                 NxWorries    1
## 416               Ocean Roses    1
## 417                    Oleria    1
## 418             PARTYNEXTDOOR    1
## 419                 Passenger    1
## 420                     PHARØ    1
## 421                 Phil Good    1
## 422               Pink Sweat$    1
## 423                     Pixia    1
## 424                  Pokaraet    1
## 425                  Pop Mage    1
## 426                    Poylow    1
## 427            Princess Nokia    1
## 428                   PRIYANX    1
## 429                 ptasinski    1
## 430                    PUBLIC    1
## 431                    Raaban    1
## 432            Rachel Platten    1
## 433             Ramin Djawadi    1
## 434                   Rawanne    1
## 435               Reed Deming    1
## 436                      Rema    1
## 437                 Retronaut    1
## 438                  RezaDead    1
## 439          Ricky Montgomery    1
## 440                      Ridi    1
## 441                    Rixton    1
## 442                 Rosa Linn    1
## 443                      ROSÉ    1
## 444                  ROY KNOX    1
## 445       Royal & the Serpent    1
## 446              Rui Da Silva    1
## 447                   Ruth B.    1
## 448                 SABRINA G    1
## 449                     Saige    1
## 450               salem ilese    1
## 451               Sam Tinnesz    1
## 452                   Sandëro    1
## 453                   SANDICE    1
## 454                   Santana    1
## 455             Sarah Jeffery    1
## 456                  Saweetie    1
## 457                     Scity    1
## 458               Scythermane    1
## 459             Sean Kingston    1
## 460                 Sean Paul    1
## 461                 Sevdaliza    1
## 462                 Shontelle    1
## 463   Sidewalks and Skeletons    1
## 464                       SiR    1
## 465                      Skan    1
## 466    Ski Mask The Slump God    1
## 467                 SKYXLINER    1
## 468                   SLANDER    1
## 469                 Snakehips    1
## 470              Snoh Aalegra    1
## 471                 Soap&Skin    1
## 472            SouthmadeVelly    1
## 473         sped up nightcore    1
## 474               Spice Girls    1
## 475                Stela Cole    1
## 476           Stephen Sanchez    1
## 477                STICKY KEY    1
## 478                 Sub Urban    1
## 479    Subspace Super Highway    1
## 480               Sugar Jesus    1
## 481                       SZA    1
## 482                  t.A.T.u.    1
## 483       Tasha Cobbs Leonard    1
## 484                Tate McRae    1
## 485                 Tech N9ne    1
## 486                  Telenova    1
## 487                    Tesher    1
## 488               THE ANXIETY    1
## 489                   The Cab    1
## 490     The Chemical Brothers    1
## 491                The Cramps    1
## 492                 The Score    1
## 493                The Script    1
## 494          The Tech Thieves    1
## 495           The Temper Trap    1
## 496                 TheFatRat    1
## 497                    Tiësto    1
## 498            Tommee Profitt    1
## 499                   Tove Lo    1
## 500                     Train    1
## 501             Trevor Daniel    1
## 502                      TRFN    1
## 503          Trinidad Cardona    1
## 504         Twenty One Pilots    1
## 505                  UNKLFNKL    1
## 506                      Used    1
## 507                Van Snyder    1
## 508                  Vanillaz    1
## 509                 VIC MENSA    1
## 510             We Architects    1
## 511                  WEEDMANE    1
## 512                    WILLOW    1
## 513                Witt Lowry    1
## 514               Wiz Khalifa    1
## 515                  WizTheMc    1
## 516                 Xanemusic    1
## 517                      Xizt    1
## 518                Yagih Mael    1
## 519                 YG Marley    1
## 520                    Yooniq    1
## 521                  yung kai    1
## 522                 Yung Lean    1
## 523                 Zac Efron    1
## 524              Zara Larsson    1
## 525                    ZODIVK    1
## 526                       Zum    1

4. Output all the column names in a data frame.

colnames(spotify_data)
## [1] "track_name"   "artist"       "album"        "release_date" "popularity"  
## [6] "spotify_url"  "id"           "duration_min"

5. Output the min, max, average, and standard deviation of a variable from a data frame.

"Shortest song in Top 1000: "
## [1] "Shortest song in Top 1000: "
min(spotify_data$duration_min)
## [1] 0.96915
"Longest song in Top 1000: "
## [1] "Longest song in Top 1000: "
max(spotify_data$duration_min)
## [1] 9.497883
"Average song length in Top 1000: "
## [1] "Average song length in Top 1000: "
mean(spotify_data$duration_min)
## [1] 3.318516
"Standard Deviation of song length in Top 1000: "
## [1] "Standard Deviation of song length in Top 1000: "
sd(spotify_data$duration_min)
## [1] 0.8495905

6. Illustrate how you can select columns of a data frame into a new data frame. Show your result by executing a summary or glimpse of the new data frame.

new_dataframe <- select(spotify_data, "artist", "album", 
                        "release_date")
summary(new_dataframe)
##     artist             album           release_date      
##  Length:1000        Length:1000        Length:1000       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character

7. Illustrate renaming a column.

spotify_rename <- rename(spotify_data, length_min = duration_min)
glimpse(spotify_rename)
## Rows: 1,000
## Columns: 8
## $ track_name   <chr> "All The Stars (with SZA)", "Starboy", "Señorita", "Heat …
## $ artist       <chr> "Kendrick Lamar", "The Weeknd", "Shawn Mendes", "Glass An…
## $ album        <chr> "Black Panther The Album Music From And Inspired By", "St…
## $ release_date <chr> "2018-02-09", "2016-11-25", "2019-06-21", "2020-08-07", "…
## $ popularity   <int> 95, 90, 80, 87, 87, 77, 73, 80, 84, 88, 88, 81, 85, 86, 6…
## $ spotify_url  <chr> "https://open.spotify.com/track/3GCdLUSnKSMJhs4Tj6CV3s", …
## $ id           <chr> "3GCdLUSnKSMJhs4Tj6CV3s", "7MXVkk9YMctZqd1Srtv4MB", "0TK2…
## $ length_min   <dbl> 3.869767, 3.840883, 3.182667, 3.980083, 3.432433, 3.67965…

8. Illustrate the use of filter (from tidyverse) on a data frame.

filter_spotify <- filter(spotify_data, duration_min > 4)
glimpse(filter_spotify)
## Rows: 155
## Columns: 8
## $ track_name   <chr> "Apocalypse", "Love Me Like You Do", "Summertime Sadness"…
## $ artist       <chr> "Cigarettes After Sex", "Ellie Goulding", "Lana Del Rey",…
## $ album        <chr> "Cigarettes After Sex", "Fifty Shades Freed (Original Mot…
## $ release_date <chr> "2017-06-09", "2018-02-09", "2012-01-01", "2016-11-25", "…
## $ popularity   <int> 73, 81, 85, 81, 87, 83, 88, 89, 83, 41, 75, 87, 81, 71, 7…
## $ spotify_url  <chr> "https://open.spotify.com/track/0yc6Gst2xkRu0eMLeRMGCX", …
## $ id           <chr> "0yc6Gst2xkRu0eMLeRMGCX", "0Cy7wt6IlRfBPHXXjmZbcP", "3BJe…
## $ duration_min <dbl> 4.843600, 4.225333, 4.423783, 4.337533, 4.179333, 4.74776…

9. Illustrate the use of arrange on a dataframe. Use any column of your choosing, but sort from greatest to least.

arrange_spotify <- arrange(spotify_data, desc(popularity))
glimpse(arrange_spotify)
## Rows: 1,000
## Columns: 8
## $ track_name   <chr> "That’s So True", "APT.", "All The Stars (with SZA)", "I …
## $ artist       <chr> "Gracie Abrams", "ROSÉ", "Kendrick Lamar", "Arctic Monkey…
## $ album        <chr> "The Secret of Us (Deluxe)", "APT.", "Black Panther The A…
## $ release_date <chr> "2024-10-18", "2024-10-18", "2018-02-09", "2013-09-09", "…
## $ popularity   <int> 97, 96, 95, 93, 93, 92, 92, 92, 92, 90, 90, 90, 90, 90, 9…
## $ spotify_url  <chr> "https://open.spotify.com/track/7ne4VBA60CxGM75vw0EYad", …
## $ id           <chr> "7ne4VBA60CxGM75vw0EYad", "5vNRhkKd0yEAg8suGBpjeY", "3GCd…
## $ duration_min <dbl> 2.771667, 2.831950, 3.869767, 3.065933, 4.078067, 4.44621…

10. Use slice_max to output the top 4 rows of a data frame. Use any column of your choosing.

  slice_max(spotify_data, n=4, popularity)
##                                       track_name         artist
## 1                                 That’s So True  Gracie Abrams
## 2                                           APT.           ROSÉ
## 3                       All The Stars (with SZA) Kendrick Lamar
## 4                               I Wanna Be Yours Arctic Monkeys
## 5 One Of The Girls (with JENNIE, Lily Rose Depp)     The Weeknd
##                                                     album release_date
## 1                               The Secret of Us (Deluxe)   2024-10-18
## 2                                                    APT.   2024-10-18
## 3      Black Panther The Album Music From And Inspired By   2018-02-09
## 4                                                      AM   2013-09-09
## 5 The Idol Episode 4 (Music from the HBO Original Series)   2023-06-23
##   popularity                                           spotify_url
## 1         97 https://open.spotify.com/track/7ne4VBA60CxGM75vw0EYad
## 2         96 https://open.spotify.com/track/5vNRhkKd0yEAg8suGBpjeY
## 3         95 https://open.spotify.com/track/3GCdLUSnKSMJhs4Tj6CV3s
## 4         93 https://open.spotify.com/track/5XeFesFbtLpXzIVDNQP22n
## 5         93 https://open.spotify.com/track/7CyPwkp0oE8Ro9Dd5CUDjW
##                       id duration_min
## 1 7ne4VBA60CxGM75vw0EYad     2.771667
## 2 5vNRhkKd0yEAg8suGBpjeY     2.831950
## 3 3GCdLUSnKSMJhs4Tj6CV3s     3.869767
## 4 5XeFesFbtLpXzIVDNQP22n     3.065933
## 5 7CyPwkp0oE8Ro9Dd5CUDjW     4.078067

11. Illustrate the use of a pipe operation (%>%)

spotify_data %>%
  arrange(desc(duration_min)) %>%
  slice_head(n = 5) %>%
  select(duration_min, track_name, album, artist)
##   duration_min                                      track_name
## 1     9.497883                                        Lost Boy
## 2     8.436667                               I'm Getting Ready
## 3     8.069100                                         Mirrors
## 4     7.476217 What Goes Around.../...Comes Around (Interlude)
## 5     7.035767                              Achilles Come Down
##                                            album              artist
## 1                                           SYRE               Jaden
## 2                       Heart. Passion. Pursuit. Tasha Cobbs Leonard
## 3 The 20/20 Experience - The Complete Experience   Justin Timberlake
## 4                           FutureSex/LoveSounds   Justin Timberlake
## 5                        Go Farther In Lightness      Gang of Youths

12. Use ggplot to create 2 different visualizations of your data

ggplot(spotify_data, aes(x=popularity, y=duration_min)) +
  geom_point() +
  labs(title = "Popularity vs Duration of Song", 
         x = "Popularity of Song", y = "Duration of Song")

glimpse(spotify_data)
## Rows: 1,000
## Columns: 8
## $ track_name   <chr> "All The Stars (with SZA)", "Starboy", "Señorita", "Heat …
## $ artist       <chr> "Kendrick Lamar", "The Weeknd", "Shawn Mendes", "Glass An…
## $ album        <chr> "Black Panther The Album Music From And Inspired By", "St…
## $ release_date <chr> "2018-02-09", "2016-11-25", "2019-06-21", "2020-08-07", "…
## $ popularity   <int> 95, 90, 80, 87, 87, 77, 73, 80, 84, 88, 88, 81, 85, 86, 6…
## $ spotify_url  <chr> "https://open.spotify.com/track/3GCdLUSnKSMJhs4Tj6CV3s", …
## $ id           <chr> "3GCdLUSnKSMJhs4Tj6CV3s", "7MXVkk9YMctZqd1Srtv4MB", "0TK2…
## $ duration_min <dbl> 3.869767, 3.840883, 3.182667, 3.980083, 3.432433, 3.67965…
artist_table <- as.data.frame(table(spotify_data$artist))
artist_table <- arrange(artist_table, desc(Freq))
artist_table <- filter(artist_table, Freq > 6)
ggplot(artist_table, aes(x=Freq, y=Var1)) +
    geom_bar(stat="identity") +
  labs(title = "Artists* and the amount of top 1000 songs they have",
       caption = "*Only artists with more than 6 songs are included",
       x = "# of Songs",
       y = "Artist Name")