Using Machine Learning to Predict Song Popularity*
Goal: Use ML to identify which attributes most influence song popularity.
**What did we do?
**We used Machine Learning (ML) techniques to predict song popularity based on attributes such as energy, danceability, and acousticness. Two models were applied: Linear Regression for simplicity and Random Forest for enhanced prediction accuracy.
Values of popularity, energy, danceability, and acousticness This code prepares the Spotify dataset for analysis: - Load Libraries: Uses dplyr and ggplot2 for data handling and visuals. - Import Data: Reads the Spotify dataset from a CSV file. - Fill Missing Data: Adds random placeholder values for popularity, energy, danceability, & acousticness - Clean Data: Keeps only the necessary columns and removes rows with missing values. - Split Data: Divides the dataset into 80% training and 20% testing for machine learning. This ensures the data is ready for accurate analysis and modeling
##Key Takeaways Data Preparation is Crucial: -Missing or incomplete data can skew results. Adding placeholder values ensures the dataset is complete for analysis. Feature Selection: -Selecting key features like popularity, energy, danceability, and acousticness focuses the analysis on attributes most relevant to predicting song popularity. Data Cleaning: -Removing missing values reduces noise and improves the quality of the input data for the model. Training and Testing Split: -Dividing the data ensures the model is trained on one subset and validated on another, which prevents overfitting and improves the reliability of predictions.
##How This Improves Predictions: -Consistency: Cleaning and standardizing the data ensures the machine learning model learns from accurate inputs. -Focus: Limiting the dataset to the most relevant features allows the model to better capture relationships between attributes and popularity. -Validation: Testing on separate data evaluates how well the model predicts real-world scenarios, increasing confidence in its predictions.