2023-04-01

Introduction

  • This presentation provides an overview for the Developing Data Products Course Project.

  • The web application can be found at the link here: Project Website.

  • The source code for the project can be found here: Source Code

Overview

  • This project is using the Customers dataset from Kaggle to predict a Spending Score based on demographics data.

  • Application uses the dataset to train a model using RandomForest and predict a score based on user inputted values.

  • It also provides a plot that shows the relationship of Age vs. Spending Score by Gender once the user inputted values.

Customer Dataset

  • This data is a detailed analysis of a imaginative shop’s ideal customers.
  • It helps a business to better understand its customers.
  • The owner of a shop gets information about Customers through membership cards.
  • Dataset consists of 2000 records and 8 columns
'data.frame':   2000 obs. of  8 variables:
 $ CustomerID            : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Gender                : chr  "Male" "Male" "Female" "Female" ...
 $ Age                   : int  19 21 20 23 31 22 35 23 64 30 ...
 $ Annual.Income....     : int  15000 35000 86000 59000 38000 58000 31000 84000 97000 98000 ...
 $ Spending.Score..1.100.: int  39 81 6 77 40 76 6 94 3 72 ...
 $ Profession            : chr  "Healthcare" "Engineer" "Engineer" "Lawyer" ...
 $ Work.Experience       : int  1 3 1 0 2 0 1 1 0 1 ...
 $ Family.Size           : int  4 3 1 2 6 2 3 3 3 4 ...

Prediction Model and the R Code

  • This application was built by using a shiny app with ui.R and server.R files. It is hosted in shinyapps.io website.

  • Prediction model is built by using RandomForest as seen below:

predict_score <- function(input1, input2, input3, input4, input5, input6) 
                {
                model <- randomForest(Spending.Score..1.100. ~ ., data = cust_data)
                # Prepare input data for score
                new_data <- data.frame(Gender = input1, Age = input2, 
                        Annual.Income.... = input3, Profession = input4, 
                        Work.Experience = input5, Family.Size = input6)
                
                # Make prediction and return result
                prediction <- predict(model, new_data)
                return(prediction)}