This uses Churn Dataset to generate and deploy new machine learning model to predict Churn. It has two version of a model wherein the user can choose to update the previous ML model to reflect and incorporate the info in the new dataset in an updated ML model. The implementation of this project will be carried out using R Shiny and will subsequently be made available online. online.
From the given data set, we created a csv generator to randomly split the data 50-50 and generate csv files to be used on the application. The first 50% of the data will be used to train and obtain a “Version 1” of the prediction model while the second 50% of the data will be used to update the “Version 2” model.
library(RWeka)
library(caTools)
#Version 1 Data
sample <- sample.split(churndata$Churn, SplitRatio = 0.5)
ChurnTrain <- subset(churndata, sample == TRUE)
ChurnTest <- subset(churndata, sample == FALSE, select = -Churn)
write.csv(ChurnTest, "ChurnTest.csv", row.names = FALSE)
write.csv(ChurnTrain, "ChurnTrain.csv", row.names = FALSE)
#Version 2 Data
sample_v2 <- sample.split(churndata$Churn, SplitRatio = 0.5)
ChurnTrain_v2 <- subset(churndata, sample_v2 == TRUE)
ChurnTest_v2 <- subset(churndata, sample_v2 == FALSE, select = -Churn)
write.csv(ChurnTest_v2, "ChurnTest_v2.csv", row.names = FALSE)
write.csv(ChurnTrain_v2, "ChurnTrain_v2.csv", row.names = FALSE)
We then created two model functions, load_and_train function and train_and_create, for Version 1 and Version 2 models respectively. Both functions consist of model generation using Random Forest, with both values of m set to 2*sqrt(p) as this gives the most optimized segment. Moreover, we created an evaluate_model function with k = 5 folds for cross-validation and summary_model function to display the summary of the model
library(RWeka)
library(caTools)
load_and_train_model <- function() {
churndata <- read.csv("churndata.csv", stringsAsFactors = TRUE)
new_column_names <- c("AccountLength", "IntlPlan", "VMailPlan",
"VMailMessage", "DayMins", "DayCalls", "DayCharge",
"EveMins", "EveCalls", "EveCharge", "NightMins",
"NightCalls", "NightCharge", "IntlMins", "IntlCalls",
"IntlCharge", "CustServCalls", "Churn")
colnames(churndata) <- new_column_names
set.seed(123)
sample <- sample.split(churndata$Churn, SplitRatio = 0.5)
ChurnTrain <- subset(churndata, sample == TRUE)
RF <- make_Weka_classifier("weka/classifiers/trees/RandomForest")
rfmodel <- RF(Churn ~ ., data = ChurnTrain,
control = Weka_control(K=floor(2*sqrt(18))))
return(rfmodel)
}
train_and_create_model <- function(training_data) {
new_column_names <- c("AccountLength", "IntlPlan", "VMailPlan",
"VMailMessage", "DayMins", "DayCalls", "DayCharge",
"EveMins", "EveCalls", "EveCharge", "NightMins",
"NightCalls", "NightCharge", "IntlMins", "IntlCalls",
"IntlCharge", "CustServCalls", "Churn")
colnames(training_data) <- new_column_names
RF <- make_Weka_classifier("weka/classifiers/trees/RandomForest")
rfmodel <- RF(Churn ~ ., data = training_data, control = Weka_control(K=floor(2*sqrt(18))))
return(rfmodel)
}
evaluate_model <- function(model, test_data) {
evaluation_results <- evaluate_Weka_classifier(
model, newdata = test_data,
numFolds = 5, class = TRUE, seed = 1
)
return(evaluation_results)
}
model_summary <- function(model){
return(summary(model))
}
The application lets the user choose from two different versions of the model. Specifically, it can accept new data for prediction and use any version to predict new rows. Specifically, each model has their own specifications:
The application has the following general functions:
This section demonstrates how the application works and the information it provides to the user. The demo is provided for each version of the model.
The prediction table shows the head of the dataset with their
corresponding churn values
The application also provides model summary and evaluation of the
model to the users
The prediction table shows the head of the dataset with their
corresponding churn values
The application also provides model summary and evaluation of the
model to the users
To test our Churn Prediction Application, you may also visit the link below:
https://qyu1db-keana0francheska-bautista.shinyapps.io/developingDataProducts_Sy_Bautista/