Overview

Given the Churn Dataset, we were tasked to deploy a new ML model that will predict Churn.

Instructions

Version1 - The churn dataset split into 50% for training.
Model is integrated into an a model using R Shiny or Python Dash Although in our case, we deploy it in the R Shiny by the help of the RWeka library.
Deployed app should accept new training data, allowing the model to be updated based on the latest churn data provided by the user.
Users can upload CSV files for prediction their new churn data . The model selection process involves automatic parameter tuning to optimize its performance.
The chosen model’s evaluation is based on relevant Key Performance Indicators (KPIs), ensuring its accuracy and reliability.

Summary

In the developed Shiny app for churn prediction, we created an interface that integrates machine learning models to predict customer churn. The app is structured with a UI and a reactive server logic that together facilitate the prediction process. We utilized the Churn Dataset and deployed a machine learning model using the Random Forest algorithm.

R Shiny Application

User Interface (ui.R):
- Organized the UI using fluidPage, creating distinct panels for model training, churn prediction, and result display.
- Incorporated interactive input elements like file upload widgets, radio buttons, and action buttons for user interaction.
- Enabled users to select different “Versions” of the model to use for prediction.
- Displaying prediction outcomes and evaluation metrics.
Server Logic (server.R):
- Loaded and preprocessed the Churn Dataset, preparing it for model training and prediction.
- Split the dataset into training and testing sets to build and evaluate the initial “Version 1” machine learning model.
- Created reactive expressions to handle new training data and new data for prediction.
- Update the model when new training data is uploaded.
- Then the program dynamically update the version dropdown with available models.

Machine Learning Model

Model Selection: Reason Why Random Forest is deploy as a model

We utilized the Random Forest and decided to be our optimal choice for churn prediction due to its ensemble nature, which mitigates overfitting and captures non-linear relationships especially complex dataset like the churn data. It provides the highest accuracy and precise predicting of True Negatives and True Positives among the other model tests such as SVM, decision trees and Adaboost. Its ability to handle imbalanced data, identify feature importance, and require minimal tuning makes it suitable for accurate and robust predictions. The algorithm’s versatility and consistent performance across domains make it a reliable solution for businesses aiming to predict customer churn effectively.

Model Training:
- Utilized the Random Forest algorithm from the RWeka library to create a predictive model.
- Trained the initial “Version 1” model using the training data.
- Implemented K-fold cross-validation and we use the hyperparameter K = 2*sqrt(n), where n is the total number of predictors which is 18 to evaluate model performance using the evaluate_Weka_classifier function.
Model Upload and Update:
- Enabled users to upload new training data to update the model.
- Utilized Reactive functions to create a new model based on the updated training data.
- Maintained a list of models, appending the new model to the list.
- Dynamically updated the version dropdown to reflect the available models.

In summary, the documentation provides a comprehensive overview of the Shiny app’s structure, functionality, and the machine learning model’s creation and update process, enabling users to understand and effectively utilize the app for the customer churn prediction.

Features of RShiny Application

Churn Prediction: Users can upload the test data (in CSV file format) using the “Choose File” button.
Version Selection: Allowing users to upload for New Data and Train it as new model and appennd it as a new Version.
Classification Report: The App will display a classification report with evaluation metrics, such as accuracy, precision, recall, and F1-score. This report summarizes the performance of the selected model on the test data.
Predictions Table: The table will show the index of each entry and the corresponding predicted churn outcome (e.g., “True” or “False”). -Updating the Model:
- Can update the model, return to the “Model Training” section.
- Upload a new training dataset by clicking the “Choose File” button.
- Select the desired option (“All Data” or “New Data”) using the radio buttons.
- Click the “Train” button to create an updated model.

R Shiny Churn Model Application

Churn Data Model Prediction

Jayce Jocson and Brian Aguilar

August 15, 2023