Overview
Given the Churn Dataset, we were tasked to deploy a new ML model that
will predict Churn.
Instructions
- Version1 - The churn dataset split into 50% for training.
- Model is integrated into an a model using R Shiny or Python Dash
Although in our case, we deploy it in the R Shiny by the help of the
RWeka library.
- Deployed app should accept new training data, allowing the model to
be updated based on the latest churn data provided by the user.
- Users can upload CSV files for prediction their new churn data . The
model selection process involves automatic parameter tuning to optimize
its performance.
- The chosen model’s evaluation is based on relevant Key Performance
Indicators (KPIs), ensuring its accuracy and reliability.
Summary
In the developed Shiny app for churn prediction, we created an
interface that integrates machine learning models to predict customer
churn. The app is structured with a UI and a reactive server logic that
together facilitate the prediction process. We utilized the Churn
Dataset and deployed a machine learning model using the Random Forest
algorithm.
R Shiny Application
- User Interface (
ui.R):
- Organized the UI using
fluidPage, creating distinct
panels for model training, churn prediction, and result display.
- Incorporated interactive input elements like file upload widgets,
radio buttons, and action buttons for user interaction.
- Enabled users to select different “Versions” of the model to use for
prediction.
- Displaying prediction outcomes and evaluation metrics.
- Server Logic (
server.R):
- Loaded and preprocessed the Churn Dataset, preparing it for model
training and prediction.
- Split the dataset into training and testing sets to build and
evaluate the initial “Version 1” machine learning model.
- Created reactive expressions to handle new training data and new
data for prediction.
- Update the model when new training data is uploaded.
- Then the program dynamically update the version dropdown with
available models.
Machine Learning Model
Model Selection: Reason Why Random Forest is deploy
as a model
We utilized the Random Forest and decided to be our optimal choice
for churn prediction due to its ensemble nature, which mitigates
overfitting and captures non-linear relationships especially complex
dataset like the churn data. It provides the highest accuracy and
precise predicting of True Negatives and True Positives among the other
model tests such as SVM, decision trees and Adaboost. Its ability to
handle imbalanced data, identify feature importance, and require minimal
tuning makes it suitable for accurate and robust predictions. The
algorithm’s versatility and consistent performance across domains make
it a reliable solution for businesses aiming to predict customer churn
effectively.
- Model Training:
- Utilized the Random Forest algorithm from the RWeka library to
create a predictive model.
- Trained the initial “Version 1” model using the training data.
- Implemented K-fold cross-validation and we use the hyperparameter K
= 2*sqrt(n), where n is the total number of predictors which is 18 to
evaluate model performance using the
evaluate_Weka_classifier function.
- Model Upload and Update:
- Enabled users to upload new training data to update the model.
- Utilized
Reactive functions to create a new model based
on the updated training data.
- Maintained a list of models, appending the new model to the
list.
- Dynamically updated the version dropdown to reflect the available
models.
In summary, the documentation provides a comprehensive overview of
the Shiny app’s structure, functionality, and the machine learning
model’s creation and update process, enabling users to understand and
effectively utilize the app for the customer churn prediction.
Features of RShiny Application
- Churn Prediction: Users can upload the test data
(in CSV file format) using the “Choose File” button.
- Version Selection: Allowing users to upload for New
Data and
Train it as new model and appennd it as a new
Version.
- Classification Report: The App will display a
classification report with evaluation metrics, such as accuracy,
precision, recall, and F1-score. This report summarizes the performance
of the selected model on the test data.
- Predictions Table: The table will show the index of
each entry and the corresponding predicted churn outcome (e.g., “True”
or “False”). -Updating the Model:
- Can update the model, return to the “Model Training” section.
- Upload a new training dataset by clicking the “Choose File”
button.
- Select the desired option (“All Data” or “New Data”) using the radio
buttons.
- Click the “Train” button to create an updated model.
R Shiny Churn Model Application
