LAB 3 — Classification, Model Comparison & Ensembling Weight: (as per syllabus) Tier: Core Modeling Dataset: week2_churn_data.csv Purpose This lab develops your ability to build, evaluate, compare, and combine classification models for a realistic business problem. You will move from individual models to model selection and ensembling, emphasizing performance tradeoffs and business interpretation. ________________________________________ Learning Goals By completing this lab, you will be able to: • Build and evaluate multiple classification models • Compare models using appropriate performance metrics • Select predictors intentionally and justify modeling choices • Apply simple ensemble techniques to improve performance • Translate model results into business recommendations ________________________________________ Tasks Task 1 — Load Data & Define Outcome • Load week2_churn_data.csv

#setwd("C:/Users/kelle/OneDrive/Documents/MSBA/TBANLT560W26/R working directory")
library(readr)
churnL3 <- read_csv("week2_churn_data.csv")
#View(week2_churn_data)

• Identify the business outcome variable

dim(df)
## NULL
str(df)
## function (x, df1, df2, ncp, log = FALSE)
head(df, 3)
##                                            
## 1 function (x, df1, df2, ncp, log = FALSE) 
## 2 {                                        
## 3     if (missing(ncp))

• Confirm outcome class balance (churn rate)

churnL3$churn <- factor(churnL3$churn, levels = c("No","Yes"))
table(churnL3$churn)
## 
##  No Yes 
## 290 110

Expected output: • Code showing dataset structure • 2–3 sentences describing the outcome and churn rate This mean the data set is imbalance and customers that churn are observed more rarely.If we want to capture more instances where customers are likely to churn we may want to oversample records of customers that churn for the model to increase sensitivity. We may also want address the issue by setting a lower threshold for our main class.

Checkpoint (write 2–4 sentences): Your answer: - Business question: [WRITE HERE]
- What is the business question?
Can business decision makers use a ML model to determine if a customer will churn based on attributes like monthy_charges or tenure_months.


Task 2 — Predictor Selection

You may not automatically use all variables.

Choose 4–8 predictors that make business sense for churn.
You may include both numeric and categorical predictors. • Review available predictors

Write your predictor list here (exact column names):

• Select a subset of predictors for modeling

# TODO: Replace the vector below with your chosen predictor column names
churnL3predictor_cols <- c("tenure_months", "monthly_charges","online_security", "tech_support")
  # "tenure_months",
  # "monthly_charges",
  # "contract_type"


# Keep outcome + predictors only
churnL3keep_cols <- c("churn", churnL3predictor_cols)
churnL3_df2 <- churnL3[ , churnL3keep_cols ]

• Justify inclusion/exclusion decisions (e.g., relevance, redundancy, interpretability)

# Confirm structure
str(churnL3_df2)
## tibble [400 × 5] (S3: tbl_df/tbl/data.frame)
##  $ churn          : Factor w/ 2 levels "No","Yes": 2 2 1 1 1 1 1 1 1 1 ...
##  $ tenure_months  : num [1:400] 49 29 35 1 23 31 24 9 3 11 ...
##  $ monthly_charges: num [1:400] 68.6 77.4 80.2 58.4 50.8 ...
##  $ online_security: chr [1:400] "Yes" "Yes" "Yes" "Yes" ...
##  $ tech_support   : chr [1:400] "Yes" "Yes" "No" "Yes" ...

Expected output: • Short written justification (≈1 paragraph) These variables were selected because they showed some correlation to the outcome variable while total_chares was excluded because of high correlation with tenture_months


Task 3 — Train/Test Split • Create a 70/30 train–test split

n <- nrow(churnL3_df2)
train_size <- floor(0.70 * n)

train_idx <- sample(1:n, size = train_size, replace = FALSE)
train <- churnL3_df2[train_idx, ]
test  <- churnL3_df2[-train_idx, ]

• Verify that outcome proportions are similar across splits

c(nrow(train), nrow(test))
## [1] 280 120
prop.table(table(train$churn))
## 
##        No       Yes 
## 0.7321429 0.2678571
prop.table(table(test$churn))
## 
##        No       Yes 
## 0.7083333 0.2916667

Expected output: Your answer: [WRITE HERE] Checkpoint (1–2 sentences): - Are the churn rates similar in train vs test? The results show that the churn rates are similar but they also seem to include a significant amount of variance. This could mean the training data did not include observations with very similar predictor values.


• Code showing split

calc_metrics <- function(actual, pred_class, positive = "Yes") {
  actual <- factor(actual, levels = c("No","Yes"))
  pred_class <- factor(pred_class, levels = c("No","Yes"))
  cm <- table(Predicted = pred_class, Actual = actual)

  TP <- cm[positive, positive]
  TN <- cm[setdiff(levels(actual), positive), setdiff(levels(actual), positive)]
  FP <- cm[positive, setdiff(levels(actual), positive)]
  FN <- cm[setdiff(levels(actual), positive), positive]

  accuracy  <- (TP + TN) / sum(cm)
  precision <- if ((TP + FP) == 0) NA else TP / (TP + FP)
  recall    <- if ((TP + FN) == 0) NA else TP / (TP + FN)
  f1        <- if (is.na(precision) | is.na(recall) | (precision + recall) == 0) NA else 2 * precision * recall / (precision + recall)

  list(cm = cm, accuracy = accuracy, precision = precision, recall = recall, f1 = f1)
}

make_roc_auc <- function(actual, prob_yes) {
  roc_obj <- roc(response = actual, predictor = prob_yes, levels = c("No","Yes"), direction = "<")
  auc_val <- as.numeric(auc(roc_obj))
  list(roc = roc_obj, auc = auc_val)
}

• Confirmation of sample sizes and churn rates


Task 4 — Exploratory Model Diagnostics Create at least two plots that help you understand churn risk.
Each plot must include 1–3 sentences explaining what it suggests.

• Produce at least two student-chosen plots that help you understand: o Predictor–outcome relationships, or

# TODO: Choose one predictor and plot churn rate by group (if categorical)
# Example ideas:
# - barplot of churn proportion by contract type
# - boxplot of numeric predictor split by churn
#
# PLOT 1 CODE HERE

Interpretation (1–3 sentences): [WRITE HERE]

o Potential modeling challenges (e.g., separation, scale)

# TODO: A second plot, different from Plot 1
# PLOT 2 CODE HERE

Interpretation (1–3 sentences): [WRITE HERE]

Expected output: • Plots generated by you

• Brief interpretation for each plot


Task 5 — Logistic Regression (Baseline Model)

• Fit a logistic regression using selected predictors

# Logistic regression (uses selected predictors only)
logit_fit <- glm(churn ~ ., data = train, family = binomial)
summary(logit_fit)
## 
## Call:
## glm(formula = churn ~ ., family = binomial, data = train)
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)  
## (Intercept)        -1.345371   0.569427  -2.363   0.0181 *
## tenure_months       0.010169   0.008218   1.237   0.2159  
## monthly_charges    -0.002821   0.006407  -0.440   0.6598  
## online_securityYes  0.299793   0.272327   1.101   0.2710  
## tech_supportYes     0.164144   0.275591   0.596   0.5514  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 325.42  on 279  degrees of freedom
## Residual deviance: 322.22  on 275  degrees of freedom
## AIC: 332.22
## 
## Number of Fisher Scoring iterations: 4

• Generate predictions on the test set

logit_prob <- predict(logit_fit, newdata = test, type = "response")

• Compute: o Confusion matrix

# TODO: Choose a threshold (start with 0.50). You may adjust later.
threshold <- 0.50
logit_pred <- ifelse(logit_prob >= threshold, "Yes", "No")

o Accuracy, Precision, Recall, F1

logit_m <- calc_metrics(test$churn, logit_pred)
logit_m$cm
##          Actual
## Predicted No Yes
##       No  85  35
##       Yes  0   0
logit_m$accuracy; logit_m$precision; logit_m$recall; logit_m$f1
## [1] 0.7083333
## [1] NA
## [1] 0
## [1] NA

o ROC curve and AUC

logit_ra <- make_roc_auc(test$churn, logit_prob)
logit_ra$auc
## [1] 0.5842017

Expected output: Checkpoint (2–4 sentences): - Which predictors look influential (direction/sign), and does that match your EDA?

Your answer: [WRITE HERE] • Metrics and ROC plot

• Short interpretation of coefficients and performance


Task 6 — k-Nearest Neighbors (kNN) kNN needs: - numeric design matrix (dummy variables for categorical predictors) - scaling using train statistics only

• Prepare data appropriately (dummy variables + scaling)

• Fit a kNN model (choose and report k)

x_train <- model.matrix(churn ~ . - 1, data = train)
x_test  <- model.matrix(churn ~ . - 1, data = test)

y_train <- train$churn
y_test  <- test$churn

# Scale using TRAIN stats
train_center <- apply(x_train, 2, mean)
train_scale  <- apply(x_train, 2, sd)
train_scale[train_scale == 0] <- 1

x_train_s <- scale(x_train, center = train_center, scale = train_scale)
x_test_s  <- scale(x_test,  center = train_center, scale = train_scale)

• Evaluate using the same metrics as Task 5

# TODO: Choose k (try at least two values and record what happens)
k <- 15

knn_pred <- knn(train = x_train_s, test = x_test_s, cl = y_train, k = k, prob = TRUE)

# Convert "winning prob" into prob(Yes) for ROC
winprob <- attr(knn_pred, "prob")
knn_prob_yes <- ifelse(knn_pred == "Yes", winprob, 1 - winprob)

knn_m <- calc_metrics(y_test, knn_pred)
knn_m$cm
##          Actual
## Predicted No Yes
##       No  79  34
##       Yes  6   1
knn_m$accuracy; knn_m$precision; knn_m$recall; knn_m$f1
## [1] 0.6666667
## [1] 0.1428571
## [1] 0.02857143
## [1] 0.04761905
knn_ra <- make_roc_auc(y_test, knn_prob_yes)
knn_ra$auc
## [1] 0.4905882

Expected output: Checkpoint (1–3 sentences): - What k did you choose and why? What changed when you tried another k?

Your answer: [WRITE HERE] • Metrics and ROC curve

• Comparison to logistic regression


Task 7 — Naive Bayes • Fit a Naive Bayes classifier

# NOTE: e1071::naiveBayes can handle factors directly.
nb_fit <- naiveBayes(churn ~ ., data = train)
nb_fit
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##        No       Yes 
## 0.7321429 0.2678571 
## 
## Conditional probabilities:
##      tenure_months
## Y         [,1]     [,2]
##   No  29.12195 16.64302
##   Yes 31.77333 17.47166
## 
##      monthly_charges
## Y         [,1]     [,2]
##   No  69.16922 20.97912
##   Yes 67.75907 22.62381
## 
##      online_security
## Y            No       Yes
##   No  0.5707317 0.4292683
##   Yes 0.4933333 0.5066667
## 
##      tech_support
## Y            No       Yes
##   No  0.5073171 0.4926829
##   Yes 0.4800000 0.5200000

• Evaluate using the same metrics

# Predicted class
nb_pred <- predict(nb_fit, newdata = test, type = "class")

# Predicted probability of "Yes" for ROC
nb_prob <- predict(nb_fit, newdata = test, type = "raw")[, "Yes"]

nb_m <- calc_metrics(test$churn, nb_pred)
nb_m$cm
##          Actual
## Predicted No Yes
##       No  85  35
##       Yes  0   0
nb_m$accuracy; nb_m$precision; nb_m$recall; nb_m$f1
## [1] 0.7083333
## [1] NA
## [1] 0
## [1] NA
nb_ra <- make_roc_auc(test$churn, nb_prob)
nb_ra$auc
## [1] 0.6070588

• Discuss when Naive Bayes may outperform other models

Expected output: Checkpoint (1–3 sentences): - When might Naive Bayes be a good choice, even if AUC is not the highest?

Your answer: [WRITE HERE]

• Metrics and ROC curve

• Short interpretive discussion


Task 8 — Classification Tree • Fit a classification tree

tree_fit <- rpart(churn ~ ., data = train, method = "class")
print(tree_fit)
## n= 280 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
##   1) root 280 75 No (0.73214286 0.26785714)  
##     2) monthly_charges>=29.45 271 69 No (0.74538745 0.25461255)  
##       4) tenure_months< 48.5 218 48 No (0.77981651 0.22018349)  
##         8) tenure_months>=41.5 33  3 No (0.90909091 0.09090909) *
##         9) tenure_months< 41.5 185 45 No (0.75675676 0.24324324)  
##          18) online_security=No 100 19 No (0.81000000 0.19000000)  
##            36) monthly_charges< 77.98 70  8 No (0.88571429 0.11428571) *
##            37) monthly_charges>=77.98 30 11 No (0.63333333 0.36666667)  
##              74) monthly_charges>=89.105 18  3 No (0.83333333 0.16666667) *
##              75) monthly_charges< 89.105 12  4 Yes (0.33333333 0.66666667) *
##          19) online_security=Yes 85 26 No (0.69411765 0.30588235)  
##            38) monthly_charges>=78.435 31  5 No (0.83870968 0.16129032) *
##            39) monthly_charges< 78.435 54 21 No (0.61111111 0.38888889)  
##              78) monthly_charges< 49.315 11  2 No (0.81818182 0.18181818) *
##              79) monthly_charges>=49.315 43 19 No (0.55813953 0.44186047)  
##               158) monthly_charges>=56.03 34 13 No (0.61764706 0.38235294)  
##                 316) tenure_months>=27.5 11  2 No (0.81818182 0.18181818) *
##                 317) tenure_months< 27.5 23 11 No (0.52173913 0.47826087)  
##                   634) tenure_months< 12.5 13  4 No (0.69230769 0.30769231) *
##                   635) tenure_months>=12.5 10  3 Yes (0.30000000 0.70000000) *
##               159) monthly_charges< 56.03 9  3 Yes (0.33333333 0.66666667) *
##       5) tenure_months>=48.5 53 21 No (0.60377358 0.39622642)  
##        10) monthly_charges< 60.79 19  3 No (0.84210526 0.15789474) *
##        11) monthly_charges>=60.79 34 16 Yes (0.47058824 0.52941176)  
##          22) tenure_months>=50.5 25 11 No (0.56000000 0.44000000)  
##            44) online_security=Yes 9  2 No (0.77777778 0.22222222) *
##            45) online_security=No 16  7 Yes (0.43750000 0.56250000) *
##          23) tenure_months< 50.5 9  2 Yes (0.22222222 0.77777778) *
##     3) monthly_charges< 29.45 9  3 Yes (0.33333333 0.66666667) *

• Evaluate using the same metrics

tree_prob <- predict(tree_fit, newdata = test, type = "prob")[, "Yes"]
tree_pred <- ifelse(tree_prob >= 0.50, "Yes", "No")

tree_m <- calc_metrics(test$churn, tree_pred)
tree_m$cm
##          Actual
## Predicted No Yes
##       No  60  26
##       Yes 25   9
tree_m$accuracy; tree_m$precision; tree_m$recall; tree_m$f1
## [1] 0.575
## [1] 0.2647059
## [1] 0.2571429
## [1] 0.2608696
tree_ra <- make_roc_auc(test$churn, tree_prob)
tree_ra$auc
## [1] 0.5368067
# Optional (requires extra package)
# install.packages("rpart.plot")
# library(rpart.plot)
# rpart.plot(tree_fit)

• Comment on interpretability vs performance

Expected output: Checkpoint (1–3 sentences): - What do you gain/lose with a tree vs logistic regression?

Your answer: [WRITE HERE]

• Metrics (tree visualization optional)

• Interpretation of strengths/weaknesses


Task 9 — Model Comparison • Create a single comparison table including: o Accuracy, Precision, Recall, F1, AUC for all models

results <- data.frame(
  Model = c("Logistic", paste0("kNN (k=", k, ")"), "Naive Bayes", "CART Tree"),
  Accuracy  = c(logit_m$accuracy,  knn_m$accuracy,  nb_m$accuracy,  tree_m$accuracy),
  Precision = c(logit_m$precision, knn_m$precision, nb_m$precision, tree_m$precision),
  Recall    = c(logit_m$recall,    knn_m$recall,    nb_m$recall,    tree_m$recall),
  F1        = c(logit_m$f1,        knn_m$f1,        nb_m$f1,        tree_m$f1),
  AUC       = c(logit_ra$auc,      knn_ra$auc,      nb_ra$auc,      tree_ra$auc)
)
results
##         Model  Accuracy Precision     Recall         F1       AUC
## 1    Logistic 0.7083333        NA 0.00000000         NA 0.5842017
## 2  kNN (k=15) 0.6666667 0.1428571 0.02857143 0.04761905 0.4905882
## 3 Naive Bayes 0.7083333        NA 0.00000000         NA 0.6070588
## 4   CART Tree 0.5750000 0.2647059 0.25714286 0.26086957 0.5368067

• Plot all ROC curves on one chart

plot(logit_ra$roc, main = "ROC Curves (Test Set)", legacy.axes = TRUE)
plot(knn_ra$roc, add = TRUE)
plot(nb_ra$roc, add = TRUE)
plot(tree_ra$roc, add = TRUE)

legend(
  "bottomright",
  legend = c(
    paste0("Logit AUC=", round(logit_ra$auc, 3)),
    paste0("kNN AUC=", round(knn_ra$auc, 3)),
    paste0("NB AUC=", round(nb_ra$auc, 3)),
    paste0("Tree AUC=", round(tree_ra$auc, 3))
  ),
  lty = 1,
  bty = "n"
)

Expected output:

• Comparison table

• Combined ROC plot ________________________________________ Task 10 — Model Selection & Business Scenario Assume false negatives are more costly than false positives (missing likely churners).

Select one model as “best” based on the business goal. • Identify the best-performing model under this assumption

Write 1 short paragraph: - Which metric did you prioritize and why? (e.g., Recall if missing churners is costly) - Which model did you choose?

• Justify your choice using metrics and business reasoning Expected output: Your answer: [WRITE HERE] • 1–2 paragraphs written explanation ________________________________________ Task 11 — Ensemble Extension Using your best individual model: Pick one improvement path:

Option A (recommended): Bagging
- If your best model is a tree, bagging is a natural choice. - If your best model is not a tree, you can still bag trees as a “strong alternative” and compare.

Option B (optional): Boosting (advanced)
Option C (optional/bonus): Stacking (advanced) • Apply one ensembling approach (e.g., bagging, boosting, or stacking) `

• Compare ensemble performance to the single model

• State whether ensembling improved results and why

Expected output: Write 3–6 sentences: - What ensemble method did you use? - Did it improve your chosen metric vs the single best model? - Why might it help (or not) on this dataset?

Your answer: [WRITE HERE] • Updated metrics • Brief interpretation of ensemble effectiveness ________________________________________ # Final task — Recommendation memo (≤ 1 page)

Write a memo to a non-technical manager:

  1. Best single model (metrics + business rationale)
  2. Best threshold / metric tradeoff for your scenario
  3. Whether your ensemble improved outcomes and whether you recommend it Deliverable Submit one knitted R Markdown (HTML, PDF, or Doc) that includes: • All code and outputs • Tables and plots • Clear written answers for each task – either generated by the code or you type in Notes • These can all be included in one knitted output • Metrics must be computed on the test set only • Plots must be interpreted, not just displayed • Code clarity and reasoning matter more than perfect accuracy