Performance evaluation is an important part of machine learning process. We need to know how well the model works on the unseen data set.
In this study, I will evaluate my model based on the following Classification metrics: 1. Accuracy Score: How well the model accurately predict the actual class 2. Missclassification Score:1- Accuracy Score 3. Confusion Matrix: Confusion matrix describe the performance for all classes
4. Precision:What proportion of positive identifications was actually correct?
\[ Precision = \frac{True Pos}{(True Pos+False Pos)} \]
\[ Recall = \frac{True Pos}{(True Pos+False Neg)} \] 6. Specificity: how good a test is at avoiding false alarms. \[ Recall = \frac{True Neg}{(True Neg+False Pos)} \]
7.F1 Score:
\[ Recall = 2 * \frac{Precision * Recall}{(Precision + Recall)} \]
Note that for skewed data set accuracy score may be highly misleading.
Data:
Data:
Auto data set: A data frame with 392 observations on the following 9 variables.
mpg:miles per gallon
cylinders: Number of cylinders between 4 and 8
displacement:Engine displacement (cu. inches)
horsepower:Engine horsepower
weight:Vehicle weight (lbs.)
acceleration:Time to accelerate from 0 to 60 mph (sec.)
year:Model year (modulo 100)
origin:Origin of car (1. American, 2. European, 3. Japanese)
name:Vehicle name
The orginal data contained 408 observations but 16 observations with missing values were removed.
Objective:
Carry out the logistic regression and evaluate model and choose the best cut-off point using ROC Curve.
Data Preparation
#Loading necessary libraries
library(tidyverse) # For data manipulation
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(dplyr)
library(caret) # confusion matrix
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(e1071)
library(GGally)
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
library(rpart)
library(ROCR)
library(rpart.plot)
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
library(rattle)
## Loading required package: bitops
## Rattle: A free graphical interface for data science with R.
## Version 5.4.0 Copyright (c) 2006-2020 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.
library(yardstick)
## For binary classification, the first factor level is assumed to be the event.
## Use the argument `event_level = "second"` to alter this as needed.
##
## Attaching package: 'yardstick'
## The following objects are masked from 'package:caret':
##
## precision, recall, sensitivity, specificity
## The following object is masked from 'package:readr':
##
## spec
library(formattable)
#PREPARING WORK SPAcE
# Clear the workspace:
rm(list = ls())
# Load data
df <- read.csv("auto-mpg.csv", header = TRUE)
head(df)
## mpg cylinders displacement horsepower weight acceleration model.year origin
## 1 18 8 307 130 3504 12.0 70 1
## 2 15 8 350 165 3693 11.5 70 1
## 3 18 8 318 150 3436 11.0 70 1
## 4 16 8 304 150 3433 12.0 70 1
## 5 17 8 302 140 3449 10.5 70 1
## 6 15 8 429 198 4341 10.0 70 1
## car.name
## 1 chevrolet chevelle malibu
## 2 buick skylark 320
## 3 plymouth satellite
## 4 amc rebel sst
## 5 ford torino
## 6 ford galaxie 500
nrow(df)
## [1] 398
col <-colnames(df)
col
## [1] "mpg" "cylinders" "displacement" "horsepower" "weight"
## [6] "acceleration" "model.year" "origin" "car.name"
#Summary Statistics
summary(df)
## mpg cylinders displacement horsepower
## Min. : 9.00 Min. :3.000 Min. : 68.0 Length:398
## 1st Qu.:17.50 1st Qu.:4.000 1st Qu.:104.2 Class :character
## Median :23.00 Median :4.000 Median :148.5 Mode :character
## Mean :23.51 Mean :5.455 Mean :193.4
## 3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:262.0
## Max. :46.60 Max. :8.000 Max. :455.0
## weight acceleration model.year origin
## Min. :1613 Min. : 8.00 Min. :70.00 Min. :1.000
## 1st Qu.:2224 1st Qu.:13.82 1st Qu.:73.00 1st Qu.:1.000
## Median :2804 Median :15.50 Median :76.00 Median :1.000
## Mean :2970 Mean :15.57 Mean :76.01 Mean :1.573
## 3rd Qu.:3608 3rd Qu.:17.18 3rd Qu.:79.00 3rd Qu.:2.000
## Max. :5140 Max. :24.80 Max. :82.00 Max. :3.000
## car.name
## Length:398
## Class :character
## Mode :character
##
##
##
sapply(df, sd)
## mpg cylinders displacement horsepower weight acceleration
## 7.8159843 1.7010042 104.2698382 NA 846.8417742 2.7576889
## model.year origin car.name
## 3.6976266 0.8020549 NA
#Data Structure
str(df)
## 'data.frame': 398 obs. of 9 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : int 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : chr "130" "165" "150" "150" ...
## $ weight : int 3504 3693 3436 3433 3449 4341 4354 4312 4425 3850 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ model.year : int 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : int 1 1 1 1 1 1 1 1 1 1 ...
## $ car.name : chr "chevrolet chevelle malibu" "buick skylark 320" "plymouth satellite" "amc rebel sst" ...
#Convert to Categorical Variable & to Numeric
df <- df %>%
select(-car.name) %>%
mutate_at(vars(cylinders,model.year,origin ), funs(factor)) %>%
mutate_at(vars(horsepower),funs(as.numeric))
str(df)
## 'data.frame': 398 obs. of 8 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : Factor w/ 5 levels "3","4","5","6",..: 5 5 5 5 5 5 5 5 5 5 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : num 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : int 3504 3693 3436 3433 3449 4341 4354 4312 4425 3850 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ model.year : Factor w/ 13 levels "70","71","72",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ origin : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
#Create binary class based on MPG
m0 <- min(df$mpg)
m1 <- mean(df$mpg)
m2 <- max(df$mpg)
df$mpg_con <- cut(x=df$mpg, breaks=c(min(df$mpg),mean(df$mpg), max(df$mpg)), labels=c("L", "H"))
df <- df %>%
select(-mpg)
Data Partition
ind <- sample(2, nrow(df), replace = T, prob=c(0.8,0.2))
train <- df[ind==1,]
test <- df[ind==2,]
head(train)
## cylinders displacement horsepower weight acceleration model.year origin
## 1 8 307 130 3504 12.0 70 1
## 3 8 318 150 3436 11.0 70 1
## 4 8 304 150 3433 12.0 70 1
## 5 8 302 140 3449 10.5 70 1
## 6 8 429 198 4341 10.0 70 1
## 7 8 454 220 4354 9.0 70 1
## mpg_con
## 1 L
## 3 L
## 4 L
## 5 L
## 6 L
## 7 L
head(test)
## cylinders displacement horsepower weight acceleration model.year origin
## 2 8 350 165 3693 11.5 70 1
## 8 8 440 215 4312 8.5 70 1
## 13 8 400 150 3761 9.5 70 1
## 23 4 104 95 2375 17.5 70 2
## 24 4 121 113 2234 12.5 70 2
## 25 6 199 90 2648 15.0 70 1
## mpg_con
## 2 L
## 8 L
## 13 L
## 23 H
## 24 H
## 25 L
Checking Stratified Sampling
#Overall
prop.table(table(df$mpg_con))
##
## L H
## 0.5239295 0.4760705
#Train
prop.table(table(train$mpg_con))
##
## L H
## 0.5216049 0.4783951
#Test
prop.table(table(test$mpg_con))
##
## L H
## 0.5342466 0.4657534
Decision Tree Model
model_fit1 <- rpart(mpg_con ~ ., data=train)
rpart.plot(model_fit1)
# For probabilities
p <- predict (model_fit1, newdata=test, type="prob")
head(p)
## L H
## 2 1.00000000 0.000000
## 8 1.00000000 0.000000
## 13 1.00000000 0.000000
## 23 0.05755396 0.942446
## 24 0.05755396 0.942446
## 25 1.00000000 0.000000
#For classes
pred <- predict (model_fit1, newdata=test, type="class")
Accuracy rate for mimimum level
t<-table(test$mpg_con)
t
##
## L H
## 39 34
40/ 76
## [1] 0.5263158
We should be able to get a higher accuracy rate to say our model is succesful.
Accuracy Score
acc<-mean(test$mpg_con == pred)
acc
## [1] 0.9041096
Misclassification Score
1- acc
## [1] 0.09589041
Confusion Matrix
table(pred, test$mpg_con)
##
## pred L H
## L 37 5
## H 2 29
#prediciton data frame
df1 <- data.frame(pred, mpg_con=test$mpg_con)
df1
## pred mpg_con
## 2 L L
## 8 L L
## 13 L L
## 23 H H
## 24 H H
## 25 L L
## 34 L L
## 38 L L
## 40 L L
## 55 H H
## 61 L L
## 65 L L
## 68 L L
## 69 L L
## 109 L L
## 114 L L
## 115 L H
## 120 L L
## 123 L H
## 127 L L
## 133 H H
## 135 L L
## 138 L L
## 146 H H
## 150 H H
## 154 L L
## 157 L L
## 161 L L
## 179 L L
## 181 L H
## 185 H H
## 192 L L
## 194 L H
## 199 H H
## 201 L L
## 210 H L
## 219 H H
## 223 L L
## 228 L L
## 240 H H
## 244 L L
## 246 H H
## 250 L L
## 253 L L
## 254 L L
## 261 L L
## 265 L L
## 267 H H
## 272 H L
## 276 L L
## 284 L L
## 285 L L
## 287 L L
## 289 L L
## 295 H H
## 296 H H
## 297 H H
## 298 H H
## 299 L L
## 306 H H
## 310 H H
## 311 H H
## 312 H H
## 326 H H
## 327 H H
## 328 H H
## 335 H H
## 347 H H
## 348 H H
## 381 H H
## 385 H H
## 388 L H
## 394 H H
#with Yardstick library
conf_mat(estimate=pred, truth= mpg_con, data=df1)
## Truth
## Prediction L H
## L 37 5
## H 2 29
Precision
pre <- precision(data=df1, estimate=pred, truth=mpg_con)
pre
## # A tibble: 1 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 precision binary 0.881
Recall (Sensitivity)
rec <- recall(data=df1, estimate=pred, truth=mpg_con)
rec
## # A tibble: 1 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 recall binary 0.949
Specificity
spe <- spec(data=df1, estimate=pred, truth=mpg_con)
spe
## # A tibble: 1 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 spec binary 0.853
F1 Score
f1 <- f_meas(data=df1, estimate=pred, truth=mpg_con)
f1
## # A tibble: 1 x 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 f_meas binary 0.914
df<-data_frame(
id = 1:5,
Model = c('Accuracy', 'Precision', 'Sensitivity','Specificity', 'F1 Score'),
Scores = c(acc, pre[3], rec[3], spe[3], f1[3]),
)
formattable(df,
list(
Model = formatter("span",
style = ~ style(color = ifelse((Scores) > mean(Scores),
"green", "red"))),
area(col = c(Scores)) ~ normalize_bar("lightgreen", 0.2)
))
| id | Model | Scores |
|---|---|---|
| 1 | Accuracy | 0.9041096 |
| 2 | Precision | 0.8809524 |
| 3 | Sensitivity | 0.9487179 |
| 4 | Specificity | 0.8529412 |
| 5 | F1 Score | 0.9135802 |
References:
CSU Applied Statistics Course Notes
https://developers.google.com/machine-learning/crash-course/classification/precision-and-recall </br
https://uberpython.wordpress.com/2012/01/01/precision-recall-sensitivity-and-specificity/