560_ch5

Data

Build your table in R

table5.7 <- matrix(c(.03,0,.52,0,.38,0,.82,1,.33,0,.42,0,.55,1,.59,0,.09,0,.21,0,.43,0,.04,0,.08,0,.13,0,.01,0,.79,1,.42,0,.29,0,.08,0,.02,0), ncol=2, byrow=TRUE)
colnames(table5.7)<-c("Propensity of 1","Actual")
table5.7<- as.data.frame(table5.7)
table5.7

##    Propensity of 1 Actual
## 1             0.03      0
## 2             0.52      0
## 3             0.38      0
## 4             0.82      1
## 5             0.33      0
## 6             0.42      0
## 7             0.55      1
## 8             0.59      0
## 9             0.09      0
## 10            0.21      0
## 11            0.43      0
## 12            0.04      0
## 13            0.08      0
## 14            0.13      0
## 15            0.01      0
## 16            0.79      1
## 17            0.42      0
## 18            0.29      0
## 19            0.08      0
## 20            0.02      0

Part A

Calculate error rates, sensitivity, and specificity using cut offs of 0.25, 0.5, and 0.75

NOTE 1: You will need to load the “caret” and “e1071” packages NOTE 2: “reference” = “actual”

library(caret)

## Loading required package: lattice

## Loading required package: ggplot2

library(e1071)

##cutoff = 0.25
confusionMatrix(as.factor(ifelse(table5.7$`Propensity of 1`>0.25, '1', '0')), 
                as.factor(table5.7$Actual))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction 0 1
##          0 9 0
##          1 8 3
##                                           
##                Accuracy : 0.6             
##                  95% CI : (0.3605, 0.8088)
##     No Information Rate : 0.85            
##     P-Value [Acc > NIR] : 0.99867         
##                                           
##                   Kappa : 0.2523          
##                                           
##  Mcnemar's Test P-Value : 0.01333         
##                                           
##             Sensitivity : 0.5294          
##             Specificity : 1.0000          
##          Pos Pred Value : 1.0000          
##          Neg Pred Value : 0.2727          
##              Prevalence : 0.8500          
##          Detection Rate : 0.4500          
##    Detection Prevalence : 0.4500          
##       Balanced Accuracy : 0.7647          
##                                           
##        'Positive' Class : 0               
##

##cutoff = 0.5
confusionMatrix(as.factor(ifelse(table5.7$`Propensity of 1`>0.5, '1', '0')), 
                as.factor(table5.7$Actual))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0  1
##          0 15  0
##          1  2  3
##                                          
##                Accuracy : 0.9            
##                  95% CI : (0.683, 0.9877)
##     No Information Rate : 0.85           
##     P-Value [Acc > NIR] : 0.4049         
##                                          
##                   Kappa : 0.6923         
##                                          
##  Mcnemar's Test P-Value : 0.4795         
##                                          
##             Sensitivity : 0.8824         
##             Specificity : 1.0000         
##          Pos Pred Value : 1.0000         
##          Neg Pred Value : 0.6000         
##              Prevalence : 0.8500         
##          Detection Rate : 0.7500         
##    Detection Prevalence : 0.7500         
##       Balanced Accuracy : 0.9412         
##                                          
##        'Positive' Class : 0              
##

##cutoff = 0.75
confusionMatrix(as.factor(ifelse(table5.7$`Propensity of 1`>0.75, '1', '0')), 
                as.factor(table5.7$Actual))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0  1
##          0 17  1
##          1  0  2
##                                           
##                Accuracy : 0.95            
##                  95% CI : (0.7513, 0.9987)
##     No Information Rate : 0.85            
##     P-Value [Acc > NIR] : 0.1756          
##                                           
##                   Kappa : 0.7727          
##                                           
##  Mcnemar's Test P-Value : 1.0000          
##                                           
##             Sensitivity : 1.0000          
##             Specificity : 0.6667          
##          Pos Pred Value : 0.9444          
##          Neg Pred Value : 1.0000          
##              Prevalence : 0.8500          
##          Detection Rate : 0.8500          
##    Detection Prevalence : 0.9000          
##       Balanced Accuracy : 0.8333          
##                                           
##        'Positive' Class : 0               
##

Part B

Create a decile-wise lift chart in R.

NOTE 1: You will need to use the “gains” package to compute the deciles as the “caret” package requires deciles to be manually computed. NOTE 2: Percentiles do not match deciles exactly due to the small sample of discrete data, with multiple records sharing the same decile boundary.

library(gains)

gain <- gains(table5.7$Actual, table5.7$`Propensity of 1`)
barplot(gain$mean.resp / mean(table5.7$Actual), names.arg = gain$depth, xlab = "Percentile", 
        ylab = "Mean Response", main = "Decile-wise lift chart")

560_ch5

Problem 5.7 | Predictive Model Validation

C.E.Weiner

1/27/2020

Data

Part A

Part B