Objective

Predict cheaters using the Affairs data from the AER package, using the classification algorithm from the rpart package.

Data Prep

The Tree

Two Interpretations from the Tree

  1. The most likely cheaters
  1. unlikely cheater

Variable Importance

Interesting how gender is percieved by many as a predictor for cheating. However, the data shows that it is not a good predictor.

Model Accuracy

Using the validation data, it is observed that the model is able to :
- Predict correctly 27% of the cheaters
- Misclassify faithful people as cheaters 8.6% of the time

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  176 
## 
##  
##                         | validation_tree$cheater_predicted 
## validation_tree$cheater |         0 |         1 | Row Total | 
## ------------------------|-----------|-----------|-----------|
##                       0 |       117 |        11 |       128 | 
##                         |     0.377 |     2.387 |           | 
##                         |     0.914 |     0.086 |     0.727 | 
##                         |     0.770 |     0.458 |           | 
##                         |     0.665 |     0.062 |           | 
## ------------------------|-----------|-----------|-----------|
##                       1 |        35 |        13 |        48 | 
##                         |     1.005 |     6.365 |           | 
##                         |     0.729 |     0.271 |     0.273 | 
##                         |     0.230 |     0.542 |           | 
##                         |     0.199 |     0.074 |           | 
## ------------------------|-----------|-----------|-----------|
##            Column Total |       152 |        24 |       176 | 
##                         |     0.864 |     0.136 |           | 
## ------------------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  10.13359     d.f. =  1     p =  0.001455916 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  8.624406     d.f. =  1     p =  0.003316886 
## 
##