Objective

Predict cheaters using the Affairs data from the AER package, using the classification algorithm from the rpart package.

Data Prep

The Tree

Two Interpretations from the Tree

  1. The most likely cheaters:
    • been married more that 2.8 years
    • Not very religious (religiousness < 4)
    • Educated (education > = 17 years)
    • Rates their marriage not so well (rating < 4)
  2. Unlikely cheater
    • Newlywed (years married < 2.8 years)

Variable Importance

Interesting how gender is perceived by many as a predictor for cheating. However, the data shows that it is not a good predictor.

Model Accuracy

Using the validation data, it is observed that the model is able to:

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  176 
## 
##  
##                         | validation_tree$cheater_predicted 
## validation_tree$cheater |         0 |         1 | Row Total | 
## ------------------------|-----------|-----------|-----------|
##                       0 |       117 |        11 |       128 | 
##                         |     0.377 |     2.387 |           | 
##                         |     0.914 |     0.086 |     0.727 | 
##                         |     0.770 |     0.458 |           | 
##                         |     0.665 |     0.062 |           | 
## ------------------------|-----------|-----------|-----------|
##                       1 |        35 |        13 |        48 | 
##                         |     1.005 |     6.365 |           | 
##                         |     0.729 |     0.271 |     0.273 | 
##                         |     0.230 |     0.542 |           | 
##                         |     0.199 |     0.074 |           | 
## ------------------------|-----------|-----------|-----------|
##            Column Total |       152 |        24 |       176 | 
##                         |     0.864 |     0.136 |           | 
## ------------------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  10.13359     d.f. =  1     p =  0.001455916 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  8.624406     d.f. =  1     p =  0.003316886 
## 
##