Predict cheaters using the Affairs data from the AER
package, using the classification algorithm from the rpart
package.
cheater by classifying
observations with 0 affairs as 0, and with one or more
affairs as 1.Interesting how gender is perceived by many as a predictor for cheating. However, the data shows that it is not a good predictor.
Using the validation data, it is observed that the model is able to:
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 176
##
##
## | validation_tree$cheater_predicted
## validation_tree$cheater | 0 | 1 | Row Total |
## ------------------------|-----------|-----------|-----------|
## 0 | 117 | 11 | 128 |
## | 0.377 | 2.387 | |
## | 0.914 | 0.086 | 0.727 |
## | 0.770 | 0.458 | |
## | 0.665 | 0.062 | |
## ------------------------|-----------|-----------|-----------|
## 1 | 35 | 13 | 48 |
## | 1.005 | 6.365 | |
## | 0.729 | 0.271 | 0.273 |
## | 0.230 | 0.542 | |
## | 0.199 | 0.074 | |
## ------------------------|-----------|-----------|-----------|
## Column Total | 152 | 24 | 176 |
## | 0.864 | 0.136 | |
## ------------------------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 10.13359 d.f. = 1 p = 0.001455916
##
## Pearson's Chi-squared test with Yates' continuity correction
## ------------------------------------------------------------
## Chi^2 = 8.624406 d.f. = 1 p = 0.003316886
##
##