Nathaniel Phillips, Economic Psychology, University of Basel
BaselR Meeting, March 2017




Tree dramatically outperformed the doctor's clinical judgments and resulted in far fewer false-positives and huge cost savings
To this day, the tree is still used at the hospital.

A fast and frugal decision tree (FFT) is a very simple, highly restricted decision tree.
In an FFT, each node has exactly two branches, where at least one branch is an exit branch (Martignon et al., 2008).
Because of their restrictions, FFTs are even faster and require less information than non-FFT trees.

There is no off-the-shelf method to construct FFTs
FFTrees.
# Available on CRAN
install.packages("FFTrees")
devtools::github("ndphillips/FFTrees", include_vignette = TRUE)
library(FFTrees)
head(heartdisease)
## age sex cp trestbps chol fbs restecg thalach exang oldpeak slope
## 257 67 0 a 106 223 0 normal 142 0 0.3 up
## 118 35 0 a 138 183 0 normal 182 0 1.4 up
## 242 41 0 aa 126 306 0 normal 163 0 0.0 up
## 12 56 0 aa 140 294 0 hypertrophy 153 0 1.3 flat
## 131 54 1 np 120 258 0 hypertrophy 147 0 0.4 flat
## 108 57 1 np 128 229 0 hypertrophy 150 0 0.4 flat
## ca thal diagnosis
## 257 2 normal 0
## 118 0 normal 0
## 242 0 normal 0
## 12 0 normal 0
## 131 0 rd 0
## 108 1 rd 1
# Step 1: Create training and test data
set.seed(100)
heartdisease <- heartdisease[sample(nrow(heartdisease)),]
heart.train <- heartdisease[1:150,]
heart.test <- heartdisease[151:303,]
# Step 2: Create heart.fft
heart.fft <- FFTrees(formula = diagnosis ~.,
data = heart.train,
data.test = heart.test)

# Step 3: Summary statistics
heart.fft
## [1] "7 FFTs using up to 4 of 13 cues"
## [1] "FFT #5 uses 3 cues {thal,thalach,cp} with the following performance:"
## train test
## n 150.00 153.00
## pci 0.88 0.87
## mcu 1.67 1.78
## acc 0.81 0.73
## bacc 0.82 0.73
## sens 0.89 0.81
## spec 0.74 0.66
plot(heart.fft, what = "cues", main = "Heart Disease")
plot(heart.fft, main = "Heart Disease",
decision.names = c("healthy", "sick"),
stats = FALSE)



| dataset | cases | cues | base.rate |
|---|---|---|---|
| arrhythmia | 68 | 280 | 0.29 |
| audiology | 226 | 70 | 0.10 |
| breast | 683 | 10 | 0.35 |
| bridges | 92 | 10 | 0.39 |
| cmc | 1473 | 10 | 0.35 |
Table: 5 of the 10 prediction datasets
| Step | Time |
|---|---|
| 1. Decision threshold for each cue | |
| 2. Select cues | 2 |
| 3. Order cues | 4 |
| 4. Exit branch | 6 |
The FFForest() function will create a forest of FFTs by creating trees from random subsets of the data.
You can then extrapolate cue importance and co-occurence from an FFForest object: