The legal system of the United States operates at the state level and at the federal level
Federal courts hear cases beyond the scope of state law
Federal courts are divided into:
Legal academics and political scientists regularly make predictions of SCOTUS decisions from detailed studies of cases and individual justices
In 2002, Andrew Martin, a professor of political science at Washington University in St. Louis, decided to instead predict decisions using a statistical model built from data
Together with his colleagues, he decided to test this model against a panel of experts
Martin used a method called Classification and Regression Trees (CART)
Dependent Variable: Did Justice Stevens vote to reverse the lower court decision? 1 = reverse, 0 = affirm
In each subset, we have a bucket of observations, which may contain both outcomes (i.e. affirm and reverse)
Designed to improve prediction accuracy of CART
To make a prediction for a new observation, each tree “votes” on the outcome, and we pick the outcome that receives the majority of the votes
Each tree can split on only a random subset of the variables
2 4 5 2 1 -> 1st tree
3 5 1 5 2 -> 2nd tree
In CART, the value of “minibucket” can affect the model’s out-of-sample accuracy
How should we set this parameter?
Before, we limited our tree using minibucket
Smaller cp leads to a bigger tree (might overfit)
Used 628 previous SCOTUS cases between 1994 and 2001
Made predictions for the 68 cases that would be decided in October 2002, before the term started
Experts only asked to predict within their area of expertise; more than one expert to each case
Allowed to consider any source of information, but not allowed to communicate with each other regarding predictions
For the 68 cases in October 2002:
Predicting Supreme Court decisions is very valuable to firms, politicians and non-governmental organizations
A model that predicts these decisions is both more accurate and faster than experts
# Read in the data
stevens = read.csv("stevens.csv")
# Output structure
str(stevens)
## 'data.frame': 566 obs. of 9 variables:
## $ Docket : Factor w/ 566 levels "00-1011","00-1045",..: 63 69 70 145 97 181 242 289 334 436 ...
## $ Term : int 1994 1994 1994 1994 1995 1995 1996 1997 1997 1999 ...
## $ Circuit : Factor w/ 13 levels "10th","11th",..: 4 11 7 3 9 11 13 11 12 2 ...
## $ Issue : Factor w/ 11 levels "Attorneys","CivilRights",..: 5 5 5 5 9 5 5 5 5 3 ...
## $ Petitioner: Factor w/ 12 levels "AMERICAN.INDIAN",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Respondent: Factor w/ 12 levels "AMERICAN.INDIAN",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ LowerCourt: Factor w/ 2 levels "conser","liberal": 2 2 2 1 1 1 1 1 1 1 ...
## $ Unconst : int 0 0 0 0 0 1 0 1 0 0 ...
## $ Reverse : int 1 1 1 1 1 0 1 1 1 1 ...# Split the data
library(caTools)
set.seed(3000)
spl = sample.split(stevens$Reverse, SplitRatio = 0.7)
Train = subset(stevens, spl==TRUE)
Test = subset(stevens, spl==FALSE)# Load CART tree packages
library(rpart)
library(rpart.plot)# CART model
StevensTree = rpart(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = Train, method="class", minbucket=25)
# Plot CART tree
prp(StevensTree)# Make predictions
PredictCART = predict(StevensTree, newdata = Test, type = "class")
z = table(Test$Reverse, PredictCART)
kable(z)| 0 | 1 | |
|---|---|---|
| 0 | 41 | 36 |
| 1 | 22 | 71 |
# Compute Accuracy
sum(diag(z))/sum(z)
## [1] 0.6588235# ROC curve
library(ROCR)
# Make predictions on test set
PredictROC = predict(StevensTree, newdata = Test)
PredictROC
## 0 1
## 1 0.3035714 0.6964286
## 3 0.3035714 0.6964286
## 4 0.4000000 0.6000000
## 6 0.4000000 0.6000000
## 8 0.4000000 0.6000000
## 21 0.3035714 0.6964286
## 32 0.5517241 0.4482759
## 36 0.5517241 0.4482759
## 40 0.3035714 0.6964286
## 42 0.5517241 0.4482759
## 46 0.5517241 0.4482759
## 47 0.4000000 0.6000000
## 53 0.5517241 0.4482759
## 55 0.3035714 0.6964286
## 59 0.1842105 0.8157895
## 60 0.4000000 0.6000000
## 66 0.4000000 0.6000000
## 67 0.4000000 0.6000000
## 68 0.1842105 0.8157895
## 72 0.3035714 0.6964286
## 79 0.3035714 0.6964286
## 80 0.5517241 0.4482759
## 87 0.7600000 0.2400000
## 88 0.1842105 0.8157895
## 92 0.7910448 0.2089552
## 95 0.7910448 0.2089552
## 102 0.7910448 0.2089552
## 106 0.7910448 0.2089552
## 110 0.7910448 0.2089552
## 112 0.7910448 0.2089552
## 114 0.7910448 0.2089552
## 125 0.7910448 0.2089552
## 130 0.7910448 0.2089552
## 134 0.7910448 0.2089552
## 138 0.7910448 0.2089552
## 145 0.7910448 0.2089552
## 146 0.7910448 0.2089552
## 148 0.3035714 0.6964286
## 149 0.3035714 0.6964286
## 152 0.3035714 0.6964286
## 154 0.5517241 0.4482759
## 161 0.7878788 0.2121212
## 164 0.4000000 0.6000000
## 167 0.7878788 0.2121212
## 169 0.3035714 0.6964286
## 171 0.7600000 0.2400000
## 175 0.5517241 0.4482759
## 176 0.0754717 0.9245283
## 177 0.0754717 0.9245283
## 178 0.0754717 0.9245283
## 180 0.0754717 0.9245283
## 187 0.0754717 0.9245283
## 188 0.7878788 0.2121212
## 190 0.0754717 0.9245283
## 192 0.0754717 0.9245283
## 196 0.0754717 0.9245283
## 197 0.3035714 0.6964286
## 208 0.3035714 0.6964286
## 210 0.0754717 0.9245283
## 216 0.7910448 0.2089552
## 218 0.7910448 0.2089552
## 220 0.0754717 0.9245283
## 224 0.4000000 0.6000000
## 226 0.7600000 0.2400000
## 227 0.4000000 0.6000000
## 228 0.7878788 0.2121212
## 235 0.3035714 0.6964286
## 239 0.7878788 0.2121212
## 242 0.7600000 0.2400000
## 244 0.7600000 0.2400000
## 247 0.4000000 0.6000000
## 255 0.3035714 0.6964286
## 260 0.5517241 0.4482759
## 261 0.7600000 0.2400000
## 264 0.3035714 0.6964286
## 265 0.3035714 0.6964286
## 268 0.3035714 0.6964286
## 272 0.5517241 0.4482759
## 273 0.3035714 0.6964286
## 274 0.5517241 0.4482759
## 275 0.3035714 0.6964286
## 282 0.4000000 0.6000000
## 286 0.7878788 0.2121212
## 291 0.4000000 0.6000000
## 294 0.1842105 0.8157895
## 305 0.4000000 0.6000000
## 306 0.3035714 0.6964286
## 308 0.7878788 0.2121212
## 311 0.7878788 0.2121212
## 313 0.7878788 0.2121212
## 314 0.7878788 0.2121212
## 315 0.7878788 0.2121212
## 317 0.7878788 0.2121212
## 320 0.7878788 0.2121212
## 321 0.7878788 0.2121212
## 323 0.4000000 0.6000000
## 331 0.3035714 0.6964286
## 335 0.3035714 0.6964286
## 338 0.7600000 0.2400000
## 341 0.5517241 0.4482759
## 345 0.5517241 0.4482759
## 346 0.3035714 0.6964286
## 350 0.3035714 0.6964286
## 352 0.3035714 0.6964286
## 353 0.1842105 0.8157895
## 355 0.3035714 0.6964286
## 356 0.1842105 0.8157895
## 358 0.3035714 0.6964286
## 359 0.3035714 0.6964286
## 360 0.4000000 0.6000000
## 361 0.4000000 0.6000000
## 362 0.5517241 0.4482759
## 364 0.3035714 0.6964286
## 368 0.3035714 0.6964286
## 381 0.4000000 0.6000000
## 382 0.1842105 0.8157895
## 384 0.3035714 0.6964286
## 387 0.1842105 0.8157895
## 389 0.3035714 0.6964286
## 390 0.4000000 0.6000000
## 394 0.3035714 0.6964286
## 400 0.7878788 0.2121212
## 402 0.4000000 0.6000000
## 405 0.7878788 0.2121212
## 408 0.3035714 0.6964286
## 410 0.3035714 0.6964286
## 416 0.4000000 0.6000000
## 422 0.7600000 0.2400000
## 432 0.0754717 0.9245283
## 434 0.7910448 0.2089552
## 436 0.0754717 0.9245283
## 441 0.7910448 0.2089552
## 444 0.0754717 0.9245283
## 448 0.0754717 0.9245283
## 450 0.0754717 0.9245283
## 451 0.0754717 0.9245283
## 452 0.7910448 0.2089552
## 454 0.0754717 0.9245283
## 456 0.0754717 0.9245283
## 459 0.0754717 0.9245283
## 462 0.0754717 0.9245283
## 464 0.0754717 0.9245283
## 467 0.0754717 0.9245283
## 468 0.0754717 0.9245283
## 470 0.0754717 0.9245283
## 473 0.0754717 0.9245283
## 476 0.0754717 0.9245283
## 478 0.0754717 0.9245283
## 480 0.0754717 0.9245283
## 482 0.0754717 0.9245283
## 483 0.0754717 0.9245283
## 484 0.0754717 0.9245283
## 494 0.7910448 0.2089552
## 498 0.1842105 0.8157895
## 504 0.4000000 0.6000000
## 509 0.4000000 0.6000000
## 521 0.7600000 0.2400000
## 527 0.4000000 0.6000000
## 531 0.4000000 0.6000000
## 535 0.4000000 0.6000000
## 538 0.7600000 0.2400000
## 539 0.1842105 0.8157895
## 540 0.4000000 0.6000000
## 543 0.7600000 0.2400000
## 545 0.4000000 0.6000000
## 546 0.7910448 0.2089552
## 551 0.7910448 0.2089552
## 552 0.7910448 0.2089552
## 556 0.4000000 0.6000000
## 558 0.1842105 0.8157895
# Plot ROC curve
pred = prediction(PredictROC[,2], Test$Reverse)
perf = performance(pred, "tpr", "fpr")
plot(perf)# Load randomForest package
library(randomForest)# Build random forest model
StevensForest = randomForest(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = Train, ntree=200, nodesize=25 )
# Convert outcome to factor
Train$Reverse = as.factor(Train$Reverse)
Test$Reverse = as.factor(Test$Reverse)# Try again
StevensForest = randomForest(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = Train, ntree=200, nodesize=25 )
# Make predictions
PredictForest = predict(StevensForest, newdata = Test)
# Compute Accuracy
z = table(Test$Reverse, PredictForest)
kable(z)| 0 | 1 | |
|---|---|---|
| 0 | 42 | 35 |
| 1 | 18 | 75 |
sum(diag(z))/sum(z)
## [1] 0.6882353# Load cross-validation packages
library(caret)
library(e1071)
# Define cross-validation experiment
numFolds = trainControl( method = "cv", number = 10 )
cpGrid = expand.grid( .cp = seq(0.01,0.5,0.01))
# Perform the cross validation
train(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = Train, method = "rpart", trControl = numFolds, tuneGrid = cpGrid )
## CART
##
## 396 samples
## 6 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 356, 356, 357, 356, 356, 357, ...
## Resampling results across tuning parameters:
##
## cp Accuracy Kappa
## 0.01 0.6087821 0.189707219
## 0.02 0.6216667 0.223453071
## 0.03 0.6267949 0.239192228
## 0.04 0.6368590 0.266178297
## 0.05 0.6443590 0.283030759
## 0.06 0.6443590 0.283030759
## 0.07 0.6443590 0.283030759
## 0.08 0.6443590 0.283030759
## 0.09 0.6443590 0.283030759
## 0.10 0.6443590 0.283030759
## 0.11 0.6443590 0.283030759
## 0.12 0.6443590 0.283030759
## 0.13 0.6443590 0.283030759
## 0.14 0.6443590 0.283030759
## 0.15 0.6443590 0.283030759
## 0.16 0.6443590 0.283030759
## 0.17 0.6443590 0.283030759
## 0.18 0.6443590 0.283030759
## 0.19 0.6443590 0.283030759
## 0.20 0.6038462 0.185123111
## 0.21 0.5631410 0.078289037
## 0.22 0.5528846 0.051089037
## 0.23 0.5403846 0.004897294
## 0.24 0.5378846 -0.008808290
## 0.25 0.5378846 -0.008808290
## 0.26 0.5453846 0.000000000
## 0.27 0.5453846 0.000000000
## 0.28 0.5453846 0.000000000
## 0.29 0.5453846 0.000000000
## 0.30 0.5453846 0.000000000
## 0.31 0.5453846 0.000000000
## 0.32 0.5453846 0.000000000
## 0.33 0.5453846 0.000000000
## 0.34 0.5453846 0.000000000
## 0.35 0.5453846 0.000000000
## 0.36 0.5453846 0.000000000
## 0.37 0.5453846 0.000000000
## 0.38 0.5453846 0.000000000
## 0.39 0.5453846 0.000000000
## 0.40 0.5453846 0.000000000
## 0.41 0.5453846 0.000000000
## 0.42 0.5453846 0.000000000
## 0.43 0.5453846 0.000000000
## 0.44 0.5453846 0.000000000
## 0.45 0.5453846 0.000000000
## 0.46 0.5453846 0.000000000
## 0.47 0.5453846 0.000000000
## 0.48 0.5453846 0.000000000
## 0.49 0.5453846 0.000000000
## 0.50 0.5453846 0.000000000
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.19.
# Create a new CART model
StevensTreeCV = rpart(Reverse ~ Circuit + Issue + Petitioner + Respondent + LowerCourt + Unconst, data = Train, method="class", cp = 0.18)
# Make predictions
PredictCV = predict(StevensTreeCV, newdata = Test, type = "class")
z = table(Test$Reverse, PredictCV)
kable(z)| 0 | 1 | |
|---|---|---|
| 0 | 59 | 18 |
| 1 | 29 | 64 |
# Compute Accuracy
sum(diag(z))/sum(z)
## [1] 0.7235294