Multiple linear regression is used to show the relationship between a dependent variable and multiple (two or more) independent variables. We will be using this method of analysis on a dataset from R called USJudgeRatings, which contains lawyers’ ratings of state judges in the US Superior Court in 1977.
Formula
A multiple linear regression model follows the formula:
y = b0 + b1x1 + b2x2… + b11*x11
where b0 is the intercept, and b1 to b11 are the coefficients of the variables x to x11.
Exploring the dataset
Firstly, load the dplyr and moderndive packages
library(dplyr)
library(moderndive)
Let’s use glimpse() to look inside the dataset:
glimpse(USJudgeRatings)
## Observations: 43
## Variables: 12
## $ CONT <dbl> 5.7, 6.8, 7.2, 6.8, 7.3, 6.2, 10.6, 7.0, 7.3, 8.2, 7.0, 6...
## $ INTG <dbl> 7.9, 8.9, 8.1, 8.8, 6.4, 8.8, 9.0, 5.9, 8.9, 7.9, 8.0, 8....
## $ DMNR <dbl> 7.7, 8.8, 7.8, 8.5, 4.3, 8.7, 8.9, 4.9, 8.9, 6.7, 7.6, 7....
## $ DILG <dbl> 7.3, 8.5, 7.8, 8.8, 6.5, 8.5, 8.7, 5.1, 8.7, 8.1, 7.4, 7....
## $ CFMG <dbl> 7.1, 7.8, 7.5, 8.3, 6.0, 7.9, 8.5, 5.4, 8.6, 7.9, 7.3, 7....
## $ DECI <dbl> 7.4, 8.1, 7.6, 8.5, 6.2, 8.0, 8.5, 5.9, 8.5, 8.0, 7.5, 7....
## $ PREP <dbl> 7.1, 8.0, 7.5, 8.7, 5.7, 8.1, 8.5, 4.8, 8.4, 7.9, 7.1, 6....
## $ FAMI <dbl> 7.1, 8.0, 7.5, 8.7, 5.7, 8.0, 8.5, 5.1, 8.4, 8.1, 7.2, 7....
## $ ORAL <dbl> 7.1, 7.8, 7.3, 8.4, 5.1, 8.0, 8.6, 4.7, 8.4, 7.7, 7.1, 7....
## $ WRIT <dbl> 7.0, 7.9, 7.4, 8.5, 5.3, 8.0, 8.4, 4.9, 8.5, 7.8, 7.2, 7....
## $ PHYS <dbl> 8.3, 8.5, 7.9, 8.8, 5.5, 8.6, 9.1, 6.8, 8.8, 8.5, 8.4, 6....
## $ RTEN <dbl> 7.8, 8.7, 7.8, 8.7, 4.8, 8.6, 9.0, 5.0, 8.8, 7.9, 7.7, 7....
We can see that the dataset contains 43 observations on the 12 numeric variables, which are:
Analysis
We want to see if there is any correlation between any of the first 11 variables and how much lawyers believe a particular judge is worthy of staying on in the US Superior Court. To do this, we will need to fit a multiple regression model with the lm() function, using RTEN as the dependent variable.
characteristics <- lm(RTEN ~ CONT + INTG + DMNR + DILG + CFMG + DECI + PREP + FAMI + ORAL + WRIT + PHYS, data = USJudgeRatings)
Next, we can get a regression table using the get_regression_table() function.
get_regression_table(characteristics)
## # A tibble: 12 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.12 0.519 -4.08 0 -3.18 -1.06
## 2 CONT 0.013 0.026 0.495 0.624 -0.04 0.066
## 3 INTG 0.365 0.129 2.82 0.008 0.101 0.629
## 4 DMNR 0.125 0.09 1.40 0.172 -0.058 0.308
## 5 DILG 0.067 0.143 0.466 0.644 -0.225 0.358
## 6 CFMG -0.195 0.148 -1.32 0.198 -0.496 0.107
## 7 DECI 0.278 0.138 2.01 0.053 -0.004 0.56
## 8 PREP -0.002 0.24 -0.008 0.994 -0.491 0.488
## 9 FAMI -0.136 0.267 -0.508 0.615 -0.681 0.409
## 10 ORAL 0.548 0.277 1.98 0.057 -0.018 1.11
## 11 WRIT -0.068 0.315 -0.216 0.83 -0.71 0.574
## 12 PHYS 0.269 0.062 4.33 0 0.142 0.396
The table tells us the intercept estimate which has no practical meaning as a negative number. However, it tells us where the RTEN score crosses the x axis - that is, when lawyers’ ratings for a judge’s retention is 0.
We can also see the coefficients for each of the variables, and their p-values. We will be using the conventional p < 0.05 as the threshold test for the statistical significance of variables. The smaller the p-value, the more statistically significant the variable will be.
Refining the regression model
To refine our model, let’s remove the biggest p-value which is PREP (preparation for trial) at 0.994, and run our model again to see how the remaining variables relate to RTEN:
characteristics_1 <- lm(RTEN ~ CONT + INTG + DMNR + DILG + CFMG + DECI + FAMI + ORAL + WRIT + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_1)
## # A tibble: 11 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.12 0.51 -4.15 0 -3.16 -1.08
## 2 CONT 0.013 0.025 0.509 0.615 -0.038 0.064
## 3 INTG 0.365 0.124 2.95 0.006 0.113 0.617
## 4 DMNR 0.125 0.085 1.47 0.152 -0.049 0.299
## 5 DILG 0.066 0.114 0.581 0.566 -0.166 0.298
## 6 CFMG -0.195 0.144 -1.36 0.185 -0.487 0.098
## 7 DECI 0.278 0.135 2.07 0.047 0.004 0.553
## 8 FAMI -0.137 0.213 -0.643 0.525 -0.571 0.297
## 9 ORAL 0.547 0.268 2.04 0.049 0.002 1.09
## 10 WRIT -0.068 0.306 -0.221 0.827 -0.691 0.556
## 11 PHYS 0.269 0.06 4.45 0 0.146 0.392
```
Note the p-values change when we remove a variable due to changes in relationship between the remaining variables. Unfortunately, WRIT at 0.827 is the next largest p-value. You would think sound written rulings would be critical for a judge’s retention rating! Let’s remove it from our model, and run it again.
characteristics_2 <- lm(RTEN ~ CONT + INTG + DMNR + DILG + CFMG + DECI + FAMI + ORAL + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_2)
## # A tibble: 10 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.13 0.502 -4.24 0 -3.15 -1.10
## 2 CONT 0.014 0.024 0.594 0.556 -0.034 0.063
## 3 INTG 0.361 0.121 3.00 0.005 0.116 0.606
## 4 DMNR 0.123 0.083 1.47 0.15 -0.047 0.292
## 5 DILG 0.071 0.11 0.641 0.526 -0.153 0.295
## 6 CFMG -0.198 0.141 -1.40 0.17 -0.485 0.089
## 7 DECI 0.272 0.13 2.10 0.044 0.008 0.536
## 8 FAMI -0.168 0.158 -1.06 0.295 -0.489 0.153
## 9 ORAL 0.516 0.225 2.30 0.028 0.059 0.974
## 10 PHYS 0.274 0.055 5.01 0 0.163 0.386
It looks like CONT or the number of contacts a lawyer has with the judge seems the next largest p-value at 0.556. Let’s see what happens when it’s gone.
characteristics_3 <- lm(RTEN ~ INTG + DMNR + DILG + CFMG + DECI + FAMI + ORAL + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_3)
## # A tibble: 9 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.04 0.478 -4.28 0 -3.02 -1.07
## 2 INTG 0.372 0.118 3.16 0.003 0.132 0.612
## 3 DMNR 0.106 0.078 1.36 0.182 -0.052 0.263
## 4 DILG 0.056 0.106 0.524 0.604 -0.16 0.271
## 5 CFMG -0.165 0.129 -1.28 0.208 -0.426 0.096
## 6 DECI 0.267 0.128 2.08 0.045 0.006 0.527
## 7 FAMI -0.191 0.152 -1.25 0.218 -0.499 0.118
## 8 ORAL 0.541 0.219 2.47 0.019 0.096 0.986
## 9 PHYS 0.27 0.054 5.02 0 0.161 0.379
Then let’s sadly remove diligence DILG which has a p-value of 0.604.
characteristics_4 <- lm(RTEN ~ INTG + DMNR + CFMG + DECI + FAMI + ORAL + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_4)
## # A tibble: 8 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.12 0.451 -4.71 0 -3.04 -1.21
## 2 INTG 0.398 0.106 3.74 0.001 0.182 0.613
## 3 DMNR 0.099 0.076 1.31 0.199 -0.055 0.253
## 4 CFMG -0.137 0.116 -1.18 0.245 -0.372 0.098
## 5 DECI 0.283 0.123 2.30 0.028 0.033 0.532
## 6 FAMI -0.168 0.144 -1.17 0.252 -0.461 0.125
## 7 ORAL 0.526 0.215 2.45 0.02 0.09 0.963
## 8 PHYS 0.264 0.052 5.08 0 0.158 0.369
And then remove familiarity with the law FAMI at 0.252.
characteristics_5 <- lm(RTEN ~ INTG + DMNR + CFMG + DECI + ORAL + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_5)
## # A tibble: 7 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.25 0.44 -5.11 0 -3.14 -1.35
## 2 INTG 0.384 0.106 3.62 0.001 0.168 0.599
## 3 DMNR 0.147 0.064 2.3 0.027 0.017 0.277
## 4 CFMG -0.1 0.112 -0.893 0.378 -0.326 0.127
## 5 DECI 0.249 0.12 2.07 0.045 0.005 0.492
## 6 ORAL 0.306 0.103 2.96 0.005 0.096 0.516
## 7 PHYS 0.289 0.047 6.10 0 0.193 0.385
Next to go is case file management CFMG at 0.378.
characteristics_6 <- lm(RTEN ~ INTG + DMNR + DECI + ORAL + PHYS, data = USJudgeRatings)
get_regression_table(characteristics_6)
## # A tibble: 6 x 7
## term estimate std_error statistic p_value lower_ci upper_ci
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 intercept -2.20 0.436 -5.06 0 -3.09 -1.32
## 2 INTG 0.378 0.106 3.58 0.001 0.164 0.592
## 3 DMNR 0.152 0.064 2.39 0.022 0.023 0.281
## 4 DECI 0.167 0.077 2.16 0.037 0.011 0.323
## 5 ORAL 0.292 0.102 2.86 0.007 0.085 0.498
## 6 PHYS 0.283 0.047 6.05 0 0.188 0.378
Finally, we are left with variables where p < 0.05. And the order of the variables seems pretty odd: the strongest positive correlation between lawyers’ scores on whether judges should stay is related to 1) a judge’s physical strength (which seems perfectly correlated), then 2) their integrity, 3) sound oral rulings, 4) demeanour, and finally 5) prompt decisions. But how do we know this model is actually any good? How well does it predict lawyers’ opinions on whether US Judges should stay?
Ensuring our model is robust
The sum of square residuals measures the lack of fit of a model to a set of points. However, the most appropriate test for a regression with multiple variables is the adjusted R-squared test, using summary.lm().
summary.lm(characteristics_6)
##
## Call:
## lm(formula = RTEN ~ INTG + DMNR + DECI + ORAL + PHYS, data = USJudgeRatings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.240656 -0.069026 -0.009474 0.068961 0.246402
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.20433 0.43611 -5.055 1.19e-05 ***
## INTG 0.37785 0.10559 3.579 0.000986 ***
## DMNR 0.15199 0.06354 2.392 0.021957 *
## DECI 0.16672 0.07702 2.165 0.036928 *
## ORAL 0.29169 0.10191 2.862 0.006887 **
## PHYS 0.28292 0.04678 6.048 5.40e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1119 on 37 degrees of freedom
## Multiple R-squared: 0.9909, Adjusted R-squared: 0.9897
## F-statistic: 806.1 on 5 and 37 DF, p-value: < 2.2e-16
As you can see, the adjusted R-squared of 0.9897 is very close to one which is the ideal fit. Let’s compare this to if we hadn’t removed the last variable.
summary.lm(characteristics_5)
##
## Call:
## lm(formula = RTEN ~ INTG + DMNR + CFMG + DECI + ORAL + PHYS,
## data = USJudgeRatings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.24640 -0.06378 -0.00994 0.06123 0.25233
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.24575 0.43976 -5.107 1.08e-05 ***
## INTG 0.38363 0.10607 3.617 0.000907 ***
## DMNR 0.14710 0.06395 2.300 0.027345 *
## CFMG -0.09977 0.11173 -0.893 0.377826
## DECI 0.24884 0.12009 2.072 0.045475 *
## ORAL 0.30632 0.10350 2.960 0.005419 **
## PHYS 0.28924 0.04744 6.097 5.15e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1122 on 36 degrees of freedom
## Multiple R-squared: 0.9911, Adjusted R-squared: 0.9896
## F-statistic: 668.2 on 6 and 36 DF, p-value: < 2.2e-16
The adjusted R-squared is slightly smaller at 0.9896, meaning we were on the right track… Let’s check what would have happened to the adjusted R-squared if we continued to remove the next largest p-value variable DECI or prompt decisions (despite being under the p< 0.05 threshold).
characteristics_7 <- lm(RTEN ~ INTG + DMNR + ORAL + PHYS, data = USJudgeRatings)
summary.lm(characteristics_7)
##
## Call:
## lm(formula = RTEN ~ INTG + DMNR + ORAL + PHYS, data = USJudgeRatings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.24455 -0.06517 -0.01191 0.05901 0.24095
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.73086 0.39518 -4.380 9.01e-05 ***
## INTG 0.33169 0.10831 3.062 0.00402 **
## DMNR 0.13983 0.06629 2.109 0.04157 *
## ORAL 0.45847 0.06987 6.562 9.70e-08 ***
## PHYS 0.28710 0.04895 5.865 8.73e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1172 on 38 degrees of freedom
## Multiple R-squared: 0.9898, Adjusted R-squared: 0.9887
## F-statistic: 917.4 on 4 and 38 DF, p-value: < 2.2e-16
Conclusion
We can see the p-value drops back down to 0.9887, suggesting our 5 variable model is the best predictor of a retention score for US judges.