Each jurisdiction in the United States, whether a state or territory, regulates its own licensing requirements for the practice of law, with one of the primary requirements being the state bar exam. This exam consists of various components, including the Multistate Bar Exam (MBE), Multistate Essay Exam (MEE), and Multistate Performance Test (MPT).These modules test a wide range of knowledge and reasoning skills, including but not limited to Constitutional Law, Contracts, Family Law, and Torts.
The objective of such a study is to determine the traits that help predict the likelihood that a law student will pass the bar examination. The data provided is student-level data of everyone who sat for the bar exam (both pass and fail) from 2021-2024.The dataset provided encompasses student-level data from individuals who took the bar exam between 2021 and 2024, including both successful and unsuccessful candidates. The paper uses logistic regressions to analyze the impact of academic, preparatory, and demographic factors upon the probability that the candidate passed the bar examination. The findings are designed to provide a pragmatic response to bar performance at the law school.
EDA (Exploratory data analysis) is an essential chunk to begin with before diving into the modeling part. This is an important step to look for any possible problems (e.g.missing values, outliers, or collinearity among the variables). Another advantage of EDA is that it allows to visualise the distribution of the variables, so that to choose the most promising predictors for the logistic regression model.
The data we are given cover multiple academic and preparation variables, including LSAT score, UGPA, GPA_1L, and bar preparation information. Preliminary checks on the dataset were conducted to understand the distribution of these variables and identify any patterns that could inform the subsequent modeling process.The information presented is individual-level information about everyone that took the bar examination in the years 2021 through 2024. Here’s a look at the main factors:
| Variable | Description |
|---|---|
| LSAT | Score on the LSAT entrance examination |
| UGPA | Undergraduate GPA |
| Year | Year the student entered law school |
| CivPro, LPI, LPII | Scores in 1L core courses |
| GPA_1L, GPA_Final | Cumulative GPA at end of 1L year and 3L year, respectively |
| Accommodations | Recieved accommodations from Student Disability Services (Yes/No) |
| Probation | Ever placed on academic probation (Yes/No) |
| Legal Analysis, AdvLegalPerf, AdvLegalAnalysis | Enrollment in various elective courses (Yes/No) |
| BarPrepCompany | Type of bar preparation course taken |
| BarPrepCompletion | Percent of bar prep course completed |
| X.LawSchoolBarPrepWorkshops | Number of bar prep workshops attended |
| StudentSuccessInitiative | Participation in academic support program (Yes/No) |
| BarPrepMentor | Whether the student had a mentor for bar prep (Yes/No) |
| MPRE, MPT, MEE, MBE | Component scores of the bar exam |
| UBE | Composite Uniform Bar Exam score |
| PassFail | Final Outcome - did the student pass the bar exam? (Yes/No) |
Data preprocessing is one of the most important stages for soundness and robustness of the model. There are a number of cleaning and transfromation steps performed in order to get the dataset in a wrangled state suitable for logistic regression prediction.
The initial dataset was cleaned by removing rows with missing values in the PassFail variable, ensuring a complete-case dataset. Additionally, binary categorical variables such as Accommodations and Probation were transformed into numerical format (0 = No, 1 = Yes), as logistic regression requires numerical inputs. Ordinal categorical variables, such as grades in CivPro and LPI, were converted into factors with ordered levels to reflect their ordinal nature. Non-ordinal categorical variables like BarPrepCompany were encoded numerically after factory encoding to facilitate their inclusion in the regression model.
Finally, irrelevant predictors and those that could introduce multicollinearity, such as the detailed scores for different bar exam components (e.g., MPRE, MEE), were removed from the dataset to ensure the focus remained on the primary academic and preparatory factors influencing bar passage.
#load libraries
library(MASS)
library(tidyverse)
library(ggplot2)
library(corrplot)
library(knitr)
library(broom)
library(dplyr)
library(pROC)
df<-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/Updated_Bar_Data_For_Review_Final.csv")
# Glimpse: scrollable in preformatted text
cat("### Structure of the Data (glimpse)\n")
cat("<div style='overflow-x: auto; max-width: 100%;'><pre>")
``` r cat(paste(capture.output(glimpse(df)), collapse = "\n")) ``` Rows: 476 Columns: 28 $ Year2021, 2021, 2021, 2021, 2021, 2021, 2021, … $ PassFail "F", "F", "F", "F", "F", "F", "F", "F", "F… $ Age 29.1, 29.6, 29.0, 36.2, 28.9, 30.8, 29.1, … $ LSAT 152, 155, 157, 156, 145, 154, 149, 160, 15… $ UGPA 3.42, 2.82, 3.46, 3.13, 3.49, 2.85, 3.43, … $ CivPro "B+", "B+", "C", "D+", "C", "B+", "C", "C"… $ LPI "A", "B", "B", "C", "C+", "F", "C", "C+", … $ LPII "A", "B", "B", "C+", "C+", "CR", "B", "B",… $ GPA_1L 3.206, 2.431, 2.620, 2.275, 2.293, 2.538, … $ GPA_Final 3.29, 3.20, 2.91, 2.77, 2.90, 2.82, 3.00, … $ FinalRankPercentile 0.46, 0.33, 0.08, 0.02, 0.08, 0.05, 0.15, … $ Accommodations "N", "Y", "N", "N", "N", "N", "N", "N", "N… $ Probation "N", "Y", "N", "Y", "Y", "N", "Y", "Y", "N… $ LegalAnalysis_TexasPractice "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y… $ AdvLegalPerfSkills "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y… $ AdvLegalAnalysis "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y… $ BarPrepCompany "Barbri", "Barbri", "Barbri", "Barbri", "T… $ BarPrepCompletion 0.96, 0.98, 0.48, 1.00, 0.77, 0.02, 0.90, … $ OptIntoWritingGuide "", "", "", "", "", "", "", "", "", "", ""… $ X.LawSchoolBarPrepWorkshops 3, 0, 3, 0, 5, 1, 5, 5, 1, 5, 0, 5, 5, 1, … $ StudentSuccessInitiative "N", "Cochran", "Smith", "Baldwin", "Baldw… $ BarPrepMentor "N", "N", "N", "N", "N", "N", "N", "N", "N… $ MPRE 103, 76, 99, 81, 99, NA, 90, 97, 100, 78, … $ MPT 3.0, 3.0, 3.0, 2.5, 3.5, 3.0, 2.5, 2.5, 3.… $ MEE 2.67, 3.17, 2.67, 3.00, 2.67, 2.00, 3.50, … $ WrittenScaledScore 125.5, 133.1, 125.5, 125.5, 130.5, 115.4, … $ MBE 133.3, 132.7, 118.2, 140.1, 125.4, 113.5, … $ UBE 258.8, 265.8, 243.7, 265.6, 255.9, 228.9, … ``` r cat("
“)
</pre></div>
``` r
# Summary: show in scrollable table format
cat("\n\n### Summary Statistics (first 10 columns)\n")
summary_df <- summary(df)
cat("<div style='overflow-x: auto; max-width: 100%;'>")
kable(summary_df, format = "html", table.attr = "style='width:auto;'")
| Year | PassFail | Age | LSAT | UGPA | CivPro | LPI | LPII | GPA_1L | GPA_Final | FinalRankPercentile | Accommodations | Probation | LegalAnalysis_TexasPractice | AdvLegalPerfSkills | AdvLegalAnalysis | BarPrepCompany | BarPrepCompletion | OptIntoWritingGuide | X.LawSchoolBarPrepWorkshops | StudentSuccessInitiative | BarPrepMentor | MPRE | MPT | MEE | WrittenScaledScore | MBE | UBE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min. :2021 | Length:476 | Min. :23.10 | Min. :141.0 | Min. :2.010 | Length:476 | Length:476 | Length:476 | Min. :2.200 | Min. :2.44 | Min. :0.0000 | Length:476 | Length:476 | Length:476 | Length:476 | Length:476 | Length:476 | Min. :0.0200 | Length:476 | Min. :0.000 | Length:476 | Length:476 | Min. : 76.00 | Min. :1.000 | Min. :2.000 | Min. :111.7 | Min. :103.6 | Min. :227.3 | |
| 1st Qu.:2022 | Class :character | 1st Qu.:26.70 | 1st Qu.:153.0 | 1st Qu.:3.250 | Class :character | Class :character | Class :character | 1st Qu.:2.781 | 1st Qu.:3.05 | 1st Qu.:0.2600 | Class :character | Class :character | Class :character | Class :character | Class :character | Class :character | 1st Qu.:0.8000 | Class :character | 1st Qu.:0.000 | Class :character | Class :character | 1st Qu.: 89.50 | 1st Qu.:3.000 | 1st Qu.:3.330 | 1st Qu.:138.0 | 1st Qu.:138.7 | 1st Qu.:278.5 | |
| Median :2023 | Mode :character | Median :28.20 | Median :156.0 | Median :3.490 | Mode :character | Mode :character | Mode :character | Median :3.083 | Median :3.27 | Median :0.5150 | Mode :character | Mode :character | Mode :character | Mode :character | Mode :character | Mode :character | Median :0.8900 | Mode :character | Median :0.000 | Mode :character | Mode :character | Median : 99.00 | Median :3.500 | Median :3.670 | Median :146.9 | Median :147.1 | Median :293.5 | |
| Mean :2023 | NA | Mean :29.13 | Mean :155.3 | Mean :3.451 | NA | NA | NA | Mean :3.086 | Mean :3.28 | Mean :0.5067 | NA | NA | NA | NA | NA | NA | Mean :0.8635 | NA | Mean :1.532 | NA | NA | Mean : 99.46 | Mean :3.651 | Mean :3.719 | Mean :146.6 | Mean :146.2 | Mean :292.9 | |
| 3rd Qu.:2024 | NA | 3rd Qu.:30.10 | 3rd Qu.:157.0 | 3rd Qu.:3.710 | NA | NA | NA | 3rd Qu.:3.383 | 3rd Qu.:3.52 | 3rd Qu.:0.7500 | NA | NA | NA | NA | NA | NA | 3rd Qu.:0.9800 | NA | 3rd Qu.:3.000 | NA | NA | 3rd Qu.:107.00 | 3rd Qu.:4.000 | 3rd Qu.:4.000 | 3rd Qu.:155.7 | 3rd Qu.:154.0 | 3rd Qu.:306.8 | |
| Max. :2024 | NA | Max. :65.70 | Max. :168.0 | Max. :4.140 | NA | NA | NA | Max. :4.000 | Max. :3.99 | Max. :0.9900 | NA | NA | NA | NA | NA | NA | Max. :1.0000 | NA | Max. :5.000 | NA | NA | Max. :145.00 | Max. :5.500 | Max. :5.330 | Max. :181.2 | Max. :187.9 | Max. :358.7 | |
| NA | NA | NA | NA | NA | NA | NA | NA | NA’s :4 | NA | NA | NA | NA | NA | NA | NA | NA | NA’s :23 | NA | NA | NA | NA | NA’s :273 | NA | NA | NA | NA | NA |
cat("</div>")
# Drop unnecessary score columns
df<-df[,!(names(df) %in% c("Year","MPRE","MPT","MEE","WrittenScaledScore","MBE","UBE"))]
# Drop all-NA columns
df <- df[, colSums(is.na(df)) < nrow(df)]
# Convert binary character variables to 0/1
binary_vars <- c("PassFail", "Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis")
df[binary_vars] <- lapply(df[binary_vars], function(x) as.numeric(x == "Y" | x == "P"))
# Convert certain character fields to factor
factor_vars <- c("CivPro", "LPI", "LPII", "BarPrepCompany")
df[factor_vars] <- lapply(df[factor_vars], as.factor)
# Process additional binary indicators
df$BarPrepMentor<-as.factor(df$BarPrepMentor)
levels(df$BarPrepMentor)
[1] “AbbeyCoufal” “AmberBeard” “AmberRich”
[4] “AshleyPirtle” “AshleySanders” “BenIvey”
[7] “BrendaJohnson” “BryanGreer” “CadyMello”
[10] “ChrisRhodes” “ClayElliott” “ColeShooter”
[13] “ColleenByrom” “ColleenElbe(Potts)” “ColleenPotts”
[16] “DanielleSaavedra” “DavidHutchens” “DavidRice”
[19] “DeirdreWard” “DenetteVaughn” “DolphWenzel”
[22] “GrantCoffey” “HaleyHickey” “HolleyMcDaniel”
[25] “HollyHaseloff” “HoltonWestbrook” “JacquelynnMayes”
[28] “JessicaAycock” “JohnMoore” “JordanChavez”
[31] “JosephAustin” “JulieDavis” “JustinPlescha”
[34] “KathleenGoegel” “KatyCrocker” “KimberlyKelley”
[37] “LauraFidelie” “LauraMcDivitt” “LaurenWelch”
[40] “LeenaAl-Souki” “MadelynDeviney” “MariaOviedo”
[43] “MelissaWaggoner” “MerylBenham” “MichaelEconomidis” [46]
“MikelaBryant” “MistyPratt” “MonicaReyes”
[49] “N” “PaulaMilan” “PaulaMillan”
[52] “PaulBarkhurst” “QuentinWetsel” “RebekahLuna”
[55] “RebekaLuna” “ReidLollis” “SaraThornton”
[58] “ScottKeffer” “ScoutBlosser” “TasiaEaslon”
[61] “TomHall” “TravisWeibold” “TylynnPayne”
[64] “VictoriaWhitehead” “VictorMellinger” “WilliamWells”
[67] “WillRaftis” “Y-DanielleSaavedra”
df$BarPrepMentor<-as.character(df$BarPrepMentor)
df$BarPrepMentor<-as.numeric(ifelse(df$BarPrepMentor=="N",0,1))
df$StudentSuccessInitiative<-as.factor(df$StudentSuccessInitiative)
levels(df$StudentSuccessInitiative)
[1] “Arrington” “Aycock” “Baldwin” “Beyer” “Chapman”
[6] “Christopher” “Cochran” “Corn” “Gonzalez” “Hardberger” [11]
“Humphrey” “Keffer” “Lauriat” “Lux” “McDonald”
[16] “N” “Rosen” “RSherwin” “Saavedra” “Sherwin”
[21] “Smith” “Stafford”
df$StudentSuccessInitiative<-as.numeric(ifelse(df$StudentSuccessInitiative=="N",0,1))
df$OptIntoWritingGuide<-as.factor(df$OptIntoWritingGuide)
levels(df$OptIntoWritingGuide)
[1] “” “N” “Y”
df$OptIntoWritingGuide<-ifelse(df$OptIntoWritingGuide=="",0,ifelse(df$OptIntoWritingGuide=="Y",1,0))
df$OptIntoWritingGuide<-as.numeric(df$OptIntoWritingGuide)
# Convert non-ordinal factor to numeric
df$BarPrepCompany <- as.numeric(df$BarPrepCompany)
# Define and convert ordinal grades
grade_levels <- c("", "F", "D", "D+", "C", "C+", "B", "B+", "A")
df$CivPro <- factor(df$CivPro, levels = grade_levels, ordered = FALSE)
df$LPI <- factor(df$LPI, levels = grade_levels, ordered = FALSE)
# Clean and convert LPII
df$LPII <- gsub("^CR$", "C", as.character(df$LPII))
df$LPII <- factor(df$LPII, levels = grade_levels[2:9], ordered = FALSE) # remove "" from LPII if not used
# Drop rows with missing PassFail
df <- df %>% filter(!is.na(PassFail))
# Complete case filter
df <- df[complete.cases(df), ]
The distribution of ordinal categorical variables was examined through bar plots that displayed the frequency of each grade in CivPro, LPI, and LPII. These variables, which represent grades in key 1L courses, were treated as ordinal factors in the model. The analysis revealed that most students received grades in the higher range (B+ to A), with very few students scoring in the lower grade categories.
This distribution is important as it confirms that students with higher grades in core subjects tend to have better outcomes on the bar exam. The use of an ordered factor in the logistic regression model helps maintain the ordinal nature of the grades, which is crucial for making accurate predictions regarding the likelihood of passing the bar.
# Select character or factor columns
cat_vars <- df %>% select(where(~ is.character(.x) || is.factor(.x)))
# Convert to long format
cat_long <- cat_vars %>%
pivot_longer(cols = everything(), names_to = "Variable", values_to = "Value") %>%
filter(Value != "") # Optional: remove blanks
# Plot
ggplot(cat_long, aes(x = Value)) +
geom_bar(fill = "lightcoral", color = "black") +
facet_wrap(~ Variable, scales = "free_x", ncol = 3) +
labs(
title = "Distributions of Ordinal Categorical Variables",
x = "Category",
y = "Count"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
strip.text = element_text(face = "bold")
)
# Select only numeric columns
num_vars <- df %>% select(where(is.numeric))
# Keep only numeric columns with more than 2 unique values (i.e., exclude binary 0/1)
non_binary_vars <- num_vars %>%
select(where(~ length(unique(na.omit(.))) > 2))
# Pivot to long format
pv_long <- pivot_longer(non_binary_vars, cols = everything(), names_to = "Variable", values_to = "Value")
The distribution of continuous variables, such as LSAT, UGPA, and GPA_1L, was assessed through histograms. These distributions showed that LSAT scores were approximately normally distributed, with a slight skew towards higher scores. Similarly, UGPA and GPA_1L also followed a roughly normal distribution, although UGPA exhibited a slight rightward skew, suggesting a higher concentration of students with relatively high undergraduate GPAs.
Understanding the distributions of these continuous variables is vital as they form the primary predictors in the logistic regression model. The continuous nature of LSAT and GPA_1L allows for their use in predicting the log-odds of bar passage through logistic regression, and the observed distributions suggest that they are well-suited to serve as significant predictors.
library(dplyr)
# Plot histograms
ggplot(pv_long, aes(x = Value)) +
geom_histogram(bins = 30, color = "black", fill = "lightblue") +
facet_wrap(~ Variable, scales = "free", ncol = 3) +
labs(title = "Histograms of Non-Binary Numeric Variables", x = "Value", y = "Frequency") +
theme_minimal()
The bar plots were employed to investigate the distribution of binary factors such as Accommodations and Probation. The distribution of these variables was more even across the two categories (0 = No, 1 = Yes) but slight skew was observed for Accommodations towards 0 (No) and for Probation towards 1 (Yes).
Important with respect to these binary predictors is that they can be used to gain additional views on the student characteristic angle that could affect whether students pass the bar. It should be noted that the skew in the distribution of students accommodated is proof that they are more likely to fail the bar exam. This implies that the students with an accommodation can be at unique risk in performance, and it is essential to include them in the logistic regression model to find at-risk students.
# Step 1: Select numeric columns
num_vars <- df %>% select(where(is.numeric))
# Step 2: Keep only binary fields (0/1 or only 2 unique values)
binary_vars <- num_vars %>%
select(where(~ all(na.omit(.) %in% c(0, 1)) && length(unique(na.omit(.))) == 2))
# Step 3: Convert to long format for plotting
binary_long <- pivot_longer(binary_vars, cols = everything(), names_to = "Variable", values_to = "Value")
# Step 4: Plot bar charts
ggplot(binary_long, aes(x = factor(Value))) +
geom_bar(fill = "lightgreen", color = "black") +
facet_wrap(~ Variable, scales = "free", ncol = 3) +
labs(title = "Bar Plots of Binary (0/1) Variables", x = "Value", y = "Count") +
theme_minimal()
A Binomial General Linear Model (GLM) was employed for this analysis, as the outcome variable, PassFail, is binary. Logistic regression, a type of binomial GLM, is ideal for modeling binary outcomes as it estimates the probability of the outcome occurring as a function of predictor variables. The logistic regression model uses the logit link function to model the relationship between the log-odds of passing the bar exam and the predictor variables.
The binomial distribution is appropriate for this scenario, as the bar exam outcome (pass/fail) is a dichotomous variable. The logistic regression model allows for the estimation of the log-odds, which can then be transformed into probabilities. The coefficients of the logistic regression model represent the change in the log-odds of passing the bar exam for each unit increase in the predictor variables.
As a consequence of not using a typical linear regression model, model selection is guided by a set of alternative criteria tailored to the characteristics of generalized linear models, as outlined below:
Akaike Information Criterion (AIC): A measure that indicates how well a model fits the data from which it was generated. log(L) is the model’s log-likelihood and indicates how well the model fits the data.K is the number of estimated parameters (including the intercept). (-2*log(L)) rewards a good fit, while the second term (+2K) penalizes the model with unnecessary variables that do not significantly improve the likelihood. Models should be compared using the AIC, lower AIC values indicate a better fit.
\[ AIC=-2*log(L)+2K \]
Stepwise Model Selection: Stepwise Regression Methods
Stepwise regression was employed to select the most significant predictors for the final model. The three main approaches to stepwise selection are forward selection, backward elimination, and both directions (bidirectional stepwise). Each approach is explained below:
The first model was built with all available predictors. The betas form the initial model indicated the importance of each candidate predictive factor. Nevertheless, with the stepwise model selection process some variables with the higher p-values were eliminated, thereby ensuring better parsimony of the model while protecting against overfitting.
# Run logistic regression
mod <- glm(PassFail ~ ., data = df, family = binomial(link = "logit"))
summary(mod)
##
## Call:
## glm(formula = PassFail ~ ., family = binomial(link = "logit"),
## data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -48.18643 5839.73804 -0.008 0.993416
## Age -0.10573 0.04851 -2.180 0.029275 *
## LSAT 0.21356 0.07175 2.976 0.002917 **
## UGPA 0.99046 0.70055 1.414 0.157413
## CivProD -30.28588 5594.88415 -0.005 0.995681
## CivProD+ -11.20839 3956.18083 -0.003 0.997739
## CivProC -11.94402 3956.18072 -0.003 0.997591
## CivProC+ -12.26230 3956.18067 -0.003 0.997527
## CivProB -11.25721 3956.18069 -0.003 0.997730
## CivProB+ -11.72957 3956.18069 -0.003 0.997634
## CivProA -11.59134 3956.18079 -0.003 0.997662
## LPID 17.05356 3956.18119 0.004 0.996561
## LPID+ 13.48904 3956.18071 0.003 0.997280
## LPIC 13.15715 3956.18066 0.003 0.997346
## LPIC+ 12.22904 3956.18061 0.003 0.997534
## LPIB 13.05641 3956.18059 0.003 0.997367
## LPIB+ 11.58617 3956.18066 0.003 0.997663
## LPIA 11.07918 3956.18072 0.003 0.997766
## LPIID+ -3.09390 2363.64903 -0.001 0.998956
## LPIIC -20.66144 1673.17729 -0.012 0.990147
## LPIIC+ -20.07982 1673.17718 -0.012 0.990425
## LPIIB -21.25965 1673.17732 -0.013 0.989862
## LPIIB+ -20.67828 1673.17733 -0.012 0.990139
## LPIIA -21.60781 1673.17747 -0.013 0.989696
## GPA_1L 1.54144 1.33387 1.156 0.247839
## GPA_Final 9.00455 3.57983 2.515 0.011891 *
## FinalRankPercentile -3.95262 4.24611 -0.931 0.351915
## Accommodations 0.15234 0.55712 0.273 0.784514
## Probation -0.43399 0.68563 -0.633 0.526747
## LegalAnalysis_TexasPractice -1.32937 0.71630 -1.856 0.063471 .
## AdvLegalPerfSkills -0.42303 0.78154 -0.541 0.588318
## AdvLegalAnalysis 0.34144 0.62192 0.549 0.582999
## BarPrepCompany 0.36060 0.10759 3.352 0.000804 ***
## BarPrepCompletion 6.62858 1.50306 4.410 1.03e-05 ***
## OptIntoWritingGuide -0.30474 0.58143 -0.524 0.600196
## X.LawSchoolBarPrepWorkshops 0.04322 0.11808 0.366 0.714377
## StudentSuccessInitiative 0.16120 0.68908 0.234 0.815037
## BarPrepMentor -0.70051 0.60596 -1.156 0.247661
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 172.79 on 410 degrees of freedom
## AIC: 248.79
##
## Number of Fisher Scoring iterations: 16
Forward Selection Forward selection begins with no predictors in the model and adds variables one by one based on the most significant improvements to the model’s fit (usually measured by AIC or p-value). This method adds predictors that improve the model most significantly, stopping when further additions do not provide substantial improvements.
Results of Forward Selection: The final model selected LSAT, BarPrepCompletion, BarPrepCompany, GPA_Final, and LegalAnalysis_TexasPractice. This approach highlighted the importance of these variables but could have missed other potential important interactions between predictors.
Backward elimination starts with all predictors and removes the least significant ones, one at a time, based on p-value or AIC, until only significant variables remain. This method is more suitable when most predictors are expected to be important, but it can be prone to retaining collinear or redundant variables.
Results of Backward Elimination: The final model retained LSAT, BarPrepCompletion, BarPrepCompany, and GPA_Final. The variable LegalAnalysis_TexasPractice was removed, indicating that its inclusion did not contribute significantly to explaining the outcome once other variables were accounted for.
The bidirectional stepwise method combines both forward and backward selection, adding and removing variables at each step based on their significance. This approach tends to result in a more balanced model, as it allows for the inclusion of variables that improve the model while simultaneously removing those that do not.
Results of Both Directions: A final model that combined the selection conducted in both directions included LSAT, BarPrepCompletion, BarPrepCompany, GPA_Final, and LegalAnalysis_TexasPractice. This approach provided the most balanced model by considering both the addition and removal of predictors, ultimately resulting in a robust and well-fit model.
Given the results of each stepwise method, both directions (bidirectional stepwise) is recommended for this analysis. This approach balances the strengths of both forward and backward selection, ensuring that the most relevant predictors are included while removing those that add little to the model. This method is particularly useful when there are potentially significant interactions between predictors or when the set of predictors is moderately large.
selected_model1 <- step(mod, direction = "backward")
## Start: AIC=248.79
## PassFail ~ Age + LSAT + UGPA + CivPro + LPI + LPII + GPA_1L +
## GPA_Final + FinalRankPercentile + Accommodations + Probation +
## LegalAnalysis_TexasPractice + AdvLegalPerfSkills + AdvLegalAnalysis +
## BarPrepCompany + BarPrepCompletion + OptIntoWritingGuide +
## X.LawSchoolBarPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - CivPro 7 178.46 240.46
## - LPI 7 184.10 246.10
## - StudentSuccessInitiative 1 172.84 246.84
## - Accommodations 1 172.86 246.86
## - X.LawSchoolBarPrepWorkshops 1 172.92 246.92
## - OptIntoWritingGuide 1 173.06 247.06
## - AdvLegalPerfSkills 1 173.08 247.08
## - AdvLegalAnalysis 1 173.09 247.09
## - Probation 1 173.19 247.19
## - FinalRankPercentile 1 173.66 247.66
## - BarPrepMentor 1 174.11 248.11
## - GPA_1L 1 174.16 248.16
## - UGPA 1 174.78 248.78
## <none> 172.79 248.79
## - LegalAnalysis_TexasPractice 1 176.36 250.36
## - Age 1 177.14 251.14
## - LPII 6 187.96 251.96
## - GPA_Final 1 179.51 253.51
## - LSAT 1 182.69 256.69
## - BarPrepCompany 1 185.72 259.72
## - BarPrepCompletion 1 195.92 269.92
##
## Step: AIC=240.46
## PassFail ~ Age + LSAT + UGPA + LPI + LPII + GPA_1L + GPA_Final +
## FinalRankPercentile + Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + X.LawSchoolBarPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - LPI 7 189.34 237.34
## - X.LawSchoolBarPrepWorkshops 1 178.46 238.46
## - Accommodations 1 178.48 238.48
## - OptIntoWritingGuide 1 178.51 238.51
## - StudentSuccessInitiative 1 178.57 238.57
## - AdvLegalAnalysis 1 178.78 238.78
## - Probation 1 178.95 238.95
## - AdvLegalPerfSkills 1 179.00 239.00
## - BarPrepMentor 1 179.13 239.13
## - UGPA 1 179.51 239.51
## - FinalRankPercentile 1 179.63 239.63
## <none> 178.46 240.46
## - GPA_1L 1 180.82 240.82
## - LegalAnalysis_TexasPractice 1 181.02 241.02
## - Age 1 182.66 242.66
## - LPII 6 195.79 245.79
## - GPA_Final 1 185.90 245.90
## - LSAT 1 186.76 246.76
## - BarPrepCompany 1 189.96 249.96
## - BarPrepCompletion 1 201.56 261.56
##
## Step: AIC=237.34
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + X.LawSchoolBarPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - X.LawSchoolBarPrepWorkshops 1 189.35 235.35
## - OptIntoWritingGuide 1 189.40 235.40
## - Accommodations 1 189.40 235.40
## - StudentSuccessInitiative 1 189.45 235.45
## - AdvLegalPerfSkills 1 189.59 235.59
## - FinalRankPercentile 1 189.68 235.68
## - Probation 1 189.83 235.83
## - AdvLegalAnalysis 1 190.07 236.07
## - BarPrepMentor 1 190.11 236.11
## - UGPA 1 190.32 236.32
## - GPA_1L 1 190.34 236.34
## - Age 1 191.07 237.07
## <none> 189.34 237.34
## - LegalAnalysis_TexasPractice 1 192.09 238.09
## - GPA_Final 1 193.45 239.45
## - LPII 6 206.84 242.84
## - LSAT 1 199.54 245.54
## - BarPrepCompany 1 201.37 247.37
## - BarPrepCompletion 1 215.28 261.28
##
## Step: AIC=235.36
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - OptIntoWritingGuide 1 189.42 233.42
## - Accommodations 1 189.43 233.43
## - StudentSuccessInitiative 1 189.49 233.49
## - AdvLegalPerfSkills 1 189.62 233.62
## - FinalRankPercentile 1 189.69 233.69
## - Probation 1 189.87 233.87
## - AdvLegalAnalysis 1 190.07 234.07
## - BarPrepMentor 1 190.21 234.21
## - UGPA 1 190.32 234.32
## - GPA_1L 1 190.43 234.43
## - Age 1 191.16 235.16
## <none> 189.35 235.35
## - LegalAnalysis_TexasPractice 1 192.16 236.16
## - GPA_Final 1 193.47 237.47
## - LPII 6 206.89 240.89
## - LSAT 1 199.68 243.68
## - BarPrepCompany 1 201.38 245.38
## - BarPrepCompletion 1 215.43 259.43
##
## Step: AIC=233.42
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - Accommodations 1 189.49 231.49
## - StudentSuccessInitiative 1 189.57 231.57
## - AdvLegalPerfSkills 1 189.63 231.63
## - FinalRankPercentile 1 189.86 231.86
## - Probation 1 189.90 231.90
## - AdvLegalAnalysis 1 190.14 232.14
## - BarPrepMentor 1 190.29 232.29
## - UGPA 1 190.35 232.35
## - GPA_1L 1 190.62 232.62
## - Age 1 191.17 233.17
## <none> 189.42 233.42
## - LegalAnalysis_TexasPractice 1 192.26 234.26
## - GPA_Final 1 193.97 235.97
## - LPII 6 207.07 239.07
## - LSAT 1 199.96 241.96
## - BarPrepCompany 1 201.49 243.49
## - BarPrepCompletion 1 215.45 257.45
##
## Step: AIC=231.49
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalPerfSkills +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - StudentSuccessInitiative 1 189.65 229.65
## - AdvLegalPerfSkills 1 189.69 229.69
## - FinalRankPercentile 1 189.91 229.91
## - Probation 1 189.96 229.96
## - AdvLegalAnalysis 1 190.18 230.18
## - BarPrepMentor 1 190.38 230.38
## - UGPA 1 190.43 230.43
## - GPA_1L 1 190.65 230.65
## - Age 1 191.20 231.20
## <none> 189.49 231.49
## - LegalAnalysis_TexasPractice 1 192.44 232.44
## - GPA_Final 1 194.03 234.03
## - LPII 6 207.11 237.11
## - LSAT 1 199.96 239.96
## - BarPrepCompany 1 201.56 241.56
## - BarPrepCompletion 1 215.49 255.49
##
## Step: AIC=229.65
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalPerfSkills +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - AdvLegalPerfSkills 1 189.88 227.88
## - FinalRankPercentile 1 189.96 227.96
## - Probation 1 190.05 228.05
## - AdvLegalAnalysis 1 190.37 228.37
## - BarPrepMentor 1 190.51 228.51
## - UGPA 1 190.56 228.56
## - GPA_1L 1 191.04 229.04
## - Age 1 191.35 229.35
## <none> 189.65 229.65
## - LegalAnalysis_TexasPractice 1 192.72 230.72
## - GPA_Final 1 194.12 232.12
## - LPII 6 207.38 235.38
## - LSAT 1 200.03 238.03
## - BarPrepCompany 1 201.75 239.75
## - BarPrepCompletion 1 215.99 253.99
##
## Step: AIC=227.88
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalAnalysis +
## BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - FinalRankPercentile 1 190.10 226.10
## - Probation 1 190.27 226.27
## - AdvLegalAnalysis 1 190.38 226.38
## - BarPrepMentor 1 190.67 226.67
## - UGPA 1 190.85 226.85
## - GPA_1L 1 191.25 227.25
## - Age 1 191.87 227.87
## <none> 189.88 227.88
## - GPA_Final 1 194.14 230.14
## - LegalAnalysis_TexasPractice 1 195.77 231.77
## - LPII 6 207.64 233.64
## - LSAT 1 200.58 236.58
## - BarPrepCompany 1 201.86 237.86
## - BarPrepCompletion 1 216.10 252.10
##
## Step: AIC=226.1
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + Probation +
## LegalAnalysis_TexasPractice + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - Probation 1 190.51 224.51
## - AdvLegalAnalysis 1 190.80 224.80
## - BarPrepMentor 1 190.95 224.95
## - UGPA 1 190.96 224.96
## - GPA_1L 1 191.29 225.29
## - Age 1 192.08 226.08
## <none> 190.10 226.10
## - LegalAnalysis_TexasPractice 1 195.81 229.81
## - LPII 6 207.68 231.68
## - LSAT 1 200.59 234.59
## - BarPrepCompany 1 202.25 236.25
## - GPA_Final 1 205.00 239.00
## - BarPrepCompletion 1 216.10 250.10
##
## Step: AIC=224.51
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - BarPrepMentor 1 191.23 223.23
## - AdvLegalAnalysis 1 191.24 223.24
## - UGPA 1 191.31 223.31
## <none> 190.51 224.51
## - Age 1 192.64 224.64
## - GPA_1L 1 192.97 224.97
## - LegalAnalysis_TexasPractice 1 196.65 228.65
## - LPII 6 207.68 229.68
## - LSAT 1 201.03 233.03
## - BarPrepCompany 1 202.40 234.40
## - GPA_Final 1 205.00 237.00
## - BarPrepCompletion 1 216.29 248.29
##
## Step: AIC=223.23
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## - UGPA 1 191.93 221.93
## - AdvLegalAnalysis 1 192.10 222.10
## - Age 1 193.22 223.22
## <none> 191.23 223.23
## - GPA_1L 1 194.14 224.14
## - LegalAnalysis_TexasPractice 1 197.46 227.46
## - LPII 6 208.52 228.52
## - LSAT 1 201.61 231.61
## - BarPrepCompany 1 202.62 232.62
## - GPA_Final 1 205.05 235.05
## - BarPrepCompletion 1 216.39 246.39
##
## Step: AIC=221.93
## PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## - AdvLegalAnalysis 1 192.56 220.56
## <none> 191.93 221.93
## - GPA_1L 1 194.93 222.93
## - Age 1 195.03 223.03
## - LegalAnalysis_TexasPractice 1 197.69 225.69
## - LPII 6 209.15 227.15
## - LSAT 1 201.66 229.66
## - BarPrepCompany 1 203.61 231.61
## - GPA_Final 1 206.00 234.00
## - BarPrepCompletion 1 217.60 245.60
##
## Step: AIC=220.56
## PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## <none> 192.56 220.56
## - Age 1 195.35 221.35
## - GPA_1L 1 195.72 221.72
## - LegalAnalysis_TexasPractice 1 197.90 223.90
## - LPII 6 209.34 225.34
## - LSAT 1 202.33 228.33
## - BarPrepCompany 1 204.78 230.78
## - GPA_Final 1 207.34 233.34
## - BarPrepCompletion 1 218.04 244.04
summary(selected_model1)
##
## Call:
## glm(formula = PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final +
## LegalAnalysis_TexasPractice + BarPrepCompany + BarPrepCompletion,
## family = binomial(link = "logit"), data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -27.53251 1707.41785 -0.016 0.987134
## Age -0.07328 0.04195 -1.747 0.080621 .
## LSAT 0.17653 0.05812 3.038 0.002385 **
## LPIID+ -1.97323 2430.58374 -0.001 0.999352
## LPIIC -19.46438 1707.38862 -0.011 0.990904
## LPIIC+ -18.93889 1707.38859 -0.011 0.991150
## LPIIB -19.82845 1707.38862 -0.012 0.990734
## LPIIB+ -19.78111 1707.38866 -0.012 0.990756
## LPIIA -20.70301 1707.38877 -0.012 0.990325
## GPA_1L 1.55461 0.89317 1.741 0.081764 .
## GPA_Final 4.46326 1.21079 3.686 0.000228 ***
## LegalAnalysis_TexasPractice -0.97980 0.43890 -2.232 0.025588 *
## BarPrepCompany 0.30204 0.09058 3.334 0.000855 ***
## BarPrepCompletion 5.97021 1.28994 4.628 3.69e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 192.56 on 434 degrees of freedom
## AIC: 220.56
##
## Number of Fisher Scoring iterations: 16
# Run forward stepwise selection starting from the null model
null_model <- glm(PassFail ~ 1, data = df, family = binomial)
selected_model2 <- step(null_model,
scope = list(lower = null_model, upper = mod),
direction = "forward")
## Start: AIC=319.6
## PassFail ~ 1
##
## Df Deviance AIC
## + FinalRankPercentile 1 249.11 253.11
## + GPA_Final 1 250.24 254.24
## + GPA_1L 1 257.80 261.80
## + StudentSuccessInitiative 1 267.77 271.77
## + CivPro 7 277.61 293.61
## + BarPrepCompletion 1 293.92 297.92
## + Probation 1 301.30 305.30
## + LSAT 1 307.76 311.76
## + LegalAnalysis_TexasPractice 1 311.02 315.02
## + Age 1 314.69 318.69
## + UGPA 1 314.82 318.82
## + LPI 7 303.29 319.29
## <none> 317.60 319.60
## + X.LawSchoolBarPrepWorkshops 1 316.36 320.36
## + OptIntoWritingGuide 1 316.59 320.59
## + AdvLegalPerfSkills 1 317.20 321.20
## + BarPrepMentor 1 317.27 321.27
## + BarPrepCompany 1 317.54 321.54
## + Accommodations 1 317.56 321.56
## + AdvLegalAnalysis 1 317.59 321.59
## + LPII 6 308.57 322.57
##
## Step: AIC=253.11
## PassFail ~ FinalRankPercentile
##
## Df Deviance AIC
## + BarPrepCompletion 1 239.19 245.19
## + LSAT 1 241.15 247.15
## + Age 1 245.88 251.88
## + GPA_1L 1 246.64 252.64
## <none> 249.11 253.11
## + LegalAnalysis_TexasPractice 1 247.24 253.24
## + BarPrepCompany 1 247.84 253.84
## + LPII 6 238.10 254.10
## + Probation 1 248.13 254.13
## + GPA_Final 1 248.46 254.46
## + StudentSuccessInitiative 1 248.57 254.57
## + AdvLegalPerfSkills 1 248.73 254.73
## + BarPrepMentor 1 248.87 254.87
## + OptIntoWritingGuide 1 248.94 254.94
## + Accommodations 1 248.99 254.99
## + X.LawSchoolBarPrepWorkshops 1 249.04 255.04
## + AdvLegalAnalysis 1 249.09 255.09
## + UGPA 1 249.11 255.11
## + LPI 7 238.57 256.57
## + CivPro 7 240.06 258.06
##
## Step: AIC=245.19
## PassFail ~ FinalRankPercentile + BarPrepCompletion
##
## Df Deviance AIC
## + LSAT 1 227.15 235.15
## + BarPrepCompany 1 234.98 242.98
## + GPA_1L 1 235.28 243.28
## + LegalAnalysis_TexasPractice 1 235.68 243.68
## + Age 1 235.98 243.98
## <none> 239.19 245.19
## + Probation 1 237.66 245.66
## + AdvLegalPerfSkills 1 238.57 246.57
## + BarPrepMentor 1 238.60 246.60
## + LPII 6 228.63 246.63
## + OptIntoWritingGuide 1 238.68 246.68
## + GPA_Final 1 238.72 246.72
## + X.LawSchoolBarPrepWorkshops 1 238.72 246.72
## + StudentSuccessInitiative 1 238.91 246.91
## + Accommodations 1 239.11 247.11
## + UGPA 1 239.14 247.14
## + AdvLegalAnalysis 1 239.19 247.19
## + CivPro 7 229.10 249.10
## + LPI 7 230.70 250.70
##
## Step: AIC=235.15
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT
##
## Df Deviance AIC
## + BarPrepCompany 1 219.57 229.57
## + Age 1 224.70 234.70
## + LegalAnalysis_TexasPractice 1 225.11 235.11
## <none> 227.15 235.15
## + GPA_Final 1 225.35 235.35
## + GPA_1L 1 225.81 235.81
## + UGPA 1 226.48 236.48
## + Probation 1 226.58 236.58
## + BarPrepMentor 1 226.75 236.75
## + StudentSuccessInitiative 1 226.75 236.75
## + OptIntoWritingGuide 1 226.77 236.77
## + AdvLegalAnalysis 1 227.04 237.04
## + AdvLegalPerfSkills 1 227.11 237.11
## + X.LawSchoolBarPrepWorkshops 1 227.14 237.14
## + Accommodations 1 227.14 237.14
## + LPII 6 217.81 237.81
## + CivPro 7 220.12 242.12
## + LPI 7 220.54 242.54
##
## Step: AIC=229.57
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany
##
## Df Deviance AIC
## + LegalAnalysis_TexasPractice 1 215.79 227.79
## + GPA_1L 1 217.47 229.47
## + Age 1 217.51 229.51
## <none> 219.57 229.57
## + LPII 6 207.88 229.88
## + GPA_Final 1 218.50 230.50
## + Probation 1 218.53 230.53
## + BarPrepMentor 1 218.66 230.66
## + UGPA 1 218.99 230.99
## + StudentSuccessInitiative 1 219.03 231.03
## + AdvLegalPerfSkills 1 219.20 231.20
## + OptIntoWritingGuide 1 219.27 231.27
## + X.LawSchoolBarPrepWorkshops 1 219.53 231.53
## + AdvLegalAnalysis 1 219.54 231.54
## + Accommodations 1 219.57 231.57
## + CivPro 7 212.18 236.18
## + LPI 7 214.47 238.47
##
## Step: AIC=227.79
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany +
## LegalAnalysis_TexasPractice
##
## Df Deviance AIC
## + GPA_Final 1 213.19 227.19
## + LPII 6 203.32 227.32
## <none> 215.79 227.79
## + Age 1 213.97 227.97
## + GPA_1L 1 214.22 228.22
## + OptIntoWritingGuide 1 214.60 228.60
## + BarPrepMentor 1 214.84 228.84
## + AdvLegalPerfSkills 1 214.87 228.87
## + AdvLegalAnalysis 1 214.99 228.99
## + UGPA 1 215.22 229.22
## + Probation 1 215.29 229.29
## + StudentSuccessInitiative 1 215.56 229.56
## + X.LawSchoolBarPrepWorkshops 1 215.62 229.62
## + Accommodations 1 215.79 229.79
## + CivPro 7 208.87 234.87
## + LPI 7 210.25 236.25
##
## Step: AIC=227.19
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany +
## LegalAnalysis_TexasPractice + GPA_Final
##
## Df Deviance AIC
## + LPII 6 198.39 224.39
## + Age 1 210.55 226.55
## <none> 213.19 227.19
## + GPA_1L 1 211.95 227.95
## + UGPA 1 212.00 228.00
## + BarPrepMentor 1 212.40 228.40
## + X.LawSchoolBarPrepWorkshops 1 212.71 228.71
## + Probation 1 212.76 228.76
## + OptIntoWritingGuide 1 212.89 228.89
## + StudentSuccessInitiative 1 212.97 228.97
## + AdvLegalAnalysis 1 213.00 229.00
## + AdvLegalPerfSkills 1 213.03 229.03
## + Accommodations 1 213.19 229.19
## + CivPro 7 206.05 234.05
## + LPI 7 206.47 234.47
##
## Step: AIC=224.39
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany +
## LegalAnalysis_TexasPractice + GPA_Final + LPII
##
## Df Deviance AIC
## + GPA_1L 1 195.16 223.16
## + Age 1 195.68 223.68
## <none> 198.39 224.39
## + Probation 1 196.42 224.42
## + UGPA 1 196.81 224.81
## + BarPrepMentor 1 197.52 225.52
## + X.LawSchoolBarPrepWorkshops 1 197.75 225.75
## + AdvLegalAnalysis 1 197.90 225.90
## + StudentSuccessInitiative 1 198.16 226.16
## + AdvLegalPerfSkills 1 198.26 226.26
## + OptIntoWritingGuide 1 198.36 226.36
## + Accommodations 1 198.38 226.38
## + CivPro 7 191.06 231.06
## + LPI 7 191.47 231.47
##
## Step: AIC=223.16
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany +
## LegalAnalysis_TexasPractice + GPA_Final + LPII + GPA_1L
##
## Df Deviance AIC
## + Age 1 192.21 222.21
## <none> 195.16 223.16
## + UGPA 1 193.44 223.44
## + Probation 1 194.72 224.72
## + BarPrepMentor 1 194.73 224.73
## + AdvLegalPerfSkills 1 194.89 224.89
## + AdvLegalAnalysis 1 194.94 224.94
## + X.LawSchoolBarPrepWorkshops 1 195.04 225.04
## + StudentSuccessInitiative 1 195.08 225.08
## + OptIntoWritingGuide 1 195.12 225.12
## + Accommodations 1 195.14 225.14
## + LPI 7 187.10 229.10
## + CivPro 7 189.88 231.88
##
## Step: AIC=222.21
## PassFail ~ FinalRankPercentile + BarPrepCompletion + LSAT + BarPrepCompany +
## LegalAnalysis_TexasPractice + GPA_Final + LPII + GPA_1L +
## Age
##
## Df Deviance AIC
## <none> 192.21 222.21
## + UGPA 1 191.55 223.55
## + BarPrepMentor 1 191.56 223.56
## + AdvLegalAnalysis 1 191.77 223.77
## + Probation 1 191.96 223.96
## + StudentSuccessInitiative 1 192.12 224.12
## + Accommodations 1 192.15 224.15
## + X.LawSchoolBarPrepWorkshops 1 192.16 224.16
## + OptIntoWritingGuide 1 192.18 224.18
## + AdvLegalPerfSkills 1 192.20 224.20
## + LPI 7 181.18 225.18
## + CivPro 7 187.84 231.84
summary(selected_model2)
##
## Call:
## glm(formula = PassFail ~ FinalRankPercentile + BarPrepCompletion +
## LSAT + BarPrepCompany + LegalAnalysis_TexasPractice + GPA_Final +
## LPII + GPA_1L + Age, family = binomial, data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -32.04679 1652.75109 -0.019 0.984530
## FinalRankPercentile -1.95462 3.31011 -0.591 0.554855
## BarPrepCompletion 6.03466 1.29351 4.665 3.08e-06 ***
## LSAT 0.18068 0.05877 3.074 0.002109 **
## BarPrepCompany 0.29990 0.09088 3.300 0.000967 ***
## LegalAnalysis_TexasPractice -1.03321 0.44823 -2.305 0.021161 *
## GPA_Final 5.92335 2.76481 2.142 0.032160 *
## LPIID+ -2.25284 2389.13860 -0.001 0.999248
## LPIIC -19.72982 1652.70304 -0.012 0.990475
## LPIIC+ -19.20407 1652.70301 -0.012 0.990729
## LPIIB -20.11521 1652.70305 -0.012 0.990289
## LPIIB+ -20.01055 1652.70306 -0.012 0.990340
## LPIIA -20.95519 1652.70319 -0.013 0.989884
## GPA_1L 1.66898 0.91786 1.818 0.069012 .
## Age -0.07536 0.04186 -1.800 0.071801 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 192.21 on 433 degrees of freedom
## AIC: 222.21
##
## Number of Fisher Scoring iterations: 16
# Perform stepwise selection in both directions
selected_model3 <- step(mod, direction = "both")
## Start: AIC=248.79
## PassFail ~ Age + LSAT + UGPA + CivPro + LPI + LPII + GPA_1L +
## GPA_Final + FinalRankPercentile + Accommodations + Probation +
## LegalAnalysis_TexasPractice + AdvLegalPerfSkills + AdvLegalAnalysis +
## BarPrepCompany + BarPrepCompletion + OptIntoWritingGuide +
## X.LawSchoolBarPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - CivPro 7 178.46 240.46
## - LPI 7 184.10 246.10
## - StudentSuccessInitiative 1 172.84 246.84
## - Accommodations 1 172.86 246.86
## - X.LawSchoolBarPrepWorkshops 1 172.92 246.92
## - OptIntoWritingGuide 1 173.06 247.06
## - AdvLegalPerfSkills 1 173.08 247.08
## - AdvLegalAnalysis 1 173.09 247.09
## - Probation 1 173.19 247.19
## - FinalRankPercentile 1 173.66 247.66
## - BarPrepMentor 1 174.11 248.11
## - GPA_1L 1 174.16 248.16
## - UGPA 1 174.78 248.78
## <none> 172.79 248.79
## - LegalAnalysis_TexasPractice 1 176.36 250.36
## - Age 1 177.14 251.14
## - LPII 6 187.96 251.96
## - GPA_Final 1 179.51 253.51
## - LSAT 1 182.69 256.69
## - BarPrepCompany 1 185.72 259.72
## - BarPrepCompletion 1 195.92 269.92
##
## Step: AIC=240.46
## PassFail ~ Age + LSAT + UGPA + LPI + LPII + GPA_1L + GPA_Final +
## FinalRankPercentile + Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + X.LawSchoolBarPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - LPI 7 189.34 237.34
## - X.LawSchoolBarPrepWorkshops 1 178.46 238.46
## - Accommodations 1 178.48 238.48
## - OptIntoWritingGuide 1 178.51 238.51
## - StudentSuccessInitiative 1 178.57 238.57
## - AdvLegalAnalysis 1 178.78 238.78
## - Probation 1 178.95 238.95
## - AdvLegalPerfSkills 1 179.00 239.00
## - BarPrepMentor 1 179.13 239.13
## - UGPA 1 179.51 239.51
## - FinalRankPercentile 1 179.63 239.63
## <none> 178.46 240.46
## - GPA_1L 1 180.82 240.82
## - LegalAnalysis_TexasPractice 1 181.02 241.02
## - Age 1 182.66 242.66
## - LPII 6 195.79 245.79
## - GPA_Final 1 185.90 245.90
## - LSAT 1 186.76 246.76
## + CivPro 7 172.79 248.79
## - BarPrepCompany 1 189.96 249.96
## - BarPrepCompletion 1 201.56 261.56
##
## Step: AIC=237.34
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + X.LawSchoolBarPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - X.LawSchoolBarPrepWorkshops 1 189.35 235.35
## - OptIntoWritingGuide 1 189.40 235.40
## - Accommodations 1 189.40 235.40
## - StudentSuccessInitiative 1 189.45 235.45
## - AdvLegalPerfSkills 1 189.59 235.59
## - FinalRankPercentile 1 189.68 235.68
## - Probation 1 189.83 235.83
## - AdvLegalAnalysis 1 190.07 236.07
## - BarPrepMentor 1 190.11 236.11
## - UGPA 1 190.32 236.32
## - GPA_1L 1 190.34 236.34
## - Age 1 191.07 237.07
## <none> 189.34 237.34
## - LegalAnalysis_TexasPractice 1 192.09 238.09
## - GPA_Final 1 193.45 239.45
## + LPI 7 178.46 240.46
## - LPII 6 206.84 242.84
## - LSAT 1 199.54 245.54
## + CivPro 7 184.10 246.10
## - BarPrepCompany 1 201.37 247.37
## - BarPrepCompletion 1 215.28 261.28
##
## Step: AIC=235.36
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + OptIntoWritingGuide + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - OptIntoWritingGuide 1 189.42 233.42
## - Accommodations 1 189.43 233.43
## - StudentSuccessInitiative 1 189.49 233.49
## - AdvLegalPerfSkills 1 189.62 233.62
## - FinalRankPercentile 1 189.69 233.69
## - Probation 1 189.87 233.87
## - AdvLegalAnalysis 1 190.07 234.07
## - BarPrepMentor 1 190.21 234.21
## - UGPA 1 190.32 234.32
## - GPA_1L 1 190.43 234.43
## - Age 1 191.16 235.16
## <none> 189.35 235.35
## - LegalAnalysis_TexasPractice 1 192.16 236.16
## + X.LawSchoolBarPrepWorkshops 1 189.34 237.34
## - GPA_Final 1 193.47 237.47
## + LPI 7 178.46 238.46
## - LPII 6 206.89 240.89
## - LSAT 1 199.68 243.68
## + CivPro 7 184.13 244.13
## - BarPrepCompany 1 201.38 245.38
## - BarPrepCompletion 1 215.43 259.43
##
## Step: AIC=233.42
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Accommodations + Probation + LegalAnalysis_TexasPractice +
## AdvLegalPerfSkills + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + StudentSuccessInitiative + BarPrepMentor
##
## Df Deviance AIC
## - Accommodations 1 189.49 231.49
## - StudentSuccessInitiative 1 189.57 231.57
## - AdvLegalPerfSkills 1 189.63 231.63
## - FinalRankPercentile 1 189.86 231.86
## - Probation 1 189.90 231.90
## - AdvLegalAnalysis 1 190.14 232.14
## - BarPrepMentor 1 190.29 232.29
## - UGPA 1 190.35 232.35
## - GPA_1L 1 190.62 232.62
## - Age 1 191.17 233.17
## <none> 189.42 233.42
## - LegalAnalysis_TexasPractice 1 192.26 234.26
## + OptIntoWritingGuide 1 189.35 235.35
## + X.LawSchoolBarPrepWorkshops 1 189.40 235.40
## - GPA_Final 1 193.97 235.97
## + LPI 7 178.51 236.51
## - LPII 6 207.07 239.07
## - LSAT 1 199.96 241.96
## + CivPro 7 184.27 242.27
## - BarPrepCompany 1 201.49 243.49
## - BarPrepCompletion 1 215.45 257.45
##
## Step: AIC=231.49
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalPerfSkills +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + StudentSuccessInitiative +
## BarPrepMentor
##
## Df Deviance AIC
## - StudentSuccessInitiative 1 189.65 229.65
## - AdvLegalPerfSkills 1 189.69 229.69
## - FinalRankPercentile 1 189.91 229.91
## - Probation 1 189.96 229.96
## - AdvLegalAnalysis 1 190.18 230.18
## - BarPrepMentor 1 190.38 230.38
## - UGPA 1 190.43 230.43
## - GPA_1L 1 190.65 230.65
## - Age 1 191.20 231.20
## <none> 189.49 231.49
## - LegalAnalysis_TexasPractice 1 192.44 232.44
## + Accommodations 1 189.42 233.42
## + OptIntoWritingGuide 1 189.43 233.43
## + X.LawSchoolBarPrepWorkshops 1 189.46 233.46
## - GPA_Final 1 194.03 234.03
## + LPI 7 178.54 234.54
## - LPII 6 207.11 237.11
## - LSAT 1 199.96 239.96
## + CivPro 7 184.30 240.30
## - BarPrepCompany 1 201.56 241.56
## - BarPrepCompletion 1 215.49 255.49
##
## Step: AIC=229.65
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalPerfSkills +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - AdvLegalPerfSkills 1 189.88 227.88
## - FinalRankPercentile 1 189.96 227.96
## - Probation 1 190.05 228.05
## - AdvLegalAnalysis 1 190.37 228.37
## - BarPrepMentor 1 190.51 228.51
## - UGPA 1 190.56 228.56
## - GPA_1L 1 191.04 229.04
## - Age 1 191.35 229.35
## <none> 189.65 229.65
## - LegalAnalysis_TexasPractice 1 192.72 230.72
## + StudentSuccessInitiative 1 189.49 231.49
## + OptIntoWritingGuide 1 189.56 231.56
## + Accommodations 1 189.57 231.57
## + X.LawSchoolBarPrepWorkshops 1 189.60 231.60
## - GPA_Final 1 194.12 232.12
## + LPI 7 178.63 232.63
## - LPII 6 207.38 235.38
## - LSAT 1 200.03 238.03
## + CivPro 7 184.59 238.59
## - BarPrepCompany 1 201.75 239.75
## - BarPrepCompletion 1 215.99 253.99
##
## Step: AIC=227.88
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + FinalRankPercentile +
## Probation + LegalAnalysis_TexasPractice + AdvLegalAnalysis +
## BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - FinalRankPercentile 1 190.10 226.10
## - Probation 1 190.27 226.27
## - AdvLegalAnalysis 1 190.38 226.38
## - BarPrepMentor 1 190.67 226.67
## - UGPA 1 190.85 226.85
## - GPA_1L 1 191.25 227.25
## - Age 1 191.87 227.87
## <none> 189.88 227.88
## + AdvLegalPerfSkills 1 189.65 229.65
## + StudentSuccessInitiative 1 189.69 229.69
## + Accommodations 1 189.82 229.82
## + X.LawSchoolBarPrepWorkshops 1 189.82 229.82
## + OptIntoWritingGuide 1 189.86 229.86
## - GPA_Final 1 194.14 230.14
## + LPI 7 179.10 231.10
## - LegalAnalysis_TexasPractice 1 195.77 231.77
## - LPII 6 207.64 233.64
## - LSAT 1 200.58 236.58
## + CivPro 7 184.60 236.60
## - BarPrepCompany 1 201.86 237.86
## - BarPrepCompletion 1 216.10 252.10
##
## Step: AIC=226.1
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + Probation +
## LegalAnalysis_TexasPractice + AdvLegalAnalysis + BarPrepCompany +
## BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - Probation 1 190.51 224.51
## - AdvLegalAnalysis 1 190.80 224.80
## - BarPrepMentor 1 190.95 224.95
## - UGPA 1 190.96 224.96
## - GPA_1L 1 191.29 225.29
## - Age 1 192.08 226.08
## <none> 190.10 226.10
## + FinalRankPercentile 1 189.88 227.88
## + AdvLegalPerfSkills 1 189.96 227.96
## + OptIntoWritingGuide 1 190.02 228.02
## + StudentSuccessInitiative 1 190.03 228.03
## + Accommodations 1 190.05 228.05
## + X.LawSchoolBarPrepWorkshops 1 190.06 228.06
## - LegalAnalysis_TexasPractice 1 195.81 229.81
## + LPI 7 180.45 230.45
## - LPII 6 207.68 231.68
## - LSAT 1 200.59 234.59
## + CivPro 7 184.78 234.78
## - BarPrepCompany 1 202.25 236.25
## - GPA_Final 1 205.00 239.00
## - BarPrepCompletion 1 216.10 250.10
##
## Step: AIC=224.51
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion + BarPrepMentor
##
## Df Deviance AIC
## - BarPrepMentor 1 191.23 223.23
## - AdvLegalAnalysis 1 191.24 223.24
## - UGPA 1 191.31 223.31
## <none> 190.51 224.51
## - Age 1 192.64 224.64
## - GPA_1L 1 192.97 224.97
## + Probation 1 190.10 226.10
## + FinalRankPercentile 1 190.27 226.27
## + AdvLegalPerfSkills 1 190.38 226.38
## + X.LawSchoolBarPrepWorkshops 1 190.46 226.46
## + OptIntoWritingGuide 1 190.47 226.47
## + Accommodations 1 190.47 226.47
## + StudentSuccessInitiative 1 190.48 226.48
## - LegalAnalysis_TexasPractice 1 196.65 228.65
## + LPI 7 180.94 228.94
## - LPII 6 207.68 229.68
## + CivPro 7 184.93 232.93
## - LSAT 1 201.03 233.03
## - BarPrepCompany 1 202.40 234.40
## - GPA_Final 1 205.00 237.00
## - BarPrepCompletion 1 216.29 248.29
##
## Step: AIC=223.23
## PassFail ~ Age + LSAT + UGPA + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## - UGPA 1 191.93 221.93
## - AdvLegalAnalysis 1 192.10 222.10
## - Age 1 193.22 223.22
## <none> 191.23 223.23
## - GPA_1L 1 194.14 224.14
## + BarPrepMentor 1 190.51 224.51
## + FinalRankPercentile 1 190.94 224.94
## + Probation 1 190.95 224.95
## + X.LawSchoolBarPrepWorkshops 1 191.10 225.10
## + OptIntoWritingGuide 1 191.15 225.15
## + AdvLegalPerfSkills 1 191.16 225.16
## + Accommodations 1 191.17 225.17
## + StudentSuccessInitiative 1 191.22 225.22
## - LegalAnalysis_TexasPractice 1 197.46 227.46
## + LPI 7 181.58 227.58
## - LPII 6 208.52 228.52
## - LSAT 1 201.61 231.61
## + CivPro 7 186.20 232.20
## - BarPrepCompany 1 202.62 232.62
## - GPA_Final 1 205.05 235.05
## - BarPrepCompletion 1 216.39 246.39
##
## Step: AIC=221.93
## PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## AdvLegalAnalysis + BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## - AdvLegalAnalysis 1 192.56 220.56
## <none> 191.93 221.93
## - GPA_1L 1 194.93 222.93
## - Age 1 195.03 223.03
## + UGPA 1 191.23 223.23
## + BarPrepMentor 1 191.31 223.31
## + Probation 1 191.68 223.68
## + FinalRankPercentile 1 191.77 223.77
## + AdvLegalPerfSkills 1 191.81 223.81
## + Accommodations 1 191.86 223.86
## + X.LawSchoolBarPrepWorkshops 1 191.87 223.87
## + OptIntoWritingGuide 1 191.90 223.90
## + StudentSuccessInitiative 1 191.91 223.91
## - LegalAnalysis_TexasPractice 1 197.69 225.69
## + LPI 7 182.25 226.25
## - LPII 6 209.15 227.15
## - LSAT 1 201.66 229.66
## + CivPro 7 187.45 231.45
## - BarPrepCompany 1 203.61 231.61
## - GPA_Final 1 206.00 234.00
## - BarPrepCompletion 1 217.60 245.60
##
## Step: AIC=220.56
## PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final + LegalAnalysis_TexasPractice +
## BarPrepCompany + BarPrepCompletion
##
## Df Deviance AIC
## <none> 192.56 220.56
## - Age 1 195.35 221.35
## - GPA_1L 1 195.72 221.72
## + BarPrepMentor 1 191.82 221.82
## + AdvLegalAnalysis 1 191.93 221.93
## + UGPA 1 192.10 222.10
## + FinalRankPercentile 1 192.21 222.21
## + Probation 1 192.29 222.29
## + OptIntoWritingGuide 1 192.43 222.43
## + Accommodations 1 192.52 222.52
## + X.LawSchoolBarPrepWorkshops 1 192.54 222.54
## + StudentSuccessInitiative 1 192.55 222.55
## + AdvLegalPerfSkills 1 192.56 222.56
## - LegalAnalysis_TexasPractice 1 197.90 223.90
## + LPI 7 182.57 224.57
## - LPII 6 209.34 225.34
## - LSAT 1 202.33 228.33
## + CivPro 7 188.11 230.11
## - BarPrepCompany 1 204.78 230.78
## - GPA_Final 1 207.34 233.34
## - BarPrepCompletion 1 218.04 244.04
# Show summary of the final selected model
summary(selected_model3)
##
## Call:
## glm(formula = PassFail ~ Age + LSAT + LPII + GPA_1L + GPA_Final +
## LegalAnalysis_TexasPractice + BarPrepCompany + BarPrepCompletion,
## family = binomial(link = "logit"), data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -27.53251 1707.41785 -0.016 0.987134
## Age -0.07328 0.04195 -1.747 0.080621 .
## LSAT 0.17653 0.05812 3.038 0.002385 **
## LPIID+ -1.97323 2430.58374 -0.001 0.999352
## LPIIC -19.46438 1707.38862 -0.011 0.990904
## LPIIC+ -18.93889 1707.38859 -0.011 0.991150
## LPIIB -19.82845 1707.38862 -0.012 0.990734
## LPIIB+ -19.78111 1707.38866 -0.012 0.990756
## LPIIA -20.70301 1707.38877 -0.012 0.990325
## GPA_1L 1.55461 0.89317 1.741 0.081764 .
## GPA_Final 4.46326 1.21079 3.686 0.000228 ***
## LegalAnalysis_TexasPractice -0.97980 0.43890 -2.232 0.025588 *
## BarPrepCompany 0.30204 0.09058 3.334 0.000855 ***
## BarPrepCompletion 5.97021 1.28994 4.628 3.69e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 192.56 on 434 degrees of freedom
## AIC: 220.56
##
## Number of Fisher Scoring iterations: 16
The final model’s coefficients provided a clear interpretation of the relationship between predictor variables and the likelihood of passing the bar exam. A Residuals vs. Fitted Values plot was used to evaluate the model’s fit. The plot showed that the residuals were randomly distributed around zero, with no obvious patterns, indicating that the model appropriately captured the relationships between predictors and the outcome.
The lack of patterns in the residuals suggests that the model fits the data well and that the assumptions of logistic regression are met. This reinforces the reliability of the model and its ability to make accurate predictions.
# Residuals vs Fitted Values plot
plot(selected_model1$fitted.values, residuals(selected_model1, type = "deviance"),
xlab = "Fitted Values", ylab = "Deviance Residuals",
main = "Residuals vs Fitted Values")
abline(h = 0, lty = 2)
This analysis well identified a handful of major predictors of bar performance: LSAT scores, BarPrepCompletion percent, and GPA_Final. These results emphasize the significance of strong academic performance and sufficient preparation for the bar.
[Note: As an exercise for future reference, we could make the model stronger by adding interaction terms between the variables (eg the interaction between LSAT and GPA)] [or using regularization techniques like Lasso or Ridge to avoid overfitting.] Additionally, incorporating a cross-validation approach would provide a more robust estimate of model performance and generalizability. These improvements would further refine the model’s accuracy, helping to better predict bar passage outcomes and guide future interventions for at-risk students.
These improvements would further refine the model’s accuracy, helping to better predict bar passage outcomes and guide future interventions for at-risk students.
This section includes the entire code used for the analysis but will not be run.
# Load necessary libraries
library(MASS)
library(tidyverse)
library(ggplot2)
library(corrplot)
library(knitr)
library(broom)
library(dplyr)
library(pROC)
# Load the dataset
df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/Updated_Bar_Data_For_Review_Final.csv")
# Glimpse: scrollable in preformatted text
cat("### Structure of the Data (glimpse)\n")
cat("<div style='overflow-x: auto; max-width: 100%;'><pre>")
cat(paste(capture.output(glimpse(df)), collapse = "\n"))
cat("</pre></div>")
# Summary: show in scrollable table format
cat("\n\n### Summary Statistics (first 10 columns)\n")
summary_df <- summary(df)
cat("<div style='overflow-x: auto; max-width: 100%;'>")
kable(summary_df, format = "html", table.attr = "style='width:auto;'")
cat("</div>")
# Drop unnecessary score columns
df <- df[, !(names(df) %in% c("Year", "MPRE", "MPT", "MEE", "WrittenScaledScore", "MBE", "UBE"))]
# Drop all-NA columns
df <- df[, colSums(is.na(df)) < nrow(df)]
# Convert binary character variables to 0/1
binary_vars <- c("PassFail", "Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis")
df[binary_vars] <- lapply(df[binary_vars], function(x) as.numeric(x == "Y" | x == "P"))
# Convert certain character fields to factor
factor_vars <- c("CivPro", "LPI", "LPII", "BarPrepCompany")
df[factor_vars] <- lapply(df[factor_vars], as.factor)
# Process additional binary indicators
df$BarPrepMentor <- as.factor(df$BarPrepMentor)
levels(df$BarPrepMentor)
df$BarPrepMentor <- as.character(df$BarPrepMentor)
df$BarPrepMentor <- as.numeric(ifelse(df$BarPrepMentor == "N", 0, 1))
df$StudentSuccessInitiative <- as.factor(df$StudentSuccessInitiative)
levels(df$StudentSuccessInitiative)
df$StudentSuccessInitiative <- as.numeric(ifelse(df$StudentSuccessInitiative == "N", 0, 1))
df$OptIntoWritingGuide <- as.factor(df$OptIntoWritingGuide)
levels(df$OptIntoWritingGuide)
df$OptIntoWritingGuide <- ifelse(df$OptIntoWritingGuide == "", 0, ifelse(df$OptIntoWritingGuide == "Y", 1, 0))
df$OptIntoWritingGuide <- as.numeric(df$OptIntoWritingGuide)
# Convert non-ordinal factor to numeric
df$BarPrepCompany <- as.numeric(df$BarPrepCompany)
# Define and convert ordinal grades
grade_levels <- c("", "F", "D", "D+", "C", "C+", "B", "B+", "A")
df$CivPro <- factor(df$CivPro, levels = grade_levels, ordered = FALSE)
df$LPI <- factor(df$LPI, levels = grade_levels, ordered = FALSE)
# Clean and convert LPII
df$LPII <- gsub("^CR$", "C", as.character(df$LPII))
df$LPII <- factor(df$LPII, levels = grade_levels[2:9], ordered = FALSE) # remove "" from LPII if not used
# Drop rows with missing PassFail
df <- df %>% filter(!is.na(PassFail))
# Complete case filter
df <- df[complete.cases(df), ]
# Select character or factor columns for categorical distributions
cat_vars <- df %>% select(where(~ is.character(.x) || is.factor(.x)))
# Convert to long format
cat_long <- cat_vars %>%
pivot_longer(cols = everything(), names_to = "Variable", values_to = "Value") %>%
filter(Value != "") # Optional: remove blanks
# Plot categorical variable distributions
ggplot(cat_long, aes(x = Value)) +
geom_bar(fill = "lightcoral", color = "black") +
facet_wrap(~ Variable, scales = "free_x", ncol = 3) +
labs(
title = "Distributions of Ordinal Categorical Variables",
x = "Category",
y = "Count"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
strip.text = element_text(face = "bold")
)
# Select only numeric columns
num_vars <- df %>% select(where(is.numeric))
# Keep only numeric columns with more than 2 unique values (i.e., exclude binary 0/1)
non_binary_vars <- num_vars %>%
select(where(~ length(unique(na.omit(.))) > 2))
# Pivot to long format for numeric distributions
pv_long <- pivot_longer(non_binary_vars, cols = everything(), names_to = "Variable", values_to = "Value")
# Plot histograms for non-binary numeric variables
ggplot(pv_long, aes(x = Value)) +
geom_histogram(bins = 30, color = "black", fill = "lightblue") +
facet_wrap(~ Variable, scales = "free", ncol = 3) +
labs(title = "Histograms of Non-Binary Numeric Variables", x = "Value", y = "Frequency") +
theme_minimal()
# Select numeric columns for binary variable distributions
binary_vars <- num_vars %>%
select(where(~ all(na.omit(.) %in% c(0, 1)) && length(unique(na.omit(.))) == 2))
# Convert to long format for binary distributions
binary_long <- pivot_longer(binary_vars, cols = everything(), names_to = "Variable", values_to = "Value")
# Plot binary variable distributions
ggplot(binary_long, aes(x = factor(Value))) +
geom_bar(fill = "lightgreen", color = "black") +
facet_wrap(~ Variable, scales = "free", ncol = 3) +
labs(title = "Bar Plots of Binary (0/1) Variables", x = "Value", y = "Count") +
theme_minimal()
# Run logistic regression
mod <- glm(PassFail ~ ., data = df, family = binomial(link = "logit"))
# Show model summary
summary(mod)
# Perform stepwise model selection (backward)
selected_model1 <- step(mod, direction = "backward")
summary(selected_model1)
# Run forward stepwise selection starting from the null model
null_model <- glm(PassFail ~ 1, data = df, family = binomial)
selected_model2 <- step(null_model,
scope = list(lower = null_model, upper = mod),
direction = "forward")
summary(selected_model2)
# Perform stepwise selection in both directions
selected_model3 <- step(mod, direction = "both")
# Show summary of the final selected model
summary(selected_model3)
# Plot residuals vs fitted values for model diagnostics
plot(selected_model1$fitted.values, residuals(selected_model1, type = "deviance"),
xlab = "Fitted Values", ylab = "Deviance Residuals",
main = "Residuals vs Fitted Values")
abline(h = 0, lty = 2)