Passing the bar exam is a critical milestone for law school graduates and a key indicator of academic readiness and program effectiveness. Within a traditional three-year law curriculum, students advance from foundational legal instruction in the first and second years (1L and 2L) to more specialized preparation in the final year (3L), culminating in a standardized licensure exam.
As part of an internal academic review, this study analyzes data from students who took the bar exam between 2021 and 2024. The dataset includes both successful and unsuccessful candidates, offering a comprehensive view of academic, preparatory, and background factors potentially associated with bar passage.
The primary objective is to build a logistic regression model that identifies statistically significant predictors of exam success. To achieve this, the analysis is structured as follows:
Data Loading and Preparation: The dataset is cleaned and preprocessed by transforming character fields into categorical variables, resolving inconsistencies, creating binary indicators for program participation, and removing records with critical missing values.
Exploratory and Correlation Analysis: Summary statistics, data type inspection, and correlation matrices are used to assess redundancy among variables, evaluate scale and distribution, and guide the selection of valid predictors by avoiding post-outcome variables.
Model Construction: A logistic regression model is specified, including both main effects and theoretically justified interaction terms, to quantify the relationship between key predictors (e.g., GPA, LSAT, academic support) and bar passage.
Model Optimization: A backward stepwise selection process based on the Akaike Information Criterion (AIC) is employed to refine the model, retaining only the most informative and statistically robust terms.
This structured approach enables a interpretable analysis of bar exam performance across multiple student groups.
This section presents an exploratory data analysis (EDA) to prepare the dataset for modeling. It begins with loading and inspecting the raw data, identifying data types, and evaluating the presence of missing values. Subsequent steps involve data cleaning and transformation to correct formatting issues, recode inconsistent levels, and generate derived variables that enhance interpretability. Relevant features are then selected and renamed to align with the project’s structure. Finally, correlation analysis is conducted to assess linear relationships among numerical variables, helping to identify redundant predictors and ensuring the inclusion of only valid
In this section, we load and prepare the dataset for analysis. The process involves importing the data from an external source, inspecting its structure, identifying missing values, and converting key variables to their appropriate types. These steps ensure the dataset is properly formatted and ready for further statistical exploration and modeling.
DATA LOADING
# Create the dataframe
url <- "https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/Updated_Bar_Data_For_Review_Final.csv"
data <- read.csv(url)
rmarkdown::paged_table(data)
INITIAL DATA ANALYSIS
# Initial Data Exploration
str(data)
## 'data.frame': 476 obs. of 28 variables:
## $ Year : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ PassFail : chr "F" "F" "F" "F" ...
## $ Age : num 29.1 29.6 29 36.2 28.9 30.8 29.1 42.9 28.3 27.1 ...
## $ LSAT : int 152 155 157 156 145 154 149 160 152 150 ...
## $ UGPA : num 3.42 2.82 3.46 3.13 3.49 2.85 3.43 3.29 3.62 3.07 ...
## $ CivPro : chr "B+" "B+" "C" "D+" ...
## $ LPI : chr "A" "B" "B" "C" ...
## $ LPII : chr "A" "B" "B" "C+" ...
## $ GPA_1L : num 3.21 2.43 2.62 2.27 2.29 ...
## $ GPA_Final : num 3.29 3.2 2.91 2.77 2.9 2.82 3 3.09 3.21 2.74 ...
## $ FinalRankPercentile : num 0.46 0.33 0.08 0.02 0.08 0.05 0.15 0.22 0.34 0.01 ...
## $ Accommodations : chr "N" "Y" "N" "N" ...
## $ Probation : chr "N" "Y" "N" "Y" ...
## $ LegalAnalysis_TexasPractice: chr "Y" "Y" "Y" "Y" ...
## $ AdvLegalPerfSkills : chr "Y" "Y" "Y" "Y" ...
## $ AdvLegalAnalysis : chr "Y" "Y" "Y" "Y" ...
## $ BarPrepCompany : chr "Barbri" "Barbri" "Barbri" "Barbri" ...
## $ BarPrepCompletion : num 0.96 0.98 0.48 1 0.77 0.02 0.9 0.76 0.77 0.88 ...
## $ OptIntoWritingGuide : chr "" "" "" "" ...
## $ X.LawSchoolBarPrepWorkshops: int 3 0 3 0 5 1 5 5 1 5 ...
## $ StudentSuccessInitiative : chr "N" "Cochran" "Smith" "Baldwin" ...
## $ BarPrepMentor : chr "N" "N" "N" "N" ...
## $ MPRE : num 103 76 99 81 99 NA 90 97 100 78 ...
## $ MPT : num 3 3 3 2.5 3.5 3 2.5 2.5 3 2.5 ...
## $ MEE : num 2.67 3.17 2.67 3 2.67 2 3.5 3 2.67 3.83 ...
## $ WrittenScaledScore : num 126 133 126 126 130 ...
## $ MBE : num 133 133 118 140 125 ...
## $ UBE : num 259 266 244 266 256 ...
summary(data)
## Year PassFail Age LSAT
## Min. :2021 Length:476 Min. :23.10 Min. :141.0
## 1st Qu.:2022 Class :character 1st Qu.:26.70 1st Qu.:153.0
## Median :2023 Mode :character Median :28.20 Median :156.0
## Mean :2023 Mean :29.13 Mean :155.3
## 3rd Qu.:2024 3rd Qu.:30.10 3rd Qu.:157.0
## Max. :2024 Max. :65.70 Max. :168.0
##
## UGPA CivPro LPI LPII
## Min. :2.010 Length:476 Length:476 Length:476
## 1st Qu.:3.250 Class :character Class :character Class :character
## Median :3.490 Mode :character Mode :character Mode :character
## Mean :3.451
## 3rd Qu.:3.710
## Max. :4.140
##
## GPA_1L GPA_Final FinalRankPercentile Accommodations
## Min. :2.200 Min. :2.44 Min. :0.0000 Length:476
## 1st Qu.:2.781 1st Qu.:3.05 1st Qu.:0.2600 Class :character
## Median :3.083 Median :3.27 Median :0.5150 Mode :character
## Mean :3.086 Mean :3.28 Mean :0.5067
## 3rd Qu.:3.383 3rd Qu.:3.52 3rd Qu.:0.7500
## Max. :4.000 Max. :3.99 Max. :0.9900
## NA's :4
## Probation LegalAnalysis_TexasPractice AdvLegalPerfSkills
## Length:476 Length:476 Length:476
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## AdvLegalAnalysis BarPrepCompany BarPrepCompletion OptIntoWritingGuide
## Length:476 Length:476 Min. :0.0200 Length:476
## Class :character Class :character 1st Qu.:0.8000 Class :character
## Mode :character Mode :character Median :0.8900 Mode :character
## Mean :0.8635
## 3rd Qu.:0.9800
## Max. :1.0000
## NA's :23
## X.LawSchoolBarPrepWorkshops StudentSuccessInitiative BarPrepMentor
## Min. :0.000 Length:476 Length:476
## 1st Qu.:0.000 Class :character Class :character
## Median :0.000 Mode :character Mode :character
## Mean :1.532
## 3rd Qu.:3.000
## Max. :5.000
##
## MPRE MPT MEE WrittenScaledScore
## Min. : 76.00 Min. :1.000 Min. :2.000 Min. :111.7
## 1st Qu.: 89.50 1st Qu.:3.000 1st Qu.:3.330 1st Qu.:138.0
## Median : 99.00 Median :3.500 Median :3.670 Median :146.9
## Mean : 99.46 Mean :3.651 Mean :3.719 Mean :146.6
## 3rd Qu.:107.00 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:155.7
## Max. :145.00 Max. :5.500 Max. :5.330 Max. :181.2
## NA's :273
## MBE UBE
## Min. :103.6 Min. :227.3
## 1st Qu.:138.7 1st Qu.:278.5
## Median :147.1 Median :293.5
## Mean :146.2 Mean :292.9
## 3rd Qu.:154.0 3rd Qu.:306.8
## Max. :187.9 Max. :358.7
##
colSums(is.na(data))
## Year PassFail
## 0 0
## Age LSAT
## 0 0
## UGPA CivPro
## 0 0
## LPI LPII
## 0 0
## GPA_1L GPA_Final
## 4 0
## FinalRankPercentile Accommodations
## 0 0
## Probation LegalAnalysis_TexasPractice
## 0 0
## AdvLegalPerfSkills AdvLegalAnalysis
## 0 0
## BarPrepCompany BarPrepCompletion
## 0 23
## OptIntoWritingGuide X.LawSchoolBarPrepWorkshops
## 0 0
## StudentSuccessInitiative BarPrepMentor
## 0 0
## MPRE MPT
## 273 0
## MEE WrittenScaledScore
## 0 0
## MBE UBE
## 0 0
Define the Target Variable
# Define the Target Variable (PassFail)
data$PassFail <- factor(data$PassFail, levels = c("F", "P"))
Convert Key Categorical Variables to Factors
# Convert Key Categorical Variables to Factors
categorical_vars <- c("Accommodations", "Probation",
"LegalAnalysis_TexasPractice",
"AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "StudentSuccessInitiative", "BarPrepMentor",
"CivPro", "LPI", "LPII")
data[categorical_vars] <- lapply(data[categorical_vars], factor)
Check Levels of Categorical Variables
# Check Levels of Categorical Variables
str(data)
## 'data.frame': 476 obs. of 28 variables:
## $ Year : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ PassFail : Factor w/ 2 levels "F","P": 1 1 1 1 1 1 1 1 1 2 ...
## $ Age : num 29.1 29.6 29 36.2 28.9 30.8 29.1 42.9 28.3 27.1 ...
## $ LSAT : int 152 155 157 156 145 154 149 160 152 150 ...
## $ UGPA : num 3.42 2.82 3.46 3.13 3.49 2.85 3.43 3.29 3.62 3.07 ...
## $ CivPro : Factor w/ 9 levels "","A","B","B+",..: 4 4 5 8 5 4 5 5 6 5 ...
## $ LPI : Factor w/ 9 levels "","A","B","B+",..: 2 3 3 5 6 9 5 6 3 3 ...
## $ LPII : Factor w/ 9 levels "","A","B","B+",..: 2 3 3 6 6 7 3 3 3 5 ...
## $ GPA_1L : num 3.21 2.43 2.62 2.27 2.29 ...
## $ GPA_Final : num 3.29 3.2 2.91 2.77 2.9 2.82 3 3.09 3.21 2.74 ...
## $ FinalRankPercentile : num 0.46 0.33 0.08 0.02 0.08 0.05 0.15 0.22 0.34 0.01 ...
## $ Accommodations : Factor w/ 2 levels "N","Y": 1 2 1 1 1 1 1 1 1 1 ...
## $ Probation : Factor w/ 3 levels "N","N ","Y": 1 3 1 3 3 1 3 3 1 3 ...
## $ LegalAnalysis_TexasPractice: Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
## $ AdvLegalPerfSkills : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
## $ AdvLegalAnalysis : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
## $ BarPrepCompany : Factor w/ 7 levels "","Barbri","Helix",..: 2 2 2 2 7 7 7 7 7 7 ...
## $ BarPrepCompletion : num 0.96 0.98 0.48 1 0.77 0.02 0.9 0.76 0.77 0.88 ...
## $ OptIntoWritingGuide : chr "" "" "" "" ...
## $ X.LawSchoolBarPrepWorkshops: int 3 0 3 0 5 1 5 5 1 5 ...
## $ StudentSuccessInitiative : Factor w/ 22 levels "Arrington","Aycock",..: 16 7 21 3 3 17 19 6 16 6 ...
## $ BarPrepMentor : Factor w/ 68 levels "AbbeyCoufal",..: 49 49 49 49 49 49 49 49 49 49 ...
## $ MPRE : num 103 76 99 81 99 NA 90 97 100 78 ...
## $ MPT : num 3 3 3 2.5 3.5 3 2.5 2.5 3 2.5 ...
## $ MEE : num 2.67 3.17 2.67 3 2.67 2 3.5 3 2.67 3.83 ...
## $ WrittenScaledScore : num 126 133 126 126 130 ...
## $ MBE : num 133 133 118 140 125 ...
## $ UBE : num 259 266 244 266 256 ...
sapply(data[, c("Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "StudentSuccessInitiative", "BarPrepMentor", "CivPro", "LPI", "LPII")], levels)
## $Accommodations
## [1] "N" "Y"
##
## $Probation
## [1] "N" "N " "Y"
##
## $LegalAnalysis_TexasPractice
## [1] "N" "Y"
##
## $AdvLegalPerfSkills
## [1] "N" "Y"
##
## $AdvLegalAnalysis
## [1] "N" "Y"
##
## $BarPrepCompany
## [1] "" "Barbri" "Helix" "JD Advising" "Kaplan"
## [6] "Quimbee" "Themis"
##
## $StudentSuccessInitiative
## [1] "Arrington" "Aycock" "Baldwin" "Beyer" "Chapman"
## [6] "Christopher" "Cochran" "Corn" "Gonzalez" "Hardberger"
## [11] "Humphrey" "Keffer" "Lauriat" "Lux" "McDonald"
## [16] "N" "Rosen" "RSherwin" "Saavedra" "Sherwin"
## [21] "Smith" "Stafford"
##
## $BarPrepMentor
## [1] "AbbeyCoufal" "AmberBeard" "AmberRich"
## [4] "AshleyPirtle" "AshleySanders" "BenIvey"
## [7] "BrendaJohnson" "BryanGreer" "CadyMello"
## [10] "ChrisRhodes" "ClayElliott" "ColeShooter"
## [13] "ColleenByrom" "ColleenElbe(Potts)" "ColleenPotts"
## [16] "DanielleSaavedra" "DavidHutchens" "DavidRice"
## [19] "DeirdreWard" "DenetteVaughn" "DolphWenzel"
## [22] "GrantCoffey" "HaleyHickey" "HolleyMcDaniel"
## [25] "HollyHaseloff" "HoltonWestbrook" "JacquelynnMayes"
## [28] "JessicaAycock" "JohnMoore" "JordanChavez"
## [31] "JosephAustin" "JulieDavis" "JustinPlescha"
## [34] "KathleenGoegel" "KatyCrocker" "KimberlyKelley"
## [37] "LauraFidelie" "LauraMcDivitt" "LaurenWelch"
## [40] "LeenaAl-Souki" "MadelynDeviney" "MariaOviedo"
## [43] "MelissaWaggoner" "MerylBenham" "MichaelEconomidis"
## [46] "MikelaBryant" "MistyPratt" "MonicaReyes"
## [49] "N" "PaulaMilan" "PaulaMillan"
## [52] "PaulBarkhurst" "QuentinWetsel" "RebekahLuna"
## [55] "RebekaLuna" "ReidLollis" "SaraThornton"
## [58] "ScottKeffer" "ScoutBlosser" "TasiaEaslon"
## [61] "TomHall" "TravisWeibold" "TylynnPayne"
## [64] "VictoriaWhitehead" "VictorMellinger" "WilliamWells"
## [67] "WillRaftis" "Y-DanielleSaavedra"
##
## $CivPro
## [1] "" "A" "B" "B+" "C" "C+" "D" "D+" "F"
##
## $LPI
## [1] "" "A" "B" "B+" "C" "C+" "D" "D+" "F"
##
## $LPII
## [1] "" "A" "B" "B+" "C" "C+" "CR" "D" "D+"
Check Number of Unique Values for Numerical Variables
# Check Number of Unique Values for Numerical Variables
sapply(data[, c("LSAT", "UGPA", "GPA_1L", "GPA_Final", "FinalRankPercentile",
"BarPrepCompletion", "X.LawSchoolBarPrepWorkshops",
"MPRE", "MPT", "MEE", "MBE", "UBE")], function(x) length(unique(x)))
## LSAT UGPA
## 25 143
## GPA_1L GPA_Final
## 220 133
## FinalRankPercentile BarPrepCompletion
## 100 56
## X.LawSchoolBarPrepWorkshops MPRE
## 6 54
## MPT MEE
## 9 21
## MBE UBE
## 173 348
Observations
In this section, we perform essential data cleaning and preparation tasks to ensure consistency and analytical integrity. This includes correcting issues in categorical variables, generating new binary indicators for student participation, selecting relevant features, renaming variables for clarity, and addressing missing values by removing irrelevant columns and incomplete records.
In this section, we apply targeted data cleaning and transformation steps to improve variable consistency and prepare the dataset for analysis. This includes trimming white spaces, recoding empty strings, handling missing values in key categorical variables, and creating new binary indicators to better capture student participation in mentoring and support programs.
Clean Probation Variable
# Clean Probation Variable
data$Probation <- trimws(data$Probation)
data$Probation <- factor(data$Probation)
Replace Empty Strings in BarPrepCompany
# Replace Empty Strings in BarPrepCompany
levels(data$BarPrepCompany) <- c(levels(data$BarPrepCompany), "None")
data$BarPrepCompany[data$BarPrepCompany == ""] <- "None"
data$BarPrepCompany <- factor(data$BarPrepCompany)
Handle Missing Values in CivPro, LPI, and LPII
# Handle Missing Values in CivPro, LPI, and LPII
data$CivPro[data$CivPro == ""] <- NA
data$LPI[data$LPI == ""] <- NA
data$LPII[data$LPII == ""] <- NA
data$CivPro <- factor(data$CivPro)
data$LPI <- factor(data$LPI)
data$LPII <- factor(data$LPII)
Create Binary Variables for Mentor and Student Support Participation
# Create Binary Variables for Mentor and Student Support Participation
data$HadMentor <- ifelse(data$BarPrepMentor == "N", "No", "Yes")
data$HadMentor <- factor(data$HadMentor)
data$StudentSuccessParticipated <- ifelse(data$StudentSuccessInitiative == "N", "No", "Yes")
data$StudentSuccessParticipated <- as.factor(data$StudentSuccessParticipated)
In this section, we refine the dataset by selecting only the variables relevant to the analysis and renaming them according to the project’s naming convention. We also assess the presence of missing values across all selected features and apply cleaning steps accordingly. This includes removing irrelevant columns and excluding incomplete records for important variables
Select and Rename Relevant Variables
# Select and Rename Relevant Variables
data2 <- data[c(
"Year", "PassFail", "Age", "LSAT", "UGPA",
"CivPro", "LPI", "LPII", "GPA_1L", "GPA_Final", "FinalRankPercentile",
"Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "BarPrepCompletion",
"OptIntoWritingGuide", "X.LawSchoolBarPrepWorkshops",
"MPRE", "MPT", "MEE", "WrittenScaledScore", "MBE", "UBE",
"StudentSuccessParticipated", "HadMentor"
)]
# Rename Variables in data2 According to Provided Naming Convention
colnames(data2) <- c(
"Class", "Pass", "Age", "LSAT", "UGPA",
"CivPro", "LP1", "LP2", "OneCum", "FGPA", "FinalRankPercentile",
"Accom", "Probation",
"LegalAnalysis", "AdvLegalPerf", "AdvLegalAnalysis",
"BarPrep", "PctBarPrepComplete",
"OptIntoWritingGuide", "NumPrepWorkshops",
"MPRE", "MPT", "MEE", "WrittenScaledScore", "MBE", "UBE",
"StudentSuccessInitiative", "BarPrepMentor"
)
Assess Missing Values
# Assess Missing Values
colSums(is.na(data2))
## Class Pass Age
## 0 0 0
## LSAT UGPA CivPro
## 0 0 4
## LP1 LP2 OneCum
## 5 3 4
## FGPA FinalRankPercentile Accom
## 0 0 0
## Probation LegalAnalysis AdvLegalPerf
## 0 0 0
## AdvLegalAnalysis BarPrep PctBarPrepComplete
## 0 0 23
## OptIntoWritingGuide NumPrepWorkshops MPRE
## 0 0 273
## MPT MEE WrittenScaledScore
## 0 0 0
## MBE UBE StudentSuccessInitiative
## 0 0 0
## BarPrepMentor
## 0
Observations
# Remove unnecessary columns: MPRE, OptIntoWritingGuide, WrittenScaledScore and FinalRankPercentile
data2 <- subset(data2, select = -c(MPRE, OptIntoWritingGuide, WrittenScaledScore, FinalRankPercentile))
# Remove rows with missing values in critical academic and preparation variables
critical_vars <- c("CivPro", "LP1", "LP2", "OneCum", "PctBarPrepComplete")
data2 <- data2[complete.cases(data2[, critical_vars]), ]
# Verify that all missing values are removed
colSums(is.na(data2))
## Class Pass Age
## 0 0 0
## LSAT UGPA CivPro
## 0 0 0
## LP1 LP2 OneCum
## 0 0 0
## FGPA Accom Probation
## 0 0 0
## LegalAnalysis AdvLegalPerf AdvLegalAnalysis
## 0 0 0
## BarPrep PctBarPrepComplete NumPrepWorkshops
## 0 0 0
## MPT MEE MBE
## 0 0 0
## UBE StudentSuccessInitiative BarPrepMentor
## 0 0 0
In this section, we examine the relationships among numeric variables through a correlation matrix. This analysis helps identify redundant predictors and assess potential violations of modeling assumptions, such as causality
# Correlation Matrix Visualization
numeric_vars <- data2[, sapply(data2, is.numeric)]
cor_matrix <- cor(numeric_vars, use = "complete.obs")
ggcorrplot(cor_matrix,
lab = TRUE,
lab_size = 4,
colors = c("red", "white", "#4A90E2"),
outline.color = "black",
show.legend = TRUE,
title = "Correlation Matrix of Numeric Variables",
ggtheme = ggplot2::theme_minimal()
)
Observations
This section outlines the development and refinement of a logistic regression model designed to predict exam outcomes based on academic, preparatory, and support-related variables. We first construct a comprehensive model that incorporates both main effects and theoretically motivated interaction terms. Given the potential for overfitting, a stepwise selection process is then employed to simplify the model while maintaining predictive performance. The goal is to identify the most influential predictors and interactions that explain students’ likelihood of passing the exam.
Logistic regression is a widely used statistical method for modeling the relationship between a binary outcome variable and one or more independent variables. It allows us to estimate the probability of a specific outcome (e.g., passing the exam) as a function of several predictors, while interpreting the effect of each variable in terms of odds.
Beyond the individual effects of each predictor, certain combinations of factors may interact and jointly influence the probability that a student passes the bar exam. Therefore, interaction terms were included in the logistic regression model based on theoretical reasoning and the academic context.
The selected interactions fall into the following conceptual categories:
The general form of the logistic regression model is:
\[ \log\left(\frac{P}{1 - P}\right) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_p X_p + \beta_a (X_i \cdot X_j) + \beta_b (X_k \cdot X_l) + \cdots \]
Where:
We fit a logistic regression model to examine how various academic and preparatory factors influence the probability of passing the bar exam
model <- glm(
Pass ~ LSAT + UGPA + FGPA +
CivPro + LP1 + LP2 +
Accom + Probation +
LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
BarPrep + PctBarPrepComplete + NumPrepWorkshops +
StudentSuccessInitiative + BarPrepMentor + Age +
PctBarPrepComplete * FGPA +
NumPrepWorkshops * FGPA +
BarPrep * FGPA +
Probation * StudentSuccessInitiative +
Probation * BarPrepMentor +
Probation * LegalAnalysis +
AdvLegalPerf * FGPA +
AdvLegalAnalysis * PctBarPrepComplete +
LegalAnalysis * BarPrepMentor +
Accom * BarPrepMentor +
Accom * NumPrepWorkshops +
Age * BarPrep +
Age * AdvLegalPerf,
family = binomial,
data = data2
)
summary(model)
##
## Call:
## glm(formula = Pass ~ LSAT + UGPA + FGPA + CivPro + LP1 + LP2 +
## Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
## BarPrep + PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor + Age + PctBarPrepComplete * FGPA + NumPrepWorkshops *
## FGPA + BarPrep * FGPA + Probation * StudentSuccessInitiative +
## Probation * BarPrepMentor + Probation * LegalAnalysis + AdvLegalPerf *
## FGPA + AdvLegalAnalysis * PctBarPrepComplete + LegalAnalysis *
## BarPrepMentor + Accom * BarPrepMentor + Accom * NumPrepWorkshops +
## Age * BarPrep + Age * AdvLegalPerf, family = binomial, data = data2)
##
## Coefficients: (2 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -121.11168 32.06595 -3.777 0.000159
## LSAT 0.40098 0.09879 4.059 4.92e-05
## UGPA 1.13283 0.86303 1.313 0.189311
## FGPA 16.61607 8.23642 2.017 0.043655
## CivProB -0.24246 1.69992 -0.143 0.886584
## CivProB+ -0.91517 1.76946 -0.517 0.605016
## CivProC -0.21065 1.75584 -0.120 0.904505
## CivProC+ -1.87381 1.79275 -1.045 0.295923
## CivProD -18.83462 3956.18089 -0.005 0.996201
## CivProD+ 1.22296 2.25717 0.542 0.587949
## CivProF 0.57058 3956.18255 0.000 0.999885
## LP1B 3.09940 1.12904 2.745 0.006048
## LP1B+ 0.90304 1.04438 0.865 0.387223
## LP1C 2.19372 1.34275 1.634 0.102312
## LP1C+ 1.78451 1.09281 1.633 0.102477
## LP1D 17.32865 313.11414 0.055 0.955865
## LP1D+ 5.17139 2.27400 2.274 0.022958
## LP1F -5.50236 3956.18199 -0.001 0.998890
## LP2B 0.55607 1.11542 0.499 0.618110
## LP2B+ 1.38538 1.19002 1.164 0.244357
## LP2C 1.59572 1.48744 1.073 0.283362
## LP2C+ 2.13734 1.36794 1.562 0.118182
## LP2CR 1.03683 1.32008 0.785 0.432204
## LP2D 17.19828 1734.11582 0.010 0.992087
## LP2D+ 19.23622 1553.99695 0.012 0.990124
## AccomY -1.88392 0.86315 -2.183 0.029064
## ProbationY 8.14584 2.76681 2.944 0.003239
## LegalAnalysisY -2.84095 1.12966 -2.515 0.011908
## AdvLegalPerfY 17.98095 9.54971 1.883 0.059717
## AdvLegalAnalysisY -1.34158 3.15973 -0.425 0.671137
## BarPrepHelix 15.95202 3956.18087 0.004 0.996783
## BarPrepKaplan -9.63249 26.93799 -0.358 0.720658
## BarPrepThemis 3.59691 7.83419 0.459 0.646141
## PctBarPrepComplete 26.36757 26.31618 1.002 0.316366
## NumPrepWorkshops -2.55186 1.98903 -1.283 0.199504
## StudentSuccessInitiativeYes 0.59664 0.84688 0.705 0.481110
## BarPrepMentorYes -4.02863 1.36571 -2.950 0.003179
## Age -0.07865 0.14412 -0.546 0.585243
## FGPA:PctBarPrepComplete -5.70865 8.83416 -0.646 0.518149
## FGPA:NumPrepWorkshops 0.83027 0.66560 1.247 0.212246
## FGPA:BarPrepHelix NA NA NA NA
## FGPA:BarPrepKaplan -5.08698 5.79570 -0.878 0.380098
## FGPA:BarPrepThemis -3.77535 2.63088 -1.435 0.151283
## ProbationY:StudentSuccessInitiativeYes -3.24184 1.91577 -1.692 0.090610
## ProbationY:BarPrepMentorYes -2.64625 2.15073 -1.230 0.218549
## ProbationY:LegalAnalysisY -6.27705 2.66977 -2.351 0.018715
## FGPA:AdvLegalPerfY -1.02776 3.16203 -0.325 0.745157
## AdvLegalAnalysisY:PctBarPrepComplete 2.63073 3.85218 0.683 0.494657
## LegalAnalysisY:BarPrepMentorYes 3.98768 1.50018 2.658 0.007858
## AccomY:BarPrepMentorYes 3.96404 4.25750 0.931 0.351817
## AccomY:NumPrepWorkshops 1.49991 0.83849 1.789 0.073645
## BarPrepHelix:Age NA NA NA NA
## BarPrepKaplan:Age 0.84048 0.80120 1.049 0.294164
## BarPrepThemis:Age 0.36413 0.15942 2.284 0.022366
## AdvLegalPerfY:Age -0.52593 0.20203 -2.603 0.009235
##
## (Intercept) ***
## LSAT ***
## UGPA
## FGPA *
## CivProB
## CivProB+
## CivProC
## CivProC+
## CivProD
## CivProD+
## CivProF
## LP1B **
## LP1B+
## LP1C
## LP1C+
## LP1D
## LP1D+ *
## LP1F
## LP2B
## LP2B+
## LP2C
## LP2C+
## LP2CR
## LP2D
## LP2D+
## AccomY *
## ProbationY **
## LegalAnalysisY *
## AdvLegalPerfY .
## AdvLegalAnalysisY
## BarPrepHelix
## BarPrepKaplan
## BarPrepThemis
## PctBarPrepComplete
## NumPrepWorkshops
## StudentSuccessInitiativeYes
## BarPrepMentorYes **
## Age
## FGPA:PctBarPrepComplete
## FGPA:NumPrepWorkshops
## FGPA:BarPrepHelix
## FGPA:BarPrepKaplan
## FGPA:BarPrepThemis
## ProbationY:StudentSuccessInitiativeYes .
## ProbationY:BarPrepMentorYes
## ProbationY:LegalAnalysisY *
## FGPA:AdvLegalPerfY
## AdvLegalAnalysisY:PctBarPrepComplete
## LegalAnalysisY:BarPrepMentorYes **
## AccomY:BarPrepMentorYes
## AccomY:NumPrepWorkshops .
## BarPrepHelix:Age
## BarPrepKaplan:Age
## BarPrepThemis:Age *
## AdvLegalPerfY:Age **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 130.56 on 395 degrees of freedom
## AIC: 236.56
##
## Number of Fisher Scoring iterations: 16
Observations
To refine the model and improve its simplicity, a stepwise selection procedure was applied. Specifically, we used the backward elimination approach, which begins with the full model, including all main effects and theoretically justified interaction terms, and iteratively removes non-contributing predictors based on the Akaike Information Criterion (AIC). This process balances model fit and complexity, aiming to retain only the most informative variables.
modelo_stepwise <- step(model, direction = "backward")
## Start: AIC=236.56
## Pass ~ LSAT + UGPA + FGPA + CivPro + LP1 + LP2 + Accom + Probation +
## LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis + BarPrep +
## PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor + Age + PctBarPrepComplete * FGPA + NumPrepWorkshops *
## FGPA + BarPrep * FGPA + Probation * StudentSuccessInitiative +
## Probation * BarPrepMentor + Probation * LegalAnalysis + AdvLegalPerf *
## FGPA + AdvLegalAnalysis * PctBarPrepComplete + LegalAnalysis *
## BarPrepMentor + Accom * BarPrepMentor + Accom * NumPrepWorkshops +
## Age * BarPrep + Age * AdvLegalPerf
##
## Df Deviance AIC
## - CivPro 7 140.01 232.01
## - LP2 7 141.17 233.17
## - FGPA:AdvLegalPerf 1 130.67 234.67
## - FGPA:BarPrep 2 132.95 234.95
## - FGPA:PctBarPrepComplete 1 131.00 235.00
## - AdvLegalAnalysis:PctBarPrepComplete 1 131.03 235.03
## - Probation:BarPrepMentor 1 132.13 236.13
## - Accom:BarPrepMentor 1 132.16 236.16
## - UGPA 1 132.32 236.32
## - FGPA:NumPrepWorkshops 1 132.32 236.32
## <none> 130.56 236.56
## - LP1 7 145.35 237.35
## - Probation:StudentSuccessInitiative 1 134.07 238.07
## - BarPrep:Age 2 137.63 239.63
## - Probation:LegalAnalysis 1 136.06 240.06
## - AdvLegalPerf:Age 1 138.41 242.41
## - LegalAnalysis:BarPrepMentor 1 138.68 242.68
## - Accom:NumPrepWorkshops 1 138.97 242.97
## - LSAT 1 152.92 256.92
##
## Step: AIC=232.01
## Pass ~ LSAT + UGPA + FGPA + LP1 + LP2 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:PctBarPrepComplete + FGPA:NumPrepWorkshops + FGPA:BarPrep +
## Probation:StudentSuccessInitiative + Probation:BarPrepMentor +
## Probation:LegalAnalysis + FGPA:AdvLegalPerf + AdvLegalAnalysis:PctBarPrepComplete +
## LegalAnalysis:BarPrepMentor + Accom:BarPrepMentor + Accom:NumPrepWorkshops +
## BarPrep:Age + AdvLegalPerf:Age
##
## Df Deviance AIC
## - LP2 7 148.44 226.44
## - LP1 7 150.75 228.75
## - FGPA:BarPrep 2 141.20 229.20
## - FGPA:PctBarPrepComplete 1 140.34 230.34
## - FGPA:AdvLegalPerf 1 140.43 230.43
## - UGPA 1 140.59 230.59
## - AdvLegalAnalysis:PctBarPrepComplete 1 140.86 230.86
## - Accom:BarPrepMentor 1 141.52 231.52
## - FGPA:NumPrepWorkshops 1 141.56 231.56
## <none> 140.01 232.01
## - Probation:BarPrepMentor 1 142.38 232.38
## - BarPrep:Age 2 145.87 233.87
## - Probation:StudentSuccessInitiative 1 143.96 233.96
## - Probation:LegalAnalysis 1 145.54 235.54
## - AdvLegalPerf:Age 1 145.56 235.56
## - LegalAnalysis:BarPrepMentor 1 146.34 236.34
## - Accom:NumPrepWorkshops 1 147.84 237.84
## - LSAT 1 158.28 248.28
##
## Step: AIC=226.44
## Pass ~ LSAT + UGPA + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:PctBarPrepComplete + FGPA:NumPrepWorkshops + FGPA:BarPrep +
## Probation:StudentSuccessInitiative + Probation:BarPrepMentor +
## Probation:LegalAnalysis + FGPA:AdvLegalPerf + AdvLegalAnalysis:PctBarPrepComplete +
## LegalAnalysis:BarPrepMentor + Accom:BarPrepMentor + Accom:NumPrepWorkshops +
## BarPrep:Age + AdvLegalPerf:Age
##
## Df Deviance AIC
## - FGPA:BarPrep 2 149.11 223.11
## - UGPA 1 148.62 224.62
## - FGPA:AdvLegalPerf 1 148.79 224.79
## - FGPA:PctBarPrepComplete 1 148.90 224.90
## - AdvLegalAnalysis:PctBarPrepComplete 1 148.98 224.98
## - LP1 7 162.38 226.38
## <none> 148.44 226.44
## - FGPA:NumPrepWorkshops 1 150.77 226.77
## - Accom:BarPrepMentor 1 151.06 227.06
## - Probation:BarPrepMentor 1 151.66 227.66
## - BarPrep:Age 2 153.88 227.88
## - Probation:StudentSuccessInitiative 1 153.35 229.35
## - AdvLegalPerf:Age 1 153.88 229.88
## - LegalAnalysis:BarPrepMentor 1 156.29 232.29
## - Probation:LegalAnalysis 1 157.84 233.84
## - Accom:NumPrepWorkshops 1 158.56 234.56
## - LSAT 1 169.72 245.72
##
## Step: AIC=223.11
## Pass ~ LSAT + UGPA + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:PctBarPrepComplete + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + FGPA:AdvLegalPerf +
## AdvLegalAnalysis:PctBarPrepComplete + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age
##
## Df Deviance AIC
## - FGPA:PctBarPrepComplete 1 149.25 221.25
## - UGPA 1 149.26 221.26
## - FGPA:AdvLegalPerf 1 149.45 221.45
## - AdvLegalAnalysis:PctBarPrepComplete 1 149.61 221.61
## <none> 149.11 223.11
## - FGPA:NumPrepWorkshops 1 151.22 223.22
## - LP1 7 163.50 223.50
## - Accom:BarPrepMentor 1 151.71 223.71
## - BarPrep:Age 2 153.90 223.90
## - Probation:BarPrepMentor 1 152.02 224.02
## - Probation:StudentSuccessInitiative 1 154.58 226.58
## - AdvLegalPerf:Age 1 154.64 226.64
## - LegalAnalysis:BarPrepMentor 1 157.01 229.01
## - Probation:LegalAnalysis 1 157.88 229.88
## - Accom:NumPrepWorkshops 1 159.02 231.02
## - LSAT 1 170.22 242.22
##
## Step: AIC=221.25
## Pass ~ LSAT + UGPA + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + FGPA:AdvLegalPerf +
## AdvLegalAnalysis:PctBarPrepComplete + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age
##
## Df Deviance AIC
## - UGPA 1 149.42 219.42
## - AdvLegalAnalysis:PctBarPrepComplete 1 149.62 219.62
## - FGPA:AdvLegalPerf 1 149.64 219.64
## - FGPA:NumPrepWorkshops 1 151.23 221.23
## <none> 149.25 221.25
## - LP1 7 163.60 221.60
## - Accom:BarPrepMentor 1 151.90 221.90
## - BarPrep:Age 2 153.90 221.90
## - Probation:BarPrepMentor 1 152.12 222.12
## - AdvLegalPerf:Age 1 154.68 224.68
## - Probation:StudentSuccessInitiative 1 154.93 224.93
## - LegalAnalysis:BarPrepMentor 1 157.17 227.17
## - Probation:LegalAnalysis 1 157.89 227.89
## - Accom:NumPrepWorkshops 1 159.07 229.07
## - LSAT 1 170.28 240.28
##
## Step: AIC=219.42
## Pass ~ LSAT + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + FGPA:AdvLegalPerf +
## AdvLegalAnalysis:PctBarPrepComplete + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age
##
## Df Deviance AIC
## - AdvLegalAnalysis:PctBarPrepComplete 1 149.83 217.83
## - FGPA:AdvLegalPerf 1 149.85 217.85
## - FGPA:NumPrepWorkshops 1 151.40 219.40
## <none> 149.42 219.42
## - LP1 7 163.67 219.67
## - Accom:BarPrepMentor 1 152.08 220.08
## - BarPrep:Age 2 154.37 220.37
## - Probation:BarPrepMentor 1 152.47 220.47
## - AdvLegalPerf:Age 1 154.81 222.81
## - Probation:StudentSuccessInitiative 1 155.16 223.16
## - LegalAnalysis:BarPrepMentor 1 157.51 225.51
## - Probation:LegalAnalysis 1 158.83 226.83
## - Accom:NumPrepWorkshops 1 159.37 227.37
## - LSAT 1 170.49 238.49
##
## Step: AIC=217.83
## Pass ~ LSAT + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + FGPA:AdvLegalPerf +
## LegalAnalysis:BarPrepMentor + Accom:BarPrepMentor + Accom:NumPrepWorkshops +
## BarPrep:Age + AdvLegalPerf:Age
##
## Df Deviance AIC
## - AdvLegalAnalysis 1 149.97 215.97
## - FGPA:AdvLegalPerf 1 150.30 216.30
## <none> 149.83 217.83
## - FGPA:NumPrepWorkshops 1 151.96 217.96
## - LP1 7 164.28 218.28
## - Accom:BarPrepMentor 1 152.51 218.51
## - BarPrep:Age 2 155.01 219.01
## - Probation:BarPrepMentor 1 153.01 219.01
## - AdvLegalPerf:Age 1 154.82 220.82
## - Probation:StudentSuccessInitiative 1 155.28 221.28
## - LegalAnalysis:BarPrepMentor 1 157.97 223.97
## - Probation:LegalAnalysis 1 159.59 225.59
## - Accom:NumPrepWorkshops 1 159.67 225.67
## - LSAT 1 170.49 236.49
## - PctBarPrepComplete 1 185.01 251.01
##
## Step: AIC=215.97
## Pass ~ LSAT + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor + Age + FGPA:NumPrepWorkshops +
## Probation:StudentSuccessInitiative + Probation:BarPrepMentor +
## Probation:LegalAnalysis + FGPA:AdvLegalPerf + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age
##
## Df Deviance AIC
## - FGPA:AdvLegalPerf 1 150.43 214.43
## <none> 149.97 215.97
## - FGPA:NumPrepWorkshops 1 152.01 216.01
## - LP1 7 164.28 216.28
## - Accom:BarPrepMentor 1 152.75 216.75
## - BarPrep:Age 2 155.07 217.07
## - Probation:BarPrepMentor 1 153.46 217.46
## - AdvLegalPerf:Age 1 154.87 218.87
## - Probation:StudentSuccessInitiative 1 155.46 219.46
## - LegalAnalysis:BarPrepMentor 1 157.99 221.99
## - Accom:NumPrepWorkshops 1 159.72 223.72
## - Probation:LegalAnalysis 1 159.90 223.90
## - LSAT 1 170.69 234.69
## - PctBarPrepComplete 1 185.30 249.30
##
## Step: AIC=214.43
## Pass ~ LSAT + FGPA + LP1 + Accom + Probation + LegalAnalysis +
## AdvLegalPerf + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor + Age + FGPA:NumPrepWorkshops +
## Probation:StudentSuccessInitiative + Probation:BarPrepMentor +
## Probation:LegalAnalysis + LegalAnalysis:BarPrepMentor + Accom:BarPrepMentor +
## Accom:NumPrepWorkshops + BarPrep:Age + AdvLegalPerf:Age
##
## Df Deviance AIC
## <none> 150.43 214.43
## - FGPA:NumPrepWorkshops 1 152.55 214.55
## - LP1 7 165.14 215.14
## - Accom:BarPrepMentor 1 153.24 215.24
## - BarPrep:Age 2 155.93 215.93
## - Probation:BarPrepMentor 1 153.98 215.98
## - Probation:StudentSuccessInitiative 1 155.58 217.58
## - AdvLegalPerf:Age 1 156.93 218.93
## - LegalAnalysis:BarPrepMentor 1 158.13 220.13
## - Probation:LegalAnalysis 1 160.63 222.63
## - Accom:NumPrepWorkshops 1 160.66 222.66
## - LSAT 1 171.04 233.04
## - PctBarPrepComplete 1 186.82 248.82
summary(modelo_stepwise)
##
## Call:
## glm(formula = Pass ~ LSAT + FGPA + LP1 + Accom + Probation +
## LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age, family = binomial, data = data2)
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -77.26688 16.45683 -4.695 2.66e-06
## LSAT 0.32312 0.07873 4.104 4.06e-05
## FGPA 7.62862 1.90623 4.002 6.28e-05
## LP1B 2.45949 0.85710 2.870 0.004111
## LP1B+ 0.86828 0.84795 1.024 0.305847
## LP1C 2.25387 1.12478 2.004 0.045088
## LP1C+ 1.53055 0.82329 1.859 0.063018
## LP1D 12.77530 182.11723 0.070 0.944075
## LP1D+ 4.52225 1.75318 2.579 0.009896
## LP1F -6.86233 1455.39855 -0.005 0.996238
## AccomY -2.02271 0.74651 -2.710 0.006737
## ProbationY 8.93183 2.41907 3.692 0.000222
## LegalAnalysisY -2.05678 0.85430 -2.408 0.016060
## AdvLegalPerfY 10.24174 4.08860 2.505 0.012247
## BarPrepHelix 14.70974 1455.39790 0.010 0.991936
## BarPrepKaplan -23.65909 18.72741 -1.263 0.206467
## BarPrepThemis -4.76433 3.46276 -1.376 0.168860
## PctBarPrepComplete 9.73694 1.89435 5.140 2.75e-07
## NumPrepWorkshops -2.43445 1.73304 -1.405 0.160101
## StudentSuccessInitiativeYes 0.56483 0.76293 0.740 0.459097
## BarPrepMentorYes -3.14512 1.10756 -2.840 0.004516
## Age -0.10560 0.12115 -0.872 0.383411
## FGPA:NumPrepWorkshops 0.77463 0.57763 1.341 0.179908
## ProbationY:StudentSuccessInitiativeYes -3.46016 1.70536 -2.029 0.042460
## ProbationY:BarPrepMentorYes -3.49753 1.89029 -1.850 0.064275
## ProbationY:LegalAnalysisY -6.16617 2.07631 -2.970 0.002980
## LegalAnalysisY:BarPrepMentorYes 3.52039 1.32205 2.663 0.007749
## AccomY:BarPrepMentorYes 4.04478 2.93011 1.380 0.167458
## AccomY:NumPrepWorkshops 1.49695 0.82649 1.811 0.070106
## BarPrepHelix:Age NA NA NA NA
## BarPrepKaplan:Age 0.79700 0.68414 1.165 0.244031
## BarPrepThemis:Age 0.24211 0.11986 2.020 0.043388
## AdvLegalPerfY:Age -0.35209 0.14393 -2.446 0.014431
##
## (Intercept) ***
## LSAT ***
## FGPA ***
## LP1B **
## LP1B+
## LP1C *
## LP1C+ .
## LP1D
## LP1D+ **
## LP1F
## AccomY **
## ProbationY ***
## LegalAnalysisY *
## AdvLegalPerfY *
## BarPrepHelix
## BarPrepKaplan
## BarPrepThemis
## PctBarPrepComplete ***
## NumPrepWorkshops
## StudentSuccessInitiativeYes
## BarPrepMentorYes **
## Age
## FGPA:NumPrepWorkshops
## ProbationY:StudentSuccessInitiativeYes *
## ProbationY:BarPrepMentorYes .
## ProbationY:LegalAnalysisY **
## LegalAnalysisY:BarPrepMentorYes **
## AccomY:BarPrepMentorYes
## AccomY:NumPrepWorkshops .
## BarPrepHelix:Age
## BarPrepKaplan:Age
## BarPrepThemis:Age *
## AdvLegalPerfY:Age *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 150.43 on 416 degrees of freedom
## AIC: 214.43
##
## Number of Fisher Scoring iterations: 14
model_final <- glm(formula = Pass ~ LSAT + FGPA + LP1 + Accom + Probation +
LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
Probation:BarPrepMentor + Probation:LegalAnalysis + LegalAnalysis:BarPrepMentor +
Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
AdvLegalPerf:Age, family = binomial, data = data2)
summary(model_final)
##
## Call:
## glm(formula = Pass ~ LSAT + FGPA + LP1 + Accom + Probation +
## LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
## NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
## Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
## Probation:BarPrepMentor + Probation:LegalAnalysis + LegalAnalysis:BarPrepMentor +
## Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
## AdvLegalPerf:Age, family = binomial, data = data2)
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -77.26688 16.45683 -4.695 2.66e-06
## LSAT 0.32312 0.07873 4.104 4.06e-05
## FGPA 7.62862 1.90623 4.002 6.28e-05
## LP1B 2.45949 0.85710 2.870 0.004111
## LP1B+ 0.86828 0.84795 1.024 0.305847
## LP1C 2.25387 1.12478 2.004 0.045088
## LP1C+ 1.53055 0.82329 1.859 0.063018
## LP1D 12.77530 182.11723 0.070 0.944075
## LP1D+ 4.52225 1.75318 2.579 0.009896
## LP1F -6.86233 1455.39855 -0.005 0.996238
## AccomY -2.02271 0.74651 -2.710 0.006737
## ProbationY 8.93183 2.41907 3.692 0.000222
## LegalAnalysisY -2.05678 0.85430 -2.408 0.016060
## AdvLegalPerfY 10.24174 4.08860 2.505 0.012247
## BarPrepHelix 14.70974 1455.39790 0.010 0.991936
## BarPrepKaplan -23.65909 18.72741 -1.263 0.206467
## BarPrepThemis -4.76433 3.46276 -1.376 0.168860
## PctBarPrepComplete 9.73694 1.89435 5.140 2.75e-07
## NumPrepWorkshops -2.43445 1.73304 -1.405 0.160101
## StudentSuccessInitiativeYes 0.56483 0.76293 0.740 0.459097
## BarPrepMentorYes -3.14512 1.10756 -2.840 0.004516
## Age -0.10560 0.12115 -0.872 0.383411
## FGPA:NumPrepWorkshops 0.77463 0.57763 1.341 0.179908
## ProbationY:StudentSuccessInitiativeYes -3.46016 1.70536 -2.029 0.042460
## ProbationY:BarPrepMentorYes -3.49753 1.89029 -1.850 0.064275
## ProbationY:LegalAnalysisY -6.16617 2.07631 -2.970 0.002980
## LegalAnalysisY:BarPrepMentorYes 3.52039 1.32205 2.663 0.007749
## AccomY:BarPrepMentorYes 4.04478 2.93011 1.380 0.167458
## AccomY:NumPrepWorkshops 1.49695 0.82649 1.811 0.070106
## BarPrepHelix:Age NA NA NA NA
## BarPrepKaplan:Age 0.79700 0.68414 1.165 0.244031
## BarPrepThemis:Age 0.24211 0.11986 2.020 0.043388
## AdvLegalPerfY:Age -0.35209 0.14393 -2.446 0.014431
##
## (Intercept) ***
## LSAT ***
## FGPA ***
## LP1B **
## LP1B+
## LP1C *
## LP1C+ .
## LP1D
## LP1D+ **
## LP1F
## AccomY **
## ProbationY ***
## LegalAnalysisY *
## AdvLegalPerfY *
## BarPrepHelix
## BarPrepKaplan
## BarPrepThemis
## PctBarPrepComplete ***
## NumPrepWorkshops
## StudentSuccessInitiativeYes
## BarPrepMentorYes **
## Age
## FGPA:NumPrepWorkshops
## ProbationY:StudentSuccessInitiativeYes *
## ProbationY:BarPrepMentorYes .
## ProbationY:LegalAnalysisY **
## LegalAnalysisY:BarPrepMentorYes **
## AccomY:BarPrepMentorYes
## AccomY:NumPrepWorkshops .
## BarPrepHelix:Age
## BarPrepKaplan:Age
## BarPrepThemis:Age *
## AdvLegalPerfY:Age *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.60 on 447 degrees of freedom
## Residual deviance: 150.43 on 416 degrees of freedom
## AIC: 214.43
##
## Number of Fisher Scoring iterations: 14
Observations
Stepwise Logistic Regression Model:
Key insights from the final model:
Marginal Predictors (p between 0.05 and 0.10): LP1C+, Accom × NumPrepWorkshops, Probation × BarPrepMentor
Removed variables: The procedure eliminated UGPA, CivPro, LP2, and several interaction terms (e.g., FGPA × BarPrep), indicating limited predictive value in the current dataset.
Model Quality:
# SECTION 1: EXPLORATORY DATA ANALYSIS (EDA) ----------------------------------------------------------
# DATA LOADING AND INITIAL ANALYSIS
# Create the dataframe
url <- "https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/Updated_Bar_Data_For_Review_Final.csv"
data <- read.csv(url)
data
# Initial Data Exploration
str(data)
summary(data)
colSums(is.na(data))
# Define the Target Variable (PassFail)
data$PassFail <- factor(data$PassFail, levels = c("F", "P"))
# Convert Key Categorical Variables to Factors
categorical_vars <- c("Accommodations", "Probation",
"LegalAnalysis_TexasPractice",
"AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "StudentSuccessInitiative", "BarPrepMentor",
"CivPro", "LPI", "LPII")
data[categorical_vars] <- lapply(data[categorical_vars], factor)
# Check Levels of Categorical Variables
str(data)
sapply(data[, c("Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "StudentSuccessInitiative", "BarPrepMentor", "CivPro", "LPI", "LPII")], levels)
# Check Number of Unique Values for Numerical Variables
sapply(data[, c("LSAT", "UGPA", "GPA_1L", "GPA_Final", "FinalRankPercentile",
"BarPrepCompletion", "X.LawSchoolBarPrepWorkshops",
"MPRE", "MPT", "MEE", "MBE", "UBE")], function(x) length(unique(x)))
# DATA FILTERING AND PREPARATION
# Clean Probation Variable
data$Probation <- trimws(data$Probation)
data$Probation <- factor(data$Probation)
# Replace Empty Strings in BarPrepCompany
levels(data$BarPrepCompany) <- c(levels(data$BarPrepCompany), "None")
data$BarPrepCompany[data$BarPrepCompany == ""] <- "None"
data$BarPrepCompany <- factor(data$BarPrepCompany)
# Handle Missing Values in CivPro, LPI, and LPII
data$CivPro[data$CivPro == ""] <- NA
data$LPI[data$LPI == ""] <- NA
data$LPII[data$LPII == ""] <- NA
data$CivPro <- factor(data$CivPro)
data$LPI <- factor(data$LPI)
data$LPII <- factor(data$LPII)
# Create Binary Variables for Mentor and Student Support Participation
data$HadMentor <- ifelse(data$BarPrepMentor == "N", "No", "Yes")
data$HadMentor <- factor(data$HadMentor)
data$StudentSuccessParticipated <- ifelse(data$StudentSuccessInitiative == "N", "No", "Yes")
data$StudentSuccessParticipated <- as.factor(data$StudentSuccessParticipated)
# Create Filtered Dataset data2 by Selecting Relevant Variables
# Select and Rename Relevant Variables
data2 <- data[c(
"Year", "PassFail", "Age", "LSAT", "UGPA",
"CivPro", "LPI", "LPII", "GPA_1L", "GPA_Final", "FinalRankPercentile",
"Accommodations", "Probation",
"LegalAnalysis_TexasPractice", "AdvLegalPerfSkills", "AdvLegalAnalysis",
"BarPrepCompany", "BarPrepCompletion",
"OptIntoWritingGuide", "X.LawSchoolBarPrepWorkshops",
"MPRE", "MPT", "MEE", "WrittenScaledScore", "MBE", "UBE",
"StudentSuccessParticipated", "HadMentor"
)]
# Rename Variables in data2 According to Provided Naming Convention
colnames(data2) <- c(
"Class", "Pass", "Age", "LSAT", "UGPA",
"CivPro", "LP1", "LP2", "OneCum", "FGPA", "FinalRankPercentile",
"Accom", "Probation",
"LegalAnalysis", "AdvLegalPerf", "AdvLegalAnalysis",
"BarPrep", "PctBarPrepComplete",
"OptIntoWritingGuide", "NumPrepWorkshops",
"MPRE", "MPT", "MEE", "WrittenScaledScore", "MBE", "UBE",
"StudentSuccessInitiative", "BarPrepMentor"
)
# Assess Missing Values
colSums(is.na(data2))
# Remove unnecessary columns: MPRE, OptIntoWritingGuide, WrittenScaledScore and FinalRankPercentile
data2 <- subset(data2, select = -c(MPRE, OptIntoWritingGuide, WrittenScaledScore, FinalRankPercentile))
# Remove rows with missing values in critical academic and preparation variables
critical_vars <- c("CivPro", "LP1", "LP2", "OneCum", "PctBarPrepComplete")
data2 <- data2[complete.cases(data2[, critical_vars]), ]
# Verify that all missing values are removed
colSums(is.na(data2))
# CORRELATION ANALYSIS
# Correlation Matrix Visualization
numeric_vars <- data2[, sapply(data2, is.numeric)]
cor_matrix <- cor(numeric_vars, use = "complete.obs")
ggcorrplot(cor_matrix,
lab = TRUE,
lab_size = 4,
colors = c("red", "white", "#4A90E2"),
outline.color = "black",
show.legend = TRUE,
title = "Correlation Matrix of Numeric Variables",
ggtheme = ggplot2::theme_minimal()
)
# -----------------------------------------------------------------------------------------------------
# SECTION 2: MODEL IMPLEMENTATION ---------------------------------------------------------------------
# Initial model
model <- glm(
Pass ~ LSAT + UGPA + FGPA +
CivPro + LP1 + LP2 +
Accom + Probation +
LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
BarPrep + PctBarPrepComplete + NumPrepWorkshops +
StudentSuccessInitiative + BarPrepMentor + Age +
PctBarPrepComplete * FGPA +
NumPrepWorkshops * FGPA +
BarPrep * FGPA +
Probation * StudentSuccessInitiative +
Probation * BarPrepMentor +
Probation * LegalAnalysis +
AdvLegalPerf * FGPA +
AdvLegalAnalysis * PctBarPrepComplete +
LegalAnalysis * BarPrepMentor +
Accom * BarPrepMentor +
Accom * NumPrepWorkshops +
Age * BarPrep +
Age * AdvLegalPerf,
family = binomial,
data = data2
)
summary(model)
# MODEL REFINEMENT USING STEPWISE SELECTION
modelo_stepwise <- step(model, direction = "backward")
summary(modelo_stepwise)
# Final Model
model_final <- glm(formula = Pass ~ LSAT + FGPA + LP1 + Accom + Probation +
LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
NumPrepWorkshops + StudentSuccessInitiative + BarPrepMentor +
Age + FGPA:NumPrepWorkshops + Probation:StudentSuccessInitiative +
Probation:BarPrepMentor + Probation:LegalAnalysis + LegalAnalysis:BarPrepMentor +
Accom:BarPrepMentor + Accom:NumPrepWorkshops + BarPrep:Age +
AdvLegalPerf:Age, family = binomial, data = data2)
summary(model_final)
# -----------------------------------------------------------------------------------------------------