This assignment uses data from TTU Law School to explore what factors are related to bar exam performance. The data set includes information on students’ background, their academic performance during law school, and their involvement in bar preparation activities. The goal is to better understand what might help students achieve higher scores and successfully pass the bar exam.
Passing the bar exam is an important step for law students, so it is useful to understand which factors are connected to success. This can help the school figure out where students might need more support and what kinds of resources are most helpful.
In this analysis, I focus on two outcomes: the overall UBE score and whether a student passes or fails. I will use both linear regression and logistic regression to study these outcomes and see how different variables relate to them.
I will look at a mix of factors, including things like LSAT scores, GPA, and different types of bar preparation activities. I also want to see how academic challenges during law school might be related to bar exam results.
Before doing the analysis, I expect that students with stronger academic performance and those who spend more time on bar preparation will generally do better on the exam. Students who had academic difficulties may have lower scores or a lower chance of passing.
The analysis will include exploring the data, fitting models, and checking whether the models are reasonable. The main goal is to understand the results clearly and use them to suggest practical ways the law school could help improve student outcomes.
Several variables were cleaned and reformatted before modeling. Letter grade variables (CivPro, LPI, and LPII) were converted to numeric values using a standard 4.0 GPA scale. Binary variables ( Probation, Accommodations, and participation indicators) were converted to factors. Categorical variables such as BarPrepCompany were also treated as factors.
## Loading required package: carData
## 'data.frame': 600 obs. of 28 variables:
## $ Year : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ PassFail : chr "F" "F" "F" "F" ...
## $ Age : num 29.1 29.6 29 36.2 28.9 30.8 29.1 42.9 28.3 27.1 ...
## $ LSAT : int 152 155 157 156 145 154 149 160 152 150 ...
## $ UGPA : chr "3.42" "2.82" "3.46" "3.13" ...
## $ CivPro : chr "B+" "B+" "C" "D+" ...
## $ LPI : chr "A" "B" "B" "C" ...
## $ LPII : chr "A" "B" "B" "C+" ...
## $ GPA_1L : num 3.21 2.43 2.62 2.27 2.29 ...
## $ GPA_Final : num 3.29 3.2 2.91 2.77 2.9 2.82 3 3.09 3.21 2.74 ...
## $ FinalRankPercentile : num 0.46 0.33 0.08 0.02 0.08 0.05 0.15 0.22 0.34 0.01 ...
## $ Accommodations : chr "N" "Y" "N" "N" ...
## $ Probation : chr "N" "Y" "N" "Y" ...
## $ LegalAnalysis_TexasPractice: chr "Y" "Y" "Y" "Y" ...
## $ AdvLegalPerfSkills : chr "Y" "Y" "Y" "Y" ...
## $ AdvLegalAnalysis : chr "Y" "Y" "Y" "Y" ...
## $ BarPrepCompany : chr "Barbri" "Barbri" "Barbri" "Barbri" ...
## $ BarPrepCompletion : num 0.96 0.98 0.48 1 0.77 0.02 0.9 0.76 0.77 0.88 ...
## $ OptIntoWritingGuide : chr "" "" "" "" ...
## $ X.LawSchoolBarPrepWorkshops: int 3 0 3 0 5 1 5 5 1 5 ...
## $ StudentSuccessInitiative : chr "N" "Cochran" "Smith" "Baldwin" ...
## $ BarPrepMentor : chr "N" "N" "N" "N" ...
## $ MPRE : num 103 76 99 81 99 NA 90 97 100 78 ...
## $ MPT : num 3 3 3 2.5 3.5 3 2.5 2.5 3 2.5 ...
## $ MEE : num 2.67 3.17 2.67 3 2.67 2 3.5 3 2.67 3.83 ...
## $ WrittenScaledScore : num 126 133 126 126 130 ...
## $ MBE : num 133 133 118 140 125 ...
## $ UBE : num 259 266 244 266 256 ...
## Warning: NAs introduced by coercion
Before building the models, I explored the relationships between the response variables and the potential predictors. This step helps give an initial understanding of the data and shows which variables may be more strongly related to the outcomes.
To do this, I used simple scatterplots for numeric variables and boxplots when comparing numeric variables with the pass/fail outcome. These plots helped visualize patterns, trends, and differences between groups.
In this section, I build several regression models to understand what factors are related to bar exam performance. I use both linear and logistic regression to look at how different variables affect UBE scores and the probability of passing the exam.
The models include variables related to academic performance as well as bar preparation activities. I also try different combinations of variables to see how the results change and which factors seem most important.
The first linear regression model was built to examine how GPA_Final, LSAT, and BarPrepCompletion are related to UBE scores. These variables were selected based on their strong relationships observed in the premodeling analysis and their logical importance.
GPA_Final reflects academic performance during law school, LSAT captures pre-law ability, and BarPrepCompletion represents effort in preparing for the exam. This model provides a simple baseline for understanding how ability and preparation are associated with bar exam performance.
Diagnostic plots were used to assess model adequacy and showed no major violations of model assumptions. Variance Inflation Factors (VIF) were also checked and indicated no concerns with multicollinearity.
##
## Call:
## lm(formula = UBE ~ GPA_Final + LSAT + BarPrepCompletion, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -56.794 -11.403 0.165 10.807 52.612
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1381 28.7965 -0.005 0.996
## GPA_Final 32.0421 2.3118 13.860 < 2e-16 ***
## LSAT 1.0529 0.1825 5.769 1.31e-08 ***
## BarPrepCompletion 30.0545 4.8531 6.193 1.13e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.71 on 570 degrees of freedom
## (26 observations deleted due to missingness)
## Multiple R-squared: 0.3825, Adjusted R-squared: 0.3793
## F-statistic: 117.7 on 3 and 570 DF, p-value: < 2.2e-16
## GPA_Final LSAT BarPrepCompletion
## 1.104345 1.035072 1.097704
The results of the first linear regression model show that GPA_Final, LSAT, and BarPrepCompletion are all statistically significant predictors of UBE scores.
GPA_Final has a strong positive association with UBE, with an estimated increase of about 32 points in UBE score.LSAT is also positively related to UBE, with higher LSAT scores associated with higher UBE scores. Similarly, BarPrepCompletion shows a positive effect, indicating that students who complete more of their bar preparation program tend to achieve higher UBE scores.
The model explains approximately 38% of the variation in UBE scores , suggesting a moderate fit. The overall model is statistically significant , indicating that the predictors, taken together, are useful in explaining bar exam performance.
In the second model, additional variables were included to explore whether factors beyond basic academic performance could help explain UBE scores. Initially, some multicollinearity was observed, as indicated by higher VIF values. To address this, the bar preparation variable was centered, which helped stabilize the model and reduce multicollinearity.
Model adequacy was then assessed using diagnostic plots, which did not show any major violations of assumptions such as non-linearity or unequal variance. After making these adjustments, a revised model was developed that included GPA, LSAT, bar preparation completion, and participation in advanced legal courses, along with year effects.
##
## Call:
## lm(formula = UBE ~ GPA_Final + LSAT + AdvLegalAnalysis + AdvLegalPerfSkills +
## Year + Probation + BarPrepCompletion_c + Probation:BarPrepCompletion_c,
## data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -55.065 -10.254 0.039 9.863 53.989
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.9666 28.5051 1.367 0.172171
## GPA_Final 34.7367 2.4149 14.384 < 2e-16 ***
## LSAT 0.8355 0.1829 4.567 6.08e-06 ***
## AdvLegalAnalysisY 4.1558 1.8187 2.285 0.022682 *
## AdvLegalPerfSkillsY 5.4018 2.4679 2.189 0.029021 *
## Year2022 -4.5134 2.2376 -2.017 0.044169 *
## Year2023 11.0679 3.1490 3.515 0.000476 ***
## Year2024 10.0439 3.0515 3.291 0.001059 **
## Year2025 15.3380 3.0575 5.016 7.07e-07 ***
## ProbationY -0.9593 2.7617 -0.347 0.728456
## BarPrepCompletion_c 30.0509 4.8428 6.205 1.06e-09 ***
## ProbationY:BarPrepCompletion_c -36.1442 19.9433 -1.812 0.070466 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.1 on 562 degrees of freedom
## (26 observations deleted due to missingness)
## Multiple R-squared: 0.4345, Adjusted R-squared: 0.4234
## F-statistic: 39.25 on 11 and 562 DF, p-value: < 2.2e-16
## GVIF Df GVIF^(1/(2*Df))
## GPA_Final 1.297217 1 1.138954
## LSAT 1.119586 1 1.058105
## AdvLegalAnalysis 1.756073 1 1.325169
## AdvLegalPerfSkills 3.349535 1 1.830173
## Year 3.774343 4 1.180607
## Probation 1.220033 1 1.104551
## BarPrepCompletion_c 1.176752 1 1.084782
## Probation:BarPrepCompletion_c 1.144500 1 1.069813
In the second linear model, GPA_Final, LSAT, BarPrepCompletion, Advanced Legal Analysis, Advanced Legal Performance Skills, and Year were included to explain UBE scores. GPA_Final, LSAT, and BarPrepCompletion remained important predictors of UBE scores. Students with higher final GPA, higher LSAT scores, and higher bar prep completion tended to have higher UBE scores.
The model also showed that participation in Advanced Legal Analysis and Advanced Legal Performance Skills was positively associated with UBE scores. This is important because these are variables related to coursework and preparation that the law school may be able to influence. Some year effects were also significant, suggesting that UBE scores differed across cohorts.
Recommendation
Based on this model, the law school should continue encouraging students to complete more of their bar preparation program. The results also suggest that students may benefit from taking Advanced Legal Analysis and Advanced Legal Performance Skills, so the school could promote these courses more strongly or recommend them for students who may need additional bar support.
A practical recommendation is to identify students with lower GPA or lower preparation engagement early and encourage them to complete bar prep activities and enroll in bar-aligned courses before the exam.
## Analysis of Variance Table
##
## Model 1: UBE ~ GPA_Final + LSAT + BarPrepCompletion
## Model 2: UBE ~ GPA_Final + LSAT + AdvLegalAnalysis + AdvLegalPerfSkills +
## Year + Probation + BarPrepCompletion_c + Probation:BarPrepCompletion_c
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 570 159085
## 2 562 145700 8 13385 6.4536 4.761e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
An ANOVA test was conducted to compare the baseline linear model (Model 1) with the extended model (Model 2). The results show a statistically significant improvement in model fit , indicating that the additional variables included in Model 2 contribute to explaining variation in UBE scores.
A logistic regression model was used to examine factors associated with the probability of passing the bar exam. The model included LSAT, LPII performance, first-year GPA, bar preparation completion, class rank percentile, and an interaction between GPA_1L and bar preparation and Year.
##
## Call:
## glm(formula = PassFail ~ LSAT + LPII_Num + GPA_1L + BarPrepCompletion_c +
## Year + FinalRankPercentile, family = binomial, data = df)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -22.51322 8.14837 -2.763 0.005729 **
## LSAT 0.12398 0.05489 2.259 0.023897 *
## LPII_Num -1.30955 0.38750 -3.380 0.000726 ***
## GPA_1L 2.96994 1.00471 2.956 0.003116 **
## BarPrepCompletion_c 4.50047 1.03618 4.343 1.4e-05 ***
## Year2022 -1.55582 0.60879 -2.556 0.010601 *
## Year2023 0.53111 0.64904 0.818 0.413186
## Year2024 -1.00266 0.55166 -1.818 0.069138 .
## Year2025 0.88115 0.73374 1.201 0.229791
## FinalRankPercentile 4.49966 1.53634 2.929 0.003403 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 325.10 on 521 degrees of freedom
## Residual deviance: 192.85 on 512 degrees of freedom
## (78 observations deleted due to missingness)
## AIC: 212.85
##
## Number of Fisher Scoring iterations: 7
## [1] 0
The results indicate that GPA_1L is a strong and statistically significant predictor of passing the bar exam, suggesting that students with higher first-year academic performance are more likely to pass. LPII performance is also significant, indicating that success in this course is associated with improved bar exam outcomes. Final rank percentile was found to be a significant predictor, showing that students with higher class standing have a greater likelihood of passing. LSAT also showed a positive and statistically significant relationship with passing. The very small p-value indicates that the predictors in the model significantly improve the ability to explain the probability of passing the bar exam.
The model demonstrated a strong improvement in fit compared to the null model, with the deviance decreasing from 325.10 to 192.21. The AIC value of 214.21 indicates a better fit compared to previous models.The inclusion of year in the model shows that passing rates differ across years, indicating that cohort differences may affect bar exam outcomes.
Recommendations
Based on these results, early academic performance should be closely monitored, as it is a strong indicator of bar exam success. Students with lower first-year GPA may benefit from early academic support and targeted interventions. Additionally, performance in key courses such as LPII appears to be closely linked to passing outcomes, suggesting that these courses play an important role in preparing students for the bar exam.
This analysis used both linear and logistic regression models to examine factors associated with bar exam performance, including UBE scores and the probability of passing the exam. Across all models, academic performance variables were consistently the strongest predictors of success. In particular, first-year GPA , LPII performance, and final rank percentile showed strong and statistically significant relationships with both UBE scores and passing outcomes.
The logistic regression model showed a substantial reduction in deviance compared to the null model, and the chi-square test confirmed that the predictors collectively provide a significantly better fit for explaining the probability of passing the bar exam.
Overall, the results indicate that academic performance is the primary driver of bar exam success. Students who perform well early in law school and maintain strong academic standing are more likely to pass the bar exam.
Based on these findings, an important recommendation is to identify students with lower first-year GPA early and provide targeted academic support. In addition, strong performance in key courses such as LPII appears to be closely related to passing outcomes, suggesting that these courses play an important role in preparation.
In conclusion, focusing on academic performance and early intervention for at-risk students may be the most effective strategy for improving bar exam outcomes.