Bar Examination Analysis

Introduction

The Uniform Bar Examination (UBE) is the final step before someone can actually practice law, so naturally it’s one of the most important outcomes for a law school. Because of that, bar passage rates end up being a major way schools are judged.

At TTU Law, the goal is not just getting students admitted, but actually getting them through the program and passing the bar. That leads to a pretty straightforward question: what actually matters more for bar success what students come in with, or what happens while they’re in law school?

This project looks at that question by comparing pre-admission factors like LSAT and UGPA with things that happen during law school, like GPA, class rank, and bar prep behavior. The point is to figure out which group actually explains bar outcomes better, and which ones the school can realistically do something about.

Research Question

Are bar examination outcomes better explained by pre-admission metrics (LSAT, UGPA), or by law school performance and bar preparation variables?

Hypotheses

Before running anything, my expectation is pretty simple.

LSAT and UGPA should matter to some extent, since they measure general academic ability. But I don’t expect them to explain much once students actually go through law school.

Law school performance (GPA and class rank) should matter more because it reflects how students actually perform in the environment that prepares them for the bar.

Bar prep behavior should also matter a lot. Students who actually complete prep programs and show up to workshops are probably more prepared and more consistent overall.

Overall, I expect that once you include law school performance and bar prep variables, LSAT and UGPA won’t really carry much weight anymore.

Data and Methods

Data Description

The data set includes 600 TTU Law students from 2021–2025. It tracks students from before they enter law school all the way through bar exam results.

The variables fall into three main stages:

- Pre-admission (LSAT, UGPA, age)

- Law school performance (GPA, class rank)

- Bar prep behavior and final bar outcomes

Data Cleaning

Before running models, I cleaned up a few things.

UGPA was converted to numeric to ensure it was correctly treated as a continuous variable in the models:

Several law school course grades were originally stored as letter grades, which are not suitable for regression analysis. To fix this, I created a standard 4.0 grading scale conversion

Then I applied this mapping to key first-year law courses: This step was necessary so that academic performance could be analyzed quantitatively instead of as categorical letter grades.

To better understand differences between students who passed and failed, I created separate subsets of the dataset: These subsets were used for exploratory analysis to visually and statistically compare performance patterns before building regression models.

df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/BarData_2025.csv")
str(df)

## 'data.frame':    600 obs. of  28 variables:
##  $ Year                       : int  2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
##  $ PassFail                   : chr  "F" "F" "F" "F" ...
##  $ Age                        : num  29.1 29.6 29 36.2 28.9 30.8 29.1 42.9 28.3 27.1 ...
##  $ LSAT                       : int  152 155 157 156 145 154 149 160 152 150 ...
##  $ UGPA                       : chr  "3.42" "2.82" "3.46" "3.13" ...
##  $ CivPro                     : chr  "B+" "B+" "C" "D+" ...
##  $ LPI                        : chr  "A" "B" "B" "C" ...
##  $ LPII                       : chr  "A" "B" "B" "C+" ...
##  $ GPA_1L                     : num  3.21 2.43 2.62 2.27 2.29 ...
##  $ GPA_Final                  : num  3.29 3.2 2.91 2.77 2.9 2.82 3 3.09 3.21 2.74 ...
##  $ FinalRankPercentile        : num  0.46 0.33 0.08 0.02 0.08 0.05 0.15 0.22 0.34 0.01 ...
##  $ Accommodations             : chr  "N" "Y" "N" "N" ...
##  $ Probation                  : chr  "N" "Y" "N" "Y" ...
##  $ LegalAnalysis_TexasPractice: chr  "Y" "Y" "Y" "Y" ...
##  $ AdvLegalPerfSkills         : chr  "Y" "Y" "Y" "Y" ...
##  $ AdvLegalAnalysis           : chr  "Y" "Y" "Y" "Y" ...
##  $ BarPrepCompany             : chr  "Barbri" "Barbri" "Barbri" "Barbri" ...
##  $ BarPrepCompletion          : num  0.96 0.98 0.48 1 0.77 0.02 0.9 0.76 0.77 0.88 ...
##  $ OptIntoWritingGuide        : chr  "" "" "" "" ...
##  $ X.LawSchoolBarPrepWorkshops: int  3 0 3 0 5 1 5 5 1 5 ...
##  $ StudentSuccessInitiative   : chr  "N" "Cochran" "Smith" "Baldwin" ...
##  $ BarPrepMentor              : chr  "N" "N" "N" "N" ...
##  $ MPRE                       : num  103 76 99 81 99 NA 90 97 100 78 ...
##  $ MPT                        : num  3 3 3 2.5 3.5 3 2.5 2.5 3 2.5 ...
##  $ MEE                        : num  2.67 3.17 2.67 3 2.67 2 3.5 3 2.67 3.83 ...
##  $ WrittenScaledScore         : num  126 133 126 126 130 ...
##  $ MBE                        : num  133 133 118 140 125 ...
##  $ UBE                        : num  259 266 244 266 256 ...

##Data manipulation and Cleaning
#Convert UGPA to numeric
df$UGPA <- as.numeric(df$UGPA)

## Warning: NAs introduced by coercion

#convert PassFail
df$PassFail<-factor(df$PassFail,levels=c("F","P"))
View(df)

#convert letter grades to numeric scale
grademap <- c("A"=4.0, "A-"=3.7,"B+"=3.3,"B"=3.0,"B-"=2.7, "C+"=2.3,"C"=2.0,
              "C-"=1.7,"D+"=1.3,"D"=1.0,"D-"=0.7,"F"=0)
df$GPA_Final <- as.numeric(df$GPA_Final)

#map key first year law courses
df$CivPro_Num <- grademap[df$CivPro]
df$LPI_Num <- grademap[df$LPI]
df$LPII_Num <- grademap[df$LPII]
head(df)

##   Year PassFail  Age LSAT UGPA CivPro LPI LPII GPA_1L GPA_Final
## 1 2021        F 29.1  152 3.42     B+   A    A  3.206      3.29
## 2 2021        F 29.6  155 2.82     B+   B    B  2.431      3.20
## 3 2021        F 29.0  157 3.46      C   B    B  2.620      2.91
## 4 2021        F 36.2  156 3.13     D+   C   C+  2.275      2.77
## 5 2021        F 28.9  145 3.49      C  C+   C+  2.293      2.90
## 6 2021        F 30.8  154 2.85     B+   F   CR  2.538      2.82
##   FinalRankPercentile Accommodations Probation LegalAnalysis_TexasPractice
## 1                0.46              N         N                           Y
## 2                0.33              Y         Y                           Y
## 3                0.08              N         N                           Y
## 4                0.02              N         Y                           Y
## 5                0.08              N         Y                           Y
## 6                0.05              N         N                           Y
##   AdvLegalPerfSkills AdvLegalAnalysis BarPrepCompany BarPrepCompletion
## 1                  Y                Y         Barbri              0.96
## 2                  Y                Y         Barbri              0.98
## 3                  Y                Y         Barbri              0.48
## 4                  Y                Y         Barbri              1.00
## 5                  Y                Y         Themis              0.77
## 6                  Y                Y         Themis              0.02
##   OptIntoWritingGuide X.LawSchoolBarPrepWorkshops StudentSuccessInitiative
## 1                                               3                        N
## 2                                               0                  Cochran
## 3                                               3                    Smith
## 4                                               0                  Baldwin
## 5                                               5                  Baldwin
## 6                                               1                    Rosen
##   BarPrepMentor MPRE MPT  MEE WrittenScaledScore   MBE   UBE CivPro_Num LPI_Num
## 1             N  103 3.0 2.67              125.5 133.3 258.8        3.3     4.0
## 2             N   76 3.0 3.17              133.1 132.7 265.8        3.3     3.0
## 3             N   99 3.0 2.67              125.5 118.2 243.7        2.0     3.0
## 4             N   81 2.5 3.00              125.5 140.1 265.6        1.3     2.0
## 5             N   99 3.5 2.67              130.5 125.4 255.9        2.0     2.3
## 6             N   NA 3.0 2.00              115.4 113.5 228.9        3.3     0.0
##   LPII_Num
## 1      4.0
## 2      3.0
## 3      3.0
## 4      2.3
## 5      2.3
## 6       NA

colnames(df)

##  [1] "Year"                        "PassFail"                   
##  [3] "Age"                         "LSAT"                       
##  [5] "UGPA"                        "CivPro"                     
##  [7] "LPI"                         "LPII"                       
##  [9] "GPA_1L"                      "GPA_Final"                  
## [11] "FinalRankPercentile"         "Accommodations"             
## [13] "Probation"                   "LegalAnalysis_TexasPractice"
## [15] "AdvLegalPerfSkills"          "AdvLegalAnalysis"           
## [17] "BarPrepCompany"              "BarPrepCompletion"          
## [19] "OptIntoWritingGuide"         "X.LawSchoolBarPrepWorkshops"
## [21] "StudentSuccessInitiative"    "BarPrepMentor"              
## [23] "MPRE"                        "MPT"                        
## [25] "MEE"                         "WrittenScaledScore"         
## [27] "MBE"                         "UBE"                        
## [29] "CivPro_Num"                  "LPI_Num"                    
## [31] "LPII_Num"

#Change to Workshops
colnames(df)[colnames(df) == "X.LawSchoolBarPrepWorkshops"] <- "Workshops"
colnames(df)

##  [1] "Year"                        "PassFail"                   
##  [3] "Age"                         "LSAT"                       
##  [5] "UGPA"                        "CivPro"                     
##  [7] "LPI"                         "LPII"                       
##  [9] "GPA_1L"                      "GPA_Final"                  
## [11] "FinalRankPercentile"         "Accommodations"             
## [13] "Probation"                   "LegalAnalysis_TexasPractice"
## [15] "AdvLegalPerfSkills"          "AdvLegalAnalysis"           
## [17] "BarPrepCompany"              "BarPrepCompletion"          
## [19] "OptIntoWritingGuide"         "Workshops"                  
## [21] "StudentSuccessInitiative"    "BarPrepMentor"              
## [23] "MPRE"                        "MPT"                        
## [25] "MEE"                         "WrittenScaledScore"         
## [27] "MBE"                         "UBE"                        
## [29] "CivPro_Num"                  "LPI_Num"                    
## [31] "LPII_Num"

#subset dataframe by PassFail#
df_pass <- df[df$PassFail=="P",]
df_fail <- subset(df,PassFail=="F")

Exploratory Data Analysis

Before building any regression models, I conducted exploratory data analysis to get a basic sense of how the key variables relate to bar exam outcomes. This step is important because it helps identify initial patterns, potential relationships, and any obvious differences between students who passed and those who failed. I focused on comparing academic performance measures like GPA and class rank, along with bar preparation behaviors such as completion rates and workshop participation. I also looked at the overall distribution of UBE scores to understand how outcomes are spread across the sample. Overall, the exploratory analysis provides a first look at which factors appear to matter most before moving into formal modeling

GPA and Bar Passage

##Boxplot: Final GPA vs Bar Passage
boxplot(GPA_Final ~ PassFail,
        data = df,
        main = "Final GPA vs Bar Passage",
        xlab = "Pass/Fail",
        ylab = "Final GPA",
        col = "lightgray")

Students who pass the bar exam tend to have higher final GPAs, suggesting a strong relationship between academic performance and success.

Bar Prep Completion

##Histogram: Bar Prep Completion by Pass/Fail (side‑by‑side)
# Clean PassFail
df$PassFail <- trimws(df$PassFail)
df$PassFail <- as.character(df$PassFail)

# Set colors for each group
cols <- ifelse(df$PassFail == "P", rgb(0,0,1,0.5), rgb(1,0,0,0.5))

# Plot everything at once
hist(df$BarPrepCompletion,
     breaks = 20,
     col = cols,
     xlab = "Bar Prep Completion",
     main = "Bar Prep Completion by Pass/Fail")

# Add legend
legend("topright",
       legend = c("Pass", "Fail"),
       fill = c(rgb(0,0,1,0.5), rgb(1,0,0,0.5)))

Students with higher completion rates appear more likely to pass, supporting the hypothesis that preparation intensity matters.

Workshop Participation

#Bar Plot: Workshops by Pass/Fail (side‑by‑side)
length(df$Workshops)

## [1] 600

length(df$PassFail)

## [1] 600

tab <- table(df$Workshops, df$PassFail)

barplot(tab,
        beside = TRUE,
        col = c("skyblue", "salmon"),
        xlab = "Workshops",
        ylab = "Count",
        main = "Workshops by Pass/Fail")

legend("topright",
       legend = colnames(tab),
       fill = c("skyblue", "salmon"))

Workshop attendance shows a positive association with passing outcomes.

UBE Score Distribution

#Histogram: UBE Scores
hist(df$UBE,
     breaks = 30,
     col = "lightblue",
     main = "Distribution of UBE Scores",
     xlab = "UBE Score")

The distribution of UBE scores appears approximately normal, with a visible cutoff near the passing threshold.

Results

This section lays out the results from the regression models used to figure out what actually explains bar exam outcomes. I ran separate models for pre-admission variables, law school performance, and bar preparation behaviors, and then a full model that includes everything together. The goal here is to see which group of variables actually carries the most weight when it comes to predicting who passes the bar. I also compare the models using basic fit measures and significance levels to see whether outcomes are more tied to what students come in with, or what they do during law school and bar prep.

Model 1: Pre-Admission Variables Only

#Model 1: Pre-Admission Variables Only
df$PassFail<-factor(df$PassFail,levels=c("F","P"))
model1 <- glm(PassFail ~ LSAT + UGPA + Age,
                 data = df,
                 family = binomial)

summary(model1)

## 
## Call:
## glm(formula = PassFail ~ LSAT + UGPA + Age, family = binomial, 
##     data = df)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -25.17526    6.46142  -3.896 9.77e-05 ***
## LSAT          0.16514    0.03731   4.426 9.60e-06 ***
## UGPA          0.83891    0.39235   2.138   0.0325 *  
## Age          -0.03759    0.02807  -1.339   0.1806    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 394.26  on 598  degrees of freedom
## Residual deviance: 368.59  on 595  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 376.59
## 
## Number of Fisher Scoring iterations: 5

Interpretation

Based on the logistic regression, it looks like LSAT scores and undergraduate GPA are the strongest predictors of whether someone passes the bar exam. Both of these variables have positive and statistically significant effects, meaning that higher LSAT scores and higher UGPAs are linked to a higher chance of passing. Age, on the other hand, doesn’t seem to matter much in this model since its effect isn’t statistically significant. Overall, the results suggest that academic preparation especially LSAT performance and college GPA plays a meaningful role in bar exam success.

Model 2: Law School Performance

##Model 2: Law School Performance
model2 <- glm(PassFail ~ GPA_Final + FinalRankPercentile,
                 data = df,
                 family = binomial)

summary(model2)

## 
## Call:
## glm(formula = PassFail ~ GPA_Final + FinalRankPercentile, family = binomial, 
##     data = df)
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)
## (Intercept)           -6.643      5.722  -1.161    0.246
## GPA_Final              2.542      2.111   1.204    0.228
## FinalRankPercentile    2.538      2.493   1.018    0.309
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 394.48  on 599  degrees of freedom
## Residual deviance: 311.93  on 597  degrees of freedom
## AIC: 317.93
## 
## Number of Fisher Scoring iterations: 6

Interpretation

In this model, I looked at whether a student’s final GPA and their final class rank percentile could predict whether they passed the bar exam. Neither GPA nor class rank came out as statistically significant, meaning that at least in this dataset, these two academic measures don’t reliably explain who passes and who doesn’t. The coefficients are positive, so the general trend still points in the expected direction (higher GPA or better rank → higher chance of passing), but the p‑values show that the effects aren’t strong enough to be considered meaningful. Even though the model’s deviance dropped a lot compared to the null model, the individual predictors themselves don’t stand out as significant contributors.

Model 3: Bar Preparation Behaviors

##Model 3: Bar Preparation Behaviors
model3 <- glm(PassFail ~ BarPrepCompletion + Workshops +
                    OptIntoWritingGuide +
                    AdvLegalPerfSkills + AdvLegalAnalysis,
                  data = df,
                  family = binomial)

summary(model3)

## 
## Call:
## glm(formula = PassFail ~ BarPrepCompletion + Workshops + OptIntoWritingGuide + 
##     AdvLegalPerfSkills + AdvLegalAnalysis, family = binomial, 
##     data = df)
## 
## Coefficients:
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)          -1.69984    0.92783  -1.832   0.0669 .  
## BarPrepCompletion     4.34815    0.77007   5.646 1.64e-08 ***
## Workshops            -0.07615    0.07674  -0.992   0.3211    
## OptIntoWritingGuideN  0.81895    0.65583   1.249   0.2118    
## OptIntoWritingGuideY  0.32867    0.56925   0.577   0.5637    
## AdvLegalPerfSkillsY  -0.15413    0.55023  -0.280   0.7794    
## AdvLegalAnalysisY     0.30810    0.42472   0.725   0.4682    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 375.82  on 573  degrees of freedom
## Residual deviance: 341.17  on 567  degrees of freedom
##   (26 observations deleted due to missingness)
## AIC: 355.17
## 
## Number of Fisher Scoring iterations: 5

Interpretation

In this model, I looked at whether different types of bar prep activities like completing the bar prep program, attending workshops, opting into a writing guide, or taking advanced legal skills courses help predict who passes the bar exam. The results show that BarPrepCompletion is by far the strongest and only statistically significant predictor. Its coefficient is large and highly significant, meaning students who completed more of the bar prep program had a much higher chance of passing. The other variables workshops, the writing guide, Advanced Legal Performance Skills, and Advanced Legal Analysis didn’t come out significant. Their coefficients are small and their p‑values are high, so there isn’t enough evidence to say they meaningfully affect bar passage in this dataset. Overall, the model suggests that actually completing the bar prep program matters a lot, while the other support activities don’t show a clear impact on passing the bar.

Model 4: Full Model

##Model 4: Full Model
model4 <- glm(PassFail ~ LSAT + UGPA + Age +
                    GPA_Final + FinalRankPercentile +
                    BarPrepCompletion + Workshops +
                    OptIntoWritingGuide +
                    AdvLegalPerfSkills + AdvLegalAnalysis,
                  data = df,
                  family = binomial)
summary(model4)

## 
## Call:
## glm(formula = PassFail ~ LSAT + UGPA + Age + GPA_Final + FinalRankPercentile + 
##     BarPrepCompletion + Workshops + OptIntoWritingGuide + AdvLegalPerfSkills + 
##     AdvLegalAnalysis, family = binomial, data = df)
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)          -57.57493   12.44555  -4.626 3.73e-06 ***
## LSAT                   0.17994    0.04814   3.738 0.000185 ***
## UGPA                   0.47052    0.50027   0.941 0.346946    
## Age                   -0.07309    0.03935  -1.858 0.063208 .  
## GPA_Final              9.45366    2.96677   3.187 0.001440 ** 
## FinalRankPercentile   -4.98075    3.31788  -1.501 0.133307    
## BarPrepCompletion      3.88202    0.92825   4.182 2.89e-05 ***
## Workshops              0.02211    0.08975   0.246 0.805394    
## OptIntoWritingGuideN   2.25339    0.83537   2.697 0.006987 ** 
## OptIntoWritingGuideY   1.54418    0.71090   2.172 0.029845 *  
## AdvLegalPerfSkillsY    0.53760    0.64611   0.832 0.405377    
## AdvLegalAnalysisY      0.39769    0.48691   0.817 0.414064    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 375.61  on 572  degrees of freedom
## Residual deviance: 249.03  on 561  degrees of freedom
##   (27 observations deleted due to missingness)
## AIC: 273.03
## 
## Number of Fisher Scoring iterations: 7

Interpretation

In this final model, I included all the academic and bar‑prep variables at once to see which ones actually matter when they’re competing against each other. A few predictors really stood out. LSAT, GPA_Final, BarPrepCompletion, and opting into the Writing Guide all came out statistically significant. This means that higher LSAT scores, higher final law school GPA, completing more of the bar prep program, and choosing to use the writing guide were all associated with a higher chance of passing the bar exam. These effects stayed significant even after controlling for everything else in the model, which suggests they each contribute something unique. Some variables that looked important in earlier models like UGPA, class rank percentile, workshops, and the advanced legal skills courses were not significant here. Their p‑values are too high to say they reliably predict bar passage once the stronger predictors are included. Age was borderline but still not strong enough to count as significant. Overall, this model fits much better than the earlier ones (the residual deviance dropped a lot), and it shows that bar passage is most strongly linked to LSAT performance, final law school GPA, completing the bar prep program, and opting into the writing guide, while the other academic support activities don’t show clear effects when everything is considered together. Model Comparison Interpretation Even though the models weren’t all fit on the exact same number of students, the AIC values still give a good sense of which model performs best. Model 4 clearly has the lowest AIC, which means it explains bar passage the best out of all the models you tested. The warning just reminds you that each model used slightly different numbers of observations because of missing data, so the comparison isn’t perfectly one‑to‑one but the overall pattern is still pretty clear.

Discussion

Looking across all the models, the biggest takeaway is that bar exam outcomes seem to be shaped much more by what students do during law school than by the credentials they had coming in. LSAT and UGPA show some relationship with bar passage at first, but once law‑school performance and bar‑prep behaviors are added, those admissions variables stop being meaningful. This suggests that students aren’t locked into bar outcomes based on where they started.

Instead, the strongest academic predictors are final law school GPA and, to a lesser extent, LSAT. GPA makes sense as a major factor because it reflects how well students handled the actual law school curriculum. LSAT still matters in the full model, but not nearly as much as GPA.

Bar‑prep behavior also plays a big role. BarPrepCompletion is one of the most consistent and powerful predictors across the models. Students who complete more of their bar prep program have a much higher chance of passing. Opting into the writing guide also shows a positive effect in the full model. These results suggest that engagement and follow through during bar prep really do matter.

Overall, the results point to a pretty clear conclusion: students’ choices and performance during law school and especially during bar prep are more important than their admissions stats.

Limitations

There are a few limitations to keep in mind. First, this is observational data, so we can only talk about associations, not definite cause‑and‑effect relationships.

Second, the dataset doesn’t include personal factors that could influence bar performance, like stress, financial pressure, work obligations, or family responsibilities.

Third, the data comes from one law school (TTU Law), so the findings might not generalize perfectly to other institutions with different student populations or bar‑prep structures.

Recommendations

Prioritize Bar Prep Completion

Since BarPrepCompletion is one of the strongest predictors of passing, the school should monitor completion rates more closely and reach out early when students fall behind.
Strengthen Writing Support

Because opting into the writing guide was significant in the full model, expanding or promoting writing‑focused support could help more students benefit from it.
Identify At‑Risk Students Early

GPA and LSAT both show meaningful relationships with bar passage. Students with lower GPAs or weaker academic indicators should be identified earlier so they can receive targeted support before bar prep even begins.

Conclusion

Overall, the analysis shows that bar exam success is influenced more by what students do during law school than by their admissions credentials. Final GPA, LSAT, and especially bar‑prep completion were the strongest predictors of passing, while UGPA and class rank mattered less once everything was considered together. These findings suggest that consistent academic performance and active engagement in bar preparation play a much larger role in bar outcomes than where students started when they entered law school.