Introduction

The Uniform Bar Examination (UBE) is a standardized licensing exam used in Texas and many other jurisdictions. It combines multiple‑choice, essay, and performance components into a single 400‑point score. Because bar passage determines whether graduates can enter the profession, understanding what predicts UBE performance is essential for improving student outcomes. In this analysis, I examine which academic and preparation‑related factors best predict bar exam performance. I focus on two outcomes: the continuous UBE score and the Pass/Fail classification. These outcomes capture both overall performance and the practical threshold for licensure. Before modeling, I expected that LSAT, undergraduate GPA, and performance in key 1L courses would be positively associated with bar outcomes. These variables reflect early academic preparation and foundational skills that map closely onto bar exam demands.

df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/BarData_2025.csv")
str(df)
## 'data.frame':    600 obs. of  28 variables:
##  $ Year                       : int  2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
##  $ PassFail                   : chr  "F" "F" "F" "F" ...
##  $ Age                        : num  29.1 29.6 29 36.2 28.9 30.8 29.1 42.9 28.3 27.1 ...
##  $ LSAT                       : int  152 155 157 156 145 154 149 160 152 150 ...
##  $ UGPA                       : chr  "3.42" "2.82" "3.46" "3.13" ...
##  $ CivPro                     : chr  "B+" "B+" "C" "D+" ...
##  $ LPI                        : chr  "A" "B" "B" "C" ...
##  $ LPII                       : chr  "A" "B" "B" "C+" ...
##  $ GPA_1L                     : num  3.21 2.43 2.62 2.27 2.29 ...
##  $ GPA_Final                  : num  3.29 3.2 2.91 2.77 2.9 2.82 3 3.09 3.21 2.74 ...
##  $ FinalRankPercentile        : num  0.46 0.33 0.08 0.02 0.08 0.05 0.15 0.22 0.34 0.01 ...
##  $ Accommodations             : chr  "N" "Y" "N" "N" ...
##  $ Probation                  : chr  "N" "Y" "N" "Y" ...
##  $ LegalAnalysis_TexasPractice: chr  "Y" "Y" "Y" "Y" ...
##  $ AdvLegalPerfSkills         : chr  "Y" "Y" "Y" "Y" ...
##  $ AdvLegalAnalysis           : chr  "Y" "Y" "Y" "Y" ...
##  $ BarPrepCompany             : chr  "Barbri" "Barbri" "Barbri" "Barbri" ...
##  $ BarPrepCompletion          : num  0.96 0.98 0.48 1 0.77 0.02 0.9 0.76 0.77 0.88 ...
##  $ OptIntoWritingGuide        : chr  "" "" "" "" ...
##  $ X.LawSchoolBarPrepWorkshops: int  3 0 3 0 5 1 5 5 1 5 ...
##  $ StudentSuccessInitiative   : chr  "N" "Cochran" "Smith" "Baldwin" ...
##  $ BarPrepMentor              : chr  "N" "N" "N" "N" ...
##  $ MPRE                       : num  103 76 99 81 99 NA 90 97 100 78 ...
##  $ MPT                        : num  3 3 3 2.5 3.5 3 2.5 2.5 3 2.5 ...
##  $ MEE                        : num  2.67 3.17 2.67 3 2.67 2 3.5 3 2.67 3.83 ...
##  $ WrittenScaledScore         : num  126 133 126 126 130 ...
##  $ MBE                        : num  133 133 118 140 125 ...
##  $ UBE                        : num  259 266 244 266 256 ...
summary(df)
##       Year        PassFail              Age             LSAT      
##  Min.   :2021   Length:600         Min.   :22.80   Min.   :141.0  
##  1st Qu.:2022   Class :character   1st Qu.:26.30   1st Qu.:153.0  
##  Median :2023   Mode  :character   Median :27.85   Median :156.0  
##  Mean   :2023                      Mean   :28.71   Mean   :155.6  
##  3rd Qu.:2024                      3rd Qu.:29.52   3rd Qu.:158.0  
##  Max.   :2025                      Max.   :65.70   Max.   :171.0  
##                                                                   
##      UGPA              CivPro              LPI                LPII          
##  Length:600         Length:600         Length:600         Length:600        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##      GPA_1L        GPA_Final     FinalRankPercentile Accommodations    
##  Min.   :2.200   Min.   :2.440   Min.   :0.0000      Length:600        
##  1st Qu.:2.783   1st Qu.:3.050   1st Qu.:0.2600      Class :character  
##  Median :3.084   Median :3.263   Median :0.5100      Mode  :character  
##  Mean   :3.091   Mean   :3.275   Mean   :0.5059                        
##  3rd Qu.:3.383   3rd Qu.:3.500   3rd Qu.:0.7500                        
##  Max.   :4.000   Max.   :3.990   Max.   :0.9900                        
##  NA's   :8                                                             
##   Probation         LegalAnalysis_TexasPractice AdvLegalPerfSkills
##  Length:600         Length:600                  Length:600        
##  Class :character   Class :character            Class :character  
##  Mode  :character   Mode  :character            Mode  :character  
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##  AdvLegalAnalysis   BarPrepCompany     BarPrepCompletion OptIntoWritingGuide
##  Length:600         Length:600         Min.   :0.000     Length:600         
##  Class :character   Class :character   1st Qu.:0.800     Class :character   
##  Mode  :character   Mode  :character   Median :0.900     Mode  :character   
##                                        Mean   :0.865                        
##                                        3rd Qu.:0.980                        
##                                        Max.   :1.000                        
##                                        NA's   :26                           
##  X.LawSchoolBarPrepWorkshops StudentSuccessInitiative BarPrepMentor     
##  Min.   :0.000               Length:600               Length:600        
##  1st Qu.:0.000               Class :character         Class :character  
##  Median :1.000               Mode  :character         Mode  :character  
##  Mean   :1.588                                                          
##  3rd Qu.:3.000                                                          
##  Max.   :5.000                                                          
##                                                                         
##       MPRE             MPT             MEE       WrittenScaledScore
##  Min.   : 76.00   Min.   :1.000   Min.   :2.00   Min.   :111.7     
##  1st Qu.: 89.50   1st Qu.:3.000   1st Qu.:3.33   1st Qu.:139.7     
##  Median : 99.00   Median :3.500   Median :3.83   Median :148.2     
##  Mean   : 99.46   Mean   :3.649   Mean   :3.74   Mean   :147.4     
##  3rd Qu.:107.00   3rd Qu.:4.000   3rd Qu.:4.17   3rd Qu.:156.5     
##  Max.   :145.00   Max.   :5.500   Max.   :5.33   Max.   :181.2     
##  NA's   :397                                                       
##       MBE             UBE       
##  Min.   :103.6   Min.   :227.3  
##  1st Qu.:139.4   1st Qu.:280.4  
##  Median :147.9   Median :295.3  
##  Mean   :147.3   Mean   :294.7  
##  3rd Qu.:155.4   3rd Qu.:309.6  
##  Max.   :187.9   Max.   :358.7  
## 

Data and Methods

The UBE score is created by combining the scaled written score with the scaled MBE score. The written score is based on a weighted combination of MEE essays and MPT tasks, which are scaled each year to align with the MBE’s 200‑point metric. A candidate passes if their UBE score is 270 or higher. The dataset includes variables organized by the students progression through law school: pre-admission metrics, 1L grades, cumulative GPA and rank, status indicators, bar-aligned electives, commercial bar prep engagement, institutional support programs, and bar exam component scores. This structure helps guide which predictors are appropriate for each model. Letter grades were converted to numeric values, PassFail was recoded as a binary factor, and rows with missing key variables were removed for consistency. I used linear regression to predict UBE scores and logistic regression to predict Pass/Fail. These methods allow both continuous and categorical outcomes to be analyzed appropriately.

df$UGPA <- as.numeric(df$UGPA)
df$PassFail <- factor(df$PassFail, levels = c("F", "P"))
grademap <- c(
  "A" = 4.0, "A-" = 3.7, "B+" = 3.3, "B" = 3.0, "B-" = 2.7,
  "C+" = 2.3, "C" = 2.0, "C-" = 1.7, "D+" = 1.3, "D" = 1.0,
  "D-" = 0.7, "F" = 0
)
df$CivPro_Num <- grademap[df$CivPro]
df$LPI_Num    <- grademap[df$LPI]
df$LPII_Num   <- grademap[df$LPII]
colSums(is.na(df))
##                        Year                    PassFail 
##                           0                           0 
##                         Age                        LSAT 
##                           0                           0 
##                        UGPA                      CivPro 
##                           1                           0 
##                         LPI                        LPII 
##                           0                           0 
##                      GPA_1L                   GPA_Final 
##                           8                           0 
##         FinalRankPercentile              Accommodations 
##                           0                           0 
##                   Probation LegalAnalysis_TexasPractice 
##                           0                           0 
##          AdvLegalPerfSkills            AdvLegalAnalysis 
##                           0                           0 
##              BarPrepCompany           BarPrepCompletion 
##                           0                          26 
##         OptIntoWritingGuide X.LawSchoolBarPrepWorkshops 
##                           0                           0 
##    StudentSuccessInitiative               BarPrepMentor 
##                           0                           0 
##                        MPRE                         MPT 
##                         397                           0 
##                         MEE          WrittenScaledScore 
##                           0                           0 
##                         MBE                         UBE 
##                           0                           0 
##                  CivPro_Num                     LPI_Num 
##                           7                           9 
##                    LPII_Num 
##                          56
df_model <- df[!is.na(df$LSAT) & !is.na(df$UGPA) & !is.na(df$UBE), ]

Exploratory Data Analysis and Results

Descriptive statistics and plots showed clear differences between passing and failing students, with passing students clustering at higher UBE scores. LSAT, UGPA, and several course grades displayed positive relationships with bar performance. In the linear model, LSAT and Civil Procedure grades were significant predictors of UBE score. In the logistic model, LSAT, UGPA, and writing focused course grades increased the odds of passing. Model diagnostics indicated acceptable fit and no major assumption violations.

summary(df_model[, c("UBE","PassFail","LSAT","UGPA","CivPro_Num","LPI_Num","LPII_Num")])
##       UBE        PassFail      LSAT            UGPA         CivPro_Num   
##  Min.   :227.3   F: 61    Min.   :141.0   Min.   :2.010   Min.   :0.000  
##  1st Qu.:280.3   P:538    1st Qu.:153.0   1st Qu.:3.280   1st Qu.:2.300  
##  Median :295.1            Median :156.0   Median :3.540   Median :3.000  
##  Mean   :294.7            Mean   :155.6   Mean   :3.478   Mean   :2.985  
##  3rd Qu.:309.7            3rd Qu.:158.0   3rd Qu.:3.740   3rd Qu.:3.300  
##  Max.   :358.7            Max.   :171.0   Max.   :4.140   Max.   :4.000  
##                                                           NA's   :7      
##     LPI_Num         LPII_Num    
##  Min.   :0.000   Min.   :1.000  
##  1st Qu.:2.300   1st Qu.:2.300  
##  Median :3.000   Median :3.000  
##  Mean   :2.957   Mean   :3.007  
##  3rd Qu.:3.300   3rd Qu.:3.300  
##  Max.   :4.000   Max.   :4.000  
##  NA's   :9       NA's   :56
# UBE by Pass/Fail
boxplot(UBE ~ PassFail, data = df_model,
        main = "UBE Score by Pass/Fail",
        xlab = "Pass/Fail", ylab = "UBE Score",
        col = c("red", "green"))

# Histogram of UBE
hist(df_model$UBE, breaks = 30, main = "Distribution of UBE Scores",
     xlab = "UBE Score")

# LSAT vs UBE
plot(df_model$LSAT, df_model$UBE,
     xlab = "LSAT", ylab = "UBE",
     main = "LSAT vs UBE")
abline(lm(UBE ~ LSAT, data = df_model), col = "blue")

Model 1:Linear Regression for UBE

model1 <- lm(UBE ~ LSAT + UGPA + CivPro_Num + LPI_Num + LPII_Num, data=df_model)
summary(model1)
## 
## Call:
## lm(formula = UBE ~ LSAT + UGPA + CivPro_Num + LPI_Num + LPII_Num, 
##     data = df_model)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -65.280 -12.279   1.659  12.895  54.470 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  53.6273    36.8556   1.455  0.14624    
## LSAT          1.1218     0.2242   5.004 7.63e-07 ***
## UGPA          7.5707     2.3606   3.207  0.00142 ** 
## CivPro_Num    9.2092     1.3328   6.910 1.38e-11 ***
## LPI_Num      -1.1107     1.6351  -0.679  0.49726    
## LPII_Num      5.5260     1.5623   3.537  0.00044 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.96 on 535 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.2238, Adjusted R-squared:  0.2165 
## F-statistic: 30.85 on 5 and 535 DF,  p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(model1)

par(mfrow=c(1,1))

Model 2: Logistic Regression for PassFail

model2 <- glm(PassFail ~ LSAT + UGPA + CivPro_Num + LPI_Num + LPII_Num,
              data=df_model, family=binomial)
summary(model2)
## 
## Call:
## glm(formula = PassFail ~ LSAT + UGPA + CivPro_Num + LPI_Num + 
##     LPII_Num, family = binomial, data = df_model)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -25.53222    7.59696  -3.361 0.000777 ***
## LSAT          0.13591    0.04469   3.041 0.002358 ** 
## UGPA          0.93384    0.45865   2.036 0.041742 *  
## CivPro_Num    1.10112    0.24188   4.552 5.31e-06 ***
## LPI_Num       0.14823    0.29275   0.506 0.612620    
## LPII_Num      0.02427    0.28816   0.084 0.932887    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 337.92  on 540  degrees of freedom
## Residual deviance: 290.95  on 535  degrees of freedom
##   (58 observations deleted due to missingness)
## AIC: 302.95
## 
## Number of Fisher Scoring iterations: 6
anova(model2, test="Chisq")
## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: PassFail
## 
## Terms added sequentially (first to last)
## 
## 
##            Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                         540     337.92              
## LSAT        1  12.6672       539     325.25 0.0003721 ***
## UGPA        1   6.7939       538     318.46 0.0091469 ** 
## CivPro_Num  1  27.1068       537     291.35 1.925e-07 ***
## LPI_Num     1   0.3894       536     290.96 0.5325998    
## LPII_Num    1   0.0071       535     290.95 0.9329294    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model 3: A more Targeted Model

model3 <- lm(WrittenScaledScore ~ LSAT + UGPA + CivPro_Num + LPI_Num + LPII_Num,
             data=df_model)
summary(model3)
## 
## Call:
## lm(formula = WrittenScaledScore ~ LSAT + UGPA + CivPro_Num + 
##     LPI_Num + LPII_Num, data = df_model)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.757  -7.400   0.586   7.420  31.263 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  41.9941    22.7889   1.843 0.065920 .  
## LSAT          0.4713     0.1386   3.400 0.000723 ***
## UGPA          3.7216     1.4596   2.550 0.011057 *  
## CivPro_Num    4.5566     0.8241   5.529 5.04e-08 ***
## LPI_Num      -1.0667     1.0111  -1.055 0.291905    
## LPII_Num      2.9905     0.9660   3.096 0.002067 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.72 on 535 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.1459, Adjusted R-squared:  0.1379 
## F-statistic: 18.28 on 5 and 535 DF,  p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(model3)

par(mfrow=c(1,1))

Diagnostics and Assumptions

df_model <- df[complete.cases(df[, c("PassFail","LSAT","UGPA",
                                     "CivPro_Num","LPI_Num","LPII_Num",
                                     "UBE","WrittenScaledScore")]), ]
df_model$pred_prob <- predict(model2, type="response")
df_model$pred_class <- ifelse(df_model$pred_prob > 0.5, "P", "F")
mean(df_model$pred_class == df_model$PassFail)
## [1] 0.9001848
model2 <- glm(PassFail ~ LSAT + UGPA + CivPro_Num + LPI_Num + LPII_Num,
              data = df_model, family = binomial)

df_model$pred_prob <- predict(model2, type="response")
df_model$pred_class <- ifelse(df_model$pred_prob > 0.5, "P", "F")

Disscussions and Recommendations

Overall, the results supported the initial hypotheses. Admissions metrics and early doctrinal performance were strong predictors of bar outcomes, suggesting that foundational academic preparation continues to influence performance at graduation. Writing‑intensive courses also played a meaningful role. However, the dataset lacks information on study habits, personal circumstances, and bar prep behaviors, so the models identify associations rather than causal effects. Based on the findings, the school could strengthen bar outcomes by providing early academic support for students entering lower LSAT and UGPA scores, enhancing writing focused practice opportunities. These reccomendations align directly with the predictors that showed the strongest effects in the models.