Introduction & Background

The consulting firm of Penick & Rust, a wholly-owned subsidiary of Dewey, Cheatham, and Howe, was hired to develop a strategy for the Law School of Major University (home of the Major Miners) on improving passage rates for the bar exam. The Law School provided Penick & Rust with data for the past two Academic Years (20/21 & 21/22). Licensure to practice law requires a law school graduate to demonstrate proficiency in law (usually demonstrated with a graduate degree from an accredited law school) and a passing score on the bar examination [1]. “The Bar,” as it is frequently called, is a combination of three rounds of testing consisting of multiple choice questions in the Multistate Bar Examination (MBE), essays in the Multistate Essay Examination (MEE), and more generalized question in the Multistate Performance Test (MPT). Finally, the scores of the three test sections are combined into the Uniform Bar Exam (UBE) score. The typical weighting of the three portions making up the UBE is 50% MBE, 30% MEE, and 20% MPT [2]. However, each state defines its passing requirements as outlined in Table 1.

Table 1 - Minimum Passing UBE Score by Jurisdiction [3]

Minimum UBE Score

Jurisdiction

260

Alabama, Minnesota, Missouri, New Mexico, North Dakota

264

Indiana, Oklahoma

266

Connecticut, D.C., Illinois, Iowa, Kansas, Kentucky, Maryland, Montana, New Jersey, New York, South Carolina, Virgin Islands

268

Michigan

270

Alaska, Arkansas, Colorado, Maine, Massachusetts, Nebraska, New Hampshire, North Carolina, Ohio, Oregon, Rhode Island, Tennessee, Texas, Utah, Vermont, Washington, West Virginia, Wyoming

272

Idaho, Pennsylvania

273

Arizona

Understanding the UBE Score

The great state of Texas, home of Major Univeristy, utilizes the UBE total score for determining if students pass the Bar Exam. Like most states, Texas calculates the UBE by weighting the three components of the UBE score differently, as shown in Figure 1. Texas also requires a passing score of 85 or higher on the Multistate Professional Responsibility Exam (MPRE). Still, this score is separate from, and not included, in the calculation of the UBE [4].

Calculating the UBE is a difficult task. The UBE is a score out of 400 possible points. The MBE portion is scored out of 200 points and scaled by 50% when added to the UBE. However, the MEE and MPT are scored on a 6-point scale, then scaled and equated to the MBE, and combined to form the UBE. The simplest way to calculate a UBE is to calculate the percentage of each component, multiply each by their weight, add each together into a total percent, and then multiply that by 400 points [5]. However, scaling can also come into play so scores can be compared from examinee to examinee from the same test [6]. Therefore, it isn’t easy to calculate the score based on the three values of MEE, MPT, and MBE. However, they are related to one another, and there is expected to be some collinearity between these if they are considered factors within the model.

Understanding the Data and Variables

The data was provided in a Microsoft Excel file with four sheets. The first sheet contained the list of students that did not pass the bar exam in 2022, and the second sheet contained the list of students that did pass the bar in 2022. Similarly, the third sheet contained the list of students that did not pass the bar exam in 2021, and the fourth sheet contained the list of students that did pass the bar in 2021. This is the extent of the data provided. Next, the data was read into R-Studio and combined into a single data frame for further data wrangling and the consequent analysis.

Provided with the data was a summary sheet defining each of the variables in the data. The following discussion establishes each variable, classifying it by type (continuous, ordinal, categorical, etc.), and speculates its usefulness and any anticipated interactions between the variables.

LSAT

This variable is the student’s score on the LSAT entrance examination. Referencing Dr. Timothy Bond, Associate Professor of Economics at Purdue University, scores from standardized tests are typically considered ordinal variables (whether that is the right thing to do or not, in his opinion) because higher test scores are inherently better than lower test scores [7]. Therefore, this analysis will treat the LSAT variable as an ordinal variable. This variable is considered a stand-alone variable and is not expected to interact with other variables in the data set. However, one could anticipate a correlation between a high LSAT score and a person with a high undergraduate Grade Point Average (GPA).

UGPA

The variable UGPA is the student’s undergraduate GPA reported to the law school. Per the Statistics Wiki page at Rochester University, a GPA can be treated as a continuous variable [8]. A GPA will typically indicate the capabilities of an individual, so it would be reasonable to expect a GPA to interact with other variables related to academic proficiency (like course scores, standardized tests, etc. However, a GPA measures past performance, and past performance is not a perfect indicator of future performance.

Class

The Class variable indicates the year the student entered law school. The year a student started would be a categorical variable because the “year started” is just a category for dividing the students. There is no indication that the Class variable will interact with other variables as it is merely a label. However, the capabilities of student cohorts can vary wildly year over year.

Another factor to be considered regarding the Class variable is that the two years presented in the data were affected by the COVID-19 pandemic. Again, this could affect the analysis of the data.

CivPro

The CivPro variable represents the student’s score in a Civil Procedures course to be taken during their first year in Law School. The data is reported as a grade score and will, therefore, be treated as ordinal, where the levels will be treated as such: F < D < D+ < C < C+ < B < B+ < A.

LP1

The LP1 variable represents the student’s score in the Legal Practice 1 course. This course is taken during the Fall semester of the student’s first year in Law School. The course is writing intensive and may have some bearing on the essay components of the bar exam. Again, the data is reported as a grade score and will, therefore, be treated as ordinal, where the levels will be treated as such: F < D < D+ < C < C+ < B < B+ < A.

LP2

The LP2 variable represents the student’s score in the Legal Practice 2 course taken during the Spring semester of the student’s first year in Law School. The course is, again, writing-intensive and may have some bearing on the essay components of the bar exam. Once again, the data is reported as a grade score. However, during the Spring of 2020, many students were allowed to receive Credit (CR) for the class instead of receiving the grade they would have usually been assigned to protect their GPA. This became a common procedure for many schools during the Spring of 2020 and the COVID-19 pandemic. Therefore, the LP2 variable data has a different ordinal structure than the other class scores reported in the data.

There are two ways to approach this. Both options are viable and will be used interchangeably throughout the report. The first is to count all Credits (CR) as a “C,” which is the required “passing” grade for the course to count. By doing this, the data remains ordinal and will match the levels and structures of other course grades where the levels will be treated as such: F < D < D+ < C < C+ < B < B+ < A. The second option is to treat all LP2 variable scores as a factor with two levels: “Pass” (A = B+ = B = C+ = C = CR) or “Fail” (D+ = D = F). The pro to treating the data this way is that it is fast and easy to address. The downside is that it does remove some fidelity from the data. Using this, we could also treat all other course score grades as “pass/fail” so the data better aligns.

In the following analyses, how LP2 was treated will be announced in the report.

OneLCUM

This variable represents the cumulative GPA of first-year law students at the end of their first year of law school. Like the UGPA, the first-year cumulative GPA will be treated as a continuous variable.

FGPA

This variable represents the final GPA of a student as they graduate from law school and go on to take the bar exam. Like the other GPA values used in the analysis, this value will also be treated as a continuous variable.

Accom

The Accom variable is a binary “Yes” or “No” that indicates whether a law school student received educational accommodations from the Student Disability Services office. These accommodations can include extra time on tests, extended due dates on homework, authority to record lectures, or having an assigned note taker during class time, among other things. Therefore, the Accom variable will be treated as a factor with two levels.

Probation

This variable is also a binary “Yes” or “No” that indicates whether the student was ever on academic probation due to a poor GPA. A student may go on academic probation and complete the program successfully. However, what is not shown in the data, is how many students went on academic probation and did not complete the program. This may or may not have a bearing on the forthcoming data analysis. Again, like the Accom variable, Probation will be treated as a factor with two levels.

LegalAnalysis

Legal Analysis is an elective course offered in law school. This variable indicates whether the student did, or did not, take the Legal Analysis course. The student’s grade in the course is not considered, just whether they took it for credit. Therefore, LegalAnalysis will be treated as a factor with two levels.

AdvLegalPerf

Legal Analysis Performance is another elective course offered by the law school. Like the previous elective, it indicates whether a student did, or did not, take the Legal Analysis Performance course. Again, student scores are not reported, only whether they took the course for credit. Therefore, AdvLegalPerf will be treated as a factor with two levels.

AdvLegalAnalysis

Advanced Legal Analysis is another elective course offered at the law school. Similar to the other two electives outlined, it indicates whether the student took the course for credit or not. The treatment of AdvLegalAnalysis will be no different than the other electives covered; it will be a factor with two levels.

BarPrep

Each law student is encouraged to take a preparatory course for the bar exam. However, it is not a requirement for a law degree. Instead, students self-report their involvement in a bar exam prep course. Themis and Barbri offer the two most popular courses. Since not all students will take a bar prep course or may forget to report it, there are some “NA” blanks in the data for the BarPrep variable. These “NA” blanks have been replaced with “unknown,” so the BarPrep variable is treated as a factor with three levels: Themis, Barbri, and unknown.

PctBarPrepComplete

The variable PctBarPrepComplete is reported to the law school by the companies that administer the bar exam prep courses (namely Themis and Barbri). The data is a percentage of the prep course that the student completed. Therefore, the data values are considered continuous over a range from 0 to 1. A 0 in this variable indicates that a student did not take a bar prep course or did not have their percent complete reported to the correct school when filling out their paperwork with the course administrator.

NumPrepWorkshops

The law school at Major University offers bar exam prep workshops during a student’s final semester. There are five workshops offered. The NumPrepWorkshops variable counts the number of workshops the student attended. It is presumed that attending one workshop is better than attending no workshops. Likewise, it is assumed that attending two workshops is better than attending one workshop, and so on. Because of this, this variable is treated as ordinal, where the levels are treated as 0 < 1 < 2 < 3 < 4 < 5.

StudentSuccessInitiative

The StudentSuccessInitiative variable tracks students participating in the law school student success initiative program. To be considered for the program, the student must have had a low cumulative GPA at some point during their law school career. This variable is treated as a factor with two levels; either “Yes” or “No.”

BarPrepMentor

A bar prep mentor is a low professional from the community local to Major University. These individuals want to give back and mentor law students in many aspects of their careers. In particular, the mentor is to focus some time on how they prepared for the bar exam and what it was like taking the bar exam. This variable is treated as a factor with two levels, “Yes” or “No,” because it is a voluntary program.

MPRE

The Multistate Professional Responsibility Exam (MPRE) is a standardized ethics exam for the law profession. Not all jurisdictions in the United States require the MPRE to become a professional lawyer. However, other jurisdictions do. For instance, Texas requires an MPRE score of 85 or better and passing the bar exam to become a professional lawyer [9]. Therefore, the MPRE variable is treated as a continuous variable from 0 to 150. A zero indicates a student did not take the MPRE, probably because the jurisdiction they will be going to does not require it.

MPT

The Multistate Performance Exam (MPT) is made up of two 90-minute sessions that cover a range of topics from legal analysis to ethical dilemmas [10]. The MPT score is reported on a 6.0 scale and could be treated as continuous within that range. However, an inspection of the data reveals that the scores have been (assumed) artificially rounded to the nearest whole or half number. Based on this, the proper treatment of the data for the MPT variable is ordinal data where the levels are 1.0 < 1.5 < 2.0 < 2.5 < 3.0 < 3.5 < 4.0 < 4.5 < 5.0 < 5.5 < 6.0.

MEE

The Multistate Essay Exam (MEE) tests for the ability to “identify legal issues raised by a hypothetical factual situation; separate material which is relevant from that which is not; present a reasoned analysis of the relevant issues in a clear, concise, and well-organized composition; and demonstrate an understanding of the fundamental legal principles relevant to the probable solution of the issues raised by the factual situation [11].” Most jurisdictions calculate the MEE score on a 1.0 to 6.0 scale [12]. Therefore, one would likely assume the data would be continuous on that scale. However, upon inspection of the data, the MEE data is discretized into groupings and should be treated as ordinal data where the levels are treated as 1.0000 < 1.1667 < 1.3333 < 1.5000 < 1.6667 < 1.8333 < 2.0000 < 2.1667 < 2.3333 < 2.5000 < 2.6667 < 2.8333 < 3.0000 < 3.1667 < 3.3333 < 3.5000 < 3.6667 < 3.8333 < 4.0000 < 4.1667 < 4.3333 < 4.5000 < 4.6667 < 4.8333 < 5.0000 < 5.1667 < 5.3333 < 5.5000 < 5.6667 < 5.8333 < 6.000.

MBE

The Multistate Bar Exam (MBE) determines how well a law student applies legal principles and reasoning to analyze given circumstances [13]. The MBE has a maximum score of 200 points and can be considered continuous throughout the scoring range. Upon inspection of the data provided, this is a safe assumption.

UBE

The Uniform Bar Exam (UBE) score is the scaled composite of the MPT, MEE, and MBE scores into a single entity. A discussion on the math behind the calculation has already been completed in the section Understanding the UBE Scores. The maximum UBE score is 400 points and could be considered continuous throughout this range (from a 0 score - which is highly unlikely). This is a reasonable treatment of the data after inspection.

PASS

The PASS variable is simply a two-level factor variable that identifies if the student successfully passed the bar exam (1) or did not pass the bar exam (0). Therefore, this data will be treated as a factor with two levels.

Methods

The Law School of Major University contracted us, Penick & Rust, to help improve the passage rates for the bar. The university provided us with an Excel spreadsheet about students passing rates for the bar exam.

To make recommendations, Penick & Rust analyzed the spreadsheet using R programming language to build linear regression and generalized linear regression models to help predict the exam scores and keep the best 1st-year students.

Data Collected

The data was collected for the admitted classes of 2018 and 2019, with a size of 224 students. Each student had various academic attributes about them, and the attributes are in Table 2. The table is arranged into three columns: Miscellaneous Attributes, Yes/No Attributes, and Numerical Attributes.

Table 2 - Summary of Attributes

Misc. Attributes

Yes/No Attributes

Numerical Attributes

Class

Probation

MPRE

CivPro

LegalAnalysis

MPT

LP1

AdvLegalPerf

MEE

LP2

AdvLegalAnalysis

MBE

BarPrep

StudentSuccessInitiative

PctBarPrepComplete

BarPrepMentor

NumPrepWorkshops

Accom

FGPA

PASS

LSAT

UGPA

OneLCUM

UBE

The miscellaneous attributes include the grades awarded to students for three courses (CivPro, LP1, and LP2, and a Bar prep course) and the year of admission for the students. An important thing to note, the grades for LP2 happened during the height of the COVID-19 pandemic, and some students decided to get credit for a course instead of getting a letter grade. Unfortunately, the credit threw off the ordinal scale of the data and decided to set a pass or fail point at a letter grade of C, meaning the student had to get a C in a course to pass LP2, and anything less than a C would result in failing grade for the course.

The Yes/No attributes are binary fields like whether the student took certain classes, needed exam accommodations, probationally status, passed the UBE, Bar prep mentor, and a student success initiative program. We considered these to be binary attributes that may affect the Numerical attributes of the student.

The Numerical attributes of the student consisted of necessary law examinations (MPT, MBE, MEE, UBE, and MPRE), percentage bar prep course completed, first and final GPAs, and Law School Admission Test (LSAT). We considered the major exam law examinations to be outcome variables because the other attributes could affect whether a student becomes a lawyer or not.

Method of the Study

We analyzed the dataset in R using linear and generalized linear regression to help predict the outcome of law exam scores and keep the best first-year students. To investigate the difference between the models, we analyzed the VIF of factors, the significance of the factors/levels, the AIC of the different models, the \(R^2\) value of the current model, and the diagnostic plots for the individual models.

For determining the VIFs for the predictors, we tried to use the vif function in R, but the vif function gave us a GVIF value. The GVIF is a generalized version of the VIF when the degrees of freedom are greater than one. So, to analyze the VIFs on the same scale, we transformed using GVIF^(1/(df). This resulted in a transformation back into VIF to analyze them on the same scale as the predictors with one degree of freedom. This transformation is simply squaring the GVIF value to equate to the VIF value [14].

The AIC was used to differentiate between the models when they were run in a stepwise regression. The goal was to minimize the AIC for the many different combinations of predictors that models could generate in stepwise regression.

The other two types of analysis, the significance of the factors/levels and \(R^2\), were done on each model we deemed important. First, the \(R^2\) was used to see how well the model describes the variation of the given predictors. Then, we looked at p-values for each factor and level to see if they impacted outcome variables.

Analysis

Keeping the Right Students

This section presents a quick analysis of the first-year cumulative GPA compared with the LSAT score, undergraduate GPA, and grades in the Civil Procedures, Legal Practice 1, and Legal Practice 2 courses. For this analysis, any “Credit” (CR) grades for the Legal Practice 2 course have been converted to a grade of “C.”

The model is a standard linear model of CMP_mod1<-lm(OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data=CMP_Mod1_dat). The model summary (below) indicates the model has an \(R^2\) value of 0.7819 and a p-value of 3.7015606^{-40}, indicating it is a good model fit. Each variable in the model does show some level of significance. A quick check with the dredge function shows that this model is the best model with the lowest AICc value.

## 
## Call:
## lm(formula = OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data = CMP_Mod1_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6703 -0.1436  0.0000  0.1445  0.5743 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.061835   0.180024  11.453  < 2e-16 ***
## LSAT.L       0.055707   0.185398   0.300 0.764169    
## LSAT.Q      -0.105076   0.168196  -0.625 0.532959    
## LSAT.C      -0.527429   0.157138  -3.356 0.000966 ***
## LSAT^4       0.072351   0.155511   0.465 0.642328    
## LSAT^5      -0.266490   0.154276  -1.727 0.085848 .  
## LSAT^6      -0.241611   0.164032  -1.473 0.142540    
## LSAT^7      -0.160006   0.168868  -0.948 0.344665    
## LSAT^8      -0.205677   0.160804  -1.279 0.202551    
## LSAT^9       0.229857   0.158002   1.455 0.147503    
## LSAT^10     -0.181393   0.161998  -1.120 0.264349    
## LSAT^11      0.150779   0.156978   0.961 0.338108    
## LSAT^12     -0.279199   0.143419  -1.947 0.053149 .  
## LSAT^13     -0.086459   0.127388  -0.679 0.498210    
## LSAT^14     -0.189479   0.119827  -1.581 0.115600    
## LSAT^15     -0.020633   0.120053  -0.172 0.863736    
## LSAT^16      0.058724   0.118460   0.496 0.620697    
## LSAT^17      0.101878   0.106605   0.956 0.340548    
## LSAT^18      0.298997   0.090145   3.317 0.001105 ** 
## LSAT^19      0.121162   0.074044   1.636 0.103542    
## LSAT^20      0.140622   0.064466   2.181 0.030476 *  
## LSAT^21     -0.003122   0.051539  -0.061 0.951771    
## UGPA         0.165408   0.048593   3.404 0.000821 ***
## CivPro.L     0.807330   0.142150   5.679 5.45e-08 ***
## CivPro.Q    -0.012055   0.131660  -0.092 0.927151    
## CivPro.C     0.180432   0.112918   1.598 0.111848    
## CivPro^4    -0.129982   0.093111  -1.396 0.164467    
## CivPro^5     0.168355   0.068392   2.462 0.014790 *  
## CivPro^6    -0.156734   0.048788  -3.213 0.001563 ** 
## LP1.L        0.607291   0.146387   4.149 5.19e-05 ***
## LP1.Q       -0.077906   0.139471  -0.559 0.577156    
## LP1.C        0.101057   0.135388   0.746 0.456399    
## LP1^4       -0.196313   0.105312  -1.864 0.063962 .  
## LP1^5        0.102784   0.100669   1.021 0.308645    
## LP1^6       -0.089637   0.104332  -0.859 0.391419    
## LP1^7        0.082339   0.075804   1.086 0.278859    
## LP2.L        0.602996   0.149889   4.023 8.50e-05 ***
## LP2.Q       -0.317728   0.136827  -2.322 0.021365 *  
## LP2.C        0.339262   0.118106   2.873 0.004570 ** 
## LP2^4       -0.205777   0.101782  -2.022 0.044710 *  
## LP2^5       -0.065976   0.072393  -0.911 0.363348    
## LP2^6        0.155022   0.047738   3.247 0.001393 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2236 on 177 degrees of freedom
## Multiple R-squared:  0.7819, Adjusted R-squared:  0.7313 
## F-statistic: 15.47 on 41 and 177 DF,  p-value: < 2.2e-16
## Fixed term is "(Intercept)"

Figures 2 and 3 show the Results vs. Fitted and Normality plots, respectively, for this model. Figure 4 shows the result of the boxcox function. Again, there are no concerns over the normality of the data or the variance. There are also no concerns about a need to transform the data.

Figure 2 - Residuals vs. Fitted (CMP_mod1<-lm(OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data=CMP_Mod1_dat))

## Warning: not plotting observations with leverage one:
##   17, 75, 83, 92, 124, 164, 199, 212

Figure 3 - Normality Plot (CMP_mod1<-lm(OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data=CMP_Mod1_dat))

Figure 4 - Boxcox Plot (CMP_mod1<-lm(OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data=CMP_Mod1_dat))

This analysis indicates that Law School at Major University monitors the variables that predict a successful first-year law student. The subsequent analysis will examine if the cumulative GPA from the first year predicts further success in the program.

Improving the MBE

The MBE counts as 50% of the total UBE score, and having a low grade on the MBE could result in accumulating the necessary points to pass the UBE and so trying to predict the outcome of the results and see which courses or factors help the student successfully pass the UBE.

To predict the outcome of the MBE score, we used the courses LP1, LP2, LegalAnalysis, AdvLegalPref, and AdvLegalAnalysis. We decided on these courses because they seem directly related to the MBE.

## 
## Call:
## lm(formula = MBE ~ CivPro + LP1 + LP2 + LegalAnalysis + AdvLegalPerf + 
##     AdvLegalAnalysis, data = All_Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -40.619  -6.328   0.000   6.379  23.726 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       142.1143     3.7145  38.259   <2e-16 ***
## CivPro.L           10.5467     6.6243   1.592   0.1129    
## CivPro.Q            4.5119     5.9169   0.763   0.4466    
## CivPro.C            2.2507     4.9287   0.457   0.6484    
## CivPro^4           -1.8479     4.0089  -0.461   0.6453    
## CivPro^5            5.0327     2.9769   1.691   0.0925 .  
## CivPro^6           -4.4996     2.0644  -2.180   0.0304 *  
## LP1.L              20.2635     6.2124   3.262   0.0013 ** 
## LP1.Q             -14.2183     5.9712  -2.381   0.0182 *  
## LP1.C              13.0508     5.8020   2.249   0.0256 *  
## LP1^4             -10.4723     4.5179  -2.318   0.0215 *  
## LP1^5               5.8115     4.3350   1.341   0.1816    
## LP1^6              -2.9777     4.6045  -0.647   0.5186    
## LP1^7               2.8805     3.2808   0.878   0.3810    
## LP2.L               4.1659     3.6320   1.147   0.2527    
## LegalAnalysisY     -2.4406     3.6707  -0.665   0.5069    
## AdvLegalPerfY       3.2736     2.7593   1.186   0.2369    
## AdvLegalAnalysisY  -0.1687     1.3968  -0.121   0.9040    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.946 on 201 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.2713, Adjusted R-squared:  0.2097 
## F-statistic: 4.403 on 17 and 201 DF,  p-value: 1.075e-07

The model equation is MBE_model <- lm(data = All_Data,MBE~CivPro+LP1+LP2+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis). The model is fitted with only single-order terms and not interactions. By looking at the summary of the predictor variables, some terms don’t look right, which could indicate co-linearity or being influenced by outliers.

Overall, the model doesn’t describe the variation well with an \(R^2\) of 0.2713301, but the p-value is pretty low at 1.0751399^{-7}. This means that the model is a good fit for the data but doesn’t accurately describe the variation.

Figure 5 - VIF Plot

The bar graphic shows predictor variables on the y-axis and VIFs on the x-axis. The graphic can have up to two colors, one of which is reddish and turquoise. The reddish color indicates that VIF is less than five, while the turquoise color indicates that the VIF is greater than five.

Since most of our predictor variables are close to one, the predictor variables are just slightly colinear but not significantly enough to affect the model. Next, we will look at diagnostic plots for the MBE model.

Figure 6 - MBE Model Residuals vs. Fitted

By looking at the Residuals vs. Fitted diagnostic plot, there is a value that is significantly far away from the rest of the fitted values. This point could indicate an outlier, and it could be affecting the model.

## 
## Call:
## lm(formula = MBE ~ CivPro + LP1 + LP2 + LegalAnalysis + AdvLegalPerf + 
##     AdvLegalAnalysis, data = data_outlier_removal)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -40.619  -6.360  -0.038   6.480  23.726 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       146.1573     3.5390  41.299   <2e-16 ***
## CivPro.L           10.5467     6.6243   1.592   0.1129    
## CivPro.Q            4.5119     5.9169   0.763   0.4466    
## CivPro.C            2.2507     4.9287   0.457   0.6484    
## CivPro^4           -1.8479     4.0089  -0.461   0.6453    
## CivPro^5            5.0327     2.9769   1.691   0.0925 .  
## CivPro^6           -4.4996     2.0644  -2.180   0.0304 *  
## LP1.L               3.4238     4.6358   0.739   0.4610    
## LP1.Q               2.6190     3.5036   0.748   0.4556    
## LP1.C              -2.0664     3.9970  -0.517   0.6057    
## LP1^4               0.6296     4.4502   0.141   0.8876    
## LP1^5              -1.0681     3.6586  -0.292   0.7706    
## LP1^6               1.9799     2.2941   0.863   0.3892    
## LP2.L               4.1659     3.6320   1.147   0.2527    
## LegalAnalysisY     -2.4406     3.6707  -0.665   0.5069    
## AdvLegalPerfY       3.2736     2.7593   1.186   0.2369    
## AdvLegalAnalysisY  -0.1687     1.3968  -0.121   0.9040    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.946 on 201 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.2422, Adjusted R-squared:  0.1819 
## F-statistic: 4.016 on 16 and 201 DF,  p-value: 1.301e-06

After investigation, the outlier was at index 202; the point did significantly affect the model parameters for LP1 estimates for beta. The effect caused the parameter estimation to have a negative impact on the model if the student made a B+ in the course. It didn’t make any sense, and the removal of the point did cause the terms to be insignificant, and LP1 needs to be removed.

Figure 7 - MBE Model 2 Residuals vs. Fitted

After removing the point at the index of 202, the Residual vs. Fitted values indicates that 205,223, and 222 are outliers, and the points should be investigated.

The next plot is the Normal Q-Q plot, which shows a better understanding of why the points are considered outliers.

## Warning: not plotting observations with leverage one:
##   211

Figure 8 - MBE Model2 Normal Q-Q

R indicates that points 205, 222, and 223 are still considered outliers, but point 205 seems the most significant outlier and requires more attention.

## 
## Call:
## lm(formula = MBE ~ CivPro + LP1 + LP2 + LegalAnalysis + AdvLegalPerf + 
##     AdvLegalAnalysis, data = data_outlier_removal2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.0550  -6.4869   0.0967   6.1121  23.7688 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       145.8604     3.3908  43.016  < 2e-16 ***
## CivPro.L           10.8471     6.3461   1.709  0.08895 .  
## CivPro.Q            3.6563     5.6714   0.645  0.51987    
## CivPro.C            2.5344     4.7219   0.537  0.59204    
## CivPro^4           -1.3538     3.8419  -0.352  0.72493    
## CivPro^5            5.0661     2.8517   1.777  0.07716 .  
## CivPro^6           -5.3955     1.9882  -2.714  0.00723 ** 
## LP1.L               3.7408     4.4415   0.842  0.40066    
## LP1.Q               2.5363     3.3563   0.756  0.45072    
## LP1.C              -2.3961     3.8296  -0.626  0.53224    
## LP1^4               0.7091     4.2631   0.166  0.86807    
## LP1^5              -0.8338     3.5052  -0.238  0.81223    
## LP1^6               2.4933     2.2008   1.133  0.25861    
## LP2.L               4.0568     3.4793   1.166  0.24501    
## LegalAnalysisY     -2.4970     3.5163  -0.710  0.47845    
## AdvLegalPerfY       2.9374     2.6444   1.111  0.26799    
## AdvLegalAnalysisY   0.1650     1.3403   0.123  0.90217    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.527 on 200 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.2571, Adjusted R-squared:  0.1977 
## F-statistic: 4.327 on 16 and 200 DF,  p-value: 2.962e-07

After removing point 205, the estimates didn’t change significantly, and predictor estimates are approximately the same as the previous models. We can conclude that taking out any more outliers wouldn’t significantly impact the model itself, so we decided to proceed with our investigation.

The \(R^2\) did increase by about .02, but it’s still not a good \(R^2\) for a model. This indicates that other factors might contribute to the model, but it’s not in the data set.

## 
## Call:
## lm(formula = MBE ~ CivPro, data = data_outlier_removal2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -24.7022  -6.2986   0.3978   5.7014  24.3014 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  143.402      1.604  89.394  < 2e-16 ***
## CivPro.L      13.735      5.798   2.369  0.01874 *  
## CivPro.Q       3.951      5.365   0.736  0.46233    
## CivPro.C       1.499      4.537   0.330  0.74149    
## CivPro^4      -1.254      3.733  -0.336  0.73722    
## CivPro^5       5.335      2.747   1.942  0.05343 .  
## CivPro^6      -5.276      1.899  -2.778  0.00595 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.534 on 214 degrees of freedom
## Multiple R-squared:  0.2359, Adjusted R-squared:  0.2145 
## F-statistic: 11.01 on 6 and 214 DF,  p-value: 1.089e-10

Using the step-wise regression function, the Model came out to be MBE~CivPro with the lowest AIC at 1003.5.

By looking at the p-value, the p-value is a good fit for the data, but the model doesn’t do a good job of modeling the variation of MBE results. The p-value and \(R^2\) are low, which is unsuitable for a model.

The two levels that make a significant difference in the MBE score are CivPro.L and CivPro^6. The beta value for CivPro.L to be 13.735, and the beta value for CivPro^6 to be -5.276.

Figure 9 - Final MBE Model Residuals vs. Fitted

The MBE model does look like it has constant variation through the residuals, but the data looks categorical. Overall we would say it’s a good fit.

## Warning: not plotting observations with leverage one:
##   214

Figure 10 - Final MBE Model Normal QQ plot

The data does look normally distributed.

R indicates points 222, 223, and 188 as outliers, but the points don’t appear to impact the normality assumption significantly.

## Warning: not plotting observations with leverage one:
##   214

Figure 11 - Final MBE Model Residuals vs. Leverage

After plotting the leverage of the MBE model, four points around 0.25 are significantly away from the group. We did remove the points, but it didn’t affect the model performance significantly.

The overall model doesn’t do a good job of predicting the outcome of the MBE score. Even though we were provided with a list of courses, some other outside factors, like study time for MBE, could significantly impact the MBE scores.

## 
## Call:
## glm(formula = PASS ~ BarPrep + PctBarPrepComplete + NumPrepWorkshops + 
##     StudentSuccessInitiative + BarPrepMentor + Probation, family = binomial(), 
##     data = All_Data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6841   0.1701   0.2896   0.4358   1.5290  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)               -1.65484    1.57626  -1.050  0.29379    
## BarPrepThemis              0.98069    0.56637   1.732  0.08336 .  
## PctBarPrepComplete         4.55331    1.75403   2.596  0.00943 ** 
## NumPrepWorkshops1          0.40212    0.85489   0.470  0.63808    
## NumPrepWorkshops2          1.00770    1.15099   0.876  0.38130    
## NumPrepWorkshops3         -0.44556    1.03859  -0.429  0.66792    
## NumPrepWorkshops4          1.19070    1.29381   0.920  0.35741    
## NumPrepWorkshops5         -0.08386    0.67365  -0.124  0.90093    
## StudentSuccessInitiativeY -1.82888    0.55339  -3.305  0.00095 ***
## BarPrepMentorY             1.40690    1.08528   1.296  0.19486    
## ProbationY                -1.39727    0.65241  -2.142  0.03222 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 157.53  on 210  degrees of freedom
## Residual deviance: 114.99  on 200  degrees of freedom
##   (12 observations deleted due to missingness)
## AIC: 136.99
## 
## Number of Fisher Scoring iterations: 6

The model equation is glm(data = All_Data,PASS~BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor+Probation, family = binomial()). We used a binomial distribution using the logit as a linking function to get the probability of a student passing given the predictor variable.

If you compare the amount of deviance explained by the model and \(\chi^2\), you get \(D_{null}-D_{Res}=38.04\), which is greater than 16.9189776, which means that our model is a good fit for the data.

The number of prep workshops a student attends seems insignificant, and the next model will compare them.

## 
## Call:
## glm(formula = PASS ~ BarPrep + PctBarPrepComplete + StudentSuccessInitiative + 
##     BarPrepMentor, family = binomial(), data = All_Data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6946   0.1576   0.3254   0.4362   1.4401  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                -1.4258     1.5367  -0.928   0.3535    
## BarPrepThemis               0.8305     0.5357   1.550   0.1210    
## PctBarPrepComplete          4.3375     1.7189   2.523   0.0116 *  
## StudentSuccessInitiativeY  -2.0860     0.4977  -4.191 2.78e-05 ***
## BarPrepMentorY              1.4918     1.0691   1.395   0.1629    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 157.53  on 210  degrees of freedom
## Residual deviance: 121.73  on 206  degrees of freedom
##   (12 observations deleted due to missingness)
## AIC: 131.73
## 
## Number of Fisher Scoring iterations: 6

Like the previous model, the model is still considered significant because \(D_{null}-D_{Res}=38.04\) which is greater than 9.487729. The AIC also is better than the previous model.

The Bar Prep Course (Yes/NO) predictor variable is insignificant and needs to be removed from the model. Therefore, the next model will discuss the removal of the predictor variable.

## 
## Call:
## glm(formula = PASS ~ PctBarPrepComplete + StudentSuccessInitiative + 
##     BarPrepMentor, family = binomial(), data = All_Data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6671   0.1540   0.3322   0.4205   1.4299  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                -0.3339     1.3138  -0.254   0.7994    
## PctBarPrepComplete          3.4441     1.5460   2.228   0.0259 *  
## StudentSuccessInitiativeY  -1.8957     0.4690  -4.042 5.31e-05 ***
## BarPrepMentorY              1.7123     1.0598   1.616   0.1062    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 157.53  on 210  degrees of freedom
## Residual deviance: 124.24  on 207  degrees of freedom
##   (12 observations deleted due to missingness)
## AIC: 132.24
## 
## Number of Fisher Scoring iterations: 6

Like the previous two models, the model is still considered significant because \(D_{null}-D_{Res}=33.29\) which is greater than 7.8147279. The AIC did increase by about .5, but we deemed it not to be a big jump in the number. The Bar Prep mentor still does not significantly impact the model; the next one will remove the term.

## 
## Call:
## glm(formula = PASS ~ StudentSuccessInitiative + PctBarPrepComplete, 
##     family = binomial(), data = All_Data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.5522   0.2601   0.3192   0.4056   1.4135  
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                -0.4348     1.3183  -0.330   0.7415    
## StudentSuccessInitiativeY  -1.9310     0.4646  -4.157 3.23e-05 ***
## PctBarPrepComplete          3.8046     1.5433   2.465   0.0137 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 157.53  on 210  degrees of freedom
## Residual deviance: 128.28  on 208  degrees of freedom
##   (12 observations deleted due to missingness)
## AIC: 134.28
## 
## Number of Fisher Scoring iterations: 5

The AIC for this model did go up by 2.04 points compared to the previous model, but all the predictor terms are significant based on the p-values. The model is also a good fit because \(D_{null}-D_{Res}=33.29\) which is greater than 5.9914645.

As a result, the student success initiative has a negative impact on the probability of the student passing the exam. The law school should focus on this aspect to see if they can improve the passing rate by improving the program. The law school should also focus on getting more students to complete their Bar prep courses as it improves the probability of the student passing the UBE.

Student Success in the Law School Program

It is one thing to complete the first year of law school successfully. It is another to complete the entire program successfully. As noted in the Understanding the Data and Variables section, the Law School at Major University has a program to help its students succeed. Most notable are the academic accommodations, the student success initiative, and probation. Academic accommodations are not a negative thing but help students that might not otherwise be able to perform well in a demanding program. Accommodations are a means for leveling the playing field. The student success initiative, however, is only available to students with a poor grade point average (GPA). The student success initiative works to try to keep students out of probation. While holding a negative connotation, the probation program is also a means to incentivize students to work harder and stay until the completion of the program.

The model is a standard linear model of CMP_mod2<-lm(FGPA ~ OneLCUM + Accom + Probation + StudentSuccessInitiative, data=CMP_Mod2_dat). The model only takes into account single factors without interaction. Due to the nature of the variables (Accom, Probation, StudentSuccessInitiative), it is easy to see how they would interact and drive up the Variance Inflation Factors (VIF). The model summary (below) indicates the model has an R^2 value of 0.7805 and a p-value of 1.3993568^{-70}, indicating it is a good model fit. Each variable in the model does show some level of significance. A quick check with the dredge function shows that this model is the best model with the lowest AICc value. Additionally, a brief review of the VIF shows no concerns about interactions between the variables as they are used in this model.

## 
## Call:
## lm(formula = FGPA ~ ., data = CMP_Mod2_dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.47518 -0.08739  0.00571  0.10389  0.27619 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1.87132    0.10142  18.452  < 2e-16 ***
## OneLCUM                    0.48161    0.03055  15.767  < 2e-16 ***
## AccomY                     0.06835    0.02923   2.339   0.0203 *  
## ProbationY                 0.06819    0.03824   1.783   0.0760 .  
## StudentSuccessInitiativeY -0.21387    0.02796  -7.650  6.4e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1428 on 218 degrees of freedom
## Multiple R-squared:  0.7805, Adjusted R-squared:  0.7765 
## F-statistic: 193.8 on 4 and 218 DF,  p-value: < 2.2e-16
## Fixed term is "(Intercept)"
##                  OneLCUM                    Accom                Probation 
##                 1.907064                 1.056881                 1.364469 
## StudentSuccessInitiative 
##                 1.663198

Figures 12 and 13 show the Results vs. Fitted and Normality plots, respectively, for this model. Figure 14 shows the result of the boxcox function. Again, there are no concerns over the normality of the data or the variance. There are also no concerns about a need to transform the data. Figure 15 shows the plot of the Standardized Residuals vs. the Fitted Values. Again, a couple of points are approaching the \(\sqrt{3}\) rule-of-thumb for outliers and leverage points, but they are not concerning enough to adjust anything.

Figure 12 - Residuals vs. Fitted (CMP_mod2<-lm(FGPA ~ OneLCUM + Accom + Probation + StudentSuccessInitiative, data=CMP_Mod2_dat))

Figure 13 - Normality Plot (CMP_mod2<-lm(FGPA ~ OneLCUM + Accom + Probation + StudentSuccessInitiative, data=CMP_Mod2_dat))

Figure 14 - Boxcox Plot (CMP_mod2<-lm(FGPA ~ OneLCUM + Accom + Probation + StudentSuccessInitiative, data=CMP_Mod2_dat))

Figure 15 - Standardized Residuals vs. Fitted Values (CMP_mod2<-lm(FGPA ~ OneLCUM + Accom + Probation + StudentSuccessInitiative, data=CMP_Mod2_dat))

This analysis also indicates that the Law School at Major University is doing all the right things to help students succeed. The following analysis looks at how well the performance in some classes might help with components of the UBE.

Improving Essay Results on UBE

Essays can be daunting for students, especially when they are part of a serious test like the bar exam. However, law students must be able to convey meaningful information in concise writing. Therefore, this section looks at the Legal Practice 1 and 2 classes’ impact on the bar exam’s MEE portion. For this analysis, the “Credit” received in Legal Practice 2 in the Spring of 2020 is treated as a grade of C. Since we are regressing MEE, it cannot be treated as an ordinal value. So its raw data was saved and is being used as numerical data.

## 
## Call:
## lm(formula = MEE ~ ., data = CMP_Mod3_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2634 -0.2634  0.0000  0.2965  1.4722 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.43147    0.11740  29.229  < 2e-16 ***
## LP1.L        1.20290    0.30715   3.916 0.000122 ***
## LP1.Q       -0.80770    0.29043  -2.781 0.005923 ** 
## LP1.C        0.48061    0.28179   1.706 0.089605 .  
## LP1^4       -0.35506    0.21990  -1.615 0.107935    
## LP1^5       -0.06405    0.21270  -0.301 0.763611    
## LP1^6        0.11355    0.22140   0.513 0.608600    
## LP1^7       -0.05961    0.15805  -0.377 0.706441    
## LP2.L       -0.11259    0.30800  -0.366 0.715083    
## LP2.Q        0.49265    0.27813   1.771 0.077997 .  
## LP2.C       -0.08295    0.24314  -0.341 0.733333    
## LP2^4        0.12106    0.20948   0.578 0.563944    
## LP2^5       -0.17623    0.15113  -1.166 0.244913    
## LP2^6        0.02416    0.09885   0.244 0.807145    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4884 on 205 degrees of freedom
## Multiple R-squared:  0.1889, Adjusted R-squared:  0.1375 
## F-statistic: 3.673 on 13 and 205 DF,  p-value: 3.008e-05
## Fixed term is "(Intercept)"

The model is a standard linear model of CMP_mod3<-lm(MEE ~ LP1 + LP2 + LP1:LP2, data=CMP_Mod3_dat). This time, this model takes into account single factors and interactions. The model summary (below) indicates the model has an \(R^2\) value of 0.1889 and a p-value of 3.0077433^{-5}. However, each variable in the model is not necessarily significant. For example, a quick check with the dredge function shows that a model containing only LP1 is the best model with the lowest AICc value.

## 
## Call:
## lm(formula = MEE ~ ., data = CMP_Mod3_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2634 -0.2634  0.0000  0.2965  1.4722 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.43147    0.11740  29.229  < 2e-16 ***
## LP1.L        1.20290    0.30715   3.916 0.000122 ***
## LP1.Q       -0.80770    0.29043  -2.781 0.005923 ** 
## LP1.C        0.48061    0.28179   1.706 0.089605 .  
## LP1^4       -0.35506    0.21990  -1.615 0.107935    
## LP1^5       -0.06405    0.21270  -0.301 0.763611    
## LP1^6        0.11355    0.22140   0.513 0.608600    
## LP1^7       -0.05961    0.15805  -0.377 0.706441    
## LP2.L       -0.11259    0.30800  -0.366 0.715083    
## LP2.Q        0.49265    0.27813   1.771 0.077997 .  
## LP2.C       -0.08295    0.24314  -0.341 0.733333    
## LP2^4        0.12106    0.20948   0.578 0.563944    
## LP2^5       -0.17623    0.15113  -1.166 0.244913    
## LP2^6        0.02416    0.09885   0.244 0.807145    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4884 on 205 degrees of freedom
## Multiple R-squared:  0.1889, Adjusted R-squared:  0.1375 
## F-statistic: 3.673 on 13 and 205 DF,  p-value: 3.008e-05
## Fixed term is "(Intercept)"

The reduced model is a standard linear model of CMP_mod3_2<-lm(MEE ~ LP1, data=CMP_Mod3_dat). The model summary (below) indicates the model has an \(R^2\) value of 0.1377 and a p-value of 4.7777067^{-5}. Now the factors of LP1 are significant to the model. However, the \(R^2\) value indicates that only a small portion of the MEE results are explained by the Legal Analysis 1 and 2 courses. Since there is only one factor in the model, there is no point in checking the VIFs.

## 
## Call:
## lm(formula = MEE ~ LP1, data = CMP_Mod3_dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.32226 -0.31901  0.01111  0.34444  1.48958 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.36346    0.08740  38.482  < 2e-16 ***
## LP1.L        1.28452    0.30620   4.195 4.02e-05 ***
## LP1.Q       -0.65993    0.29055  -2.271   0.0241 *  
## LP1.C        0.51315    0.28195   1.820   0.0702 .  
## LP1^4       -0.37608    0.22094  -1.702   0.0902 .  
## LP1^5       -0.04134    0.21404  -0.193   0.8470    
## LP1^6        0.10431    0.22385   0.466   0.6417    
## LP1^7       -0.06998    0.15933  -0.439   0.6610    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4964 on 211 degrees of freedom
## Multiple R-squared:  0.1377, Adjusted R-squared:  0.1091 
## F-statistic: 4.813 on 7 and 211 DF,  p-value: 4.778e-05

Figures 16 and 17 show the Results vs. Fitted and Normality plots, respectively, for this model. The normality and constant variance assumptions usually made for linear regression models are suspect due to the shapes depicted in these plots. If this model were of any significance, more time should be spent determining the issues with normality and constant variance. It is likely due to the lack of good data to support this model. Figure 18 shows the result of the boxcox function. Again, there are no concerns regarding the transformation of the data.

Figure 16 - Residuals vs. Fitted (CMP_mod3_1<-lm(MEE ~ LP1, data=CMP_Mod3_dat))

## Warning: not plotting observations with leverage one:
##   199

Figure 17 - Normality Plot (CMP_mod3_1<-lm(MEE ~ LP1, data=CMP_Mod3_dat))

Figure 18 - Boxcox Plot (CMP_mod3_1<-lm(MEE ~ LP1, data=CMP_Mod3_dat))

This analysis suggests no linkage between performance in the Legal Analysis courses and the student scores on the MEE portion of the bar exam. However, a deeper look could be made at providing scores for the Legal Analysis, Advanced Legal Analysis, and Advanced Legal Performance classes to see if the writing content in those classes if added to the model, would be contributors to the model.

Evaluating UBE Against (Nearly) Everything

The ultimate goal of this assessment is to determine if a model can be developed to predict bar passage rates based on various attributes of law school students. These attributes have already been covered in some detail elsewhere in the report but are related to their performance in law school, the types of electives they took, and how much they spent preparing for the bar exam.

The model is built around the UBE score and will omit the MPT, MEE, and MBE scores since they are obviously directly related to the UBE. Likewise, the PASS variable will not be in this model since it directly correlates to the UBE. Otherwise, the UBE will be regressed on all the other single-factor variables and a handful of judiciously selected interactions.

The first set of interactions includes the items that most directly measure the students’ academic aptitude; the LSAT score, their undergraduate GPA, their first year of law school cumulative GPA, and their final GPA. Each combination of interactions from this group is included in the model.

The second set of interactions includes the items that identify if a student is struggling, whether they need accommodations if they have ever been on academic probation, and if they were included in the Student Success Initiative. Also included in this set of interactions is the final GPA because each of these variables should impact the final GPA. Therefore, each combination of interactions from this group is included in the model.

The third set of interactions includes the items related to coursework, how they did in the Civil Procedures, Legal Practice 1, and Legal Practice 2 courses, and if they took the Legal Analysis, Advanced Legal Performance, and Advanced Legal Analysis classes. Each combination of interactions from this group is included in the model.

The fourth set of interactions includes the items that are related to the students’ preparation for the bar exam; did the student take a Bar Prep course and how much did they complete, did they attend the Prep Workshops, and did they make use of a local mentor to help in their preparation. Each combination of interactions from this group is included in the model.

The initial model is a standard linear model of the form:

CMP_mod6<-lm(UBE~.+LSAT:UGPA+LSAT:OneLCUM+LSAT:FGPA+UGPA:OneLCUM+UGPA:FGPA+Accom:Probation+Accom:FGPA +Probation:FGPA+StudentSuccessInitiative:FGPA+CivPro:LP1+CivPro:LP2+LP1:LP2+LegalAnalysis:AdvLega lPerf+LegalAnalysis:AdvLegalAnalysis+AdvLegalPerf:AdvLegalAnalysis+BarPrep:PctBarPrepComplete+Bar Prep:NumPrepWorkshops+BarPrep:BarPrepMentor+PctBarPrepComplete:NumPrepWorkshops+PctBarPrepComplet e:BarPrepMentor+NumPrepWorkshops:BarPrepMentor+Accom:StudentSuccessInitiative+Probation:StudentSu ccessInitiative, data = Mod6_dat)

The model summary (below) indicates the model has an \(R^2\) value of 0.8663 and a p-value of 4.1402838^{-6}, which both seem reasonable. However, each variable in the model is not necessarily significant. For example, figures 19, 20, 21, and 22 show the Boxcox function plot, the constant variance plot, the normality plot, and the Cook’s Distance plot, respectively. Each plot is okay except for Cook’s Distance, which indicates some concerns with outliers and leverage points. But, before we eliminate these points, we will go through some stepwise regression to determine the best model and then check for outliers and leverage points once a better model has been selected.

## 
## Call:
## lm(formula = UBE ~ . + LSAT:UGPA + LSAT:OneLCUM + LSAT:FGPA + 
##     UGPA:OneLCUM + UGPA:FGPA + Accom:Probation + Accom:FGPA + 
##     Probation:FGPA + StudentSuccessInitiative:FGPA + CivPro:LP1 + 
##     CivPro:LP2 + LP1:LP2 + LegalAnalysis:AdvLegalPerf + LegalAnalysis:AdvLegalAnalysis + 
##     AdvLegalPerf:AdvLegalAnalysis + BarPrep:PctBarPrepComplete + 
##     BarPrep:NumPrepWorkshops + BarPrep:BarPrepMentor + PctBarPrepComplete:NumPrepWorkshops + 
##     PctBarPrepComplete:BarPrepMentor + NumPrepWorkshops:BarPrepMentor + 
##     Accom:StudentSuccessInitiative + Probation:StudentSuccessInitiative, 
##     data = Mod6_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.755  -3.394   0.000   3.810  27.324 
## 
## Coefficients: (57 not defined because of singularities)
##                                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                          -2.710e+08  8.094e+08  -0.335 0.738871    
## LSAT.L                               -2.051e+09  6.173e+09  -0.332 0.740757    
## LSAT.Q                               -2.340e+09  7.099e+09  -0.330 0.742764    
## LSAT.C                               -2.262e+09  6.980e+09  -0.324 0.746992    
## LSAT^4                               -1.978e+09  6.252e+09  -0.316 0.752729    
## LSAT^5                               -1.553e+09  5.053e+09  -0.307 0.759631    
## LSAT^6                               -1.123e+09  3.832e+09  -0.293 0.770517    
## LSAT^7                               -7.330e+08  2.617e+09  -0.280 0.780356    
## LSAT^8                               -4.385e+08  1.702e+09  -0.258 0.797507    
## LSAT^9                               -2.366e+08  9.754e+08  -0.243 0.809155    
## LSAT^10                              -1.154e+08  5.399e+08  -0.214 0.831392    
## LSAT^11                              -5.159e+07  2.516e+08  -0.205 0.838165    
## LSAT^12                              -2.116e+07  1.157e+08  -0.183 0.855452    
## LSAT^13                              -9.025e+06  4.073e+07  -0.222 0.825381    
## LSAT^14                              -4.023e+06  1.481e+07  -0.272 0.786840    
## LSAT^15                              -2.156e+06  3.449e+06  -0.625 0.534224    
## LSAT^16                              -1.114e+06  1.094e+06  -1.018 0.312652    
## LSAT^17                              -5.558e+05  3.968e+05  -1.401 0.166170    
## LSAT^18                              -2.304e+05  1.667e+05  -1.382 0.171870    
## LSAT^19                              -8.039e+04  6.155e+04  -1.306 0.196272    
## LSAT^20                              -2.178e+04  1.753e+04  -1.243 0.218506    
## LSAT^21                              -4.011e+03  3.620e+03  -1.108 0.272077    
## UGPA                                  1.003e+08  2.073e+08   0.484 0.630299    
## Class2019                            -1.293e+01  3.730e+00  -3.466 0.000956 ***
## CivPro.L                             -1.891e+03  6.130e+03  -0.308 0.758770    
## CivPro.Q                              1.309e+03  4.752e+03   0.275 0.783911    
## CivPro.C                             -1.273e+03  2.774e+03  -0.459 0.647865    
## CivPro^4                              7.048e+02  1.205e+03   0.585 0.560733    
## CivPro^5                             -1.107e+01  2.333e+02  -0.047 0.962292    
## CivPro^6                              1.785e+01  9.414e+00   1.896 0.062591 .  
## LP1.L                                -3.778e+03  7.089e+03  -0.533 0.595891    
## LP1.Q                                 9.003e+02  5.673e+03   0.159 0.874414    
## LP1.C                                -1.644e+03  3.511e+03  -0.468 0.641262    
## LP1^4                                 1.100e+02  1.445e+03   0.076 0.939559    
## LP1^5                                -1.847e+02  4.671e+02  -0.395 0.693923    
## LP1^6                                 5.534e+01  1.798e+02   0.308 0.759249    
## LP1^7                                -3.335e+01  4.928e+01  -0.677 0.501058    
## LP2.L                                 2.867e+01  1.345e+02   0.213 0.831862    
## OneLCUM                               2.016e+07  2.528e+07   0.797 0.428175    
## FGPA                                 -4.259e+07  5.067e+07  -0.841 0.403799    
## AccomY                               -1.504e+01  9.301e+01  -0.162 0.872018    
## ProbationY                           -2.232e+02  6.423e+02  -0.348 0.729362    
## LegalAnalysisY                        3.136e+01  3.279e+01   0.956 0.342480    
## AdvLegalPerfY                         4.620e+00  8.282e+00   0.558 0.578988    
## AdvLegalAnalysisY                     3.351e+00  4.123e+00   0.813 0.419439    
## BarPrepThemis                         1.362e+01  3.061e+01   0.445 0.657809    
## PctBarPrepComplete                    2.532e+01  3.448e+01   0.734 0.465571    
## NumPrepWorkshops1                     4.795e+00  4.991e+01   0.096 0.923763    
## NumPrepWorkshops2                     1.976e+01  6.235e+01   0.317 0.752363    
## NumPrepWorkshops3                    -3.587e+01  5.108e+01  -0.702 0.485160    
## NumPrepWorkshops4                    -1.342e+02  2.479e+02  -0.541 0.590194    
## NumPrepWorkshops5                     1.140e+01  5.142e+01   0.222 0.825274    
## StudentSuccessInitiativeY             4.139e+01  1.126e+02   0.368 0.714430    
## BarPrepMentorY                       -8.340e-01  5.130e+01  -0.016 0.987080    
## MPRE                                  1.974e-02  5.715e-02   0.345 0.730991    
## LSAT.L:UGPA                           7.545e+08  1.590e+09   0.475 0.636677    
## LSAT.Q:UGPA                           8.551e+08  1.831e+09   0.467 0.642028    
## LSAT.C:UGPA                           8.145e+08  1.817e+09   0.448 0.655564    
## LSAT^4:UGPA                           7.023e+08  1.636e+09   0.429 0.669169    
## LSAT^5:UGPA                           5.386e+08  1.340e+09   0.402 0.689030    
## LSAT^6:UGPA                           3.811e+08  1.023e+09   0.372 0.710861    
## LSAT^7:UGPA                           2.406e+08  7.096e+08   0.339 0.735675    
## LSAT^8:UGPA                           1.395e+08  4.648e+08   0.300 0.765073    
## LSAT^9:UGPA                           7.129e+07  2.702e+08   0.264 0.792777    
## LSAT^10:UGPA                          3.270e+07  1.502e+08   0.218 0.828327    
## LSAT^11:UGPA                          1.284e+07  7.064e+07   0.182 0.856321    
## LSAT^12:UGPA                          4.241e+06  3.236e+07   0.131 0.896132    
## LSAT^13:UGPA                          1.122e+06  1.136e+07   0.099 0.921656    
## LSAT^14:UGPA                          1.832e+05  4.047e+06   0.045 0.964044    
## LSAT^15:UGPA                          1.651e+04  8.452e+05   0.020 0.984473    
## LSAT^16:UGPA                         -7.535e+03  2.050e+05  -0.037 0.970794    
## LSAT^17:UGPA                                 NA         NA      NA       NA    
## LSAT^18:UGPA                                 NA         NA      NA       NA    
## LSAT^19:UGPA                                 NA         NA      NA       NA    
## LSAT^20:UGPA                                 NA         NA      NA       NA    
## LSAT^21:UGPA                                 NA         NA      NA       NA    
## LSAT.L:OneLCUM                        1.480e+08  1.822e+08   0.812 0.419701    
## LSAT.Q:OneLCUM                        1.653e+08  2.079e+08   0.795 0.429412    
## LSAT.C:OneLCUM                        1.492e+08  1.841e+08   0.810 0.420945    
## LSAT^4:OneLCUM                        1.234e+08  1.562e+08   0.790 0.432574    
## LSAT^5:OneLCUM                        8.626e+07  1.071e+08   0.806 0.423399    
## LSAT^6:OneLCUM                        5.668e+07  7.270e+07   0.780 0.438506    
## LSAT^7:OneLCUM                        3.083e+07  3.860e+07   0.799 0.427527    
## LSAT^8:OneLCUM                        1.588e+07  2.086e+07   0.761 0.449268    
## LSAT^9:OneLCUM                        6.307e+06  8.014e+06   0.787 0.434272    
## LSAT^10:OneLCUM                       2.395e+06  3.299e+06   0.726 0.470457    
## LSAT^11:OneLCUM                       5.624e+05  7.328e+05   0.767 0.445667    
## LSAT^12:OneLCUM                       1.332e+05  2.070e+05   0.643 0.522314    
## LSAT^13:OneLCUM                              NA         NA      NA       NA    
## LSAT^14:OneLCUM                              NA         NA      NA       NA    
## LSAT^15:OneLCUM                              NA         NA      NA       NA    
## LSAT^16:OneLCUM                              NA         NA      NA       NA    
## LSAT^17:OneLCUM                              NA         NA      NA       NA    
## LSAT^18:OneLCUM                              NA         NA      NA       NA    
## LSAT^19:OneLCUM                              NA         NA      NA       NA    
## LSAT^20:OneLCUM                              NA         NA      NA       NA    
## LSAT^21:OneLCUM                              NA         NA      NA       NA    
## LSAT.L:FGPA                          -3.135e+08  3.642e+08  -0.861 0.392606    
## LSAT.Q:FGPA                          -3.491e+08  4.168e+08  -0.837 0.405506    
## LSAT.C:FGPA                          -3.158e+08  3.682e+08  -0.858 0.394385    
## LSAT^4:FGPA                          -2.602e+08  3.136e+08  -0.830 0.409853    
## LSAT^5:FGPA                          -1.824e+08  2.143e+08  -0.851 0.397871    
## LSAT^6:FGPA                          -1.192e+08  1.462e+08  -0.815 0.417929    
## LSAT^7:FGPA                          -6.509e+07  7.742e+07  -0.841 0.403660    
## LSAT^8:FGPA                          -3.327e+07  4.210e+07  -0.790 0.432369    
## LSAT^9:FGPA                          -1.328e+07  1.611e+07  -0.824 0.412946    
## LSAT^10:FGPA                         -4.976e+06  6.698e+06  -0.743 0.460242    
## LSAT^11:FGPA                         -1.180e+06  1.479e+06  -0.797 0.428230    
## LSAT^12:FGPA                         -2.705e+05  4.250e+05  -0.637 0.526699    
## LSAT^13:FGPA                                 NA         NA      NA       NA    
## LSAT^14:FGPA                                 NA         NA      NA       NA    
## LSAT^15:FGPA                                 NA         NA      NA       NA    
## LSAT^16:FGPA                                 NA         NA      NA       NA    
## LSAT^17:FGPA                                 NA         NA      NA       NA    
## LSAT^18:FGPA                                 NA         NA      NA       NA    
## LSAT^19:FGPA                                 NA         NA      NA       NA    
## LSAT^20:FGPA                                 NA         NA      NA       NA    
## LSAT^21:FGPA                                 NA         NA      NA       NA    
## UGPA:OneLCUM                         -5.203e+00  2.790e+01  -0.186 0.852669    
## UGPA:FGPA                             2.272e+01  3.881e+01   0.586 0.560294    
## AccomY:ProbationY                    -3.732e+01  7.576e+01  -0.493 0.624002    
## FGPA:AccomY                           2.039e+00  2.697e+01   0.076 0.939986    
## FGPA:ProbationY                       5.827e+01  2.132e+02   0.273 0.785483    
## FGPA:StudentSuccessInitiativeY       -1.164e+01  3.653e+01  -0.319 0.751093    
## CivPro.L:LP1.L                        1.568e+04  2.472e+04   0.634 0.528055    
## CivPro.Q:LP1.L                       -1.066e+04  1.841e+04  -0.579 0.564751    
## CivPro.C:LP1.L                        8.849e+03  1.131e+04   0.782 0.436958    
## CivPro^4:LP1.L                       -3.583e+03  4.919e+03  -0.728 0.469030    
## CivPro^5:LP1.L                        9.864e+01  4.269e+02   0.231 0.818021    
## CivPro^6:LP1.L                               NA         NA      NA       NA    
## CivPro.L:LP1.Q                       -3.845e+03  1.890e+04  -0.203 0.839427    
## CivPro.Q:LP1.Q                        2.056e+03  1.458e+04   0.141 0.888329    
## CivPro.C:LP1.Q                       -3.590e+03  8.307e+03  -0.432 0.667119    
## CivPro^4:LP1.Q                        1.623e+03  3.547e+03   0.458 0.648854    
## CivPro^5:LP1.Q                        1.247e+02  6.113e+02   0.204 0.838992    
## CivPro^6:LP1.Q                               NA         NA      NA       NA    
## CivPro.L:LP1.C                        4.826e+03  1.028e+04   0.469 0.640490    
## CivPro.Q:LP1.C                       -3.802e+03  7.581e+03  -0.502 0.617762    
## CivPro.C:LP1.C                        3.275e+03  4.721e+03   0.694 0.490430    
## CivPro^4:LP1.C                       -1.238e+03  2.047e+03  -0.605 0.547462    
## CivPro^5:LP1.C                               NA         NA      NA       NA    
## CivPro^6:LP1.C                               NA         NA      NA       NA    
## CivPro.L:LP1^4                        1.302e+03  3.963e+03   0.328 0.743658    
## CivPro.Q:LP1^4                       -3.060e+02  2.537e+03  -0.121 0.904372    
## CivPro.C:LP1^4                       -3.335e+02  1.343e+03  -0.248 0.804689    
## CivPro^4:LP1^4                        2.139e+02  5.389e+02   0.397 0.692755    
## CivPro^5:LP1^4                               NA         NA      NA       NA    
## CivPro^6:LP1^4                               NA         NA      NA       NA    
## CivPro.L:LP1^5                       -6.877e+02  7.783e+02  -0.884 0.380289    
## CivPro.Q:LP1^5                               NA         NA      NA       NA    
## CivPro.C:LP1^5                               NA         NA      NA       NA    
## CivPro^4:LP1^5                               NA         NA      NA       NA    
## CivPro^5:LP1^5                               NA         NA      NA       NA    
## CivPro^6:LP1^5                               NA         NA      NA       NA    
## CivPro.L:LP1^6                        3.719e+02  2.736e+02   1.359 0.178868    
## CivPro.Q:LP1^6                               NA         NA      NA       NA    
## CivPro.C:LP1^6                               NA         NA      NA       NA    
## CivPro^4:LP1^6                               NA         NA      NA       NA    
## CivPro^5:LP1^6                               NA         NA      NA       NA    
## CivPro^6:LP1^6                               NA         NA      NA       NA    
## CivPro.L:LP1^7                               NA         NA      NA       NA    
## CivPro.Q:LP1^7                               NA         NA      NA       NA    
## CivPro.C:LP1^7                               NA         NA      NA       NA    
## CivPro^4:LP1^7                               NA         NA      NA       NA    
## CivPro^5:LP1^7                               NA         NA      NA       NA    
## CivPro^6:LP1^7                               NA         NA      NA       NA    
## CivPro.L:LP2.L                        4.722e+02  7.483e+02   0.631 0.530260    
## CivPro.Q:LP2.L                        4.328e+01  3.293e+02   0.131 0.895847    
## CivPro.C:LP2.L                        3.544e+02  3.573e+02   0.992 0.325057    
## CivPro^4:LP2.L                               NA         NA      NA       NA    
## CivPro^5:LP2.L                               NA         NA      NA       NA    
## CivPro^6:LP2.L                               NA         NA      NA       NA    
## LP1.L:LP2.L                                  NA         NA      NA       NA    
## LP1.Q:LP2.L                                  NA         NA      NA       NA    
## LP1.C:LP2.L                                  NA         NA      NA       NA    
## LP1^4:LP2.L                                  NA         NA      NA       NA    
## LP1^5:LP2.L                                  NA         NA      NA       NA    
## LP1^6:LP2.L                                  NA         NA      NA       NA    
## LP1^7:LP2.L                                  NA         NA      NA       NA    
## LegalAnalysisY:AdvLegalPerfY          1.137e+01  8.385e+01   0.136 0.892533    
## LegalAnalysisY:AdvLegalAnalysisY             NA         NA      NA       NA    
## AdvLegalPerfY:AdvLegalAnalysisY       1.053e+01  1.198e+01   0.878 0.383074    
## BarPrepThemis:PctBarPrepComplete     -8.107e-01  3.317e+01  -0.024 0.980579    
## BarPrepThemis:NumPrepWorkshops1      -1.349e+01  1.068e+01  -1.264 0.211041    
## BarPrepThemis:NumPrepWorkshops2      -1.898e+01  1.204e+01  -1.576 0.120000    
## BarPrepThemis:NumPrepWorkshops3      -4.519e+00  1.225e+01  -0.369 0.713387    
## BarPrepThemis:NumPrepWorkshops4       6.138e+00  6.355e+01   0.097 0.923368    
## BarPrepThemis:NumPrepWorkshops5      -3.048e+01  1.263e+01  -2.413 0.018736 *  
## BarPrepThemis:BarPrepMentorY          1.851e+01  9.538e+00   1.941 0.056735 .  
## PctBarPrepComplete:NumPrepWorkshops1 -6.194e-01  5.545e+01  -0.011 0.991123    
## PctBarPrepComplete:NumPrepWorkshops2 -8.965e+00  6.835e+01  -0.131 0.896058    
## PctBarPrepComplete:NumPrepWorkshops3  4.715e+01  5.558e+01   0.848 0.399506    
## PctBarPrepComplete:NumPrepWorkshops4  1.518e+02  2.462e+02   0.617 0.539678    
## PctBarPrepComplete:NumPrepWorkshops5  1.432e+01  5.448e+01   0.263 0.793485    
## PctBarPrepComplete:BarPrepMentorY    -6.951e+00  5.592e+01  -0.124 0.901470    
## NumPrepWorkshops1:BarPrepMentorY     -2.211e+01  2.095e+01  -1.055 0.295386    
## NumPrepWorkshops2:BarPrepMentorY     -2.484e+00  1.256e+01  -0.198 0.843804    
## NumPrepWorkshops3:BarPrepMentorY     -1.175e+01  1.320e+01  -0.890 0.376773    
## NumPrepWorkshops4:BarPrepMentorY             NA         NA      NA       NA    
## NumPrepWorkshops5:BarPrepMentorY     -6.076e+00  1.063e+01  -0.572 0.569610    
## AccomY:StudentSuccessInitiativeY      6.151e+00  2.282e+01   0.269 0.788447    
## ProbationY:StudentSuccessInitiativeY  4.084e+01  5.385e+01   0.758 0.451083    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.29 on 63 degrees of freedom
## Multiple R-squared:  0.8663, Adjusted R-squared:  0.5606 
## F-statistic: 2.834 on 144 and 63 DF,  p-value: 4.14e-06

Figure 19 - Boxcox Plot (CMP_mod6)

Figure 20 - Residuals vs. Fitted (CMP_mod6)

## Warning: not plotting observations with leverage one:
##   9, 16, 35, 52, 60, 66, 71, 72, 73, 76, 81, 82, 83, 86, 88, 89, 119, 139, 150, 156, 161, 165, 172, 174, 182, 185, 188, 189, 197, 199, 200, 202

Figure 21 - Normality Plot (CMP_mod6)

Figure 22 - Cook’s Distance (CMP_mod6)

This model has some issues and is stuffed with too many parameters. Nevertheless, this model is a good candidate for some processes in eliminating variables into a more manageable and meaningful model. The process to be used is a step-wise regression. Step-wise will be used in the forward, reverse, and both to drive out parameters that do not add to the meaningfulness of the model. To determine whether a parameter is meaningful, the step-wise processes will either add or remove parameters and check the minimum Akaike Information Criterion (AIC). The AIC can be used to estimate the relative quality of sets of statistical models. Therefore, the ideal model will be identified when a series of step-wise reductions find the lowest AIC value in the set.

Step-wise regression in the forward direction requires an initial model. This initial model is just the UBE score regressed on the constant one (1). The process will start with this initial model, add parameters one at a time, and only keep them in the model if the AIC value reacts appropriately (goes down in value). The process will continue until the full scope of the model has been reached. The full scope will be the full linear model created in the forward process.

When the forward step-wise process is complete, the ideal model (shown below) includes the variables FGPA, PctBarPrepComplete, BarPrep, LP2, AdvLegalPerf, Class, OneLCUM, StudentSuccessInitiative, and BarPrepMentor. Some of these make sense, like the FGPA, for instance. If someone is good at school, they will do well on the standardized test. Others also fit like all the variables that apply to preparation for the bar exam (BarPrep, PctBarPrepComplete, and BarPrepMentor). The most surprising variable kept in the model is the Class variable. A particular cohort of students will not perform better than another by any significant degree. However, the COVID-19 pandemic occurring during these students’ time in law school could have more of an effect than expected.

## UBE ~ FGPA + PctBarPrepComplete + BarPrep + LP2 + AdvLegalPerf + 
##     Class + OneLCUM + StudentSuccessInitiative + BarPrepMentor

The model summary (below) indicates the model has an \(R^2\) value of 0.5693 and a p-value of 7.8149028^{-32}, which both seem reasonable. However, each variable in the model is not necessarily significant. Before we analyze these results further, let’s see what the other step-wise processes produce.

## 
## Call:
## lm(formula = UBE ~ FGPA + PctBarPrepComplete + BarPrep + LP2 + 
##     AdvLegalPerf + Class + OneLCUM + StudentSuccessInitiative + 
##     BarPrepMentor, data = Mod6_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -52.187  -8.472   0.051   8.877  36.192 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                104.821     17.139   6.116 5.04e-09 ***
## FGPA                        35.388      6.911   5.121 7.19e-07 ***
## PctBarPrepComplete          41.748      7.608   5.487 1.24e-07 ***
## BarPrepThemis                9.011      2.069   4.355 2.13e-05 ***
## LP2.L                       12.820      4.998   2.565   0.0111 *  
## AdvLegalPerfY                9.355      3.778   2.476   0.0141 *  
## Class2019                   -6.005      2.032  -2.956   0.0035 ** 
## OneLCUM                     11.783      4.405   2.675   0.0081 ** 
## StudentSuccessInitiativeY    4.853      3.115   1.558   0.1208    
## BarPrepMentorY               3.390      2.368   1.431   0.1539    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.45 on 198 degrees of freedom
## Multiple R-squared:  0.5693, Adjusted R-squared:  0.5497 
## F-statistic: 29.08 on 9 and 198 DF,  p-value: < 2.2e-16

Step-wise regression in the reverse direction does not require the initial model. Instead, the initial model is the full model with every variable and interaction. The process will start with this model, remove parameters one at a time, and only keep them in the model if the AIC value reacts appropriately (goes down in value). The process will continue until the model is reduced to its minimum value based on the AIC.

When the reverse step-wise process is complete, the ideal model (shown below) includes the variables LSAT, UGPA, Class, CivPro, LP1, LP2, OneLCUM, FGPA, Accom, Probation, LegalAnalysis, AdvLegalPerf, AdvLegalAnalysis, BarPrep, PctBarPrepComplete, NumPrepWorkshops, BarPrepMentor, LSAT:UGPA, LSAT:OneLCUM, LSAT:FGPA, UGPA:FGPA, CivPro:LP1, CivPro:LP2, AdvLegalPerf:AdvLegalAnalysis, BarPrep:NumPrepWorkshops, BarPrep:BarPrepMentor. Interestingly, this model now includes so many interaction terms that the forward step-wise process did not identify as significant.

## UBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 + OneLCUM + FGPA + 
##     Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis + 
##     BarPrep + PctBarPrepComplete + NumPrepWorkshops + BarPrepMentor + 
##     LSAT:UGPA + LSAT:OneLCUM + LSAT:FGPA + UGPA:FGPA + CivPro:LP1 + 
##     CivPro:LP2 + AdvLegalPerf:AdvLegalAnalysis + BarPrep:NumPrepWorkshops + 
##     BarPrep:BarPrepMentor

The model summary (below) indicates the model has an \(R^2\) value of 0.8549 and a p-value of 6.6017219^{-11}, which both seem reasonable. However, again, the significance of each variable in the model is questionable. Therefore, before we analyze these results further, let’s see what the final step-wise processes (both directions) produce.

## 
## Call:
## lm(formula = UBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 + 
##     OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf + 
##     AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops + 
##     BarPrepMentor + LSAT:UGPA + LSAT:OneLCUM + LSAT:FGPA + UGPA:FGPA + 
##     CivPro:LP1 + CivPro:LP2 + AdvLegalPerf:AdvLegalAnalysis + 
##     BarPrep:NumPrepWorkshops + BarPrep:BarPrepMentor, data = Mod6_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.868  -4.521   0.000   4.149  26.076 
## 
## Coefficients: (48 not defined because of singularities)
##                                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     -3.947e+08  5.898e+08  -0.669 0.505177    
## LSAT.L                          -3.011e+09  4.495e+09  -0.670 0.504812    
## LSAT.Q                          -3.436e+09  5.170e+09  -0.665 0.508125    
## LSAT.C                          -3.363e+09  5.076e+09  -0.662 0.509478    
## LSAT^4                          -2.962e+09  4.545e+09  -0.652 0.516291    
## LSAT^5                          -2.369e+09  3.667e+09  -0.646 0.519989    
## LSAT^6                          -1.741e+09  2.778e+09  -0.627 0.532436    
## LSAT^7                          -1.167e+09  1.894e+09  -0.617 0.539229    
## LSAT^8                          -7.203e+08  1.229e+09  -0.586 0.559508    
## LSAT^9                          -4.023e+08  7.031e+08  -0.572 0.568693    
## LSAT^10                         -2.064e+08  3.885e+08  -0.531 0.596653    
## LSAT^11                         -9.496e+07  1.807e+08  -0.525 0.600654    
## LSAT^12                         -4.073e+07  8.300e+07  -0.491 0.624928    
## LSAT^13                         -1.607e+07  2.930e+07  -0.549 0.584748    
## LSAT^14                         -6.510e+06  1.070e+07  -0.608 0.544562    
## LSAT^15                         -2.720e+06  2.562e+06  -1.062 0.291406    
## LSAT^16                         -1.258e+06  8.367e+05  -1.503 0.136524    
## LSAT^17                         -5.643e+05  3.011e+05  -1.874 0.064352 .  
## LSAT^18                         -2.326e+05  1.268e+05  -1.834 0.070268 .  
## LSAT^19                         -8.020e+04  4.704e+04  -1.705 0.091920 .  
## LSAT^20                         -2.158e+04  1.348e+04  -1.601 0.113205    
## LSAT^21                         -3.938e+03  2.807e+03  -1.403 0.164305    
## UGPA                             1.325e+08  1.527e+08   0.868 0.388018    
## Class2019                       -1.114e+01  2.944e+00  -3.784 0.000288 ***
## CivPro.L                         1.754e+03  4.034e+03   0.435 0.664886    
## CivPro.Q                        -9.275e+02  3.116e+03  -0.298 0.766686    
## CivPro.C                         2.940e+02  1.832e+03   0.161 0.872863    
## CivPro^4                        -8.718e+00  8.255e+02  -0.011 0.991599    
## CivPro^5                        -1.080e+02  1.614e+02  -0.669 0.505101    
## CivPro^6                         1.309e+01  6.816e+00   1.921 0.058126 .  
## LP1.L                            2.371e+03  4.938e+03   0.480 0.632404    
## LP1.Q                           -2.342e+03  3.872e+03  -0.605 0.546909    
## LP1.C                            1.593e+03  2.461e+03   0.647 0.519087    
## LP1^4                           -8.968e+02  1.032e+03  -0.869 0.387397    
## LP1^5                            2.637e+02  3.112e+02   0.847 0.399244    
## LP1^6                           -1.151e+02  1.217e+02  -0.946 0.347086    
## LP1^7                            1.338e+01  3.321e+01   0.403 0.687962    
## LP2.L                           -3.872e+01  7.515e+01  -0.515 0.607722    
## OneLCUM                          1.536e+07  1.888e+07   0.814 0.418192    
## FGPA                            -3.500e+07  3.518e+07  -0.995 0.322745    
## AccomY                          -7.124e+00  4.181e+00  -1.704 0.092069 .  
## ProbationY                      -1.915e+01  1.130e+01  -1.694 0.093886 .  
## LegalAnalysisY                   3.354e+01  1.974e+01   1.700 0.092885 .  
## AdvLegalPerfY                    4.582e+00  6.845e+00   0.669 0.505052    
## AdvLegalAnalysisY                6.107e-01  2.862e+00   0.213 0.831553    
## BarPrepThemis                    1.432e+01  3.807e+00   3.760 0.000313 ***
## PctBarPrepComplete               2.908e+01  1.145e+01   2.540 0.012914 *  
## NumPrepWorkshops1                2.604e+00  5.851e+00   0.445 0.657454    
## NumPrepWorkshops2                1.260e+01  5.793e+00   2.175 0.032454 *  
## NumPrepWorkshops3                4.087e+00  7.943e+00   0.515 0.608225    
## NumPrepWorkshops4                1.671e+01  1.066e+01   1.567 0.120811    
## NumPrepWorkshops5                2.318e+01  6.912e+00   3.354 0.001195 ** 
## BarPrepMentorY                  -1.095e+01  5.318e+00  -2.058 0.042653 *  
## LSAT.L:UGPA                      1.007e+09  1.169e+09   0.861 0.391591    
## LSAT.Q:UGPA                      1.142e+09  1.347e+09   0.848 0.398635    
## LSAT.C:UGPA                      1.107e+09  1.334e+09   0.830 0.408820    
## LSAT^4:UGPA                      9.636e+08  1.199e+09   0.803 0.423959    
## LSAT^5:UGPA                      7.585e+08  9.789e+08   0.775 0.440603    
## LSAT^6:UGPA                      5.479e+08  7.464e+08   0.734 0.464948    
## LSAT^7:UGPA                      3.595e+08  5.156e+08   0.697 0.487631    
## LSAT^8:UGPA                      2.166e+08  3.369e+08   0.643 0.522120    
## LSAT^9:UGPA                      1.172e+08  1.951e+08   0.601 0.549628    
## LSAT^10:UGPA                     5.777e+07  1.081e+08   0.534 0.594574    
## LSAT^11:UGPA                     2.490e+07  5.068e+07   0.491 0.624522    
## LSAT^12:UGPA                     9.596e+06  2.315e+07   0.415 0.679526    
## LSAT^13:UGPA                     3.042e+06  8.103e+06   0.375 0.708310    
## LSAT^14:UGPA                     8.323e+05  2.878e+06   0.289 0.773173    
## LSAT^15:UGPA                     1.555e+05  5.994e+05   0.259 0.795912    
## LSAT^16:UGPA                     2.298e+04  1.452e+05   0.158 0.874605    
## LSAT^17:UGPA                            NA         NA      NA       NA    
## LSAT^18:UGPA                            NA         NA      NA       NA    
## LSAT^19:UGPA                            NA         NA      NA       NA    
## LSAT^20:UGPA                            NA         NA      NA       NA    
## LSAT^21:UGPA                            NA         NA      NA       NA    
## LSAT.L:OneLCUM                   1.140e+08  1.365e+08   0.835 0.405878    
## LSAT.Q:OneLCUM                   1.258e+08  1.552e+08   0.811 0.419814    
## LSAT.C:OneLCUM                   1.148e+08  1.379e+08   0.833 0.407378    
## LSAT^4:OneLCUM                   9.358e+07  1.165e+08   0.803 0.423990    
## LSAT^5:OneLCUM                   6.628e+07  8.011e+07   0.827 0.410350    
## LSAT^6:OneLCUM                   4.272e+07  5.410e+07   0.790 0.431912    
## LSAT^7:OneLCUM                   2.361e+07  2.885e+07   0.819 0.415384    
## LSAT^8:OneLCUM                   1.183e+07  1.546e+07   0.765 0.446551    
## LSAT^9:OneLCUM                   4.805e+06  5.977e+06   0.804 0.423694    
## LSAT^10:OneLCUM                  1.738e+06  2.429e+06   0.716 0.476215    
## LSAT^11:OneLCUM                  4.246e+05  5.447e+05   0.779 0.437906    
## LSAT^12:OneLCUM                  8.948e+04  1.500e+05   0.597 0.552346    
## LSAT^13:OneLCUM                         NA         NA      NA       NA    
## LSAT^14:OneLCUM                         NA         NA      NA       NA    
## LSAT^15:OneLCUM                         NA         NA      NA       NA    
## LSAT^16:OneLCUM                         NA         NA      NA       NA    
## LSAT^17:OneLCUM                         NA         NA      NA       NA    
## LSAT^18:OneLCUM                         NA         NA      NA       NA    
## LSAT^19:OneLCUM                         NA         NA      NA       NA    
## LSAT^20:OneLCUM                         NA         NA      NA       NA    
## LSAT^21:OneLCUM                         NA         NA      NA       NA    
## LSAT.L:FGPA                     -2.604e+08  2.534e+08  -1.028 0.307017    
## LSAT.Q:FGPA                     -2.864e+08  2.894e+08  -0.990 0.325089    
## LSAT.C:FGPA                     -2.619e+08  2.561e+08  -1.023 0.309362    
## LSAT^4:FGPA                     -2.127e+08  2.175e+08  -0.978 0.331094    
## LSAT^5:FGPA                     -1.509e+08  1.490e+08  -1.013 0.313983    
## LSAT^6:FGPA                     -9.676e+07  1.013e+08  -0.955 0.342395    
## LSAT^7:FGPA                     -5.359e+07  5.376e+07  -0.997 0.321733    
## LSAT^8:FGPA                     -2.663e+07  2.911e+07  -0.915 0.363038    
## LSAT^9:FGPA                     -1.085e+07  1.117e+07  -0.971 0.334354    
## LSAT^10:FGPA                    -3.867e+06  4.614e+06  -0.838 0.404335    
## LSAT^11:FGPA                    -9.510e+05  1.024e+06  -0.929 0.355589    
## LSAT^12:FGPA                    -1.926e+05  2.905e+05  -0.663 0.509240    
## LSAT^13:FGPA                            NA         NA      NA       NA    
## LSAT^14:FGPA                            NA         NA      NA       NA    
## LSAT^15:FGPA                            NA         NA      NA       NA    
## LSAT^16:FGPA                            NA         NA      NA       NA    
## LSAT^17:FGPA                            NA         NA      NA       NA    
## LSAT^18:FGPA                            NA         NA      NA       NA    
## LSAT^19:FGPA                            NA         NA      NA       NA    
## LSAT^20:FGPA                            NA         NA      NA       NA    
## LSAT^21:FGPA                            NA         NA      NA       NA    
## UGPA:FGPA                        1.815e+01  1.524e+01   1.191 0.237085    
## CivPro.L:LP1.L                  -5.972e+03  1.721e+04  -0.347 0.729509    
## CivPro.Q:LP1.L                   4.089e+03  1.295e+04   0.316 0.752947    
## CivPro.C:LP1.L                  -8.834e+02  7.981e+03  -0.111 0.912130    
## CivPro^4:LP1.L                   5.797e+02  3.486e+03   0.166 0.868336    
## CivPro^5:LP1.L                   2.043e+02  3.001e+02   0.681 0.497890    
## CivPro^6:LP1.L                          NA         NA      NA       NA    
## CivPro.L:LP1.Q                   6.152e+03  1.272e+04   0.484 0.629874    
## CivPro.Q:LP1.Q                  -3.705e+03  9.630e+03  -0.385 0.701396    
## CivPro.C:LP1.Q                   7.839e+02  5.633e+03   0.139 0.889662    
## CivPro^4:LP1.Q                  -3.983e+02  2.465e+03  -0.162 0.872026    
## CivPro^5:LP1.Q                  -2.041e+02  4.092e+02  -0.499 0.619285    
## CivPro^6:LP1.Q                          NA         NA      NA       NA    
## CivPro.L:LP1.C                  -4.491e+03  7.188e+03  -0.625 0.533753    
## CivPro.Q:LP1.C                   2.494e+03  5.468e+03   0.456 0.649412    
## CivPro.C:LP1.C                  -9.096e+02  3.415e+03  -0.266 0.790627    
## CivPro^4:LP1.C                   5.390e+02  1.493e+03   0.361 0.718994    
## CivPro^5:LP1.C                          NA         NA      NA       NA    
## CivPro^6:LP1.C                          NA         NA      NA       NA    
## CivPro.L:LP1^4                   2.647e+03  2.726e+03   0.971 0.334425    
## CivPro.Q:LP1^4                  -1.080e+03  1.711e+03  -0.631 0.529606    
## CivPro.C:LP1^4                   4.038e+02  9.485e+02   0.426 0.671387    
## CivPro^4:LP1^4                  -1.822e+02  3.981e+02  -0.458 0.648351    
## CivPro^5:LP1^4                          NA         NA      NA       NA    
## CivPro^6:LP1^4                          NA         NA      NA       NA    
## CivPro.L:LP1^5                  -7.338e+02  5.971e+02  -1.229 0.222535    
## CivPro.Q:LP1^5                          NA         NA      NA       NA    
## CivPro.C:LP1^5                          NA         NA      NA       NA    
## CivPro^4:LP1^5                          NA         NA      NA       NA    
## CivPro^5:LP1^5                          NA         NA      NA       NA    
## CivPro^6:LP1^5                          NA         NA      NA       NA    
## CivPro.L:LP1^6                   3.023e+02  2.056e+02   1.471 0.145094    
## CivPro.Q:LP1^6                          NA         NA      NA       NA    
## CivPro.C:LP1^6                          NA         NA      NA       NA    
## CivPro^4:LP1^6                          NA         NA      NA       NA    
## CivPro^5:LP1^6                          NA         NA      NA       NA    
## CivPro^6:LP1^6                          NA         NA      NA       NA    
## CivPro.L:LP1^7                          NA         NA      NA       NA    
## CivPro.Q:LP1^7                          NA         NA      NA       NA    
## CivPro.C:LP1^7                          NA         NA      NA       NA    
## CivPro^4:LP1^7                          NA         NA      NA       NA    
## CivPro^5:LP1^7                          NA         NA      NA       NA    
## CivPro^6:LP1^7                          NA         NA      NA       NA    
## CivPro.L:LP2.L                   6.373e+02  4.543e+02   1.403 0.164383    
## CivPro.Q:LP2.L                  -1.147e+02  1.932e+02  -0.594 0.554180    
## CivPro.C:LP2.L                   3.821e+02  2.253e+02   1.696 0.093587 .  
## CivPro^4:LP2.L                          NA         NA      NA       NA    
## CivPro^5:LP2.L                          NA         NA      NA       NA    
## CivPro^6:LP2.L                          NA         NA      NA       NA    
## AdvLegalPerfY:AdvLegalAnalysisY  1.250e+01  9.702e+00   1.288 0.201218    
## BarPrepThemis:NumPrepWorkshops1 -1.195e+01  8.406e+00  -1.422 0.158843    
## BarPrepThemis:NumPrepWorkshops2 -2.021e+01  8.314e+00  -2.430 0.017216 *  
## BarPrepThemis:NumPrepWorkshops3 -9.667e+00  9.938e+00  -0.973 0.333476    
## BarPrepThemis:NumPrepWorkshops4 -2.609e+01  1.397e+01  -1.867 0.065343 .  
## BarPrepThemis:NumPrepWorkshops5 -3.171e+01  8.909e+00  -3.560 0.000614 ***
## BarPrepThemis:BarPrepMentorY     1.907e+01  6.354e+00   3.002 0.003536 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.99 on 84 degrees of freedom
## Multiple R-squared:  0.8549, Adjusted R-squared:  0.6424 
## F-statistic: 4.023 on 123 and 84 DF,  p-value: 6.602e-11

Step-wise regression in both directions again requires an initial model. However, this initial model will be the regression against all first-order single-factor variables. The process will start with this model, add or remove parameters one at a time, and only keep them in the model if the AIC value reacts appropriately (goes down in value). The process will continue until the full scope of the model has been reached. Similar to the forward process, the full scope will be the full linear model just created.

When the step-wise process in both directions is complete, the ideal model (shown below) includes the variables LSAT, UGPA, Class, CivPro, LP1, LP2, OneLCUM, FGPA, Accom, Probation, LegalAnalysis, AdvLegalPerf, AdvLegalAnalysis, BarPrep, PctBarPrepComplete, NumPrepWorkshops, BarPrepMentor, LSAT:UGPA, LSAT:OneLCUM, LSAT:FGPA, UGPA:FGPA, CivPro:LP1, CivPro:LP2, AdvLegalPerf:AdvLegalAnalysis, BarPrep:NumPrepWorkshops, BarPrep:BarPrepMentor. Interestingly, this model now includes so many interaction terms that the forward step-wise process did not identify as significant.

## UBE ~ Class + LP2 + OneLCUM + FGPA + Probation + AdvLegalPerf + 
##     AdvLegalAnalysis + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + 
##     BarPrepMentor + Probation:StudentSuccessInitiative + AdvLegalPerf:AdvLegalAnalysis + 
##     BarPrep:BarPrepMentor

The model summary (below) indicates the model has an \(R^2\) value of 0.5933 and a p-value of 1.2600333^{-30}, which both seem reasonable. But, again, the significance of each variable in the model is questionable.

## 
## Call:
## lm(formula = UBE ~ Class + LP2 + OneLCUM + FGPA + Probation + 
##     AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete + 
##     StudentSuccessInitiative + BarPrepMentor + Probation:StudentSuccessInitiative + 
##     AdvLegalPerf:AdvLegalAnalysis + BarPrep:BarPrepMentor, data = Mod6_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -50.223  -8.001  -0.243   8.653  37.460 
## 
## Coefficients:
##                                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                           95.8183    17.9123   5.349 2.48e-07 ***
## Class2019                             -6.8763     2.0338  -3.381 0.000874 ***
## LP2.L                                 10.4308     5.1853   2.012 0.045652 *  
## OneLCUM                               12.0597     4.5770   2.635 0.009101 ** 
## FGPA                                  37.0624     7.0199   5.280 3.47e-07 ***
## ProbationY                            14.0541     7.3331   1.917 0.056774 .  
## AdvLegalPerfY                         -1.0857     6.1793  -0.176 0.860712    
## AdvLegalAnalysisY                     -1.6602     1.9674  -0.844 0.399814    
## BarPrepThemis                          8.1448     2.2690   3.590 0.000420 ***
## PctBarPrepComplete                    44.0560     7.5749   5.816 2.47e-08 ***
## StudentSuccessInitiativeY              6.6866     3.3728   1.983 0.048842 *  
## BarPrepMentorY                         0.5571     3.5731   0.156 0.876253    
## ProbationY:StudentSuccessInitiativeY -17.6488     8.1655  -2.161 0.031897 *  
## AdvLegalPerfY:AdvLegalAnalysisY       17.1852     7.8950   2.177 0.030716 *  
## BarPrepThemis:BarPrepMentorY           6.4183     4.6498   1.380 0.169083    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.24 on 193 degrees of freedom
## Multiple R-squared:  0.5933, Adjusted R-squared:  0.5637 
## F-statistic: 20.11 on 14 and 193 DF,  p-value: < 2.2e-16

Table 3 summarizes the \(R^2\) and p-values from the three step-wise processes. The model produced by the “both” direction process yields a slightly better model. The \(R^2\) value isn’t as high as the reverse process calculated, but the p-value is significantly lower, meaning the model is a better fit. And it uses fewer variables than the reverse model.

Table 3 - Summary of Step-wise Results

Model \(R^2\) p-value
Step-wise Forward 0.5693 7.8149028^{-32}
Step-wise Reverse 0.8549 6.6017219^{-11}
Step-wise Both 0.5933 1.2600333^{-30}

Upon review of the “Step-wise Both” model summary, there is no reason to keep the BarPrepMentor variable and its interaction with BarPrep (deemed insignificant). The new final model is given as follows:

CMP_Final_mod<-lm(UBE ~ Class + LP2 + OneLCUM + FGPA + Probation + AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + Probation:StudentSuccessInitiative + AdvLegalPerf:AdvLegalAnalysis, data = Mod6_dat)

The model summary (below) indicates the final accepted model has an \(R^2\) value of 0.5824 and a p-value of 6.2801014^{-31}, which both seem reasonable.

## 
## Call:
## lm(formula = UBE ~ Class + LP2 + OneLCUM + FGPA + Probation + 
##     AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete + 
##     StudentSuccessInitiative + Probation:StudentSuccessInitiative + 
##     AdvLegalPerf:AdvLegalAnalysis, data = Mod6_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -51.992  -8.195  -0.643   8.072  36.122 
## 
## Coefficients:
##                                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                           96.3054    18.0296   5.342 2.55e-07 ***
## Class2019                             -6.3576     2.0330  -3.127  0.00204 ** 
## LP2.L                                 12.0778     5.1672   2.337  0.02043 *  
## OneLCUM                               11.7226     4.5579   2.572  0.01086 *  
## FGPA                                  36.8650     7.0174   5.253 3.89e-07 ***
## ProbationY                            12.5072     7.3545   1.701  0.09061 .  
## AdvLegalPerfY                         -0.0691     6.2079  -0.011  0.99113    
## AdvLegalAnalysisY                     -1.3944     1.9769  -0.705  0.48142    
## BarPrepThemis                          9.9452     2.0514   4.848 2.54e-06 ***
## PctBarPrepComplete                    46.5127     7.5272   6.179 3.69e-09 ***
## StudentSuccessInitiativeY              6.2454     3.3920   1.841  0.06711 .  
## ProbationY:StudentSuccessInitiativeY -16.8127     8.2207  -2.045  0.04218 *  
## AdvLegalPerfY:AdvLegalAnalysisY       15.3319     7.9003   1.941  0.05374 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.35 on 195 degrees of freedom
## Multiple R-squared:  0.5824, Adjusted R-squared:  0.5567 
## F-statistic: 22.66 on 12 and 195 DF,  p-value: < 2.2e-16

Results

The results could be more conclusive. The initial models built were promising in that the Law School at Major University tracks the correct data to predict student success within the first year and the program. However, models trying to relate this data to performance on the bar exam could be more helpful.

The final model of UBE regressed against almost everything and could only explain away approximately 59% of the deviation from the model. This could be improved.

Conclusion & Discussion

The types and amounts of data collected would improve a future study looking at this. For instance, more pertinent coursework should be included in the data. This coursework should consist of grades received, not just whether the class was taken. It would also be essential to look at how many times a course was repeated before earning a passing grade. Regarding the students’ preparation for the bar exam, more data could be collected on the time spent (inside and outside of workshops and prep courses) preparing for the exam.

Additionally, more data would be helpful. With only 204 rows of valuable data, very little can be conclusively drawn from the data. Continued collection of the data over future semesters could help remedy this. And, since the years incorporated into this study were impacted by the COVID-19 pandemic, it would be interesting to see what “normal” data would look like in a few years.

With more and better data, a future study of this kind would prove more useful.

Resources

[1] Americanbar.org. [Online]. Available: https://www.americanbar.org/groups/legal_education/resources/bar_admissions/basic_overview/. [Accessed: 14-Apr-2023].

[2] “What is the bar exam?,” Kaplan Test Prep. [Online]. Available: https://www.kaptest.com/bar-exam/what-is-the-bar-exam. [Accessed: 14-Apr-2023].

[3] “Minimum scores,” NCBE. [Online]. Available: https://www.ncbex.org/exams/ube/score-portability/minimum-scores/. [Accessed: 14-Apr-2023].

[4] “Texas bar exam information for [year]: Format, dates, Statistics,” AmeriBar Bar Review, 16-Jul-2021. [Online]. Available: https://ameribar.com/texas-bar-exam/#:~:text=Scoring%2FGrading%20and%20Results,%25%2C%20and%20MPT%2020%25. [Accessed: 20-Apr-2023].

[5] “Essential facts of the ube,” 7Sage bar. [Online]. Available: https://7sage.com/bar/essential-facts-of-the-ube/. [Accessed: 20-Apr-2023].

[6] K. W. Boyd, “Bar exam scoring: How is the bar exam scored?,” Accounting Institute of Success - CPA Exam Prep, 13-Jan-2023. [Online]. Available: https://www.ais-cpa.com/bar-exam-scoring/#ftoc-heading-4. [Accessed: 20-Apr-2023].

[7] T. N. Bond, “How can we measure achievement with tests? problems of scales and measurement error,” How can we measure achievement with tests? Problems of scales and measurement error - Purdue Business, 16-Apr-2018. [Online]. Available: https://business.purdue.edu/centers/purce/blogs/2018-4-16.php#:~:text=The%20first%20problem%20is%20that,better%20scores%20than%20lower%20numbers. [Accessed: 20-Apr-2023].

[8] “Predicting value from another measured variable,” TwoVariables - StatsWiki. [Online]. Available: https://wiki.bcs.rochester.edu/StatsWiki/TwoVariables. [Accessed: 20-Apr-2023].

[9] “MPRE scores by State 2023,” Wisevoter, 26-Mar-2023. [Online]. Available: https://wisevoter.com/state-rankings/mpre-scores-by-state/. [Accessed: 24-Apr-2023].

[10] “Research guides: Bar exam resources: MPT (multi-state performance test),” MPT (Multi-State Performance Test) - Bar Exam Resources - Research Guides at University of Cincinnati. [Online]. Available: https://guides.libraries.uc.edu/c.php?g=222391&p=4488058#:~:text=on%20this%20Guide-,Multistate%20Performance%20Test%20(MPT),a%20lawyering%20task%2C%20and%20communication. [Accessed: 24-Apr-2023]. [11] “Multistate essay examination,” NCBE. [Online]. Available: https://www.ncbex.org/exams/mee/#:~:text=The%20purpose%20of%20the%20MEE,organized%20composition%3B%20and%20(4). [Accessed: 24-Apr-2023].

[12] JD Advising, “Topic 10: Mee grading & scoring: What you need to know,” JD Advising, 20-Oct-2020. [Online]. Available: https://jdadvising.com/mee-grading-scoring-what-you-need-to-know/#:~:text=Scaling%20of%20MEE%20scores&text=Most%20MEE%20jurisdictions%20use%20a,a%2020%E2%80%9380%20scale). [Accessed: 24-Apr-2023].

[13] “Multistate Bar Examination,” NCBE. [Online]. Available: https://www.ncbex.org/exams/mbe/#:~:text=The%20purpose%20of%20the%20MBE,to%20analyze%20given%20fact%20patterns. [Accessed: 24-Apr-2023].

[14] J. Fox and G. Monette, “Generalized Collinearity Diagnostics,” Journal of the American Statistical Association, vol. 87, no. 417, pp. 178–183, Mar. 1992.

Full Code

knitr::opts_chunk$set(echo = TRUE)
options(na.action = "na.fail")   #  prevent fitting models to different datasets
library(openxlsx)
library(tibble)
library(knitr)
library(kableExtra)
library(flextable)
library(car)
library(stringr)
library(dplyr)
library(MASS)
library(MuMIn)
library(ggplot2)
library(lattice)
library(caret)
library(broom)
set_flextable_defaults(fonts_ignore = TRUE)

# Read in Data Files by sheet
Pass_21<-read.xlsx("https://raw.githubusercontent.com/c-penick/ProjectFile/main/BarDataSet.xlsx", sheet = 4)
Pass_22<-read.xlsx("https://raw.githubusercontent.com/c-penick/ProjectFile/main/BarDataSet.xlsx", sheet = 2)
Fail_21<-read.xlsx("https://raw.githubusercontent.com/c-penick/ProjectFile/main/BarDataSet.xlsx", sheet = 3)
Fail_22<-read.xlsx("https://raw.githubusercontent.com/c-penick/ProjectFile/main/BarDataSet.xlsx", sheet = 1)

# Data wrangling - treatment of data discussed in Understanding the Data and Variables section
All_Data<-rbind(Pass_21, Pass_22, Fail_21, Fail_22)
All_Data$MPRE <- as.numeric(All_Data$MPRE)
Orderinal_array <- c("LP2","CivPro","LP1","LSAT")
Nominal_array <- c("NumPrepWorkshops","Class","Accom","Probation","LegalAnalysis","AdvLegalPerf","AdvLegalAnalysis","BarPrep","BarPrepMentor",
                  "StudentSuccessInitiative","PASS")

All_Data$LP1<-as.character(All_Data$LP1)
All_Data$LP1[All_Data$LP1=="B "]<-"B"
All_Data$LP1[All_Data$LP1=="D "]<-"D"
All_Data$LP1<-as.ordered(All_Data$LP1)
All_Data$LP1<-ordered(All_Data$LP1, levels = c("F", "D", "D+", "C", "C+", "B", "B+", "A"))

Raw_All_Data_LP2<-All_Data$LP2
Raw_All_Data_LP2<-as.character(Raw_All_Data_LP2)
Raw_All_Data_LP2[Raw_All_Data_LP2=="CR"]<-"C"
Raw_All_Data_LP2<-as.ordered(Raw_All_Data_LP2)
Raw_All_Data_LP2<-ordered(Raw_All_Data_LP2, levels = c("F", "D", "D+", "C", "C+", "B", "B+", "A"))
All_Data$LP2[All_Data$LP2=="A"]<-"Pass"
All_Data$LP2[All_Data$LP2=="B+"]<-"Pass"
All_Data$LP2[All_Data$LP2=="B"]<-"Pass"
All_Data$LP2[All_Data$LP2=="C+"]<-"Pass"
All_Data$LP2[All_Data$LP2=="C"]<-"Pass"
All_Data$LP2[All_Data$LP2=="CR"]<-"Pass"
All_Data$LP2[All_Data$LP2=="D+"]<-"Fail"
All_Data$LP2[All_Data$LP2=="D"]<-"Fail"
All_Data$LP2[All_Data$LP2=="F"]<-"Fail"
All_Data$LP2<-as.ordered(All_Data$LP2)
#All_Data$LP2<-ordered(All_Data$LP2, levels = c("F", "D", "D+", "C", "C+", "B", "B+", "A"))
All_Data$LP2<-ordered(All_Data$LP2, levels = c("Pass", "Fail"))

All_Data$CivPro
All_Data$CivPro<-as.ordered(All_Data$CivPro)
levels(All_Data$CivPro)
All_Data$CivPro<-ordered(All_Data$CivPro, levels = c("F", "D", "D+", "C", "C+", "B", "B+", "A"))

All_Data$LSAT<-as.ordered(All_Data$LSAT)

  for (i in Nominal_array) {
    All_Data[,i] <- factor(All_Data[,i])
  } 

All_Data$BarPrep<-as.character(All_Data$BarPrep)
All_Data$BarPrep<-str_replace_na(All_Data$BarPrep, replacement = "unknown")
All_Data$BarPrep<-as.factor(All_Data$BarPrep)

All_Data$MPRE<-as.numeric(All_Data$MPRE)
All_Data$MPRE[is.na(All_Data$MPRE)]<-0

All_Data$MPT<-as.ordered(All_Data$MPT)
All_Data$MPT<-ordered(All_Data$MPT, levels = c(1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0))

Raw_All_Data_MEE<-All_Data$MEE
All_Data$MEE<-as.ordered(All_Data$MEE)
All_Data$MEE<-ordered(All_Data$MEE, levels = c(1.0000 , 1.1667 , 1.3333 , 1.5000 , 1.6667 , 1.8333 , 2.0000 , 2.1667 , 2.3333 , 2.5000 , 2.6667 , 2.8333 , 3.0000 , 3.1667 , 3.3333 , 3.5000 , 3.6667 , 3.8333 , 4.0000 , 4.1667 , 4.3333 , 4.5000 , 4.6667 , 4.8333 , 5.0000 , 5.1667 , 5.3333 , 5.5000 , 5.6667 , 5.8333 , 6.000))

#inital_model<-lm(data = All_Data,UBE~.)
#summary(inital_model)
#VIF_check<- data.frame(vif(inital_model))
#VIF_check$GVIF_Standardized <- VIF_check$GVIF..1..2.Df..^2

min_UBE<-c(260, 264, 266, 268, 270, 272, 273)
Jurisdiction<-c("Alabama, Minnesota, Missouri, New Mexico, North Dakota", "Indiana, Oklahoma", "Connecticut, D.C., Illinois, Iowa, Kansas, Kentucky, Maryland, Montana, New Jersey, New York, South Carolina, Virgin Islands", "Michigan", "Alaska, Arkansas, Colorado, Maine, Massachusetts, Nebraska, New Hampshire, North Carolina, Ohio, Oregon, Rhode Island, Tennessee, Texas, Utah, Vermont, Washington, West Virginia, Wyoming", "Idaho, Pennsylvania", "Arizona")
T1_data<-rbind(min_UBE, Jurisdiction)
df_T1<-data.frame(min_UBE, Jurisdiction)
colnames(df_T1)<-c('Minimum UBE Score', 'Jurisdiction')
ft<-flextable(data=df_T1, col_keys = c("Minimum UBE Score", "Jurisdiction"))
ft<-width(ft, j = 1, width = 2, unit = "in")
ft<-width(ft, j = 2, width = 4, unit = "in")
ft<-align_nottext_col(ft, align = "center", header = TRUE, footer = TRUE)
ft<-padding(ft, padding.left = 25, padding.right = 25, part = "header")
ft

slices<-c(20, 30, 50)
lbls<-c("MPT", "MEE", "MBE")
pct<-round(slices/sum(slices)*100)
lbls<-paste(lbls, pct)
lbls<-paste(lbls, "%", sep = " ")
pie(slices, labels=lbls, col=rainbow(length(lbls)), main = "Scoring & Weighting of Texas Bar Exam")

Misc<-c("Class", "CivPro", "LP1", "LP2", "BarPrep", "", "", "", "", "", "")
YN<-c("Probation", "LegalAnalysis", "AdvLegalPerf", "AdvLegalAnalysis", "StudentSuccessInitiative", "BarPrepMentor", "Accom", "PASS", "", "", "")
Num_att<-c("MPRE", "MPT", "MEE", "MBE", "PctBarPrepComplete", "NumPrepWorkshops", "FGPA", "LSAT", "UGPA", "OneLCUM", "UBE")
T2_data<-rbind(Misc, YN, Num_att)
df_T2<-data.frame(Misc, YN, Num_att)
colnames(df_T2)<-c('Misc. Attributes', 'Yes/No Attributes', 'Numerical Attributes')
ft2<-flextable(data=df_T2, col_keys = c("Misc. Attributes", "Yes/No Attributes", "Numerical Attributes"))
ft2<-width(ft2, j = 1, width = 2, unit = "in")
ft2<-width(ft2, j = 2, width = 2, unit = "in")
ft2<-width(ft2, j = 3, width = 2, unit = "in")
ft2<-align_nottext_col(ft2, align = "center", header = TRUE, footer = TRUE)
ft2<-padding(ft2, padding.left = 0, padding.right = 0, part = "header")
ft2

#KEeping the right students
# CMP first model OneLCum ~ LSAT, UGPA, CivPro, LP1, LP2
# Treating LP2 as CR = C
CMP_Mod1_dat<-All_Data
CMP_Mod1_dat<-CMP_Mod1_dat[,!names(CMP_Mod1_dat) %in% c("Class", "LP2", "FGPA", "Accom", "Probation", "LegalAnalysis", "AdvLegalPerf", "AdvLegalAnalysis", "BarPrep", "PctBarPrepComplete", "NumPrepWorkshops", "StudentSuccessInitiative", "BarPrepMentor", "MPRE", "MPT", "MEE", "MBE", "UBE", "PASS")]
#Raw_All_Data_LP2[Raw_All_Data_LP2=="CR"]<-"C"
#Raw_All_Data_LP2<-ordered(Raw_All_Data_LP2, levels = c("F", "D", "D+", "C", "C+", "B", "B+", #"A"))
CMP_Mod1_dat$LP2<-Raw_All_Data_LP2
CMP_Mod1_dat<-na.omit(CMP_Mod1_dat)
CMP_mod1<-lm(OneLCUM ~ LSAT + UGPA + CivPro + LP1 + LP2, data=CMP_Mod1_dat)
CMP_mod1_r2<-summary(CMP_mod1)$r.squared
CMP_mod1_p<-glance(CMP_mod1)$p.value

summary(CMP_mod1)
dredge(CMP_mod1)

plot(CMP_mod1, which = 1)

plot(CMP_mod1, which = 2)

boxcox(CMP_mod1)

## Improving the MBE
options(na.action = "na.omit") 
MBE_model <- lm(data = All_Data,MBE~CivPro+LP1+LP2+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis)
MBE_VIFs <- vif(MBE_model)
MBE_VIFs<- data.frame(vif(MBE_model))
MBE_VIFs$VIF <- round(MBE_VIFs$GVIF..1..2.Df..^2,2)
MBE_VIFs$Color <- cut(MBE_VIFs$VIF,c(-Inf,5,Inf))
summary(MBE_model)

ggplot(data = MBE_VIFs)+
  geom_col(aes(VIF,reorder(row.names(MBE_VIFs),VIF),fill=Color,))+ geom_text(aes(VIF,row.names(MBE_VIFs),label=VIF,hjust=-.4))+
  xlab("VIF")+ylab("Predictor Varibales")+ggtitle("VIF's for the MBE Model")+scale_x_continuous(breaks=seq(0,4),limits = c(0,4))+
  scale_fill_discrete(name="VIF Legend",labels=c("VIF<5","VIF>5"))

plot(MBE_model,which = 1)

outlier_RF <- data.frame(MBE_model$fitted.values) %>% rownames_to_column()

outlier_RF <- outlier_RF[outlier_RF$MBE_model.fitted.values<120,1]
outlier_RF <- as.numeric(outlier_RF)
data_outlier_removal <- All_Data[-outlier_RF,]

MBE_model2 <- lm(data = data_outlier_removal,MBE~CivPro+LP1+LP2+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis)
summary(MBE_model2)

plot(MBE_model2,which = 1)

plot(MBE_model2,which = 2)

outlier_RF2 <- data.frame(MBE_model2$residuals) %>% rownames_to_column()
outlier_RF2 <- outlier_RF2[outlier_RF2$MBE_model2.residuals< -30,1]
outlier_RF2 <- as.numeric(outlier_RF2)
data_outlier_removal2 <- data_outlier_removal[-204,]

MBE_model3 <- lm(data = data_outlier_removal2,MBE~CivPro+LP1+LP2+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis)
summary(MBE_model3)

# step(MBE_model3)
MBE_model8 <- lm(data = data_outlier_removal2,MBE~CivPro)
summary(MBE_model8)

plot(MBE_model8, which = 1)

plot(MBE_model8, which = 2)

plot(MBE_model8, which = 5)

PASS_MODEL <- glm(data = All_Data , PASS~BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor+Probation, family = binomial())
summary(PASS_MODEL)

PASS_MODEL2 <- glm(data = All_Data , PASS~BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor, family = binomial())
summary(PASS_MODEL2)

PASS_MODEL3 <- glm(data = All_Data, PASS~PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor, family = binomial())
summary(PASS_MODEL3)

PASS_MODEL4 <- glm(data = All_Data, PASS~StudentSuccessInitiative+PctBarPrepComplete, family = binomial())
summary(PASS_MODEL4)


## Student Success in the Law School Program
# My SECOND model FGPA ~ Probation, Accom, OneLCUM, StudentSuccessInitiative
CMP_Mod2_dat<-All_Data
CMP_Mod2_dat<-CMP_Mod2_dat[,!names(CMP_Mod2_dat) %in% c("LSAT", "UGPA", "Class", "CivPro", "LP1", "LP2", "LegalAnalysis", "AdvLegalAnalysis", "AdvLegalPerf", "BarPrep", "PctBarPrepComplete", "NumPrepWorkshops", "BarPrepMentor", "MPRE", "MPT", "MEE", "MBE", "UBE", "PASS")]
CMP_Mod2_dat<-na.omit(CMP_Mod2_dat)
CMP_mod2<-lm(FGPA~., data = CMP_Mod2_dat)
dummy_variable<-summary(CMP_mod2)
CMP_mod2_r2<-summary(CMP_mod2)$r.squared
CMP_mod2_p<-glance(CMP_mod2)$p.value

options(na.action = "na.fail")
summary(CMP_mod2)
dredge(CMP_mod2)
vif(CMP_mod2)

plot(CMP_mod2, which = 1)

plot(CMP_mod2, which = 2)

boxcox(CMP_mod2)

plot(CMP_mod2, which = 3)

## Improving Essay Results on UBE
# My third model MEE ~ LP1, LP2
CMP_Mod3_dat<-All_Data
CMP_Mod3_dat<-CMP_Mod3_dat[,!names(CMP_Mod3_dat) %in% c("LSAT","UGPA","Class","CivPro", "LP2","OneLCUM", "FGPA", "Accom", "Probation", "LegalAnalysis", "AdvLegalPerf", "AdvLegalAnalysis", "BarPrep", "PctBarPrepComplete", "NumPrepWorkshops", "StudentSuccessInitiative", "BarPrepMentor", "MPRE", "MPT", "MBE", "UBE", "PASS")]
CMP_Mod3_dat$LP2<-Raw_All_Data_LP2
CMP_Mod3_dat$MEE<-Raw_All_Data_MEE
CMP_Mod3_dat<-na.omit(CMP_Mod3_dat)
CMP_mod3<-lm(MEE~., data = CMP_Mod3_dat)
summary(CMP_mod3)
CMP_mod3_r2<-summary(CMP_mod3)$r.squared
CMP_mod3_p<-glance(CMP_mod3)$p.value
dredge(CMP_mod3)
CMP_mod3_2<-lm(MEE~LP1, data = CMP_Mod3_dat)
CMP_mod3_r22<-summary(CMP_mod3_2)$r.squared
CMP_mod3_p2<-glance(CMP_mod3_2)$p.value

summary(CMP_mod3)
dredge(CMP_mod3)
CMP_mod3_2<-lm(MEE~LP1, data = CMP_Mod3_dat)
CMP_mod3_r22<-summary(CMP_mod3_2)$r.squared
CMP_mod3_p2<-glance(CMP_mod3_2)$p.value

summary(CMP_mod3_2)

plot(CMP_mod3_2, which = 1)

plot(CMP_mod3_2, which = 2)

boxcox(CMP_mod3_2)


## Evaluating UBE Against (Nearly) Everything
Mod6_dat<-All_Data
Mod6_dat<-Mod6_dat[,!names(Mod6_dat) %in% c("MPT", "MEE", "MBE", "PASS")]
#str_replace_na(Mod6_dat$BarPrep, replacement = "unknown")
Mod6_dat$MPRE<-as.numeric(Mod6_dat$MPRE)
Mod6_dat$MPRE[is.na(Mod6_dat$MPRE)]<-0
Mod6_dat<-na.omit(Mod6_dat)
CMPmod6lm<-lm(UBE~.+LSAT:UGPA+LSAT:OneLCUM+LSAT:FGPA+UGPA:OneLCUM+UGPA:FGPA+Accom:Probation+Accom:FGPA+Probation:FGPA+StudentSuccessInitiative:FGPA+CivPro:LP1+CivPro:LP2+LP1:LP2+LegalAnalysis:AdvLegalPerf+LegalAnalysis:AdvLegalAnalysis+AdvLegalPerf:AdvLegalAnalysis+BarPrep:PctBarPrepComplete+BarPrep:NumPrepWorkshops+BarPrep:BarPrepMentor+PctBarPrepComplete:NumPrepWorkshops+PctBarPrepComplete:BarPrepMentor+NumPrepWorkshops:BarPrepMentor+Accom:StudentSuccessInitiative+Probation:StudentSuccessInitiative, data = Mod6_dat)
CMP_mod6_summary<-summary(CMPmod6lm)
CMP_mod6_r2<-summary(CMPmod6lm)$r.squared
CMP_mod6_p<-glance(CMPmod6lm)$p.value

summary(CMPmod6lm)

boxcox(CMPmod6lm)

plot(CMPmod6lm, which = 1)

plot(CMPmod6lm, which = 2)

plot(CMPmod6lm, which = 4)

SW_init_mod_lm<-lm(UBE~1, data = Mod6_dat)
SW_CMPmod6lm_fwd<-step(SW_init_mod_lm, scope~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor+MPRE+LSAT:UGPA+LSAT:OneLCUM+LSAT:FGPA+UGPA:OneLCUM+UGPA:FGPA+Accom:Probation+CivPro:LP1+CivPro:LP2+LP1:LP2+LegalAnalysis:AdvLegalPerf+LegalAnalysis:AdvLegalAnalysis+AdvLegalPerf:AdvLegalAnalysis+BarPrep:PctBarPrepComplete+BarPrep:NumPrepWorkshops+BarPrep:BarPrepMentor+PctBarPrepComplete:NumPrepWorkshops+PctBarPrepComplete:BarPrepMentor+NumPrepWorkshops:BarPrepMentor+Accom:StudentSuccessInitiative+Probation:StudentSuccessInitiative, direction = "forward", trace = FALSE)
SW_CMPmod6lm_fwd_fin<-lm(UBE~FGPA + PctBarPrepComplete + BarPrep + LP2 + AdvLegalPerf + Class + OneLCUM + StudentSuccessInitiative + BarPrepMentor, data = Mod6_dat)
SW_CMPmod6lm_fwd_fin_r2<-summary(SW_CMPmod6lm_fwd_fin)$r.squared
SW_CMPmod6lm_fwd_fin_p<-glance(SW_CMPmod6lm_fwd_fin)$p.value
formula(SW_CMPmod6lm_fwd)

summary(SW_CMPmod6lm_fwd_fin)

SW_CMPmod6lm_rev<-step(CMPmod6lm, direction = "backward", trace = FALSE)
SW_CMPmod6lm_rev_fin<-lm(UBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 + OneLCUM + FGPA + 
    Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis + 
    BarPrep + PctBarPrepComplete + NumPrepWorkshops + BarPrepMentor + 
    LSAT:UGPA + LSAT:OneLCUM + LSAT:FGPA + UGPA:FGPA + CivPro:LP1 + 
    CivPro:LP2 + AdvLegalPerf:AdvLegalAnalysis + BarPrep:NumPrepWorkshops + 
    BarPrep:BarPrepMentor, data = Mod6_dat)
SW_CMPmod6lm_rev_fin_r2<-summary(SW_CMPmod6lm_rev_fin)$r.squared
SW_CMPmod6lm_rev_fin_p<-glance(SW_CMPmod6lm_rev_fin)$p.value
formula(SW_CMPmod6lm_rev)

summary(SW_CMPmod6lm_rev_fin)

SW_CMPmod6lm_both_init<-lm(UBE~.,data = Mod6_dat)
SW_CMPmod6lm_both<-step(SW_CMPmod6lm_both_init, scope~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor+MPRE+LSAT:UGPA+LSAT:OneLCUM+LSAT:FGPA+UGPA:OneLCUM+UGPA:FGPA+Accom:Probation+CivPro:LP1+CivPro:LP2+LP1:LP2+LegalAnalysis:AdvLegalPerf+LegalAnalysis:AdvLegalAnalysis+AdvLegalPerf:AdvLegalAnalysis+BarPrep:PctBarPrepComplete+BarPrep:NumPrepWorkshops+BarPrep:BarPrepMentor+PctBarPrepComplete:NumPrepWorkshops+PctBarPrepComplete:BarPrepMentor+NumPrepWorkshops:BarPrepMentor+Accom:StudentSuccessInitiative+Probation:StudentSuccessInitiative, direction = "both", trace = FALSE)
SW_CMPmod6lm_both_fin<-lm(UBE~Class + LP2 + OneLCUM + FGPA + Probation + AdvLegalPerf + 
    AdvLegalAnalysis + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + 
    BarPrepMentor + Probation:StudentSuccessInitiative + AdvLegalPerf:AdvLegalAnalysis + 
    BarPrep:BarPrepMentor, data = Mod6_dat)
SW_CMPmod6lm_both_fin_r2<-summary(SW_CMPmod6lm_both_fin)$r.squared
SW_CMPmod6lm_both_fin_p<-glance(SW_CMPmod6lm_both_fin)$p.value
formula(SW_CMPmod6lm_both)

summary(SW_CMPmod6lm_both_fin)

CMP_Final_mod<-lm(UBE ~ Class + LP2 + OneLCUM + FGPA + Probation + AdvLegalPerf + AdvLegalAnalysis + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + Probation:StudentSuccessInitiative + AdvLegalPerf:AdvLegalAnalysis, data = Mod6_dat)
CMP_Final_mod_r2<-summary(CMP_Final_mod)$r.squared
CMP_Final_mod_p<-glance(CMP_Final_mod)$p.value

summary(CMP_Final_mod)