This Project is for IE5344-Statistical Data Analysis. Our Main Objective is to find significant Predictors for bar exam passes rates. So, the bar school can focus more on those factors while taking bar exam and get more passing rate. We did a Regression Analysis on UBE and from the fitted model we got significant variables. As UBE is dependent on MPT, MEE, MBE value, we did Regression Analysis on those as well and concluded the significant predictors variables from the model. Throughout the finding significant predictors’ process, we did multicollinearity check, outliers check, adequacy check, transformation where it required, AIC checking for subset models.
The data we have to do analysis is about a bar exam. The data includes all students in the program who took the exam in 2021 and 2022, both those who passed and failed the exam. We need to recommend on which variables they need to focus more to increase the rate of passing in bar exam.
In our regression analysis, we observed there is not much importance for “LSAT exam score”, “Accom”, “Num of workshops” attended are not much important for regression. “LSAT exam score” is just entrance score. So, it does not impact law school exam that much. And also “accom” value is not compulsory for students and it is set mainly for disabled people, so we can put this variable apart from the regression.
As OneLCUM is cumulative value of civpro, LP1, LP2, we can set them apart from the regression analysis, and directly regress only the “OneLCUM” variable.
“Probation” is another important variable because the result of a student may depend on the students’ behaviour, and it may impact on their focus on study.
“LegalAnalysis”, “AdvLegalPerf”, “AdvLegalAnalysis” are elective courses. These three may or may not be significant according to our response what we have taken for regression analysis.
“Barpep” is compulsory course completion so it will impact our analysis.
“PctBarPrepCompelete” is also a course that every law student wants to complete, so it is also a main predictor.
“NumPrepWorkshops” is not a mandatory factor for all students to take. It is optional to take. It may create difference in our regression when we consider this as predictor.
“StudentSuccessInitiative” variable is not for all students as well. The student who has got poor marks, it is for them to be beneficial.
BarPrepMentor is a variable to decide whether the student has a bar preparation mentor or not.
Below are the responses:
MPRE: Mulitstate Professional Responsibility Exam
MPT: Multistate Performance Exam
MEE: Mutistate Essay Exam
MBE: Multistate Bar Exam
UBE: Uniform Bar Exam. A composite score from the MPT, MEE, MBE examinations
Pass: Whether or not the student passed the bar exam. The minimum score on the UBE for bar passage varies slightly from year to year. When we observe our data, we can see most of students passed but their performance on different exams was not the same. Some of them got good results in some exams at the same time but also, they performed bad on other exams. Overall, when we observe the given data very few students was failed.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.3
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.3
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'tidyr' was built under R version 4.2.3
## Warning: package 'readr' was built under R version 4.2.3
## Warning: package 'purrr' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## Warning: package 'stringr' was built under R version 4.2.3
## Warning: package 'forcats' was built under R version 4.2.3
## Warning: package 'lubridate' was built under R version 4.2.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.1 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(readxl)
## Warning: package 'readxl' was built under R version 4.2.3
library(dplyr)
library(tidyr)
library(purrr)
library(MASS)
## Warning: package 'MASS' was built under R version 4.2.3
##
## Attaching package: 'MASS'
##
## The following object is masked from 'package:dplyr':
##
## select
library(car)
## Warning: package 'car' was built under R version 4.2.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.2.3
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
library(MASS)
sheet1<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2022Fail")
sheet1
## # A tibble: 18 × 24
## LSAT UGPA Class CivPro LP1 LP2 OneLCUM FGPA Accom Probation
## <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 154 3.18 2019 C+ C+ C+ 2.47 2.92 N Y
## 2 152 2.56 2019 B C+ C 2.5 3.06 N N
## 3 158 2.78 2019 C B+ CR 3.22 3.14 Y N
## 4 153 3.47 2019 A B+ A 3.5 3.42 N N
## 5 159 2.97 2019 C C C+ 2.28 2.64 N Y
## 6 148 3.59 2019 B A B+ 2.72 3.05 N N
## 7 155 3.45 2019 B D+ CR 2.76 2.99 N N
## 8 152 3.39 2019 C A B 2.65 3.15 N N
## 9 151 3.61 2019 C D C+ 2.58 2.77 N N
## 10 157 3.11 2019 C B+ B 2.67 3.19 N N
## 11 155 3.95 2019 D B B 2.43 2.88 N Y
## 12 151 3.94 2019 C B CR 2.69 3.15 N N
## 13 149 3.41 2019 C B B 2.93 3.27 N N
## 14 153 3.5 2019 C+ C+ C+ 2.67 3.06 N N
## 15 149 4.06 2019 B A A 2.88 3.36 N N
## 16 149 3.6 2019 C+ C+ CR 3.06 3.25 Y N
## 17 157 2.94 2019 B+ B+ CR 3.66 3.20 N N
## 18 153 3.55 2019 B B CR 2.84 2.55 N N
## # ℹ 14 more variables: LegalAnalysis <chr>, AdvLegalPerf <chr>,
## # AdvLegalAnalysis <chr>, BarPrep <chr>, PctBarPrepComplete <chr>,
## # NumPrepWorkshops <dbl>, StudentSuccessInitiative <chr>,
## # BarPrepMentor <chr>, MPRE <chr>, MPT <dbl>, MEE <dbl>, MBE <dbl>,
## # UBE <dbl>, PASS <dbl>
sheet2<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2022Pass")
sheet2
## # A tibble: 89 × 24
## LSAT UGPA Class CivPro LP1 LP2 OneLCUM FGPA Accom Probation
## <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 155 3.29 2019 C+ C+ CR 3.5 3.09 N N
## 2 153 3.73 2019 B A B+ 3.28 3.40 N N
## 3 154 3.39 2019 C+ B+ CR 3.09 2.99 N N
## 4 155 3.7 2019 C C+ CR 2.84 3.28 N N
## 5 158 2.82 2019 C+ C B+ 3.12 3.45 N N
## 6 156 3.54 2019 B B+ CR 3.21 3.43 N N
## 7 160 3.87 2019 B B CR 3.45 3.36 N N
## 8 157 3.2 2019 A A A 3.62 3.60 N N
## 9 156 3.09 2019 B C CR 3 3.18 N N
## 10 153 3.66 2019 B+ A A 3.62 3.72 N N
## # ℹ 79 more rows
## # ℹ 14 more variables: LegalAnalysis <chr>, AdvLegalPerf <chr>,
## # AdvLegalAnalysis <chr>, BarPrep <chr>, PctBarPrepComplete <chr>,
## # NumPrepWorkshops <dbl>, StudentSuccessInitiative <chr>,
## # BarPrepMentor <chr>, MPRE <chr>, MPT <dbl>, MEE <dbl>, MBE <dbl>,
## # UBE <dbl>, PASS <dbl>
sheet3<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2021Fail")
sheet3
## # A tibble: 9 × 24
## LSAT UGPA Class CivPro LP1 LP2 OneLCUM FGPA Accom Probation
## <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 152 3.42 2018 B+ A A 3.21 3.29 N N
## 2 155 2.82 2018 B+ B B 2.43 3.20 Y Y
## 3 157 3.46 2018 C B B 2.62 2.91 N N
## 4 156 3.13 2018 D+ C C+ 2.28 2.77 N Y
## 5 145 3.49 2018 C C+ C+ 2.29 2.90 N Y
## 6 154 2.85 2018 B+ F CR 2.54 2.82 N N
## 7 149 3.43 2018 C C B 2.28 3.00 N Y
## 8 160 3.29 2018 C C+ B 2.66 3.09 N Y
## 9 152 3.62 2018 C+ B B 2.60 3.21 N N
## # ℹ 14 more variables: LegalAnalysis <chr>, AdvLegalPerf <chr>,
## # AdvLegalAnalysis <chr>, BarPrep <chr>, PctBarPrepComplete <dbl>,
## # NumPrepWorkshops <dbl>, StudentSuccessInitiative <chr>,
## # BarPrepMentor <chr>, MPRE <chr>, MPT <dbl>, MEE <dbl>, MBE <dbl>,
## # UBE <dbl>, PASS <dbl>
sheet4<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2021Pass")
sheet4
## # A tibble: 107 × 24
## LSAT UGPA Class CivPro LP1 LP2 OneLCUM FGPA Accom Probation
## <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 150 3.07 2018 C B C 2.36 2.74 N Y
## 2 148 3.57 2018 C+ A B+ 2.95 3.44 N N
## 3 155 3.26 2018 C+ B C 2.83 3.17 N N
## 4 152 3.96 2018 A B+ A 3.60 3.80 N N
## 5 156 3.41 2018 B B C+ 2.88 3.26 N N
## 6 153 3.64 2018 B C+ B 2.52 3.32 Y Y
## 7 151 3.67 2018 B B+ B 2.79 3.26 N N
## 8 156 3.52 2018 B+ B A 3.16 3.58 Y N
## 9 157 2.62 2018 B A A 3.22 3.35 N N
## 10 163 3.45 2018 A B B 3.60 3.76 N N
## # ℹ 97 more rows
## # ℹ 14 more variables: LegalAnalysis <chr>, AdvLegalPerf <chr>,
## # AdvLegalAnalysis <chr>, BarPrep <chr>, PctBarPrepComplete <chr>,
## # NumPrepWorkshops <dbl>, StudentSuccessInitiative <chr>,
## # BarPrepMentor <chr>, MPRE <chr>, MPT <dbl>, MEE <dbl>, MBE <dbl>,
## # UBE <dbl>, PASS <dbl>
sheet1[sheet1=='NA']<-NA
sheet2[sheet2=='NA']<-NA
sheet3[sheet3=='NA']<-NA
sheet4[sheet4=='NA']<-NA
sheet1<-na.omit(sheet1)
sheet2<-na.omit(sheet2)
sheet3<-na.omit(sheet3)
sheet4<-na.omit(sheet4)
view(sheet1)
view(sheet2)
view(sheet3)
view(sheet4)
data<-rbind(sheet1,sheet2,sheet3,sheet4)
View(data)
str(data)
## tibble [195 × 24] (S3: tbl_df/tbl/data.frame)
## $ LSAT : num [1:195] 154 152 158 153 159 148 155 152 151 157 ...
## $ UGPA : num [1:195] 3.18 2.56 2.78 3.47 2.97 3.59 3.45 3.39 3.61 3.11 ...
## $ Class : num [1:195] 2019 2019 2019 2019 2019 ...
## $ CivPro : chr [1:195] "C+" "B" "C" "A" ...
## $ LP1 : chr [1:195] "C+" "C+" "B+" "B+" ...
## $ LP2 : chr [1:195] "C+" "C" "CR" "A" ...
## $ OneLCUM : num [1:195] 2.47 2.5 3.23 3.5 2.28 ...
## $ FGPA : num [1:195] 2.92 3.06 3.14 3.42 2.64 ...
## $ Accom : chr [1:195] "N" "N" "Y" "N" ...
## $ Probation : chr [1:195] "Y" "N" "N" "N" ...
## $ LegalAnalysis : chr [1:195] "N" "N" "N" "N" ...
## $ AdvLegalPerf : chr [1:195] "N" "N" "N" "N" ...
## $ AdvLegalAnalysis : chr [1:195] "N" "N" "Y" "N" ...
## $ BarPrep : chr [1:195] "Themis" "Themis" "Barbri" "Themis" ...
## $ PctBarPrepComplete : chr [1:195] "0.76700000000000002" "0.58199999999999996" "0.83" "0.69620000000000004" ...
## $ NumPrepWorkshops : num [1:195] 4 0 0 0 5 1 0 5 5 0 ...
## $ StudentSuccessInitiative: chr [1:195] "Y" "Y" "N" "N" ...
## $ BarPrepMentor : chr [1:195] "N" "N" "N" "N" ...
## $ MPRE : chr [1:195] "79" "95" "86" "95" ...
## $ MPT : num [1:195] 3 2.5 2 2 3 2 2.5 3.5 3 2.5 ...
## $ MEE : num [1:195] 3.17 3.67 3 3.17 3 ...
## $ MBE : num [1:195] 135 126 135 140 129 ...
## $ UBE : num [1:195] 266 260 254 262 258 ...
## $ PASS : num [1:195] 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "na.action")= 'omit' Named int [1:2] 12 17
## ..- attr(*, "names")= chr [1:2] "12" "17"
summary(data)
## LSAT UGPA Class CivPro
## Min. :145.0 Min. :2.210 Min. :2018 Length:195
## 1st Qu.:152.0 1st Qu.:3.225 1st Qu.:2018 Class :character
## Median :155.0 Median :3.440 Median :2018 Mode :character
## Mean :154.7 Mean :3.428 Mean :2018
## 3rd Qu.:157.0 3rd Qu.:3.660 3rd Qu.:2019
## Max. :165.0 Max. :4.140 Max. :2019
## LP1 LP2 OneLCUM FGPA
## Length:195 Length:195 Min. :2.200 Min. :2.551
## Class :character Class :character 1st Qu.:2.808 1st Qu.:3.132
## Mode :character Mode :character Median :3.145 Median :3.359
## Mean :3.132 Mean :3.346
## 3rd Qu.:3.463 3rd Qu.:3.554
## Max. :4.000 Max. :3.983
## Accom Probation LegalAnalysis AdvLegalPerf
## Length:195 Length:195 Length:195 Length:195
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## AdvLegalAnalysis BarPrep PctBarPrepComplete NumPrepWorkshops
## Length:195 Length:195 Length:195 Min. :0.000
## Class :character Class :character Class :character 1st Qu.:0.000
## Mode :character Mode :character Mode :character Median :1.000
## Mean :1.708
## 3rd Qu.:3.000
## Max. :5.000
## StudentSuccessInitiative BarPrepMentor MPRE MPT
## Length:195 Length:195 Length:195 Min. :2.000
## Class :character Class :character Class :character 1st Qu.:3.000
## Mode :character Mode :character Mode :character Median :4.000
## Mean :3.705
## 3rd Qu.:4.500
## Max. :5.500
## MEE MBE UBE PASS
## Min. :2.167 Min. :103.6 Min. :229.1 Min. :0.0000
## 1st Qu.:3.333 1st Qu.:139.4 1st Qu.:279.9 1st Qu.:1.0000
## Median :3.667 Median :146.3 Median :292.5 Median :1.0000
## Mean :3.667 Mean :145.8 Mean :291.9 Mean :0.8769
## 3rd Qu.:4.000 3rd Qu.:153.1 3rd Qu.:305.4 3rd Qu.:1.0000
## Max. :5.333 Max. :171.8 Max. :344.8 Max. :1.0000
We need to convert data format because some of them are character, some of them are grades , strings; So convertion is required to essential form.
data$CivPro<-factor(data$CivPro,c("D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$LP1<-factor(data$LP1,c("D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$LP2<-factor(data$LP2,c("CR","D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$Accom<-as.factor(data$Accom)
data$Probation<-as.factor(data$Probation)
data$LegalAnalysis<-as.factor(data$LegalAnalysis)
data$AdvLegalPerf<-as.factor(data$AdvLegalPerf)
data$AdvLegalAnalysis<-as.factor(data$AdvLegalAnalysis)
data$BarPrep<-as.factor(data$BarPrep)
data$PctBarPrepComplete<-as.numeric(data$PctBarPrepComplete)
data$NumPrepWorkshops<-factor(data$NumPrepWorkshops,c("0","1","2","3","4","5"),ordered=TRUE)
data$StudentSuccessInitiative<-as.factor(data$StudentSuccessInitiative)
data$BarPrepMentor<-as.factor(data$BarPrepMentor)
There is a question mark in the row 80. So, we need to omit this row.
data<-data[-80,]
view(data)
data$MPRE<-as.numeric(data$MPRE)
data[,1:23]<-sapply(data[,1:23],as.numeric)
Now the data is perfect in structure we can start our analysis
str(data)
## tibble [194 × 24] (S3: tbl_df/tbl/data.frame)
## $ LSAT : num [1:194] 154 152 158 153 159 148 155 152 151 157 ...
## $ UGPA : num [1:194] 3.18 2.56 2.78 3.47 2.97 3.59 3.45 3.39 3.61 3.11 ...
## $ Class : num [1:194] 2019 2019 2019 2019 2019 ...
## $ CivPro : num [1:194] 4 5 3 7 3 5 5 3 3 3 ...
## $ LP1 : num [1:194] 4 4 6 6 3 7 2 7 1 6 ...
## $ LP2 : num [1:194] 5 4 1 8 5 7 1 6 5 6 ...
## $ OneLCUM : num [1:194] 2.47 2.5 3.23 3.5 2.28 ...
## $ FGPA : num [1:194] 2.92 3.06 3.14 3.42 2.64 ...
## $ Accom : num [1:194] 1 1 2 1 1 1 1 1 1 1 ...
## $ Probation : num [1:194] 2 1 1 1 2 1 1 1 1 1 ...
## $ LegalAnalysis : num [1:194] 1 1 1 1 2 1 1 1 2 1 ...
## $ AdvLegalPerf : num [1:194] 1 1 1 1 2 1 1 1 2 1 ...
## $ AdvLegalAnalysis : num [1:194] 1 1 2 1 2 2 1 1 2 1 ...
## $ BarPrep : num [1:194] 2 2 1 2 2 2 1 1 1 2 ...
## $ PctBarPrepComplete : num [1:194] 0.767 0.582 0.83 0.696 0.749 ...
## $ NumPrepWorkshops : num [1:194] 5 1 1 1 6 2 1 6 6 1 ...
## $ StudentSuccessInitiative: num [1:194] 2 2 1 1 2 2 2 2 2 1 ...
## $ BarPrepMentor : num [1:194] 1 1 1 1 1 1 1 1 1 2 ...
## $ MPRE : num [1:194] 79 95 86 95 79 87 86 85 85 87 ...
## $ MPT : num [1:194] 3 2.5 2 2 3 2 2.5 3.5 3 2.5 ...
## $ MEE : num [1:194] 3.17 3.67 3 3.17 3 ...
## $ MBE : num [1:194] 135 126 135 140 129 ...
## $ UBE : num [1:194] 266 260 254 262 258 ...
## $ PASS : num [1:194] 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "na.action")= 'omit' Named int [1:2] 12 17
## ..- attr(*, "names")= chr [1:2] "12" "17"
view(data)
We need to standardize the data to bring them on the same level.
datanew<-data.frame(scale(data[,c(1,2,3,7,8,15,16)],center=TRUE,scale=TRUE),data[,c(4,5,6,9,10,11,12,13,14,17,18,19,20,21,22,23)])
view(datanew)
now data has proper same level scaling now we can do regression analysis
fit.UBE<-lm(UBE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE)
##
## Call:
## lm(formula = UBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.929 -8.035 0.296 9.556 38.130
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 269.5151 13.3463 20.194 < 2e-16 ***
## LSAT 1.3444 1.1220 1.198 0.232460
## UGPA -0.6114 1.1249 -0.544 0.587473
## Class -4.0198 1.2181 -3.300 0.001171 **
## CivPro 0.7758 1.1065 0.701 0.484153
## LP1 -2.2246 1.0901 -2.041 0.042785 *
## LP2 -0.2754 0.5024 -0.548 0.584319
## OneLCUM 5.9401 2.5704 2.311 0.022000 *
## FGPA 11.3045 2.4032 4.704 5.16e-06 ***
## Accom -3.3516 3.3587 -0.998 0.319704
## Probation 5.9178 4.6300 1.278 0.202890
## LegalAnalysis -6.4304 6.1636 -1.043 0.298255
## AdvLegalPerf 6.0221 4.1357 1.456 0.147149
## AdvLegalAnalysis 1.3725 2.0957 0.655 0.513394
## BarPrep 8.6507 2.2296 3.880 0.000148 ***
## PctBarPrepComplete 4.0463 1.1245 3.598 0.000417 ***
## NumPrepWorkshops -0.4260 1.1814 -0.361 0.718833
## StudentSuccessInitiative 5.6121 3.4151 1.643 0.102107
## BarPrepMentor 5.3007 2.6255 2.019 0.045018 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.69 on 175 degrees of freedom
## Multiple R-squared: 0.577, Adjusted R-squared: 0.5335
## F-statistic: 13.26 on 18 and 175 DF, p-value: < 2.2e-16
plot(fit.UBE)
As the residuals are scattered randomly against fitted values and residuals are normally distributed; the model is adequate. So there is no need for transformation, also there is no influencer point. We will check the VIF values to check the multicollinearty.
vif(fit.UBE)
## LSAT UGPA Class
## 1.296326 1.303024 1.527821
## CivPro LP1 LP2
## 2.263784 1.955095 1.505454
## OneLCUM FGPA Accom
## 6.803458 5.947323 1.220199
## Probation LegalAnalysis AdvLegalPerf
## 2.051705 1.922441 1.106845
## AdvLegalAnalysis BarPrep PctBarPrepComplete
## 1.119117 1.281510 1.302029
## NumPrepWorkshops StudentSuccessInitiative BarPrepMentor
## 1.437288 2.247827 1.251218
After observing the VIF value, onelcum , fgpa is highly correlated as they have very high VIF number. so we will remove those terms and regress again.
fit.UBE1<-lm(UBE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE1)
##
## Call:
## lm(formula = UBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
## BarPrep + PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -58.156 -9.266 0.341 9.991 37.861
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 252.6157 15.1887 16.632 < 2e-16 ***
## LSAT 3.3451 1.2532 2.669 0.008309 **
## UGPA 1.2026 1.2730 0.945 0.346103
## Class -2.9105 1.3577 -2.144 0.033420 *
## CivPro 5.4750 1.0387 5.271 3.92e-07 ***
## LP1 0.9127 1.1403 0.800 0.424542
## LP2 0.3080 0.5682 0.542 0.588473
## Accom -2.5323 3.8105 -0.665 0.507195
## Probation 1.3910 5.2330 0.266 0.790687
## LegalAnalysis -0.7131 7.0908 -0.101 0.920009
## AdvLegalPerf 3.0358 4.7709 0.636 0.525388
## AdvLegalAnalysis -0.3234 2.3733 -0.136 0.891768
## BarPrep 7.4908 2.5691 2.916 0.004007 **
## PctBarPrepComplete 4.9531 1.2937 3.829 0.000179 ***
## NumPrepWorkshops -0.6713 1.3520 -0.497 0.620128
## StudentSuccessInitiative -8.9822 3.2783 -2.740 0.006776 **
## BarPrepMentor 2.8271 3.0122 0.939 0.349229
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.86 on 177 degrees of freedom
## Multiple R-squared: 0.4259, Adjusted R-squared: 0.374
## F-statistic: 8.207 on 16 and 177 DF, p-value: 1.526e-14
vif(fit.UBE1)
## LSAT UGPA Class
## 1.205068 1.243400 1.414449
## CivPro LP1 LP2
## 1.486551 1.593903 1.435223
## Accom Probation LegalAnalysis
## 1.170321 1.953007 1.895951
## AdvLegalPerf AdvLegalAnalysis BarPrep
## 1.097585 1.069523 1.267820
## PctBarPrepComplete NumPrepWorkshops StudentSuccessInitiative
## 1.284262 1.402588 1.543548
## BarPrepMentor
## 1.227225
after removing those two terms, it is impacting r square value to decrease and p value is increasing. So it is better to keep onelcum, fgpa and some significant factors and some near significant factors remain in model. There is no significant impact by removing these terms.
fit.UBE2<-lm(UBE~Class+OneLCUM+UGPA+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE2)
##
## Call:
## lm(formula = UBE ~ Class + OneLCUM + UGPA + FGPA + Probation +
## LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.817 -8.547 -0.220 9.747 37.988
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 258.961 9.075 28.534 < 2e-16 ***
## Class -3.410 1.091 -3.126 0.002061 **
## OneLCUM 6.317 2.195 2.878 0.004479 **
## UGPA -1.352 1.060 -1.275 0.203843
## FGPA 10.169 2.248 4.524 1.09e-05 ***
## Probation 6.386 4.581 1.394 0.164998
## LegalAnalysis -6.716 5.853 -1.147 0.252739
## AdvLegalPerf 6.702 4.107 1.632 0.104464
## BarPrep 9.047 2.210 4.093 6.39e-05 ***
## PctBarPrepComplete 4.162 1.101 3.781 0.000211 ***
## StudentSuccessInitiative 4.949 3.290 1.504 0.134243
## BarPrepMentor 4.669 2.485 1.879 0.061841 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.82 on 182 degrees of freedom
## Multiple R-squared: 0.5518, Adjusted R-squared: 0.5247
## F-statistic: 20.37 on 11 and 182 DF, p-value: < 2.2e-16
vif(fit.UBE2)
## Class OneLCUM UGPA
## 1.202470 4.868515 1.135282
## FGPA Probation LegalAnalysis
## 5.107159 1.971100 1.701459
## AdvLegalPerf BarPrep PctBarPrepComplete
## 1.071283 1.235934 1.224628
## StudentSuccessInitiative BarPrepMentor
## 2.047741 1.100122
By observing now vif is better. we cannot remove fgpa because it is important for analysis
Now select model based on AIC for UBE response
library(MuMIn)
## Warning: package 'MuMIn' was built under R version 4.2.3
fullmodelUBE<-lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fitdredgeUBE<-dredge(fullmodelUBE)
## Fixed term is "(Intercept)"
plot(summary.fitdredgeUBE)
modelAICUBE736<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.736<-summary(modelAICUBE736)
summary.736
##
## Call:
## lm(formula = UBE ~ Class + UGPA + OneLCUM + FGPA + AdvLegalPerf +
## BarPrep + PctBarPrepComplete + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.803 -8.806 -0.440 9.404 37.136
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 259.624 7.664 33.874 < 2e-16 ***
## Class -3.556 1.087 -3.271 0.001278 **
## UGPA -1.542 1.052 -1.465 0.144534
## OneLCUM 5.788 2.059 2.811 0.005477 **
## FGPA 10.371 2.226 4.659 6.08e-06 ***
## AdvLegalPerf 6.080 4.079 1.490 0.137837
## BarPrep 9.198 2.209 4.164 4.80e-05 ***
## PctBarPrepComplete 4.245 1.099 3.864 0.000154 ***
## StudentSuccessInitiative 5.006 3.239 1.545 0.123997
## BarPrepMentor 4.410 2.472 1.784 0.076077 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.83 on 184 degrees of freedom
## Multiple R-squared: 0.5463, Adjusted R-squared: 0.5242
## F-statistic: 24.62 on 9 and 184 DF, p-value: < 2.2e-16
modelAICUBE224<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.224<-summary(modelAICUBE224)
summary.224
##
## Call:
## lm(formula = UBE ~ Class + UGPA + OneLCUM + FGPA + AdvLegalPerf +
## BarPrep + PctBarPrepComplete + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.137 -8.306 -0.138 9.452 40.165
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 266.361 6.328 42.095 < 2e-16 ***
## Class -3.530 1.091 -3.236 0.001436 **
## UGPA -1.527 1.056 -1.447 0.149722
## OneLCUM 5.582 2.062 2.706 0.007439 **
## FGPA 9.082 2.072 4.384 1.95e-05 ***
## AdvLegalPerf 5.650 4.085 1.383 0.168270
## BarPrep 9.496 2.209 4.299 2.77e-05 ***
## PctBarPrepComplete 4.200 1.102 3.810 0.000189 ***
## BarPrepMentor 4.010 2.467 1.625 0.105803
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.88 on 185 degrees of freedom
## Multiple R-squared: 0.5405, Adjusted R-squared: 0.5206
## F-statistic: 27.2 on 8 and 185 DF, p-value: < 2.2e-16
modelAICUBE223<-lm(UBE~Class+UGPA+OneLCUM+FGPA+BarPrep+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.223<-summary(modelAICUBE223)
summary.223
##
## Call:
## lm(formula = UBE ~ Class + UGPA + OneLCUM + FGPA + BarPrep +
## PctBarPrepComplete + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -54.618 -8.290 -0.183 9.581 39.864
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 272.696 4.376 62.310 < 2e-16 ***
## Class -3.462 1.092 -3.169 0.001789 **
## UGPA -1.717 1.050 -1.636 0.103511
## OneLCUM 5.419 2.064 2.625 0.009373 **
## FGPA 9.123 2.077 4.393 1.87e-05 ***
## BarPrep 9.356 2.212 4.230 3.67e-05 ***
## PctBarPrepComplete 4.038 1.099 3.675 0.000311 ***
## BarPrepMentor 3.935 2.473 1.591 0.113255
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.91 on 186 degrees of freedom
## Multiple R-squared: 0.5357, Adjusted R-squared: 0.5182
## F-statistic: 30.66 on 7 and 186 DF, p-value: < 2.2e-16
From the AIC selection our best model contains Class + UGPA + OneLCUM + FGPA + AdvLegalPerf + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + BarPrepMentor for UBE response
Find “best” subset of all possible models using ols_step_best_subset
library(olsrr)
## Warning: package 'olsrr' was built under R version 4.2.3
##
## Attaching package: 'olsrr'
## The following object is masked from 'package:MASS':
##
## cement
## The following object is masked from 'package:datasets':
##
## rivers
modelUBEsubset <- lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitUBEsubset<-ols_step_best_subset(modelUBEsubset)
summary.fitUBEsubset
## Best Subsets Regression
## ---------------------------------------------------------------------------------------------------------------------------------------------
## Model Index Predictors
## ---------------------------------------------------------------------------------------------------------------------------------------------
## 1 FGPA
## 2 FGPA BarPrep
## 3 FGPA BarPrep PctBarPrepComplete
## 4 Class FGPA BarPrep PctBarPrepComplete
## 5 Class OneLCUM FGPA BarPrep PctBarPrepComplete
## 6 Class UGPA OneLCUM FGPA BarPrep PctBarPrepComplete
## 7 Class UGPA OneLCUM FGPA BarPrep PctBarPrepComplete BarPrepMentor
## 8 Class OneLCUM FGPA AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 9 Class UGPA OneLCUM FGPA AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 10 Class UGPA OneLCUM FGPA Probation AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 11 Class UGPA OneLCUM FGPA Probation LegalAnalysis AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## ---------------------------------------------------------------------------------------------------------------------------------------------
##
## Subsets Regression Summary
## -----------------------------------------------------------------------------------------------------------------------------------------
## Adj. Pred
## Model R-Square R-Square R-Square C(p) AIC SBIC SBC MSEP FPE HSP APC
## -----------------------------------------------------------------------------------------------------------------------------------------
## 1 0.4370 0.4341 0.4253 38.5953 1607.3001 1056.1008 1617.1037 44111.1882 229.7212 1.1905 0.5747
## 2 0.4577 0.4520 0.4406 32.1891 1602.0316 1050.7415 1615.1031 42712.7035 223.5670 1.1587 0.5593
## 3 0.4949 0.4869 0.4721 19.0908 1590.2510 1039.2644 1606.5903 39994.3970 210.3956 1.0907 0.5264
## 4 0.5072 0.4968 0.4791 16.0867 1587.4588 1036.6046 1607.0660 39226.1032 207.3903 1.0754 0.5188
## 5 0.5228 0.5101 0.4883 11.7657 1583.2312 1032.6975 1606.1062 38190.0285 202.9214 1.0526 0.5077
## 6 0.5294 0.5143 0.4913 11.0926 1582.5362 1032.2006 1608.6791 37865.6714 202.1982 1.0492 0.5059
## 7 0.5357 0.5182 0.4927 10.5262 1581.9131 1031.8278 1611.3238 37559.0564 201.5529 1.0463 0.5042
## 8 0.5410 0.5212 0.487 10.3555 1581.6664 1031.8533 1614.3450 37328.3793 201.3008 1.0455 0.5036
## 9 0.5463 0.5242 0.489 10.2058 1581.4156 1031.9252 1617.3621 37099.4182 201.0457 1.0448 0.5030
## 10 0.5485 0.5239 0.4847 11.3164 1582.4767 1033.2039 1621.6910 37123.1480 202.1544 1.0511 0.5057
## 11 0.5518 0.5247 0.4832 12.0000 1583.0785 1034.1040 1625.5606 37060.1861 202.7898 1.0551 0.5073
## -----------------------------------------------------------------------------------------------------------------------------------------
## AIC: Akaike Information Criteria
## SBIC: Sawa's Bayesian Information Criteria
## SBC: Schwarz Bayesian Criteria
## MSEP: Estimated error of prediction, assuming multivariate normality
## FPE: Final Prediction Error
## HSP: Hocking's Sp
## APC: Amemiya Prediction Criteria
plot(summary.fitUBEsubset)
modelbestUBE<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestUBE)
##
## Call:
## lm(formula = UBE ~ Class + UGPA + OneLCUM + FGPA + AdvLegalPerf +
## BarPrep + PctBarPrepComplete + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -51.803 -8.806 -0.440 9.404 37.136
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 259.624 7.664 33.874 < 2e-16 ***
## Class -3.556 1.087 -3.271 0.001278 **
## UGPA -1.542 1.052 -1.465 0.144534
## OneLCUM 5.788 2.059 2.811 0.005477 **
## FGPA 10.371 2.226 4.659 6.08e-06 ***
## AdvLegalPerf 6.080 4.079 1.490 0.137837
## BarPrep 9.198 2.209 4.164 4.80e-05 ***
## PctBarPrepComplete 4.245 1.099 3.864 0.000154 ***
## StudentSuccessInitiative 5.006 3.239 1.545 0.123997
## BarPrepMentor 4.410 2.472 1.784 0.076077 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.83 on 184 degrees of freedom
## Multiple R-squared: 0.5463, Adjusted R-squared: 0.5242
## F-statistic: 24.62 on 9 and 184 DF, p-value: < 2.2e-16
It is best based on 9th subset model having lower AIC value. This model has the predictors Class UGPA OneLCUM FGPA AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
Now we will regress on MBE
fit.MBE<-lm(MBE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE)
##
## Call:
## lm(formula = MBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.382 -5.524 0.284 4.662 19.520
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.350e+02 7.928e+00 17.030 < 2e-16 ***
## LSAT 7.667e-01 6.665e-01 1.150 0.25155
## UGPA -5.302e-01 6.682e-01 -0.793 0.42858
## Class -2.303e+00 7.236e-01 -3.183 0.00172 **
## CivPro -3.694e-01 6.573e-01 -0.562 0.57488
## LP1 -1.469e+00 6.476e-01 -2.269 0.02450 *
## LP2 3.583e-04 2.984e-01 0.001 0.99904
## OneLCUM 4.623e+00 1.527e+00 3.028 0.00284 **
## FGPA 5.998e+00 1.428e+00 4.202 4.22e-05 ***
## Accom -2.429e+00 1.995e+00 -1.218 0.22505
## Probation 5.056e+00 2.750e+00 1.838 0.06770 .
## LegalAnalysis -1.141e+00 3.661e+00 -0.312 0.75572
## AdvLegalPerf 3.987e+00 2.457e+00 1.623 0.10644
## AdvLegalAnalysis 2.902e-01 1.245e+00 0.233 0.81598
## BarPrep 3.984e+00 1.325e+00 3.008 0.00302 **
## PctBarPrepComplete 1.996e+00 6.680e-01 2.988 0.00321 **
## NumPrepWorkshops -5.614e-02 7.018e-01 -0.080 0.93634
## StudentSuccessInitiative 3.723e+00 2.029e+00 1.835 0.06818 .
## BarPrepMentor 2.595e+00 1.560e+00 1.664 0.09792 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.133 on 175 degrees of freedom
## Multiple R-squared: 0.5156, Adjusted R-squared: 0.4658
## F-statistic: 10.35 on 18 and 175 DF, p-value: < 2.2e-16
plot(fit.MBE)
Adequacy check:
As the residuals are scattered randomly against fitted values and residuals are normally distributed; the model is adequate. So there is no need for transformation, also there is no influencer point.
We will check the VIF values to check the multicollinearty.
vif(fit.MBE)
## LSAT UGPA Class
## 1.296326 1.303024 1.527821
## CivPro LP1 LP2
## 2.263784 1.955095 1.505454
## OneLCUM FGPA Accom
## 6.803458 5.947323 1.220199
## Probation LegalAnalysis AdvLegalPerf
## 2.051705 1.922441 1.106845
## AdvLegalAnalysis BarPrep PctBarPrepComplete
## 1.119117 1.281510 1.302029
## NumPrepWorkshops StudentSuccessInitiative BarPrepMentor
## 1.437288 2.247827 1.251218
After observing the VIF value, onelcum , fgpa is highly correlated as they have very high VIF number. so we will remove those terms and regress again.
fit.MBE1<-lm(MBE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE1)
##
## Call:
## lm(formula = MBE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
## BarPrep + PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -39.392 -5.625 0.809 5.947 20.597
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 124.2822 9.0684 13.705 < 2e-16 ***
## LSAT 2.0396 0.7482 2.726 0.00705 **
## UGPA 0.5328 0.7600 0.701 0.48420
## Class -1.5072 0.8106 -1.859 0.06464 .
## CivPro 2.5741 0.6202 4.151 5.14e-05 ***
## LP1 0.5282 0.6808 0.776 0.43890
## LP2 0.3184 0.3393 0.939 0.34924
## Accom -2.2194 2.2750 -0.976 0.33061
## Probation 1.9714 3.1243 0.631 0.52888
## LegalAnalysis 2.3182 4.2335 0.548 0.58467
## AdvLegalPerf 2.1769 2.8484 0.764 0.44574
## AdvLegalAnalysis -0.5691 1.4170 -0.402 0.68843
## BarPrep 3.2096 1.5339 2.092 0.03782 *
## PctBarPrepComplete 2.5157 0.7724 3.257 0.00135 **
## NumPrepWorkshops -0.2807 0.8072 -0.348 0.72844
## StudentSuccessInitiative -4.8665 1.9573 -2.486 0.01383 *
## BarPrepMentor 1.0250 1.7984 0.570 0.56943
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.469 on 177 degrees of freedom
## Multiple R-squared: 0.3359, Adjusted R-squared: 0.2759
## F-statistic: 5.596 on 16 and 177 DF, p-value: 1.206e-09
vif(fit.MBE1)
## LSAT UGPA Class
## 1.205068 1.243400 1.414449
## CivPro LP1 LP2
## 1.486551 1.593903 1.435223
## Accom Probation LegalAnalysis
## 1.170321 1.953007 1.895951
## AdvLegalPerf AdvLegalAnalysis BarPrep
## 1.097585 1.069523 1.267820
## PctBarPrepComplete NumPrepWorkshops StudentSuccessInitiative
## 1.284262 1.402588 1.543548
## BarPrepMentor
## 1.227225
After removing those two terms, it is impacting r square value to decrease and p value is increasing. So it is better to keep onelcum, fgpa and some significant factors and some near significant factors remain in model. There is no significant impact by removing these terms.
fit.MBE2<-lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE2)
##
## Call:
## lm(formula = UBE ~ Class + UGPA + OneLCUM + FGPA + Probation +
## LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.817 -8.547 -0.220 9.747 37.988
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 258.961 9.075 28.534 < 2e-16 ***
## Class -3.410 1.091 -3.126 0.002061 **
## UGPA -1.352 1.060 -1.275 0.203843
## OneLCUM 6.317 2.195 2.878 0.004479 **
## FGPA 10.169 2.248 4.524 1.09e-05 ***
## Probation 6.386 4.581 1.394 0.164998
## LegalAnalysis -6.716 5.853 -1.147 0.252739
## AdvLegalPerf 6.702 4.107 1.632 0.104464
## BarPrep 9.047 2.210 4.093 6.39e-05 ***
## PctBarPrepComplete 4.162 1.101 3.781 0.000211 ***
## StudentSuccessInitiative 4.949 3.290 1.504 0.134243
## BarPrepMentor 4.669 2.485 1.879 0.061841 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.82 on 182 degrees of freedom
## Multiple R-squared: 0.5518, Adjusted R-squared: 0.5247
## F-statistic: 20.37 on 11 and 182 DF, p-value: < 2.2e-16
vif(fit.MBE2)
## Class UGPA OneLCUM
## 1.202470 1.135282 4.868515
## FGPA Probation LegalAnalysis
## 5.107159 1.971100 1.701459
## AdvLegalPerf BarPrep PctBarPrepComplete
## 1.071283 1.235934 1.224628
## StudentSuccessInitiative BarPrepMentor
## 2.047741 1.100122
By observing vif value now, we see that it is better now. So, we cannot remove fgpa because it is important for analysis.
Now select model based on AIC for MBE
library(MuMIn)
fullmodelMBE<-lm(MBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fit1MBE<-dredge(fullmodelMBE)
## Fixed term is "(Intercept)"
plot(summary.fit1MBE)
modelAICMBE992<-lm(MBE~Class+OneLCUM+FGPA+AdvLegalPerf+Probation+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.992<-summary(modelAICMBE992)
summary.992
##
## Call:
## lm(formula = MBE ~ Class + OneLCUM + FGPA + AdvLegalPerf + Probation +
## BarPrep + PctBarPrepComplete + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.895 -5.117 0.549 4.839 18.952
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 122.2251 5.1346 23.804 < 2e-16 ***
## Class -1.9669 0.6458 -3.046 0.002662 **
## OneLCUM 4.1047 1.2986 3.161 0.001841 **
## FGPA 4.9808 1.3167 3.783 0.000209 ***
## AdvLegalPerf 4.7994 2.4002 2.000 0.047015 *
## Probation 5.0554 2.3426 2.158 0.032225 *
## BarPrep 4.0335 1.3021 3.098 0.002256 **
## PctBarPrepComplete 2.0339 0.6511 3.124 0.002075 **
## StudentSuccessInitiative 3.1850 1.9425 1.640 0.102779
## BarPrepMentor 2.1738 1.4753 1.473 0.142331
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.204 on 184 degrees of freedom
## Multiple R-squared: 0.4817, Adjusted R-squared: 0.4564
## F-statistic: 19 on 9 and 184 DF, p-value: < 2.2e-16
modelAICUBE988<-lm(MBE~Class+OneLCUM+FGPA+AdvLegalPerf+BarPrep+Probation+PctBarPrepComplete+StudentSuccessInitiative,data=datanew)
summary.988<-summary(modelAICUBE988)
summary.988
##
## Call:
## lm(formula = MBE ~ Class + OneLCUM + FGPA + AdvLegalPerf + BarPrep +
## Probation + PctBarPrepComplete + StudentSuccessInitiative,
## data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -35.624 -4.817 0.122 4.860 18.591
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 125.3400 4.6940 26.702 < 2e-16 ***
## Class -1.8545 0.6433 -2.883 0.004409 **
## OneLCUM 3.8181 1.2881 2.964 0.003434 **
## FGPA 5.1590 1.3152 3.922 0.000123 ***
## AdvLegalPerf 4.6908 2.4066 1.949 0.052796 .
## BarPrep 4.2898 1.2945 3.314 0.001107 **
## Probation 4.6802 2.3361 2.003 0.046593 *
## PctBarPrepComplete 2.2121 0.6418 3.447 0.000702 ***
## StudentSuccessInitiative 2.9360 1.9412 1.512 0.132125
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.23 on 185 degrees of freedom
## Multiple R-squared: 0.4756, Adjusted R-squared: 0.453
## F-statistic: 20.98 on 8 and 185 DF, p-value: < 2.2e-16
modelAICUBE476<-lm(MBE~Class+OneLCUM+FGPA+BarPrep+PctBarPrepComplete+BarPrepMentor+Probation,data=datanew)
summary.476<-summary(modelAICUBE476)
summary.476
##
## Call:
## lm(formula = MBE ~ Class + OneLCUM + FGPA + BarPrep + PctBarPrepComplete +
## BarPrepMentor + Probation, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -36.673 -5.279 -0.094 5.245 21.901
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 131.0963 3.7565 34.898 < 2e-16 ***
## Class -1.8891 0.6524 -2.896 0.004236 **
## OneLCUM 3.9414 1.3113 3.006 0.003014 **
## FGPA 4.1270 1.2263 3.365 0.000929 ***
## BarPrep 4.0601 1.3099 3.099 0.002240 **
## PctBarPrepComplete 1.8669 0.6540 2.854 0.004802 **
## BarPrepMentor 1.8939 1.4860 1.274 0.204086
## Probation 5.5319 2.3439 2.360 0.019306 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.298 on 186 degrees of freedom
## Multiple R-squared: 0.4641, Adjusted R-squared: 0.4439
## F-statistic: 23.01 on 7 and 186 DF, p-value: < 2.2e-16
based on AIC selection our best model contains Class + OneLCUM + FGPA + AdvLegalPerf + Probation + BarPrep + PctBarPrepComplete + StudentSuccessInitiative + BarPrepMentor
we need to find “best” subset of all possible models using ols_step_best_subset For MBE response
library(olsrr)
modelMBEsubset <- lm(MBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitMBEsubset<-ols_step_best_subset(modelMBEsubset)
summary.fitMBEsubset
## Best Subsets Regression
## ---------------------------------------------------------------------------------------------------------------------------------------------
## Model Index Predictors
## ---------------------------------------------------------------------------------------------------------------------------------------------
## 1 FGPA
## 2 Class FGPA
## 3 FGPA BarPrep PctBarPrepComplete
## 4 Class FGPA BarPrep PctBarPrepComplete
## 5 Class OneLCUM FGPA BarPrep PctBarPrepComplete
## 6 Class OneLCUM FGPA Probation BarPrep PctBarPrepComplete
## 7 Class OneLCUM FGPA Probation AdvLegalPerf BarPrep PctBarPrepComplete
## 8 Class OneLCUM FGPA Probation AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative
## 9 Class OneLCUM FGPA Probation AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 10 Class UGPA OneLCUM FGPA Probation AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 11 Class UGPA OneLCUM FGPA Probation LegalAnalysis AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## ---------------------------------------------------------------------------------------------------------------------------------------------
##
## Subsets Regression Summary
## ---------------------------------------------------------------------------------------------------------------------------------------
## Adj. Pred
## Model R-Square R-Square R-Square C(p) AIC SBIC SBC MSEP FPE HSP APC
## ---------------------------------------------------------------------------------------------------------------------------------------
## 1 0.3763 0.3730 0.3633 31.4546 1398.8232 847.7485 1408.6268 15060.7258 78.4329 0.4065 0.6367
## 2 0.3906 0.3842 0.3714 28.3656 1396.3132 845.1234 1409.3846 14792.0795 77.4248 0.4013 0.6286
## 3 0.4163 0.4071 0.392 21.2409 1389.9541 838.8867 1406.2934 14243.2228 74.9283 0.3884 0.6083
## 4 0.4308 0.4187 0.4004 18.0954 1387.0766 836.1247 1406.6838 13963.4666 73.8255 0.3828 0.5993
## 5 0.4452 0.4304 0.4084 14.9775 1384.1004 833.3704 1406.9754 13682.6323 72.7022 0.3771 0.5902
## 6 0.4594 0.4421 0.4168 11.9335 1381.0680 830.6703 1407.2109 13403.9419 71.5755 0.3714 0.5811
## 7 0.4691 0.4492 0.4189 10.4742 1379.5395 829.4588 1408.9503 13233.5038 71.0148 0.3687 0.5765
## 8 0.4756 0.4530 0.4141 10.1722 1379.1554 829.3606 1411.8340 13142.9138 70.8758 0.3681 0.5754
## 9 0.4817 0.4564 0.4157 10.0011 1378.8797 829.4125 1414.8262 13060.6219 70.7769 0.3678 0.5746
## 10 0.4873 0.4593 0.4159 10.0367 1378.7974 829.6865 1418.0117 12992.1828 70.7490 0.3679 0.5744
## 11 0.4874 0.4564 0.4082 12.0000 1380.7584 831.7839 1423.2405 13061.3310 71.4704 0.3719 0.5802
## ---------------------------------------------------------------------------------------------------------------------------------------
## AIC: Akaike Information Criteria
## SBIC: Sawa's Bayesian Information Criteria
## SBC: Schwarz Bayesian Criteria
## MSEP: Estimated error of prediction, assuming multivariate normality
## FPE: Final Prediction Error
## HSP: Hocking's Sp
## APC: Amemiya Prediction Criteria
plot(summary.fitMBEsubset)
modelbestMBE<-lm(MBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestMBE)
##
## Call:
## lm(formula = MBE ~ Class + UGPA + OneLCUM + FGPA + AdvLegalPerf +
## BarPrep + PctBarPrepComplete + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -35.966 -4.993 0.606 5.467 21.660
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 127.4675 4.5769 27.850 < 2e-16 ***
## Class -2.0346 0.6491 -3.134 0.00200 **
## UGPA -0.9410 0.6283 -1.498 0.13592
## OneLCUM 3.1007 1.2296 2.522 0.01253 *
## FGPA 5.6649 1.3293 4.262 3.24e-05 ***
## AdvLegalPerf 4.2860 2.4360 1.759 0.08016 .
## BarPrep 4.3612 1.3191 3.306 0.00114 **
## PctBarPrepComplete 2.0696 0.6561 3.154 0.00188 **
## StudentSuccessInitiative 3.8181 1.9344 1.974 0.04990 *
## BarPrepMentor 1.8402 1.4761 1.247 0.21410
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.257 on 184 degrees of freedom
## Multiple R-squared: 0.475, Adjusted R-squared: 0.4494
## F-statistic: 18.5 on 9 and 184 DF, p-value: < 2.2e-16
It is best based on 9th subset model having lower AIC value. This model has the predictors Class, UGPA, OneLCUM, FGPA, Probation, AdvLegalPerf, BarPrep, PctBarPrepComplete ,StudentSuccessInitiative, BarPrepMentor.
Now we will regress on MEE
fit.MEE<-lm(MEE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE)
##
## Call:
## lm(formula = MEE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.02832 -0.25106 -0.01273 0.29375 1.23924
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.3294561 0.4045347 8.230 4.14e-14 ***
## LSAT 0.0091621 0.0340085 0.269 0.787935
## UGPA -0.0230543 0.0340963 -0.676 0.499836
## Class -0.0079563 0.0369204 -0.215 0.829630
## CivPro 0.0008258 0.0335382 0.025 0.980384
## LP1 -0.0035320 0.0330432 -0.107 0.914997
## LP2 -0.0201943 0.0152271 -1.326 0.186501
## OneLCUM 0.0838415 0.0779104 1.076 0.283352
## FGPA 0.2848538 0.0728436 3.910 0.000132 ***
## Accom -0.1398260 0.1018037 -1.373 0.171358
## Probation 0.0604397 0.1403389 0.431 0.667239
## LegalAnalysis -0.3564462 0.1868219 -1.908 0.058035 .
## AdvLegalPerf -0.0153986 0.1253553 -0.123 0.902375
## AdvLegalAnalysis 0.0806325 0.0635222 1.269 0.205998
## BarPrep 0.2594135 0.0675821 3.838 0.000173 ***
## PctBarPrepComplete 0.0558548 0.0340832 1.639 0.103057
## NumPrepWorkshops -0.0157097 0.0358099 -0.439 0.661422
## StudentSuccessInitiative 0.1807875 0.1035129 1.747 0.082475 .
## BarPrepMentor 0.1535731 0.0795801 1.930 0.055250 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.415 on 175 degrees of freedom
## Multiple R-squared: 0.4233, Adjusted R-squared: 0.3639
## F-statistic: 7.135 on 18 and 175 DF, p-value: 1.81e-13
plot(fit.MEE)
Adequacy check:
As the residuals are scattered randomly against fitted values and residuals are normally distributed; the model is adequate. So there is no need for transformation, also there is no influencer point.
We will check the VIF values to check the multicollinearty.
vif(fit.MEE)
## LSAT UGPA Class
## 1.296326 1.303024 1.527821
## CivPro LP1 LP2
## 2.263784 1.955095 1.505454
## OneLCUM FGPA Accom
## 6.803458 5.947323 1.220199
## Probation LegalAnalysis AdvLegalPerf
## 2.051705 1.922441 1.106845
## AdvLegalAnalysis BarPrep PctBarPrepComplete
## 1.119117 1.281510 1.302029
## NumPrepWorkshops StudentSuccessInitiative BarPrepMentor
## 1.437288 2.247827 1.251218
After observing the VIF value, onelcum , fgpa is highly correlated as they have very high VIF number. so we will remove those terms and regress again.
fit.MEE1<-lm(MEE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE1)
##
## Call:
## lm(formula = MEE ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
## BarPrep + PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.10486 -0.29226 -0.03909 0.31969 1.24177
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.982699 0.431031 6.920 7.98e-11 ***
## LSAT 0.050122 0.035563 1.409 0.160481
## UGPA 0.018156 0.036124 0.503 0.615881
## Class 0.010701 0.038529 0.278 0.781529
## CivPro 0.099100 0.029477 3.362 0.000948 ***
## LP1 0.060635 0.032359 1.874 0.062603 .
## LP2 -0.005876 0.016125 -0.364 0.715990
## Accom -0.109113 0.108135 -1.009 0.314331
## Probation -0.023065 0.148504 -0.155 0.876752
## LegalAnalysis -0.231417 0.201224 -1.150 0.251676
## AdvLegalPerf -0.080565 0.135389 -0.595 0.552563
## AdvLegalAnalysis 0.036085 0.067351 0.536 0.592792
## BarPrep 0.237314 0.072906 3.255 0.001358 **
## PctBarPrepComplete 0.076981 0.036713 2.097 0.037430 *
## NumPrepWorkshops -0.017667 0.038367 -0.460 0.645752
## StudentSuccessInitiative -0.149126 0.093033 -1.603 0.110732
## BarPrepMentor 0.102768 0.085480 1.202 0.230877
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4501 on 177 degrees of freedom
## Multiple R-squared: 0.3138, Adjusted R-squared: 0.2518
## F-statistic: 5.059 on 16 and 177 DF, p-value: 1.381e-08
vif(fit.MEE1)
## LSAT UGPA Class
## 1.205068 1.243400 1.414449
## CivPro LP1 LP2
## 1.486551 1.593903 1.435223
## Accom Probation LegalAnalysis
## 1.170321 1.953007 1.895951
## AdvLegalPerf AdvLegalAnalysis BarPrep
## 1.097585 1.069523 1.267820
## PctBarPrepComplete NumPrepWorkshops StudentSuccessInitiative
## 1.284262 1.402588 1.543548
## BarPrepMentor
## 1.227225
After removing those two terms, it is impacting r square value to decrease and p value is increasing. So it is better to keep onelcum, fgpa and some significant factors and some near significant factors remain in model. There is no significant impact by removing these terms.
fit.MEE2<-lm(MEE~Class+OneLCUM+UGPA+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE2)
##
## Call:
## lm(formula = MEE ~ Class + OneLCUM + UGPA + FGPA + Probation +
## LegalAnalysis + AdvLegalPerf + BarPrep + PctBarPrepComplete +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.07631 -0.25627 -0.01437 0.26746 1.20848
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.27504 0.27091 12.089 < 2e-16 ***
## Class 0.01540 0.03256 0.473 0.636736
## OneLCUM 0.11241 0.06552 1.716 0.087919 .
## UGPA -0.02033 0.03164 -0.643 0.521352
## FGPA 0.23023 0.06710 3.431 0.000744 ***
## Probation 0.04286 0.13674 0.313 0.754339
## LegalAnalysis -0.33196 0.17472 -1.900 0.059024 .
## AdvLegalPerf -0.03825 0.12260 -0.312 0.755393
## BarPrep 0.27041 0.06598 4.098 6.26e-05 ***
## PctBarPrepComplete 0.06340 0.03286 1.929 0.055246 .
## StudentSuccessInitiative 0.13858 0.09822 1.411 0.159968
## BarPrepMentor 0.11944 0.07418 1.610 0.109102
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4125 on 182 degrees of freedom
## Multiple R-squared: 0.4072, Adjusted R-squared: 0.3714
## F-statistic: 11.37 on 11 and 182 DF, p-value: 5.786e-16
vif(fit.MEE2)
## Class OneLCUM UGPA
## 1.202470 4.868515 1.135282
## FGPA Probation LegalAnalysis
## 5.107159 1.971100 1.701459
## AdvLegalPerf BarPrep PctBarPrepComplete
## 1.071283 1.235934 1.224628
## StudentSuccessInitiative BarPrepMentor
## 2.047741 1.100122
By observing vif value now, we see that it is better now. So, we cannot remove fgpa because it is important for analysis.
Now select model based on AIC for MEE
fullmodelMEE<-lm(MEE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fit<-dredge(fullmodelMEE)
## Fixed term is "(Intercept)"
plot(summary.fit)
modelAICMEE759<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.759<-summary(modelAICMEE759)
summary.759
##
## Call:
## lm(formula = MEE ~ OneLCUM + FGPA + BarPrep + LegalAnalysis +
## PctBarPrepComplete + StudentSuccessInitiative + BarPrepMentor,
## data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.0472 -0.2414 -0.0225 0.2657 1.1840
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.25526 0.22391 14.538 < 2e-16 ***
## OneLCUM 0.11858 0.05796 2.046 0.042176 *
## FGPA 0.21716 0.06192 3.507 0.000568 ***
## BarPrep 0.26881 0.06464 4.159 4.89e-05 ***
## LegalAnalysis -0.31238 0.14827 -2.107 0.036478 *
## PctBarPrepComplete 0.06306 0.03231 1.952 0.052463 .
## StudentSuccessInitiative 0.14327 0.09678 1.480 0.140455
## BarPrepMentor 0.12128 0.07261 1.670 0.096543 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.409 on 186 degrees of freedom
## Multiple R-squared: 0.4046, Adjusted R-squared: 0.3822
## F-statistic: 18.06 on 7 and 186 DF, p-value: < 2.2e-16
modelAICMEE247<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.247<-summary(modelAICMEE247)
summary.247
##
## Call:
## lm(formula = MEE ~ OneLCUM + FGPA + BarPrep + LegalAnalysis +
## PctBarPrepComplete + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.04209 -0.23681 -0.01204 0.26660 1.27480
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.39617 0.20332 16.704 < 2e-16 ***
## OneLCUM 0.11694 0.05814 2.011 0.04571 *
## FGPA 0.17950 0.05663 3.170 0.00178 **
## BarPrep 0.27766 0.06457 4.300 2.74e-05 ***
## LegalAnalysis -0.27770 0.14688 -1.891 0.06021 .
## PctBarPrepComplete 0.06273 0.03241 1.935 0.05445 .
## BarPrepMentor 0.11132 0.07253 1.535 0.12648
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4103 on 187 degrees of freedom
## Multiple R-squared: 0.3976, Adjusted R-squared: 0.3783
## F-statistic: 20.57 on 6 and 187 DF, p-value: < 2.2e-16
modelAICMEE243<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete,data=datanew)
summary.243<-summary(modelAICMEE243)
summary.243
##
## Call:
## lm(formula = MEE ~ OneLCUM + FGPA + BarPrep + LegalAnalysis +
## PctBarPrepComplete, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.01392 -0.23721 -0.01792 0.25315 1.24464
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.52804 0.18494 19.077 < 2e-16 ***
## OneLCUM 0.10917 0.05812 1.878 0.06189 .
## FGPA 0.18769 0.05658 3.317 0.00109 **
## BarPrep 0.29100 0.06421 4.532 1.04e-05 ***
## LegalAnalysis -0.29264 0.14708 -1.990 0.04808 *
## PctBarPrepComplete 0.07208 0.03195 2.256 0.02522 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4117 on 188 degrees of freedom
## Multiple R-squared: 0.39, Adjusted R-squared: 0.3738
## F-statistic: 24.04 on 5 and 188 DF, p-value: < 2.2e-16
based on AIC our best model contains OneLCUM + FGPA + BarPrep + LegalAnalysis + PctBarPrepComplete + StudentSuccessInitiative + BarPrepMentor
We will find the “best” subset of all possible models using ols_step_best_subset for MEE
library(olsrr)
modelMEEsubset <- lm(MEE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitMEEsubset<-ols_step_best_subset(modelMEEsubset)
summary.fitMEEsubset
## Best Subsets Regression
## ---------------------------------------------------------------------------------------------------------------------------------------------
## Model Index Predictors
## ---------------------------------------------------------------------------------------------------------------------------------------------
## 1 FGPA
## 2 FGPA BarPrep
## 3 FGPA LegalAnalysis BarPrep
## 4 FGPA LegalAnalysis BarPrep PctBarPrepComplete
## 5 OneLCUM FGPA LegalAnalysis BarPrep PctBarPrepComplete
## 6 OneLCUM FGPA LegalAnalysis BarPrep PctBarPrepComplete BarPrepMentor
## 7 OneLCUM FGPA LegalAnalysis BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 8 UGPA OneLCUM FGPA LegalAnalysis BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 9 Class UGPA OneLCUM FGPA LegalAnalysis BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 10 Class UGPA OneLCUM FGPA Probation LegalAnalysis BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## 11 Class UGPA OneLCUM FGPA Probation LegalAnalysis AdvLegalPerf BarPrep PctBarPrepComplete StudentSuccessInitiative BarPrepMentor
## ---------------------------------------------------------------------------------------------------------------------------------------------
##
## Subsets Regression Summary
## ---------------------------------------------------------------------------------------------------------------------------------
## Adj. Pred
## Model R-Square R-Square R-Square C(p) AIC SBIC SBC MSEP FPE HSP APC
## ---------------------------------------------------------------------------------------------------------------------------------
## 1 0.2947 0.2910 0.2796 26.5553 234.3221 -316.6644 244.1256 37.2355 0.1939 0.0010 0.7200
## 2 0.3414 0.3345 0.3205 14.2068 223.0224 -327.7725 236.0938 34.9513 0.1829 9e-04 0.6793
## 3 0.3647 0.3547 0.3388 9.0575 218.0390 -332.5526 234.3783 33.8940 0.1783 9e-04 0.6620
## 4 0.3786 0.3654 0.3475 6.8090 215.7669 -334.6145 235.3741 33.3321 0.1762 9e-04 0.6543
## 5 0.3900 0.3738 0.3496 5.2943 214.1601 -335.9610 237.0351 32.8931 0.1748 9e-04 0.6489
## 6 0.3976 0.3783 0.3496 4.9639 213.7311 -336.1379 239.8740 32.6585 0.1744 9e-04 0.6475
## 7 0.4046 0.3822 0.3485 4.8098 213.4586 -336.1177 242.8693 32.4526 0.1742 9e-04 0.6466
## 8 0.4059 0.3803 0.3457 6.4003 215.0235 -334.3880 247.7021 32.5559 0.1756 9e-04 0.6519
## 9 0.4066 0.3775 0.3403 8.2110 216.8221 -332.4405 252.7685 32.6998 0.1772 9e-04 0.6579
## 10 0.4069 0.3745 0.3306 10.0973 218.7010 -330.4177 257.9153 32.8590 0.1789 9e-04 0.6644
## 11 0.4072 0.3714 0.3255 12.0000 220.5973 -328.3772 263.0794 33.0228 0.1807 9e-04 0.6709
## ---------------------------------------------------------------------------------------------------------------------------------
## AIC: Akaike Information Criteria
## SBIC: Sawa's Bayesian Information Criteria
## SBC: Schwarz Bayesian Criteria
## MSEP: Estimated error of prediction, assuming multivariate normality
## FPE: Final Prediction Error
## HSP: Hocking's Sp
## APC: Amemiya Prediction Criteria
plot(summary.fitMEEsubset)
modelbestMEE<-lm(MEE~OneLCUM+FGPA+LegalAnalysis+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestMEE)
##
## Call:
## lm(formula = MEE ~ OneLCUM + FGPA + LegalAnalysis + BarPrep +
## PctBarPrepComplete + StudentSuccessInitiative + BarPrepMentor,
## data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.0472 -0.2414 -0.0225 0.2657 1.1840
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.25526 0.22391 14.538 < 2e-16 ***
## OneLCUM 0.11858 0.05796 2.046 0.042176 *
## FGPA 0.21716 0.06192 3.507 0.000568 ***
## LegalAnalysis -0.31238 0.14827 -2.107 0.036478 *
## BarPrep 0.26881 0.06464 4.159 4.89e-05 ***
## PctBarPrepComplete 0.06306 0.03231 1.952 0.052463 .
## StudentSuccessInitiative 0.14327 0.09678 1.480 0.140455
## BarPrepMentor 0.12128 0.07261 1.670 0.096543 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.409 on 186 degrees of freedom
## Multiple R-squared: 0.4046, Adjusted R-squared: 0.3822
## F-statistic: 18.06 on 7 and 186 DF, p-value: < 2.2e-16
best subset is OneLCUM+FGPA+LegalAnalysis+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor for MEE response
Now we will Regress on MPT
fit.MPT<-lm(MPT~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT)
##
## Call:
## lm(formula = MPT ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7329 -0.5170 0.0261 0.5329 1.5204
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9821967 0.6980346 4.272 3.17e-05 ***
## LSAT 0.0531737 0.0586826 0.906 0.3661
## UGPA 0.0284974 0.0588340 0.484 0.6287
## Class 0.0011638 0.0637071 0.018 0.9854
## CivPro 0.1129540 0.0578710 1.952 0.0526 .
## LP1 -0.0694845 0.0570168 -1.219 0.2246
## LP2 0.0003579 0.0262748 0.014 0.9891
## OneLCUM 0.0061277 0.1344364 0.046 0.9637
## FGPA 0.1333364 0.1256935 1.061 0.2902
## Accom 0.0994589 0.1756648 0.566 0.5720
## Probation 0.0178081 0.2421583 0.074 0.9415
## LegalAnalysis -0.0196917 0.3223658 -0.061 0.9514
## AdvLegalPerf 0.2276680 0.2163036 1.053 0.2940
## AdvLegalAnalysis -0.0058959 0.1096091 -0.054 0.9572
## BarPrep 0.1050084 0.1166146 0.900 0.3691
## PctBarPrepComplete 0.1273084 0.0588115 2.165 0.0318 *
## NumPrepWorkshops -0.0193981 0.0617908 -0.314 0.7539
## StudentSuccessInitiative -0.0656857 0.1786141 -0.368 0.7135
## BarPrepMentor 0.0608127 0.1373175 0.443 0.6584
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.716 on 175 degrees of freedom
## Multiple R-squared: 0.1712, Adjusted R-squared: 0.08593
## F-statistic: 2.008 on 18 and 175 DF, p-value: 0.01149
plot(fit.MPT)
The model does not seem adequate as the residuals are not randomly scattered against fitted values and the residuals are not normally distributed. We can perform boxcox transformation in this model.
b<-boxcox(fit.MPT)
lambda<-b$x
likelihood<-b$y
q<-lambda[which.max(likelihood)]
q
## [1] 1.232323
newMPT<-datanew$MPT^1.232323
newMPT
## [1] 3.872287 3.093076 2.349450 2.349450 3.872287 2.349450 3.093076 4.682390
## [9] 3.872287 3.093076 3.093076 5.519915 3.872287 4.682390 3.872287 3.872287
## [17] 4.682390 5.519915 6.382176 3.872287 4.682390 7.267028 3.093076 6.382176
## [25] 4.682390 6.382176 5.519915 4.682390 3.872287 5.519915 3.872287 4.682390
## [33] 3.872287 6.382176 6.382176 6.382176 6.382176 4.682390 7.267028 7.267028
## [41] 4.682390 5.519915 6.382176 6.382176 6.382176 5.519915 6.382176 3.872287
## [49] 5.519915 3.872287 3.872287 6.382176 6.382176 8.172708 7.267028 6.382176
## [57] 5.519915 4.682390 5.519915 5.519915 3.093076 3.872287 6.382176 5.519915
## [65] 5.519915 3.872287 6.382176 3.872287 4.682390 3.093076 5.519915 7.267028
## [73] 5.519915 4.682390 5.519915 7.267028 3.093076 3.872287 6.382176 3.872287
## [81] 6.382176 3.872287 4.682390 5.519915 6.382176 4.682390 3.093076 6.382176
## [89] 6.382176 6.382176 4.682390 5.519915 3.872287 3.872287 3.872287 3.093076
## [97] 4.682390 3.093076 3.093076 3.872287 3.093076 6.382176 4.682390 5.519915
## [105] 6.382176 5.519915 4.682390 6.382176 3.093076 4.682390 5.519915 6.382176
## [113] 6.382176 6.382176 5.519915 3.872287 5.519915 4.682390 7.267028 3.093076
## [121] 5.519915 5.519915 5.519915 3.872287 4.682390 5.519915 6.382176 5.519915
## [129] 4.682390 6.382176 5.519915 4.682390 5.519915 7.267028 3.872287 3.872287
## [137] 6.382176 5.519915 5.519915 5.519915 6.382176 3.093076 4.682390 5.519915
## [145] 4.682390 4.682390 4.682390 6.382176 6.382176 5.519915 3.093076 5.519915
## [153] 6.382176 4.682390 3.872287 6.382176 3.872287 3.872287 5.519915 6.382176
## [161] 3.872287 3.093076 4.682390 3.872287 5.519915 6.382176 5.519915 4.682390
## [169] 3.093076 4.682390 4.682390 3.872287 5.519915 5.519915 6.382176 4.682390
## [177] 4.682390 5.519915 7.267028 5.519915 6.382176 4.682390 6.382176 3.093076
## [185] 5.519915 5.519915 5.519915 7.267028 4.682390 5.519915 5.519915 6.382176
## [193] 3.872287 3.093076
fit.MPT<-lm(newMPT~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT)
##
## Call:
## lm(formula = newMPT ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.75435 -0.87171 0.02102 0.88294 2.65838
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.8428980 1.1587802 3.316 0.00111 **
## LSAT 0.0894566 0.0974166 0.918 0.35973
## UGPA 0.0412026 0.0976680 0.422 0.67364
## Class 0.0084816 0.1057578 0.080 0.93617
## CivPro 0.1857383 0.0960694 1.933 0.05480 .
## LP1 -0.1182039 0.0946514 -1.249 0.21339
## LP2 0.0024190 0.0436178 0.055 0.95584
## OneLCUM 0.0098561 0.2231726 0.044 0.96482
## FGPA 0.2296870 0.2086588 1.101 0.27251
## Accom 0.1704424 0.2916144 0.584 0.55965
## Probation 0.0319370 0.4019975 0.079 0.93677
## LegalAnalysis -0.0372415 0.5351470 -0.070 0.94460
## AdvLegalPerf 0.3594853 0.3590773 1.001 0.31814
## AdvLegalAnalysis -0.0001692 0.1819578 -0.001 0.99926
## BarPrep 0.1778639 0.1935874 0.919 0.35948
## PctBarPrepComplete 0.2097745 0.0976307 2.149 0.03304 *
## NumPrepWorkshops -0.0352871 0.1025765 -0.344 0.73125
## StudentSuccessInitiative -0.0945023 0.2965104 -0.319 0.75032
## BarPrepMentor 0.1021870 0.2279555 0.448 0.65451
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.189 on 175 degrees of freedom
## Multiple R-squared: 0.1707, Adjusted R-squared: 0.08536
## F-statistic: 2.001 on 18 and 175 DF, p-value: 0.01187
plot(fit.MPT)
After the boxcox transformation, we will check the VIF
vif(fit.MPT)
## LSAT UGPA Class
## 1.296326 1.303024 1.527821
## CivPro LP1 LP2
## 2.263784 1.955095 1.505454
## OneLCUM FGPA Accom
## 6.803458 5.947323 1.220199
## Probation LegalAnalysis AdvLegalPerf
## 2.051705 1.922441 1.106845
## AdvLegalAnalysis BarPrep PctBarPrepComplete
## 1.119117 1.281510 1.302029
## NumPrepWorkshops StudentSuccessInitiative BarPrepMentor
## 1.437288 2.247827 1.251218
when we observe vif onelcum,fgpa has high vif so when we try to remove those high vif values
fit.MPT1<-lm(newMPT~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT1)
##
## Call:
## lm(formula = newMPT ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## Accom + Probation + LegalAnalysis + AdvLegalPerf + AdvLegalAnalysis +
## BarPrep + PctBarPrepComplete + NumPrepWorkshops + StudentSuccessInitiative +
## BarPrepMentor, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.91286 -0.88023 -0.03875 0.89349 2.74294
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.632655 1.138282 3.191 0.00168 **
## LSAT 0.114191 0.093917 1.216 0.22565
## UGPA 0.070486 0.095399 0.739 0.46097
## Class 0.015375 0.101749 0.151 0.88007
## CivPro 0.247318 0.077843 3.177 0.00176 **
## LP1 -0.079523 0.085454 -0.931 0.35333
## LP2 0.013630 0.042584 0.320 0.74930
## Accom 0.204038 0.285566 0.715 0.47586
## Probation -0.008588 0.392174 -0.022 0.98255
## LegalAnalysis 0.046877 0.531399 0.088 0.92981
## AdvLegalPerf 0.315784 0.357540 0.883 0.37832
## AdvLegalAnalysis -0.037681 0.177864 -0.212 0.83246
## BarPrep 0.166295 0.192533 0.864 0.38891
## PctBarPrepComplete 0.225298 0.096954 2.324 0.02127 *
## NumPrepWorkshops -0.033160 0.101322 -0.327 0.74385
## StudentSuccessInitiative -0.327334 0.245685 -1.332 0.18446
## BarPrepMentor 0.071330 0.225739 0.316 0.75239
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.189 on 177 degrees of freedom
## Multiple R-squared: 0.1613, Adjusted R-squared: 0.08553
## F-statistic: 2.128 on 16 and 177 DF, p-value: 0.009011
vif(fit.MPT1)
## LSAT UGPA Class
## 1.205068 1.243400 1.414449
## CivPro LP1 LP2
## 1.486551 1.593903 1.435223
## Accom Probation LegalAnalysis
## 1.170321 1.953007 1.895951
## AdvLegalPerf AdvLegalAnalysis BarPrep
## 1.097585 1.069523 1.267820
## PctBarPrepComplete NumPrepWorkshops StudentSuccessInitiative
## 1.284262 1.402588 1.543548
## BarPrepMentor
## 1.227225
after removing, the VIF value for all predictors decreased. But there is not that many significant predictors variable. So we will proceed with stepwise regression in forward direction.
library(MuMIn)
modelMPTforw<-lm(newMPT~1,data=datanew)
formula(modelMPTforw)
## newMPT ~ 1
step(modelMPTforw,scope~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew,direction="forward")
## Start: AIC=85.36
## newMPT ~ 1
##
## Df Sum of Sq RSS AIC
## + CivPro 1 31.0904 267.05 65.996
## + StudentSuccessInitiative 1 16.5587 281.58 76.276
## + PctBarPrepComplete 1 10.4621 287.68 80.431
## + LegalAnalysis 1 6.0437 292.09 83.388
## + Probation 1 5.6812 292.46 83.629
## + LSAT 1 4.8057 293.33 84.209
## <none> 298.14 85.361
## + BarPrepMentor 1 2.3761 295.76 85.809
## + LP1 1 1.9281 296.21 86.103
## + NumPrepWorkshops 1 1.9144 296.22 86.112
## + LP2 1 1.2580 296.88 86.541
## + BarPrep 1 0.9414 297.20 86.748
## + AdvLegalAnalysis 1 0.8318 297.31 86.819
## + UGPA 1 0.8064 297.33 86.836
## + Accom 1 0.4377 297.70 87.076
## + AdvLegalPerf 1 0.3441 297.79 87.137
## + Class 1 0.0017 298.14 87.360
##
## Step: AIC=66
## newMPT ~ CivPro
##
## Df Sum of Sq RSS AIC
## + PctBarPrepComplete 1 6.7908 260.26 62.999
## + StudentSuccessInitiative 1 3.7275 263.32 65.269
## <none> 267.05 65.996
## + BarPrepMentor 1 1.2418 265.81 67.092
## + LSAT 1 1.1996 265.85 67.123
## + AdvLegalPerf 1 0.7973 266.25 67.416
## + LP1 1 0.7114 266.34 67.479
## + Accom 1 0.4833 266.56 67.645
## + UGPA 1 0.2322 266.81 67.828
## + LegalAnalysis 1 0.2036 266.84 67.848
## + NumPrepWorkshops 1 0.1577 266.89 67.882
## + Probation 1 0.0459 267.00 67.963
## + LP2 1 0.0263 267.02 67.977
## + AdvLegalAnalysis 1 0.0139 267.03 67.986
## + BarPrep 1 0.0104 267.04 67.989
## + Class 1 0.0000 267.05 67.996
##
## Step: AIC=63
## newMPT ~ CivPro + PctBarPrepComplete
##
## Df Sum of Sq RSS AIC
## <none> 260.26 62.999
## + StudentSuccessInitiative 1 2.50802 257.75 63.121
## + LSAT 1 2.32141 257.94 63.261
## + AdvLegalPerf 1 1.46760 258.79 63.902
## + LP1 1 0.81213 259.44 64.393
## + BarPrep 1 0.61718 259.64 64.539
## + Accom 1 0.60573 259.65 64.547
## + BarPrepMentor 1 0.47851 259.78 64.642
## + NumPrepWorkshops 1 0.45469 259.80 64.660
## + UGPA 1 0.09217 260.17 64.931
## + AdvLegalAnalysis 1 0.05160 260.20 64.961
## + LegalAnalysis 1 0.03836 260.22 64.971
## + LP2 1 0.03047 260.23 64.977
## + Class 1 0.02621 260.23 64.980
## + Probation 1 0.00665 260.25 64.994
##
## Call:
## lm(formula = newMPT ~ CivPro + PctBarPrepComplete, data = datanew)
##
## Coefficients:
## (Intercept) CivPro PctBarPrepComplete
## 3.5830 0.2832 0.1889
After doing the stepwise forward regression for MPT, the lowest AIC is for the model:
modelMPTforw<-lm(newMPT ~ CivPro + PctBarPrepComplete, data = datanew)
summary(modelMPTforw)
##
## Call:
## lm(formula = newMPT ~ CivPro + PctBarPrepComplete, data = datanew)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.93817 -0.86198 0.00356 0.88741 2.75210
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.58301 0.33855 10.583 < 2e-16 ***
## CivPro 0.28319 0.06313 4.486 1.25e-05 ***
## PctBarPrepComplete 0.18885 0.08459 2.232 0.0267 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.167 on 191 degrees of freedom
## Multiple R-squared: 0.1271, Adjusted R-squared: 0.1179
## F-statistic: 13.9 on 2 and 191 DF, p-value: 2.312e-06
So, for MPT the significant predictors are CivPro and PctBarPrepComplete
logistic regression pass as response
datanew2<-data.frame(scale(data[,c(1,2,3,7,8,15,16)],center=TRUE,scale=TRUE),data[,c(4,5,6,9,10,11,12,13,14,17,18,19,20,21,22,23,24)])
view(datanew2)
mod<-glm(PASS~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew2,family=binomial(link="logit"))
summary(mod)
##
## Call:
## glm(formula = PASS ~ LSAT + UGPA + Class + CivPro + LP1 + LP2 +
## OneLCUM + FGPA + Accom + Probation + LegalAnalysis + AdvLegalPerf +
## AdvLegalAnalysis + BarPrep + PctBarPrepComplete + NumPrepWorkshops +
## StudentSuccessInitiative + BarPrepMentor, family = binomial(link = "logit"),
## data = datanew2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.42283 0.02875 0.10340 0.31007 1.84562
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.2416 3.8162 -0.325 0.74491
## LSAT 0.6671 0.4563 1.462 0.14374
## UGPA 0.3206 0.4246 0.755 0.45024
## Class -1.3771 0.4327 -3.182 0.00146 **
## CivPro 0.4698 0.3383 1.389 0.16489
## LP1 -0.2596 0.3434 -0.756 0.44968
## LP2 -0.4393 0.2072 -2.120 0.03403 *
## OneLCUM 1.4609 0.8364 1.747 0.08069 .
## FGPA 0.6312 0.6692 0.943 0.34557
## Accom -1.2808 0.9934 -1.289 0.19728
## Probation 0.5020 1.2510 0.401 0.68825
## LegalAnalysis -0.1482 1.2999 -0.114 0.90925
## AdvLegalPerf 1.6165 1.5084 1.072 0.28385
## AdvLegalAnalysis 0.4194 0.7539 0.556 0.57796
## BarPrep 2.0040 0.8346 2.401 0.01635 *
## PctBarPrepComplete 0.5490 0.3101 1.771 0.07662 .
## NumPrepWorkshops -0.1779 0.3965 -0.449 0.65371
## StudentSuccessInitiative -0.5428 1.0153 -0.535 0.59288
## BarPrepMentor 2.2938 1.2897 1.779 0.07530 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 145.211 on 193 degrees of freedom
## Residual deviance: 73.265 on 175 degrees of freedom
## AIC: 111.26
##
## Number of Fisher Scoring iterations: 7
After doing the logistic regression for the model, the significant variables are Class, LP2 and Barprep.
Discussion:
After getting several best model with its significant variable, bar school should focus on more Class, UGPA, OneLCUM, FGPA, AdvLegalPerf, BarPrep, PctBarPrepComplete , StudentSuccessInitiative, BarPrepMentor,LegalAnalysis, CivPro and PctBarPrepComplete which was found through regressing UBE value and also MBE, MEE and MPT variables as these three variables are calculated to determine UBE value.
Unevaluated code:
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(tidyr)
library(purrr)
library(MASS)
library(car)
library(MASS)
sheet1<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2022Fail")
sheet1
sheet2<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2022Pass")
sheet2
sheet3<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2021Fail")
sheet3
sheet4<-read_excel("C:/Users/sampa/OneDrive - Texas Tech University/Courses/Statistical Data Analysis/Project/BarDataSet.xlsx", sheet = "2021Pass")
sheet4
sheet1[sheet1=='NA']<-NA
sheet2[sheet2=='NA']<-NA
sheet3[sheet3=='NA']<-NA
sheet4[sheet4=='NA']<-NA
sheet1<-na.omit(sheet1)
sheet2<-na.omit(sheet2)
sheet3<-na.omit(sheet3)
sheet4<-na.omit(sheet4)
view(sheet1)
view(sheet2)
view(sheet3)
view(sheet4)
data<-rbind(sheet1,sheet2,sheet3,sheet4)
View(data)
str(data)
summary(data)
data$CivPro<-factor(data$CivPro,c("D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$LP1<-factor(data$LP1,c("D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$LP2<-factor(data$LP2,c("CR","D","D+","C","C+","B","B+","A"),ordered=TRUE)
data$Accom<-as.factor(data$Accom)
data$Probation<-as.factor(data$Probation)
data$LegalAnalysis<-as.factor(data$LegalAnalysis)
data$AdvLegalPerf<-as.factor(data$AdvLegalPerf)
data$AdvLegalAnalysis<-as.factor(data$AdvLegalAnalysis)
data$BarPrep<-as.factor(data$BarPrep)
data$PctBarPrepComplete<-as.numeric(data$PctBarPrepComplete)
data$NumPrepWorkshops<-factor(data$NumPrepWorkshops,c("0","1","2","3","4","5"),ordered=TRUE)
data$StudentSuccessInitiative<-as.factor(data$StudentSuccessInitiative)
data$BarPrepMentor<-as.factor(data$BarPrepMentor)
data<-data[-80,]
view(data)
data$MPRE<-as.numeric(data$MPRE)
data[,1:23]<-sapply(data[,1:23],as.numeric)
str(data)
view(data)
datanew<-data.frame(scale(data[,c(1,2,3,7,8,15,16)],center=TRUE,scale=TRUE),data[,c(4,5,6,9,10,11,12,13,14,17,18,19,20,21,22,23)])
view(datanew)
fit.UBE<-lm(UBE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE)
plot(fit.UBE)
vif(fit.UBE)
fit.UBE1<-lm(UBE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE1)
vif(fit.UBE1)
fit.UBE2<-lm(UBE~Class+OneLCUM+UGPA+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.UBE2)
vif(fit.UBE2)
library(MuMIn)
fullmodelUBE<-lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fitdredgeUBE<-dredge(fullmodelUBE)
## Fixed term is "(Intercept)"
plot(summary.fitdredgeUBE)
modelAICUBE736<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.736<-summary(modelAICUBE736)
summary.736
modelAICUBE224<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.224<-summary(modelAICUBE224)
summary.224
modelAICUBE223<-lm(UBE~Class+UGPA+OneLCUM+FGPA+BarPrep+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.223<-summary(modelAICUBE223)
summary.223
library(olsrr)
modelUBEsubset <- lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitUBEsubset<-ols_step_best_subset(modelUBEsubset)
summary.fitUBEsubset
plot(summary.fitUBEsubset)
modelbestUBE<-lm(UBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestUBE)
fit.MBE<-lm(MBE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE)
plot(fit.MBE)
vif(fit.MBE)
fit.MBE1<-lm(MBE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE1)
vif(fit.MBE1)
fit.MBE2<-lm(UBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MBE2)
vif(fit.MBE2)
library(MuMIn)
fullmodelMBE<-lm(MBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fit1MBE<-dredge(fullmodelMBE)
## Fixed term is "(Intercept)"
plot(summary.fit1MBE)
modelAICMBE992<-lm(MBE~Class+OneLCUM+FGPA+AdvLegalPerf+Probation+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.992<-summary(modelAICMBE992)
summary.992
modelAICUBE988<-lm(MBE~Class+OneLCUM+FGPA+AdvLegalPerf+BarPrep+Probation+PctBarPrepComplete+StudentSuccessInitiative,data=datanew)
summary.988<-summary(modelAICUBE988)
summary.988
modelAICUBE476<-lm(MBE~Class+OneLCUM+FGPA+BarPrep+PctBarPrepComplete+BarPrepMentor+Probation,data=datanew)
summary.476<-summary(modelAICUBE476)
summary.476
library(olsrr)
modelMBEsubset <- lm(MBE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitMBEsubset<-ols_step_best_subset(modelMBEsubset)
summary.fitMBEsubset
plot(summary.fitMBEsubset)
modelbestMBE<-lm(MBE~Class+UGPA+OneLCUM+FGPA+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestMBE)
fit.MEE<-lm(MEE~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE)
plot(fit.MEE)
vif(fit.MEE)
fit.MEE1<-lm(MEE~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE1)
vif(fit.MEE1)
fit.MEE2<-lm(MEE~Class+OneLCUM+UGPA+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MEE2)
vif(fit.MEE2)
fullmodelMEE<-lm(MEE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew,na.action = "na.fail")
summary.fit<-dredge(fullmodelMEE)
## Fixed term is "(Intercept)"
plot(summary.fit)
modelAICMEE759<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.759<-summary(modelAICMEE759)
summary.759
modelAICMEE247<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete+BarPrepMentor,data=datanew)
summary.247<-summary(modelAICMEE247)
summary.247
modelAICMEE243<-lm(MEE~OneLCUM+FGPA+BarPrep+LegalAnalysis+PctBarPrepComplete,data=datanew)
summary.243<-summary(modelAICMEE243)
summary.243
library(olsrr)
modelMEEsubset <- lm(MEE~Class+UGPA+OneLCUM+FGPA+Probation+LegalAnalysis+AdvLegalPerf+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary.fitMEEsubset<-ols_step_best_subset(modelMEEsubset)
summary.fitMEEsubset
plot(summary.fitMEEsubset)
modelbestMEE<-lm(MEE~OneLCUM+FGPA+LegalAnalysis+BarPrep+PctBarPrepComplete+StudentSuccessInitiative+BarPrepMentor,dat=datanew)
summary(modelbestMEE)
fit.MPT<-lm(MPT~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT)
plot(fit.MPT)
b<-boxcox(fit.MPT)
lambda<-b$x
likelihood<-b$y
q<-lambda[which.max(likelihood)]
q
newMPT<-datanew$MPT^1.232323
newMPT
fit.MPT<-lm(newMPT~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT)
plot(fit.MPT)
vif(fit.MPT)
fit.MPT1<-lm(newMPT~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew)
summary(fit.MPT1)
vif(fit.MPT1)
library(MuMIn)
modelMPTforw<-lm(newMPT~1,data=datanew)
formula(modelMPTforw)
step(modelMPTforw,scope~LSAT+UGPA+Class+CivPro+LP1+LP2+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew,direction="forward")
modelMPTforw<-lm(newMPT ~ CivPro + PctBarPrepComplete, data = datanew)
summary(modelMPTforw)
datanew2<-data.frame(scale(data[,c(1,2,3,7,8,15,16)],center=TRUE,scale=TRUE),data[,c(4,5,6,9,10,11,12,13,14,17,18,19,20,21,22,23,24)])
view(datanew2)
mod<-glm(PASS~LSAT+UGPA+Class+CivPro+LP1+LP2+OneLCUM+FGPA+Accom+Probation+LegalAnalysis+AdvLegalPerf+AdvLegalAnalysis+BarPrep+PctBarPrepComplete+NumPrepWorkshops+StudentSuccessInitiative+BarPrepMentor,data=datanew2,family=binomial(link="logit"))
summary(mod)