Introduction

In this assigment, I will predict the probability of admission in master’s program for undergraduate students. The dataset was taken from kaggle.com. It includes factors that might explain what helps a student to inclrease his/her chance of admission into a master’s program.

I have created a new binomial variable “Admission” with value “1” if the value for chance of admission is equal to or more than 80%, and “0” otherwise. I will use this variable as my outcome variable and use student’s GRE score, CGPA research experience and strength of statement of purpose as predictor variables.

Importing Necessary Packages

library(Zelig)
library(texreg)
## Warning: package 'texreg' was built under R version 3.5.3
library(mvtnorm)
library(car)
library(sjmisc)
## Warning: package 'sjmisc' was built under R version 3.5.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.5.3
library(readr)
MastersAdmission1 <- read_csv("C:/Users/Nusrat/Desktop/MA - 3rd semester, Spring 19/SOC 712 - Advanced Analytics (R)/Assignment 6 - Zelig 5/MastersAdmission1.csv")
head(MastersAdmission1)
## # A tibble: 6 x 12
##   `Serial No.` GREScore GREgrp TOEFLScore `University Rat~   SOP   LOR
##          <dbl>    <dbl>  <dbl>      <dbl>            <dbl> <dbl> <dbl>
## 1            1      337      1        118                4   4.5   4.5
## 2            2      324      1        107                4   4     4.5
## 3            3      316      0        104                3   3     3.5
## 4            4      322      1        110                3   3.5   2.5
## 5            5      314      0        103                2   2     3  
## 6            6      330      1        115                5   4.5   3  
## # ... with 5 more variables: CGPA <dbl>, CGPAgrp <dbl>, Research <dbl>,
## #   ChanceofAdmit <dbl>, Admission <dbl>

Multivariate Logit Model

z5.2 <- zlogit$new()
z5.2$zelig(Admission ~ SOP + GREScore*CGPA + Research, data = MastersAdmission1)
summary(z5.2)
## Model: 
## 
## Call:
## z5.2$zelig(formula = Admission ~ SOP + GREScore * CGPA + Research, 
##     data = MastersAdmission1)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.26595  -0.04739  -0.00519  -0.00016   3.12047  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)   303.0617   859.5307   0.353 0.724396
## SOP             1.9620     0.5422   3.618 0.000296
## GREScore       -1.2422     2.6571  -0.468 0.640136
## CGPA          -37.4840    94.8235  -0.395 0.692620
## Research        1.0162     0.7840   1.296 0.194895
## GREScore:CGPA   0.1463     0.2931   0.499 0.617599
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 524.36  on 499  degrees of freedom
## Residual deviance: 100.17  on 494  degrees of freedom
## AIC: 112.17
## 
## Number of Fisher Scoring iterations: 11
## 
## Next step: Use 'setx' method

Here I conducted a multivariate analysis with strength of statement of purpose, research experience and an interaction between GRE score and CGPA as independent variables to predict student’s chance of admission into master’s program. Results suggest no statistical significance.

Effect of Strength of Statement of Purpose

z5SOP = min(MastersAdmission1$SOP):max(MastersAdmission1$SOP)
x <- setx(z5.2, SOP = z5SOP)
s <- sim(z5.2, x = x)
s$graph()

The graph suggests that students who submit statement of purpose (SOP) with a stength of 4 or less have an expected value of 0, meaning that they have no chance of getting admitted into a master’s program. When the strength of SOP increases to 5, the expected value increases to approximately 0.05, however, with that, the level of uncertainty in chance of admission increases as well.

Effect of Having Research Experience

z5R = min(MastersAdmission1$Research):max(MastersAdmission1$Research)
x <- setx(z5.2, Research = z5R)
R <- sim(z5.2, x = x)
R$graph()

The graph suggests that students with no research experience have an expected value of 0, meaning that they have pretty much no chance of getting admitted into a master’s program. For those with research experience, the expected value increases to approximately 0.005 (not a significant increase). Also, with that, the level of uncertainty in chance of admission increases as well. Overall, having research experience does not help getting admission into the program.

First Difference: GRE Score Difference

z5GRE <- zlogit$new()
z5GRE$zelig(Admission ~ SOP + GREgrp*CGPAgrp + Research, data = MastersAdmission1)
z5GRE$setx(GREgrp = 0)
z5GRE$setx1(GREgrp = 1)
z5GRE$sim()
summary(z5GRE)
## 
##  sim x :
##  -----
## ev
##           mean        sd       50%         2.5% 97.5%
## [1,] 0.5008772 0.5001236 0.9385738 2.220446e-16     1
## pv
##        0   1
## [1,] 0.5 0.5
## 
##  sim x1 :
##  -----
## ev
##           mean        sd          50%         2.5% 97.5%
## [1,] 0.4878581 0.4998928 2.220446e-16 2.220446e-16     1
## pv
##          0     1
## [1,] 0.512 0.488
## fd
##             mean      sd 50% 2.5% 97.5%
## [1,] -0.01301907 0.70221   0   -1     1

Prior to running the above analysis, GRE score and CGPA were mutated to new variables called GREgrp and CGPAgrp respectively. For GREgrp “1”, are students with GRE score more than or equal to 320 (out of 340) and GREgrp “0” are students with GRE score less than 320. For CGPAgrp, students with values more than or equal to 8.5 (out of 10) is in group “1” and otherwise “0”.

The analysis of difference in GRE groups shows that the mean difference in chance of being admitted in a maser’s program between the two groups is 0.038 - which is pretty low.

GRE group DIfference in CGPA group Difference for Chance of Admission

Difference in GRE score among students with CGPA < 8.5

z5zero <- zlogit$new()
z5zero$zelig(Admission ~ SOP + GREgrp*CGPAgrp + Research, data = MastersAdmission1)
z5zero$setx(GREgrp = 0, CGPAgrp = 0)
z5zero$setx1(GREgrp = 1, CGPAgrp = 0)
z5zero$sim()
summary(z5zero)
## 
##  sim x :
##  -----
## ev
##           mean       sd 50%         2.5% 97.5%
## [1,] 0.5199965 0.499846   1 2.220446e-16     1
## pv
##         0    1
## [1,] 0.48 0.52
## 
##  sim x1 :
##  -----
## ev
##           mean        sd          50%         2.5% 97.5%
## [1,] 0.4980002 0.5002458 2.544521e-06 2.220446e-16     1
## pv
##          0     1
## [1,] 0.502 0.498
## fd
##             mean        sd 50% 2.5% 97.5%
## [1,] -0.02199625 0.6840932   0   -1     1

Here, I ran an analysis to predict the difference in GRE score among students with CGPA less than or equal to 8.5 out of 10. The mean first difference in chance of admission is 0.42. Students with GRE score more than or equal to 320 has 42% more chance of admission than those with a lower GRE score.

z5zero$graph()

Difference in GRE score among students with CGPA >= 8.5

z5one <- zlogit$new()
z5one$zelig(Admission ~ SOP + GREgrp*CGPAgrp + Research, data = MastersAdmission1)
z5one$setx(GREgrp = 0, CGPAgrp = 1)
z5one$setx1(GREgrp = 1, CGPAgrp = 1)
z5one$sim()
summary(z5one)
## 
##  sim x :
##  -----
## ev
##           mean        sd         50%         2.5% 97.5%
## [1,] 0.4972816 0.4997769 0.001367331 2.220446e-16     1
## pv
##          0     1
## [1,] 0.503 0.497
## 
##  sim x1 :
##  -----
## ev
##            mean         sd        50%       2.5%     97.5%
## [1,] 0.07431933 0.03213858 0.06819159 0.02862625 0.1504754
## pv
##          0     1
## [1,] 0.931 0.069
## fd
##            mean        sd        50%       2.5%     97.5%
## [1,] -0.4229622 0.5018746 0.02323236 -0.9667188 0.1387635

I ran an analysis to predict the difference in GRE score among students with CGPA less than 8.5 out of 10.The mean first difference in chance of admission increases to approximately 0.448 or 45%. For this category, students with GRE score more than or equal to 320 has 45% more chance of admission than those with a lower GRE score.

z5one$graph()

Combining the Results in a Single Table

d0<-z5zero$get_qi(xvalue="x1", qi="fd")
d1<-z5one$get_qi(xvalue="x1", qi="fd")
dfd <- as.data.frame(cbind(d0, d1))
head(dfd)
##              V1          V2
## 1  1.000000e+00 -0.92948326
## 2 -6.743426e-08  0.08104536
## 3  0.000000e+00  0.05204978
## 4  0.000000e+00 -0.85880365
## 5 -1.000000e+00  0.04281451
## 6  0.000000e+00 -0.91150175

Tables and GGPLOTS

library(tidyr)
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:sjmisc':
## 
##     replace_na
## The following object is masked from 'package:texreg':
## 
##     extract
tidd <- dfd %>% 
  gather(CGPAgrp, simv, 1:2)
head(tidd)
##   CGPAgrp          simv
## 1      V1  1.000000e+00
## 2      V1 -6.743426e-08
## 3      V1  0.000000e+00
## 4      V1  0.000000e+00
## 5      V1 -1.000000e+00
## 6      V1  0.000000e+00
library(dplyr)
tidd %>% 
  group_by(CGPAgrp) %>% 
  summarise(mean = mean(simv), sd = sd(simv))
## # A tibble: 2 x 3
##   CGPAgrp    mean    sd
##   <chr>     <dbl> <dbl>
## 1 V1      -0.0220 0.684
## 2 V2      -0.423  0.502
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following object is masked from 'package:Zelig':
## 
##     stat
ggplot(tidd, aes(simv)) + geom_histogram() + facet_grid(~CGPAgrp)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Conclusion

The comparison of histogram indicate that there is a significant difference in the chance of admission among the students in two CGPA groups. Overall, the more GRE score and CGPA a student has, the better is his/her chance of getting admission into a master’s program.