Creating logistic regression model to check the impact of the number of exit exam on student’s performance

## 
## Call:
## glm(formula = factor(Pass) ~ factor(Grade) + factor(NumExit), 
##     family = binomial(link = "logit"), data = exit2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8978   0.6009   0.6009   0.7221   0.8667  
## 
## Coefficients:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.62029    0.05424  29.872  < 2e-16 ***
## factor(Grade)PS   -0.40908    0.06724  -6.084 1.17e-09 ***
## factor(NumExit)2  -0.01716    0.08234  -0.208   0.8349    
## factor(NumExit)3  -0.37083    0.21605  -1.716   0.0861 .  
## factor(NumExit)4  -0.42556    0.53536  -0.795   0.4267    
## factor(NumExit)5  10.94577  187.49087   0.058   0.9534    
## factor(NumExit)6 -14.18635  324.74370  -0.044   0.9652    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 5706.9  on 5698  degrees of freedom
## Residual deviance: 5661.7  on 5692  degrees of freedom
## AIC: 5675.7
## 
## Number of Fisher Scoring iterations: 11

According to the result, both grade and the NumExit=3(the number of exit exam is 3) affect on students’ final exam performance without considering the effect of different plans.

Next, we will consider several frequently used plan to investigate the relationship between the number of exit exams and student’s final exam performance

Building logistic regression model on a subset dataset which only includes the frequently used plans:T1 rhyme, T1 word awareness, T1 rote counting, and T1 one-to-one

## 
## Call:
## glm(formula = factor(Pass) ~ factor(Grade) + factor(NumExit), 
##     family = binomial(link = "logit"), data = data1)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0440  -1.2574   0.6646   0.8746   1.0995  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.3978     0.1040  13.445  < 2e-16 ***
## factor(Grade)PS   -0.6339     0.1250  -5.073 3.92e-07 ***
## factor(NumExit)2   0.2072     0.1466   1.413    0.158    
## factor(NumExit)3   0.5590     0.4955   1.128    0.259    
## factor(NumExit)4  -0.5779     0.5951  -0.971    0.331    
## factor(NumExit)6 -13.9639   324.7437  -0.043    0.966    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1650.3  on 1462  degrees of freedom
## Residual deviance: 1615.3  on 1457  degrees of freedom
## AIC: 1627.3
## 
## Number of Fisher Scoring iterations: 11

The result shows that only grade signficantly influence on students’ performance.

Let’s build the similar model but divide on Grade

## 
## Call:
## glm(formula = factor(Pass) ~ factor(NumExit), family = binomial(link = "logit"), 
##     data = data1[data1$Grade == "PS", ])
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6719  -1.5015   0.8849   0.8849   1.2735  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       0.73571    0.08966   8.206  2.3e-16 ***
## factor(NumExit)2  0.37794    0.19552   1.933   0.0532 .  
## factor(NumExit)3  0.36291    0.67267   0.540   0.5895    
## factor(NumExit)4 -0.95885    0.67679  -1.417   0.1565    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 946.85  on 766  degrees of freedom
## Residual deviance: 940.41  on 763  degrees of freedom
## AIC: 948.41
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = factor(Pass) ~ factor(NumExit), family = binomial(link = "logit"), 
##     data = data1[data1$Grade == "PK", ])
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.1460   0.6512   0.6512   0.6512   0.6576  
## 
## Coefficients:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.44316    0.11468  12.585   <2e-16 ***
## factor(NumExit)2  -0.02177    0.22050  -0.099    0.921    
## factor(NumExit)3   0.75407    0.75413   1.000    0.317    
## factor(NumExit)4  13.12291  509.65214   0.026    0.979    
## factor(NumExit)6 -16.00922  882.74338  -0.018    0.986    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 676.13  on 695  degrees of freedom
## Residual deviance: 670.29  on 691  degrees of freedom
## AIC: 680.29
## 
## Number of Fisher Scoring iterations: 13

After dividing on Grade, we can see that for prekindergarden students, the number of exit exam has no obviously effect on passing final exam, while for preschool student, the number of exit exams has some effect on passing final exam,which means that NumExit=2(the number of exit exam is 2) will increase the odds of passing exit exam(s) to 1.46 when other varialbes hold constant.