Probability of NYC Public School Delay by Borough, School Age or PreK, and Run Type

For this week’s homework, I will be using the The Bus Breakdown and Delay dataset founded on Kaggle. The dataset is hsoted by City of New York and it collects informatiom from school bus vendors operating out in the field in real time. The depemdent variable will be bus delay and the independent variables will be Borough, School Age or Prek, and Run Type.

Packages

I unloaded the functions I will be using/I might use.

library(readr)
library(dplyr)
library(Zelig)
library(texreg)
library(pander)
library(visreg)
library(effects)

Opening the dataset

I imported the dataset to R.

busdelay<-read_csv("C:/Users/wroni/Downloads/ny-bus-breakdown-and-delays/bus-breakdown-and-delays.csv")

Recoding the dataset

Since I will be focusing on the probability of bus delay, I recoded the dependent vaiable between 0 (no delay due to running late) or 1 (delay due to running late).

busdelay2 <- mutate(busdelay, busdelay_binary= recode(Breakdown_or_Running_Late,`Running Late` = 1, `Breakdown` = 0))

Summary of dataset

This shows all the variables in the dataset, including the recoded variable from the previous step.

head(busdelay2)

Model 1

The first model determines the odd of a bus delay by borough.

m1 <- glm(busdelay_binary ~ Boro, family = binomial, data = busdelay2)

summary(m1)
## 
## Call:
## glm(formula = busdelay_binary ~ Boro, family = binomial, data = busdelay2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7957   0.3093   0.4784   0.5287   0.7399  
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)    
## (Intercept)          2.99800    0.21845  13.724  < 2e-16 ***
## BoroBronx           -1.10098    0.21872  -5.034 4.81e-07 ***
## BoroBrooklyn        -0.88810    0.21881  -4.059 4.93e-05 ***
## BoroConnecticut      0.01753    0.39067   0.045 0.964205    
## BoroManhattan        0.01788    0.21913   0.082 0.934981    
## BoroNassau County   -1.33911    0.22222  -6.026 1.68e-09 ***
## BoroNew Jersey       0.15775    0.25111   0.628 0.529872    
## BoroQueens          -1.84233    0.21877  -8.421  < 2e-16 ***
## BoroRockland County -0.05356    0.26667  -0.201 0.840809    
## BoroStaten Island    0.39824    0.22355   1.781 0.074836 .  
## BoroWestchester      0.88962    0.23390   3.803 0.000143 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 199544  on 287612  degrees of freedom
## Residual deviance: 187666  on 287602  degrees of freedom
##   (14376 observations deleted due to missingness)
## AIC: 187688
## 
## Number of Fisher Scoring iterations: 6

Model 2

This model adds a second indepndent variable of School Age or Pre-K. The second model determines the odd of bus delay by Borough and School Age or Pre-K.

m2 <- glm(busdelay_binary ~ Boro + School_Age_or_PreK, family = binomial, data = busdelay2)

summary(m2)
## 
## Call:
## glm(formula = busdelay_binary ~ Boro + School_Age_or_PreK, family = binomial, 
##     data = busdelay2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.4230   0.2037   0.3093   0.5078   0.7500  
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   5.45975    0.22194  24.600  < 2e-16 ***
## BoroBronx                    -1.59527    0.21876  -7.292 3.04e-13 ***
## BoroBrooklyn                 -1.01483    0.21882  -4.638 3.52e-06 ***
## BoroConnecticut               0.01753    0.39067   0.045 0.964205    
## BoroManhattan                 0.01781    0.21913   0.081 0.935208    
## BoroNassau County            -1.33911    0.22222  -6.026 1.68e-09 ***
## BoroNew Jersey                0.15775    0.25111   0.628 0.529872    
## BoroQueens                   -1.87333    0.21877  -8.563  < 2e-16 ***
## BoroRockland County          -0.05356    0.26667  -0.201 0.840809    
## BoroStaten Island             0.39571    0.22355   1.770 0.076704 .  
## BoroWestchester               0.88962    0.23390   3.803 0.000143 ***
## School_Age_or_PreKSchool-Age -2.46175    0.03918 -62.829  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 199544  on 287612  degrees of freedom
## Residual deviance: 179849  on 287601  degrees of freedom
##   (14376 observations deleted due to missingness)
## AIC: 179873
## 
## Number of Fisher Scoring iterations: 6

Model 3

This model adds a third indepndent variable of Run Type. The third model determines the odd of bus delay by Borough, School Age or Pre-K, and Run Type. This model also includes an interaction between the variables of School Age or Pre-K and Run Type. However, the model shows NA.

m3 <- glm(busdelay_binary ~ Boro + School_Age_or_PreK * Run_Type, family = binomial, data = busdelay2)

summary(m3)
## 
## Call:
## glm(formula = busdelay_binary ~ Boro + School_Age_or_PreK * Run_Type, 
##     family = binomial, data = busdelay2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.4412   0.2035   0.3393   0.5751   0.9954  
## 
## Coefficients: (10 not defined because of singularities)
##                                                              Estimate
## (Intercept)                                                   5.21563
## BoroBronx                                                    -1.34831
## BoroBrooklyn                                                 -0.78379
## BoroConnecticut                                               0.11510
## BoroManhattan                                                 0.25517
## BoroNassau County                                            -1.21267
## BoroNew Jersey                                                0.15601
## BoroQueens                                                   -1.64646
## BoroRockland County                                           0.06861
## BoroStaten Island                                             0.70274
## BoroWestchester                                               0.89665
## School_Age_or_PreKSchool-Age                                 -2.64456
## Run_TypeGeneral Ed Field Trip                                -0.32177
## Run_TypeGeneral Ed PM Run                                    -0.44231
## Run_TypePre-K/EI                                                   NA
## Run_TypeProject Read AM Run                                   0.22859
## Run_TypeProject Read Field Trip                              -0.59623
## Run_TypeProject Read PM Run                                   0.95148
## Run_TypeSpecial Ed AM Run                                     0.49293
## Run_TypeSpecial Ed Field Trip                                -0.48014
## Run_TypeSpecial Ed PM Run                                    -0.33142
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed Field Trip         NA
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed PM Run             NA
## School_Age_or_PreKSchool-Age:Run_TypePre-K/EI                      NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read AM Run           NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read Field Trip       NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read PM Run           NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed AM Run             NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed Field Trip         NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed PM Run             NA
##                                                              Std. Error
## (Intercept)                                                     0.22234
## BoroBronx                                                       0.21917
## BoroBrooklyn                                                    0.21923
## BoroConnecticut                                                 0.39169
## BoroManhattan                                                   0.21953
## BoroNassau County                                               0.22264
## BoroNew Jersey                                                  0.25148
## BoroQueens                                                      0.21917
## BoroRockland County                                             0.26728
## BoroStaten Island                                               0.22416
## BoroWestchester                                                 0.23424
## School_Age_or_PreKSchool-Age                                    0.04142
## Run_TypeGeneral Ed Field Trip                                   0.07936
## Run_TypeGeneral Ed PM Run                                       0.02967
## Run_TypePre-K/EI                                                     NA
## Run_TypeProject Read AM Run                                     0.30058
## Run_TypeProject Read Field Trip                                 1.24128
## Run_TypeProject Read PM Run                                     0.15421
## Run_TypeSpecial Ed AM Run                                       0.01742
## Run_TypeSpecial Ed Field Trip                                   0.07580
## Run_TypeSpecial Ed PM Run                                       0.02040
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed Field Trip           NA
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed PM Run               NA
## School_Age_or_PreKSchool-Age:Run_TypePre-K/EI                        NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read AM Run             NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read Field Trip         NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read PM Run             NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed AM Run               NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed Field Trip           NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed PM Run               NA
##                                                              z value
## (Intercept)                                                   23.458
## BoroBronx                                                     -6.152
## BoroBrooklyn                                                  -3.575
## BoroConnecticut                                                0.294
## BoroManhattan                                                  1.162
## BoroNassau County                                             -5.447
## BoroNew Jersey                                                 0.620
## BoroQueens                                                    -7.512
## BoroRockland County                                            0.257
## BoroStaten Island                                              3.135
## BoroWestchester                                                3.828
## School_Age_or_PreKSchool-Age                                 -63.848
## Run_TypeGeneral Ed Field Trip                                 -4.055
## Run_TypeGeneral Ed PM Run                                    -14.909
## Run_TypePre-K/EI                                                  NA
## Run_TypeProject Read AM Run                                    0.760
## Run_TypeProject Read Field Trip                               -0.480
## Run_TypeProject Read PM Run                                    6.170
## Run_TypeSpecial Ed AM Run                                     28.303
## Run_TypeSpecial Ed Field Trip                                 -6.334
## Run_TypeSpecial Ed PM Run                                    -16.246
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed Field Trip        NA
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed PM Run            NA
## School_Age_or_PreKSchool-Age:Run_TypePre-K/EI                     NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read AM Run          NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read Field Trip      NA
## School_Age_or_PreKSchool-Age:Run_TypeProject Read PM Run          NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed AM Run            NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed Field Trip        NA
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed PM Run            NA
##                                                              Pr(>|z|)    
## (Intercept)                                                   < 2e-16 ***
## BoroBronx                                                    7.65e-10 ***
## BoroBrooklyn                                                 0.000350 ***
## BoroConnecticut                                              0.768863    
## BoroManhattan                                                0.245096    
## BoroNassau County                                            5.13e-08 ***
## BoroNew Jersey                                               0.535022    
## BoroQueens                                                   5.82e-14 ***
## BoroRockland County                                          0.797407    
## BoroStaten Island                                            0.001719 ** 
## BoroWestchester                                              0.000129 ***
## School_Age_or_PreKSchool-Age                                  < 2e-16 ***
## Run_TypeGeneral Ed Field Trip                                5.02e-05 ***
## Run_TypeGeneral Ed PM Run                                     < 2e-16 ***
## Run_TypePre-K/EI                                                   NA    
## Run_TypeProject Read AM Run                                  0.446959    
## Run_TypeProject Read Field Trip                              0.630993    
## Run_TypeProject Read PM Run                                  6.82e-10 ***
## Run_TypeSpecial Ed AM Run                                     < 2e-16 ***
## Run_TypeSpecial Ed Field Trip                                2.38e-10 ***
## Run_TypeSpecial Ed PM Run                                     < 2e-16 ***
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed Field Trip         NA    
## School_Age_or_PreKSchool-Age:Run_TypeGeneral Ed PM Run             NA    
## School_Age_or_PreKSchool-Age:Run_TypePre-K/EI                      NA    
## School_Age_or_PreKSchool-Age:Run_TypeProject Read AM Run           NA    
## School_Age_or_PreKSchool-Age:Run_TypeProject Read Field Trip       NA    
## School_Age_or_PreKSchool-Age:Run_TypeProject Read PM Run           NA    
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed AM Run             NA    
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed Field Trip         NA    
## School_Age_or_PreKSchool-Age:Run_TypeSpecial Ed PM Run             NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 199539  on 287609  degrees of freedom
## Residual deviance: 176362  on 287590  degrees of freedom
##   (14379 observations deleted due to missingness)
## AIC: 176402
## 
## Number of Fisher Scoring iterations: 6

Information Criteria

Both the AIC and BIC shows that Model 3 seems to the best fit model (lower balues indicating better fit).

Intrepting the results (Model 3)

The results show that the odds of a bus delay increases by 0.90 for Westchester, 0.70 for Staten Island, 0.07 for Rockland County, 0.16 for New Jersey, 0.12 for Connecticut, and 0.26 for Manhattan. The odds of a bus delay decreases by 1.35 for Bronx, .78 for Brooklyn, 1.21 for Nassau County, and 1.65 for Queens. Please note that the dataset contains more than just the five major NYC boroughs even though the variable is labeled Borough.

Once School Age or Prek variable is factored in with Borough, the odds of a bus delay decreases by 2.64.

Once the third variable of Run Type is factored in with Borough and School Age or PreK, then the odds of a bus delay decreases by 0.32 for General Education Field Trip, 0.44 for General Education PM Run, 0.48 for Special Education Field Trip, and 0.33 for Special Education PM Run. The odds of a bus delay increases by 0.49 for Special Education AM Run.

table1 <- htmlreg(list(m1, m2, m3), doctype= FALSE)

pander(table1)
Statistical models
Model 1 Model 2 Model 3
(Intercept) 3.00*** 5.46*** 5.22***
(0.22) (0.22) (0.22)
BoroBronx -1.10*** -1.60*** -1.35***
(0.22) (0.22) (0.22)
BoroBrooklyn -0.89*** -1.01*** -0.78***
(0.22) (0.22) (0.22)
BoroConnecticut 0.02 0.02 0.12
(0.39) (0.39) (0.39)
BoroManhattan 0.02 0.02 0.26
(0.22) (0.22) (0.22)
BoroNassau County -1.34*** -1.34*** -1.21***
(0.22) (0.22) (0.22)
BoroNew Jersey 0.16 0.16 0.16
(0.25) (0.25) (0.25)
BoroQueens -1.84*** -1.87*** -1.65***
(0.22) (0.22) (0.22)
BoroRockland County -0.05 -0.05 0.07
(0.27) (0.27) (0.27)
BoroStaten Island 0.40 0.40 0.70**
(0.22) (0.22) (0.22)
BoroWestchester 0.89*** 0.89*** 0.90***
(0.23) (0.23) (0.23)
School_Age_or_PreKSchool-Age -2.46*** -2.64***
(0.04) (0.04)
Run_TypeGeneral Ed Field Trip -0.32***
(0.08)
Run_TypeGeneral Ed PM Run -0.44***
(0.03)
Run_TypeProject Read AM Run 0.23
(0.30)
Run_TypeProject Read Field Trip -0.60
(1.24)
Run_TypeProject Read PM Run 0.95***
(0.15)
Run_TypeSpecial Ed AM Run 0.49***
(0.02)
Run_TypeSpecial Ed Field Trip -0.48***
(0.08)
Run_TypeSpecial Ed PM Run -0.33***
(0.02)
AIC 187688.49 179873.39 176402.47
BIC 187804.75 180000.23 176613.86
Log Likelihood -93833.24 -89924.70 -88181.24
Deviance 187666.49 179849.39 176362.47
Num. obs. 287613 287613 287610
p < 0.001, p < 0.01, p < 0.05

Plotting

This plot shows that the proability of bus delay increases in Wechester and Staten Island, especially for School-Age.

visreg(m3,"Boro", by = "School_Age_or_PreK", scale="response")

This plot again shows that the probability of bus delay increases for School-Age, epsecially in Westchester.

visreg(m3,"School_Age_or_PreK", by = "Boro", scale="response")

This plot again shows that bus delay increases for Run Type Special Education AM Run, especially in Westchester and Staten Island.

visreg(m3,"Run_Type", by = "Boro", scale="response")