How does zip code infuence child care quality?
I am interested in understanding the relationship between zip code and the quality of child care programming thoroughout the New York City boroughs. Prior research indicates a strong correlation between school quality, educational attainment, and zip codes, in which zip codes can help predict test scores, college graduation rates, and overall academic progress. For many parents in New York with long and inflexible working hours, child care services are not optional. Parents may rely on child care programs to satisfy their child’s academic, social, and emotional needs. Child care facilites are usually close to home to prevent an inconvenient travel distance from after-school to the program and from the program to the household. I want to determine if zip codes can predict public health hazard violation rate. I would also like to how age range and zip code interact with the dependent variable, public health hazard violation rate.
The “childcare” dataset is downloaded from Kaggle; it consists of inspections conducted, and the violations found, by the Department of Health and Mental Hygiene at child care programs thoroughout the New York City boroughs.
library(readr)
library(tidyverse)
## -- Attaching packages ------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.3.1
## v tibble 2.0.1 v dplyr 0.8.0.1
## v tidyr 0.8.3 v stringr 1.4.0
## v ggplot2 3.1.0 v forcats 0.4.0
## -- Conflicts ---------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(Zelig)
## Loading required package: survival
##
## Attaching package: 'Zelig'
## The following object is masked from 'package:purrr':
##
## reduce
## The following object is masked from 'package:ggplot2':
##
## stat
library(pander)
library(texreg)
## Version: 1.36.23
## Date: 2017-03-03
## Author: Philip Leifeld (University of Glasgow)
##
## Please cite the JSS article in your publications -- see citation("texreg").
##
## Attaching package: 'texreg'
## The following object is masked from 'package:tidyr':
##
## extract
library(visreg)
library(effects)
## Loading required package: carData
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
childcare <- read_csv("C:/Users/Skippz/Desktop/dohmh-childcare-center-inspections.csv")
## Parsed with column specification:
## cols(
## .default = col_character(),
## ZipCode = col_double(),
## `Permit Number` = col_double(),
## `Permit Expiration` = col_datetime(format = ""),
## `Maximum Capacity` = col_double(),
## `Building Identification Number` = col_double(),
## `Date Permitted` = col_datetime(format = ""),
## `Violation Rate Percent` = col_double(),
## `Average Violation Rate Percent` = col_double(),
## `Total Educational Workers` = col_double(),
## `Average Total Educational Workers` = col_double(),
## `Public Health Hazard Violation Rate` = col_double(),
## `Average Public Health Hazard Violation Rate` = col_double(),
## `Critical Violation Rate` = col_double(),
## `Average Critical Violation Rate` = col_double(),
## `Inspection Date` = col_datetime(format = "")
## )
## See spec(...) for full column specifications.
head(childcare)
## # A tibble: 6 x 34
## `Center Name` `Legal Name` Building Street Borough ZipCode Phone
## <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 ESF, INC ESF Camp Ri~ 1 SPAUL~ BRONX 10471 718-~
## 2 ALL SEASONS ~ ALL SEASONS~ 190 E 162~ BRONX 10451 914-~
## 3 ALL SEASONS ~ ALL SEASONS~ 190 E 162~ BRONX 10451 914-~
## 4 ALL SEASONS ~ ALL SEASONS~ 190 E 162~ BRONX 10451 914-~
## 5 ALL SEASONS ~ ALL SEASONS~ 190 E 162~ BRONX 10451 914-~
## 6 YESHIVAT OHR~ YESHIVAT OH~ 86-06 135TH~ QUEENS 11418 718-~
## # ... with 27 more variables: `Permit Number` <dbl>, `Permit
## # Expiration` <dttm>, Status <chr>, `Age Range` <chr>, `Maximum
## # Capacity` <dbl>, `Day Care ID` <chr>, `Program Type` <chr>, `Facility
## # Type` <chr>, `Child Care Type` <chr>, `Building Identification
## # Number` <dbl>, URL <chr>, `Date Permitted` <dttm>, Actual <chr>,
## # `Violation Rate Percent` <dbl>, `Average Violation Rate
## # Percent` <dbl>, `Total Educational Workers` <dbl>, `Average Total
## # Educational Workers` <dbl>, `Public Health Hazard Violation
## # Rate` <dbl>, `Average Public Health Hazard Violation Rate` <dbl>,
## # `Critical Violation Rate` <dbl>, `Average Critical Violation
## # Rate` <dbl>, `Inspection Date` <dttm>, `Regulation Summary` <chr>,
## # `Violation Category` <chr>, `Health Code Sub Section` <chr>,
## # `Violation Status` <chr>, `Inspection Summary Result` <chr>
qualitycc<-sjlabelled::remove_all_labels(childcare)
head(qualitycc)
## Center.Name Legal.Name Building
## 1 ESF, INC ESF Camp Riverdale 1
## 2 ALL SEASONS ABC DAY CARE, LLC ALL SEASONS DAY CARE, LLC 190
## 3 ALL SEASONS ABC DAY CARE, LLC ALL SEASONS DAY CARE, LLC 190
## 4 ALL SEASONS ABC DAY CARE, LLC ALL SEASONS DAY CARE, LLC 190
## 5 ALL SEASONS ABC DAY CARE, LLC ALL SEASONS DAY CARE, LLC 190
## 6 YESHIVAT OHR HAIIM YESHIVAT OHR HAIIM 86-06
## Street Borough ZipCode Phone Permit.Number
## 1 SPAULDING LANE BRONX 10471 718-432-1013 104551
## 2 E 162ND ST BRONX 10451 914-490-5231 104386
## 3 E 162ND ST BRONX 10451 914-490-5231 104386
## 4 E 162ND ST BRONX 10451 914-490-5231 104386
## 5 E 162ND ST BRONX 10451 914-490-5231 104386
## 6 135TH STREET QUEENS 11418 718-658-7066 NA
## Permit.Expiration Status Age.Range Maximum.Capacity
## 1 2018-09-15 Expired-In Renewal 0 YEARS - 16 YEARS 0
## 2 2020-04-26 Permitted 0 YEARS - 2 YEARS 14
## 3 2020-04-26 Permitted 0 YEARS - 2 YEARS 14
## 4 2020-04-26 Permitted 0 YEARS - 2 YEARS 14
## 5 2020-04-26 Permitted 0 YEARS - 2 YEARS 14
## 6 2115-01-23 Active 3 YEARS - 5 YEARS 0
## Day.Care.ID Program.Type Facility.Type Child.Care.Type
## 1 DC37193 ALL AGE CAMP Camp Camp
## 2 DC35866 INFANT TODDLER GDC Child Care - Infants/Toddlers
## 3 DC35866 INFANT TODDLER GDC Child Care - Infants/Toddlers
## 4 DC35866 INFANT TODDLER GDC Child Care - Infants/Toddlers
## 5 DC35866 INFANT TODDLER GDC Child Care - Infants/Toddlers
## 6 DC20398 PRESCHOOL SBCC School Based Child Care
## Building.Identification.Number URL
## 1 2090707 www.esfcamps.com/riverdale
## 2 2002804 www.allseasondaycare.com
## 3 2002804 www.allseasondaycare.com
## 4 2002804 www.allseasondaycare.com
## 5 2002804 www.allseasondaycare.com
## 6 4206444 <NA>
## Date.Permitted Actual Violation.Rate.Percent
## 1 2018-08-07 11:38:39 Y NA
## 2 2018-04-26 16:28:50 Y 0.0000
## 3 2018-04-26 16:28:50 Y 0.0000
## 4 2018-04-26 16:28:50 Y 0.0000
## 5 2018-04-26 16:28:50 Y 0.0000
## 6 <NA> <NA> 66.6667
## Average.Violation.Rate.Percent Total.Educational.Workers
## 1 NA 0
## 2 NA 0
## 3 NA 0
## 4 NA 0
## 5 NA 0
## 6 NA 0
## Average.Total.Educational.Workers Public.Health.Hazard.Violation.Rate
## 1 1 NA
## 2 0 0.0000
## 3 0 0.0000
## 4 0 0.0000
## 5 0 0.0000
## 6 0 33.3333
## Average.Public.Health.Hazard.Violation.Rate Critical.Violation.Rate
## 1 NA NA
## 2 NA 0.0000
## 3 NA 0.0000
## 4 NA 0.0000
## 5 NA 0.0000
## 6 NA 66.6667
## Average.Critical.Violation.Rate Inspection.Date
## 1 NA <NA>
## 2 NA 2018-12-08
## 3 NA 2018-12-08
## 4 NA 2018-08-01
## 5 NA 2018-05-23
## 6 NA 2019-01-31
## Regulation.Summary
## 1 <NA>
## 2 There were no new violations observed at the time of this inspection/visit.
## 3 There were no new violations observed at the time of this inspection/visit.
## 4 There were no new violations observed at the time of this inspection/visit.
## 5 At time of inspection floors/walls ceilings were observed not maintained; in disrepair or covered in a toxic finish.
## 6 All staff who will have unsupervised contact with children has undergone child abuse and criminal justice screening and reference checks
## Violation.Category Health.Code.Sub.Section Violation.Status
## 1 <NA> <NA> <NA>
## 2 <NA> <NA> N/A
## 3 <NA> <NA> N/A
## 4 <NA> <NA> N/A
## 5 GENERAL 47.41(j) CORRECTED
## 6 GENERAL 43.13(a) CORRECTED
## Inspection.Summary.Result
## 1 <NA>
## 2 Monitoring Inspection Non-Routine - Passed inspection with no violations
## 3 Compliance Inspection of Open Violations - Reinspection Required; Violations corrected at time of inspection
## 4 Initial Annual Inspection - Passed inspection with no violations
## 5 Initial Annual Inspection - Reinspection Not Required
## 6 Initial Annual Inspection - Reinspection Required
Model 1 demonstrates an intercept of 28.93108050, which corresponds to those with no zip code have public health hazard violation rate increase by 28.93 units. As zip code increases, there is a -0.0002838 unit decrease in public health hazard violation rates. This data does not appear statistically significant as the p-value is 0.1734.
model1 <- lm(Public.Health.Hazard.Violation.Rate ~ ZipCode, data = qualitycc)
summary(model1)
##
## Call:
## lm(formula = Public.Health.Hazard.Violation.Rate ~ ZipCode, data = qualitycc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -26.093 -25.706 -5.742 13.992 74.314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.9310850 2.0730054 13.956 <2e-16 ***
## ZipCode -0.0002838 0.0001910 -1.486 0.137
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24.34 on 55630 degrees of freedom
## (6292 observations deleted due to missingness)
## Multiple R-squared: 3.967e-05, Adjusted R-squared: 2.169e-05
## F-statistic: 2.207 on 1 and 55630 DF, p-value: 0.1374
According to Model 2, there is a decrease in public health hazard violation rate by -6.66 units for age range 0 - 2 years anda decrease by -1.33 units for age range 2- 3 years. When age range is 3 - 5 years, there is a increase in public health violations by 13.89 units. This may be due to children crossing a significant mental and social growth threshold as they vocalize their feelings and act out more in this age range, thus child care providers may not be sufficiently trained or up-to-date with resources to handle the needs of this age group.
model2 <- lm(Public.Health.Hazard.Violation.Rate ~ ZipCode + Age.Range, data = qualitycc)
summary(model2)
##
## Call:
## lm(formula = Public.Health.Hazard.Violation.Rate ~ ZipCode +
## Age.Range, data = qualitycc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -52.899 -21.246 -5.173 13.281 74.611
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.3838919 2.2049548 16.501 < 2e-16 ***
## ZipCode -0.0008456 0.0001981 -4.269 1.97e-05 ***
## Age.Range0 YEARS - 2 YEARS -6.6634331 0.5452699 -12.220 < 2e-16 ***
## Age.Range2 YEARS - 5 YEARS -1.3249315 0.4737039 -2.797 0.00516 **
## Age.Range3 YEARS - 5 YEARS 13.8904813 0.6880768 20.187 < 2e-16 ***
## Age.Range6 YEARS - 16 YEARS 25.0793438 1.8750443 13.375 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24.19 on 51877 degrees of freedom
## (10041 observations deleted due to missingness)
## Multiple R-squared: 0.02664, Adjusted R-squared: 0.02654
## F-statistic: 283.9 on 5 and 51877 DF, p-value: < 2.2e-16
model3 <- lm(Public.Health.Hazard.Violation.Rate ~ ZipCode*Critical.Violation.Rate + Age.Range + Violation.Status, data = qualitycc)
summary(model3)
##
## Call:
## lm(formula = Public.Health.Hazard.Violation.Rate ~ ZipCode *
## Critical.Violation.Rate + Age.Range + Violation.Status, data = qualitycc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -75.806 -13.711 -3.338 11.255 96.637
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.206e+01 3.786e+00 -3.185 0.00145 **
## ZipCode 2.258e-03 3.456e-04 6.534 6.45e-11 ***
## Critical.Violation.Rate 1.277e+00 7.733e-02 16.511 < 2e-16 ***
## Age.Range0 YEARS - 2 YEARS -6.932e+00 4.861e-01 -14.259 < 2e-16 ***
## Age.Range2 YEARS - 5 YEARS -3.184e+00 4.230e-01 -7.527 5.26e-14 ***
## Age.Range3 YEARS - 5 YEARS 1.156e+01 6.141e-01 18.819 < 2e-16 ***
## Age.Range6 YEARS - 16 YEARS 1.917e+01 1.671e+00 11.470 < 2e-16 ***
## Violation.StatusMORE INFO 3.812e-01 6.600e-01 0.578 0.56349
## Violation.StatusN/A -5.022e+00 2.267e-01 -22.150 < 2e-16 ***
## Violation.StatusOPEN 2.889e+00 7.399e-01 3.904 9.47e-05 ***
## ZipCode:Critical.Violation.Rate -8.173e-05 7.102e-06 -11.508 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21.51 on 51872 degrees of freedom
## (10041 observations deleted due to missingness)
## Multiple R-squared: 0.2307, Adjusted R-squared: 0.2305
## F-statistic: 1555 on 10 and 51872 DF, p-value: < 2.2e-16
As Model 3 is the most complex of the models and R squared remains relatively low, it is the best fit.
table <- htmlreg(list(model1, model2, model3), doctype=FALSE)
pander(table)
| Model 1 | Model 2 | Model 3 | ||
|---|---|---|---|---|
| (Intercept) | 28.93*** | 36.38*** | -12.06** | |
| (2.07) | (2.20) | (3.79) | ||
| ZipCode | -0.00 | -0.00*** | 0.00*** | |
| (0.00) | (0.00) | (0.00) | ||
| Age.Range0 YEARS - 2 YEARS | -6.66*** | -6.93*** | ||
| (0.55) | (0.49) | |||
| Age.Range2 YEARS - 5 YEARS | -1.32** | -3.18*** | ||
| (0.47) | (0.42) | |||
| Age.Range3 YEARS - 5 YEARS | 13.89*** | 11.56*** | ||
| (0.69) | (0.61) | |||
| Age.Range6 YEARS - 16 YEARS | 25.08*** | 19.17*** | ||
| (1.88) | (1.67) | |||
| Critical.Violation.Rate | 1.28*** | |||
| (0.08) | ||||
| Violation.StatusMORE INFO | 0.38 | |||
| (0.66) | ||||
| Violation.StatusN/A | -5.02*** | |||
| (0.23) | ||||
| Violation.StatusOPEN | 2.89*** | |||
| (0.74) | ||||
| ZipCode:Critical.Violation.Rate | -0.00*** | |||
| (0.00) | ||||
| R2 | 0.00 | 0.03 | 0.23 | |
| Adj. R2 | 0.00 | 0.03 | 0.23 | |
| Num. obs. | 55632 | 51883 | 51883 | |
| RMSE | 24.34 | 24.19 | 21.51 | |
| p < 0.001, p < 0.01, p < 0.05 | ||||
visreg(model1, "ZipCode", scale = "response")
visreg(model2,"ZipCode", by = "Age.Range", scale = "response")