Part 2: COVID-19 Student Case Rates Across Texas per School District Continued

Author

B.A. Flores

Published

May 4, 2023

Introduction

This analysis is an extension of the previous study of observing COVID-19 student case rates across Texas per school district. As what was found in the previous observation of the data there was no truly strong association present. Although occupants per room was statistically significant the amount of actual change that took place was extremely small and the models explained little of the overall interactions that were being had with the dependent variable. The following link will direct you to observe that article with the statistical findings and present literature for this study here.

From that previous study one of the biggest overall findings was the composition of the Top 10 school districts that are observed that had the highest COVID-19 student infection rate. The majority of those school districts were from the San Antonio area. This expressing that the pandemic hit the city of San Antonio, Texas especially hard when observing student COVID-19 cases.

This current analysis will be further observing our key outcome variable among other predictor variables to see what may help explain any trends that may have occurred across those 134 school districts in the state of Texas from the time periods of August 2, 2021 - March 13, 2022.

Data & Methods

Sample

For my research to be done for this class I will analyze a dataset that is a synthesis of 3 different sources of data to help observe COVID-19 cases for those who were infected while attending public school per district. The COVID-19 case data comes from Texas Department of State Health Services observing the time periods of August 2, 2021 - March 13, 2022 for both students and staff.

I then add information of school characteristics from the Common Core of Data (CCD) for the years of 2020-2021. These school characteristics consist of student/teacher ratio, type of school, among others.

Finally, data from the National Center for Education Statistics was used observing district level data for both children and parents of those school districts. These added predictors help give an even further demographic detailed description of the school districts. This data from the NCES are 5 year estimates observing the years 2015 – 2019.

By observing these 3 separate sources of data collectively it will allow to be able to observe any possible health disparities regarding COVID-19. The data set itself is Bexar County focused, being that all public school districts in Bexar County are observed in this data. As well as all immediate surrounding counties to Bexar are also observed.

For the other school districts in the data a stratified random sample was conducted for the rest of the state of Texas being that random school districts outside of the Bexar county area are selected. While also being sure to include all major metropolitan areas in the state which are Austin, Dallas, El Paso, Houston, and San Antonio; which was captured in the Bexar county analysis.

Analysis

I calculated this student COVID infection rate by dividing the number of confirmed student cases by the total population that is enrolled in that school district, multiplying this number by 1000. By calculating the rate of student COVID cases per school district it allows for easier comparisons across the state of Texas amongst these school districts.

Numerous Pearsons correlations are performed to see among the variables observed which has a statistically significant relationship with COVID infection rates among students.

Finally, I conducte multiple linear regression models with the found key variables from the given correlation matrix. Controlling for any possible interaction effects with the other observed predictors in the analysis.

The outcome variable for this analysis is the rate of COVID student cases measuring among key predictor variables supported by the literature being:

the percent of the child population that is foreign born, the percent of homes that were built before the year 1970, the percent of children that are not US citizens, the percent of the child population that is non-Hispanic black, the percentage of the child population that is American Indian/Alaskan, the percentage of renters that pay a monthly rent of less than $1000, the percentage of the child population that is non-Hispanic white, the percentage of workers currently in the labor force who do not have health insurance, the percentage of the parents who are 25 years of age and older who did not finish high school, and median earnings for workers.

These various demographic & socio-economic measures will be observed to find any associations with the rate of COVID cases amongst students per school district throughout the state of Texas from August 2, 2021 - March 13, 2022.

Variable Recode

Code
TXCOVID2<- TXCOVIDSCHOOLS %>%

mutate(
COVIDrateStudents=(Total.Student.Cases/Total.District.Enrollment.as.of.SeptemBer.29..2021)*1000, #COVID.Student.District.infection.rate.

childforeign= (ChildPlace.of.Birth..Foreign...CDP02.5.PLACE.OF.BIRTH/ChildPlace.of.Birth..Total.Population...CDP02.5.PLACE.OF.BIRTH)*100,
 #Percent.children.thats.foreign.born

homesbefore1970= (
Year.Built..1960.to.1969...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1950.to.1959...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1940.to.1949...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1939.or.earlier...CDP04.2.YEAR.STRUCTURE.BUILT)/
(Year.Built..2014.or.later...CDP04.2.YEAR.STRUCTURE.BUILT   +Year.Built..2010.to.2013...CDP04.2.YEAR.STRUCTURE.BUILT    +Year.Built..2000.to.2009...CDP04.2.YEAR.STRUCTURE.BUILT    +Year.Built..1990.to.1999...CDP04.2.YEAR.STRUCTURE.BUILT    +Year.Built..1980.to.1989...CDP04.2.YEAR.STRUCTURE.BUILT    +Year.Built..1970.to.1979...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1960.to.1969...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1950.to.1959...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1940.to.1949...CDP04.2.YEAR.STRUCTURE.BUILT
+Year.Built..1939.or.earlier...CDP04.2.YEAR.STRUCTURE.BUILT)*100, 
#Percent of those homes that were built earlier than 1970

childnotcitizen= (ChildUS.Citizen..Not.a.US.citizen....CDP02.6.US.CITIZENSHIP.STATUS/Children..Total.Population...CDP05.1)*100,
#Percentage of the child population that is not a US Citizen

percentchildnhblack= (ChildsRace..Black.AA...CDP05.2/Children..Total.Population...CDP05.1)*100,
#Percent of the child population that is non-Hispanic black

percentparentnhblack= (ADTRace..Black.AA...PDP05.2/ADT.Total.Pop...PDP05.1)*100, 
#Percent of the parent population that is non-Hispanic black

percentchildwhite= (ChildsRace..White...CDP05.2/Children..Total.Population...CDP05.1)*100,
#Percent of the child population that is non-Hispanic white

percentamindian= (ChildsRace..American.Ind..Alaskan...CDP05.2/Children..Total.Population...CDP05.1)*100,
#Percent of the child population that is American Indian / Alaskan

nohsparent= ((EDUC..HS...PDP02.5+EDUC.9th.12th.no.diploma...PDP02.5)/EDUC.pop..25yrs.....PDP02.5)*100,
#Percent of the parents age 25 years and older that did not finish high school

workernotinsured= (Employ.No.HLTH.COV...PDP03.7/Employ.Stat.Employed...PDP03.1)*100,
#Percent of parents employed in the laborforce that does not have health insurance 

rentunder1000= ((Gross.Rent...500GROSS.RENT +
                 Gross.Rent..500.999GROSS.RENT)
/Occupied.units.paying.rentGROSS.RENT)*100)
#Percentage of renters that are paying under 1000 monthly

TXCOVID3<- TXCOVID2 %>% 
  select(District.Name, COVIDrateStudents, childforeign, homesbefore1970, Gross.Rent..MedianGROSS.RENT, percentchildnhblack, nohsparent, workernotinsured, percentamindian, Median.earnings.for.workers...PDP03.6, childnotcitizen, percentchildwhite, rentunder1000)

TXCORRELATION<- TXCOVID2 %>% 
  select(COVIDrateStudents, childforeign, homesbefore1970, percentchildnhblack, percentamindian, rentunder1000, childnotcitizen, percentchildwhite, workernotinsured)

Pearson’s Correlation Matrix

While observing the various Pearson Correlations immediatly there are many more statistically signficant results than presented in the last study when observing COVID-19 Student case rates per school district. The strongest relationship is found between those COVID-19 rates and the percentage of the child population that is non-Hispanic black. This data suggests that as COVID-19 rates increase the percentage of the child population that is non-Hispanic black decreases. This negative relationship is moderatly weak with a p-value of .01.

The variable that is 2nd strongest in its relationship is the percentage of the child population that is not a US citizen. This too is a negative relationship being that as COVID-19 student case rates increase the percentage of the child population within those school districts that tend to not have US citizenship decreases. This relationship is statistically significant at the .05 level.

Marginal effects were also had in this correlation matrix with a negative relationship being found with the percentage of the child population that is foreign born. Suggesting as COVID-19 student cases increases the percentage of those students who are foreign born decreases. A positive relationship is found with the variable observing the percentage of those school districts that are non-Hispanic white. This positive relationship being found with marginal significance suggests that as COVID-19 student cases per school district increases so does the percentage of non-Hispanic white children in those school districts.

Multiple Linear Regression Models

Models focused on non-Hispanic black children as main predictor
Code
summ(modelchildall)
Observations 133 (1 missing obs. deleted)
Dependent variable COVIDrateStudents
Type OLS linear regression
F(6,126) 2.50
0.11
Adj. R² 0.06
Est. S.E. t val. p
(Intercept) 180.34 40.62 4.44 0.00
percentchildnhblack -1.62 0.60 -2.68 0.01
workernotinsured -2.39 1.15 -2.09 0.04
homesbefore1970 0.06 0.38 0.17 0.87
nohsparent 0.39 1.12 0.35 0.73
rentunder1000 0.32 0.34 0.94 0.35
Median.earnings.for.workers...PDP03.6 -0.00 0.00 -1.30 0.20
Standard errors: OLS

When observing this Multiple Linear Regression model the key predictor of the percent of children per school district that is non-Hispanic black remained statistically significant at the .01 level.

Controlling for all other variables in the model, for every 10 units of increase in COVID-19 infection rates for students we can expect a decrease in the percentage of the child population to be non-Hispanic black by about 16.2 units of change per school district in Texas.

The r-square is found to be at 0.11 being that the variables present in this model explain about 11% of the interaction that is being had between the outcome variable and the predictor variables in the model.

Models focused on children that are not US citizens as main predictor
Code
summ(modelUSall)
Observations 133 (1 missing obs. deleted)
Dependent variable COVIDrateStudents
Type OLS linear regression
F(6,126) 1.35
0.06
Adj. R² 0.02
Est. S.E. t val. p
(Intercept) 140.93 38.55 3.66 0.00
childnotcitizen -2.30 2.87 -0.80 0.42
workernotinsured -2.13 1.23 -1.73 0.09
homesbefore1970 -0.00 0.39 -0.01 0.99
nohsparent 0.67 1.16 0.58 0.56
rentunder1000 0.47 0.37 1.25 0.21
Median.earnings.for.workers...PDP03.6 -0.00 0.00 -0.56 0.57
Standard errors: OLS

When observing the Multiple Linear Regression model for those children who are not US citizens it no longer has its statistical significance that it once had when it was a one-on-one relationship between it and the outcome variable.

This observing that the association between the percentage of the child population per school district that does not have US citizenship and COVID-19 student case rate can be explained away being possibly due to poor access to health insurance, possibly living in older built homes, not understanding the resources available due to low educational attainment, and/or they experience economic strain due to low income.

The r-square for the model is observed at 0.06 being that the present model explains about 6% of the interaction between the outcome variable and the predictors in the model.

None of the predictors in the model are statistically significant so none of the findings within this model are statistically sound.

Conclusion

With a continued observation it is shown that the percentage of the child population that is non-Hispanic black exhibited the strongest and highest statistically significant results of the presented models in this study and in the previous study, Part 1.

Even when observing Part 1 of this research (from the link provided) all of the regression models from Part 1 that were presented held about half of the model strength exhibited from the r-square compared to the model shown here in Part 2.

Irrespective of all of the literature that is presented that speaks of Hispanics and non-Hispanic blacks, adults and children alike, tending to be hit the heaviest during the COVID-19 pandemic for various reasons (Artiga et. al., 2021; Kim et. al., 2021; McCormick et al., 2021; Moreira et al., 2021).

It is found from this study of the state of Texas that the only ethnic group that was observed with a positive association with COVID-19 student case rates by school district was the variable observing the percentage of the child population that was non-Hispanic white.

From the data it seems from August 2, 2021 - March 13, 2022 Hispanic children and children who are foreign born (who may also tend to not have a US citizenship compared to their non-Hispanic white counterparts) tended to do relatively well during this time of the pandemic in the state of Texas when observing school district data.

Even more surprising was the strong and statistically proven association being that those school districts that tended to have higher non-Hispanic black student populations tended to have lower reported COVID-19 student cases per school district.

Overall, when observing these various school districts across the state of Texas not too many trends are expressed from the data that shows any true racial disparities during August 2, 2021 - March 13, 2022 of the pandemic in regards to COVID-19 student cases per school district.

Although as what was found in Part 1 of this continued study; these racial health disparities maybe more spatial in nature. As with what was found when observing the Top 10 COVID-19 student case rates among those 134 school districts. With 7 of those 10 school districts being in the San Antonio area.

Further research must be conducted to spatially analyze this data to see if any clustering can be found among different racial minority groups across the state of Texas per school district.

References

Artiga, Samantha, Hill Latoya, Ndugga Nambi (2021) Racial Disparities in COVID-19 Impacts and Vaccinations for Children. Racial Equity and Health Policy. Kaiser Family Foundation.

Kim, L., Whitaker, M., O’Halloran, A., Kambhampati, A., Chai, S. J., Reingold, A., Armistead, I., Kawasaki, B., Meek, J., Yousey- Hindes, K., Anderson, E. J., Openo, K. P., Weigel, A., Ryan, P., Monroe, M. L., Fox, K., Kim, S., Lynfield, R., Bye, E., Shrum Davis, S., et. al., COVID-NET Surveillance Team (2020). Hospitalization Rates and Characteristics of Children Aged <18 Years Hospitalized with LaboratoryConfirmed COVID-19 - COVID-NET, 14 States, March 1-July 25, 2020. MMWR. Morbidity and mortality weekly report, 69(32), 1081–1088. https://doi.org/10.15585/mmwr.mm6932e3

McCormick, D. W., Richardson, L. C., Young, P. R., Viens, L. J., Gould, C. V., Kimball, A., … & Koumans, E.H. (2021). Deaths in children and adolescents associated with COVID-19 and MIS-C in the United States. Pediatrics, 148(5).

Moreira, A., Chorath, K., Rajasekaran, K. et al. (2021) Demographic predictors of hospitalization and mortality in US children with COVID-19. Eur J Pediatr 180, 1659–1663. https://doi.org/10.1007/s00431- 021-03955-x