Preliminary Findings: COVID-19 Student Case Rates Across Texas per School District

Author

B.A. Flores

Introduction

Across the literature of pediatric cases of COVID -19 health disparities can be found: nonHispanic blacks and Hispanics are more likely to die from COVID-19 than non-Hispanic whites, Hispanics, and non-Hispanic blacks have the highest prevalence’s of underlying conditions furthering increasing their risk of death from COVID-19, Hispanics were the majority of the decedents for both groups that did and did not have MIS-C symptoms, and compared to non-Hispanic whites, Hispanics and non-Hispanic blacks were hospitalized at much higher rates. (Artiga et. al., 2021; Kim et. al., 2021; McCormick et al., 2021; Moreira et al., 2021)

Between February 2020 to July 2020, Hispanic children made up almost half of the pediatric deaths due to COVID-19 during that first initial wave. (McCormick et al, 2021) These Hispanic children tended to have underlying conditions with obesity most prevalent with their death almost 9 times more likely than those who did not have a comorbidity. (McCormick et al, 2021; Moreira et al, 2021) 31% of deaths from COVID-19 with no underlying conditions were also Hispanic. (McCormick et al. 2021) Contrary to the “Hispanic Paradox” which showed that Hispanics tended to have the same if not better health outcomes than non-Hispanic whites (Hummer et al. 2007; Markides and Coreil 1986; Markides and Eschbach, 2011; Palloni and Morenoff, 2001; Riosmena et al., 2015) This even being so regarding the U.S. infant mortality rate (Perez & Desalvo, 2019). This buffer is not to be found with the COVID-19 pandemic. This “Hispanic Paradox” is also to be known to fade as duration within the U.S. increases (Cho and Hummer., 2001; Landale et al., 2000; Palloni and Arias, 2004) Factors that erode this epidemiological paradox within the US context may also be similar factors as to why the COVID-19 pandemic disproportionally effected Hispanic children.

Different reasons of high infection and death rates of Hispanic children are given such as obesity, parental occupations being frontline workers or having “viral-contact” jobs, poverty implications, and larger households linked to crowding. (Moreira et al., 2021; Kim et. al., 2021)

Data & Methods

Sample

For my research to be done for this class I will analyze a dataset that is a synthesis of 3 different sources of data to help observe COVID-19 cases for those who were infected while attending public school per district. The COVID-19 case data comes from Texas Department of State Health Services observing the time periods of August 2, 2021 - March 13, 2022 for both students and staff.

I then add information of school characteristics from the Common Core of Data (CCD) for the years of 2020-2021. These school characteristics consist of student/teacher ratio, type of school, among others.

Finally, data from the National Center for Education Statistics was used observing district level data for both children and parents of those school districts. These added predictors help give an even further demographic detailed description of the school districts. This data from the NCES are 5 year estimates observing the years 2015 – 2019.

By observing these 3 separate sources of data collectively it will allow to be able to observe any possible health disparities regarding COVID-19. The data set itself is Bexar County focused, being that all public school districts in Bexar County are observed in this data. As well as all immediate surrounding counties to Bexar are also observed.

For the other school districts in the data a stratified random sample was conducted for the rest of the state of Texas being that random school districts outside of the Bexar county area are selected. While also being sure to include all major metropolitan areas in the state which are Austin, Dallas, El Paso, Houston, and San Antonio; which was captured in the Bexar county analysis.

Analysis

To begin evaluating the 134 school districts in the data set we observe the Top 10 school districts with the highest COVID infection rate.

I calculated this student COVID infection rate by dividing the number of confirmed student cases by the total population that is enrolled in that school district, multiplying this number by 1000. By calculating the rate of student COVID cases per school district it allows for easier comparisons across the state of Texas amongst these school districts.

Numerous Pearsons correlations are performed to see amongst the variables observed which has a statistically significant relationship with COVID infection rates among students.

Finally, nested modeling is performed conducting multiple linear regression models with the found key variable from the given correlation matrix. Controlling for any possible interaction effects with the other observed predictors in the analysis.

The outcome variable for this analysis is the rate of COVID student cases measuring amongst key predictor variables supported by the literature being:

occupants per room being 1.51 or more to measure over crowding, the percent of the school district that is Hispanic to measure the effect COVID had amongst this population, the city that the school district is located in, median earnings for workers per school district, and the percentage of the population within those school districts that do not have a high school diploma.

These various demographic & socio-economic measures will be observed to find any associations with the rate of COVID cases amongst students per school district throughout the state of Texas from August 2, 2021 - March 13, 2022.

Variable Recode

Code
TXCOVID2<- TXCOVIDSCHOOLS %>% 

mutate(
  COVIDrateStudents= (Total.Student.Cases/Total.District.Enrollment.as.of.SeptemBer.29..2021)*1000,
       #COVID Student District infection rate
  
Hispanicpercent= (ChildHispanic..Total...CDP05.3)
/(Children..Total.Population...CDP05.1)*100,
        # Percent of the children population within the school district that is Hispanic

nohspercent= (EDUC.9th.12th.no.diploma...PDP02.5/EDUC.pop..25yrs.....PDP02.5)*100)
        #Percent of the population that is 25 years or older within those school districts that do not have a high school diploma

TXCOVID3<- TXCOVID2 %>% 
  select(District.Name, COVIDrateStudents, Hispanicpercent, City...DISTRICT.DATA, Median.earnings.for.workers...PDP03.6,nohspercent, Occupants.Per.Room..1.51.or.more...CDP04.1)

TXCORRELATION<- TXCOVID2 %>% 
  select(COVIDrateStudents, Hispanicpercent, Median.earnings.for.workers...PDP03.6,nohspercent, Occupants.Per.Room..1.51.or.more...CDP04.1)

Results

Descriptives - Top 10 highest COVID student case rates in Texas

Code
TXCOVIDTABLE<- head(arrange(TXCOVID3, desc(COVIDrateStudents)), n=10)
District.Name COVIDrateStudents Hispanicpercent City...DISTRICT.DATA Median.earnings.for.workers...PDP03.6 nohspercent Occupants.Per.Room..1.51.or.more...CDP04.1
SOMERSET ISD TOTAL 434.3835 88.786280 SOMERSET 35271 13.917526 55
SOUTH SAN ANTONIO ISD TOTAL 392.5420 95.300593 SAN ANTONIO 27162 19.625073 255
RED LICK ISD TOTAL 288.0000 1.587302 TEXARKANA 60268 0.620155 0
HARLANDALE ISD TOTAL 275.5387 92.805990 SAN ANTONIO 28233 14.487471 225
SOUTHSIDE ISD TOTAL 253.4113 89.299461 SAN ANTONIO 31751 21.025105 105
SOUTHWEST ISD TOTAL 242.3265 84.316506 SAN ANTONIO 30943 15.623722 240
EAST CENTRAL ISD TOTAL 234.1614 67.620183 SAN ANTONIO 39534 7.849133 60
FLORESVILLE ISD TOTAL 228.5784 56.653992 FLORESVILLE 47528 6.263499 160
FLOUR BLUFF ISD TOTAL 221.3727 33.720930 CORPUS CHRISTI 40270 5.550146 0
LAKE TRAVIS ISD TOTAL 219.0726 18.521898 AUSTIN 80164 2.034760 25

When observing the Top 10 school districts with the highest COVID infection rates among students various trends are found.

Of those 10 school districts among 134 across the state of Texas 7 of them are from the San Antonio area.

5 of those 10 school districts have a student population that is over 80% Hispanic.

5 of those 10 school districts observe a median earning income for their workers of less than $36,000.

When observing the percentage of the population within those school districts that are aged 25 years or older that do not have a high school diploma 5 of the 10 school districts show over 12% of that population that did not earn a high school diploma.

For observing occupants per room of atleast 1.51 or more observes over 100 child respondents in 5 different school districts among the top 10 that live in households were more than 1 person lives in each room.

Pearson’s Correlation Matrix

To help understand which of the selected variables has the most effect on the outcome variable of COVID student infection rates we observe various Pearson correlations using this correlation matrix.

From this analysis of our outcome variable we can see that only the variable of Occupants Per Room has a statistically significant relationship at .05. This negative relationship shows that as COVID student infection rates go up the number of students that live in households with more than 1.51 persons per room goes down.

Multiple Linear Regression Models

Occupants Per Room X Median Earnings for Workers - Model 1
Code
summ(model1)
Observations 133 (1 missing obs. deleted)
Dependent variable COVIDrateStudents
Type OLS linear regression
F(2,130) 2.07
0.03
Adj. R² 0.02
Est. S.E. t val. p
(Intercept) 119.02 17.81 6.68 0.00
Occupants.Per.Room..1.51.or.more...CDP04.1 -0.01 0.01 -2.03 0.04
Median.earnings.for.workers...PDP03.6 -0.00 0.00 -0.28 0.78
Standard errors: OLS

For observing model 1 the key predictor of Occupants Per Room is statistically significant at .05 level. Observing that for every 100 units of increase in COVID infection rates for students we can expect a decrease in Occupants Per Room by about 1 unit of change controlling for Median earnings for workers per school district.

Occupants Per Room X Hispanic Percentage - Model 2
Code
model2  <- lm(COVIDrateStudents ~Occupants.Per.Room..1.51.or.more...CDP04.1 +Hispanicpercent, data = TXCOVID3)

summ(model2)
Observations 133 (1 missing obs. deleted)
Dependent variable COVIDrateStudents
Type OLS linear regression
F(2,130) 2.91
0.04
Adj. R² 0.03
Est. S.E. t val. p
(Intercept) 99.67 13.36 7.46 0.00
Occupants.Per.Room..1.51.or.more...CDP04.1 -0.02 0.01 -2.23 0.03
Hispanicpercent 0.31 0.23 1.31 0.19
Standard errors: OLS

For model 2 the key predictor of Occupants Per Room was again statistically significant observing for every 100 units of increase in COVID infection rates among students we can expect a decrease in Occupants Per Room by about 2 units of change controlling for the percentage of the school district that is Hispanic.

Occupants Per Room X Percentage without a Highschool Diploma - Model 3
Code
model3  <- lm(COVIDrateStudents ~Occupants.Per.Room..1.51.or.more...CDP04.1 + nohspercent, data = TXCOVID3)

summ(model3)
Observations 133 (1 missing obs. deleted)
Dependent variable COVIDrateStudents
Type OLS linear regression
F(2,130) 2.10
0.03
Adj. R² 0.02
Est. S.E. t val. p
(Intercept) 118.08 12.14 9.72 0.00
Occupants.Per.Room..1.51.or.more...CDP04.1 -0.01 0.01 -1.91 0.06
nohspercent -0.39 1.07 -0.37 0.71
Standard errors: OLS

For model 3 the key predictor variable of the model Occupants Per Room is not statistically significant. The significance is lost when the variable of the percentage of the population that is 25 years or older within that school district that does not have a high school diploma is added to the model.

Conclusion

When observing the given data we can see that the San Antonio area of the state of Texas was disproportionately effected by COVID regarding their student population in comparison with the rest of the state. Of the 134 observed school districts only 15 of them are from Bexar County which is representative of about 11% of the total sample. Yet 7 of the Top 10 highest COVID student case rates are found in the San Antonio area.

For the variables observed the only one that had a statistically significant relationship in the correlation matrix with the outcome variable was Occupants Per Room of 1.51 or more. This variable held statistical significance in each of the multiple linear regression models until the percentage of the adult population that does not have a high school diploma within that school district was added into model 3. Only then is the statistically significant relationship between the outcome variable and Occupants Per Room is lost.

Although this variable may explain away the relationship in the model it only has a r-square of .03. Being that the variables in the model only explain about 3% of the interaction that is being had with our outcome variable. Model 2 had the highest r-square at .04 but Occupants Per Room still held its statistical significance at the .05 level.

From this analysis it shows that other variables from within this dataset should be considered when creating these different regression models.

Next, to create a more spatial analysis of the COVID student case rates would help further illustrate any other possible associations that may be had.

References

Artiga, Samantha, Hill Latoya, Ndugga Nambi (2021) Racial Disparities in COVID-19 Impacts and Vaccinations for Children. Racial Equity and Health Policy. Kaiser Family Foundation.

Cho, Y. and Hummer, R.A. (2001). Disability Status Differentials Across Fifteen Asian and Pacific Islander Groups and the Effect of Nativity and Duration of Residence in the U.S. Social Biology, 48(3–4): 171– 195.

Hummer, R. A., Powers, D. A., Pullum, S. G., Gossman, G. L., & Frisbie, W. P. (2007). Paradox found (again): infant mortality among the Mexican-origin population in the United States. Demography, 44(3), 441–457.

Kim, L., Whitaker, M., O’Halloran, A., Kambhampati, A., Chai, S. J., Reingold, A., Armistead, I., Kawasaki, B., Meek, J., Yousey- Hindes, K., Anderson, E. J., Openo, K. P., Weigel, A., Ryan, P., Monroe, M. L., Fox, K., Kim, S., Lynfield, R., Bye, E., Shrum Davis, S., et. al., COVID-NET Surveillance Team (2020). Hospitalization Rates and Characteristics of Children Aged <18 Years Hospitalized with LaboratoryConfirmed COVID-19 - COVID-NET, 14 States, March 1-July 25, 2020. MMWR. Morbidity and mortality weekly report, 69(32), 1081–1088. https://doi.org/10.15585/mmwr.mm6932e3

Landale, N. S., Oropesa, R. S., & Gorman, B. K. (2000). Migration and infant death: Assimilation or selective migration among Puerto Ricans? American Sociological Review, 888–909.

Markides, K. S., & Coreil, J. (1986). The health of Hispanics in the southwestern United States: an epidemiologic paradox. Public health reports, 101(3), 253.

Markides, Kyriakos S. and Karl Eschbach (2011), Hispanic Paradox in Adult Mortality in the United Staes. Richard G. Rogers and Eileen M. Crimmins (eds,), International Handbook of Adult Mortality. New York: Springer, pp. 225 – 238

McCormick, D. W., Richardson, L. C., Young, P. R., Viens, L. J., Gould, C. V., Kimball, A., … & Koumans, E.H. (2021). Deaths in children and adolescents associated with COVID-19 and MIS-C in the United States. Pediatrics, 148(5).

Moreira, A., Chorath, K., Rajasekaran, K. et al. (2021) Demographic predictors of hospitalization and mortality in US children with COVID-19. Eur J Pediatr 180, 1659–1663. https://doi.org/10.1007/s00431- 021-03955-x

Palloni, Alberto and Elizabeth Arias (2004), “Paradox Lost: Explaining the Hispanic Adult Mortality Advantage. Demography, 41, pp – 385 - 415

Palloni, Alberto and Jeffrey D. Morenoff (2001) Interpreting the Paradoxical in the “Hispanic Paradox’: Demographic and Epidemiological Approaches. M. Weinstein, A. Hermalin, and M. Stoto (eds), Population Health and Aging. New York: New York Academy of Sciences. Pp. 140-174

Perez-patron, & Desalvo, B. (2019). Infant Mortality. In Handbook of Population (pp. 343–354).

Riosmena, Fernando, Bethany G. Everett, Richard G. Rogers, and Jeff A. Dennis (2015) Negative Acculturation and Nothing More? Cumulative Disadvantage and Mortality during the Immigrant Adaptation Process among Latinos in the United States. International Migration Review, 49., pp – 443 – 478