Introduction and Methods

Column

An Exploratory Data Report

Purpose Statement

The purpose of this cross-sectional study is to understand how Covid-19 deaths vary by ethnicity in the United States. Data has shown that the Latinx community, for example, has been disproportionately affected by Covid-19, with the percentage of positive cases within this subpopulation surpassing the proportion of Latinx in the United States (Macias Gil et al., 2020, Gross et al., 2020). However the most recently published papers that delve into these topics were approved in November 2020, meaning that data from Novemeber 2020 to March 2021 has yet to be analyzed.

Using data from the Center for Disease Control and Prevention (CDC) on Covid-19 deaths in the United States from the months of October 2020 to March 2021, this research focuses on the exploratory analysis of (1) deaths per ethnicity/race as proportional to the population percentages in the United States (2) deaths per ethnicity/race accounting for confounding demographics and factors and (3) a cross-comparison of deaths per confounding demographics and factors across the ethnicities/races with the highest deaths.

The final deliverables include an analysis of which ethnicities/races are dying at a higher proportion, a discussion of how factors and demographics confound within each ethnicity/race, and a cross-comparison brief summary of deaths for the White, Black, and Latino/Hispanic populations in the United States.

Citations and Data Sets

Gross, Cary P., Utibe R. Essien, Saamir Pasha, Jacob R. Gross, Shi-yi Wang, and Marcella Nunez-Smith. “Racial and Ethnic Disparities in Population-Level Covid-19 Mortality.” Journal of General Internal Medicine 35, no. 10 (2020): 3097–99. https://doi.org/10.1007/s11606-020-06081-w.

Macias Gil, Raul, Jasmine R Marcelin, Brenda Zuniga-Blanco, Carina Marquez, Trini Mathew, and Damani A Piggott. “COVID-19 Pandemic: Disparate Health Impact on the Hispanic/Latinx Population in the United States.” The Journal of Infectious Diseases 222, no. 10 (2020): 1592–95. https://doi.org/10.1093/infdis/jiaa474.

CDC Data Set

United States Census

Column

Methods

This exploratory analysis uses RStudio and CDC Data for Covid-19 Deaths from October 2020 to March 2021. The data was tidied and wrangled within RStudio to keep consistency. Additionally, the FlexDashboard package was used to create this interface, the Plotly package to create the interactive graphics, and multiple data manipulation packages such as dplyr, tidyr, yarrr, and tidyverse for data manipulation. No other programming languages or analytics softwares were used in the creation of this final deliverable.

To begin the project, data was imported into R and wrangled. Subsequently, the original data set was subdivided into the three Race/Ethnicity subdivisions utilized throughout the dashboard: White, Black, and Hispanic/Latino. Within each category, comorbidity, age group, and sex (male or female) were utilized for cross-comparison analysis.

As data visualization outputs, bar graphs were created for (1) a general understanding of deaths across the ethnicities (2) deaths per ethnicity/race as proportional to the population percentages in the United States (3) deaths within each ethnicity per age group, sex, and comorbidity, and (4) a comparison of deaths across the ethnicities per factors. Additionally, a dygraph was created as a visual output for the deaths across ethnicties/races through time.

As statistical outputs, chi-square tests were used throughout to understand if the differences in comorbidities and sex were statistically significant in the White, Black, and Hispanic/Latino community respectively. ANOVA tests were also attempted to prove statistical significant of the age group divisions. However, given the CDC data division per age “group” instead of age, statistical tests were not possible on that subset of the data.

For each deliverable, which is visually represented as a new tab in the Dashboard, a brief discussion of the data and the statistical tests has been given. Although not exhaustive, the purpose of this discussion is to begin to peel the complex factors that affect Covid-19 deaths across race/ethnicity in the United States.

A model of the Sars-Cov-2 Virus

A model of the Sars-Cov-2 Virus

253,623

Deaths Per Ethnicity/Race

Column

Figure 1: Covid-19 Deaths per Ethnicity. Oct 2020-Mar 2021

Column

Figure 2: Covid-19 Deaths proportional to the Population

Column

Discussion

Looking at both figure 1 and figure 2, there are some clear differences in amount of deaths versus proportional amount of deaths due to Covid-19 in the United States. Figure 1 showcases the number of deaths per ethnicity/race for 8 distinct categories. The White population has a higher overall count with 115,152 deaths. Hispanic/Latino comes in second with 33,505, while Black comes in third with 18,474 deaths. However, these populations are not equally represented in the population breakdown across the United States. As a result, they cannot be directly compared until the deaths have been adjusted to the proper proportions. Figure 2 adjusts the count by the number of people within each Ethnicity/Race in the United States. Surprisingly, the category of Multiple/Other comes out on top with a 0.11% of this population dying over 0.07% for the White population and a 0.06% of the Latino/Hispanic population. Further chi-square analysis cannot be performed on the percentages given the formatting of the CDC data.

With this information and a lack of a statistical test, the original assumption that deaths for the Latinx population would surpass the proportion of Hispanic/Latino in the United States much like it did for positive cases of Covid-19 cannot be proven. From the visual output alone, it can be suggested that the White population and the Hispanic/Latino population are in fact dying at similar rates. However, the Multiple/Other category is dying at a faster (although not necessarily statistically significant) rate than the other ethnicities. Further analysis needs to be done on the confounding factors behind this result.

Washing Your Hands: A Security Measure to Prevent Covid-19 Infection

Washing Your Hands: A Security Measure to Prevent Covid-19 Infection

Race: White

Column

Figure 1: Per Age Group

Figure 2: Per Sex

Figure 3: Per Comorbidity

Column

Discussion

[1] "X(8, N = 115142) = 273087.26, p < 0.01 (2-tailed)"
[1] "X(2, N = 114945) = 58653.93, p < 0.01 (2-tailed)"
[1] "X(1, N = 18173) = 10459.55, p < 0.01 (2-tailed)"

Focusing only on the White population subgroup of the data, these graphs and statistical outputs are aimed at understanding the confounding factors of age group, sex (male or female), and comorbidity.

Figure 1 visually suggests that within the White population, people aged 80+ are dying at a higher rate than other age groups. A one-sample chi-square test concludes that the difference between the age groups with respects to death counts in the White population is statistically significant (X(8, N = 138350) = 329801.43, p < 0.01 (2-tailed)). Looking at the table below, the age group 80+ contributes the most to the chi-square statistic, statistically suggesting that people 80+ within the White population are dying at a higher rate.

Age Group Deaths
0-9 Years 34
10-19 Years 62
20-29 Years 302
30-39 Years 683
40-49 Years 1861
50-59 Years 6220
60-69 Years 18332
70-79 Years 37081
80+ Years 73775

Figure 2 visually suggests that within the White population, males are dying at a higher rate than females. A one-sample chi-square test concludes that the difference between males and females with respects to death counts in the White population is statistically significant (X(2, N = 138109) = 70289.42, p < 0.01 (2-tailed)). Looking at the table below, males contribute the most to the chi-square statistic, statistically suggesting that males within the White population are dying at a higher rate.

Sex Deaths
Male 62241
Female 52702

Figure 3 visually suggests that within the White population, people with comorbidities are dying at a higher rate than people who do not have comorbidities. A one-sample chi-square test concludes that the difference between people with comorbidities and people with no comorbidities with respects to death counts in the White population is statistically significant (X(1, N = 23855) = 13412.06, p < 0.01 (2-tailed)). Looking at the table below, people with comorbidities contribute the most to the chi-square statistic, statistically suggesting that people with comorbidities within the White population are dying at a higher rate.

Co-morbidities Deaths
Yes 15980
No 2193

In conclusion, based on both the visual and statistical outputs, it appears that within the White population, people 80+, males, and people with comorbities are dying at a higher rate than their counterparts.

Race: Black

Column

Figure 4: Per Age Group

Figure 5: Per Gender

Figure 6: Per Previous Comorbidity

Column

Discussion

[1] "X(8, N = 18473) = 22012.63, p < 0.01 (2-tailed)"
[1] "X(2, N = 18446) = 9288.98, p < 0.01 (2-tailed)"
[1] "X(1, N = 3718) = 2466.05, p < 0.01 (2-tailed)"

Focusing only on the Black population subgroup of the data, these graphs and statistical outputs are aimed at understanding the confounding factors of age group, sex (male or female), and comorbidity.

Figure 4 visually suggests that within the Black population, people aged 80+ are dying at a higher rate than other age groups. A one-sample chi-square test concludes that the difference between the age groups with respects to death counts in the Black population is statistically significant (X(8, N = 18473) = 22012.63, p < 0.01 (2-tailed)). Looking at the table below, the age group 80+ contributes the most to the chi-square statistic, statistically suggesting that people 80+ within the Black population are dying at a higher rate.

Age Group Deaths
0-9 Years 13
10-19 Years 27
20-29 Years 146
30-39 Years 305
40-49 Years 714
50-59 Years 2075
60-69 Years 4267
70-79 Years 5094
80+ Years 5832

Figure 5 visually suggests that within the Black population, males are dying at a higher rate than females. A chi-square test of independence concludes that the difference between males and females with respects to death counts in the Black population is statistically significant (X(2, N = 18446) = 9288.98, p < 0.01 (2-tailed)). Looking at the table below, males contribute the most to the chi-square statistic, statistically suggesting that males within the Black population are dying at a higher rate.

Sex Deaths
Male 9683
Female 8762

Figure 6 visually suggests that within the Black population, people with comorbidities are dying at a higher rate than people who do not have comorbidities. A chi-square test of independence concludes that the difference between people with comorbidities and people with no comorbidities with respects to death counts in the White population is statistically significant (X(1, N = 3718) = 2466.05, p < 0.01 (2-tailed)). Looking at the table below, people with comorbidities contribute the most to the chi-square statistic, statistically suggesting that people with comorbidities within the Black population are dying at a higher rate. However, looking at the visual input, it is also important to notice that in this population, 14,756 people did not have their comorbidites accounted for.

Co-morbidities Deaths
Yes 14756
No 3373

In conclusion, based on both the visual and statistical outputs, it appears that within the Black population, people 80+, males, and people with comorbities are dying at a higher rate than their counterparts.

Ethnicity: Hispanic/Latino

Column

Figure 7: Per Age Group

Figure 8: Per Gender

Figure 9: Per Previous Comorbidity

Column

Discussion

[1] "X(8, N = 33502) = 32532.14, p < 0.01 (2-tailed)"
[1] "X(1, N = 33450) = 1785.41, p < 0.01 (2-tailed)"
[1] "X(1, N = 2618) = 16.25, p < 0.01 (2-tailed)"

Focusing only on the Hispanic/Latino population subgroup of the data, these graphs and statistical outputs are aimed at understanding the confounding factors of age group, sex (male or female), and comorbidity.

Figure 7 visually suggests that within the Hispanic/Latino population, people aged 80+ are dying at a higher rate than other age groups. A chi-square test of independence concludes that the difference between the age groups with respects to death counts in the Hispanic/Latino population is statistically significant (X(8, N = 33502) = 32532.14, p < 0.01 (2-tailed)). Looking at the table below, the age group 80+ contributes the most to the chi-square statistic, statistically suggesting that people 80+ within the Hispanic/Latino population are dying at a higher rate.

Age Group Deaths
0-9 Years 19
10-19 Years 45
20-29 Years 274
30-39 Years 772
40-49 Years 2074
50-59 Years 4806
60-69 Years 8001
70-79 Years 8307
80+ Years 9204

Figure 8 visually suggests that within the Hispanic/Latino population, males are dying at a higher rate than females. A chi-square test of independence concludes that the difference between males and females with respects to death counts in the Hispanic/Latino population is statistically significant (X(1, N = 33450) = 1785.41, p < 0.01 (2-tailed)). Looking at the table below, males contribute the most to the chi-square statistic, statistically suggesting that males within the Hispanic/Latino population are dying at a higher rate.

Sex Deaths
Male 20589
Female 12861

Figure 9 visually suggests that within the Hispanic/Latino population, people with comorbidities are dying at a higher rate than people who do not have comorbidities. A chi-square test of independence concludes that the difference between people with comorbidities and people with no comorbidities with respects to death counts in the Hispanic/Latino population is statistically significant (X(1, N = 3718) = 2466.05, p < 0.01 (2-tailed)). Looking at the table below, people with comorbidities contribute the most to the chi-square statistic, statistically suggesting that people with comorbidities within the Hispanic/Latino population are dying at a higher rate.

Co-morbidities Deaths
Yes 2454
No 165

In conclusion, based on both the visual and statistical outputs, it appears that within the Hispanic/Latino population, people 80+, males, and people with comorbidities are dying at a higher rate than their counterparts.

Confounding Factors Across Ethnicities

Column

Figure 10: Deaths by Age Group Across Ethnicities

Figure 11: Deaths by Sex Across Ethnicities

Figure 12: Deaths by Comorbidity Across Ethnicities

Column

Discussion

This section is meant as a visual output that puts together all the previous graphics into one for comparison across the ethnicities/races (White, Black, Hispanic/Latino) based on Age Group, Sex, and Comorbidities.

As seen in the previous data outputs, all three races/ethnicities have the highest death rate among people in the 80+ age group (See Figure 10). At the same time, all three races/ethnicities have the highest death rate among males when compared to females (See Figure 11). Finally, all three races/ethnicities have the highest death rate among people with comorbidities when compared to people without comorbidities (See Figure 12). In this last visual output, though, it is relevant to point out that the Black population and the Hispanic/Latino population have a higher number of deaths with no output for comorbidity; 4,901 Black people and 10,277 Hispanic/Latino people died without the CDC data accounting for comorbidities or not. While this exploratory data analysis does not look for a cause behind this statistic, there is justification for further analysis.

A chi-square test for independence on each of these factors across ethnicities suggests that the results are statistically significant (By Age Group and Ethnicity= X(16, N = 65981) = 5782.43, p < 0.01 (2-tailed), By Sex and Ethnicity= X(4, N = 65878) = 257.66, p < 0.01 (2-tailed), By Comorbidity and Ethnicity= X(2, N = 10379) = 13.71, p < 0.01 (2-tailed)) and certain populations based on factors are dying at higher rates than others. However, given that this data is not proportional to population percentages, this chi-square is statistically significant for White males with comorbidities but not meaningful.

Deaths Through Time

Column

Discussion

This dygraph is aimed at visualizing how cases rise and drop by ethnicity/race (White, Black, Hispanic/Latino) from October 2020 to March 2021. The visualization showcases January to February 2021, but the interactive graphic can be expanded to see a full output of the 5-month period. This visual output is significant to understand statistically when cases reached a high in the United States, so a qualitative analysis can be done on the reasoning behind these peaks. For example, January 5 is a peak for all three ethnicities/races, with 1160 White, 298 Black, and 592 Hispanic/Latino people dying. Although this is an exploratory analysis, a quick study on the date in relevance to holidays would suggest that this peak comes after the Christmas holiday season in which travel is more common across the United States.

Using a Face Mask: A Security Measure to Prevent Covid-19 Infection

Using a Face Mask: A Security Measure to Prevent Covid-19 Infection

Column

Deaths Through Time

Conclusion

Column

Conclusion

Through this exploratory cross-sectional study, CDC Data on Covid-19 deaths in the United States from October 2020 to March 2021 was analyzed. The original purpose was to understand which ethnicities/races were dying at a higher rate than other ethnicities in the United States. Additionally, factos such as age group, sex, and presence of comorbidities was included in the analysis. Finally, the principal investigator was interested in analyzing if the proportional death of Hispoanic/Latino surpassed the proportion of Latino/Hispanic in the United States, given that the rate of infection did surpass this statsitic.

Regarding the first objective, a visual output was created to analyze the deaths across ethnicities/races both by counts and proportions. The proportions graph showcased that the “Multiple/Other” population is dying at a faster rate than other proportional populations in the United States. In terms of the second objective, age group, sex, and presence of comorbidities were analyzed separately for the White population, the Black population, and the Hispanic/Latino population. Based on bar graphs and one-sample chi-square tests, the White population has higher deaths among people aged 80+ per age group, males per sex, and comorbidity present per presence of comorbidities. The Black population and the Hispanic/Latino population have the same statistical and visual outputs. Finally, for the third objective, this research rejects the hypothesis that Hispanic/Latino are dying at a higher proportional rate than other ethnicities/races in the United States. Further quantitative and qualitative analysis needs to be done on a new set of data based on this question to reframe the results.

As deliverables, this exploratory research analysis showcases multiple graphs per Ethnicity/Race as well as confounded graphs per factor across the ethnicities/races. Moreover, a dygraph showcasing deaths per ethnicity/race (White, Black, Hispanic/Latino) from October 2020 to March 2021 has been produced. This graph is meant as a public-facing graphic aimed at helping the understanding of peaks and troughs of the Covid-19 pandemic.

In conclusion, this exploratory data analysis suggests that (1) the “Multiple/Other” population is proportionally dying at a faster rate than other populations in the United States (2) factors such as Age Group, Sex, and Presence of Comorbidities seem to be affecting ethnicities/races equally across the United States, and (3) there is justification to further research why the Black and Hispanic/Latino communities have such a high count of deaths not accounted for within presence of comorbidities. With this in mind, the researcher hopes to continue qualitative and mixed methods work on the effects of Covid-19 and Covid-19 related deaths across the ethnicities here analyzed.

Most Effective Prevention Method for Covid-19 Related Deaths: The Covid-19 Vaccine

Most Effective Prevention Method for Covid-19 Related Deaths: The Covid-19 Vaccine

Column

Silvana Montanola

Silvana Montanola

About the Researcher

Silvana Montanola is a first semester PhD student in the Medical Anthropology program at the University of Maryland. Born and raised in Honduras, Silvana has had first-hand contact with a mixture of cultures and lived experiences which motivated her to pursue her studies with the Latinx community in the United States. She did a double major in Anthropology and English literature with a minor in Latin American Studies at Rollins College that further solidified her interests. As part of her undergraduate thesis, Silvana worked with the Farmworkers Association of Florida in a project related to HPV vaccination rates and barriers among the Latinx community. For her PhD, she is planning on continuing her health work with the Latinx community, while moving her focus to the state of Maryland and to pregnant undocumented mothers’ access to prenatal and neonatal care. However, given the Covid-19 pandemic, Silvana is thinking of refocusing her work to deal with the aftermath of the pandemic. This Dashboard is her first step in that direction