Introduction
West Virginia, and most of Appalachia, has long been recognized as experiencing significant health disparities, elevated risks of adverse health outcomes, and higher rates of mortality in comparison to the rest of the United States (Borak et al., 2012; Hendryx et al., 2010; Krometis et al., 2017). Higher rates of various cancers, chronic diseases, drug-related mortality, and increased exposure to environmental health risks like coal mining and mountaintop removal mining (MTM) are found in West Virginia (Borak et al., 2012; Hendryx et al., 2010; Monnat, 2018). With higher rates of mortality, increased risk of environmental exposure, historically lower socioeconomic status, and limited or no access to health care sites due to the complex topography of the mountainous terrain, the relationship between mortality rates and spatial accessibility to health care in West Virginia needs to be evaluated to identify best practices and locations for public health strategies. This project’s objective is to automate the process of map production for various maps for mortality rates in West Virginia from 2011 to 2018 and find the ratio of healthcare sites to the county’s population alongside exploring the use of linear regression.
Background
Previous research focused on exposure to hazardous land uses, like mountaintop removal mining (MTM) and coal mining, found that persons living near these land uses and/or within the counties experience elevated mortality causes after controlling for socio-economic, health services, and behavioral variables (e.g., Hendryx and Ahern, 2008; Hendryx et al., 2010; Krometis et al., 2017; Salm and Benson, 2019). Hendryx et al. concluded that coal mining exposure and proximity to mining boundaries are significantly correlated to cancer mortality in West Virginia with the most common cancer being respiratory related – a potential effect of air quality and pollution due to the coal mines (2010). The number of mining sites per county has a strong correlation with mortality through neurological and cognitive-related deaths, such as dementia, due to ingestion of particular matter (PM) from the MTM sites (Salm and Benson, 2019). However, limitations exist for environmental exposure research in that development of cancer is extremely variable, can take a long period to develop, and can result from unaccounted for variables or express results that don’t provide accurate conclusion to create an effective public strategy (Borak et al., 2012; Hendryx et al., 2010; Krometis et al., 2017). Socioeconomic and/or community characteristics should be included as variables when assessing environmental health spatial patterns as they can contribute to identifying geographic hot spots of adverse health outcomes and contribute to modeling prediction.
Drug-related mortality is extremely prevalent in West Virginia, alongside an increased risk for polypharmacy which is defined as the “simultaneous use of drugs from five or more different drug classes on a daily basis for at least 60 consecutive days a year” (Feng et al., 2017; Monnat, 2018). West Virginia had the highest rate of death due to drug overdose in 2018 compared to all other states, however, few studies have examined drug-related mortality rates within a spatial setting, focused in WV, which accounts for social, economic, and environmental factors (CDC, 2020a). Spatial accessibility of healthcare sites and rehabilitation centers has not been widely researched for the state of West Virginia (Moody et al., 2018). Through the use of spatial regression techniques and identifying relationships between polypharmacy, drug-related deaths, and community characteristics, policies can be implemented which increase accessibility to needed healthcare sites and work to decrease mortality rates (Moody et al., 2018). Obesity, obesity-related diseases, and cardiovascular mortality is a growing public health crisis in the United States and is especially a concern in West Virginia. The state of West Virginia consistently has one of the highest prevalences of obesity for the United States. The Center for Disease Control (CDC) reported West Virginia as the second-highest state for obesity prevalence in 2019, with Mississippi as first, across all states (CDC, 2020b). While the high prevalence of obesity and other non-communicable diseases, such as heart disease and/or diabetes, exists in WV and has increased over the past decade, few studies have been able to effectively identify the spatial relationships of chronic disease mortality and socioeconomic, environmental, and demographic factors (Amarasinghe et al., 2006; Annie et al., 2020). However, it is concluded that attributes of the built environment and natural amenities are spatially located and health events, like obesity and chronic diseases, and healthcare accessibility are in relation to socioeconomic, demographic, and environmental factors (Amarasinghe et al., 2006; Boehmer et al., 2006; Dashputre et al., 2020). Some research currently exists that addresses how the immediate built environment, geographic location, proximity to recreational facilities, and socioeconomic factors are associated with obesity and obesogenic neighborhoods but have not addressed the environment-obesity relationship across various levels of urbanization or spatial access to healthcare sites (e.g., Boehmer et al., 2006; Dashputre et al., 2020). Identifying factors that increase risk while also evaluating spatial access to healthcare services is extremely crucial in prevention of obesity, diabetes, cardiovascular and/or other related health effects or mortality.
Spatial accessibility to primary health care services and mammography centers has been previously evaluated for the Appalachian regions in Pennsylvania, Ohio, Kentucky, and North Carolina (Donohoe et al., 2015; Donohoe et al., 2016a; Donohoe et al., 2016b). Donohoe et al. published three articles in the years of 2015 and 2016 which focus on examining spatial accessibility to primary care and to mammography centers to evaluate the impact of accessibility on breast cancer diagnosis. However, they exclude the state of West Virginia from their study areas in all articles.
The ability to access healthcare, in terms of spatial access and socioeconomic factors, has been shown to alleviate morbidity and mortality rates in Appalachia (Donohoe et al., 2016b). However, accessibility to healthcare is a challenge in West Virginia, alongside Appalachia as a whole, and especially in rural areas. Previous studies have identified the complexity of the socioeconomic, environmental, cultural, and political relationships which continue to drive health disparities throughout Appalachia, and specifically West Virginia, including the difficult, mountainous terrain as it increases the challenge to access healthcare, influences settlement patterns, and poses barriers which influence economic and health disparities (Borak et al., 2012; Krometis et al., 2017). Accurately measuring spatial access to health care services and its relationship to mortality deaths is a crucial component to assist health policies and interventions in complex regions like Appalachia that often lack healthcare resources (Donohoe et al., 2016b).
Study Area and Data
The study area is the entire state of West Virginia, positioned within the central region of the Appalachian Mountain range (Figure 1). The Appalachian Mountain range, also known as the Appalachians, is composed of complex topography which increases the challenge of accessing health services by hindering transportation and ability to travel for health care needs. West Virginia experiences varying degrees of socio-economic characteristics, topography, and overall demographics throughout the state which also influence accessibility to healthcare and its relationship to mortality rates.
County level data, including information on population and community characteristics, such as age group, gender, unemployment, poverty, education level, income, and more specific information such as the percentages of people moving in and out of the county, is obtained from the SEERStat database (Surveillance, Epidemiology, and End Results Program). The 2014-2018 County Attributes dataset in SEERStat is derived from the U.S. Census Bureau American Community Survey (ACS) 5-year direct estimates (National Cancer Institute, n.d.). Mortality rates for all causes of death (COD) for all 55 West Virginia counties are obtained from the SEERStat database for the years 2011 through 2018. This project is at a county level scale as it is the finest spatial scale at which mortality data and geographic area attribute data are provided by SEERStat. To assess the accessibility to healthcare infrastructure, geocoded locations of 1028 healthcare service sites are obtained from the ongoing West Virginia (WV) Healthlink project. The file of the site locations contains information on the name, address components, and if they have the following specific attributes and/or services: federally funded, a correctional facility, VA, hospital designated, critical access health site, primary care, pediatrics, behavioral health, rehabilitation, OBGYN, birthing, specialty services.
tm_shape(county_wm)+
tm_polygons(fill ="black", col = "gray", lwd=1) +
tm_shape(sites_wm) +
tm_bubbles(col = "red", size = 0.05) +
tm_compass(position = c("right", "bottom")) +
tm_scale_bar(position = c("right", "bottom"))+
tm_layout(main.title = "West Virginia Counties and Healthcare Sites", main.title.size = 1.25, main.title.position = "center")
Methods
Multiple packages are used in this project including tmap, tmaptools, sf, dplyr, and rgdal. The sf package allowed loading of the shapefiles and the tmap package was the first applied in the Study Area and Data section to create a map of the study area including West Virginia county boundaries and health care sites within the state.
Number of Healthcare Sites Per County
To find the number of healthcare sites for each of the 55 counties in West Virginia, st_join is used to join the health care site locations to the counties shapefile that contains all the population data using a spatial join. The group_by() and count() functions from the dplyr package are then used to count the numer of healthcare sites for each West Virginia county. The st_drop_geometry() is applied to remove the geomtric information from the point output and returns data without any spatial data attatched. Left_join, a table join, is used to join the data without any spatial information back to the county boundaries shapefile using the common name “County”.Tmap is then used to map the number of sites in each county.
names(county_wm)[5] <- "County"
<- st_join(sites_wm, county_wm)
sites_in_county <- sites_in_county %>%
sites_cnt group_by(County.x) %>%
count()
names(sites_cnt)[1] <- "County"
<- st_drop_geometry(sites_cnt)
sites_cnt1
<- left_join(county_wm, sites_cnt1, by="County")
cnty_cnt names(cnty_cnt)[33] <- "sites_num"
tm_shape(cnty_cnt) +
tm_polygons(col="sites_num", title = "Number of Healthcare Sites") +
tm_compass(position = c("right", "bottom")) +
tm_scale_bar(position = c("right", "bottom"))+
tm_layout(main.title = "Number of Healthcare Sites Per County", inner.margins = c(0.1, 0.05 ,0.075, 0.2), main.title.size = 1.25, main.title.position = "center", main.title.fontface = "bold")
Healthcare Site Ratio Per County
The dplyr package is used to calculate the ratio of healthcare sites to the total population of each county by using mutate to create a new column named siteratio and the result is calculated by the number of healthcare sites divided by the population - providing the ratio of healthcare sites per county. Tmap is then used to map the healthcare site ratio for each county in West Virginia.
<- cnty_cnt %>%
site_ratio mutate(siteratio = sites_num / Population)
tm_shape(site_ratio) +
tm_polygons(col="siteratio", title = "Healthcare Site Ratio", palette=get_brewer_pal(palette="YlGn", n=7, plot=FALSE)) +
tm_compass(position = c("right", "bottom")) +
tm_scale_bar(position = c("right", "bottom"))+
tm_layout(inner.margins = c(0.1, 0.05 ,0.075, 0.2), main.title = "Healthcare Sites Ratio", main.title.size = 1.25, main.title.fontface = "bold", main.title.position = "center")
All-Cause Mortality Rates
All mortalities and causes of death are summed by each county and normalized by the total number of years and total county population to identify patterns of mortality rates per 100 persons. The table containing the total number of mortalities, all causes of death, from 2011 to 2018 for each county is joined to the county shapefile using the left_join function by the “County” column. Mutate is used to find the average annual mortality rate per 100 persons for the state of West Virginia from 2011 to 2018 at the county level.
<- inner_join(county_wm, cod_counts, by = "County")
counties_cod
$COD_Freq <- as.numeric(counties_cod$COD_Freq)
counties_cod<- counties_cod %>%
all_cod_rate mutate(rate = (COD_Freq / Population) / 8) %>%
mutate(rate100 = (rate * 100))
tm_shape(all_cod_rate) +
tm_polygons(col="rate", title = "Avg. Mortality Rate per 100", style = "quantile", palette=get_brewer_pal(palette="Blues", n=7, plot=FALSE)) +
tm_compass(position = c("right", "bottom")) +
tm_scale_bar(position = c("right", "bottom"))+
tm_layout(inner.margins = c(0.1, 0.05 ,0.075, 0.2), main.title = "Annual Average Mortality Rate in WV per 100 individuals", main.title.size = 1, main.title.position = "center", main.title.fontface = "bold")
Exploring Relationships between explanatory variables using linear regressions
Linear regression, and multiple linear regression, is used in this paper as an exploratory tool to assess and evaluate the relationship(s) between varying explantory variables. For linear regression, four seperate linear regression models are created with the average annual mortality rate as the dependent variable and health care site ratio, poverty rate, per capita income, and percent coal hectares as the four independent explanatory variables. One multiple linear regression model is also created with the average annual mortality rate as the depedent variable but the explantory variables include the following:
- Poverty Rate
- Per capita income
- Percent Unemployed
- Percent Bachelors Degree
- Percent Coal Hectares
<- left_join(all_cod_rate, sites_cnt1, by = "County")
all_cod_rate <- all_cod_rate %>%
all_cod_rate mutate(siteratio = n / Population)
### Linear Regression
# Annual Average Mortality Rate and Site Ratio
<- lm(rate ~ siteratio, data = all_cod_rate)
ratio_rate
# Annual Average Mortality Rate and Poverty Rate
<- lm(rate ~ PovrtyRate, data = all_cod_rate)
avgmo_pov
# Annual Average Mortality Rate and Per capita Income
<- lm(rate ~ PCIncome, data = all_cod_rate)
avgmo_pci
# Annual Average Mortality Rate and Percent Coal Hectares
<- lm(rate ~ PctCoalHet, data = all_cod_rate)
avgmo_coal
### Multiple Linear Regression
<- lm(rate ~ siteratio + PovrtyRate + PCIncome + PctUnemplo + PctBachDeg + PctCoalHet, data = all_cod_rate) mr
Results and Discussion
McDowell County had the greatest mortality rate, and greatest difference from the rest of the WV counties, with a mortality rate of 3.65 per 100 individuals. The county with the next greatest annual mortality rate was Logan County with a rate of 1.64 per 100 individuals. Mon County had the lowest mortality rate at 0.65. The annual average mortality rate for all causes of death for West Virginia counties for the 8 years was 1.3 per 100 individuals per year. Generally, counties in southern West Virginia had a greater mortality rate than the northern half, except for the far northern panhandle. All-cause mortality rates were the highest in southern West Virginia counties, including McDowell, Mingo, Logan, Mercer, Boone, and Fayette, and surrounding counties also had higher mortality rates than the north central and northeast counties.
The linear regression models results returned an insignificant p-value and extremely low adjusted r-squared for the relationship between healthcare site ratio and the mortality rate, which suggests it does not contribute greatly to the mortality rate. The two models between mortality rate and poverty rate and mortality rate and per capita income returned a significant p-value (p < 0.005) but a low adjusted r-squared value, suggesting that they also does not contribute a strong portion to the mortality rate. Percent of coal hectares returned the most significant p-value and the highest adjusted r-squared value, 0.47, as compared to the other three models which suggests that it contributes more as a variable to mortality rates as the other independent variables. The multiple linear regression model has an adjusted r-squared value of 0.56 and denotes percent coal hectares as the most significant with poverty rate the next most significant, followed by per capita income and percent bachelors degree.
Conclusion
Identifying spatial patterns of higher average mortality rates and which variables contribute, and can potentially predict, mortality rates can help pinpoint where health policies should be focused and to draft mitigation and prevention strategies. Using exploratory spatial analysis methods are an essential tool in developing public health strategies and can be used in West Virginia to alleviate high average all-cause mortality rates across the state. The ability to automate these processes can increase the ability and speed of being able to analyze the most recent and accurate data to make sure the most relevant policies are being developed.
References
Amarasinghe, A., Souza, G. D., Brown, C., and Borisova, T. (2006). The Impact of Socioeconomic and Spatial Differences on Obesity in West Virginia. Graduate Theses, Dissertations, and Problem Reports, 2502. https://doi.org/10.33915/etd.2502
Annie, F., Bates, M., Nanjundappa, A., Farooq, A., Anderson, E., and Wood, M. (2020). Spatial Outcomes of Myocardial Infarction (Heart Attack) from 2000 to 2018 in Southern West Virginia Using an Collective Analysis. Authorea. DOI: 10.22541/au.160443765.56446423/v1
Boehmer, T.K., Lovegreen, S.L., Haire-Joshu, D., and Brownson, R.C. (2006). What Constitutes an Obesogenic Environment in Rural Communities. American Journal of Health Promotion, 20(6), 411-421. https://doi.org/10.4278/0890-1171-20.6.411
Borak, J., Salipante-Zaidel, C., Slade, M.D., and Fields, C.A. (2012). Mortality Disparities in Appalachia: Reassessment of Major Risk Factors. JOEM, 54(2), 146-156. DOI: 10.1097/JOM.0b013e318246f395
Center for Disease Control and Prevention. (2020a). Drug Overdose Deaths. Retrieved February 25, 2021 from https://www.cdc.gov/drugoverdose/data/statedeaths.html.
Center for Disease Control and Prevention. (2020b). Adult Obesity Prevalence Maps. Retrieved February 29, 2021 from https://www.cdc.gov/obesity/data/prevalence-maps.html.
Dashputre, A.A., Surbhi, S., Podila, P.S.B., Shuvo, S.A., and Bailey, J.E. (2020). Can primary care access reduce health care utilization for patients with obesity-associated chronic condition in medically underserved areas? Journal of Evaluation in Clinical Practice, 26(6), 1689-1698. DOI: 10.1111/jep.13360
Donohoe, J., Marshall, V., Tan, X., Camacho, F.T., Anderson, R.T., and Balkrishnan, R. (2015). Predicting Late-stage Breast Cancer Diagnosis and Receipt of Adjuvant Therapy: Applying Current Spatial Access to Care Methods in Appalachia (Revised Version). Med Care, 53(11), 980-989. doi: 10.1097/MLR.0000000000000432
Donohoe, J., Marshall, V., Tan, X., Camacho, F.T., Anderson, R.T., and Balkrishnan, R. (2016a). Evaluating and comparing methods for measuring spatial access to mammography centers in Appalachia. Health Serv Outcomes Res Method, 16, 22-40. DOI 10.1007/s10742-016-0143-y
Donohoe, J., Marshall, V., Tan, X., Camacho, F.T., Anderson, R.T., and Balkrishnan, R. (2016b). Spatial Access to Primary Care Providers in Appalachia: Evaluating Current Methodology. Journal of Primary Care & Community Health, 7(3), 149-158. DOI: 10.1177/2150131916632554
Feng, X., Tan,. X., Riley, B., Zheng, T., Bias, T.K., Becker, J.B., and Sambamoorthi, U. (2017). Prevalence and Geographic Variations of Polypharmacy Among West Virginia Medicaid Beneficiaries. Annals of Pharmacotherapy, 5(11), 981-989. https://doi.org/10.1177/1060028017717017.
Hendryx, M. and Ahern, M.M. (2008). Relations between Health Indicators and Residential Proximity to Coal Mining in West Virginia. Am J Public Health, 98(4), 669-671. DOI: 10.2105/AJPH.2007.113472
Hendryx, M., Fedorko, E., and Anesetti-Rothermel, A. (2010). A geographical information system-based analysis of cancer mortality and population exposure to coal mining activities in West Virginia, United States of America. Geospatial Health, 4(2), 243-256. https://doi.org/10.4081/gh.2010.204.
Krometis, L.A., Gohlke, J., Kolivras, K., Satterwhite, E., Marmagas, S.W., and Marr, L.C. (2017). Environmental health disparities in the Central Appalachian region of the United States. Rev Environ Health, 32(3), 253-266. DOI: 10.1515/reveh-2017-0012
Moody, L., Satterwhite, E., and Bickel, W. K. (2018). Substance Use in Rural Central Appalachia: Current Status and Treatment Considerations. Rural Mental Health, 41(2), 123-135. DOI: 10.1037/rmh0000064
National Cancer Institute. (n.d.). Static County Attributes. National Cancer Institute. Retrieved February 12, 2021 from https://seer.cancer.gov/seerstat/variables/countyattribs/static.html#14-18
Salm, A. K. and Benson, M. J. (2019). Increased Dementia Mortality in West Virginia Counties with Mountaintop Removal Mining. Int J Environ Res Public Health, 4278. doi:10.3390/ijerph16214278
Surveillance Research Program, National Cancer Institute SEERStat software (www.seer.cancer.gov/seerstat) version 8.3.8. Surveillance, Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov) SEERStat Database: County Attributes – Total U.S., 1969-2018 Counties (www.seer.cancer.gov/seerstat/variables/countyattribs). National Cancer Institute, DCCPS, Surveillance Research Program.
Surveillance Research Program, National Cancer Institute SEERStat software (www.seer.cancer.gov/seerstat) version 8.3.8. Surveillance. Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov) SEERStat Database: Mortality - All COD, Aggregated With County, Total U.S. (1969-2018)