Nebraska is a large American state with an estimated population of 1,920,076 people (as of 2017). About 666,000 people live in rural areas. Ensuring access to healthcare has always been a problem in small-town America. As this piece in the National Public Radio indicates, there are plenty of challenges that need to be tackled: baby boomer physicians are retiring and rural hospitals are declining. Younger doctors are not allured by the prospect of purchasing practices in smaller communities, and do not integrate well into the local population. Most often take advantage of signing bonuses to pay off medical school debt and then leave to work in urban hospitals. The high turnover leads to an almost complete absence of the personal relationship between a patient and doctor.
These are all certainly pressing issues. However, I was most intrigued by these lines at the start of the piece.
Taylor Walker is wiping down tables after the lunch rush at the Bunkhouse Bar and Grill in remote Arthur, Nebraska, a tiny dot of a town ringed by cattle ranches.
The 25-year-old has her young son in tow, and she is expecting another baby in August.
“I was just having some terrible pain with this pregnancy and I couldn’t get in with my doctor,” she says.
Visiting her obstetrician in North Platte is a four-hour, round-trip endeavor that usually means missing a day of work. She arrived to a recent visit only to learn that another doctor was on call and hers wasn’t available.
“So then we had to make three trips down there just to get into my regular doctor,” Walker says.
This inconvenience is part of life in Arthur County, a 700-square-mile slice of western Nebraska prairie that’s home to only 465 people. According to census figures, it’s the fifth least-populated county in the nation.
Map of travel times that we will be creating.
It took Taylor Walker four hours of travel to get care from an obstetrician. I find it regrettable that someone would have to wait for so long to obtain medical care. The February 2018 University of Nebraska Medical Center report The Status of the Healthcare Worforce in the state of Nebraska points out that 13 counties in Nebraska have no primary care doctor.
Accessibility is definitely an issue. I would like to quantify this using R to see if I can substantiate Taylor Walker’s claims, and in the process, identify other regions that might face similar issues of lack of access to hospitals.
In this document, I will provide a brief walkthrough of using publicly available data and tools to create a data visualization that represents accessibility to medical services in the different parts of Nebraska.
Some of the resources/blog posts that I used are:
While I try to annotate my code wherever possible, I will assume prior experience with R.
This document can be divided into three broad steps:
These are the libraries we will be using.
library(gdistance)
library(abind)
library(rje)
library(ggplot2)
library(malariaAtlas)
library(tidyverse)
library(knitr)
library(tmap)
library(sp)
Download the rds file available at Amelia Bertozzi-Villa’s github repository here.
US <- readRDS("gadm36_USA_1_sp.rds") #Obtaining the bulk file
analysis <- US[US@data$NAME_1=="Nebraska", ]
#Subsetting the RDS file to select only the Nebraska shapefile
plot(analysis, main="Nebraska")
#Plotting the Neraska shapefile
Let us project this shapefile.
proj4string(analysis) <- CRS("+proj=longlat +datum=WGS84")
#Write the projection file
analysis_proj <- spTransform(analysis, CRS("+proj=longlat +datum=WGS84"))
#Assign the written projection
Here’s a list of hospitals registered with Medicare. Download it, and save it in your working directory.
hospitals <- read.csv("Hospitals.csv", stringsAsFactors = FALSE )
Subset for hospitals in Nebraska.
hospitals_neb <- hospitals %>% filter(State == "NE")
#There are 89 Medicare [articipating hospitals in the state of Nebraska.
Take a look at rows 29 to 39 of the data.
kable((hospitals_neb)[29:39, c(2,29)])
| Hospital.Name | Location | |
|---|---|---|
| 29 | CRETE AREA MEDICAL CENTER | P O BOX 220, 2910 BETTEN DR CRETE, NE 68333 |
| 30 | BOX BUTTE GENERAL HOSPITAL | P O BOX 810, 2101 BOX BUTTE AVE ALLIANCE, NE 69301 |
| 31 | LEXINGTON REGIONAL HEALTH CENTER | P O BOX 980, 1201 NORTH ERIE ST LEXINGTON, NE 68850 |
| 32 | COMMUNITY HOSPITAL | P O BOX 1328, 1301 EAST H ST MCCOOK, NE 69001 |
| 33 | VALLEY COUNTY HEALTH SYSTEM | 2707 L STREET ORD, NE 68862 (41.603213, -98.942676) |
| 34 | THAYER COUNTY HEALTH SERVICES | 120 PARK AVE HEBRON, NE 68370 (40.160123, -97.593267) |
| 35 | WARREN MEMORIAL HOSPITAL | 905 SECOND ST FRIEND, NE 68359 (40.653693, -97.282052) |
| 36 | FAITH REGIONAL HEALTH SERVICES | 2700 WEST NORFOLK AVE NORFOLK, NE 68701 (42.032526, -97.450942) |
| 37 | MEMORIAL COMMUNITY HOSPITAL MCH & HEALTH SYSTEM | 810 NORTH 22ND ST BLAIR, NE 68008 (41.551152, -96.145519) |
| 38 | HENDERSON HEALTH CARE SERVICES | 1621 FRONT STREET HENDERSON, NE 68371 (40.782888, -97.807361) |
| 39 | GOTHENBURG MEMORIAL HOSPITAL | 910 20TH ST GOTHENBURG, NE 69138 (40.938132, -100.153352) |
Look at the location column. Some hospitals do not have a latitude and longitude geocoded. We will need this information to represent the hospitals as point files over the map of Nebraska. We will use the ggmap package in R to geocode addresses.
library("ggmap")
#load the ggmap package
Create an empty dataframe to temporarily hold the geocoded values that will be generated.
geocoded <- data.frame(stringsAsFactors = FALSE)
Note: You need a Google API key to use the geocode function from ggmap. This link at Google Cloud will walk you through the steps needed to generate a key. You will be asked to provide your card details, but you will not be charged a fee, now or in the future. In fact, you will be provided free credits to get you started.
Once you have your key, run this line of code:
register_google(key ='paste your key here')
#Use single quotes
Then, run this line of code.
for (i in 1:nrow(hospitals_neb)) {
result <- geocode(hospitals_neb$Location[i], output = "latlon", source = "google")
#The geocode function looks at the Location column from each row of hospitals_neb in turn
#The output argument tells it to create a vector of 2 objects, a lat and a long.
hospitals_neb$lon[i] <- as.numeric(result[1])
hospitals_neb$lat[i] <- as.numeric(result[2])
#The first line of code takes the first object from the two-object vector (the long)
#and adds it as a numeric variable to hospitals_neb under a new column called lon.
#Similarly, the second line creates a column called lat and adds the geocoded latitude.
}
Save the geodcoded data in your working directory.
write.csv(hospitals_neb, "cleangc.csv", row.names = FALSE)
#Creates a file in your working directory called cleangc, which is basically
#hospitals_neb, but with a lat and a long column attached,
Now, read in the csv file with the geocoded locations.
hospitals_geocoded <- read.csv("cleangc.csv")
Let us take a look at rows 70 to 80 of this dataset.
kable(hospitals_geocoded[70:80, c(2, 29, 30, 31)])
As can be seen from the NA in the lon and lat columns, the geocoding failed for row 75. This is not an infrequent occurrence with ggmap, and could be because the address listed did not provide sufficient detail for Google to complete the geocoding. However, we are lucky in this instance— the Location column here (derived from the original dataset) already had a lat and long listed. Let us manually enter the lat and long for row 75.
hospitals_geocoded[75, 31] = 40.592416 #lat
hospitals_geocoded[75, 30] = -98.387474 #long
Let’s pull the coordinates into an object called xy.
xy <- hospitals_geocoded[, c(30, 31)]
#create an object called xy to hold the lat and long
#order is always long before lat to facilitate mapping
Let us spatialize this dataset.
hosp_spat <- SpatialPointsDataFrame(coords = xy, data = hospitals_geocoded)
Let us assign the points the same projection as the shapefile of Nebraska.
proj4string(hosp_spat) <- CRS("+proj=longlat +datum=WGS84")
#Write the projection file
hosp_proj <- spTransform(hosp_spat, CRS("+proj=longlat +datum=WGS84"))
#Assign the written projection
Let’s visualize the hospitals in Nebraska that accept Medicare.
tm_shape(analysis_proj) + tm_borders() +
tm_shape(hosp_proj) +
tm_dots(col = "black", size = 0.08)+
tm_layout( main.title = "Hospitals in Nebraska accepting Medicare",
main.title.position = "center",
frame = FALSE) +
tm_scale_bar(position = c("right", "top")) +
tm_compass()
## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
Notice that the legend and compass overlap the map. We need to adjust the bounding box for the analysis_proj shape to make sure that these extra elements do not obscure the image.
new_box <- st_bbox(analysis_proj) # current bounding box
#it is a vector of 4 values specifying the left, right,
#top and bottom margins of the map.
xrange <- new_box$xmax - new_box$xmin # spread of x values
yrange <- new_box$ymax - new_box$ymin # spread of y values
new_box[1] <- new_box[1] - (0.1 * xrange) # xmin - left
new_box[3] <- new_box[3] + (0.1 * xrange) # xmax - right
new_box[2] <- new_box[2] - (0.1 * yrange) # ymin - bottom
new_box[4] <- new_box[4] + (0.1 * yrange) # ymax - top
new_box <- new_box %>% st_as_sfc()
#make the box a spatial object
#use it in your map to get an overlap free image
tm_shape(analysis_proj, bbox = new_box) +
tm_fill(col = "white") +
tm_borders() +
tm_shape(hosp_proj) +
tm_dots(col = "black", size = 0.08)+
tm_layout( main.title = "Distribution of Hospitals in Nebraska accepting Medicare",
main.title.position = "center",
frame = FALSE,
bg.color = "lightgreen") +
tm_scale_bar(position = c("right", "top")) +
tm_compass()
Let us now acess the malariaAtlas dataset to obtain a map that ranks every pixel in Nebraska based on the land based travel speed possible in that pixel ( A higher value indicates that you can travel across that pixel faster ). Remember, each pixel represents an area that is one square kilometer across. Nebraska is 200,000 square kilometers (70,000 square miles), so we should have around 200,000 pixels in this raster.
#quote the text in the surface argument exactlly as it is given or else it will not work.
friction <- malariaAtlas::getRaster(
surface = "A global friction surface enumerating land-based travel speed for a nominal year 2015",
shp = analysis_proj) #restrict that surface to Nebraska
#a function from the malariaAtlas library that quickly plots rasters.
malariaAtlas::autoplot_MAPraster(friction, plot_titles = FALSE)
Now, we run these two lines of code on the friction surface.
Trans<- gdistance::transition(friction, function(x) 1/mean(x), 8)
Trans.GC <- gdistance::geoCorrection(Trans)
What are these lines of code doing? As Amelia Bertozzi-Villa explains, R is calculating the ‘ease’ of traveling from any one areal unit to another. The 8 in the first line of code comes from the fact that any pixel is surrounded by 8 pixels and therefore, there are 8 possible ways to travel out of any pixel (this is what geospatial analysts would recognize as a combination of a queen and a rook configuration.) The second line of code corrects for the distortion that results when you try to represent a three dimensional surface of the earth on a two dimensional computer screen.
These next two lines of code are where the all-important time-to-hospital calculation happens. Line two takes the geocorrected transition matrix and a matrix of lat longs of all the hospitals in Nebraska accepting Medicaid, and feeds it to the accCost function, which calculates the ‘cost’ or the time taken in minutes to travel from any given point to the nearest hospital. The output is a raster, which we will visualize in the next chunk of code.
pointsmatrix <- as.matrix(hosp_proj@coords)
access.raster <- gdistance::accCost(Trans.GC, pointsmatrix)
Let us map out the accessibility of Nebraska Medicare hospitals.
plot <- malariaAtlas::autoplot_MAPraster(access.raster,
shp_df=analysis_proj, printed=F)
full_plot <- plot[[1]] + geom_point(data=data.frame(hosp_proj@coords),
aes(x=lon, y=lat)) +
scale_fill_gradientn(colors = rev(rje::cubeHelix(gamma=1.0,
start=1.5,
r=-1.0,
hue=1.5,
n=16)),
name="Minutes \n of Travel") +
ggtitle("Travel Time to Nearest Medicare Hospital in Nebraska") +
theme(axis.text=element_blank(),
panel.border=element_rect(fill=NA, color="white"))
print(full_plot)
This is a very insightful map. The whitish regions represent points from which it is very easy to access healthcare, so the roads are clearly seen mapped out in white. You can see that there are certain areas in the central/western parts of Nebraska that require upto 250 minutes of traveling to reach the nearest Medicare hospital.
This obvious lack of hospitals in the region is probably reflecting the low population density in these counties. We can visualize population density in these areas using census data, but that is a walkthrough for another day.
An area with a longer traveling time despite being close to two hospitals
This could be because of physical structures like hills or poorly constructed roads that contribute to an increased travel time.
Nebraska Counties. Taylor Walker lives in Arthur County, and our analysis supports her statement that it takes 4 hours one way to reach a doctor. In addition, we found that people in counties such as McPherson, Logan, or Thomas might face the same issues.
While this was an interesting analysis, we must not jump to conclusions. The fact that there are issues of accessibility to healthcare in rural Nebraska is undeniable, but our dataset used only Medicare hospitals to measure access. Other options to mitigate this issue could include freestanding emergency departments, use of telemedicine and improving mergers between rural and larger healthcare clinics. A deeper discussion of these policies is beyond the scope of this document, but you can refer to this document by the Rural Health Research and Policy Centers to learn more.
This anaysis could be a powerful way of quantifying access to healthcare in remote areas all over the world. From malaria clinics in sub-Saharan Africa to HIV testing centers in the forests of Myanmar, the applications of the wonderful tools provided by the Malaria Atlas Project are limitless.