Introduction

One of the critical inputs to the Estimated Ground Evacuation Time layer (EGET) is point locations of medical facilities. They represent the end points of the travel time analysis that forms the foundation of the dataset. Given that we are in the process of updating and improving EGET, we need to ensure that we are using the best and most relevant medical facility spatial data. One thing we do know is that we will be using this dataset:

Hospitals from the Homeland Infrastructure Foundation-Level Data (HIFLD), produced by the U.S. Department of Homeland Security’s Geospatial Management Office

There are several attributes within this dataset that will help determine whether or not individual medical facilities will be useful and should be used in the new generation of EGET. This document will provide a basic spatial and thematic overview of the data to inform a conversation about how best to use this dataset.

Overview

First, we can take a look at the spatial extent and distribution of the entire dataset:

# load the necessary libraries
library(sf)
library(rmarkdown)
library(knitr)
library(terra)
library(viridis)

# read in the data
hosp <- st_read("S:/ursa2/campbell/ground_evac/pilot_areas/pilot_area_1/data_in/hospitals/Hospitals.shp")
## Reading layer `Hospitals' from data source 
##   `S:\ursa2\campbell\ground_evac\pilot_areas\pilot_area_1\data_in\hospitals\Hospitals.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 8013 features and 32 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -176.6403 ymin: -14.29024 xmax: 145.7245 ymax: 71.29773
## Geodetic CRS:  WGS 84
state <- st_read("S:/ursa2/campbell/ground_evac/full_area/data_in/boundaries/census/tl_2022_us_state.shp")
## Reading layer `tl_2022_us_state' from data source 
##   `S:\ursa2\campbell\ground_evac\full_area\data_in\boundaries\census\tl_2022_us_state.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 56 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -179.2311 ymin: -14.60181 xmax: 179.8597 ymax: 71.43979
## Geodetic CRS:  NAD83
# match coordinate systems
crs <- st_crs(state)
hosp <- st_transform(hosp, crs)

# map the entire dataset out
plot(st_geometry(hosp), pch = 16, cex = 0.25)
plot(st_geometry(state), border = "red", add = T)

# subset just conus facilities and map them out
abbv <- state$STUSPS
abbv.conus <- abbv[!abbv %in% c("AK", "AS", "HI", "MP", "GU", "PR", "VI")]
state.conus <- state[state$STUSPS %in% abbv.conus,]
hosp.conus <- st_intersection(hosp, state.conus)
plot(st_geometry(hosp.conus), pch = 16, cex = 0.25)
plot(st_geometry(state.conus), border = "red", add = T)

Geographic summary

  • In total, there are 8013 medical facilities in this dataset
  • Within the contiguous US (CONUS), there are 7858

Henceforth in this document, all counts will be based only on CONUS hospitals as that is the focal extent of EGET.

Attributes of interest

I will now take a look at some of the attributes within the dataset that may be of use in determining whether or not a medical facility should be included in the new generation of EGET. To be clear, these are attributes that I am interpreting to be of use. That said, I am not an expert in medical geography. So, I will also include the first 100 rows of the full table below for those interested in taking a deeper dive:

# print paged table of hospitals data
paged_table(hosp.conus[1:100,])

Attribute #1: “STATUS”

This should be an easy one. It simply defines whether a facility is open or closed. Assuming the data are reliable and updated, then we would want to remove closed facilities.

# create and print table of STATUS
df.status <- table(st_drop_geometry(hosp.conus$STATUS)) %>%
  as.data.frame()
colnames(df.status) <- c("STATUS", "Count")
kable(df.status)
STATUS Count
CLOSED 377
OPEN 7481
# remove closed facilities from further consideration
hosp.open <- hosp.conus[hosp.conus$STATUS == "OPEN",]

After this filter, there are now 7481 facilities remaining.

Attributes #2 and #3: “TYPE” and “NAICS_DESC”

These are two attributes that provide some indication as to the types of services offered by the facilities. TYPE appears to be somewhat less descriptive than NAICS_DESC, but it’s not immediately clear which makes the most sense to use for filtering purposes. So, I will take a look at both, and how they relate to one another.

# create and print table of TYPE
df.type <- table(st_drop_geometry(hosp.conus$TYPE)) %>%
  as.data.frame()
colnames(df.type) <- c("TYPE", "Count")
df.type <- df.type[order(df.type$Count, decreasing = T),]
kable(df.type)
TYPE Count
4 GENERAL ACUTE CARE 4140
3 CRITICAL ACCESS 1241
7 PSYCHIATRIC 970
8 REHABILITATION 453
5 LONG TERM CARE 381
6 MILITARY 272
9 SPECIAL 211
1 CHILDREN 159
10 WOMEN 19
2 CHRONIC DISEASE 12

Based on TYPE alone, it would appear that every type except for the two most abundant (“GENERAL ACUTE CARE” and “CRITICAL ACCESS”) should be removed from consideration.But let’s also take a look at NAICS_DESC (which, by the way, stands for the North American Industry Classification System’s description):

# create and print table of NAICS_DESC
df.naics <- table(st_drop_geometry(hosp.conus$NAICS_DESC)) %>%
  as.data.frame()
colnames(df.naics) <- c("NAICS_DESC", "Count")
df.naics <- df.naics[order(df.naics$Count, decreasing = T),]
kable(df.naics)
NAICS_DESC Count
6 GENERAL MEDICAL AND SURGICAL HOSPITALS 5979
15 PSYCHIATRIC AND SUBSTANCE ABUSE HOSPITALS 801
17 SPECIALTY (EXCEPT PSYCHIATRIC AND SUBSTANCE ABUSE) HOSPITALS 430
16 REHABILITATION HOSPITALS (EXCEPT ALCOHOLISM, DRUG ADDICTION) 414
1 CHILDREN’S HOSPITALS, GENERAL 100
5 EXTENDED CARE HOSPITALS (EXCEPT MENTAL, SUBSTANCE ABUSE) 33
3 CHILDREN’S HOSPITALS, SPECIALTY (EXCEPT PSYCHIATRIC, SUBSTANCE ABUSE) 20
8 HOSPITALS, PSYCHIATRIC (EXCEPT CONVALESCENT) 16
4 CHRONIC DISEASE HOSPITALS 12
14 ORTHOPEDIC HOSPITALS 12
11 HOSPITALS, SUBSTANCE ABUSE 11
7 HOSPITALS, ADDICTION 10
2 CHILDREN’S HOSPITALS, PSYCHIATRIC OR SUBSTANCE ABUSE 7
10 HOSPITALS, SPECIALTY (EXCEPT PSYCHIATRIC, SUBSTANCE ABUSE) 7
12 MATERNITY HOSPITALS 3
9 HOSPITALS, PSYCHIATRIC PEDIATRIC 2
13 MENTAL HEALTH HOSPITALS 1

Based on NAICS_DESC alone, I would likely only retain the first, most abundant category (“GENERAL MEDICAL AND SURGICAL HOSPITALS”). With a total of 5979, it appears somewhat more inclusive than the top two TYPE categories (), which sum to 5381. I’m curious how these two attributes overlap, so I’ll see how the most abundant NAICS_DESC class cross-tabulates with all of the TYPE domain values:

# cross-tabulate NAICS_DESC and TYPE
df.gmsh <- hosp.open[hosp.open$NAICS_DESC == "GENERAL MEDICAL AND SURGICAL HOSPITALS",]
df.naics.vs.type <- table(df.gmsh$TYPE) %>%
  as.data.frame()
colnames(df.naics.vs.type) <- c("TYPE", "Count")
df.naics.vs.type <- df.naics.vs.type[order(df.naics.vs.type$Count, decreasing = T),]
kable(df.naics.vs.type)
TYPE Count
3 GENERAL ACUTE CARE 3835
2 CRITICAL ACCESS 1215
5 MILITARY 270
6 PSYCHIATRIC 147
4 LONG TERM CARE 86
8 SPECIAL 76
7 REHABILITATION 45
1 CHILDREN 33
9 WOMEN 10

There are several hospitals within the NAICS_DESC category of “GENERAL MEDICAL AND SURGICAL HOSPITALS” that I would consider to be unsuitable for EGET (e.g., “PSYCHIATRIC”, etc.). So, my inclination is to use the TYPE attribute, instead of the NAICS_DESC attribute for this round of filtering. Specifically, I will retain only those facilities labeled as either “GENERAL ACUTE CARE” or “CRITICAL ACCESS”.

# filter by type
hosp.type <- hosp.open[hosp.open$TYPE %in% c("GENERAL ACUTE CARE", "CRITICAL ACCESS"),]

After this filter, there are now 5136 facilities remaining.

Attribute #4: “NAME”

The previous version of EGET from WFDSS, as far as I can tell, was solely based on the following filter:

Excluded urgent care clinics and clipped this layer to the US States layer to exclude Canadian and Mexican hospitals to get at “definitive” care.

So, besides geography, the only attribute-driven filter was based on the distinction of a facility being considered an “urgent care clinic”. There are no attributes in the hospitals dataset we are relying on that specifically make this distinction. But, perhaps by looking at the names of the facilities, we can see if a similar filter could be crafted. Of course, this is not a straightforward task, as urgent care clinics can have a variety of names. In Salt Lake, for example, some facilities are called “InstaCare”, others “Ugent Care”, and perhaps others yet that might be considered urgent care facilities that only have “Clinic” in the name… Perhaps the best starting point is to look at common terms within hospital names. I’ll do this by counting the number of common words appear. Since there will be a huge number of words, I’ll filter to only those that show up at least 50 times (~1% of remaining facilities).

# get word count
word.count <- table(unlist(strsplit(tolower(hosp.type$NAME), " "))) %>%
  as.data.frame()
colnames(word.count) <- c("Word", "Count")
word.count <- word.count[word.count$Count >= 50,]
word.count <- word.count[order(word.count$Count, decreasing = T),]
kable(word.count)
Word Count
1529 hospital 2730
573 center 1538
2060 medical 1481
1427 health 741
1 - 677
2077 memorial 540
2727 regional 420
816 county 350
745 community 274
2362 of 229
3061 st 206
96 and 183
3352 valley 160
1582 inc 156
3062 st. 153
497 campus 148
3320 university 138
1252 general 134
1430 healthcare 105
2089 mercy 105
3195 the 94
2304 north 91
3157 system 86
1876 llc 77
3016 south 77
3475 west 77
1535 hospital, 74
215 baptist 73
2854 saint 71
660 city 68
149 ascension 62
517 care 60
963 district 60
2102 methodist 57
3191 texas 57

Surprisingly, “urgent” does not even show up in this list… In fact, it only shows up in 1 hospital names in the filtered set. Further, it likewise only shows up in 1 of the original, unfiltered dataset. Perhaps this has to do with the fact that urgent care clinics may be contained within other medical facilities? Difficult to say… But clearly names alone will not be useful for filtering urgent care clinics.

Looking back at the TYPE attribute a little deeper, perhaps the most relevant distinction is between “GENERAL ACUTE CARE” and “CRITICAL ACCESS” hospitals. After a little Googling, I found a definition for each provided by the Madison County Memorial Hospital:

Acute Care Hospitals (ACH) are hospitals that provide short-term patient care, whereas Critical Access Hospitals (CAH) are small facilities that give limited outpatient and inpatient hospital services to people in rural areas. Acute care is being a patient in a Hospital rather than an Urgent Care center. Critical care is a unit for serious cases that need more one on one care and are normally part of emergency room care.

These two TYPE categories may be the best way to approximate the urgent care classification, with “GENERAL ACUTE CARE” representing what one might typically consider a “hospital” and “CRITICAL ACCESS” representing what one might typically consider an “urgent care” center. For later comparisons, I will create an additional filter that distinguishes between these two types of facilities.

# create an additional filter for just general acute care facilities
hosp.acut <- hosp.type[hosp.type$TYPE == "GENERAL ACUTE CARE",]

Inclusive of both types, there are 5136 facilities remaining. Inclusive only of general acute care facilities, there are only 3919.

I will retain both for further investigation.

Attribute #6: “TRAUMA”

According to the American Trauma Society, there are five levels of trauma center categorization. Rather than copy/paste all of the details here, please click the link to read about the classification system. Unfortunately, this attribute it quite complex within the dataset… Where one might want a simple I, II, III, etc. classification, there are many more categories and combinations of categories. Here is a tabulation of the counts of each unique value within the “TRAUMA” field of the inclusive “GENERAL ACUTE CARE” and “CRITICAL ACCESS” data:

# count unique TRAUMA values
df.trauma <- table(hosp.type$"TRAUMA") %>%
  as.data.frame()
colnames(df.trauma) <- c("TRAUMA", "Count")
df.trauma <- df.trauma[order(df.trauma$Count, decreasing = T),]
kable(df.trauma)
TRAUMA Count
20 NOT AVAILABLE 2933
18 LEVEL IV 957
16 LEVEL III 515
9 LEVEL II 324
3 LEVEL I 211
26 TRH 37
25 TRF 32
2 CTH 29
6 LEVEL I ADULT, LEVEL II PEDIATRIC 18
5 LEVEL I ADULT, LEVEL I PEDIATRIC 17
19 LEVEL V 17
1 ATH 8
12 LEVEL II ADULT, LEVEL II PEDIATRIC 8
17 LEVEL III ADULT 7
4 LEVEL I ADULT 5
23 RTC 5
8 LEVEL I, LEVEL II PEDIATRIC 2
10 LEVEL II / PEDIATRIC 2
11 LEVEL II ADULT 2
7 LEVEL I, LEVEL I PEDIATRIC 1
13 LEVEL II REHAB 1
14 LEVEL II, LEVEL II PEDIATRIC 1
15 LEVEL II, LEVEL III PEDIATRIC, LEVEL II REHAB 1
21 PARC 1
22 REGIONAL 1
24 RTH 1

After some sleuthing, I found this page with definitions specific to this dataset:

https://hifld-geoplatform.opendata.arcgis.com/datasets/geoplatform::trauma-levels-1/explore

Unfortunately, it’s incomplete, and the definitions appear somewhat inconsistent between states. But, at least it helped me define some unknown abbreviations:

  • ATH: Area trauma center (seemingly comparable to Level III?)
  • CTH: Community trauma facility (seemingly comparable to Level IV?)
  • PARC: Primary adult resource center (can’t find trauma center level comparison)
  • REGIONAL: Regional trauma center (possibly comparable to Level II?)
  • RTC: Also regional trauma center (also possibly comparable to Level II?)
  • RTH: Regional trauma facility (also possibly comparable to Level II?)
  • TRF: Trauma receiving facility (seemingly comparable to Level V?)
  • TRH: Trauma receiving hospital (also seemingly comparable to Level V?)

So, most of them can be matched up with a trauma level. But there remains a challenge where there are multiple classifications (e.g., “LEVEL I ADULT, LEVEL II PEDIATRIC”). I guess we can assume that it’s safe to ignore the pediatric and rehab levels. Taking that into account, I will create a manual lookup table to attempt to assign as many facilities to a trauma center level as possible:

# build trauma lookup table
trauma.lut <- matrix(c("ATH", 3,
                       "CTH", 4,
                       "LEVEL I", 1,
                       "LEVEL I ADULT", 1,
                       "LEVEL I ADULT, LEVEL I PEDIATRIC", 1,
                       "LEVEL I ADULT, LEVEL II PEDIATRIC", 1,
                       "LEVEL I, LEVEL I PEDIATRIC", 1,
                       "LEVEL I, LEVEL II PEDIATRIC", 1,
                       "LEVEL II", 2,
                       "LEVEL II / PEDIATRIC", 2,
                       "LEVEL II ADULT", 2,
                       "LEVEL II ADULT, LEVEL II PEDIATRIC", 2,
                       "LEVEL II REHAB", NA,
                       "LEVEL II, LEVEL II PEDIATRIC", 2,
                       "LEVEL II, LEVEL III PEDIATRIC, LEVEL II REHAB", 2,
                       "LEVEL III", 3,
                       "LEVEL III ADULT", 3,
                       "LEVEL IV", 4,
                       "LEVEL V", 5,
                       "NOT AVAILABLE", NA,
                       "PARC", NA,
                       "REGIONAL", 2,
                       "RTC", 2,
                       "RTH", 2,
                       "TRF", 5,
                       "TRH", 5),
                     ncol = 2, byrow = T)
trauma.lut <- as.data.frame(trauma.lut)
colnames(trauma.lut) <- c("TRAUMA", "TRAUMA_RC")
trauma.lut$TRAUMA_RC <- as.numeric(trauma.lut$TRAUMA_RC)
kable(trauma.lut)
TRAUMA TRAUMA_RC
ATH 3
CTH 4
LEVEL I 1
LEVEL I ADULT 1
LEVEL I ADULT, LEVEL I PEDIATRIC 1
LEVEL I ADULT, LEVEL II PEDIATRIC 1
LEVEL I, LEVEL I PEDIATRIC 1
LEVEL I, LEVEL II PEDIATRIC 1
LEVEL II 2
LEVEL II / PEDIATRIC 2
LEVEL II ADULT 2
LEVEL II ADULT, LEVEL II PEDIATRIC 2
LEVEL II REHAB NA
LEVEL II, LEVEL II PEDIATRIC 2
LEVEL II, LEVEL III PEDIATRIC, LEVEL II REHAB 2
LEVEL III 3
LEVEL III ADULT 3
LEVEL IV 4
LEVEL V 5
NOT AVAILABLE NA
PARC NA
REGIONAL 2
RTC 2
RTH 2
TRF 5
TRH 5

I will join this table to the filtered hospitals data to get counts according to this new modified trauma center level classification:

# join the tables together
hosp.type <- merge(hosp.type, trauma.lut)

# summarize trauma center levels based on reclassified data
df.trauma.rc <- table(hosp.type$"TRAUMA_RC", useNA = "ifany") %>%
  as.data.frame()
colnames(df.trauma.rc) <- c("TRAUMA_RC", "Count")
df.trauma.rc <- df.trauma.rc[order(df.trauma.rc$Count, decreasing = T),]
kable(df.trauma.rc)
TRAUMA_RC Count
6 NA 2935
4 4 986
3 3 530
2 2 345
1 1 254
5 5 86

OK, much simpler. But, there are a few issues:

  1. There are a lot of NAs. Does this mean they are just not considered trauma centers? Or does this mean they do not have reliable trauma center attribution? No way to know, I suppose.
  2. It’s still not clear what a good trauma center level threshold would be with respect to EGET.

Nevertheless, it’s probably worth understanding the relationship between the two facility types described above (“GENERAL ACUTE CARE” and “CRITICAL ACCESS”) and trauma center levels. To do this, I will cross-tabulate the totals between the two fields:

# cross-tabulate TYPE and TRAUMA_RC
df.type.vs.trauma <- table(hosp.type$"TRAUMA_RC", hosp.type$"TYPE", useNA = "ifany") %>%
  as.data.frame()
colnames(df.type.vs.trauma) <- c("TRAUMA_RC", "TYPE", "Count")
df.type.vs.trauma <- df.type.vs.trauma[,c("TYPE", "TRAUMA_RC", "Count")]
df.type.vs.trauma <- df.type.vs.trauma[order(df.type.vs.trauma$TYPE, df.type.vs.trauma$TRAUMA_RC, decreasing = T),]
kable(df.type.vs.trauma)
TYPE TRAUMA_RC Count
11 GENERAL ACUTE CARE 5 3
10 GENERAL ACUTE CARE 4 443
9 GENERAL ACUTE CARE 3 504
8 GENERAL ACUTE CARE 2 343
7 GENERAL ACUTE CARE 1 253
12 GENERAL ACUTE CARE NA 2373
5 CRITICAL ACCESS 5 83
4 CRITICAL ACCESS 4 543
3 CRITICAL ACCESS 3 26
2 CRITICAL ACCESS 2 2
1 CRITICAL ACCESS 1 1
6 CRITICAL ACCESS NA 562

OK, so “CRITICAL ACCESS” facilities tend to be level IV and level V centers, whereas “GENERAL ACUTE CARE” facilities tend to be IV and under. Still not a very clear distinction, at least quantitatively. I suppose we would need some incident commander or safety officer or even medical professional to provide insight here?

Nevertheless, to provide a spatial comparison of what the implications of applying trauma center level thresholds on EGET would be, I will still generate a series of subset datasets:

# create subsets
hosp.trauma.5 <- hosp.type[hosp.type$TRAUMA_RC <= 5,]
hosp.trauma.4 <- hosp.type[hosp.type$TRAUMA_RC <= 4,]
hosp.trauma.3 <- hosp.type[hosp.type$TRAUMA_RC <= 3,]
hosp.trauma.2 <- hosp.type[hosp.type$TRAUMA_RC <= 2,]
hosp.trauma.1 <- hosp.type[hosp.type$TRAUMA_RC <= 1,]

Spatial comparisons

To understand how using different hospital subsets will affect EGET, I will create a series of maps representing the CONUS-wide Euclidean distance to the nearest medical facility for each subset. What we know for sure (or at least are quite confident in…) is that we want medical facilities that are:

  1. Open (as opposed to closed)
  2. Belonging to either “GENERAL ACUTE CARE” or “CRITICAL ACCESS” type classifications (as opposed to psychiatric, rehab, etc.)

So, those will be the bare minimum requirements. Beyond that, I will generate the following subsets:

Before I do that, I will lay some spatial groundwork. This will include reprojecting data into a suitable projection for CONUS-wide distance analyses (EPSG:5070), and creating a template Euclidean distance raster. I will also create a function that will create a table that provides the proportional area of CONUS that is within 10, 20, 50, 100, and 200 miles of the nearest hospital that will be applied to each subset.

# reproject data
subset.1 <- st_transform(hosp.type, 5070)
subset.2 <- st_transform(hosp.acut, 5070)
subset.3 <- st_transform(hosp.trauma.5, 5070)
subset.4 <- st_transform(hosp.trauma.4, 5070)
subset.5 <- st_transform(hosp.trauma.3, 5070)
subset.6 <- st_transform(hosp.trauma.2, 5070)
subset.7 <- st_transform(hosp.trauma.1, 5070)
state.conus <- st_transform(state.conus, 5070)

# create euclidean distance template raster
state.conus$RASTERID <- 1
v <- vect(state.conus)
r <- rast(v, resolution = 1000)
ed.temp <- rasterize(v, r, "RASTERID")

# define function
ed.fun <- function(subset){
  
  # calculate and plot euclidean distance (in miles) to subset
  d <- distance(ed.temp, subset, rasterize = T) %>%
    mask(v)
  d <- d / 1609.34
  plot(d, range = c(0,575), col = viridis(256))
  lines(v)
  
  # get total area
  area <- ifel(!is.na(d), 1, NA) %>%
    global(fun = sum, na.rm = T)
  
  # get threshold counts
  d.010 <- ifel(d <= 10, 1, NA)
  p.010 <- global(d.010, fun = sum, na.rm = T) / area
  p.010 <- p.010[[1]]
  d.020 <- ifel(d <= 20, 1, NA)
  p.020 <- global(d.020, fun = sum, na.rm = T) / area
  p.020 <- p.020[[1]]
  d.050 <- ifel(d <= 50, 1, NA)
  p.050 <- global(d.050, fun = sum, na.rm = T) / area
  p.050 <- p.050[[1]]
  d.100 <- ifel(d <= 100, 1, NA)
  p.100 <- global(d.100, fun = sum, na.rm = T) / area
  p.100 <- p.100[[1]]
  d.200 <- ifel(d <= 200, 1, NA)
  p.200 <- global(d.200, fun = sum, na.rm = T) / area
  p.200 <- p.200[[1]]
  d.inf <- ifel(d > 200, 1, NA)
  p.inf <- global(d.inf, fun = sum, na.rm = T) / area
  p.inf <- p.inf[[1]]
  
  # compile into table
  result <- data.frame(dist = c(10,20,50,100,200,Inf),
                       prop = c(p.010,p.020,p.050,p.100,p.200,p.inf))
  
  # print table
  kable(result)
}

Subset 1

# apply function
ed.fun(subset.1)

dist prop
10 0.2980930
20 0.6617513
50 0.9556925
100 0.9993678
200 1.0000000
Inf NaN

Subset 2

# apply function
ed.fun(subset.2)

dist prop
10 0.1924894
20 0.4471360
50 0.7811645
100 0.9472940
200 0.9999970
Inf 0.0000030

Subset 3

# apply function
ed.fun(subset.3)

dist prop
10 0.1629902
20 0.4429677
50 0.8626250
100 0.9725907
200 0.9998984
Inf 0.0001016

Subset 4

# apply function
ed.fun(subset.4)

dist prop
10 0.1547315
20 0.4187674
50 0.8430527
100 0.9682738
200 0.9994935
Inf 0.0005065

Subset 5

# apply function
ed.fun(subset.5)

dist prop
10 0.0728470
20 0.2197222
50 0.6358646
100 0.8989022
200 0.9979838
Inf 0.0020162

Subset 6

# apply function
ed.fun(subset.6)

dist prop
10 0.0378014
20 0.1207872
50 0.4445640
100 0.8002556
200 0.9950198
Inf 0.0049802

Subset 7

# apply function
ed.fun(subset.7)

dist prop
10 0.0169285
20 0.0577828
50 0.2573693
100 0.5562324
200 0.8260210
Inf 0.1739790

Concluding remarks and questions

Hopefully this was a useful exercise to help get a better understanding of the hospitals data upon which we intend to base the next generation of EGET. As can be expected, the more restrictive of a filtering procedure one applies to the hospitals, the more pessimistic the resulting EGET will appear. It is worth noting that these Euclidean distances are best-case scenarios. EGET, which will take into account road networks and off-road terrain/vegetation conditions, will produce much higher effective travel distances to these facilities. But, this provides a fast approximation that will hopefully inform our conversation about how to best filter hospitals data.

As mentioned earlier, two filters appear clear:

  1. Facilities must have STATUS == “OPEN”
  2. Facilities must belong to TYPE in [“GENERAL ACUTE CARE”, “CRITICAL ACCESS”]

As for additional filtering beyond those fairly generous thresholds, the questions remain:

1. Should we filter to just “GENERAL ACUTE CARE” facilities, which tend to be larger hospitals? Or should we also include “CRITICAL ACCESS” facilities, which tend to include smaller, urgent care-type facilities?

2. Or, alternatively, should we base this thresholding on trauma center levels? If so, what is an appropriate level?