Description

The covid-19 cases have reached over one million cases in the United States, and it is very concerning that the hospitalization begins to overwhelm for some counties where the proportion of elderly population is high and the hospitalization capacity and ICU care is limited. This analysis aims to stress the importance of early “Stay at Home” Order on relieving hospitalization saturation status and try to provide policymakers some implication on when to reopen the economy.

Research Question

Objective

Why do we use hospital and ICU beds as important indicators instead of ventilators?

  • Because hospitals can leverage the resources available to prepare more beds for potential cases in need of ICU care. There are more beds in a hospital than those of the ICU, and if we predict a surge in cases that will need ICU care, the hospital can prepare ahead and clear floors in the facilities to serve those patients.
  • Data regarding existing ICU beds and hospital beds are stable, while data regarding the existence and availability of ventilators may not be up to date, and may fluctuate, especially considering the current political climate regarding the competition for ventilators by state governors.
  • A patient that has an ICU need may not need a ventilator but may need other services that can be provided in an ICU setting (such as increased round-the-clock care). Patients who may be recovering may need to be in a “step-down” ICU setting, for which hospitals can be proactive and create such settings to address the proposed increase in demand.

LOADING ALL ORIGINAL DATA SETS HERE

hospital <- read.csv("/Users/kaiqingy/Desktop/95778 R for DS/final project/Hospitals.csv")
names(hospital)
##  [1] "X"          "Y"          "OBJECTID"   "ID"         "NAME"      
##  [6] "ADDRESS"    "CITY"       "STATE"      "ZIP"        "ZIP4"      
## [11] "TELEPHONE"  "TYPE"       "STATUS"     "POPULATION" "COUNTY"    
## [16] "COUNTYFIPS" "COUNTRY"    "LATITUDE"   "LONGITUDE"  "NAICS_CODE"
## [21] "NAICS_DESC" "SOURCE"     "SOURCEDATE" "VAL_METHOD" "VAL_DATE"  
## [26] "WEBSITE"    "STATE_ID"   "ALT_NAME"   "ST_FIPS"    "OWNER"     
## [31] "TTL_STAFF"  "BEDS"       "TRAUMA"     "HELIPAD"
icu_bed <- read.csv("/Users/kaiqingy/Desktop/95778 R for DS/final project/KHN_ICU_bed_county_analysis_2.csv")
names(icu_bed)
##  [1] "cnty_fips"                 "cnty_name"                
##  [3] "st"                        "state"                    
##  [5] "hospitals_in_cost_reports" "Hospitals_in_HC"          
##  [7] "all_icu"                   "Total_pop"                
##  [9] "X60plus"                   "X60plus_pct"              
## [11] "X60plus_per_each_icu_bed"
nyt <- read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv", stringsAsFactors = FALSE)

Variables of Interests

NYT case source variables: date, county, state, fips, cases

  • The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the county level over time.The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. The html linkage will keep regular updates to the data and compile time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
  • County-level data can be found in the data resource > date,county,state,fips,cases,deaths > 2020-01-21,Snohomish,Washington,53061,1,0 > …

TO DO TASK - Data visualizations for 1 county (by fips code) and visualize the number of cases present in the whole date range available for New York State - Do NYCs fips code to sense check–sense check : reason why we used New York - Is there any linear/exponential/logarithmic trending? We see exponential trending

US hospital variables: beds, country (US), state, county, county fips, owner, population, state, status, type

  • This feature class/shapefile contains locations of Hospitals for 50 US states from Homeland Infrastructure Foundation-Level Data. The dataset only includes hospital facilities based on data acquired from various state departments or federal sources which has been referenced in the SOURCE field. Hospital facilities which do not occur in these sources will be not present in the database. Hospitals have been categorized into children, chronic disease, critical access, general acute care, long term care, military, psychiatric, rehabilitation, special, and women based on the range of the available values from the various sources after removing similarities.

EDA ON HOSPITAL DATA SET

# I output a summary of the hospital data set
summary(hospital)
##        X                 Y             OBJECTID          ID           
##  Min.   :-176.64   Min.   :-14.29   Min.   :   1   Min.   :        4  
##  1st Qu.: -98.20   1st Qu.: 33.46   1st Qu.:1896   1st Qu.:  3983422  
##  Median : -90.07   Median : 37.97   Median :3791   Median :  9656520  
##  Mean   : -92.36   Mean   : 37.33   Mean   :3791   Mean   : 25587207  
##  3rd Qu.: -81.77   3rd Qu.: 41.32   3rd Qu.:5686   3rd Qu.: 22634748  
##  Max.   : 145.72   Max.   : 71.29   Max.   :7581   Max.   :182120889  
##                                                                       
##                                   NAME                    ADDRESS    
##  MEMORIAL HOSPITAL                  :  12   100 HOSPITAL DRIVE:   8  
##  COMMUNITY MEMORIAL HOSPITAL        :   8   CENTRO MEDICO     :   6  
##  COMMUNITY HOSPITAL                 :   7   200 HOSPITAL DRIVE:   5  
##  MERCY HOSPITAL                     :   7   ONE HOSPITAL DRIVE:   4  
##  BAYLOR EMERGENCY MEDICAL CENTER    :   6   1 HOSPITAL DRIVE  :   3  
##  HEALTHSOUTH REHABILITATION HOSPITAL:   6   500 HOSPITAL DRIVE:   3  
##  (Other)                            :7535   (Other)           :7552  
##            CITY          STATE           ZIP                   ZIP4     
##  HOUSTON     :  70   TX     : 798   Min.   :  605   NOT AVAILABLE:7471  
##  SAN ANTONIO :  45   CA     : 570   1st Qu.:32025   7213         :   3  
##  CHICAGO     :  41   FL     : 349   Median :54971   8497         :   2  
##  DALLAS      :  40   OH     : 290   Mean   :53616   0010         :   1  
##  PHILADELPHIA:  36   PA     : 279   3rd Qu.:76825   0030         :   1  
##  COLUMBUS    :  35   NY     : 277   Max.   :99929   0159         :   1  
##  (Other)     :7314   (Other):5018                   (Other)      : 102  
##           TELEPHONE                    TYPE         STATUS    
##  NOT AVAILABLE :1084   GENERAL ACUTE CARE:4524   CLOSED: 435  
##  (603) 893-2900:   5   CRITICAL ACCESS   :1031   OPEN  :7146  
##  (205) 715-5309:   4   PSYCHIATRIC       : 741                
##  (513) 636-4200:   4   LONG TERM CARE    : 431                
##  (207) 872-1000:   3   REHABILITATION    : 366                
##  (215) 612-4000:   3   MILITARY          : 213                
##  (Other)       :6478   (Other)           : 275                
##    POPULATION              COUNTY       COUNTYFIPS   COUNTRY   
##  Min.   :-999.00   LOS ANGELES: 131   06037  : 131   ASM:   1  
##  1st Qu.:  25.00   HARRIS     : 108   48201  :  93   GUM:   3  
##  Median :  65.00   JEFFERSON  :  94   04013  :  73   MNP:   1  
##  Mean   :  40.68   MONTGOMERY :  80   17031  :  73   PLW:   1  
##  3rd Qu.: 177.00   COOK       :  75   48113  :  58   PRI:  75  
##  Max.   :1592.00   MARICOPA   :  73   48439  :  54   USA:7498  
##                    (Other)    :7020   (Other):7099   VIR:   2  
##     LATITUDE        LONGITUDE         NAICS_CODE    
##  Min.   :-14.29   Min.   :-176.64   Min.   :622110  
##  1st Qu.: 33.46   1st Qu.: -98.20   1st Qu.:622110  
##  Median : 37.97   Median : -90.07   Median :622110  
##  Mean   : 37.33   Mean   : -92.36   Mean   :622141  
##  3rd Qu.: 41.32   3rd Qu.: -81.77   3rd Qu.:622110  
##  Max.   : 71.29   Max.   : 145.72   Max.   :622310  
##                                                     
##                                                         NAICS_DESC  
##  GENERAL MEDICAL AND SURGICAL HOSPITALS                      :5994  
##  PSYCHIATRIC AND SUBSTANCE ABUSE HOSPITALS                   : 581  
##  SPECIALTY (EXCEPT PSYCHIATRIC AND SUBSTANCE ABUSE) HOSPITALS: 425  
##  REHABILITATION HOSPITALS (EXCEPT ALCOHOLISM, DRUG ADDICTION): 330  
##  CHILDREN'S HOSPITALS, GENERAL                               :  88  
##  EXTENDED CARE HOSPITALS (EXCEPT MENTAL, SUBSTANCE ABUSE)    :  42  
##  (Other)                                                     : 121  
##                                                                     SOURCE    
##  http://www.dshs.texas.gov/facilities/find-a-licensee.aspx             : 740  
##  http://www.oshpd.ca.gov/HID/Facility-Listing.html                     : 531  
##  http://www.floridahealthfinder.gov/facilitylocator/facilitysearch.aspx: 340  
##  http://publicapps.odh.ohio.gov/EID/Provider_Search.aspx               : 277  
##  https://profiles.health.ny.gov/hospital/county_or_region/             : 277  
##  http://new.dhh.louisiana.gov/index.cfm/directory/category/169         : 258  
##  (Other)                                                               :5158  
##                     SOURCEDATE           VAL_METHOD  
##  2018-08-09T00:00:00.000Z:2397   IMAGERY      :4497  
##  2018-08-10T00:00:00.000Z:1164   IMAGERY/OTHER:3084  
##  2018-08-11T00:00:00.000Z:1036                       
##  2018-08-16T00:00:00.000Z: 699                       
##  2018-08-08T00:00:00.000Z: 637                       
##  2018-08-13T00:00:00.000Z: 345                       
##  (Other)                 :1303                       
##                      VAL_DATE   
##  2014-02-10T00:00:00.000Z:3725  
##  2014-03-12T00:00:00.000Z: 949  
##  2019-05-09T00:00:00.000Z: 137  
##  2016-04-15T00:00:00.000Z:  84  
##  2014-02-12T00:00:00.000Z:  83  
##  2015-05-22T00:00:00.000Z:  71  
##  (Other)                 :2532  
##                                                                     WEBSITE    
##  NOT AVAILABLE                                                          : 384  
##  http://www.kaiserpermanente.org                                        :  39  
##  http://www.aurorahealthcare.org                                        :  12  
##  http://www.bannerhealth.com/Locations/Arizona/Banner+Behavioral +Health:  10  
##  http://www.iuhealth.org                                                :  10  
##  http://www.providence.org                                              :  10  
##  (Other)                                                                :7116  
##           STATE_ID                                   ALT_NAME   
##  NOT AVAILABLE:3655   NOT AVAILABLE                      :6677  
##  10030        :  10   CHARLESTON AREA MEDICAL CENTER     :   4  
##  10010        :   8   SWEDISH MEDICAL CENTER - FIRST HILL:   4  
##  10020        :   8   CAPITAL MEDICAL CENTER             :   3  
##  10040        :   8   SKAGIT VALLEY HOSPITAL             :   3  
##  10070        :   7   CENTENNIAL MEDICAL CENTER          :   2  
##  (Other)      :3885   (Other)                            : 888  
##     ST_FIPS                                  OWNER        TTL_STAFF   
##  Min.   : 1.00   GOVERNMENT - DISTRICT/AUTHORITY: 525   Min.   :-999  
##  1st Qu.:17.00   GOVERNMENT - FEDERAL           : 266   1st Qu.:-999  
##  Median :29.00   GOVERNMENT - LOCAL             : 507   Median :-999  
##  Mean   :29.47   GOVERNMENT - STATE             : 295   Mean   :-999  
##  3rd Qu.:44.00   NON-PROFIT                     :3536   3rd Qu.:-999  
##  Max.   :78.00   NOT AVAILABLE                  : 371   Max.   :-999  
##                  PROPRIETARY                    :2081                 
##       BEDS                   TRAUMA     HELIPAD 
##  Min.   :-999.00   NOT AVAILABLE:5427   N:3380  
##  1st Qu.:  25.00   LEVEL IV     : 886   Y:4201  
##  Median :  66.00   LEVEL III    : 477           
##  Mean   :  46.53   LEVEL II     : 332           
##  3rd Qu.: 178.00   LEVEL I      : 207           
##  Max.   :1592.00   TRH          :  35           
##                    (Other)      : 217

I want to visualize the variables we have selected–get an understanding as to what states have the most beds

From the summary statistics, we can tell there are a few hospitals that have bed values of -999. I found that -999 means that there is data missing for that hospital. Info available at the metadata of the dataset source (https://www.arcgis.com/sharing/rest/content/items/6ac5e325468c4cb9b905f1728d6fbf0f/info/metadata/metadata.xml?format=default&output=html)

Hence, I want to remove those hospitals with -999 values in the beds, and refine the hospital dataset further. Now, I will call it hospitals and graph the number of beds by state to make sure that now the dataframe hospitals only has hospitals for which we know for certain the amount of beds present.

TO DO TASK - See what types of hospitals are present - Get rid of hospital types that don’t have ICU capability–long term care facilities would not provide ICU capability (do research on types of hospitals that could produce ICU) in the data wrangling section

Kaiser Family News ICU Bed Count variables: Cnty_fips, All_ICU, Total_pop

  • Kaiser Health News evaluated the capacity of intensive care unit (ICU) beds around the nation by first identifying the number of ICU beds each hospital reported in its most recent financial cost report, filed annually to the Centers for Medicare & Medicaid Services. KHN included beds reported in the categories of intensive care unit, surgical intensive care unit, coronary care unit and burn intensive care unit. KHN then totaled the ICU beds per county and matched the data with county population figures from the Census Bureau’s American Community Survey. KHN focused on the number of people 60 and older in each county because older people are considered the most likely group to require hospitalization, given their increased frailty and existing health conditions compared with younger people. For each county, KHN calculated the number of people 60 and older for each ICU bed. KHN also calculated the percentage of county residents who were 60 or older.
  • Cnty_fips: five-digit code for each county (or incorporated city); first two digits are the state FIPS (Federal Information Processing Standards) and the last three are for the county.
  • Hospitals_in_HC: the number of hospitals in the Hospital Compare general information file, for each county.
  • All_ICU: the number of ICU beds reported in the most recent cost report for each hospital, including the categories “intensive care unit,” “coronary care unit,” “burn intensive care unit” and “surgical intensive care unit.” Aggregated by county.
  • Total_pop: the total population for each county according to the 2017 five-year ACS. -Total_pop: the total population for each county according to the 2017 five-year ACS.
  • 60plus: the total population for each county that is 60 or older.
  • 60plus_pct: the percent of the total population that is 60 or older.
  • 60plus_per_each_icu_bed: The population 60 and older divided by the total number of ICU_beds,

EDA ON ICU BED DATA SET ### How many counties have no hospital ICU beds? How many counties have no hospitals?

##summarize counties without hospitals and check their population over 60
no_hos <- icu_bed%>%
  group_by(cnty_name, state)%>%
  filter(hospitals_in_cost_reports == 0)%>%
  select(cnty_name, state, X60plus, X60plus_pct)
summary(no_hos)
##    cnty_name        state        X60plus       X60plus_pct    
##  Lee    :  6   Texas   : 78   Min.   :   27   Min.   :0.0580  
##  Monroe :  6   Virginia: 69   1st Qu.: 1388   1st Qu.:0.2280  
##  Carroll:  5   Georgia : 55   Median : 2733   Median :0.2620  
##  Clay   :  5   Missouri: 46   Mean   : 4079   Mean   :0.2655  
##  Jackson:  5   Kentucky: 43   3rd Qu.: 4912   3rd Qu.:0.2975  
##  Perry  :  5   Nebraska: 28   Max.   :74936   Max.   :0.6420  
##  (Other):667   (Other) :380
##summarize counties without hospitals and check their population over 60
no_icu <- icu_bed%>%
  group_by(cnty_name, state)%>%
  filter(all_icu == 0)%>%
  select(cnty_name, state, X60plus, X60plus_pct)
summary(no_icu)
##       cnty_name         state         X60plus       X60plus_pct    
##  Lincoln   :  16   Texas   : 166   Min.   :   27   Min.   :0.0580  
##  Washington:  15   Georgia :  95   1st Qu.: 1667   1st Qu.:0.2290  
##  Clay      :  13   Nebraska:  81   Median : 3113   Median :0.2600  
##  Jackson   :  13   Missouri:  78   Mean   : 4250   Mean   :0.2626  
##  Grant     :  12   Kansas  :  76   3rd Qu.: 5243   3rd Qu.:0.2940  
##  Franklin  :  11   Virginia:  76   Max.   :74936   Max.   :0.6420  
##  (Other)   :1585   (Other) :1093
  • It is quite concerning that 699 counties have no hospitals, and 1655 counties have no icu bes. That’s quite concerning in some state like Texas, Virgina where elderly population percentage accounts for 20% of the state. It’s even alarming for some states where elderly accounts for 60% of the population without icu intensive care nor hospital. We might assume those counties will be vulnerable for the high case and fatality rate if there’s no prompt shutdown policy and supplementary medical care.

  • Initially, we were interested in all the variables of interest above. Over time and after some EDA and discussions, we decided to narrow down our scope from this data set to only include: Cnty_fips, All_ICU, Total_pop

TO DO TASK -Change Cnty_fips to FIPS to make sure we have primary key to merge the three datasets -Select the above variables and create subset -Remove missing or NA values

Data Wrangling

Clean Hospital Dataset

I create a data set “hospitals” that has my variables of interest in place, as well as renaimg the variable COUNTY_FIPS to FIPS, as this is going to be the variable that we will use to link the data sets by.

hospital <- select(hospital, NAME, BEDS, CITY, STATE, TYPE, STATUS, POPULATION, COUNTY, COUNTYFIPS, COUNTRY, NAICS_CODE, NAICS_DESC, ST_FIPS, BEDS, HELIPAD)

hospital <- rename(hospital, FIPS = COUNTYFIPS)

Clean Kaiser Family ICU dataset

Creating the ICU data set and renaming the variables. This data is aggregated by FIPS (County).

icu_bed <- rename(icu_bed, FIPS = cnty_fips)

icu_bed <- select(icu_bed, FIPS, all_icu, Total_pop)

Narrow Kaiser Family ICU dataset

Handling on the NYT coronavirus data set, first we need to convert the date strings from charcter into date format.

nyt$date <- as.Date(nyt$date, format= "%Y-%m-%d")

nyt <- rename(nyt, FIPS = fips)

nyt <- select(nyt, FIPS, county, state, date, cases, deaths)

I also want to make sure that the hospitals that are in the data set can potentially provide ICU care. So, I will need to analyse the types of hospitals present. There are a variety of hospitals in this data set, but only a few would have potential to work at a level close to or as an ICU if needed: General Acute Care, Children, Women, Military, Critical Access. Now, the subset of hospitals that I have created only has our hospitals of interest, have > 0 beds.

hospitals <- filter(hospital, BEDS > 0)

hospitals <- subset(hospitals, TYPE %in% c("GENERAL ACUTE CARE","CHILDREN", "MILITARY", "CRITICAL ACCESS", "WOMEN"))

qplot(hospitals$STATE, hospitals$BEDS)

plot(hospitals$STATE, hospitals$BEDS)

The minimum number of beds available are 2 and the maximum number of beds in a hospital are 1592.

summary(hospitals$BEDS)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     2.0    26.0    98.0   163.4   229.0  1592.0

Aggregate number of beds at the county level. Group by county, add the total number beds.

FIPS_BEDS <- hospitals %>% group_by(FIPS) %>% summarize(total_beds = sum(BEDS))

qplot(FIPS_BEDS$FIPS, FIPS_BEDS$total_beds)

plot(FIPS_BEDS$FIPS, FIPS_BEDS$total_beds)

summary(FIPS_BEDS)
##       FIPS        total_beds     
##  01001  :   1   Min.   :    4.0  
##  01003  :   1   1st Qu.:   25.0  
##  01005  :   1   Median :   79.0  
##  01007  :   1   Mean   :  350.6  
##  01009  :   1   3rd Qu.:  245.0  
##  01011  :   1   Max.   :26472.0  
##  (Other):2526

We found that Randall Co, TX has 135,000 people, but 4 beds. It is next to Amarillo, TX. This rose a question about how do we filter. If the number of beds are lower than 10, we will assume that people in the county are going to other counties for hospitalizations.

EDA ON NYT CASE DATA SET

theme_set(theme_minimal())
## create two new coloumns to caculate the fatality rate and accumulative cases per 10,000 people in the New York City.
nyc_cases<-nyt%>%
  filter(state == "New York", county == "New York City", cases >= 100) %>%
  mutate(ft_rate = deaths/cases*100,
         cases_per_100k = cases/100000)%>%
  arrange(date, cases)

## visulaize changes of NYC covid-19 cases per 100K people over time
  p1 <- nyc_cases %>%
  ggplot()+
  geom_line(aes(x = date, y = cases_per_100k), color = "darkblue")+
  geom_line(aes(x = date, y = ft_rate), color = "darkred")+
  theme_minimal()+
  scale_x_date(date_breaks = "1 week", date_labels = "%b %d %Y")+ ##create one week as break to visualize the time series trend clearly.
  ggtitle("the confirmed Covid-19 cases per 100K population and fatality rate of New York City over time")

## create interactive plots so we can track the value changes
ggplotly(p1)

Takeaway: + I took New York City, the city where 50% of serious pandemic cases to understand the cumulative trend, it seems that the curve of cases per 100k population followed the pattern of exponential curve, whereas the fatility rate curve experieced many bumps but increase rapidly, which indicates the overwhelmed icu care and oversaturation of hospitalization. The fatality has reached up from 1.5% to 7.43% in less than 45 days. That’s pretty alarming! New York ciry is populated but also have great intensive care…

Now let’s discover the counties where cases over 2000 other than New York city in the New York State.We filter the cases over 1000 to select representative counties.

ny_state <- nyt%>%
  filter(state == "New York", cases >= 1000, county != "Unknown", county != "New York City") %>%
  mutate(ft_rate = deaths/cases*100)%>%
    arrange(date, cases)
  p2 <- ny_state %>%
    ggplot(aes(x = date, y = cases, group = county))+
    geom_line()+
    theme_minimal()+
    scale_x_date(date_breaks = "1 week", date_labels = "%b %d %Y")+
    facet_wrap(~county)
    ggtitle("the confirmed Covid-19 cases in New York State by county over time")
## $title
## [1] "the confirmed Covid-19 cases in New York State by county over time"
## 
## attr(,"class")
## [1] "labels"
  ggplotly(p2)

Summary: + Each county face different challenges depends on the population, hospital beds utlization. Similar cumulative cases increase pattern counties like Nassau, Suffolk, and Westchester are the places where people of color scattered and they are high populated counties. Other countis has relatively flat curves, which means we cannot use similar exponential model to predict their trend because that vary by each county’s lockdown policy, demographics, and hospitalization situation.

Before merging the icu beds and hospital data set

ICU_BED has no leading zero, and this will inhibit merging when I am ready to merge the hospitals data set and the ICU_BEDS data set. I need to remove the leading zero in the hospitals data set.

FIPS_BEDS <- FIPS_BEDS %>% mutate(newFIPS = str_remove(FIPS_BEDS$FIPS, "^0+"))

FIPS_BEDS <- select(FIPS_BEDS, -FIPS)
FIPS_BEDS <- rename(FIPS_BEDS, FIPS = newFIPS)

NOW I WILL MERGE HOSPITALS DATA SET AND ICU_BED

HOSP_ICU <- merge(FIPS_BEDS, icu_bed, by = "FIPS")

HERE I MERGE COUNTY WIDE BEDS AND ICU BEDS TO THE CORONAVIRUS CASES BY FIPS REPORTED BY THE NEW YORK TIMES DAILY

FINAL_SET <- merge( nyt, HOSP_ICU, by = "FIPS")
FINAL_SET1 <- FINAL_SET %>%
  mutate(icu_avail = floor(.35 * all_icu),
         beds_avail =  floor(.465 * total_beds),
         case_over65_hos = ceiling(cases * 0.31 * 0.45) )

Visualization of three counties

The selection criteria for these three counties was # of hospital beds < average(# of age 65+ hospitalization)

Glenn, CA; Yates, NY; and Mitchell, GA are similar in three ways illustrated below:

County Glenn, CA Yates, NY Mitchell, GA San Francisco, CA Nassau, NY Fulton, GA
Population Size 27,935 25,083 22,574 864,263 1,363,069 1,010,420
No. ICU Beds in Co. 0 0 0 337 461 538
Total Hospital beds in Co. 47 25 23 4310 4343 4585
# Plotting Nassau County beds available and over 65 hospitalizations. 
nassau <- FINAL_SET1 %>%
  filter(county == "Nassau", state == "New York")
ggplot(nassau, aes(date)) + 
  geom_line(aes(y = beds_avail, colour = " Total hospital beds available")) + 
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))

# Subsetting counties in Georgia that have a high number of cases of COVID-19 to visualize 

georgia <- FINAL_SET1 %>%
  filter(state == "Georgia", beds_avail < mean(case_over65_hos))

ggplot(georgia, aes(date)) + 
  geom_line(aes(y = beds_avail, colour = " Total hospital beds available")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

# Now, we know what we want our outputs to be--we need to create a calculation to depict the number of beds taken decreasing over time, as an increasing number of patients who are 65+ need to be hospitalized.

FINAL_SET2 <- FINAL_SET1 %>%
  mutate(beds_remain = (beds_avail - case_over65_hos))


# We plot beds available as cases of over 65year olds in hospitalization increase in Allegheny Co, PA.
pitts2 <- FINAL_SET2%>%
filter(county == "Allegheny", state == "Pennsylvania")

ggplot(pitts2, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

# Subsetting counties in Georgia that have a high number of cases of COVID-19 to visualize beds remaining as >65 hospitalizations increase

# We are interested in visualizing 3 counties in different states: Georgia, California and New York. The counties have similar population sizes, 

georgia2 <- FINAL_SET2 %>%
  filter(state == "Georgia", beds_avail < mean(case_over65_hos))

ggplot(georgia2, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

Now, I will select 3 counties that are comparable. ### County #1: Mitchell, GA

mitchell <- FINAL_SET2%>%
filter(county == "Mitchell", state == "Georgia")

County #2: Glenn, CA

cali <- FINAL_SET2 %>%
  filter(state == "California", beds_avail < mean(case_over65_hos))

ggplot(cali, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

glenn <- FINAL_SET2%>%
filter(county == "Glenn", state == "California")

County #3: Yates, NY

nys <- FINAL_SET2 %>%
  filter(state == "New York", beds_avail < mean(case_over65_hos))

ggplot(nys, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

yates <- FINAL_SET2%>%
filter(county == "Yates", state == "New York")

Plotting the three counties side-by-side.

ggplot(mitchell, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

ggplot(yates, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

ggplot(glenn, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

Takeaways for Three Counties–Glenn, CA; Yates, NY; and Mitchell, GA

Glenn county, CA instated stay-at-home orders on March 20th, 2020. At the time, there were no known COVID-19 cases in the county. (https://www.countyofglenn.net/news/emergency-preparedness-press-release-public-health-public-information-public-notice-public-2)

Yates, NY instated stay-at-home orders on March 23rd, 2020. On March 29th, Yates Co. reported a confirmed case in the county.(https://www.yatescounty.org/535/COVID-19)

Mitchell County, GA is located in SW Georgia. The county website does not report any date for a stay-at-home order. So, we will go with the general Georgia order. Georgia instated stay-at-home orders on April 1st. By April 28th, Mitchell Co. 293 cases and 29 deaths in total had been reported. (https://www.cnn.com/2020/04/02/politics/fact-check-georgia-gov-brian-kemp-coronavirus-no-symptoms-stay-at-home/index.html)

When we consider that the three counties are similar in population size, ICU beds, and total hospital beds in the county and the differences are in the start of the stay-at-home orders, we see that there is a huge impact having instated a stay-at-home order a few days (or weeks) before other states. Our estimates regarding the number of patients over 65 who would need to be hospitalized due to COVID-19 are higher in counties where the orders to stay home were placed later in the calendar year than those who ordered people to stay home the earliest (CA vs GA, or even NY)

Choose counties in the same three State where the available beds greater than the number of elderly who needs hospitalization

County #1: San Francisco, CA

cali2 <- FINAL_SET2 %>%
  filter(state == "California", beds_avail > mean(case_over65_hos))

ggplot(cali2, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

San_Francisco <- FINAL_SET2%>%
filter(county == "San Francisco", state == "California")

County #2: New York, NY

nys2 <- FINAL_SET2 %>%
  filter(state == "New York", beds_avail > mean(case_over65_hos))

ggplot(nys2, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

Nassau <- FINAL_SET2%>%
filter(county == "Nassau", state == "New York")

County #3: Fulton, GA

geo2 <- FINAL_SET2 %>%
  filter(state == "Georgia", beds_avail > mean(case_over65_hos))

ggplot(geo2, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Remaining hospital beds for elderly")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

Fulton <- FINAL_SET2%>%
filter(county == "Fulton", state == "Georgia")

Compare among these selected states where available beds is greater than the # of people over 65 needs hospitalization

ggplot(San_Francisco, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Total hospital beds available")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

ggplot(Nassau, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Total hospital beds available")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

ggplot(Fulton, aes(date)) + 
  geom_line(aes(y = beds_remain, colour = " Total hospital beds available")) +
  geom_line(aes(y = icu_avail, colour = "Total ICU beds available"))+
  geom_line(aes(y = case_over65_hos, colour = " Over 65 who will be hospitalized"))+
  facet_wrap(facets =  vars(county))

## Takeaways for Three Counties where available beds is greater than the # of people over 65 needs hospitalization

  • San Francisco, stay-at-home health orders have been extended through May 31 to maintain progress in slowing the spread of COVID-19. The latest order also eases restrictions on some lower-risk activities starting May. San Francisco joined five other Bay Area counties on March 16 as the first region in the U.S. to order residents to “shelter in place,” that is, avoid leaving their homes except for essential business such as grocery shopping and medical visits. On March 19, the entire state of California followed suit. (https://sf.gov/news/san-francisco-and-bay-area-extend-stay-home-order-through-end-may) (https://www.cnbc.com/2020/04/27/san-francisco-extends-stay-at-home-order-through-may-amid-coronavirus.html)

  • We weren’t able to fins county–specific information regarding Nassau Co. New York Gov. Andrew Cuomo ordered all nonessential businesses to keep 100% of their workforce at home starting March 22 (https://www.cnbc.com/2020/04/30/coronavirus-states-lifting-stay-at-home-orders-reopening-businesses.html) On March 5th, Nassau Co. reported the first confirmed case in the county, by March 21st, Nassau Co. had 1234 cases and 7 deaths reported.

  • The Board of Fulton County in Georgia ratified a countywide “Shelter in Place” order on April 1st, 2020 until further notice. This includes the majority of the city of Atlanta. Mayor of Atlanta also issued the order to stay at home for at least two weeks. Three days later, Georgia State announced the state-wide lockdown policy. (https://www.fultoncountyga.gov/news/2020/04/06/updates-on-fulton-county-facility-closures-and-service-changes) (https://atlanta.eater.com/2020/4/1/21203264/fulton-county-georgia-board-of-health-commands-resident-shelter-in-place-covid19)

  • We can tell the hospital beds overwelming situation is related to the lockdown policy start-date. San Francisco,Nassau and Fulton are three counties with smilar population size, similar amount of hospital beds and size of intensive care units and their available beds should be enough for the elderly population who need it.
  • But the overwhelming status vary: San Franciscso issued the stay at home order very early in March, so the curve of remaining beds is likely to be flatter than Fulton Co. Fulton Co established the stay at home order on April 1st. On the contrary when we look at Nassau, the county government reaction is really slow, there’s no county-wide order being placed until the state-wide order issued on March 23rd. But the first case in Nassau started in March 5th and 98th case in March 15th. Cases in Nassau reached 1000 when the state-wide stay at home policy was annouced, that’s the potential reason cause the overwelming situation.
  • The exponential curve tends to cross the available beds in late March but it’s already late to take other preventative policies.

Final Takeaways

  • California set stay-at-home orders on March 19th, Georgia on April 1, and New York on March 22nd. Our analyses show the impact of local government leadership on the coronavirus situation. The counties who set stay-at-home order before the state orders, such as San Francisco, benefitted from the foresight and have not experienced overwhelming of the hospital services. This is impressive, especially when comparing to the other two counties that have similar number of beds, ICU beds, and population sizes.

  • Smaller counties are likely more vulnerable to the virus than larger counties, based on their capacity to take on patients who need hospitalization. For a county such as Mitchell, GA, the curve of patients who need hospitalization is dramatically steep. This may be due to a variety of factors that include: individual behavior and touchpoints for disease spread, the county’s capacity to process (ie. beds available in both hospitals and ICUs). In Fulton Co, GA, there is a much incrased capacity to serve those who contract COVID-19–Atlanta, the biggest metro area, houses a variety of hospitals and provider services. This capacity increase, as well as the early stay-at-home order makes Fulton Co more likely to not experience an exponential surge in cases.

Future Work

We would like our estimates to be revised, as we applied a country-wide prevalence of the disease among people older than 60 years. This may not be true, depending on the county and the exposure of older people to touchpoints to the virus. This model can inform policymakers of the impact of reactionary policies vs preventative policies. Our model assumes that only people over 60 will take the hospital and ICU beds. In reality, COVID-19 impacts people of all ages, who may also need to be hospitalized. This reduces the capacity of many counties (such as Mitchell, GA) even further. This would lead to the cross between available beds and beds taken by COVID-19 patients to cross even earlier.