Explanation of the template

Update the title with your information. For example, it could be “GTECH_78520_2023S_Week12_JohnDoe25” for John Doe, 2022 Spring semester, week 12.

Also update the author name and date accordingly.

R Spaital Lab Assignment # 1

Don’t use a single chunk for the entire assignment. Break it into multiple chunks.

task 1:

1.Set up a R project for the R-spatial section.

Done

  1. Read the NYC postal areas in Shapefiles into sf objects. As NYC DOH publishes COVID-19 data by zip code, we will utlize the potal area data later.
#read in the shapefile 
zip_sf <- st_read("R-Spatial_I_lab/ZIP_CODE_040114/ZIP_CODE_040114.shp")
## Reading layer `ZIP_CODE_040114' from data source 
##   `/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/R-Spatial_I_Lab/ZIP_CODE_040114/ZIP_CODE_040114.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 913129 ymin: 120020.9 xmax: 1067494 ymax: 272710.9
## Projected CRS: NAD83 / New York Long Island (ftUS)

3.Read and process the NYS health facilities spreadsheet data. Create sf objects from geographic coordinates.

NYS_Health_Facility <- read_csv("R-Spatial_I_Lab/NYS_Health_Facility.csv",
                                show_col_types = FALSE)
str(NYS_Health_Facility)
## spc_tbl_ [3,990 × 36] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Facility ID                 : num [1:3990] 204 620 654 1156 2589 ...
##  $ Facility Name               : chr [1:3990] "Hospice at Lourdes" "Charles T Sitrin Health Care Center Inc" "Central Park Rehabilitation and Nursing Center" "East Side Nursing Home" ...
##  $ Short Description           : chr [1:3990] "HSPC" "NH" "NH" "NH" ...
##  $ Description                 : chr [1:3990] "Hospice" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" ...
##  $ Facility Open Date          : chr [1:3990] "06/01/1985" "02/01/1989" "02/01/1989" "08/01/1979" ...
##  $ Facility Address 1          : chr [1:3990] "4102 Old Vestal Road" "2050 Tilden Avenue" "116 Martin Luther King East" "62 Prospect St" ...
##  $ Facility Address 2          : chr [1:3990] NA NA NA NA ...
##  $ Facility City               : chr [1:3990] "Vestal" "New Hartford" "Syracuse" "Warsaw" ...
##  $ Facility State              : chr [1:3990] "New York" "New York" "New York" "New York" ...
##  $ Facility Zip Code           : chr [1:3990] "13850" "13413" "13205" "14569" ...
##  $ Facility Phone Number       : num [1:3990] 6.08e+09 3.16e+09 3.15e+09 5.86e+09 5.86e+09 ...
##  $ Facility Fax Number         : num [1:3990] NA NA NA NA NA ...
##  $ Facility Website            : chr [1:3990] NA NA NA NA ...
##  $ Facility County Code        : num [1:3990] 3 32 33 60 2 ...
##  $ Facility County             : chr [1:3990] "Broome" "Oneida" "Onondaga" "Wyoming" ...
##  $ Regional Office ID          : num [1:3990] 3 3 3 1 1 1 7 1 7 5 ...
##  $ Regional Office             : chr [1:3990] "Central New York Regional Office" "Central New York Regional Office" "Central New York Regional Office" "Western Regional Office - Buffalo" ...
##  $ Main Site Name              : chr [1:3990] NA NA NA NA ...
##  $ Main Site Facility ID       : num [1:3990] NA NA NA NA NA ...
##  $ Operating Certificate Number: chr [1:3990] "0301501F" "3227304N" "3301326N" "6027303N" ...
##  $ Operator Name               : chr [1:3990] "Our Lady of Lourdes Memorial Hospital Inc" "Charles T Sitrin Health Care Center, Inc" "CPRNC, LLC" "East Side Nursing Home Inc" ...
##  $ Operator Address 1          : chr [1:3990] "169 Riverside Drive" "Box 1000 Tilden Avenue" "116 Martin Luther King East" "62 Prospect Street" ...
##  $ Operator Address 2          : chr [1:3990] NA NA NA NA ...
##  $ Operator City               : chr [1:3990] "Binghamton" "New Hartford" "Syracuse" "Warsaw" ...
##  $ Operator State              : chr [1:3990] "New York" "New York" "New York" "New York" ...
##  $ Operator Zip Code           : chr [1:3990] "13905" "13413" "13205" "14569" ...
##  $ Cooperator Name             : chr [1:3990] NA NA NA NA ...
##  $ Cooperator Address          : chr [1:3990] NA NA NA NA ...
##  $ Cooperator Address 2        : chr [1:3990] NA NA NA NA ...
##  $ Cooperator City             : chr [1:3990] NA NA NA NA ...
##  $ Cooperator State            : chr [1:3990] "New York" "New York" "New York" "New York" ...
##  $ Cooperator Zip Code         : chr [1:3990] NA NA NA NA ...
##  $ Ownership Type              : chr [1:3990] "Not for Profit Corporation" "Not for Profit Corporation" "LLC" "Business Corporation" ...
##  $ Facility Latitude           : num [1:3990] 42.1 43.1 NA 42.7 42.1 ...
##  $ Facility Longitude          : num [1:3990] -76 -75.2 NA -78.1 -78 ...
##  $ Facility Location           : chr [1:3990] "(42.097095, -75.975243)" "(43.05497, -75.228828)" NA "(42.738979, -78.12867)" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   `Facility ID` = col_double(),
##   ..   `Facility Name` = col_character(),
##   ..   `Short Description` = col_character(),
##   ..   Description = col_character(),
##   ..   `Facility Open Date` = col_character(),
##   ..   `Facility Address 1` = col_character(),
##   ..   `Facility Address 2` = col_character(),
##   ..   `Facility City` = col_character(),
##   ..   `Facility State` = col_character(),
##   ..   `Facility Zip Code` = col_character(),
##   ..   `Facility Phone Number` = col_double(),
##   ..   `Facility Fax Number` = col_double(),
##   ..   `Facility Website` = col_character(),
##   ..   `Facility County Code` = col_double(),
##   ..   `Facility County` = col_character(),
##   ..   `Regional Office ID` = col_double(),
##   ..   `Regional Office` = col_character(),
##   ..   `Main Site Name` = col_character(),
##   ..   `Main Site Facility ID` = col_double(),
##   ..   `Operating Certificate Number` = col_character(),
##   ..   `Operator Name` = col_character(),
##   ..   `Operator Address 1` = col_character(),
##   ..   `Operator Address 2` = col_character(),
##   ..   `Operator City` = col_character(),
##   ..   `Operator State` = col_character(),
##   ..   `Operator Zip Code` = col_character(),
##   ..   `Cooperator Name` = col_character(),
##   ..   `Cooperator Address` = col_character(),
##   ..   `Cooperator Address 2` = col_character(),
##   ..   `Cooperator City` = col_character(),
##   ..   `Cooperator State` = col_character(),
##   ..   `Cooperator Zip Code` = col_character(),
##   ..   `Ownership Type` = col_character(),
##   ..   `Facility Latitude` = col_double(),
##   ..   `Facility Longitude` = col_double(),
##   ..   `Facility Location` = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>

Create sf objects based on geography

# Remove rows with NA in either "Facility Longitude" or "Facility Latitude"
NYS_Health_Facility_clean <- NYS_Health_Facility[!is.na(NYS_Health_Facility$`Facility Longitude`) & !is.na(NYS_Health_Facility$`Facility Latitude`), ]

Now attempt to create sf

NYS_Health_Facility_sf <- st_as_sf(NYS_Health_Facility_clean,
                                   coords = c("Facility Longitude", "Facility Latitude"))

st_crs(NYS_Health_Facility_sf) <- 4326
                                
str(NYS_Health_Facility_sf)
## sf [3,848 × 35] (S3: sf/tbl_df/tbl/data.frame)
##  $ Facility ID                 : num [1:3848] 204 620 1156 2589 3455 ...
##  $ Facility Name               : chr [1:3848] "Hospice at Lourdes" "Charles T Sitrin Health Care Center Inc" "East Side Nursing Home" "Wellsville Manor Care Center" ...
##  $ Short Description           : chr [1:3848] "HSPC" "NH" "NH" "NH" ...
##  $ Description                 : chr [1:3848] "Hospice" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" ...
##  $ Facility Open Date          : chr [1:3848] "06/01/1985" "02/01/1989" "08/01/1979" "02/01/1989" ...
##  $ Facility Address 1          : chr [1:3848] "4102 Old Vestal Road" "2050 Tilden Avenue" "62 Prospect St" "4192A Bolivar Road" ...
##  $ Facility Address 2          : chr [1:3848] NA NA NA NA ...
##  $ Facility City               : chr [1:3848] "Vestal" "New Hartford" "Warsaw" "Wellsville" ...
##  $ Facility State              : chr [1:3848] "New York" "New York" "New York" "New York" ...
##  $ Facility Zip Code           : chr [1:3848] "13850" "13413" "14569" "14895" ...
##  $ Facility Phone Number       : num [1:3848] 6.08e+09 3.16e+09 5.86e+09 5.86e+09 7.17e+09 ...
##  $ Facility Fax Number         : num [1:3848] NA NA NA NA NA ...
##  $ Facility Website            : chr [1:3848] NA NA NA NA ...
##  $ Facility County Code        : num [1:3848] 3 32 60 2 14 ...
##  $ Facility County             : chr [1:3848] "Broome" "Oneida" "Wyoming" "Allegany" ...
##  $ Regional Office ID          : num [1:3848] 3 3 1 1 1 7 1 7 5 7 ...
##  $ Regional Office             : chr [1:3848] "Central New York Regional Office" "Central New York Regional Office" "Western Regional Office - Buffalo" "Western Regional Office - Buffalo" ...
##  $ Main Site Name              : chr [1:3848] NA NA NA NA ...
##  $ Main Site Facility ID       : num [1:3848] NA NA NA NA NA ...
##  $ Operating Certificate Number: chr [1:3848] "0301501F" "3227304N" "6027303N" "0228305N" ...
##  $ Operator Name               : chr [1:3848] "Our Lady of Lourdes Memorial Hospital Inc" "Charles T Sitrin Health Care Center, Inc" "East Side Nursing Home Inc" "Wellsville Manor LLC" ...
##  $ Operator Address 1          : chr [1:3848] "169 Riverside Drive" "Box 1000 Tilden Avenue" "62 Prospect Street" "4192a Bolivar Road" ...
##  $ Operator Address 2          : chr [1:3848] NA NA NA NA ...
##  $ Operator City               : chr [1:3848] "Binghamton" "New Hartford" "Warsaw" "Wellsville" ...
##  $ Operator State              : chr [1:3848] "New York" "New York" "New York" "New York" ...
##  $ Operator Zip Code           : chr [1:3848] "13905" "13413" "14569" "14897" ...
##  $ Cooperator Name             : chr [1:3848] NA NA NA NA ...
##  $ Cooperator Address          : chr [1:3848] NA NA NA NA ...
##  $ Cooperator Address 2        : chr [1:3848] NA NA NA NA ...
##  $ Cooperator City             : chr [1:3848] NA NA NA NA ...
##  $ Cooperator State            : chr [1:3848] "New York" "New York" "New York" "New York" ...
##  $ Cooperator Zip Code         : chr [1:3848] NA NA NA NA ...
##  $ Ownership Type              : chr [1:3848] "Not for Profit Corporation" "Not for Profit Corporation" "Business Corporation" "LLC" ...
##  $ Facility Location           : chr [1:3848] "(42.097095, -75.975243)" "(43.05497, -75.228828)" "(42.738979, -78.12867)" "(42.126461, -77.967834)" ...
##  $ geometry                    :sfc_POINT of length 3848; first list element:  'XY' num [1:2] -76 42.1
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:34] "Facility ID" "Facility Name" "Short Description" "Description" ...
  1. Read and process the NYS retail food stores daa. Create sf objects from geographic coordinates for NYC.
NYS_Retail_Food_Stores <- read_csv("R-Spatial_I_Lab/NYS_Retail_Food_Stores.csv",
                                show_col_types = FALSE)

#Extract NYC zipcoodes
nyc_zip_codes <- as.list(as.character(zip_sf$ZIPCODE))

# Now apply the filter
NYC_Retail_Food_Stores <- NYS_Retail_Food_Stores %>%
  filter(`Zip Code` %in% nyc_zip_codes)

Extract coordinates

NYC_Retail_Food_Stores %>%
  tidyr::separate(Location, c(NA, "Both", NA), sep = "(\\(|\\))") %>%
  tidyr::separate(Both, c("Lat", "Long"), sep = ",") %>%
  dplyr::mutate(Lat = as.numeric(Lat), Long = as.numeric(Long)) %>%
  tidyr::drop_na(Lat, Long) %>%
  sf::st_as_sf(coords = c('Long', 'Lat')) %>%
  st_set_crs(4326) -> NYC_Retail_Food_Stores_sf
## Warning: Expected 3 pieces. Additional pieces discarded in 1 rows [9525].
## Warning: Expected 3 pieces. Missing pieces filled with `NA` in 2899 rows [31, 43, 46,
## 356, 750, 959, 985, 1005, 1113, 1125, 1174, 1186, 1201, 1206, 1243, 1244, 1254,
## 1296, 1310, 1408, ...].
## Warning: There were 2 warnings in `dplyr::mutate()`.
## The first warning was:
## ℹ In argument: `Lat = as.numeric(Lat)`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
str(NYC_Retail_Food_Stores_sf)
## sf [11,372 × 15] (S3: sf/tbl_df/tbl/data.frame)
##  $ County            : chr [1:11372] "Bronx" "Bronx" "Bronx" "Bronx" ...
##  $ License Number    : chr [1:11372] "734149" "606221" "606228" "723375" ...
##  $ Operation Type    : chr [1:11372] "Store" "Store" "Store" "Store" ...
##  $ Establishment Type: chr [1:11372] "JAC" "JAC" "JAC" "JAC" ...
##  $ Entity Name       : chr [1:11372] "7 ELEVEN FOOD STORE #37933H" "1001 SAN MIGUEL FOOD CENTER INC" "1029 FOOD PLAZA INC" "1078 DELI GROCERY CORP" ...
##  $ DBA Name          : chr [1:11372] NA "1001 SAN MIGUEL FD CNTR" "1029 FOOD PLAZA" "1078 DELI GROCERY" ...
##  $ Street Number     : chr [1:11372] "500" "1001" "122" "1078" ...
##  $ Street Name       : chr [1:11372] "BAYCHESTER AVE" "SHERIDAN AVE" "E 181ST ST" "EAST 165TH STREET" ...
##  $ Address Line 2    : logi [1:11372] NA NA NA NA NA NA ...
##  $ Address Line 3    : logi [1:11372] NA NA NA NA NA NA ...
##  $ City              : chr [1:11372] "BRONX" "BRONX" "BRONX" "BRONX" ...
##  $ State             : chr [1:11372] "NY" "NY" "NY" "NY" ...
##  $ Zip Code          : num [1:11372] 10475 10456 10453 10459 10456 ...
##  $ Square Footage    : num [1:11372] 0 1100 2000 1200 1500 2400 1000 1200 3400 500 ...
##  $ geometry          :sfc_POINT of length 11372; first list element:  'XY' num [1:2] -73.8 40.9
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:14] "County" "License Number" "Operation Type" "Establishment Type" ...
  1. Use simple mapping method, either based on ggmap+ggplot or mapview, with a basemap to verify the above datasets in terms of their geometry locations.

Postal areas

plot(st_geometry(zip_sf), main = 'NYC ZIP CODES')

Health Facilities

ggplot(data = NYS_Health_Facility_sf) +
  geom_sf(aes(color=`Short Description`)) +
  theme_minimal() +
  coord_sf(xlim = c(-79.8, -71.8), ylim = c(40.5,45.0)) +
  labs(title = "Map of New York State Health Facilities") +
  xlab("Longitude") +
  ylab("Latitude") + 
  theme(legend.text = element_text(size = 5),
        legend.key.size = unit(0.5, "cm"),
        legend.spacing.y = unit(0.1, "cm"))

Retail Food Stores

ggplot(data = NYC_Retail_Food_Stores_sf) +
  geom_sf(aes(color=County)) +
  theme_minimal() +
  coord_sf(xlim = c(-74.3, -73.7), ylim = c(40.9,40.4)) +
  labs(title = "Map of New York City Retail Food Stores") +
  xlab("Longitude") +
  ylab("Latitude") + 
  theme(legend.text = element_text(size = 5),
        legend.key.size = unit(0.5, "cm"),
        legend.spacing.y = unit(0.1, "cm"))

I was unsure how to add a baasemap in ggplot, so to verify geometry locations I will switch to mapview.

mapview(zip_sf,
        layer.name = 'NYC Zip Codes') + 
  mapview(NYS_Health_Facility_sf,
          layer.name = 'Health Facilities',
          homebutton = FALSE,
          legend = TRUE,
          color = 'cyan',
          cex = 2) + 
  mapview(NYC_Retail_Food_Stores_sf,
          layer.name = 'Retail Food Stores',
          homebutton= FALSE,
          legend = TRUE,
          color = 'pink',
          cex = 2)

Save layers into a Geopackage

st_write(NYS_Health_Facility_sf,
         dsn = '/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg',
         layer = 'NYS_Health_Facility',
         delete_layer = TRUE) %>%
  st_write(NYC_Retail_Food_Stores_sf,
         dsn = '/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg',
         layer = 'NYC_Retail_Food_Stores',
         delete_layer = TRUE) %>%
  st_write(zip_sf,
         dsn = '/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg',
         layer = 'NYC_ZIP_Codes',
         delete_layer = TRUE)
## Deleting layer `NYS_Health_Facility' using driver `GPKG'
## Writing layer `NYS_Health_Facility' to data source 
##   `/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg' using driver `GPKG'
## Writing 3848 features with 34 fields and geometry type Point.
## Deleting layer `NYC_Retail_Food_Stores' failed
## Writing layer `NYC_Retail_Food_Stores' to data source 
##   `/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg' using driver `GPKG'
## Writing 3848 features with 34 fields and geometry type Point.
## Deleting layer `NYC_ZIP_Codes' using driver `GPKG'
## Writing layer `NYC_ZIP_Codes' to data source 
##   `/Users/jamesreo/Desktop/R Independent Study/Weeks 7-9/R-spatial/data/lab1.gpkg' using driver `GPKG'
## Writing 3848 features with 34 fields and geometry type Point.