R Spatial Lab Assignment #1

Task 1: Set up the R Project

I created an R Project for the Week 7 spatial lab and organized the working directory with folders for data, scripts, and output. I placed the assignment scripts and template in the scripts folder and stored the required datasets in the appropriate subfolders under data.

Task 2: Read the NYC postal code shapefile into an sf object

The ZIP code shapefile was read into R as an sf object.

Output:

# read NYC ZIP code shapefile
nyc_zip_sf <- st_read("data/zip_codes/ZIP_CODE_040114.shp")
## Reading layer `ZIP_CODE_040114' from data source 
##   `C:\Users\danyu\OneDrive - Hunter - CUNY\week7\data\zip_codes\ZIP_CODE_040114.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 913129 ymin: 120020.9 xmax: 1067494 ymax: 272710.9
## Projected CRS: NAD83 / New York Long Island (ftUS)
# inspect the sf object
str(nyc_zip_sf)
## Classes 'sf' and 'data.frame':   263 obs. of  13 variables:
##  $ ZIPCODE   : chr  "11436" "11213" "11212" "11225" ...
##  $ BLDGZIP   : chr  "0" "0" "0" "0" ...
##  $ PO_NAME   : chr  "Jamaica" "Brooklyn" "Brooklyn" "Brooklyn" ...
##  $ POPULATION: num  18681 62426 83866 56527 72280 ...
##  $ AREA      : num  22699295 29631004 41972104 23698630 36868799 ...
##  $ STATE     : chr  "NY" "NY" "NY" "NY" ...
##  $ COUNTY    : chr  "Queens" "Kings" "Kings" "Kings" ...
##  $ ST_FIPS   : chr  "36" "36" "36" "36" ...
##  $ CTY_FIPS  : chr  "081" "047" "047" "047" ...
##  $ URL       : chr  "http://www.usps.com/" "http://www.usps.com/" "http://www.usps.com/" "http://www.usps.com/" ...
##  $ SHAPE_AREA: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ SHAPE_LEN : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ geometry  :sfc_POLYGON of length 263; first list element: List of 1
##   ..$ : num [1:159, 1:2] 1038098 1038142 1038171 1038280 1038521 ...
##   ..- attr(*, "class")= chr [1:3] "XY" "POLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:12] "ZIPCODE" "BLDGZIP" "PO_NAME" "POPULATION" ...

Task 3: Read NYS health facilities data into an sf object

The health facilities file was imported, rows with missing coordinates were removed, and records with unrealistic coordinate values were filtered out before converting the data into an sf point object.

Output:

# read the health facilities csv file
health_fac_raw <- read_csv(
  "data/health/NYS_Health_Facility.csv",
  show_col_types = FALSE,
  lazy = FALSE
)

# remove rows with missing coordinates and keep only reasonable NY coordinates
health_fac_raw <- health_fac_raw %>%
  filter(!is.na(`Facility Latitude`) & !is.na(`Facility Longitude`)) %>%
  filter(
    `Facility Longitude` >= -80, `Facility Longitude` <= -71,
    `Facility Latitude`  >= 40,  `Facility Latitude`  <= 45
  )

# convert the table to an sf object
health_fac_sf <- st_as_sf(
  health_fac_raw,
  coords = c("Facility Longitude", "Facility Latitude"),
  crs = 4326
)

# inspect the sf object
str(health_fac_sf)
## sf [3,843 × 35] (S3: sf/tbl_df/tbl/data.frame)
##  $ Facility ID                 : num [1:3843] 204 620 1156 2589 3455 ...
##  $ Facility Name               : chr [1:3843] "Hospice at Lourdes" "Charles T Sitrin Health Care Center Inc" "East Side Nursing Home" "Wellsville Manor Care Center" ...
##  $ Short Description           : chr [1:3843] "HSPC" "NH" "NH" "NH" ...
##  $ Description                 : chr [1:3843] "Hospice" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" "Residential Health Care Facility - SNF" ...
##  $ Facility Open Date          : chr [1:3843] "06/01/1985" "02/01/1989" "08/01/1979" "02/01/1989" ...
##  $ Facility Address 1          : chr [1:3843] "4102 Old Vestal Road" "2050 Tilden Avenue" "62 Prospect St" "4192A Bolivar Road" ...
##  $ Facility Address 2          : chr [1:3843] NA NA NA NA ...
##  $ Facility City               : chr [1:3843] "Vestal" "New Hartford" "Warsaw" "Wellsville" ...
##  $ Facility State              : chr [1:3843] "New York" "New York" "New York" "New York" ...
##  $ Facility Zip Code           : chr [1:3843] "13850" "13413" "14569" "14895" ...
##  $ Facility Phone Number       : num [1:3843] 6.08e+09 3.16e+09 5.86e+09 5.86e+09 7.17e+09 ...
##  $ Facility Fax Number         : num [1:3843] NA NA NA NA NA ...
##  $ Facility Website            : chr [1:3843] NA NA NA NA ...
##  $ Facility County Code        : num [1:3843] 3 32 60 2 14 ...
##  $ Facility County             : chr [1:3843] "Broome" "Oneida" "Wyoming" "Allegany" ...
##  $ Regional Office ID          : num [1:3843] 3 3 1 1 1 7 1 7 5 7 ...
##  $ Regional Office             : chr [1:3843] "Central New York Regional Office" "Central New York Regional Office" "Western Regional Office - Buffalo" "Western Regional Office - Buffalo" ...
##  $ Main Site Name              : chr [1:3843] NA NA NA NA ...
##  $ Main Site Facility ID       : num [1:3843] NA NA NA NA NA ...
##  $ Operating Certificate Number: chr [1:3843] "0301501F" "3227304N" "6027303N" "0228305N" ...
##  $ Operator Name               : chr [1:3843] "Our Lady of Lourdes Memorial Hospital Inc" "Charles T Sitrin Health Care Center, Inc" "East Side Nursing Home Inc" "Wellsville Manor LLC" ...
##  $ Operator Address 1          : chr [1:3843] "169 Riverside Drive" "Box 1000 Tilden Avenue" "62 Prospect Street" "4192a Bolivar Road" ...
##  $ Operator Address 2          : chr [1:3843] NA NA NA NA ...
##  $ Operator City               : chr [1:3843] "Binghamton" "New Hartford" "Warsaw" "Wellsville" ...
##  $ Operator State              : chr [1:3843] "New York" "New York" "New York" "New York" ...
##  $ Operator Zip Code           : chr [1:3843] "13905" "13413" "14569" "14897" ...
##  $ Cooperator Name             : chr [1:3843] NA NA NA NA ...
##  $ Cooperator Address          : chr [1:3843] NA NA NA NA ...
##  $ Cooperator Address 2        : chr [1:3843] NA NA NA NA ...
##  $ Cooperator City             : chr [1:3843] NA NA NA NA ...
##  $ Cooperator State            : chr [1:3843] "New York" "New York" "New York" "New York" ...
##  $ Cooperator Zip Code         : chr [1:3843] NA NA NA NA ...
##  $ Ownership Type              : chr [1:3843] "Not for Profit Corporation" "Not for Profit Corporation" "Business Corporation" "LLC" ...
##  $ Facility Location           : chr [1:3843] "(42.097095, -75.975243)" "(43.05497, -75.228828)" "(42.738979, -78.12867)" "(42.126461, -77.967834)" ...
##  $ geometry                    :sfc_POINT of length 3843; first list element:  'XY' num [1:2] -76 42.1
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:34] "Facility ID" "Facility Name" "Short Description" "Description" ...
nrow(health_fac_sf)
## [1] 3843

Task 4: Read NYS retail food stores data and convert NYC records into an sf object

The retail food store file did not have separate coordinate columns, so the latitude and longitude values were extracted from the Location field. After that, only NYC county records were kept and converted into an sf object.

Output:

# read the retail food stores csv file
retail_food_raw <- read_csv(
  "data/retail_food/NYS_Retail_Food_Stores.csv",
  show_col_types = FALSE,
  lazy = FALSE
)

# extract latitude and longitude from the Location field
retail_food_sf <- retail_food_raw %>%
  mutate(
    coords_text = str_extract(Location, "\\([-0-9\\.]+,\\s*[-0-9\\.]+\\)$"),
    coords_text = str_remove_all(coords_text, "[()]")
  ) %>%
  separate(coords_text, into = c("Latitude", "Longitude"), sep = ",\\s*", remove = FALSE) %>%
  mutate(
    Latitude = as.numeric(Latitude),
    Longitude = as.numeric(Longitude)
  ) %>%
  filter(!is.na(Longitude) & !is.na(Latitude)) %>%
  # keep only NYC counties
  filter(County %in% c("Bronx", "Kings", "New York", "Queens", "Richmond")) %>%
  # convert to sf object
  st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)

# inspect the sf object
str(retail_food_sf)
## sf [11,301 × 17] (S3: sf/tbl_df/tbl/data.frame)
##  $ County            : chr [1:11301] "Bronx" "Bronx" "Bronx" "Bronx" ...
##  $ License Number    : chr [1:11301] "734149" "606221" "606228" "723375" ...
##  $ Operation Type    : chr [1:11301] "Store" "Store" "Store" "Store" ...
##  $ Establishment Type: chr [1:11301] "JAC" "JAC" "JAC" "JAC" ...
##  $ Entity Name       : chr [1:11301] "7 ELEVEN FOOD STORE #37933H" "1001 SAN MIGUEL FOOD CENTER INC" "1029 FOOD PLAZA INC" "1078 DELI GROCERY CORP" ...
##  $ DBA Name          : chr [1:11301] NA "1001 SAN MIGUEL FD CNTR" "1029 FOOD PLAZA" "1078 DELI GROCERY" ...
##  $ Street Number     : chr [1:11301] "500" "1001" "122" "1078" ...
##  $ Street Name       : chr [1:11301] "BAYCHESTER AVE" "SHERIDAN AVE" "E 181ST ST" "EAST 165TH STREET" ...
##  $ Address Line 2    : logi [1:11301] NA NA NA NA NA NA ...
##  $ Address Line 3    : logi [1:11301] NA NA NA NA NA NA ...
##  $ City              : chr [1:11301] "BRONX" "BRONX" "BRONX" "BRONX" ...
##  $ State             : chr [1:11301] "NY" "NY" "NY" "NY" ...
##  $ Zip Code          : num [1:11301] 10475 10456 10453 10459 10456 ...
##  $ Square Footage    : num [1:11301] 0 1100 2000 1200 1500 2400 1000 1200 3400 500 ...
##  $ Location          : chr [1:11301] "500 BAYCHESTER AVE\nBRONX, NY 10475\n(40.869156, -73.831875)" "1001 SHERIDAN AVE\nBRONX, NY 10456\n(40.829061, -73.919613)" "122 E 181ST ST\nBRONX, NY 10453\n(40.854755, -73.902853)" "1078 EAST 165TH STREET\nBRONX, NY 10459\n(40.825105, -73.890589)" ...
##  $ coords_text       : chr [1:11301] "40.869156, -73.831875" "40.829061, -73.919613" "40.854755, -73.902853" "40.825105, -73.890589" ...
##  $ geometry          :sfc_POINT of length 11301; first list element:  'XY' num [1:2] -73.8 40.9
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:16] "County" "License Number" "Operation Type" "Establishment Type" ...
# count the number of NYC retail food records
nrow(retail_food_sf)
## [1] 11301

Task 5: Verify the spatial objects with quick maps

I used mapview() to verify that the spatial layers were displayed in the expected locations.

Map 1: NYC ZIP code polygons

# map the ZIP code polygons
mapview(nyc_zip_sf)

Map 2: NYS health facilities points

# map the health facilities points
mapview(health_fac_sf)

Map 3: NYC retail food store points

# map the NYC retail food store points
mapview(retail_food_sf)

Task 6: Save the three sf objects

The three final spatial objects were saved into an .RData file.

Output:

# save the final sf objects
save(
  nyc_zip_sf,
  health_fac_sf,
  retail_food_sf,
  file = "output/week7_spatial_objects.RData"
)

```