This assignment introduces basic spatial data processing and visualization in R. The goal is to work with several spatial datasets related to New York City and convert them into spatial objects that can be mapped and analyzed.

Three datasets are used in this lab:

Task 1 – R Project Setup

For this assignment I created an R Project for the R-Spatial section. Using an R Project keeps all scripts and datasets organized in one folder and ensures the working directory is set correctly when running the code.

Task 2 – Read NYC Postal Areas (ZIP Code Shapefile)

The NYC ZIP code boundary shapefile is read into R using the st_read() function from the sf package. This converts the shapefile into a spatial sf object, which allows the geographic features to be mapped and analyzed.

zip_sf <- st_read("R-Spatial_I_Lab/ZIP_CODE_040114/ZIP_CODE_040114.shp")
## Reading layer `ZIP_CODE_040114' from data source 
##   `/Users/amallali/Desktop/GTECH 78520/Section_07/R-Spatial_I_Lab/ZIP_CODE_040114/ZIP_CODE_040114.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 913129 ymin: 120020.9 xmax: 1067494 ymax: 272710.9
## Projected CRS: NAD83 / New York Long Island (ftUS)
# Convert projection to WGS84 so it matches the other datasets
zip_sf <- st_transform(zip_sf, 4326)

zip_sf
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -74.25576 ymin: 40.49584 xmax: -73.6996 ymax: 40.91517
## Geodetic CRS:  WGS 84
## First 10 features:
##    ZIPCODE BLDGZIP  PO_NAME POPULATION     AREA STATE COUNTY ST_FIPS CTY_FIPS
## 1    11436       0  Jamaica      18681 22699295    NY Queens      36      081
## 2    11213       0 Brooklyn      62426 29631004    NY  Kings      36      047
## 3    11212       0 Brooklyn      83866 41972104    NY  Kings      36      047
## 4    11225       0 Brooklyn      56527 23698630    NY  Kings      36      047
## 5    11218       0 Brooklyn      72280 36868799    NY  Kings      36      047
## 6    11226       0 Brooklyn     106132 39408598    NY  Kings      36      047
## 7    11219       0 Brooklyn      92561 42002738    NY  Kings      36      047
## 8    11210       0 Brooklyn      67067 47887023    NY  Kings      36      047
## 9    11230       0 Brooklyn      80857 49926703    NY  Kings      36      047
## 10   11204       0 Brooklyn      77354 43555185    NY  Kings      36      047
##                     URL SHAPE_AREA SHAPE_LEN                       geometry
## 1  http://www.usps.com/          0         0 POLYGON ((-73.80585 40.6829...
## 2  http://www.usps.com/          0         0 POLYGON ((-73.9374 40.67973...
## 3  http://www.usps.com/          0         0 POLYGON ((-73.90294 40.6708...
## 4  http://www.usps.com/          0         0 POLYGON ((-73.95797 40.6706...
## 5  http://www.usps.com/          0         0 POLYGON ((-73.97208 40.6506...
## 6  http://www.usps.com/          0         0 POLYGON ((-73.9619 40.65487...
## 7  http://www.usps.com/          0         0 POLYGON ((-73.98906 40.6441...
## 8  http://www.usps.com/          0         0 POLYGON ((-73.9584 40.63633...
## 9  http://www.usps.com/          0         0 POLYGON ((-73.96451 40.6366...
## 10 http://www.usps.com/          0         0 POLYGON ((-73.98108 40.6352...
plot(st_geometry(zip_sf))

Task 3 – Process NYS Health Facilities Dataset

The NYS health facilities dataset contains geographic coordinates for each facility. These coordinates can be converted into spatial points so that the facility locations can be visualized on a map.

Before creating the spatial object, the dataset was cleaned by removing rows with missing coordinates. After converting the dataset into a spatial object, the points were filtered so that only locations that fall within the NYC ZIP code boundaries remain. This helps ensure that the mapped facilities are located within the study area.

health_df <- read_csv(
  "R-Spatial_I_Lab/NYS_Health_Facility.csv",
  show_col_types = FALSE
)

# Remove rows that have missing coordinates
health_df <- health_df %>%
  filter(!is.na(`Facility Longitude`), !is.na(`Facility Latitude`))

# Convert to spatial object
health_sf <- st_as_sf(
  health_df,
  coords = c("Facility Longitude", "Facility Latitude"),
  crs = 4326
)

# Keep only facilities located within NYC ZIP code boundaries
health_sf <- st_filter(health_sf, zip_sf)

health_sf
## Simple feature collection with 1293 features and 34 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -74.19681 ymin: 40.51677 xmax: -73.70332 ymax: 40.91062
## Geodetic CRS:  WGS 84
## # A tibble: 1,293 × 35
##    `Facility ID` `Facility Name`                 `Short Description` Description
##  *         <dbl> <chr>                           <chr>               <chr>      
##  1          6230 NYU Langone Rutherford          HOSP-EC             Hospital E…
##  2          7257 Park Ridge Family Health Center HOSP-EC             Hospital E…
##  3          9006 FedCare, Inc.                   DTC                 Diagnostic…
##  4          9970 Parkmed NYC, LLC                DTC                 Diagnostic…
##  5          1217 St Patricks Home                NH                  Residentia…
##  6          1820 Prime Home Health Services, LLC CHHA                Certified …
##  7          4326 Segundo Ruiz Belvis D & T Cent… DTC                 Diagnostic…
##  8          5574 Fort Greene District Health Ce… DTC-EC              Diagnostic…
##  9          7579 Queens Dialysis Center          DTC-EC              Diagnostic…
## 10          1653 Corona Health Center            DTC-EC              Diagnostic…
## # ℹ 1,283 more rows
## # ℹ 31 more variables: `Facility Open Date` <chr>, `Facility Address 1` <chr>,
## #   `Facility Address 2` <chr>, `Facility City` <chr>, `Facility State` <chr>,
## #   `Facility Zip Code` <chr>, `Facility Phone Number` <dbl>,
## #   `Facility Fax Number` <dbl>, `Facility Website` <chr>,
## #   `Facility County Code` <dbl>, `Facility County` <chr>,
## #   `Regional Office ID` <dbl>, `Regional Office` <chr>, …

Task 4 – Process NYS Retail Food Store Dataset

The retail food store dataset contains coordinate columns labeled X and Y. These coordinates represent the geographic location of each food store and can be converted into spatial points.

Similar to the health facilities dataset, the food store data was cleaned by removing rows with missing coordinates. After converting the dataset to a spatial object, the points were filtered so that only locations within the NYC ZIP code boundaries are included.

food_df <- read_csv(
  "R-Spatial_I_Lab/nys_retail_food_store_xy.csv",
  locale = locale(encoding = "latin1"),
  show_col_types = FALSE
)

# Remove rows with missing coordinates
food_df <- food_df %>%
  filter(!is.na(X), !is.na(Y))

# Convert to spatial object
food_sf <- st_as_sf(
  food_df,
  coords = c("X","Y"),
  crs = 4326
)

# Keep only food stores located within NYC ZIP code boundaries
food_sf <- st_filter(food_sf, zip_sf)

food_sf
## Simple feature collection with 11306 features and 16 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -74.2484 ymin: 40.50782 xmax: -73.70069 ymax: 40.91008
## Geodetic CRS:  WGS 84
## # A tibble: 11,306 × 17
##    ï..County License.Number Operation.Type Establishment.Type Entity.Name       
##  * <chr>              <dbl> <chr>          <chr>              <chr>             
##  1 Bronx             734149 Store          JAC                7 ELEVEN FOOD STO…
##  2 Bronx             606221 Store          JAC                1001 SAN MIGUEL F…
##  3 Bronx             606228 Store          JAC                1029 FOOD PLAZA I…
##  4 Bronx             723375 Store          JAC                1078 DELI GROCERY…
##  5 Bronx             724807 Store          JAC                1086 LUNA DELI GR…
##  6 Bronx             712943 Store          JAC                109 AJ DELI GROCE…
##  7 Bronx             703060 Store          JAC                10 NEIGHBORHOOD C…
##  8 Bronx             609065 Store          JAC                1105 TINTON DELI …
##  9 Bronx             722972 Store          A                  1150 WEBSTER PHAR…
## 10 Bronx             609621 Store          JAC                1158 GROCERY & DE…
## # ℹ 11,296 more rows
## # ℹ 12 more variables: DBA.Name <chr>, Street.Number <chr>, Street.Name <chr>,
## #   Address.Line.2 <lgl>, Address.Line.3 <lgl>, City <chr>, State <chr>,
## #   Zip.Code <dbl>, Square.Footage <dbl>, Location <chr>, Coords <chr>,
## #   geometry <POINT [°]>

Task 5 – Verify Spatial Locations with Mapview

The mapview package is used to quickly visualize spatial datasets on an interactive map. Displaying the layers together helps verify that the spatial data were created correctly and appear in the correct geographic locations.

mapview(zip_sf, layer.name = "NYC ZIP Codes") +
mapview(health_sf, layer.name = "Health Facilities") +
mapview(food_sf, layer.name = "Retail Food Stores")

The map allows the datasets to be viewed together and confirms that the facilities and food store locations are correctly positioned within the NYC ZIP code boundaries.

Task 6 – Save Spatial Objects

The three spatial datasets are saved so they can be reused later without repeating all of the data processing steps. Saving them as an .RData file allows the spatial objects to be quickly loaded in future analyses.

save(
  zip_sf,
  health_sf,
  food_sf,
  file = "spatial_lab1_objects.RData"
)