This assignment introduces basic spatial data processing and visualization in R. The goal is to work with several spatial datasets related to New York City and convert them into spatial objects that can be mapped and analyzed.
Three datasets are used in this lab:
For this assignment I created an R Project for the R-Spatial section. Using an R Project keeps all scripts and datasets organized in one folder and ensures the working directory is set correctly when running the code.
The NYC ZIP code boundary shapefile is read into R using the
st_read() function from the sf package.
This converts the shapefile into a spatial sf object,
which allows the geographic features to be mapped and analyzed.
zip_sf <- st_read("R-Spatial_I_Lab/ZIP_CODE_040114/ZIP_CODE_040114.shp")
## Reading layer `ZIP_CODE_040114' from data source
## `/Users/amallali/Desktop/GTECH 78520/Section_07/R-Spatial_I_Lab/ZIP_CODE_040114/ZIP_CODE_040114.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 913129 ymin: 120020.9 xmax: 1067494 ymax: 272710.9
## Projected CRS: NAD83 / New York Long Island (ftUS)
# Convert projection to WGS84 so it matches the other datasets
zip_sf <- st_transform(zip_sf, 4326)
zip_sf
## Simple feature collection with 263 features and 12 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -74.25576 ymin: 40.49584 xmax: -73.6996 ymax: 40.91517
## Geodetic CRS: WGS 84
## First 10 features:
## ZIPCODE BLDGZIP PO_NAME POPULATION AREA STATE COUNTY ST_FIPS CTY_FIPS
## 1 11436 0 Jamaica 18681 22699295 NY Queens 36 081
## 2 11213 0 Brooklyn 62426 29631004 NY Kings 36 047
## 3 11212 0 Brooklyn 83866 41972104 NY Kings 36 047
## 4 11225 0 Brooklyn 56527 23698630 NY Kings 36 047
## 5 11218 0 Brooklyn 72280 36868799 NY Kings 36 047
## 6 11226 0 Brooklyn 106132 39408598 NY Kings 36 047
## 7 11219 0 Brooklyn 92561 42002738 NY Kings 36 047
## 8 11210 0 Brooklyn 67067 47887023 NY Kings 36 047
## 9 11230 0 Brooklyn 80857 49926703 NY Kings 36 047
## 10 11204 0 Brooklyn 77354 43555185 NY Kings 36 047
## URL SHAPE_AREA SHAPE_LEN geometry
## 1 http://www.usps.com/ 0 0 POLYGON ((-73.80585 40.6829...
## 2 http://www.usps.com/ 0 0 POLYGON ((-73.9374 40.67973...
## 3 http://www.usps.com/ 0 0 POLYGON ((-73.90294 40.6708...
## 4 http://www.usps.com/ 0 0 POLYGON ((-73.95797 40.6706...
## 5 http://www.usps.com/ 0 0 POLYGON ((-73.97208 40.6506...
## 6 http://www.usps.com/ 0 0 POLYGON ((-73.9619 40.65487...
## 7 http://www.usps.com/ 0 0 POLYGON ((-73.98906 40.6441...
## 8 http://www.usps.com/ 0 0 POLYGON ((-73.9584 40.63633...
## 9 http://www.usps.com/ 0 0 POLYGON ((-73.96451 40.6366...
## 10 http://www.usps.com/ 0 0 POLYGON ((-73.98108 40.6352...
plot(st_geometry(zip_sf))
The NYS health facilities dataset contains geographic coordinates for each facility. These coordinates can be converted into spatial points so that the facility locations can be visualized on a map.
Before creating the spatial object, the dataset was cleaned by removing rows with missing coordinates. After converting the dataset into a spatial object, the points were filtered so that only locations that fall within the NYC ZIP code boundaries remain. This helps ensure that the mapped facilities are located within the study area.
health_df <- read_csv(
"R-Spatial_I_Lab/NYS_Health_Facility.csv",
show_col_types = FALSE
)
# Remove rows that have missing coordinates
health_df <- health_df %>%
filter(!is.na(`Facility Longitude`), !is.na(`Facility Latitude`))
# Convert to spatial object
health_sf <- st_as_sf(
health_df,
coords = c("Facility Longitude", "Facility Latitude"),
crs = 4326
)
# Keep only facilities located within NYC ZIP code boundaries
health_sf <- st_filter(health_sf, zip_sf)
health_sf
## Simple feature collection with 1293 features and 34 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -74.19681 ymin: 40.51677 xmax: -73.70332 ymax: 40.91062
## Geodetic CRS: WGS 84
## # A tibble: 1,293 × 35
## `Facility ID` `Facility Name` `Short Description` Description
## * <dbl> <chr> <chr> <chr>
## 1 6230 NYU Langone Rutherford HOSP-EC Hospital E…
## 2 7257 Park Ridge Family Health Center HOSP-EC Hospital E…
## 3 9006 FedCare, Inc. DTC Diagnostic…
## 4 9970 Parkmed NYC, LLC DTC Diagnostic…
## 5 1217 St Patricks Home NH Residentia…
## 6 1820 Prime Home Health Services, LLC CHHA Certified …
## 7 4326 Segundo Ruiz Belvis D & T Cent… DTC Diagnostic…
## 8 5574 Fort Greene District Health Ce… DTC-EC Diagnostic…
## 9 7579 Queens Dialysis Center DTC-EC Diagnostic…
## 10 1653 Corona Health Center DTC-EC Diagnostic…
## # ℹ 1,283 more rows
## # ℹ 31 more variables: `Facility Open Date` <chr>, `Facility Address 1` <chr>,
## # `Facility Address 2` <chr>, `Facility City` <chr>, `Facility State` <chr>,
## # `Facility Zip Code` <chr>, `Facility Phone Number` <dbl>,
## # `Facility Fax Number` <dbl>, `Facility Website` <chr>,
## # `Facility County Code` <dbl>, `Facility County` <chr>,
## # `Regional Office ID` <dbl>, `Regional Office` <chr>, …
The retail food store dataset contains coordinate columns labeled X and Y. These coordinates represent the geographic location of each food store and can be converted into spatial points.
Similar to the health facilities dataset, the food store data was cleaned by removing rows with missing coordinates. After converting the dataset to a spatial object, the points were filtered so that only locations within the NYC ZIP code boundaries are included.
food_df <- read_csv(
"R-Spatial_I_Lab/nys_retail_food_store_xy.csv",
locale = locale(encoding = "latin1"),
show_col_types = FALSE
)
# Remove rows with missing coordinates
food_df <- food_df %>%
filter(!is.na(X), !is.na(Y))
# Convert to spatial object
food_sf <- st_as_sf(
food_df,
coords = c("X","Y"),
crs = 4326
)
# Keep only food stores located within NYC ZIP code boundaries
food_sf <- st_filter(food_sf, zip_sf)
food_sf
## Simple feature collection with 11306 features and 16 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -74.2484 ymin: 40.50782 xmax: -73.70069 ymax: 40.91008
## Geodetic CRS: WGS 84
## # A tibble: 11,306 × 17
## ï..County License.Number Operation.Type Establishment.Type Entity.Name
## * <chr> <dbl> <chr> <chr> <chr>
## 1 Bronx 734149 Store JAC 7 ELEVEN FOOD STO…
## 2 Bronx 606221 Store JAC 1001 SAN MIGUEL F…
## 3 Bronx 606228 Store JAC 1029 FOOD PLAZA I…
## 4 Bronx 723375 Store JAC 1078 DELI GROCERY…
## 5 Bronx 724807 Store JAC 1086 LUNA DELI GR…
## 6 Bronx 712943 Store JAC 109 AJ DELI GROCE…
## 7 Bronx 703060 Store JAC 10 NEIGHBORHOOD C…
## 8 Bronx 609065 Store JAC 1105 TINTON DELI …
## 9 Bronx 722972 Store A 1150 WEBSTER PHAR…
## 10 Bronx 609621 Store JAC 1158 GROCERY & DE…
## # ℹ 11,296 more rows
## # ℹ 12 more variables: DBA.Name <chr>, Street.Number <chr>, Street.Name <chr>,
## # Address.Line.2 <lgl>, Address.Line.3 <lgl>, City <chr>, State <chr>,
## # Zip.Code <dbl>, Square.Footage <dbl>, Location <chr>, Coords <chr>,
## # geometry <POINT [°]>
The mapview package is used to quickly visualize spatial
datasets on an interactive map. Displaying the layers together helps
verify that the spatial data were created correctly and appear in the
correct geographic locations.
mapview(zip_sf, layer.name = "NYC ZIP Codes") +
mapview(health_sf, layer.name = "Health Facilities") +
mapview(food_sf, layer.name = "Retail Food Stores")