En esta primera entrega queremos calcular y mapear agregados por área a través de datos abiertos de un municipio.
Elegimos la ciudad de Nueva York, y de ella la base de datos de los espacios públicos en propiedad privada (POPS - Privately Owned Public Spaces). Estos son espacios, a su interior y exterior, destinados a uso público pero de tenencia privada, a cambio de exenciones impositivas o superficie de construcción adicional.
Iniciamos llamando a los paquetes que vamos necesitar.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.1
## -- Attaching packages --------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.0 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 0.8.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## Warning: package 'tibble' was built under R version 3.6.1
## Warning: package 'tidyr' was built under R version 3.6.1
## Warning: package 'readr' was built under R version 3.6.1
## Warning: package 'purrr' was built under R version 3.6.1
## Warning: package 'dplyr' was built under R version 3.6.1
## Warning: package 'stringr' was built under R version 3.6.1
## Warning: package 'forcats' was built under R version 3.6.1
## -- Conflicts ------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(sf)
## Warning: package 'sf' was built under R version 3.6.1
## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
Trabajaremos con estas funciones para asignar a cada registro geo-referenciado el barrio al que corresponde. Cargamos los datasets con los cuales vamos a trabajar.
mapa_NYC <- st_read("E:/Documentos/Ciencia de Datos II/NYC base/nynta.shp")
## Reading layer `nynta' from data source `E:\Documentos\Ciencia de Datos II\NYC base\nynta.shp' using driver `ESRI Shapefile'
## Simple feature collection with 195 features and 7 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 913175.1 ymin: 120121.9 xmax: 1067383 ymax: 272844.3
## epsg (SRID): NA
## proj4string: +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
POPS_NYC <- st_read("E:/Documentos/Ciencia de Datos II/POPS - Privately Owned Public Spaces/nycpops_20180815.shp")
## Reading layer `nycpops_20180815' from data source `E:\Documentos\Ciencia de Datos II\POPS - Privately Owned Public Spaces\nycpops_20180815.shp' using driver `ESRI Shapefile'
## Simple feature collection with 354 features and 27 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 979859 ymin: 189776 xmax: 1021888 ymax: 225509
## epsg (SRID): NA
## proj4string: +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
Una vez que cargamos las bases, visualizamos sus atributos, componentes y nombres de las variables.
summarize(mapa_NYC)
## Simple feature collection with 1 feature and 0 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 913175.1 ymin: 120121.9 xmax: 1067383 ymax: 272844.3
## epsg (SRID): NA
## proj4string: +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
## geometry
## 1 MULTIPOLYGON (((969488.2 14...
names(mapa_NYC)
## [1] "BoroCode" "BoroName" "CountyFIPS" "NTACode" "NTAName"
## [6] "Shape_Leng" "Shape_Area" "geometry"
names(POPS_NYC)
## [1] "POPS_Numbe" "Borough_Na" "Community_" "Address_Nu" "Street_Nam"
## [6] "Zip_Code" "Building_A" "Tax_Block" "Tax_Lot" "Building_N"
## [11] "Building_L" "Year_Compl" "Building_C" "Public_Spa" "Developer"
## [16] "Building_1" "Principal_" "Size_Requi" "Hour_Of_Ac" "Amenities_"
## [21] "Other_Requ" "Permitted_" "Physically" "Latitude" "Longitude"
## [26] "XCoordinat" "YCoordinat" "geometry"
Ahora buscamos graficar para observar ambos shapefiles:
ggplot() +
geom_sf(data = mapa_NYC) +
geom_point(data = POPS_NYC,
aes(x = Longitude, y = Latitude),
alpha = .3,
color = "green")
Vemos que que nos quedan disociados el mapa y la localización de los espacios públicos, es decir, no coinciden las coordenadas, por lo que vamos a transformar el dataset de POPS a uno de tipo espacial, para realizar el gráfico nuevamente.
POPS_NYC <- POPS_NYC %>%
filter(!is.na(Latitude), !is.na(Longitude)) %>%
st_as_sf(coords = c("Latitude", "Longitude"), crs = 4326)
ggplot() +
geom_sf(data = mapa_NYC) +
geom_sf(data = POPS_NYC, color = "green", alpha = .3) +
theme_minimal()
Ahora podemos observar que los puntos correspondientes a cada lugar se encuentran ubicados en el mapa de Nueva York, pero necesitamos unir los datasets para poder trabajarlos de manera espacial.
Con la función de join espacial los unimos.
Parques_NYC <- st_join(POPS_NYC, mapa_NYC)
Luego visualizamos la unión.
Parques_NYC %>% head
## Simple feature collection with 6 features and 34 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 987077 ymin: 190089 xmax: 990230 ymax: 192148
## epsg (SRID): NA
## proj4string: +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
## POPS_Numbe Borough_Na Community_ Address_Nu Street_Nam Zip_Code
## 1 K020001 Brooklyn 2 130 LIVINGSTON STREET 11201
## 2 K020002 Brooklyn 2 350 JAY STREET 11201
## 3 K020003 Brooklyn 2 1 METROTECH CENTER 11201
## 4 K020006 Brooklyn 2 111 LIVINGSTON STREET 11201
## 5 K020007 Brooklyn 2 230 ASHLAND PLACE 11217
## 6 K020008 Brooklyn 2 343 GOLD STREET 11201
## Building_A Tax_Block Tax_Lot
## 1 130 LIVINGSTON STREET, Brooklyn, NY 11201 163 1
## 2 350 JAY STREET, Brooklyn, NY 11201 140 7501
## 3 1 METROTECH CENTER, Brooklyn, NY 11201 147 4
## 4 111 LIVINGSTON STREET, Brooklyn, NY 11201 266 1
## 5 230 ASHLAND PLACE, Brooklyn, NY 11217 2095 7502
## 6 343 GOLD STREET, Brooklyn, NY 11201 2049 2
## Building_N
## 1 Livingston Plaza
## 2 Renaissance Plaza
## 3 <NA>
## 4 <NA>
## 5 Forte
## 6 Avalon Fort Greene
## Building_L
## 1 Full block bounded by Livingston Street, Schermerhorn Street, Smith Street, and Boerem Place
## 2 West side of Jay Street between Willoughby Street and Tech Place
## 3 Bounded by Jay Street, Tech Place, Bridge Street, Tillary Street, Flatbush Avenue, Duffield Street, and Willoughby Street
## 4 Northwest corner of Livingston Street and Boerem Place
## 5 North of Fulton Street between Ashland Place and Rockwell Place
## 6 North of Myrtle Avenue between Flatbush Avenue Extension and Prince Street
## Year_Compl Building_C Public_Spa
## 1 1991 Completed Plaza
## 2 1998 Completed Urban Plaza
## 3 1990 Completed Commons and Public Areas
## 4 1969 Completed Plaza
## 5 2007 Completed Residential Plaza
## 6 2008 Completed Residential Plaza
## Developer
## 1 Cohen Brothers
## 2 Muss Development Company
## 3 Forest City MetroTech Associates
## 4 <NA>
## 5 The Clarett Group
## 6 Avalon Bay Communities, Inc.
## Building_1
## 1 Murphy/Jahn Architects
## 2 William B. Tabler Architects
## 3 Haines Lundberg Waehler/Skidmore, Owings & Merrill/Swanke Hayden Connell
## 4 <NA>
## 5 FXFowle Architects
## 6 Perkins Eastman
## Principal_
## 1 Murphy/Jahn Architects
## 2 William B. Tabler Architects/Office of William B. Kuhl; William B. Tabler Architects/Moss Gilday Group (for alteration in 2003)
## 3 The Ehrenkrantz Group & Eckstut
## 4 <NA>
## 5 Sullivan Group Design
## 6 Moss Gilday Group
## Size_Requi
## 1 Plaza 19201.00 sf
## 2 Urban Plaza 28805.00 sf
## 3 Commons and Public Areas 0.00 sf
## 4 Plaza 9920.00 sf
## 5 Residential Plaza 3960.00 sf
## 6 Residential Plaza 17475.00 sf
## Hour_Of_Ac
## 1 Plaza: 24 Hours
## 2 Urban Plaza: 24 Hours
## 3 Commons and Public Areas: 24 Hours (except Thanksgiving, Christmas, and New Years)
## 4 Plaza: 24 Hours
## 5 Residential Plaza: 24 Hours
## 6 Residential Plaza: 24 Hours
## Amenities_
## 1 Plaza: Lighting; Plaza: Litter Receptacles; Plaza: Other Required (decorative pavement near intersection of Boerem Place and Livingston Street); Plaza: Planting (10 grade-level planting beds, 9 raised planters); Plaza: Seating (6 benches); Plaza: Trees o
## 2 Urban Plaza: Lighting; Urban Plaza: Litter Receptacles (14.4 cubic feet required, a minimum of 20 cf provided); Urban Plaza: Planting; Urban Plaza: Plaque/Sign; Urban Plaza: Retail Frontage (14% of building frontage); Urban Plaza: Seating (420 linear fee
## 3 Commons and Public Areas: Artwork (The Commons: sculpture pedestal); Commons and Public Areas: Drinking Fountain (The Commons: 1); Commons and Public Areas: Lighting (The Commons: 32 post lanterns; Flatbush Court: 10 post lanterns; Jay Plaza: 5 post lant
## 4 Plaza: None
## 5 Residential Plaza: Bicycle Parking (9 spaces required, 12 spaces provided); Residential Plaza: Drinking Fountain (1); Residential Plaza: Lighting; Residential Plaza: Litter Receptacles (8.28 cubic feet required, 2 litter receptacles at 5 cf each provided
## 6 Residential Plaza: Bicycle Parking (Primary Space: 25 spaces required, 13 bicycle racks with 2 spaces per rack provided); Residential Plaza: Drinking Fountain (Primary Space: 1); Residential Plaza: Lighting; Residential Plaza: Litter Receptacles (Primary
## Other_Requ
## 1 decorative pavement near intersection of Boerem Place and Livingston Street
## 2 <NA>
## 3 Jay Plaza: newsstand, trellis; security
## 4 <NA>
## 5 adjacent wall of Brooklyn Academy of Music covered in planting
## 6 <NA>
## Permitted_
## 1 <NA>
## 2 Urban Plaza: Other Permitted (North Terrace will have the ability to accommodate lunch vendor cart; "planter/sculpture area" on eastern end of through-block section of Urban Plaza)
## 3 <NA>
## 4 <NA>
## 5 <NA>
## 6 <NA>
## Physically Latitude Longitude XCoordinat YCoordinat BoroCode BoroName
## 1 Full/Partial 40.69041 -73.98877 987364 190811 3 Brooklyn
## 2 Full/Partial 40.69381 -73.98799 987580 192051 3 Brooklyn
## 3 Full/Partial 40.69322 -73.98672 987933 191836 3 Brooklyn
## 4 Full/Partial 40.69148 -73.98981 987077 191200 3 Brooklyn
## 5 Full/Partial 40.68825 -73.97864 990230 190089 3 Brooklyn
## 6 Unknown 40.69408 -73.98288 988996 192148 3 Brooklyn
## CountyFIPS NTACode NTAName
## 1 047 BK38 DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill
## 2 047 BK38 DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill
## 3 047 BK38 DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill
## 4 047 BK09 Brooklyn Heights-Cobble Hill
## 5 047 BK68 Fort Greene
## 6 047 BK38 DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill
## Shape_Leng Shape_Area geometry
## 1 32542.90 28477919 POINT (987364 190811)
## 2 32542.90 28477919 POINT (987580 192051)
## 3 32542.90 28477919 POINT (987933 191836)
## 4 14264.79 9983621 POINT (987077 191200)
## 5 19825.52 16482822 POINT (990230 190089)
## 6 32542.90 28477919 POINT (988996 192148)
Ahora podemos realizar el ggplot de los espacios públicos y los barrios.
ggplot() +
geom_sf(data = mapa_NYC) +
geom_sf(data = Parques_NYC, aes(color = NTAName))
Pero necesitamos mostrar en un mapa donde los colores indiquen la cantidad total de espacios públicos que podemos encontrar por barrio. Realizamos una suma de esa cantidad.
Convertimos a su vez el dataset con el conteo de espacios públicos en una tabla nuevamente, anulando la variable de geometry, para luego unirlo con el dataset espacial de los barrios. Luego unimos nuevamente los datasets de datos del mapa de Nueva York y el conteo de los espacios públicos por barrio.
conteo_NYC <- Parques_NYC %>%
group_by(NTAName) %>%
summarise(cantidad =n()) %>%
st_set_geometry(NULL)
La unión se hace por la variable que posean en común, en este caso “NTAName”.
mapa_NYC2 <- left_join(mapa_NYC, conteo_NYC)
## Joining, by = "NTAName"
Para visualizar mejor la información, hacemos un gráfico de barras.
ggplot(conteo_NYC) +
geom_bar(aes(x = NTAName, weight = cantidad), color = NA)+
coord_flip() +
labs(title = "Distribución de POPS según barrio",
x = "Barrio",
y = "Cantidad")+
theme_minimal()
Finalmente, graficamos un nuevo mapa donde el color de relleno es el indicador.
ggplot() +
geom_sf(data = mapa_NYC2,
aes(fill = cantidad), color = NA) +
labs(title = "Total de POPS por barrio",
subtitle = "Cantidad de POPS",
fill = "Cantidad total",
caption = "Datos al año 2018") +
scale_fill_viridis_c() +
theme_minimal()
Podemos ver que la localización de este tipo de espacio público se concentra más en las localidades de Manhattan, Queens y Brooklyn, llevando el mayor recuento Manhattan.