As a city with vast job opportunities for different industries that can satisfy one’s needs and pursue one’s aspirations and one of the most popular destinations for tourism globally, New York City attracts thousands and millions of people to reside and visit each year. However, New York City is also knowns as the most expensive city in the United States. According to a biannual report by the Economist Intelligence Unit in 2021, The Big Apple ranks 6th as the most expensive city globally. What potential factors might impact the property price in New York City? With a staggering 56% of New York City residents relying on city public transit (New York Public Transit Association), public transportation is the first factor that potentially impacts the property price in New York City.
There is no secret that public transportation systems perform a primary function and provide convenience for modern urban cities’ business, economic, and social development. Many studies discuss that public transport plays a significant role in the value of properties. Having a public transit station nearby should result the residential properties having more value than those further afield. In other words, the closer a residential property to a rail station, the more expensive on it will be, when other factors known to influence residential property price are similar, such as property condition, crime rate, and tax rates.
This project aims to investigate the relationship between the New York City property’s sales price in Brooklyn and Queens and the access to the subway in these two focused boroughs. The project is conducted by analyzing geospatial data and making various maps to reflect property prices in Brooklyn and Queens and the distribution of subway lines and subway stations in New York City. Two datasets from the New York City Department of Finance have been used for this project (https://www.nyc.gov/site/finance/taxes/property-annualized-sales-update.page). These two datasets contain 2021 New York City property sales data for Brooklyn and Bronx.
Before importing the datasets into R, I first deleted some columns in excel that are not related to the interest of my project, including tax class at present, block, lot, easement, building class at present, apartment number, residential units, total units, land square feet, year built, tax class at the time of sale, building class at the time of sale, and sale date. I also created a new column in both datasets called sales price per gross square feet by using sales price divided by gross square feet. After finding out that the new column sales price per gross square feet contains many missing values, I exclude any rows with this column having values missing.
I imported the updated two datasets into R Studio and excluded any rows with sales price equals to 0, and gross square feet equals to 0 from the datasets.
library(readxl)
brooklyn_2021 <- read_excel("~/Desktop/MEA Independent Study/Data/2021_brooklyn_clean.xlsx",
skip = 6)
View(brooklyn_2021)
colnames(brooklyn_2021)
colnames(brooklyn_2021)[6] <- "RESIDENTIAL UNITS"
colnames(brooklyn_2021)[7] <- "GROSS SQUARE FEET"
View(brooklyn_2021)
library(readxl)
queens_2021 <- read_excel("~/Desktop/MEA Independent Study/Data/2021_queens_clean.xlsx",
skip = 6)
View(queens_2021)
colnames(queens_2021)
colnames(queens_2021)[6] <- "RESIDENTIAL UNITS"
colnames(queens_2021)[7] <- "GROSS SQUARE FEET"
View(queens_2021)
library(dplyr)
brooklyn_2021 %>%
filter(`SALE PRICE`!= 0) %>%
filter(`GROSS SQUARE FEET`!= 0) -> brooklyn_2021
View(brooklyn_2021)
queens_2021 %>%
filter(`SALE PRICE`!= 0) %>%
filter(`GROSS SQUARE FEET`!= 0) -> queens_2021
View(queens_2021)
I created two tables displaying the mean sales price for properties based on zip codes in Brooklyn and Queens. I also found the highest and lowest sale prices in these two boroughs.
brooklyn_2021 %>%
group_by(`ZIP CODE`) %>%
summarise(mean(`SALE PRICE PER GROSS SQUARE FEET`)) -> avg_brooklyn_2021
View(avg_brooklyn_2021)
colnames(avg_brooklyn_2021)
colnames(avg_brooklyn_2021)[2] <- "AVG PRICE PER GROSS SQUARE FEET"
View(avg_brooklyn_2021)
knitr::kable(avg_brooklyn_2021)
| ZIP CODE | AVG PRICE PER GROSS SQUARE FEET |
|---|---|
| 11201 | 862.7936 |
| 11203 | 364.7076 |
| 11204 | 581.5831 |
| 11205 | 752.3495 |
| 11206 | 526.1113 |
| 11207 | 356.1703 |
| 11208 | 344.5969 |
| 11209 | 633.1645 |
| 11210 | 497.0832 |
| 11211 | 693.1178 |
| 11212 | 350.4087 |
| 11213 | 404.6472 |
| 11214 | 536.4122 |
| 11215 | 1066.6118 |
| 11216 | 895.3750 |
| 11217 | 1010.4147 |
| 11218 | 681.1163 |
| 11219 | 519.1548 |
| 11220 | 466.7412 |
| 11221 | 499.2058 |
| 11222 | 1229.4274 |
| 11223 | 679.9615 |
| 11224 | 327.7796 |
| 11225 | 629.5159 |
| 11226 | 553.5148 |
| 11228 | 592.0438 |
| 11229 | 530.0481 |
| 11230 | 692.1706 |
| 11231 | 1069.4003 |
| 11232 | 666.3988 |
| 11233 | 440.8665 |
| 11234 | 453.6013 |
| 11235 | 503.9871 |
| 11236 | 365.2623 |
| 11237 | 598.3377 |
| 11238 | 908.1895 |
| 11239 | 315.4400 |
| 11249 | 2550.8628 |
max(avg_brooklyn_2021$`AVG PRICE PER GROSS SQUARE FEET`)
## [1] 2550.863
min(avg_brooklyn_2021$`AVG PRICE PER GROSS SQUARE FEET`)
## [1] 315.44
I also found the highest sale price per gross square feet in Brooklyn is 2550.863, and the lowest sale price per gross square feet is 315.44.
queens_2021 %>%
group_by(`ZIP CODE`) %>%
summarise(mean(`SALE PRICE PER GROSS SQUARE FEET`)) -> avg_queens_2021
View(avg_queens_2021)
colnames(avg_queens_2021)
colnames(avg_queens_2021)[2] <- "AVG PRICE PER GROSS SQUARE FEET"
View(avg_queens_2021)
knitr::kable(avg_queens_2021)
| ZIP CODE | AVG PRICE PER GROSS SQUARE FEET |
|---|---|
| 11001 | 503.3111 |
| 11004 | 565.9519 |
| 11040 | 594.8567 |
| 11101 | 985.5806 |
| 11102 | 553.3970 |
| 11103 | 612.1626 |
| 11104 | 659.4729 |
| 11105 | 614.4458 |
| 11106 | 558.1881 |
| 11354 | 600.2637 |
| 11355 | 577.7160 |
| 11356 | 483.7733 |
| 11357 | 611.4370 |
| 11358 | 586.7649 |
| 11360 | 571.7917 |
| 11361 | 597.9003 |
| 11362 | 584.2241 |
| 11363 | 565.6871 |
| 11364 | 650.1251 |
| 11365 | 638.6149 |
| 11366 | 582.4770 |
| 11367 | 549.3428 |
| 11368 | 497.7642 |
| 11369 | 463.4678 |
| 11370 | 525.2667 |
| 11372 | 537.1479 |
| 11373 | 497.5879 |
| 11374 | 563.0498 |
| 11375 | 660.0694 |
| 11377 | 507.9552 |
| 11378 | 551.6695 |
| 11379 | 550.1387 |
| 11385 | 5398.3107 |
| 11411 | 398.2550 |
| 11412 | 369.9270 |
| 11413 | 406.3184 |
| 11414 | 416.5484 |
| 11415 | 436.2913 |
| 11416 | 406.0620 |
| 11417 | 457.1963 |
| 11418 | 388.2766 |
| 11419 | 456.9580 |
| 11420 | 482.9122 |
| 11421 | 426.5173 |
| 11422 | 407.2822 |
| 11423 | 435.4351 |
| 11426 | 519.6030 |
| 11427 | 506.2591 |
| 11428 | 489.2448 |
| 11429 | 405.9665 |
| 11430 | 346.7061 |
| 11432 | 567.7322 |
| 11433 | 405.4162 |
| 11434 | 415.5773 |
| 11435 | 562.8604 |
| 11436 | 471.5388 |
| 11691 | 358.2270 |
| 11692 | 301.4881 |
| 11693 | 364.6227 |
| 11694 | 487.0282 |
max(avg_queens_2021$`AVG PRICE PER GROSS SQUARE FEET`)
## [1] 5398.311
min(avg_queens_2021$`AVG PRICE PER GROSS SQUARE FEET`)
## [1] 301.4881
I also found the highest sale price per gross square feet in Queens is 5398.311, and the lowest sale price per gross square feet is 301.4881.
library(tigris)
library(tidyverse)
library(sf)
library(leaflet)
library(mapview)
read_sf("Borough Boundaries.geojson") -> boroughs
read_sf("Subway Stations.geojson") -> stations
read_sf("Subway Lines.geojson") -> lines
mapview(boroughs) + mapview(lines, color="red") + mapview(stations, color = "green")
This map provides an overview of New York City subway lines and subway stations.
leaflet() %>%
addTiles() %>%
addPolygons(data = boroughs$geometry,
color = 'purple',
stroke = 'red',
popup = paste0(
boroughs$boro_name
)
)
This is a map displaying the five boroughs of New York City. This project focuses on Brooklyn and Queens.
leaflet() %>%
addTiles() %>%
addCircleMarkers(data = stations$geometry,
radius = 2,
color = 'purple',
stroke = 'red',
popup = paste0("Name: ",
stations$name,
"<br/>",
"Line: ",stations$line
)
)
leaflet() %>%
addTiles() %>%
addPolygons(data = lines$geometry,
popup = paste0(
lines$name
)
)
Based on the distribution of subway stations and subway lines in New York City, I found out that most of the Brooklyn subways go through this borough’s west side. People on the west side of Brooklyn have easier access to the subway than residents in other areas. Subway in Queens covers the northwest part of Queens. Therefore, people in the northwest part of Queens have better access to the subway.
read_sf("~/Desktop/MEA Independent Study/nyc_zip.geojson") -> nyc
colnames(nyc)
colnames(nyc)[2] <- "ZIP CODE"
as.character(avg_brooklyn_2021$`ZIP CODE`) -> avg_brooklyn_2021$`ZIP CODE`
as.character(avg_queens_2021$`ZIP CODE`) -> avg_queens_2021$`ZIP CODE`
avg_brooklyn_2021 %>%
full_join(avg_queens_2021) -> both
## Joining, by = c("ZIP CODE", "AVG PRICE PER GROSS SQUARE FEET")
nyc %>%
inner_join(both, by="ZIP CODE") -> merged
labels <- sprintf(
"<strong>%s</strong><br/> Average Price / SqFt: $ %g",
merged$PO_NAME, merged$`AVG PRICE PER GROSS SQUARE FEET`
) %>% lapply(htmltools::HTML)
qpal <- colorQuantile(rev(viridis::viridis(10)), merged$`AVG PRICE PER GROSS SQUARE FEET`, n = 10)
leaflet() %>%
addTiles() %>%
addPolygons(
data=merged,
label = labels,
smoothFactor = 0.3,
fillOpacity = 0.7,
weight=1,
color = "white",
fillColor = ~qpal(merged$`AVG PRICE PER GROSS SQUARE FEET`)
)
The distribution of subway stations and subway lines in Brooklyn and Queens also reflects the average property prices in these two boroughs. Based on this map, the average property prices in the west side of Brooklyn, and the northwest part of Queens are more expensive than the rest of these Brooklyn and Queens.
Based on my study, access to the subway will impact property prices in Brooklyn and Queens in New York City. A public transit station nearby should make the residential properties more valuable than those further afield.
Living near public transit systems can improve people’s quality of life and provide convenience for modern urban cities’ business, economic, and social development. Therefore, access to public transportation is highly valued for people living in the New York City Metropolitan Area and many urban areas in the country. Studies throughout the U.S. also indicate that people are willing to pay more for their housing with a location near a public transit station, helping them to reduce time spent commuting. Distance to a transit station could become a significant predictor for property values in San Diego, according to a study examining condominiums in the city (https://www.jstor.org/stable/43081599#metadata_info_tab_contents). This phenomenon is also common to see worldwide. According to a study focusing on the impacts of urban rail systems on property values in the city of Naples, light rail, metro, and other urban rail transit systems can directly influence the attractiveness of surrounding properties near these stations (https://www.sciencedirect.com/science/article/pii/S0966692310000153).