CUNY MSDS Capstone Project

library(tidyverse)

Sales Data

Here we are refining the sales data of properties that have sold in the last 12 months.

Our initial thought was to develop an expected price per square foot in a building to segment the buildings into poor, average and wealthy, and to compare properties tax outcomes, however we have learned through our literature and domain dives that NYC essentially has a regressive property tax system where buildings tend to pay the same in property taxes if they are the same size. NYC demonstrably does not have vertical equity, wherein properties of higher value pay proportionally or progressively more, so our focus is more on horizontal equity, if two relatively similar sized properties have roughly the same market value.

Our task has shifted from determining if there are inequities in the NYC property tax system to how to tell if a specific property is under, over, or paying approximately equal to what they should be, before the application of abatements or exemptions.

To that end, this dataset may not be useful because if the policy of the DOF is to not review actual sales data in valuing class 2 properties market value while valuing them as a hypothetical rental property, then we can’t use actual sales data to contest the market value assessed by the DOF.

Load Sales Data

# Load subset Sales data
df <- readRDS("~/Documents/D698/SALES.rds")

Domain Dive

At least looking at our classic example, Block-Lot 1046-23, it looks like the only information we would need to collect is BLOCK, LOT, and SALE PRICE. The rest is duplicated in our previous databases.

If we wanted to do a rough valuation we could take the average sales price ($356,250 from the four sales for 1046-23 of $350k, $300k, $400k and $375k) and multiple it by the number of units (42) and divide by the total square footage of the building (26,959) to arrive at a price per square foot of $555.

This isn’t an accurate measure because likely the square footage of the building includes hallways, lobbies and common space not captured in our count of units, however it should still return a price-per-square-building-foot that can help distinguish between relative wealth among the buildings, albeit even if lower than the true price-per-square-apartment-foot.

For reference in NYC property tends to be between $800 to $1200 per square foot with prices near or below $800 being affordable or cheap and near or above $1200 being exclusive or luxury.

Example Sales Record

df2 <- df %>%
  filter(BLOCK == 1046, LOT == 23)

t(head(df2,1))

##                                [,1]                            
## BOROUGH                        "1"                             
## NEIGHBORHOOD                   "MIDTOWN WEST"                  
## BUILDING CLASS CATEGORY        "10 COOPS - ELEVATOR APARTMENTS"
## TAX CLASS AT PRESENT           "2"                             
## BLOCK                          "1046"                          
## LOT                            "23"                            
## EASEMENT                       NA                              
## BUILDING CLASS AT PRESENT      "D4"                            
## ADDRESS                        "315 WEST 55 ST, 1E"            
## APARTMENT NUMBER               NA                              
## ZIP CODE                       "10019"                         
## RESIDENTIAL UNITS              NA                              
## COMMERCIAL UNITS               NA                              
## TOTAL UNITS                    NA                              
## LAND SQUARE FEET               NA                              
## GROSS SQUARE FEET              NA                              
## YEAR BUILT                     "1945"                          
## TAX CLASS AT TIME OF SALE      "2"                             
## BUILDING CLASS AT TIME OF SALE "D4"                            
## SALE PRICE                     "350000"                        
## SALE DATE                      "2024-08-06"

Subsetting

Here we subset for the three fields we are interested. All we need are SALE PRICE and BLOCK & LOT for matching.

selected_vars <- c("BLOCK", "LOT", "SALE PRICE")

df3 <- df[, selected_vars]

Writing

Here we write the file for use in the subsequent merge file

# Write file
saveRDS(df3, file="sales2.rds")

CUNY MSDS Capstone Project

Taxation Equity for NYC Co-ops

PK O’Flaherty

Coding Chunk 7

2025-06-08