library(readr)
library(tidyverse)
library(readxl)

Rolling Sales Data

We should be able to subset for properties that have sold in the last 12 months to train and test our model

https://www.nyc.gov/site/finance/property/property-rolling-sales-data.page

“The Department of Finance’s rolling sales files list tax class 1, 2, and 4 properties that have sold in the last 12-month period in New York City. These files include the neighborhood, building type, square footage, and other data.”




Load Data

Here we load the sales data that we previously downloaded from the website above.

The excel file read with zero parsing issues.

df <- read_excel("~/Documents/D698/rollingsales_manhattan.xlsx", skip = 4)
#No more columns with parsing issues
# code to review parsing issues if any
#problems <- problems(df)

#problems %>%
#  distinct(col) %>%
#  pull(col)



Subset the data

Here we filter the data to only look at properties in Manhattan.

This also has a more complex tax class structure than what we’ve seen before, so we’re going to filter for tax classes that include 2 2A 2B or 2C.

# Subset the data
df2 <- df %>%
  filter(`BOROUGH` == "1",
         `TAX CLASS AT PRESENT` %in% c("2", "2A", "2B", "2C"))
# Check for non standard tax class designations
df %>%
  distinct(`TAX CLASS AT PRESENT`) %>%
  pull(`TAX CLASS AT PRESENT`)
## [1] "1"  "2B" "2"  "2A" "2C" "4"  "1C" "1A"



Write Data

Here we save the data for later

#No more writing to csv
#write.csv(df2, file="PVADex.csv")

saveRDS(df2, file="SALES.rds")



Reference

Here are the fields for reference

t(head(df, n=1))
##                                [,1]                     
## BOROUGH                        "1"                      
## NEIGHBORHOOD                   "ALPHABET CITY"          
## BUILDING CLASS CATEGORY        "01 ONE FAMILY DWELLINGS"
## TAX CLASS AT PRESENT           "1"                      
## BLOCK                          "376"                    
## LOT                            "43"                     
## EASEMENT                       NA                       
## BUILDING CLASS AT PRESENT      "S1"                     
## ADDRESS                        "743 EAST 6 STREET"      
## APARTMENT NUMBER               NA                       
## ZIP CODE                       "10009"                  
## RESIDENTIAL UNITS              "1"                      
## COMMERCIAL UNITS               "1"                      
## TOTAL UNITS                    "2"                      
## LAND SQUARE FEET               "2090"                   
## GROSS SQUARE FEET              "3680"                   
## YEAR BUILT                     "1940"                   
## TAX CLASS AT TIME OF SALE      "1"                      
## BUILDING CLASS AT TIME OF SALE "S1"                     
## SALE PRICE                     "0"                      
## SALE DATE                      "2025-01-23"