library(tidyverse)

PLUTO Data

Here we do a domain dive on the PLUTO data and produce a data dictionary. Additionally we’ll reduce the number of columns to have more meaningful data in the subsequent merge.

The PLUTO data or “Primary Land Use Tax Lot Output” data file compiles information from the Departments of City Planning (DCP), Finance (DOF), Citywide Administrative Services (DCAS) and the Landmarks Preservation Commission (LPC) in one place, covering characteristics of the Tax Lots, Buildings and Geographic/Political/Administrative Districts.

If this information were used to calculate Market Total Value by the DOF, it should be in the DOF’s Property Valuation Assessment Data file, so our goal here is look for any features that are correlated to assessed Market Total Value which can help make an argument for whether a DOF’s assessment is too high or low.

I read the PLUTO README document April 2025 (25v1.1) and the only take away was that while normally a tax lot is a parcel of land, for condominiums, each condo is it’s own tax lot and for consistency and comparability between buildings, all condo tax lots in a given governing condo complex are consolidated together into one record.




Load PLUTO Data

Since we saved the subset PLUTO data from Code Chunk 3 as an .rds file it’s easy to load.

# Load subset PLUTO data
df <- readRDS("~/Documents/D698/PLUTO.rds")



Domain Dive

The PLUTO database has longitude and latitude so we could add a later-stage model that looks at nearby properties to reinforce Market Total Value assessments. I question this method in a city where a one-dollar pizza place is next to a $40 pizza place is next to $300 per person Omakase tasting menu, however it should have some impact, especially since propinquity is part of the verbal description of the DOF’s otherwise opaque methodology for assessing market value. Note, the longitude and latitude reported are WGS 84 (World Geodetic System).

Alternatively to the longitude and latitude, we have the x and y coordinates for the closest point within the lot to the lot’s centroid on the New York-Long Island State Plane coordinate system. We could center (0,0) to the means for interpretability, too.

I started to compile a fourth column for which department the PLUTO data came from to help distinguish which fields are duplicates however the Department of City Planning credited themselves for a lot of DOF fields and so we’ll proceed with a selection based on our domain dive.

Building Classes should map to either C or D for our purposes with Class 2 properties, and we expect whether the apartment is walk up or elevator to have a big impact on the market value assessed.

C. WALK UP APARTMENTS
0. Three Families
1. Over Six Families Without Stores
2. Five to Six Families
3. Four Families
4. Old Law Tenements
5. Converted Dwelling or Rooming House
6. Cooperative
7. Over Six Families with Stores
8. Co-Op Conversion from Loft/Warehouse
9. Garden Apartments
M. Mobile Homes/Trailer Parks

D. ELEVATOR APARTMENTS
0. Co-op Conversion from Loft/Warehouse
1. Semi-fireproof (Without Stores)
2. Artists in Residence
3. Fireproof (Without Stores)
4. Cooperatives (Other Than Condominiums)
5. Converted
6. Fireproof with Stores
7. Semi-Fireproof with Stores
8. Luxury Type
9. Miscellaneous




Data Dictionary

Column Example Description
borough “MN” Borough, ‘MN’ is for Manhattan
block “1046” Tax Block
lot “23” Tax Lot (consolidated for condos)
cd “104” Community district (59 in NYC and 12 in Manhattan, 104 is cd#4)
bct2020 “1013900” Census tract 2020
bctcb2020 “1.0139e+10” Census block 2020
ct2010 “139” Census tract 2010
cb2010 “5000” Census block 2010
schooldist “02” School district
council “6” City council district (#6 represented by Gale Brewer)
zipcode “10019” Zip code
firecomp “E040” Fire company (Engine 40)
policeprct “18” Police precinct
healthcenterdistrict “15” Health center district
healtharea “4600” Health area
sanitboro “1” Sanitation district boro
sanitdistrict “04” Sanitation district number
sanitsub “3A” Sanitation subsection
address “315 WEST 55 STREET” Address
zonedist1 “R8” Zoning District 1
zonedist2 NA (Residence, Commercial or Manufacturing)
zonedist3 NA higher numbers are higher density
zonedist4 NA Additional ZD is for a lot divided by zoning boundary lines
overlay1 NA Commercial Overlay
overlay2 NA For when C-zoning is allowed in R-zoning
spdist1 “CL” Special Purpose District
spdist2 NA “CL” is for Clinton District
spdist3 NA Created to preserve residences from new development
ltdheight NA Limited height district
splitzone “N” Split boundary indicator
bldgclass “D4” Building Class
landuse “03” Land use category (03 is Multi-Family Elevator Buildings)
easements “0” Number of easements
ownertype NA Type of ownership code
ownername “315 W 55TH OWNERS CORP” Owner name
lotarea “5725” Lot area
bldgarea “26959” Total building floor area
comarea “1000” Commercial floor area
resarea “25959” Residential floor area
officearea “1000” Office floor area
retailarea “0” Retail floor area
garagearea “0” Garage floor area
strgearea “0” Storage floor area
factryarea “0” Factory floor area
otherarea “0” Other floor area
areasource “2” Total building floor area source code
numbldgs “1” Number of buildings
numfloors “7” Number of floors
unitsres “42” Residential Units
unitstotal “44” Total units
lotfront “57” Lot frontage
lotdepth “100.42” Lot depth
bldgfront “57” Building frontage
bldgdepth “71” Building depth
ext “N” Extension code
proxcode “3” Proximity code (3 means attached to neighboring buildings)
irrlotcode “N” Irregular lot code
lottype “5” Lot type (“5”: lot has frontage on only one street)
bsmtcode “1” Basement type/grade
assessland “311850” Assessed land value
assesstot “3518550” Assessed total value
exempttot “17200” Exempt total value
yearbuilt “1945” Year built
yearalter1 “1972” Year altered
yearalter2 “0” Year altered 2
histdist NA Historic District Name
landmark NA Landmark Status
builtfar “4.71” Built floor area ratio (total building floor area/lot area)
residfar “6.02” Maximum allowable residential FAR (floor area ratio)
commfar “0” Maximum allowable commercial FAR (floor area ratio)
facilfar “6.5” Maximum allowable community facility FAR
borocode “1” Boro Code
bbl “1010460023” Borough, tax block & lot
condono NA Condominium number
tract2010 “0139” Census tract 2
xcoord “988531” x coordinate in the NY-LI State Plane Coordinate System
ycoord “218341” y coordinate
zonemap “8c” Zoning Map #
zmcode NA Zoning Map Code
sanborn “106W022” Sanborn Map #
taxmap “10403” Tax Map #
edesignum NA E-designation number
appbbl NA Apportionment BBL (before the lot was apportioned into condos)
appdate NA Apportionment Date
plutomapid “1” PLUTO - DTM Base map indicator
firm07_flag NA 2007 Flood Insurance rate map indicator
pfirm15_flag NA 2015 preliminary flood insurance rate map indicator
version “25v1.1” version number
dcpedited NA changed by DCP (City Planning changed any field values for this tax lot)
latitude “40.76597” Latitude (WGS 84)
longitude “-73.98455” Longitude (WGS 84)
notes NA Notes (specific to Inwood



Subsetting

Here we subset for the 32 fields we are interested. We’re primarily concerned with ascertaining relative locations of properties as well as fields that could affect value such as Landmark Status or possibly School District.

selected_vars <- c("block", "lot", "cd", "bct2020", "bctcb2020", "schooldist", "council", "firecomp", "policeprct", "healthcenterdistrict", "healtharea", "zonedist1", "spdist1", "ltdheight", "bldgclass", "landuse", "easements", "histdist", "landmark", "builtfar", "residfar", "commfar", "facilfar", "bbl", "condono", "tract2010", "xcoord", "ycoord", "firm07_flag", "pfirm15_flag", "latitude", "longitude")

df2 <- df[, selected_vars]



Writing

Here we write the file for use in the subsequent merge file

# Write file
saveRDS(df2, file="pluto2.rds")