West 3rd and MacDougal St
West 3rd and MacDougal St



Abstract:




Introduction:

We want to build a model predicting assessed tax values and compare them to actual tax values. If the richer co-ops have larger R values than the mid and poor co-ops then we’ve shown that they are under assessed

It could also be there is systemic bias(?), for example NYC uses statisical methods to produce assessed property values for tax purposes. That may tend to lump poor, mid and rich co-ops together so that poor buildings are paying a disproportionately larger value of taxes and rich buildings are paying a disproportionately lower value of taxes. Basically we could ask if the statistical methods used by NYC DOF lump all buildings into the mid rich building category

Some apartments have a low property value because the maintenance is so high, maybe there is a large underlying mortgage on the building itself and the interest on that makes up a big proportion of the maintenance, or it’s a building with a large staff. If sale price alone of individual units is used to produce assessed tax values of whole buildings then buildings with heavily leveraged underlying mortgages or with large staff would be underpaying taxes

Why is this important. Tax inequities drive people to game the system to benefit themselves…

Property taxes are over a trillion dollars every year with high percent increases every year.




Literature Review:

I started out with the premise that sophisticated owners were able to successfully appeal property tax assessments. As I learned about NYC’s Department of Finance’s methodology for calculating taxes I assumed some of the over/under assessment for poor/rich buildings would be due to taking statistical averages. The literature review supported these ideas and suggest that my contribution is in educating the reader on how to understand property taxes in NYC and help them determine if they should appeal their property taxes.

I restricted articles to those written after 1981 when the current rules for taxation were last codified in NYC.

📌📌📌
What data sets, methodology and models did they use?
📌📌📌

Synthesis of Past Research

I think it makes sense that rentals pay more taxes than home owners - rental units are businesses that generate more money than a home you live in and if you don’t live in your own home you don’t receive tax abatements.

The DOF uses artificially high capitalization rates. Maybe this causes people to think they are getting a deal and they are scared to inquire becasue they don’t want a reassessment to go against their interest.

The DOF gets Net Operating Income from owner-reported financials. Uses a modeling process to correct for underreporting, missing data and inconsistency.

NOI is reported gross income minus reported expenses other than property taxes) lol then at least my building’s NOI = property tax * 1.05.

So one of the papers addressed vertical and horizontal inequities and how we’re stuck using cap rates by asset type.

capitalization rates is net operating income over market value.

So for example, my building has a net operating income of 1,000,000 and assume that for it’s class of asset the market capitalization is 0.18, then the market valuation for my building is $5,555,555. However if they used actual market data then the capitalization rate might be 0.05 and a fair market valuation of $20,000,000.

One of the papers talks about the low legal salience of overassessed taxation where it’s hard for someone to understand or quantify the financial injury of overtaxation and so they never appeal their overtaxation in court. This high market capitalization might be one way the city does that. They look at their statements and see the low market valuation for their property and are scared to contact the DOF for clarification because they then mistakenly believe they are underassessed.

The City acknowledges that there are taxation inequities and divide them into horizontal inequity (two similarly priced buildings have different assessed values) and vertical inequity (where a high-priced building has a lower assessment proportionately to a low-priced building). These inequities are because capitalization rates don’t truly reflect market value.

OK, this is a really subtle point about horizontal inequities, it sounds like there are outdated cap rates but that would only affect comparing between different building classes. There are reporting issues

For instance you and I could both own rental buildings next to each other. If yours is better managed, more aesthetic, and attracts tenants paying higher rents, you will have a higher Net Operating Income (NOI) than me and a higher assessed value than me. Market capitalization rates aren’t tied to how efficiently buildings are using their capital. And we can’t assign different market capitalizations to two buildings with identical property characteristics or otherwise we would be violating equal protection under the law. Now let’s say you don’t report your income and the city applies an average to determine yours and now you end up with the same assessed value as me, how we have horizontal inequity. (If a building doesn’t report there should be a penalty above average, but how would they detect false reporting?)

Another aspect that could distort sale price and market value is the resale price caps that exist for some co-ops in NYC. (The linked article “Brozan” was from 2002 so the prevalence of resale price caps may have diminished.). Also on the theme of complexity in the calculation, there was a (currently hidden) paper about, why, if condos are more valuable than coops, why do coops persist when they could convert into condos? Say a 600k coop unit would be worth 800k as a condo, it remains to be seen if that valuation difference between coops and condos affects pricing. From attending the CNYC’s 2025 annual meeting I found out that there are a lot of legislation for rental buildings that apply to coops but not condos so it could be that the city is trying to disincentivize coops or squeeze money out of coops with increased compliance requirements with greater expenses to meet the compliance requirements. It may be that coops can’t convert to condos without paying off their underlying mortgage and other hurdles mean Albany has a captive target.

we could abandon the system and switch to market valuation but that would have to come from Albany writing sweeping property tax legislation to replace the 1981 law.

The IBO suggests small changes to the cap rates to be more fair: use median instead of average transactions to bring down the artificially high capitalization rates, and to adjust cap rates within property types, for example use a slightly lower cap rate for office buildings in prime midtown versus a low-tier office neighborhood.

However they can only adjust capitalization rates based on property characteristics, such as building type, location, usage), not the wealth of the owners. If the DOF tried to tweak cap rates based on perceived market outcomes, it could invite lawsuits for violating equal protection or property rights.

There are some advantages to this system. If we used true market valuation, people could be rapidly pushed out of their homes due to rising property taxes in areas of gentrification. (There must be some redress, if there is gentrification and they can lower the property tax rate so they don’t collect 30% more from an area one year)

Past Research Descriptions


Large Investors

Xiao, S. W. (2022). Investor Scale and Property Taxation.
DOI Link

Identifies that larger investors (100+ units owned) have a tax assessment discount compared to small investors and individual home owners. They found the large investor tax assessment discount was larger in areas with any one of three characteristics: 1) A high tax burden, 2) a high concentration of large investors, and 3) a fairer property tax administration.

📌📌📌
How did they assess fairer? and wouldn’t a fairer property tax administrion decrease the discount for large investors?
📌📌📌


Unpredictable Assessments

Goor, R. M. (2017). Only the little people pay taxes.
URI Link

Finds that the NYC DOF deviates significantly from its publicized process when calculating property taxes and that property taxes are poorly correlated with land, market and assessed values.

A lot of Class 2c buildings are luxury condos and the DOF estimates rental income by looking at the income of nearby buildings, but many of them have rent controls leading to deep undervaluations for luxury condos.

Property tax exemptions are granted for people’s ability to organize and lobby the state legislature, not because it is sound tax policy.

Homeowners comprise 46% of the estimated market value while paying 15% of the total property tax burden, while rental properties comprise 24% but pay 37% of the property tax. This is 4.73 times more tax paid for market value as a rental building than as a homeowner.


Overview

Berry, C. (2021). An Evaluation of the Residential Property Tax Equity in New York City PDF Link

📌📌📌
Review for an overview of how assessments work and what did he find. 📌📌📌


Inequity Fix

NYC Independent Budget Office (2022). Does NYC’s Method for Assessing Commercial Property Values Result in Inequities) PDF Link

The DOF uses high capitalization rates (cap rates) that result in lower market valuations. Since the DOF doesn’t use actual market data this can lead to tax inequities.

The Independent Budget Office (IBO) confirmed there are horizontal inequities (where similarly priced properties have different assessments) and vertical inequities (where wealthier property owners pay proportionally less in property taxes).

They can’t change from cap rates to fair market value because they operate under the S7000A law from 1981, however they could lower cap rates for high value properties and raise them for low-value properties to narrow vertical inequities.


Hodge, T. R., Komarek, T. M., & McAllister, A. (2024). A Double Negative: Capitalizing on Assessment Regressivity. DOI Link

Found overassessed lower value properties sold at a 13% discount and underassessed higher value properties sold for a 10% premium. This tax inequity drives a widening of the wealth gap.


Nadine Brozan (2002) For Co-op Complexes, Complex Choices NYT Archival Link

Some NYC co-ops have resale price caps which distort sale price and market value.


Scanlon and Cohen (2011) Distribution of the Burden of New York City’s Property Tax - The Furman Center for Real Estate & Urban Policy PSU Link

Class 1 properties (1-3 unit family homes) have an effective tax rate of 0.67% compared to the 3.31% of Class 2 (co-ops, condos and rentals) or the 3.85% of Class 4 (commercial/industrial). So while Class 1 properties own 49% of property value, they contribute only 15% of tax revenue.

Both Class 1 and Class 2 buildings have renters but vastly more renters live in Class 2 housing resulting in renters paying significantly more of passed through property tax than the fewer, wealthier Class 1 renters.

In 1975 ‘Hellerstein v. Assessor of Islip’ determined that fractional assessments of full market value for determining property taxes was illegal. This was to dramatically shift property tax burden onto commercial properties and the New York State Legislature, to prevent significant disruptions, wrote into law S. 7000A in 1981, which is our current taxation system today.

While there are Co-op & Condo tax abatements to bring Class 2 property tax closer in line to Class 1, these abatements mostly benefit owners, not renters.

Confirmed wealthier properties as more likely to appeal assessments, and more likely to win.


Furman Center Policy Brief (2013) Shifting the Burden: Examining the Undertaxation of Some of the Most Valuable Properties in New York City - The Furman Center for Real Estate & Urban Policy
Furman Center Link

Section 581 of New York’s Real Property Tax Law values Class 2 properties as if they were rental properties even though they don’t generate rental income. This leads to inaccurate assessments especially for high-value properties because no true rental comparables exist, especially when compared to rent-regulated buildings.

For example in 2012, 50 individual co-op units sold for more than the entire building’s offical DOF market value estimate.

Also, in the four-class property tax system the city specifies how much property tax will be assessed from each class and sets the rates accordingly. When high-value properties are underassessed it leads to larger increases on low and mid-value properties, shifting the tax burden on to poorer people. This is clearly a regressive taxation system. The city can only pick one rate that when applied to everyone’s assessed property tax value produces the amount of taxes they need from Class 2. If my neighbor appeals and gets a lower assessed property tax value then the city has to pick a higher tax rate for everyone in Class 2 to get the same amount of property taxes from Class 2.


Lee, Lizzie (Yea Won) (2023) Evaluating the ‘Road to Reform’ for New York City’s Property Tax System
DOI Link

Supports machine learning (XGBoost) as a viable tool for improving fairness and transparency in tax assessments; was able to flag valuation anomalies.

Characterized NYC’s property tax system as extremely regressive with low-income neighborhoods often facing higher effective tax rates. Agrees with the New York City Advisory Commission on Property Tax Reforms 2021 final report for Class 2 to be split into less than inclusive or greater than 10 units, as well as switching to a sales-based market valuation. (A concern of sales-based market valuation is it accelerates gentrification.)

Suggested NYC may be reluctant to reform property tax because it’s one third of NYC revenue and encourages NYC to lower tax from property, or because of fear of high-income outmigration. Discourages carve-out plans which lower revenue hoping to stimulate construction but have mixed results.

The paper supported issues I was running into on the data side, that the DOF’s methodology is not disclosed or reproducible and so it’s difficult to verify if a DOF valuation is accurate or how it compares to equivalent properties or equity between property types.

The New York City Advisory Commission on Property Tax Reform used quantiles to group properties together to understand impact but that might mean an $800k home and an $80M luxury penthouse might both be in the same top quantile with drastically different impacts. The Commission didn’t specify their quantiles but it could be treating the top 1% or the top 0.1% the same as the top 25%.

This paper also clarified the terminology of horizontal and vertical equity. Horizontal equity would be two people in two equal homes should pay equal property tax. Vertical equity to me would be a third person in a more expensive home paying the same percentage of property tax as the first two people but to some perspectives the equal percentage of taxes puts a disproportionate tax burden on the first two and a more equal outcome is achieved if there is a progressive property tax system such that if the first two homes are 1M and the third home is 10M maybe instead of the tax rate being 1% on all it’s 1% on the first million and 2% on value above the first million.


Extended Bibliography

New York City Independent Budget Office (?). The Coop/Condo Abatement and Residential Property Tax Reform in New York City Link

New York City Independent Budget Office (2006). Twenty-Five Years After S7000A: How Property Tax Burdens Have Shifted in New York City Link

New York City Independent Budget Office (2013). The Coop & Condo Tax Break Has Expired, Giving Albany Chance for Long-Promised Fix Link

New York City Independent Budget Office (2022). Does NYC’s Method for Assessing Commercial Property Values Result in Inequities Link

Shi, Boicourt, Ng, et al. (2024). An Assessment of NYC Cooperative Housing’s Climate Vulnerability and Barriers to Adaptation Link

Cetrino, Benjamin (2014) Classification of Property for Taxation in New York State Link




Domain Dive

NYC Dept of Finance What are class 2 properties?

“we use statistical modeling to calculate the typical income and expenses for properties similar to yours in size, location, age, and number of units. The process varies depending upon whether your property has more or less than 10 units.“

For class 2 they don’t look at recent slaes data they look at operating income.

If they go off of true operating income then a resident-owner building is going to have significantly less operating income because they only collect maintenance and not rent. Operating Income - $1,000,000 Capitalization Rate - 0.18 Marketing Valuation (OI/CR) - $5,555,555 assessment ratio - 0.45 assessment - $2,500,000 tax rate - 12.86% assessed tax - $321,375

📌Out of place📌 If assessments were based on sale price data then luxury units with large staff may have lower purchase prices but higher maintenance.




Research Question:

Richer co-op buildings may pay lower property taxes than expected because they tend to have sophisticated owners who can appeal and win property tax assessed values. Hypothesis Statement
What we’re trying to do is model property tax assessments for Co-ops in NYC. By comparing the output of the model to actual assessments we’re able to look for properties with a high difference between expected and actual taxes. Our theory is that some particularly rich buildings will be paying lower taxes than we’d expect due to the sophistication of the owners who were able to appeal and reduce their assessed property taxes.

Null Hypothesis
Assessed property taxes are fair, there is no advantage afforded a few socioeconomically sophisticated buildings, and we would confirm that by showing that residuals are random, normally distributed, and not moderated by a building being identified as high-end. Another way to look at is is we would expect the rich buildings to have a higher variability in assessed taxes than the not-rich buildings.




Data and Variables:

We’ve found NYC property tax data from Open Data NYC. discuss what I’ve found.

I’m a little concerned I need to do individual building valuations and that there’s multiple buildings on any block/lot combination with no way to primary key individual buildings across the publically available NYC data

I also need to do a better domain dive of what NYC discloses of their methodology

What we don’t have is a good sophistication flag, I tried pulling educational attainment from the US Bureau of Census at the zip code level, but my zipcode in NYC has tens thousand buildings (citation needed). GH suggested I look at how granular that data is, maybe by voting precinct. I think the sophistication flag is price per square foot for the average apartment in that building - less than $1200 is low, between $1200-$1300 is mid and above $1300/sqft is high. I could use a decision tree to come up with better cutoffs

could possibly do some data visualization with an NYC map

There are 71k 4+ residence units (over how many years?) in the manhattan zipcode of 10019 alone. Zipcode doesn’t give us the granularity or resolution we need to view wealth differences in individual buildings

We’re also having an issue with block and lot number not being granular enough. On any given lot in a block there are 1-10 buildings so we need a building ID code that’s consistent across datasets

📌 Compare market valuations to sale price ratios

DATA EXPLORATION Three values Model’s estimated value The actual value as provided in our data The formulaic model We went to data to look at this. Not sure if the TotVal column is the assessed tax. I can narrow it down to look at my building and compare

Describe each dataset and where it comes from with proper citation. Find a list from NYCDOF with the variable name descriptions. Narrow the number of records to Taxtype=2 Can narrow the database to manhattan only and that could be enough for the project

But could even initially narrow the database to a particular zipcode in the beginning so that I can build an end to end model with the smallest amount of data and then build up from there.

I also want to use Census Data to get average income per zip code to predict value based on zip code… except NYC is so dense. You have a $1 pizza shop next to a $20 a person Italian restaurant next to a $200 a person italian restaurant. If you charged all three an average tax based on earnings potential knowing the average income of the street, the $1 pizza shop would disadvantaged and the $200/person italian restaurant would be advantaged

Also, check the warning and issues from the read_csv() to see if there are anything that would give us pause and make us want to find a different route of processing. For example, if there is a parsing issue with a specific deliminarotr…..

Which columns are of importance to us. The basic values we’ll use in our regression Also-> run missing statistics on the columns.

Steps for next time I look this up. Basic regression I want to run Which columns do i want to focus on. Pick one of the literature review links.

Data Display

BLDGCL what do these values mean R0, D8, C6, R4. I need to pull out a data dictionary

I should be putting my summary of work here and this more detailed work in another document

LTFront and LTDEPTH to get lot footprint, do the same for building footprint and stories to get square footage of building.

FULLVALUE, AVLAND AVTOT -> go through and do individual calculations to see how these are related

issue with one data set, up to the 2018-2019 plan year, need to grab it from another dataset but similar

can I calculate an assessed value per square foot-> compare that to buildings in the same block-lot and then what about adjacent lots?

List of data I’m looking at NYC open data NYC DOF I have a database on Appeals information. can I compare that to the price per squarefoot and see if wealthy buildings are more likely to appeal? Also it’s only successful appeals in the database, if I could find a database with unsuccessful appeals that could be interesting. Basically there are three mechanism we are looking at for inequity: 1) rich people are better able to appeal 2) part of the statistical calculation for tax assessments used by NYC DOF penalizes poorer buildings and subsidizes richer buildings but tending to assume an average value, or 3) the price per square foot is distorted by underlying building mortgage, which shouldn’t affect the taxable value but would affect the sale value, or high maintenance due to luxury amenitites and staff, two equal units, if one building has higher maintenance the price is lower. But is that true, like does the price accurately capture the value of having a doorman? Sigh, this also means price per square foot isn’t a great indicator of richness…

I need to start over, grab more datasets and do basic data exploration and evaluating missingness and imputation strategies.

No matter what we’re only looking at tax class 2 buildings. Maybe that can be part of the domain dive section: where we discuss how the DOF assesses tax values.

There’s also the Pluto database and that has a building ID but is that consistent across other databases?

BBLE “1000163859”
BORO “1”
BLOCK “16”
LOT “3859”
EASEMENT NA
OWNER “CHEN, QI TOM”
BLDGCL “R4”
TAXCLASS “2”
LTFRONT “0”
LTDEPTH “0”
EXT NA
STORIES “31”
FULLVAL “354180”
AVLAND “3310”
AVTOT “159381”
EXLAND “3310”
EXTOT “159381”
EXCD1 “6800”
STADDR “1 RIVER TERRACE” POSTCODE NA
EXMPTCL NA
BLDFRONT “0”
BLDDEPTH “0”
AVLAND2 “3310”
AVTOT2 “148953”
EXLAND2 “3310”
EXTOT2 “148953”
EXCD2 NA
PERIOD “FINAL”
YEAR “2018/19”
VALTYPE “AC-TR”
Borough NA
Latitude NA
Longitude NA
Community Board NA
Council District NA
Census Tract NA
BIN NA
NTA NA
New Georeferenced Column NA

Where can I find the NYC DOF glossary? AVTOT is Average Total value or Assessed Value Total? FULLVAL is market value? EXTOT is Exempt value? AVTOT2 is assessed value total after exemptions? Need to use summary(), look for missing values, use skimr::skim(), look for outliers, and ranges

Use Census data to come up with average salary per zipcode to help determine Richness by zipcode…




Statistical Methods:

This section describes the methods you used to analyze the data.

Goal: design and implement a model or set of models to measure or explore these relationships

Ridge Regression

Try linear regression -> I may know what my features are because of the domain dive but if not I could try lasso regression. For a nonlinear model maybe I could do random forest. How would the residuals from a linear vs nonlinear model be a diagnostic in and of itself?

use test/train split and x-fold cross-validation

Residual Analysis

Analyze residuals across building types.

Visualize anomalies - what we’re expecting to see is greater residuals for wealthy and poor buildings w

Statistical Validity

How sound are the research design and methods

Is the sample of observations selected fro the test reflective of the population are the values of the independnet variables not dependent on each other are there significant confounding or exogenous factors influencing the depenent variable and thus need to be controlled for in the model?

Internal Validity

Do the data sets and variables accurately represent the phenomena being explored

External validity

Can the results of the study be generalized

Limitations Potential concerns: It’s possible that if we train the model on the historically available data and there is the presence of unfairness with richer buildings not paying their fair share, then the model would predict relatively lower property tax assessments than you would expect and so the residuals would be normal for the higher end buildings. If we have a nonlinear model that may capture the unfairness anomaly as an expected part of the model so if we use a linear model, and the actual tax assessment methodology is linear then we will avoid this problem. If we can’t use differences in the residuals to identify the tax anomaly then we may have to find other aspects of the model to look for taxation anomalies. [Later, go through and standardize the nomenclature]

Model Selection

How did the type of relationships among the variables or end result influence which statistical or machine learning model was most appropriate

consider the scikit-learn algorithm cheat-sheet OR https://www.analyticssteps.com/blogs/5-statistical-data-analysis-techniques-statistical-modelling-machine-learning

main classifications are: regression / classifcation / clustering / dimensionality reduction

Model Fit

Discuss my work in relationship to overfitting and underfitting

need to discuss feature importance and partial dependence plots - what other visuals can I interpret?

Data Analysis




Discussion of Results:

One concern is that the model might learn the systemic bias and so maybe that’s another way to look at this. If the model is showing even residuals between poor, mid and rich buildings… I guess that would indicate it’s more systemic unfairness than sophisticated owners being able to appeal.

Also it’s easy to see with linear regression, but if we have some low-interpretable non-linear model then I’m not sure what conclusions I’ll be able to draw.




Conclusion:




Final Presentation


Include a link to the youtube presentation ***


write an article for The Co-operator ***


include link to data repository Github or elsewhere or where on the NYC websites I can find the data or, it should already be processed. maybe I can have links to other RPubs documents where the data scrubbing is visible ***


Simple website where you enter your buildings Block and Lot and are able to see a Tax Anomaly-ometer where: Green - fine/under Yellow - fine Orange - over paying Red - definitely appeal Maybe the website can also produce a small report that a board member can bring back to their board to discuss Share as a marketing tool for Daisy our property management company



make the final presentation off of slides and not the final document. ***