DCP

PULL DCP HOUSING DATABASE

import DCP relevant building class info

Pulling from this list here.

all_dcp_classes <- read_csv("MWG Building Classification - All DOB Classes.csv") %>% clean_names() %>% 
  rename(class_description = description)

## Rows: 253 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Building Code, Description, Code Category
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

relevant_dcp_classes <- read_csv("MWG Building Classification - Class Cats for R.csv") %>%  clean_names()

## Rows: 68 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): CATEGORY, CLASS, CODE TITLE, CODE CAT
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

PLUTO

# Select relevant columns
mappluto_clean <- mappluto %>%
  select(address, bbl, zone_dist1, land_use, units_res, units_total, year_built, lot_area, bldg_class) %>% #zone_dist1 refers to the zone that occupies the greatest percentage of the lot area
  mutate(
    bbl = as.character(bbl),
    # units_res = replace_na(units_res, 0),  # Ensure missing unit counts are 0
    year_built = as.integer(year_built),   # Convert year_built to integer
    lot_area = as.numeric(lot_area)        # Ensure lot_area is numeric
  )

FILTERING & JOIN

DCP

# Define relevant job types
relevant_job_types <- c("New Building", "Alteration")  # Can be modified based on analysis needs

# Define relevant job statuses (if you only want completed projects, for example)
relevant_statuses <- c("5. Completed Construction")  # Modify as needed

# Filter the dataset, add categories, create address var
dcp_housing_filtered <- dcp_housing %>%
  filter(
    # bldg_class %in% relevant_dcp_classes$`CLASS`,  # Filter by building class
    job_type %in% relevant_job_types,  # Filter by job type
    job_status %in% relevant_statuses,  # Filter by job status
    class_a_net > 0  # Only keep records with a positive net change in Class A units - adding permanent residences
   ) %>% 
  left_join(all_dcp_classes, by = c("bldg_class" = "building_code")) %>% 
  rename(
    bldg_category = code_category
  ) %>% 
  mutate(address = paste(address_num, address_st, sep = " "))

# write_csv(dcp_housing_filtered, "dcp_housing_filtered.csv")

#look at "other"s
na_dcp <- dcp_housing_filtered %>% 
  filter((is.na(bldg_class) | bldg_class == "Other")) 
  # %>% write_csv("dcp_na_others.csv")

PLUTO

# Filter the dataset, add categories, create address var
mappluto_clean <- mappluto_clean %>%
  left_join(all_dcp_classes, by = c("bldg_class" = "building_code")) %>% 
  rename(
    bldg_category = code_category)

join

#drop pluto geoms
mappluto_nogeom <- st_drop_geometry(mappluto_clean)

combined_df <- dcp_housing_filtered %>% 
  left_join(mappluto_nogeom, by = "bbl") %>% 
  mutate(
    address.dcp = address.x,
    address.pluto = address.y,
    class.dcp = bldg_class.x,
    class.pluto = bldg_class.y,
    category.dcp = bldg_category.x,
    category.pluto = bldg_category.y,
    class_a_net.dcp = class_a_net,
    units_res.pluto = units_res,
    class_description = class_description.x
  )

combined_df_classes_units <- combined_df %>% 
  select(bbl, address.dcp, address.pluto, class.dcp, class.pluto, category.pluto, category.dcp, class_a_net.dcp, units_res.pluto, class_description)

digging into classes

Using DCP HDB listed classes

by project

## Warning in sf_column %in% names(g): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 15309 of `x` matches multiple rows in `y`.
## ℹ Row 2 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.

by units (net Class A)

## Warning in sf_column %in% names(g): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 15309 of `x` matches multiple rows in `y`.
## ℹ Row 2 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.

NOTE: Initially filtered for projects with: Status = Completed, Job Type = New Construction or Alteration, and Net Class A Unit Change > 0. This chart shows projects with classes outside of our list of expected residential classes.

combined dataset

Using classes from PLUTO

NOTE: Filtered for projects with: Status = Completed, Job Type = New Construction or Alteration, and Net Class A Unit Change > 0

Iterative classification

Class is determined first by DCP Class, then if that is not available/within our list of relevant classes, it is categorized by PLUTO class. If neither is relevant, then it is assigned “irrelevant class.”

All Classes

NOTE: Filtered for projects with: Status = Completed, Job Type = New Construction or Alteration, and Net Class A Unit Change > 0

Problem/Irrelevant Classes

exporting combined df for analysis in sheets

Written to sheet here.

Started looking through the problem classes starting with the largest number of units:

N2: Supportive/low-income housing - designate as rental?
I7: Permanent supportive housing - designate as rental or condo?
V1: Vacant Land, so hard to make one determination, but the ones I looked up are all MF rental.
H8: Dorms - remove from count? classify as rental?
N9: Supportive housing
K4: Mixed use, mixed income - classify as rental, take net class A unit number
H3: Hotels, some w Rental units - throw out or take class A?
W6: Fordham Residential Campus

class_digging

2025-03-28

DCP

PULL DCP HOUSING DATABASE

import DCP relevant building class info

PLUTO

FILTERING & JOIN

DCP

PLUTO

join

digging into classes

Using DCP HDB listed classes

by project

by units (net Class A)

combined dataset

Using classes from PLUTO

Iterative classification

All Classes

Problem/Irrelevant Classes

exporting combined df for analysis in sheets