MTA Subway Accessibility Gap Analysis

Author

Muhammad Ahmad

Reason for This Analysis

New York City’s subway system is one of the largest in the world, yet it remains one of the least accessible among major global transit networks. As of 2026, only about 31% of the system’s 493 subway and Staten Island Railway stations are fully ADA-compliant, leaving hundreds of stations unreachable for riders who use wheelchairs, strollers, or other mobility devices. This gap disproportionately affects outer-borough communities, seniors, and people with disabilities.

This analysis aims to:

  1. Quantify the accessibility gap across the MTA subway system by borough and line
  2. Estimate the funding required to bring non-compliant stations up to ADA standard
  3. Identify stations near high-density development parcels that may qualify for the City’s Zoning for Accessibility (ZFA) program which is a public-private partnership where developers receive a density bonus of up to 20% in exchange for funding elevator and accessibility construction at nearby stations

The ZFA angle is particularly important: it represents a mechanism to close the funding gap without relying entirely on the MTA Capital Plan, and mapping its potential has direct policy implications.


Data & Libraries

Libraries

Code
library(tidyverse)    # data manipulation and visualization
library(leaflet)      # interactive maps
library(sf)           # spatial operations
library(knitr)        # table formatting
library(kableExtra)   # enhanced tables
library(scales)       # number formatting
library(ggplot2)      # static charts

Data Sources

This project uses three data sources across two different types:

  • MTA Subway Stations (MTA_Subway_Stations.csv): CSV — station-level data from the MTA Open Data portal including ADA status, borough, routes, and coordinates.
  • NYC PLUTO (PLUTO_filtered_near_inaccessible_stations.csv): CSV — a pre-filtered subset of NYC’s Primary Land Use Tax Lot Output dataset, containing only high-density zoning parcels within ~500 meters of an inaccessible subway station. Full dataset available at NYC Open Data.
  • MTA Elevator & Escalator Outage Data: Live API pull from the NYC Open Data Socrata API — provides real-world uptime data for existing elevators, showing that even accessible stations face reliability issues.
Code
library(httr)
library(jsonlite)

# Load MTA station data (CSV)
mta <- read_csv("https://raw.githubusercontent.com/MuhammadAhmad0006/Data607_Final_Project_/refs/heads/main/MTA_Subway_Stations.csv")

# Load filtered PLUTO data (CSV)
# Note: The full NYC PLUTO dataset is ~860K rows and 428MB - too large to process
# inside a QMD. As a pre-processing step performed separately in Python (pandas),
# it was filtered to only high-density zoning parcels (C4, C5, C6, R9, R10, M1-M3)
# within ~500m of an inaccessible MTA station, yielding 26,143 rows x 17 columns.
pluto <- read.csv("https://raw.githubusercontent.com/MuhammadAhmad0006/Data607_Final_Project_/refs/heads/main/PLUTO_filtered_near_inaccessible_stations.csv")

cat("PLUTO parcels loaded:", nrow(pluto), "rows,", ncol(pluto), "columns\n")
PLUTO parcels loaded: 26143 rows, 17 columns
Code
# Pull elevator outage data via NYC Open Data Socrata API
elevator_url <- "https://data.ny.gov/resource/rc5b-x5jp.json?$limit=1000"
response <- GET(elevator_url)

elevator_outages <- fromJSON(content(response, as = "text", encoding = "UTF-8")) %>%
  as_tibble()

Data Transformation

Code
# Recode ADA status to readable labels
mta <- mta %>%
  mutate(
    ada_status = case_when(
      ADA == 1 ~ "Fully Accessible",
      ADA == 2 ~ "Partially Accessible",
      ADA == 0 ~ "Not Accessible"
    ),
    ada_status = factor(ada_status, levels = c("Fully Accessible", 
                                                "Partially Accessible", 
                                                "Not Accessible")),
    # Recode borough abbreviations
    borough_full = case_when(
      Borough == "M"  ~ "Manhattan",
      Borough == "Bk" ~ "Brooklyn",
      Borough == "Q"  ~ "Queens",
      Borough == "Bx" ~ "Bronx",
      Borough == "SI" ~ "Staten Island"
    )
  )

# Separate inaccessible stations
inaccessible <- mta %>% filter(ADA == 0)

cat("Total stations:", nrow(mta), "\n")
Total stations: 496 
Code
cat("Inaccessible stations:", nrow(inaccessible), "\n")
Inaccessible stations: 327 
Code
cat("Fully accessible stations:", sum(mta$ADA == 1), "\n")
Fully accessible stations: 160 

Wide-to-Long Transformation

The borough accessibility counts start in wide format (one column per ADA status). We pivot to long format for flexible plotting and analysis — this follows Hadley Wickham’s tidy data principles covered in the Grammar of Data Science workflow.

Code
# Build wide format: one row per borough, one column per ADA status
borough_wide <- mta %>%
  group_by(borough_full) %>%
  summarise(
    Fully_Accessible    = sum(ADA == 1),
    Partially_Accessible = sum(ADA == 2),
    Not_Accessible      = sum(ADA == 0),
    Total               = n()
  )

cat("Wide format (", nrow(borough_wide), "rows x", ncol(borough_wide), "cols ):\n")
Wide format ( 5 rows x 5 cols ):
Code
print(borough_wide)
# A tibble: 5 × 5
  borough_full  Fully_Accessible Partially_Accessible Not_Accessible Total
  <chr>                    <int>                <int>          <int> <int>
1 Bronx                       21                    0             49    70
2 Brooklyn                    43                    2            124   169
3 Manhattan                   63                    6             84   153
4 Queens                      27                    1             55    83
5 Staten Island                6                    0             15    21
Code
# Pivot to long format for ggplot
borough_long <- borough_wide %>%
  pivot_longer(
    cols      = c(Fully_Accessible, Partially_Accessible, Not_Accessible),
    names_to  = "ada_status",
    values_to = "station_count"
  ) %>%
  mutate(
    ada_status = str_replace_all(ada_status, "_", " "),
    ada_status = factor(ada_status, 
                        levels = c("Fully Accessible", "Partially Accessible", "Not Accessible")),
    pct = round(station_count / Total * 100, 1)
  )

cat("\nLong format (", nrow(borough_long), "rows x", ncol(borough_long), "cols ):\n")

Long format ( 15 rows x 5 cols ):
Code
print(borough_long)
# A tibble: 15 × 5
   borough_full  Total ada_status           station_count   pct
   <chr>         <int> <fct>                        <int> <dbl>
 1 Bronx            70 Fully Accessible                21  30  
 2 Bronx            70 Partially Accessible             0   0  
 3 Bronx            70 Not Accessible                  49  70  
 4 Brooklyn        169 Fully Accessible                43  25.4
 5 Brooklyn        169 Partially Accessible             2   1.2
 6 Brooklyn        169 Not Accessible                 124  73.4
 7 Manhattan       153 Fully Accessible                63  41.2
 8 Manhattan       153 Partially Accessible             6   3.9
 9 Manhattan       153 Not Accessible                  84  54.9
10 Queens           83 Fully Accessible                27  32.5
11 Queens           83 Partially Accessible             1   1.2
12 Queens           83 Not Accessible                  55  66.3
13 Staten Island    21 Fully Accessible                 6  28.6
14 Staten Island    21 Partially Accessible             0   0  
15 Staten Island    21 Not Accessible                  15  71.4

Accessibility Gap Analysis

System-Wide Overview

Code
# Summary table
mta %>%
  count(ada_status) %>%
  mutate(
    Percent = round(n / sum(n) * 100, 1),
    n = comma(n)
  ) %>%
  rename(`ADA Status` = ada_status, `# Stations` = n, `% of System` = Percent) %>%
  kable(caption = "MTA Subway Station Accessibility Status") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
MTA Subway Station Accessibility Status
ADA Status # Stations % of System
Fully Accessible 160 32.3
Partially Accessible 9 1.8
Not Accessible 327 65.9
Code
mta %>%
  count(ada_status) %>%
  ggplot(aes(x = reorder(ada_status, -n), y = n, fill = ada_status)) +
  geom_col(width = 0.6) +
  geom_text(aes(label = paste0(n, "\n(", round(n/sum(n)*100,1), "%)")),
            vjust = -0.4, size = 3.5) +
  scale_fill_manual(values = c("Fully Accessible" = "#2ecc71",
                                "Partially Accessible" = "#f39c12",
                                "Not Accessible" = "#e74c3c")) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.15))) +
  labs(title = "MTA Subway Station Accessibility Status",
       x = NULL, y = "Number of Stations", fill = NULL) +
  theme_minimal() +
  theme(legend.position = "none")

MTA Subway Station Accessibility by Status

Accessibility Gap by Borough

Code
# Now uses borough_long — tidy long-format data from the pivot above
borough_long %>%
  ggplot(aes(x = borough_full, y = station_count, fill = ada_status)) +
  geom_col(position = "stack") +
  geom_text(aes(label = ifelse(station_count > 5, station_count, "")),
            position = position_stack(vjust = 0.5), size = 3, color = "white") +
  scale_fill_manual(values = c("Fully Accessible" = "#2ecc71",
                                "Partially Accessible" = "#f39c12",
                                "Not Accessible" = "#e74c3c")) +
  labs(title = "Subway Station Accessibility by Borough",
       x = NULL, y = "Number of Stations", fill = "ADA Status") +
  theme_minimal()

Accessibility status breakdown by NYC borough
Code
# Table: inaccessible count and % by borough
mta %>%
  group_by(borough_full) %>%
  summarise(
    Total = n(),
    Inaccessible = sum(ADA == 0),
    `% Inaccessible` = round(Inaccessible / Total * 100, 1)
  ) %>%
  arrange(desc(Inaccessible)) %>%
  rename(Borough = borough_full) %>%
  kable(caption = "Inaccessible Stations by Borough") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Inaccessible Stations by Borough
Borough Total Inaccessible % Inaccessible
Brooklyn 169 124 73.4
Manhattan 153 84 54.9
Queens 83 55 66.3
Bronx 70 49 70.0
Staten Island 21 15 71.4

Funding Gap Estimation

The MTA’s capital program and legal settlements give us cost benchmarks for accessibility upgrades. Based on publicly available MTA capital plan data:

  • Average cost per station accessibility upgrade: ~$50–$80 million (full elevator installation + ADA compliance work)
  • We use $65 million as a midpoint estimate
Code
cost_per_station <- 65e6  # $65 million estimate

funding_summary <- mta %>%
  filter(ADA == 0) %>%
  group_by(borough_full) %>%
  summarise(
    `Inaccessible Stations` = n(),
    `Estimated Cost ($ Millions)` = round(n() * cost_per_station / 1e6)
  ) %>%
  bind_rows(
    summarise(., 
              borough_full = "TOTAL",
              `Inaccessible Stations` = sum(`Inaccessible Stations`),
              `Estimated Cost ($ Millions)` = sum(`Estimated Cost ($ Millions)`))
  ) %>%
  rename(Borough = borough_full)

funding_summary %>%
  mutate(`Estimated Cost ($ Millions)` = dollar(`Estimated Cost ($ Millions)`, 
                                                  suffix = "M", prefix = "$")) %>%
  kable(caption = paste("Estimated Funding Required at $65M per Station")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Estimated Funding Required at $65M per Station
Borough Inaccessible Stations Estimated Cost ($ Millions)
Bronx 49 $3,185M
Brooklyn 124 $8,060M
Manhattan 84 $5,460M
Queens 55 $3,575M
Staten Island 15 $975M
TOTAL 327 $21,255M

Zoning for Accessibility (ZFA) Opportunity Analysis

The Zoning for Accessibility program allows developers in high-density zones (C4, C5, C6, R9, R10, and similar) to receive up to a 20% floor area bonus in exchange for fully funding and constructing elevator access at a nearby subway station. This section identifies where that opportunity exists.

Joining PLUTO Parcels to Inaccessible Stations

Code
# Rename coordinate columns to remove spaces before converting to sf
inaccessible_clean <- inaccessible %>%
  rename(lon = `GTFS Longitude`, lat = `GTFS Latitude`)

stations_sf <- inaccessible_clean %>%
  st_as_sf(coords = c("lon", "lat"), crs = 4326)

pluto_sf <- pluto %>%
  filter(!is.na(latitude), !is.na(longitude)) %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

# Find parcels within 500m of an inaccessible station
stations_proj <- st_transform(stations_sf, 32618)  # UTM zone 18N for NYC
pluto_proj    <- st_transform(pluto_sf, 32618)

nearby <- st_join(pluto_proj, stations_proj %>% select(`Stop Name`, Borough),
                  join = st_is_within_distance, dist = 500)

nearby_clean <- nearby %>%
  filter(!is.na(`Stop Name`)) %>%
  st_drop_geometry()

cat("High-density parcels within 500m of an inaccessible station:", nrow(nearby_clean), "\n")
High-density parcels within 500m of an inaccessible station: 51003 
Code
cat("Unique inaccessible stations with nearby ZFA parcels:", 
    n_distinct(nearby_clean$`Stop Name`), "\n")
Unique inaccessible stations with nearby ZFA parcels: 236 

Top Stations by ZFA Development Potential

Code
top_stations <- nearby_clean %>%
  group_by(`Stop Name`, Borough) %>%
  summarise(
    parcels_nearby = n(),
    avg_floors = round(mean(numfloors, na.rm = TRUE), 1),
    avg_assess = round(mean(assesstot, na.rm = TRUE) / 1e6, 1),
    .groups = "drop"
  ) %>%
  arrange(desc(parcels_nearby)) %>%
  slice_head(n = 20)

top_stations %>%
  ggplot(aes(x = reorder(`Stop Name`, parcels_nearby), y = parcels_nearby, fill = Borough)) +
  geom_col() +
  coord_flip() +
  labs(title = "Top 20 Inaccessible Stations by ZFA Development Opportunity",
       subtitle = "Number of high-density parcels within 500m",
       x = NULL, y = "Nearby High-Density Parcels", fill = "Borough") +
  theme_minimal()

Inaccessible stations with the most nearby high-density development parcels
Code
top_stations %>%
  rename(
    Station = `Stop Name`,
    `Nearby Parcels` = parcels_nearby,
    `Avg Floors` = avg_floors,
    `Avg Assessed Value ($M)` = avg_assess
  ) %>%
  kable(caption = "Top 20 Inaccessible Stations — ZFA Development Potential") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))
Top 20 Inaccessible Stations — ZFA Development Potential
Station Borough Nearby Parcels Avg Floors Avg Assessed Value ($M)
Canal St M 4787 5.9 5.6
23 St M 2331 9.0 9.5
28 St M 1303 10.5 12.1
Grand St M 1244 5.3 2.2
Bowery M 1237 5.4 2.9
Delancey St-Essex St M 1218 5.4 2.4
Spring St M 1161 5.5 5.0
4 Av-9 St Bk 975 2.8 1.0
Prince St M 943 5.5 5.7
Franklin St M 792 6.4 7.4
Wall St M 697 17.3 26.9
2 Av M 686 5.6 3.3
Broadway Junction Bk 635 1.9 0.4
39 Av-Dutch Kills Q 580 3.1 1.5
Morgan Av Bk 580 2.2 0.8
25 St Bk 577 2.2 0.6
33 St M 561 10.7 11.2
Liberty Av Bk 547 2.2 0.5
5 Av M 521 15.1 28.8
Atlantic Av Bk 515 2.1 0.4

Statistical Analysis: Is Accessibility Distributed Equally Across Boroughs?

A chi-square goodness-of-fit test lets us determine whether the distribution of inaccessible stations across boroughs is statistically different from what we’d expect if accessibility were distributed proportionally to each borough’s share of total stations.

Code
# Observed inaccessible stations per borough
observed <- mta %>%
  group_by(borough_full) %>%
  summarise(
    total        = n(),
    inaccessible = sum(ADA == 0)
  ) %>%
  arrange(borough_full)

# Expected: if inaccessibility rate were uniform across boroughs
overall_inacc_rate <- sum(mta$ADA == 0) / nrow(mta)
expected_counts <- observed$total * overall_inacc_rate

# Chi-square test
chi_result <- chisq.test(
  x = observed$inaccessible,
  p = observed$total / sum(observed$total)
)

chi_result

    Chi-squared test for given probabilities

data:  observed$inaccessible
X-squared = 4.516, df = 4, p-value = 0.3407
Code
# Visualize observed vs expected
observed %>%
  mutate(
    expected    = round(expected_counts),
    diff        = inaccessible - expected
  ) %>%
  pivot_longer(cols = c(inaccessible, expected),
               names_to = "type", values_to = "count") %>%
  mutate(type = recode(type,
                       "inaccessible" = "Observed Inaccessible",
                       "expected"     = "Expected (if uniform rate)")) %>%
  ggplot(aes(x = borough_full, y = count, fill = type)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("Observed Inaccessible" = "#e74c3c",
                                "Expected (if uniform rate)" = "#95a5a6")) +
  labs(title = "Observed vs. Expected Inaccessible Stations by Borough",
       subtitle = paste0("Chi-square test: p = ", round(chi_result$p.value, 4)),
       x = NULL, y = "Number of Inaccessible Stations", fill = NULL) +
  theme_minimal()

The chi-square result tells us whether the boroughs differ significantly in their accessibility rates beyond what chance alone would predict. A p-value below 0.05 would indicate that some boroughs are systematically underserved relative to their size.


Code
# Color palette for ADA status
pal <- colorFactor(
  palette = c("#2ecc71", "#f39c12", "#e74c3c"),
  levels  = c("Fully Accessible", "Partially Accessible", "Not Accessible")
)

leaflet(mta) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addCircleMarkers(
    lng   = ~`GTFS Longitude`,
    lat   = ~`GTFS Latitude`,
    color = ~pal(ada_status),
    radius = 5,
    stroke = FALSE,
    fillOpacity = 0.8,
    popup = ~paste0("<b>", `Stop Name`, "</b><br>",
                    "Lines: ", `Daytime Routes`, "<br>",
                    "Borough: ", borough_full, "<br>",
                    "ADA Status: ", ada_status)
  ) %>%
  addLegend("bottomright", pal = pal, values = ~ada_status,
            title = "ADA Status", opacity = 0.9)

Interactive map of MTA subway stations by accessibility status


Conclusions

Code
total_inaccessible <- sum(mta$ADA == 0)
total_cost <- total_inaccessible * cost_per_station / 1e9
zfa_stations <- n_distinct(nearby_clean$`Stop Name`)

This analysis reveals several key findings:

  1. The accessibility gap is large: 327 of 496 subway stations — over 65% of the system — lack full ADA accessibility.

  2. The funding need is enormous: At an estimated $65 million per station, full compliance would cost approximately **$21.3 billion** — far exceeding what the MTA can deliver through its Capital Plan alone.

  3. The ZFA program has significant untapped potential: 236 currently inaccessible stations have high-density development parcels within 500 meters, representing real opportunities for private developers to fund accessibility upgrades in exchange for density bonuses.

  4. Brooklyn and Manhattan have the highest absolute need, though all five boroughs have significant gaps. Outer boroughs like the Bronx and Staten Island have proportionally fewer accessible stations despite high ridership need.

The ZFA program, if fully leveraged, could meaningfully accelerate the MTA’s goal of reaching 95% accessibility by 2055 without direct cost to taxpayers.


References