0. Packages

I used the kableExtra package to create tables and the sf and nycgeo packages to create maps. If needed, you can install them using the commands below.

install.packages("kableExtra")
install.packages("sf")
remotes::install_github("mfherman/nycgeo")


1. Introduction

Long-term exposure to air pollutants can negatively impact health and increase rates of respiratory and cardiovascular diseases. Common air pollutants include particulate matter, nitrogen oxides, and ozone. Particulate matter (PM) is composed of very small particles of dust, dirt, or soot from construction or fires. Particles that are less than 2.5 micrometers (approximately 20 times smaller than width of a hair) are known as PM2.5. Nitrogen oxides, including nitrogen dioxide (NO2), are released when fuel (eg, oil, natural gas) is burned, such as emissions from vehicles. Ozone is produced by a chemical reaction between nitrogen oxides and other volatile organic compounds in the presence of sunlight. As a result, ozone pollution tends to be worse in urban environments during the summer.

Because of the potential impacts and costs of air pollution, New York City tracks the level of common air pollutants across the city over time. Here, I analyze Anthony Conrardy’s NYC air quality dataset to compare air pollutant levels versus time of year.


2. Data

2.1. Source

The dataset is from NYC OpenData, a publicly-available collection of datasets generated by New York City government projects. The raw data was available as a CSV file, so I downloaded and saved it to my GitHub repository.

2.2. Input

I read the CSV file into a dataframe (tibble). Due to the file size, only the first 100 rows are shown.

nycaq_data_raw <- read_csv('https://media.githubusercontent.com/media/alexandersimon1/Data607/main/Project2/Dataset3/Air_Quality_20240226.csv', show_col_types = FALSE)
kbl_display(head(nycaq_data_raw, 100), "100%", "500px")
Unique ID Indicator ID Name Measure Measure Info Geo Type Name Geo Join ID Geo Place Name Time Period Start_Date Data Value Message
172653 375 Nitrogen dioxide (NO2) Mean ppb UHF34 203 Bedford Stuyvesant - Crown Heights Annual Average 2011 12/01/2010 25.30 NA
172585 375 Nitrogen dioxide (NO2) Mean ppb UHF34 203 Bedford Stuyvesant - Crown Heights Annual Average 2009 12/01/2008 26.93 NA
336637 375 Nitrogen dioxide (NO2) Mean ppb UHF34 204 East New York Annual Average 2015 01/01/2015 19.09 NA
336622 375 Nitrogen dioxide (NO2) Mean ppb UHF34 103 Fordham - Bronx Pk Annual Average 2015 01/01/2015 19.76 NA
172582 375 Nitrogen dioxide (NO2) Mean ppb UHF34 104 Pelham - Throgs Neck Annual Average 2009 12/01/2008 22.83 NA
667327 375 Nitrogen dioxide (NO2) Mean ppb UHF34 104 Pelham - Throgs Neck Annual Average 2020 01/01/2020 16.19 NA
172607 375 Nitrogen dioxide (NO2) Mean ppb UHF34 306308 Chelsea-Village Annual Average 2009 12/01/2008 38.16 NA
172675 375 Nitrogen dioxide (NO2) Mean ppb UHF34 306308 Chelsea-Village Annual Average 2011 12/01/2010 34.96 NA
175345 375 Nitrogen dioxide (NO2) Mean ppb UHF42 206 Borough Park Winter 2010-11 12/01/2010 30.10 NA
176689 375 Nitrogen dioxide (NO2) Mean ppb UHF42 206 Borough Park Annual Average 2013 12/01/2012 20.23 NA
176682 375 Nitrogen dioxide (NO2) Mean ppb UHF42 106 High Bridge - Morrisania Annual Average 2013 12/01/2012 23.73 NA
336507 375 Nitrogen dioxide (NO2) Mean ppb UHF42 106 High Bridge - Morrisania Winter 2014-15 12/01/2014 26.00 NA
740910 375 Nitrogen dioxide (NO2) Mean ppb UHF42 106 High Bridge - Morrisania Annual Average 2021 01/01/2021 18.04 NA
175348 375 Nitrogen dioxide (NO2) Mean ppb UHF42 209 Bensonhurst - Bay Ridge Winter 2010-11 12/01/2010 28.44 NA
175894 375 Nitrogen dioxide (NO2) Mean ppb UHF42 209 Bensonhurst - Bay Ridge Summer 2009 06/01/2009 18.95 NA
175895 375 Nitrogen dioxide (NO2) Mean ppb UHF42 210 Coney Island - Sheepshead Bay Summer 2009 06/01/2009 15.22 NA
175349 375 Nitrogen dioxide (NO2) Mean ppb UHF42 210 Coney Island - Sheepshead Bay Winter 2010-11 12/01/2010 25.70 NA
176693 375 Nitrogen dioxide (NO2) Mean ppb UHF42 210 Coney Island - Sheepshead Bay Annual Average 2013 12/01/2012 16.36 NA
741006 375 Nitrogen dioxide (NO2) Mean ppb UHF42 410 Rockaways Annual Average 2021 01/01/2021 11.41 NA
550028 375 Nitrogen dioxide (NO2) Mean ppb CD 201 Mott Haven and Melrose (CD1) Annual Average 2017 01/01/2017 21.25 NA
336723 375 Nitrogen dioxide (NO2) Mean ppb CD 101 Financial District (CD1) Winter 2014-15 12/01/2014 30.40 NA
741126 375 Nitrogen dioxide (NO2) Mean ppb CD 101 Financial District (CD1) Annual Average 2021 01/01/2021 21.61 NA
165858 375 Nitrogen dioxide (NO2) Mean ppb CD 102 Greenwich Village and Soho (CD2) Winter 2010-11 12/01/2010 36.79 NA
166625 375 Nitrogen dioxide (NO2) Mean ppb CD 102 Greenwich Village and Soho (CD2) Summer 2009 06/01/2009 31.63 NA
167746 375 Nitrogen dioxide (NO2) Mean ppb CD 102 Greenwich Village and Soho (CD2) Annual Average 2013 12/01/2012 29.27 NA
336852 375 Nitrogen dioxide (NO2) Mean ppb CD 402 Woodside and Sunnyside (CD2) Winter 2014-15 12/01/2014 26.28 NA
741255 375 Nitrogen dioxide (NO2) Mean ppb CD 402 Woodside and Sunnyside (CD2) Annual Average 2021 01/01/2021 20.21 NA
171599 375 Nitrogen dioxide (NO2) Mean ppb UHF34 203 Bedford Stuyvesant - Crown Heights Winter 2009-10 12/01/2009 28.86 NA
549907 375 Nitrogen dioxide (NO2) Mean ppb UHF34 204 East New York Summer 2017 06/01/2017 13.39 NA
602850 375 Nitrogen dioxide (NO2) Mean ppb UHF34 204 East New York Summer 2018 06/01/2018 11.97 NA
171595 375 Nitrogen dioxide (NO2) Mean ppb UHF34 103 Fordham - Bronx Pk Winter 2009-10 12/01/2009 23.87 NA
549895 375 Nitrogen dioxide (NO2) Mean ppb UHF34 104 Pelham - Throgs Neck Summer 2017 06/01/2017 14.14 NA
643458 375 Nitrogen dioxide (NO2) Mean ppb UHF34 201 Greenpoint Winter 2018-19 12/01/2018 26.69 NA
211654 375 Nitrogen dioxide (NO2) Mean ppb UHF34 101 Kingsbridge - Riverdale Summer 2014 06/01/2014 13.55 NA
643446 375 Nitrogen dioxide (NO2) Mean ppb UHF34 101 Kingsbridge - Riverdale Winter 2018-19 12/01/2018 18.32 NA
211655 375 Nitrogen dioxide (NO2) Mean ppb UHF34 102 Northeast Bronx Summer 2014 06/01/2014 15.40 NA
643449 375 Nitrogen dioxide (NO2) Mean ppb UHF34 102 Northeast Bronx Winter 2018-19 12/01/2018 19.11 NA
211674 375 Nitrogen dioxide (NO2) Mean ppb UHF34 402 West Queens Summer 2014 06/01/2014 16.64 NA
643506 375 Nitrogen dioxide (NO2) Mean ppb UHF34 402 West Queens Winter 2018-19 12/01/2018 24.36 NA
172662 375 Nitrogen dioxide (NO2) Mean ppb UHF34 301 Washington Heights Annual Average 2011 12/01/2010 25.56 NA
667399 375 Nitrogen dioxide (NO2) Mean ppb UHF34 301 Washington Heights Annual Average 2020 01/01/2020 18.08 NA
175885 375 Nitrogen dioxide (NO2) Mean ppb UHF42 107 Hunts Point - Mott Haven Summer 2009 06/01/2009 23.00 NA
176683 375 Nitrogen dioxide (NO2) Mean ppb UHF42 107 Hunts Point - Mott Haven Annual Average 2013 12/01/2012 21.54 NA
740913 375 Nitrogen dioxide (NO2) Mean ppb UHF42 107 Hunts Point - Mott Haven Annual Average 2021 01/01/2021 18.52 NA
175892 375 Nitrogen dioxide (NO2) Mean ppb UHF42 207 East Flatbush - Flatbush Summer 2009 06/01/2009 21.39 NA
336531 375 Nitrogen dioxide (NO2) Mean ppb UHF42 207 East Flatbush - Flatbush Winter 2014-15 12/01/2014 25.70 NA
175347 375 Nitrogen dioxide (NO2) Mean ppb UHF42 208 Canarsie - Flatlands Winter 2010-11 12/01/2010 26.24 NA
175913 375 Nitrogen dioxide (NO2) Mean ppb UHF42 407 Southwest Queens Summer 2009 06/01/2009 18.59 NA
165871 375 Nitrogen dioxide (NO2) Mean ppb CD 203 Morrisania and Crotona (CD3) Winter 2010-11 12/01/2010 30.92 NA
167759 375 Nitrogen dioxide (NO2) Mean ppb CD 203 Morrisania and Crotona (CD3) Annual Average 2013 12/01/2012 22.32 NA
549998 375 Nitrogen dioxide (NO2) Mean ppb CD 103 Lower East Side and Chinatown (CD3) Annual Average 2017 01/01/2017 23.65 NA
172663 375 Nitrogen dioxide (NO2) Mean ppb UHF34 302 Central Harlem - Morningside Heights Annual Average 2011 12/01/2010 28.23 NA
211670 375 Nitrogen dioxide (NO2) Mean ppb UHF34 302 Central Harlem - Morningside Heights Summer 2014 06/01/2014 19.54 NA
336664 375 Nitrogen dioxide (NO2) Mean ppb UHF34 302 Central Harlem - Morningside Heights Annual Average 2015 01/01/2015 23.15 NA
643494 375 Nitrogen dioxide (NO2) Mean ppb UHF34 302 Central Harlem - Morningside Heights Winter 2018-19 12/01/2018 24.69 NA
549903 375 Nitrogen dioxide (NO2) Mean ppb UHF34 202 Downtown - Heights - Slope Winter 2016-17 12/01/2016 26.27 NA
602844 375 Nitrogen dioxide (NO2) Mean ppb UHF34 202 Downtown - Heights - Slope Summer 2018 06/01/2018 16.37 NA
602829 375 Nitrogen dioxide (NO2) Mean ppb UHF34 101 Kingsbridge - Riverdale Summer 2018 06/01/2018 11.69 NA
549886 375 Nitrogen dioxide (NO2) Mean ppb UHF34 101 Kingsbridge - Riverdale Summer 2017 06/01/2017 13.02 NA
211757 375 Nitrogen dioxide (NO2) Mean ppb UHF34 102 Northeast Bronx Annual Average 2014 12/01/2013 18.97 NA
549889 375 Nitrogen dioxide (NO2) Mean ppb UHF34 102 Northeast Bronx Summer 2017 06/01/2017 14.37 NA
602832 375 Nitrogen dioxide (NO2) Mean ppb UHF34 102 Northeast Bronx Summer 2018 06/01/2018 13.05 NA
175297 375 Nitrogen dioxide (NO2) Mean ppb UHF42 107 Hunts Point - Mott Haven Winter 2009-10 12/01/2009 26.48 NA
175381 375 Nitrogen dioxide (NO2) Mean ppb UHF42 107 Hunts Point - Mott Haven Winter 2011-12 12/01/2011 25.17 NA
211390 375 Nitrogen dioxide (NO2) Mean ppb Borough 1 Bronx Annual Average 2014 12/01/2013 19.80 NA
336652 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Annual Average 2015 01/01/2015 18.62 NA
643482 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Winter 2018-19 12/01/2018 20.53 NA
667381 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Annual Average 2020 01/01/2020 14.81 NA
172593 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Annual Average 2009 12/01/2008 27.13 NA
172661 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Annual Average 2011 12/01/2010 25.75 NA
336658 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Annual Average 2015 01/01/2015 21.27 NA
211668 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Summer 2014 06/01/2014 17.00 NA
336709 375 Nitrogen dioxide (NO2) Mean ppb UHF34 501502 Northern SI Annual Average 2015 01/01/2015 16.00 NA
643539 375 Nitrogen dioxide (NO2) Mean ppb UHF34 501502 Northern SI Winter 2018-19 12/01/2018 18.62 NA
667495 375 Nitrogen dioxide (NO2) Mean ppb UHF34 501502 Northern SI Annual Average 2020 01/01/2020 13.11 NA
549782 375 Nitrogen dioxide (NO2) Mean ppb UHF42 201 Greenpoint Annual Average 2017 01/01/2017 21.19 NA
740916 375 Nitrogen dioxide (NO2) Mean ppb UHF42 201 Greenpoint Annual Average 2021 01/01/2021 20.55 NA
175333 375 Nitrogen dioxide (NO2) Mean ppb UHF42 101 Kingsbridge - Riverdale Winter 2010-11 12/01/2010 25.48 NA
175371 375 Nitrogen dioxide (NO2) Mean ppb UHF42 501 Port Richmond Winter 2010-11 12/01/2010 27.72 NA
336606 375 Nitrogen dioxide (NO2) Mean ppb UHF42 501 Port Richmond Winter 2014-15 12/01/2014 22.69 NA
549875 375 Nitrogen dioxide (NO2) Mean ppb UHF42 501 Port Richmond Annual Average 2017 01/01/2017 16.98 NA
336744 375 Nitrogen dioxide (NO2) Mean ppb CD 108 Upper East Side (CD8) Winter 2014-15 12/01/2014 29.30 NA
741147 375 Nitrogen dioxide (NO2) Mean ppb CD 108 Upper East Side (CD8) Annual Average 2021 01/01/2021 20.37 NA
336750 375 Nitrogen dioxide (NO2) Mean ppb CD 110 Central Harlem (CD10) Winter 2014-15 12/01/2014 26.84 NA
550025 375 Nitrogen dioxide (NO2) Mean ppb CD 112 Washington Heights and Inwood (CD12) Annual Average 2017 01/01/2017 20.14 NA
165890 375 Nitrogen dioxide (NO2) Mean ppb CD 310 Bay Ridge and Dyker Heights (CD10) Winter 2010-11 12/01/2010 29.44 NA
166657 375 Nitrogen dioxide (NO2) Mean ppb CD 310 Bay Ridge and Dyker Heights (CD10) Summer 2009 06/01/2009 20.38 NA
167778 375 Nitrogen dioxide (NO2) Mean ppb CD 310 Bay Ridge and Dyker Heights (CD10) Annual Average 2013 12/01/2012 19.48 NA
336822 375 Nitrogen dioxide (NO2) Mean ppb CD 310 Bay Ridge and Dyker Heights (CD10) Winter 2014-15 12/01/2014 25.17 NA
741225 375 Nitrogen dioxide (NO2) Mean ppb CD 310 Bay Ridge and Dyker Heights (CD10) Annual Average 2021 01/01/2021 16.42 NA
550097 375 Nitrogen dioxide (NO2) Mean ppb CD 312 Borough Park (CD12) Annual Average 2017 01/01/2017 19.03 NA
741231 375 Nitrogen dioxide (NO2) Mean ppb CD 312 Borough Park (CD12) Annual Average 2021 01/01/2021 16.92 NA
549924 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Winter 2016-17 12/01/2016 23.78 NA
602865 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Summer 2018 06/01/2018 11.47 NA
667383 375 Nitrogen dioxide (NO2) Mean ppb UHF34 209 Bensonhurst - Bay Ridge Summer 2020 06/01/2020 9.80 NA
171607 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Winter 2009-10 12/01/2009 28.63 NA
211770 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Annual Average 2014 12/01/2013 22.20 NA
602871 375 Nitrogen dioxide (NO2) Mean ppb UHF34 211 Williamsburg - Bushwick Summer 2018 06/01/2018 15.84 NA
549897 375 Nitrogen dioxide (NO2) Mean ppb UHF34 104 Pelham - Throgs Neck Winter 2016-17 12/01/2016 25.91 NA
211777 375 Nitrogen dioxide (NO2) Mean ppb UHF34 403 Flushing - Clearview Annual Average 2014 12/01/2013 19.45 NA


2.3. Dimensions

The data frame has 16,218 rows (measurements) and 12 columns (variables).

dim(nycaq_data_raw)
## [1] 16218    12


3. Data checks and transformations

3.1. Understanding the column names

Some of the column names were confusing (eg, what is the difference between “Measure” and “Measure Info”?). To figure out what kind of data was in these columns, I examined the different categories of data in the column.

3.1.1. “Measure Info” column

The “Measure Info” column appears to show the unit of each measurement value.

kbl_display(table(nycaq_data_raw$`Measure Info`), "50%")
Var1 Freq
mcg/m3 5499
number 288
per 100,000 192
per 100,000 adults 1152
per 100,000 children 576
per km2 632
ppb 7473
µg/m3 406


To figure out what types of data had a unit of “number”, I examined the rows that had this value. It appeared that only rows with a “Measure” of “Number per km2” have a “Measure Info” of “number.”

kbl_display(
  nycaq_data_raw %>%
    filter(`Measure Info` == "number"),
  "100%", "500px")
Unique ID Indicator ID Name Measure Measure Info Geo Type Name Geo Join ID Geo Place Name Time Period Start_Date Data Value Message
179789 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2015 01/01/2015 1.2 NA
130443 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 4 Queens 2013 01/01/2013 2.2 NA
179793 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2015 01/01/2015 1.7 NA
179807 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2015 01/01/2015 41.5 NA
179792 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 206 Borough Park 2015 01/01/2015 1.1 NA
179772 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 409 Southeast Queens 2015 01/01/2015 0.3 NA
179809 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 3 Manhattan 2015 01/01/2015 26.8 NA
179802 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2015 01/01/2015 14.1 NA
130408 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 205 Sunset Park 2013 01/01/2013 0.1 NA
130399 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2013 01/01/2013 24.5 NA
130427 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 403 Flushing - Clearview 2013 01/01/2013 5.1 NA
130419 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 305 Upper East Side 2013 01/01/2013 95.0 NA
179803 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2015 01/01/2015 17.8 NA
130425 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2013 01/01/2013 6.7 NA
130412 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2013 01/01/2013 1.7 NA
130434 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 410 Rockaways 2013 01/01/2013 0.0 NA
179769 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 502 Stapleton - St. George 2015 01/01/2015 0.2 NA
179777 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 406 Fresh Meadows 2015 01/01/2015 3.1 NA
179799 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2015 01/01/2015 10.7 NA
179804 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2015 01/01/2015 36.4 NA
179825 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 406 Fresh Meadows 2015 01/01/2015 0.4 NA
179729 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 406 Fresh Meadows 2015 01/01/2015 14.3 NA
179852 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2015 01/01/2015 4.9 NA
179839 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 102 Northeast Bronx 2015 01/01/2015 0.3 NA
130430 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 406 Fresh Meadows 2013 01/01/2013 4.3 NA
179767 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 503 Willowbrook 2015 01/01/2015 0.0 NA
179781 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2015 01/01/2015 1.3 NA
179779 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 403 Flushing - Clearview 2015 01/01/2015 3.0 NA
179795 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 303 East Harlem 2015 01/01/2015 4.8 NA
130416 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2013 01/01/2013 15.7 NA
130420 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2013 01/01/2013 67.4 NA
179733 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2015 01/01/2015 22.5 NA
179829 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2015 01/01/2015 0.2 NA
179749 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2015 01/01/2015 65.0 NA
179768 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 501 Port Richmond 2015 01/01/2015 0.0 NA
179787 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2015 01/01/2015 5.0 NA
179732 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 201 Greenpoint 2015 01/01/2015 18.8 NA
179828 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 201 Greenpoint 2015 01/01/2015 0.0 NA
179784 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 402 West Queens 2015 01/01/2015 1.1 NA
130435 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 501 Port Richmond 2013 01/01/2013 0.0 NA
179776 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 407 Southwest Queens 2015 01/01/2015 0.5 NA
179774 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 408 Jamaica 2015 01/01/2015 0.4 NA
179763 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 1 Bronx 2015 01/01/2015 39.5 NA
179759 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2015 01/01/2015 256.2 NA
179855 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2015 01/01/2015 5.5 NA
179742 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2015 01/01/2015 33.2 NA
179838 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2015 01/01/2015 0.2 NA
179745 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2015 01/01/2015 35.5 NA
179841 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2015 01/01/2015 0.3 NA
130432 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 408 Jamaica 2013 01/01/2013 0.5 NA
130431 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 407 Southwest Queens 2013 01/01/2013 0.8 NA
179770 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 410 Rockaways 2015 01/01/2015 0.0 NA
179801 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 310 Lower Manhattan 2015 01/01/2015 8.3 NA
179785 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2015 01/01/2015 1.2 NA
130422 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2013 01/01/2013 32.9 NA
130421 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2013 01/01/2013 78.8 NA
130411 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2013 01/01/2013 0.0 NA
130459 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2013 01/01/2013 0.0 NA
179849 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 310 Lower Manhattan 2015 01/01/2015 1.2 NA
179753 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 310 Lower Manhattan 2015 01/01/2015 114.9 NA
179840 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 206 Borough Park 2015 01/01/2015 0.2 NA
179744 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 206 Borough Park 2015 01/01/2015 34.4 NA
179750 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2015 01/01/2015 72.2 NA
130437 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 503 Willowbrook 2013 01/01/2013 0.0 NA
130429 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2013 01/01/2013 2.7 NA
130417 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 303 East Harlem 2013 01/01/2013 11.6 NA
130433 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 409 Southeast Queens 2013 01/01/2013 0.3 NA
130424 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 310 Lower Manhattan 2013 01/01/2013 13.7 NA
130423 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2013 01/01/2013 24.9 NA
130460 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2013 01/01/2013 0.2 NA
179766 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2015 01/01/2015 0.0 NA
130415 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 301 Washington Heights 2013 01/01/2013 51.0 NA
130436 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 502 Stapleton - St. George 2013 01/01/2013 0.4 NA
130426 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 402 West Queens 2013 01/01/2013 2.3 NA
130454 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2013 01/01/2013 0.1 NA
130471 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2013 01/01/2013 3.0 NA
130519 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2013 01/01/2013 126.1 NA
130457 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 206 Borough Park 2013 01/01/2013 0.2 NA
130498 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2013 01/01/2013 78.0 NA
130450 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2013 01/01/2013 2.4 NA
130490 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 3 Manhattan 2013 01/01/2013 5.9 NA
130505 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 206 Borough Park 2013 01/01/2013 34.8 NA
130482 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 410 Rockaways 2013 01/01/2013 0.0 NA
130481 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 409 Southeast Queens 2013 01/01/2013 0.0 NA
130529 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 409 Southeast Queens 2013 01/01/2013 8.1 NA
130528 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 408 Jamaica 2013 01/01/2013 13.2 NA
130480 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 408 Jamaica 2013 01/01/2013 0.1 NA
130527 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 407 Southwest Queens 2013 01/01/2013 14.5 NA
130479 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 407 Southwest Queens 2013 01/01/2013 0.1 NA
130517 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2013 01/01/2013 284.7 NA
130469 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 307 Gramercy Park - Murray Hill 2013 01/01/2013 9.0 NA
130458 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2013 01/01/2013 0.3 NA
130518 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2013 01/01/2013 132.5 NA
130470 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2013 01/01/2013 4.1 NA
130506 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2013 01/01/2013 33.5 NA
130507 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2013 01/01/2013 6.7 NA
130500 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 201 Greenpoint 2013 01/01/2013 18.9 NA
130511 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 301 Washington Heights 2013 01/01/2013 115.3 NA
130493 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2013 01/01/2013 42.5 NA
130494 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 102 Northeast Bronx 2013 01/01/2013 33.8 NA
130446 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 102 Northeast Bronx 2013 01/01/2013 0.3 NA
130501 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2013 01/01/2013 32.0 NA
130453 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2013 01/01/2013 0.1 NA
130504 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 205 Sunset Park 2013 01/01/2013 13.4 NA
130456 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 205 Sunset Park 2013 01/01/2013 0.0 NA
130515 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 305 Upper East Side 2013 01/01/2013 269.8 NA
130475 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 403 Flushing - Clearview 2013 01/01/2013 0.6 NA
130523 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 403 Flushing - Clearview 2013 01/01/2013 18.7 NA
130488 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 1 Bronx 2013 01/01/2013 1.2 NA
130536 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 1 Bronx 2013 01/01/2013 42.7 NA
179858 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 2 Brooklyn 2015 01/01/2015 0.1 NA
179762 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 2 Brooklyn 2015 01/01/2015 22.8 NA
179728 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 407 Southwest Queens 2015 01/01/2015 14.3 NA
179824 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 407 Southwest Queens 2015 01/01/2015 0.1 NA
179842 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2015 01/01/2015 1.3 NA
179746 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2015 01/01/2015 35.8 NA
130464 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2013 01/01/2013 2.0 NA
130512 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2013 01/01/2013 82.1 NA
130509 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2013 01/01/2013 23.7 NA
130461 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2013 01/01/2013 0.1 NA
130508 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2013 01/01/2013 26.5 NA
130538 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 3 Manhattan 2013 01/01/2013 161.1 NA
179724 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 409 Southeast Queens 2015 01/01/2015 8.1 NA
179818 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 410 Rockaways 2015 01/01/2015 0.0 NA
179820 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 409 Southeast Queens 2015 01/01/2015 0.0 NA
179830 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2015 01/01/2015 0.1 NA
179734 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2015 01/01/2015 23.5 NA
179827 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 403 Flushing - Clearview 2015 01/01/2015 0.4 NA
179731 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 403 Flushing - Clearview 2015 01/01/2015 17.0 NA
179822 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 408 Jamaica 2015 01/01/2015 0.1 NA
179859 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 1 Bronx 2015 01/01/2015 0.9 NA
179832 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 402 West Queens 2015 01/01/2015 0.2 NA
179847 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2015 01/01/2015 1.6 NA
179751 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 302 Central Harlem - Morningside Heights 2015 01/01/2015 77.8 NA
179756 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2015 01/01/2015 181.5 NA
130532 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 502 Stapleton - St. George 2013 01/01/2013 4.7 NA
130484 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 502 Stapleton - St. George 2013 01/01/2013 0.0 NA
130522 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 402 West Queens 2013 01/01/2013 24.6 NA
130474 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 402 West Queens 2013 01/01/2013 0.3 NA
130409 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 206 Borough Park 2013 01/01/2013 1.5 NA
130472 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 310 Lower Manhattan 2013 01/01/2013 1.7 NA
130520 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 310 Lower Manhattan 2013 01/01/2013 118.7 NA
130530 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 410 Rockaways 2013 01/01/2013 6.1 NA
130404 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 201 Greenpoint 2013 01/01/2013 0.4 NA
130463 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 301 Washington Heights 2013 01/01/2013 5.8 NA
130445 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2013 01/01/2013 2.0 NA
130521 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2013 01/01/2013 30.6 NA
130452 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 201 Greenpoint 2013 01/01/2013 0.1 NA
130531 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 501 Port Richmond 2013 01/01/2013 2.8 NA
130483 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 501 Port Richmond 2013 01/01/2013 0.0 NA
130473 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2013 01/01/2013 0.8 NA
130495 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2013 01/01/2013 71.0 NA
130525 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2013 01/01/2013 23.6 NA
130477 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 405 Ridgewood - Forest Hills 2013 01/01/2013 0.3 NA
130502 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2013 01/01/2013 31.7 NA
130485 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 503 Willowbrook 2013 01/01/2013 0.0 NA
130533 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 503 Willowbrook 2013 01/01/2013 2.1 NA
130513 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 303 East Harlem 2013 01/01/2013 55.8 NA
130465 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 303 East Harlem 2013 01/01/2013 1.3 NA
130447 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2013 01/01/2013 3.0 NA
130467 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 305 Upper East Side 2013 01/01/2013 10.8 NA
130405 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2013 01/01/2013 0.8 NA
130478 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 406 Fresh Meadows 2013 01/01/2013 0.5 NA
130526 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 406 Fresh Meadows 2013 01/01/2013 15.3 NA
130468 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2013 01/01/2013 7.7 NA
130516 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 306 Chelsea - Clinton 2013 01/01/2013 204.8 NA
130491 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 4 Queens 2013 01/01/2013 0.3 NA
130539 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 4 Queens 2013 01/01/2013 16.1 NA
130410 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2013 01/01/2013 2.3 NA
130537 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 2 Brooklyn 2013 01/01/2013 22.8 NA
179773 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2015 01/01/2015 0.9 NA
179786 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2015 01/01/2015 0.3 NA
179805 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 304 Upper West Side 2015 01/01/2015 50.9 NA
179718 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2015 01/01/2015 2.0 NA
179834 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2015 01/01/2015 0.1 NA
179738 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2015 01/01/2015 27.8 NA
130438 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2013 01/01/2013 0.0 NA
179860 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 4 Queens 2015 01/01/2015 0.2 NA
179764 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 4 Queens 2015 01/01/2015 15.6 NA
179848 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 301 Washington Heights 2015 01/01/2015 4.2 NA
179752 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 301 Washington Heights 2015 01/01/2015 100.6 NA
179816 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 501 Port Richmond 2015 01/01/2015 0.0 NA
179835 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2015 01/01/2015 0.7 NA
179739 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 401 Long Island City - Astoria 2015 01/01/2015 29.3 NA
179720 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 501 Port Richmond 2015 01/01/2015 2.8 NA
179743 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 102 Northeast Bronx 2015 01/01/2015 33.3 NA
179837 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2015 01/01/2015 0.2 NA
179741 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 202 Downtown - Heights - Slope 2015 01/01/2015 32.5 NA
179740 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2015 01/01/2015 31.9 NA
179836 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2015 01/01/2015 0.2 NA
179719 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 503 Willowbrook 2015 01/01/2015 2.1 NA
179815 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 503 Willowbrook 2015 01/01/2015 0.0 NA
179747 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 303 East Harlem 2015 01/01/2015 50.1 NA
179843 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 303 East Harlem 2015 01/01/2015 0.7 NA
179727 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 205 Sunset Park 2015 01/01/2015 13.9 NA
179823 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 205 Sunset Park 2015 01/01/2015 0.1 NA
179845 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2015 01/01/2015 2.3 NA
179758 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 305 Upper East Side 2015 01/01/2015 225.9 NA
179819 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2015 01/01/2015 0.0 NA
179723 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2015 01/01/2015 6.6 NA
179755 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2015 01/01/2015 121.3 NA
179851 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 308 Greenwich Village - SoHo 2015 01/01/2015 2.6 NA
179726 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 408 Jamaica 2015 01/01/2015 13.1 NA
179736 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 402 West Queens 2015 01/01/2015 23.7 NA
179817 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 502 Stapleton - St. George 2015 01/01/2015 0.0 NA
179721 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 502 Stapleton - St. George 2015 01/01/2015 4.6 NA
179854 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 305 Upper East Side 2015 01/01/2015 5.7 NA
179844 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 105 Crotona -Tremont 2015 01/01/2015 1.3 NA
179748 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 105 Crotona -Tremont 2015 01/01/2015 56.0 NA
179800 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 301 Washington Heights 2015 01/01/2015 32.0 NA
179780 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 201 Greenpoint 2015 01/01/2015 0.3 NA
179794 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2015 01/01/2015 9.1 NA
179850 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2015 01/01/2015 2.0 NA
179737 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2015 01/01/2015 26.1 NA
179833 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 209 Bensonhurst - Bay Ridge 2015 01/01/2015 0.2 NA
179722 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 410 Rockaways 2015 01/01/2015 6.1 NA
179857 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 3 Manhattan 2015 01/01/2015 3.7 NA
179754 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 309 Union Square - Lower East Side 2015 01/01/2015 117.5 NA
179846 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2015 01/01/2015 1.9 NA
179761 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 3 Manhattan 2015 01/01/2015 142.8 NA
179806 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 305 Upper East Side 2015 01/01/2015 39.4 NA
179775 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 205 Sunset Park 2015 01/01/2015 0.5 NA
179788 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2015 01/01/2015 1.0 NA
179797 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 103 Fordham - Bronx Pk 2015 01/01/2015 16.6 NA
130497 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 105 Crotona -Tremont 2013 01/01/2013 62.5 NA
130449 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 105 Crotona -Tremont 2013 01/01/2013 2.0 NA
130406 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 203 Bedford Stuyvesant - Crown Heights 2013 01/01/2013 0.9 NA
179790 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 207 East Flatbush - Flatbush 2015 01/01/2015 1.7 NA
130451 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2013 01/01/2013 0.4 NA
130499 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2013 01/01/2013 36.8 NA
130489 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 2 Brooklyn 2013 01/01/2013 0.1 NA
179782 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2015 01/01/2015 0.7 NA
130413 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 210 Coney Island - Sheepshead Bay 2013 01/01/2013 0.9 NA
130428 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2013 01/01/2013 1.6 NA
130418 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 304 Upper West Side 2013 01/01/2013 99.7 NA
130534 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2013 01/01/2013 2.0 NA
179791 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 102 Northeast Bronx 2015 01/01/2015 1.6 NA
130462 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2013 01/01/2013 0.1 NA
130510 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2013 01/01/2013 27.8 NA
130503 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 204 East New York 2013 01/01/2013 14.7 NA
130403 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 107 Hunts Point - Mott Haven 2013 01/01/2013 3.3 NA
130440 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 1 Bronx 2013 01/01/2013 10.3 NA
130398 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 102 Northeast Bronx 2013 01/01/2013 2.2 NA
130401 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 105 Crotona -Tremont 2013 01/01/2013 16.9 NA
130535 642 Boiler Emissions- Total NOx Emissions Number per km2 number Citywide 1 New York City 2013 01/01/2013 29.4 NA
130487 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Citywide 1 New York City 2013 01/01/2013 0.7 NA
130439 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Citywide 1 New York City 2013 01/01/2013 6.1 NA
130455 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 204 East New York 2013 01/01/2013 0.0 NA
130448 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2013 01/01/2013 0.5 NA
130496 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2013 01/01/2013 24.9 NA
130476 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2013 01/01/2013 0.2 NA
130524 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2013 01/01/2013 10.4 NA
130514 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 304 Upper West Side 2013 01/01/2013 247.9 NA
130466 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 304 Upper West Side 2013 01/01/2013 11.4 NA
179810 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 2 Brooklyn 2015 01/01/2015 0.7 NA
179798 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2015 01/01/2015 12.9 NA
130402 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 106 High Bridge - Morrisania 2013 01/01/2013 19.7 NA
130442 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 3 Manhattan 2013 01/01/2013 50.6 NA
179765 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 5 Staten Island 2015 01/01/2015 2.7 NA
179826 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 204 East New York 2015 01/01/2015 0.0 NA
179730 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 204 East New York 2015 01/01/2015 14.7 NA
179735 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2015 01/01/2015 23.6 NA
179831 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2015 01/01/2015 0.4 NA
179814 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2015 01/01/2015 0.0 NA
179725 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2015 01/01/2015 9.8 NA
179821 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 404 Bayside - Little Neck 2015 01/01/2015 0.2 NA
179771 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 208 Canarsie - Flatlands 2015 01/01/2015 0.0 NA
179811 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 1 Bronx 2015 01/01/2015 6.3 NA
130441 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 2 Brooklyn 2013 01/01/2013 0.8 NA
179796 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 105 Crotona -Tremont 2015 01/01/2015 8.8 NA
179812 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 4 Queens 2015 01/01/2015 1.4 NA
130397 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 101 Kingsbridge - Riverdale 2013 01/01/2013 17.6 NA
179760 642 Boiler Emissions- Total NOx Emissions Number per km2 number Citywide 1 New York City 2015 01/01/2015 27.4 NA
179856 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Citywide 1 New York City 2015 01/01/2015 0.5 NA
179808 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Citywide 1 New York City 2015 01/01/2015 3.5 NA
130486 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 504 South Beach - Tottenville 2013 01/01/2013 0.0 NA
130492 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 5 Staten Island 2013 01/01/2013 0.0 NA
130540 642 Boiler Emissions- Total NOx Emissions Number per km2 number Borough 5 Staten Island 2013 01/01/2013 2.7 NA
179757 642 Boiler Emissions- Total NOx Emissions Number per km2 number UHF42 304 Upper West Side 2015 01/01/2015 210.5 NA
179853 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number UHF42 304 Upper West Side 2015 01/01/2015 7.0 NA
179861 641 Boiler Emissions- Total PM2.5 Emissions Number per km2 number Borough 5 Staten Island 2015 01/01/2015 0.0 NA
179778 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 204 East New York 2015 01/01/2015 0.1 NA
130407 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 204 East New York 2013 01/01/2013 0.1 NA
130414 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 211 Williamsburg - Bushwick 2013 01/01/2013 0.3 NA
179783 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2015 01/01/2015 2.8 NA
179813 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 5 Staten Island 2015 01/01/2015 0.1 NA
130400 640 Boiler Emissions- Total SO2 Emissions Number per km2 number UHF42 104 Pelham - Throgs Neck 2013 01/01/2013 4.4 NA
130444 640 Boiler Emissions- Total SO2 Emissions Number per km2 number Borough 5 Staten Island 2013 01/01/2013 0.1 NA


3.1.2. “Measure” column

The “Measure” column appeared to be a combination of units, which are redundant with the “Measure Info” column, and statistical descriptions (eg, average). This column does not seem necessary.

kbl_display(table(nycaq_data_raw$Measure), "50%")
Var1 Freq
Annual average concentration 406
Estimated annual rate 576
Estimated annual rate (age 18+) 576
Estimated annual rate (age 30+) 192
Estimated annual rate (under age 18) 576
Mean 12972
million miles 632
Number per km2 288


3.1.3. “Name” column

This column revealed a mixture of pollutant names (eg, NO2, ozone) and unrelated statistics, such as vehicle miles traveled, hospitalizations, and deaths.

kbl_display(table(nycaq_data_raw$`Name`), "50%", "500px")
Var1 Freq
Annual vehicle miles traveled 209
Annual vehicle miles travelled (cars) 214
Annual vehicle miles travelled (trucks) 209
Asthma emergency department visits due to PM2.5 384
Asthma emergency departments visits due to Ozone 384
Asthma hospitalizations due to Ozone 384
Boiler Emissions- Total NOx Emissions 96
Boiler Emissions- Total PM2.5 Emissions 96
Boiler Emissions- Total SO2 Emissions 96
Cardiac and respiratory deaths due to Ozone 192
Cardiovascular hospitalizations due to PM2.5 (age 40+) 192
Deaths due to PM2.5 192
Fine particles (PM 2.5) 5499
Nitrogen dioxide (NO2) 5499
Outdoor Air Toxics - Benzene 203
Outdoor Air Toxics - Formaldehyde 203
Ozone (O3) 1974
Respiratory hospitalizations due to PM2.5 (age 20+) 192


3.1.4. “Time Period” column

This column shows the year(s) of the observation, either as a single number (eg, 2005) or as a range (eg, 2005-2007). Since there is also a “Start_Date” column, the year in the “Time Period” column may be redundant.

Some rows also include the season (summer or winter). The dataset does not appear to have data from spring or fall. It is not clear whether this is because no measurements were collected in spring and fall, or because each year is divided into 6-month intervals (ie, summer and winter).

kbl_display(table(nycaq_data_raw$`Time Period`), "50%", "500px")
Var1 Freq
2-Year Summer Average 2009-2010 141
2005 407
2005-2007 480
2009-2011 480
2011 214
2012-2014 480
2013 144
2014 96
2015 144
2015-2017 480
2016 321
Annual Average 2009 282
Annual Average 2010 282
Annual Average 2011 282
Annual Average 2012 282
Annual Average 2013 282
Annual Average 2014 282
Annual Average 2015 282
Annual Average 2016 282
Annual Average 2017 282
Annual Average 2018 282
Annual Average 2019 282
Annual Average 2020 282
Annual Average 2021 282
Summer 2009 423
Summer 2010 423
Summer 2011 423
Summer 2012 423
Summer 2013 423
Summer 2014 423
Summer 2015 423
Summer 2016 423
Summer 2017 423
Summer 2018 423
Summer 2019 423
Summer 2020 423
Summer 2021 423
Winter 2008-09 282
Winter 2009-10 282
Winter 2010-11 282
Winter 2011-12 282
Winter 2012-13 282
Winter 2013-14 282
Winter 2014-15 282
Winter 2015-16 282
Winter 2016-17 282
Winter 2017-18 282
Winter 2018-19 282
Winter 2019-20 282
Winter 2020-21 282


3.1.5. “Start_Date” column

This column shows the start date of the “Time Period” column; however, examination of the month and days shows that there are only 3 different month values (1, 6, and 12). Presumably this corresponds to the start of the year (January) for the annual averages and June and December for the summer and winter seasons.

nycaq_data_dates <- nycaq_data_raw %>%
  select(Start_Date) %>%
  mutate(
    Start_Date = as.Date(Start_Date, format = "%m/%d/%Y"),    
    Start_month = as.numeric(format(Start_Date, "%m")),
    Start_day = as.numeric(format(Start_Date, "%d"))    
  )
kbl_display(table(nycaq_data_dates$Start_month), "20%")
Var1 Freq
1 4938
6 5640
12 5640


Similarly, there are only a few different values for the day. Moreover, nearly all the day values are 1, with relatively few values of 2 or 31, suggesting that they could be typos or due to data entry on days before or after holidays (eg, December 31, January 2nd).

kbl_display(table(nycaq_data_dates$Start_day), "20%")
Var1 Freq
1 15456
2 480
31 282


Together, this shows that the “Start_Date column” is mostly redundant with the “Time Period” column.


3.1.6. “Geo Join ID” column

This column contains NYC neighborhood identifier codes. I explain how these IDs can be used to map the air pollutant data in Section 4.4. Chloropleth maps.


3.2. Select columns

Based on these explorations and the more easily understood column names, I selected the columns needed to perform exploratory data analysis and the analysis that Anthony proposed. I also renamed and reordered columns to make them more intuitive and easier to work with.

nycaq_data <- nycaq_data_raw %>%
  select(
    ID = `Unique ID`,
    pollutant = Name,
    value = `Data Value`,
    unit = `Measure Info`,
    date = `Start_Date`,
    season = `Time Period`,
    geo_join_ID = `Geo Join ID`    
  )


3.3. Filter rows

Because this analysis focuses on air pollutants, I filtered the dataset for pollutants.

nycaq_data <- nycaq_data %>%
  filter(pollutant %in% c("Boiler Emissions- Total NOx Emissions",
                          "Boiler Emissions- Total PM2.5 Emissions",
                          "Boiler Emissions- Total SO2 Emissions",
                          "Fine particles (PM 2.5)",
                          "Nitrogen dioxide (NO2)",
                          "Outdoor Air Toxics - Benzene",
                          "Outdoor Air Toxics - Formaldehyde",
                          "Ozone (O3)"))

Note that this step eliminates the need to correct the rows with units of “number”, because those rows are not associated with pollutants.


3.4. Data types

A glimpse of the data frame shows that the “date” column needs to be converted to date format. The data types of the other columns are appropriate.

glimpse(nycaq_data)
## Rows: 13,666
## Columns: 7
## $ ID          <dbl> 172653, 172585, 336637, 336622, 172582, 667327, 172607, 17…
## $ pollutant   <chr> "Nitrogen dioxide (NO2)", "Nitrogen dioxide (NO2)", "Nitro…
## $ value       <dbl> 25.30, 26.93, 19.09, 19.76, 22.83, 16.19, 38.16, 34.96, 30…
## $ unit        <chr> "ppb", "ppb", "ppb", "ppb", "ppb", "ppb", "ppb", "ppb", "p…
## $ date        <chr> "12/01/2010", "12/01/2008", "01/01/2015", "01/01/2015", "1…
## $ season      <chr> "Annual Average 2011", "Annual Average 2009", "Annual Aver…
## $ geo_join_ID <dbl> 203, 203, 204, 103, 104, 104, 306308, 306308, 206, 206, 10…

I also reformatted the date to only show the year. As noted previously, the month and day are not informative because there are only a few discrete values in the dataset.

nycaq_data <- nycaq_data %>%
  mutate(
    date = as.Date(date, format = "%m/%d/%Y"),
    date = as.numeric(format(date, "%Y")),    
  ) %>%
  rename(year = date)


3.5. Duplicate rows

There were no duplicate rows.

sprintf("Total rows: %s", nrow(nycaq_data))
## [1] "Total rows: 13666"
sprintf("Distinct rows: %s", nrow(distinct(nycaq_data)))
## [1] "Distinct rows: 13666"


3.6. Missing values

There were no null (NA) values in any columns.

map(nycaq_data, ~ sum(is.na(.)))
## $ID
## [1] 0
## 
## $pollutant
## [1] 0
## 
## $value
## [1] 0
## 
## $unit
## [1] 0
## 
## $year
## [1] 0
## 
## $season
## [1] 0
## 
## $geo_join_ID
## [1] 0


3.7. Create data subsets

3.7.1. Seasonal

This dataframe includes rows that have “Summer” or “Winter” in the season column. Because the year is already in the date column, I removed the year from the season column.

nycaq_data_seasonal <- nycaq_data %>%
  filter(
    grepl("(Summer|Winter) \\d{4}", season),
  ) %>%
  mutate(
    season = str_extract(season, ".*(?= \\d{4})")
  )

Only 3 pollutants met these criteria.

kbl_display(table(nycaq_data_seasonal$pollutant), "30%")
Var1 Freq
Fine particles (PM 2.5) 3666
Nitrogen dioxide (NO2) 3666
Ozone (O3) 1833

I then created separate dataframes for each pollutant.

fine_particles_seasonal <- nycaq_data_seasonal %>%
  filter(pollutant == "Fine particles (PM 2.5)") %>%
  select(ID, PM2.5 = value, unit, year, season, geo_join_ID)

no2_seasonal <- nycaq_data_seasonal %>%
  filter(pollutant == "Nitrogen dioxide (NO2)") %>%
  select(ID, NO2 = value, unit, year, season, geo_join_ID)  

ozone_seasonal <- nycaq_data_seasonal %>%
  filter(pollutant == "Ozone (O3)") %>%
  select(ID, O3 = value, unit, year, season, geo_join_ID)  


3.7.2. Annual averages

This dataframe includes rows that have “Annual Average” in the season column.

nycaq_data_annual_avg <- nycaq_data %>%
  filter(
    grepl("Annual Average \\d{4}", season),
  ) %>%
  # Season is not applicable
  select(ID, pollutant, value, unit, year, geo_join_ID)

Only 2 pollutants met these criteria.

kbl_display(table(nycaq_data_annual_avg$pollutant), "30%")
Var1 Freq
Fine particles (PM 2.5) 1833
Nitrogen dioxide (NO2) 1833

I then created separate dataframes for each pollutant.

fine_particles_annual <- nycaq_data_annual_avg %>%
  filter(pollutant == "Fine particles (PM 2.5)") %>%
  select(ID, PM2.5 = value, unit, year, geo_join_ID)

no2_annual <- nycaq_data_annual_avg %>%
  filter(pollutant == "Nitrogen dioxide (NO2)") %>%
  select(ID, NO2 = value, unit, year, geo_join_ID)


4. Analysis

4.1. Summary statistics

The data frame below summarizes the summary statistics of the 3 pollutants by season.

  • The mean concentration of fine particles (PM2.5) was similar in both summer (9.67 \(µg/m^3\)) and winter (9.80 \(µg/m^3\)); however, the spread (SD) is larger in the winter.

  • The mean (SD) concentration of nitrogen dioxide (NO2) was higher in the winter (25.79 [4.57] ppb) than in the summer (16.51 [5.43] ppb).

kbl_display(
  nycaq_data_seasonal %>%
    group_by(pollutant, season) %>%
    summarise(
      n = n(),
      min = min(value),
      max = max(value),
      mean = round(mean(value), 2),
      SD = round(sd(value), 2),
      median = median(value),
      IQR = IQR(value),
      .groups = "keep"
    ), "100%")
pollutant season n min max mean SD median IQR
Fine particles (PM 2.5) Summer 1833 6.27 16.12 9.67 1.70 9.59 2.62
Fine particles (PM 2.5) Winter 1833 5.92 18.84 9.80 2.38 9.32 3.64
Nitrogen dioxide (NO2) Summer 1833 4.85 45.00 16.51 5.43 15.84 6.00
Nitrogen dioxide (NO2) Winter 1833 13.88 50.56 25.79 4.57 25.54 5.39
Ozone (O3) Summer 1833 14.38 40.40 30.32 3.19 30.54 3.69


4.2. Distributions

4.2.1. Fine particles (PM2.5)

The distribution of fine particle concentration during the summer and winter is similar. The winter distribution is a little more right-skewed than the summer distribution.

ggplot(fine_particles_seasonal, aes(x = PM2.5, fill = season)) +
  geom_histogram(binwidth = 0.15, alpha = 0.5, position = "identity") +
  scale_x_continuous(breaks = seq(6, 20, by = 2)) +
  xlab(bquote(bold("Concentration of fine particles (PM2.5) (µg/" * m^3 * ")"))) +
  ylab("Count") +
    theme(axis.title.y = element_text(face = "bold")) +    
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"))    


4.2.2. Nitrogen dioxide (NO2)

The distribution of nitrogen dioxide concentrations during the summer and winter are both bell-shaped, but the mean concentration is higher in the winter than in the summer.

ggplot(no2_seasonal, aes(x = NO2, fill = season)) +
  geom_histogram(binwidth = 0.5, alpha = 0.5, position = "identity") +
  scale_x_continuous(breaks = seq(0, 50, by = 5)) +
  xlab(bquote(bold("Concentration of nitrogen dioxide (" * NO[2] * ") (ppb)"))) +
  ylab("Count") +
    theme(axis.title.y = element_text(face = "bold")) +
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"))


4.2.3. Ozone (O3)

The distribution of ozone concentrations during the summer is bell-shaped and skewed a little to the left.

ggplot(ozone_seasonal, aes(x = O3)) +
  geom_histogram(binwidth = 0.3, fill = "#F8766D", alpha = 0.5) +
  scale_x_continuous(breaks = seq(15, 40, by = 5)) +
  xlab(bquote(bold("Concentration of ozone (" * O[3] * ") (ppb)"))) +
  ylab("Count") +
    theme(axis.title.y = element_text(face = "bold")) +
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"))


4.3. Change over time

4.3.1. Fine particles (PM2.5), annual

The average concentration1 of fine particles (PM2.5) trended downward from 2008 to 2020. During this time, levels decreased approximately two-fold.

ggplot(fine_particles_annual, aes(x = factor(year), y = PM2.5, color = factor(year))) +
  geom_boxplot() +
  scale_y_continuous(breaks = seq(4, 18, by = 2)) +
  xlab("Year") +
    theme(axis.title.x = element_text(face = "bold")) +  
  ylab(bquote(bold("Concentration of fine particles (PM2.5) (µg/" * m^3 * ")"))) +  
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"),
    legend.position = "none")  


4.3.2. Fine particles (PM2.5), seasonal

First, I reshaped the fine particles seasonal data to create side-by-side boxplots.

temp_df <- fine_particles_seasonal %>%
  pivot_wider(
    id_cols = c(ID, year),
    names_from = season,
    values_from = PM2.5
  )

fine_particles_seasonal2 <- temp_df %>%
  pivot_longer(
    cols = c(Winter, Summer),
    names_to = "season",
    values_to = "PM2.5")

Then I removed the rows with null (NA) values.

fine_particles_seasonal2 <- drop_na(fine_particles_seasonal2)

I omitted outliers from the boxplots to reduce clutter. The boxplots generally show that the average summer and winter concentrations of fine particulates decreased from 2010 to 2020. There was more year-to-year fluctuation in the winter levels than the summer levels before 2013 than after 2013.

There does not seem to be a pattern in the average concentration of fine particulates from season to season. The average concentration of some seasons is less than than the preceding season (eg, winter 2011 vs summer 2011), whereas the average concentration of other seasons is higher than the preceding season (eg, winter 2013 vs summer 2013). However, the season-to-season difference appears to become smaller over time.

ggplot(fine_particles_seasonal2, 
       aes(x = factor(year), y = PM2.5, color = season, fill = season)) +
  geom_boxplot(outlier.shape = NA, alpha = 0.2) +
  scale_y_continuous(breaks = seq(0, 20, by = 2)) +
  xlab("Year") +
    theme(axis.title.x = element_text(face = "bold")) +  
  ylab(bquote(bold("Concentration of fine particles (PM2.5) (µg/" * m^3 * ")"))) +  
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"))


4.3.3. Nitrogen dioxide (NO2), annual

The average concentration of nitrogen dioxide has decreased from 2008 to 2021; however, there is wide variation in levels (ie, between the whiskers) as well as spikes (ie, high outlier values) each year.

ggplot(no2_annual, aes(x = factor(year), y = NO2, color = factor(year))) +
  geom_boxplot() +
  scale_y_continuous(breaks = seq(10, 50, by = 5)) +
  xlab("Year") +
    theme(axis.title.x = element_text(face = "bold")) +  
  ylab(bquote(bold("Concentration of nitrogen dioxide (" * NO[2] * ") (ppb)"))) +  
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey"),
    legend.position = "none"
    )


4.3.4. Nitrogen dioxide (NO2), seasonal

I reshaped the NO2 seasonal data to create side-by-side boxplots.

temp_df <- no2_seasonal %>%
  pivot_wider(
    id_cols = c(ID, year),
    names_from = season,
    values_from = NO2
  )

no2_seasonal2 <- temp_df %>%
  pivot_longer(
    cols = c(Winter, Summer),
    names_to = "season",
    values_to = "NO2"
  )

Then I removed the rows with null (NA) values.

no2_seasonal2 <- drop_na(no2_seasonal2)

I omitted outliers from the boxplots to reduce clutter. The boxplots show that the average concentration of nitrogen dioxide is higher in the winter than in the summer for all years in the dataset. In addition, the average summer levels have steadily decreased from 2009 to 2021. In contrast, the average winter levels remained relatively unchanged during this period.

Across all years, the average concentration of NO2 alternates from being low in the summer to high in the winter. This pattern is consistent with the fact that NO2 is produced from burning fuel since more fuel is used in the winter for heating.

ggplot(no2_seasonal2, 
       aes(x = factor(year), y = NO2, color = season, fill = season)) +
  geom_boxplot(outlier.shape = NA, alpha = 0.2) +
  xlab("Year") +
    theme(axis.title.x = element_text(face = "bold")) +  
  ylab(bquote(bold("Concentration of nitrogen dioxide (" * NO[2] * ") (ppb)"))) + 
  theme(
    axis.text = element_text(size = 10),
    axis.line = element_line(linewidth = 0.5, color = "darkgrey")
    )


4.4. Chloropleth maps

4.4.1. Understanding “Geo Join ID”

Since the raw dataset included names and geographic IDs of NYC neighborhoods, I wanted to see if I could map the air pollutant data to a map. I’ve never done this before and found it challenging to figure out. The explanation of the field on the NYC Open Data website for the air pollutant data wasn’t very helpful. It only vaguely describes “Geo Join ID” as “an identifier to join to mapping geography files to make thematic maps”. As a beginner, I didn’t know what that meant. Moreover, a Google search for the term “Geo Join ID” did not yield any useful results. However, after extensive research about how maps can be created in R,2 I figured out that “Geo Join ID” corresponds to NYC borough ID codes that are used to define geographic boundaries in the nycgeo R package.

With this key insight, I was able to create “chloropleth” maps of the air pollutant data. A chloropleth map is a type of thematic map that uses color scales to depict differences in data values across geographic areas. Now the explanation of “Geo Join ID” on the NYC Open Data website makes sense!


4.4.2. Tidy “Geo Join ID”

Geo Join IDs are 3-digit identifiers; however, some values in the air pollutant dataset have 6 or 9 digits and appear to be concatenations of multiple identifiers. Since I don’t know which is the correct identifier, I omitted all rows with Geo Join IDs with more or less than 3 digits (implemented in next section).


4.4.3. Data subsets

Due to the nature of a map and the color mapping, a chloropleth map is only a snapshot in time. To test its application to the air pollutant data, I created two subsets of the seasonal nitrogen dioxide data, one with data from summer 2009 and the other with data from summer 2020. The boxplots in section 4.3.4 showed that the median concentration of NO2 across the entire city was lower in the summer of 2020 than in the summer of 2009; however, the plot doesn’t show changes at the neighborhood level.

no2_summer2009 <- no2_seasonal %>%
  filter(year == 2009 & season == "Summer" & (geo_join_ID >= 100 & geo_join_ID < 1000))
no2_summer2020 <- no2_seasonal %>%
  filter(year == 2020 & season == "Summer" & (geo_join_ID >= 100 & geo_join_ID < 1000))


4.4.4. Creating the maps

First, I used functions from the nycgeo and sf libraries to import the geographic boundaries of NYC community districts and transform them to the New York coordinate reference system (crs 2263).3

nyc_cd <- nyc_boundaries(
  geography = 'cd',
) %>%
  st_transform(2263)

Then I combined the NYC boundaries dataframe with the air pollutants dataframe using the common borough ID in the former and the Geo Join ID in the latter.

# IDs have different data types in the 2 dataframes
nyc_cd <- nyc_cd %>%
  mutate(
    borough_cd_id = as.numeric(borough_cd_id)
  )
# Join the dataframes
nyc_cd_comb_no2_summer2009 <- nyc_cd %>%
  left_join(no2_summer2009, by = c("borough_cd_id" = "geo_join_ID"))

nyc_cd_comb_no2_summer2020 <- nyc_cd %>%
  left_join(no2_summer2020, by = c("borough_cd_id" = "geo_join_ID"))

In summer 2009, NO2 levels were highest in the Midtown area and lowest along the Long Beach Barrier Island and around Howard Beach (Queens). Note that areas without any data are shown as gray.

ggplot(nyc_cd_comb_no2_summer2009, aes(fill = NO2)) +
  geom_sf(color = NA) +
  scale_fill_distiller(palette = "RdBu", direction = -1, name = "NO2 (ppb)") +  
  theme_void() +
  ggtitle(bquote(bold(NO[2] * " Levels in NYC Neighborhoods in Summer 2009")))

In summer 2020, NO2 levels were highest around Carroll Gardens (Brooklyn) and lowest along the Long Beach Barrier Island and around Howard Beach.

Comparing the two maps shows that NO2 levels in the Midtown area decreased substantially from 2009 to 2020. On the other hand, there was not much change on the Long Beach Barrier Island, Staten Island, and most of Queens. However, NO2 levels in these areas were already relatively low in 2009. These differences are consistent with the greater urban density in Manhattan versus less urban areas of NYC since traffic and population density would be expected to be proportional to the amount of fuel burned, which is the primary source of NO2.

ggplot(nyc_cd_comb_no2_summer2020, aes(fill = NO2)) +
  geom_sf(color = NA) +
  scale_fill_distiller(palette = "RdBu", direction = -1, name = "NO2 (ppb)") +  
  theme_void() +
  ggtitle(bquote(bold(NO[2] * " Levels in NYC Neighborhoods in Summer 2020")))


5. Conclusions

I cleaned and analyzed the NYC air quality dataset. I did my best to analyze Anthony’s proposal to compare air pollutant levels versus time of year. However, the data were not as granular as I expected and many pollutants did not have seasonal data, which limited the comparisons that could be performed.

My analyses showed that PM2.5 and NO2 levels in NYC have decreased during the time frame of the dataset. The seasonal pattern of pollutant levels was clearest for NO2, which showed a clear difference between summer and winter that is consistent with the greater amount of fuel that is burned during the winter.

I also created chloropleth maps to compare NO2 levels in NYC neighborhoods in 2009 versus 2020. The maps showed that the city-wide decrease from 2009 to 2020 was mainly driven by decreases in Manhattan. This could be due to changes in traffic volume, vehicle fuel efficiency, and/or residential heating needs.

Chloropleth maps are visually appealing; however, because each map only represents data from a specific time point, many maps would need to be created to analyze changes over time. It may be possible to show this using an animation, which may be a future project.


  1. To be specific, the median of the annual averages, but this is a mouthful.↩︎

  2. Key websites that helped me create chloropleth maps of the air pollutant data:

    https://justinmorganwilliams.medium.com/basics-of-gis-mapping-with-r-using-grow-nyc-markets-75adcdd9b0

    https://rdvark.net/2021/12/29/pretty-choropleth-maps-with-sf-and-ggplot2/↩︎

  3. The code block produces a message that the coordinate reference system object is outdated. I found a Stack Overflow webpage that explains the issue, but as far as I can tell, it is due to how the coordinate reference system was generated. Since this is an external resource, I don’t know how to update it. In any case, the code seems to work fine as is.↩︎