Objective

To examine the fine particulate matter pollution over a ten-year period from 1999 in USA using the National Emissions Inventory database.

Loading data and packages:

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Basic checking

str(NEI)
## 'data.frame':    6497651 obs. of  6 variables:
##  $ fips     : chr  "09001" "09001" "09001" "09001" ...
##  $ SCC      : chr  "10100401" "10100404" "10100501" "10200401" ...
##  $ Pollutant: chr  "PM25-PRI" "PM25-PRI" "PM25-PRI" "PM25-PRI" ...
##  $ Emissions: num  15.714 234.178 0.128 2.036 0.388 ...
##  $ type     : chr  "POINT" "POINT" "POINT" "POINT" ...
##  $ year     : int  1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
unique(NEI$type)
## [1] "POINT"    "NONPOINT" "ON-ROAD"  "NON-ROAD"
unique(NEI$Pollutant)
## [1] "PM25-PRI"
tail(NEI)
##           fips        SCC Pollutant   Emissions     type year
## 75051171 56011 2282020005  PM25-PRI 0.028598300 NON-ROAD 2008
## 75051181 53009 2265003020  PM25-PRI 0.003152410 NON-ROAD 2008
## 75051191 41057 2260002006  PM25-PRI 0.046869500 NON-ROAD 2008
## 75051201 38015 2270006005  PM25-PRI 1.012890000 NON-ROAD 2008
## 75051211 46105 2265004075  PM25-PRI 0.000486488 NON-ROAD 2008
## 75051221 53005 2270004076  PM25-PRI 0.001622670 NON-ROAD 2008
str(SCC)
## 'data.frame':    11717 obs. of  15 variables:
##  $ SCC                : Factor w/ 11717 levels "10100101","10100102",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ Data.Category      : Factor w/ 6 levels "Biogenic","Event",..: 6 6 6 6 6 6 6 6 6 6 ...
##  $ Short.Name         : Factor w/ 11238 levels "","2,4-D Salts and Esters Prod /Process Vents, 2,4-D Recovery: Filtration",..: 3283 3284 3293 3291 3290 3294 3295 3296 3292 3289 ...
##  $ EI.Sector          : Factor w/ 59 levels "Agriculture - Crops & Livestock Dust",..: 18 18 18 18 18 18 18 18 18 18 ...
##  $ Option.Group       : Factor w/ 25 levels "","C/I Kerosene",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Option.Set         : Factor w/ 18 levels "","A","B","B1A",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ SCC.Level.One      : Factor w/ 17 levels "Brick Kilns",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ SCC.Level.Two      : Factor w/ 146 levels "","Agricultural Chemicals Production",..: 32 32 32 32 32 32 32 32 32 32 ...
##  $ SCC.Level.Three    : Factor w/ 1061 levels "","100% Biosolids (e.g., sewage sludge, manure, mixtures of these matls)",..: 88 88 156 156 156 156 156 156 156 156 ...
##  $ SCC.Level.Four     : Factor w/ 6084 levels "","(NH4)2 SO4 Acid Bath System and Evaporator",..: 4455 5583 4466 4458 1341 5246 5584 5983 4461 776 ...
##  $ Map.To             : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Last.Inventory.Year: int  NA NA NA NA NA NA NA NA NA NA ...
##  $ Created_Date       : Factor w/ 57 levels "","1/27/2000 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Revised_Date       : Factor w/ 44 levels "","1/27/2000 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Usage.Notes        : Factor w/ 21 levels ""," ","includes bleaching towers, washer hoods, filtrate tanks, vacuum pump exhausts",..: 1 1 1 1 1 1 1 1 1 1 ...

NEI contains a data frame with PM2.5 emitted, in tons, from a specific source for the entire year of 1999, 2002, 2005 and 2008.

SCC provides detailed information of PM2.5 sources. The source index map with column “SCC” in NEI.

Exploration

  1. Have total emissions from PM2.5 decreased in the United States from 1999 to 2008?
by.year <- NEI %>% group_by(year) %>% summarise(sum(Emissions)) 
plot(by.year$year, by.year$`sum(Emissions)`, type = "o", pch = 15, lty = 1, lwd = 2, frame = FALSE, xlab = "Year", ylab = "Total Emissions", main = "Annual Emissions")

dev.copy(png, file = "plot1.png", height = 496, width = 496)
## quartz_off_screen 
##                 3
dev.off()
## quartz_off_screen 
##                 2

From 1999 to 2008, PM2.5 emissions decreased in USA.

  1. Have total emissions from PM2.5 decreased in the Baltimore City, Maryland (𝚏𝚒𝚙𝚜 == “𝟸𝟺𝟻𝟷𝟶”) from 1999 to 2008?
baltimore <- filter(NEI, fips == "24510") %>% group_by(year) %>% summarise(sum(Emissions)) 
plot(baltimore$year, baltimore$`sum(Emissions)`, type = "o", pch = 15, lty = 1, lwd = 2, frame = FALSE, xlab = "Year", ylab = "Total Emissions", main = "Annual Emissions in Baltimore City")

dev.copy(png, file = "plot2.png", height = 496, width = 496)
## quartz_off_screen 
##                 3
dev.off()
## quartz_off_screen 
##                 2

In the Baltimore City, PM2.5 emissions decreased from 1999 to 2008 with a peak in 2005.

  1. Of the four types of sources indicated by the 𝚝𝚢𝚙𝚎 (point, nonpoint, onroad, nonroad) variable, which of these four sources have seen decreases in emissions from 1999–2008 for Baltimore City? Which have seen increases in emissions from 1999–2008?
baltimore2 <- filter(NEI, fips == "24510") %>% group_by(type, year) %>% summarise(sum(Emissions))
ggplot(baltimore2, aes(year, `sum(Emissions)`)) + 
  geom_point(aes(color = type), size = 6, alpha = .5) +
  geom_line(aes(color = type)) + 
  labs(title = "Annual Emissions of each type in Baltimore",y = "Total Emissions")

ggsave("plot3.png", height = 5, width = 5)

PM2.5 sources with types “Non-road”, “Nonpoint”,“On-road” decreased while type “Point” slightly increased with a sharp peak in 2005.

  1. Across the United States, how have emissions from coal combustion-related sources changed from 1999–2008?
comb <- grepl("comb", SCC$SCC.Level.One, ignore.case=TRUE)
coal <- grepl("coal", SCC$SCC.Level.Four, ignore.case=TRUE) 
CoalCombustion <- (comb & coal)
ccRelated <- NEI[NEI$SCC %in% SCC[CoalCombustion,]$SCC,] %>% group_by(year) %>% summarise(sum(Emissions))

ggplot(ccRelated, aes(year, `sum(Emissions)`)) +
  geom_point(size = 6, alpha=.5) +
  geom_line() +
  labs(title = "Emissions of Coal combustion-related sources", y = "Annual Emission")

ggsave("plot4.png", height = 5, width = 5)

PM2.5 emitted from coal combustion-related sources is reduced over the period of 1999-2008.

  1. How have emissions from motor vehicle sources changed from 1999–2008 in Baltimore City?
motor <- grepl("motor", SCC$Short.Name, ignore.case = TRUE)
mt <- NEI[NEI$SCC %in% SCC[motor,]$SCC,] %>% filter(fips == "24510") %>% group_by(year) %>% summarise(sum(Emissions))
ggplot(mt, aes(year, `sum(Emissions)`)) +
  geom_point(size = 6, alpha=.5) +
  geom_line() +
  labs(title = "Emissions from motor vehicle sources in Baltimore", y = "Annual Emission")

ggsave("plot5.png", height = 5, width = 5)

Motor vehicles contributed high-level PM2.5 emissions in 2002 and 2005. The emission returned back to the same level of 1999 in 2008.

  1. Compare emissions from motor vehicle sources in Baltimore City with emissions from motor vehicle sources in Los Angeles County, California (𝚏𝚒𝚙𝚜 == “𝟶𝟼𝟶𝟹𝟽”). Which city has seen greater changes over time in motor vehicle emissions?
ba <- NEI[NEI$SCC %in% SCC[motor,]$SCC,] %>% filter(fips == c("24510","06037")) %>% group_by(fips, year) %>% summarise(sum(Emissions))

ggplot(ba, aes(year, `sum(Emissions)`)) + 
  geom_point(aes(color = fips), size = 6, alpha = .5) +
  geom_line(aes(color = fips)) + 
  scale_colour_discrete(name = "U.S. County",
                       breaks = c("06037", "24510"),
                       labels = c("Los Angeles County", "Baltimore City")) +
  labs(title = "Annual Emissions from motor vehicle sources in Baltimore and Angeles",y = "Total Emissions")

ggsave("plot6.png", height = 5, width = 5)

PM2.5 emission from motor vehicle sources of Los Angeles County boosted in 2005-2008.