Assignment

The overall goal of this assignment is to explore the National Emissions Inventory database and see what it say about fine particulate matter pollution in the United states over the 10-year period 1999 to 2008.

setwd("C:/Users/angul/OneDrive/R/ExploreData/Data")

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(RColorBrewer)

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")

Q1. Have total emissions from PM2.5 decreased in the United States from 1999 to 2008? Using the base plotting system, make a plot showing the total PM2.5 emission from all sources for each of the years 1999, 2002, 2005, and 2008. Upload a PNG file containing your plot addressing this question.

# emissions by year 
total_emmissions <- aggregate(Emissions ~ year, NEI, sum)

png("Q1_53pcdropspike.png", width=480, height=480)
barplot(
  (total_emmissions$Emissions)/10^6,
  names.arg=total_emmissions$year,
  xlab="Year",
  ylab="PM2.5 Emissions (10^6 Tons)",
  ylim = c(0, 7.5),
  col="black", 
  border="red",
  main="53% PM2.5 emission drop from 1999 to 2008, over the USA")
dev.off()
## png 
##   2
print("Percent Total emissions change: ")
## [1] "Percent Total emissions change: "
pcdiff <- ((total_emmissions[4,2] - total_emmissions[1,2])/total_emmissions[1,2])*100
print(pcdiff) 
## [1] -52.75847

scratch space

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")

head(total_emmissions)
##   year Emissions
## 1 1999   7332967
## 2 2002   5635780
## 3 2005   5454703
## 4 2008   3464206
totUS1999 <- total_emmissions[1,2]
print("The total US PM2.5 emmissions in 1999 were:") 
## [1] "The total US PM2.5 emmissions in 1999 were:"
print(totUS1999, useSource=TRUE)
## [1] 7332967
totUS2008 <- total_emmissions[4,2]
print("while that of 2008 sttod at:") 
## [1] "while that of 2008 sttod at:"
print(totUS2008, useSource=TRUE)
## [1] 3464206
difftotUS <- total_emmissions[1,2] - total_emmissions[4,2]
pcdiff <- (total_emmissions[1,2] - total_emmissions[4,2])/total_emmissions[1,2]

Q2. Have total emissions from PM2.5 decreased in the Baltimore City, Maryland ( fips == 24510) from 1999 to 2008? Use the base plotting system to make a plot answering this question.

setwd("C:/Users/angul/OneDrive/R/ExploreData/Data")

library(ggplot2)
library(dplyr)
library(RColorBrewer)

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")

baltimore <- subset(NEI, fips=="24510")
totPM25_Baltimore <- aggregate(Emissions ~ year, baltimore, sum)

png("Q2_43dropBalt2015spike.png", width=480, height=480)
barplot(
  totPM25_Baltimore$Emissions,
  names.arg=totPM25_Baltimore$year,
  xlab="Year",
  ylab="PM2.5 Emissions, in Tons",
  ylim=c(0,3500),
  col="black", 
  border="red",
  main="43% drop: despite 2005 spike in Baltimore City" )
dev.off()
## png 
##   2
pcdiffBalt <- (totPM25_Baltimore[1,2] - totPM25_Baltimore[4,2])/totPM25_Baltimore[1,2]
print("Percent Total emissions change: ")
## [1] "Percent Total emissions change: "
print(pcdiffBalt)
## [1] 0.431222

scratch space

head(totPM25_Baltimore)
##   year Emissions
## 1 1999  3274.180
## 2 2002  2453.916
## 3 2005  3091.354
## 4 2008  1862.282
totBalt1999 <- totPM25_Baltimore[1,2]
print("The total Baltimore PM2.5 emmissions in 1999 were:") 
## [1] "The total Baltimore PM2.5 emmissions in 1999 were:"
print(totBalt1999, useSource=TRUE)
## [1] 3274.18
totBalt2008 <- totPM25_Baltimore[1,2]
print("while that of 2008 stood at:") 
## [1] "while that of 2008 stood at:"
print(totBalt2008, useSource=TRUE)
## [1] 3274.18
difftotBalt <- totPM25_Baltimore[1,2] - totPM25_Baltimore[4,2]
pcdiffBalt <- (totPM25_Baltimore[1,2] - totPM25_Baltimore[4,2])/totPM25_Baltimore[1,2]
print("Percent Total emissions change: ")
## [1] "Percent Total emissions change: "
print(pcdiffBalt)
## [1] 0.431222

Q3. Have total emissions from PM2.5 decreased in the Baltimore City, Maryland ( fips == 24510) from 1999 to 2008? Use the base plotting system to make a plot answering this question.

setwd("C:/Users/angul/OneDrive/R/ExploreData/Data")

library(ggplot2)
library(dplyr)
library(RColorBrewer)

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")

tot_emi_24510_by_type <- NEI %>%
        filter(fips == 24510) %>%
        select(fips, type, Emissions, year) %>%
        group_by(year, type) %>%
        summarise(Total_Emissions = sum(Emissions, na.rm = TRUE))
## `summarise()` regrouping output by 'year' (override with `.groups` argument)
SourcePM25Baltimore <- ggplot(tot_emi_24510_by_type, aes(x = factor(year), y = Total_Emissions, fill = type)) +
        geom_bar(stat = "identity") +
        facet_grid(.~type) + 
        labs(x = "Year", y = "PM2.5 emissions in Tons", title = "Sources of PM25 emissions in Baltimore City, 2019-2008")+
        theme(plot.title = element_text(size = 14),
              axis.title.x = element_text(size = 12),
              axis.title.y = element_text(size = 12)) +
        scale_fill_hue(c=45, l=80) +
        theme_dark() +
        ggsave("Q3_SourcePM25Baltimore.png", width = 25, height = 20, units = "cm")

print(SourcePM25Baltimore)

scratch space

head(tot_emi_24510_by_type)
## # A tibble: 6 x 3
## # Groups:   year [2]
##    year type     Total_Emissions
##   <int> <chr>              <dbl>
## 1  1999 NON-ROAD            523.
## 2  1999 NONPOINT           2108.
## 3  1999 ON-ROAD             347.
## 4  1999 POINT               297.
## 5  2002 NON-ROAD            241.
## 6  2002 NONPOINT           1510.
print("Nonpoint sources continue to pollute the most, even if PM2.5 emissions dropped, from 1999 to 2008. There was a peak in the point source, which indicates a pollution incident there in 2005. Both non-road & on_road sources are relatively low and have a downward trend ")
## [1] "Nonpoint sources continue to pollute the most, even if PM2.5 emissions dropped, from 1999 to 2008. There was a peak in the point source, which indicates a pollution incident there in 2005. Both non-road & on_road sources are relatively low and have a downward trend "

Q5.How have emissions from motor vehicle sources changed from 1999 to 2008 in Baltimore City?

setwd("C:/Users/angul/OneDrive/R/ExploreData/Data")

library(ggplot2)
library(dplyr)
library(RColorBrewer)

NEI <- readRDS("summarySCC_PM25.rds")
SCC <- readRDS("Source_Classification_Code.rds")

SCC_Vehicles <- SCC %>%
        filter(grepl('[Vv]ehicle', SCC.Level.Two)) %>%
        select(SCC, SCC.Level.Two)

Tot_Emi_24510_V <- NEI %>%
        filter(fips == "24510") %>%
        select(SCC, fips, Emissions, year) %>%
        inner_join(SCC_Vehicles, by = "SCC") %>%
        group_by(year) %>%
        summarise(Total_Emissions = sum(Emissions, na.rm = TRUE)) %>%
        select(Total_Emissions, year)
## `summarise()` ungrouping output (override with `.groups` argument)
BaltimoreVehicles_bar <- ggplot(Tot_Emi_24510_V, aes(factor(year), Total_Emissions)) +
        geom_bar(stat = "identity", fill = "black", width = 0.5,
        col="yellow") + labs(x = "Year", y = "PM2.5 emissions, in Tons", 
        title = " PM2.5 emissions in Baltimore City, 1999-2008", 
        subtitle = " from Vehicle related particulate emissions") +
        theme(plot.title = element_text(size = 14),
              axis.title.x = element_text(size = 12),
              axis.title.y = element_text(size = 12)) +
        ggsave("Q5_BaltimoreVehicles_bar.png", width = 25, height = 20, 
        units = "cm")

print(BaltimoreVehicles_bar)

scratch space

head(Tot_Emi_24510_V)
## # A tibble: 4 x 2
##   Total_Emissions  year
##             <dbl> <int>
## 1            404.  1999
## 2            192.  2002
## 3            185.  2005
## 4            138.  2008
print("Motor vehicle emissions dropped from 404 in 1999 to 138 Tons in 2008, in Baltimore City.")
## [1] "Motor vehicle emissions dropped from 404 in 1999 to 138 Tons in 2008, in Baltimore City."

Q6. Compare emissions from motor vehicle sources in Baltimore City with emissions from motor vehicle sources in Los Angeles County, California ( fips == 06037). Which city has seen greater changes over time in motor vehicle emissions?

setwd("C:/Users/angul/OneDrive/R/ExploreData/Data")

library(ggplot2)
library(dplyr)
library(RColorBrewer)

SCC_Vehicles <- SCC %>%
        filter(grepl('[Vv]ehicle', SCC.Level.Two)) %>%
        select(SCC, SCC.Level.Two)

Tot_Emi_Two_Locs <- NEI %>%
        filter(fips == "24510" | fips == "06037") %>%
        select(fips, SCC, Emissions, year) %>%
        inner_join(SCC_Vehicles, by = "SCC") %>%
        group_by(fips, year) %>%
        summarise(Total_Emissions = sum(Emissions, na.rm = TRUE)) %>%
        select(Total_Emissions, fips, year)
## `summarise()` regrouping output by 'fips' (override with `.groups` argument)
Tot_Emi_Two_Locs$fips <- gsub("24510", "Baltimore City", Tot_Emi_Two_Locs$fips)
Tot_Emi_Two_Locs$fips <- gsub("06037", "Los Angeles County", Tot_Emi_Two_Locs$fips)

Vehicle_BaltLA_bar <- ggplot(Tot_Emi_Two_Locs, aes(x = factor(year), y = Total_Emissions, fill = fips )) +
        geom_bar(stat = "identity", width = 0.7) +
        facet_grid(.~fips) + 
        labs(x = "Year", y = "PM2.5 emissions, in Tons", title = "Vehicle related PM2.5 emissions, 1999-2008") +
        theme(plot.title = element_text(size = 14),
              axis.title.x = element_text(size = 12),
              axis.title.y = element_text(size = 12),
              strip.text.x = element_text(size = 12)) +
        scale_fill_hue(c=45, l=80) +
        theme_dark() +      
        ggsave("Q6_Vehicle_BaltLA_bar.png", width = 25, height = 20, units = "cm")

print(Vehicle_BaltLA_bar)

scratch space

print(summary(Tot_Emi_Two_Locs))
##  Total_Emissions      fips                year     
##  Min.   : 138.2   Length:8           Min.   :1999  
##  1st Qu.: 190.4   Class :character   1st Qu.:2001  
##  Median :3256.7   Mode  :character   Median :2004  
##  Mean   :3492.9                      Mean   :2004  
##  3rd Qu.:6612.9                      3rd Qu.:2006  
##  Max.   :7304.1                      Max.   :2008
Tot_Emi_Two_Locs
## # A tibble: 8 x 3
## # Groups:   fips [2]
##   Total_Emissions fips                year
##             <dbl> <chr>              <int>
## 1           6110. Los Angeles County  1999
## 2           7189. Los Angeles County  2002
## 3           7304. Los Angeles County  2005
## 4           6421. Los Angeles County  2008
## 5            404. Baltimore City      1999
## 6            192. Baltimore City      2002
## 7            185. Baltimore City      2005
## 8            138. Baltimore City      2008

scratch space

print((Tot_Emi_Two_Locs[4,1]/Tot_Emi_Two_Locs[8,1] +
Tot_Emi_Two_Locs[3,1]/Tot_Emi_Two_Locs[7,1] +
Tot_Emi_Two_Locs[2,1]/Tot_Emi_Two_Locs[6,1]+
Tot_Emi_Two_Locs[1,1]/Tot_Emi_Two_Locs[5,1])/4)
##   Total_Emissions
## 1        34.60322
print("Vehicle related emission in Los Angeles County were on average 35 times that of Baltimore City, during the period 1999-2008")
## [1] "Vehicle related emission in Los Angeles County were on average 35 times that of Baltimore City, during the period 1999-2008"
print("The change from 1999 to 2008 followed a steady downward trend  and resulted in a shift of: ")
## [1] "The change from 1999 to 2008 followed a steady downward trend  and resulted in a shift of: "
print(Tot_Emi_Two_Locs[5,1]-Tot_Emi_Two_Locs[8,1]) 
##   Total_Emissions
## 1        265.5298
print("in Baltimore City, while in Los Angeles County vehicle related emissions rose from 6110 to 7304 Tons from 1999 to 2005, then went down to 6421 Tons in 2008, this is an increase of: ")
## [1] "in Baltimore City, while in Los Angeles County vehicle related emissions rose from 6110 to 7304 Tons from 1999 to 2005, then went down to 6421 Tons in 2008, this is an increase of: "
print(Tot_Emi_Two_Locs[1,1]-Tot_Emi_Two_Locs[4,1]) 
##   Total_Emissions
## 1        -311.327
print("311 Tons of vehicle related emissions")
## [1] "311 Tons of vehicle related emissions"
print("Vehicle related emission in Los Angeles County were on average 35 times that of Baltimore City, during the period 1999-2008. In Baltimore city there was a steady downward trend and resulted in a sheading of 266 Tons of PM2.5 emissions. While in Los Angeles County vehicle related emissions rose from 6110 to 7304 Tons from 1999 to 2005, then went down to 6421 Tons in 2008, this resulted in a net increase of 311 Tons of vehicle related emissions when comparing 1999 to 2008.")
## [1] "Vehicle related emission in Los Angeles County were on average 35 times that of Baltimore City, during the period 1999-2008. In Baltimore city there was a steady downward trend and resulted in a sheading of 266 Tons of PM2.5 emissions. While in Los Angeles County vehicle related emissions rose from 6110 to 7304 Tons from 1999 to 2005, then went down to 6421 Tons in 2008, this resulted in a net increase of 311 Tons of vehicle related emissions when comparing 1999 to 2008."

Get the full EPA report here