Emissions trends over time (by city and source)

Plot 1: total emmissions trends Loading in emissions data:

setwd("C:/Users/dhnsingh/Downloads/wk4_case_study/exdata_data_NEI_data")

x <- readRDS("summarySCC_PM25.rds")
y <- readRDS("Source_Classification_Code.rds")

Checking variable types:

str(x)
## 'data.frame':    6497651 obs. of  6 variables:
##  $ fips     : chr  "09001" "09001" "09001" "09001" ...
##  $ SCC      : chr  "10100401" "10100404" "10100501" "10200401" ...
##  $ Pollutant: chr  "PM25-PRI" "PM25-PRI" "PM25-PRI" "PM25-PRI" ...
##  $ Emissions: num  15.714 234.178 0.128 2.036 0.388 ...
##  $ type     : chr  "POINT" "POINT" "POINT" "POINT" ...
##  $ year     : int  1999 1999 1999 1999 1999 1999 1999 1999 1999 1999 ...
summary(x)
##      fips               SCC             Pollutant        
##  Length:6497651     Length:6497651     Length:6497651    
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##    Emissions            type                year     
##  Min.   :     0.0   Length:6497651     Min.   :1999  
##  1st Qu.:     0.0   Class :character   1st Qu.:2002  
##  Median :     0.0   Mode  :character   Median :2005  
##  Mean   :     3.4                      Mean   :2004  
##  3rd Qu.:     0.1                      3rd Qu.:2008  
##  Max.   :646952.0                      Max.   :2008

Collapsing to totals by year:

ag <- aggregate(Emissions~year, data = x, FUN = sum)

Plotting totals by year:

par(bg = "yellow")
plot(ag$year, ag$Emissions, cex = 3, pch = 18, col = "darkgreen",
     main = "Total Emissions over Time", 
     xlab = "Total Emissions", ylab = "Year")
lines(ag$year, ag$Emissions, col = "red", lwd = 2)

Total emissions of all types show a decrease over a period of time

Plot 2: total emmissions trends in Baltimore, MD Subsetting data to Baltimore, MD:

x_sub2 <- subset(x, fips == "24510")

Collapsing to totals by year:

ag2 <- aggregate(Emissions~year, data = x_sub2, FUN = sum)

Plotting emissions totals in Baltimore, MD:

par(bg = "lightblue")
plot(ag2$year, ag2$Emissions, cex = 3, pch = 18, col = "maroon",
     main = "Baltimore: Total Emissions over Time", 
     xlab = "Total Emissions", ylab = "Year")
lines(ag2$year, ag2$Emissions, col = "white", lwd = 2)

Total emissions of any type have decreased in Baltimore city between 1999 and 2008

Installing ggplot:

library(ggplot2)
library(ggthemes)

Plot 3: total emmissions trends in Baltimore, md by emmission type Subsetting data to Baltimore, MD:

x_sub3 <- subset(x, fips == "24510")

Collapsing to totals by year and type:

ag3 <- aggregate(Emissions~year+type, data = x_sub3, FUN = sum)

Plotting emissions totals in Baltimore, MD by type:

g3 <- ggplot(ag3, aes(year, Emissions, color = type))
g3 + geom_point(size = 2.5) + geom_line(size = 1.25) + 
  xlab("Emissions") + ylab("Year") + ggtitle("Emissions by Type")  + 
  theme_wsj()

All types of emissions decreased between 1999 and 2008, except for points source emissions which saw a spike around year 2005 and reversion to mean

Plot 4: coal emmissions Subsetting columns

x4 <- x[c("SCC", "Emissions", "year")]
y4 <- y[c("SCC", "Short.Name", "SCC.Level.Three")]

Subsetting to labels capturing coal caused pollution:

y_coal <- y4[grepl("Coal", y4$Short.Name)|grepl("coal", y4$Short.Name)|grepl("Coal", y4$SCC.Level.Three)|grepl("coal", y4$SCC.Level.Three),]

Merging emissions for rows caused by coal:

mrg4 <- merge(x, y_coal, by = "SCC", all.y = TRUE)

Collapsing to totals by year and type:

ag4 <- aggregate(Emissions~year, data = mrg4, FUN = sum)

Plotting coal emissions trends:

g4 <- ggplot(ag4, aes(year, Emissions))
g4 + geom_point(size = 3, color = "black") + geom_line(size = 1.25, color = "gray") + 
  xlab("Emissions") + ylab("Year") + ggtitle("Coal Emissions Trends")  + 
  theme_economist() + scale_color_economist()

Coal emissions trends have shown a steady decline from 1999 to 2008, a healthy trend indeed

Plot 5: motor vehicle emmissions Subsetting data to Baltimore, MD

x_sub5 <- subset(x, fips == "24510")

Using on-road as vehicle:

x_sub5 <- x_sub5[x_sub5$type == "ON-ROAD",]

Collapsing to totals by year:

ag5 <- aggregate(Emissions~year, data = x_sub5, FUN = sum)

Plotting motor emissions trends:

g5 <- ggplot(ag5, aes(year, Emissions))
g5 + geom_point(size = 3, shape = 22, color = "red", stroke = 2, fill = "gray") + geom_line(size = 1.25, color = "darkblue") + 
  xlab("Emissions") + ylab("Year") + ggtitle("Baltimore: Motor Emissions Trends")  + 
  theme_stata()

Emissions due to on-road (motor vehicle) sources in Baltimore have steadily fallen between 1999-2008!

Plot 6: motor vehicles, comparative measure across cities Subsetting data to Baltimore city and Los Angeles:

x_sub6 <- subset(x, fips == "24510" | fips == "06037")

Using on-road as vehicle:

x_sub6 <- x_sub6[x_sub6$type == "ON-ROAD",]

Collapsing to totals by year and area:

ag6 <- aggregate(Emissions~year+fips, data = x_sub6, FUN = sum)

Plotting motor emissions trends comparison:

g6 <- ggplot(ag6, aes(year, Emissions, color = fips))
g6 + geom_point(size = 3, shape = 22, color = "red", stroke = 2, fill = "gray") + geom_line(size = 1.25, color = "darkblue") + 
  xlab("Emissions") + ylab("Year") + ggtitle("LA vs Ba.: Motor Emissions Trends")  + 
  theme_stata() + facet_wrap(~fips)

LA county has seen far more change in motor vehicle emissions over a period of time than Baltimore city. This can be owing to its population size, density, or sprawl!