Reading the data & installing necessary libraries
NEI <- readRDS("F:/Exploratory Data Analysis/Week 4/summarySCC_PM25.rds")
SCC <- readRDS("F:/Exploratory Data Analysis/Week 4/Source_Classification_Code.rds")
dim(NEI)
## [1] 6497651 6
dim(SCC)
## [1] 11717 15
library(ggplot2)
library(plyr)
Have total emissions from PM2.5 decreased in the United States from 1999 to 2008 ?
totalemissionsyearwise <- tapply(NEI$Emissions,NEI$year,sum,na.rm=FALSE)
plot(names(totalemissionsyearwise),totalemissionsyearwise,type ="l",col="blue", lwd = 3 ,xlab="Year",ylab = expression("Total"~PM[2.5]~"Emissions(tons)"))
title(main = list(expression("Total US" ~ PM[2.5] ~ "Emissions by Year"),col = "blue", cex= 1.5))

Answer –
Yes. There was a sharp reduction in PM emissions from 1999 to 2002 in USA. In subsequent years from 2002 to 2005, slower decline in PM emissions was observed. Again, from 2005 to 2008, faster decline in PM emissions was evident in comparison with the period of 2002 to 2005. All above conclusions are inferred from the graph titled as “Yearwise Total PM Emission in USA”.
Have total emissions from PM2.5 decreased in the Baltimore City, Maryland (fips == “24510”) from 1999 to 2008?
Baltimore <- subset(NEI,NEI$fips==24510)
dim(Baltimore)
## [1] 2096 6
Baltimoreemissionsyearwise <- tapply(Baltimore$Emissions,Baltimore$year,sum, na.rm=TRUE)
plot(names(Baltimoreemissionsyearwise),Baltimoreemissionsyearwise,type ="l",col="blue", lwd = 3 ,xlab="Year",ylab = expression("Total"~PM[2.5]~"Emissions(tons) in Baltimore"))
title(main = list(expression("Total Baltimore" ~ PM[2.5] ~ "Emissions by Year"),col = "blue", cex= 1.5))

Answer –
For Baltimore city, a zigzag trend is observed for PM emissions from 1999 to 2008 i.e. from 1999 to 2002, a sharp decrease in PM emissions is observed while the trend has reversed with an uplift in PM emissions from 2002 to 2005. Again, the trend has veered the course by displaying a sharp reduction in PM emissions from 2005 to 2008.
Of the four types of sources indicated by the type (point, nonpoint, onroad, nonroad) variable, which of these four sources have seen decreases in emissions from 1999-2008 for Baltimore City? Which have seen increases in emissions from 1999-2008?
Baltimore <- subset(NEI,NEI$fips==24510)
Baltimoresource <- ddply(Baltimore,.(year,type),function(x) sum(x$Emissions))
colnames(Baltimoresource)[3] <- "Emissions"
ggplot(Baltimoresource, aes(x=year, y=Emissions, group = type, colour = type)) +geom_line() + ggtitle(expression("Baltimore City" ~ PM[2.5] ~ "Emmissions by source, type and year")) + xlab("Year") + ylab(expression("Total" ~ PM[2.5] ~ "Emissions (in tons)"))

Answer –
Nonpoint (green line): From the plot, we see that nonpoint (green line) sharply decreased from 1999 to 2002. It remained steady from 2002 to 2005 with 1,500 Total \(PM_{2.5}\) emissions. Finally, a slight decrease occurred between 2005 and 2008 from 1,500 Total \(PM_{2.5}\) emissions.
Point (purple line): From the plot, we see that the point (purple line) slightly increased from 1999 to 2002. It then sharply increased in \(PM_{2.5}\) emissions from 2002 to 2005. Finally, from 2005 to 2008, the \(PM_{2.5}\) emissions sharply decreased.
Onroad (blue line): From the plot, we see that the onroad (blue line) slightly decreased from 1999 to 2002. It remained approximately steady from 2002 to 2005 and continued this trend from 2005 to 2008. In comparison to the nonroad values, this over all trend was lower compared to the nonroad values.
Nonroad (red line): From the plot, we see that the nonroad (red line) followed the same path as the onroad values only slightly higher in \(PM_{2.5}\) emissions values. slightly decreased from 1999 to 2002. It remained approximately steady from 2002 to 2005 and continued this trend from 2005 to 2008.
Answer –
Total (Black line) : As inferred from the graph, total PM emissions on account of coal combustion sources has declined from 1999 to 2002 while it has increased a wee bit from 2002 to 2005. A sharp reduction in emissions was evident during 2005 till 2008.
Nonpoint (Orange line): From the plot, we see that nonpoint (orange line) sharply increased from 1999 to 2002. It remained steady from 2002 to 2005. Finally, a sharp decrease occurred between 2005 and 2008.
Point (Green line): As inferred from the graph, total PM emissions from point source on account of coal combustion sources has declined from 1999 to 2002 while it has increased a wee bit from 2002 to 2005. A sharp reduction in emissions was evident during 2005 till 2008.
How have emissions from motor vehicle sources changed from 1999-2008 in Baltimore City?
Assumption –
1 ) Emissions from each types of sources is considered separately as Motor Vehicle Emission of that type of source for that particular year.
2 ) Consolidated Emissions from all types of sources is considered as Total Motor Vehicle Emission for that particular year.
Baltimore <- subset(NEI,fips==24510)
Motorscc <- subset(SCC, grepl("Motor",Short.Name))
dim(Motorscc)
## [1] 138 15
Baltimoremotor <- ddply(Baltimore,.(year,type),function(x) sum(x$Emissions))
colnames(Baltimoremotor)[3] <- "Emissions"
dim(Baltimoremotor)
## [1] 16 3
qplot(year, Emissions, data=Baltimoremotor, color=type, geom="line") + stat_summary(fun.y = "sum", fun.ymin = "sum", fun.ymax = "sum", color = "black", lwd=2, aes(shape="total"), geom="line") + geom_line(aes(size="total", shape = NA)) + ggtitle(expression("Motor Vehicle" ~ PM[2.5] ~ "Emissions by Source Type and Year")) + xlab("Year") + ylab(expression("Motor Vehicle" ~ PM[2.5] ~ "Emissions (tons)"))
## Warning: Ignoring unknown aesthetics: shape
## Warning: Ignoring unknown aesthetics: shape
## Warning: Using size for a discrete variable is not advised.

Answer –
Total (Black line) : As inferred from the graph, total PM emissions on account of Motor Vehicles has declined from 1999 to 2002 while it has increased a substantially from 2002 to 2005. A sharp reduction in emissions was evident during 2005 till 2008.
Nonroad (Orange line): From the plot, we see that nonpoint (orange line) decreased from 1999 to 2002. It remained steady from 2002 to 2005. Finally, a sharp decrease occurred between 2005 and 2008.
Onroad (Sky Blue line) : From the plot, we see that onroad (sky blue line) decreased from 1999 to 2002. It remained steady from 2002 to 2005. Finally, a wee bit decrease occurred between 2005 and 2008.
Nonpoint (Green line) : As inferred from the graph, total PM emissions on account of Motor Vehicles has declined from 1999 to 2002 while it has remained steady ( no change) from 2002 to 2005. A marginal reduction in emissions was evident during 2005 till 2008.
Point (Purple line): As inferred from the graph, total PM emissions from point source on account of motor vehicles has increased from 1999 to 2002 while it has increased sharply further from 2002 to 2005. A sharp reduction in emissions was evident during 2005 till 2008.
Compare emissions from motor vehicle sources in Baltimore City with emissions from motor vehicle sources in Los Angeles County, California (fips == “06037”). Which city has seen greater changes over time in motor vehicle emissions?
Assumption – Consolidated Emissions from all types of sources is considered as Total Motor Vehicle Emissions for that particular year. PNG graph is plotted only for Total Motor Vehicle Emissions for each county yearwise.
Baltimore <- subset(NEI,fips=="24510")
Losangeles <- subset(NEI,fips=="06037")
Motorscc <- subset(SCC, grepl("Motor",Short.Name))
Baltimoremotor1 <- ddply(Baltimore,.(year),function(x) sum(x$Emissions))
Losangelesmotor1 <- ddply(Losangeles,.(year),function(x) sum(x$Emissions))
colnames(Baltimoremotor1)[2] <- "Emissions"
colnames(Losangelesmotor1)[2] <- "Emissions"
dim(Baltimoremotor1)
## [1] 4 2
dim(Losangelesmotor1)
## [1] 4 2
BAM <- mutate(Baltimoremotor1,County,County=rep("Baltimore",4))
LAM <- mutate(Losangelesmotor1,County,County=rep("Losangeles",4))
BALAM <- rbind(BAM,LAM)
ggplot(BALAM, aes(x=year, y=Emissions, group = County, colour = County)) +geom_line(lwd = 2) + ggtitle(expression("Yearwise Comparison of Total Motor Vehicles" ~ PM[2.5] ~ "Emissions in Baltimore & Los Angeles")) + xlab("Year") + ylab(expression("Total Motor Vehicles" ~ PM[2.5] ~ "Emissions (in tons)"))

Answer –
It can be inferred from the graph & data that there was a lot of pollution on acccount of Motor Vehicles in Los Angeles(LA) in comparison to Baltimore (BA).
During 1999 to 2002, substantial decrease in motor vehicle pollution was observed in LA with respect to BA.
During 2002 to 2005, an increase in motor vehicle pollution is noticed at BA whereas LA witnessed decrease in pollution levels.
During 2005 to 2008, fall in motor vehicle pollution was evident in BA whereas an abrupt rise in motor vehicle pollution is seen in LA.