Introduction

The goal of this project is to take a look at the Untied States carbon emission from 2013 to 2017. The problem that the data will try to show is, after the US signed the Paris Agreement, did the US reduced its carbon emissions by the agreed amount. The UN states that the global temperatures can not be raised by 2.0 degrees Celsius and the climate action tracker shows that in order to meet this goal, the US will need to reduce it’s total carbon emissions from 6500 Mt to around 3000 Mt. The data that this project is working with is focused on power plants emission and while it does not show the total carbon emissions for the US, it will give us an idea of the amount of power plants that are running on fossil fuels and the amount of CO2 that these power plants are emitting. This will then be able to find the current amount of CO2 emissions and the number of power plants that are running on fossil fuels in order to show that there need to be an effort to decrease the reliance on these power plants. This will also show which fossil fuel is the highest in use for power plants and to see in comparison which renewable energy is able to produce an efficient amount of energy in comparison to the fossil fuel run power plants.

Loading in the Data

First, the data needs to be loaded in to R and then tidied. Since we are working with power plants for the US, I selected the US and then for each column got rid of and data that is na.

us_energy %>% head(10)

This is the data that we ar working with. It contains the type of fuel, the average energy production for that power plant for the years 2013 - 2017. I then got the average for each fuel type and then got rid of the fuel types that did not made sense such as waste or other.

# Getting the average energy produce by year
us_energy = us_energy %>% group_by(primary_fuel) %>% 
  summarise("Avg 2013 Energy gwh" = mean(generation_gwh_2013),
            "Avg 2014 Energy gwh" = mean(generation_gwh_2014),
            "Avg 2015 Energy gwh" = mean(generation_gwh_2015),
            "Avg 2016 Energy gwh" = mean(generation_gwh_2016),
            "Avg 2017 Energy gwh" = mean(generation_gwh_2017))

# getting rid of the other and waste category
us_energy = us_energy %>% 
  filter(primary_fuel == "Biomass" | primary_fuel == "Coal" | 
           primary_fuel == "Gas" | primary_fuel == "Geothermal" |
           primary_fuel == "Hydro" | primary_fuel == "Nuclear" | 
           primary_fuel == "Oil" | primary_fuel == "Petcoke" |
           primary_fuel == "Solar" | primary_fuel == "Wind")

us_energy

Organizing the data for each visualization

For each graph that I created, I had to organize the data in a matrix or data set so that it is readable for the functions either in base R or ggplot. For the barplot, it required that the data is in a matrix, while ggplot required that the data is in a tibble. I also had to reorganize the tibbles so that there is a column that just has the energy production and another column that had the fuel type. This was necessary in order to graph it in groups either for the facet wrap or for the other barplot.

# create a function to convert gwh to twh
convert_to_twh = function(data_set){
  i = 1
  num = length(data_set)
  while (i <= num){
    data_set[,i] = (data_set[,i] / 1000)
    i = i + 1
  }
  return(data_set)
} 

#convert to co2 emission using 0.699 metric ton CO2 / mwh and convert to
#megatonne
get_co2 = function(data_set){
  i = 1
  num = length(data_set)
  co2_emission = c()
  while (i <= num){
    co2_emission = ((0.699 * data_set[i]) * (1/0.001))
    co2_emission = co2_emission[i] / 1000000
    i = i + 1
  }
  return(co2_emission)
}

us_energy_for_graph = us_energy
us_energy_for_graph = as.matrix(us_energy_for_graph)
# reorganize the us_energy graph into a new matrix
new_matrix = matrix(, 50, 3)
years = c(2013:2017)
new_matrix[,1] = years
new_matrix[1:5, 2] = us_energy_for_graph[1, 1]
new_matrix[6:10, 2] = us_energy_for_graph[2,1]
new_matrix[11:15, 2] = us_energy_for_graph[3, 1]
new_matrix[16:20, 2] = us_energy_for_graph[4, 1]
new_matrix[21:25, 2] = us_energy_for_graph[5, 1]
new_matrix[26:30, 2] = us_energy_for_graph[6, 1]
new_matrix[31:35, 2] = us_energy_for_graph[7,1]
new_matrix[36:40, 2] = us_energy_for_graph[8, 1]
new_matrix[41:45, 2] = us_energy_for_graph[9,1]
new_matrix[46:50, 2] = us_energy_for_graph[10, 1]

i = 1
j = 2
k = 1
while (i <= 50){
  while (j <= 6){
    new_matrix[i, 3] = us_energy_for_graph[k, j]
    i = i + 1
    j = j + 1
  }
  j = 2
  k = k + 1
}

colnames(new_matrix) = c("years", "fuel type", "energy in gwh")
new_tibble = as_tibble(new_matrix)
new_tibble$`energy in gwh`= as.numeric(new_tibble$`energy in gwh`)

co2_fossil_fuel = new_tibble
co2_fossil_fuel = co2_fossil_fuel %>% 
  filter(`fuel type` == "Biomass" | `fuel type` == "Coal" |
           `fuel type` == "Gas" | `fuel type` == "Oil" |
           `fuel type` == "Petcoke")
x = get_co2(co2_fossil_fuel[,3])
x = as.matrix(x)
co2_fossil_fuel = co2_fossil_fuel %>% 
  add_column("co2 emissions in metric tons" = x[,1])
colnames(co2_fossil_fuel)[4] = "CO2 Emissions in Megatonnes"

# creating a table just for fossil fuel
us_fossil_fuel = us_energy %>% 
  filter(primary_fuel == "Biomass" | primary_fuel == "Coal" |
           primary_fuel == "Gas" | primary_fuel == "Oil" |
           primary_fuel == "Petcoke")

# creating a table for renewable energy
us_renewable_energy = us_energy %>% 
  filter(primary_fuel == "Geothermal" | primary_fuel == "Hydro" |
           primary_fuel == "Nuclear" | primary_fuel == "Solar" |
           primary_fuel == "Wind")

# getting the average of energy production for fossil fuel
avg_fossil_fuel = us_fossil_fuel %>% 
  summarise("2013 Avg Energy gwh" = mean(`Avg 2013 Energy gwh`), 
            "2014 Avg Energy gwh" = mean(`Avg 2014 Energy gwh`),
            "2015 Avg Energy gwh" = mean(`Avg 2015 Energy gwh`),
            "2016 Avg Energy gwh" = mean(`Avg 2016 Energy gwh`),
            "2017 Avg Energy gwh" = mean(`Avg 2017 Energy gwh`))

# getting the average renewable energy 
avg_renewable_energy = us_renewable_energy %>% 
  summarise("2013 Avg Energy gwh" = mean(`Avg 2013 Energy gwh`), 
            "2014 Avg Energy gwh" = mean(`Avg 2014 Energy gwh`),
            "2015 Avg Energy gwh" = mean(`Avg 2015 Energy gwh`),
            "2016 Avg Energy gwh" = mean(`Avg 2016 Energy gwh`),
            "2017 Avg Energy gwh" = mean(`Avg 2017 Energy gwh`))

# convert the units to twh and renaming the columns
avg_renewable_energy = convert_to_twh(avg_renewable_energy)
avg_renewable_energy = avg_renewable_energy %>% 
  rename("Avg 2013 Energy twh" = `2013 Avg Energy gwh`, 
         "Avg 2014 Energy twh" = `2014 Avg Energy gwh`,
         "Avg 2015 Energy twh" = `2015 Avg Energy gwh`,
         "Avg 2016 Energy twh" = `2016 Avg Energy gwh`,
         "Avg 2017 Energy twh" = `2017 Avg Energy gwh`)
avg_fossil_fuel = convert_to_twh(avg_fossil_fuel)
avg_fossil_fuel = avg_fossil_fuel %>% 
  rename("Avg 2013 Energy twh" = `2013 Avg Energy gwh`, 
         "Avg 2014 Energy twh" = `2014 Avg Energy gwh`,
         "Avg 2015 Energy twh" = `2015 Avg Energy gwh`,
         "Avg 2016 Energy twh" = `2016 Avg Energy gwh`,
         "Avg 2017 Energy twh" = `2017 Avg Energy gwh`)

# Convert the avg renewable energy and fossil fuel to a matrix
avg_renewable_energy = as.matrix(avg_renewable_energy)
avg_fossil_fuel = as.matrix(avg_fossil_fuel)

# Stores the data into a matrix, so it can be used in a bar plot
years = c(2013:2017)
both_energy_matrix = matrix(,5, 3)
both_energy_matrix[,1] = years
both_energy_matrix[,2] = avg_renewable_energy[1,]
both_energy_matrix[,3] = avg_fossil_fuel[1,]
colnames(both_energy_matrix) =
  c("Years", "Avg Renewable Energy in twh", "Avg Fossil Fuel in twh")


both_energy_matrix
##      Years Avg Renewable Energy in twh Avg Fossil Fuel in twh
## [1,]  2013                    2.722160              1.0849950
## [2,]  2014                    2.741538              1.0520123
## [3,]  2015                    2.759774              1.0046132
## [4,]  2016                    2.768830              0.9219338
## [5,]  2017                    2.797552              1.0678354
co2_fossil_fuel
new_tibble

Data Visualization

The First Graph is showing the difference between the energy production for power plants that use renewable energy and power plants that uses fossil fuels.

# Bar plot renewable energy vs fossil fuel 

opar = par(oma = c(2,0,0,0))
barplot(cbind(both_energy_matrix[,2], both_energy_matrix[,3])
        ~ both_energy_matrix[,1], beside = TRUE, col = c("#33cc66", "#663333"), 
        legend = T, xlab = "Years", ylab = "Avg Energy in twh", ylim = c(0,3),
        main = "Average Energy Consumption for Renewable and Non Renwable
        Energy Sources")
par(opar)
opar = par(oma = c(0,0,0,0), mar = c(0,0,0,0), new = TRUE)
legend(x = "bottom", legend = c("Renewable Energy", "Fossil Fuel"), 
       fill = c("#33cc66", "#663333"), bty = "o", ncol = 2, 
       inset = -.35, cex = 1.2)

par(opar)

The first thing that is noticeable is that, on average, there is a higher production of power plants that utilizes renewable energy than power plants that uses fossil fuels. However, what is making the number that much higher is that one of the renewable energy is nuclear power plants. The next graph will show that while, nuclear power plants have a higher energy output, other renewable energies have a much lower energy output.

# Facet wrap using us energy
ggplot(data = new_tibble) + aes(x = years, y = `energy in gwh`, 
                                group = `fuel type`, color = `fuel type`) +
  geom_line(lwd = 1) + facet_wrap(`fuel type`~.) + theme_bw(base_size = 14) +
  ggtitle("US Energy Consumption over Time") + 
  theme(plot.title = element_text(hjust = 0.5), 
        axis.text.x = element_text(angle = 60, hjust = 1)) + xlab("Year") + 
  ylab("Energy Consumption in gwh") + theme(text = element_text(size = 10)) +
  scale_y_continuous(trans = "identity", breaks = seq(1000, 14000, 2000), 
                     limits = c(1, 14000))

This graph is able to show the average energy production for each fuel type from 2013 to 2017 in the US. I think it is able to show that even though the average for renewable energy is higher than the average for fossil fuels, it shows that the second highest fuel type is coal which has an average energy output of just under 5000 gwh. What is also highlighted in this graph is that gas and petcoke also have a higher energy output than other renewable energies. This demonstrates that nuclear energy is inflating the averages for renewable energy and making it seem that there is a higher energy production for renewable energy than for fossil fuels. Though, overall it shows that the US is not entirely removed from fossil fuel consumption and have not made any progress in reducing their reliance on power plants that consume fossil fuels. The next graph demonstrates the problem with this reliance by showing the amount of CO2 emissions that are produce on average from burning fossil fuels. The calculation used the average emission factor for the US that is stated by EPA. This factor is 0.699 metric tons CO2 / gwh.

# CO2 emissions graph practice
theme_update(plot.title = element_text(hjust = 0.5))
ggplot(data = co2_fossil_fuel) + 
  aes(x = years, y = `CO2 Emissions in Megatonnes`, fill = `fuel type`) +
  geom_bar(stat = "identity", color = "black", position = position_dodge()) +
  ggtitle("CO2 Emissions in megatonnes in the US per year") + xlab("Year") + 
  ylab("CO2 Emissions in megatonne (Mt)") + theme_bw(base_size = 14) +
  scale_y_continuous(trans = "identity", breaks = seq(0, 3, .2), 
                     limits = c(0, 3)) +  
  theme(plot.title = element_text(hjust = 0.5))

This graph shows the average CO2 emissions for each fossil fuel, again giving a picture of the problem with still utilizing these power plants, even after signing the Paris agreement in 2015. It shows that the highest fuel type, coal, is emitting an average of 2.6 mega tonnes of CO2 per year and with all the fossil fuels that is a total of about 3.8 mega tonnes of CO2 per year. This graph highlights the original problem where I wanted to look at the US power plants emissions and see if and how the US can meet the Paris Agreement. The graph shows that its biggest CO2 emitting power plant is coming from using coal and needs to be able to find ways to reduce its reliance on this type of fuel.

Conclusion

The goal of this project was to look at the US emissions from its power plants and to see if and how it can meet the standards for lowering CO2 emissions set by the Paris Agreement. The data shows that on average the US relies more on renewable energy than fossil fuels, with nuclear energy having the highest energy output. The data also shows that besides nuclear energy, coal and petcoke have a higher energy output than than the other renewable energy which means that there is a greater reliance or more of these power plants in production than power plants that run on hydro, solar, or wind. This is not inclusive since there could be various reasons that there is a higher energy out from fossil fuels such as the power plants for those renewable energies are not optimize for their type of energy extraction or even weather getting in the way, but the main important thing to note is that 3 out of the 5 fossil fuels have a higher energy output than the power plants running on renewable energy. The CO2 emissions chart highlights the importance of finding ways to reduce our carbon emissions, due to the amount of fuel that is required to be used to run those power plants is leading to a high amount of CO2 emissions being emitted every year. The Paris Agreement is targeted for the entire country’s CO2 emissions, with power plants only being a part of it. It is a joint effort in finding ways to reduce our carbon emissions, but this report shows that even though our reliance on fossil fuels is lower than our reliance on renewable energy, it is still responsible for almost 4 mega tonnes of CO2 emissions every year.