DS Labs HW

Author

Iris Wu

Load libraries and packages

library(tidyverse)
library("dslabs")
library(extrafont)
data(package = "dslabs")
list.files(system.file("script", package = "dslabs"))
 [1] "make-admissions.R"                   
 [2] "make-brca.R"                         
 [3] "make-brexit_polls.R"                 
 [4] "make-calificaciones.R"               
 [5] "make-death_prob.R"                   
 [6] "make-divorce_margarine.R"            
 [7] "make-gapminder-rdas.R"               
 [8] "make-greenhouse_gases.R"             
 [9] "make-historic_co2.R"                 
[10] "make-mice_weights.R"                 
[11] "make-mnist_127.R"                    
[12] "make-mnist_27.R"                     
[13] "make-movielens.R"                    
[14] "make-murders-rda.R"                  
[15] "make-na_example-rda.R"               
[16] "make-nyc_regents_scores.R"           
[17] "make-olive.R"                        
[18] "make-outlier_example.R"              
[19] "make-polls_2008.R"                   
[20] "make-polls_us_election_2016.R"       
[21] "make-pr_death_counts.R"              
[22] "make-reported_heights-rda.R"         
[23] "make-research_funding_rates.R"       
[24] "make-results_us_election_2012.R"     
[25] "make-stars.R"                        
[26] "make-temp_carbon.R"                  
[27] "make-tissue-gene-expression.R"       
[28] "make-trump_tweets.R"                 
[29] "make-weekly_us_contagious_diseases.R"
[30] "save-gapminder-example-csv.R"        

Load the dataset

data("greenhouse_gases")

##Clean the dataset

#pivot wider so the data is easier to read 
gases1 <- greenhouse_gases |>
  pivot_wider(names_from = gas, values_from = concentration)
#I used this site: "https://www.rdocumentation.org/packages/dslabs/versions/0.8.0/topics/greenhouse_gases" to understand the units of measurement for concentration. CO2 is measured in parts per million (ppm), while N2O and CH4 are measured in parts per billion (ppb). I want to convert all concentrations to ppb. Multiply ppm by 1,000. 
gases2 <- gases1 |>
  mutate(CO2_ppb = CO2 *1000) |>
  #delete redundant CO2 column 
  select(-CO2)
#filter for years from 1500 to 2000 
gases3 <- gases2 |>
  filter(year >= 1500 & year <= 2000) 
# pivot longer so the data is easier to plot 
gases4 <- gases3 |>
  pivot_longer( cols = 2:4, names_to = "gas", values_to = "concentration")
#rename the gases 
gases4$gas <- gsub("CO2_ppb", "CO2", gases4$gas)

Data Visualization

Please grade the graph below.

ggplot(gases4, aes(x = year, y = concentration, color = gas)) +
  geom_line(size = .8) +
  geom_point(size = 2.5) +
  theme_gray(base_family = "serif") +
  geom_vline(xintercept = 1760, col = "black", linetype = "dotdash", alpha = .6) +
  geom_text(aes(x = 1725, y = 35000, label = "Start of \n Industrial \n Revolution"), 
            color = "black", size = 3.5) +
  scale_y_log10("Parts per billion, log scale", limits = c(100, 500000)) +
    scale_x_continuous("Year", limits = c(1500, 2000)) +
    ggtitle("Concentrations of Greenhouse Gases, 1500-2000") +
      scale_color_discrete(name = "Gas",  palette =("Set2")) +
  theme(plot.title = element_text(hjust =.5, face = "bold")) 

Description of graph

I used the greenhouse_gases dataset in DSLabs to create a line graph. The graph shows the concentrations of carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O) from 1500 to 2000. CO2 is measured in parts per million, while methane and N2O are measured in parts per billion. For consistency, I converted CO2 concentrations to parts per billion. Since CO2 concentrations are much higher, I used a logarithmic scale for the y-axis so that data points for all three gases would appear on the graph. I focused on the time period from 1500 to 2000 because it encompasses the development of the modern age. The rapid progression of science and technology in this era brought irreparable changes to the natural environment. On my graph, I indicated the start of the Industrial Revolution, in 1760. I wanted to see how concentrations of greenhouse gases changed from the start of the modern era to the Industrial Revolution and beyond. Due to the log-scale, the changes in concentration for each gas do not seem dramatic. However, it is clear that concentrations for all three gases increased in the decades after the Industrial Revolution.

Just for fun: plotting each gas individually

The changes in concentration of each gas before and after the start of the Industrial Revolution are more apparent.

ggplot(gases4, aes(x = year, y = concentration, color = gas)) +
  facet_wrap(~gas, scales = "free_y") +
  geom_line() +
  geom_point() +
  geom_vline(xintercept = 1760, col = "black", linetype = "dotdash") +
  scale_color_brewer(palette = "Set2") +
  labs(title = "Concentration of Greenhouse Gases, 1500-2000", subtitle = "Dashed line marks start of Industrial Revolution, 1760", x = "Year", y = "Concentration*", caption = "*CH4 and N2O measured in parts per billion; CO2 measured in parts per million") +
  theme_grey(base_family = "serif") +
    theme(plot.title = element_text(hjust = .6, face = "bold")) +
  theme(plot.subtitle = element_text(hjust = .6, face = "italic"))