DSLab Assignment

Author

Paul D-O

Load library(tidyverse)

library(tidyverse)

Load (dslab): The Dslap package containing plytoria of dataset,for the purpose of the assignment I will be loading the dataset temp_carbon;Global temperature anomaly and carbon emissions, 1751-2018.

library("dslabs")
data("temp_carbon") # load the temp_carbon dataset into the global
head(temp_carbon)   # displays the first six rows to understand its structure.
  year temp_anomaly land_anomaly ocean_anomaly carbon_emissions
1 1880        -0.11        -0.48         -0.01              236
2 1881        -0.08        -0.40          0.01              243
3 1882        -0.10        -0.48          0.00              256
4 1883        -0.18        -0.66         -0.04              272
5 1884        -0.26        -0.69         -0.14              275
6 1885        -0.25        -0.56         -0.17              277

Summary statistic of the above dataset

summary(temp_carbon)
      year       temp_anomaly     land_anomaly      ocean_anomaly     
 Min.   :1751   Min.   :-0.450   Min.   :-0.69000   Min.   :-0.46000  
 1st Qu.:1818   1st Qu.:-0.180   1st Qu.:-0.31500   1st Qu.:-0.17000  
 Median :1884   Median :-0.030   Median :-0.05000   Median :-0.01000  
 Mean   :1884   Mean   : 0.060   Mean   : 0.07086   Mean   : 0.05273  
 3rd Qu.:1951   3rd Qu.: 0.275   3rd Qu.: 0.30500   3rd Qu.: 0.25500  
 Max.   :2018   Max.   : 0.980   Max.   : 1.50000   Max.   : 0.79000  
                NA's   :129      NA's   :129        NA's   :129       
 carbon_emissions 
 Min.   :   3.00  
 1st Qu.:  13.75  
 Median : 264.00  
 Mean   :1522.98  
 3rd Qu.:1431.50  
 Max.   :9855.00  
 NA's   :4        

Filter missing values using dypler command filter

temp_carbon_filter <- temp_carbon|> 
  filter(!is.na(temp_anomaly) & !is.na(carbon_emissions) & !is.na(ocean_anomaly) & !is.na(land_anomaly) & !is.na(year)) # Filter out rows with missing values in the relevant columns

Reshape the dataset to long format for easier plotting

temp_carbon_long<-temp_carbon_filter|>
  pivot_longer(cols = c(temp_anomaly,carbon_emissions,ocean_anomaly,land_anomaly),
               names_to = "anomaly_type",
               values_to = "values")
temp_carbon_long
# A tibble: 540 × 3
    year anomaly_type     values
   <dbl> <chr>             <dbl>
 1  1880 temp_anomaly      -0.11
 2  1880 carbon_emissions 236   
 3  1880 ocean_anomaly     -0.01
 4  1880 land_anomaly      -0.48
 5  1881 temp_anomaly      -0.08
 6  1881 carbon_emissions 243   
 7  1881 ocean_anomaly      0.01
 8  1881 land_anomaly      -0.4 
 9  1882 temp_anomaly      -0.1 
10  1882 carbon_emissions 256   
# ℹ 530 more rows

Create a faceted line plot

ggplot(temp_carbon_long, aes(x = year, y = values, color = anomaly_type)) + 
  geom_line(size = 1) +  # Add lines for each anomaly type  
  facet_wrap(~ anomaly_type, scale ="free_y") +  # Create separate panels for each anomaly type  
  labs(title = "Global temperature anomaly and\n carbon emissions, 1751-2018.",  # Title for the plot      
       x = "Year",  # X-axis label      
       y = "Value",  # Y-axis label      
       color = "anomaly_type") +  # Legend title  
  theme_minimal(base_size = 15) +  # Change the theme to minimal with larger base font size
  theme(legend.position = "right",  # Position the legend at the bottom        
        panel.grid.major = element_line(colour = "grey80"),  # Customize grid lines       
        panel.grid.minor = element_blank())  # Remove minor grid lines

Summary

The plot was created using the dslabs package, which offers a variety of datasets for analysis. Specifically, the dataset chosen for this project contains global temperature anomalies and carbon emissions data. This dataset provides climate trends, with temperature anomalies spanning from 1880 to 2018 and carbon emissions data from 1751 to 2014. The global temperature anomalies represent deviations from the 20th-century average, reflecting both land and ocean trends.

To visualize this dataset, a faceted line plot was generated using ggplot2 . The data was reshaped to a long format using pivot_long() , ensuring all variables could be plotted and faceted. Aesthetic mappings were applied, with year on the x-axis, values on the y-axis, and anomaly_type represented by color. The facet_wrap() function separated each variable into its own subplot, allowing independent y-axis scaling for clearer comparisons. Every code chunk was accompanied by detailed explanations, outlining the steps for filtering, reshaping, plotting, and customizing the visualization.