library(tidyverse)DSLab Assignment
Load library(tidyverse)
Load (dslab): The Dslap package containing plytoria of dataset,for the purpose of the assignment I will be loading the dataset temp_carbon;Global temperature anomaly and carbon emissions, 1751-2018.
library("dslabs")
data("temp_carbon") # load the temp_carbon dataset into the global
head(temp_carbon) # displays the first six rows to understand its structure. year temp_anomaly land_anomaly ocean_anomaly carbon_emissions
1 1880 -0.11 -0.48 -0.01 236
2 1881 -0.08 -0.40 0.01 243
3 1882 -0.10 -0.48 0.00 256
4 1883 -0.18 -0.66 -0.04 272
5 1884 -0.26 -0.69 -0.14 275
6 1885 -0.25 -0.56 -0.17 277
Summary statistic of the above dataset
summary(temp_carbon) year temp_anomaly land_anomaly ocean_anomaly
Min. :1751 Min. :-0.450 Min. :-0.69000 Min. :-0.46000
1st Qu.:1818 1st Qu.:-0.180 1st Qu.:-0.31500 1st Qu.:-0.17000
Median :1884 Median :-0.030 Median :-0.05000 Median :-0.01000
Mean :1884 Mean : 0.060 Mean : 0.07086 Mean : 0.05273
3rd Qu.:1951 3rd Qu.: 0.275 3rd Qu.: 0.30500 3rd Qu.: 0.25500
Max. :2018 Max. : 0.980 Max. : 1.50000 Max. : 0.79000
NA's :129 NA's :129 NA's :129
carbon_emissions
Min. : 3.00
1st Qu.: 13.75
Median : 264.00
Mean :1522.98
3rd Qu.:1431.50
Max. :9855.00
NA's :4
Filter missing values using dypler command filter
temp_carbon_filter <- temp_carbon|>
filter(!is.na(temp_anomaly) & !is.na(carbon_emissions) & !is.na(ocean_anomaly) & !is.na(land_anomaly) & !is.na(year)) # Filter out rows with missing values in the relevant columnsReshape the dataset to long format for easier plotting
temp_carbon_long<-temp_carbon_filter|>
pivot_longer(cols = c(temp_anomaly,carbon_emissions,ocean_anomaly,land_anomaly),
names_to = "anomaly_type",
values_to = "values")
temp_carbon_long# A tibble: 540 × 3
year anomaly_type values
<dbl> <chr> <dbl>
1 1880 temp_anomaly -0.11
2 1880 carbon_emissions 236
3 1880 ocean_anomaly -0.01
4 1880 land_anomaly -0.48
5 1881 temp_anomaly -0.08
6 1881 carbon_emissions 243
7 1881 ocean_anomaly 0.01
8 1881 land_anomaly -0.4
9 1882 temp_anomaly -0.1
10 1882 carbon_emissions 256
# ℹ 530 more rows
Create a faceted line plot
ggplot(temp_carbon_long, aes(x = year, y = values, color = anomaly_type)) +
geom_line(size = 1) + # Add lines for each anomaly type
facet_wrap(~ anomaly_type, scale ="free_y") + # Create separate panels for each anomaly type
labs(title = "Global temperature anomaly and\n carbon emissions, 1751-2018.", # Title for the plot
x = "Year", # X-axis label
y = "Value", # Y-axis label
color = "anomaly_type") + # Legend title
theme_minimal(base_size = 15) + # Change the theme to minimal with larger base font size
theme(legend.position = "right", # Position the legend at the bottom
panel.grid.major = element_line(colour = "grey80"), # Customize grid lines
panel.grid.minor = element_blank()) # Remove minor grid linesSummary
The plot was created using the dslabs package, which offers a variety of datasets for analysis. Specifically, the dataset chosen for this project contains global temperature anomalies and carbon emissions data. This dataset provides climate trends, with temperature anomalies spanning from 1880 to 2018 and carbon emissions data from 1751 to 2014. The global temperature anomalies represent deviations from the 20th-century average, reflecting both land and ocean trends.
To visualize this dataset, a faceted line plot was generated using ggplot2 . The data was reshaped to a long format using pivot_long() , ensuring all variables could be plotted and faceted. Aesthetic mappings were applied, with year on the x-axis, values on the y-axis, and anomaly_type represented by color. The facet_wrap() function separated each variable into its own subplot, allowing independent y-axis scaling for clearer comparisons. Every code chunk was accompanied by detailed explanations, outlining the steps for filtering, reshaping, plotting, and customizing the visualization.