Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of “information overload” or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.
The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.
The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration. To accomplish this task you will have to complete a number of steps:
The dataset, obtained from the Goddard Institute for Space Studies (GISS), provides global temperature anomalies from 1880 to 2022. These anomalies represent deviations from a baseline average temperature, offering insights into long-term climate trends.
The visualizations reveal a steady increase in global temperatures, with a sharp rise starting in 1970. This period marks a significant acceleration in warming, likely driven by industrialization and increased greenhouse gas emissions. Additionally, a slight dip in 2020 is observed, which may be attributed to reduced human activity during the COVID-19 pandemic lockdowns. However, this dip is a short-term fluctuation and does not alter the long-term warming trend.
The data highlights the importance of understanding global temperature changes and their implications for climate policy, environmental management, and sustainability efforts.
The visualizations illustrate seasonal temperature variations and trends from 1975 to the present. The data is categorized into four seasons:
The graphs provide insights into how temperature anomalies have evolved over time for each season and how they vary across decades.
| Observation | Description | Potential Implications |
|---|---|---|
| Global Warming Evidence | Both graphs provide compelling evidence of global warming, with a consistent upward trend in temperature anomalies across all seasons and decades. | Rising sea levels, increased droughts, and ecosystem shifts. |
| Seasonal Impacts | The warming is not uniform, with Summer (JJA) and Autumn (SON) experiencing slightly higher anomalies. | Potential disruptions in agriculture, water resources, and increased heat-related illnesses. |
| Increased Variability | The wider range of anomalies in recent decades indicates greater climate variability. | More frequent and severe weather events, impacting infrastructure and public safety. |
The visualizations illustrate seasonal temperature variations and trends from 1880 to the present. The data is categorized into four seasons:
The graphs provide insights into how temperature anomalies have evolved over time for each season and how they vary across regions and decades.
The Seasonal Temperature Anomalies by Region graph shows clear warming trends across all regions, with pronounced increases in temperature anomalies from the mid-20th century onward. Winter (DJF) and summer (JJA) display stronger variations compared to spring (MAM) and fall (SON). Northern Hemisphere regions demonstrate steeper warming trends than Southern Hemisphere regions. The consistent upward trend highlights rising global temperatures, particularly after 1970.
The Line Plot for Selected Regions by Season highlights increasing variability in recent decades. While all seasons show warming trends, winter (DJF) and summer (JJA) have experienced more pronounced temperature spikes since the 1970s. The increasing spread in data points indicates greater uncertainty and more extreme seasonal temperature patterns.
The Time Series Decomposition Analysis graph breaks the data into three components:
These insights highlight the accelerating pace of climate change, especially in winter and summer, with a notable rise in temperature variability. The growing unpredictability in temperature patterns emphasizes the increasing frequency of extreme climate events.
The visualizations collectively highlight the relationship between rising CO₂ concentrations and global temperature anomalies, as well as regional variations over time.
Key Observations:
Global Warming Evidence Both graphs provide compelling evidence of global warming, with a consistent upward trend in temperature anomalies across all seasons and decades.
Seasonal Impacts The warming is not uniform, with Summer (JJA) and Autumn (SON) experiencing slightly higher anomalies.
Increased Variability The wider range of anomalies in recent decades indicates greater climate variability.
CO₂ and Temperature Correlation The scatter plot demonstrates a strong positive correlation between CO₂ concentration (ppm) and global temperature anomalies (°C).
Conclusion: This trend supports the need for policies to reduce CO₂ emissions to mitigate climate change. The findings align with the scientific consensus on the greenhouse effect and its impact on global warming.
---
title: "LAB 2 - Climate Change Analysis and Insights"
author: "Jijo George"
date: "`r Sys.Date()`"
output:
flexdashboard::flex_dashboard:
orientation: rows
Horizontal_layout: fill
social: menu
source: embed
html_document: default
df_print: paged
pdf_document: default
---
```{r}
# Set default repository for the session
options(repos = c(CRAN = "https://cloud.r-project.org/"))
# Install required packages
if (!require("dygraphs")) install.packages("dygraphs", quiet = TRUE)
if (!require("xts")) install.packages("xts", quiet = TRUE)
if (!require("dplyr")) install.packages("dplyr", quiet = TRUE)
if (!require("readr")) install.packages("readr", quiet = TRUE)
if (!require("plotly")) install.packages("plotly", quiet = TRUE)
if (!require("tidyverse")) install.packages("tidyverse", quiet = TRUE)
# Load required libraries
library(ggplot2)
library(dplyr)
library(readr)
library(dygraphs)
library(xts)
library(plotly)
library(tidyverse)
```
# Table of Contents {.sidebar}
* [Overview](#overview)
* [Introduction](#introduction)
* [Global Temperature and CO2 Analysis](#global-temperature-and-co2-analysis)
- [Global Temperature Trends](#globaltemperaturetrends)
- [Seasonal Variations in Temperature](#seasonal-variations-in-temperature)
- [Regional Temperature Anomalies](#regional-temperature-anomalies)
- [Correlation Between Temperature and CO2](#correlation-between-temperature-and-co2)
* [Conclusion & Insights](#conclusion)
# **Introduction**
Row {data-height=230}
-------------------------------------
### **Overview**
Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of "information overload" or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.
The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.
Row
-------------------------------------
### **Objective**
The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration. To accomplish this task you will have to complete a number of steps:
1. Identify what information interests you about climate change.
2. Find, collect, organize, and summarize the data necessary to create your data exploration plan.
3. Design and create the most appropriate visualizations (no less than 5 visualizations) to explore the data and present that information.
4. Finally organize the layout of those visualizations into a dashboard (use the flexdashboard package) in a way that shows your path of data exploration.
5. Develop four questions or ideas about climate change from your visualizations.
### **Four Questions to answer through this analysis**
1. How have global average temperatures changed over the past century?
2. What are the seasonal variations in global temperature trends over the last 50 years?
3. Is there a measurable correlation between global temperature changes and CO2 emissions over time?
4. How do temperature anomalies vary across different regions of the world?
# **Global Temperature Trends**
Row {data-height=150}
------------------------
### **Trend of the Global Temperatures**
The dataset, obtained from the **Goddard Institute for Space Studies (GISS)**, provides global temperature anomalies from 1880 to 2022. These anomalies represent deviations from a baseline average temperature, offering insights into long-term climate trends.
The visualizations reveal a steady increase in global temperatures, with a sharp rise starting in 1970. This period marks a significant acceleration in warming, likely driven by industrialization and increased greenhouse gas emissions. Additionally, a slight dip in 2020 is observed, which may be attributed to reduced human activity during the COVID-19 pandemic lockdowns. However, this dip is a short-term fluctuation and does not alter the long-term warming trend.
The data highlights the importance of understanding global temperature changes and their implications for climate policy, environmental management, and sustainability efforts.
Row {data-height=450}
------------------------
### **Global Temperature Change in Celcius (1880 - 2022)**
```{r}
# Load the temperature data
temp_data <- read_csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\GLB.Ts+dSST.csv")
# Select relevant columns and clean the data
temp_data <- temp_data %>%
select(Year, `J-D`) %>%
rename(Annual_Mean = `J-D`) %>%
filter(!is.na(Annual_Mean))
# Convert the data to an xts object
temp_xts <- xts(temp_data$Annual_Mean, order.by = as.Date(paste0(temp_data$Year, "-01-01")))
# Create interactive dygraph for global temperature anomalies
dygraph(temp_xts, main = "Interactive Visualization: Global Temperature Anomalies (1880–2022)") %>%
dyAxis("y", label = "Temperature Anomaly (°C)", valueRange = c(-1, 2)) %>%
dyAxis("x", label = "Year") %>%
dyOptions(colors = "darkblue", strokeWidth = 2, drawGrid = TRUE) %>%
dyRangeSelector(height = 30, strokeColor = "gray") %>%
dyHighlight(highlightCircleSize = 6, highlightSeriesBackgroundAlpha = 0.3, hideOnMouseOut = FALSE) %>%
dyLegend(show = "follow", width = 300)
```
# **Seasonal Variations in Temperature**
Row {data-height=300}
------------------------
### **Seasonal Temperature Trends**
The visualizations illustrate seasonal temperature variations and trends from 1975 to the present. The data is categorized into four seasons:
- **DJF** (December, January, February)
- **MAM** (March, April, May)
- **JJA** (June, July, August)
- **SON** (September, October, November)
The graphs provide insights into how temperature anomalies have evolved over time for each season and how they vary across decades.
| **Observation** | **Description** | **Potential Implications** |
|:---------------------------|:-------------------------------------------------------------|:--------------------------------------------------------|
| **Global Warming Evidence** | Both graphs provide compelling evidence of global warming, with a consistent upward trend in temperature anomalies across all seasons and decades. | Rising sea levels, increased droughts, and ecosystem shifts. |
| **Seasonal Impacts** | The warming is not uniform, with Summer (**JJA**) and Autumn (**SON**) experiencing slightly higher anomalies. | Potential disruptions in agriculture, water resources, and increased heat-related illnesses. |
| **Increased Variability** | The wider range of anomalies in recent decades indicates greater climate variability. | More frequent and severe weather events, impacting infrastructure and public safety. |
Row {.tabset .tabset-fade data-width=550}
------------------------
### **Seasonal Temperature Variations (1975–Present)**
```{r}
# Load the temperature data
temp_data_seasonal <- read_csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\GLB.Ts+dSST.csv")
# Filter data
temp_data_seasonal <- temp_data_seasonal %>%
filter(Year >= 1975) %>%
select(Year, DJF, MAM, JJA, SON) %>%
pivot_longer(cols = c(DJF, MAM, JJA, SON), names_to = "Season", values_to = "Anomaly")
# Create interactive line plot for seasonal temperature anomalies
line_plot <- plot_ly(temp_data_seasonal, x = ~Year, y = ~Anomaly, color = ~Season, type = 'scatter', mode = 'lines+markers') %>%
layout(
title = "Seasonal Temperature Variations (1975–Present)",
xaxis = list(title = "Year"),
yaxis = list(title = "Temperature Anomaly (°C)"),
legend = list(title = list(text = "Season"))
)
line_plot
```
### Seasonal Temperature Variations Across Decades {data-width=900}
```{r}
temp_data_seasonal <- temp_data_seasonal %>%
mutate(Decade = floor(Year / 10) * 10) %>%
filter(!is.na(Anomaly)) %>%
mutate(Anomaly = as.numeric(Anomaly))
trend_data <- temp_data_seasonal %>%
group_by(Season, Decade) %>%
summarize(Mean_Anomaly = mean(Anomaly, na.rm = TRUE), .groups = "drop")
# Create an interactive scatter plot with trend lines
scatter_plot <- plot_ly(temp_data_seasonal,
x = ~as.factor(Decade),
y = ~Anomaly,
color = ~Season,
type = 'scatter',
mode = 'markers') %>%
add_trace(data = trend_data,
x = ~as.factor(Decade),
y = ~Mean_Anomaly,
color = ~Season,
type = 'scatter',
mode = 'lines',
line = list(dash = 'solid')) %>%
layout(
title = "Seasonal Temperature Variations Across Decades",
xaxis = list(title = "Decade"),
yaxis = list(title = "Temperature Anomaly (°C)"),
legend = list(title = list(text = "Season"))
)
scatter_plot
```
# **Regional Temperature Anomalies**
Row {data-height=300}
------------------------
### **Understanding Seasonal Temperature Trends**
The visualizations illustrate seasonal temperature variations and trends from 1880 to the present. The data is categorized into four seasons:
- **DJF** (December, January, February)
- **MAM** (March, April, May)
- **JJA** (June, July, August)
- **SON** (September, October, November)
The graphs provide insights into how temperature anomalies have evolved over time for each season and how they vary across regions and decades.
The **Seasonal Temperature Anomalies by Region** graph shows clear warming trends across all regions, with pronounced increases in temperature anomalies from the mid-20th century onward. Winter (**DJF**) and summer (**JJA**) display stronger variations compared to spring (**MAM**) and fall (**SON**). Northern Hemisphere regions demonstrate steeper warming trends than Southern Hemisphere regions. The consistent upward trend highlights rising global temperatures, particularly after 1970.
The **Line Plot for Selected Regions by Season** highlights increasing variability in recent decades. While all seasons show warming trends, winter (**DJF**) and summer (**JJA**) have experienced more pronounced temperature spikes since the 1970s. The increasing spread in data points indicates greater uncertainty and more extreme seasonal temperature patterns.
The **Time Series Decomposition Analysis** graph breaks the data into three components:
- **Trend:** Displays a clear upward pattern in temperature anomalies, reinforcing the evidence of global warming.
- **Seasonal Component:** Shows stable cyclic patterns, indicating consistent seasonal behavior.
- **Residual Component:** Reveals rising variability, especially in recent years, pointing to increasing climate extremes and unpredictable conditions.
These insights highlight the accelerating pace of climate change, especially in winter and summer, with a notable rise in temperature variability. The growing unpredictability in temperature patterns emphasizes the increasing frequency of extreme climate events.
Row {.tabset .tabset-fade data-width=550}
------------------------
### Seasonal Temperature Anomalies by Region {data-width=900}
```{r}
# Load required libraries
library(tidyverse)
library(plotly)
# Load the data
data <- read.csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\ZonAnn.Ts+dSST.csv", fileEncoding = "UTF-8")
# Clean the data and add 'Season' column
data_cleaned <- data %>%
pivot_longer(
cols = starts_with("X"),
names_to = "Region",
values_to = "Anomaly"
) %>%
mutate(
Year = as.integer(Year),
Anomaly = as.numeric(Anomaly),
Region = gsub("X", "", Region),
Season = case_when(
Year %% 4 == 0 ~ "Winter",
Year %% 4 == 1 ~ "Spring",
Year %% 4 == 2 ~ "Summer",
Year %% 4 == 3 ~ "Fall"
)
) %>%
drop_na(Anomaly)
# Facet Grid Plot for Seasonal Anomalies
facet_grid_plot <- ggplot(data_cleaned, aes(x = Year, y = Anomaly, color = Season)) +
geom_line() +
facet_wrap(~ Region, scales = "free_y") +
labs(
title = "Seasonal Temperature Anomalies by Region",
x = "Year",
y = "Temperature Anomaly (°C)",
color = "Season"
) +
theme_minimal()
ggplotly(facet_grid_plot)
```
### Line plot for selected regions by season {data-width=900}
```{r}
# Load required libraries
library(tidyverse)
library(plotly)
# Load the data
data <- read.csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\ZonAnn.Ts+dSST.csv", fileEncoding = "UTF-8")
# Clean the data and add 'Season' column
data_cleaned <- data %>%
pivot_longer(
cols = starts_with("X"),
names_to = "Region",
values_to = "Anomaly"
) %>%
mutate(
Year = as.integer(Year),
Anomaly = as.numeric(Anomaly),
Region = gsub("X", "", Region),
Season = case_when(
Year %% 4 == 0 ~ "Winter",
Year %% 4 == 1 ~ "Spring",
Year %% 4 == 2 ~ "Summer",
Year %% 4 == 3 ~ "Fall"
)
) %>%
drop_na(Anomaly)
# Line plot for selected regions by season
line_plot <- plot_ly(
data_cleaned,
x = ~Year,
y = ~Anomaly,
color = ~Season,
type = "scatter",
mode = "lines"
) %>%
layout(
title = "Seasonal Temperature Anomalies by Region",
xaxis = list(title = "Year"),
yaxis = list(title = "Temperature Anomaly (°C)"),
legend = list(title = list(text = "Season"))
)
line_plot
```
### Time Series Decomposition: Trend, Seasonality & Residual Analysis {data-width=900}
```{r}
# Decompose the time series into trend, seasonal, and residual components for better insights.
library(forecast)
ts_data <- ts(data_cleaned$Anomaly, start = min(data_cleaned$Year), frequency = 12)
stl_decomp <- stl(ts_data, s.window = "periodic")
plot(stl_decomp)
```
# **Temperature and CO2**
Row {data-height=220}
------------------------
### **Correlation Between CO₂ Concentration and Temperature Anomalies**
The visualizations collectively highlight the relationship between rising CO₂ concentrations and global temperature anomalies, as well as regional variations over time.
**Key Observations**:
1. **Correlation**: A strong positive correlation exists between CO₂ levels and temperature anomalies, with higher CO₂ concentrations leading to higher temperature anomalies.
2. **Trends Over Time**: CO₂ concentrations have steadily increased from 2015 to 2024, while temperature anomalies show a fluctuating but upward trend.
3. **Regional Variations**: The Northern Hemisphere experiences more pronounced temperature anomalies compared to the Southern Hemisphere, with global anomalies intensifying significantly in recent decades.
Row {.tabset .tabset-fade data-width=550}
------------------------
### Correlation Between CO2 Concentration and Temperature Anomalies {data-width=900}
```{r}
# Load required libraries
library(tidyverse)
library(plotly)
# Load the datasets
co2_data <- read.csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\co2_trend_gl.csv")
temp_data <- read_csv("C:\\Harrisburg_University\\Semester 3_ANLY 512-90-Data Visualization\\Lab 2\\data\\ZonAnn.Ts+dSST.csv")
# Data Cleaning and Preparation
co2_data_clean <- co2_data %>%
rename(Year = year, CO2 = trend) %>%
select(Year, CO2) %>%
group_by(Year) %>%
summarize(CO2 = mean(CO2, na.rm = TRUE))
temp_data_clean <- temp_data %>%
rename(Year = Year, Temperature_Anomaly = Glob) %>%
select(Year, Temperature_Anomaly) %>%
mutate(
Year = as.numeric(Year),
Temperature_Anomaly = as.numeric(Temperature_Anomaly)
) %>%
drop_na()
combined_data <- merge(co2_data_clean, temp_data_clean, by = "Year") %>%
drop_na()
# Visualization
# Interactive Scatter Plot: CO2 vs Temperature Anomalies
scatter_plot <- plot_ly(
data = combined_data,
x = ~CO2,
y = ~Temperature_Anomaly,
type = 'scatter',
mode = 'markers',
marker = list(color = 'blue', size = 10, opacity = 0.6)
) %>%
layout(
title = "Correlation Between CO2 Concentration and Temperature Anomalies",
xaxis = list(title = "CO2 Concentration (ppm)"),
yaxis = list(title = "Temperature Anomaly (°C)")
)
scatter_plot
```
### CO2 Emissions and Temperature Anomalies Over Time {data-width=900}
```{r}
# Interactive Line Plot: CO2 and Temperature Anomalies Over Time
line_plot <- plot_ly() %>%
add_lines(
data = combined_data,
x = ~Year,
y = ~CO2,
name = "CO2 Concentration (ppm)",
line = list(color = 'blue')
) %>%
add_lines(
data = combined_data,
x = ~Year,
y = ~Temperature_Anomaly * 100,
name = "Temperature Anomaly (°C)",
line = list(color = 'red')
) %>%
layout(
title = "CO2 Emissions and Temperature Anomalies Over Time",
xaxis = list(title = "Year"),
yaxis = list(title = "CO2 Concentration (ppm)"),
yaxis2 = list(
title = "Temperature Anomaly (°C)",
overlaying = "y",
side = "right"
)
)
line_plot
```
### Regional Temperature Anomalies {data-width=900}
```{r}
# Interactive Heatmap: Regional Temperature Anomalies
regional_data_long <- temp_data %>%
select(Year, NHem, SHem, Glob) %>%
pivot_longer(cols = c(NHem, SHem, Glob), names_to = "Region", values_to = "Anomaly") %>%
filter(!is.na(Anomaly))
heatmap_plot <- plot_ly(
data = regional_data_long,
x = ~Year,
y = ~Region,
z = ~Anomaly,
type = "heatmap",
colorscale = "Viridis"
) %>%
layout(
title = "Regional Temperature Anomalies",
xaxis = list(title = "Year"),
yaxis = list(title = "Region"),
colorbar = list(title = "Temperature Anomaly (°C)")
)
heatmap_plot
```
# **Conclusions and Implications**
---
### **Conclusions**
**Global Warming Evidence**
Both graphs provide compelling evidence of global warming, with a consistent upward trend in temperature anomalies across all seasons and decades.
- **Implications**: Rising sea levels, increased droughts, and ecosystem shifts.
**Seasonal Impacts**
The warming is not uniform, with **Summer (JJA)** and **Autumn (SON)** experiencing slightly higher anomalies.
- **Implications**: Potential disruptions in agriculture, water resources, and increased heat-related illnesses.
**Increased Variability**
The wider range of anomalies in recent decades indicates greater climate variability.
- **Implications**: More frequent and severe weather events, impacting infrastructure and public safety.
**CO₂ and Temperature Correlation**
The scatter plot demonstrates a strong positive correlation between CO₂ concentration (ppm) and global temperature anomalies (°C).
- **Key Observations**:
1. The correlation coefficient indicates a strong linear relationship between CO₂ levels and temperature anomalies.
2. The scatter plot visually reinforces the connection between human-induced CO₂ emissions and rising global temperatures.
**Conclusion**: This trend supports the need for policies to reduce CO₂ emissions to mitigate climate change. The findings align with the scientific consensus on the greenhouse effect and its impact on global warming.