ASSESSMENT DECLARATION CHECKLIST

Please carefully read the statements below and check each box if you agree with the declaration. If you do not check all boxes, your assignment will not be marked. If you make a false declaration on any of these points, you may be investigated for academic misconduct. Students found to have breached academic integrity may receive official warnings and/or serious academic penalties. Please read more about academic integrity here. If you are unsure about any of these points or feel your assessment might breach academic integrity, please contact your course coordinator for support. It is important that you DO NOT submit any assessment until you can complete the declaration truthfully.

By checking the boxes below, I declare the following:

I understand that:

I agree and acknowledge that:

ORIGINAL

The original data visualization selected for the assignment is shown below:


Source: National Oceanic and Atmospheric Administration (2022).
(https://www.epa.gov/climate-indicators/climate-change-indicators-us-and-global-temperature.)


The image above illustrates the historical evolution of annual average temperatures in the contiguous 48 states which makes up the United States of America. The temperature data is derived from land-based weather stations for surface measurements, while satellite technology is utilized to monitor the lower troposphere, the Earth’s lowest atmospheric layer. The “UAH” and “RSS” labels signify two distinct approaches to interpreting the original satellite data.

OBJECTIVE AND AUDIENCE

The objective and audience of the original data visualization chosen can be summarized as follows:

Objective The objective of the visualization is to communicate the long-term trend of the United States temperatures and the causes and effects of climate change over time. To provide a clear representation of temperature change over time, this graph employs the period from 1901 to 2020. The visualization communicates the following:

Audience The audience for the visualization are:

CRITIQUE/DECONSTRUCT

The visualization chosen had the following issues:

CODE & DATA

The data provided (NOAA 2022), consists of 121 records and 4 columns. Limitations in the data are a result of fewer stations in the early 20th century, and uncertainties in the surface temperature data increase as one goes back in time. The following codes were used to clean the data and fix the issues identified in the dataset provided.

# Load the 'readxl' library for reading Excel files in R.
library(readxl)

# Load the 'ggplot2' library for creating data visualizations using the Grammar of Graphics framework.
library(ggplot2)

# Load the 'dplyr' library for efficient data manipulation tasks such as filtering, sorting, and summarizing.
library(dplyr)

# Load the 'gridExtra' library for additional functionality in arranging and combining multiple plots into layouts.
library(gridExtra)

# Load the 'reshape2' library for data reshaping and restructuring, particularly useful for transforming data formats.
library(reshape2)


# Read data from CSV file
data <- read.csv("~/LEARNING/R/Temperature_USA.csv")

#Review the data type, first few rows, column name and, the data values
glimpse(data)
## Rows: 121
## Columns: 4
## $ Year                                            <int> 1901, 1902, 1903, 1904…
## $ Earth.s.surface..land.and.ocean.                <dbl> -0.270, -0.468, -0.666…
## $ Lower.troposphere..measured.by.satellite...UAH. <chr> "null", "null", "null"…
## $ Lower.troposphere..measured.by.satellite...RSS. <chr> "null", "null", "null"…
# Reviewing the structure of the data
str(data)
## 'data.frame':    121 obs. of  4 variables:
##  $ Year                                           : int  1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 ...
##  $ Earth.s.surface..land.and.ocean.               : num  -0.27 -0.468 -0.666 -0.828 -0.504 -0.378 -0.684 -0.774 -0.792 -0.72 ...
##  $ Lower.troposphere..measured.by.satellite...UAH.: chr  "null" "null" "null" "null" ...
##  $ Lower.troposphere..measured.by.satellite...RSS.: chr  "null" "null" "null" "null" ...
# Clean the data by replacing the "null" with NA in the dataset
data[data == "null"] <- NA

# Convert appropriate columns to numeric using apply
data[, 2:4] <- apply(data[, 2:4], 2, as.numeric)

#Confirm the structure of the dataset after cleaning
str(data)
## 'data.frame':    121 obs. of  4 variables:
##  $ Year                                           : int  1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 ...
##  $ Earth.s.surface..land.and.ocean.               : num  -0.27 -0.468 -0.666 -0.828 -0.504 -0.378 -0.684 -0.774 -0.792 -0.72 ...
##  $ Lower.troposphere..measured.by.satellite...UAH.: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Lower.troposphere..measured.by.satellite...RSS.: num  NA NA NA NA NA NA NA NA NA NA ...
#Reviewing post cleaning
glimpse(data) 
## Rows: 121
## Columns: 4
## $ Year                                            <int> 1901, 1902, 1903, 1904…
## $ Earth.s.surface..land.and.ocean.                <dbl> -0.270, -0.468, -0.666…
## $ Lower.troposphere..measured.by.satellite...UAH. <dbl> NA, NA, NA, NA, NA, NA…
## $ Lower.troposphere..measured.by.satellite...RSS. <dbl> NA, NA, NA, NA, NA, NA…
# List the column names in the dataset
colnames(data)
## [1] "Year"                                           
## [2] "Earth.s.surface..land.and.ocean."               
## [3] "Lower.troposphere..measured.by.satellite...UAH."
## [4] "Lower.troposphere..measured.by.satellite...RSS."
# Rename the columns with lowercase names
colnames(data) <- c("Year", "earth_surface", "uah_lower_troposphere", "rss_lower_troposphere")

RECONSTRUCTION

In order to reconstruct the visualization, we carry out the following activities:

The following plot fixes visualization and the overlapping Lines representing the Earth’s surface, the lower troposphere measured by satellite, and the lower troposphere measured by RSS.

# Create a scatterplot with smoothed line (geom_smooth)
ggplot(data, aes(x = Year, y = earth_surface)) +
  geom_point() +  # Add points for the scatterplot
  geom_smooth(method = "lm", se = FALSE, color = "blue") +  # Add linear regression line
  labs(title = "Scatterplot with Smoothed Line of Earth Surface Temperatures", x = "Year", y = "Temperature (°F)")

ggplot(data, aes(x = Year, y = uah_lower_troposphere)) +
  geom_point() +  # Add points for the scatterplot
  geom_smooth(method = "lm", se = FALSE, color = "black") +  # Add linear regression line
 labs(title = "Scatterplot with Smoothed Line of UAH Lower Troposphere Temperatures", x = "Year", y = "Temperature (°F)")

ggplot(data, aes(x = Year, y = rss_lower_troposphere)) +
  geom_point() +  # Add points for the scatterplot
  geom_smooth(method = "lm", se = FALSE, color = "orange") +  # Add linear regression line
  labs(title = "Scatterplot with Smoothed Line of RSS Lower Troposphere Temperatures", x = "Year", y = "Temperature (°F)")

Facet Wrapping our Visualization - We facet wrap the visualization to allow for comparison of the various columns that makes up the dataset.

# Create a combined dataset with melted data for easy plotting
melted_data <- melt(data, id.vars = "Year")

# Create a scatterplot with smoothed line, faceted by variable
ggplot(data = melted_data, aes(x = Year, y = value, color = variable)) +
  geom_point() +  # Add points
  geom_smooth(method = "lm", se = FALSE) +  # Add linear regression line
  labs(title = "Scatterplots with Smoothed Lines for Temperature Data", x = "Year", y = "Temperature (°F)") +
  facet_wrap(~ variable, scales = "free_y", ncol = 1) +  # Facet by variable, allow y-axis to vary, 1 column
  scale_color_manual(values = c("earth_surface" = "blue", "uah_lower_troposphere" = "black", "rss_lower_troposphere" = "orange")) +  # Define custom colors
  theme_minimal() +  # Minimal theme
  theme(legend.position = "bottom")  # Legend position

REFERENCES

Baglin, J 2023, ‘Data Visualisation: From Theory to Practice’, RMIT Online, RMIT University Melbourne,

MATH2402, ‘Data Visualisation and Communication’, RMIT Online, RMIT University Melbourne, https://rmit.instructure.com/courses/95340/pages/1-dot-3-1-data-visualisation-critique?module_item_id=3696530, accessed 12 September 2023

Reference list US EPA 2016, Climate Change Indicators: U.S. and Global Temperature | US EPA, US EPA, viewed 10 September 2023, https://www.epa.gov/climate-indicators/climate-change-indicators-us-and-global-temperature.

n.d., NOAA (National Oceanic and Atmospheric Administration). 2022. Climate at a glance. Accessed March 2022.

RStudio (2021). RStudio Cheatsheets Retreived 22 July 2021, from RStudio website: https://www.rstudio.com/resources/cheatsheets/