Introduction

Row

Overview

Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of “information overload” or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.

The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.

Objective

The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration. To accomplish this task you will have to complete a number of steps:

  1. Identify what information interests you about climate change.
  2. Find, collect, organize, and summarize the data necessary to create your data exploration plan.
  3. Design and create the most appropriate visualizations (no less than 5 visualizations) to explore the data and present that information.
  4. Finally organize the layout of those visualizations into a dashboard (use the flexdashboard package) in a way that shows your path of data exploration.
  5. Develop four questions or ideas about climate change from your visualizations.

Row

Dates & Deliverables

You are responsible for submitting a link to your dashboard hosted on the Rpubs site. The dashboard must include the source_code = embed parameter. The due date for this project is XX at the start of class. This assignment is worth 75 points, 3x a normal homework, the additional time should allow you to spend the necessary effort on this assignment.

You are welcome to work in groups of ≤2 people. However, each person in a group must submit their own link to the assignment on moodle for grading! Each team member can submit the same link to a single Rpubs account, however it may be a good idea for each of you to post your own copy to rpubs in case you want to share it to prospective employers etc.

Getting data

There are lots of places we can get climate data to answer your questions. The simplest would be to go to NOAA National Centers for Environmental Information (https://www.ncdc.noaa.gov/). There are all kinds of data here (regional, global, marine). Also, on the front page of the NOAA website there are also other websites that have climate data, such as: (https://www.climate.gov/), (https://www.weather.gov/), (https://www.drought.gov/drought/), and (https://www.globalchange.gov/). Obviously, you don’t have to use all of them but it might be helpful to browse them to get ideas for the development of your questions.

Alternatively, and more professionally, there are tons of packages that allow you to access data from R. See here for a great primer on accessing NOAA data with ‘R’. It is also a good introduction to API keys and their use.

Global CO2 Levels

Column

Trend of the Global CO2 levels

The visualization shows the Average Global CO2 Emissions from 1959-2022. Data is provided by the The National Oceanic and Atmospheric Administration - NOAA. The CO2 levels are measured in ppm.

From the graph we can conclude that there is sharp increase in the Global CO2 emissions. This is likely due to the explosion of the global economy and the global supply chain, coupled with the consistent increased need for fossil fuels to power the global economy.

Column

Global CO2 Levels

Global Temperatures

Column

Trend of the Global Temperatures

The dataset, obtained from the GISS, provides the global temperature anomalies from 1880 to 2022. The visualizations show an estimate of global surface temperature change, both on land and in the oceans.

From the graph, we can conclude that the global temperature has been steadily increasing, with a sharp rise as of 1976. We can also note a slight dip in 2020, likely due to the quarantine and lock downs associated with the Covid-19 pandemic.

Column

Global Temperature Change in Celcius (1880 - 2022)

Lower Tropospheric Global Temperature

Column

Trend of Lower Tropospheric Global Temperature

Lower Tropospheric Global Temperature is the most common index to measure see the earth’s global warming.

This visualization shows the Annual Lower Tropospheric Global Temperature between 1979 to 2019. From the graph,we can conclude that the Annual lower tropospheric temperature anomalies have increased over the last century. From the graph, we can see that the spikes in 2000 and 2019 show a very sharp increase which could be alarming.

Column

Annual Lower Tropospheric Global Temperature Anomalies (1850 - 2022)

U.S. Statewide Min/Max Temperatures

Column

Trend of the U.S. Statewide Min/Max Temperatures

This visualization shows the statewide minimum / maximum temperature across the US between 5 year periods. From the graph, we can cocclude that the temperatures seem to rise steadily as we go north as seen with the gradient color in both maps.

Column

Statewide Maximum Temperatures

Statewide Maximum & Minimum Temperatures

Global Sea Ice Coverage

Column

Trend of Global Sea Ice Coverage

The dataset, obtained through NOAA. The visualization shows the extent of Global Sea Ice, in millions square kilometers, in the month of February for every year from 1979 through 2023.

From the graph, we can see that the extent of Global Sea Ice remained fairly consistent until the year 2000, hovering around the 18 to 19 million square kilometers range. After that the extent of Global Sea Ice declined, from 19 to 16 million square kilometers of sea ice. The lowest point for the sea ice appears to be February 2023, at almost exactly 16 million square kilometers of sea ice.

Column

Global Land Temperature Anomalies in Degrees Celcius (1880 - 2019)

Sea Ice Coverage by Hemispheres

Column

Trend of Sea Ice Coverage by Hemispheres

NOAA provides datasets on the Sea Ice levels for both the Northern and Southern hemisphere, which supplied the data for the two graphs in this section. Both datasets show the extent of sea ice, in millions of square kilometers, in the month of February for every year from 1979 through 2023.

From the visualization we can conclude that the Northern Hemisphere has a far greater amount of sea ice than the southern hemisphere. The northern hemisphere shows sea ice in the range of 14-16 million square kilometers, while the Southern Hemisphere has only 2-3 million square kilometers of sea ice. The second thing we can tell from the graphs is that the Northern Hemisphere shows a fairly predictable ice levels and a fairly consistent decline, while the Southern Hemisphere is far more unpredictable in the yearly ice levels. Both hemispheres show decline, with the lowest ice level in the North coming in 2018, and the lowest ice level in the South coming in 2023. We can see from these visualizations that both Hemispheres have been affected by the changing climate, and the raising temperatures have had different impacts on the ice levels for each hemisphere, although with the same result for both: steadily lowering ice levels in each Hemisphere.

Column

Northern Hemisphere Global Ice Extent

Southern Hemisphere Global Ice Extent

Final Observations & Questions

Row

Observations and Questions

What can we tell from all 6 visualizations?

From the visualizations, we can clearly tell fundamentally ask 6 questions. We have used open source data sets and analyzed the atmospheric data to show trends in the data. For the questions we have raised, we have showcased visualizations to provide an answer for the questions. The answers may help us understand the situation that we are in currently and may help us make better data driven decisions.

  • Question 1: What is the trend for the global CO2 levels? Answer: Global CO2 levels are dangerously high and the data shows that that the CO2 levels are trending even higher.

  • Question 2: What trends can we see in the global temperature? Answer: Global temperatures are trending high and the data shows that that the temperature levels are trending even higher and show no signs of slowing.

  • Question 3: What trends can we see in the annual lower tropospheric global temperature anomalies? Answer: The Annual lower tropospheric temperature anomalies have increased over the last century. Especially can see that the spikes between 2000 and 2019.

  • Question 4: What is the trend of temperature across the states in US? Answer: The trend of temperatures seem to rise steadily as we go north as seen with the gradient color in both maps above.

  • Question 5: What is the trend of global sea ice coverage? Answer: We can see that the extent of Global Sea Ice remained fairly consistent until the year 2000, hovering around the 18 to 19 million square kilometers range. After that the extent of Global Sea Ice declined, from 19 to 16 million square kilometers of sea ice. The lowest point for the sea ice appears to be February 2023, at almost exactly 16 million square kilometers of sea ice.

  • Question 6: What is the trend of global sea ice coverage by hemispheres? Answer: From the visualization we can conclude that the Northern Hemisphere has a far greater amount of sea ice than the southern hemisphere. The northern hemisphere shows sea ice in the range of 14-16 million square kilometers, while the Southern Hemisphere has only 2-3 million square kilometers of sea ice. Both hemispheres show decline, with the lowest ice level in the North coming in 2018, and the lowest ice level in the South coming in 2023.

Conclusion:

Combining all of this observations from the data analysis, we can determine that the global CO2 levels are the major cause of the rising temperatures on land and sea, which in turn has resulted drastically lower global sea ice levels. This showcases the urgency of closely monitoring CO2 in the atmosphere, and provides an dire need to curb global CO2 levels.

---
title: "ANLY 512: Lab 2: Data Exploration and Analysis Laboratory"
author: "Mithil Kashyap Vyas"
date: "04/10/2023"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    social: menu
    source: embed
    html_document:
    df_print: paged
    pdf_document: default
---
```{r}
library(quantmod)
library(flexdashboard)
library(plyr)
library(DT)
library(ipred)
library(dplyr)
library(highcharter)
library(viridisLite)
library(ggplot2)
library(broom)
library(xts)
library(zoo)
library(dygraphs)
library(lubridate)
library(maps)
library(ggmap)
library(maptools)
library(dygraphs)
library(rnoaa)
```

#####################################################################################################################

# **Introduction**

## Table of Contents {.sidebar}

**Table of Contents:**

* Introduction

* Global CO~2~ Levels
  
* Global Temperatures

* Lower Tropospheric Global Temperature

* U.S. Statewide Min/Max Temperature

* Global Sea Ice Coverage

* Sea Ice Coverage by Hemisphere

* Final Observation

Row {data-height=230}
-----------------------------------------------------------------------
    
### **Overview** 
Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of “information overload” or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.

The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.

### **Objective** 

The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration. To accomplish this task you will have to complete a number of steps:

1. Identify what information interests you about climate change.
2. Find, collect, organize, and summarize the data necessary to create your data exploration plan.
3. Design and create the most appropriate visualizations (no less than 5 visualizations) to explore the data and present that information.
4. Finally organize the layout of those visualizations into a dashboard (use the flexdashboard package) in a way that shows your path of data exploration.
5. Develop four questions or ideas about climate change from your visualizations.

Row
-----------------------------------------------------------------------
    


### **Dates & Deliverables**

You are responsible for submitting a link to your dashboard hosted on the Rpubs site. The dashboard must include the source_code = embed parameter.
The due date for this project is XX at the start of class. This assignment is worth 75 points, 3x a normal homework, the additional time should allow you to spend the necessary effort on this assignment.

You are welcome to work in groups of ≤2 people. However, each person in a group must submit their own link to the assignment on moodle for grading! Each team member can submit the same link to a single Rpubs account, however it may be a good idea for each of you to post your own copy to rpubs in case you want to share it to prospective employers etc.


### **Getting data**

There are lots of places we can get climate data to answer your questions. The simplest would be to go to NOAA National Centers for Environmental Information (https://www.ncdc.noaa.gov/). There are all kinds of data here (regional, global, marine). Also, on the front page of the NOAA website there are also other websites that have climate data, such as: (https://www.climate.gov/), (https://www.weather.gov/), (https://www.drought.gov/drought/), and (https://www.globalchange.gov/). Obviously, you don’t have to use all of them but it might be helpful to browse them to get ideas for the development of your questions.

Alternatively, and more professionally, there are tons of packages that allow you to access data from R. See here for a great primer on accessing NOAA data with ‘R’. It is also a good introduction to API keys and their use.

#####################################################################################################################

# **Global CO~2~ Levels**

Column {data-width=250}
------------------------
### **Trend of the Global CO~2~ levels**

The visualization shows the Average Global CO~2~ Emissions from 1959-2022. Data is provided by the The National Oceanic and Atmospheric Administration - NOAA. The CO~2~ levels are measured in ppm. 

From the graph we can conclude that there is sharp increase in the Global CO~2~ emissions. This is likely due to the explosion of the global economy and the global supply chain, coupled with the consistent increased need for fossil fuels to power the global economy.

Column {data-width=850}
------------------------
### **Global CO~2~ Levels**

```{r}
data = read.csv(url("https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_annmean_mlo.csv"), skip=59)

ggplot(data, aes(x=year, y=mean))+ geom_point(color="red") + stat_smooth(method=lm) + ggtitle("Average Global CO2 Emissions from 1959-2022") + labs(x="Year", y="Mean CO2 Emissions (ppm)")
```


#####################################################################################################################


# **Global Temperatures**

Column {data-width=250}
------------------------
### **Trend of the Global Temperatures**

The dataset, obtained from the GISS, provides the global temperature anomalies from 1880 to 2022. The visualizations show an estimate of global surface temperature change, both on land and in the oceans.

From the graph, we can conclude that the global temperature has been steadily increasing, with a sharp rise as of 1976. We can also note a slight dip in 2020, likely due to the quarantine and lock downs associated with the Covid-19 pandemic.


Column {data-width=1150}
------------------------
### **Global Temperature Change in Celcius (1880 - 2022)**

```{r}
data2 = read.table("https://data.giss.nasa.gov/gistemp/graphs/graph_data/Global_Mean_Estimates_based_on_Land_and_Ocean_Data/graph.txt", header = FALSE, col.names = c("Year","No_Smoothing","Lowess(5)"),skip = 5)
smoothing = ts(data2$Lowess.5.,frequency=1,start=c(1880))
anMean=ts(data2$No_Smoothing,frequency = 1,start=c(1880))
temp=cbind(smoothing,anMean)

dygraph(temp, main="Global Temperature Anomaly from 1880 - 2022 in Degrees Celcius", xlab="Year",ylab="Temperature Anomaly") %>%
  dyRangeSelector() %>% dyLegend(width = 250, show="onmouseover") %>% dyOptions(colors = RColorBrewer::brewer.pal(3, "Set1"))
```


#####################################################################################################################


# **Lower Tropospheric Global Temperature**

Column {data-width=250}
------------------------
### **Trend of Lower Tropospheric Global Temperature**

Lower Tropospheric Global Temperature is the most common index to measure see the earth’s global warming. 

This visualization shows the Annual Lower Tropospheric Global Temperature between 1979 to 2019. From the graph,we can conclude that the Annual lower tropospheric temperature anomalies have increased over the last century. From the graph, we can see that the spikes in 2000 and 2019 show a very sharp increase which could be alarming.

Column {data-width=850}
------------------------
### **Annual Lower Tropospheric Global Temperature Anomalies (1850 - 2022)**

```{r}
data3 = read.csv(url("https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series/globe/land_ocean/ytd/12/1850-2022/data.csv"),skip=4)

data3_melt = reshape2::melt(data3, id.var='Year')

ggplot(data3_melt, aes(x=Year, y=value, col=variable)) +
geom_line() + labs(title="Annual Lower Tropospheric Global Temperature Anomalies", x = "Year", y = "Anomalies", color=NULL)
```


#####################################################################################################################

# **U.S. Statewide Min/Max Temperatures**

Column {data-width=200}
------------------------

### **Trend of the U.S. Statewide Min/Max Temperatures**

This visualization shows the statewide minimum / maximum temperature across the US between 5 year periods. From the graph, we can cocclude that the temperatures seem to rise steadily as we go north as seen with the gradient color in both maps.


Column {.tabset .tabset-fade data-width=850}
------------------------
### **Statewide Maximum Temperatures**

```{r}
data4 = read.csv(url("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-tmax-201906-60.csv"), skip=3)

data4$region = tolower(data4$Location)
usstates = map_data("state")
data4 = merge(usstates, data4, by="region", all=T)

ggplot(data4, aes(x = long, y = lat, group = group, fill = Value)) + geom_polygon(color = "white") +
scale_fill_gradient(name = "Degrees Fahrenheit", low = "#feceda", high = "#c81f49", guide = "colorbar", na.value="black") +
labs(title="Statewide Maximum Temperature [July 2014 - June 2019]", x="Longitude", y="Latitude") + coord_map()
```


### **Statewide Maximum & Minimum Temperatures**

```{r}
data5 = read.csv(url("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-tmin-201906-60.csv"),skip=3)

data5$region = tolower(data5$Location)
data5 = merge(usstates, data5, by="region", all=T)

ggplot(data5, aes(x = long, y = lat, group = group, fill = Value)) + geom_polygon(color = "white") + scale_fill_gradient(name = "Degrees Fahrenheit", na.value="black") + labs(title="Statewide Minumum Temperature [July 2014 - June 2019] ", x="Longitude", y="Latitude") + coord_map()

```


#####################################################################################################################

# **Global Sea Ice Coverage**


Column {data-width=200}
------------------------
### **Trend of Global Sea Ice Coverage**

The dataset, obtained through NOAA. The visualization shows the extent of Global Sea Ice, in millions square kilometers, in the month of February for every year from 1979 through 2023.

From the graph, we can see that the extent of Global Sea Ice remained fairly consistent until the year 2000, hovering around the 18 to 19 million square kilometers range. After that the extent of Global Sea Ice declined, from 19 to 16 million square kilometers of sea ice. The lowest point for the sea ice appears to be February 2023, at almost exactly 16 million square kilometers of sea ice.


Column {data-width=850}
------------------------
### **Global Land Temperature Anomalies in Degrees Celcius (1880 - 2019)**

```{r}
data6 = read.csv(url("https://www.ncei.noaa.gov/access/monitoring/snow-and-ice-extent/sea-ice/G/2/data.csv"),skip=4)

ggplot(data6, aes(x=Date, y=Value)) + geom_area(position="jitter", alpha=0.2, fill="blue") + scale_y_continuous(breaks=c(0,2,4,6,8,10,12,14,16,18,20)) + theme_minimal()+
  ylab("Extent of Global Sea Ice (Millions Sq Km") + ggtitle("Global Sea Ice Extent in February (1979-2023)")
```



#####################################################################################################################


# **Sea Ice Coverage by Hemispheres**

Column {data-width=200}
------------------------
### **Trend of Sea Ice Coverage by Hemispheres**

NOAA provides datasets on the Sea Ice levels for both the Northern and Southern hemisphere, which supplied the data for the two graphs in this section. Both datasets show the extent of sea ice, in millions of square kilometers, in the month of February for every year from 1979 through 2023.

From the visualization we can conclude that the Northern Hemisphere has a far greater amount of sea ice than the southern hemisphere. The northern hemisphere shows sea ice in the range of 14-16 million square kilometers, while the Southern Hemisphere has only 2-3 million square kilometers of sea ice. The second thing we can tell from the graphs is that the Northern Hemisphere shows a fairly predictable ice levels and a fairly consistent decline, while the Southern Hemisphere is far more unpredictable in the yearly ice levels. Both hemispheres show decline, with the lowest ice level in the North coming in 2018, and the lowest ice level in the South coming in 2023. We can see from these visualizations that both Hemispheres have been affected by the changing climate, and the raising temperatures have had different impacts on the ice levels for each hemisphere, although with the same result for both: steadily lowering ice levels in each Hemisphere.


Column {.tabset .tabset-fade}
------------------------
### **Northern Hemisphere Global Ice Extent**

```{r}
data7 = read.csv(url("https://www.ncei.noaa.gov/access/monitoring/snow-and-ice-extent/sea-ice/N/2/data.csv"),skip=4)

ggplot(data7, aes(x=Date,y=Value)) + geom_area(position="jitter", alpha=0.2, fill="cyan") + scale_y_continuous(breaks = c(0,2,4,6,8,10,12,14,16,18)) + theme_minimal() + ylab("Extent in millions of Sq Km") + ggtitle("Extent of Northern Hemisphere Sea Ice in (Feb 1979-2023)")
```

### **Southern Hemisphere Global Ice Extent**

```{r}
data8 = read.csv(url("https://www.ncei.noaa.gov/access/monitoring/snow-and-ice-extent/sea-ice/S/2/data.csv"),skip=4)

ggplot(data8, aes(x=Date, y=Value))+ geom_area(position="jitter", alpha=0.2, fill="purple") + scale_y_continuous(breaks = c(0,1,2,3,4)) +
  theme_minimal() + ylab("Extent in millions of Sq Km") + ggtitle("Extent of Southern Hemisphere Sea Ice in (Feb 1979-2023)")
```

#####################################################################################################################

# **Final Observations & Questions**

Row
-----------------------------------------------------------------------

### **Observations and Questions**

**What can we tell from all 6 visualizations?**

From the visualizations, we can clearly tell fundamentally ask 6 questions. We have used open source data sets and analyzed the atmospheric data to show trends in the data. For the questions we have raised, we have showcased visualizations to provide an answer for the questions. The answers may help us understand the situation that we are in currently and may help us make better data driven decisions.

+ **Question 1:** What is the trend for the global CO~2~ levels?
**Answer:** Global CO~2~ levels are dangerously high and the data shows that that the CO~2~ levels are trending even higher.

+ **Question 2:** What trends can we see in the global temperature?
**Answer:** Global temperatures are trending high and the data shows that that the temperature levels are trending even higher and show no signs of slowing. 

+ **Question 3:** What trends can we see in the annual lower tropospheric global temperature anomalies?
**Answer:** The Annual lower tropospheric temperature anomalies have increased over the last century. Especially can see that the spikes between 2000 and 2019.

+ **Question 4:** What is the trend of temperature across the states in US?
**Answer:** The trend of temperatures seem to rise steadily as we go north as seen with the gradient color in both maps above.

+ **Question 5:** What is the trend of global sea ice coverage?
**Answer:** We can see that the extent of Global Sea Ice remained fairly consistent until the year 2000, hovering around the 18 to 19 million square kilometers range. After that the extent of Global Sea Ice declined, from 19 to 16 million square kilometers of sea ice. The lowest point for the sea ice appears to be February 2023, at almost exactly 16 million square kilometers of sea ice.

+ **Question 6:** What is the trend of global sea ice coverage by hemispheres?
**Answer:** From the visualization we can conclude that the Northern Hemisphere has a far greater amount of sea ice than the southern hemisphere. The northern hemisphere shows sea ice in the range of 14-16 million square kilometers, while the Southern Hemisphere has only 2-3 million square kilometers of sea ice. Both hemispheres show decline, with the lowest ice level in the North coming in 2018, and the lowest ice level in the South coming in 2023.


**Conclusion:**

Combining all of this observations from the data analysis, we can determine that the global CO~2~ levels are the major cause of the rising temperatures on land and sea, which in turn has resulted drastically lower global sea ice levels. This showcases the urgency of closely monitoring CO~2~ in the atmosphere, and provides an dire need to curb global CO~2~ levels.






#####################################################################################################################