Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of “information overload” or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.
The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.
The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration.
Four questions or ideas about climate change from our visualizations:
How does the climate change over time?
Can we visualize the change in sea ice over time?
What is the pattern of the change in sea ice over time?
Do changes in sea ice differ between the two hemispheres?
The global climate change is real. The global temperature now is much abnormal than years before. The average temperature has changed in United States obviously.
The summer ice extent in North Hemisphere is lower than South while the winter is opposite.
In North Hemisphere, the average ice extent decreases by year in every month.
In South Hemisphere, the average ice extent decreases by year but stays relatively stable every month.
---
title: "ANLY 512 Lab 2: Data Exploration and Analysis Laboratory"
author: "Wenxue Sun, Arju Begum"
date: "`r Sys.Date()`"
output:
flexdashboard::flex_dashboard:
source_code: embed
orientation: rows
vertical_layout: fill
---
```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
library(dplyr)
library(ggmap)
library(gridExtra)
library(plotly)
library(reshape2)
temperatures_state <- read.csv("data/GlobalLandTemperaturesByState.csv")
temperatures_state$dt <- as.Date(temperatures_state$dt)
temperatures_state$Year <- format.Date(temperatures_state$dt,format = "%Y")
temperatures_state$Month <- format.Date(temperatures_state$dt,format = "%m")
temperatures_state.usa <- temperatures_state %>% filter(Country == 'United States')
seaice<-read.csv("data/seaice.csv")
seaice<-seaice%>%
select(Year,Month,Extent,hemisphere)
seaice$Month<-as.factor(seaice$Month)
levels(seaice$Month)<-c("January","February","March","April","May","June","July","August","September","October","November","December")
```
Introduction
================================
Row {}
-------------------------
### Overview
Climate change and science has been an issue for discussion and debate for at least the last decade. Climate data collection is currently being collected for areas all over the world. Policy decisions are based on the most recent analysis conducted on data extracted from huge online repositories of this data. Due to the inherent growth in the electronic production and storage of information, there is often a feeling of “information overload” or inundation when facing the process of quantitative decision making. As an analyst your job will often be to explore large data sets and develop questions or ideas from visualizations of those data sets.
The ability to synthesize large data sets using visualizations is a skill that all data scientists should have. In addition to this data scientists are called upon to present data syntheses and develop questions or ideas based on their data exploration. This lab should take you through the major steps in data exploration and presentation.
Row {}
-------------------------
### Objective
The objective of this laboratory is to survey the available data, plan, design, and create an information dashboard/presentation that not only explores the data but helps you develop questions based on that data exploration.
Four questions or ideas about climate change from our visualizations:
1. How does the climate change over time?
2. Can we visualize the change in sea ice over time?
3. What is the pattern of the change in sea ice over time?
4. Do changes in sea ice differ between the two hemispheres?
Global Temperature
================================
Inputs {.sidebar}
-----------------------------------------------------------------------
### Global Temperature Analysis
One of the most obvious indications of climate change is increase of global increase in temperature for the past several year over several decades.
We also chose to study the Global Temperature Anomalies data obtained from Climate.gov website which created this dataset by blending the land and ocean data. We utilized this data to understand the trend of temperature anomalies over time.
Source : https://www.climate.gov/maps-data/dataset/global-temperature-anomalies-graphing-tool
As it can be clearly seen, the graph shows a progression of increasing temperature anomalies especially in the 20th century which also acts as a proof that Climate change is Real.
Row {data-height=450}
-----------------------------------------------------------------------
```{r}
avg.temp1850 = temperatures_state.usa %>% filter(Year == 1850) %>% group_by(State) %>% summarise(avg.temp.1850 = mean(AverageTemperature,na.rm = T))
usa.map = map_data("state")
a= unique(usa.map$region)
avg.temp1850$State = tolower(avg.temp1850$State)
avg.temp1850$State[11] = a[10]
usa.map = merge(x=usa.map,y=avg.temp1850,by.x = "region",by.y = "State",all.x = TRUE)
```
```{r}
avg.temp2000 = temperatures_state.usa %>% filter(Year == 2000) %>% group_by(State) %>% summarise(avg.temp.2000 = mean(AverageTemperature,na.rm = T))
avg.temp2000$State = tolower(avg.temp2000$State)
avg.temp2000$State[11] = a[10]
usa.map = merge(x=usa.map,y=avg.temp2000,by.x = "region",by.y = "State",all.x = TRUE)
avg.temp2012 = temperatures_state.usa %>% filter(Year == 2012) %>% group_by(State) %>% summarise(avg.temp.2012 = mean(AverageTemperature,na.rm = T))
avg.temp2012$State = tolower(avg.temp2012$State)
avg.temp2012$State[11] = a[10]
usa.map = merge(x=usa.map,y=avg.temp2012,by.x = "region",by.y = "State",all.x = TRUE)
usa.map$change2000 <- (usa.map$avg.temp.2000 - usa.map$avg.temp.1850)*100/usa.map$avg.temp.1850
usa.map$change2012 <- (usa.map$avg.temp.2012 - usa.map$avg.temp.1850)*100/usa.map$avg.temp.1850
```
### Temperature in USA - 2000
```{r}
p1 <- ggplot() + geom_polygon(data = usa.map,aes(x=long,y=lat,group = group,fill = change2000),col = "white")+
scale_fill_continuous(low="light blue",high = "red",limits = c(-0.15,122),name = "Percentage Change in Average Temperature")+
theme_nothing(legend = T)+
theme(legend.position = "bottom")+
labs(title = "Temperature in USA - 2000")+
coord_map("albers", at0 = 45.5, lat1 = 29.5)
plot(p1)
```
### Temperature in USA - 2012
```{r}
p2 <- ggplot() + geom_polygon(data = usa.map,aes(x=long,y=lat,group = group,fill = change2012),col = "white")+
scale_fill_continuous(low="sky blue",high = "red",limits = c(-0.15,122),name = "Percentage Change in Average Temperature")+
theme_nothing(legend = T)+
theme(legend.position = "bottom")+
labs(title = "Temperature in USA - 2012")+
coord_map("albers", at0 = 45.5, lat1 = 29.5)
plot(p2)
```
Row {data-height=550}
-------------
### Average Temperature in USA 1849-2012
```{r}
temperatures_state.usa %>% filter(Year >=1849) %>% group_by(Year) %>% summarise(Avg.Temp = mean(AverageTemperature,na.rm = T)) %>%
ggplot(aes(x=Year,y=Avg.Temp,col= Avg.Temp))+
geom_point()+
geom_smooth(aes(group = 1))+
scale_color_continuous(low = "sky blue",high = "red")+
scale_x_discrete(breaks = seq(1849,2012,10))+
theme(panel.background = element_blank(),
legend.position = "bottom")+
labs(title = "Average Temperature in USA 1849-2012")
```
Sea Ice
==============================
Inputs {.sidebar}
-----------------------------------------------------------------------
### Sea Ice Analysis
The area under ice, in Northern hemisphere, is more than that in the Southern hemisphere throughout the time-period 1978-2015.
Over the decades of 1980, 1990, 2000, and 2010, the average extent of ice, during the first half year, has reduced significantly in 2000 and 2010 while the pattern seems reversed gradually in the second half year.
```{r}
north<-seaice%>%
filter(hemisphere=="north")
north<-north%>%
select(Year,Month,Extent)
melt_north<-melt(north,id.vars = c("Year","Month"),measure.vars ="Extent")
case_north<-dcast(melt_north,Year~Month,mean)
```
Row {.tabset .tabset-fade}
-----------------------------------------------------------------------
### Average Sea Ice Extent by Month & Year
```{r}
a<-list(title="Year")
b<-list(title="Average Extent (10^6 sq km)")
d<-list( x = 1.19,
y = 1.02,
text = 'Month',
xref = 'paper',
yref = 'paper',
showarrow = F
)
plot_ly(case_north,x=~Year,y=~Month)%>%
add_trace(y=~January,name="January",mode="scatter")%>%
add_trace(y=~February,name="February",mode="scatter")%>%
add_trace(y=~March,name="March",mode="scatter")%>%
add_trace(y=~April,name="April",mode="scatter")%>%
add_trace(y=~May,name="May",mode="scatter")%>%
add_trace(y=~June,name="June",mode="scatter")%>%
add_trace(y=~July,name="July",mode="scatter")%>%
add_trace(y=~August,name="August",mode="scatter")%>%
add_trace(y=~September,name="September",mode="scatter")%>%
add_trace(y=~October,name="October",mode="scatter")%>%
add_trace(y=~November,name="November",mode="scatter")%>%
add_trace(y=~December,name="December",mode="scatter")%>%
layout(title="Average Sea Ice Extent by Month & Year (North Hemisphere)",
xaxis=a,
yaxis=b,
annotations = d
)
```
### Comparison of Ice Cover
```{r}
ice<-read.csv("data/seaice.csv")
ice$Year = as.integer(ice$Year)
ice$Month = as.integer(ice$Month)
ice$Extent = as.numeric(ice$Extent)
ice$Missing = as.numeric(ice$Missing)
ice_monthly = ice %>%
filter(Year %in% c(1980,1990,2000,2010)) %>%
group_by(hemisphere, Year, Month) %>%
summarise(Extent = mean(Extent))
ggplot(ice_monthly, aes(x= as.factor(Month), y = ice_monthly$Extent, fill = hemisphere)) +
geom_bar(stat = "identity") + facet_wrap(~ Year,ncol = 2) +
ylab("Extent (in 10^6 sq km)") +
xlab("Months") +
ggtitle("Comparison of Month-wise Ice Cover") +
theme_minimal()
```
Summary
==============================
The global climate change is real. The global temperature now is much abnormal than years before. The average temperature has changed in United States obviously.
The summer ice extent in North Hemisphere is lower than South while the winter is opposite.
In North Hemisphere, the average ice extent decreases by year in every month.
In South Hemisphere, the average ice extent decreases by year but stays relatively stable every month.