Excercise for Module 6

Choose any previous exercise.

  1. Create a new R markdown file.
  2. Describe what the code is doing.
  3. Convert to html with knitr.
  4. Publish html page on RPubs.com

Let’s look at the GISS temperature data set. First, load in the libraries used in Module 6, then read the .csv file for the temperature data.

library(rpart)
library(rpart.plot)

giss <- read.csv("giss_temp.csv")

The yearly mean temperature anomaly is calculated by taking the mean of the monthly temperature anomalies for each year.

ann_temp <- tapply(giss$TempAnom, giss$Year, mean)
years <- unique(giss$Year)
tempCol = ifelse(ann_temp > 0, "red", "blue")

plot(ann_temp ~ years, type = 'h',
     main = "Annual Mean Temperature Anomaly",
     xlab = "Year", ylab = "Temperature Anomaly",
     col = tempCol, lwd = 2)

Before 1930, the temperature anomaly was negative. After 1980 the temperature anomaly began trending increasingly positive. Between 1930 and 1980 the temperature anomaly fluctuated around zero.

Can a classification scheme be applied to the temperature anomaly data?

giss.rpart = rpart(TempAnom ~ Month + Year, data = giss)
prp(giss.rpart, type = 3, extra = 1)

The classification scheme found similar results to those displayed in the previous plot. The temperature anomaly was around -0.25 for years before 1930, then closer to zero between 1930 and 1980. After 1980 the anomaly was around 0.22, then began to increase in 1997.

It looks like there is a relationship between the temperature and year, but what about the month data? Calculating an r-val for the correlation between the temperature anomaly and year or month can give some insight on the strength (if any) of the relationship between the two variables. This can be done using the cor() function. A value of 0, means there is no relationship. A value of 1 means a perfect relationship.

yearR = cor(giss$Year, giss$TempAnom)
monthR = cor(giss$Month, giss$TempAnom)

There is a strong correlation between the temperature anomaly and year, with an r-value of 0.8104959.

However, there is no correlation between the temperature anomaly and month (r-value of 0.0042272).