My Work

These data are gathered by Berkely Earth, a non-profit founded in 2013. These data display the global land and ocean temperature monthly, from 1850 up to December of 2015. I have read in the data, and created a time series object.
berk<-read.csv("~/Desktop/ESCI597A/Week04/berk.glob.csv", header=T)
berk.ts<-ts(berk$Series,start=c(1850,1),end=c(2015,2),frequency=12)

Monthly Average

The following includes filters for three moving averages. Filters include: three years (36 months), ten years (120 months),and twenty years (240 months).
ma36 <- filter(x=berk.ts, filter=rep(x=1/36,times=36), sides=2)
ma120 <- filter(x=berk.ts, filter=rep(x=1/120,times=120), sides=2)
ma240 <- filter(x=berk.ts, filter=rep(x=1/240,times=240), sides=2)

par(tcl=0.5,mar=rep(3,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(berk.ts,type="l",xlab="Year",ylab="Land and Ocean Temperature",col="cadetblue3")
abline(h=-0.15)
lines(ma36, col = "chocolate3", lwd = 2)
lines(ma120, col = "gold2", lwd = 2)
lines(ma240, col = "firebrick4", lwd = 2)
axis(3);axis(4)

#####Note how low frequencies retain the most value (three years), while higher frequency (twenty years) looses some of the values on the ends. We get a smoother line, but at the expense of lost data at the ends. However, there seems to be an upward trend. Interesting.

Hanning Filter

Using a Hanning Filter, the filters still retain more value at a lower frequency, but, with more smoother lines. It still looks like there is an upward trend in global temperature.
require(dplR)
## Loading required package: dplR
series<-berk$Series
date<-berk$Date.Decimal

han36 <- hanning(series,n=36)
han120 <- hanning(series,n=120)
han240 <- hanning(series,n=240)

par(tcl=0.5,mar=rep(3,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(date,series,type="l",xlab="Year",ylab="Land and Ocean Temperature",col="cadetblue3")
abline(h=-0.15)
lines(date,han36, col = "chocolate3", lwd = 2)
lines(date,han120, col = "gold2", lwd = 2)
lines(date,han240, col = "firebrick4", lwd = 2)
axis(3);axis(4)

The Hanning filter smooths out the line, though it still is very similar to the moving average. However, the ends are still truncated. So we end up loosing some information. To adjust for this, we can use a smoothing spline to make these lines.

Smoothing Splines

spl.2 <- smooth.spline(berk$Series,spar=0.2)$y
spl.4 <- smooth.spline(berk$Series,spar=0.4)$y
spl.8 <- smooth.spline(berk$Series,spar=0.8)$y

par(tcl=0.5,mar=rep(3,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(date,series,type="l",xlab="Year",ylab="Land and Ocean Temperature",col="cadetblue3")
abline(h=-0.15)
lines(date,spl.2, col = "chocolate3", lwd = 2)
lines(date,spl.4, col = "gold2", lwd = 2)
lines(date,spl.8, col = "firebrick4", lwd = 2)
axis(3);axis(4)

Check out these values of spar, the smoothing parameter. These values typically range from (0,1].
spl.1 <- smooth.spline(berk$Series,spar=0.1)$y
spl.5 <- smooth.spline(berk$Series,spar=0.5)$y
spl.9 <- smooth.spline(berk$Series,spar=0.9)$y

par(tcl=0.5,mar=rep(3,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(date,series,type="l",xlab="Year",ylab="Land and Ocean Temperature",col="cadetblue3")
abline(h=-0.15)
lines(date,spl.1, col = "chocolate3", lwd = 2)
lines(date,spl.5, col = "gold2", lwd = 2)
lines(date,spl.9, col = "firebrick4", lwd = 2)
axis(3);axis(4)

Check out those end regions! We have some knowledge that the planet is warming overall, and the smoothing spline is similar to the Hanning filter in that is averagine data from the years around each point, but the smoothing spline is removing low frequency variability. Spar values closer to 1 smooth the line out more, while values closer to 0 do not smooth out the data as much. This is a result of the math involved in calculating the the smoothing parameter. Spar is the value of the coefficient of lambda.

Lowess

Removing low-frequency variability
36 Months
n<-length(series)
yrs<-1:n
f36 <- 36/n
f36.lo <- lowess(x = yrs, y = series, f = f36)
residual.36<-series/f36.lo$y
par(mfcol=c(2,1),mar=c(0.1,3,3,3),mgp=c(1.2,0.2,0),tcl=0.5)
plot(yrs,series,type="l",xlab="Year",ylab="Temperature",col="cadetblue3")
abline(h=-0.15)
lines(yrs,f36.lo$y, col = "chocolate3", lwd = 2)
axis(3);axis(4)
par(mar=c(3,3,0.1,3))
plot(yrs,residual.36,type="l",ylab="Residuals",xlab="Years",col="cadetblue3")
abline(h=1500)

120 Months
n<-length(series)
yrs<-1:n
f120 <- 120/n
f120.lo <- lowess(x = yrs, y = series, f = f120)
residual.120<-series/f120.lo$y
par(mfcol=c(2,1),mar=c(0.1,3,3,3),mgp=c(1.2,0.2,0),tcl=0.5)
plot(yrs,series,type="l",xlab="Year",ylab="Temperature",col="cadetblue3")
abline(h=-0.15)
lines(yrs,f120.lo$y, col = "chocolate3", lwd = 2)
axis(3);axis(4)
par(mar=c(3,3,0.1,3))
plot(yrs,residual.120,type="l",ylab="Residuals",xlab="Years",col="cadetblue3")
abline(h=1500)

240 Months
n<-length(series)
yrs<-1:n
f240 <- 240/n
f240.lo <- lowess(x = yrs, y = series, f = f240)
residual.240<-series/f240.lo$y
par(mfcol=c(2,1),mar=c(0.1,3,3,3),mgp=c(1.2,0.2,0),tcl=0.5)
plot(yrs,series,type="l",xlab="Year",ylab="Temperature",col="cadetblue3")
abline(h=-0.15)
lines(yrs,f240.lo$y, col = "chocolate3", lwd = 2)
axis(3);axis(4)
par(mar=c(3,3,0.1,3))
plot(yrs,residual.240,type="l",ylab="Residuals",xlab="Years",col="cadetblue3")

I’m not sure what these residuals mean…I’ll revisit this idea.

Aggregation Work

Here I am going to aggregate my monthly data into yearly data. This will remove a lot of extra information. It makes sense to do that with global temperatures, since temperature varies over seasons, we can take an average of one year, and see if there is a trend. Note, I am not truncating my data since the last value was collected in December of 2015.
years<-1850:2015
berk.filter<-filter(x=berk$Series, filter=rep(x=1/12,times=12),sides=1)
berk.yr.avg<-berk.filter[seq(from=12, to=length(berk$Series),by=12)]
yearly.avg<-data.frame(years,berk.yr.avg)

mean(berk$Series[1:12]) #Check to make sure the mean is the same
## [1] -0.4525833
berk.yr.avg[1] #Whoop! It is! 
## [1] -0.4525833
Plot of aggregated data
par(tcl=0.5,mar=rep(3,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(years,berk.yr.avg,,type="l",ylab="Annual Temperature Increase",xlab="Years")

Spectral Analysis

Three Oceanic Patterns

library(repmis)
githubURL <- "https://github.com/AndyBunn/TeachingData/raw/master/climate_Time_Series_Extravaganza.Rdata"
source_data(githubURL) 
## Downloading data from: https://github.com/AndyBunn/TeachingData/raw/master/climate_Time_Series_Extravaganza.Rdata
## SHA-1 hash of the downloaded data file is:
## 5f5b467b42520fdfbb85cfc8a1fbcc45197e0e56
##  [1] "loti.ts"      "loti.zoo"     "ice.ts"       "ice.zoo"     
##  [5] "sealevel.ts"  "sealevel.zoo" "sunspots.ts"  "sunspots.zoo"
##  [9] "enso.ts"      "enso.zoo"     "amo.ts"       "amo.zoo"     
## [13] "pdo.ts"       "pdo.zoo"      "ohc.ts"       "ohc.zoo"     
## [17] "co2.ts"       "co2.zoo"
oceans3<-cbind(ENSO=enso.ts,PDO=pdo.ts,AMO=amo.ts)
par(tcl=0.5,mar=rep(2.5,4),mgp=c(1.1,0.1,0),xaxs="i")
plot(oceans3,main="Three Ocean Patterns", col="mediumpurple4")

par(mfrow=c(2,1))
spectrum(enso.ts,log="no")
spectrum(enso.ts,log="no", xlim=c(0,1))

enso.vec<-as.vector(enso.ts)
enso.time <- as.vector(time(enso.ts))
wavelet.plot(morlet(y1=enso.vec,x1=enso.time),
             crn.lab= "", useRaster= T, reverse.y=T)

#We have the greatest power between the periods 16-64. This corresponds to the cycle in which El Nino occurs. This is the dominant frequency. 
From my understanding of El Nino, it is occurs every few years, so it makes sense that there are spikes, and sharp drops.
par(mfrow=c(2,1))
spectrum(pdo.ts,log="no")
spectrum(pdo.ts,log="no", xlim=c(0,1))

pdo.vec<-as.vector(pdo.ts)
pdo.time <- as.vector(time(pdo.ts))
wavelet.plot(morlet(y1=pdo.vec,x1=pdo.time),
             crn.lab= "", useRaster= T, reverse.y=T)

#Not sure about this oceanic pattern, there is some power between 32-64 months, but it isn't as powerful of a dominant frequency.

Looks like there is a memory around 0.2 (~2 years), which probaly is a result of the giant peak at 0.1. However, I’m not sure how the Pacific Decadal Oscillation. Though, it is a recurring ocean-atmosphere pattern.

par(mfrow=c(2,1))
spectrum(amo.ts,log="no")
spectrum(amo.ts,log="no", xlim=c(0,0.4))

amo.vec<-as.vector(amo.ts)
amo.time <- as.vector(time(amo.ts))
wavelet.plot(morlet(y1=amo.vec,x1=amo.time),
             crn.lab= "", useRaster= T, reverse.y=T)

#Note the dark red region just before 1900. This is a region of greatest power which occurs right around 128 months, or ~10 years. This is the dominant frequency. 

Yeah, I have an inclination that we don’t know enough about this event. A large peak at the first of the periodogram, and maybe a memory around 1 year?