library(moments)
library(repmis)
githubURL <- "https://github.com/AndyBunn/TeachingData/raw/master/climate_Time_Series_Extravaganza.Rdata"
source_data(githubURL)
## Downloading data from: https://github.com/AndyBunn/TeachingData/raw/master/climate_Time_Series_Extravaganza.Rdata
## SHA-1 hash of the downloaded data file is:
## 5f5b467b42520fdfbb85cfc8a1fbcc45197e0e56
## [1] "loti.ts" "loti.zoo" "ice.ts" "ice.zoo"
## [5] "sealevel.ts" "sealevel.zoo" "sunspots.ts" "sunspots.zoo"
## [9] "enso.ts" "enso.zoo" "amo.ts" "amo.zoo"
## [13] "pdo.ts" "pdo.zoo" "ohc.ts" "ohc.zoo"
## [17] "co2.ts" "co2.zoo"
plot(ice.ts)
summary(ice.ts)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.620 9.255 12.390 11.740 14.500 16.480
class(ice.ts)
## [1] "ts"
Just looking at the plot, I can see a somewhat negative trend in the data over time.
kt <-kurtosis(ice.ts)
sk <- skewness(ice.ts)
my.xlim <- range(ice.ts)
h<-hist(ice.ts, breaks=10, col="lightblue", xlab="ice extent (sq km)",main="",xlim=my.xlim)
xfit<-seq(min(ice.ts),max(ice.ts),length=100)
yfit<-dnorm(xfit,mean=mean(ice.ts),sd=sd(ice.ts))
yfit <- yfit*diff(h$mids[1:2])*length(ice.ts)
lines(xfit, yfit, col="darkblue", lwd=2)
boxplot(ice.ts, horizontal=TRUE, outline=TRUE, axes=FALSE, ylim=my.xlim, col = "lightgreen", add = TRUE,boxwex=3)
text(x = 4, y=60, labels = paste("Kurtosis=",round(kt,2)),pos = 4)
text(x = 4, y=55, labels = paste("Skewness=",round(sk,2)),pos = 4)
icetime <- time(ice.ts)
ice.lm <- lm(ice.ts ~ icetime)
summary(ice.lm)
##
## Call:
## lm(formula = ice.ts ~ icetime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.3134 -2.5664 0.7485 2.8378 4.2998
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 118.86216 27.18806 4.372 1.54e-05 ***
## icetime -0.05362 0.01361 -3.940 9.46e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.094 on 445 degrees of freedom
## Multiple R-squared: 0.03371, Adjusted R-squared: 0.03154
## F-statistic: 15.52 on 1 and 445 DF, p-value: 9.457e-05
The results show that there was an annual decrease in sea ice of 0.05, or 5%. I then used a moving average filtering method in order to smooth the data.
ma10 <- filter(x=ice.ts, filter=rep(x=1/10,times=10), sides=2)
ma5 <- filter(x=ice.ts, filter=rep(x=1/5,times=5), sides=2)
plot(ice.ts,col="grey")
lines(ma10,col="red",lwd=2)
lines(ma5,col="blue",lwd=2)
abline(ice.lm, col="black",lwd=2, lty="dashed")
ice.ts<- decompose(ice.ts)
plot(ice.ts)
Decomposing the data shows a very strong seasonal component to the trends through time. You can also really see the negative trend over time without the seasonal component.
data(lynx)
plot(lynx)
summary(lynx)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 39.0 348.2 771.0 1538.0 2567.0 6991.0
class(lynx)
## [1] "ts"
The data was collected over a roughly 100 year period from 1821-1934. The plot reveals some very large fluctuations in trappings of Canadian lynx, with sharp increases and decreases every decade or so. class(lynx) reveals that the data set is in fact a time series (‘ts’).
kt <-kurtosis(lynx)
sk <- skewness(lynx)
my.xlim <- range(lynx)
h<-hist(lynx, breaks=10, col="lightblue", xlab="lynx numbers",
main="",xlim=my.xlim)
xfit<-seq(min(lynx),max(lynx),length=100)
yfit<-dnorm(xfit,mean=mean(lynx),sd=sd(lynx))
yfit <- yfit*diff(h$mids[1:2])*length(lynx)
lines(xfit, yfit, col="darkblue", lwd=2)
boxplot(lynx, horizontal=TRUE, outline=TRUE, axes=FALSE,
ylim=my.xlim, col = "lightgreen", add = TRUE, boxwex=3)
text(x =3200 , y=40, labels = paste("Kurtosis=",round(kt,2)),pos = 4)
text(x =3200 , y=33, labels = paste("Skewness=",round(sk,2)),pos = 4)
lynxtime <- time(lynx)
lynx.lm <- lm(lynx ~ lynxtime)
summary(lynx.lm)
##
## Call:
## lm(formula = lynx ~ lynxtime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1594 -1211 -755 1032 5366
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4630.034 8493.112 -0.545 0.587
## lynxtime 3.285 4.523 0.726 0.469
##
## Residual standard error: 1589 on 112 degrees of freedom
## Multiple R-squared: 0.004689, Adjusted R-squared: -0.004198
## F-statistic: 0.5276 on 1 and 112 DF, p-value: 0.4691
I interpreted the output to mean that the number of trapped lynx increased annually by 3 animals over the length of the data set.
I then used a moving average filtering method in order to smooth the data.
ma10 <- filter(x=lynx, filter=rep(x=1/10,times=10), sides=2)
ma5 <- filter(x=lynx, filter=rep(x=1/5,times=5), sides=2)
plot(lynx,col="grey")
lines(ma10,col="red",lwd=2)
lines(ma5,col="blue",lwd=2)
abline(lynx.lm, col="black",lwd=2, lty="dashed")
The 10 year filter does a pretty good job of smoothing out the data. You can see a very slight increase in Lynx numbers over time.
I played around with the filter lengths a little.
ma8 <- filter(x=lynx, filter=rep(x=1/8,times=8), sides=2)
ma16 <- filter(x=lynx, filter=rep(x=1/16,times=16), sides=2)
plot(lynx,col="grey")
lines(ma8,col="red",lwd=2)
lines(ma16,col="blue",lwd=2)
abline(lynx.lm, col="black",lwd=2, lty="dashed")
I used an 8 and a 16 year filter and found some interesting oscillations in the smoothing filter. I think because an 8 year filter does a good job of smoothing the peaks out, if you double that filter, it really shows the cyclic nature of the data trend.For every peak in the 8 year filter, the 16 year filter has a valley. Hm.
lynx.train <- window(lynx, end = 1934)
lynx.test <- window(lynx, start = 1821)
lynx.train.ar<-ar(lynx.train)
lynx.train.forecast <- predict(lynx.train.ar, n.ahead=30)
plot(lynx,xlim=c(1820,1940),col="pink",lwd=1.5, ylab="Lynx")
lines(lynx.train,col="red",lty="dotted",lwd=2)
lines(lynx.test,col="blue",lty="dotted",lwd=2)
lines(lynx.train.forecast$pred,col="green",lwd=1.5)
lynx.train.forecast <- predict(lynx.train.ar, n.ahead=100)
plot(lynx,xlim=c(1820,2040),col="pink",lwd=1.5, ylab="Lynx")
lines(lynx.train,col="red",lty="dotted",lwd=2)
lines(lynx.test,col="blue",lty="dotted",lwd=2)
lines(lynx.train.forecast$pred,col="green",lwd=1.5)
After a decade or so, you can definitely start to see that the forecast has an exponential decay pattern…as it loses its ‘memory’.