Analysis of Spatio-Temporal Data

Exercise #04

1) Analyze the auto- and cross-correlograms for the meteo data set, for the variables temperature, pressure, humidity and wind speed; give an interpretation of the results.

## Load meteo dataset
load("/home/jan/R/meteo.RData")
## store temperature, pressure, humidity and wind speed from dataset as
## matrix
observation = data.frame(date = as.POSIXct(meteo$date), temp = meteo$T.outside, 
    pres = meteo$pressure, humi = meteo$humidity, wind = meteo$windspeed)
summary(observation)
##       date                          temp            pres      
##  Min.   :2007-06-14 14:55:00   Min.   : 7.43   Min.   :-76.8  
##  1st Qu.:2007-06-16 08:06:15   1st Qu.:14.90   1st Qu.:-71.5  
##  Median :2007-06-18 01:18:30   Median :18.45   Median :-70.2  
##  Mean   :2007-06-18 01:18:30   Mean   :18.13   Mean   :-70.4  
##  3rd Qu.:2007-06-19 18:30:45   3rd Qu.:20.86   3rd Qu.:-69.0  
##  Max.   :2007-06-21 11:43:00   Max.   :29.51   Max.   :-65.5  
##       humi            wind      
##  Min.   : 32.4   Min.   :0.000  
##  1st Qu.: 61.2   1st Qu.:0.355  
##  Median : 77.0   Median :0.793  
##  Mean   : 74.9   Mean   :1.115  
##  3rd Qu.: 90.9   3rd Qu.:1.584  
##  Max.   :100.1   Max.   :6.876
rnames = colnames(observation[, 2:5])
# split graphics window
par(mfrow = (c(2, 2)))
# plot 4 ACF
for (i in rnames) acf(observation[, i], main = i)

plot of chunk unnamed-chunk-1

obs_matrix = observation[, 2:5]
row.names(obs_matrix) = as.numeric(observation$date)
## Warning: non-unique value when setting 'row.names': '1181833080'
## Error: duplicate 'row.names' are not allowed
acf(obs_matrix)

plot of chunk unnamed-chunk-1

4) How are lattice data usually represented in GIS?

Lattice date in GIS is represented as raster data, stored in a relational database.



Excercise #02

1) Load the meteo dataset and fit a model for the seasonal component using LOESS. Plot ACF and PACF of the residuals. Try to fit a suitable AR(p) to the residuals using Box-Jenkins.

load("c:/meteo.RData")
## Warning: cannot open compressed file 'c:/meteo.RData', probable reason 'No
## such file or directory'
## Error: cannot open the connection
tempData = data.frame(date = as.numeric(meteo$date), temp = meteo$T.outside)
tempLoess = loess(temp ~ date, span = 0.05, tempData)
tempPredict = predict(tempLoess, data.frame(date = tempData$date))
plot(x = meteo$date, y = tempData$temp, type = "p", pch = 16, main = "LOESS for T.outside of meteo.RData", 
    ylab = "Temperature", xlab = "")
lines(tempData$date, tempPredict, col = "red")

plot of chunk unnamed-chunk-2

tempResiduals = tempData$temp - tempPredict
plot(tempResiduals, type = "l")

plot of chunk unnamed-chunk-2

ar(tempResiduals)
## 
## Call:
## ar(x = tempResiduals)
## 
## Coefficients:
##      1       2       3       4       5       6       7       8       9  
##  1.801  -1.087   0.388  -0.192   0.089  -0.013  -0.039   0.049  -0.021  
##     10      11      12      13      14      15      16      17      18  
##  0.008  -0.001   0.007  -0.029   0.000   0.047  -0.002  -0.067   0.100  
##     19      20      21  
## -0.087   0.056  -0.028  
## 
## Order selected 21  sigma^2 estimated as  0.00257
pacf(tempResiduals)

plot of chunk unnamed-chunk-2

2) Compute the periodogram for the CO2 data without trend removal. What do you find?

N = length(co2)
I = abs(fft(co2)/sqrt(N))^2
P = (4/N) * I
f = (0:floor(N/2))/N
plot(f, I[1:((N/2) + 1)], type = "o", xlab = "frequency", ylab = "", main = "Raw Periodogram CO2")

plot of chunk unnamed-chunk-3

3) Use function stl() to remove the trend/seasonal component from the CO2 data. Recompute the periodogram and compare it to that in the previous exercise.

model = stl(co2, s.window = "periodic")
trend = model$time.series[, "trend"]
co2.ts.cor = co2 - trend
N = length(co2.ts.cor)
I = abs(fft(co2.ts.cor)/sqrt(N))^2
P = (4/N) * I
f = (0:floor(N/2))/N
plot(f, I[1:((N/2) + 1)], type = "o", xlab = "frequency", ylab = "", main = "De-trended Periodogram")

plot of chunk unnamed-chunk-4

4) Draw 500 samples from the standart normal distribution. Compute the periodogram of this white noise time series.

noise = rnorm(1:500)
N = 500
I = abs(fft(noise)/sqrt(N))^2
P = (4/N) * I
f = (0:floor(N/2))/N
plot(f, I[1:((N/2) + 1)], type = "o", xlab = "frequency", ylab = "", main = "Periodogram of White Noise")

plot of chunk unnamed-chunk-5

5) Simulate from the following cosine model Y(t) = A * cos(2 πωt + φ) + W(t) with ω = 1/50, A = 2, φ = 0.6π and W(t) ≈ N(0,σ²) Simulate once with σ² = 0 and once with σ² = 5.

noise = rnorm(500, 0, sqrt(5))
t = 1:500
model = 2 * cos(1/50 * 2 * pi * t + 0.6 * pi)
ysim = model + noise
plot(ysim)
lines(model, type = "l", col = "red")

plot of chunk unnamed-chunk-6

7) Periodograms are difficult to interpret and may be biased. Find out what is meant by “leakage” and “antialiasing”. Write down a brief explaination of these phenomena.



Exercise #01

1) Questions about time series

2) Carry out example of HoltWinters and explain what the partial plots obtained represent

require(graphics)
# Seasonal Holt-Winters
m <- HoltWinters(co2)
plot(m)

plot of chunk unnamed-chunk-7

plot(fitted(m))

plot of chunk unnamed-chunk-8

m <- HoltWinters(AirPassengers, seasonal = "mult")
plot(m)

plot of chunk unnamed-chunk-9

## Non-Seasonal Holt-Winters
x <- uspop + rnorm(uspop, sd = 5)
m <- HoltWinters(x, gamma = FALSE)
plot(m)

## Exponential Smoothing
m2 <- HoltWinters(x, gamma = FALSE, beta = FALSE)
lines(fitted(m2)[, 1], col = 3)

plot of chunk unnamed-chunk-10

3) Describe what the package forecast does

The forecast package is a framework to implement automatic forecast algorithms for time series in R. Therefore, different models (mainly exponetial smoothing and ARIMA) are used for parameter optimization for a set of cases. An appropriate model is selected from the set, according to a penalized likelihood. Then point forecasts are produced from which the prediction intervals are obtained.