Libraries to use.

1 Time Series

1.1 Creating Time Series.

          Qtr1      Qtr2      Qtr3      Qtr4
2010            1.995615  1.763062  1.452173
2011  4.779934  5.718965  8.779271  8.389820
2012 10.365962  5.398546  5.021821  6.601211
2013  8.987069 10.687979 13.110406 10.572172
2014  8.693758  7.936813 11.620355 11.202458
2015 18.759441 12.399895 13.134812 11.559743
2016 14.003463 13.301188 15.671638 14.750984
2017 15.016441 11.195322 14.881908          
Time Series:
Start = 1980 
End = 2014 
Frequency = 1 
 [1]  1.995615  1.763062  1.452173  4.779934  5.718965  8.779271  8.389820
 [8] 10.365962  5.398546  5.021821  6.601211  8.987069 10.687979 13.110406
[15] 10.572172  8.693758  7.936813 11.620355 11.202458 18.759441 12.399895
[22] 13.134812 11.559743 14.003463 13.301188 15.671638 14.750984 15.016441
[29] 11.195322 14.881908  1.995615  1.763062  1.452173  4.779934  5.718965
Loading required package: zoo

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
                [,1]
2016-01-01  1.995615
2016-01-02  1.763062
2016-01-03  1.452173
2016-01-04  4.779934
2016-01-05  5.718965
2016-01-06  8.779271
2016-01-07  8.389820
2016-01-08 10.365962
2016-01-09  5.398546
2016-01-10  5.021821
2016-01-11  6.601211
2016-01-12  8.987069
2016-01-13 10.687979
2016-01-14 13.110406
2016-01-15 10.572172
2016-01-16  8.693758
2016-01-17  7.936813
2016-01-18 11.620355
2016-01-19 11.202458
2016-01-20 18.759441
2016-01-21 12.399895
2016-01-22 13.134812
2016-01-23 11.559743
2016-01-24 14.003463
2016-01-25 13.301188
2016-01-26 15.671638
2016-01-27 14.750984
2016-01-28 15.016441
2016-01-29 11.195322
2016-01-30 14.881908

1.3 Visualisation

1.3.2 ggplot

1.3.3 Plotly

1.3.4 dygraphs

1.4 Data For Example

1.4.1 Lag, Difference, and Autocorrelation

           Jan       Feb       Mar       Apr       May       Jun       Jul
2009                                                                      
2010  1.000000 -1.839818 -3.017105 -1.842691  5.234162  9.841046  8.316776
2011 22.237127 16.637862  3.658659  7.686553 12.333570 17.485469 24.962183
2012 25.156549 26.243030 23.531001 20.708705 11.434616 12.531440 22.490719
2013 24.744799 24.425151 24.882243 36.510455 46.747396 47.942016 47.605089
2014 56.182945                                                            
           Aug       Sep       Oct       Nov       Dec
2009                                          0.000000
2010  6.511892  6.177134  8.671431 11.451401 19.952744
2011 28.333312 28.007451 30.626409 35.790274 38.104631
2012 37.753707 36.636157 36.766465 35.771098 25.376246
2013 56.433489 61.217564 48.260981 35.207093 40.722804
2014                                                  
             Jan         Feb         Mar         Apr         May
2010                           1.0000000  -2.8398178  -1.1772875
2011   2.7799700   8.5013429   2.2843828  -5.5992647 -12.9792034
2012   5.1638647   2.3143578 -12.9480828   1.0864811  -2.7120285
2013  -0.9953678 -10.3948518  -0.6314469  -0.3196474   0.4570921
2014 -13.0538878   5.5157105  15.4601410                        
             Jun         Jul         Aug         Sep         Oct
2010   1.1744144   7.0768531   4.6068836  -1.5242693  -1.8048846
2011   4.0278940   4.6470171   5.1518996   7.4767133   3.3711296
2012  -2.8222960  -9.2740888   1.0968231   9.9592793  15.2629881
2013  11.6282120  10.2369407   1.1946202  -0.3369278   8.8284007
2014                                                            
             Nov         Dec
2010  -0.3347581   2.4942974
2011  -0.3258617   2.6189584
2012  -1.1175503   0.1303086
2013   4.7840752 -12.9565837
2014                        

Autocorrelations of series 'mytsdata', by lag

0.0000 0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667 0.7500 
 1.000  0.859  0.708  0.609  0.519  0.405  0.337  0.325  0.314  0.278 
0.8333 0.9167 1.0000 1.0833 1.1667 1.2500 1.3333 
 0.245  0.223  0.186  0.156  0.122  0.049 -0.016 

1.4.3 Trend


Call:
lm(formula = mytsdata ~ time(mytsdata))

Residuals:
     Min       1Q   Median       3Q      Max 
-17.0685  -5.6420   0.6023   6.3084  16.8171 

Coefficients:
                  Estimate  Std. Error t value Pr(>|t|)    
(Intercept)    -23966.4357   1958.8331  -12.23 2.30e-16 ***
time(mytsdata)     11.9230      0.9735   12.25 2.22e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.278 on 48 degrees of freedom
Multiple R-squared:  0.7576,    Adjusted R-squared:  0.7525 
F-statistic:   150 on 1 and 48 DF,  p-value: 2.221e-16
Analysis of Variance Table

Response: mytsdata
               Df  Sum Sq Mean Sq F value    Pr(>F)    
time(mytsdata)  1 10279.2 10279.2     150 2.221e-16 ***
Residuals      48  3289.4    68.5                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


    Shapiro-Wilk normality test

data:  linearMod$residuals
W = 0.98465, p-value = 0.7566

1.4.4 Regression with Lagged Time Series


Time series regression with "ts" data:
Start = 2010(3), End = 2014(3)

Call:
dynlm(formula = mytsdata ~ L(mytsdata, 1))

Residuals:
     Min       1Q   Median       3Q      Max 
-14.5796  -3.5823   0.0598   3.2617  15.4817 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     2.72080    1.68965    1.61    0.114    
L(mytsdata, 1)  0.93266    0.05967   15.63   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.675 on 47 degrees of freedom
Multiple R-squared:  0.8387,    Adjusted R-squared:  0.8352 
F-statistic: 244.3 on 1 and 47 DF,  p-value: < 2.2e-16

1.4.5 Seasonality

1.4.6 Making Stationary


    Augmented Dickey-Fuller Test

data:  mytsdata
Dickey-Fuller = -3.3154, Lag order = 3, p-value = 0.07932
alternative hypothesis: stationary
[1] 0
[1] 1

    Augmented Dickey-Fuller Test

data:  stationaryTS
Dickey-Fuller = -3.9373, Lag order = 3, p-value = 0.01973
alternative hypothesis: stationary

1.4.7 Forecast

Series: ts.stl$time.series[, 3] 
ARIMA(2,0,0) with zero mean 

Coefficients:
         ar1      ar2
      0.8735  -0.6074
s.e.  0.1167   0.1158

sigma^2 estimated as 18.39:  log likelihood=-143.36
AIC=292.72   AICc=293.24   BIC=298.45

Training set error measures:
                    ME     RMSE      MAE      MPE    MAPE      MASE
Training set 0.1290637 4.201792 3.438581 124.7528 161.216 0.4287052
                   ACF1
Training set 0.03800343

1.4.8 Entir Process

The data are clearly non-stationary, as the series wanders up and down for long periods. Consequently, we will take a first difference of the data.

The PACF shown is suggestive of an AR(2) model. So an initial candidate model is an ARIMA(2,1,0). There are no other obvious candidate models.

Series: mytsdatadj 
ARIMA(2,1,0) 

Coefficients:
         ar1      ar2
      0.3716  -0.4716
s.e.  0.1341   0.1319

sigma^2 estimated as 28.68:  log likelihood=-151.02
AIC=308.04   AICc=308.57   BIC=313.71


    Ljung-Box test

data:  Residuals from ARIMA(2,1,0)
Q* = 7.5718, df = 8, p-value = 0.4764

Model df: 2.   Total lags used: 10

1.4.9 Seasonal Package

1.5 Examples

1.5.1 White Noise

Smoothing

1.5.2 Trend and White Noise

1.5.3 Autoregresive

1.5.4 Seasonality and White Noise

1.5.5 Cross-correlation


Attaching package: 'astsa'
The following object is masked from 'package:seasonal':

    unemp
The following object is masked from 'package:forecast':

    gas

1.6 Real Life Examples

data(jj)
plot(jj, type="o", ylab="Quarterly Earnings per Share")

jjdata<-data.frame(Y=as.matrix(jj), date=as.Date(time(jj)))
jjdata %>%
plot_ly(x = ~date, y = ~Y, mode = 'lines+markers', type = "scatter", name="Confirm", width = 850, height = 750)

library(dygraphs)
dygraph(jj)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector()

dygraph(globtemp)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector()

dygraph(speech)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector()

#as.xts(AirPassengers)
dygraph(AirPassengers)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector()

lungDeaths <- cbind(mdeaths, fdeaths)
dygraph(lungDeaths)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector()

SoiRec <- cbind(soi, rec) # Southern Oscillation Index and Recruitment
dy_graph <- list(dygraph(soi)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector(),
  dygraph(rec)%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector())
htmltools::browsable(htmltools::tagList(dy_graph))

SoiRecdata<-data.frame(as.matrix(SoiRec), date=as.Date(time(SoiRec)))
subplot(SoiRecdata %>%
plot_ly(x = ~date, y = ~soi, mode = 'lines+markers', type = "scatter", name="soi"),
SoiRecdata %>%
plot_ly(x = ~date, y = ~rec, mode = 'lines+markers', type = "scatter", name="rec"), nrows = 2, shareX = TRUE)

dy_graph <- list(
dygraph(fmri1[,2:5])%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector(),

dygraph(fmri1[,6:9])%>% 
  dyOptions(drawPoints = TRUE, pointSize = 4)%>%
  dyHighlight(highlightSeriesOpts = list(strokeWidth = 3))%>% 
  dyRangeSelector())
htmltools::browsable(htmltools::tagList(dy_graph))

1.7 Exercise

Response mytsX :

Call:
lm(formula = mytsX ~ time(mytsdata))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5583 -0.6055  0.1441  0.8457  2.5433 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    426.38967   32.54121   13.10   <2e-16 ***
time(mytsdata)  -0.21471    0.01633  -13.15   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.179 on 98 degrees of freedom
Multiple R-squared:  0.6382,    Adjusted R-squared:  0.6345 
F-statistic: 172.9 on 1 and 98 DF,  p-value: < 2.2e-16


Response mytsY :

Call:
lm(formula = mytsY ~ time(mytsdata))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.5659 -1.0276 -0.2509  1.0234  3.0951 

Coefficients:
                  Estimate  Std. Error t value Pr(>|t|)    
(Intercept)    -1189.71594    39.87865  -29.83   <2e-16 ***
time(mytsdata)     0.60070     0.02001   30.02   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.444 on 98 degrees of freedom
Multiple R-squared:  0.9019,    Adjusted R-squared:  0.9009 
F-statistic: 900.9 on 1 and 98 DF,  p-value: < 2.2e-16
Analysis of Variance Table

               Df  Pillai approx F num Df den Df    Pr(>F)    
(Intercept)     1 0.96816  1474.56      2     97 < 2.2e-16 ***
time(mytsdata)  1 0.92730   618.61      2     97 < 2.2e-16 ***
Residuals      98                                             
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


    Augmented Dickey-Fuller Test

data:  mytsdata[, 1]
Dickey-Fuller = -3.4149, Lag order = 4, p-value = 0.05601
alternative hypothesis: stationary
[1] 0
[1] 1

    Augmented Dickey-Fuller Test

data:  stationaryTS
Dickey-Fuller = -5.7598, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

    Augmented Dickey-Fuller Test

data:  mytsdata[, 2]
Dickey-Fuller = -2.801, Lag order = 4, p-value = 0.2448
alternative hypothesis: stationary
[1] 0
[1] 1

    Augmented Dickey-Fuller Test

data:  stationaryTS
Dickey-Fuller = -5.3845, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

The data are clearly non-stationary, as the series wanders up and down for long periods. Consequently, we will take a first difference of the data.

The PACF shown is suggestive of an AR(1) model. So an initial candidate model is an ARIMA(1,1,0). There are no other obvious candidate models.

Series: mytsdatadj 
ARIMA(1,1,0) 

Coefficients:
          ar1
      -0.5128
s.e.   0.0861

sigma^2 estimated as 1.542:  log likelihood=-161.56
AIC=327.12   AICc=327.24   BIC=332.31


    Ljung-Box test

data:  Residuals from ARIMA(1,1,0)
Q* = 12.871, df = 7, p-value = 0.07533

Model df: 1.   Total lags used: 8

Series: mytsdatadj 
ARIMA(0,1,1) with drift 

Coefficients:
          ma1    drift
      -0.7453  -0.0545
s.e.   0.1093   0.0305

sigma^2 estimated as 1.362:  log likelihood=-155.16
AIC=316.33   AICc=316.58   BIC=324.11


    Ljung-Box test

data:  Residuals from ARIMA(0,1,1) with drift
Q* = 8.1736, df = 6, p-value = 0.2257

Model df: 2.   Total lags used: 8