Data Sources and References

Borrowed from: ryankelly

Loading Necessay Libraries

library(astsa, quietly=TRUE, warn.conflicts=FALSE)
library(ggplot2)
library(knitr)
library(printr)
library(plyr)
library(dplyr)
library(lubridate)
library(gridExtra)
library(reshape2)
library(TTR)

Reading Time Series

Here we use the file http://robjhyndman.com/tsdldata/misc/kings.dat contains data on the age of death of successive kings of England, starting with William the Conqueror (original source: Hipel and Mcleod, 1994). You can read data into R using the scan() function, which assumes that your data for successive time points is in a simple text file with one column. The first three lines contain some comment on the data, and we want to ignore this when we read the data into R.

kings <- scan('http://robjhyndman.com/tsdldata/misc/kings.dat', skip=3)
head(kings)
## [1] 60 43 67 50 56 42

Converting the data into a time series

Once you have read the time series data into R, the next step is to store the data in a time series object in R, so that you can use R’s many functions for analysing time series data. To store the data in a time series object, we use the ts() function in R.

kings <- ts(kings)
kings
## Time Series:
## Start = 1 
## End = 42 
## Frequency = 1 
##  [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 69
## [24] 59 48 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56

However, it is common to come across time series that have been collected at regular intervals that are less than the one year of the kings dataset, for example, monthly, weekly or quarterly. In these cases we can specify the number of times that data was collected per year by using the frequency parameter in the ts( ) function. For monthly data, we set frequency = 12. We can also specify the first year that the data were collected and the first interval in that year by using the ‘stat’ parameter. For example, the third quarter of 1909 would be `start = c(1909, 3).

Next we load in a dataset of number of births per month in New York city, from January 1946 to December 1958.

births <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")

births <- ts(births, frequency = 12, start = c(1946, 1))
births
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1946 26.663 23.598 26.931 24.740 25.806 24.364 24.477 23.901 23.175 23.227
## 1947 21.439 21.089 23.709 21.669 21.752 20.761 23.479 23.824 23.105 23.110
## 1948 21.937 20.035 23.590 21.672 22.222 22.123 23.950 23.504 22.238 23.142
## 1949 21.548 20.000 22.424 20.615 21.761 22.874 24.104 23.748 23.262 22.907
## 1950 22.604 20.894 24.677 23.673 25.320 23.583 24.671 24.454 24.122 24.252
## 1951 23.287 23.049 25.076 24.037 24.430 24.667 26.451 25.618 25.014 25.110
## 1952 23.798 22.270 24.775 22.646 23.988 24.737 26.276 25.816 25.210 25.199
## 1953 24.364 22.644 25.565 24.062 25.431 24.635 27.009 26.606 26.268 26.462
## 1954 24.657 23.304 26.982 26.199 27.210 26.122 26.706 26.878 26.152 26.379
## 1955 24.990 24.239 26.721 23.475 24.767 26.219 28.361 28.599 27.914 27.784
## 1956 26.217 24.218 27.914 26.975 28.527 27.139 28.982 28.169 28.056 29.136
## 1957 26.589 24.848 27.543 26.896 28.878 27.390 28.065 28.141 29.048 28.484
## 1958 27.132 24.924 28.963 26.589 27.931 28.009 29.229 28.759 28.405 27.945
## 1959 26.076 25.286 27.660 25.951 26.398 25.565 28.865 30.000 29.261 29.012
##         Nov    Dec
## 1946 21.672 21.870
## 1947 21.759 22.073
## 1948 21.059 21.573
## 1949 21.519 22.025
## 1950 22.084 22.991
## 1951 22.964 23.981
## 1952 23.162 24.707
## 1953 25.246 25.180
## 1954 24.712 25.688
## 1955 25.693 26.881
## 1956 26.291 26.987
## 1957 26.634 27.735
## 1958 25.912 26.619
## 1959 26.992 27.897

Next loading data on beach town souvenir shop.

gift <- scan("http://robjhyndman.com/tsdldata/data/fancy.dat")
gift<- ts(gift, frequency=12, start=c(1987,1))
gift
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 1987   1664.81   2397.53   2840.71   3547.29   3752.96   3714.74   4349.61
## 1988   2499.81   5198.24   7225.14   4806.03   5900.88   4951.34   6179.12
## 1989   4717.02   5702.63   9957.58   5304.78   6492.43   6630.80   7349.62
## 1990   5921.10   5814.58  12421.25   6369.77   7609.12   7224.75   8121.22
## 1991   4826.64   6470.23   9638.77   8821.17   8722.37  10209.48  11276.55
## 1992   7615.03   9849.69  14558.40  11587.33   9332.56  13082.09  16732.78
## 1993  10243.24  11266.88  21826.84  17357.33  15997.79  18601.53  26155.15
##            Aug       Sep       Oct       Nov       Dec
## 1987   3566.34   5021.82   6423.48   7600.60  19756.21
## 1988   4752.15   5496.43   5835.10  12600.08  28541.72
## 1989   8176.62   8573.17   9690.50  15151.84  34061.01
## 1990   7979.25   8093.06   8476.70  17914.66  30114.41
## 1991  12552.22  11637.39  13606.89  21822.11  45060.69
## 1992  19888.61  23933.38  25391.35  36024.80  80721.71
## 1993  28586.52  30505.41  30821.33  46634.38 104660.67

Plotting Time Series

Plot the kings data.

plot.ts(kings)

At this point we could guess that this time series could be described using an additive model, since the random fluctuations in the data are roughly constant in size over time.

Plotting the births data.

plot.ts(births)

We can see from this time series that there is certainly some seasonal variation in the number of births per month; there is a peak every summer, and a trough every winter. Again the it seems like this could be described using an additive model, as the seasonal fluctuations are roughly constant in size over time and do not seem to depend on the level of the time series, and the random fluctuations seem constant over time.

plot.ts(gift)

In this case, an additive model is not appropriate since the size of the seasonal and random fluctuations change over time and the level of the time series. It is then appropriate to transform the time series so that we can model the data with a classic additive model.

logGift <- log(gift)
plot.ts(logGift)

Decomposing Time Series

Decomposing a time series means separating it into it’s constituent components, which are often a trend component and a random component, and if the data is seasonal, a seasonal component.

Decomposing non-Seasonal Data

Recall that non-seasonal time series consist of a trend component and a random component. Decomposing the time series involves tying to separate the time series into these individual components.

One way to do this is using some smoothing method, such as a simple moving average. The SMA() function in the TTR R package can be used to smooth time series data using a moving average. The SMA function takes a span argument as n order. To calculate the moving average of order 5, we set n = 5.

Lets start with n=3 to see a clearer picture of the Kings dataset trend component

kingsSMA3 <- SMA(kings, n=3)
plot.ts(kingsSMA3)

It seems like there is still some random fluctuations in the data, we might want to try a big larger of a smoother. Lets try n=8.

kingsSMA3 <- SMA(kings, n=8)
plot.ts(kingsSMA3)

This is better, we can see that the death of English kings has declined from ~55 years to ~40 years for a brief period, followed by a rapid increase in the next 20 years to ages in the 70’s.

Decomposing Seasonal Data

A seasonal time series, in addition to the trend and random components, also has a seasonal component. Decomposing a seasonal time series means separating the time series into these three components. In R we can use the decompose() function to estimate the three components of the time series.

Lets estimate the trend, seasonal, and random components of the New York births dataset.

birthsComp <- decompose(births)

birthsComp
## $x
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1946 26.663 23.598 26.931 24.740 25.806 24.364 24.477 23.901 23.175 23.227
## 1947 21.439 21.089 23.709 21.669 21.752 20.761 23.479 23.824 23.105 23.110
## 1948 21.937 20.035 23.590 21.672 22.222 22.123 23.950 23.504 22.238 23.142
## 1949 21.548 20.000 22.424 20.615 21.761 22.874 24.104 23.748 23.262 22.907
## 1950 22.604 20.894 24.677 23.673 25.320 23.583 24.671 24.454 24.122 24.252
## 1951 23.287 23.049 25.076 24.037 24.430 24.667 26.451 25.618 25.014 25.110
## 1952 23.798 22.270 24.775 22.646 23.988 24.737 26.276 25.816 25.210 25.199
## 1953 24.364 22.644 25.565 24.062 25.431 24.635 27.009 26.606 26.268 26.462
## 1954 24.657 23.304 26.982 26.199 27.210 26.122 26.706 26.878 26.152 26.379
## 1955 24.990 24.239 26.721 23.475 24.767 26.219 28.361 28.599 27.914 27.784
## 1956 26.217 24.218 27.914 26.975 28.527 27.139 28.982 28.169 28.056 29.136
## 1957 26.589 24.848 27.543 26.896 28.878 27.390 28.065 28.141 29.048 28.484
## 1958 27.132 24.924 28.963 26.589 27.931 28.009 29.229 28.759 28.405 27.945
## 1959 26.076 25.286 27.660 25.951 26.398 25.565 28.865 30.000 29.261 29.012
##         Nov    Dec
## 1946 21.672 21.870
## 1947 21.759 22.073
## 1948 21.059 21.573
## 1949 21.519 22.025
## 1950 22.084 22.991
## 1951 22.964 23.981
## 1952 23.162 24.707
## 1953 25.246 25.180
## 1954 24.712 25.688
## 1955 25.693 26.881
## 1956 26.291 26.987
## 1957 26.634 27.735
## 1958 25.912 26.619
## 1959 26.992 27.897
## 
## $seasonal
##             Jan        Feb        Mar        Apr        May        Jun
## 1946 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1947 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1948 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1949 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1950 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1951 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1952 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1953 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1954 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1955 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1956 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1957 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1958 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
## 1959 -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
##             Jul        Aug        Sep        Oct        Nov        Dec
## 1946  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1947  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1948  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1949  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1950  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1951  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1952  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1953  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1954  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1955  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1956  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1957  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1958  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 1959  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 
## $trend
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1946       NA       NA       NA       NA       NA       NA 23.98433
## 1947 22.35350 22.30871 22.30258 22.29479 22.29354 22.30562 22.33483
## 1948 22.43038 22.43667 22.38721 22.35242 22.32458 22.27458 22.23754
## 1949 22.06375 22.08033 22.13317 22.16604 22.17542 22.21342 22.27625
## 1950 23.21663 23.26967 23.33492 23.42679 23.50638 23.57017 23.63888
## 1951 24.00083 24.12350 24.20917 24.28208 24.35450 24.43242 24.49496
## 1952 24.27204 24.27300 24.28942 24.30129 24.31325 24.35175 24.40558
## 1953 24.78646 24.84992 24.92692 25.02362 25.16308 25.26963 25.30154
## 1954 25.92446 25.92317 25.92967 25.92137 25.89567 25.89458 25.92963
## 1955 25.64612 25.78679 25.93192 26.06388 26.16329 26.25388 26.35471
## 1956 27.21104 27.21900 27.20700 27.26925 27.35050 27.37983 27.39975
## 1957 27.44221 27.40283 27.44300 27.45717 27.44429 27.48975 27.54354
## 1958 27.68642 27.76067 27.75963 27.71037 27.65783 27.58125 27.49075
## 1959 26.96858 27.00512 27.09250 27.17263 27.26208 27.36033       NA
##           Aug      Sep      Oct      Nov      Dec
## 1946 23.66213 23.42333 23.16112 22.86425 22.54521
## 1947 22.31167 22.26279 22.25796 22.27767 22.35400
## 1948 22.21988 22.16983 22.07721 22.01396 22.02604
## 1949 22.35750 22.48862 22.70992 22.98563 23.16346
## 1950 23.75713 23.86354 23.89533 23.87342 23.88150
## 1951 24.48379 24.43879 24.36829 24.29192 24.27642
## 1952 24.44475 24.49325 24.58517 24.70429 24.76017
## 1953 25.34125 25.42779 25.57588 25.73904 25.87513
## 1954 25.98246 26.01054 25.88617 25.67087 25.57312
## 1955 26.40496 26.45379 26.64933 26.95183 27.14683
## 1956 27.44150 27.45229 27.43354 27.44488 27.46996
## 1957 27.56933 27.63167 27.67804 27.62579 27.61212
## 1958 27.46183 27.42262 27.34175 27.25129 27.08558
## 1959       NA       NA       NA       NA       NA
## 
## $random
##               Jan          Feb          Mar          Apr          May
## 1946           NA           NA           NA           NA           NA
## 1947 -0.237305288  0.863252404  0.543893429  0.175887019 -0.793193109
## 1948  0.183819712 -0.318705929  0.340268429  0.121262019 -0.354234776
## 1949  0.161444712  0.002627404 -0.571689904 -0.749362981 -0.666068109
## 1950  0.064569712 -0.292705929  0.479560096  1.047887019  1.561973558
## 1951 -0.036638622  1.008460737  0.004310096  0.556595353 -0.176151442
## 1952  0.203153045  0.079960737 -0.376939904 -0.853612981 -0.576901442
## 1953  0.254736378 -0.122955929 -0.224439904 -0.159946314  0.016265224
## 1954 -0.590263622 -0.536205929  0.189810096  1.079303686  1.062681891
## 1955  0.021069712  0.535169071 -0.073439904 -1.787196314 -1.647943109
## 1956 -0.316846955 -0.918039263 -0.155523237  0.507428686  0.924848558
## 1957 -0.176013622 -0.471872596 -0.762523237  0.240512019  1.182056891
## 1958  0.122778045 -0.753705929  0.340851763 -0.319696314  0.021515224
## 1959 -0.215388622  0.363835737 -0.295023237 -0.419946314 -1.115734776
##               Jun          Jul          Aug          Sep          Oct
## 1946           NA -0.963379006 -0.925718750 -0.939949519 -0.709369391
## 1947 -1.391369391 -0.311879006  0.347739583  0.150592147  0.076797276
## 1948  0.001672276  0.256412660  0.119531250 -0.623449519  0.289547276
## 1949  0.813838942  0.371704327  0.225906250  0.081758814 -0.578161058
## 1950  0.166088942 -0.423920673 -0.467718750 -0.433157853 -0.418577724
## 1951  0.387838942  0.499995994 -0.030385417 -0.116407853 -0.033536058
## 1952  0.538505609  0.414370994  0.206656250  0.025133814 -0.161411058
## 1953 -0.481369391  0.251412660  0.100156250  0.148592147  0.110880609
## 1954  0.380672276 -0.679670673 -0.269052083 -0.550157853 -0.282411058
## 1955  0.118380609  0.550245994  1.029447917  0.768592147  0.359422276
## 1956 -0.087577724  0.126204327 -0.437093750 -0.087907853  0.927213942
## 1957  0.053505609 -0.934587340 -0.592927083  0.724717147  0.030713942
## 1958  0.581005609  0.282204327  0.132572917  0.290758814 -0.171994391
## 1959 -1.642077724           NA           NA           NA           NA
##               Nov          Dec
## 1946 -0.082484776 -0.298388622
## 1947  0.591098558  0.095819712
## 1948  0.154806891 -0.076221955
## 1949 -0.356859776 -0.761638622
## 1950 -0.679651442 -0.513680288
## 1951 -0.218151442  0.081403045
## 1952 -0.432526442  0.323653045
## 1953  0.616723558 -0.318305288
## 1954  0.150890224  0.491694712
## 1955 -0.149068109  0.110986378
## 1956 -0.044109776 -0.106138622
## 1957  0.117973558  0.499694712
## 1958 -0.229526442 -0.089763622
## 1959           NA           NA
## 
## $figure
##  [1] -0.6771947 -2.0829607  0.8625232 -0.8016787  0.2516514 -0.1532556
##  [7]  1.4560457  1.1645938  0.6916162  0.7752444 -1.1097652 -0.3768197
## 
## $type
## [1] "additive"
## 
## attr(,"class")
## [1] "decomposed.ts"

Now lets plot the components .

plot(birthsComp)

Seasonally Adjusting

If you have a seasonal time series, you can seasonally adjust the series by estimating the seasonal component, and subtracting it from the original time series. We can see below that time time series simply consists of the trend and random components.

birthsSeasonAdj <- births - birthsComp$seasonal
plot.ts(birthsSeasonAdj)