library(fpp2)
library(zoo)
library(plotly)
Use the help function to explore what the series ‘gold’, ‘woolyrnq’ and ‘gas’ represent.
‘gold’: Daily morning gold prices in US dollars. 1 January 1985 – 31 March 1989. ‘woolyrnq’: Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994. ‘gas’: Australian monthly gas production: 1956–1995.
Use autoplot() to plot each of these in separate plots.
autoplot(gold)
autoplot(woolyrnq)
autoplot(gas)
What is the frequency of each series? Hint: apply the frequency() function.
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
Use which.max() to spot the outlier in the gold series. Which observation was it?
paste0('Observation: ',which.max(gold),', Value: ',gold[which.max(gold)])
## [1] "Observation: 770, Value: 593.7"
You can read the data into R with the following script:
tute1 <- read.csv("tute1.csv", header=TRUE)
Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
(The [,-1] removes the first column which contains the quarters as we don’t need them now.)
Construct time series plots of each of the three series
autoplot(mytimeseries, facets=TRUE)
Check what happens when you don’t include facets=TRUE.
autoplot(mytimeseries)
You can read the data into R with the following script:
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
The second argument (skip=1) is required because the Excel sheet has two header rows.
Select one of the time series as follows (but replace the column name with your own chosen column):
myts <- ts(retaildata[,"A3349791W"],
frequency=12, start=c(1982,4))
Explore your chosen retail time series using the following functions:
autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()
autoplot(myts)
ggseasonplot(myts)
The seasonal plot shows a positive trend and seasonality.
ggsubseriesplot(myts)
The subseries plot shows a positive trend with peaks during November and
December and slight lows in February and March as seen in the seasonal
plot.
gglagplot(myts)
The lag plot shows positive linear relationships at all lags, but the
strongest is at lag 12 thus confirming an annual seasonality.
ggAcf(myts)
The autocorrelation plot shows that r12 is higher than any of the other lags thus confirming the annual seasonality. All spikes are out of the bounds of the blue lines this confirming that the series is not white noise.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
We can see seasonality, with peaks in the last two months of the year. We do not see any signs of cyclicity, however there is a positive trend throughout the series. The ACF graphs confirms that the series is not white noise.
Create time plots of the following time series: ‘bicoal’, ‘chicken’, ‘dole’, ‘usdeaths’, ‘lynx’, ‘goog’, ‘writing’, ‘fancy’, ‘a10’, ‘h02’.
Use help() to find out about the data in each series.
‘bicoal’: Annual bituminous coal production in the USA: 1920–1968.
‘chicken’: Price of chicken in US (constant dollars): 1924–1993.
‘dole’: Monthly total of people on unemployment benefits in Australia (Jan 1965 – Jul 1992).
‘usdeaths’: Monthly accidental deaths in USA.
‘lynx’:Annual Canadian Lynx trappings 1821-1934
‘goog’: Closing stock prices of GOOG from the NASDAQ exchange, for 1000 consecutive trading days between 25 February 2013 and 13 February 2017. Adjusted for splits. goog200 contains the first 200 observations from goog.
‘writing’: Industry sales for printing and writing paper (in thousands of French francs): Jan 1963 – Dec 1972.
‘fancy’: Monthly sales for a souvenir shop on the wharf at a beach resort town in Queensland, Australia.
‘a10’: Monthly antidiabetic drug subsidy in Australia from 1991 to 2008
‘h02’: Monthly corticosteroid drug subsidy in Australia from 1991 to 2008.
autoplot(bicoal)
autoplot(chicken)
autoplot(dole)
autoplot(usdeaths)
autoplot(lynx)
For the goog plot, modify the axis labels and title.
autoplot(goog) + ggtitle("Closing Stock Prices of GOOG") +
ylab("$ Dollars ") +
xlab("Year")
autoplot(writing)
autoplot(fancy)
autoplot(a10)
autoplot(h02)
Use the ggseasonplot() and ggsubseriesplot() functions to explore the seasonal patterns in the following time series: writing, fancy, a10, h02.
ggseasonplot(writing)
ggseasonplot(fancy)
ggseasonplot(a10)
ggseasonplot(h02)
ggsubseriesplot(writing)
ggsubseriesplot(fancy)
ggsubseriesplot(a10)
ggsubseriesplot(h02)
What can you say about the seasonal patterns?
Writing: We see a sharp valley around the middle of the year with a sharp decline from July to August and a sharp increase from August to September. This might be due to the ending of the school year around June/July and the beginning of the next year around August September. Sales stay almost consistent with periodic highs and lows.
Fancy: We see an large increase in sales from October to December with sales reaching their lowest points in January from where a gradual increase starts again. This can be attributed to end of the year vacations.
a10: We see a large jump in January that are likely sales that were made in the end of December and registered in January. We also see that February has been the point consistently.
h02: We see a large jump in January followed by a sharp decline in February and a gradual increase in the following months to December. This is likely to be the sales that were made during the end of the year being registered in January and steady increase throughout the year.
Can you identify any unusual years?
We cannot find any years that stand out for any of the series.
Match the time plots with the ACF plots
data.frame(Time_Plot = c(1,2,3,4), ACF = c("B","A","D","C"))
## Time_Plot ACF
## 1 1 B
## 2 2 A
## 3 3 D
## 4 4 C