library(fpp2)

Question 1 - Examine Datasets

Use the help function to explore what the series gold, woolyrnq and gas represent.

Use autoplot() to plot each of these in separate plots. What is the frequency of each series? Hint: apply the frequency() function. Use which.max() to spot the outlier in the gold series. Which observation was it?

Gold Dataset

We can use the help function to learn more about datasets. The gold dataset is a time series dataset which holds daily mornign gold pries in US dollars from Jan 1985 to Marfh 1989.

help(gold)

When using autoplot, we can see the time series in full. There is an anomoly at around time 770.

autoplot(gold)

The frequency of this dataset is 1. From the text this describes an annual frequency.

frequency(gold)
## [1] 1

We can use which.max() to get the outlier

which.max(gold)
## [1] 770

Wow! I was exactly right earlier when I guessed 770.

woolyrnq Dataset

help("woolyrnq")

Here we can see a general trend going down.

autoplot(woolyrnq)

The frequency of this dataset is 4, which suggests quarterly frequency.

frequency(woolyrnq)
## [1] 4

We can use which.max() to get the outlier

which.max(woolyrnq)
## [1] 21

gas Dataset

help(gas)

Here we have a clear upward trend over time. There are major dips and peaks for each cycle, but this is because of varying gas usage across various seasons. The frequency as we will see below is 12, indicating a monthly frequency.

autoplot(gas)

The frequency of this dataset is 12. From the text this describes an monthly frequency.

frequency(gas)
## [1] 12

We can use which.max() to get the outlier

which.max(gas)
## [1] 475

Question 2 - Tute Dataset

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

tute1 <- read.csv("tute1.csv", header=TRUE)
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
autoplot(mytimeseries, facets=TRUE)

By setting facets = FALSE, each of the time series will be graphed on the same vertical axis. This can be useful to compare the absolute scale of different series.

autoplot(mytimeseries, facets=FALSE)

Question 3 - Australian Retail Data

Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))

Autoplot results show a clear positive trend in the series. There are cyclical peaks which liekly correspond to end of year shopping trends.

autoplot(myts)

Our previous theory is validated when we look at the season plot below. December and November (end of year) have the largest share of sales. We can also see that this share has increased over the years as more and more people shop end of year.

ggseasonplot(myts)

While all months have seen increased spend over time, we can see tht november and december are the ‘outlier’ months, with the largest change over time.

ggsubseriesplot(myts)

gglagplot(myts)

ggAcf(myts)

Question 4

Create time plots of the following time series: bicoal, chicken, dole, usdeaths, lynx, goog, writing, fancy, a10, h02

autoplot(bicoal)

autoplot(chicken)

autoplot(dole)

autoplot(usdeaths)

autoplot(lynx)

autoplot(goog)

autoplot(writing)

autoplot(fancy)

autoplot(a10)

autoplot(h02)

Question 5

Use the ggseasonplot() and ggsubseriesplot() functions to explore the seasonal patterns in the following time series: writing, fancy, a10, h02.

Writing Dataset

help(writing)

We can see a clear seasonal pattern and trend in the “Sales of printing and writing paper” dataset. While sales have been rising over the years, there has always been a month drop in sales in August. Typically sales will increase from novermber to december. There are a few years which break this trend, namely 1970.

ggseasonplot(writing)

ggsubseriesplot(writing)

Fancy Dataset

help(fancy)

In the “Sales for a souvenir shop” dataset, there is a clear seasonality in increased spend towards the end of the year. Sales begin to spike around november and into december from year to year. Additionally, there seems to be a small bump in sales every march, except in the year 1993.

ggseasonplot(fancy)

ggsubseriesplot(fancy)

A10 Dataset

help(a10)

It is interesting to note that in the “Monthly anti-diabetic drug subsidy in Australia from 1991 to 2008” dataset, there tends to be a seasonal pattern of decreased spending from the first to the second month of each year. This could be due to general increased awareness in the first month and not necessarily decreased awareness in the second month.

Past the second month, there is a general trend upwards in spend.

Over the course of multiple years, there is a clear upward trend in spend.

ggseasonplot(a10)

ggsubseriesplot(a10)

## H02 Dataset

help(h02)

There is a clear drop between January and February in the “Monthly corticosteroid drug subsidy in Australia from 1991 to 2008” for every year on record. Spend increases over the course of the year, matching or surpassing initial spend.

ggseasonplot(h02)

ggsubseriesplot(h02)

Question 8

The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.

1 –> B

2 –> A

3 –> D

4 –> C