Homework 1

Exercises: 2.1, 2.2, 2.3, 2.4, 2.5 and 2.8

Exercise 2.1: ts objects

Use the help function to explore what the series gold, woolyrnq and gas represent.

Loading libraries

library(fpp2)
## Warning: package 'fpp2' was built under R version 4.4.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## ── Attaching packages ──────────────────────────────────────────── fpp2 2.5.1 ──
## ✔ ggplot2   4.0.0      ✔ fma       2.5   
## ✔ forecast  8.24.0     ✔ expsmooth 2.3
## 

Using the Help Function to Explore Series Type

help(gold)
help(woolyrnq)
help(gas)

2.1 (A) Use autoplot() to plot each of these in separate plots.

Gold

autoplot(gold) + 
  ggtitle("Daily Morning Gold Prices") + 
  xlab("Year (1 January 1985 – 31 March 1989)") + 
  ylab("Gold Price") + 
  theme_light()

It appears as though the gold series is stored as a ts object with a numeric time index instead of dates since I don’t see any dates displayed on the x axis.

Woolyrnq

autoplot(woolyrnq) + 
  ggtitle("Quarterly Production of Woollen Yarn in Australia") + 
  xlab("Year (Mar 1965 – Sep 1994)") + 
  ylab("Woollen Yarn Production (tonnes)") + 
  theme_light()

Gas

autoplot(gas) + 
  ggtitle("Australian Monthly Gas Production") + 
  xlab("Year (1956–1995)") + 
  ylab("Monthly Gas Production") + 
  theme_light()

2.1 (B) What is the frequency of each series? Hint: apply the frequency() function.

Gold

frequency(gold)
## [1] 1

This makes sense since it is daily data.

Woolyrnq

frequency(woolyrnq)
## [1] 4

This is quarterly data.

Gas

frequency(gas)
## [1] 12

This is monthly data.

2.1 (C) Use which.max() to spot the outlier in the gold series. Which observation was it?

which.max(gold)
## [1] 770

It is observation number 770.

Exercise 2.2

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

2.2 (A) You can read the data into R with the following script:

## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2.2 (B) Convert the data to time series

mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)

2.2 (C.1) Construct time series plots of each of the three series

autoplot(mytimeseries, facets=TRUE) + 
  theme_light()

2.2 (C.2) Check what happens when you don’t include facets=TRUE.

autoplot(mytimeseries) + 
  theme_light()

When you don’t include the facets=TRUE it makes it all one plot and creates a key to distinguish between the different series.

Exercise 2.3

Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.

2.3 (A) You can read the data into R with the following script:

2.3 (B) Select one of the time series as follows (but replace the column name with your own chosen column):

myts <- ts(retaildata[,"A3349399C"],
  frequency=12, start=c(1982,4))

The column I selected is: Turnover ; New South Wales ; Clothing retailing.

2.3 (C) Explore your chosen retail time series using the following functions:

autoplot

autoplot(myts)

ggseasonplot

ggseasonplot(myts)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

ggsubseriesplot

ggsubseriesplot(myts)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

gglagplot

gglagplot(myts)

ggAcf

ggAcf(myts)

All of the graphs clearly show that peak clothing purchases happen during the month of December. I learned that time series can be represented in different types of graphs and that they can display the same information in a variety of ways to drive the same point home. They all displayed December as the peak month. Furthermore, all of the graphs show a slight peak around May/June for clothes shopping as well.

Exercise 2.4

Create time plots of the following time series: bicoal, chicken, dole, usdeaths, lynx, goog, writing, fancy, a10, h02. Use help() to find out about the data in each series. For the goog plot, modify the axis labels and title.

bicoal

help(bicoal)
autoplot(bicoal) + 
  ggtitle("Annual Bituminous Coal Production") + 
  xlab("Year (1920–1968)") + 
  ylab("Coal Production") + 
  theme_light()

chicken

help(chicken)
autoplot(chicken) + 
  ggtitle("Price of Chicken in the US") + 
  xlab("Year (1924–1993)") + 
  ylab("Price of Chicken in US Dollars") + 
  theme_light()

dole

help(dole)
autoplot(dole) + 
  ggtitle("Unemployment Benefits in Australia") + 
  xlab("Year (Jan 1965 – Jul 1992)") + 
  ylab("Monthly total of people on unemployment benefits") + 
  theme_light()

usdeaths

help(usdeaths)
autoplot(usdeaths) + 
  ggtitle("Accidental deaths in USA") + 
  xlab("Year") + 
  ylab("Monthly accidental deaths in USA") + 
  theme_light()

lynx

help(lynx)
## Help on topic 'lynx' was found in the following packages:
## 
##   Package               Library
##   fma                   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
##   datasets              /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## 
## Using the first match ...
autoplot(lynx) + 
  ggtitle("Annual Canadian Lynx Trappings in McKenzie River") + 
  xlab("Year (1821–1934)") + 
  ylab("Annual number of lynx trapped") + 
  theme_light()

goog

help(goog)
autoplot(goog) + 
  ggtitle("Daily closing stock prices of Google Inc") + 
  xlab("Year (25 February 2013 and 13 February 2017)") + 
  ylab("Closing stock prices") + 
  theme_light()

writing

help(writing)
autoplot(writing) + 
  ggtitle("Sales of Printing and Writing Paper") + 
  xlab("Year (Jan 1963 – Dec 1972)") + 
  ylab("Industry sales (in thousands of French francs)") + 
  theme_light()

fancy

help(fancy)
autoplot(fancy) + 
  ggtitle("Souvenir Shop Sales: Queensland, Australia") + 
  xlab("Year") + 
  ylab("Monthly sales") + 
  theme_light()

a10

help(a10)
autoplot(a10) + 
  ggtitle("Monthly Anti-diabetic Drug Subsidy in Australia ") + 
  xlab("Year (July 1991 - June 2008)") + 
  ylab("Monthly government expenditure (millions of dollars)") + 
  theme_light()

h02

help(h02)
autoplot(h02) + 
  ggtitle("Monthly Corticosteroid Drug Subsidy in Australia") + 
  xlab("Year (July 1991 - June 2008)") + 
  ylab("Monthly government expenditure (millions of dollars)") + 
  theme_light()

Exercise 2.5

Use the ggseasonplot() and ggsubseriesplot() functions to explore the seasonal patterns in the following time series: writing, fancy, a10, h02.

What can you say about the seasonal patterns? Can you identify any unusual years?

writing: ggseasonplot

ggseasonplot(writing)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

writing: ggsubseriesplot

ggsubseriesplot(writing)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

For sales of writing and printing paper there appears to be a clear dip during the month of August. The decline from May to August could make sense since the months of May and June usually mark the end of school semesters, therefore could explain a decline in need of writing and printing papers.

fancy: ggseasonplot

ggseasonplot(fancy)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

fancy: ggsubseriesplot

ggsubseriesplot(fancy)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

I’m a little surprised that the increase for a souvenir shop isn’t more prominent during the summer months, but it makes sense that it is high during the months of November, especially December, as it is holiday season and a lot of people take vacation during that time.

a10: ggseasonplot

ggseasonplot(a10)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

a10: ggsubseriesplot

ggsubseriesplot(a10)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

help(a10)

January being the highest and the only peak makes sense as it is the beginning of the year which is when the government subsidiaries would generally start.

h02: ggseasonplot

ggseasonplot(h02)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

h02: ggsubseriesplot

ggsubseriesplot(h02)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?

``

help(h02)

There appears to be a seasonality to this where the subsidy increases as the year progresses from February, with January being the highest. It could be explained by the versatility of the corticosteroid drugs as they can treat a vast range of different conditions.

Exercise 2.8

The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.

Plot 1 matches plot B. -> it shows a slow decrease Plot 2 matches plot A. -> it shows a fluctuating plot with an overall decline
Plot 3 matches plot D. -> it shows a fluctuating graph, probably very seasonal that is increasing Plot 4 matches plot C. -> it shows a fluctuating plot