Exercise 2.2

Download the file tute1.csv from (the book website)[https://raw.githubusercontent.com/waheeb123/Datasets/main/tute1.csv], open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

a. You can read the data into R with the following script:

library(fpp2)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.5 ──
## ✔ ggplot2   3.5.1     ✔ fma       2.5  
## ✔ forecast  8.20      ✔ expsmooth 2.3
## 
library(readxl)
tute1 <- read.csv("https://raw.githubusercontent.com/waheeb123/Datasets/main/tute1.csv", header = TRUE)
View(tute1)
head(tute1)
##        X  Sales AdBudget   GDP
## 1 Mar-81 1020.2    659.2 251.8
## 2 Jun-81  889.2    589.0 290.9
## 3 Sep-81  795.0    512.5 290.8
## 4 Dec-81 1003.9    614.1 292.4
## 5 Mar-82 1057.7    647.2 279.1
## 6 Jun-82  944.4    602.0 254.0
summary(tute1)
##       X                 Sales           AdBudget          GDP       
##  Length:100         Min.   : 735.1   Min.   :489.9   Min.   :249.3  
##  Class :character   1st Qu.: 871.1   1st Qu.:569.5   1st Qu.:271.4  
##  Mode  :character   Median : 960.6   Median :608.5   Median :282.6  
##                     Mean   : 948.7   Mean   :591.9   Mean   :281.2  
##                     3rd Qu.:1018.7   3rd Qu.:635.0   3rd Qu.:290.3  
##                     Max.   :1115.5   Max.   :665.9   Max.   :330.6
str(tute1)
## 'data.frame':    100 obs. of  4 variables:
##  $ X       : chr  "Mar-81" "Jun-81" "Sep-81" "Dec-81" ...
##  $ Sales   : num  1020 889 795 1004 1058 ...
##  $ AdBudget: num  659 589 512 614 647 ...
##  $ GDP     : num  252 291 291 292 279 ...
# Convert the first column to Date format to extract the start year
start_date <- as.Date(paste0("01-", tute1$X[1]), format="%d-%b-%y")
start_year <- as.numeric(format(start_date, "%Y"))
start_year
## [1] 1981

b. Convert the data to time series

# The [,-1] removes the first column which contains the quarters we don't need
mytimeseries <- ts(tute1[, -1], start = 1981, frequency = 4)
tsdisplay(mytimeseries)

c. Construct time series plots of each of the three series.

autoplot(mytimeseries)

Exercise 2.3

  1. Download some monthly Australian retail data from OTexts.org/fpp2/extrafiles/retail.xlsx. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
  1. Read the data into R

  2. Select one of the time series as follows (but replace the column name with your own chosen column):

myts <- ts(retaildata[, "A3349337W"], frequency = 12, start = c(1982, 4))
  1. Explore your chosen retail time series using the following functions:

autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()

Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

autoplot(myts) + ggtitle("A3349337W") +
  xlab("Year") + ylab("Sales")

The autoplot shows a strong seasonality to the data, as well as an upward trend. Though there is a brief dip from 1990-2000, there is no evidence that this is part of a cycle yet.

ggseasonplot(myts, year.labels = TRUE, year.labels.left = TRUE) +
  ylab("Sales") + ggtitle("Seasonal Plot of A3349337W")

The seasonal plot emphasizes the seasonality of the data. Sales start to rise in the fall before spiking sharply between November and December, then falling off after January, obviously coinciding with holiday shopping and sales for Christmas.

ggsubseriesplot(myts) + ylab("Sales") +
  ggtitle("Seasonal Subseries Plot of A3349337W")

Again, the subseries highlights the seasonality of the data, but paints it clearer than the seasonal plot. Though sales rise from September, the floor actually remains the same. The only real difference is in December, which not only has a higher ceiling, but a higher floor as well.

gglagplot(myts)

The data is not very readable in this lag series. We can see some negative relationships and some positive relationships, but the amount of graphs, and the fact that this is monthly, make it difficult to discern much.

ggAcf(myts)

The decrease in lags highlights the trend, while the scalloped shape shows the seasonality of the sales data.