library(knitr)
library(kableExtra)
#install.packages("fpp2")
library(fpp2)

Exercise 2.1

Use the help function to explore what the series gold, woolyrnq and gas represent.

# help(gold)
# help(woolyrnq)
# help(gas)
head(gold)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1] 306.25 299.50 303.45 296.75 304.40 298.35
tsdisplay(gold)

head(woolyrnq)
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1965 6172 6709 6633 6660
## 1966 6786 6800
tsdisplay(woolyrnq)

head(gas)
##       Jan  Feb  Mar  Apr  May  Jun
## 1956 1709 1646 1794 1878 2173 2321
tsdisplay(gas)

a.Use autoplot() to plot each of these in separate plots.

autoplot(gold) + ggtitle("Daily Morning Gold Prices ($) Jan 1 1985 - Mar 31 1989") +
  ylab("$") + xlab("Days") 

autoplot(woolyrnq) + ggtitle("Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994") +
  ylab("Tons") + xlab("")

autoplot(gas) + ggtitle("Australian monthly gas production: 1956 - 1995") +
  ylab("") + xlab("Months")

b.What is the frequency of each series? Hint: apply the frequency() function.

paste0("Gold Frequency is: ", frequency(gold))
## [1] "Gold Frequency is: 1"
paste0("Woolyrnq Frequency is: ", frequency(woolyrnq))
## [1] "Woolyrnq Frequency is: 4"
paste0("Gas Frequency is: ", frequency(gas))
## [1] "Gas Frequency is: 12"

c.Use which.max() to spot the outlier in the gold series. Which observation was it?

#
gold.outlier.when <- which.max(gold)

paste0("gold get maximum value at position: ", gold.outlier.when)
## [1] "gold get maximum value at position: 770"
paste0("gold's maximum value was: ", gold[gold.outlier.when])
## [1] "gold's maximum value was: 593.7"

Exercice 2.2

Download the file tute1.csv from OTexts.org/fpp2/extrafiles/tute1.csv, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

  1. You can read the data into R with the following script
tute1 <- read.csv("tute1.csv", header=TRUE)
kable(head(tute1))
X Sales AdBudget GDP
Mar-81 1020.2 659.2 251.8
Jun-81 889.2 589.0 290.9
Sep-81 795.0 512.5 290.8
Dec-81 1003.9 614.1 292.4
Mar-82 1057.7 647.2 279.1
Jun-82 944.4 602.0 254.0
  1. Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
  1. Construct time series plots of each of the three series
autoplot(mytimeseries, facets=TRUE)

Check what happens when you don’t include facets=TRUE

autoplot(mytimeseries)

When not including ‘facets=TRUE’,The resultant graph does not subset them into individual plots.

Exercice 2.3

Download some monthly Australian retail data from OTexts.org/fpp2/extrafiles/retail.xlsx. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.

  1. You can read the data into R with the following script
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
  1. Select one of the time series as follows (but replace the column name with your own chosen column)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
  1. Explore your chosen retail time series using the following functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()
autoplot(myts)

ggseasonplot(myts)

ggsubseriesplot(myts)

gglagplot(myts, lags = 12)

ggAcf(myts)

Can you spot any seasonality, cyclicity and trend? What do you learn about the series? This time series data appears to have positive trend due to the ACF AutoCorrelation. It has slowly decreasing positive values over time. It does not appear to have enough seasonal lag, and there is no cyclic trend as well. The overall sales appear to have positive trend over the period of time.

Exercice 2.6

Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.

Let’s explore hsales

head(hsales)
##      Jan Feb Mar Apr May Jun
## 1973  55  60  68  63  65  61
autoplot(hsales)

ggseasonplot(hsales)

ggsubseriesplot(hsales)

gglagplot(hsales)

ggAcf(hsales, lag.max = 400)

Per plots above, we spot seasonality and cyclicity. There is no trend in the data. The seasonal component appears to be Spring (March). The cyclic component is indicated by some roughs and crests (around 10yrs period).

Let’s explore usdeaths

head(usdeaths)
##        Jan   Feb   Mar   Apr   May   Jun
## 1973  9007  8106  8928  9137 10017 10826
autoplot(usdeaths)

ggseasonplot(usdeaths)

ggsubseriesplot(usdeaths)

gglagplot(usdeaths)

ggAcf(usdeaths, lag.max = 60)

From the plot above, It seems that July appears to have most number of deaths. We do spot Seasonality in the this usdeaths time series data.

Let’s explore bricksq

head(bricksq)
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1956  189  204  208  197
## 1957  187  214
autoplot(bricksq)

ggseasonplot(bricksq)

ggsubseriesplot(bricksq)

gglagplot(bricksq)

ggAcf(bricksq, lag.max = 200)

According to graphic above, we spot strong trend in the bricksq timeseries dataset with little seasonality.

Let’s explore sunspotarea

head(sunspotarea)
## Time Series:
## Start = 1875 
## End = 1880 
## Frequency = 1 
## [1] 213.13333 109.28333  92.85833  22.21667  36.33333 446.75000
autoplot(sunspotarea)

#ggseasonplot(sunspotarea)    -Seasonal plots do not work with frequencies of 1
#ggsubseriesplot(sunspotarea) -ggsubseriesplot is not seasonal so useless to draw it
gglagplot(sunspotarea)

ggAcf(sunspotarea, lag.max = 50)

From above graph, there is no clear evidence of trend and seasonality in this time series data. We do spot strong cyclicity in the sunspotarea time series data.

Let’s explore gasoline

head(gasoline)
## Time Series:
## Start = 1991.1 
## End = 1991.19582477755 
## Frequency = 52.1785714285714 
## [1] 6.621 6.433 6.582 7.224 6.875 6.947
autoplot(gasoline)

ggseasonplot(gasoline)

#ggsubseriesplot(gasoline) -cannot draw
gglagplot(gasoline)

ggAcf(gasoline, lag.max = 1000)

In the above graph, we clearly spot seasonality and trend in the gasoline time series data. The lag plot shows some positive corelation which indicates that there is a seasonality component. The sales tend to be positively trending over the period of time but flattening a bit around and after 2005.