Please submit exercises 2.1, 2.2, 2.3 and 2.6 from the Hyndman online Forecasting book. Please submit both your Rpubs link as well as attach the .rmd file with your code.

2.1

Use the help function to explore what the series gold, woolyrnq and gas represent.

help(gold) # Daily morning gold prices in US dollars. 1 January 1985 – 31 March 1989.
help(woolyrnq) # Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994.
help(gas)  # Australian monthly gas production: 1956–1995.

head(gold)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1] 306.25 299.50 303.45 296.75 304.40 298.35
head(woolyrnq)
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1965 6172 6709 6633 6660
## 1966 6786 6800
head(gas)
##       Jan  Feb  Mar  Apr  May  Jun
## 1956 1709 1646 1794 1878 2173 2321
  1. Use autoplot() to plot each of these in separate plots.
library(ggplot2)
autoplot(gold)+
  ggtitle("Daily morning gold prices in US dollars. 1 January 1985 – 31 March 1989")+
  xlab("1 January 1985 - 31 March 1989")+
  ylab("US dollars")

autoplot(woolyrnq)+
  ggtitle("Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994")+
  xlab("Mar 1965 – Sep 1994")+
  ylab("Production")

autoplot(gas)+
  ggtitle("Australian monthly gas production: 1956–1995")+
  xlab("1956–1995")+
  ylab("Production")

  1. What is the frequency of each series? Hint: apply the frequency() function.
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
  1. Use which.max() to spot the outlier in the gold series. Which observation was it?
which.max(gold)
## [1] 770

The 770th row has the maximum value which is considered as an outlier.

2.2

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

  1. You can read the data into R with the following script:
tute1 <- read.csv("https://otexts.com/fpp2/extrafiles/tute1.csv", header=TRUE)
str(tute1)
## 'data.frame':    100 obs. of  4 variables:
##  $ X       : Factor w/ 100 levels "Dec-00","Dec-01",..: 57 32 82 7 58 33 83 8 59 34 ...
##  $ Sales   : num  1020 889 795 1004 1058 ...
##  $ AdBudget: num  659 589 512 614 647 ...
##  $ GDP     : num  252 291 291 292 279 ...
View(tute1)
## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1
  1. Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)

(The [,-1] removes the first column which contains the quarters as we don’t need them now.)

  1. Construct time series plots of each of the three series
autoplot(mytimeseries, facets=TRUE)

Check what happens when you don’t include facets=TRUE.

autoplot(mytimeseries)

2.3

Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.

  1. You can read the data into R with the following script:
retaildata <- readxl::read_excel("retail.xlsx", skip=1)

The second argument (skip=1) is required because the Excel sheet has two header rows.

b.Select one of the time series as follows (but replace the column name with your own chosen column):

myts <- ts(retaildata[,"A3349882C"], frequency = 12, start =c(1982,4))
head(myts)
##        Apr   May   Jun   Jul   Aug   Sep
## 1982 139.3 136.0 143.5 150.2 144.0 146.9
View(myts)
## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1
  1. Explore your chosen retail time series using the following functions:

autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()

Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

autoplot(myts)+
  ggtitle("A3349882C")+
  xlab("April 1982 to December 2013")+
  ylab("Sales")

ggseasonplot(myts)

ggsubseriesplot(myts)

gglagplot(myts)

ggAcf(myts)

The plot shows a trend of increase in sales, especially between 2008 to 2010. The seasonality is becoming more obvious compare to the 80s and 90s; it shows increase in sales in Feburary, August, and November. The spike is also incrasing year by year, and it hits the peak in early December. It makes sense beacause we have the biggest holidays- Thanks giving and Christmas, and las plot tells a negatie relationship.

2.6

Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.

Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

hsales

autoplot(hsales)

ggseasonplot(hsales)

gglagplot(hsales)

ggAcf(hsales)

The dataset shows, a seasonal pick sales exist between Feburary and March, and lower sales in rest of year. We can see this clearly in the seasonal plot. However, there is no clear trend, and cyclicity is showing in auoto and seasonal plot. It seems the sales was good at 70s and decreased dramatically at 80s.

usdeaths

autoplot(usdeaths)

ggseasonplot(usdeaths)

gglagplot(usdeaths)

ggsubseriesplot(usdeaths)

ggAcf(usdeaths)

The plot shows it has seasonality and no trend , no cyclicity.

bricksq

autoplot(bricksq)

ggseasonplot(bricksq)

ggsubseriesplot(bricksq)

gglagplot(bricksq)

ggAcf(bricksq)

Q2 and Q3 have higher production compare to Q1, Q4. during the recession between 1982 to 1983, the brick production drops dramatically. It has increase in trending and week seasonality. It shows a positive relationships in lag with seasonality and trend.

sunspotarea

autoplot(sunspotarea)

#ggseasonplot(sunspotarea)
#ggsubseriesplot(sunsportarea)
gglagplot(sunspotarea)

ggAcf(sunspotarea)

It has cyclicity, however it is not a seasonal data.

gasoline

autoplot(gasoline)

ggseasonplot(gasoline)

#ggsubseriesplot(gasoline)
gglagplot(gasoline)

ggAcf(gasoline)

The plot has seasonality and trend, and seasonal length is too long to make the lagplot useful.