library(fpp2)
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
library(data.table)
library(readxl)

  1. Use the help function to explore what the series gold, woolyrnq and gas represent.
## time series gold
class(gold)
## [1] "ts"
summary(gold)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   285.0   337.7   403.2   392.5   443.7   593.7      34
head(gold,2)
## Time Series:
## Start = 1 
## End = 2 
## Frequency = 1 
## [1] 306.25 299.50
tail(gold,2)
## Time Series:
## Start = 1107 
## End = 1108 
## Frequency = 1 
## [1] 384.0 382.3
## time series woolyrnq
class(woolyrnq)
## [1] "ts"
summary(woolyrnq)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3324    4882    5466    5658    6646    7819
head(woolyrnq,2)
##      Qtr1 Qtr2
## 1965 6172 6709
tail(woolyrnq,2)
##      Qtr2 Qtr3
## 1994 6135 6396
## time series gas
class(gas)
## [1] "ts"
summary(gas)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1646    2675   16788   21415   38629   66600
head(gas,2)
##       Jan  Feb
## 1956 1709 1646
tail(gas,2)
##        Jul   Aug
## 1995 66600 60054
  1. Use autoplot() to plot each of these in separate plots.
### ts gold
autoplot(gold) +ggtitle("Daily morning gold prices in US dollars. 1 January 1985 - 31 March 1989") 

## ts woolyrnq
autoplot(woolyrnq) +ggtitle("Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 - Sep 1994")

## ts gas
autoplot(gas) + ggtitle("Australian monthly gas production: 1956-1995")

  1. What is the frequency of each series? Hint: apply the frequency() function.
## Frequency of ts gold
frequency(gold)
## [1] 1
# Frequency of gold is Annual
## Frequency of ts woolyrnq
frequency(woolyrnq)
## [1] 4
# Frequency of woolyrnq is Quarterly
## Frequency off ts gas
frequency(gas)
## [1] 12
# Frequency of gas is Monthly
  1. Use which.max() to spot the outlier in the gold series. Which observation was it?
which.max(gold)
## [1] 770
## The maximum occurs at record 770

max(gold, na.rm=T)
## [1] 593.7
## The maximum value of gold is 593.7
  1. Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

a.You can read the data into R with the following script:

## fread - (library(data.table))
tute1 <- fread("C:/Users/Gurpreet/Documents/Data624/tute1.csv", header=T, stringsAsFactors=F)
  1. Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)

c.Construct time series plots of each of the three series

## autoplot with facets
autoplot(mytimeseries, facets=TRUE) +ggtitle("Quarterly sales for a small company over the period 1981-2005")

## autoplot without facets
autoplot(mytimeseries)+ggtitle("Quarterly sales for a small company over the period 1981-2005")

By including facets =TRUE, the plots are separated into different panels based on the Sales, ADBudget and GDP. Each panel corresponds to different category.

  1. Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
  1. You can read the data into R with the following script:
retaildata <- readxl::read_excel("C:/Users/Gurpreet/Documents/Data624/retail.xlsx", skip=1)
  1. Select one of the time series as follows (but replace the column name with your own chosen column):
myts <- ts(retaildata[,"A3349401C"],
  frequency=12, start=c(1982,4))
  1. Explore your chosen retail time series using the following functions:

autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()

Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

autoplot(myts) + ggtitle("Australian Retail Data")

ggseasonplot(myts)

ggsubseriesplot(myts)

gglagplot(myts, lags = 12)

ggAcf(myts)

There is an increasing trend in the time series data.There is seasonality but no cyclic pattern in the time series data. The retail sales going to peak at end of year. This can suggest high sales in December, possibility of Holiday and Christmas shopping.

  1. Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline. Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
## hsales
autoplot(hsales) +ggtitle("Monthly sales of new one-family houses sold in the USA since 1973")

ggseasonplot(hsales)

ggsubseriesplot(hsales)

gglagplot(hsales)

ggAcf(hsales)

There is no increaasing or decreasing trend in the housing sales data. The timeseries data suggest strong seasonal pattern during March, April each year suggesting majority of the buyers activity during these months. Sales decreasing during december which might be due to winter. The seasonality is evident from seasonal and subseries plots. The data also suggest some cyclic patterns during 1982 and 1991. For 1991, we can correlate this to early 90’s recession (Iraq invading Kuwait) tends to increase the oil prices bringing DOW down to 18% in three months. Link

## usdeaths
autoplot(usdeaths)

ggseasonplot(usdeaths)

ggsubseriesplot(usdeaths)

gglagplot(usdeaths)

ggAcf(usdeaths)

There is no trend in the usdeaths time series data. In addition, there is no possible cyclic pattern in the data. However, there is seasonality in the data. There is increase in accidential deaths in middle of year (summer June-July). Subseries plots supports that fact, the accidential deaths going to peak and decreasing slowly after mid-year. This can be supported by the fact, that people are going out to the beaches and drinking and driving in these months.

##bricksq
autoplot(bricksq) +ggtitle("Australian quarterly clay brick production: 1956-1994")

ggseasonplot(bricksq)

ggsubseriesplot(bricksq)

gglagplot(bricksq)

ggAcf(bricksq)

There is seasonality and positive trend in the data. The cyclic pattern is not appearing in the ts data.

## sunspotarea
autoplot(sunspotarea)

#ggseasonplot(sunspotarea)
#ggsubseriesplot(sunspotarea)
gglagplot(sunspotarea)

ggAcf(sunspotarea)

No trend in the ts data. Although there seems to be a seasonality pattern, but due to larghe amount of data and inability to sketch the seasonality graphs, it is difficult to get the inference.

##gasoline
autoplot(gasoline) + ggtitle("Weekly data beginning 2 February 1991, ending 20 January 2017in million barrels per day")

ggseasonplot(gasoline)

#ggsubseriesplot(gasoline)
gglagplot(gasoline)

ggAcf(gasoline)

There is an increasing trend in the gasoline price data. There is no seasonality in the data. In general, there should be cyclic patterns in the ts data, due to market crashes and other economic factors, from the graphs it is difficult to detext the cyclic pattern. Possibly we might have to work around with data with annual frequency to show the cyclic patterns.