suppressMessages(suppressWarnings(library(fpp2)))
suppressMessages(suppressWarnings(library(readxl)))
2.1. Use the help function to explore what the series gold, woolyrnq and gas represent.
str(gold)
## Time-Series [1:1108] from 1 to 1108: 306 300 303 297 304 ...
str(woolyrnq)
## Time-Series [1:119] from 1965 to 1994: 6172 6709 6633 6660 6786 ...
str(gas)
## Time-Series [1:476] from 1956 to 1996: 1709 1646 1794 1878 2173 ...
# a. Use autoplot() to plot each of these in separate plots.
autoplot(gold)
autoplot(woolyrnq)
autoplot(gas)
# b. What is the frequency of each series? Hint: apply the frequency() function.
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
# c. Use which.max() to spot the outlier in the gold series. Which observation was it?
which.max(gold)
## [1] 770
gold[which.max(gold)]
## [1] 593.7
2.2. Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
# a. You can read the data into R with the following script:
tute1 <- read.csv("C:/Users/rites/Documents/GitHub/Data624_Assignment1/tute1.csv", header=TRUE)
View(tute1)
# b. Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
# (The [,-1] removes the first column which contains the quarters as we don't need them now.)
# c. Construct time series plots of each of the three series
autoplot(mytimeseries, facets=TRUE)
# Check what happens when you don't include facets=TRUE.
autoplot(mytimeseries)
#The labels are seperated and colored
2.3. Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
# a. You can read the data into R with the following script:
retaildata <- readxl::read_excel("C:/Users/rites/Documents/GitHub/Data624_Assignment1/retail.xlsx", skip=1)
## readxl works best with a newer version of the tibble package.
## You currently have tibble v1.4.2.
## Falling back to column name repair from tibble <= v1.4.2.
## Message displays once per session.
# The second argument (skip=1) is required because the Excel sheet has two header rows.
# b. Select one of the time series as follows (but replace the column name with your own chosen column):
myts <- ts(retaildata[,"A3349873A"],frequency=12, start=c(1982,4))
#c. Explore your chosen retail time series using the following functions:
#Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
autoplot(myts)
ggseasonplot(myts)
ggsubseriesplot(myts)
gglagplot(myts, lags = 12)
ggAcf(myts)
# WE can see seasonality and trend here
2.6. Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.
# Can you spot any seasonality, cyclicity and trend?
# What do you learn about the series?
# hsales:
autoplot(hsales)
ggseasonplot(hsales)
ggsubseriesplot(hsales)
gglagplot(hsales)
ggAcf(hsales, lag.max = 400)
# We can see seasonality and cyclicity. The cycle period is ~100 months.
# usdeaths:
autoplot(usdeaths)
ggseasonplot(usdeaths)
ggsubseriesplot(usdeaths)
gglagplot(usdeaths)
ggAcf(usdeaths, lag.max = 60)
# We can see seasonality.
# bricksq:
autoplot(bricksq)
ggseasonplot(bricksq)
ggsubseriesplot(bricksq)
gglagplot(bricksq)
ggAcf(bricksq, lag.max = 200)
# We can see slight seasonality and very good trend.
# sunspotarea:
autoplot(sunspotarea)
# ggseasonplot(sunspotarea)
# Data are not seasonal
# ggsubseriesplot(sunspotarea)
# Data are not seasonal
gglagplot(sunspotarea)
ggAcf(sunspotarea, lag.max = 50)
# We can see strong cyclicity.
# gasoline:
autoplot(gasoline)
ggseasonplot(gasoline)
#ggsubseriesplot(gasoline)
# Each season requires at least 2 observations. That is not present
gglagplot(gasoline)
ggAcf(gasoline, lag.max = 1000)
# We can see seasonality and trend.