Exercises 2.1, 2.2, 2.3 and 2.6 from the Hyndman online Forecasting book. The rpubs version of this work can be found here, and source/data can be found on github here.
#clear the workspace
rm(list = ls())
#load req's packages
library(forecast)
library(readxl)
library(RCurl)
library(fpp2)
Use the help function to explore what the series gold, woolyrnq and gas represent.
describe.data <- function(data) {
freq <- frequency(data)
outlier <- which.max(data)
return(c(freq,outlier))
}
#help(gold)
autoplot(gold)
question1 <- describe.data(gold)
#help(woolyrnq)
autoplot(woolyrnq)
question2 <- describe.data(woolyrnq)
#help(gas)
autoplot(gas)
question3 <- describe.data(gas)
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
You can read the data into R with the following script:
tute1 <- read.csv("http://otexts.com/fpp2/extrafiles/tute1.csv",header=T)
View(tute1)
Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
Construct time series plots of each of the three series & check what happens when you don’t include facets=TRUE.
autoplot(mytimeseries, facets=TRUE,main="With 'Facets' Argument")
autoplot(mytimeseries,main="Without 'Facets' Argument")
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
#create a temp file
temp_file <- tempfile(fileext = ".xlsx")
#grab a copy of the xl file from my github, save to temp create above
download.file(url = "https://github.com/plb2018/DATA624/raw/master/Homework1/retail.xlsx",
destfile = temp_file,
mode = "wb",
quiet = TRUE)
#load xl from temp
retaildata <- readxl::read_excel(temp_file,skip=1)
my.ts <- ts(retaildata[,"A3349388W"],
frequency=12, start=c(1982,4))
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
For this question I just picked a column at random and ended up with “Turnover ; Total (State) ; Takeaway food services ;”.
Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
my.ts <- hsales
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
my.ts <- usdeaths
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
* Once again, these data are clearly seasonal and show little trend, on aggregate. * The seasonality plots shows a dip in Feb, then a clear rise to a peak in Jul. Thereafter, it drops off slightly and seems to flatten our towards the end of the year. * The lag plots are informative here also with Feb consistently appearing at the bottom and Jul, near the top. The 12 month panel suggests an annual seasonality. * The ACF almost looks like a sine wave, indicative of a pattern in the data. The peaks are at 12 and 24, suggesting an annual season.
my.ts <- bricksq
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
my.ts <- sunspotarea
autoplot(my.ts)
#ggseasonplot(my.ts)
#ggsubseriesplot(my.ts)
gglagplot(my.ts,lags=12)
ggAcf(my.ts)
my.ts <- gasoline
autoplot(my.ts)
ggseasonplot(my.ts)
#ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)