Exercises 2.1, 2.2, 2.3 and 2.6 from the Hyndman online Forecasting book.
#load req's packages
library(forecast)
library(readxl)
library(RCurl)
library(fpp2)
Use the help function to explore what the series gold, woolyrnq and gas represent.
describe.data <- function(data) {
freq <- frequency(data)
outlier <- which.max(data)
return(c(freq,outlier))
}
autoplot(gold)
question1 <- describe.data(gold)
autoplot(woolyrnq)
question2 <- describe.data(woolyrnq)
autoplot(gas)
question3 <- describe.data(gas)
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
You can read the data into R with the following script:
tute1 <- read.csv("http://otexts.com/fpp2/extrafiles/tute1.csv",header=T)
View(tute1)
Convert the data to time series
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
Construct time series plots of each of the three series & check what happens when you don’t include facets=TRUE.
autoplot(mytimeseries, facets=TRUE,main="With 'Facets' Argument")
autoplot(mytimeseries,main="Without 'Facets' Argument")
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
#create a temp file
temp_file <- tempfile(fileext = ".xlsx")
download.file(url = "https://github.com/omerozeren/DATA624/raw/master/HMW1/retail.xlsx",
destfile = temp_file,
mode = "wb",
quiet = TRUE)
#load xl from temp
retaildata <- readxl::read_excel(temp_file,skip=1)
my.ts <- ts(retaildata[,"A3349388W"],
frequency=12, start=c(1982,4))
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
Reference: “https://otexts.com/fpp2/autocorrelation.html”
“Trend: A trend exists when there is a long-term increase or decrease in the data. It does not have to be linear. Sometimes we will refer to a trend as”changing direction“, when it might go from an increasing trend to a decreasing trend.”
“Seasonal: A seasonal pattern occurs when a time series is affected by seasonal factors such as the time of the year or the day of the week. Seasonality is always of a fixed and known frequency. The monthly sales of antidiabetic drugs above shows seasonality which is induced partly by the change in the cost of the drugs at the end of the calendar year.”
“Cyclic: A cycle occurs when the data exhibit rises and falls that are not of a fixed frequency. These fluctuations are usually due to economic conditions, and are often related to the”business cycle“. The duration of these fluctuations is usually at least 2 years.”
“When data have a trend, the autocorrelations for small lags tend to be large and positive because observations nearby in time are also nearby in size. So the ACF of trended time series tend to have positive values that slowly decrease as the lags increase.
When data are seasonal, the autocorrelations will be larger for the seasonal lags (at multiples of the seasonal frequency) than for other lags.
When data are both trended and seasonal, you see a combination of these effects."
For this question I just picked a column at random and ended up with “Turnover ; Total (State) ;
Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
my.ts <- hsales
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
my.ts <- usdeaths
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
my.ts <- bricksq
autoplot(my.ts)
ggseasonplot(my.ts)
ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)
my.ts <- sunspotarea
autoplot(my.ts)
#ggseasonplot(my.ts)
#ggsubseriesplot(my.ts)
gglagplot(my.ts,lags=12)
ggAcf(my.ts)
my.ts <- gasoline
autoplot(my.ts)
ggseasonplot(my.ts)
#ggsubseriesplot(my.ts)
gglagplot(my.ts)
ggAcf(my.ts)