library(fpp2)
## Warning: package 'fpp2' was built under R version 3.6.3
## Registered S3 method overwritten by 'xts':
## method from
## as.zoo.xts zoo
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.2 v fma 2.4
## v forecast 8.13 v expsmooth 2.3
## Warning: package 'ggplot2' was built under R version 3.6.3
## Warning: package 'forecast' was built under R version 3.6.3
## Warning: package 'fma' was built under R version 3.6.3
## Warning: package 'expsmooth' was built under R version 3.6.3
##
library(ggplot2)
library(ggfortify)
## Warning: package 'ggfortify' was built under R version 3.6.3
## Registered S3 methods overwritten by 'ggfortify':
## method from
## autoplot.Arima forecast
## autoplot.acf forecast
## autoplot.ar forecast
## autoplot.bats forecast
## autoplot.decomposed.ts forecast
## autoplot.ets forecast
## autoplot.forecast forecast
## autoplot.stl forecast
## autoplot.ts forecast
## fitted.ar forecast
## fortify.ts forecast
## residuals.ar forecast
Use the help function to explore what the series gold , woolyrnq and gas represent.
help(gold)
## starting httpd help server ... done
help(woolyrnq)
help(gas)
autoplot(gold)+ggtitle("Daily morning gold prices")
autoplot(woolyrnq)+ggtitle("Quarterly production of woollen yarn in Australia")
autoplot(gas)+ggtitle("Australian monthly gas production")
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
The frequencies of gold, woolyrnq, and gas are 1, 4 and 12, respectively.
which.max(gold)
## [1] 770
gold[which.max(gold)]
## [1] 593.7
The observation is 593.7
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
tute1 <- read.csv("tute1.csv", header=TRUE)
View(tute1)
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
autoplot(mytimeseries, facets=TRUE)+ ggtitle("With Facets=TRUE")
autoplot(mytimeseries)+ ggtitle("Without Facets=TRUE")
If we don’t include facets=TRUE, three series combine into one graph and different color of lines denote the variables of Sales, AdBudget and GDP.
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349727C"],
frequency=12, start=c(1982,4))
autoplot(myts)
ggseasonplot(myts)
ggsubseriesplot(myts)
gglagplot(myts)
ggAcf(myts)
I choose column A3349727C to analyze the seasonality, cyclicity and trend.
autoplot(): The plot show a strong increasing trend from 1980 until 2013, with strong seasonality.
ggseasonplot(): We notice a common dip in February and a remarkable increase from November to December. We are not surprised because December is a holiday season.
ggsubseriesplot(): The plot confirm our finding with average low sales number in February and high average number is December.
gglagplot(): The scatterplot shows a positive relationship from Lag 1 to lag 16, especially in lag 12.
ggAcf(): ACF plot show all the positive values from lag 1 to lag 24, which confirm our find of previous scatterplot. When data are either seasonal or cyclic, the ACF will peak around the seasonal lags or at the average cycle length. Thus, we see that the maximal autocorrelation for the Australian retail data occurs at a lag of 12 and lag 24, which match our December peak sales.
Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.
autoplot(hsales)
ggseasonplot(hsales)
ggsubseriesplot(hsales)
gglagplot(hsales)
ggAcf(hsales)
We can detect the seasonal trend from the plot, but no upward or downward trend. Seasonal plot shows an increase sale in March and then the sale gradually decrease after that. ggsubseriesplot() indicates the lowest average sale is December. The lag plot shows positive correlation and lag 1 is the most obvious subplot. ACF plot also confirm our seasonality finding with the highest peak in lag 1, lag 12 and lag 24.We also detect a negative correlation from lag 18 to lag 21, but they still within the blue dot line, meaning the values are not significantly different from zero. From the autocorrelation plot, we can’t detect trends or cyclicity so far.
autoplot(usdeaths)
ggseasonplot(usdeaths)
ggsubseriesplot(usdeaths)
gglagplot(usdeaths)
ggAcf(usdeaths)
The plot shows a strong seasonality but no apparent upward or downward trend. The seasonal plot and Seasonal subseries plot clearly indicate that the highest accidental deaths happened in July and the lowest deaths happened in February. Lag 1 and lag 12 show a strong positive correlation and lag 6 indicate a strong negative correlation. From the autocorrelation plot, we can detect a shot cycle within one year because the trend of lag 1 to lag 12 was repeated again from lag 13 to lag 24, with strong positive correlation in lag 1, lag 12 and lag 24, but strong negative correlation in lag 6 and lag 18.
autoplot(bricksq)
ggseasonplot(bricksq)
ggsubseriesplot(bricksq)
gglagplot(bricksq)
ggAcf(bricksq)
Except 1975 and 1983, the time series plot shows a strong increasing trend. At the same time, there might be two cycles from 1975 to 1983, and from 1984 to 1992. The seasonal plot and seasonal subseries plot indicates a small higher clay brick production in Q3, and all the lag scatterplots shows positive correlation. For the autocorrelation plot, positive values that slowly decrease as lags increase means a upward trended time series and confirm our finding.
autoplot(sunspotarea)
gglagplot(sunspotarea)
ggAcf(sunspotarea)
The time series plot shows a fix pattern from trough to peak every few years. We also notice an upward trend of peak from 1875 to around 1960. We can’t generate seasonal plot because the data is yearly data. The lag scatterplot shows both positive and negative correlation. ACF plot shows an apparent cycle every ten lags.
autoplot(gasoline)
ggseasonplot(gasoline)
gglagplot(gasoline)
ggAcf(gasoline)
Autoplot reveal a clearly upward trend from 1991 to 2007, then a decrease after 2007 and increase again from 2014. Because this is weekly data, the seasonal plot is cluttered and tough to interpret. The lag plot reveal relatively strong positive autocorrelations and the repeat pattern every 52 weeks.