require(fpp2)
## Loading required package: fpp2
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
gold, woolyrnq, and gas represent.autoplot() to plot each of these in separate plots.gold is the daily morning gold prices in US dollars from 1 January 1985 - 31 March 1989
autoplot(gold) + ggtitle("Daily morning gold prices (1985 - 1989)") +
xlab("Year") + ylab("Price")
woolyrnq is the quarterly production of woollen yarn in Australia (tonnes) from March 1965 - September 1994.
autoplot(woolyrnq) + ggtitle("Australian woollen yarn production (1965 - 1994)") +
xlab("Year") + ylab("Tonnes")
gas is the Australian monthly gas production from 1956 - 1995.
autoplot(gas) + ggtitle("Australian gas production (1956 - 1995)") +
xlab("Year") + ylab("Gas")
frequency() functionfrequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
The frequency of gold is annual. The frequency of woolyrnq is quarterly. The frequency of gas is monthly.
which.max() to spot the outlier in the gold series. Which observation was it?which.max(gold)
## [1] 770
It was observation 770, which had a price of $593.70.
tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series labelled Sales, AdBudget, and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget, and GDP is the gross domestic product. All series have been adjusted for inflation.tute1 <- read.csv("tute1.csv", header = TRUE)
View(tute1)
# The [,-1] removes the first column which contains the quarters we don't need
mytimeseries <- ts(tute1[, -1], start = 1981, frequency = 4)
autoplot(mytimeseries, facets = TRUE)
Without facets = TRUE:
autoplot(mytimeseries)
All the graphs are plotted on one plane, which makes reading all three a little difficult.
# The second argument (skip = 1) is required because the Excel sheet has two header rows
retaildata <- readxl::read_excel("retail.xlsx", skip = 1)
myts <- ts(retaildata[, "A3349337W"], frequency = 12, start = c(1982, 4))
autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() Can you spot any seasonality, cyclicity, and trend? What do you learn about the series?autoplot(myts) + ggtitle("A3349337W") +
xlab("Year") + ylab("Sales")
The autoplot shows a strong seasonality to the data, as well as an upward trend. Though there is a brief dip from 1990-2000, there is no evidence that this is part of a cycle yet.
ggseasonplot(myts, year.labels = TRUE, year.labels.left = TRUE) +
ylab("Sales") + ggtitle("Seasonal Plot of A3349337W")
The seasonal plot emphasizes the seasonality of the data. Sales start to rise in the fall before spiking sharply between November and December, then falling off after January, obviously coinciding with holiday shopping and sales for Christmas.
ggsubseriesplot(myts) + ylab("Sales") +
ggtitle("Seasonal Subseries Plot of A3349337W")
Again, the subseries highlights the seasonality of the data, but paints it clearer than the seasonal plot. Though sales rise from September, the floor actually remains the same. The only real difference is in December, which not only has a higher ceiling, but a higher floor as well.
gglagplot(myts)
The data is not very readable in this lag series. We can see some negative relationships and some positive relationships, but the amount of graphs, and the fact that this is monthly, make it difficult to discern much.
ggAcf(myts)
The decrease in lags highlights the trend, while the scalloped shape shows the seasonality of the sales data.
autplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline. Can you spot any seasonality, cyclicity, and trends? What do you learn about the series?autoplot(hsales)
ggseasonplot(hsales, year.labels = TRUE, year.labels.left = TRUE)
ggsubseriesplot(hsales)
gglagplot(hsales)
ggAcf(hsales)
There is clear seasonality for the hsales data, along with roughly five year cycles. The seasonplot and sub series plot show that they peak around March before tapering off for the rest of the year. However the autoplot does not show a clear trend, yet the autocorrelation plot sort of suggests one as the lags decrease, although there is a dip into the negatives.
autoplot(usdeaths)
ggseasonplot(usdeaths, year.labels = TRUE, year.labels.left = TRUE)
ggsubseriesplot(usdeaths)
gglagplot(usdeaths)
ggAcf(usdeaths)
There is a clear seasonality for the usdeaths data, although I’m not sure if there is a cycle. It might just be a strong seasonality. Deaths spike during the summer and dip in the winter, which makes sense as people are less likely to be out and about in the winter and more likely to be out and sociable in the summer. There is, however, no trend through this period which is a bit disappointing as you’d like to see a downward trend.
autoplot(bricksq)
ggseasonplot(bricksq, year.labels = TRUE, year.labels.left = TRUE)
ggsubseriesplot(bricksq)
gglagplot(bricksq)
ggAcf(bricksq)
bricksq had an upward trend until the 80’s or so. It might have plateaued as the autocorrelation plot shows a trend, but the shrinking is not as extreme as others. Since 1975 or so, there seems to be some evidence of cyclicity at roughly 5 year intervals. There is not a strong suggestion of seasonality to me, but it is there as evidenced by the scalloping in the autocorrelation plot and season plot, but again it’s not as extreme as others.
autoplot(sunspotarea)
#ggseasonplot(sunspotarea, year.labels = TRUE, year.labels.left = TRUE)
#ggsubseriesplot(sunspotarea)
gglagplot(sunspotarea)
ggAcf(sunspotarea)
sunspotarea data is very cyclical and has some seasonality, but there is no discernible trend to the data according to the autocorrelation graph. Oddly the seasonplot and subseries plot would not work with the data, claiming that sunspotarea is not seasonal data.
autoplot(gasoline)
ggseasonplot(gasoline, year.labels = TRUE, year.labels.left = TRUE)
#ggsubseriesplot(gasoline)
gglagplot(gasoline)
ggAcf(gasoline)
The gasoline data has an upward trend but very little seasonality. In fact, there’s so much data that it’s really hard to parse it on the seasonplot. The prices seem to fluctuate wildly throughout the course of the year.