Exercises: 2.1, 2.2, 2.3, 2.4, 2.5 and 2.8
Use the help function to explore what the series gold, woolyrnq and gas represent.
library(fpp2)
## Warning: package 'fpp2' was built under R version 4.4.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## ── Attaching packages ──────────────────────────────────────────── fpp2 2.5.1 ──
## ✔ ggplot2 4.0.0 ✔ fma 2.5
## ✔ forecast 8.24.0 ✔ expsmooth 2.3
##
help(gold)
help(woolyrnq)
help(gas)
autoplot(gold) +
ggtitle("Daily Morning Gold Prices") +
xlab("Year (1 January 1985 – 31 March 1989)") +
ylab("Gold Price") +
theme_light()
It appears as though the gold series is stored as a ts object with a
numeric time index instead of dates since I don’t see any dates
displayed on the x axis.
autoplot(woolyrnq) +
ggtitle("Quarterly Production of Woollen Yarn in Australia") +
xlab("Year (Mar 1965 – Sep 1994)") +
ylab("Woollen Yarn Production (tonnes)") +
theme_light()
autoplot(gas) +
ggtitle("Australian Monthly Gas Production") +
xlab("Year (1956–1995)") +
ylab("Monthly Gas Production") +
theme_light()
frequency(gold)
## [1] 1
This makes sense since it is daily data.
frequency(woolyrnq)
## [1] 4
This is quarterly data.
frequency(gas)
## [1] 12
This is monthly data.
which.max(gold)
## [1] 770
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
autoplot(mytimeseries, facets=TRUE) +
theme_light()
autoplot(mytimeseries) +
theme_light()
When you don’t include the facets=TRUE it makes it all one plot and
creates a key to distinguish between the different series.
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
myts <- ts(retaildata[,"A3349399C"],
frequency=12, start=c(1982,4))
The column I selected is: Turnover ; New South Wales ; Clothing retailing.
autoplot(myts)
ggseasonplot(myts)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
ggsubseriesplot(myts)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
gglagplot(myts)
ggAcf(myts)
All of the graphs clearly show that peak clothing purchases happen
during the month of December. I learned that time series can be
represented in different types of graphs and that they can display the
same information in a variety of ways to drive the same point home. They
all displayed December as the peak month. Furthermore, all of the graphs
show a slight peak around May/June for clothes shopping as well.
Create time plots of the following time series: bicoal, chicken, dole, usdeaths, lynx, goog, writing, fancy, a10, h02. Use help() to find out about the data in each series. For the goog plot, modify the axis labels and title.
help(bicoal)
autoplot(bicoal) +
ggtitle("Annual Bituminous Coal Production") +
xlab("Year (1920–1968)") +
ylab("Coal Production") +
theme_light()
help(chicken)
autoplot(chicken) +
ggtitle("Price of Chicken in the US") +
xlab("Year (1924–1993)") +
ylab("Price of Chicken in US Dollars") +
theme_light()
help(dole)
autoplot(dole) +
ggtitle("Unemployment Benefits in Australia") +
xlab("Year (Jan 1965 – Jul 1992)") +
ylab("Monthly total of people on unemployment benefits") +
theme_light()
help(usdeaths)
autoplot(usdeaths) +
ggtitle("Accidental deaths in USA") +
xlab("Year") +
ylab("Monthly accidental deaths in USA") +
theme_light()
help(lynx)
## Help on topic 'lynx' was found in the following packages:
##
## Package Library
## fma /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## datasets /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
##
##
## Using the first match ...
autoplot(lynx) +
ggtitle("Annual Canadian Lynx Trappings in McKenzie River") +
xlab("Year (1821–1934)") +
ylab("Annual number of lynx trapped") +
theme_light()
help(goog)
autoplot(goog) +
ggtitle("Daily closing stock prices of Google Inc") +
xlab("Year (25 February 2013 and 13 February 2017)") +
ylab("Closing stock prices") +
theme_light()
help(writing)
autoplot(writing) +
ggtitle("Sales of Printing and Writing Paper") +
xlab("Year (Jan 1963 – Dec 1972)") +
ylab("Industry sales (in thousands of French francs)") +
theme_light()
help(fancy)
autoplot(fancy) +
ggtitle("Souvenir Shop Sales: Queensland, Australia") +
xlab("Year") +
ylab("Monthly sales") +
theme_light()
help(a10)
autoplot(a10) +
ggtitle("Monthly Anti-diabetic Drug Subsidy in Australia ") +
xlab("Year (July 1991 - June 2008)") +
ylab("Monthly government expenditure (millions of dollars)") +
theme_light()
help(h02)
autoplot(h02) +
ggtitle("Monthly Corticosteroid Drug Subsidy in Australia") +
xlab("Year (July 1991 - June 2008)") +
ylab("Monthly government expenditure (millions of dollars)") +
theme_light()
Use the ggseasonplot() and ggsubseriesplot() functions to explore the seasonal patterns in the following time series: writing, fancy, a10, h02.
What can you say about the seasonal patterns? Can you identify any unusual years?
ggseasonplot(writing)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
ggsubseriesplot(writing)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
For sales of writing and printing paper there appears to be a clear dip
during the month of August. The decline from May to August could make
sense since the months of May and June usually mark the end of school
semesters, therefore could explain a decline in need of writing and
printing papers.
ggseasonplot(fancy)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
ggsubseriesplot(fancy)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
I’m a little surprised that the increase for a souvenir shop isn’t more
prominent during the summer months, but it makes sense that it is high
during the months of November, especially December, as it is holiday
season and a lot of people take vacation during that time.
ggseasonplot(a10)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
ggsubseriesplot(a10)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
help(a10)
January being the highest and the only peak makes sense as it is the beginning of the year which is when the government subsidiaries would generally start.
ggseasonplot(h02)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
ggsubseriesplot(h02)
## Warning in fortify(data, ...): Arguments in `...` must be used.
## ✖ Problematic argument:
## • na.rm = TRUE
## ℹ Did you misspell an argument name?
``
help(h02)
There appears to be a seasonality to this where the subsidy increases as the year progresses from February, with January being the highest. It could be explained by the versatility of the corticosteroid drugs as they can treat a vast range of different conditions.
The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.
Plot 1 matches plot B. -> it shows a slow decrease Plot 2 matches
plot A. -> it shows a fluctuating plot with an overall decline
Plot 3 matches plot D. -> it shows a fluctuating graph, probably very
seasonal that is increasing Plot 4 matches plot C. -> it shows a
fluctuating plot