library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.2
## -- Attaching packages -------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1 v purrr 0.3.3
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.0 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## Warning: package 'tibble' was built under R version 3.6.3
## Warning: package 'tidyr' was built under R version 3.6.3
## Warning: package 'readr' was built under R version 3.6.3
## Warning: package 'dplyr' was built under R version 3.6.3
## Warning: package 'forcats' was built under R version 3.6.3
## -- Conflicts ----------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
library(fpp2)
## Warning: package 'fpp2' was built under R version 3.6.3
## Loading required package: forecast
## Warning: package 'forecast' was built under R version 3.6.3
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Loading required package: fma
## Warning: package 'fma' was built under R version 3.6.3
## Loading required package: expsmooth
## Warning: package 'expsmooth' was built under R version 3.6.3
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.6.2
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
Use the help function to explore what the series gold
, woolyrnq
and gas
represent.
gold
is the daily morning gold prices in US dollars from 1 Jan 1985 to 31 March 1989. woolyrnq
is the quarterly production of woolen yarn in Australia in tons from March 1965 to September 1994. gas
is monthly gas production in Australia from 1956-1995.
autoplot()
to plot each of these in separate plots.autoplot(gold)
autoplot(woolyrnq)
autoplot(gas)
frequency()
function.frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
gold
is an annual time series, woolyrnq
is a quarterly time series and gas
is a monthly time series
which.max()
to spot the outlier in the gold series. Which observation was it?goldoutlier <- which.max(gold)
It is the 770 observation. The price of gold was 593.7
Download the file tute1.csv
from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
if(!file.exists("tute1.csv")){
download.file("http://otexts.com/fpp2/extrafiles/tute1.csv", "tute1.csv")
}
tute1 <- read.csv("tute1.csv", header=TRUE)
View(tute1)
mytimeseries <- ts(tute1[,-1], start=1981, frequency=4)
(The [,-1] removes the first column which contains the quarters as we don’t need them now.)
autoplot(mytimeseries, facets=TRUE)
Check what happens when you don’t include facets=TRUE.
autoplot(mytimeseries)
It stacks the visualization instead of having small multiples.
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
temp = tempfile(fileext = ".xlsx")
dataURL <- "https://otexts.com/fpp2/extrafiles/retail.xlsx"
download.file(dataURL, destfile=temp, mode='wb')
retaildata <- readxl::read_excel(temp, skip=1)
The second argument (skip=1) is required because the Excel sheet has two header rows.
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
autoplot()
, ggseasonplot()
, ggsubseriesplot()
, gglagplot()
, ggAcf()
autoplot(myts)
ggseasonplot(myts)
ggsubseriesplot(myts)
gglagplot(myts)
ggAcf(myts)
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
There is a clear seasonality increase in retail sales from October to the end of the year. This is the Christmas shopping season. There is also a trend of increasing retail sales over time. The trend has been rising untill the 2000’s where it flatened out for roughly a decade. Since 2010 it looks like the trend increases again.
Use the following graphics functions: autoplot()
, ggseasonplot()
, ggsubseriesplot()
, gglagplot()
, ggAcf()
and explore features from the following time series: hsales
, usdeaths
, bricksq
, sunspotarea
, gasoline
.
autoplot(hsales)
ggseasonplot(hsales)
ggsubseriesplot(hsales)
gglagplot(hsales)
ggAcf(hsales)
One-family home sales in the US tend to be highest in March. The ACF plot suggests that there is some annual cycle, but it is noisy.
If I am a realtor I’m not going to be busy in the winter months. Early spring (march through may) will be my busy time of the year.
autoplot(usdeaths)
ggseasonplot(usdeaths)
ggsubseriesplot(usdeaths)
gglagplot(usdeaths)
ggAcf(usdeaths)
There appears to be a seasonal pattern to the data.
Accidental deaths in the US tends to be highest in July.
autoplot(bricksq)
ggseasonplot(bricksq)
ggsubseriesplot(bricksq)
gglagplot(bricksq)
ggAcf(bricksq)
There isn’t much variation from Q2 through Q4. The trend was genrally increasing until about the 1980’s.
Q1 is a slow quarter for Australian clay brick producers. There has been quite a bit more irregularity since 1975.
autoplot(sunspotarea)
#ggseasonplot(sunspotarea)
#ggsubseriesplot(sunspotarea)
gglagplot(sunspotarea)
ggAcf(sunspotarea)
There appears to be a cycle in the data. It looks like it’s about a 10 to 11 year cycle.
If the pattern holds 2020 should be a year with low sunspot area.
autoplot(gasoline)
ggseasonplot(gasoline)
gasoline %>%
as.vector()%>%
ts(., frequency=52) %>%
ggsubseriesplot()
gglagplot(gasoline)
ggAcf(gasoline)
There is a trend and some seasonality to the data. I thought you would see some cyclical behavior coinsiding with U.S. regression dates but that is not present.
The trend of the supply of gasoline has been generally increasing. It increases slightly during the summer months.