suppressPackageStartupMessages(library(fpp2))
suppressPackageStartupMessages(library(feasts))
suppressPackageStartupMessages(library(tidyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(tibble))
suppressPackageStartupMessages(library(tsibble))
suppressPackageStartupMessages(library(tsibbledata))
Home Work #1
Use the help function to explore what the series gafa_stock, PBS, vic_elec and pelt represent.
Use autoplot() to plot some of the series in these data sets.
What is the time interval of each series?
# help('gafa_stock')
# help('PBS')
# help('vic_elec')
# help('pelt')
mypbs <- ts(PBS, start= 2000, end = 2016, frequency = 12)
## Warning in data.matrix(data): NAs introduced by coercion
## Warning in data.matrix(data): NAs introduced by coercion
## Warning in data.matrix(data): NAs introduced by coercion
## Warning in data.matrix(data): NAs introduced by coercion
## Warning in data.matrix(data): NAs introduced by coercion
## Warning in data.matrix(data): NAs introduced by coercion
autoplot(mypbs[,"Cost"])+ggtitle("Prescription Costs 2000 - 2016 ")+ylab("Cost")
Time Interval for each series gafa_stock is daily PBS is monthly vic_elec is every 30min pelt is yearly
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
gafa_stock %>% group_by(Symbol) %>% filter(Close == max(Close))
## # A tsibble: 4 x 8 [!]
## # Key: Symbol [4]
## # Groups: Symbol [4]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2018-10-03 230. 233. 230. 232. 230. 28654800
## 2 AMZN 2018-09-04 2026. 2050. 2013 2040. 2040. 5721100
## 3 FB 2018-07-25 216. 219. 214. 218. 218. 58954200
## 4 GOOG 2018-07-26 1251 1270. 1249. 1268. 1268. 2405600
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
tute1 <- readr::read_csv(“tute1.csv”) View(tute1)
mytimeseries <- tute1 %>% mutate(Quarter = yearmonth(Quarter)) %>% as_tsibble(index = Quarter)
mytimeseries %>% pivot_longer(-Quarter) %>% ggplot(aes(x = Quarter, y = value, colour = name)) + geom_line() + facet_grid(name ~ ., scales = “free_y”)
Check what happens when you don’t include facet_grid().
tute1 <- readr::read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(tute1)
## # A tibble: 6 × 4
## Quarter Sales AdBudget GDP
## <date> <dbl> <dbl> <dbl>
## 1 1981-03-01 1020. 659. 252.
## 2 1981-06-01 889. 589 291.
## 3 1981-09-01 795 512. 291.
## 4 1981-12-01 1004. 614. 292.
## 5 1982-03-01 1058. 647. 279.
## 6 1982-06-01 944. 602 254
mytimeseries <- tute1 %>%
mutate(Quarter = yearmonth(Quarter)) %>%
as_tsibble(index = Quarter)
head(mytimeseries)
## # A tsibble: 6 x 4 [3M]
## Quarter Sales AdBudget GDP
## <mth> <dbl> <dbl> <dbl>
## 1 1981 Mar 1020. 659. 252.
## 2 1981 Jun 889. 589 291.
## 3 1981 Sep 795 512. 291.
## 4 1981 Dec 1004. 614. 292.
## 5 1982 Mar 1058. 647. 279.
## 6 1982 Jun 944. 602 254
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name))+
geom_line()+
facet_grid(name ~ ., scales = "free_y")
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
Without facet_grid the sets are not separated
The USgas package contains data on the demand for natural gas in the US.
Install the USgas package.
Create a tsibble from us_total with year as the index and state as the key.
Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
suppressPackageStartupMessages(library(USgas))
tsibble_us <- us_total %>%
as_tsibble(index = year, key = state)
new_england <- us_total %>%
group_by(state) %>%
filter(state %in% c('Maine', 'Vermont', 'New
Hampshire', 'Massachusetts',
'Connecticut' ,'Rhode Island')) %>%
ungroup()%>%
as_tsibble(key = state,index = year)
autoplot(new_england,y)
Download tourism.xlsx from the book website and read it into R using readxl::read_excel().
Create a tsibble which is identical to the tourism tsibble from the tsibble package.
Find what combination of Region and Purpose had the maximum number of overnight trips on average.
d)Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
tourism <- readxl::read_excel("tourism.xlsx")
tsibble_tourism <- tourism %>% mutate(Quarter = yearquarter(Quarter) ) %>%
as_tsibble(index = Quarter, key = c(Region, State, Purpose))
head(tsibble_tourism)
## # A tsibble: 6 x 5 [1Q]
## # Key: Region, State, Purpose [1]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
tsibble_tourism %>% group_by(Region, Purpose) %>%
summarise(Trips = mean(Trips)) %>%
ungroup() %>%
filter(Trips == max(Trips))
## # A tsibble: 1 x 4 [1Q]
## # Key: Region, Purpose [1]
## Region Purpose Quarter Trips
## <chr> <chr> <qtr> <dbl>
## 1 Melbourne Visiting 2017 Q4 985.
new_tsibble <- tsibble_tourism %>%
group_by(State) %>% summarise(Trips = sum(Trips))%>%
ungroup()
head(new_tsibble)
## # A tsibble: 6 x 3 [1Q]
## # Key: State [1]
## State Quarter Trips
## <chr> <qtr> <dbl>
## 1 ACT 1998 Q1 551.
## 2 ACT 1998 Q2 416.
## 3 ACT 1998 Q3 436.
## 4 ACT 1998 Q4 450.
## 5 ACT 1999 Q1 379.
## 6 ACT 1999 Q2 558.
Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):
set.seed(12345678)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
Explore your chosen retail time series using the following functions:
autoplot(), gg_season(), gg_subseries(), gg_lag(),
ACF() %>% autoplot()
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
autoplot(myseries,Turnover)
gg_season(myseries, Turnover)
gg_subseries(myseries, Turnover)
gg_lag(myseries, Turnover)
ACF(myseries, Turnover) %>% autoplot()