2.10 Exercises

1.

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

#Loading required R package and datasets:
library(fpp3)
data("aus_production")
data("pelt")
data("gafa_stock")
data("vic_elec")

a

Use ? (or help()) to find out about the data in each series.

help("aus_production")

b

What is the time interval of each series?

aus_production: quarterly (i.e. every three months) pelt: yearly (i.e. every 12 months) gafa_stock: daily (i.e. everyday) vic_elec: half-hourly (i.e. every 30 minutes)

c

Use autoplot() to produce a time plot of each series.

autoplot(aus_production,Bricks)+ggtitle("Quarterly Production of Bricks in Australia")

autoplot(pelt, Lynx) +
  ggtitle("Canadian Lync Pelts traded 1845-1935")

autoplot(gafa_stock, Close) +
  ggtitle("Closing GAFA Stock Prices from 2014-2018")

autoplot(vic_elec, Demand) +
  ggtitle("Electricity Demand for Victoria, Australia")

d

For the last plot, modify the axis labels and title.

library(ggplot2)
autoplot(vic_elec,color="blue") + labs(title = "Electricity Demand in Victoria")+ ylab("Demand (MW)") # Time plot for Electricity Demand in Victoria

2.

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

library(dplyr)
data(gafa_stock)
gafa_stock %>%
  group_by(Symbol) %>%
  filter(Close == max(Close)) %>% #Keeps rows where Close value = max close value
  select(Symbol, Date, Close)

## # A tsibble: 4 x 3 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Date       Close
##   <chr>  <date>     <dbl>
## 1 AAPL   2018-10-03  232.
## 2 AMZN   2018-09-04 2040.
## 3 FB     2018-07-25  218.
## 4 GOOG   2018-07-26 1268.

3.

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

###a. You can read the data into R with the following script:

tute1 <- readr::read_csv("tute1.csv")
View(tute1)

b.

Convert the data to time series

mytimeseries <- tute1 %>%
  mutate(Quarter=yearquarter(Quarter)) %>%
  as_tibble(index = Quarter)

###c. Construct time series plots of each of the three series

mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check what happens when you don’t include facet_grid()

mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) + geom_line()

Without facet_grid(), it makes it difficult to interpret the plot, especially if the series has different magnitudes or trends.

6.

The aus_arrivals data set comprises quarterly international arrivals to Australia from Japan, New Zealand, UK and the US.

a

Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries.

autoplot(aus_arrivals) +
xlab("Year") + ylab("") +
ggtitle("Total Quarterly Arrivals to Australia")

gg_season(aus_arrivals)

gg_subseries(aus_arrivals)

### b Can you identify any unusual observations?

UK data appears as the most seasonal; travel is high during Q1 and Q4. US peaks are Q1 and Q4.

DATA 624 - HW 1

Shana Green

9-10-2023

2.10 Exercises

1.

a

b

c

d

2.

3.

b.

6.

a