library(tsibble)
library(fpp3)
library(tidyverse)
library(USgas)
library(readxl)
library(ggplot2)

Exercises 2.1

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

Use ? (or help()) to find out about the data in each series. What is the time interval of each series? Use autoplot() to produce a time plot of each series. For the last plot, modify the axis labels and title.

Bricks from aus_production:

data: - Beer: Beer production in megalitres. - Tobacco: Tobacco and cigarette production in tonnes. - Bricks: Clay brick production in millions of bricks. - Cement: Portland cement production in thousands of tonnes. - Electricity: Electricity production in gigawatt hours. - Gas: Gas production in petajoules.

Time Interval: Quarterly

?aus_production
interval(aus_production)
## <Interval[0]>
aus_production %>%
  autoplot(Bricks)

Lynx from pelt:

data: - Hare: The number of Snowshoe Hare pelts traded. - Lynx: The number of Canadian Lynx pelts traded.

Time Interval: Yearly

?pelt
interval(pelt)
## <Interval[0]>
pelt %>%
  autoplot(Lynx)

Close from gafa_stock:

data: - Open: The opening price for the stock. - High: The stock’s highest trading price. - Low: The stock’s lowest trading price. - Close: The closing price for the stock. - Adj_Close: The adjusted closing price for the stock. - Volume: The amount of stock traded.

Time Interval: Market daily

?gafa_stock
interval(gafa_stock)
## <Interval[0]>
gafa_stock %>%
  autoplot(Close)

Demand from vic_elec:

data: - Demand: Total electricity demand in MWh. - Temperature: Temperature of Melbourne (BOM site 086071). - Holiday: Indicator for if that day is a public holiday.

Time Interval: Half-Hourly

?vic_elec
interval(vic_elec)
## <Interval[0]>
vic_elec %>%
  autoplot(Demand) +
  labs(title = "Electricity Demand in Victoria (Half-Hourly)",
       y = "Electricity Demand (MW)",
       x = "Time (Years)")

2.2

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

gafa_stock %>%
  group_by(Symbol) %>%
  filter(Close == max(Close, na.rm = TRUE)) %>%
  select(Symbol, Date, Close)

2.3

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

  1. You can read the data into R with the following script:
tute1 <- readr::read_csv("https://otexts.com/fpp3/extrafiles/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#View(tute1)
  1. Convert the data to time series
mytimeseries <- tute1 |>
  mutate(Quarter = yearquarter(Quarter)) |>
  as_tsibble(index = Quarter)
  1. Construct time series plots of each of the three series
mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check what happens when you don’t include facet_grid().

mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() 

2.4

The USgas package contains data on the demand for natural gas in the US.

Install the USgas package. Create a tsibble from us_total with year as the index and state as the key. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

us_gas_tsibble <- us_total |>
  as_tsibble(key = state, index = year)

glimpse(us_gas_tsibble)
## Rows: 1,266
## Columns: 3
## Key: state [53]
## $ year  <int> 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007…
## $ state <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Alabama"…
## $ y     <int> 324158, 329134, 337270, 353614, 332693, 379343, 350345, 382367, …
new_england_states <- c("Maine", "Vermont", "New Hampshire", "Massachusetts", "Connecticut", "Rhode Island")

new_england_gas <- us_gas_tsibble |>
  filter(state %in% new_england_states)

new_england_gas |>
  ggplot(aes(x = year, y = y, color = state)) +
  geom_line(linewidth = 1) +
  labs(title = "Annual Natural Gas Consumption in New England by State",
       x = "Year",
       y = "Natural Gas Consumption (Million Cubic Feet)",
       color = "State")

2.5

tourism_ds <- read_excel("/Users/benson/Downloads/tourism.xlsx")

head(tourism_ds,10)
tourism_tsibble <- tourism_ds|>
  mutate(Quarter = yearquarter(Quarter))

head(tourism_tsibble,10)
max_avg_trips <- tourism_tsibble |>
  group_by(Region, Purpose) |>
  summarise(Avg_Trips = mean(Trips, na.rm = TRUE), .groups = "drop") |>
  arrange(desc(Avg_Trips))

head(max_avg_trips,10)
tourism_state_tsibble <- tourism_tsibble |>
  as_tsibble(index = Quarter, key = c(Region, State, Purpose))

head(tourism_state_tsibble, 10)

2.8

Use the following graphics functions: - autoplot() - gg_season() - gg_subseries() - gg_lag() - ACF() and explore features from the following time series: - “Total Private” Employed from us_employment - Bricks from aus_production - Hare from pelt - “H02” Cost from PBS - Barrels from us_gasoline

Total Private Employed from us_employment

  • Can you spot any seasonality, cyclicity and trend? Autoplot, Seasonal plot, Subseries plot, and Log plot are indicates consistent economic growth with increasing employment levels.

  • What do you learn about the series? Peaks and troughs repeat every months, mostly cause by recessions

  • What can you say about the seasonal patterns? The subseries plot highlights that employment tends to rise in specific months across years, mostly cause by hiring seasons.

  • Can you identify any unusual years? Yes, the autopilot and subseries plots had sharp dips between 2008 - 2010.

employment_tsibble <- us_employment |>
  filter(Title == "Total Private")

autoplot(employment_tsibble, Employed) +
  labs(title = "Autoplot: Total Private Employment",
       y = "Number Employed")

gg_season(employment_tsibble, Employed) +
  labs(title = "Seasonal Plot: Total Private Employment")

gg_subseries(employment_tsibble, Employed) +
  labs(title = "Subseries Plot: Total Private Employment")

gg_lag(employment_tsibble, Employed, lags = 12) +
  labs(title = "Lag Plot: Total Private Employment")

employment_tsibble |>
  ACF(Employed) |>
  autoplot() + 
  labs(title = "ACF Plot: Total Private Employment")

Bricks from aus_production

  • Can you spot any seasonality, cyclicity and trend? There is a clear upward trend in brick production, after reaching a peak, there is a sharp decline. This pattern repeatedly happens.

  • What do you learn about the series? Quarter 3 had higher Brick production, and Quarter 1 had lower Brick production. This suggests that the use of bricks differs every Quarter.

  • What can you say about the seasonal patterns? The subseries plot highlighting that certain quarters consistently have higher or lower production. Same story from the subseries plot.

  • Can you identify any unusual years? Yes, since 2010, brick production in Australia has not returned to its peak in 1980. While the world population is increasing, the demand for bricks for construction is increasing. This suggests that Australia imports bricks to lower costs or lose its export market.

autoplot(aus_production, Bricks) +
  labs(title = "Autoplot: Bricks Production in Australia",
       y = "Bricks Produced")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_season(aus_production, Bricks)  +
  labs(title = "Seasonal Plot: Bricks Production in Australia")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_subseries(aus_production, Bricks)  +
  labs(title = "Subseries Plot: Bricks Production in Australia")
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_lag(aus_production, Bricks)  +
  labs(title = "Lag Plot: Bricks Production in Australia")
## Warning: Removed 20 rows containing missing values (gg_lag).

aus_production |>
  ACF(Bricks) |>
  autoplot() + 
  labs(title = "ACF Plot: Bricks Production in Australia")

Hare from pelt

  • Can you spot any seasonality, cyclicity and trend? The hare population shows repeating rises and falls over time, with no consistent upward or downward long-term trend.

  • What do you learn about the series? It looks like a cyclic peak every 10 years.

  • What can you say about the seasonal patterns? There is no clear seasonal pattern since the data is yearly.

  • Can you identify any unusual years? Yes, 1864 and 1886 had unusual peaks in these 2 years.

*The data must contain at least one observation per seasonal period for gg_season().

autoplot(pelt, Hare) +
  labs(title = "Autoplot: Hare Population Over Time",
       y = "Number Employed")

#gg_season(pelt_tsibble, Hare)  +
#  labs(title = "Seasonal Plot: Hare Population Over Time")

gg_subseries(pelt, Hare)  +
  labs(title = "Subseries Plot: Hare Population Over Time")

gg_lag(pelt, Hare)  +
  labs(title = "Lag Plot: Hare Population Over Time")

pelt |>
  ACF(Hare) |>
  autoplot() + 
  labs(title = "ACF Plot: Hare Population Over Time")

“H02” Cost from PBS

The data provided to contains more than one time series which error to use gg_lag().

solution: I breaking it down into four distinct time series based on the combinations of Concession and Type which is General & Co_payment, General & safety net, Concessional & Co_payment, Concessional & safety net.

  • Can you spot any seasonality, cyclicity and trend? General & Co_payment - There is no clear seasonality, cyclicity and trend General & safety net - There is a monthly seasonality, peak at Dec and sharply fall in Jan.  Concessional & Co_payment - There is no clear seasonality, cyclicity and trend Concessional & safety net -T here is a monthly seasonality, peak at Dec and sharply fall in Jan. 

  • What do you learn about the series? The overall trend suggests that the cost has been rising over the years

  • What can you say about the seasonal patterns? The Seasonal Plot and Subseries Plot highlight that certain months consistently have higher or lower costs.

  • Can you identify any unusual years? No. 

PBS_tsibble <- PBS |>
  filter(ATC2 == "H02")

autoplot(PBS_tsibble, Cost) +
  labs(title = "Autoplot: H02 Cost Over Time",
       y = "Cost")

gg_season(PBS_tsibble, Cost) +
  labs(title = "Seasonal Plot: H02 Cost Over Time")

gg_subseries(PBS_tsibble, Cost) +
  labs(title = "Subseries Plot: H02 Cost Over Time")

PBS_tsibble_general_copayments <- PBS_tsibble |>
  filter(Concession == "General", Type == "Co-payments")

gg_lag(PBS_tsibble_general_copayments, Cost, lags = 12) +
  labs(title = "Lag Plot: H02 Cost in General and Co-Payments Over Time")

PBS_tsibble_general_safetynet <- PBS_tsibble |>
  filter(Concession == "General", Type == "Safety net")

gg_lag(PBS_tsibble_general_safetynet, Cost, lags = 12) +
  labs(title = "Lag Plot: H02 Cost in General and Safety Net Over Time")

PBS_tsibble_concessional_copayments <- PBS_tsibble |>
  filter(Concession == "Concessional", Type == "Co-payments")

gg_lag(PBS_tsibble_concessional_copayments, Cost, lags = 12) +
  labs(title = "Lag Plot: H02 Cost in Concessional and Co-Payments Over Time")

PBS_tsibble_concessional_safetynet <- PBS_tsibble |>
  filter(Concession == "Concessional", Type == "Safety net")

gg_lag(PBS_tsibble_concessional_safetynet, Cost, lags = 12) +
  labs(title = "Lag Plot: H02 Cost in Concessional and Safety Net Over Time")

PBS_tsibble |>
  ACF(Cost) |>
  autoplot() + 
  labs(title = "ACF Plot: H02 Cost Over Time")

Barrels from us_gasolin

  • Can you spot any seasonality, cyclicity and trend? Consumption levels vary consistently throughout the year, with certain weeks or months showing higher or lower consumption.

  • What do you learn about the series? Consumption levels have not consistently increased or decreased over the observed period.

  • What can you say about the seasonal patterns? June and July had slightly more consumption on average. Probably an increase in road trips in Summer.

  • Can you identify any unusual years? From 2007 to 2012, US gasoline consumption was on the decline. This might be due to social changes that occurred during and after the Great Recession.

autoplot(us_gasoline, Barrels) +
  labs(title = "Autoplot: US Gasoline Consumption",
       y = "Barrels")

gg_season(us_gasoline, Barrels) +
  labs(title = "Seasonal Plot: US Gasoline Consumption")

gg_subseries(us_gasoline, Barrels) +
  labs(title = "Subseries Plot: US Gasoline Consumption")

gg_lag(us_gasoline, Barrels, lags = 12) +
  labs(title = "Lag Plot: US Gasoline Consumption")

us_gasoline |>
  ACF(Barrels) |>
  autoplot() + 
  labs(title = "ACF Plot: US Gasoline Consumption")