1. Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

Loading Libaries:

library(fpp3)
library(tidyverse)
library(reactable)
library(tsibble)
library(kableExtra)

Loading Datasets:

data("aus_production")
data("pelt")
data("gafa_stock")
data("vic_elec")
?aus_production
?pelt
?gafa_stock
?vic_elec

aus_production has a quarterly time interval.

pelt has a yearly time interval.

gafa_stock is a daily time interval during the workweek (Monday to Friday).

vic_elec is a half-hour time interval.

autoplot(aus_production, Bricks)

autoplot(pelt, Lynx)

autoplot(gafa_stock, Close)

autoplot(vic_elec, Demand)

autoplot(aus_production, Bricks) +
  labs(x="Year (in Quarters)", y="Brick Production (in millions)") +
  ggtitle("Quarterly Production of Bricks in Australia") +
  theme(plot.title=element_text(hjust=0.5))

autoplot(pelt, Lynx) +
  labs(x="Year (1845 - 1935)", y="Number of Canadian Lynx Pelts Traded") +
  ggtitle("Hudson Bay Company Trading Records of Canadian Lynx Furs") +
  theme(plot.title=element_text(hjust=0.5))

autoplot(gafa_stock, Close) +
  labs(x="Daily (Monday to Friday)", y="Stock Price") +
  ggtitle("Daily Gafa Stock Price") +
  theme(plot.title=element_text(hjust=0.5))

autoplot(vic_elec, Demand) +
  labs(x="Time (Every Half-Hour)", y="Total Electricity Demand (MWh)") +
  ggtitle("Electricity Demand in Victoria, Australia") +
  theme(plot.title=element_text(hjust=0.5))

2. Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

gafa_stock %>%
  group_by(Symbol) %>%
  filter(Close == max(Close)) %>%
  select(Symbol, Date, Close) %>%
  arrange(desc(Close))
## # A tsibble: 4 x 3 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Date       Close
##   <chr>  <date>     <dbl>
## 1 AMZN   2018-09-04 2040.
## 2 GOOG   2018-07-26 1268.
## 3 AAPL   2018-10-03  232.
## 4 FB     2018-07-25  218.

3. Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

a. You can read the data into R with the following script:

tute1 <- readr::read_csv("/Users/mohamedhassan/Downloads/tute1.csv")
reactable(tute1)

b. Convert the data to time series

mytimeseries <- tute1 |>
  mutate(Quarter = yearquarter(Quarter)) |>
  as_tsibble(index = Quarter)

c. Construct time series plots of each of the three series

mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check what happens when you don’t include facet_grid().

mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() 

  #facet_grid(name ~ ., scales = "free_y")

4. The USgas package contains data on the demand for natural gas in the US.

a. Install the USgas package.

#install.packages("USgas")

b. Create a tsibble from us_total with year as the index and state as the key.

library(USgas)
reactable(usgas)
new_us_gas <- usgas %>% 
  mutate(Year = year(`date`)) |>
  as_tibble(
    key = state,
    index = Year
  )
reactable(new_us_gas)

c. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

new_us_gas |>
  filter(state %in% c("Maine", "Vermont", "New Hampshire", "Massachusetts", "Connecticut", "Rhode Island")) |>
  ggplot(aes(x = Year, y = y, color = state)) +
  geom_line() +
  facet_grid(state ~ ., scales = "free_y") +
  labs(x="Year", y="Natural Gas Consumption") + 
  ggtitle("Annual Natural Gas Consumption for States in the New England Area") +
  theme(plot.title=element_text(hjust=0.3)) 

5.

a. Download tourism.xlsx from the book website and read it into R using readxl::read_excel().

tourism_data <- readxl::read_excel("/Users/mohamedhassan/Downloads/tourism.xlsx")
kbl(head(tourism_data, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Quarter Region State Purpose Trips
1998-01-01 Adelaide South Australia Business 135.0777
1998-04-01 Adelaide South Australia Business 109.9873
1998-07-01 Adelaide South Australia Business 166.0347
1998-10-01 Adelaide South Australia Business 127.1605
1999-01-01 Adelaide South Australia Business 137.4485
1999-04-01 Adelaide South Australia Business 199.9126
1999-07-01 Adelaide South Australia Business 169.3551
1999-10-01 Adelaide South Australia Business 134.3579
2000-01-01 Adelaide South Australia Business 154.0344
2000-04-01 Adelaide South Australia Business 168.7764

b. Create a tsibble which is identical to the tourism tsibble from the tsibble package.

# Tourism Tsibble
kbl(head(tourism, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Quarter Region State Purpose Trips
1998 Q1 Adelaide South Australia Business 135.0777
1998 Q2 Adelaide South Australia Business 109.9873
1998 Q3 Adelaide South Australia Business 166.0347
1998 Q4 Adelaide South Australia Business 127.1605
1999 Q1 Adelaide South Australia Business 137.4485
1999 Q2 Adelaide South Australia Business 199.9126
1999 Q3 Adelaide South Australia Business 169.3551
1999 Q4 Adelaide South Australia Business 134.3579
2000 Q1 Adelaide South Australia Business 154.0344
2000 Q2 Adelaide South Australia Business 168.7764
keys <- tourism_data %>%
  select(Region, State, Purpose)


tourism_data2 <- tourism_data %>%
  mutate(Quarter = yearquarter(Quarter)) %>%
  as_tibble(key = keys, index = Quarter)
kbl(head(tourism_data2, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Quarter Region State Purpose Trips
1998 Q1 Adelaide South Australia Business 135.0777
1998 Q2 Adelaide South Australia Business 109.9873
1998 Q3 Adelaide South Australia Business 166.0347
1998 Q4 Adelaide South Australia Business 127.1605
1999 Q1 Adelaide South Australia Business 137.4485
1999 Q2 Adelaide South Australia Business 199.9126
1999 Q3 Adelaide South Australia Business 169.3551
1999 Q4 Adelaide South Australia Business 134.3579
2000 Q1 Adelaide South Australia Business 154.0344
2000 Q2 Adelaide South Australia Business 168.7764

c. Find what combination of Region and Purpose had the maximum number of overnight trips on average.

average_max_trips <- tourism_data2 |>
  group_by(Region, Purpose) |>
  summarize(Overnight_Trips = round(mean(Trips), 2))|>
  ungroup() |>
  filter(Overnight_Trips == max(Overnight_Trips))
reactable(average_max_trips)

d. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

total_trips_state <- tourism_data2 |>
  select(Quarter, State, Trips) |>
  group_by(Quarter, State) |>
  summarize(Total_Trips = round(sum(Trips), 2)) |>
  #distinct(Purpose, Region) |>
  as_tsibble(
    key=State,
    index=Quarter
  )
kbl(head(total_trips_state, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Quarter State Total_Trips
1998 Q1 ACT 551.00
1998 Q2 ACT 416.03
1998 Q3 ACT 436.03
1998 Q4 ACT 449.80
1999 Q1 ACT 378.57
1999 Q2 ACT 558.18
1999 Q3 ACT 448.90
1999 Q4 ACT 594.83
2000 Q1 ACT 599.67
2000 Q2 ACT 557.14

8. Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

Total Private Employed

total_private <- us_employment |>
  filter(Title == "Total Private")
kbl(head(total_private, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Month Series_ID Title Employed
1939 Jan CEU0500000001 Total Private 25338
1939 Feb CEU0500000001 Total Private 25447
1939 Mar CEU0500000001 Total Private 25833
1939 Apr CEU0500000001 Total Private 25801
1939 May CEU0500000001 Total Private 26113
1939 Jun CEU0500000001 Total Private 26485
1939 Jul CEU0500000001 Total Private 26481
1939 Aug CEU0500000001 Total Private 26848
1939 Sep CEU0500000001 Total Private 27468
1939 Oct CEU0500000001 Total Private 27830
autoplot(total_private, Employed)

gg_season(total_private, Employed, labels = "both")

gg_subseries(total_private, Employed)

gg_lag(total_private, Employed)

ACF(total_private, Employed) |>
  autoplot()

  • Can you spot any seasonality, cyclicity and trend?

It appears that the most amount of hiring occurs during typical warm weather months (May - September).

  • What do you learn about the series?

The number of people employed in the U.S. by the private sector has generally had an upward trajectory. The number of employed from 1940-1980 and 1980-2020 is relatively the same, with slightly more employed from 1980-2020.

  • What can you say about the seasonal patterns?

The number of employed typically peaks during the Spring and Summer months, while the number of employed during Fall and Winter months appear to lag behind.

  • Can you identify any unusual years?

There was a decline in the number of employed around 2008, which coincides with the global recession that place during that time. US employment began to increase again starting in 2010.

Bricks From Aus_Production

autoplot(aus_production, Bricks)

gg_season(aus_production, Bricks, labels = "both")

gg_subseries(aus_production, Bricks)

gg_lag(aus_production, Bricks, geom = "point")

ACF(aus_production, Bricks) |>
  autoplot()

  • Can you spot any seasonality, cyclicity and trend?

Q1 produced the least amount of Bricks.

  • What do you learn about the series?

The peak production of Bricks occurred in 1980, and has failed to sustain that level of production since.

  • What can you say about the seasonal patterns?

The least amount of Brick production occurs during Q1, while Brick production during Q3 is slightly more than Q2 and Q4.

  • Can you identify any unusual years?

After the peak of Brick production in 1980, there was a significant decline in production a few years later. Brick production sharply increased not longer after, with 1989 nearly achieving 1980 production levels. Since then, however, Brick production has not been able to recover, with Brick production failing to achieve over 500 million.

Hare From Pelt

autoplot(pelt, Hare)

# gg_season produced an error
# gg_season(pelt, Hare, period = "year")
gg_subseries(pelt, Hare)

gg_lag(pelt, Hare, geom = "point")

ACF(pelt, Hare) |>
  autoplot()

  • Can you spot any seasonality, cyclicity and trend?

It appears that the number Snowshoe Hare pelts traded follows a 10-year cycle, with the ACF plot illustrating the upward and downward spikes of trading.

  • What do you learn about the series?

Hare pelts were a popularly traded commodity during the 1800s, and declined during the late 19th to early 20th Century.

  • What can you say about the seasonal patterns?

Since the data is yearly, it is difficult to discern any seasonal patterns.

  • Can you identify any unusual years?

Trade of Hare pelts reached their peak a little after 1860, and failed to reach this level thereafter.

“H02” Cost from PBS

data("PBS")
Cost_PBS <- PBS |>
  filter(ATC2 == "H02")
kbl(head(Cost_PBS, n = 10)) %>%
kable_styling(latex_options="scale_down", c("striped", "hover", "condensed", full_width=F))
Month Concession Type ATC1 ATC1_desc ATC2 ATC2_desc Scripts Cost
1991 Jul Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 63261 317384
1991 Aug Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 53528 269891
1991 Sep Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 52822 269703
1991 Oct Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 54016 280418
1991 Nov Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 49281 268070
1991 Dec Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 51798 277139
1992 Jan Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 42436 221772
1992 Feb Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 52913 272345
1992 Mar Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 62908 325700
1992 Apr Concessional Co-payments H Systemic hormonal preparations, excl. sex hormones and insulins H02 CORTICOSTEROIDS FOR SYSTEMIC USE 68499 349271
autoplot(Cost_PBS, Cost)

gg_season(Cost_PBS, Cost)

gg_subseries(Cost_PBS, Cost)

# This produced an error and could not run
# gg_lag(Cost_PBS, Cost)
ACF(Cost_PBS, Cost) |>
  autoplot()

  • Can you spot any seasonality, cyclicity and trend?

The Concessional Safety Net and General Safety Net have lower costs from February to around June/July, while Concessional Co-payments have higher costs during the same months.

  • What do you learn about the series?

General Co-payments appear to have the most stable costs year-round. The other forms of prescription insurance have costs that varies throughout the year.

Additionally, the increased and decreased costs associated with General Safety Net and Concessional Safety Net appear to follow a 12-month pattern.

  • What can you say about the seasonal patterns?

Concessional Safety Net and General Safety Net appear to follow the same pattern of lower costs between February and June/July, and increased costs in the subsequent months thereafter. Concessional Co-payments appears to have the opposite pattern, with higher costs from February to July, and decreased in the subsequent months thereafter. There doesn’t appear to be any discernible pattern in costs associated with General Co-payments.

  • Can you identify any unusual years?

There doesn’t appear to be a specific year where there was an unusual increase or decrease in costs.

Barrels from us_gasoline

data("us_gasoline")
autoplot(us_gasoline, Barrels)

gg_season(us_gasoline, Barrels, labels = "both")

gg_subseries(us_gasoline, Barrels)

gg_lag(us_gasoline, Barrels, geom = "point")

ACF(us_gasoline, Barrels) |>
  autoplot()

  • Can you spot any seasonality, cyclicity and trend?

Production of gasoline appears to increase during the Spring and Summer, and decrease during the Fall and Winter.

  • What do you learn about the series?

The number of barrels produced steadily increased from 1991 to around 2007, before declining around 2008-2009, which coincides with the global recession that occurred. However, the amount of barrels being produced since that decline has recovered, having a peak of production that slightly exceeds the peak of production prior to the decline.

  • What can you say about the seasonal patterns?

Production tends to pick up during the warm weather months in the Spring and Summer, and lag during cold weather months in the Fall and Winter.

  • Can you identify any unusual years?

After a mostly upward trajectory during the 1990s, there was a significant drop in production around 2001-2002, which may have coincided with the September 11th attacks in the U.S. and subsequent war in Afghanistan.