Time Series

Lab 1

1.

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

“Use ? (or help()) to find out about the data in each series.” “What is the time interval of each series?” “Use autoplot() to produce a time plot of each series.” “For the last plot, modify the axis labels and”title.”

Assuming the packages have been installed, we can start the libraries:

library(fpp3)
## Warning: package 'fpp3' was built under R version 4.4.2
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.3     ✔ fable       0.4.1
## ✔ ggplot2     3.5.1
## Warning: package 'tsibble' was built under R version 4.4.2
## Warning: package 'tsibbledata' was built under R version 4.4.2
## Warning: package 'feasts' was built under R version 4.4.2
## Warning: package 'fabletools' was built under R version 4.4.2
## Warning: package 'fable' was built under R version 4.4.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
library(tsibble)
library(dplyr)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0     ✔ readr   2.1.5
## ✔ purrr   1.0.2     ✔ stringr 1.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()     masks stats::filter()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(ggplot2)

Now we take a look at each dataset

Bricks from aus_production Dataset

use “?” to get information on each daset:

?aus_production
## starting httpd help server ... done
head(aus_production)

Description Quarterly estimates of selected indicators of manufacturing production in Australia.

Format Time series of class tsibble.

Details aus_production is a half-hourly tsibble with six values:

Beer: Beer production in megalitres. Tobacco: Tobacco and cigarette production in tonnes. Bricks: Clay brick production in millions of bricks. Cement: Portland cement production in thousands of tonnes. Electricity: Electricity production in gigawatt hours. Gas: Gas production in petajoules.

Source Australian Bureau of Statistics, catalogue number 8301.0.55.001 table 1.

For the time series we can focus only on the columns Bricks and Quarter

#Filter for Bricks time series
time_bricks <- aus_production %>% filter(!is.na(Bricks)) %>% select(Quarter, Bricks)

dim(time_bricks)
## [1] 198   2
head(time_bricks)

In this case we see that after filtering any missing values we have 198 entries for the 2 columns.

#Time interval
min(time_bricks$Quarter)
## <yearquarter[1]>
## [1] "1956 Q1"
## # Year starts on: January
max(time_bricks$Quarter)
## <yearquarter[1]>
## [1] "2005 Q2"
## # Year starts on: January

The time interval for aus_productions is Quarterly, and goes from Q1 of 1956 to Q2 of 2005.

# using auto plot, but adding labels for clarity
autoplot(time_bricks, Bricks) +
  ggtitle("Production of Bricks in Australia") +
  ylab("Bricks (in Millions)") +
  xlab("Year")

The quarterly time series graph seems to show seasonality, and a long term trend of growth until the early 80s, followed by decline until year 2000.

Lynx from Pelt Dataset

?pelt
head(pelt)

Description Hudson Bay Company trading records for Snowshoe Hare and Canadian Lynx furs from 1845 to 1935. This data contains trade records for all areas of the company.

Format Time series of class tsibble

Details pelt is an annual tsibble with two values:

Hare: The number of Snowshoe Hare pelts traded. Lynx: The number of Canadian Lynx pelts traded.

Source Hudson Bay Company

For the time series we can focus only on the columns Year and Lynx

#Filter for Lynx time series
time_lynx <- pelt %>% filter(!is.na(Lynx)) %>% select(Year, Lynx)

dim(time_lynx)
## [1] 91  2
head(time_lynx)

In this case we see that after filtering any missing values we have 91 entries for the 2 columns.

#Time interval for pelts
min(time_lynx$Year)
## [1] 1845
max(time_lynx$Year)
## [1] 1935

The time interval for pelts yearly, and goes from 1845 to 1935

# using auto plot, but adding labels for clarity
autoplot(time_lynx, Lynx) +
  ggtitle("Canadian Lynx Pelts Traded by Hudson Company 1845-1935") +
  ylab("Lynx Pelts") +
  xlab("Year")

The annual time series graph seems to show a cyclic behavior, perhaps indicating population cycle changes.

Close from gafa_stock Dataset

?gafa_stock
head(gafa_stock)

Description Historical stock prices from 2014-2018 for Google, Amazon, Facebook and Apple. All prices are in $USD.

Format Time series of class tsibble

Details gafa_stock is a tsibble containing data on irregular trading days:

Open: The opening price for the stock. High: The stock’s highest trading price. Low: The stock’s lowest trading price. Close: The closing price for the stock. Adj_Close: The adjusted closing price for the stock. Volume: The amount of stock traded.

Each stock is uniquely identified by one key:

Symbol: The ticker symbol for the stock.

Source Yahoo Finance historical data

For the time series we can focus only on the columns Date and Close

#Filter for gafa_stock time series
time_close <- gafa_stock %>% filter(!is.na(Close)) %>% select(Date, Close)

dim(time_close)
## [1] 5032    3
head(time_close)

In this case we see that after filtering any missing values we have 5032 entries for the 2 columns.

#Time interval for pelts
min(time_close$Date)
## [1] "2014-01-02"
max(time_close$Date)
## [1] "2018-12-31"

The time interval for gafa_stock is weekdays (business days) and goes from January 2nd 2014 to December 31st 2018

# using auto plot, but adding labels for clarity
autoplot(time_close, Close) +
  ggtitle("Historical Stock Prices 2014-2018: Google, Amazon, Facebook and Apple") +
  ylab("Closing Price ($USD)") +
  xlab("Year")

The daily time series graph seems to show some upwards trends for all stocks but perhaps more evident for Amazon and Google.

Demand from vic_elec Dataset

?vic_elec
head(vic_elec)

Description vic_elec is a half-hourly tsibble with three values:

Demand: Total electricity demand in MWh. Temperature: Temperature of Melbourne (BOM site 086071). Holiday: Indicator for if that day is a public holiday.

Format Time series of class tsibble.

Details This data is for operational demand, which is the demand met by local scheduled generating units, semi-scheduled generating units, and non-scheduled intermittent generating units of aggregate capacity larger than 30 MWh, and by generation imports to the region. The operational demand excludes the demand met by non-scheduled non-intermittent generating units, non-scheduled intermittent generating units of aggregate capacity smaller than 30 MWh, exempt generation (e.g. rooftop solar, gas tri-generation, very small wind farms, etc), and demand of local scheduled loads. It also excludes some very large industrial users (such as mines or smelters).

Source Australian Energy Market Operator.

This dataset has 2 time-related columns, but since Time includes the date, we can focus on that one, plus the column Demand

#Filter for vic_elec time series
time_demand <- vic_elec %>% filter(!is.na(Demand)) %>% select(Time, Demand)

dim(time_demand)
## [1] 52608     2
head(time_demand)

In this case we see that after filtering any missing values we have 5032 entries for the 2 columns.

#Time interval for pelts
min(time_demand$Time)
## [1] "2012-01-01 AEDT"
max(time_demand$Time)
## [1] "2014-12-31 23:30:00 AEDT"

The time interval for Demand is half-hourly and goes from January 1nd 2012 at midnight to December 31st 2014 at 11:30pm

# using auto plot, but adding labels for clarity
autoplot(time_demand, Demand) +
  ggtitle("Electricity Demand in Melbourne 2012-2014") +
  ylab("Demand (MWh)") +
  xlab("Time")

The half-hourly time series graph shows seasonality, with high frequency fluctuation.

2.

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

If the idea is to find the peak prices I like to use slice_max, as I was thought that it is optimized for performance in larger datasets.

peak_prices <- gafa_stock %>%
  group_by(Symbol) %>%               # With this step we keep all the coming up steps independent
                                       # for each group of symbols (AAPL, AMZN, FB, GOOG)
  slice_max(Close, n = 1, with_ties = FALSE) %>%  # If we change with_ties from FALSE to TRUE then we can see if there are 
                                                  # repeated Max values (PS: there are not)
    select(Date, Close, Symbol)                         

peak_prices

We find that the price for all 4 stocks peaked in 2018. More specifically, Apple on October 3d 2018; Amazon on September 4th 2018; Facebook on July 25th 2018; and Google on July 26th 2018.

3.

Having downloaded the tute1.csv file from the book website, we proceed to upload it to GitHub to ensure easy reproducibility.

That replaces step a)

tute1 <- read_csv("https://raw.githubusercontent.com/Lfirenzeg/msds624labs/refs/heads/main/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(tute1)

step b) We use the code provided in the lab to convert the Quarter column into a properly formatted quarterly time series

tute1_ts <- tute1 %>% 
  mutate(Quarter = yearquarter(Quarter)) %>% #The yearquarter() converts the Quarter column into a properly quarterly date                                                 #format
  as_tsibble(index = Quarter)  #converting the dataset into time series tibble, with Quarter column serving as the time index

head(tute1_ts)

step c) Using the code provided to create faceted line plot

tute1_ts |>
  pivot_longer(-Quarter) |> #pivoting all columns except for Quarter, changing the table from a "wide" to "long" format
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +     #adding line plot for the time series
  facet_grid(name ~ ., scales = "free_y")  #create facets of the plot, specifying based on the name column,

                                          #and "free_y adds a y-axis scale for each

And it asks to compare the same code when facet_grid() is not used:

tute1_ts |>
  pivot_longer(-Quarter) |>   #pivoting all columns except for Quarter, changing the table from a "wide" to "long" format
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() #adding line plot for the time series

Without facet_grid(name ~ ., scales = “free_y”), the resulting plot incorporates all the facets into a single y-axis, making it more difficult to visualize variability. For example, with its own y-axis, the GDP line makes more evident the range in change; while without the y-axis, its adjusted to the scale of the other 2 lines, seemingly looking flatter.

4.

The USgas package contains data on the demand for natural gas in the US. Install the USgas package. Create a tsibble from us_total with year as the index and state as the key. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island)

  1. To install the USgas package we run: install.packages(“USgas”) once. I’m not adding it to the chunk to avoid any warning messages about it already being installed when I run all the chunks
# Load the package
library(USgas)
## Warning: package 'USgas' was built under R version 4.4.2
us_total <- us_total %>%
  rename(Gas_Consumption = y)
head(us_total)
  1. Creating a time series using year as the index, and state as the key
us_total_ts <- us_total %>%
  as_tsibble(index = year, key = state)
head(us_total_ts)
  1. New England area comprises:Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island. So we filter for those states and then we can start plotting.
# defining a new vector called new_england
new_england <- c("Maine", "Vermont", "New Hampshire", 
                        "Massachusetts", "Connecticut", "Rhode Island")

gas_neweng <- us_total_ts %>%
  filter(state %in% new_england)  #look in the state column for the values specified in vector new_england, then save the   #                                   #results in gas_neweng 

Plotting time using ggplot2

# using facets plot for each state
gas_neweng %>%
  ggplot(aes(x = year, y = Gas_Consumption, colour = state)) +
  geom_line() +
  facet_wrap(~ state, scales = "free_y") +
  ggtitle("Natural Gas Consumption in New England by Year") +
  ylab("Gas Consumption (Billion Cubic Feet)") +
  xlab("Year") +
  theme_minimal()

In the plots we can see that Connecticut, Massachusetts and Vermont have a steady increase in gas consumption over the years. Maine and New Hampshire have more fluctuation, with more marked decline periods. Also, Connecticut and Massachusetts consume the largets amounts of natural gas, while Vermont consumes the least (despite growing trend).

5.

Download tourism.xlsx from the book website and read it into R using readxl::read_excel(). Create a tsibble which is identical to the tourism tsibble from the tsibble package. Find what combination of Region and Purpose had the maximum number of overnight trips on average. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

After installing :install.packages(“readxl”), we can run:

library(readxl)
## Warning: package 'readxl' was built under R version 4.4.2
# URL in book
url <- "https://bit.ly/fpptourism"

# download file
download.file(url, destfile = "tourism.xlsx", mode = "wb")

# read file
tourism <- read_excel("tourism.xlsx")

# View the first few rows
head(tourism)
  1. We had already loaded library tsibble, otherwise we would need to include it.
# Converting tourism to a time series
tourism_ts <- tourism %>%
  mutate(Quarter = yearquarter(Quarter)) %>%  # Ensures Quarter has the proper format
  as_tsibble(index = Quarter, key = c(Region, State, Purpose))

head(tourism_ts)
max_on_avg_trips <- tourism_ts %>%
  arrange(Region, Purpose, Quarter) %>%    # to ensure proper temporal ordering
  group_by(Region, Purpose) %>%            # grouping by Region and Purpose
  summarise(avg_trips = mean(Trips, na.rm = TRUE), .groups = "drop") %>%  # to calculate the average amount of trips
  arrange(desc(avg_trips))                 # arrange by descending order of avg trips
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Region`, `Purpose`, `Quarter` first.
head(max_on_avg_trips)

We find that Melbourne, with a purpose of visiting, had the highest average of overnight trips (985), for 2017 Q4.

  1. Time series combining Purpose and Region, with just total trips by State

To combine both Purpose and Region the data is aggregated solely by State and Quarter.

# creating a new tsibble combining Purposes and Regions
st_trips_ts <- tourism_ts %>%
  as_tibble() %>%                           # convert to a tibble to allow flexible grouping
                     # this removes the tsibble structure temporarily, allowing us to group by Quarter and State 
  mutate(Quarter = yearquarter(Quarter)) %>%  # we ensure again Quarter is in yearquarter format
  group_by(State, Quarter) %>%    #grouping by State and Quarter
  summarise(total_trips = sum(Trips, na.rm = TRUE), .groups = "drop") %>%  # adding up the total trips
  arrange(desc(total_trips))                # total_trips in descending order

head (st_trips_ts)

However, if we want to see total by States (without including quarter) we can update the code to:

# creating a new tsibble combining Purposes and Regions
st_trips_ts <- tourism_ts %>%
  as_tibble() %>%                           # convert to a tibble to allow flexible grouping
                     # this removes the tsibble structure temporarily, allowing us to group by Quarter and State 
  mutate(Quarter = yearquarter(Quarter)) %>%  # we ensure again Quarter is in yearquarter format
  group_by(State) %>%    #grouping only by State
  summarise(total_trips = sum(Trips, na.rm = TRUE), .groups = "drop") %>%  # adding up the total trips
  arrange(desc(total_trips))                # total_trips in descending order

head (st_trips_ts)

But by not having the Quarter object then it couldn’t be worked as a time series anymore.

8.

Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

a)Can you spot any seasonality, cyclicity and trend? b)What do you learn about the series? c)What can you say about the seasonal patterns? d)Can you identify any unusual years?

“Total Private” Employed from us_employment

# to filter for "Total Private" employment data
us_total_private <- us_employment %>%
  filter(Title == "Total Private") %>%
  select(Month, Employed) %>%
  as_tsibble(index = Month)

Let’s create the plots first

autoplot(us_total_private, Employed) +
  ggtitle("Total Private Employment in the U.S.") +
  xlab("Year") +
  ylab("Employed (in thousands)")

Seasonality

gg_season(us_total_private, Employed, period = "year") +
  ggtitle("Seasonal Plot of Total Private Employment") +
  xlab("Month") +
  ylab("Employed (in thousands)")

Subseries

gg_subseries(us_total_private, Employed) +
  ggtitle("Subseries Plot of Total Private Employment") +
  xlab("Month") +
  ylab("Employed (in thousands)")

Lag Plot

gg_lag(us_total_private, Employed) +
  ggtitle("Lag Plot of Total Private Employment") +
  xlab("Lag") +
  ylab("Employed (in thousands)")

Autocorrelation

us_total_private %>%
  ACF(Employed) %>%
  autoplot() +
  ggtitle("Autocorrelation Function for Total Private Employment") +
  xlab("Lag") +
  ylab("ACF")

  1. autoplot() shows a clear long-term upward trend in private employment, with dips during specific periods like 2008. There seems to be weak seasonality, with changes in a year not pronounced, but spring and summer seem to see an increase in employment. Also, the ACF shows strong correlation at all lag.

  2. The series is dominated by a strong upward trend, and employment is persistent, meaning past values have a strong influence on future values.

  3. Employment tends to peak during summer months and decline slightly at the start of the year.

  4. Perhaps the 2008 dip can be related to the financial crisis.

Bricks from aus_production

organize and filter data

bricks_ts <- aus_production %>%
  filter(!is.na(Bricks)) %>%  # Remove rows with missing values in Bricks
  select(Quarter, Bricks) %>%
  as_tsibble(index = Quarter)

now for plots:

# Plot the time series
autoplot(bricks_ts, Bricks) +
  ggtitle("Clay Brick Production in Australia") +
  xlab("Year") +
  ylab("Bricks (in millions)")

# Seasonal plot
gg_season(bricks_ts, Bricks, period = "year") +
  ggtitle("Seasonal Plot of Brick Production") +
  xlab("Quarter") +
  ylab("Bricks (in millions)")

# Subseries plot
gg_subseries(bricks_ts, Bricks) +
  ggtitle("Subseries Plot of Brick Production") +
  xlab("Quarter") +
  ylab("Bricks (in millions)")

# Lag plot
gg_lag(bricks_ts, Bricks) +
  ggtitle("Lag Plot of Brick Production") +
  xlab("Lag") +
  ylab("Bricks (in millions)")

# ACF plot
bricks_ts %>%
  ACF(Bricks) %>%
  autoplot() +
  ggtitle("Autocorrelation Function for Brick Production") +
  xlab("Lag") +
  ylab("ACF")

a) autoplot() shows a long-term growth trend in brick production from 1956 to around the 1980s, with a decline after that, stabilizing at a lower level in recent decades. Also, there seem to be strong seasonal patterns, with brick production peaks in Q2 and Q3 (spring/summer months), likely related to construction activity; and lower Production in Q1 and Q4 (winter months). ACF plot shows significant cyclic patterns with autocorrelation spikes at lag-4.

  1. Brick production seemed to grow steadily until the early 1980s (perhaps due to expanding construction demands). And Brick production seems to be strongly influenced by seasonal factors, with higher production during warmer months.

c)The seasonal pattern is consistent, with production at its highest in Q2 and Q3. Meanwhile Q1 is lower.

  1. There was a sharp decline in production in the mid-1980s that be considered unusual compared to the previous upward trend. This could be related to an economic recession or changes in the construction industry.

Hare from pelt

Again, we first filter and organize the date

hare_ts <- pelt %>%
  filter(!is.na(Hare)) %>%  # Remove rows with missing values in Hare
  select(Year, Hare) %>%
  as_tsibble(index = Year)

now we plot

# Plot the time series
autoplot(hare_ts, Hare) +
  ggtitle("Canadian Hare Pelts Traded") +
  xlab("Year") +
  ylab("Hare Pelts (in thousands)")

# Subseries plot (although for annual data, this may not reveal much)
gg_subseries(hare_ts, Hare) +
  ggtitle("Subseries Plot of Hare Pelt Trade") +
  xlab("Year") +
  ylab("Hare Pelts (in thousands)")

# Lag plot
gg_lag(hare_ts, Hare) +
  ggtitle("Lag Plot of Hare Pelt Trade") +
  xlab("Lag") +
  ylab("Hare Pelts (in thousands)")

# ACF plot
hare_ts %>%
  ACF(Hare) %>%
  autoplot() +
  ggtitle("Autocorrelation Function for Hare Pelt Trade") +
  xlab("Lag") +
  ylab("ACF")

  1. auplot doesnt show if there is clear long-term upward or downward trend in hare pelt tradin, with the data fluctuating significantly, perhaps population dynamics or economic factors. We can see multi-year cycles, that are consistent and pronounced. Since the dataset is annual, seasonality is not directly applicable. Instead, the focus is on cyclic behavior.

  2. This specific series is dominated by cyclic patterns rather than long-term trends.Significant spikes and valleys are visible, indicating periods of high and low population levels or hunting activity.

  3. Seasonal patterns are not evident since the dataset is annual.

  4. There seem to be extreme peaks around 1860, 1880, and 1920 stand out as periods of unusually high hare pelts traded.

“H02” Cost from PBS

PBS %>% 
  filter(ATC2 == "H02") %>% 
  autoplot(Cost) +
  ggtitle("H02 Drug Cost Over Time") +
  xlab("Year") +
  ylab("Cost ($AUD)")

PBS %>% 
  filter(ATC2 == "H02") %>% 
  gg_season(Cost, period = "year") +
  ggtitle("Seasonal Plot of H02 Drug Cost") +
  xlab("Month") +
  ylab("Cost ($AUD)")

PBS %>% 
  filter(ATC2 == "H02") %>% 
  gg_subseries(Cost) +
  ggtitle("Subseries Plot of H02 Drug Cost") +
  xlab("Month") +
  ylab("Cost ($AUD)")

PBS %>% 
  filter(ATC2 == "H02") %>% 
  ACF(Cost) %>% 
  autoplot() +
  ggtitle("Autocorrelation Function for H02 Drug Cost") +
  xlab("Lag") +
  ylab("ACF")

a) There is clear upward trend in costs over time, indicating increasing drug expenditures. Also costs appear to rise consistently, with a slightly steeper increase in recent years. There is strong seasonal behavior, with peaks occurring annually, with higher costs concentrated around the middle of the year.

  1. It looks like the costs are highest in the middle months (April–June) and lowest at the start and end of the year. Seasonal patterns are consistent across years but vary slightly in amplitude.

  2. Monthly distributions shows strong seasonality within each category. “Concessional/Co-payments” shows the largest variance and highest peaks. We also can cyclic behavior.

  3. No drastic anomalies for any year, but increasing variance in recent years can be related with harder to predict behavior?

Barrels from us_gasoline

us_gasoline %>%
  autoplot(Barrels) +
  ggtitle("Gasoline Production in the U.S.") +
  xlab("Year") +
  ylab("Barrels (in thousands)")

us_gasoline %>%
  gg_season(Barrels, period = "year") +
  ggtitle("Seasonal Plot of U.S. Gasoline Production") +
  xlab("Month") +
  ylab("Barrels (in thousands)")

us_gasoline %>%
  gg_subseries(Barrels) +
  ggtitle("Subseries Plot of U.S. Gasoline Production") +
  xlab("Month") +
  ylab("Barrels (in thousands)")

us_gasoline %>%
  gg_lag(Barrels) +
  ggtitle("Lag Plot of U.S. Gasoline Production") +
  xlab("Lag") +
  ylab("Barrels (in thousands)")

us_gasoline %>%
  ACF(Barrels) %>%
  autoplot() +
  ggtitle("Autocorrelation Function for U.S. Gasoline Production") +
  xlab("Lag") +
  ylab("ACF")

  1. The seasonal plot shows clear seasonal variations in gasoline production. Production tends to peak in the summer months and drops in the winter months, which is typical for higher gasoline demand during travel seasons. Also, there are longer-term cycles evident in the data. For example, there are multi-year periods of growth followed by slight declines, particularly visible in the early 2000s and the 2008 financial crisis. The overall trend shows an increase in gasoline production from 1990 to the early 2000s.

  2. We learn that there was a steady increase in production for about two decades, with periods of decline (such as 2008), suggesting sensitivity to economic conditions. The strong seasonal pattern reflects demand influenced by weather.

  3. The seasonal plot highlights a consistent pattern of higher production during summer (travel season) and lower production in winter months.

  4. 2008–2009: The global financial crisis led to a visible drop in production, reflecting reduced demand and economic slowdown.