library(fpp3)
library(tsibbledata)
library(tidyverse)
Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.
# Looking at the datasets
?aus_production #Quarterly estimates of selected indicators of manufacturing production in Australia.
?pelt #Hudson Bay Company trading records for Snowshoe Hare and Canadian Lynx furs from 1845 to 1935. This data contains trade records for all areas of the company.
?gafa_stock #Historical stock prices from 2014-2018 for Google, Amazon, Facebook and Apple.
?vic_elec #vic_elec is a half-hourly tsibble with three values: Demand, Temperature, Holiday
#checking out the data
frequency(aus_production) #quarterly interval
## [1] 4
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
frequency(pelt) #yearly interval
## [1] 1
head(pelt)
## # A tsibble: 6 x 3 [1Y]
## Year Hare Lynx
## <dbl> <dbl> <dbl>
## 1 1845 19580 30090
## 2 1846 19600 45150
## 3 1847 19610 49150
## 4 1848 11990 39520
## 5 1849 28040 21230
## 6 1850 58000 8420
#frequency(gafa_stock) #daily interval
glimpse(gafa_stock)
## Rows: 5,032
## Columns: 8
## Key: Symbol [4]
## $ Symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAP…
## $ Date <date> 2014-01-02, 2014-01-03, 2014-01-06, 2014-01-07, 2014-01-08,…
## $ Open <dbl> 79.38286, 78.98000, 76.77857, 77.76000, 76.97285, 78.11429, …
## $ High <dbl> 79.57571, 79.10000, 78.11429, 77.99429, 77.93714, 78.12286, …
## $ Low <dbl> 78.86000, 77.20428, 76.22857, 76.84571, 76.95571, 76.47857, …
## $ Close <dbl> 79.01857, 77.28286, 77.70428, 77.14857, 77.63715, 76.64571, …
## $ Adj_Close <dbl> 66.96433, 65.49342, 65.85053, 65.37959, 65.79363, 64.95345, …
## $ Volume <dbl> 58671200, 98116900, 103152700, 79302300, 64632400, 69787200,…
frequency(vic_elec) #30 minute interval
## [1] 48
head(vic_elec)
## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
## Time Demand Temperature Date Holiday
## <dttm> <dbl> <dbl> <date> <lgl>
## 1 2012-01-01 00:00:00 4383. 21.4 2012-01-01 TRUE
## 2 2012-01-01 00:30:00 4263. 21.0 2012-01-01 TRUE
## 3 2012-01-01 01:00:00 4049. 20.7 2012-01-01 TRUE
## 4 2012-01-01 01:30:00 3878. 20.6 2012-01-01 TRUE
## 5 2012-01-01 02:00:00 4036. 20.4 2012-01-01 TRUE
## 6 2012-01-01 02:30:00 3866. 20.2 2012-01-01 TRUE
autoplot(aus_production, Bricks) +
ggtitle("Quarterly Brick Production")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
autoplot(pelt, Lynx) +
ggtitle("Annual Lynx Trappings")
autoplot(gafa_stock, Close) +
ggtitle("Daily Closing Prices of GAFA Stocks")
autoplot(vic_elec, Demand) +
labs(
title = "Electricity Demand in Victoria",
x = "Date and Time",
y = "Megawatts (MW)"
)
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
Since the gafa_stock time series has four symbols, it should first be grouped by Symbol and then filtered by the highest closing price of that stock.
gafa_stock %>%
group_by(Symbol) %>%
filter(Close ==max(Close))
## # A tsibble: 4 x 8 [!]
## # Key: Symbol [4]
## # Groups: Symbol [4]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2018-10-03 230. 233. 230. 232. 230. 28654800
## 2 AMZN 2018-09-04 2026. 2050. 2013 2040. 2040. 5721100
## 3 FB 2018-07-25 216. 219. 214. 218. 218. 58954200
## 4 GOOG 2018-07-26 1251 1270. 1249. 1268. 1268. 2405600
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
tute1 <- readr::read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#view(tute1)
head(tute1)
## # A tibble: 6 × 4
## Quarter Sales AdBudget GDP
## <date> <dbl> <dbl> <dbl>
## 1 1981-03-01 1020. 659. 252.
## 2 1981-06-01 889. 589 291.
## 3 1981-09-01 795 512. 291.
## 4 1981-12-01 1004. 614. 292.
## 5 1982-03-01 1058. 647. 279.
## 6 1982-06-01 944. 602 254
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid().
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
When facets_grid()
is removed, all the series are
overlaid over one plot with a single shared y-axis instead of separate
panels. This can be useful for when comparing trends of all three
columns. However, sometimes they have different units of measurements,
which may make it difficult to identify patterns if they’re sharing the
same y-axis. The data would need to be scaled or normalized so that the
data can fit into one plot. In this case, the GDP does not look like it
is changing much because all three time series are sharing the same
y-axis so the data looks flattened.
The USgas package contains data on the demand for natural gas in the US.
#install.packages("USgas")
library(USgas)
us_gasoline <- us_total %>%
as_tsibble(index = year, key = state)
head(us_gasoline)
## # A tsibble: 6 x 3 [1Y]
## # Key: state [1]
## year state y
## <int> <chr> <int>
## 1 1997 Alabama 324158
## 2 1998 Alabama 329134
## 3 1999 Alabama 337270
## 4 2000 Alabama 353614
## 5 2001 Alabama 332693
## 6 2002 Alabama 379343
#creating a vector for new england areas
new_england <- c("Maine","Vermont","New Hampshire",
"Massachusetts","Connecticut","Rhode Island")
#filtering gas consumption to new england areas
ne_gas <- us_gasoline |>
filter(state %in% new_england)
#Plot without facet_wrap
ggplot(ne_gas, aes(x = year, y = y, colour = state)) +
geom_line() +
labs(
title = "Annual Gas Consumption by State — New England",
x = "Year",
y = "Gas Consumption",
colour = "State"
)
A plot of all new england states shows that Massachusetts has the highest gas consumption in the area and Vermont has the least. Connecticut seems to have a steady growth in gas consumption. However, the other states might also have a trend in gas consumption but it is difficult to see from this plot since New Hampshire, Maine, and Rhode island have a flattened plot due to the shared y-axis.
#plotting one panel per state
ggplot(ne_gas, aes(x = year, y = y)) +
geom_line() +
facet_wrap(~ state, ncol = 3, scales = "free_y") +
labs(
title = "Annual Gas Consumption — New England States",
x = "Year",
y = "Gas Consumption"
)
This faceted plot shows the each state in its own panel with its own y-axis. Here we can see the clear upward trend in gas consumption for Connecticut, Massachusetts, and Vermont. We can also see the slow decline in Maine. This faceted plot reveals the pattern of each individual state.
tourism_xl <- readxl::read_excel("tourism.xlsx")
head(tourism_xl)
## # A tibble: 6 × 5
## Quarter Region State Purpose Trips
## <chr> <chr> <chr> <chr> <dbl>
## 1 1998-01-01 Adelaide South Australia Business 135.
## 2 1998-04-01 Adelaide South Australia Business 110.
## 3 1998-07-01 Adelaide South Australia Business 166.
## 4 1998-10-01 Adelaide South Australia Business 127.
## 5 1999-01-01 Adelaide South Australia Business 137.
## 6 1999-04-01 Adelaide South Australia Business 200.
head(tourism)
## # A tsibble: 6 x 5 [1Q]
## # Key: Region, State, Purpose [1]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
The built-in tourism dataset that comes in the tsibble pacakge has: - Index: Quarter (class= yearquarter) - Keys: Region, State, Purpose - Variable: Trips (numeric)
tourism_tb <- tourism_xl %>%
mutate(Quarter = yearquarter(Quarter)) %>%
as_tsibble(index = Quarter, key = c(Region, State, Purpose))
head(tourism_tb)
## # A tsibble: 6 x 5 [1Q]
## # Key: Region, State, Purpose [1]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
To do this, I would first group Region and Purpose, calculate the average overnight trips over all quarters. Then sort from largest to smallest and the figure at the top is the combination with the maximum number of overnight trips on average.
avg_trips <- tourism_tb %>%
as_tibble() %>% #through lots of trial and error, I found out that if I do not put this line here it will show the average trips for that specific quarter because it is a tstibble so I converted it back to a regular tibble
group_by(Region, Purpose) %>%
summarise(avg_trips = mean(Trips, na.rm = TRUE)) %>%
arrange(desc(avg_trips))
## `summarise()` has grouped output by 'Region'. You can override using the
## `.groups` argument.
avg_trips
## # A tibble: 304 × 3
## # Groups: Region [76]
## Region Purpose avg_trips
## <chr> <chr> <dbl>
## 1 Sydney Visiting 747.
## 2 Melbourne Visiting 619.
## 3 Sydney Business 602.
## 4 North Coast NSW Holiday 588.
## 5 Sydney Holiday 550.
## 6 Gold Coast Holiday 528.
## 7 Melbourne Holiday 507.
## 8 South Coast Holiday 495.
## 9 Brisbane Visiting 493.
## 10 Melbourne Business 478.
## # ℹ 294 more rows
The combination of Sydney and Purpose had maximum number of overnight trips on average with 747.
To find the total trips by State, the tsibble would have to converted to a regular tibble, grouped by State and Quarter, and then summed up and converted back into a tsibble.
total_trips <- tourism_tb %>%
as_tibble() %>%
group_by(State, Quarter) %>%
summarise(Trips = sum(Trips)) %>%
as_tsibble(index = Quarter, key = State)
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
total_trips
## # A tsibble: 640 x 3 [1Q]
## # Key: State [8]
## # Groups: State [8]
## State Quarter Trips
## <chr> <qtr> <dbl>
## 1 ACT 1998 Q1 551.
## 2 ACT 1998 Q2 416.
## 3 ACT 1998 Q3 436.
## 4 ACT 1998 Q4 450.
## 5 ACT 1999 Q1 379.
## 6 ACT 1999 Q2 558.
## 7 ACT 1999 Q3 449.
## 8 ACT 1999 Q4 595.
## 9 ACT 2000 Q1 600.
## 10 ACT 2000 Q2 557.
## # ℹ 630 more rows
Assuming the question is just asking for the “total trips by State” without the time data, here it is:
total_by_state_notime <- tourism_tb %>%
as_tibble() %>%
group_by(State) %>%
summarise(Total_Trips = sum(Trips))
total_by_state_notime
## # A tibble: 8 × 2
## State Total_Trips
## <chr> <dbl>
## 1 ACT 41007.
## 2 New South Wales 557367.
## 3 Northern Territory 28614.
## 4 Queensland 386643.
## 5 South Australia 118151.
## 6 Tasmania 54137.
## 7 Victoria 390463.
## 8 Western Australia 147820.
Added this code for completeness but could not convert back to tsibble because the time was removed.
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?
The us_employment is tsibble with many series so we would need to filter out just the title “Total Private”
us_emp <- us_employment %>%
filter(Title == "Total Private")
head(us_emp)
## # A tsibble: 6 x 4 [1M]
## # Key: Series_ID [1]
## Month Series_ID Title Employed
## <mth> <chr> <chr> <dbl>
## 1 1939 Jan CEU0500000001 Total Private 25338
## 2 1939 Feb CEU0500000001 Total Private 25447
## 3 1939 Mar CEU0500000001 Total Private 25833
## 4 1939 Apr CEU0500000001 Total Private 25801
## 5 1939 May CEU0500000001 Total Private 26113
## 6 1939 Jun CEU0500000001 Total Private 26485
autoplot(us_emp, Employed) +
ggtitle("Time plot: Monthly US Total Private Employment")
gg_season(us_emp, Employed) +
ggtitle("Seasonal plot: Monthly US Total Private Employment")
gg_subseries(us_emp, Employed) +
ggtitle("Subseries plot: Monthly US Total Private Employment")
gg_lag(us_emp, Employed) +
ggtitle("Lag plot: Monthly US Total Private Employment")
ACF(us_emp, Employed) %>%
autoplot() +
ggtitle("ACF: Monthly US Total Private Employment")
- Can you spot any seasonality, cyclicity and trend? From the time plot,
there is a strong upward trend in employment. It is a bit hard to tell
from this seasonal plot if there is seasonality but there does seem to
be a dip in employment in the beginning of the year in Jan and Feb and
slow rises towards June.
What do you learn about the series? Employment has grown steadily from 1940 to 2020. However, there are some periods of downturn. Specifically, the sudden drop in employment around 2008 due to the housing market crisis.
What can you say about the seasonal patterns? The subseries plot shows a lower employment at the beginning of the year compared to the summer months. This is supported by the lower average employment, represented by the blue horizontal lines in the subseries plot, in Jan compared to the average employment in Jun and July. This could possibly be due to students on their summer break entering the work force.
Can you identify any unusual years? There are a couple of unusual years that deviate from the steady uptrend in the time plot. The most notable one seems to be in 2008-2009 during the time of the housing crisis and the stock market dropping almost 50%.
bricks <- aus_production %>%
select(Quarter, Bricks)
head(bricks)
## # A tsibble: 6 x 2 [1Q]
## Quarter Bricks
## <qtr> <dbl>
## 1 1956 Q1 189
## 2 1956 Q2 204
## 3 1956 Q3 208
## 4 1956 Q4 197
## 5 1957 Q1 187
## 6 1957 Q2 214
autoplot(bricks, Bricks) +
ggtitle("Time plot: Quarterly Australian Bricks Production")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_season(bricks, Bricks) +
ggtitle("Seasonal plot: Quarterly Australian Bricks Production")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_subseries(bricks, Bricks) +
ggtitle("Subseries plot: Quarterly Australian Bricks Production")
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_lag(bricks, Bricks) +
ggtitle("Lag plot: Quarterly Australian Bricks Production")
## Warning: Removed 20 rows containing missing values (gg_lag).
ACF(bricks, Bricks) %>%
autoplot() +
ggtitle("ACF: Quarterly Australian Bricks Production")
Can you spot any seasonality, cyclicity and trend? Based on the time plot, there was an uptrend in brick production until 1980 Q1 and then there is a decline in brick production. As for seasonality, the seasonal and subseries plot shows that brick production is highest in Q3 and lowest in Q1, indicating strong quarterly seasonality.
What do you learn about the series? Brick production is highly seasonal and seems to be slowly decline after 1980 Q1. The ACF plot shows a spike in production at lags 4, 8, 16 and 20 (multiples of 4), indicating strong quarterly seasonality.
What can you say about the seasonal patterns? Based on the subseries plot, Q1 has the lowest average brick production while Q3 has the highest. This is confirmed by the time plot where there are consitent periods of ups and downs.
Can you identify any unusual years? The most notable unusual years seems the sharp declines around 1980 Q1. The two declines before and after 1980 Q1 seem to be greated than the other declines.
head(pelt)
## # A tsibble: 6 x 3 [1Y]
## Year Hare Lynx
## <dbl> <dbl> <dbl>
## 1 1845 19580 30090
## 2 1846 19600 45150
## 3 1847 19610 49150
## 4 1848 11990 39520
## 5 1849 28040 21230
## 6 1850 58000 8420
#selecting only the Year & Hare column
hare <- pelt %>%
select(Year, Hare)
head(hare)
## # A tsibble: 6 x 2 [1Y]
## Year Hare
## <dbl> <dbl>
## 1 1845 19580
## 2 1846 19600
## 3 1847 19610
## 4 1848 11990
## 5 1849 28040
## 6 1850 58000
autoplot(hare, Hare) +
ggtitle("Time plot: Annual Hare Pelts")
#gg_season(hare, Hare) + #this time series is annual data so there is no seasonal period
# ggtitle("Seasonal plot: Annual Hare Pelts")
#gg_subseries(har, Hare) + ##this time series is annual data so there is no seasonal period
# ggtitle("Subseries plot: Annual Hare Pelts")
gg_lag(hare, Hare) +
ggtitle("Lag plot: Annual Hare Pelts")
ACF(hare, Hare) %>%
autoplot() +
ggtitle("ACF: Annual Hare Pelts")
Can you spot any seasonality, cyclicity and trend? Since this times series is annual, there is no seasonality. However, there seems to be some flunctations without a defined time interval. There are no obvious trendsin the time plot.
What do you learn about the series? There are a lot of ups and downs in pelt from hare production. The lag plot shows random, cuvred clouds instead of a linear line, suggesting nonlinear cyclic behvaior.
What can you say about the seasonal patterns? Since this is annual data, the gg_season() and gg_subseries() function errored out. There is no seasonal patterns.
Can you identify any unusual years? The time plot shows two very high peaks in the 1860s and 1880s, followed by very sharp declines.These two peaks are significant since they are both above 125,000 where as the peaks of the other years are less than 100,000.
head(PBS)
## # A tsibble: 6 x 9 [1M]
## # Key: Concession, Type, ATC1, ATC2 [1]
## Month Concession Type ATC1 ATC1_desc ATC2 ATC2_desc Scripts Cost
## <mth> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 1991 Jul Concessional Co-paymen… A Alimenta… A01 STOMATOL… 18228 67877
## 2 1991 Aug Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15327 57011
## 3 1991 Sep Concessional Co-paymen… A Alimenta… A01 STOMATOL… 14775 55020
## 4 1991 Oct Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15380 57222
## 5 1991 Nov Concessional Co-paymen… A Alimenta… A01 STOMATOL… 14371 52120
## 6 1991 Dec Concessional Co-paymen… A Alimenta… A01 STOMATOL… 15028 54299
#filtering out the H02 series and cost
h02 <- PBS %>%
filter(ATC2 == "H02") %>%
summarise(Cost = sum(Cost))
h02
## # A tsibble: 204 x 2 [1M]
## Month Cost
## <mth> <dbl>
## 1 1991 Jul 429795
## 2 1991 Aug 400906
## 3 1991 Sep 432159
## 4 1991 Oct 492543
## 5 1991 Nov 502369
## 6 1991 Dec 602652
## 7 1992 Jan 660119
## 8 1992 Feb 336220
## 9 1992 Mar 351348
## 10 1992 Apr 379808
## # ℹ 194 more rows
autoplot(h02, Cost) +
ggtitle("Time plot: Monthly H02 Costs")
gg_season(h02, Cost) +
ggtitle("Seasonal plot: Monthly H02 Costs")
gg_subseries(h02, Cost) +
ggtitle("Subseries plot: Monthly H02 Costs")
gg_lag(h02, Cost) +
ggtitle("Lag plot: Monthly H02 Costs")
ACF(h02, Cost) %>%
autoplot() +
ggtitle("ACF: Monthly H02 Costs")
Can you spot any seasonality, cyclicity and trend? There is a upward trend in pharmaceutical costs over time. Based on the seasonal and subseries plot, there is a significant spike in average costs in Jan but a giant tumble in Feb. Costs slowly build up past March and reaches a high again in Dec. H02 costs show strong seasonality.
What do you learn about the series? Drug costs are trending upward throughout the years. The average drug costs are higher in Dec and Jan than they are the rest of the year.
What can you say about the seasonal patterns? The seasonal plot shows there is a peak in Jan but a hard drop in Feb followed by a gradual increase the rest of the year.
Can you identify any unusual years? The time plot shows that some of the peaks are much higher than the previous years. Normally the peaks are increasing consistently but in 1994 Jan and 2005 Jan, there seems to be an abnormal spike compared to the rest of the years.
us_gasoline <- fpp3::us_gasoline
head(fpp3::us_gasoline)
## # A tsibble: 6 x 2 [1W]
## Week Barrels
## <week> <dbl>
## 1 1991 W06 6.62
## 2 1991 W07 6.43
## 3 1991 W08 6.58
## 4 1991 W09 7.22
## 5 1991 W10 6.88
## 6 1991 W11 6.95
autoplot(us_gasoline, Barrels) +
ggtitle("Time plot: Weekly US Gasoline Production (Barrels)")
gg_season(us_gasoline, Barrels) +
ggtitle("Seasonal plot: Weekly US Gasoline Production (Barrels)")
gg_subseries(us_gasoline, Barrels) +
ggtitle("Subseries plot: Weekly US Gasoline Production (Barrels)")
gg_lag(us_gasoline, Barrels) +
ggtitle("Lag plot: Weekly US Gasoline Production (Barrels)")
ACF(us_gasoline, Barrels) %>%
autoplot() +
ggtitle("ACF: Weekly US Gasoline Production (Barrels)")
Can you spot any seasonality, cyclicity and trend? The time plot shows a gradual increase in gasoline production and a plateau around 2009.Based on the subseries plot, the average barrels is higher during the middle of the year compared to the early and later weeks in the year, indicating strong seasonality.
What do you learn about the series? Gasoline production is higher in the summer (middle of the year) than it is in the winter. This can be due to more people driving during the summer months than the winter months.Gasoline production is highly seasonal. The lag plot shows a diagonal line indicating
What can you say about the seasonal patterns? In the season plot, there is a rise in gas production from May to August and a peak in July. This gradually decreases as we approach Dec and takes a big fall in Jan. This is consistent year to year.
Can you identify any unusual years? There is a plateau and gradual decline around 2009. This may have been to the 2008 recession. There are also sharp declines in gas production around 1999 W2 and 2019. This may be due to the 2000 dot com crash and the 2019 covid pandemic.