library(fpp3)
library(tsibbledata)
library(tidyverse)

2.1

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

# Looking at the datasets
?aus_production   #Quarterly estimates of selected indicators of manufacturing production in Australia.
?pelt             #Hudson Bay Company trading records for Snowshoe Hare and Canadian Lynx furs from 1845 to 1935. This data contains trade records for all areas of the company.
?gafa_stock       #Historical stock prices from 2014-2018 for Google, Amazon, Facebook and Apple.
?vic_elec         #vic_elec is a half-hourly tsibble with three values: Demand, Temperature, Holiday

#checking out the data
frequency(aus_production) #quarterly interval

## [1] 4

head(aus_production)

## # A tsibble: 6 x 7 [1Q]
##   Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##     <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
## 1 1956 Q1   284    5225    189    465        3923     5
## 2 1956 Q2   213    5178    204    532        4436     6
## 3 1956 Q3   227    5297    208    561        4806     7
## 4 1956 Q4   308    5681    197    570        4418     6
## 5 1957 Q1   262    5577    187    529        4339     5
## 6 1957 Q2   228    5651    214    604        4811     7

frequency(pelt) #yearly interval

## [1] 1

head(pelt)

## # A tsibble: 6 x 3 [1Y]
##    Year  Hare  Lynx
##   <dbl> <dbl> <dbl>
## 1  1845 19580 30090
## 2  1846 19600 45150
## 3  1847 19610 49150
## 4  1848 11990 39520
## 5  1849 28040 21230
## 6  1850 58000  8420

#frequency(gafa_stock) #daily interval
glimpse(gafa_stock)

## Rows: 5,032
## Columns: 8
## Key: Symbol [4]
## $ Symbol    <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAP…
## $ Date      <date> 2014-01-02, 2014-01-03, 2014-01-06, 2014-01-07, 2014-01-08,…
## $ Open      <dbl> 79.38286, 78.98000, 76.77857, 77.76000, 76.97285, 78.11429, …
## $ High      <dbl> 79.57571, 79.10000, 78.11429, 77.99429, 77.93714, 78.12286, …
## $ Low       <dbl> 78.86000, 77.20428, 76.22857, 76.84571, 76.95571, 76.47857, …
## $ Close     <dbl> 79.01857, 77.28286, 77.70428, 77.14857, 77.63715, 76.64571, …
## $ Adj_Close <dbl> 66.96433, 65.49342, 65.85053, 65.37959, 65.79363, 64.95345, …
## $ Volume    <dbl> 58671200, 98116900, 103152700, 79302300, 64632400, 69787200,…

frequency(vic_elec) #30 minute interval

## [1] 48

head(vic_elec)

## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
##   Time                Demand Temperature Date       Holiday
##   <dttm>               <dbl>       <dbl> <date>     <lgl>  
## 1 2012-01-01 00:00:00  4383.        21.4 2012-01-01 TRUE   
## 2 2012-01-01 00:30:00  4263.        21.0 2012-01-01 TRUE   
## 3 2012-01-01 01:00:00  4049.        20.7 2012-01-01 TRUE   
## 4 2012-01-01 01:30:00  3878.        20.6 2012-01-01 TRUE   
## 5 2012-01-01 02:00:00  4036.        20.4 2012-01-01 TRUE   
## 6 2012-01-01 02:30:00  3866.        20.2 2012-01-01 TRUE

Use autoplot() to produce a time plot of each series.

autoplot(aus_production, Bricks) +
  ggtitle("Quarterly Brick Production")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

autoplot(pelt, Lynx) +
  ggtitle("Annual Lynx Trappings")

autoplot(gafa_stock, Close) +
  ggtitle("Daily Closing Prices of GAFA Stocks")

autoplot(vic_elec, Demand) +
  labs(
    title = "Electricity Demand in Victoria",
    x = "Date and Time",
    y = "Megawatts (MW)"
  )

The aus_production time series uses a quarterly time interval
The pelt time series uses an annual time interval
The gafa_stock time series uses a daily time interval
The vic_elec time series uses a half hour (30 minute) time interval

2.2

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

Since the gafa_stock time series has four symbols, it should first be grouped by Symbol and then filtered by the highest closing price of that stock.

gafa_stock %>% 
  group_by(Symbol) %>% 
  filter(Close ==max(Close))

## # A tsibble: 4 x 8 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Date        Open  High   Low Close Adj_Close   Volume
##   <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
## 1 AAPL   2018-10-03  230.  233.  230.  232.      230. 28654800
## 2 AMZN   2018-09-04 2026. 2050. 2013  2040.     2040.  5721100
## 3 FB     2018-07-25  216.  219.  214.  218.      218. 58954200
## 4 GOOG   2018-07-26 1251  1270. 1249. 1268.     1268.  2405600

For AAPL the day that corresponds with the peak closing price is 2018-10-03
For AMZN the day that corresponds with the peak closing price is 2018-09-04
For FB the day that corresponds with the peak closing price is 2018-07-25
For GOOG the day that corresponds with the peak closing price is 2018-07-26

2.3

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

a. You can read the data into R with the following script:

tute1 <- readr::read_csv("tute1.csv")

## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#view(tute1)
head(tute1)

## # A tibble: 6 × 4
##   Quarter    Sales AdBudget   GDP
##   <date>     <dbl>    <dbl> <dbl>
## 1 1981-03-01 1020.     659.  252.
## 2 1981-06-01  889.     589   291.
## 3 1981-09-01  795      512.  291.
## 4 1981-12-01 1004.     614.  292.
## 5 1982-03-01 1058.     647.  279.
## 6 1982-06-01  944.     602   254

b. Convert the data to time series

mytimeseries <- tute1 |>
  mutate(Quarter = yearquarter(Quarter)) |>
  as_tsibble(index = Quarter)

c. Construct time series plots of each of the three series

mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check what happens when you don’t include facet_grid().

mytimeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line()

When facets_grid() is removed, all the series are overlaid over one plot with a single shared y-axis instead of separate panels. This can be useful for when comparing trends of all three columns. However, sometimes they have different units of measurements, which may make it difficult to identify patterns if they’re sharing the same y-axis. The data would need to be scaled or normalized so that the data can fit into one plot. In this case, the GDP does not look like it is changing much because all three time series are sharing the same y-axis so the data looks flattened.

2.4

The USgas package contains data on the demand for natural gas in the US.

a. Install the USgas package.

#install.packages("USgas")
library(USgas)

b. Create a tsibble from us_total with year as the index and state as the key.

us_gasoline <- us_total %>% 
  as_tsibble(index = year, key = state)
head(us_gasoline)

## # A tsibble: 6 x 3 [1Y]
## # Key:       state [1]
##    year state        y
##   <int> <chr>    <int>
## 1  1997 Alabama 324158
## 2  1998 Alabama 329134
## 3  1999 Alabama 337270
## 4  2000 Alabama 353614
## 5  2001 Alabama 332693
## 6  2002 Alabama 379343

c. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

#creating a vector for new england areas
new_england <- c("Maine","Vermont","New Hampshire",
                 "Massachusetts","Connecticut","Rhode Island")

#filtering gas consumption to new england areas
ne_gas <- us_gasoline |>
  filter(state %in% new_england)

#Plot without facet_wrap
ggplot(ne_gas, aes(x = year, y = y, colour = state)) +
  geom_line() +
  labs(
    title = "Annual Gas Consumption by State — New England",
    x = "Year",
    y = "Gas Consumption",
    colour = "State"
  )

A plot of all new england states shows that Massachusetts has the highest gas consumption in the area and Vermont has the least. Connecticut seems to have a steady growth in gas consumption. However, the other states might also have a trend in gas consumption but it is difficult to see from this plot since New Hampshire, Maine, and Rhode island have a flattened plot due to the shared y-axis.

#plotting one panel per state
ggplot(ne_gas, aes(x = year, y = y)) +
  geom_line() +
  facet_wrap(~ state, ncol = 3, scales = "free_y") +
  labs(
    title = "Annual Gas Consumption — New England States",
    x = "Year",
    y = "Gas Consumption" 
  )

This faceted plot shows the each state in its own panel with its own y-axis. Here we can see the clear upward trend in gas consumption for Connecticut, Massachusetts, and Vermont. We can also see the slow decline in Maine. This faceted plot reveals the pattern of each individual state.

2.5

a. Download tourism.xlsx from the book website and read it into R using readxl::read_excel().

tourism_xl <- readxl::read_excel("tourism.xlsx")
head(tourism_xl)

## # A tibble: 6 × 5
##   Quarter    Region   State           Purpose  Trips
##   <chr>      <chr>    <chr>           <chr>    <dbl>
## 1 1998-01-01 Adelaide South Australia Business  135.
## 2 1998-04-01 Adelaide South Australia Business  110.
## 3 1998-07-01 Adelaide South Australia Business  166.
## 4 1998-10-01 Adelaide South Australia Business  127.
## 5 1999-01-01 Adelaide South Australia Business  137.
## 6 1999-04-01 Adelaide South Australia Business  200.

b. Create a tsibble which is identical to the tourism tsibble from the tsibble package.

head(tourism)

## # A tsibble: 6 x 5 [1Q]
## # Key:       Region, State, Purpose [1]
##   Quarter Region   State           Purpose  Trips
##     <qtr> <chr>    <chr>           <chr>    <dbl>
## 1 1998 Q1 Adelaide South Australia Business  135.
## 2 1998 Q2 Adelaide South Australia Business  110.
## 3 1998 Q3 Adelaide South Australia Business  166.
## 4 1998 Q4 Adelaide South Australia Business  127.
## 5 1999 Q1 Adelaide South Australia Business  137.
## 6 1999 Q2 Adelaide South Australia Business  200.

The built-in tourism dataset that comes in the tsibble pacakge has: - Index: Quarter (class= yearquarter) - Keys: Region, State, Purpose - Variable: Trips (numeric)

tourism_tb <- tourism_xl %>% 
  mutate(Quarter = yearquarter(Quarter)) %>% 
  as_tsibble(index = Quarter, key = c(Region, State, Purpose))

head(tourism_tb)

## # A tsibble: 6 x 5 [1Q]
## # Key:       Region, State, Purpose [1]
##   Quarter Region   State           Purpose  Trips
##     <qtr> <chr>    <chr>           <chr>    <dbl>
## 1 1998 Q1 Adelaide South Australia Business  135.
## 2 1998 Q2 Adelaide South Australia Business  110.
## 3 1998 Q3 Adelaide South Australia Business  166.
## 4 1998 Q4 Adelaide South Australia Business  127.
## 5 1999 Q1 Adelaide South Australia Business  137.
## 6 1999 Q2 Adelaide South Australia Business  200.

c. Find what combination of Region and Purpose had the maximum number of overnight trips on average.

To do this, I would first group Region and Purpose, calculate the average overnight trips over all quarters. Then sort from largest to smallest and the figure at the top is the combination with the maximum number of overnight trips on average.

avg_trips <- tourism_tb %>%
  as_tibble() %>% #through lots of trial and error, I found out that if I do not put this line here it will show the average trips for that specific quarter because it is a tstibble so I converted it back to a regular tibble
  group_by(Region, Purpose) %>%
  summarise(avg_trips = mean(Trips, na.rm = TRUE)) %>% 
  arrange(desc(avg_trips))

## `summarise()` has grouped output by 'Region'. You can override using the
## `.groups` argument.

avg_trips

## # A tibble: 304 × 3
## # Groups:   Region [76]
##    Region          Purpose  avg_trips
##    <chr>           <chr>        <dbl>
##  1 Sydney          Visiting      747.
##  2 Melbourne       Visiting      619.
##  3 Sydney          Business      602.
##  4 North Coast NSW Holiday       588.
##  5 Sydney          Holiday       550.
##  6 Gold Coast      Holiday       528.
##  7 Melbourne       Holiday       507.
##  8 South Coast     Holiday       495.
##  9 Brisbane        Visiting      493.
## 10 Melbourne       Business      478.
## # ℹ 294 more rows

The combination of Sydney and Purpose had maximum number of overnight trips on average with 747.

d. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

To find the total trips by State, the tsibble would have to converted to a regular tibble, grouped by State and Quarter, and then summed up and converted back into a tsibble.

total_trips <- tourism_tb %>%
  as_tibble() %>% 
  group_by(State, Quarter) %>%
  summarise(Trips = sum(Trips)) %>%
  as_tsibble(index = Quarter, key = State)

## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.

total_trips

## # A tsibble: 640 x 3 [1Q]
## # Key:       State [8]
## # Groups:    State [8]
##    State Quarter Trips
##    <chr>   <qtr> <dbl>
##  1 ACT   1998 Q1  551.
##  2 ACT   1998 Q2  416.
##  3 ACT   1998 Q3  436.
##  4 ACT   1998 Q4  450.
##  5 ACT   1999 Q1  379.
##  6 ACT   1999 Q2  558.
##  7 ACT   1999 Q3  449.
##  8 ACT   1999 Q4  595.
##  9 ACT   2000 Q1  600.
## 10 ACT   2000 Q2  557.
## # ℹ 630 more rows

Assuming the question is just asking for the “total trips by State” without the time data, here it is:

total_by_state_notime <- tourism_tb %>%
  as_tibble() %>%
  group_by(State) %>%
  summarise(Total_Trips = sum(Trips))
  
total_by_state_notime

## # A tibble: 8 × 2
##   State              Total_Trips
##   <chr>                    <dbl>
## 1 ACT                     41007.
## 2 New South Wales        557367.
## 3 Northern Territory      28614.
## 4 Queensland             386643.
## 5 South Australia        118151.
## 6 Tasmania                54137.
## 7 Victoria               390463.
## 8 Western Australia      147820.

Added this code for completeness but could not convert back to tsibble because the time was removed.

2.8

Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

Can you spot any seasonality, cyclicity and trend? What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

us_employment

The us_employment is tsibble with many series so we would need to filter out just the title “Total Private”

us_emp <- us_employment %>% 
  filter(Title == "Total Private")

head(us_emp)

## # A tsibble: 6 x 4 [1M]
## # Key:       Series_ID [1]
##      Month Series_ID     Title         Employed
##      <mth> <chr>         <chr>            <dbl>
## 1 1939 Jan CEU0500000001 Total Private    25338
## 2 1939 Feb CEU0500000001 Total Private    25447
## 3 1939 Mar CEU0500000001 Total Private    25833
## 4 1939 Apr CEU0500000001 Total Private    25801
## 5 1939 May CEU0500000001 Total Private    26113
## 6 1939 Jun CEU0500000001 Total Private    26485

autoplot(us_emp, Employed) +
  ggtitle("Time plot: Monthly US Total Private Employment")

gg_season(us_emp, Employed) +
  ggtitle("Seasonal plot: Monthly US Total Private Employment")

gg_subseries(us_emp, Employed) +
  ggtitle("Subseries plot: Monthly US Total Private Employment")

gg_lag(us_emp, Employed) +
  ggtitle("Lag plot: Monthly US Total Private Employment")

ACF(us_emp, Employed) %>%
  autoplot() +
  ggtitle("ACF: Monthly US Total Private Employment")

- Can you spot any seasonality, cyclicity and trend? From the time plot, there is a strong upward trend in employment. It is a bit hard to tell from this seasonal plot if there is seasonality but there does seem to be a dip in employment in the beginning of the year in Jan and Feb and slow rises towards June.

What do you learn about the series? Employment has grown steadily from 1940 to 2020. However, there are some periods of downturn. Specifically, the sudden drop in employment around 2008 due to the housing market crisis.
What can you say about the seasonal patterns? The subseries plot shows a lower employment at the beginning of the year compared to the summer months. This is supported by the lower average employment, represented by the blue horizontal lines in the subseries plot, in Jan compared to the average employment in Jun and July. This could possibly be due to students on their summer break entering the work force.
Can you identify any unusual years? There are a couple of unusual years that deviate from the steady uptrend in the time plot. The most notable one seems to be in 2008-2009 during the time of the housing crisis and the stock market dropping almost 50%.

bricks

bricks <- aus_production %>%
  select(Quarter, Bricks) 

head(bricks)

## # A tsibble: 6 x 2 [1Q]
##   Quarter Bricks
##     <qtr>  <dbl>
## 1 1956 Q1    189
## 2 1956 Q2    204
## 3 1956 Q3    208
## 4 1956 Q4    197
## 5 1957 Q1    187
## 6 1957 Q2    214

autoplot(bricks, Bricks) +
  ggtitle("Time plot: Quarterly Australian Bricks Production")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_season(bricks, Bricks) +
  ggtitle("Seasonal plot: Quarterly Australian Bricks Production")

## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_subseries(bricks, Bricks) +
  ggtitle("Subseries plot: Quarterly Australian Bricks Production")

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_lag(bricks, Bricks) +
  ggtitle("Lag plot: Quarterly Australian Bricks Production")

## Warning: Removed 20 rows containing missing values (gg_lag).

ACF(bricks, Bricks) %>%
  autoplot() +
  ggtitle("ACF: Quarterly Australian Bricks Production")

Can you spot any seasonality, cyclicity and trend? Based on the time plot, there was an uptrend in brick production until 1980 Q1 and then there is a decline in brick production. As for seasonality, the seasonal and subseries plot shows that brick production is highest in Q3 and lowest in Q1, indicating strong quarterly seasonality.
What do you learn about the series? Brick production is highly seasonal and seems to be slowly decline after 1980 Q1. The ACF plot shows a spike in production at lags 4, 8, 16 and 20 (multiples of 4), indicating strong quarterly seasonality.
What can you say about the seasonal patterns? Based on the subseries plot, Q1 has the lowest average brick production while Q3 has the highest. This is confirmed by the time plot where there are consitent periods of ups and downs.
Can you identify any unusual years? The most notable unusual years seems the sharp declines around 1980 Q1. The two declines before and after 1980 Q1 seem to be greated than the other declines.

hare

head(pelt)

## # A tsibble: 6 x 3 [1Y]
##    Year  Hare  Lynx
##   <dbl> <dbl> <dbl>
## 1  1845 19580 30090
## 2  1846 19600 45150
## 3  1847 19610 49150
## 4  1848 11990 39520
## 5  1849 28040 21230
## 6  1850 58000  8420

#selecting only the Year & Hare column
hare <- pelt %>%
  select(Year, Hare)

head(hare)

## # A tsibble: 6 x 2 [1Y]
##    Year  Hare
##   <dbl> <dbl>
## 1  1845 19580
## 2  1846 19600
## 3  1847 19610
## 4  1848 11990
## 5  1849 28040
## 6  1850 58000

autoplot(hare, Hare) +
  ggtitle("Time plot: Annual Hare Pelts")

#gg_season(hare, Hare) +      #this time series is annual data so there is no seasonal period
#   ggtitle("Seasonal plot: Annual Hare Pelts")

#gg_subseries(har, Hare) +   ##this time series is annual data so there is no seasonal period
#   ggtitle("Subseries plot: Annual Hare Pelts")

gg_lag(hare, Hare) +
  ggtitle("Lag plot: Annual Hare Pelts")

ACF(hare, Hare) %>%
  autoplot() +
  ggtitle("ACF: Annual Hare Pelts")

Can you spot any seasonality, cyclicity and trend? Since this times series is annual, there is no seasonality. However, there seems to be some flunctations without a defined time interval. There are no obvious trendsin the time plot.
What do you learn about the series? There are a lot of ups and downs in pelt from hare production. The lag plot shows random, cuvred clouds instead of a linear line, suggesting nonlinear cyclic behvaior.
What can you say about the seasonal patterns? Since this is annual data, the gg_season() and gg_subseries() function errored out. There is no seasonal patterns.
Can you identify any unusual years? The time plot shows two very high peaks in the 1860s and 1880s, followed by very sharp declines.These two peaks are significant since they are both above 125,000 where as the peaks of the other years are less than 100,000.

h02

head(PBS)

## # A tsibble: 6 x 9 [1M]
## # Key:       Concession, Type, ATC1, ATC2 [1]
##      Month Concession   Type       ATC1  ATC1_desc ATC2  ATC2_desc Scripts  Cost
##      <mth> <chr>        <chr>      <chr> <chr>     <chr> <chr>       <dbl> <dbl>
## 1 1991 Jul Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   18228 67877
## 2 1991 Aug Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   15327 57011
## 3 1991 Sep Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   14775 55020
## 4 1991 Oct Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   15380 57222
## 5 1991 Nov Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   14371 52120
## 6 1991 Dec Concessional Co-paymen… A     Alimenta… A01   STOMATOL…   15028 54299

#filtering out the H02 series and cost
h02 <- PBS %>%
  filter(ATC2 == "H02") %>%
  summarise(Cost = sum(Cost))

h02

## # A tsibble: 204 x 2 [1M]
##       Month   Cost
##       <mth>  <dbl>
##  1 1991 Jul 429795
##  2 1991 Aug 400906
##  3 1991 Sep 432159
##  4 1991 Oct 492543
##  5 1991 Nov 502369
##  6 1991 Dec 602652
##  7 1992 Jan 660119
##  8 1992 Feb 336220
##  9 1992 Mar 351348
## 10 1992 Apr 379808
## # ℹ 194 more rows

autoplot(h02, Cost) +
  ggtitle("Time plot: Monthly H02 Costs")

gg_season(h02, Cost) +
  ggtitle("Seasonal plot: Monthly H02  Costs")

gg_subseries(h02, Cost) +
  ggtitle("Subseries plot: Monthly H02 Costs")

gg_lag(h02, Cost) +
  ggtitle("Lag plot: Monthly H02 Costs")

ACF(h02, Cost) %>%
  autoplot() +
  ggtitle("ACF: Monthly H02 Costs")

Can you spot any seasonality, cyclicity and trend? There is a upward trend in pharmaceutical costs over time. Based on the seasonal and subseries plot, there is a significant spike in average costs in Jan but a giant tumble in Feb. Costs slowly build up past March and reaches a high again in Dec. H02 costs show strong seasonality.
What do you learn about the series? Drug costs are trending upward throughout the years. The average drug costs are higher in Dec and Jan than they are the rest of the year.
What can you say about the seasonal patterns? The seasonal plot shows there is a peak in Jan but a hard drop in Feb followed by a gradual increase the rest of the year.
Can you identify any unusual years? The time plot shows that some of the peaks are much higher than the previous years. Normally the peaks are increasing consistently but in 1994 Jan and 2005 Jan, there seems to be an abnormal spike compared to the rest of the years.

us_gasoline

us_gasoline <- fpp3::us_gasoline
head(fpp3::us_gasoline)

## # A tsibble: 6 x 2 [1W]
##       Week Barrels
##     <week>   <dbl>
## 1 1991 W06    6.62
## 2 1991 W07    6.43
## 3 1991 W08    6.58
## 4 1991 W09    7.22
## 5 1991 W10    6.88
## 6 1991 W11    6.95

autoplot(us_gasoline, Barrels) +
  ggtitle("Time plot: Weekly US Gasoline Production (Barrels)")

gg_season(us_gasoline, Barrels) +
  ggtitle("Seasonal plot: Weekly US Gasoline Production (Barrels)")

gg_subseries(us_gasoline, Barrels) +
  ggtitle("Subseries plot: Weekly US Gasoline Production (Barrels)")

gg_lag(us_gasoline, Barrels) +
  ggtitle("Lag plot: Weekly US Gasoline Production (Barrels)")

ACF(us_gasoline, Barrels) %>%
  autoplot() +
  ggtitle("ACF: Weekly US Gasoline Production (Barrels)")

Can you spot any seasonality, cyclicity and trend? The time plot shows a gradual increase in gasoline production and a plateau around 2009.Based on the subseries plot, the average barrels is higher during the middle of the year compared to the early and later weeks in the year, indicating strong seasonality.
What do you learn about the series? Gasoline production is higher in the summer (middle of the year) than it is in the winter. This can be due to more people driving during the summer months than the winter months.Gasoline production is highly seasonal. The lag plot shows a diagonal line indicating
What can you say about the seasonal patterns? In the season plot, there is a rise in gas production from May to August and a peak in July. This gradually decreases as we approach Dec and takes a big fall in Jan. This is consistent year to year.
Can you identify any unusual years? There is a plateau and gradual decline around 2009. This may have been to the 2008 recession. There are also sharp declines in gas production around 1999 W2 and 2019. This may be due to the 2000 dot com crash and the 2019 covid pandemic.

HW1 Time Series

Jian Quan Chen

2025-09-06

2.1