2.1

Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

Use ? (or help()) to find out about the data in each series.

#?aus_production
#?pelt
#?gafa_stock 
#?vic_elec

What is the time interval of each series?

interval(aus_production) #time interval is quarter
## <interval[1]>
## [1] 1Q
interval(pelt) # tiem interbal is year
## <interval[1]>
## [1] 1Y
interval(gafa_stock) # time interval is irregular days
## <interval[1]>
## [1] !
interval(vic_elec) # time interval is half hour
## <interval[1]>
## [1] 30m

Use autoplot() to produce a time plot of each series.

autoplot(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

autoplot(pelt, Lynx)

autoplot(gafa_stock, Close)

# For the last plot, modify the axis labels and title.

autoplot(vic_elec, Demand) +
  labs(title = "Demand for electricity in Victoria, Australia",
       x = "Year",
       y = "Demand")

2.2

Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

peak_cday <- gafa_stock  |>
  group_by(Symbol)  |>
  filter(Close == max(Close)) 

peak_cday

2.3

Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

#reading the tute1 csv in downloads folder
read_tute1 <- read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (3): Sales, AdBudget, GDP
## date (1): Quarter
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# b Convert the data to time series

tute1_timeseries <- read_tute1 |>
  mutate(Quarter = yearquarter(Quarter)) |>
  as_tsibble(index = Quarter)
#  c Construct time series plots of each of the three series
tute1_timeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

Check what happens when you don’t include facet_grid(). If you don’t include facet_grid(), the timeseries graph created has three separate lines of the adbudget, GDP and sales in a single graph. It is not cohesive.

tute1_timeseries |>
  pivot_longer(-Quarter) |>
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line()

2.4

The USgas package contains data on the demand for natural gas in the US.

Install the USgas package.

library(USgas)
## Warning: package 'USgas' was built under R version 4.5.2
  1. Create a tsibble from us_total with year as the index and state as the key.
tsibble_us_tot <- us_total  |>
  as_tsibble(index = year , key = state)  |>
print()
## # A tsibble: 1,266 x 3 [1Y]
## # Key:       state [53]
##     year state        y
##    <int> <chr>    <int>
##  1  1997 Alabama 324158
##  2  1998 Alabama 329134
##  3  1999 Alabama 337270
##  4  2000 Alabama 353614
##  5  2001 Alabama 332693
##  6  2002 Alabama 379343
##  7  2003 Alabama 350345
##  8  2004 Alabama 382367
##  9  2005 Alabama 353156
## 10  2006 Alabama 391093
## # ℹ 1,256 more rows

C. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

new_engl <- c("Maine", "Vermont", "New Hampshire",
                        "Massachusetts", "Connecticut", "Rhode Island")
tsibble_us_tot |>
  filter(state %in% new_engl) |>
  ggplot(aes(x = year, y = y, colour = state)) +
  geom_line(size = 1) +
  labs(title = "New England Annual Gas Demand",
       x = "Year",
       y = "Total",
       colour = "State") +
  facet_grid(state ~ ., scales = "free_y") +
  theme(strip.text = element_text(size = 4.5, angle = 45, hjust = 1))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

2.5

Download tourism.xlsx from the book website and read it into R using readxl::read_excel().

read_tourism <- read_excel("tourism.xlsx")

print(read_tourism)
## # A tibble: 24,320 × 5
##    Quarter    Region   State           Purpose  Trips
##    <chr>      <chr>    <chr>           <chr>    <dbl>
##  1 1998-01-01 Adelaide South Australia Business  135.
##  2 1998-04-01 Adelaide South Australia Business  110.
##  3 1998-07-01 Adelaide South Australia Business  166.
##  4 1998-10-01 Adelaide South Australia Business  127.
##  5 1999-01-01 Adelaide South Australia Business  137.
##  6 1999-04-01 Adelaide South Australia Business  200.
##  7 1999-07-01 Adelaide South Australia Business  169.
##  8 1999-10-01 Adelaide South Australia Business  134.
##  9 2000-01-01 Adelaide South Australia Business  154.
## 10 2000-04-01 Adelaide South Australia Business  169.
## # ℹ 24,310 more rows
  1. Create a tsibble which is identical to the tourism tsibble from the tsibble package.
tour_tsib <- tourism  |>
  mutate(Quarter = yearquarter(Quarter))  |>
  as_tsibble(key = c(Region, Purpose), index = Quarter)

print(tour_tsib)
## # A tsibble: 24,320 x 5 [1Q]
## # Key:       Region, Purpose [304]
##    Quarter Region   State           Purpose  Trips
##      <qtr> <chr>    <chr>           <chr>    <dbl>
##  1 1998 Q1 Adelaide South Australia Business  135.
##  2 1998 Q2 Adelaide South Australia Business  110.
##  3 1998 Q3 Adelaide South Australia Business  166.
##  4 1998 Q4 Adelaide South Australia Business  127.
##  5 1999 Q1 Adelaide South Australia Business  137.
##  6 1999 Q2 Adelaide South Australia Business  200.
##  7 1999 Q3 Adelaide South Australia Business  169.
##  8 1999 Q4 Adelaide South Australia Business  134.
##  9 2000 Q1 Adelaide South Australia Business  154.
## 10 2000 Q2 Adelaide South Australia Business  169.
## # ℹ 24,310 more rows
  1. Find what combination of Region and Purpose had the maximum number of overnight trips on average.
overnight_trip_avg <- tour_tsib  |>
  group_by(Region, Purpose)  |>
  summarise(Average = mean(Trips, na.rm = TRUE))  |>
  filter(Average == max(Average))


print(overnight_trip_avg)
## # A tsibble: 76 x 4 [1Q]
## # Key:       Region, Purpose [76]
## # Groups:    Region [76]
##    Region                     Purpose  Quarter Average
##    <chr>                      <chr>      <qtr>   <dbl>
##  1 Adelaide                   Visiting 2017 Q1   270. 
##  2 Adelaide Hills             Visiting 2002 Q4    81.1
##  3 Alice Springs              Holiday  1998 Q3    76.5
##  4 Australia's Coral Coast    Holiday  2014 Q3   198. 
##  5 Australia's Golden Outback Business 2017 Q3   174. 
##  6 Australia's North West     Business 2016 Q3   297. 
##  7 Australia's South West     Holiday  2016 Q1   612. 
##  8 Ballarat                   Visiting 2004 Q1   103. 
##  9 Barkly                     Holiday  1998 Q3    37.9
## 10 Barossa                    Holiday  2006 Q1    51.0
## # ℹ 66 more rows

D. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

trip_tot_tsib <- tour_tsib  |>
  group_by(State)  |>
  summarise(Total_Trips = sum(Trips))  |>
  as_tibble(index = State)

print(trip_tot_tsib)
## # A tibble: 640 × 3
##    State Quarter Total_Trips
##    <chr>   <qtr>       <dbl>
##  1 ACT   1998 Q1        551.
##  2 ACT   1998 Q2        416.
##  3 ACT   1998 Q3        436.
##  4 ACT   1998 Q4        450.
##  5 ACT   1999 Q1        379.
##  6 ACT   1999 Q2        558.
##  7 ACT   1999 Q3        449.
##  8 ACT   1999 Q4        595.
##  9 ACT   2000 Q1        600.
## 10 ACT   2000 Q2        557.
## # ℹ 630 more rows

2.8

Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

From Employed:

  1. you spot any seasonality, cyclicity and trend?

There seems to be an upward trend in US employment, signaling more people in the US are being employed throughout the years.

  1. What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

The series has strong seasonal patterns of employment, revealing cycles of hiring. The unusual years are in downward spikes such as 2010 or 1991.

us_employment |>
  filter(Title == "Total Private") |>
  autoplot(Employed)

us_employment |>
  filter(Title == "Total Private") |>
  gg_season(Employed)
## Warning: `gg_season()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_season()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

us_employment |>
  filter(Title == "Total Private") |>
  gg_subseries(Employed)
## Warning: `gg_subseries()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_subseries()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

us_employment |>
  filter(Title == "Total Private") |>
  gg_lag(Employed)
## Warning: `gg_lag()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_lag()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

us_employment |>
  filter(Title == "Total Private") |>
  ACF(Employed) |>
  autoplot()

From Bricks:

  1. You spot any seasonality, cyclicity and trend?

There seems to be an upward trend in US employment, signaling more people in the US are being employed throughout the years.

  1. What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

The series trend seems to be on a decline over the years. There is some seasonality trends which are higher in the Q2 and Q3. And it seems like the early 2000s imply an unusual decline

autoplot(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_season(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_subseries(aus_production, Bricks)
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).

gg_lag(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values (gg_lag).

From Hare:

  1. You spot any seasonality, cyclicity and trend?

There doesn’t seem to be any upward or downward trends with this series.

  1. What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

The series trends is unclear upward or downward. The seasonal patterns aren’t very strong. There are many troughs in this series for plenty of years.

autoplot(pelt, Hare)

gg_subseries(pelt, Hare)

From Cost:

  1. You spot any seasonality, cyclicity and trend?

For this series, it can be said that there is a increase in costs over time.

  1. What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

The series trends is gradually increasing. The seasonal patterns aren’t very strong, but potentially a little higher presently in the winter time. There are a lot of sudden spikes in the series regards to price changes.

PBS |>
  filter(ATC2 == "H02") |>
  autoplot(Cost)

PBS |>
  filter(ATC2 == "H02") |>
  gg_season(Cost)

PBS |>
  filter(ATC2 == "H02") |>
  gg_subseries(Cost)

PBS |>
  filter(ATC2 == "H02") |>
  ACF(Cost) |>
  autoplot()

From Barrels:

  1. You spot any seasonality, cyclicity and trend?

For this series, it is pretty firm with a gradual increase over time .

  1. What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?

The seasonal patterns are pretty clear as you can see that in the summertime more us gasoline is used. We can see sharp declines in years such as around the 2010s and 2020s. 2020 is understandable due to the COVID-19 pandemic.

autoplot(us_gasoline, Barrels)

gg_season(us_gasoline, Barrels)

gg_subseries(us_gasoline, Barrels)

gg_lag(us_gasoline, Barrels)