2.1 Explore The following Four Time Series

library(fpp3)
## Warning: package 'fpp3' was built under R version 4.3.2
## ── Attaching packages ────────────────────────────────────────────── fpp3 0.5 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.3
## ✔ dplyr       1.1.2     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.0     ✔ feasts      0.3.1
## ✔ lubridate   1.9.2     ✔ fable       0.3.3
## ✔ ggplot2     3.4.2     ✔ fabletools  0.3.4
## Warning: package 'tsibble' was built under R version 4.3.2
## Warning: package 'tsibbledata' was built under R version 4.3.2
## Warning: package 'feasts' was built under R version 4.3.2
## Warning: package 'fabletools' was built under R version 4.3.2
## Warning: package 'fable' was built under R version 4.3.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

2.1.A

help("aus_production")
## starting httpd help server ... done
help(pelt)
help("gafa_stock")
help("vic_elec")

2.1.B

The time interval for aus_production is quarterly estimates of selected indicators in Australia

The time interval for pelt is yearly from 1845 to 1935

The time interval for gafa_stock are daily trading days from 2014 to 2018

The time interval for vic_elec which give half-hourly measurements of electricity demand

2.1.C Autoplot.

autoplot(aus_production,Bricks) + ggtitle("Quarterly Productions of Bricks in Australia")
## Warning: Removed 20 rows containing missing values (`geom_line()`).

autoplot(pelt,Lynx) + ggtitle("Records of Canadian Lynx pelts traded")

autoplot(gafa_stock,Close) + ggtitle("Closing Prices for GAFA stocks")

autoplot(vic_elec,Demand) + ggtitle("Half-hourly electricity Demand")

2.2 Use Filter() to find out

gafa_stock %>%
  group_by(Symbol) %>%
  filter(Close == max(Close)) %>%
  select(Symbol,Close)
## # A tsibble: 4 x 3 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Close Date      
##   <chr>  <dbl> <date>    
## 1 AAPL    232. 2018-10-03
## 2 AMZN   2040. 2018-09-04
## 3 FB      218. 2018-07-25
## 4 GOOG   1268. 2018-07-26

I used dplyr function to group by the stock symbol and then filtered out to return the max of the closing price and then selected the symbol column and the closing price column.

2.3 Download file Tute1

tute1 <- read.csv("C:\\Users\\Al Haque\\OneDrive\\Desktop\\Data 624\\tute1.csv")
mytimeseries <- tute1 %>%
  mutate(Quarter = yearquarter(Quarter)) %>%
  as_tsibble(index = Quarter)
head(mytimeseries)
## # A tsibble: 6 x 4 [1Q]
##   Quarter Sales AdBudget   GDP
##     <qtr> <dbl>    <dbl> <dbl>
## 1 1981 Q1 1020.     659.  252.
## 2 1981 Q2  889.     589   291.
## 3 1981 Q3  795      512.  291.
## 4 1981 Q4 1004.     614.  292.
## 5 1982 Q1 1058.     647.  279.
## 6 1982 Q2  944.     602   254
mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

If you remove the facet_grid it plots the stock names into one plot..

mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line()

2.4 USgas package

2.4.A

library(USgas)
## Warning: package 'USgas' was built under R version 4.3.2

2.4.B

## Create a tstibble with year as the index and state as the key.. 
Us_total <- us_total %>%
  as_tibble(key = state,
            index = year)

2.4.C

Plotting The annual natural gas consumption by state for the New England area for the following states

Us_total %>%
  filter(state %in% c(c('Maine', 'Vermont', 'New Hampshire', 'Massachusetts', 'Connecticut', 'Rhode Island'))) %>%
  ggplot(aes(x= year,y = y,color = state)) +
  geom_line() +
  ggtitle("US Annual Total Natural Gas Consumption")

2.5 Download Tourism.xlsx

2.5.A

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0     ✔ readr   2.1.4
## ✔ purrr   1.0.1     ✔ stringr 1.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()     masks stats::filter()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Tourism <- readxl::read_excel("C:\\Users\\Al Haque\\OneDrive\\Desktop\\Data 624\\tourism.xlsx")
head(Tourism)
## # A tibble: 6 × 5
##   Quarter    Region   State           Purpose  Trips
##   <chr>      <chr>    <chr>           <chr>    <dbl>
## 1 1998-01-01 Adelaide South Australia Business  135.
## 2 1998-04-01 Adelaide South Australia Business  110.
## 3 1998-07-01 Adelaide South Australia Business  166.
## 4 1998-10-01 Adelaide South Australia Business  127.
## 5 1999-01-01 Adelaide South Australia Business  137.
## 6 1999-04-01 Adelaide South Australia Business  200.

2.5.B

## Convert it to tstibble
Tourism <- Tourism %>%
  mutate(Quarter = yearquarter(Quarter)) %>%
  as_tsibble(index = Quarter,key = c("Region","State","Purpose"))

2.5.C

## group the data by region,purpose, calculate the avg trips, ungroup it and select the highest average trip within the region and purpose.. 
set.seed(123)
Tourism %>%
  group_by(Region,Purpose) %>%
  summarise(Trips = mean(Trips)) %>%
  ungroup() %>%
  filter(Trips == max(Trips))
## # A tsibble: 1 x 4 [1Q]
## # Key:       Region, Purpose [1]
##   Region    Purpose  Quarter Trips
##   <chr>     <chr>      <qtr> <dbl>
## 1 Melbourne Visiting 2017 Q4  985.

The highest average of trips in the region is Melbourne in for the purpose of visiting.

2.5.D

Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

## I think this is correct.
Tourism2 <- readxl::read_excel("C:\\Users\\Al Haque\\OneDrive\\Desktop\\Data 624\\tourism.xlsx")
Tourism2 %>%
  group_by(Quarter,State) %>%
  mutate(Quarter = yearquarter(Quarter)) %>%
  summarise(Trips = sum(Trips)) %>%
  as_tsibble(index = Quarter,key = State)
## `summarise()` has grouped output by 'Quarter'. You can override using the
## `.groups` argument.
## # A tsibble: 640 x 3 [1Q]
## # Key:       State [8]
## # Groups:    @ Quarter [80]
##    Quarter State Trips
##      <qtr> <chr> <dbl>
##  1 1998 Q1 ACT    551.
##  2 1998 Q2 ACT    416.
##  3 1998 Q3 ACT    436.
##  4 1998 Q4 ACT    450.
##  5 1999 Q1 ACT    379.
##  6 1999 Q2 ACT    558.
##  7 1999 Q3 ACT    449.
##  8 1999 Q4 ACT    595.
##  9 2000 Q1 ACT    600.
## 10 2000 Q2 ACT    557.
## # ℹ 630 more rows

2.8 use graphics function

Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series:

US Employment..

us_employment %>%
  filter(Title == "Total Private") %>%
  autoplot(Employed) + ggtitle("Total Private Employed")

us_employment %>%
  filter(Title == "Total Private") %>%
  gg_season(Employed)

us_employment %>%
  filter(Title == "Total Private") %>%
  gg_subseries(Employed)

us_employment %>%
  filter(Title == "Total Private") %>%
  gg_lag(Employed,geom = "point")

us_employment %>%
  filter(Title == "Total Private") %>%
  ACF(Employed) %>%
  autoplot()

For this data, we can see that there is an increasing positive trend of employment for Total Private increasing over the year, we can see a seasonal pattern in which employment increases for the first 6 months,with a decrease in employment, followed by a another increase in employment. Within the lag plot we can see strong positive relationship within all the lags subplots. One thing to notice is the sharp decrease in Employment for Total Private jobs during the late 2000s.

Aus_Productions

aus_production %>%
  autoplot(Bricks) + ggtitle("Clay Brick Quarterly Production in Australia")
## Warning: Removed 20 rows containing missing values (`geom_line()`).

aus_production %>%
  gg_season(Bricks)
## Warning: Removed 20 rows containing missing values (`geom_line()`).

aus_production %>%
  gg_subseries(Bricks)
## Warning: Removed 5 rows containing missing values (`geom_line()`).

aus_production %>%
  gg_lag(Bricks,geom = "point")
## Warning: Removed 20 rows containing missing values (gg_lag).

aus_production %>%
  ACF(Bricks) %>%
  autoplot()

The Bricks production shows no clear trend, but there are some strong seasonality per year as well as some cyclic behavior, one thing to note is that there was a clear decrease in bricks production in Q1 in 1980. The seasonal plot shows that there is a slight increase in Bricks Production during q1 and q3 before it decreases in q4.

Hare From Pelt

pelt %>%
  autoplot(Hare) + ggtitle("Number of Hare Pelt Traded")

pelt %>%
  gg_subseries(Hare)

pelt %>%
  gg_lag(Hare,geom = "point")

pelt %>%
  ACF(Hare) %>%
  autoplot()

The Hare pelt trading record shows no clear trend, but we can see strong seasonal pattern with some cylic behavior, we can sharp increases and decreases hare pelt traded, with the number of hare pelt traded decreasing as the year go on. The lag plot shows moderate positive correlation especially in lag 1 and lag 2. In 1860 we see a very sharp decrease of hare pelt traded.

PBS Cost

PBS %>%
  filter(ATC2 == "H02") %>%
  autoplot(Cost) 

I wasn’t able to plot with all of the functions except with autopilot.. but for each concession for H02 has no visible trend, we can see each concession have a strong seasonality with increases and decreases for each month. The general/ co-payment seems to have the strongest peaks and drop-off in the chart..

US_Gasoline

us_gasoline %>%
  autoplot(Barrels) + ggtitle("US finshed motor gasoline product supplied") 

us_gasoline %>%
  gg_season(Barrels)

us_gasoline %>%
  gg_subseries(Barrels)

us_gasoline %>%
  gg_lag(Barrels, geom = "point")

us_gasoline %>%
  ACF(Barrels) %>%
  autoplot()

This series shows a positive trend and shows signs of seasonality where there are peaks and declines during certain time of the months, we can also witness cyclic behavior since there is no fixed frequency of the series.The lag plot shows signs of positive correlation and is over plotted. There is also no changes of seasonality within the plot.

Fin