library(tidyverse)
library(fpp3)
Explore the following four time series: - Bricks from aus_production - Lynx from pelt - Close from gafa_stock - Demand from vic_elec.
# read in the time series
?aus_production
?pelt
?gafa_stock
?vic_elec
aus_production
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
# Time period
min(aus_production$Quarter)
## <yearquarter[1]>
## [1] "1956 Q1"
## # Year starts on: January
max(aus_production$Quarter)
## <yearquarter[1]>
## [1] "2010 Q2"
## # Year starts on: January
# Interval of each observation
interval(aus_production)
## <interval[1]>
## [1] 1Q
The interval for the aus_production series is quarterly from 1956 to the second quarter (Q2) of 2010
pelt
head(pelt)
## # A tsibble: 6 x 3 [1Y]
## Year Hare Lynx
## <dbl> <dbl> <dbl>
## 1 1845 19580 30090
## 2 1846 19600 45150
## 3 1847 19610 49150
## 4 1848 11990 39520
## 5 1849 28040 21230
## 6 1850 58000 8420
# Time period
min(pelt$Year)
## [1] 1845
max(pelt$Year)
## [1] 1935
# Interval
interval(pelt)
## <interval[1]>
## [1] 1Y
For the pelt time series data was collected over one year intervals from 1845 to 1935
gafa_stock
head(gafa_stock)
## # A tsibble: 6 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2014-01-02 79.4 79.6 78.9 79.0 67.0 58671200
## 2 AAPL 2014-01-03 79.0 79.1 77.2 77.3 65.5 98116900
## 3 AAPL 2014-01-06 76.8 78.1 76.2 77.7 65.9 103152700
## 4 AAPL 2014-01-07 77.8 78.0 76.8 77.1 65.4 79302300
## 5 AAPL 2014-01-08 77.0 77.9 77.0 77.6 65.8 64632400
## 6 AAPL 2014-01-09 78.1 78.1 76.5 76.6 65.0 69787200
# Time period
min(gafa_stock$Date)
## [1] "2014-01-02"
max(gafa_stock$Date)
## [1] "2018-12-31"
# Interval
interval(gafa_stock)
## <interval[1]>
## [1] !
The interval of the stock data is 1 business day from January 2nd, 2014 to December 31st, 2018.
head(vic_elec)
## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
## Time Demand Temperature Date Holiday
## <dttm> <dbl> <dbl> <date> <lgl>
## 1 2012-01-01 00:00:00 4383. 21.4 2012-01-01 TRUE
## 2 2012-01-01 00:30:00 4263. 21.0 2012-01-01 TRUE
## 3 2012-01-01 01:00:00 4049. 20.7 2012-01-01 TRUE
## 4 2012-01-01 01:30:00 3878. 20.6 2012-01-01 TRUE
## 5 2012-01-01 02:00:00 4036. 20.4 2012-01-01 TRUE
## 6 2012-01-01 02:30:00 3866. 20.2 2012-01-01 TRUE
# Time period
min(vic_elec$Time)
## [1] "2012-01-01 AEDT"
max(vic_elec$Time)
## [1] "2014-12-31 23:30:00 AEDT"
# Interval
interval(vic_elec)
## <interval[1]>
## [1] 30m
The interval for the vic_elec time series is every 30 minutes from January 1st, 2012 at midnight to December 31st, 2014 at 11:30 PM
# aus_production - Bricks
aus_production %>% autoplot(Bricks) + labs(y = " Number of Bricks (millions)", title = "Australian Clay Brick Production")
# pelt - Lynx
pelt %>% autoplot(Lynx) + labs(y = "Number of pelts", title = "Canadian Lynx Traded")
# gafa_stock
gafa_stock %>% autoplot(Close) + labs(ylabel = "US Dollars", title = "Historical Closing Stock Prices")
# vic_elec - Demand
vic_elec %>% autoplot(Demand)
vic_elec %>% autoplot(Demand) + labs(x = "Year", y = "Demand (MWh)", title = "Electricity Demand for Victoria, Australia")
# EXERCISE 2.2
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock
head(gafa_stock)
## # A tsibble: 6 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2014-01-02 79.4 79.6 78.9 79.0 67.0 58671200
## 2 AAPL 2014-01-03 79.0 79.1 77.2 77.3 65.5 98116900
## 3 AAPL 2014-01-06 76.8 78.1 76.2 77.7 65.9 103152700
## 4 AAPL 2014-01-07 77.8 78.0 76.8 77.1 65.4 79302300
## 5 AAPL 2014-01-08 77.0 77.9 77.0 77.6 65.8 64632400
## 6 AAPL 2014-01-09 78.1 78.1 76.5 76.6 65.0 69787200
# Get the unique Stocks in the data set
unique(gafa_stock$Symbol)
## [1] "AAPL" "AMZN" "FB" "GOOG"
gafa_stock %>% autoplot(Close)
# Peak close for AAPL
gafa_stock %>% filter(Symbol == "AAPL") %>% arrange(desc(Close)) %>% head(1)
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Symbol`, `Date` first.
## # A tsibble: 1 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 2018-10-03 230. 233. 230. 232. 230. 28654800
The peak closing price for AAPL - Apple Inc was $232.07 on 10/03/2018
# Peak close for AMZN
gafa_stock %>% filter(Symbol == "AMZN") %>% arrange(desc(Close)) %>% head(1)
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Symbol`, `Date` first.
## # A tsibble: 1 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AMZN 2018-09-04 2026. 2050. 2013 2040. 2040. 5721100
The peak closing price for AMZN - Amazon.com Inc was $2039.51 on 09/04/2018
# Peak close for FB
gafa_stock %>% filter(Symbol == "FB") %>% arrange(desc(Close)) %>% head(1)
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Symbol`, `Date` first.
## # A tsibble: 1 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 FB 2018-07-25 216. 219. 214. 218. 218. 58954200
The peak closing price for FB - Meta Platforms Inc was $217.50 on 07/25/2018
# Peak close for GOOG
gafa_stock %>% filter(Symbol == "GOOG") %>% arrange(desc(Close)) %>% head(1)
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Symbol`, `Date` first.
## # A tsibble: 1 x 8 [!]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 GOOG 2018-07-26 1251 1270. 1249. 1268. 1268. 2405600
The peak closing price for GOOG - Alphabet Inc was $1268.33 on 07/26/2018
tute1 <- read_csv("https://raw.githubusercontent.com/D-hartog/DATA624/refs/heads/main/HW1/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(tute1)
## # A tibble: 6 × 4
## Quarter Sales AdBudget GDP
## <date> <dbl> <dbl> <dbl>
## 1 1981-03-01 1020. 659. 252.
## 2 1981-06-01 889. 589 291.
## 3 1981-09-01 795 512. 291.
## 4 1981-12-01 1004. 614. 292.
## 5 1982-03-01 1058. 647. 279.
## 6 1982-06-01 944. 602 254
mytimeseries <- tute1 %>%
mutate(Quarter = yearquarter(Quarter, fiscal_start = 1)) %>%
as_tsibble(index = Quarter)
head(mytimeseries)
## # A tsibble: 6 x 4 [1Q]
## Quarter Sales AdBudget GDP
## <qtr> <dbl> <dbl> <dbl>
## 1 1981 Q1 1020. 659. 252.
## 2 1981 Q2 889. 589 291.
## 3 1981 Q3 795 512. 291.
## 4 1981 Q4 1004. 614. 292.
## 5 1982 Q1 1058. 647. 279.
## 6 1982 Q2 944. 602 254
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid()
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
When the facet grid function is not added, each variable is plotted on
the same graph. Since they all share the y - axis the detail of the
trends gets lost, especially for GDP.
The USgas package contains data on the demand for natural gas in the US.
# install.packages("USgas")
library(USgas)
## Warning: package 'USgas' was built under R version 4.3.3
head(us_total)
## year state y
## 1 1997 Alabama 324158
## 2 1998 Alabama 329134
## 3 1999 Alabama 337270
## 4 2000 Alabama 353614
## 5 2001 Alabama 332693
## 6 2002 Alabama 379343
#
ustotal_tsibble <- us_total %>% as_tsibble(index = year, key = state)
ustotal_tsibble %>% filter(state %in% c("Maine", "Vermont", "New Hampshire", "Massachusetts", "Connecticut", "Rhode Island")) %>% autoplot(y) + labs(x = "Year", y= "Consumption", title = "Natural Gas Consumption Among New England States")
tourism <- readxl::read_excel("tourism.xlsx")
head(tourism)
## # A tibble: 6 × 5
## Quarter Region State Purpose Trips
## <chr> <chr> <chr> <chr> <dbl>
## 1 1998-01-01 Adelaide South Australia Business 135.
## 2 1998-04-01 Adelaide South Australia Business 110.
## 3 1998-07-01 Adelaide South Australia Business 166.
## 4 1998-10-01 Adelaide South Australia Business 127.
## 5 1999-01-01 Adelaide South Australia Business 137.
## 6 1999-04-01 Adelaide South Australia Business 200.
tourism_ts <- tourism %>% mutate(Quarter = yearquarter(Quarter, fiscal_start = 1)) %>% as_tsibble(index = Quarter, key = c(Region, State, Purpose))
head(tourism_ts)
## # A tsibble: 6 x 5 [1Q]
## # Key: Region, State, Purpose [1]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
tourism %>% group_by(Region, Purpose) %>% summarise(Avg_overnight_trips = mean(Trips)) %>% arrange(desc(Avg_overnight_trips)) %>% head(1)
## `summarise()` has grouped output by 'Region'. You can override using the
## `.groups` argument.
## # A tibble: 1 × 3
## # Groups: Region [1]
## Region Purpose Avg_overnight_trips
## <chr> <chr> <dbl>
## 1 Sydney Visiting 747.
state_total_trips_ts <- tourism_ts %>% group_by(State) %>% summarise(Total_trips = sum(Trips))
head(state_total_trips_ts)
## # A tsibble: 6 x 3 [1Q]
## # Key: State [1]
## State Quarter Total_trips
## <chr> <qtr> <dbl>
## 1 ACT 1998 Q1 551.
## 2 ACT 1998 Q2 416.
## 3 ACT 1998 Q3 436.
## 4 ACT 1998 Q4 450.
## 5 ACT 1999 Q1 379.
## 6 ACT 1999 Q2 558.
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series:
head(us_employment)
## # A tsibble: 6 x 4 [1M]
## # Key: Series_ID [1]
## Month Series_ID Title Employed
## <mth> <chr> <chr> <dbl>
## 1 1939 Jan CEU0500000001 Total Private 25338
## 2 1939 Feb CEU0500000001 Total Private 25447
## 3 1939 Mar CEU0500000001 Total Private 25833
## 4 1939 Apr CEU0500000001 Total Private 25801
## 5 1939 May CEU0500000001 Total Private 26113
## 6 1939 Jun CEU0500000001 Total Private 26485
# Filter data frame for Total private employed
total_private <- us_employment %>% filter(Title == "Total Private")
Can you spot any seasonality, cyclicity and trend?
total_private %>% autoplot(Employed) + labs(title = "US Employment: Total Private")
total_private %>% filter(year(Month) > 1960 & year(Month) < 1965) %>% autoplot(Employed)
Using the autoplot() function we can see a clear positive trend from 1939 to 2020 and the trend appears to be linear. There also appears to be some cyclic behavior. In the first graph above, the cycles are not consistently spaced, occurring between approximately 5-10 years. When we zoom in to the years 1961-1965 there are what might appear to be seasonal trends.
What do you learn about the series? The series contains observations from 1939 to 2020 in one month intervals. Despite there being some fluctuations with in a year and over several years, the trend in the number of employed persons continues to rise.
What can you say about the seasonal patterns?
total_private %>% gg_season(Employed, period = "year") + labs(title = "Seasonal Trends in US Employment: Total Private ")
Using gg_season to plot the trends across every year, we see small
linear trends from the beginning to the end of each year but not a clear
seasonal pattern as was initally visualized in the zoomed in
autoplot.
total_private %>% gg_lag(period = "month", lags = 1:24)
## Plot variable not specified, automatically selected `y = Employed`
total_private %>% ACF(Employed, lag_max = 100) %>% autoplot()
There are strong autocorrelations between the lags and the
autocorrelation plot demonstrates a slow decrrease in the strength of
the correlations as the lags get further out. This aligns with the
linear trend we visualized when plotting the whole time series.
Can you identify any unusual years?
total_private %>% filter(year(Month) > 2005 & year(Month) < 2011) %>% autoplot(Employed)
The most obvious unusual year that broke from yearly trend was in 2008.
This was an unusual year as there was a larger drop in employment
starting in the middle of the year that continued downward in the start
of 2009. This is in contrast to other years where we might see a slight
drop towards the end of the year followed by a steady incline at the
start of the next year. This can be explained by the 2008 financial
crisis and recession.
bricks <- aus_production %>% select(Bricks)
head(bricks)
## # A tsibble: 6 x 2 [1Q]
## Bricks Quarter
## <dbl> <qtr>
## 1 189 1956 Q1
## 2 204 1956 Q2
## 3 208 1956 Q3
## 4 197 1956 Q4
## 5 187 1957 Q1
## 6 214 1957 Q2
min(bricks$Quarter)
## <yearquarter[1]>
## [1] "1956 Q1"
## # Year starts on: January
max(bricks$Quarter)
## <yearquarter[1]>
## [1] "2010 Q2"
## # Year starts on: January
Can you spot any seasonality, cyclic and trend?
bricks %>% autoplot(Bricks) + labs(title = "Australian Clay Brick Production")
bricks %>% filter(year(Quarter) <= 1965) %>% autoplot(Bricks)
Plotting the time series does reveal an initial upward trend in the
number of clay bricks being produced, as well as seasonality and cyclic
trends. At about 1975 there is abig dorp in the production followed by a
period of increased production before another drop off in the early
80’s. The production picks up again but after this, the production of
clay bricks follows a downward trend. The season trends still persist as
well as the cycles with a large drop in production every 5-7 years.
What do you learn about the series? The series spans the years from 1956 to the second quarter of 2010. The production of clay bricks seems to be fairly predictable up until the mid 1970’s. After this there is visually more variation in the clay brick production.
What can you say about the seasonal patterns?
bricks %>% gg_season(Bricks) + labs(title = "Seasonal Trends")
When visualizing the trends between each quarter in a given year, for
most early years there is an increase in the amount of bricks being
produced from Q1 to Q2, production steadies during Q2 to Q3 or slightly
increases, and from Q3 to Q4 there is a slight decrease. It is harder to
identify a consistent seasonal trend in later years.
bricks %>% gg_lag(Bricks, lags = 1:9, geom = "point")
## Warning: Removed 20 rows containing missing values (gg_lag).
bricks %>% ACF(Bricks, lag_max = 36) %>% autoplot()
The lag plots above show strong positive correlations over 9 lags and
when plotting the correlation coefficients we also see the typical
pattern where the slow decrease in correlations indicates the overall
linear trend and the seasonality can be seen spikes in the correlation
coefficients at every 4th lag.
Can you identify any unusual years? As the trend was continuing to increase, in the early 1980’s there was a big drop in the early 1980’s.
hare <- pelt %>% select(Hare)
head(hare)
## # A tsibble: 6 x 2 [1Y]
## Hare Year
## <dbl> <dbl>
## 1 19580 1845
## 2 19600 1846
## 3 19610 1847
## 4 11990 1848
## 5 28040 1849
## 6 58000 1850
Can you spot any seasonality, cyclicity and trend?
hare %>% autoplot(Hare) + labs(title = "Number of Snowshoe Hare pelts Traded")
There does not appear to be any linear trend in the amount of hare pelts
traded in the time frame data was collect. There does appear to be
cyclic patterns as the peaks and valleys span 3-5 years.
What do you learn about the series? The data in the series is collected every year from 1845 to 1935. Over the span of 90 years the amount of pelts traded follows a cyclical behavior.
What can you say about the seasonal patterns? Since there is only data for each year, it is not possible to get an accurate assessment on seasonality.
hare %>% gg_lag(Hare, lags = 1:10, geom = "point")
hare %>% ACF(Hare, lag_max = 25) %>% autoplot()
The autocorrelation plot shows repeating periods of positive and
negative spikes in a sinusodal wave pattern. The autocorrelations are
also decaying towards 0, getting weaker the further the lags are away
from each other.
Can you identify any unusual years?
hare %>% arrange(desc(Hare))
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by ``, `Year` first.
## # A tsibble: 91 x 2 [1Y]
## Hare Year
## <dbl> <dbl>
## 1 152650 1863
## 2 148360 1864
## 3 134860 1886
## 4 134850 1885
## 5 103790 1887
## 6 101250 1875
## 7 97120 1876
## 8 89760 1933
## 9 88480 1853
## 10 88060 1856
## # ℹ 81 more rows
In the autolot we can clearly see two spikes in the mid to late 1800’s when the amount of Hare pelts were being traded. When we sort the data we can see that between 1863-1864 and 1885 - 1886 there were periods noted in the graph.
H02 <- PBS %>% filter(ATC2 == "H02")
H02
## # A tsibble: 816 x 9 [1M]
## # Key: Concession, Type, ATC1, ATC2 [4]
## Month Concession Type ATC1 ATC1_desc ATC2 ATC2_desc Scripts Cost
## <mth> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 1991 Jul Concessional Co-paym… H Systemic… H02 CORTICOS… 63261 317384
## 2 1991 Aug Concessional Co-paym… H Systemic… H02 CORTICOS… 53528 269891
## 3 1991 Sep Concessional Co-paym… H Systemic… H02 CORTICOS… 52822 269703
## 4 1991 Oct Concessional Co-paym… H Systemic… H02 CORTICOS… 54016 280418
## 5 1991 Nov Concessional Co-paym… H Systemic… H02 CORTICOS… 49281 268070
## 6 1991 Dec Concessional Co-paym… H Systemic… H02 CORTICOS… 51798 277139
## 7 1992 Jan Concessional Co-paym… H Systemic… H02 CORTICOS… 42436 221772
## 8 1992 Feb Concessional Co-paym… H Systemic… H02 CORTICOS… 52913 272345
## 9 1992 Mar Concessional Co-paym… H Systemic… H02 CORTICOS… 62908 325700
## 10 1992 Apr Concessional Co-paym… H Systemic… H02 CORTICOS… 68499 349271
## # ℹ 806 more rows
Can you spot any seasonality, cyclicity and trend?
H02 %>% autoplot(Cost) +
facet_grid(Type ~ Concession, scales = "free_y")
There are four combinations that we can look at regarding the cost of
scripts. - Concessional scripts and co-payments: The cost from
concessional scripts and co-payments appears to increase over time with
seasonal features. - Concessional scripts and safety net costs appear to
have cyclical patterns. The peak safety net/concessional costs appear to
increase over time. - General/co payment costs do not appear to have any
trends but potentially have seasonality. - General/safety net costs also
follow a cyclical pattern with two clear spikes in the beginning of the
time series
What do you learn about the series? Monthly data about the total number of scripts and the cost of the scripts are collected. The four different groups appear to have individual patterns of trend, seasaonality and cycles.
What can you say about the seasonal patterns?
1. Concessional/safety net costs
H02 %>% filter((Concession == "Concessional") & (Type == "Safety net")) %>% gg_season(Cost) + labs(title = "Concessional/safety net costs")
# Lag plot
H02 %>% filter((Concession == "Concessional") & (Type == "Safety net")) %>% gg_lag(Cost, geom = "point") + labs(title = "Concessional/safety net costs")
# ACF plot
H02 %>% filter((Concession == "Concessional") & (Type == "Safety net")) %>% ACF(Cost, lag_max = 72) %>% autoplot() + labs(title = "Concessional/safety net costs")
The seasonal, lag and autocorrelation plots for concessional/safety net costs indicate a cyclica pattern and the decaying autocorrelations also point to some trend over time.
2. Concessional/co-payment costs
H02 %>% filter((Concession == "Concessional") & (Type == "Co-payments")) %>% gg_season(Cost) + labs(title = "Concessional/co-payment costs")
# Lag plot
H02 %>% filter((Concession == "Concessional") & (Type == "Co-payments")) %>% gg_lag(Cost, geom = "point") + labs(title = "Concessional/co-payment costs")
# ACF plot
H02 %>% filter((Concession == "Concessional") & (Type == "Co-payments")) %>% ACF(Cost, lag_max = 72) %>% autoplot() + labs(title = "Concessional/co-payment costs")
The seasonal plot shows that for most years costs of scripts are higher from March to July and then start to drop towards the end of the year. There are strong autocorrelations in the data that indicate previous months are good indicators of the cost of the current year. The autocorrelation plot provides a lot of information, the peaks at the multiples of 12 being highly correlated indicates seasonality, the decay towards 0 indicates a linear trend and the sinusoidal wave form indicates a cyclical pattern in the data.
3. General/safety net costs
# Seasonal plot
H02 %>% filter((Concession == "General") & (Type == "Safety net")) %>% gg_season(Cost) + labs(title = "General/safety net costs")
# Lag plot
H02 %>% filter((Concession == "General") & (Type == "Safety net")) %>% gg_lag(Cost, geom = "point") + labs(title = "General/safety net costs")
# ACF plot
H02 %>% filter((Concession == "General") & (Type == "Safety net")) %>% ACF(Cost, lag_max = 72) %>% autoplot() + labs(title = "General/safety net costs")
The seasonal plot of the cost of the General type/safety net shows higher costs at the beginning and at then end of the year. The lowest costs are in February to May, where April marks the beginning of the rise in costs. In the lag plot there appears to be two clusters but moderate correlations are calculated and plotted. The autocorrelation plot shows consistent periods of positive and negative correlations confirming seasonality.
4. General/co-payments costs
# Time series plot
H02 %>% filter((Concession == "General") & (Type == "Co-payments")) %>% autoplot(Cost) + labs(title = "General/co-payment costs")
# Seasonal plot
H02 %>% filter((Concession == "General") & (Type == "Co-payments")) %>% gg_season(Cost) + labs(title = "General/co-payment costs")
# Lag plot
H02 %>% filter((Concession == "General") & (Type == "Co-payments")) %>% gg_lag(Cost, geom = "point") + labs(title = "General/co-payment costs")
# ACF plot
H02 %>% filter((Concession == "General") & (Type == "Co-payments")) %>% ACF(Cost, lag_max = 72) %>% autoplot() + labs(title = "General/co-payment costs")
I re-plotted the time series to visualize it more clearly. The seasonal plots of general/co payments costs is difficult to see a clear pattern between the years. The lag plots over 9 lags demonstrate positive correlations but the autocorrelation plots confirm weak correlations. Beyond 21 lags the correlations are negative except at 23 (however it is close to 0). Visually inspecting the autocorrelation plot there is no pattern in the peaks and valleys, indicating weak seasonality in the plot.
Can you identify any unusual years? There does not appear to be any unusual years.
Can you spot any seasonality, cyclicity and trend?/What do you learn about the series?
?us_gasoline
head(us_gasoline)
## # A tsibble: 6 x 2 [1W]
## Week Barrels
## <week> <dbl>
## 1 1991 W06 6.62
## 2 1991 W07 6.43
## 3 1991 W08 6.58
## 4 1991 W09 7.22
## 5 1991 W10 6.88
## 6 1991 W11 6.95
us_gasoline %>% autoplot(Barrels) + labs(y = "Barrels (million/day)", title = "US Finished Motor Gasoline Product Supplied")
The series contains information in weekly intervals from the 6th week of
1991 to the 3rd week of 2017.. Visualizing the numbers of gasoline
barrels supplied overtime we can clearly see a linear trend up until the
mid 2000’s. The trend reverses and then again looks to trend upwards.
The plot also reveals a seasonality component as well. Zooming in on a
few early years gives a better visual on trends within one year, where
we can see the seasonal trends.
us_gasoline %>% filter(year(Week) <= 1993) %>% autoplot(Barrels)
What can you say about the seasonal patterns?
us_gasoline %>% gg_season(Barrels)
Looking at seasonality using gg_seasons, we see some consistent patterns
year to year. There appears to be an upward trend up until the middle of
the year and the number of barrels produced seem to level out or
slightly drop.
us_gasoline %>% gg_lag(Barrels, lags = 1:10, geom = "point") + labs(title = "Lag plots")
Inspecting the lag plots, there are strong positive correlations over
all lags up to 10. The data points appear to spread out as the lags get
further spaced. An autocorrelation plot will help to provide further
insight into the presence of seasonality.
# Lags over 4 years
us_gasoline %>% ACF(Barrels, lag_max = 208) %>% autoplot()
Inspecting the lag plot autocorrelation plot for a max lag of 208
(equivalent to 4 years) we can see peaks and valleys at regular
intervals indicating. I decided to plot more than what was visualized in
the lag plots to get a better idea of the long term trend. The slow
decrease in the ACF as the lags increases is due to the trend, while the
“scalloped” shape is due to the seasonality.
Can you identify any unusual years?
Visually inspecting the first plot of the time series, the first area that draws attention is the change from an upward trend to a downward trend during the years 2008 - 2010. This may reflect the stock market crisis and recession that hit the US and the reduced demand. Another period that sticks out to me is around 2015/2016. The seasonal pattern appears to have abnormal shape compared to other time periods in the series.