Do exercises 3.1, 3.2, 3.3, 3.4, 3.5, 3.7, 3.8 and 3.9 from the online Hyndman book. Please include your Rpubs link along with.pdf file of your run code
Consider the GDP information in global_economy.
Plot the GDP per capita for each country over time. Which country
has the highest GDP per capita?
How has this changed over time?.
Highest GDP seems to be Luxemborg for 2017.
Per Wiki and investopedia,we can see some changes on the plot reflect real events.
In 1973, oil production and revenues increased and Qatar had high per capita income. Qatar’s economy was in a downturn from 1982 to 1989. OPEC (Organization of Petroleum Exporting Countries) quotas on crude oil production. We can see that on the plot.
From 2010 to 2015, its China GDP growth rate slowed to 7%. Chinese manufacturing declined to its lowest level in three years, with the nation’s purchasing manager’s index falling to 49.7 in August 2015.
library(tidyverse)
library(dplyr)
# install.packages("ggplot2")
# install.packages("ggfortify")
library(ggplot2)
library(ggfortify)
library(tsibble)
library(tsibbledata)
#install.packages("feasts")
library(feasts)
library ("fpp3")
global_economy %>%
autoplot(GDP / Population, show.legend = FALSE) +
labs(title= "GDP per capita", y = "GDP per capita")
#?global_economy
# str(global_economy)
unique(global_economy$Year)
## [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974
## [16] 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
## [31] 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
## [46] 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
gp2011 <- global_economy %>%
mutate(GDPpercp = GDP/Population) %>% group_by(Country) %>% arrange(desc(GDPpercp)) %>% filter(Year >= 1990, Code %in% c("MAC","USA", "LUX","SGP","QAT","DEU"))
gp2017 <- global_economy %>%
mutate(GDPpercp = GDP/Population) %>% group_by(Country) %>% arrange(desc(GDPpercp)) %>% filter(Year == 2017)
gp1960 <- global_economy %>%
mutate(GDPpercp = GDP/Population) %>% group_by(Country) %>% arrange(desc(GDPpercp)) %>% filter(between(Year,1960, 1990), Code %in% c("MAC","USA", "LUX","SGP","QAT","DEU"))
head(gp2011,25)
## # A tsibble: 25 x 10 [1Y]
## # Key: Country [3]
## # Groups: Country [3]
## Country Code Year GDP Growth CPI Imports Exports Population GDPpercp
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Luxembo… LUX 2014 6.63e10 5.77 109. 174. 208. 556319 119225.
## 2 Luxembo… LUX 2011 6.00e10 2.54 103. 145. 178. 518347 115762.
## 3 Luxembo… LUX 2008 5.58e10 -1.28 97.4 156. 187. 488650 114294.
## 4 Luxembo… LUX 2013 6.17e10 3.65 108. 159. 191. 543360 113625.
## 5 Luxembo… LUX 2012 5.67e10 -0.353 106. 155. 186. 530946 106749.
## 6 Luxembo… LUX 2007 5.09e10 8.35 94.2 150. 183. 479993 106018.
## 7 Luxembo… LUX 2010 5.32e10 4.86 100 142. 175. 506953 104965.
## 8 Luxembo… LUX 2017 6.24e10 2.30 111. 194. 230. 599449 104103.
## 9 Luxembo… LUX 2009 5.14e10 -4.36 97.8 132. 164. 497783 103199.
## 10 Luxembo… LUX 2015 5.78e10 2.86 109. 187. 223. 569604 101447.
## # ℹ 15 more rows
head(gp2017,25)
## # A tsibble: 25 x 10 [1Y]
## # Key: Country [25]
## # Groups: Country [25]
## Country Code Year GDP Growth CPI Imports Exports Population GDPpercp
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Luxembo… LUX 2017 6.24e10 2.30 111. 194. 230. 599449 104103.
## 2 Macao S… MAC 2017 5.04e10 9.10 136. 32.0 79.4 622567 80893.
## 3 Switzer… CHE 2017 6.79e11 1.09 98.3 53.9 65.0 8466017 80190.
## 4 Norway NOR 2017 3.99e11 1.92 115. 33.1 35.5 5282223 75505.
## 5 Iceland ISL 2017 2.39e10 3.64 122. 42.8 47.0 341284 70057.
## 6 Ireland IRL 2017 3.34e11 7.80 105. 87.9 120. 4813608 69331.
## 7 Qatar QAT 2017 1.67e11 1.58 116. 37.3 51.0 2639211 63249.
## 8 United … USA 2017 1.94e13 2.27 112. NA NA 325719178 59532.
## 9 North A… NAC 2017 2.10e13 2.35 NA NA NA 362492702 58070.
## 10 Singapo… SGP 2017 3.24e11 3.62 113. 149. 173. 5612253 57714.
## # ℹ 15 more rows
gp2011 |>
autoplot(GDPpercp)
gp1960 |>
autoplot(GDPpercp)
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP from global_economy.
The best transformation would be to calculate GDP per capita for the US,
therefore we have shown the plot for GDP per capita as a population
adjustment.
Slaughter of Victorian “Bulls, bullocks and steers” in
aus_livestock.
A good value of lamda is one which makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler.
Victorian Electricity Demand from vic_elec.
For Electricity demand we can convert the tsibble to look at demand daily and monthly
Gas production from aus_production.
For the Gas dataset, the -1 lambda has shown some variation across the trend on the plot.
# install.packages("MASS")
# install.packages("Guerrero")
usgdp <- global_economy %>%
mutate(GDPpercp = GDP/Population) %>% group_by(Country) %>% arrange(desc(GDPpercp)) %>% filter(Code %in% c("USA"))
usgdp |>
autoplot(GDPpercp)
usgdp1 <- global_economy %>% filter(Code %in% c("USA"))
usgdp1 |>
autoplot(GDP)
library("MASS")
library("feasts")
library(dplyr)
aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot() + ggtitle("Australian livestock timeseries")
lambda <- aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
features(Count,features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] -0.04461887
aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot() + ggtitle("Australian livestock timeseries")
aus_livestock %>% filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>% autoplot(box_cox(Count,lambda))+
labs(y = "",
title = "Transformed livestock dataset",
round(lambda,2))
v1 <- vic_elec %>%
group_by(Date) %>%
mutate(Demand = sum(Demand)) %>%
distinct(Date, Demand)
v1 %>%
as_tsibble(index = Date) %>%
autoplot(Demand) +
labs(title= "Daily Electricity Demand", y = "MWH ")
v2 <- v1 %>%
mutate(Date = yearmonth(Date)) %>%
group_by(Date) %>%
summarise(Demand = sum(Demand)) %>%
as_tsibble(index = Date) %>%
autoplot(Demand) +
labs(title= "Monthly Electricity Demand", y = "MWH")
head(v1)
## # A tibble: 6 × 2
## # Groups: Date [6]
## Date Demand
## <date> <dbl>
## 1 2012-01-01 222438.
## 2 2012-01-02 257965.
## 3 2012-01-03 267099.
## 4 2012-01-04 222742.
## 5 2012-01-05 210585.
## 6 2012-01-06 210247.
head(v2)
## $data
## # A tsibble: 36 x 2 [1M]
## Date Demand
## <mth> <dbl>
## 1 2012 Jan 7241049.
## 2 2012 Feb 6874543.
## 3 2012 Mar 6746560.
## 4 2012 Apr 6401078.
## 5 2012 May 7375177.
## 6 2012 Jun 7388456.
## 7 2012 Jul 7568114.
## 8 2012 Aug 7492261.
## 9 2012 Sep 6567196.
## 10 2012 Oct 6680705.
## # ℹ 26 more rows
##
## $layers
## $layers[[1]]
## geom_line: na.rm = FALSE, orientation = NA
## stat_identity: na.rm = FALSE
## position_identity
##
##
## $scales
## <ggproto object: Class ScalesList, gg>
## add: function
## add_defaults: function
## add_missing: function
## backtransform_df: function
## clone: function
## find: function
## get_scales: function
## has_scale: function
## input: function
## map_df: function
## n: function
## non_position_scales: function
## scales: list
## train_df: function
## transform_df: function
## super: <ggproto object: Class ScalesList, gg>
##
## $guides
## <Guides[0] ggproto object>
##
## <empty>
##
## $mapping
## Aesthetic mapping:
## * `x` -> `Date`
## * `y` -> `Demand`
##
## $theme
## list()
aus_production %>%
autoplot(Gas) +
labs(title = "Gas Production")
lambda1 <- aus_production %>%
features(Gas, features = guerrero) %>%
pull(lambda_guerrero)
lambda1
## [1] 0.1095171
aus_production %>%
autoplot(box_cox(Gas, lambda1)) +
labs(y = "", title = "Transformed Gas Production ",
round(lambda1,2))
Why is a Box-Cox transformation unhelpful for the canadian_gas
data?
Since lambda is very close to 0, there is not much change on the plot.
All series have been adjusted for inflation
library(fpp3)
canadian_gas
## # A tsibble: 542 x 2 [1M]
## Month Volume
## <mth> <dbl>
## 1 1960 Jan 1.43
## 2 1960 Feb 1.31
## 3 1960 Mar 1.40
## 4 1960 Apr 1.17
## 5 1960 May 1.12
## 6 1960 Jun 1.01
## 7 1960 Jul 0.966
## 8 1960 Aug 0.977
## 9 1960 Sep 1.03
## 10 1960 Oct 1.25
## # ℹ 532 more rows
lambdaa <- canadian_gas %>%
features(Volume,features = guerrero) %>%
pull(lambda_guerrero)
lambdaa
## [1] 0.5767648
canadian_gas %>%
autoplot() + labs(title = "Monthly Canadian gas production")
canadian_gas %>%
autoplot(box_cox(Volume,lambdaa))
What Box-Cox transformation would you select for your retail data
(from Exercise 7 in Section 2.10)?
We have used box-cox and guerrero method, again lamda is closer to 0, therefore the plot is relatively same.
set.seed(1234)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
autoplot(myseries, Turnover)+
labs(title = "Retail Turnover", y = "$AUD (in millions)")
lambda3a <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
autoplot(box_cox(Turnover, lambda3a)) +
labs(y = "", title = "Transformed Retail Turnover ",
round(lambda3a,2))
For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.
Tobacco from aus_production,
Economy class passengers between Melbourne and Sydney from ansett,
and
Pedestrian counts at Southern Cross Station from pedestrian.
When Lambda is close to 1, the shape of the data doesn’t vary by much in the case of the Tobacco dataset.This is series is not steadily increasing with level therefore, box cox value is not going to change the shape by much.
autoplot(aus_production, Tobacco)+
labs(title = "Tobacco and Cigarette Production in Tonnes")
lambda <- aus_production %>%
features(Tobacco, features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.9264636
lambda1<--1
aus_production %>%
autoplot(box_cox(Tobacco, lambda)) +
labs(y = "", title = "Transformed Tobacco Production ",
round(lambda,2))
aus_production %>%
autoplot(box_cox(Tobacco, lambda1)) +
labs(y = "", title = "Transformed Tobacco Production ",
round(lambda1,2))
lambda3 <- ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
features(Passengers,features = guerrero) %>%
pull(lambda_guerrero)
lambda3
## [1] 1.999927
ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(Passengers) + labs(title = "Passengers on Ansett Flights")
ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(box_cox(Passengers,lambda3)) + labs(title = "Transformed data")
head(ansett)
## # A tsibble: 6 x 4 [1W]
## # Key: Airports, Class [1]
## Week Airports Class Passengers
## <week> <chr> <chr> <dbl>
## 1 1989 W28 ADL-PER Business 193
## 2 1989 W29 ADL-PER Business 254
## 3 1989 W30 ADL-PER Business 185
## 4 1989 W31 ADL-PER Business 254
## 5 1989 W32 ADL-PER Business 191
## 6 1989 W33 ADL-PER Business 136
#hourly pedastrian counts
southern_cross <- pedestrian %>%
filter(Sensor == "Southern Cross Station")
autoplot(southern_cross, Count)+
labs(title = "Hourly Pedestrian Counts ")
head(southern_cross)
## # A tsibble: 6 x 5 [1h] <Australia/Melbourne>
## # Key: Sensor [1]
## Sensor Date_Time Date Time Count
## <chr> <dttm> <date> <int> <int>
## 1 Southern Cross Station 2015-01-01 00:00:00 2015-01-01 0 746
## 2 Southern Cross Station 2015-01-01 01:00:00 2015-01-01 1 312
## 3 Southern Cross Station 2015-01-01 02:00:00 2015-01-01 2 180
## 4 Southern Cross Station 2015-01-01 03:00:00 2015-01-01 3 133
## 5 Southern Cross Station 2015-01-01 04:00:00 2015-01-01 4 44
## 6 Southern Cross Station 2015-01-01 05:00:00 2015-01-01 5 16
lambda7 <- southern_cross %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
lambda7
## [1] -0.2501616
southern_cross %>%
autoplot(box_cox(Count, lambda7)) +
labs(y = "", title = "Transformed Daily Pedestrian Counts with ",
round(lambda7,2))
# daily series
southern_cross1 <- southern_cross %>%
index_by(Date) %>%
summarise(Count = sum(Count))
head(southern_cross1)
## # A tsibble: 6 x 2 [1D]
## Date Count
## <date> <int>
## 1 2015-01-01 2813
## 2 2015-01-02 4648
## 3 2015-01-03 1428
## 4 2015-01-04 1347
## 5 2015-01-05 11483
## 6 2015-01-06 11513
autoplot(southern_cross1, Count)+
labs(title = "Daily Pedestrian data")
lambda8 <- southern_cross1 %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
lambda8
## [1] 0.2726316
southern_cross1 %>%
autoplot(box_cox(Count, lambda8)) +
labs(y = "", title = "Transformed Daily Pedestrian data",
round(lambda8,2))
#weekly
southern_cross2 <- southern_cross %>%
mutate(Week = yearweek(Date)) %>%
index_by(Week) %>%
summarise(Count = sum(Count))
head(southern_cross2)
## # A tsibble: 6 x 2 [1W]
## Week Count
## <week> <int>
## 1 2015 W01 10236
## 2 2015 W02 60134
## 3 2015 W03 71440
## 4 2015 W04 73393
## 5 2015 W05 62535
## 6 2015 W06 79154
autoplot(southern_cross2, Count)+
labs(title = "Weekly Pedestrian ")
lambda9 <- southern_cross2 %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
lambda9
## [1] -0.1108714
southern_cross2 %>%
autoplot(box_cox(Count, lambda9)) +
labs(y = "", title = "Transformed Weekly Counts ",
round(lambda9,2))
Show that a 3×5 MA is equivalent to a 7-term weighted moving average
with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and
0.067.
I tried to understand how to show the calculation , however, unsure if I got the calculation right below.
library ("fpp3")
# 7 term weights - Explanation
#
# k=7 periods
# if k is 7 periods then m = (7*2)+ 1 = 15 (ie 15 MA)
# if m is 15, then series becomes 1/15{k periods before + yt + k periods after}
# since k = 7, we have to use the average of 7 periods before yt therefore we have
#
# 3 * 5-MA
# 5 MA means , m = 5, therefore k = m-1/2 = 4/2 so therefore we have 2 before and 2 after the y period. No we have another 3
# which means, m= 3, therefore k = 3-1/2 = 1 so therefore 1 before and 1 after the 5-MA
beer <- aus_production |>
filter(year(Quarter) >= 1992)
beer_ma1 <- beer |>
mutate(
`5-MA` = slider::slide_dbl(Beer, mean,
.before = 2, .after = 2, .complete = TRUE),
`3x5-MA` = slider::slide_dbl(`5-MA`, mean,
.before = 1, .after = 1, .complete = TRUE))
head(beer_ma1, n=20)
## # A tsibble: 20 x 9 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas `5-MA` `3x5-MA`
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1992 Q1 443 5777 383 1289 38332 117 NA NA
## 2 1992 Q2 410 5853 404 1501 39774 151 NA NA
## 3 1992 Q3 420 6416 446 1539 42246 175 448. NA
## 4 1992 Q4 532 5825 420 1568 38498 129 443. 445.
## 5 1993 Q1 433 5724 394 1450 39460 116 443. 449.
## 6 1993 Q2 421 6036 462 1668 41356 149 462. 450.
## 7 1993 Q3 410 6570 475 1648 42949 163 445 447.
## 8 1993 Q4 512 5675 443 1863 40974 138 435. 438.
## 9 1994 Q1 449 5311 421 1468 40162 127 435 443.
## 10 1994 Q2 381 5717 475 1755 41199 159 459. 445.
## 11 1994 Q3 423 7000 497 1962 44095 184 442 445
## 12 1994 Q4 531 6085 476 1833 41745 147 434. 439.
## 13 1995 Q1 426 4714 430 1626 41768 131 441. 445.
## 14 1995 Q2 408 3939 457 1703 43686 167 460. 446.
## 15 1995 Q3 416 6137 417 1733 46022 181 436. 442.
## 16 1995 Q4 520 4739 370 1545 42800 145 430. 431.
## 17 1996 Q1 409 4275 310 1526 43661 133 428. 435.
## 18 1996 Q2 398 5239 358 1593 44707 162 446. 434.
## 19 1996 Q3 398 6293 379 1706 46326 184 429. 434.
## 20 1996 Q4 507 5575 369 1699 43346 146 427. 428.
beer <- aus_production |>
filter(year(Quarter) >= 1992)
beer_ma2 <- beer |>
mutate(
`15-MA` = slider::slide_dbl(Beer, mean,
.before = 7, .after = 7, .complete = TRUE))
head(beer_ma1, n=20)
## # A tsibble: 20 x 9 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas `5-MA` `3x5-MA`
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1992 Q1 443 5777 383 1289 38332 117 NA NA
## 2 1992 Q2 410 5853 404 1501 39774 151 NA NA
## 3 1992 Q3 420 6416 446 1539 42246 175 448. NA
## 4 1992 Q4 532 5825 420 1568 38498 129 443. 445.
## 5 1993 Q1 433 5724 394 1450 39460 116 443. 449.
## 6 1993 Q2 421 6036 462 1668 41356 149 462. 450.
## 7 1993 Q3 410 6570 475 1648 42949 163 445 447.
## 8 1993 Q4 512 5675 443 1863 40974 138 435. 438.
## 9 1994 Q1 449 5311 421 1468 40162 127 435 443.
## 10 1994 Q2 381 5717 475 1755 41199 159 459. 445.
## 11 1994 Q3 423 7000 497 1962 44095 184 442 445
## 12 1994 Q4 531 6085 476 1833 41745 147 434. 439.
## 13 1995 Q1 426 4714 430 1626 41768 131 441. 445.
## 14 1995 Q2 408 3939 457 1703 43686 167 460. 446.
## 15 1995 Q3 416 6137 417 1733 46022 181 436. 442.
## 16 1995 Q4 520 4739 370 1545 42800 145 430. 431.
## 17 1996 Q1 409 4275 310 1526 43661 133 428. 435.
## 18 1996 Q2 398 5239 358 1593 44707 162 446. 434.
## 19 1996 Q3 398 6293 379 1706 46326 184 429. 434.
## 20 1996 Q4 507 5575 369 1699 43346 146 427. 428.
head(beer_ma2, n=20)
## # A tsibble: 20 x 8 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas `15-MA`
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1992 Q1 443 5777 383 1289 38332 117 NA
## 2 1992 Q2 410 5853 404 1501 39774 151 NA
## 3 1992 Q3 420 6416 446 1539 42246 175 NA
## 4 1992 Q4 532 5825 420 1568 38498 129 NA
## 5 1993 Q1 433 5724 394 1450 39460 116 NA
## 6 1993 Q2 421 6036 462 1668 41356 149 NA
## 7 1993 Q3 410 6570 475 1648 42949 163 NA
## 8 1993 Q4 512 5675 443 1863 40974 138 441
## 9 1994 Q1 449 5311 421 1468 40162 127 446.
## 10 1994 Q2 381 5717 475 1755 41199 159 446.
## 11 1994 Q3 423 7000 497 1962 44095 184 445.
## 12 1994 Q4 531 6085 476 1833 41745 147 436.
## 13 1995 Q1 426 4714 430 1626 41768 131 441.
## 14 1995 Q2 408 3939 457 1703 43686 167 441.
## 15 1995 Q3 416 6137 417 1733 46022 181 441.
## 16 1995 Q4 520 4739 370 1545 42800 145 433.
## 17 1996 Q1 409 4275 310 1526 43661 133 439.
## 18 1996 Q2 398 5239 358 1593 44707 162 442.
## 19 1996 Q3 398 6293 379 1706 46326 184 440
## 20 1996 Q4 507 5575 369 1699 43346 146 431.
Consider the last five years of the Gas data from
aus_production
Plot the time series. Can you identify seasonal fluctuations and/or a
trend-cycle?
From the textbook “we can think of a time series as comprising three
components: a trend-cycle component, a seasonal component, and
a remainder component (containing anything else in the time series).”.
On the gg_season below, we can see patterns exist between the quarters
therefore there is a seasonal fluctuations between quarters within the
years. On trend-cycle, there is a mild upwards trend a slight cycle but
not very obvious on the plot.
Use classical_decomposition with type=multiplicative to calculate the
trend-cycle and seasonal indices.
Do the results support the graphical interpretation from part a?
Yes, here we are able to show the seasonal indices below and there is
seasonal pattern aligning with the plot above.
Compute and plot the seasonally adjusted data.
Change one observation to be an outlier (e.g., add 300 to one
observation), and recompute the seasonally adjusted data. What is
the effect of the outlier? Does it make any difference if the outlier is
near the end rather than in the middle of the time series?
The outlier when added to an observation in the middle, does seem to show change in the plot. If its prior to the start of the series, the graph shape doesn’t change as much.
#install.packages("seasonal")
library(seasonal)
gas <- tail(aus_production, 5*4) #|> select(Gas)
#Weekly data beginning Week 6, 1991, and ending Week 3, 2017. Units are million barrels per day.
gas %>% autoplot() + geom_smooth()
gg_season(gas, period = "year", labels="both")
#Classical
gasd <- gas %>%
model(classical_decomposition(Gas, type = "multiplicative"))
components(gasd) %>%
autoplot()
labs(title = "Classical multiplicative Decomposition of Gas dataset")
## $title
## [1] "Classical multiplicative Decomposition of Gas dataset"
##
## attr(,"class")
## [1] "labels"
components(gasd) %>%
head()
## # A dable: 6 x 7 [1Q]
## # Key: .model [1]
## # : Gas = trend * seasonal * random
## .model Quarter Gas trend seasonal random season_adjust
## <chr> <qtr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "classical_decomposition(Ga… 2005 Q3 221 NA 1.13 NA 196.
## 2 "classical_decomposition(Ga… 2005 Q4 180 NA 0.925 NA 195.
## 3 "classical_decomposition(Ga… 2006 Q1 171 200. 0.875 0.974 195.
## 4 "classical_decomposition(Ga… 2006 Q2 224 204. 1.07 1.02 209.
## 5 "classical_decomposition(Ga… 2006 Q3 233 207 1.13 1.00 207.
## 6 "classical_decomposition(Ga… 2006 Q4 192 210. 0.925 0.987 208.
#x11 method to display the seasonal adjusted data
x11_dcmp <- gas %>%
model(x11 = X_13ARIMA_SEATS(Gas ~ x11())) |>
components()
autoplot(x11_dcmp) +
labs(title =
"Decomposition of total Gas using X-11.")
x11_dcmp |>
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas",
title = "Gas Production") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
x11_dcmp |>
gg_subseries(seasonal)
#Seats decomposition
gasg <- gas |>
model(seats = X_13ARIMA_SEATS(Gas ~ seats())) |>
components()
autoplot(gasg) +
labs(title =
"Decomposition of total gas using SEATS")
#STL decomposition
gas |>
model(
STL(Gas ~ trend(window = 15) +
season(window = "periodic"),
robust = TRUE)) |>
components() |>
autoplot()
#Outlier
gas0<- gas %>%
mutate(Gas = ifelse(Gas == 192, Gas + 300, Gas))
gas %>%
mutate(Gas = ifelse(Gas == 192, Gas + 300, Gas)) %>% model(classical_decomposition(Gas, type = "multiplicative")) %>%
components() %>%
as_tsibble() %>%
autoplot(Gas, colour = "gray") +
geom_line(aes(y=season_adjust), colour = "#0072B2") + labs(title = "Seasonally Adjusted Gas Production with an Outlier")
gas %>%
mutate(Gas = ifelse(Gas == 180, Gas + 300, Gas)) %>% model(classical_decomposition(Gas, type = "multiplicative")) %>%
components() %>%
as_tsibble() %>%
autoplot(Gas, colour = "gray") +
geom_line(aes(y=season_adjust), colour = "#0072B2") + labs(title = "Seasonally Adjusted Gas Production with an Outlier")
# library(dplyr)
# # Identify rows with the value
# row_indices <- apply(gas0, 1, function(row) any(row == "492"))
# # Subset the data
# filtered_data <- gas0[row_indices, ]
#
# filtered_data
Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11.
Does it reveal any outliers, or unusual features that you had not
noticed previously?
There does seem to be outlier with 1990 to 2000.
library(seasonal)
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
head(myseries)
## # A tsibble: 6 x 5 [1M]
## # Key: State, Industry [1]
## State Industry `Series ID` Month Turnover
## <chr> <chr> <chr> <mth> <dbl>
## 1 Northern Territory Clothing, footwear and perso… A3349767W 1988 Apr 2.3
## 2 Northern Territory Clothing, footwear and perso… A3349767W 1988 May 2.9
## 3 Northern Territory Clothing, footwear and perso… A3349767W 1988 Jun 2.6
## 4 Northern Territory Clothing, footwear and perso… A3349767W 1988 Jul 2.8
## 5 Northern Territory Clothing, footwear and perso… A3349767W 1988 Aug 2.9
## 6 Northern Territory Clothing, footwear and perso… A3349767W 1988 Sep 3
#x11 method to display the seasonal adjusted data
x11_dcmp1 <- myseries %>%
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
components()
autoplot(x11_dcmp1) +
labs(title =
"Decomposition of total Turnover using X-11.")
x11_dcmp1 |>
ggplot(aes(x = Month)) +
geom_line(aes(y = Turnover, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Turnover",
title = "Turnover") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
x11_dcmp1 |>
gg_subseries(seasonal)
Figures 3.19 and 3.20 show the result of decomposing the number of
persons in the civilian labour force in Australia each month from
February 1978 to August 1995.
Write about 3–5 sentences describing the results of the
decomposition. Pay particular attention to the scales of the graphs in
making your interpretation.
Is the recession of 1991/1992 visible in the estimated components?
The irregular graph shows a big downward spike in the data after 1990 January. Prior to that it is trending upward and in line with non-recession. Yes it is visible on the 2nd image in Jan/Feb subseries.
knitr::include_graphics("labour-1.png")
knitr::include_graphics("labour2-1.png")