Homework 2

Homework # 2 (624)

3.1 Consider the GDP Info in global_economy Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

library(fpp3)

## Warning: package 'fpp3' was built under R version 4.4.2

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──

## ✔ tibble      3.2.1     ✔ tsibble     1.1.5
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.3     ✔ fable       0.4.1
## ✔ ggplot2     3.5.1

## Warning: package 'tsibble' was built under R version 4.4.2

## Warning: package 'tsibbledata' was built under R version 4.4.2

## Warning: package 'feasts' was built under R version 4.4.2

## Warning: package 'fabletools' was built under R version 4.4.2

## Warning: package 'fable' was built under R version 4.4.2

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

head(global_economy)

## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414

library(seasonal)

## Warning: package 'seasonal' was built under R version 4.4.3

## 
## Attaching package: 'seasonal'

## The following object is masked from 'package:tibble':
## 
##     view

Using autoplot to display the GDP per capita for each country, we make the show.legend function to show false output for each country..

global_economy %>%
  autoplot(GDP/Population,show.legend = FALSE) +
  labs(x = "Year", y = "GDP Per Capita")

## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).

## Use the slice_max function to return the top 5 GDP per capita
global_economy %>%
  mutate(percap = GDP/Population) %>%
  slice_max(percap, n = 25) %>%
  distinct()

## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.

## # A tibble: 25 × 10
##    Country     Code   Year    GDP Growth   CPI Imports Exports Population percap
##    <fct>       <fct> <dbl>  <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>  <dbl>
##  1 Monaco      MCO    2014 7.06e9  7.18     NA      NA      NA      38132 1.85e5
##  2 Monaco      MCO    2008 6.48e9  0.732    NA      NA      NA      35853 1.81e5
##  3 Liechtenst… LIE    2014 6.66e9 NA        NA      NA      NA      37127 1.79e5
##  4 Liechtenst… LIE    2013 6.39e9 NA        NA      NA      NA      36834 1.74e5
##  5 Monaco      MCO    2013 6.55e9  9.57     NA      NA      NA      37971 1.73e5
##  6 Monaco      MCO    2016 6.47e9  3.21     NA      NA      NA      38499 1.68e5
##  7 Liechtenst… LIE    2015 6.27e9 NA        NA      NA      NA      37403 1.68e5
##  8 Monaco      MCO    2007 5.87e9 14.4      NA      NA      NA      35111 1.67e5
##  9 Liechtenst… LIE    2016 6.21e9 NA        NA      NA      NA      37666 1.65e5
## 10 Monaco      MCO    2015 6.26e9  4.94     NA      NA      NA      38307 1.63e5
## # ℹ 15 more rows

It seems like in certain time-frames Monaco had the highest GDP per capita throughout the year,followed by Liechtenstein and then Luxoemburg. But what is the most recent year in the chart.

## Check the recent year.
unique(global_economy$Year)

##  [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974
## [16] 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
## [31] 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
## [46] 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

The data goes up to 2017

global_economy %>%
  filter(Year == 2017) %>%
  mutate(percap = GDP/Population) %>%
  group_by(Country) %>%
  arrange(desc(percap)) %>%
  select(Country,percap)

## # A tsibble: 262 x 3 [1Y]
## # Key:       Country [262]
## # Groups:    Country [262]
##    Country           percap  Year
##    <fct>              <dbl> <dbl>
##  1 Luxembourg       104103.  2017
##  2 Macao SAR, China  80893.  2017
##  3 Switzerland       80190.  2017
##  4 Norway            75505.  2017
##  5 Iceland           70057.  2017
##  6 Ireland           69331.  2017
##  7 Qatar             63249.  2017
##  8 United States     59532.  2017
##  9 North America     58070.  2017
## 10 Singapore         57714.  2017
## # ℹ 252 more rows

It seems like in 2017 Luxembourg,China and Switzerland had the highest GDP per capita, I will check the previous few years to see what happened.

## 2016
global_economy %>%
  filter(Year == 2016) %>%
  mutate(percap = GDP/Population) %>%
  group_by(Country) %>%
  arrange(desc(percap)) %>%
  select(Country,percap)

## # A tsibble: 262 x 3 [1Y]
## # Key:       Country [262]
## # Groups:    Country [262]
##    Country           percap  Year
##    <fct>              <dbl> <dbl>
##  1 Monaco           168011.  2016
##  2 Liechtenstein    164993.  2016
##  3 Luxembourg       100739.  2016
##  4 Switzerland       79866.  2016
##  5 Isle of Man       78730.  2016
##  6 Macao SAR, China  74017.  2016
##  7 Norway            70890.  2016
##  8 Ireland           64100.  2016
##  9 Iceland           60530.  2016
## 10 Qatar             59044.  2016
## # ℹ 252 more rows

In 2016 Monaco,Liechtenstein and Luxembourg had the highest GDP,it appears there is no GDP data for Monaco in 2017, check 2015 just in case.

# 2015
global_economy %>%
  filter(Year == 2015) %>%
  mutate(percap = GDP/Population) %>%
  group_by(Country) %>%
  arrange(desc(percap)) %>%
  select(Country,percap)

## # A tsibble: 262 x 3 [1Y]
## # Key:       Country [262]
## # Groups:    Country [262]
##    Country           percap  Year
##    <fct>              <dbl> <dbl>
##  1 Liechtenstein    167591.  2015
##  2 Monaco           163369.  2015
##  3 Luxembourg       101447.  2015
##  4 Switzerland       82016.  2015
##  5 Isle of Man       81672.  2015
##  6 Macao SAR, China  75484.  2015
##  7 Norway            74498.  2015
##  8 Qatar             65177.  2015
##  9 Ireland           61808.  2015
## 10 Australia         56561.  2015
## # ℹ 252 more rows

Liechtenstein was ahead this year, and finally check 2014..

# 2014
global_economy %>%
  filter(Year == 2014) %>%
  mutate(percap = GDP/Population) %>%
  group_by(Country) %>%
  arrange(desc(percap)) %>%
  select(Country,percap)

## # A tsibble: 262 x 3 [1Y]
## # Key:       Country [262]
## # Groups:    Country [262]
##    Country           percap  Year
##    <fct>              <dbl> <dbl>
##  1 Monaco           185153.  2014
##  2 Liechtenstein    179308.  2014
##  3 Luxembourg       119225.  2014
##  4 Norway            97200.  2014
##  5 Macao SAR, China  94004.  2014
##  6 Isle of Man       89942.  2014
##  7 Qatar             86853.  2014
##  8 Switzerland       86606.  2014
##  9 Denmark           62549.  2014
## 10 Australia         62328.  2014
## # ℹ 252 more rows

After plotting and verifying the data for the few years, Monaco in 2014 had the highest ever GDP per capita of all time(not accounting for Inflation..), and in recent years Monaco,Liechtenstein and Luxembourg had the highest GDP per capita but there is missing information in 2017 for Monaco and Liechtenstein but I assume they would rank highly in 2017 as well.

3.2 For each of the following series, make a graph of the data For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP

global_economy %>%
  filter(Country == "United States") %>%
  ggplot(aes(x = Year,y = GDP)) +
  geom_line() + labs(title = "U.S GDP")

It appears that this time series has a quadratic growth trend.

global_economy %>%
  filter(Country == "United States") %>%
  autoplot(sqrt(GDP)) + ggtitle("Square-Root Transformation on GDP")

Applying The square-root transformation, it seems like nothing has changed..

global_economy %>%
  filter(Country == "United States") %>%
  autoplot(log(GDP)) + ggtitle("Log Transformation on GDP")

The log transformation seems to have make the time-series data appear more linear

Australian Livestock

aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  autoplot() + ggtitle("Australian Livestock slaughter")

## Plot variable not specified, automatically selected `.vars = Count`

## Plot variable not specified, automatically selected `.vars = Count`

## This is the meat production in Australia for human consumption, 
aus_live1 <- aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  autoplot(log(Count)) + ggtitle("Log Transformation of Livestock Slaughter")

aus_live1

Log Transforming the count did nothing, can try box-cox?

## Try a box-cox transformation using the features function 

## Okay it prints out a lambda value
lambda <- aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  features(Count,features = guerrero) %>%
  pull(lambda_guerrero)

lambda

## [1] -0.04461887

THe lambda value is really small and it is close to zero. From the article on Box-Cox a lambda equals 0 means it performed a log transformation.

aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  autoplot(box_cox(Count,lambda)) + ggtitle("Box-Cox Transformation")

The article mentioned of simple calender effect by removing the variation by computing average sales per trading day in each month. Since this is a monthly count of slaughters we have to convert it to average sales per trading day in each month,

## Convert this data into average count of slaughters. 
aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
## calculate the average by dividing count by # of days..  using days_in_month to count # of day
  mutate(AverageKill = Count/days_in_month(Month)) %>%
  autoplot(AverageKill) + ggtitle("Average Slaughter per month")

Victoria electricity

vic_elec %>%
  autoplot() + ggtitle("Half-Hour Electricity Demand for Victoria,Australia")

## Plot variable not specified, automatically selected `.vars = Demand`

## Plot variable not specified, automatically selected `.vars = Demand`

It seems this time-series wouldn’t benefit from any transformations since there is no trend,and no cylic behavior but a seasonal pattern.

vic_elec %>%
  autoplot(log(Demand)) + ggtitle("Log Transformation")

Applied a log-transformation and not much has changed. What if we convert the half-hour electricity demand to a daily average electricity demand.

## Reconvert tstibble to tibble..
Vic_elec <- vic_elec %>% as_tibble()

## Get the average electricity demand for each day.
Vic_elec %>%
  group_by(Date) %>%
  summarise(Total = sum(Demand)) %>%
  mutate(Date = as.Date(Date)) %>%
  as_tsibble(index = Date,key = Total) %>%
  ggplot(aes(x = Date, y = Total)) +
  geom_line() + ggtitle("Daily Electricity Demand Per Day")

Converting the half-hour electricity demand to a daily demand makes the chart easier to read.. Gas_Production

aus_production %>%
  select(Quarter,Electricity) %>%
  autoplot()

## Plot variable not specified, automatically selected `.vars = Electricity`

## Plot variable not specified, automatically selected `.vars = Electricity`

This time-series appears quadratic, so perhaps a square root transformation can works

aus_production %>%
  autoplot(sqrt(Electricity))

The time-series seems better when it is transformed as a square-root transformation

3.3 Why is a Box-Cox transformation unhelpful The canadian_gas data is monthly canadian gas production from Jan 1960 to Feb 2005 it is a monthly time-series

canadian_gas

## # A tsibble: 542 x 2 [1M]
##       Month Volume
##       <mth>  <dbl>
##  1 1960 Jan  1.43 
##  2 1960 Feb  1.31 
##  3 1960 Mar  1.40 
##  4 1960 Apr  1.17 
##  5 1960 May  1.12 
##  6 1960 Jun  1.01 
##  7 1960 Jul  0.966
##  8 1960 Aug  0.977
##  9 1960 Sep  1.03 
## 10 1960 Oct  1.25 
## # ℹ 532 more rows

canadian_gas %>%
  autoplot() + labs(title = "Monthly Canadian gas production", x = "Months", y = "volume")

## Plot variable not specified, automatically selected `.vars = Volume`

## Plot variable not specified, automatically selected `.vars = Volume`

lambda1 <- canadian_gas %>%
  features(Volume,features = guerrero) %>%
  pull(lambda_guerrero)
  
lambda1

## [1] 0.5767648

canadian_gas %>%
  autoplot(box_cox(Volume,lambda1))

The textbook explains that we perform transformations on the data to stablize the variance.The box-cox transformation is unhelpful because the seasonal_variation is the same across both charts. The variance impact between 1979 and 1985 is diminished in the box-cox transformation. Though the variance at both tail ends are increased.

3.4 What Box-Cox transformation would you select for your retail data The aus_retail is a monthly tstibble with one value, I will use the data from the textbook.

## Taken directly from the textbook. 
set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

We can apply a box-cox transformation and see what value of lambda best transform the data.

myseries %>%
  autoplot() + ggtitle("Australian Retail Trade Turnover")

## Plot variable not specified, automatically selected `.vars = Turnover`

## Plot variable not specified, automatically selected `.vars = Turnover`

lambda2 <- myseries %>%
  features(Turnover,features = guerrero) %>%
  pull(lambda_guerrero)


lambda2

## [1] 0.08303631

The lambda value is 0.08 may approximate to 0

## Print the lambda value
myseries %>%
  autoplot(box_cox(Turnover,lambda2)) + ggtitle("Austrailian Retail with lambda = 0.08")

According to guerro which selected the lambda value for me, choose a value of 0.08 for the retail data, which is approximately a log transformation.

3.5 For the following series, find an appropriate Box-Cox transformation For this question, I will use the guerro function to choose the appropriate lambda values for the transformation in the chart.

Aus_production

aus_production %>%
  autoplot(Tobacco) + ggtitle("Quarterly Production of Tobbaco")

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

# Creating a lambda value 
lambda3 <- aus_production %>%
  features(Tobacco,features = guerrero) %>%
  pull(lambda_guerrero)
lambda3

## [1] 0.9264636

aus_production %>%
  autoplot(box_cox(Tobacco,lambda3)) + ggtitle("Quarterly production of lambda = 0.92")

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

The guerro function choose a lambda of 0.9 Ansett Ansett is the passenger numbers on Ansett airline flight, filter for airports between Melbourne and Sydney and economy class

ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  autoplot(Passengers) + labs(title = "Passengers numbers on Ansett Flights")

## Create a lambda value from guerro 
lambda4 <- ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  features(Passengers,features = guerrero) %>%
  pull(lambda_guerrero)

## Applying The Lambda value
ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  autoplot(box_cox(Passengers,lambda4)) + labs(title = "Transformed passenger numbers lambda = 1.99")

The box_cox transformation choose a lambda value of 1.9 for the ansett data. Pedestrian This contains hourly pedestrian counts in the city of Melbourne

## I am not sure what's happening here.. 
pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  autoplot(Count) + ggtitle("Pedestrian counts in the city of Melbourne")

## Use the guerro feature once again for lambda
lambda5 <- pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  features(Count,features = guerrero) %>%
  pull(lambda_guerrero)

pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  autoplot(box_cox(Count,lambda5)) + ggtitle("Transformed Count with lambda = -0.25")

The lambda value chosen from the guerro function is -0.25

3.7 Consider The Last 5 Years of Gas Data A.

gas <- tail(aus_production, 5*4) |> select(Gas)

gas

## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   221 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   205 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   236 2010 Q2

gas %>%
  autoplot() + ggtitle("Quarterly Gas production")

## Plot variable not specified, automatically selected `.vars = Gas`

We can see a strong seasonal fluctuation, with sharp increases in the middle of the year and a sharp decline at the begninng of every first quarter, it seems to be going up on a slight positive trend.

## From section 3.4
cmp <- gas %>%
  model(
  classical_decomposition(Gas,type = "multiplicative")) %>%
  components() 


cmp %>%  
  autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

C I felt like my interpretations were correct there is a slight positive trend, and the seasonal patterns are going down and up for each year. Thus its hard to account for the random variation in the data.

# Took the code from section 3.5 and plotted it for section D. 
cmp %>%
  ggplot(aes(x = Quarter)) + 
  geom_line(aes(y = Gas,color = "Data")) +
  geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
  geom_line(aes(y = trend,color = "Trend")) + 
  labs(title = "Quarterly production of Gas in Australia") +
   scale_colour_manual(
    values = c("gray", "#0072B2", "#D55E00"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).

E Change one observation by adding 300 to one observation..

Gast1 <- gas
Gast1$Gas[1] <- Gast1$Gas[1] + 300
Gast1

## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   521 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   205 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   236 2010 Q2

cmp1 <- Gast1 %>%
  model(
  classical_decomposition(Gas,type = "multiplicative")) %>%
  components() 


cmp1 %>%  
  autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

## Plot the seasonal adjusted data again..
cmp1 %>%
  ggplot(aes(x = Quarter)) + 
  geom_line(aes(y = Gas,color = "Data")) +
  geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
  geom_line(aes(y = trend,color = "Trend")) + 
  labs(title = "Quarterly production of Gas in Australia") +
   scale_colour_manual(
    values = c("gray", "#0072B2", "#D55E00"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).

When I added 300 to the first observation, The seasonal adjusted data shot up at the beginning and then went down as soon it returned to the normal observations, which shows that the effect of the outlier had an effect on the seasonal adjusted data.

F Place an outlier in the middle of the dataset.

Gast2 <- gas

Gast2$Gas[10] <- Gast2$Gas[10] + 300

Gast2

## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   221 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   505 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   236 2010 Q2

cmp2 <- Gast2 %>%
  model(
  classical_decomposition(Gas,type = "multiplicative")) %>%
  components() 


cmp2 %>%  
  autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

## Plot the seasonal adjusted data again..
cmp2 %>%
  ggplot(aes(x = Quarter)) + 
  geom_line(aes(y = Gas,color = "Data")) +
  geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
  geom_line(aes(y = trend,color = "Trend")) + 
  labs(title = "Quarterly production of Gas in Australia") +
   scale_colour_manual(
    values = c("gray", "#0072B2", "#D55E00"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).

Place an outlier at the end of the observation.

Gast3 <- gas

Gast3$Gas[20] <- Gast3$Gas[20] + 300

Gast3

## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   221 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   205 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   536 2010 Q2

cmp3 <- Gast3 %>%
  model(
  classical_decomposition(Gas,type = "multiplicative")) %>%
  components() 


cmp3 %>%  
  autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

cmp3 %>%
  ggplot(aes(x = Quarter)) + 
  geom_line(aes(y = Gas,color = "Data")) +
  geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
  geom_line(aes(y = trend,color = "Trend")) + 
  labs(title = "Quarterly production of Gas in Australia") +
   scale_colour_manual(
    values = c("gray", "#0072B2", "#D55E00"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).

Placing an outlier at the middle and end of the time series we can see that there is a difference in where the outlier is placed and how the seasonal adjusted data is computed on where the outlier is.

3.8 Recall your Time Series

## Use the textbook data
set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

I will answer the previous question by plotting the myseries with the plot functions..

myseries %>%
  autoplot() + labs(title = "Australian retail trade turnover")

## Plot variable not specified, automatically selected `.vars = Turnover`

myseries %>%
  gg_season() +labs(title = "Australian retail trade turnover")

## Plot variable not specified, automatically selected `y = Turnover`

myseries %>%
  gg_subseries() + labs(title = "Australian retail trade turnover")

## Plot variable not specified, automatically selected `y = Turnover`

x11_dcmp <- myseries %>%
  model(X11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components() 

autoplot(x11_dcmp) + 
labs(title = "Decomposition of Australian clothing and accessory retailing trade turnover using X-11")

In this particular series I see in the iregular table some spikes within the chart, which could indicate outliers, which may have an effect on the trend which is slightly increasing. This would have been hard to see in the other plots without the x11 decomposition.

3.9 A. The results of the decomposition shown in the chart. The results shows a slight positive trend line that is representative of the value line.The scales in the season-year is pretty high and is going from -100 to 100. The seasonal chart shows various increases from Jan to March and a decrease during the spring and a decrease in the summer and fluctuates during the summer.

The recession of 1991/1992 is visible in the estimated components since we can see a slight decrease in the trend line and a big decrease in the remainder window. The overall data and the trend reflects the recession, i.e the outlier in the plot.

Fin

Homework 2

Ngawang Dakpa

2025-03-22