Email           :
Instagram   : https://www.instagram.com/marvis.zerex/
RPubs         : https://rpubs.com/invokerarts/
Linkedin     : https://www.linkedin.com/in/jeffry-wijaya-087a191b5/
Majors        : Business Statistics
Address      : ARA Center, Matana University Tower
                      Jl. CBD Barat Kav, RT.1, Curug Sangereng, Kelapa Dua, Tangerang, Banten 15810.



library(DT)
library(fpp2)
library(fpp3)
library(tidyverse)
library(ggplot2)
library(tsibbledata)
library(tsibble)
library(forecast)
library(fma)
library(expsmooth)
library(feasts)
library(USgas)
library(xlsx)
library(tidyr)
library(data.table)
library(seasonal)
math.trans = function(ts, objname)
{
  lambda = BoxCox.lambda(ts)
  print(paste('Lambda value for time series', objname, '=', lambda))
  print(paste('Plotting Original vs Transformed time series for', objname))
  
  df = cbind(Original = ts, Transformed = BoxCox(ts, lambda))
  
  autoplot(df, facet = TRUE) +
    xlab('Time') + ylab('Value') +
    ggtitle(paste('Original vs Transformed plot for', objname))
}

left = function(text, num_char) {
  substr(text, 1, num_char)
}
 
mid = function(text, start_num, num_char) {
  substr(text, start_num, start_num + num_char - 1)
}
 
right = function(text, num_char) {
  substr(text, nchar(text) - (num_char-1), nchar(text))
}

1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

global_economy %>%
  tsibble(key = Code, index = Year)%>%
  autoplot(GDP/Population, show.legend = FALSE) +
  labs(title= "GDP per capita",
       y = "$US")

datamax <- global_economy %>% mutate(sum = (global_economy$GDP/global_economy$Population))

maximumvalue <- datamax[which.max(datamax$sum),]
maximumvalue
## # A tsibble: 1 x 10 [1Y]
## # Key:       Country [1]
##   Country Code   Year         GDP Growth   CPI Imports Exports Population    sum
##   <fct>   <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>  <dbl>
## 1 Monaco  MCO    2014 7060236168.   7.18    NA      NA      NA      38132 1.85e5
global_economy %>%
  tsibble(key = Code, index = Year)%>%
  filter(Country=="Monaco") %>%
  autoplot(GDP/Population)

2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

2.1 United States GDP from global_economy

global_economy %>%
  filter(Country == "United States") %>%
  autoplot(GDP/Population) +
  labs(title= "GDP per capita", y = "$US")

2.2 Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock

aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  autoplot(Count) +
  labs(title= "Slaughter of Victorian “Bulls, bullocks and steers", y = "Total Count")

2.3 Victorian Electricity Demand from vic_elec.

autoplot(vic_elec, Demand)

2.4 Gas production from aus_production

autoplot(aus_production, Gas)

3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?

autoplot(canadian_gas,Volume)

lambda <- canadian_gas %>%
  features(Volume, features = guerrero) %>%
  pull(lambda_guerrero)

canadian_gas %>%
  autoplot(box_cox(Volume, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lambda,2))))

The plot of monthly Canadian gas production displays a seasonality of 1 year and a seasonal variance that is relatively low from 1960 through 1978, larger from 1978 through 1988 and smaller from 1988 through 2005. Because the seasonal variation increases and then decreases, the Box Cox transformation cannot be used to make the seasonal variation uniform.

4 What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries %>%
  autoplot(Turnover)+
  labs(title = "Australian Retail Trade Turnover")

lambda_retail <- myseries %>%
                  features(Turnover, features = guerrero) %>%
                  pull(lambda_guerrero)

myseries %>%
  autoplot(box_cox(Turnover, lambda_retail))+
  labs(title = latex2exp::TeX(paste0(
         "Box Cox Transformation of Australian Retail Trade Turnover with $\\lambda$ = ",
         round(lambda_retail,2))))

The plot of Australian Retail Trade doesn’t show an upward trend and a seasonality of one year. The seasonal variation increases with time.

5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

tobacco <- ts(aus_production[,"Tobacco"])
math.trans(tobacco, 'tobacco')
## [1] "Lambda value for time series tobacco = 0.709945071249522"
## [1] "Plotting Original vs Transformed time series for tobacco"

economypassengers <- ansett %>%
                      filter(Class == "Economy", Airports =="MEL-SYD")
economy <- ts(economypassengers[,"Passengers"])
math.trans(economy, "economy passengers between Melbourne and Sydney")
## [1] "Lambda value for time series economy passengers between Melbourne and Sydney = 1.99995900720725"
## [1] "Plotting Original vs Transformed time series for economy passengers between Melbourne and Sydney"

newdata <- filter(pedestrian, Sensor =='Southern Cross Station')

datapede <- ts(newdata[,"Count"])
math.trans(datapede, 'datapede')
## [1] "Lambda value for time series datapede = 0.0748116138525378"
## [1] "Plotting Original vs Transformed time series for datapede"

6 Show that a \(3 \times 5\) MA is equivalent to a 7-term weighted moving average with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and 0.067.

A 3X5 MA is when a 3-MA follows a moving of a 5-MA series. It is also called a “centered moving average of order 5”. This is because the results are now symmetric.

\[3\times5MA=\frac{1}{15}Y_1+\frac{2}{15}Y_2+\frac{3}{15}Y_3+\frac{3}{15}Y_4++\frac{3}{15}Y_5+\frac{2}{15}Y_6+\frac{1}{15}Y_7\]

7 Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) %>% select(Gas)
gas
## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   221 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   205 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   236 2010 Q2

7.1 Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

autoplot(gas, Gas)

From the plot above, there are seasonal fluctuations with a general upward growth trend. The seasons peak in roughly the middle of the year - potentially the summer months.

7.2 Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

gas %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data")

The results of the multiplicative decomposition show a quarterly seasonal component with a frequency of 1 year. There is an increasing trend from year 2006 through middle 2007. After year 2007, there is no trend until early 2008. After that, ther is an increasing trend late 2009.

7.3 Do the results support the graphical interpretation from part a?

The results support the graphical interpretation from part a, which was a seasonality of frequency 1 year and an increasing trend. And because classical multiplicative decomposition relies on moving averages, there is no data at the beginning and end of the trend-cycle.

7.4 Compute and plot the seasonally adjusted data.

gas_adjusted <- gas %>%
                model(classical_decomposition(Gas,type = "multiplicative")) %>%
                components()
gas_adjusted %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data") +
  scale_colour_manual(
    values = c("gray", "Red", "Blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

7.5 Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

gas_Obs <- gas
gas_Obs$Gas[10] <- gas_Obs$Gas[10]+300

gas_Obs %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")

gas_Obs %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data with 300 added to the 10th observation") +
  scale_colour_manual(
    values = c("gray", "Red", "Blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

when 300 was added to the 10th observation, it caused a large spike in the seasonally adjusted data. The quarterly gas data was taken from a seasonal low point to a relative high point. The addition of 300 to the 10th observation has a relatively small affect on the seasonal component. This is because the seasonal component is uniform for each year and only one data point has changed. It also caused a decreasing trend from early 2008 until middle 2008.

7.6 Does it make any difference if the outlier is near the end rather than in the middle of the time series?

gas_comp <- gas
gas_comp$Gas[20] <- gas_comp $Gas[10]+300

gas_comp %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")

gas_comp %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data with 300 added to the last observation") +
  scale_colour_manual(
    values = c("gray", "Red", "Blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

Adding 300 to the last entry causes a spike at the end of the seasonally adjusted data. The seasonal data is less affected by the change - its pattern more closely matches that of the original data. The trend looks more better, because it looks more increasing and also at the end cause adding 300 at the last observation.

8 Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

myseries %>%
  model(classical_decomposition(Turnover,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Multiplicative decomposition of my retail time series data")

myseries %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components() %>%
  autoplot()+
  labs(title = "X-11 decomposition of my retail time series data")

Compare both decomposition, the X-11 trend-cycle has captured the sudden fall in the 2000-2010.

9 Figures 1 and 2 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Figure 1: Decomposition of the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Figure 2: Seasonal component from the decomposition shown in the previous figure.

9.1 Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

Isolating the trend component from the seasonal component shows that the trend has increased throught the majority of the time frame, with a few stationary periods occuring in the early 90s. The monthly breakdown of the seasonal component shows that a few months show greater velocities in their variations than other months.

9.2 Is the recession of 1991/1992 visible in the estimated components?

Yes, we see a dip in employment during 1991/1992 that is not explained by seasonality or the positive trend.

10 This exercise uses the canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).

10.1 Plot the data using autoplot(), gg_subseries() and gg_season() to look at the effect of the changing seasonality over time.

canadian_gas %>% autoplot(Volume)

canadian_gas %>% gg_subseries(Volume)

canadian_gas %>% gg_season(Volume)

10.2 Do an STL decomposition of the data. You will need to choose a seasonal window to allow for the changing shape of the seasonal component.

canadian_gas %>%
  model(
    STL(Volume ~ trend(window = 21) +
                   season(window = 13),
    robust = TRUE)) %>%
  components() %>%
  autoplot()

10.3 How does the seasonal shape change over time? [Hint: Try plotting the seasonal component using gg_season().]

As shown above, the seasonal shape is flat from beginning and then as the time goes by the seasonal shape increases. In year 1960 there is no trend-cycle, we can say the gas production didn’t really a trend in that time. After year 1975 there is a trend-cycle, hence the gas production increases at that time and so on.

10.4 Can you produce a plausible seasonally adjusted series?

canadian_gas %>%
 model(
    STL(Volume ~ trend(window = 21) +
                   season(window = 13),
    robust = TRUE)) %>%
  components() %>%
  ggplot(aes(x = Month)) +
  geom_line(aes(y = Volume, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(title = "STL decomposition of Canadian Gas Production") +
  scale_colour_manual(
    values = c("gray", "Red", "Blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

10.5 Compare the results with those obtained using SEATS and X-11. How are they different?

canadian_gas %>%
  model(x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
  components() %>%
  autoplot()+
  labs(title = "X-11 decomposition of Canadian Gas Production")