Email         : sherlytaurinsiri@gmail.com
Instagram     : https://www.instagram.com/sherlytaurin
RPubs       : https://rpubs.com/sherlytaurin/
Github       : https://github.com/sherlytaurin/
Telegram       : @Sherlytaurin

1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

global_economy %>%
  tsibble(key = Code, index = Year)%>%
  autoplot(GDP/Population, show.legend = FALSE) +
  labs(title= "GDP per capita",
       y = "$US")

datamax <- global_economy %>% 
  mutate(sum = (global_economy$GDP/global_economy$Population))
maximumvalue <- datamax[which.max(datamax$sum),]
maximumvalue

## # A tsibble: 1 x 10 [1Y]
## # Key:       Country [1]
##   Country Code   Year         GDP Growth   CPI Imports Exports Population    sum
##   <fct>   <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>  <dbl>
## 1 Monaco  MCO    2014 7060236168.   7.18    NA      NA      NA      38132 1.85e5

From data, we can see that the country that have the highest GDP per capita is Monaco. The time series was like:

global_economy %>%
  tsibble(key = Code, index = Year)%>%
  filter(Country=="Monaco") %>%
  autoplot(GDP/Population)

2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

2.1 United States GDP from global_economy

2.2 Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock

aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
  autoplot(Count) +
  labs(title= "Slaughter of Victorian Bulls, bullocks and steers", y = "Count")

2.3 Victorian Electricity Demand from vic_elec.

vic_elec %>% autoplot(Demand)

2.4 Gas production from aus_production

aus_production %>% autoplot(Gas)

3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?

The plot of monthly Canadian gas production displays a seasonality of 1 year and a seasonal variance that is relatively low from 1960 through 1978, larger from 1978 through 1988 and smaller from 1988 through 2005. Because the seasonal variation increases and then decreases, the Box Cox transformation cannot be used to make the seasonal variation uniform.

4 What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

lambda_retail <- myseries %>%
                  features(Turnover, features = guerrero) %>%
                  pull(lambda_guerrero)

myseries %>%
  autoplot(box_cox(Turnover, lambda_retail))+
  labs(title = latex2exp::TeX(paste0(
         "Box Cox Transformation of Australian Retail Trade Turnover with $\\lambda$ = ",
         round(lambda_retail,2))))

The plot from Australian Retai Trade time series doesn’t show any upward treand and a seasonality of one year. The seasonal variation increases with time. So we will choose the second one.

5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

math.trans = function(ts, objname)
{
  lambda = BoxCox.lambda(ts)
  print(paste('Lambda value for time series', objname, '=', lambda))
  print(paste('Plotting Original vs Transformed time series for', objname))
  
  df = cbind(Original = ts, Transformed = BoxCox(ts, lambda))
  
  autoplot(df, facet = TRUE) +
    xlab('Time') + ylab('Value') +
    ggtitle(paste('Original vs Transformed plot for', objname))
}

5.1 Tobacco from aus_production

tobacco <- ts(aus_production[,"Tobacco"])
math.trans(tobacco, 'tobacco')

## [1] "Lambda value for time series tobacco = 0.709945071249522"
## [1] "Plotting Original vs Transformed time series for tobacco"

5.2 Economy class passengers between Melbourne and Sydney from ansett

economypassengers <- ansett %>%
                      filter(Class == "Economy", Airports =="MEL-SYD")
economy <- ts(economypassengers[,"Passengers"])
math.trans(economy, "economy passengers between Melbourne and Sydney")

## [1] "Lambda value for time series economy passengers between Melbourne and Sydney = 1.99995900720725"
## [1] "Plotting Original vs Transformed time series for economy passengers between Melbourne and Sydney"

5.3 Pedestrian counts at Southern Cross Station from pedestrian

pedestriancount <- pedestrian %>%
            filter(Sensor =="Southern Cross Station")

datapedestrian <- ts(pedestriancount[,"Count"])
math.trans(datapedestrian, 'Pedestrian counts at Southern Cross Station')

## [1] "Lambda value for time series Pedestrian counts at Southern Cross Station = 0.0748116138525378"
## [1] "Plotting Original vs Transformed time series for Pedestrian counts at Southern Cross Station"

6 Show that a 3 x 5 MA is equivalent to a 7-term weighted moving average with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and 0.067.

3x5 MA is taking the average for the first five terms and then getting the moving average of each three observation set. Since a 3X5 MA will lead to a 1/15(y + 2y + 3y + 3y + 3y + 2y + y), it follows the weighted average of 1/15, 2/15, 3/15, 3/15 … etc.

it’s called “Centered Moving Average of order 5” because the result are symmetric. \[ 3 \times 5 MA = \frac {1}{15}Y_1 + \frac {2}{15}Y_2 + \frac {3}{15}Y_3 + \frac {3}{15}Y_4 + \frac {3}{15}Y_5 + \frac {2}{15}Y_6 + \frac {1}{15}Y_7 \]

7 Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) %>% select(Gas)
gas

## # A tsibble: 20 x 2 [1Q]
##      Gas Quarter
##    <dbl>   <qtr>
##  1   221 2005 Q3
##  2   180 2005 Q4
##  3   171 2006 Q1
##  4   224 2006 Q2
##  5   233 2006 Q3
##  6   192 2006 Q4
##  7   187 2007 Q1
##  8   234 2007 Q2
##  9   245 2007 Q3
## 10   205 2007 Q4
## 11   194 2008 Q1
## 12   229 2008 Q2
## 13   249 2008 Q3
## 14   203 2008 Q4
## 15   196 2009 Q1
## 16   238 2009 Q2
## 17   252 2009 Q3
## 18   210 2009 Q4
## 19   205 2010 Q1
## 20   236 2010 Q2

7.1 Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

gas %>%
  autoplot(Gas)+
  labs(title = "Last Five Years of The Gas Data")+
  theme_replace()+
  geom_line(col = "#1B89D3")

From the plot, there are seasonal fluctuation with a general upward growth trend. The highest trend is in the middle of the year and will downward until the beginning of the year and so on.

7.2 Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

gas %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data")

The results of the multiplicative decomposition show a quarterly seasonal component with a frequency of 1 year. There is an increasing trend from year 2006 through middle 2007. After year 2007, there is no trend until early 2008. After that, ther is an increasing trend late 2009.

7.3 Do the results support the graphical interpretation from part a?

The results support the graphical interpretation from part a, which was a seasonality of frequency 1 year and an increasing trend. And because classical multiplicative decomposition relies on moving averages, there is no data at the beginning and end of the trend-cycle.

7.4 Compute and plot the seasonally adjusted data.

gas_decom <- gas %>%
             model(classical_decomposition(Gas,type = "multiplicative")) %>%
             components()
gas_decom %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data") +
  scale_colour_manual(
    values = c("gray", "#0072B2", "#D55E00"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

7.5 Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

gas1 <- gas
gas1$Gas[10] <- gas1$Gas[10]+300

gas1 %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")

gas1 %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data with 300 added to the 10th observation") +
  scale_colour_manual(
    values = c("dark green", "red", "blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

when 300 was added to the 10th observation, it caused a large spike in the seasonally adjusted data. The quarterly gas data was taken from a seasonal low point to a relative high point. The addition of 300 to the 10th observation has a relatively small affect on the seasonal component. This is because the seasonal component is uniform for each year and only one data point has changed. It also caused a decreasing trend from early 2008 until middle 2008.

7.6 Does it make any difference if the outlier is near the end rather than in the middle of the time series?

gas2 <- gas
gas2$Gas[20] <- gas2$Gas[10]+300

gas2 %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")

gas2 %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(y = "",
       title = "Last Five Years of The Gas Data with 300 added to the last observation") +
  scale_colour_manual(
    values = c("dark green", "red", "blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

Adding 300 to the last entry causes a spike at the end of the seasonally adjusted data. The seasonal data is less affected by the change - its pattern more closely matches that of the original data. The trend looks more better, because it looks more increasing and also at the end cause adding 300 at the last observation.

8 Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

myseries %>%
  model(classical_decomposition(Turnover,type = "multiplicative")) %>%
  components() %>%
  autoplot() + 
  ggtitle("Multiplicative decomposition of my retail time series data")

myseries %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components() %>%
  autoplot()+
  labs(title = "X-11 decomposition of my retail time series data")

Compare both decomposition, the X-11 trend-cycle has captured the sudden fall in the 2000-2010.

9 Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Figure 1 : Decomposition of the number of persons in the civillian labour force in Australia each month from February 1978 to August 1995.

Figure 2: Seasonal component from the decomposition shown in the previous figure.

9.1 Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

Isolating the trend component from the seasonal component shows that the trend has increased throught the majority of the time frame, with a few stationary periods occuring in the early 90s. The monthly breakdown of the seasonal component shows that a few months show greater velocities in their variations than other months.

9.2 Is the recession of 1991/1992 visible in the estimated components?

Yes, we see a dip in employment during 1991/1992 that is not explained by seasonality or the positive trend.

10 This exercise uses the canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).

10.1 Plot the data using autoplot(), gg_subseries() and gg_season() to look at the effect of the changing seasonality over time.

canadian_gas %>%
  autoplot(Volume)+
  labs(title = "Monthly Canadian Gas Production",
       subtitle = "autoplot()",
       y = "Billions of cubic meter")+
  theme_replace()+
  geom_line(col = "dark green")

canadian_gas %>%
  gg_subseries(Volume)+
  labs(title = "Monthly Canadian Gas Production",
       subtitle = "gg_subseries()",
       y = "Billions of cubic meter")

canadian_gas %>%
  gg_season(Volume)+
  labs(title = "Monthly Canadian Gas Production",
       subtitle = "gg_season()",
       y = "Billions of cubic meter")

We can see, that according to the plots above, Canadian_gas data has seasonally increasing trend. In general, the gas production increasing on winter and decreasing on summer.
The trend increases drammatically from 1975 to 1990. That because there were larger differences gas production between winter and summer in those year that shown on the seasonal plot.

10.2 Do an STL decomposition of the data. You will need to choose a seasonal window to allow for the changing shape of the seasonal component.

canadian_gas %>%
  model(
    STL(Volume ~ trend(window = 21) +
                   season(window = 13),
    robust = TRUE)) %>%
  components() %>%
  autoplot()

From the STL decomposition above, The trend component adequately represent the original data (volume). Where the seasonal component(Season_year) increases from 1975 until 1985 after that decreases. The remainder component is almost around zero.

10.3 How does the seasonal shape change over time? [Hint: Try plotting the seasonal component using gg_season().]

As shown above, the seasonal shape is flat from beginning and then as the time goes by the seasonal shape increases. In year 1960 there is no trend-cycle, we can say the gas production didn’t really a trend in that time. After year 1975 there is a trend-cycle, hence the gas production increases at that time and so on.

10.4 Can you produce a plausible seasonally adjusted series?

canadian_gas %>%
 model(
    STL(Volume ~ trend(window = 21) +
                   season(window = 13),
    robust = TRUE)) %>%
  components() %>%
  ggplot(aes(x = Month)) +
  geom_line(aes(y = Volume, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) +
  labs(title = "STL decomposition of Canadian Gas Production") +
  scale_colour_manual(
    values = c("green", "red", "blue"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

10.5 Compare the results with those obtained using SEATS and X-11. How are they different?

canadian_gas %>%
  model(x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
  components() %>%
  autoplot()+
  labs(title = "X-11 decomposition of Canadian Gas Production")

canadian_gas %>%
  model(seats = X_13ARIMA_SEATS(Volume ~ seats())) %>%
  components() %>%
autoplot() +
  labs(title ="SEATS Decomposition of Canadian Gas Production")

The decomposed trend and seasonal components are similliar each other. The changes of seasonality are differents from the original data. The differences of seasonally adjusted time series are very less between these two method. In addition, the remainder component of the SEATS decomposition is larger than that of the X11 decomposition, where both remainders are around one. The remainder component of the STL decomposition is smaller.

So, we can conclude that the STL decomposition fits the canadian_gas data better.

Tugas Ekonometrika W11

Sherly Taurin Siridion – 20194920011

April 19, 2021