3.1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

Since there are a lot of countries in this dataset, the legend takes up too much space.

## # A tibble: 58 x 5
## # Groups:   Year [58]
##    Country        Year     GDP Population GDPPerCap
##    <fct>         <dbl>   <dbl>      <dbl>     <dbl>
##  1 United States  1960 5.43e11  180671000     3007.
##  2 United States  1961 5.63e11  183691000     3067.
##  3 United States  1962 6.05e11  186538000     3244.
##  4 United States  1963 6.39e11  189242000     3375.
##  5 United States  1964 6.86e11  191889000     3574.
##  6 Kuwait         1965 2.10e 9     473554     4429.
##  7 Kuwait         1966 2.39e 9     524856     4556.
##  8 United States  1967 8.62e11  198712000     4336.
##  9 United States  1968 9.43e11  200706000     4696.
## 10 United States  1969 1.02e12  202677000     5032.
## # ... with 48 more rows

The above dataframe shows which country had the highest per capita GDP for each year in this time series. There seem to be 6 countries that have shared the top spot amongst themselves during this period from 1960 to 2017.

The plot above shows the 6 countries that have had the highest per capita GDP from 1960 to 2017.

This is another plot that shows all the countries for each year in the data.

#save(global_economy, file="global_economy.txt")
#write.csv(global_economy,file ="global_economy.csv")

3.2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

The US GDP shows an increasing trend for the past50+ years except for a dip during the 2008-2009 period as a result of the Great Financial Crisis.

This time-series above shows cyclicality as well as seasonality.

The electricity demand time series shows seasonality.

The Australian gas production time series shows an increasing trend as well as seasonality. The variance seems to increase based on the level of the time series.

Transforming this time series by taking its log seems to reduce the variance and make it more constant.

Applying a Box-Cox transformation to this time series achieves a similar result to the log transformation i.e. it makes the variance more constant. The optimal lambda parameter is derived to be 0.12.

3.3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?

Lambda = 0.39 seems to be the best choice for this data.

3.4 What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

Lambda = -0.04 seems to be the optimal choice for this time series.

3.5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

Lambda = 0.93 seems to be the best choice for this series.

Lambda = 2 seems to be the best choice for this series.

Lambda = -0.23 seems to be the best choice for this series.

3.7 Consider the last five years of the Gas data from aus_production.

  1. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

The Australian gas production time series shows an increasing trend as well as quarterly seasonality. The gas production tends to increase in Q2 and then peaks in Q3 before decreasing in Q4. This pattern repeats in each of the years.

  1. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

The above graph shows the original time series, the trend component, the seasonal component and the random component.

## # A dable: 20 x 7 [1Q]
## # Key:     .model [1]
## # :        Gas = trend * seasonal * random
##    .model                      Quarter   Gas trend seasonal random season_adjust
##    <chr>                         <qtr> <dbl> <dbl>    <dbl>  <dbl>         <dbl>
##  1 "classical_decomposition(G~ 2005 Q3   221   NA     1.13  NA              196.
##  2 "classical_decomposition(G~ 2005 Q4   180   NA     0.925 NA              195.
##  3 "classical_decomposition(G~ 2006 Q1   171  200.    0.875  0.974          195.
##  4 "classical_decomposition(G~ 2006 Q2   224  204.    1.07   1.02           209.
##  5 "classical_decomposition(G~ 2006 Q3   233  207     1.13   1.00           207.
##  6 "classical_decomposition(G~ 2006 Q4   192  210.    0.925  0.987          208.
##  7 "classical_decomposition(G~ 2007 Q1   187  213     0.875  1.00           214.
##  8 "classical_decomposition(G~ 2007 Q2   234  216.    1.07   1.01           218.
##  9 "classical_decomposition(G~ 2007 Q3   245  219.    1.13   0.996          218.
## 10 "classical_decomposition(G~ 2007 Q4   205  219.    0.925  1.01           222.
## 11 "classical_decomposition(G~ 2008 Q1   194  219.    0.875  1.01           222.
## 12 "classical_decomposition(G~ 2008 Q2   229  219     1.07   0.974          213.
## 13 "classical_decomposition(G~ 2008 Q3   249  219     1.13   1.01           221.
## 14 "classical_decomposition(G~ 2008 Q4   203  220.    0.925  0.996          219.
## 15 "classical_decomposition(G~ 2009 Q1   196  222.    0.875  1.01           224.
## 16 "classical_decomposition(G~ 2009 Q2   238  223.    1.07   0.993          222.
## 17 "classical_decomposition(G~ 2009 Q3   252  225.    1.13   0.994          224.
## 18 "classical_decomposition(G~ 2009 Q4   210  226     0.925  1.00           227.
## 19 "classical_decomposition(G~ 2010 Q1   205   NA     0.875 NA              234.
## 20 "classical_decomposition(G~ 2010 Q2   236   NA     1.07  NA              220.

The table above shows the values of the respective components.

The graph above shows the trend component overlaid on top of the plot of the time series.

  1. Do the results support the graphical interpretation from part a?

Yes, as can be seen from the table of components, the trend component is increasing steadily evey quarter, while the seasonal component is highest for Q3.

  1. Compute and plot the seasonally adjusted data.

The plot above shows the seasonally adjusted data.

  1. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
## [1] 505
components(mul.decomp.new)
## # A dable: 20 x 7 [1Q]
## # Key:     .model [1]
## # :        Gas = trend * seasonal * random
##    .model                      Quarter   Gas trend seasonal random season_adjust
##    <chr>                         <qtr> <dbl> <dbl>    <dbl>  <dbl>         <dbl>
##  1 "classical_decomposition(G~ 2005 Q3   221   NA     1.06  NA              209.
##  2 "classical_decomposition(G~ 2005 Q4   180   NA     1.12  NA              160.
##  3 "classical_decomposition(G~ 2006 Q1   171  200.    0.821  1.04           208.
##  4 "classical_decomposition(G~ 2006 Q2   224  204.    0.998  1.10           224.
##  5 "classical_decomposition(G~ 2006 Q3   233  207     1.06   1.06           220.
##  6 "classical_decomposition(G~ 2006 Q4   192  210.    1.12   0.813          171.
##  7 "classical_decomposition(G~ 2007 Q1   187  213     0.821  1.07           228.
##  8 "classical_decomposition(G~ 2007 Q2   234  254.    0.998  0.924          234.
##  9 "classical_decomposition(G~ 2007 Q3   245  294.    1.06   0.789          232.
## 10 "classical_decomposition(G~ 2007 Q4   505  294.    1.12   1.53           449.
## 11 "classical_decomposition(G~ 2008 Q1   194  294.    0.821  0.804          236.
## 12 "classical_decomposition(G~ 2008 Q2   229  256.    0.998  0.894          229.
## 13 "classical_decomposition(G~ 2008 Q3   249  219     1.06   1.08           236.
## 14 "classical_decomposition(G~ 2008 Q4   203  220.    1.12   0.820          181.
## 15 "classical_decomposition(G~ 2009 Q1   196  222.    0.821  1.08           239.
## 16 "classical_decomposition(G~ 2009 Q2   238  223.    0.998  1.07           238.
## 17 "classical_decomposition(G~ 2009 Q3   252  225.    1.06   1.06           238.
## 18 "classical_decomposition(G~ 2009 Q4   210  226     1.12   0.827          187.
## 19 "classical_decomposition(G~ 2010 Q1   205   NA     0.821 NA              250.
## 20 "classical_decomposition(G~ 2010 Q2   236   NA     0.998 NA              236.

The impact of adding 300 to the observation for 2007Q4 is that the seasonal component values change - the seasonal component for Q4 increases, while that for Q1, Q2 and Q3 decreases. The plot for the seasonally-adjusted series shows a big spike for the 2007Q4 value which was modified.

  1. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
## [1] 205
## [1] 510
## # A dable: 20 x 7 [1Q]
## # Key:     .model [1]
## # :        Gas = trend * seasonal * random
##    .model                      Quarter   Gas trend seasonal random season_adjust
##    <chr>                         <qtr> <dbl> <dbl>    <dbl>  <dbl>         <dbl>
##  1 "classical_decomposition(G~ 2005 Q3   221   NA     1.03  NA              214.
##  2 "classical_decomposition(G~ 2005 Q4   180   NA     1.09  NA              165.
##  3 "classical_decomposition(G~ 2006 Q1   171  200.    0.857  0.995          199.
##  4 "classical_decomposition(G~ 2006 Q2   224  204.    1.01   1.08           221.
##  5 "classical_decomposition(G~ 2006 Q3   233  207     1.03   1.09           225.
##  6 "classical_decomposition(G~ 2006 Q4   192  210.    1.09   0.835          176.
##  7 "classical_decomposition(G~ 2007 Q1   187  213     0.857  1.02           218.
##  8 "classical_decomposition(G~ 2007 Q2   234  216.    1.01   1.07           231.
##  9 "classical_decomposition(G~ 2007 Q3   245  219.    1.03   1.08           237.
## 10 "classical_decomposition(G~ 2007 Q4   205  219.    1.09   0.856          187.
## 11 "classical_decomposition(G~ 2008 Q1   194  219.    0.857  1.03           226.
## 12 "classical_decomposition(G~ 2008 Q2   229  219     1.01   1.03           226.
## 13 "classical_decomposition(G~ 2008 Q3   249  219     1.03   1.10           241.
## 14 "classical_decomposition(G~ 2008 Q4   203  220.    1.09   0.842          186.
## 15 "classical_decomposition(G~ 2009 Q1   196  222.    0.857  1.03           229.
## 16 "classical_decomposition(G~ 2009 Q2   238  261.    1.01   0.900          235.
## 17 "classical_decomposition(G~ 2009 Q3   252  300.    1.03   0.812          244.
## 18 "classical_decomposition(G~ 2009 Q4   510  301     1.09   1.55           466.
## 19 "classical_decomposition(G~ 2010 Q1   205   NA     0.857 NA              239.
## 20 "classical_decomposition(G~ 2010 Q2   236   NA     1.01  NA              233.

3.8 Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

The original plot shows high cyclicality as well as a seasonal component that has changed over the years. The extent of variability also seems to be dependent on the level of the time series.The increased variance during the 2011-12 period has larlgely leaked over into the remainder component.

3.9 Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995. Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation. Is the recession of 1991/1992 visible in the estimated components?

The decomposition shows an increasing trend cycle over the 1978 to 1995 period. The seasonal component seems to have changed over the years - the pattern in the early part of this period is different from the seasonal pattern in the latter years. This can also be seen in the seasonal plot (figure 3.20) where certain months such as March, July, August, November and December show a high variance over the years. The impact of the 1991/1992 recession is visible primarily in the remainder component, and not in the trend component which implies that the number of periods used in calculating the moving average for the trend is high, which results in over-smoothing.