library(fpp3)
Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
The two highest GPDs per capita are Monaco and Liechtenstein. They have a clear upward trend for their respective GDPs per capita over the years. And they also both seem to have dips in their GPD increases over the course of a 10-15 year cycle. Monaco has always been at the top of the list as far as GDP per capita since 1970, while Liechtenstein got into second place around the late 1980s and has been there ever since.
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP from global_economy.
It makes sense to adjust the GDP to GDP per capita to account for population changes. For the US this does not have much of an effect on understanding GDP, meaning that the increase in GDP was not affected by population growth.
Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.
For the slaughter counts I thought it would be interesting to see it as a moving average over 12 months, yearly moving average. This way we can get an estimate of the trend-cycle for this data. Looking at the moving average line we can clearly see a downward trend in slaughter of Victorian bulls, bullocks, and steers. We also see a cycle with peaks happening every 5-10 years.
Victorian Electricity Demand from vic_elec. It didn’t seem necessary to transform Victorian electric demand as we can clearly see seasonality, with more electricity being used in the summer and winter months. There seems to be no trend for demand of electricity as well as no cycle. It’s possible that the cycle and trend is not captured in the data as it could be longer than the 3 years of data from 2012-2015.
Gas production from aus_production. For Gas production it was necessary to control the increasing variation in the data. Fitting the data with a box-cox transformation normalized the seasonal variance. Once that occurs we can see that the seasonal variance is actually the same over time.
Why is a Box-Cox transformation unhelpful for the canadian_gas data?
The reason to use a Box-Cox Transformation is to control the variance in seasonality over time. As we can see in the graphs above, applying a lambda of 0.39 doesn’t normalize the seasonality over time, and therefore it is unhelpful when analyzing Canadian gas data.
What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?
mytsibble %>%
autoplot(Turnover) +
labs(title = 'Exercise 2.8 Monthly australian retail turnover')
For the graph above I would attempt to use a box-cox transformation to see if it can normalize the variation in seasonality. I would use a power box-cox transformation which is confirmed by the lambda pulled using the Guerrero feature of 0.27. Looking at the graph below I think it helps to normalize the variance over time.
For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.
It doesn’t seem that the box-cox transformation improved the original graph and therefore, I would recommend to not transform the Tobacco production data.
The lambda of 2 normalized the seasonal variation for economy class passengers between Melbourne and Sydney, while maintaining the big dips in ridership in 1987, 1988, 1989, and 1992.
The variance seems normalized for the pedestrian data at Southern Cross Station, but it is extremely difficult to tell anything from a graph with this level of granularity.
Consider the last five years of the Gas data from aus_production.
gas <- tail(aus_production, 5*4) %>% select(Gas)
Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
Looking at the graph above we can clearly see a seasonal fluctuations and an production trend in the data for the last 5 years of Australian gas production.
Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
The components corroborate the observations made in Part A about the upward trend and the seasonal fluctuations. There doesn’t seem to be too much randomness in the data either, with the range being 0.04.
Compute and plot the seasonally adjusted data.
Computing the seasonally adjusted plot above gives us a clearer sense of the data’s trend. It reduces the effect of the seasonal fluctuations and helps us see a clearer trend line.
The outlier dramatically impacts the seasonally adjusted data. You lose all benefits of using seasonally adjusted data since you cannot tell the trend anymore. Since the outlier was inserted in the beginning of the data so it showed up at the beginning of the seasonally adjusted data graph.
Looking at the two graphs above we can see that depending on where the outlier in the data is we would see it in the seasonally adjusted graph. If the outlier is in the middle of the data then it will show up in the middle of the seasnally adjusted graph, or if the outlier shows up at the end of the data then it will show up towards the end of the seasanlly adjusted graph, as seen above.
Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?
In the decomposition of my retail data from exercise 2.8, we get a lot more information in the irregular section. When looking at the irregular section you can see the dip right around the beginning of the recession of 1990 and then the spike right around the recovery in early 1991. Interestingly you don’t see the same dip during the great recession of 2008, however. The outliers seen in the early 90s are the type of useful information that could help analyze employment data.
Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labor force in Australia each month from February 1978 to August 1995.
Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.
The value graph shows all 3 components added together and the levels go from 7000 to 9000. This same scale can be seen for the trend component graph and we can tell there is a clear upward trend in the data. The yearly seasonal component graph has a variance between -100 and 100, much smaller than the 7000 to 9000 range of the original graph. This most likely means that the seasonal component represents a low number of changes in the civilian labor force. Finally, the remainder graph shows us the outliers, and probably the most interesting part of the graph, the 91/92 recession. The scale goes from 100 to -400 with the recession peaking at -400.
Is the recession of 1991/1992 visible in the estimated components?
The recession of 91/92 is clearly visible in the remainder component graph in figure 3.19. There is a large dip and then recovery around 91/92.