Exercise 3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?.

A. Plot the GDP per capita for each country over time. How has this changed over time?

Looking at the below plot, we can say that there is an upward trend in GDP over time.

tsibble(global_economy, key = Code, index = Year) %>%
  autoplot(GDP/Population, show.legend = FALSE) +
  labs(title = 'GDP Per Capita For All Countries',
       y = 'US Dollars')

B. Which country has the highest GDP per capita?

The below plot shows us that the country with the highest GDP is Monaco. Monaco's GDP reached $185000 US dollars in 2014.

country_gdp <- global_economy %>%
  mutate('GDP_per_capita' = GDP / Population) %>%
  select(Country, Year, GDP_per_capita) %>%
  arrange(desc(GDP_per_capita)) %>%
  head(10)

country_with_highest_gdp <- tsibble(country_gdp, key = Country, index = Year)
autoplot(country_with_highest_gdp) +
  labs(title = 'Country with The Highest GDP',
       y ='US Dollars')

 

Exercise 3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

A. United States GDP from global_economy

tsibble(global_economy, key = Code, index = Year) %>%
  filter(Country == 'United States') %>%
  autoplot(GDP/Population) +
  labs(title = 'United States GDP', y = 'US Dollars') +
  geom_line(col = '#006400') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

B. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock

aus_livestock %>%
  filter(Animal == 'Bulls, bullocks and steers', State == 'Victoria') %>%
  autoplot(Count) +
  labs(title = 'Slaughter of Victorian “Bulls, Bullocks and Steers"', y = 'Count') +
  geom_line(col = '#DC143C') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

C. Victorian Electricity Demand from vic_elec

vic_elec %>%
  autoplot(Demand) +
  labs(title = 'Victorian Electricity Demand', y = 'Demand') +
  geom_line(col = '#9932CC') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

D. Gas production from aus_production

aus_production %>%
  autoplot(Gas) +
  labs(title = 'Gas Production') +
  geom_line(col = '#FF8C00') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

 

Exercise 3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

Applying Box-Cox to the data does not appear to stablize the variance in the data - the results are very similiar to the original plot.

canadian_gas %>%
  autoplot(Volume) +
  labs(title = 'Canadian Gas Production', y = 'Volume', x = 'Month') +
  geom_line(col = '#1E90FF') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

canadian_gas_lambda <- canadian_gas %>% features(Volume, features = guerrero)

canadian_gas %>%
  autoplot(box_cox(Volume, canadian_gas_lambda$lambda_guerrero)) +
  labs(title = 'Canadian Gas Production Box-Cox Transformation', y = 'Volume', x = 'Month') +
  geom_line(col = '#DAA520') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

 

Exercise 3.4

What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

Using the boxcox() function's default method to find an optimal lambda, we get a lambda value of -0.02 that we use to transform the "Turnover" response variable in the "Box-Cox Transformation" plot below.

set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1))

myseries %>%
  autoplot(Turnover) +
  labs(title = 'Original Monthly Australian Retail Data', x = 'Month') +
  geom_line(col = '#2E8B57') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

lambda <- myseries %>% features(Turnover, features = guerrero)

myseries %>%
  autoplot(box_cox(Turnover, lambda$lambda_guerrero)) +
  labs(title = 'Monthly Australian Retail Data Box-Cox Transformation', y = 'Turnover', x = 'Month') +
  geom_line(col = '#FA8072') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

## [1] "The Lambda value for the above transformation is -0.02"

 

Exercise 3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

A. Tobacco from aus_production

aus_production %>%
  autoplot(Tobacco) +
  labs(title = 'Original Tobacco Production Data', x = 'Quarter') +
  geom_line(col = '#8B4513') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

tobacco_lambda <- aus_production %>% features(Tobacco, features = guerrero)

aus_production %>%
  autoplot(box_cox(Tobacco, tobacco_lambda$lambda_guerrero)) +
  labs(title = ' Tobacco Production Data Box-Cox Transformation', y = 'Tobacco', x = 'Quarter') +
  geom_line(col = '#F4A460') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

## [1] "The Lambda value for the above transformation is 0.93"

 

B. Economy class passengers between Melbourne and Sydney from ansett

economy_passengers <- ansett %>%
  filter(Class == 'Economy', Airports == 'MEL-SYD')

economy_passengers %>%
  autoplot(Passengers) +
  labs(title = 'Original Economy Class Passengers Between Melbourne And Sydney Data', x = 'Week') +
  geom_line(col = '#EE82EE') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

passengers_lambda <- ansett %>% filter(Class == 'Economy', Airports == 'MEL-SYD') %>%
  features(Passengers, features = guerrero)

ansett %>% filter(Class == 'Economy', Airports == 'MEL-SYD') %>%
  autoplot(box_cox(Passengers, passengers_lambda$lambda_guerrero)) +
  labs(title = 'Economy Class Passengers Between Melbourne And Sydney Data Box-Cox Transformation', y = 'Passengers', x = 'Week') +
  geom_line(col = '#FF6347') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

## [1] "The Lambda value for the above transformation is 2"

 

C. Pedestrian counts at Southern Cross Station from pedestrian

pedestrian_counts <- pedestrian %>%
  filter(Sensor == 'Southern Cross Station')

pedestrian_counts %>%
  autoplot(Count) +
  labs(title = 'Original Pedestrian Counts At Southern Cross Station Data',  y = 'Count', x = 'Date') +
  geom_line(col = '#FFA07A') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

pedestrian_lambda <- pedestrian %>% filter(Sensor == 'Southern Cross Station') %>%
  features(Count, features = guerrero)

pedestrian %>% filter(Sensor == 'Southern Cross Station') %>%
  autoplot(box_cox(Count, pedestrian_lambda$lambda_guerrero)) +
  labs(title = 'Pedestrian Counts At Southern Cross Station Data Box-Cox Transformation', y = 'Count', x = 'Date') +
  geom_line(col = '#FF6347') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

## [1] "The Lambda value for the above transformation is -0.23"

 

Exercise 3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) %>% select(Gas)

3.7 A. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

The below plot shows us that there are seasonal fluctuations with an upward growth trend. Each season peaks around the middle of the year which may be indicative of summer months when people tend to take more road trips and thus the demand for gas increases.

gas %>%
  autoplot() +
  labs(title = 'Last Five Years of Gas Data') +
  geom_line(col = '#DAA520') +
  theme(panel.grid.major.x = element_line(size = 0.15, linetype = 'solid', color = '#808080'),
        panel.grid.minor.x = element_line(size = 0.15, linetype = 'solid', color = '#696969'),
        panel.grid.major.y = element_line(size = 0.15, linetype = 'solid', color = '#2F4F4F'))

3.7 B. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

gas %>%
  model(classical_decomposition(Gas, type = 'multiplicative')) %>%
  components() %>%
  autoplot() +
  labs(title = 'Last Five Years of Gas Data')

3.7 C. Do the results support the graphical interpretation from part a?

Yes - the results support the graphical interpretation from part a. The results capture the seasonailty displayed in part a, and an upward trend is also identified.

3.7 D. Compute and plot the seasonally adjusted data.

gas_seasonally_adjusted <- gas %>% model(classical_decomposition(Gas, type = 'multiplicative')) %>% components()

gas_seasonally_adjusted %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, color = 'Data')) +
  geom_line(aes(y = season_adjust, color = 'Seasonally Adjusted')) +
  geom_line(aes(y = trend, color = 'Trend')) +
  labs(title = 'Last Five Years of Gas Data Seasonally Adjusted')

3.7 E. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier.

Adding the outlier to the middle of the time series results in a large spike in the seasonally adjusted data. It also results in the upward trend in the seasonally adjusted data being diminished.

gas_outlier <- gas
gas_outlier$Gas[10] <-gas_outlier$Gas[10] + 300

gas_outlier %>% model(classical_decomposition(Gas, type = 'multiplicative')) %>% components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, color = 'Data')) +
  geom_line(aes(y = season_adjust, color = 'Seasonally Adjusted')) +
  geom_line(aes(y = trend, color = 'Trend')) +
  labs(title = 'Last Five Years of Gas Data Seasonally Adjusted With Added Outlier')

3.7 F. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

Adding the outlier near the end of the time series results in a spike at the end of the adjusted data. It also results in the upward trend being maintained, which is not the case when the outlier is added to the middle of the timeseries.

gas_outlier <- gas
gas_outlier$Gas[20] <-gas_outlier$Gas[10] + 300

gas_outlier %>% model(classical_decomposition(Gas, type = 'multiplicative')) %>% components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, color = 'Data')) +
  geom_line(aes(y = season_adjust, color = 'Seasonally Adjusted')) +
  geom_line(aes(y = trend, color = 'Trend')) +
  labs(title = 'Last Five Years of Gas Data Seasonally Adjusted With Added Outlier')

 

Exercise 3.8

Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

The upward trend in the X-11 decomposition data continues, but displays more troughs than the data in the Multiplicative decomposition. The seasonality in the X-11 decomposition also displays a steady downward trend.

Multiplicative Decomposition

myseries %>%
  model(classical_decomposition(Turnover, type = 'multiplicative')) %>%
  components() %>%
  autoplot() + 
  ggtitle('Monthly Australian Retail Data Multiplicative Decomposition')

X-11 Decomposition

myseries %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~  x11())) %>%
  components() %>%
  autoplot() + 
  ggtitle('Monthly Australian Retail Data X-11 Decomposition')

 

Exercise 3.9

Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Figure 3.19

Figure 3.19

Figure 3.20

Figure 3.20

3.9 A. Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

The number of people in the civilian labour force in Australia has risen steadily over time as per the trend data in figure 3.19. The month to month breakdown of the seasonal data in figure 3.20 tells us that some months display greater variation velocities than other months. We can also identify a seasonal pattern in the "season_year" graph.

3.9 B. Is the recession of 1991/1992 visible in the estimated components?

Yes, a drop in employment during 1991/1992 can be seen in the remainder graph. This drop is not explained by the seasonality or trend data.