3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

Answer:

Monaco has the highest GPD per capita, and it had increased with a small dip from time to time.

library(fpp3)
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.4     ✔ fable       0.4.1
## ✔ ggplot2     3.5.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
global_economy |>
  autoplot(GDP / Population, show.legend =  FALSE) 
## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).

global_economy |>
  mutate(GDP_per_capita = GDP / Population) |>
  filter(GDP_per_capita == max(GDP_per_capita, na.rm = TRUE)) |>
  select(Country, GDP_per_capita)
global_economy |>
  filter(Country == "Monaco") |>
  autoplot(GDP/Population)
## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_line()`).

3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

global_economy |>
  filter(Country == "United States") |>
  autoplot(GDP / 10^12) +
  labs(y="trillions")

aus_livestock |>
  filter(Animal == "Bulls, bullocks and steers") |>
  filter(State == "Victoria") |>
  autoplot(Count)

vic_data <- vic_elec |>
  group_by(Date) |>
  mutate(Demand = sum(Demand)) |>
  distinct(Date, Demand)

vic_data |>
  as_tsibble(index = Date) |>
  autoplot(Demand)

aus_production |>
  autoplot(Gas)

3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

Answer:

It’s unhelpful for the canadian_gas data because the data is already stable in term of variance before applying the box-cox tranformation, so there is not much different before or after the transformation.

canadian_gas |>
  autoplot(Volume) 

lambda <- canadian_gas |>
  features(Volume, features = guerrero) |>
  pull(lambda_guerrero)

canadian_gas |>
  autoplot(box_cox(Volume, lambda)) +
    labs(y = "", title = paste0("at lambda = ",round(lambda,2)))

3.4

What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

Answer:

I would select lambda = 0.08 since it helps stabilize the seasonal variation.

set.seed(12345678) 
seed <- aus_retail |>   
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(seed, Turnover)

lambda <- seed |>
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)

seed |>
  autoplot(box_cox(Turnover, lambda)) +
  labs(y = "", title = paste0("at lambda = ",round(lambda,2)))

3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

Answer:

  • For aus_production, the box-cox transformation didn’t seem to do much changes, so it is not very effective here, lambda being close to 1 as well.

  • For the ansett, lambda is 2, so the variance should be better.

  • The lambda is negative, so applying the box cox transformation seems unreliable.

autoplot(aus_production, Tobacco)+
  labs(title = "Tobacco and Cigarette")
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

lambda <- aus_production |>
  features(Tobacco, features = guerrero) |>
  pull(lambda_guerrero)

aus_production |>
  autoplot(box_cox(Tobacco, lambda)) +
  labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

ansett_data <- ansett |>
  filter(Class == "Economy", Airports == "MEL-SYD")

autoplot(ansett_data, Passengers)+
  labs(title = "Economy class")

lambda <- ansett_data |>
  features(Passengers, features = guerrero) |>
  pull(lambda_guerrero)

ansett_data |>
  autoplot(box_cox(Passengers, lambda)) +
  labs(y = "", title = paste0("at lambda = ",round(lambda,2)))

pede_data <- pedestrian |>
  filter(Sensor == "Southern Cross Station") 

autoplot(pede_data, Count)+
  labs(title = "Hourly Pedestrian")

lambda <- pede_data |>
  features(Count, features = guerrero) |>
  pull(lambda_guerrero)

pede_data |>
  autoplot(box_cox(Count, lambda)) +
  labs(y = "", title = paste0("at lambda = ",round(lambda,2)))

3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) |> select(Gas)
  1. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

  2. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

  3. Do the results support the graphical interpretation from part a?

  4. Compute and plot the seasonally adjusted data.

  5. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

  6. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

    Answer

    • For question 1, an increasing trend that increases at the beginning of the year, then decreases at the middle to end of the year.

    • For question 3, the results do support the graphical interpretation from part a, because the trend is increasing, and seasonality has the up and down similar to the first graph.

    • For question 5, there was a change in the overall data when applying the change to the outlier, the flow seems to have change, from up to down.

    • I don’t see much difference, seems to be showing the overall trend.

gas <- tail(aus_production, 5*4) |> select(Gas)

autoplot(gas, Gas)

clas_decom <- gas |>
  model(classical_decomposition(Gas, type = "multiplicative")) 

components(clas_decom) |>
  autoplot() 
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

components(clas_decom) |>
  as_tsibble() |>
  autoplot(Gas) +
  geom_line(aes(y=season_adjust))

gas |>
  mutate(Gas = ifelse(Quarter == yearquarter("2008Q1"), Gas + 300, Gas)) |>
  model(classical_decomposition(Gas, type = "multiplicative")) |>
  components() |> #extracting decomposed components(trend, seasonal, remainder)
  as_tsibble() |>
  autoplot(Gas) +
  geom_line(aes(y=season_adjust))

gas |>
  mutate(Gas = ifelse(Quarter == yearquarter("2010Q1"), Gas + 300, Gas)) |>
  model(classical_decomposition(Gas, type = "multiplicative")) |>
  components() |>
  as_tsibble() |>
  autoplot(Gas) +
  geom_line(aes(y=season_adjust))

3.8

Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

Answer:

This doesn’t seem to reveal much outlier, it also captures the irregular, which is interesting to see more noises in the graph.

library(seasonal)
## 
## Attaching package: 'seasonal'
## The following object is masked from 'package:tibble':
## 
##     view
x_11 <- seed |>
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
  components()

autoplot(x_11) 

3.9

  1. Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

  2. Is the recession of 1991/1992 visible in the estimated components?

    Answer:

    1. The trend is increasing consistently in the number of the civilian labor force in Australia. The seasonal change overtime are consistent, but seems to have wider increase and decrease as year goes. The remainder seems to have a large dip around 1992, going to negative 400. That probably mean there were probably something big like the recession.
    2. It is quite visible in the remainder, is drop drastically around 1991/1992.