Q1: Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

Librarires:

library(fpp3)
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.2
## ✔ lubridate   1.9.4     ✔ fable       0.5.0
## ✔ ggplot2     3.5.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
library(dplyr)
library(ggplot2)
library(seasonal)
## 
## Attaching package: 'seasonal'
## The following object is masked from 'package:tibble':
## 
##     view

GDP Plot per capita for each country of the world.

gdp_pc <- global_economy %>%
  mutate(GDP_per_Capita = GDP / Population)

gdp_pc %>%
  autoplot(GDP_per_Capita, alpha = 0.3) +
  labs(
    title = "GDP per Capita for Each Country (Global Economy)",
    x = "Year",
    y = "GDP per Capita (US$)"
  ) +
  guides(colour = "none")  # turn off legend for clarity
## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).

Highest GDP Country including top 10 higest GDP countries:

# Get most recent year
latest_year <- max(gdp_pc$Year, na.rm = TRUE)

# Select top 10 countries based on most recent year
top10_countries <- gdp_pc %>%
  filter(Year == latest_year) %>%
  arrange(desc(GDP_per_Capita)) %>%
  slice(1:10) %>%
  pull(Country)

# Filter full dataset for those top 10 countries
top10_gdp_trend <- gdp_pc %>%
  filter(Country %in% top10_countries)

# Line plot
ggplot(top10_gdp_trend, 
       aes(x = Year, 
           y = GDP_per_Capita, 
           color = Country, 
           group = Country)) +
  geom_line(size = 1) +
  geom_point() +
  labs(
    title = paste("GDP per Capita Trends for Top 10 Countries (as of", latest_year, ")"),
    x = "Year",
    y = "GDP per Capita (US$)",
    color = "Country"
  ) +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 32 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 42 rows containing missing values or values outside the scale range
## (`geom_point()`).

Comment: Luxemburg has the higest GDP per capita in the world in the recent times especially around 2010.

Q2: For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy.

Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.

Victorian Electricity Demand from vic_elec.

Gas production from aus_production.

US GDP:

us_gdp <- global_economy %>%
  filter(Country == "United States")

us_gdp %>%
  autoplot(GDP) +
  labs(title = "United States GDP",
       y = "GDP (US$)")

GDP shows strong upward (exponential) growth. A log transformation stabilizes variance and converts exponential growth into roughly linear growth.This makes long-term growth rates easier to interpret.

Log transformation:

us_gdp %>%
  autoplot(log(GDP)) +
  labs(title = "Log transformation of United States GDP",
       y = "log(GDP)")

Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.

vic_bulls <- aus_livestock %>%
  filter(State == "Victoria",
         Animal == "Bulls, bullocks and steers")

# Original
vic_bulls %>%
  autoplot(Count) +
  labs(title = "Victorian Slaughter: Bulls, Bullocks and Steers",
       y = "Number Slaughtered")

Comment: The livestock data exhibits substantial fluctuations together with an overall decreasing long-term trend. Although the series shows an increase between 1990 and 2000, this upward movement appears to be temporary rather than sustained. After this period, the general direction of the series declines, indicating a long-run downward trend.

Victorian Electricity Demand from vic_elec.

# Original
vic_elec %>%
  autoplot(Demand) +
  labs(title = "Victorian Electricity Demand",
       y = "Demand (MW)")

Comment: This plot provides a clearer representation of the seasonal patterns in electricity demand. By aggregating the half-hourly observations into daily averages, the high-frequency fluctuations are smoothed out, making the underlying trend and broader seasonal movements more visible.

Gas production from aus_production.

# Original
aus_production %>%
  autoplot(Gas) +
  labs(title = "Australian Gas Production",
       y = "Gas Production")

Comment: The plot indicates a significant increase beginning around 1980, after which the series exhibits a clear upward trend. This suggests that growth accelerated during this period and continued to rise steadily in subsequent years.

# Log transformation
aus_production %>%
  autoplot(log(Gas)) +
  labs(title = "Australian Gas Production (Log Scale)",
       y = "log(Gas)")

3: Why is a Box-Cox transformation unhelpful for the canadian_gas data?

canadian_gas %>%
  autoplot(Volume) +
  labs(title = "Canadian Gas Production",
       y = "Volume",
       x = "Year")

Comment: A Box–Cox transformation is unhelpful for the canadian_gas data because the series already exhibits approximately constant variance with additive seasonality, meaning there is no strong need for variance stabilization.

4. What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

set.seed(123)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 
myseries %>% autoplot(Turnover) +
  labs(title = "Retail Data Turnover",
       y = "$AUD (Millions)")

lambda <- myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries %>% autoplot(box_cox(Turnover, lambda))+
  labs(title = paste("Transformed Retail Turnover with \u03BB =", round(lambda, 2)))

Comment: From the plot, the seasonal fluctuations appear more uniform after transformation. The Box–Cox transformation, particularly when λ = 0 (equivalent to the natural logarithm), is useful when the data exhibits exponential growth. By selecting an appropriate value of λ, the transformation stabilizes the variance and makes the series more suitable for modeling, thereby simplifying the forecasting process.

5. For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

data("pedestrian")
data("ansett")
aus_production %>% autoplot(Tobacco) +
  labs(title = "Original Tobacco Production", y = "Production in tonnes")
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

lambda <- aus_production %>%
  features(Tobacco, features = guerrero)%>%
  pull(lambda_guerrero)

aus_production %>% autoplot(box_cox(Tobacco,lambda)) +
  labs(title = paste("Transformed Tobacco Production with \u03BB =", round(lambda, 2)))
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

Comment: For the Tobacco data, the estimated Box–Cox parameter (λ = 0.93) is very close to 1, indicating that little to no transformation was necessary. This suggests that the original data already has relatively stable variance, and applying the transformation results in only a minimal change to the series.

7. Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) |> select(Gas)

gas <- tail(aus_production, 5*4) |> select(Gas)
head(gas)
## # A tsibble: 6 x 2 [1Q]
##     Gas Quarter
##   <dbl>   <qtr>
## 1   221 2005 Q3
## 2   180 2005 Q4
## 3   171 2006 Q1
## 4   224 2006 Q2
## 5   233 2006 Q3
## 6   192 2006 Q4

a. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

gas %>%
  autoplot(Gas)

b. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

  decomp <- gas %>%
    model(classical_decomposition(Gas, type = "multiplicative")) %>%
    components() 
  
  decomp %>%
    autoplot() +
    labs(title = "Classical multiplicative decomposition of gas production")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

c. Do the results support the graphical interpretation from part a?

Yes, the results support the graphical interpretation from part (a). The trend plot displays an overall upward movement from left to right, with a relatively stable period in the middle of the series. Additionally, the seasonal indices indicate a consistent seasonal pattern, with little variation in seasonal magnitude over the 2006–2010 period. This suggests that the seasonal effects remain stable across those years.

d. Compute and plot the seasonally adjusted data.

 decomp %>%
    ggplot(aes(x = Quarter)) +
    geom_line(aes(y = Gas, colour = "Data")) +
    geom_line(aes(y = season_adjust,
                  colour = "Seasonally Adjusted")) +
    geom_line(aes(y = trend, colour = "Trend")) +
    labs(y = "Gas",
         title = "Seasonally Adjusted Gas Production") +
    scale_colour_manual(
      values = c("gray", "green", "black"),
      breaks = c("Data", "Seasonally Adjusted", "Trend")
    )
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).

Comment: The data indicates a clear upward trend in gas production over time.

e. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

gas %>%
  mutate(Gas = ifelse(Gas == 249, Gas + 300, Gas)) %>%
  model(classical_decomposition(Gas, type = "multiplicative")) %>%
  components() %>%
  as_tsibble() %>%
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "blue") +
  labs(title = "Seasonally Adjusted Gas Production with an Outlier")

Comment: After adding 300 to 2008 Q3, that quarter becomes a clear outlier in the series. This produces a noticeable spike in both the original data and the seasonally adjusted series. However, the increase appears less pronounced in the seasonally adjusted data, as the seasonal component has been removed. The sudden jump also disrupts the underlying trend, causing a temporary distortion in the overall pattern of the series.

f. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

gas %>%
  mutate(Gas = ifelse(Gas == 236, Gas + 300, Gas)) %>%
  model(classical_decomposition(Gas, type = "multiplicative")) %>%
  components() %>%
  as_tsibble() %>%
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "blue") +
  labs(title = "Seasonally Adjusted Gas Production with an Outlier at the End")

Comment: It appears that the position of the outlier whether near the middle or at the end of the series—does not make a significant difference. In both cases, there is a pronounced spike at the point of the outlier, and the underlying trend becomes less noticeable.

8. Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

x11_dcmp <- myseries %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components()
autoplot(x11_dcmp) +
  labs(title = "Decomposition of total US retail employment using X-11.")

Comment: Yes, the seasonal variation reverses over time. Early in the series, the seasonal plot shows spikes corresponding to higher turnover, whereas later on, the spikes correspond to lower turnover values. The irregular component highlights a few outliers in the data. Overall, the trend plot does not reveal any unusual patterns or anomalies.

Q9. Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

a) Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

b)Is the recession of 1991/1992 visible in the estimated components?

Answer Q9:The series represents the monthly number of persons in the civilian labor force in Australia from February 1978 to August 1995. The seasonal component, extracted from the decomposition, is shown in the previous figure, highlighting the recurring monthly patterns in the labor force over the years.

a) From figure 3.19 we can see that the trend is positive and increasing over time. From the seasonality trend we can see that there is 3 peaks per a year in hiring the civilian labor force, which seems to be around March, September and December. throughout the data we can see that there isn’t much noise until the end of the 1980’s through to around 1993. With the trough being around the end of 1990 and into 1991.

B. Is the recession of 1991/1992 visible in the estimated components?

Looking at figure 3.20 the seasonal component from the decomposition we can see that during the March to August period of the early 1990’s there is a sharp decrease. This can match up to figure 3.19 the overview of the STL decomposition where during the period of 1990 to 1991 we see a huge decrease in the remainder or noise column to help further the case of there being a recession.