3.1 Consider the GDP information in global_economy.

Plot the GDP per capita for each country over time.

global_economy  %>%
  autoplot(GDP / Population, show.legend = FALSE) +
  labs(title= "GDP per capita from 1960–2017", 
       y = "USD ($)", x = "year")

## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).

Which country has the highest GDP per capita?

Monaco has the highest GDP per capita at 185152.5 in the year 2014.

global_economy %>%
  mutate(GDP_per_capita = GDP / Population) %>%
  slice_max(GDP_per_capita, n = 1) %>%
  select(Country, GDP_per_capita)

How has this changed over time?

Monaco’s GDP has an overall trend of increasing steadily over time, however it seems to dip around the 2010s. Which can be explained by the great depression and the decreasing economic power from that period.

global_economy %>%
  filter(Country == "Monaco") %>%
  autoplot(GDP / Population) +
  labs(title = "Monaco over time GDP per capita", 
  y = "USD ($)", x = "year")

## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_line()`).

3.2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy

global_economy %>%
  filter(Country == "United States") %>%
  autoplot(GDP / 10^9) +
  labs(title = "over the years United States GDP ", 
      y = "Billions", x = "year")

Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.

aus_livestock %>%
  filter(Animal == "Bulls, bullocks and steers",
         State == "Victoria") %>%
  mutate(Number_Slaughtered = Count) %>%
  autoplot(Number_Slaughtered) +
  labs(title = "Slaughter of Victoria Bulls, Bullocks, and Steers")

Victorian Electricity Demand from vic_elec.

I made a transformation to visualize the monthly Victorian electricity demand. The monthly demand shows that falls around the spring and fall seasons and peaks around the new year and in the summer season.

vic_dem <- vic_elec  %>%
  group_by(Date)  %>%
  mutate(Demand = sum(Demand))  %>%
  distinct(Date, Demand)
vic_dem

vic_dem  %>%
  mutate(Date = yearmonth(Date))  %>%
  group_by(Date)  %>%
  summarise(Demand = sum(Demand)) %>%
  as_tsibble(index = Date)  %>%
  autoplot(Demand) +
  labs(title= "Victorian Elect Demand Monthly", 
       y = "USD (in trillions)", x = "year")

Gas production from aus_production.

For transformation of aus_production, I made it so that it took on a log transformation. This is to help with the variance, making the fluctuations of the gas production more stable and constant in size.

aus_gas <- aus_production %>%
  select(Quarter, Gas)

autoplot(aus_gas) +
  labs(title = "Australian Gas Production", 
       y = "Petajoules",
       x = "quarter")

## Plot variable not specified, automatically selected `.vars = Gas`

autoplot(aus_gas, log(Gas)) +
  labs(title = "Log Australian Gas Production", 
       y = "log(Petajoules)")

3.3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?

A Box-Cox transformation is unhelpful for the canadian_gas_data because, as shown in the graph, there is no exponential growth of this data and it is pretty steady. A box-cox transformation could also compress the seasonality.

canadian_gas %>%
  autoplot() +
  labs(title = "Gas Production Canadian", 
       y = "Volume", x = "Year")

## Plot variable not specified, automatically selected `.vars = Volume`

3.4 What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

A Box-Cox transformation with a lambda of 0.015 is selected for my retail data.

set.seed(1266)


myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries, Turnover) +
  labs(title = "Retail Turnover", 
       y = "millions australian",
       x="month")

lambda = .015

myseries  %>%
  autoplot(box_cox(Turnover, lambda)) +
  labs(title = "Transformed Retail Turnover with lambda"
       , y = ""
       )

3.5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

For Tabacco in aus_production, when using guerrero function, the lambda is 0.9264636.

data_tobac <- aus_production %>%
  select(Quarter, Tobacco)

data_tobac %>%
  features(Tobacco, features = guerrero)

For economy class in ansett, when using the guerrero function to pick lambda for passengers between melbourne and sydney, it chose 1.999927.

data_econ_class <- ansett %>%
  filter(Airports == "MEL-SYD", Class == "Economy")

data_econ_class %>%
  features(Passengers, features = guerrero)

For passenger counts at Southern Cross Station from pedestrian, when using the guerrero function to pick lambda, it chose 0.2726316.

data_ped <- pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  index_by(Date = date(Date_Time)) %>%
  summarise(Count = sum(Count, na.rm = TRUE))

data_ped %>%
  features(Count, features = guerrero)

3.7 Consider the last five years of the Gas data from aus_production.

Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

The data shows a consistent seasonal pattern with values that drop in Q1 and rise in Q3. From 2006 to 2009, there are both lows and highs trend slightly upward as can see in the graph.

gas <- tail(aus_production, 5*4) %>% select(Gas)

gas %>%
  autoplot(Gas)

Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices. Do the results support the graphical interpretation from part a? Compute and plot the seasonally adjusted data.

The graphs of the classical decomposition support the graphical interpretation of part a.

gas %>%
  model(decomp = classical_decomposition(Gas, type = "multiplicative")) %>%
  components() %>%
  autoplot()

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier? Does it make any difference if the outlier is near the end rather than in the middle of the time series?

Seems like the effect of the outlier causes a large peak at the middle of the graph. I believe it makes a difference whether the outlier is near the end rather than the middle of the time series, this is due to the way in which the trend could potentially be interpreted. When it is at the end, there could be less of an effect on the trend, but if it is in the middle then classical decomposition might fit it in a undesirable way.

gas %>%
  mutate(Gas = if_else(row_number() == 10, Gas + 300, Gas)) %>%
  model(decomp = classical_decomposition(Gas, type = "multiplicative")) %>%
  components() %>%
  autoplot(season_adjust) +
  labs(title = "Seasonally Adjusted",
       y = "")

3.8 Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

Possibly in 2010 Jan, there seems to an irregularity in the trend, but it seems mostly okay.

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

x11_dcmp <- myseries  %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components()


x11_dcmp %>% autoplot()

3.9 Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

The decomposition show a steady upward trend in the Australian civilian labor force in 1978 to 1995, which rises from around 6500 to 9000 persons. Seasonal component show consistent within year patterns, with dips in January but size of seasonal fluctuations is relatively small compared to the overall trend. The remainder component captures irregular fluctuations, with noticable irregularities around 1991-1992. But overall, the labor force seems to have stable seasonal cycles.

Is the recession of 1991/1992 visible in the estimated components?

Yes, the 1991/1992 recession is visible in the decomposition of the estimated components.

DATA624

Jackie Yee

2026-02-15

3.1

Consider the GDP information in global_economy.

3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

3.4

What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

3.7

Consider the last five years of the Gas data from aus_production.

3.8

Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

3.9

Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.