Assignment 2

3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

global_economy %>% autoplot(GDP/Population, show.legend =  FALSE)

## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).

highest_GDP_country <- global_economy %>%
  mutate(GDPC = GDP / Population) %>%
  filter(GDPC == max(GDPC, na.rm = TRUE)) %>%
  pull(Country)
highest_GDP_country

## [1] Monaco
## 263 Levels: Afghanistan Albania Algeria American Samoa Andorra ... Zimbabwe

global_economy |> 
  filter(Country == "Monaco")|>
  autoplot(GDP / Population)

## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_line()`).

Monaco has experienced robust economic growth, demonstrating a strong and resilient economy. During periods of global economic distress, Monaco’s economy has managed to maintain its strength and even show growth, reflecting its stability and effective economic management. In 2014 it recorded the highest GDP/capita of $ 185152.53.

3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.

global_economy |> 
  filter(Code == "USA")|>
  autoplot(GDP )

global_economy |> 
  filter(Code == "USA")|>
  autoplot(GDP / Population )

In this case I chose to tranform the data into GDP / per capita by dividing the GDP by total population. This transformation provide a clear picture of the US econmy in relation to its people/.

aus_livestock|>
  filter(Animal == "Bulls, bullocks and steers", State =="Victoria")|>
  autoplot()

## Plot variable not specified, automatically selected `.vars = Count`

vic_elec  %>%  autoplot(Demand)

aus_production  %>% autoplot(Gas)

lambda <- aus_production |>
  features(Gas, features = guerrero) |>
  pull(lambda_guerrero)
aus_production |>
  autoplot(box_cox(Gas, lambda)) +
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lambda,2))))

3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

autoplot(canadian_gas)

## Plot variable not specified, automatically selected `.vars = Volume`

lamb_can_gas <- canadian_gas |>
  features(Volume, features = guerrero)  |>
  pull(lambda_guerrero)

canadian_gas |>
  autoplot(box_cox(Volume,lamb_can_gas))+
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed gas production with $\\lambda$ = ",
         round(lamb_can_gas,2))))

This looks like a log function at this scale lets try:

canadian_gas |>
  autoplot(log(Volume))+
  labs(y = "Log Gas volume",
       title = 
         "Transformed gas production with log")

3.4

What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

set.seed(1399118)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
  

myseries |>
  autoplot(Turnover)

my_lambda <- myseries |>
  features(Turnover, features = guerrero)  |>
  pull(lambda_guerrero)

myseries |>
  autoplot(box_cox(Turnover,my_lambda))+
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Turnover with  = ",
         round(my_lambda,2))))

By applying the Guerrero box_cox transformation I was able to reduce the difference in variations.

3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

aus_production |>
  select(Tobacco)|>
  autoplot()

## Plot variable not specified, automatically selected `.vars = Tobacco`

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

lambda_tobacco <- aus_production |>
features(Tobacco, features = guerrero) |>
pull(lambda_guerrero)

aus_production |>
autoplot(box_cox(Tobacco, lambda_tobacco))+
   labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Tobacco with $\\lambda$ = ",
         round(lambda_tobacco,2))))

## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).

  ansett2<- ansett|>
  filter(Class =="Economy",Airports == "MEL-SYD")
  
  ansett2|>autoplot(Passengers)

ansett_lambda<- ansett2|>
  features(Passengers, guerrero)|>
  pull(lambda_guerrero)
  
ansett2|>
  autoplot(box_cox(Passengers, ansett_lambda))+
  labs(y = "",
       title = latex2exp::TeX(paste0(
         "Transformed Passengers with $\\lambda$ = ",
         round(ansett_lambda,2))))

scs_ped<- pedestrian |>
  filter(Sensor == "Southern Cross Station") 
scs_ped|>autoplot(Count)

3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4)
  
  gas|>autoplot(Gas)

Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle? The Production of of Gas shows a Yearly trend Where the peaks and Valleys always fall in the same at the same multiple of m. in this case m=4 and peaks take place in q3 of each year, while valleys occur in q1.

Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

gas|>
  model(
    classical_decomposition(Gas, type = "multiplicative")
  ) |>
  components() |>
  autoplot() +
  labs(title = "Classical Multiplicative Decomposition of AUS GAS Production 2006-2010"
       ,y= "Gas"
       )

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

Do the results support the graphical interpretation from part a? Yes the decomposition shows the analysis I described in part a.

Compute and plot the seasonally adjusted data. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

dcmp_gas <- gas |>
  model(stl = STL(Gas))
components(dcmp_gas)

## # A dable: 20 x 7 [1Q]
## # Key:     .model [1]
## # :        Gas = trend + season_year + remainder
##    .model Quarter   Gas trend season_year remainder season_adjust
##    <chr>    <qtr> <dbl> <dbl>       <dbl>     <dbl>         <dbl>
##  1 stl    2005 Q3   221  193.        26.9     0.856          194.
##  2 stl    2005 Q4   180  197.       -16.7    -0.109          197.
##  3 stl    2006 Q1   171  200.       -25.7    -3.59           197.
##  4 stl    2006 Q2   224  204.        15.4     4.86           209.
##  5 stl    2006 Q3   233  207.        27.0    -1.03           206.
##  6 stl    2006 Q4   192  210.       -16.7    -1.33           209.
##  7 stl    2007 Q1   187  213.       -25.5    -0.550          213.
##  8 stl    2007 Q2   234  216.        15.1     2.60           219.
##  9 stl    2007 Q3   245  219.        27.0    -0.730          218.
## 10 stl    2007 Q4   205  219.       -16.6     2.55           222.
## 11 stl    2008 Q1   194  219.       -25.3     0.562          219.
## 12 stl    2008 Q2   229  219.        14.8    -4.59           214.
## 13 stl    2008 Q3   249  219.        27.0     2.98           222.
## 14 stl    2008 Q4   203  220.       -16.6    -0.834          220.
## 15 stl    2009 Q1   196  222.       -25.0    -0.740          221.
## 16 stl    2009 Q2   238  223.        14.5     0.341          223.
## 17 stl    2009 Q3   252  225.        27.1    -0.132          225.
## 18 stl    2009 Q4   210  226.       -16.5     0.986          227.
## 19 stl    2010 Q1   205  226.       -24.8     4.25           230.
## 20 stl    2010 Q2   236  225.        14.2    -3.62           222.

# Correctly modify the Gas value for a specific quarter
gas_modified <- gas |>
  mutate(Gas = replace(Gas, Quarter == yearquarter("2007 Q3"),545))

dcmp_gas <- gas_modified |>
  model(stl = STL(Gas))
components(dcmp_gas)|>
   as_tsibble() |>
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "#0072B2") +
  labs(y = "Volume ",
       title = "Gas Produced in AUS 2006/2010")

Does it make any difference if the outlier is near the end rather than in the middle of the time series?

gas_modified <- gas |>
  mutate(Gas = replace(Gas, Quarter == yearquarter("2010 Q3"),545))
dcmp_gas <- gas_modified |>
  model(stl = STL(Gas))
components(dcmp_gas)|>
   as_tsibble() |>
  autoplot(Gas, colour = "gray") +
  geom_line(aes(y=season_adjust), colour = "#0072B2") +
  labs(y = "Volume ",
       title = "Gas Produced in AUS 2006/2010")

As we can see where the Outlier shows up in the TS does matter significantly in the analysis of the TS

3.8

Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

x11_dcmp <- myseries |>
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
  components()
autoplot(x11_dcmp) +
  labs(title =
    "Decomposition of Turnover using X-11.")

x11_dcmp |>
  gg_subseries(seasonal,) +
  labs(title = "X11 Seasonal Component")

3.9

Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation. Is the recession of 1991/1992 visible in the estimated components?

https://otexts.com/fpp3/fpp_files/figure-html/labour-1.png

https://otexts.com/fpp3/fpp_files/figure-html/labour2-1.png

The original plot displays a line graph with a general upward trajectory, featuring fluctuations but an overall positive trend. The Trend component reflects this positive trend and smooths out the data, thereby not capturing the recession periods. The Seasonal component also does not indicate any recession. However, the Remainder component highlights a significant downturn in 1991/92, effectively showing the recession data points.

Assignment 2

Darwhin Gomez

2024-09-14

3.1

3.2

3.3

3.4

3.5

3.7

3.8

3.9