R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: QUESTION 1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

# 1 -----------------------------------
country_gdp <- global_economy %>%
                tsibble(key = Code, index = Year)%>%
                autoplot(GDP/Population, show.legend =  FALSE) +
                labs(title= "GDP per capita",
                     y = "US Dollar") + 
                theme(plot.title = element_text(hjust = 0.5))


monaco_gdp <- global_economy %>%
  filter(Country == "Monaco") %>%
  autoplot(GDP/Population) +
  labs(title= "GDP per capita for Monaco", y = "US Dollar") +
  theme(plot.title = element_text(hjust = 0.5))

monaco_gdp
## Warning: Removed 11 row(s) containing missing values (geom_path).

Monaco has the highest GDP and seen in the second graph above, it has been increasing over time. Most notably, it has exponentially grown after 2000. ————————————— QUESTION 2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.

# a
us_gdp <- global_economy %>% filter(Country == "United States") %>%
  autoplot(GDP/10^12) + labs(title = "US GDP", y = "USD (trillions)", x = "Year") +
  theme(plot.title = element_text(hjust = 0.5))

us_gdp

No transformations needed here. There seems to be a steady increase in GDP over time.

# b 
head(aus_livestock)
## # A tsibble: 6 x 4 [1M]
## # Key:       Animal, State [1]
##      Month Animal                     State                        Count
##      <mth> <fct>                      <fct>                        <dbl>
## 1 1976 Jul Bulls, bullocks and steers Australian Capital Territory  2300
## 2 1976 Aug Bulls, bullocks and steers Australian Capital Territory  2100
## 3 1976 Sep Bulls, bullocks and steers Australian Capital Territory  2100
## 4 1976 Oct Bulls, bullocks and steers Australian Capital Territory  1900
## 5 1976 Nov Bulls, bullocks and steers Australian Capital Territory  2100
## 6 1976 Dec Bulls, bullocks and steers Australian Capital Territory  1800
aus_liv <- aus_livestock %>%
  filter(State == "Victoria", Animal == "Bulls, bullocks and steers") %>%
  autoplot(Count) + labs(title = "Slaughter of Victorian “Bulls, bullocks and steers”", y = "Count", x = "Year") +
  theme(plot.title = element_text(hjust = 0.5))
aus_liv  

Overall, there is a decrease in the slaughter count. However, there seems to be an uptick during the 90s, then a continued downtrend.

# c
head(vic_elec)
## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
##   Time                Demand Temperature Date       Holiday
##   <dttm>               <dbl>       <dbl> <date>     <lgl>  
## 1 2012-01-01 00:00:00  4383.        21.4 2012-01-01 TRUE   
## 2 2012-01-01 00:30:00  4263.        21.0 2012-01-01 TRUE   
## 3 2012-01-01 01:00:00  4049.        20.7 2012-01-01 TRUE   
## 4 2012-01-01 01:30:00  3878.        20.6 2012-01-01 TRUE   
## 5 2012-01-01 02:00:00  4036.        20.4 2012-01-01 TRUE   
## 6 2012-01-01 02:30:00  3866.        20.2 2012-01-01 TRUE
vic <- vic_elec %>%
  autoplot(Demand) + labs(title = "Victorian Electriity Demand", y = "Demand", x = "Year") +
  theme(plot.title = element_text(hjust = 0.5)) 
vic_log <- vic_elec
vic_log$Demand <- log(vic_log$Demand)

vic_log %>%
  autoplot(Demand) + labs(title = "Log of Victorian Electriity Demand", y = "Demand", x = "Year") +
  theme(plot.title = element_text(hjust = 0.5)) 

The original graphs showed that there was a decent cyclical trend with some random spikes. I attempted a log transformation to normalize those spikes, but it didn’t do much.

# d
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
##   Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##     <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
## 1 1956 Q1   284    5225    189    465        3923     5
## 2 1956 Q2   213    5178    204    532        4436     6
## 3 1956 Q3   227    5297    208    561        4806     7
## 4 1956 Q4   308    5681    197    570        4418     6
## 5 1957 Q1   262    5577    187    529        4339     5
## 6 1957 Q2   228    5651    214    604        4811     7
aus_gas <- aus_production %>%
  autoplot(Gas) +
labs(title = "Australian Gas Production", y = "Production", x = "Year") +
  theme(plot.title = element_text(hjust = 0.5)) 
aus_gas

No transformation was made as there is a clear upward trend. If anything, the variance in seasonality differs. ———————————- QUESTION 3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?

# 3-------------------------------------
canadian_gas %>%
  autoplot(Volume) + labs(title = "Original Canadian Gas Production") +
  theme(plot.title = element_text(hjust = 0.5))

lambda <- canadian_gas %>%
  features(Volume, features = guerrero) %>%
    pull(lambda_guerrero)
  
canadian_gas %>%
  autoplot(box_cox(Volume, lambda)) + labs(title = "Transformed Canadian Gas Production") +
    theme(plot.title = element_text(hjust = 0.5))

The seasonal variance is not uniform after the Box Cox transformation. The decade between 1980 and 1990 has a higher variance than the previous years and lower variance around the 2000 mark. ————————————————– QUESTION 4 What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

# 4 ---------------------------------------------
set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
  
autoplot(myseries, Turnover)+
  labs(title = "AUS Retail", y = "$AUD (in millions)") +
  theme(plot.title = element_text(hjust = 0.5))

lambda_retail <- myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries %>%
  autoplot(box_cox(Turnover, lambda_retail)) +
  labs(title = "AUS Retail Transformed", y = "$AUD (in millions)") +
  theme(plot.title = element_text(hjust = 0.5))

I would use the transformed data because it seemed to have smoothed out the variance across all years. It also allows us to see the drop right before 2000. ——————————– QUESTION 5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

# 5.2
ansett.data <- ansett
eco_class <- ansett %>%
  filter(Airports == "MEL-SYD", Class == "Economy")

eco_class %>%
  autoplot(Passengers) + labs(title = "Original Econ Passengers MEL-SYD") +
  theme(plot.title = element_text(hjust = 0.5))

lambda5.2 <- eco_class %>%
  features(Passengers, features = guerrero) %>%
  pull(lambda_guerrero)
lambda5.2
## [1] 1.999927
eco_class %>%
  autoplot(box_cox(Passengers, lambda5.2)) + labs(title = "Transformed Econ Passengers MEL-SYD") +
  ylab("Passengers") +
  theme(plot.title = element_text(hjust = 0.5))

# 5.3
pedestrian_data <- pedestrian
  
scross_station <-pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  mutate(Week = yearweek(Date)) %>%
  index_by(Week) %>%
  summarise(Count = sum(Count))

scross_station %>%
  autoplot(Count) + labs(title = "Original Econ Passengers MEL-SYD") +
  theme(plot.title = element_text(hjust = 0.5))

lambda5.3 <- scross_station %>%
  features(Count, features = guerrero) %>%
  pull(lambda_guerrero)
lambda5.3
## [1] 0.7986839
scross_station %>%
  autoplot(box_cox(Count, lambda5.3)) + labs(title = "Transformed Econ Passengers MEL-SYD") +
  ylab("Count") +
  theme(plot.title = element_text(hjust = 0.5))

QUESTION 6 Show that a
3×5 MA is equivalent to a 7-term weighted moving average with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and 0.067.

QUESTION 6

QUESTION 7 A. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?

# 7 -------------------------------------------
# a
aus_gas <- tail(aus_production, 5*4) %>%  dplyr::select("Quarter", "Gas")
aus_gas %>%
  autoplot(Gas) + labs(title = "Time Series of Australian Gas Production") +
  ylab("Bcf") +
  theme(plot.title = element_text(hjust = 0.5))

Yes, there seems to be an upwards trend.

B. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

# 7 -------------------------------------------
# b
gas_ts <- ts(aus_gas$Gas, frequency = 4) 
gas_decomp = decompose(gas_ts, type = "multiplicative")
plot(gas_decomp)

C. Do the results support the graphical interpretation from part a? Yes, they support the interpretation from part a.

# d
data("aus_production")
ts_gas = ts(aus_gas, frequency = 12, start = 1949)

decompose_gas = decompose(ts_gas, "multiplicative")
adjust_gas = ts_gas / decompose_gas$seasonal
plot(adjust_gas)

D. Compute and plot the seasonally adjusted data.

# d
data("aus_production")
ts_gas = ts(aus_gas, frequency = 12, start = 1949)

decompose_gas = decompose(ts_gas, "multiplicative")
adjust_gas = ts_gas / decompose_gas$seasonal
plot(adjust_gas)

E. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

# e
aus_gas2 <- aus_gas %>% mutate(Gas = if_else(Gas == 187, Gas + 300, Gas))
gas_ts2 <- ts(aus_gas2$Gas, frequency = 4)  
gas_decomp2 = decompose(gas_ts2, type = "multiplicative")
plot(gas_decomp2)

adjust_gas2 = gas_ts2 / gas_decomp2$seasonal
plot(adjust_gas2)

The outlier is present and results in a large spike, visibly seen in the graphs.

F. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

# f
aus_gas3 <- aus_gas %>% mutate(Gas = if_else(Quarter == yearquarter("2010Q2"), Gas + 300, Gas))
gas_ts3 <- ts(aus_gas3$Gas, frequency = 4)  
gas_decomp3 = decompose(gas_ts3, type = "multiplicative")
plot(gas_decomp3)

adjust_gas3 = gas_ts3 / gas_decomp3$seasonal
plot(adjust_gas3)

No, the outlier still shows a significant surge at the end, similar to the middle. It essentially has the same effect. ————————————————- QUESTION 8

QUESTION 9 a - The trend component of the decomposition shows that there is an increasing trend of employment. The seasonality chart indicates that there is no seasonal trend as there is not a consistent pattern of employment. The lack of seasonality proves that labor participation is not very correlated to time of the year.
b. There was a recession in Australia in 1991 which can be shown in the error graph and seen prominently in the seasonal graph for March, August, and November.

QUESTION 10 This exercise uses the canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).

A. Plot the data using autoplot(), gg_subseries() and gg_season() to look at the effect of the changing seasonality over time.1

# 10 --------------------------------------------

# a
canadian_gas %>% autoplot(Volume) + labs(title = "Canadian Gas Volume by Month") +
  theme(plot.title = element_text(hjust = 0.5))

canadian_gas %>% gg_subseries(Volume) + labs(title = "Canadian Gas Volume by Month") +
  theme(plot.title = element_text(hjust = 0.5))

canadian_gas %>% gg_season(Volume) + labs(title = "Canadian Gas Volume by Month") +
  theme(plot.title = element_text(hjust = 0.5))

B. Do an STL decomposition of the data. You will need to choose a seasonal window to allow for the changing shape of the seasonal component.

# b
stl_gas <- stl(canadian_gas, s.window = 13, robust = TRUE)
autoplot(stl_gas) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")

stl_gas2 <- stl(canadian_gas, s.window = "periodic", robust = TRUE)
autoplot(stl_gas2) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")

stl_gas3 <- stl(canadian_gas, s.window = 250, robust = TRUE)
autoplot(stl_gas3) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")

C. How does the seasonal shape change over time? [Hint: Try plotting the seasonal component using gg_season().]

# c
canadian_gas %>% gg_season(Volume) + labs(title = "Canadian Gas Volume by Month") +
  theme(plot.title = element_text(hjust = 0.5))

Rightly so, it seems that gas volume decreases during the summer months because the weather is warmer and demand for gas heating drops.

D. Can you produce a plausible seasonally adjusted series?

# d
canadian_gas$Volume <- as.ts(canadian_gas$Volume)
autoplot(canadian_gas$Volume) + 
  autolayer(trendcycle(stl_gas), series="Trend") +
  autolayer(seasadj(stl_gas), series="Seasonally Adjusted") +
  xlab("Year") + ylab("New orders index") +
  ggtitle("STL Decomposition of Canadian Gas Production") +
  scale_colour_manual(values=c("gray","blue","red"),
                      breaks=c("Data","Seasonally Adjusted","Trend"))

E. Compare the results with those obtained using SEATS and X-11. How are they different?

# e
#canadian_gas %>%
 # model(x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
  #components() %>%
  #autoplot()+
  #labs(title = "X-11 decomposition of Canadian Gas Production")
#view(canadian_gas)