Excercise 3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

3.1 How data looks like

kable(head(global_economy))
Country Code Year GDP Growth CPI Imports Exports Population
Afghanistan AFG 1960 537777811 NA NA 7.024793 4.132233 8996351
Afghanistan AFG 1961 548888896 NA NA 8.097166 4.453443 9166764
Afghanistan AFG 1962 546666678 NA NA 9.349593 4.878051 9345868
Afghanistan AFG 1963 751111191 NA NA 16.863910 9.171601 9533954
Afghanistan AFG 1964 800000044 NA NA 18.055555 8.888893 9731361
Afghanistan AFG 1965 1006666638 NA NA 21.412803 11.258279 9938414
kable(tail(global_economy))
Country Code Year GDP Growth CPI Imports Exports Population
Zimbabwe ZWE 2012 17114849900 16.6654288 107.3283 48.99928 25.16325 14710826
Zimbabwe ZWE 2013 19091020000 1.9894928 109.0795 36.66874 21.98776 15054506
Zimbabwe ZWE 2014 19495519600 2.3769293 108.8472 33.74147 20.93015 15411675
Zimbabwe ZWE 2015 19963120600 1.7798727 106.2245 37.58864 19.16018 15777451
Zimbabwe ZWE 2016 20548678100 0.7558693 104.5606 31.27549 19.94353 16150362
Zimbabwe ZWE 2017 22040902300 4.7040354 105.5118 30.37027 19.65802 16529904

3.1 Autoplot all GDP per capita

global_economy %>%
  #R studio takes the GDP and pop from the global_economy dataset and divides them 
  autoplot(GDP/Population,show.legend = FALSE) +
  labs(x = "Year", y = "GDP Per Capita")

3.1 Getting the GDP per capita

global_economy <- global_economy %>%
  mutate(GDP_per_Capita = GDP / Population)

3.1 Highest GDP per capita in data set

Monaco in 2014 has the highest GDP per capita, followed by Liechtenstein in 2014.

global_economy_sorted <- global_economy %>%
  arrange(desc(GDP_per_Capita))

kable(head(global_economy_sorted))
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Monaco MCO 2014 7060236168 7.1796368 NA NA NA 38132 185152.5
Monaco MCO 2008 6476490406 0.7318007 NA NA NA 35853 180640.1
Liechtenstein LIE 2014 6657170923 NA NA NA NA 37127 179308.1
Liechtenstein LIE 2013 6391735894 NA NA NA NA 36834 173528.2
Monaco MCO 2013 6553372278 9.5707988 NA NA NA 37971 172588.9
Monaco MCO 2016 6468252212 3.2138488 NA NA NA 38499 168010.9

3.1 GDP over time

The data suggests that Monaco, Liechtenstein, and Luxembourg were primarily the top GDP per capita in 2012-2016.

highest_gdp_2012 <- global_economy_sorted %>%
  filter(Year == 2012) %>%          
  arrange(desc(GDP_per_Capita))  %>%           
  slice(1:5)                        

highest_gdp_2013 <- global_economy_sorted %>%
  filter(Year == 2013) %>%           
  arrange(desc(GDP_per_Capita))  %>%           
  slice(1:5)                          

highest_gdp_2014 <- global_economy_sorted %>%
  filter(Year == 2014) %>%          
  arrange(desc(GDP_per_Capita))  %>%           
  slice(1:5)                          

highest_gdp_2015 <- global_economy_sorted %>%
  filter(Year == 2015) %>%           
  arrange(desc(GDP_per_Capita))  %>%           
  slice(1:5)                        

highest_gdp_2016 <- global_economy_sorted %>%
  filter(Year == 2016) %>%          
  arrange(desc(GDP_per_Capita))  %>%           
  slice(1:5)  
kable(highest_gdp_2012)
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Monaco MCO 2012 5743029680 0.9849603 NA NA NA 37783 152000.36
Liechtenstein LIE 2012 5456009385 NA NA NA NA 36545 149295.65
Luxembourg LUX 2012 56677961787 -0.3525194 106.1643 155.41771 186.44431 530946 106749.01
Norway NOR 2012 510229136227 2.7216268 101.9908 27.54178 40.57434 5018573 101668.17
Qatar QAT 2012 186833516484 4.6872592 103.4801 29.27387 76.47241 2109568 88564.82
kable(highest_gdp_2013)
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Liechtenstein LIE 2013 6391735894 NA NA NA NA 36834 173528.15
Monaco MCO 2013 6553372278 9.570799 NA NA NA 37971 172588.88
Luxembourg LUX 2013 61739352212 3.654370 108.0053 158.61336 190.62858 543360 113625.13
Norway NOR 2013 523502127660 1.044389 104.1535 28.35444 39.13876 5079623 103059.25
Macao SAR, China MAC 2013 51552075902 11.200190 118.4497 30.40292 90.63795 575841 89524.84
kable(highest_gdp_2014)
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Monaco MCO 2014 7060236168 7.179637 NA NA NA 38132 185152.53
Liechtenstein LIE 2014 6657170923 NA NA NA NA 37127 179308.08
Luxembourg LUX 2014 66327344189 5.771916 108.6841 174.06108 208.23038 556319 119225.38
Norway NOR 2014 499338534779 1.975117 106.2800 29.78328 38.78287 5137232 97199.92
Macao SAR, China MAC 2014 55347998648 -1.201113 125.6103 31.62667 84.93632 588781 94004.39
kable(highest_gdp_2015)
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Liechtenstein LIE 2015 6268391521 NA NA NA NA 37403 167590.61
Monaco MCO 2015 6258178995 4.942330 NA NA NA 38307 163369.07
Luxembourg LUX 2015 57784495265 2.861675 109.20011 187.46843 222.70321 569604 101446.79
Switzerland CHE 2015 679289166858 1.226384 98.17176 50.62988 62.13828 8282396 82016.02
Isle of Man IMN 2015 6792417112 -0.900000 NA NA NA 83167 81672.02
kable(highest_gdp_2016)
Country Code Year GDP Growth CPI Imports Exports Population GDP_per_Capita
Monaco MCO 2016 6468252212 3.213849 NA NA NA 38499 168010.91
Liechtenstein LIE 2016 6214633651 NA NA NA NA 37666 164993.19
Luxembourg LUX 2016 58631324559 3.082643 109.5177 186.1633 221.26778 582014 100738.68
Switzerland CHE 2016 668745279605 1.375884 97.7451 54.5889 65.81131 8373338 79866.03
Isle of Man IMN 2016 6592627599 7.400000 NA NA NA 83737 78730.16

Excercise 3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

  • United States GDP from global_economy.
  • Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.
  • Victorian Electricity Demand from vic_elec.
  • Gas production from aus_production.

3.2 United States GDP from global_economy

United_States_GDP <- global_economy %>%
  filter(Country == 'United States') %>% 
  arrange(desc(GDP_per_Capita)) 


plot <- autoplot(United_States_GDP) +
  ggtitle("United States GDP") +            
  xlab("Year") +        
  ylab("GDP") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14)  
  )

print(plot)

3.2 Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.

bulls_bullocks_steers <- aus_livestock %>%
  filter(Animal == 'Bulls, bullocks and steers', State == "Victoria") 

plot <- autoplot(bulls_bullocks_steers) +
  ggtitle("Bulls, Bullocks,and Steers in Victoria") +            
  xlab("Year") +        
  ylab("Count") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.title = element_text(size = 10),     # Size of legend title
    legend.text = element_text(size = 8),       # Size of legend text
    legend.key.size = unit(0.5, "cm")           # Size of legend keys
  )

print(plot)

3.2 Victorian Electricity Demand from vic_elec

vic_elec_monthly <- vic_elec %>%
  mutate(month = floor_date(Time, "month")) %>%  
  group_by(month) %>%                           
  summarize(Demand = sum(Demand, na.rm = TRUE)) 


view(vic_elec_monthly)

plot <- autoplot(vic_elec_monthly) +
  ggtitle("Victorian Electricity Demand") +            
  xlab("Year") +        
  ylab("Count") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.position = "none"  # Remove the legend
  )

print(plot)

3.2 Gas production from aus_production

# Extract gas production data
gas_production <- aus_production %>%
  select(Quarter, Gas)

plot <- autoplot(gas_production) +
  ggtitle("Gas Prodiction") +            
  xlab("Quarter") +        
  ylab("Count") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.position = "none"  # Remove the legend
  )

print(plot)

Excercise 3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

head(canadian_gas)
## # A tsibble: 6 x 2 [1M]
##      Month Volume
##      <mth>  <dbl>
## 1 1960 Jan   1.43
## 2 1960 Feb   1.31
## 3 1960 Mar   1.40
## 4 1960 Apr   1.17
## 5 1960 May   1.12
## 6 1960 Jun   1.01
plot <- autoplot(canadian_gas) +
  ggtitle("Gas Prodiction") +            
  xlab("Month") +        
  ylab("Volume") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.position = "none"  # Remove the legend
  )

print(plot)

#?features()
?box_cox
lambda_gas <- canadian_gas %>%
  features(Volume,features = guerrero) %>%
  pull(lambda_guerrero)
  
lambda_gas
## [1] 0.5767648
#base_plot <- autoplot(box_cox(canadian_gas$Volume, lambda_gas))

#plot <- base_plot +
#  ggtitle("Gas Production") + 
##  xlab("Month") +
#  ylab("Volume") +
#  theme_minimal() +
#  theme(
#    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
#    axis.title.x = element_text(face = "bold", size = 14), 
#    axis.title.y = element_text(face = "bold", size = 14), 
#    legend.position = "none"  # Remove the legend
#  )

#print(plot)

The analysis can be unhelpful because the data from this period almost look identical, particularly between 1970 and 1980. During this time, there is a noticeable positive trend accompanied by a significant and swollen event, which can make it difficult to distinguish between different underlying patterns or trends.

Excercise 3.4

What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?

To effectively use the Box-Cox transformation, start by thoroughly understanding the nature of your data, including its trends, seasonality, and variance. Estimate the optimal lambda (\(\lambda\)) value using methods such as Guerrero’s, and then apply the Box-Cox transformation with this lambda to stabilize variance and improve normality. Finally, visualize the transformed data to assess the effectiveness of the transformation. It is essential to validate the lambda value and ensure that the transformation meets the needs of your data in practice.

Excercise 3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. - - Tobacco from aus_production, - Economy class passengers between Melbourne and Sydney from ansett - Pedestrian counts at Southern Cross Station from pedestrian.

3.5 Tobacco

tobacco_ts <- ts(aus_production$Tobacco, frequency = 12)

lambda <- BoxCox.lambda(tobacco_ts)

transformed_tobacco <- BoxCox(tobacco_ts, lambda)
plot <- autoplot(transformed_tobacco) +
  ggtitle("Quarterly Tobacco Prodiction") +            
  xlab("Year and Quarters") +        
  ylab("Tobacco") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.position = "none"  # Remove the legend
  )

print(plot)

3.5 Economy Class Passengers

ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  autoplot(Passengers) + labs(title = "Passengers on Ansett Flights")

# Come back and center title. Bold, and increase size of titles.
lambda1 <- ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  features(Passengers,features = guerrero) %>%
  pull(lambda_guerrero)

ansett %>%
  filter(Airports == "MEL-SYD" & Class == "Economy") %>%
  autoplot(box_cox(Passengers,lambda1)) + labs(title = "BoxCox of Passengers")

Excercise 3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) |> select(Gas)

kable(gas)
Gas Quarter
221 2005 Q3
180 2005 Q4
171 2006 Q1
224 2006 Q2
233 2006 Q3
192 2006 Q4
187 2007 Q1
234 2007 Q2
245 2007 Q3
205 2007 Q4
194 2008 Q1
229 2008 Q2
249 2008 Q3
203 2008 Q4
196 2009 Q1
238 2009 Q2
252 2009 Q3
210 2009 Q4
205 2010 Q1
236 2010 Q2

3.7 A. Plot the time series.

plot <- autoplot(gas) +
  ggtitle("5 years of Gas production") +            
  xlab("Quarters") +        
  ylab("Gas Production") +        
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16, hjust = 0.5), 
    axis.title.x = element_text(face = "bold", size = 14), 
    axis.title.y = element_text(face = "bold", size = 14), 
    legend.position = "none"  # Remove the legend
  )

print(plot)

3.7 A. Can you identify seasonal fluctuations and/or a trend-cycle?

Yes definitly we can spot the highs for the middle of the year and the lows for at the begging. Positive trend starting the year the negative trend toward the end of the year.

3.7 B. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.

?classical_decomposition()
decompostion_gas <- gas %>%
  model(
  classical_decomposition(Gas,type = "multiplicative")) %>%
  components() 


decompostion_gas %>%  
  autoplot() + labs(title = "Classical Decomposition of Gas")

3.7 C. Do the results support the graphical interpretation from part a?

Yes the results seem to follow the original data.

3.7 D. Compute and plot the seasonally adjusted data.

decompostion_gas %>%
  ggplot(aes(x = Quarter)) + 
  geom_line(aes(y = Gas,color = "Data")) +
  geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
  geom_line(aes(y = trend,color = "Trend")) + 
  labs(title = "Quarterly production of Gas") +
   scale_colour_manual(
    values = c("#999999", "green", "red"),
    breaks = c("Data", "Seasonally Adjusted", "Trend")
  )

3.7 E. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?

gas_addition <- gas
gas_addition$Gas[1] <- gas_addition$Gas[1] + 300
kable(gas_addition)
Gas Quarter
521 2005 Q3
180 2005 Q4
171 2006 Q1
224 2006 Q2
233 2006 Q3
192 2006 Q4
187 2007 Q1
234 2007 Q2
245 2007 Q3
205 2007 Q4
194 2008 Q1
229 2008 Q2
249 2008 Q3
203 2008 Q4
196 2009 Q1
238 2009 Q2
252 2009 Q3
210 2009 Q4
205 2010 Q1
236 2010 Q2

Adding 300 to a single observation in the data significantly skews its normal distribution. Most of the added values do not reach the peak of 300; they only go up to around 250. Therefore, this addition makes a substantial difference in the data.

3.7 F. Does it make any difference if the outlier is near the end rather than in the middle of the time series?

In both cases, it would be incorrect, but it would cause a significant increase in the positive trend shown on the chart. Initially, it would create a positive trend at the beginning, but by the end, it would result in a negative trend.

Excercise 3.8

Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

?x11
aus_production
## # A tsibble: 218 x 7 [1Q]
##    Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##      <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
##  1 1956 Q1   284    5225    189    465        3923     5
##  2 1956 Q2   213    5178    204    532        4436     6
##  3 1956 Q3   227    5297    208    561        4806     7
##  4 1956 Q4   308    5681    197    570        4418     6
##  5 1957 Q1   262    5577    187    529        4339     5
##  6 1957 Q2   228    5651    214    604        4811     7
##  7 1957 Q3   236    5317    227    603        5259     7
##  8 1957 Q4   320    6152    222    582        4735     6
##  9 1958 Q1   272    5758    199    554        4608     5
## 10 1958 Q2   233    5641    229    620        5196     7
## # ℹ 208 more rows
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries %>%
  autoplot() + labs(title = "Trade turnover")

myseries %>%
  gg_season() +labs(title = "Trade turnover")

Excercise 3.9

Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

3.9 A. Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

The line graph of the values shows an increase from the 1980s to the mid-1990s. There appears to be a small disturbance around 1992, where the value line dips slightly. After this point, the value clearly drops below a count of 400. The seasonal pattern remains consistent with no notable deviations. The trend line indicates a positive overall trend.

3.9 B. Is the recession of 1991/1992 visible in the estimated components?

Yes, the recession is clearly visible in the graph. The small disturbance is evident during the 1991/1992 period.