DATA624

Exercise 3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

View(global_economy)
dim(global_economy)

## [1] 15150     9

sum(is.na(global_economy))

## [1] 23678

max(global_economy$GDP)

## [1] NA

max(global_economy$Population)

## [1] NA

global_economy %>%
  mutate(GDP_Per_Capital = GDP/Population) %>%
  autoplot(GDP_Per_Capital) +   labs(title= "GDP per capital", y = "Currency in US Dollars")+
  theme(legend.position = "none")

gdp1 <- global_economy %>%
  mutate(GDP_Per_Capital = GDP/Population) %>%
  filter(GDP_Per_Capital > 100000) %>%
  autoplot(GDP_Per_Capital) + labs(title= "GDP per capital", y = "Currency - US Dollars")
gdp1

The highest GDP per Capita is Monaco. In 2013 he began to alternate with Liechtenstein. In the year 2015, the Highest GDP per Capital is Liechtenstein.

Exercise 3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.

gdp_USA <- global_economy %>%
  filter (Country =="United States") %>% 
  autoplot(GDP, col="blue") +   labs(title= "USA GDP", y = "Currency - US Dollars")
gdp_USA

max(global_economy$Year)

## [1] 2017

min(global_economy$Year)

## [1] 1960

gdp_USA2 <- global_economy %>%
  filter (Country =="United States") %>% 
  autoplot(log(GDP), col="blue") +  labs(title= "USA GDP", subtitle= "1960 to 2017", y = "Currency - US Dollars")
gdp_USA2

This graph was no need to to transform it, because the curve is almost a straight line.

view(aus_livestock)
aus <- aus_livestock %>%
    group_by(Animal) %>%
    filter(Animal == "Bulls, bullocks and steers") %>%
    summarise(number_of_animal = sum(Count)) %>%
    autoplot(number_of_animal, col="blue") + labs(title = "Slaughter of Victorian “Bulls, bullocks and steers”", subtitle= "aus_livestock", y = "Number_of_Animals Slaughtered")
aus

view(vic_elec)

According to the plot, from January 1980 till January 2020, about 350k of Bulls, bullocks and steers were slaughtered in Victoria, Austria

vic_elec1 <- vic_elec
vic_elec1$NewDate <- format(vic_elec$Date, format = "%Y")

view(vic_elec1)
elec1 <- vic_elec1 %>%
    group_by(NewDate)%>%
    summarise(Electric_Demand = sum(Demand))%>%
    autoplot(Electric_Demand)+labs(title = "Victorian Electricity Demand", y="Electric Demand")
elec1

Electric demand in Victoria from 2012 to 2015 range in 4000 to 6000 MW.

view(aus_production)
aus1 <- aus_production %>%
     autoplot(Gas, col="blue") + labs(title = "Historical Data on Gas Production", subtitle = "Austria by year", y = "Gas in Volume", color = "blue" )
aus1

According to the graph, gas production has increased over the years.

Exercise 3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

## [1] 0

From the graph, I can’t see if box_cox() would have an effect because the change in gas volume each year is so small.

lambda <- canadian_gas %>%
    features(Volume, features = guerrero)
    canadian_gas %>%
    autoplot(box_cox(Volume, lambda), col="blue")

canadian_gas %>%
autoplot(box_cox(Volume, lambda = 1.2), col="blue")

Let’s find optimal lambda for Box-Cox transformation. After applying box_cox the change is not significant, but the lambda parameter has an effect on the normalization of the data.

Exercise 3.4

What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

set.seed(1975)
view(aus_retail)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries %>%
  autoplot(Turnover, col="blue")

myseries %>%
  autoplot(box_cox(Turnover, lambda = 0.3), col="blue")

We selected to apply box_cox on Turnover.

Exercise 3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

lambda_tobacco <- aus_production %>%
   features(Tobacco, features = guerrero) %>%
   pull(lambda_guerrero)
aus_production %>%
  autoplot(box_cox(Tobacco, lambda_tobacco)) +
  labs(y = "", title = latex2exp::TeX(paste0( "Transformed gas production with $\\lambda$ = ", round(lambda_tobacco,2))))

Exercise 3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) %>% select(Gas)

Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
Do the results support the graphical interpretation from part a?
Compute and plot the seasonally adjusted data.
Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
Does it make any difference if the outlier is near the end rather than in the middle of the time series?

require(graphics)
gas <- tail(aus_production, 5*4) %>% 
  dplyr::select(Gas)
gas %>%
  autoplot(col="blue")+labs(title = " Historical Data on Gas Consumption", subtitle = "Austria", y = "Gas Volume" )

gas1a <- as_tsibble (gas) %>% 
    model(classical_decomposition(type = 'multiplicative'))%>%
    components()%>%
    autoplot (col="blue")+labs(title = "Historical Data on Gas Consumption", subtitle = "Austria ", y = "Gas Volume" )
gas1a

as_tsibble(gas) %>% 
   model(classical_decomposition(type ="multiplicative") )%>%
   components() %>% 
   as_tsibble() %>%
   autoplot(col="blue") + geom_line(aes(y=season_adjust), colour = "red") +labs(title = " Historical Data on Gas Consumption", subtitle = "Austria", y = "Gas Volume" )

gas$Gas[5]<-gas$Gas[5]+300
as_tsibble(gas) %>% 
    model(classical_decomposition(type ="multiplicative") )%>% 
    components() %>%
    as_tsibble() %>%
    autoplot(col="blue") +geom_line(aes(y=season_adjust), colour = "red")+labs(title = "Historical Data on Gas Consumption", subtitle = "Austria ", y = "Gas Volume" )

gas$Gas[18]<-gas$Gas[18]+300
as_tsibble(gas) %>% 
        model(classical_decomposition(type ="multiplicative") )%>% 
        components() %>%
        as_tsibble() %>%
        autoplot(col="blue") +geom_line(aes(y=season_adjust), colour = "red")+labs(title = " Historical Data on Gas Consumption", subtitle = "Austria", y = "Gas Volume" )

Exercise 3.8

Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries %>%
  model(classical_decomposition(Turnover,type = "multiplicative")) %>%
  components() %>%
  autoplot(col="blue") + 
  ggtitle("Multiplicative decomposition of my retail time series data")

There are some outliers around 2010 Jan, and some around 1995, but we has not noticed unusual features.

Exercise 3.9

Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation

The result of decomposing the number of people in the civilian labor force in Australia each month from February 1978 to August 1995. It shows that the data increases over the years, the trend is upward. In the remainder graph we can observe some atypical values in 1991/1992. The month with the highest data is December, and we can observe that through the different years, it is the same behavior.

Is the recession of 1991/1992 visible in the estimated components?

According to the plot, the atypical values that we observed between 1991 and 1992, may be due to the recession presented.

DATA624_HW2

Gabriel Santos

2023-02-11