Homework # 2 (624)
3.1 Consider the GDP Info in global_economy Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.4.2
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.5
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.1
## ✔ lubridate 1.9.3 ✔ fable 0.4.1
## ✔ ggplot2 3.5.1
## Warning: package 'tsibble' was built under R version 4.4.2
## Warning: package 'tsibbledata' was built under R version 4.4.2
## Warning: package 'feasts' was built under R version 4.4.2
## Warning: package 'fabletools' was built under R version 4.4.2
## Warning: package 'fable' was built under R version 4.4.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
head(global_economy)
## # A tsibble: 6 x 9 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Population
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan AFG 1960 537777811. NA NA 7.02 4.13 8996351
## 2 Afghanistan AFG 1961 548888896. NA NA 8.10 4.45 9166764
## 3 Afghanistan AFG 1962 546666678. NA NA 9.35 4.88 9345868
## 4 Afghanistan AFG 1963 751111191. NA NA 16.9 9.17 9533954
## 5 Afghanistan AFG 1964 800000044. NA NA 18.1 8.89 9731361
## 6 Afghanistan AFG 1965 1006666638. NA NA 21.4 11.3 9938414
library(seasonal)
## Warning: package 'seasonal' was built under R version 4.4.3
##
## Attaching package: 'seasonal'
## The following object is masked from 'package:tibble':
##
## view
Using autoplot to display the GDP per capita for each country, we make the show.legend function to show false output for each country..
global_economy %>%
autoplot(GDP/Population,show.legend = FALSE) +
labs(x = "Year", y = "GDP Per Capita")
## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Use the slice_max function to return the top 5 GDP per capita
global_economy %>%
mutate(percap = GDP/Population) %>%
slice_max(percap, n = 25) %>%
distinct()
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.
## # A tibble: 25 × 10
## Country Code Year GDP Growth CPI Imports Exports Population percap
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Monaco MCO 2014 7.06e9 7.18 NA NA NA 38132 1.85e5
## 2 Monaco MCO 2008 6.48e9 0.732 NA NA NA 35853 1.81e5
## 3 Liechtenst… LIE 2014 6.66e9 NA NA NA NA 37127 1.79e5
## 4 Liechtenst… LIE 2013 6.39e9 NA NA NA NA 36834 1.74e5
## 5 Monaco MCO 2013 6.55e9 9.57 NA NA NA 37971 1.73e5
## 6 Monaco MCO 2016 6.47e9 3.21 NA NA NA 38499 1.68e5
## 7 Liechtenst… LIE 2015 6.27e9 NA NA NA NA 37403 1.68e5
## 8 Monaco MCO 2007 5.87e9 14.4 NA NA NA 35111 1.67e5
## 9 Liechtenst… LIE 2016 6.21e9 NA NA NA NA 37666 1.65e5
## 10 Monaco MCO 2015 6.26e9 4.94 NA NA NA 38307 1.63e5
## # ℹ 15 more rows
It seems like in certain time-frames Monaco had the highest GDP per capita throughout the year,followed by Liechtenstein and then Luxoemburg. But what is the most recent year in the chart.
## Check the recent year.
unique(global_economy$Year)
## [1] 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974
## [16] 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
## [31] 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
## [46] 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
The data goes up to 2017
global_economy %>%
filter(Year == 2017) %>%
mutate(percap = GDP/Population) %>%
group_by(Country) %>%
arrange(desc(percap)) %>%
select(Country,percap)
## # A tsibble: 262 x 3 [1Y]
## # Key: Country [262]
## # Groups: Country [262]
## Country percap Year
## <fct> <dbl> <dbl>
## 1 Luxembourg 104103. 2017
## 2 Macao SAR, China 80893. 2017
## 3 Switzerland 80190. 2017
## 4 Norway 75505. 2017
## 5 Iceland 70057. 2017
## 6 Ireland 69331. 2017
## 7 Qatar 63249. 2017
## 8 United States 59532. 2017
## 9 North America 58070. 2017
## 10 Singapore 57714. 2017
## # ℹ 252 more rows
It seems like in 2017 Luxembourg,China and Switzerland had the highest GDP per capita, I will check the previous few years to see what happened.
## 2016
global_economy %>%
filter(Year == 2016) %>%
mutate(percap = GDP/Population) %>%
group_by(Country) %>%
arrange(desc(percap)) %>%
select(Country,percap)
## # A tsibble: 262 x 3 [1Y]
## # Key: Country [262]
## # Groups: Country [262]
## Country percap Year
## <fct> <dbl> <dbl>
## 1 Monaco 168011. 2016
## 2 Liechtenstein 164993. 2016
## 3 Luxembourg 100739. 2016
## 4 Switzerland 79866. 2016
## 5 Isle of Man 78730. 2016
## 6 Macao SAR, China 74017. 2016
## 7 Norway 70890. 2016
## 8 Ireland 64100. 2016
## 9 Iceland 60530. 2016
## 10 Qatar 59044. 2016
## # ℹ 252 more rows
In 2016 Monaco,Liechtenstein and Luxembourg had the highest GDP,it appears there is no GDP data for Monaco in 2017, check 2015 just in case.
# 2015
global_economy %>%
filter(Year == 2015) %>%
mutate(percap = GDP/Population) %>%
group_by(Country) %>%
arrange(desc(percap)) %>%
select(Country,percap)
## # A tsibble: 262 x 3 [1Y]
## # Key: Country [262]
## # Groups: Country [262]
## Country percap Year
## <fct> <dbl> <dbl>
## 1 Liechtenstein 167591. 2015
## 2 Monaco 163369. 2015
## 3 Luxembourg 101447. 2015
## 4 Switzerland 82016. 2015
## 5 Isle of Man 81672. 2015
## 6 Macao SAR, China 75484. 2015
## 7 Norway 74498. 2015
## 8 Qatar 65177. 2015
## 9 Ireland 61808. 2015
## 10 Australia 56561. 2015
## # ℹ 252 more rows
Liechtenstein was ahead this year, and finally check 2014..
# 2014
global_economy %>%
filter(Year == 2014) %>%
mutate(percap = GDP/Population) %>%
group_by(Country) %>%
arrange(desc(percap)) %>%
select(Country,percap)
## # A tsibble: 262 x 3 [1Y]
## # Key: Country [262]
## # Groups: Country [262]
## Country percap Year
## <fct> <dbl> <dbl>
## 1 Monaco 185153. 2014
## 2 Liechtenstein 179308. 2014
## 3 Luxembourg 119225. 2014
## 4 Norway 97200. 2014
## 5 Macao SAR, China 94004. 2014
## 6 Isle of Man 89942. 2014
## 7 Qatar 86853. 2014
## 8 Switzerland 86606. 2014
## 9 Denmark 62549. 2014
## 10 Australia 62328. 2014
## # ℹ 252 more rows
After plotting and verifying the data for the few years, Monaco in 2014 had the highest ever GDP per capita of all time(not accounting for Inflation..), and in recent years Monaco,Liechtenstein and Luxembourg had the highest GDP per capita but there is missing information in 2017 for Monaco and Liechtenstein but I assume they would rank highly in 2017 as well.
3.2 For each of the following series, make a graph of the data For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP
global_economy %>%
filter(Country == "United States") %>%
ggplot(aes(x = Year,y = GDP)) +
geom_line() + labs(title = "U.S GDP")
It appears that this time series has a quadratic growth trend.
global_economy %>%
filter(Country == "United States") %>%
autoplot(sqrt(GDP)) + ggtitle("Square-Root Transformation on GDP")
Applying The square-root transformation, it seems like nothing has
changed..
global_economy %>%
filter(Country == "United States") %>%
autoplot(log(GDP)) + ggtitle("Log Transformation on GDP")
The log transformation seems to have make the time-series data appear
more linear
Australian Livestock
aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot() + ggtitle("Australian Livestock slaughter")
## Plot variable not specified, automatically selected `.vars = Count`
## Plot variable not specified, automatically selected `.vars = Count`
## This is the meat production in Australia for human consumption,
aus_live1 <- aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot(log(Count)) + ggtitle("Log Transformation of Livestock Slaughter")
aus_live1
Log Transforming the count did nothing, can try box-cox?
## Try a box-cox transformation using the features function
## Okay it prints out a lambda value
lambda <- aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
features(Count,features = guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] -0.04461887
THe lambda value is really small and it is close to zero. From the article on Box-Cox a lambda equals 0 means it performed a log transformation.
aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot(box_cox(Count,lambda)) + ggtitle("Box-Cox Transformation")
The article mentioned of simple calender effect by removing the
variation by computing average sales per trading day in each month.
Since this is a monthly count of slaughters we have to convert it to
average sales per trading day in each month,
## Convert this data into average count of slaughters.
aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
## calculate the average by dividing count by # of days.. using days_in_month to count # of day
mutate(AverageKill = Count/days_in_month(Month)) %>%
autoplot(AverageKill) + ggtitle("Average Slaughter per month")
Victoria electricity
vic_elec %>%
autoplot() + ggtitle("Half-Hour Electricity Demand for Victoria,Australia")
## Plot variable not specified, automatically selected `.vars = Demand`
## Plot variable not specified, automatically selected `.vars = Demand`
It seems this time-series wouldn’t benefit from any transformations since there is no trend,and no cylic behavior but a seasonal pattern.
vic_elec %>%
autoplot(log(Demand)) + ggtitle("Log Transformation")
Applied a log-transformation and not much has changed. What if we
convert the half-hour electricity demand to a daily average electricity
demand.
## Reconvert tstibble to tibble..
Vic_elec <- vic_elec %>% as_tibble()
## Get the average electricity demand for each day.
Vic_elec %>%
group_by(Date) %>%
summarise(Total = sum(Demand)) %>%
mutate(Date = as.Date(Date)) %>%
as_tsibble(index = Date,key = Total) %>%
ggplot(aes(x = Date, y = Total)) +
geom_line() + ggtitle("Daily Electricity Demand Per Day")
Converting the half-hour electricity demand to a daily demand makes the
chart easier to read.. Gas_Production
aus_production %>%
select(Quarter,Electricity) %>%
autoplot()
## Plot variable not specified, automatically selected `.vars = Electricity`
## Plot variable not specified, automatically selected `.vars = Electricity`
This time-series appears quadratic, so perhaps a square root transformation can works
aus_production %>%
autoplot(sqrt(Electricity))
The time-series seems better when it is transformed as a square-root
transformation
3.3 Why is a Box-Cox transformation unhelpful The canadian_gas data is monthly canadian gas production from Jan 1960 to Feb 2005 it is a monthly time-series
canadian_gas
## # A tsibble: 542 x 2 [1M]
## Month Volume
## <mth> <dbl>
## 1 1960 Jan 1.43
## 2 1960 Feb 1.31
## 3 1960 Mar 1.40
## 4 1960 Apr 1.17
## 5 1960 May 1.12
## 6 1960 Jun 1.01
## 7 1960 Jul 0.966
## 8 1960 Aug 0.977
## 9 1960 Sep 1.03
## 10 1960 Oct 1.25
## # ℹ 532 more rows
canadian_gas %>%
autoplot() + labs(title = "Monthly Canadian gas production", x = "Months", y = "volume")
## Plot variable not specified, automatically selected `.vars = Volume`
## Plot variable not specified, automatically selected `.vars = Volume`
lambda1 <- canadian_gas %>%
features(Volume,features = guerrero) %>%
pull(lambda_guerrero)
lambda1
## [1] 0.5767648
canadian_gas %>%
autoplot(box_cox(Volume,lambda1))
The textbook explains that we perform transformations on the data to
stablize the variance.The box-cox transformation is unhelpful because
the seasonal_variation is the same across both charts. The variance
impact between 1979 and 1985 is diminished in the box-cox
transformation. Though the variance at both tail ends are increased.
3.4 What Box-Cox transformation would you select for your retail data The aus_retail is a monthly tstibble with one value, I will use the data from the textbook.
## Taken directly from the textbook.
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
We can apply a box-cox transformation and see what value of lambda best transform the data.
myseries %>%
autoplot() + ggtitle("Australian Retail Trade Turnover")
## Plot variable not specified, automatically selected `.vars = Turnover`
## Plot variable not specified, automatically selected `.vars = Turnover`
lambda2 <- myseries %>%
features(Turnover,features = guerrero) %>%
pull(lambda_guerrero)
lambda2
## [1] 0.08303631
The lambda value is 0.08 may approximate to 0
## Print the lambda value
myseries %>%
autoplot(box_cox(Turnover,lambda2)) + ggtitle("Austrailian Retail with lambda = 0.08")
According to guerro which selected the lambda value for me, choose a
value of 0.08 for the retail data, which is approximately a log
transformation.
3.5 For the following series, find an appropriate Box-Cox transformation For this question, I will use the guerro function to choose the appropriate lambda values for the transformation in the chart.
Aus_production
aus_production %>%
autoplot(Tobacco) + ggtitle("Quarterly Production of Tobbaco")
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
# Creating a lambda value
lambda3 <- aus_production %>%
features(Tobacco,features = guerrero) %>%
pull(lambda_guerrero)
lambda3
## [1] 0.9264636
aus_production %>%
autoplot(box_cox(Tobacco,lambda3)) + ggtitle("Quarterly production of lambda = 0.92")
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
The guerro function choose a lambda of 0.9 Ansett Ansett is the
passenger numbers on Ansett airline flight, filter for airports between
Melbourne and Sydney and economy class
ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(Passengers) + labs(title = "Passengers numbers on Ansett Flights")
## Create a lambda value from guerro
lambda4 <- ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
features(Passengers,features = guerrero) %>%
pull(lambda_guerrero)
## Applying The Lambda value
ansett %>%
filter(Airports == "MEL-SYD" & Class == "Economy") %>%
autoplot(box_cox(Passengers,lambda4)) + labs(title = "Transformed passenger numbers lambda = 1.99")
The box_cox transformation choose a lambda value of 1.9 for the ansett
data. Pedestrian This contains hourly pedestrian counts in the city of
Melbourne
## I am not sure what's happening here..
pedestrian %>%
filter(Sensor == "Southern Cross Station") %>%
autoplot(Count) + ggtitle("Pedestrian counts in the city of Melbourne")
## Use the guerro feature once again for lambda
lambda5 <- pedestrian %>%
filter(Sensor == "Southern Cross Station") %>%
features(Count,features = guerrero) %>%
pull(lambda_guerrero)
pedestrian %>%
filter(Sensor == "Southern Cross Station") %>%
autoplot(box_cox(Count,lambda5)) + ggtitle("Transformed Count with lambda = -0.25")
The lambda value chosen from the guerro function is -0.25
3.7 Consider The Last 5 Years of Gas Data A.
gas <- tail(aus_production, 5*4) |> select(Gas)
gas
## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
gas %>%
autoplot() + ggtitle("Quarterly Gas production")
## Plot variable not specified, automatically selected `.vars = Gas`
We can see a strong seasonal fluctuation, with sharp increases in the
middle of the year and a sharp decline at the begninng of every first
quarter, it seems to be going up on a slight positive trend.
## From section 3.4
cmp <- gas %>%
model(
classical_decomposition(Gas,type = "multiplicative")) %>%
components()
cmp %>%
autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
C I felt like my interpretations were correct there is a slight positive
trend, and the seasonal patterns are going down and up for each year.
Thus its hard to account for the random variation in the data.
D
# Took the code from section 3.5 and plotted it for section D.
cmp %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas,color = "Data")) +
geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
geom_line(aes(y = trend,color = "Trend")) +
labs(title = "Quarterly production of Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
E Change one observation by adding 300 to one observation..
Gast1 <- gas
Gast1$Gas[1] <- Gast1$Gas[1] + 300
Gast1
## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 521 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
cmp1 <- Gast1 %>%
model(
classical_decomposition(Gas,type = "multiplicative")) %>%
components()
cmp1 %>%
autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Plot the seasonal adjusted data again..
cmp1 %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas,color = "Data")) +
geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
geom_line(aes(y = trend,color = "Trend")) +
labs(title = "Quarterly production of Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
When I added 300 to the first observation, The seasonal adjusted data
shot up at the beginning and then went down as soon it returned to the
normal observations, which shows that the effect of the outlier had an
effect on the seasonal adjusted data.
F Place an outlier in the middle of the dataset.
Gast2 <- gas
Gast2$Gas[10] <- Gast2$Gas[10] + 300
Gast2
## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 505 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
cmp2 <- Gast2 %>%
model(
classical_decomposition(Gas,type = "multiplicative")) %>%
components()
cmp2 %>%
autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Plot the seasonal adjusted data again..
cmp2 %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas,color = "Data")) +
geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
geom_line(aes(y = trend,color = "Trend")) +
labs(title = "Quarterly production of Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
Place an outlier at the end of the observation.
Gast3 <- gas
Gast3$Gas[20] <- Gast3$Gas[20] + 300
Gast3
## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 536 2010 Q2
cmp3 <- Gast3 %>%
model(
classical_decomposition(Gas,type = "multiplicative")) %>%
components()
cmp3 %>%
autoplot() + labs(title = "Classical multiplicative decomposition of Gas Production in Australia")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
cmp3 %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas,color = "Data")) +
geom_line(aes(y = season_adjust,color = "Seasonally Adjusted")) +
geom_line(aes(y = trend,color = "Trend")) +
labs(title = "Quarterly production of Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
Placing an outlier at the middle and end of the time series we can see
that there is a difference in where the outlier is placed and how the
seasonal adjusted data is computed on where the outlier is.
3.8 Recall your Time Series
## Use the textbook data
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
I will answer the previous question by plotting the myseries with the plot functions..
myseries %>%
autoplot() + labs(title = "Australian retail trade turnover")
## Plot variable not specified, automatically selected `.vars = Turnover`
myseries %>%
gg_season() +labs(title = "Australian retail trade turnover")
## Plot variable not specified, automatically selected `y = Turnover`
myseries %>%
gg_subseries() + labs(title = "Australian retail trade turnover")
## Plot variable not specified, automatically selected `y = Turnover`
x11_dcmp <- myseries %>%
model(X11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
components()
autoplot(x11_dcmp) +
labs(title = "Decomposition of Australian clothing and accessory retailing trade turnover using X-11")
In this particular series I see in the iregular table some spikes within
the chart, which could indicate outliers, which may have an effect on
the trend which is slightly increasing. This would have been hard to see
in the other plots without the x11 decomposition.
3.9 A. The results of the decomposition shown in the chart. The results shows a slight positive trend line that is representative of the value line.The scales in the season-year is pretty high and is going from -100 to 100. The seasonal chart shows various increases from Jan to March and a decrease during the spring and a decrease in the summer and fluctuates during the summer.
Fin