Email : je070601@gmail.com
Instagram : https://www.instagram.com/marvis.zerex/
RPubs : https://rpubs.com/invokerarts/
Linkedin : https://www.linkedin.com/in/jeffry-wijaya-087a191b5/
Majors : Business Statistics
Address : ARA Center, Matana University Tower
Jl. CBD Barat Kav, RT.1, Curug Sangereng, Kelapa Dua, Tangerang, Banten 15810.
library(DT)
library(fpp2)
library(fpp3)
library(tidyverse)
library(ggplot2)
library(tsibbledata)
library(tsibble)
library(forecast)
library(fma)
library(expsmooth)
library(feasts)
library(USgas)
library(xlsx)
library(tidyr)
library(data.table)
library(seasonal)math.trans = function(ts, objname)
{
lambda = BoxCox.lambda(ts)
print(paste('Lambda value for time series', objname, '=', lambda))
print(paste('Plotting Original vs Transformed time series for', objname))
df = cbind(Original = ts, Transformed = BoxCox(ts, lambda))
autoplot(df, facet = TRUE) +
xlab('Time') + ylab('Value') +
ggtitle(paste('Original vs Transformed plot for', objname))
}
left = function(text, num_char) {
substr(text, 1, num_char)
}
mid = function(text, start_num, num_char) {
substr(text, start_num, start_num + num_char - 1)
}
right = function(text, num_char) {
substr(text, nchar(text) - (num_char-1), nchar(text))
}global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?global_economy %>%
tsibble(key = Code, index = Year)%>%
autoplot(GDP/Population, show.legend = FALSE) +
labs(title= "GDP per capita",
y = "$US")datamax <- global_economy %>% mutate(sum = (global_economy$GDP/global_economy$Population))
maximumvalue <- datamax[which.max(datamax$sum),]
maximumvalue## # A tsibble: 1 x 10 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Population sum
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Monaco MCO 2014 7060236168. 7.18 NA NA NA 38132 1.85e5
global_economy %>%
tsibble(key = Code, index = Year)%>%
filter(Country=="Monaco") %>%
autoplot(GDP/Population)global_economyglobal_economy %>%
filter(Country == "United States") %>%
autoplot(GDP/Population) +
labs(title= "GDP per capita", y = "$US")canadian_gas data?lambda <- canadian_gas %>%
features(Volume, features = guerrero) %>%
pull(lambda_guerrero)
canadian_gas %>%
autoplot(box_cox(Volume, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed gas production with $\\lambda$ = ",
round(lambda,2))))The plot of monthly Canadian gas production displays a seasonality of 1 year and a seasonal variance that is relatively low from 1960 through 1978, larger from 1978 through 1988 and smaller from 1988 through 2005. Because the seasonal variation increases and then decreases, the Box Cox transformation cannot be used to make the seasonal variation uniform.
set.seed(12345678)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries %>%
autoplot(Turnover)+
labs(title = "Australian Retail Trade Turnover")lambda_retail <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
autoplot(box_cox(Turnover, lambda_retail))+
labs(title = latex2exp::TeX(paste0(
"Box Cox Transformation of Australian Retail Trade Turnover with $\\lambda$ = ",
round(lambda_retail,2))))The plot of Australian Retail Trade doesn’t show an upward trend and a seasonality of one year. The seasonal variation increases with time.
aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.## [1] "Lambda value for time series tobacco = 0.709945071249522"
## [1] "Plotting Original vs Transformed time series for tobacco"
economypassengers <- ansett %>%
filter(Class == "Economy", Airports =="MEL-SYD")
economy <- ts(economypassengers[,"Passengers"])
math.trans(economy, "economy passengers between Melbourne and Sydney")## [1] "Lambda value for time series economy passengers between Melbourne and Sydney = 1.99995900720725"
## [1] "Plotting Original vs Transformed time series for economy passengers between Melbourne and Sydney"
newdata <- filter(pedestrian, Sensor =='Southern Cross Station')
datapede <- ts(newdata[,"Count"])
math.trans(datapede, 'datapede')## [1] "Lambda value for time series datapede = 0.0748116138525378"
## [1] "Plotting Original vs Transformed time series for datapede"
A 3X5 MA is when a 3-MA follows a moving of a 5-MA series. It is also called a “centered moving average of order 5”. This is because the results are now symmetric.
\[3\times5MA=\frac{1}{15}Y_1+\frac{2}{15}Y_2+\frac{3}{15}Y_3+\frac{3}{15}Y_4++\frac{3}{15}Y_5+\frac{2}{15}Y_6+\frac{1}{15}Y_7\]
aus_production.## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
From the plot above, there are seasonal fluctuations with a general upward growth trend. The seasons peak in roughly the middle of the year - potentially the summer months.
classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.gas %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data")The results of the multiplicative decomposition show a quarterly seasonal component with a frequency of 1 year. There is an increasing trend from year 2006 through middle 2007. After year 2007, there is no trend until early 2008. After that, ther is an increasing trend late 2009.
The results support the graphical interpretation from part a, which was a seasonality of frequency 1 year and an increasing trend. And because classical multiplicative decomposition relies on moving averages, there is no data at the beginning and end of the trend-cycle.
gas_adjusted <- gas %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components()
gas_adjusted %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data") +
scale_colour_manual(
values = c("gray", "Red", "Blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)gas_Obs <- gas
gas_Obs$Gas[10] <- gas_Obs$Gas[10]+300
gas_Obs %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")gas_Obs %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data with 300 added to the 10th observation") +
scale_colour_manual(
values = c("gray", "Red", "Blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)when 300 was added to the 10th observation, it caused a large spike in the seasonally adjusted data. The quarterly gas data was taken from a seasonal low point to a relative high point. The addition of 300 to the 10th observation has a relatively small affect on the seasonal component. This is because the seasonal component is uniform for each year and only one data point has changed. It also caused a decreasing trend from early 2008 until middle 2008.
gas_comp <- gas
gas_comp$Gas[20] <- gas_comp $Gas[10]+300
gas_comp %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")gas_comp %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data with 300 added to the last observation") +
scale_colour_manual(
values = c("gray", "Red", "Blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)Adding 300 to the last entry causes a spike at the end of the seasonally adjusted data. The seasonal data is less affected by the change - its pattern more closely matches that of the original data. The trend looks more better, because it looks more increasing and also at the end cause adding 300 at the last observation.
myseries %>%
model(classical_decomposition(Turnover,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Multiplicative decomposition of my retail time series data")myseries %>%
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
components() %>%
autoplot()+
labs(title = "X-11 decomposition of my retail time series data")Compare both decomposition, the X-11 trend-cycle has captured the sudden fall in the 2000-2010.
Figure 1: Decomposition of the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.
Figure 2: Seasonal component from the decomposition shown in the previous figure.
Isolating the trend component from the seasonal component shows that the trend has increased throught the majority of the time frame, with a few stationary periods occuring in the early 90s. The monthly breakdown of the seasonal component shows that a few months show greater velocities in their variations than other months.
Yes, we see a dip in employment during 1991/1992 that is not explained by seasonality or the positive trend.
canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).autoplot(), gg_subseries() and gg_season() to look at the effect of the changing seasonality over time.canadian_gas %>%
model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)) %>%
components() %>%
autoplot()gg_season().]As shown above, the seasonal shape is flat from beginning and then as the time goes by the seasonal shape increases. In year 1960 there is no trend-cycle, we can say the gas production didn’t really a trend in that time. After year 1975 there is a trend-cycle, hence the gas production increases at that time and so on.
canadian_gas %>%
model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)) %>%
components() %>%
ggplot(aes(x = Month)) +
geom_line(aes(y = Volume, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(title = "STL decomposition of Canadian Gas Production") +
scale_colour_manual(
values = c("gray", "Red", "Blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)