Email : sherlytaurinsiri@gmail.com
Instagram : https://www.instagram.com/sherlytaurin
RPubs : https://rpubs.com/sherlytaurin/
Github : https://github.com/sherlytaurin/
Telegram : @Sherlytaurin
global_economy %>%
tsibble(key = Code, index = Year)%>%
autoplot(GDP/Population, show.legend = FALSE) +
labs(title= "GDP per capita",
y = "$US")datamax <- global_economy %>%
mutate(sum = (global_economy$GDP/global_economy$Population))
maximumvalue <- datamax[which.max(datamax$sum),]
maximumvalue## # A tsibble: 1 x 10 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Population sum
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Monaco MCO 2014 7060236168. 7.18 NA NA NA 38132 1.85e5
From data, we can see that the country that have the highest GDP per capita is Monaco. The time series was like:
global_economy %>%
tsibble(key = Code, index = Year)%>%
filter(Country=="Monaco") %>%
autoplot(GDP/Population)aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot(Count) +
labs(title= "Slaughter of Victorian Bulls, bullocks and steers", y = "Count")The plot of monthly Canadian gas production displays a seasonality of 1 year and a seasonal variance that is relatively low from 1960 through 1978, larger from 1978 through 1988 and smaller from 1988 through 2005. Because the seasonal variation increases and then decreases, the Box Cox transformation cannot be used to make the seasonal variation uniform.
lambda_retail <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
autoplot(box_cox(Turnover, lambda_retail))+
labs(title = latex2exp::TeX(paste0(
"Box Cox Transformation of Australian Retail Trade Turnover with $\\lambda$ = ",
round(lambda_retail,2)))) The plot from Australian Retai Trade time series doesn’t show any upward treand and a seasonality of one year. The seasonal variation increases with time. So we will choose the second one.
math.trans = function(ts, objname)
{
lambda = BoxCox.lambda(ts)
print(paste('Lambda value for time series', objname, '=', lambda))
print(paste('Plotting Original vs Transformed time series for', objname))
df = cbind(Original = ts, Transformed = BoxCox(ts, lambda))
autoplot(df, facet = TRUE) +
xlab('Time') + ylab('Value') +
ggtitle(paste('Original vs Transformed plot for', objname))
}## [1] "Lambda value for time series tobacco = 0.709945071249522"
## [1] "Plotting Original vs Transformed time series for tobacco"
economypassengers <- ansett %>%
filter(Class == "Economy", Airports =="MEL-SYD")
economy <- ts(economypassengers[,"Passengers"])
math.trans(economy, "economy passengers between Melbourne and Sydney")## [1] "Lambda value for time series economy passengers between Melbourne and Sydney = 1.99995900720725"
## [1] "Plotting Original vs Transformed time series for economy passengers between Melbourne and Sydney"
pedestriancount <- pedestrian %>%
filter(Sensor =="Southern Cross Station")
datapedestrian <- ts(pedestriancount[,"Count"])
math.trans(datapedestrian, 'Pedestrian counts at Southern Cross Station')## [1] "Lambda value for time series Pedestrian counts at Southern Cross Station = 0.0748116138525378"
## [1] "Plotting Original vs Transformed time series for Pedestrian counts at Southern Cross Station"
3x5 MA is taking the average for the first five terms and then getting the moving average of each three observation set. Since a 3X5 MA will lead to a 1/15(y + 2y + 3y + 3y + 3y + 2y + y), it follows the weighted average of 1/15, 2/15, 3/15, 3/15 … etc.
it’s called “Centered Moving Average of order 5” because the result are symmetric. \[
3 \times 5 MA = \frac {1}{15}Y_1 + \frac {2}{15}Y_2 + \frac {3}{15}Y_3 + \frac {3}{15}Y_4 + \frac {3}{15}Y_5 + \frac {2}{15}Y_6 + \frac {1}{15}Y_7
\]
## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
gas %>%
autoplot(Gas)+
labs(title = "Last Five Years of The Gas Data")+
theme_replace()+
geom_line(col = "#1B89D3") From the plot, there are seasonal fluctuation with a general upward growth trend. The highest trend is in the middle of the year and will downward until the beginning of the year and so on.
gas %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data") The results of the multiplicative decomposition show a quarterly seasonal component with a frequency of 1 year. There is an increasing trend from year 2006 through middle 2007. After year 2007, there is no trend until early 2008. After that, ther is an increasing trend late 2009.
The results support the graphical interpretation from part a, which was a seasonality of frequency 1 year and an increasing trend. And because classical multiplicative decomposition relies on moving averages, there is no data at the beginning and end of the trend-cycle.
gas_decom <- gas %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components()
gas_decom %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)gas1 <- gas
gas1$Gas[10] <- gas1$Gas[10]+300
gas1 %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")gas1 %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data with 300 added to the 10th observation") +
scale_colour_manual(
values = c("dark green", "red", "blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
) when 300 was added to the 10th observation, it caused a large spike in the seasonally adjusted data. The quarterly gas data was taken from a seasonal low point to a relative high point. The addition of 300 to the 10th observation has a relatively small affect on the seasonal component. This is because the seasonal component is uniform for each year and only one data point has changed. It also caused a decreasing trend from early 2008 until middle 2008.
gas2 <- gas
gas2$Gas[20] <- gas2$Gas[10]+300
gas2 %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Last Five Years of The Gas Data with 300 added to the 10th observation")gas2 %>%
model(classical_decomposition(Gas,type = "multiplicative")) %>%
components() %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "",
title = "Last Five Years of The Gas Data with 300 added to the last observation") +
scale_colour_manual(
values = c("dark green", "red", "blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)Adding 300 to the last entry causes a spike at the end of the seasonally adjusted data. The seasonal data is less affected by the change - its pattern more closely matches that of the original data. The trend looks more better, because it looks more increasing and also at the end cause adding 300 at the last observation.
myseries %>%
model(classical_decomposition(Turnover,type = "multiplicative")) %>%
components() %>%
autoplot() +
ggtitle("Multiplicative decomposition of my retail time series data")myseries %>%
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
components() %>%
autoplot()+
labs(title = "X-11 decomposition of my retail time series data") Compare both decomposition, the X-11 trend-cycle has captured the sudden fall in the 2000-2010.
Figure 1 : Decomposition of the number of persons in the civillian labour force in Australia each month from February 1978 to August 1995.
Figure 2: Seasonal component from the decomposition shown in the previous figure.
Isolating the trend component from the seasonal component shows that the trend has increased throught the majority of the time frame, with a few stationary periods occuring in the early 90s. The monthly breakdown of the seasonal component shows that a few months show greater velocities in their variations than other months.
Yes, we see a dip in employment during 1991/1992 that is not explained by seasonality or the positive trend.
canadian_gas %>%
autoplot(Volume)+
labs(title = "Monthly Canadian Gas Production",
subtitle = "autoplot()",
y = "Billions of cubic meter")+
theme_replace()+
geom_line(col = "dark green")canadian_gas %>%
gg_subseries(Volume)+
labs(title = "Monthly Canadian Gas Production",
subtitle = "gg_subseries()",
y = "Billions of cubic meter")canadian_gas %>%
gg_season(Volume)+
labs(title = "Monthly Canadian Gas Production",
subtitle = "gg_season()",
y = "Billions of cubic meter")We can see, that according to the plots above, Canadian_gas data has seasonally increasing trend. In general, the gas production increasing on winter and decreasing on summer.
The trend increases drammatically from 1975 to 1990. That because there were larger differences gas production between winter and summer in those year that shown on the seasonal plot.
canadian_gas %>%
model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)) %>%
components() %>%
autoplot() From the STL decomposition above, The trend component adequately represent the original data (volume). Where the seasonal component(Season_year) increases from 1975 until 1985 after that decreases. The remainder component is almost around zero.
As shown above, the seasonal shape is flat from beginning and then as the time goes by the seasonal shape increases. In year 1960 there is no trend-cycle, we can say the gas production didn’t really a trend in that time. After year 1975 there is a trend-cycle, hence the gas production increases at that time and so on.
canadian_gas %>%
model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)) %>%
components() %>%
ggplot(aes(x = Month)) +
geom_line(aes(y = Volume, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(title = "STL decomposition of Canadian Gas Production") +
scale_colour_manual(
values = c("green", "red", "blue"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)canadian_gas %>%
model(x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
components() %>%
autoplot()+
labs(title = "X-11 decomposition of Canadian Gas Production")canadian_gas %>%
model(seats = X_13ARIMA_SEATS(Volume ~ seats())) %>%
components() %>%
autoplot() +
labs(title ="SEATS Decomposition of Canadian Gas Production")The decomposed trend and seasonal components are similliar each other. The changes of seasonality are differents from the original data. The differences of seasonally adjusted time series are very less between these two method. In addition, the remainder component of the SEATS decomposition is larger than that of the X11 decomposition, where both remainders are around one. The remainder component of the STL decomposition is smaller.
So, we can conclude that the STL decomposition fits the canadian_gas data better.