?gold
?woolyrnq
?gas
autoplot(gold)
autoplot(woolyrnq)
autoplot(gas)
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
?which.max
which.max(gold)
## [1] 770
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
head(retaildata)
## # A tibble: 6 × 190
## `Series ID` A3349335T A3349627V A3349338X A3349398A A3349468W
## <dttm> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1982-04-01 00:00:00 303. 41.7 63.9 409. 65.8
## 2 1982-05-01 00:00:00 298. 43.1 64 405. 65.8
## 3 1982-06-01 00:00:00 298 40.3 62.7 401 62.3
## 4 1982-07-01 00:00:00 308. 40.9 65.6 414. 68.2
## 5 1982-08-01 00:00:00 299. 42.1 62.6 404. 66
## 6 1982-09-01 00:00:00 305. 42 64.4 412. 62.3
## # … with 184 more variables: A3349336V <dbl>, A3349337W <dbl>, A3349397X <dbl>,
## # A3349399C <dbl>, A3349874C <dbl>, A3349871W <dbl>, A3349790V <dbl>,
## # A3349556W <dbl>, A3349791W <dbl>, A3349401C <dbl>, A3349873A <dbl>,
## # A3349872X <dbl>, A3349709X <dbl>, A3349792X <dbl>, A3349789K <dbl>,
## # A3349555V <dbl>, A3349565X <dbl>, A3349414R <dbl>, A3349799R <dbl>,
## # A3349642T <dbl>, A3349413L <dbl>, A3349564W <dbl>, A3349416V <dbl>,
## # A3349643V <dbl>, A3349483V <dbl>, A3349722T <dbl>, A3349727C <dbl>, …
myts <- ts(retaildata[,"A3349335T"],
frequency=12, start=c(1982,4))
autoplot(myts)
ggseasonplot(myts)
ggsubseriesplot(myts)
gglagplot(myts)
ggAcf(myts)
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
There does appear to be seasonality, becoming more pronounced after 2005. For instance, values tend to drop in February, increase in October, and peak in December. Not seeing clear signs of cyclicity. There does appear to be a consistent upward trend, which begins to accelerate after 2000.
?plastics
autoplot(plastics)
Yes, there is clear seasonality, with sales peaking just past the middle of the year. As for trend, there appears to be a consistent upward trend.
plastics %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Month") +
ggtitle("Classical multiplicative decomposition
of Sales of plastic product")
Yes, my estimation of seasonality and trend was generally correct. Though I did not clearly see the upward trend leveling off towards the end of the series.
plastics_ts <- ts(plastics, frequency=frequency(plastics), start=c(2017,1))
plastics_ts %>% seas(x11="") -> fit
autoplot(seasadj(fit), series="Data") +
ggtitle("Seasonally Adjusted Sales of Plastic Product")
plastics_cp2 <- plastics
plastics_cp2[14] <- plastics_cp2[14] + 600
plastics_out2_ts <- ts(plastics_cp2, frequency=frequency(plastics_cp2), start=c(2017,1))
plastics_out2_ts %>% seas(x11="") -> fit2
autoplot(plastics_out2_ts, series="Data") +
autolayer(trendcycle(fit2), series="Trend") +
autolayer(seasadj(fit2), series="Seasonally Adjusted") +
xlab("Year") + ylab("Sales") +
ggtitle("Sales of plastic product") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
plastics_cp2 %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Month") +
ggtitle("Classical multiplicative decomposition
of Sales of plastic product")
The outlier, which occurs in the trough of a seasonal cycle, does appear to change the shape of the seasonal cycle. Though the majority of the effects are captured in the remainder.
plastics_cp3 <- plastics
plastics_cp3[52] <- plastics_cp3[52] + 600
plastics_out3_ts <- ts(plastics_cp3, frequency=frequency(plastics_cp3), start=c(2017,1))
plastics_out3_ts %>% seas(x11="") -> fit3
autoplot(plastics_out3_ts, series="Data") +
autolayer(trendcycle(fit3), series="Trend") +
autolayer(seasadj(fit3), series="Seasonally Adjusted") +
xlab("Month") + ylab("Sales") +
ggtitle("Sales of plastic product") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
plastics_cp3 %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Month") +
ggtitle("Classical multiplicative decomposition
of Sales of plastic product")
The outlier towards the end of the series appears to have a smaller impact on trend and seasonality. It is unclear if the addition of 600 to a data point is less impactful at the end of the series, since the values do increase over time or if the effects are a result of the model being more influence by earlier data points. Again, the majority of the outlier’s effect is captured in the remainder.