https://zhuanlan.zhihu.com/p/88103863
一般来说,时间序列能够分解为趋势性、周期性和残差三个部分。
传统的分解方法可以划分为可加性分解(additive)和可乘性分解(multiplicative)两种。其中,可加性的分解目前不建议使用,原因是: 1、分解的时候,趋势性的预测总是缺乏首尾的数值。这一点做滑动平均(moving average)就会有体会,用前n个序列的均值来作为n+1时刻的趋势预测,那么前面n个数值注定都是缺失值,而最后n个数值也无法获得;
2、对于骤增和骤降不敏感,显得过于平滑,因此如果有突发的事件也难以进行捕捉;
3、对季节性的预测非常刻板,基本假设中周期性是固定的,如果随着时间改变其周期性波动也发生改变,可加性分解就无法捕捉到。
经典时间序列分解缺点 - 经典时间序列分解法无法估计趋势-周期项的最前面几个和最后面几个的观测。 - 经典时间序列分解法对趋势-周期项的估计倾向于过度平滑数据中的快速上升或快速下降 - 经典时间序列分解法假设季节项每年是重复的。 - 处理异常值,经典时间序列分解法通常不够稳健。
library(fpp2)
# type="multiplicative"/ type="additive"
# elecequip %>% decompose(type="multiplicative") %>%
# autoplot() + xlab("Year") +
# ggtitle("Classical multiplicative decomposition of electrical equipment index")
plot(decompose(elecequip,type="multiplicative"))
“In particular, trend-cycle estimates are available for all observations including the end points, and the seasonal component is allowed to vary slowly over time. X11 also has some sophisticated methods for handling trading day variation, holiday effects and the effects of known predictors. It handles both additive and multiplicative decomposition. The process is entirely automatic and tends to be highly robust to outliers and level shifts in the time series.” 首尾的数值都能补全,季节因子也能够随着时间变化而发生缓慢变化,能够处理节假日等突发事件,既能够用可加模式也能够用可乘模式,分解过程全自动化而且非常稳健不怕离群值。
library(seasonal)
elecequip %>% seas(x11="") %>%
autoplot() +
ggtitle("电气设备指数的X11分解")+
theme(text = element_text(family = "STSong"))+
theme(plot.title = element_text(hjust = 0.5))
SEATS是Seasonal Extraction in ARIMA Time Series的缩写,就是从ARIMA模型中提取季节变化因素。它具有一定的限制,就是只能够对季节尺度和月尺度的时间序列进行处理,其他则不行。 因此,其他类型的季节性,如每日数据,或每小时数据,或每周数据,需要其他方法。
http://www.seasonal.website/seasonal.html
library(seasonal)
autoplot(seas(elecequip)) +
ggtitle("电气设备指数SEATS分解")+
theme(text = element_text(family = "STHeiti"))+
theme(plot.title = element_text(hjust = 0.5))
STL全称为Seasonal and Trend decomposition using Loess,也就是对季节因子和趋势因子都进行了局部多项式回归的方法。它比以上提到的所有方法都要优秀 ,因为: 1、SEATS和X11只能解决季节和月尺度的时间序列,STL没有任何限制; 2、季节因子可以随时间变化而变化,变化速率可以用户自定义; 3、趋势因子可以用户自定义; 4、对离群值不敏感,不过这样可能会让局部残差变大。
也有缺点,如只能进行可加性分解,无法根据交易日进行调整。 不过如果要用可乘性的分解,其实可以先将序列进行对数化运算,然后再做STL,最后再反推回去即可。
两个主要参数是趋势-周期窗口(t.window) 和季节性窗口(s.window)。 这些参数控制了趋势-周期项和季节项的变化速度,它们的值越小允许变化的速度越快。 在估计趋势-周期项和季节项的时候 t.window和 s.window都需要是奇数,并且所用的数据年份应是连续的。
用户必须设定 s.window,因为它没有默认值,如果将该值设为无穷大就相当于令季节项为周期性的(即,各年相同)。t.window是可选项,若没有填写它则使用默认值。
library(fpp2)
#elecequip %>%
# stl(t.window=13, s.window="periodic", robust=TRUE) %>%
# autoplot()
fcast <- stl(elecequip,t.window=13, s.window="periodic", robust=TRUE)
head(fcast)
$time.series
seasonal trend remainder
Jan 1996 -4.951454 81.35054 2.950917980
Feb 1996 -5.776154 80.91799 0.638167053
Mar 1996 7.921218 80.48544 -2.086656500
Apr 1996 -5.979828 80.10324 -1.523412992
May 1996 -4.686905 79.72104 -0.174137282
Jun 1996 7.792969 79.42071 -3.403673831
Jul 1996 -1.776216 79.12037 2.455847705
Aug 1996 -16.669933 79.38534 -0.305407239
Sep 1996 7.547602 79.65031 -1.787913640
Oct 1996 2.997748 80.15073 -0.038475259
Nov 1996 3.925423 80.65114 -0.366566218
Dec 1996 9.655530 81.29166 -1.247192453
Jan 1997 -4.951454 81.93218 1.659272011
Feb 1997 -5.776154 82.47371 0.722443215
Mar 1997 7.921218 83.01524 -1.076458206
Apr 1997 -5.979828 83.47181 3.778021758
May 1997 -4.686905 83.92837 -0.561466077
Jun 1997 7.792969 84.27065 -2.553622010
Jul 1997 -1.776216 84.61294 0.833280141
Aug 1997 -16.669933 84.91120 1.558728445
Sep 1997 7.547602 85.20947 -1.667074708
Oct 1997 2.997748 85.55528 0.876971740
Nov 1997 3.925423 85.90109 1.213488848
Dec 1997 9.655530 86.11876 -2.904290525
Jan 1998 -4.951454 86.33643 0.485020803
Feb 1998 -5.776154 86.32070 4.815457097
Mar 1998 7.921218 86.30496 -1.246179234
Apr 1998 -5.979828 86.07967 0.990158006
May 1998 -4.686905 85.85438 4.472527447
Jun 1998 7.792969 85.41810 -2.071073016
Jul 1998 -1.776216 84.98183 0.254384604
Aug 1998 -16.669933 84.63683 -1.596897219
Sep 1998 7.547602 84.29183 1.500569502
Oct 1998 2.997748 84.41304 -1.480790382
Nov 1998 3.925423 84.53426 -1.649679607
Dec 1998 9.655530 84.91839 -1.273917926
Jan 1999 -4.951454 85.30252 1.238934455
Feb 1999 -5.776154 86.15337 1.392781142
Mar 1999 7.921218 87.00423 -3.685444797
Apr 1999 -5.979828 88.31774 -2.887910868
May 1999 -4.686905 89.63125 2.045655264
Jun 1999 7.792969 91.09100 -2.283965223
Jul 1999 -1.776216 92.55074 7.215472374
Aug 1999 -16.669933 94.18712 1.612811085
Sep 1999 7.547602 95.82350 0.188898340
Oct 1999 2.997748 97.39144 0.500817212
Nov 1999 3.925423 98.95937 -3.484793256
Dec 1999 9.655530 100.53613 1.608336626
Jan 2000 -4.951454 102.11290 -1.861442791
Feb 2000 -5.776154 103.57941 -0.033259347
Mar 2000 7.921218 105.04593 3.262851472
Apr 2000 -5.979828 106.47777 0.482054998
May 2000 -4.686905 107.90961 0.847290726
Jun 2000 7.792969 109.10559 -2.258557225
Jul 2000 -1.776216 110.30156 -0.905347092
Aug 2000 -16.669933 110.88524 1.904694362
Sep 2000 7.547602 111.46891 4.483484359
Oct 2000 2.997748 111.28701 1.835246391
Nov 2000 3.925423 111.10510 1.829479083
Dec 2000 9.655530 110.20745 8.747019057
Jan 2001 -4.951454 109.30980 -3.798350268
Feb 2001 -5.776154 107.82629 0.999861072
Mar 2001 7.921218 106.34278 4.795999787
Apr 2001 -5.979828 104.69439 -6.254557991
May 2001 -4.686905 103.04599 0.390916434
Jun 2001 7.792969 101.27587 2.071159744
Jul 2001 -1.776216 99.50575 -1.599538861
Aug 2001 -16.669933 98.38176 -1.991830716
Sep 2001 7.547602 97.25777 -2.735374027
Oct 2001 2.997748 96.60027 -3.418019995
Nov 2001 3.925423 95.94277 1.391804697
Dec 2001 9.655530 95.71897 4.475497131
Jan 2002 -4.951454 95.49517 -1.023719733
Feb 2002 -5.776154 95.17570 -0.129544351
Mar 2002 7.921218 94.85622 1.572558406
Apr 2002 -5.979828 94.27220 -1.242370672
May 2002 -4.686905 93.68817 0.328732453
Jun 2002 7.792969 93.31346 1.093574116
Jul 2002 -1.776216 92.93874 -3.032526137
Aug 2002 -16.669933 92.70736 -0.357428410
Sep 2002 7.547602 92.47598 -0.543582139
Oct 2002 2.997748 92.43478 0.967475741
Nov 2002 3.925423 92.39357 -0.158995719
Dec 2002 9.655530 92.30554 -0.961067285
Jan 2003 -4.951454 92.21750 2.073951850
Feb 2003 -5.776154 92.17510 0.511052887
Mar 2003 7.921218 92.13270 -1.153918703
Apr 2003 -5.979828 92.35429 -0.834464863
May 2003 -4.686905 92.57588 -2.638978822
Jun 2003 7.792969 93.16112 0.185914273
Jul 2003 -1.776216 93.74635 -0.170134547
Aug 2003 -16.669933 94.58090 -0.930972127
Sep 2003 7.547602 95.41546 1.366938837
Oct 2003 2.997748 95.99587 0.726378187
Nov 2003 3.925423 96.57629 0.558288196
Dec 2003 9.655530 96.88313 2.461343968
Jan 2004 -4.951454 97.18996 -2.358509559
Feb 2004 -5.776154 97.21492 0.831234967
Mar 2004 7.921218 97.23988 -0.051093132
Apr 2004 -5.979828 97.21215 0.267680456
May 2004 -4.686905 97.18442 0.062486247
Jun 2004 7.792969 97.11237 -0.555341879
Jul 2004 -1.776216 97.04033 0.945888079
Aug 2004 -16.669933 96.91965 -0.669716488
Sep 2004 7.547602 96.79897 1.083427489
Oct 2004 2.997748 96.76474 -0.582492027
Nov 2004 3.925423 96.73052 -0.885940883
Dec 2004 9.655530 96.79291 7.101561726
Jan 2005 -4.951454 96.85530 -0.253844965
Feb 2005 -5.776154 97.19779 -0.861636232
Mar 2005 7.921218 97.54028 0.058499876
Apr 2005 -5.979828 98.19674 -0.036911706
May 2005 -4.686905 98.85320 -2.946291086
Jun 2005 7.792969 99.60269 1.644343318
Jul 2005 -1.776216 100.35218 0.684035807
Aug 2005 -16.669933 101.14300 -1.113066764
Sep 2005 7.547602 101.93382 1.318579209
Oct 2005 2.997748 102.70299 -0.750741300
Nov 2005 3.925423 103.47217 -0.327591149
Dec 2005 9.655530 104.38454 0.359933543
Jan 2006 -4.951454 105.29690 -1.185451065
Feb 2006 -5.776154 106.22678 -0.590627950
Mar 2006 7.921218 107.15666 1.062122540
Apr 2006 -5.979828 107.95910 1.500730171
May 2006 -4.686905 108.76154 -1.004629995
Jun 2006 7.792969 109.43108 2.095955428
Jul 2006 -1.776216 110.10062 -0.384401065
Aug 2006 -16.669933 110.31020 -3.050268676
Sep 2006 7.547602 110.51979 3.732612256
Oct 2006 2.997748 110.56757 3.544678070
Nov 2006 3.925423 110.61536 -0.830785455
Dec 2006 9.655530 110.70349 0.010980661
Jan 2007 -4.951454 110.79162 -1.910162521
Feb 2007 -5.776154 110.99761 -1.121458178
Mar 2007 7.921218 111.20361 6.595173540
Apr 2007 -5.979828 111.50688 -0.827049850
May 2007 -4.686905 111.81015 1.326758963
Jun 2007 7.792969 112.18828 3.128754659
Jul 2007 -1.776216 112.56641 -1.900191562
Aug 2007 -16.669933 112.80304 -2.063106481
Sep 2007 7.547602 113.03967 1.292727144
Oct 2007 2.997748 113.13677 0.675483721
Nov 2007 3.925423 113.23387 -1.289289041
Dec 2007 9.655530 113.16916 4.315306950
Jan 2008 -4.951454 113.10446 1.296993642
Feb 2008 -5.776154 112.46753 -1.461372665
Mar 2008 7.921218 111.83059 1.568188402
Apr 2008 -5.979828 110.65486 4.104963012
May 2008 -4.686905 109.47914 -1.592230176
Jun 2008 7.792969 107.84106 2.295970424
Jul 2008 -1.776216 106.20299 -0.666770892
Aug 2008 -16.669933 103.82479 2.115146716
Sep 2008 7.547602 101.44659 0.505812868
Oct 2008 2.997748 98.26394 2.758314838
Nov 2008 3.925423 95.08129 1.113287469
Dec 2008 9.655530 91.95562 -0.431145022
Jan 2009 -4.951454 88.82994 -6.498486811
Feb 2009 -5.776154 86.24744 -5.281289535
Mar 2009 7.921218 83.66495 -5.186164883
Apr 2009 -5.979828 81.90465 -1.794826480
May 2009 -4.686905 80.14436 -1.357455875
Jun 2009 7.792969 80.31661 -2.499579852
Jul 2009 -1.776216 80.48886 1.187354255
Aug 2009 -16.669933 81.10849 0.921441296
Sep 2009 7.547602 81.72812 -1.185723120
Oct 2009 2.997748 82.43057 -0.828321546
Nov 2009 3.925423 83.13303 1.031550688
Dec 2009 9.655530 84.09764 8.766833335
Jan 2010 -4.951454 85.06225 -0.830793318
Feb 2010 -5.776154 86.26165 -1.745498931
Mar 2010 7.921218 87.46106 -0.762277171
Apr 2010 -5.979828 88.67778 1.962044801
May 2010 -4.686905 89.89451 -0.007601025
Jun 2010 7.792969 90.80762 5.339407158
Jul 2010 -1.776216 91.72074 -0.074526575
Aug 2010 -16.669933 92.43823 2.371698377
Sep 2010 7.547602 93.15573 -4.203328128
Oct 2010 2.997748 93.66953 -1.987273581
Nov 2010 3.925423 94.18333 3.661251627
Dec 2010 9.655530 94.69753 -0.873064866
Jan 2011 -4.951454 95.21174 2.309709341
Feb 2011 -5.776154 95.55742 -0.621267448
Mar 2011 7.921218 95.90310 0.655683137
Apr 2011 -5.979828 95.69042 -0.260589281
May 2011 -4.686905 95.47773 2.609170503
Jun 2011 7.792969 94.64883 0.458199896
Jul 2011 -1.776216 93.81993 1.726287373
Aug 2011 -16.669933 92.89703 1.352905249
Sep 2011 7.547602 91.97413 -4.481728330
Oct 2011 2.997748 91.47537 -2.703120905
Nov 2011 3.925423 90.97662 -1.532042819
Dec 2011 9.655530 90.48989 -1.805418828
Jan 2012 -4.951454 90.00316 1.388295863
Feb 2012 -5.776154 89.63097 1.185180969
Mar 2012 7.921218 89.25879 0.619993449
$weights
[1] 0.74384858 0.98722214 0.86779454 0.92796956 0.99902934 0.66741969 0.81878152 0.99704750
[9] 0.90157427 0.99995208 0.99572147 0.95080235 0.91438160 0.98348427 0.96414682 0.60027851
[17] 0.99002219 0.80444279 0.97826181 0.92488132 0.91405884 0.97596194 0.95431186 0.74973809
[25] 0.99246360 0.40189310 0.95186414 0.96943029 0.46878500 0.86880875 0.99798544 0.92100809
[33] 0.93026946 0.93193582 0.91579216 0.94878789 0.95156020 0.93936352 0.61857650 0.75419578
[41] 0.87231267 0.84189144 0.03188847 0.91970206 0.99889216 0.99212722 0.65309107 0.92085052
[49] 0.89436314 0.99997190 0.69094649 0.99265672 0.97748320 0.84528476 0.97425251 0.88869417
[57] 0.46580898 0.89623245 0.89692337 0.00000000 0.59922249 0.96792494 0.40254249 0.14775798
[65] 0.99499534 0.86865737 0.92111189 0.87860508 0.77712536 0.66443181 0.94030829 0.47022864
[73] 0.96731066 0.99946552 0.92307877 0.95178095 0.99661101 0.96270219 0.73072121 0.99596788
[81] 0.99068582 0.97068118 0.99918718 0.97053495 0.86804512 0.99167378 0.95888906 0.97815862
[89] 0.79215448 0.99892557 0.99908019 0.97280404 0.94192937 0.98344314 0.99026198 0.81920951
[97] 0.83291690 0.97817471 0.99993658 0.99773947 0.99988013 0.99022955 0.97200967 0.98580020
[105] 0.96353935 0.98916030 0.97502814 0.04285976 0.99799366 0.97662376 0.99988263 0.99995169
[113] 0.74451474 0.91665833 0.98531530 0.96123966 0.94591308 0.98226884 0.99658041 0.99611279
[121] 0.95659085 0.98912200 0.96428746 0.93017048 0.96833425 0.86635824 0.99531488 0.72772314
[129] 0.60872284 0.64277187 0.97826359 1.00000000 0.88901019 0.96098708 0.09725082 0.97858370
[137] 0.94521605 0.71507955 0.88917074 0.86993752 0.94817627 0.98579237 0.94779627 0.50098043
[145] 0.94741484 0.93367432 0.92353370 0.53918440 0.92137096 0.84083847 0.98596906 0.86339268
[153] 0.99172314 0.77190314 0.95970051 0.99466610 0.11665421 0.32040855 0.33931730 0.90287758
[161] 0.94262407 0.81222788 0.95610555 0.97345802 0.95600037 0.97836635 0.96691397 0.00000000
[169] 0.97856232 0.90619556 0.98191779 0.88243782 0.99999654 0.30317735 0.99981273 0.83060822
[177] 0.51966372 0.87906469 0.62199758 0.97557766 0.83789924 0.98794849 0.98622581 0.99785608
[185] 0.79676393 0.99342066 0.90815828 0.94304061 0.46651138 0.78259688 0.92706106 0.89867624
[193] 0.93967881 0.95612448 0.98778581
$call
stl(x = elecequip, s.window = "periodic", t.window = 13, robust = TRUE)
$win
s t l
1951 13 13
$deg
s t l
0 1 1
$jump
s t l
196 2 2
fcast %>% autoplot()
mstl() 函数提供了一方便的自动STL分解,其中s.window=13,t.window也是自动选择的。 它一般情况下平衡了季节性过拟合与允许其随时间缓慢变化。但是与其他自动化过程一样,对于某些时间序列默认设置需要调整。
library(forecast)
#假设您有每小时的数据,则平均每天有24个时段,每年平均有24 * 365.25个时段。
# data_ts <- mstl(msts(data, seasonal.periods = c(24, 24*365.25)))
#如果您实际上有每月的数据,则频率为12。
# data_ts <- mstl(ts(data, frequency = 12))
fcast <- mstl(ts(elecequip, frequency = 12))
head(fcast)
Data Trend Seasonal12 Remainder
Jan 1 79.35 79.67697 -4.690051 4.3630765
Feb 1 75.78 79.74462 -4.600674 0.6360492
Mar 1 86.32 79.81227 7.797572 -1.2898461
Apr 1 72.60 79.87992 -6.809679 -0.4702458
May 1 74.86 79.99255 -4.240861 -0.8916850
Jun 1 83.81 80.10517 6.217074 -2.5122411
fcast %>% autoplot()
懒人模式:
A short-cut approach is to use thestlf() function. The following code will decompose the time series using STL, forecast the seasonally adjusted series, and return the reseasonalised forecasts.
fcast <- stlf(elecequip, method='naive')
fcast %>% autoplot()
library(fpp2)
data(nottem)
head(nottem)
stl(nottem, s.window = "periodic", t.window = 50)
#时序图、季节效应图、趋势图以及随机波动项
plot(stl(nottem, s.window = "periodic", t.window = 50))
使用monthplot()函数查看时序的月度图 使用monthplot()函数查看时序的月度图,该图将不同年份的相同月份分类汇总在一起并合成到同一幅图上, 图中的各个横线为各个子序列的均值,从这个图中也可以看到整体的趋势走向与季节变化情况。
monthplot(nottem, xlab="", ylab="")
\[log(Yt) = log(Trend * Seasonal * Irregular) = log(Trend) + log(Seasonal) + log(Irregular) \]
R中自带的decompose()函数对相加与相乘模型都可以直接进行季节分解 data(AirPassengers) data <- decompose(AirPassengers,type=‘multiplicative’) data\(seasonal,\)trend,$random plot(data)
在分解季节成分的基础上,如果有需要的话,我们可以对时间序列进行季节因素调整,将这一部分信息从原始数据中去除。
data <- decompose(nottem,type='additive')
data2 <- nottem-data$seasonal
#左边的图形为原始数据,右边的图形则为去除掉季节成分后的修正数据,
#此时时序中仅包含趋势成分与随机波动成分。
par(mfrow=c(1,2))
plot(nottem)
plot(data2)