A new coronavirus designated 2019-nCoV was first identified in Wuhan,
the capital of China’s Hubei province. On 7 January 2020, Chinese
authorities confirmed COVID-19 and on 30 January 2020, the
Director-General of WHO declared the COVID-19 outbreak a Public Health
Emergency of International concern. On 8 March, 2020, Bangladesh has
confirmed 3 laboratories tested coronavirus cases for the very first
time.
In response of Covid-19 infection people developed pneumonia without a
clear cause and for which existing vaccines or treatments were not
effective. The virus has shown evidence of human-to-human transmission.
Transmission rate (rate of infection) appeared to escalate in
mid-January 2020.
This dataset contains daily data of COVID-19 outbreak in Bangladesh.
From 20 December 2020, it contains each confirmed, recovery, and death
cases by age and division till 31 July 2022, This is a time-series
dataset and this dataset will updated in a daily basis.
We can forecast daily covid confirmed rate, covid confirmed population
rate, recovered rate, recovered population rate, death rate by gender,
death rate by age, date rate by division in Bangladesh for future
pandemic preparation.
## date daily_covid_confirmed total_covid_confirmed
## 1 12/20/2020 1153 500713
## 2 12/21/2020 1470 502183
## 3 12/22/2020 1318 503501
## 4 12/23/2020 1367 504868
## 5 12/25/2020 NA NA
## 6 12/26/2020 NA NA
## daily_covid_confirmed_rate total_covid_confirmed_rate
## 1 8.66 16.28
## 2 9.38 16.24
## 3 8.70 16.21
## 4 8.58 16.17
## 5 NA NA
## 6 NA NA
## covid_confirmed_population_rate daily_recovered total_recovered
## 1 2940.07 1926 437527
## 2 2948.70 2167 439694
## 3 2956.44 2235 441929
## 4 2964.47 2416 444345
## 5 NA NA NA
## 6 NA NA NA
## recovered_rate recovered_population_rate daily_death total_death death_rate
## 1 87.38 2569.06 38 7280 1.45
## 2 87.56 2581.78 32 7312 1.46
## 3 87.77 2549.90 17 7329 1.46
## 4 88.01 2609.09 30 7359 1.46
## 5 NA NA NA NA NA
## 6 NA NA NA NA NA
## death_population_rate male_daily_death male_total_death male_death_rate
## 1 42.75 30 5552 76.26
## 2 42.93 23 5575 76.24
## 3 43.03 11 5586 76.22
## 4 43.21 21 5607 76.19
## 5 NA NA NA NA
## 6 NA NA NA NA
## female_daily_death female_total_death female_death_rate age_0_10_daily_death
## 1 8 1728 23.74 1
## 2 9 1737 23.76 0
## 3 6 1743 23.74 0
## 4 9 1752 23.81 0
## 5 NA NA NA NA
## 6 NA NA NA NA
## age_0_10_total_death age_0_10_death_rate age_11_20_daily_death
## 1 34 0.47 0
## 2 34 0.46 1
## 3 34 0.46 0
## 4 34 0.46 0
## 5 NA NA NA
## 6 NA NA NA
## age_11_20_total_death age_11_20_death_rate age_21_30_daily_death
## 1 55 0.76 1
## 2 56 0.77 1
## 3 56 0.76 0
## 4 56 0.76 0
## 5 NA NA NA
## 6 NA NA NA
## age_21_30_total_death age_21_30_death_rate age_31_40_daily_death
## 1 158 2.17 1
## 2 159 2.17 1
## 3 159 2.17 0
## 4 159 2.16 2
## 5 NA NA NA
## 6 NA NA NA
## age_31_40_total_death age_31_40_death_rate age_41_50_daily_death
## 1 368 5.05 4
## 2 369 5.05 3
## 3 369 5.03 2
## 4 371 5.04 5
## 5 NA NA NA
## 6 NA NA NA
## age_41_50_total_death age_41_50_death_rate age_51_60_daily_death
## 1 856 11.76 9
## 2 859 11.75 5
## 3 861 11.75 4
## 4 866 11.77 5
## 5 NA NA NA
## 6 NA NA NA
## age_51_60_total_death age_51_60_death_rate age_61_70_daily_death
## 1 1861 25.56 22
## 2 1866 25.52 21
## 3 1870 25.52 11
## 4 1875 25.48 18
## 5 NA NA NA
## 6 NA NA NA
## age_61_70_total_death age_61_70_death_rate age_71_80_daily_death
## 1 3948 54.23 NA
## 2 3969 54.28 NA
## 3 3980 54.30 NA
## 4 3998 54.33 NA
## 5 NA NA NA
## 6 NA NA NA
## age_71_80_total_death age_71_80_death_rate age_81_90_daily_death
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## age_81_90_total_death age_81_90_death_rate age_91_100_daily_death
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## age_91_100_total_death age_91_100_death_rate age_100_above_daily_death
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
## age_100_above_total_death age_100_above_death_rate dh_daily_death
## 1 NA NA 22
## 2 NA NA 15
## 3 NA NA 9
## 4 NA NA 16
## 5 NA NA NA
## 6 NA NA NA
## dh_total_death dh_death_rate cg_daily_death cg_total_death cg_death_rate
## 1 3981 54.68 8 1351 18.56
## 2 3996 54.65 8 1359 18.59
## 3 4005 54.65 4 1363 18.60
## 4 4021 54.64 9 1372 18.64
## 5 NA NA NA NA NA
## 6 NA NA NA NA NA
## rs_daily_death rs_total_death rs_death_rate kh_daily_death kh_total_death
## 1 2 428 5.88 0 516
## 2 1 429 5.87 1 517
## 3 1 430 5.87 1 518
## 4 1 431 5.86 1 519
## 5 NA NA NA NA NA
## 6 NA NA NA NA NA
## kh_death_rate ba_daily_death ba_total_death ba_death_rate sy_daily_death
## 1 7.09 1 232 3.19 3
## 2 7.07 3 235 3.21 3
## 3 7.07 0 235 3.21 1
## 4 7.05 1 236 3.21 0
## 5 NA NA NA NA NA
## 6 NA NA NA NA NA
## sy_total_death sy_death_rate rp_daily_death rp_total_death rp_death_rate
## 1 286 3.93 1 327 4.49
## 2 289 3.95 1 328 4.49
## 3 290 3.96 1 329 4.49
## 4 290 3.94 2 331 4.50
## 5 NA NA NA NA NA
## 6 NA NA NA NA NA
## mm_daily_death mm_total_death mm_death_rate
## 1 1 159 2.18
## 2 0 159 2.17
## 3 0 159 2.17
## 4 0 159 2.16
## 5 NA NA NA
## 6 NA NA NA
Column names of the dataset:
## [1] "date" "daily_covid_confirmed"
## [3] "total_covid_confirmed" "daily_covid_confirmed_rate"
## [5] "total_covid_confirmed_rate" "covid_confirmed_population_rate"
## [7] "daily_recovered" "total_recovered"
## [9] "recovered_rate" "recovered_population_rate"
## [11] "daily_death" "total_death"
## [13] "death_rate" "death_population_rate"
## [15] "male_daily_death" "male_total_death"
## [17] "male_death_rate" "female_daily_death"
## [19] "female_total_death" "female_death_rate"
## [21] "age_0_10_daily_death" "age_0_10_total_death"
## [23] "age_0_10_death_rate" "age_11_20_daily_death"
## [25] "age_11_20_total_death" "age_11_20_death_rate"
## [27] "age_21_30_daily_death" "age_21_30_total_death"
## [29] "age_21_30_death_rate" "age_31_40_daily_death"
## [31] "age_31_40_total_death" "age_31_40_death_rate"
## [33] "age_41_50_daily_death" "age_41_50_total_death"
## [35] "age_41_50_death_rate" "age_51_60_daily_death"
## [37] "age_51_60_total_death" "age_51_60_death_rate"
## [39] "age_61_70_daily_death" "age_61_70_total_death"
## [41] "age_61_70_death_rate" "age_71_80_daily_death"
## [43] "age_71_80_total_death" "age_71_80_death_rate"
## [45] "age_81_90_daily_death" "age_81_90_total_death"
## [47] "age_81_90_death_rate" "age_91_100_daily_death"
## [49] "age_91_100_total_death" "age_91_100_death_rate"
## [51] "age_100_above_daily_death" "age_100_above_total_death"
## [53] "age_100_above_death_rate" "dh_daily_death"
## [55] "dh_total_death" "dh_death_rate"
## [57] "cg_daily_death" "cg_total_death"
## [59] "cg_death_rate" "rs_daily_death"
## [61] "rs_total_death" "rs_death_rate"
## [63] "kh_daily_death" "kh_total_death"
## [65] "kh_death_rate" "ba_daily_death"
## [67] "ba_total_death" "ba_death_rate"
## [69] "sy_daily_death" "sy_total_death"
## [71] "sy_death_rate" "rp_daily_death"
## [73] "rp_total_death" "rp_death_rate"
## [75] "mm_daily_death" "mm_total_death"
## [77] "mm_death_rate"
Processing the data into time series:
## Using `as.date` as index variable.
## # A tsibble: 588 x 78 [1D]
## date daily_covid_con… total_covid_con… daily_covid_con… total_covid_con…
## <chr> <int> <int> <dbl> <dbl>
## 1 12/20/20… 1153 500713 8.66 16.3
## 2 12/21/20… 1470 502183 9.38 16.2
## 3 12/22/20… 1318 503501 8.7 16.2
## 4 12/23/20… 1367 504868 8.58 16.2
## 5 12/25/20… NA NA NA NA
## 6 12/26/20… NA NA NA NA
## 7 12/27/20… 1049 509148 8.29 16.0
## 8 12/28/20… NA NA NA NA
## 9 12/29/20… 1181 511261 8.1 16.0
## 10 12/30/20… 1235 512496 8.11 15.9
## # … with 578 more rows, and 73 more variables:
## # covid_confirmed_population_rate <dbl>, daily_recovered <int>,
## # total_recovered <int>, recovered_rate <dbl>,
## # recovered_population_rate <dbl>, daily_death <int>, total_death <int>,
## # death_rate <dbl>, death_population_rate <dbl>, male_daily_death <int>,
## # male_total_death <int>, male_death_rate <dbl>, female_daily_death <int>,
## # female_total_death <dbl>, female_death_rate <dbl>, …
## [,1]
## 2020-12-20 1.45
## 2020-12-21 1.46
## 2020-12-22 1.46
## 2020-12-23 1.46
## 2020-12-25 1.46
## 2020-12-26 1.46
## [1] "xts" "zoo"
Converting xts to ts:
## [1] "ts"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.420 1.490 1.540 1.588 1.730 1.780
No significant change, so continue with the actual distribution
Timeplot for Age 0-10 daily death rate:
Classical decomposition: Additive Seasonality of “daily death
rate”:
Additive Seasonality of age_0_10_death_rate (“daily death rate”)
in classical decomposition:
Visualizing the age_0_10_death_rate (“daily death rate”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_seas_dr
## Dickey-Fuller = -1.8449, Lag order = 8, p-value = 0.644
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_seas_dr): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_seas_dr
## KPSS Level = 1.7617, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_seas_dr
## Dickey-Fuller = -3.6473, Lag order = 8, p-value = 0.02815
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_seas_dr): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_seas_dr
## KPSS Level = 1.0574, Truncation lag parameter = 6, p-value = 0.01
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_seas_dr
## ARIMA(1,1,1)(1,0,0)[7]
##
## Coefficients:
## ar1 ma1 sar1
## -0.3385 -0.7776 0.0972
## s.e. 0.0437 0.0280 0.0421
##
## sigma^2 = 1.389e-05: log likelihood = 2446.17
## AIC=-4884.33 AICc=-4884.26 BIC=-4866.84
## Series: Diff_Adj_seas_dr
## ARIMA(1,0,2) with non-zero mean
##
## Coefficients:
## ar1 ma1 ma2 mean
## 0.9585 -1.1087 0.3176 1e-04
## s.e. 0.0141 0.0436 0.0398 7e-04
##
## sigma^2 = 1.378e-05: log likelihood = 2453.51
## AIC=-4897.03 AICc=-4896.92 BIC=-4875.15
# Plotting the best model Forecast:
Final Forecast of Age 81-90 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 1.067588e-04 1.823047e-05 -4.674518e-06 -6.911082e-06 -4.838255e-06
## [6] -1.949816e-06 1.011492e-06 3.891053e-06 6.659844e-06 9.315533e-06
## [11] 1.186133e-05 1.430149e-05 1.664033e-05 1.888204e-05 2.103065e-05
## [16] 2.309003e-05 2.506389e-05 2.695577e-05 2.876908e-05 3.050708e-05
## [21] 3.217290e-05 3.376954e-05 3.529988e-05 3.676665e-05 3.817252e-05
## [26] 3.951999e-05 4.081151e-05 4.204939e-05 4.323586e-05 4.437306e-05
## [31] 4.546302e-05
## [,1]
## 2020-12-20 8.66
## 2020-12-21 9.38
## 2020-12-22 8.70
## 2020-12-23 8.58
## 2020-12-25 8.29
## 2020-12-26 8.29
## [1] "xts" "zoo"
## [,1]
## 2020-12-20 8.66
## [1] "ts"
## Time Series:
## Start = 18616
## End = 19203
## Frequency = 1
## [1] 8.66 9.38 8.70 8.58 8.29 8.29 8.29 8.10 8.10 8.11 7.65 8.18
## [13] 7.64 7.64 7.52 6.85 6.29 6.55 8.29 8.29 8.29 6.02 5.00 5.66
## [25] 4.90 3.96 3.96 3.96 3.96 3.96 3.96 3.96 3.34 3.34 3.34 4.06
## [37] 3.58 3.36 3.43 3.55 3.55 3.55 3.55 3.63 2.92 3.18 2.79 2.51
## [49] 2.35 2.30 2.67 2.59 2.65 2.82 2.26 2.53 3.15 2.68 2.67 2.68
## [61] 2.85 3.14 2.33 3.30 3.13 2.65 2.63 3.13 3.30 2.87 4.31 3.36
## [73] 3.74 3.87 4.63 4.13 4.30 4.98 5.13 5.98 5.82 6.62 6.26 7.15
## [85] 9.48 8.29 7.68 10.45 10.04 9.39 10.29 11.19 13.69 12.97 13.26 13.69
## [97] 14.90 17.65 18.38 18.94 19.90 22.94 23.28 23.15 23.07 23.40 21.02 22.02
## [109] 20.65 23.57 20.49 19.81 20.59 18.29 20.89 21.00 23.36 21.46 19.06 17.68
## [121] 16.85 15.07 14.63 14.00 13.11 13.33 12.82 12.51 10.48 9.39 10.34 9.61
## [133] 9.60 8.95 8.71 8.59 8.44 9.89 8.74 8.19 8.99 8.67 7.45 9.58
## [145] 10.82 6.95 6.69 6.75 7.55 7.83 7.50 8.22 8.41 8.90 8.15 10.08
## [157] 9.11 8.12 9.30 7.91 10.11 9.41 9.67 9.81 9.94 10.40 11.03 10.73
## [169] 11.47 12.12 12.33 13.25 13.24 14.12 12.99 14.80 14.27 16.62 15.44 18.59
## [181] 18.02 16.38 19.27 19.36 20.27 19.93 21.22 22.50 21.59 23.86 23.97 25.13
## [193] 25.90 28.27 27.39 28.99 29.30 31.46 31.32 31.62 30.95 31.46 29.67 31.24
## [205] 29.21 29.14 27.23 28.96 29.06 29.09 29.59 29.31 30.48 32.19 31.05 32.55
## [217] 30.04 29.82 28.44 3.12 29.21 30.77 30.24 29.97 29.91 28.54 27.91 27.12
## [229] 26.25 25.65 24.52 24.28 23.54 23.45 22.46 20.83 20.66 20.25 21.08 21.08
## [241] 17.67 17.64 17.18 16.71 15.16 15.54 15.12 14.76 13.77 12.78 13.67 14.14
## [253] 12.07 11.95 10.11 10.40 10.76 9.82 9.66 9.82 9.69 9.07 8.76 8.65
## [265] 7.03 7.46 7.69 6.54 6.64 5.98 6.41 6.05 5.62 5.67 4.69 4.79
## [277] 4.61 4.54 4.59 4.41 4.36 4.49 4.12 3.24 3.43 3.41 2.90 3.19
## [289] 2.72 2.88 2.97 2.77 2.45 2.36 2.58 2.35 2.34 2.16 2.09 1.88
## [301] 1.74 1.80 2.20 1.80 1.51 1.36 1.85 1.49 1.39 1.44 1.53 1.50
## [313] 1.71 1.25 1.22 1.08 1.14 1.31 1.32 1.12 1.18 1.17 1.28 1.18
## [325] 1.31 1.21 1.28 1.11 1.14 1.32 1.03 1.35 1.25 1.40 1.18 1.16
## [337] 1.42 1.45 1.49 1.25 1.41 1.15 1.03 1.34 1.38 1.50 1.24 1.40
## [349] 1.07 1.03 1.44 1.45 1.35 1.22 1.34 1.13 1.52 1.75 1.29 1.05
## [361] 1.02 1.17 0.87 1.22 1.30 1.39 1.87 1.95 2.02 2.01 1.57 2.16
## [373] 2.10 2.37 2.25 2.74 2.43 2.91 3.37 3.91 4.20 4.86 5.67 5.79
## [385] 6.78 8.53 8.97 11.68 12.03 14.66 14.35 17.82 20.88 23.98 25.11 26.37
## [397] 28.49 28.02 31.29 32.37 32.40 31.64 31.98 33.37 31.10 28.33 29.77 29.17
## [409] 27.43 25.86 22.95 23.83 21.50 21.07 20.03 18.83 16.95 15.46 16.50 14.85
## [421] 13.53 13.77 12.20 10.24 9.31 8.71 7.82 6.94 6.77 5.58 5.53 5.48
## [433] 4.15 4.01 3.65 3.35 3.22 2.91 3.20 2.11 2.63 2.18 2.23 1.97
## [445] 1.91 1.86 1.77 1.88 1.75 1.54 1.38 1.69 1.16 0.83 0.90 1.06
## [457] 1.11 1.27 0.76 1.03 0.89 0.54 0.86 0.75 0.89 0.78 1.09 0.88
## [469] 0.79 0.78 0.52 0.61 0.65 0.77 0.62 0.80 0.71 0.38 0.58 0.52
## [481] 0.64 1.28 1.04 0.67 0.90 0.55 0.76 0.54 0.55 0.41 0.43 0.38
## [493] 0.47 0.41 0.63 0.58 0.95 0.40 0.42 0.60 0.18 0.55 0.38 0.41
## [505] 0.40 0.56 0.53 0.89 0.45 0.55 0.77 0.77 0.75 0.44 0.60 0.79
## [517] 0.41 0.78 0.67 0.79 0.65 0.65 0.59 0.83 0.79 0.63 0.61 0.63
## [529] 0.42 0.60 0.75 0.79 0.99 1.14 1.18 1.15 1.35 1.14 2.06 1.91
## [541] 3.56 3.88 5.76 6.27 5.94 7.38 10.87 11.03 13.30 14.32 12.18 15.07
## [553] 15.66 15.20 15.47 15.23 15.70 15.31 13.22 15.53 16.51 16.74 16.89 16.54
## [565] 15.59 15.59 13.18 13.18 13.79 13.79 11.89 11.55 13.70 11.12 9.77 9.66
## [577] 12.20 9.81 8.36 10.10 7.04 7.84 6.14 6.83 6.62 5.84 6.64 6.38
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.180 1.505 6.380 9.292 14.335 33.370 21
Count Minimum and Maximum date:
## [1] "2020-12-20"
## [1] "2022-07-31"
## [1] 355
## [1] 212
Converting the dataset into univariate daily data with a weekly
pattern selecting “daily_covid_confirmed_rate” column:
Outliers & Missing value treatment:
## $index
## [1] 102 103 104 105 106 108 110 117 118 198 199 200 216 220 384 393 394 395 396
## [20] 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 550
## [39] 552 553 562 563
##
## $replacements
## [1] 19.751419 20.358882 19.879549 20.239726 21.139221 20.578927 20.998327
## [8] 20.587909 19.548770 29.045692 29.418791 29.281536 29.807639 29.041732
## [15] 6.032141 18.032638 18.162037 18.196977 18.154834 18.464556 18.177000
## [22] 18.475935 18.683242 18.795951 18.766130 18.778629 19.097120 18.797762
## [29] 19.084914 19.336687 19.434562 19.342425 19.412540 19.743117 19.435365
## [36] 19.713829 19.922048 12.970765 12.538085 13.932408 15.794625 16.443985
Checking the distribution of the time series.
##
## Shapiro-Wilk normality test
##
## data: clean_DCCR
## W = 0.86365, p-value < 2.2e-16
##
## Shapiro-Wilk normality test
##
## data: box_clean_DCCR
## W = 0.95434, p-value = 1.604e-12
Visualizing the “daily_covid_confirmed_rate” time series
data:
## Classical Decomposition for Seasonal adjustment: suitable for
daily data with a weekly pattern
Classical Decomposition on whole dataset
(daily_covid_confirmed_rate)
Additive Seasonality of daily_covid_confirmed_rate in classical
decomposition:
Additive Seasonality of daily_covid_confirmed_rate in classical
decomposition:
Visualizing the “daily_covid_confirmed_rate” time series data
after adjusted seasonality with classical decomposition:
Split “log_clean_daily_covid_confirmed_rate” data 70% : 30% =
training : test, n=length(testdata)
## Length of daily_covid_confirmed_rate dataset= 588
## Length of training dataset= 411.6
## Length of test dataset= 176.4
## [1] 176
## Model1: TBATS
## TBATS(1, {0,0}, 0.965, {<7,3>})
##
## Call: tbats(y = train_DCCR)
##
## Parameters
## Alpha: 0.5945616
## Beta: 0.09816488
## Damping Parameter: 0.964674
## Gamma-1 Values: 0.003993247
## Gamma-2 Values: 0.001969353
##
## Seed States:
## [,1]
## [1,] 2.918732200
## [2,] 0.013531423
## [3,] 0.007012924
## [4,] 0.030990260
## [5,] 0.014821555
## [6,] 0.018708632
## [7,] 0.022414309
## [8,] -0.020348963
##
## Sigma: 0.1526664
## AIC: 957.9533
Model1: TBATS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 12.33 0.9041
## McLeod-Li Q Q ~ chisq(20) 57.49 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 286 0.138
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 208 0.67
## Rank P (P-42333)/1396.3 ~ N(0,1) 43059 0.6031
Model1: TBATS errors Diagnostic Assumptions and useful properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast of Model1-TBATS:
Changing the f_tbats_dccr_mean as the format of test
data
TBATS-forecasting accuracy measures based on the test
dataset:
Model1-TBATS accuracy:
## MSE_m1_dccr = 16.94361
## RMSE_m1_dccr = 4.116261
## ME_m1_dccr = -3.673463
## MAE_m1_dccr = 3.67378
## MPE_m1_dccr = 525.7242
## MAPE_m1_dccr = 1484.96
Seasonal adjustment: Classical Decomposition on training dataset
(train_daily_covid_confirmed_rate)
Additive Seasonality of train_daily_covid_confirmed_rate in
classical decomposition:
Seasonality Adjusted from train_daily_covid_confirmed_rate in
Classical decomposition:
Visualizing the “train_daily_covid_confirmed_rate” adjusted
seasonality with classical decomposition:
Now check stationary of “Adj_s_c_train_DCCR”
Augmented Dickey-Fuller (ADF) test:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_train_DCCR
## Dickey-Fuller = -1.8331, Lag order = 7, p-value = 0.6479
## alternative hypothesis: stationary
The null hypothesis will be rejected if p-value < alpha.
Here, p-value > 0.05. So, we do not reject the null hypothesis.
Hence, a large p-value (p-value=0.6479 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test on
Nile:
## Warning in kpss.test(Adj_s_c_train_DCCR): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_train_DCCR
## KPSS Level = 1.0606, Truncation lag parameter = 5, p-value = 0.01
The null hypothesis will be rejected if p-value < alpha.
Here, p-value < 0.05. So, we reject the null hypothesis.
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) from
KPSS test suggests that the series is not stationary.
In conclusion, the Series is not stationary by the ADF and KPSS
tests. We need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_train_DCCR
## Dickey-Fuller = -3.5234, Lag order = 7, p-value = 0.0404
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_train_DCCR
## KPSS Level = 0.35132, Truncation lag parameter = 5, p-value = 0.09814
Hence, a small p-value (p-value=0.0404 less than Alpha=0.05) and
a large p-value (p-value=0.09814 greater than Alpha=0.05) suggests that
the series is now stationary after taking seasonal
difference.
## Model2- Classical decomposition + ARIMA
## Series: Diff_Adj_s_c_train_DCCR
## ARIMA(3,0,2) with zero mean
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2
## 1.2832 -0.2895 -0.0265 -1.5644 0.6578
## s.e. 0.1388 0.1040 0.0797 0.1289 0.1111
##
## sigma^2 = 0.02346: log likelihood = 190.02
## AIC=-368.04 AICc=-367.83 BIC=-343.93
## Series: Diff_Adj_s_c_train_DCCR
## ARIMA(2,0,2) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 mean
## 1.2518 -0.2863 -1.5298 0.6292 0.0037
## s.e. 0.1115 0.1092 0.0902 0.0835 0.0208
##
## sigma^2 = 0.02346: log likelihood = 189.98
## AIC=-367.97 AICc=-367.76 BIC=-343.85
Comparing AIC, AICc and BIC, auto.arima(3,0,2) is performing
good model between auto.arima(3,0,1) and ARIMA(3,0,2)
Model2- ARIMA residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 7.37 0.9953
## McLeod-Li Q Q ~ chisq(20) 50.01 2e-04 *
## Turning points T (T-272.7)/8.5 ~ N(0,1) 297 0.0043 *
## Diff signs S (S-205)/5.9 ~ N(0,1) 205 1
## Rank P (P-42127.5)/1391.2 ~ N(0,1) 42868 0.5945
Model2: auto.arima residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast: Model2- Classical decomposition +
ARIMA
Forecasting with seasonal adjustment: Classical decomposition +
ARIMA:
Adding model1 auto.arima Forecast mean with seasonal index of
daily_covid_confirmed_rate in the length of the test data
[413:n_daily_covid_confirmed_rate].
Changing the f_arima_sea_adj as the format of test
data:
Forecasting accuracy measures based on the test dataset: Model2
- Classical decomposition + ARIMA
Accuracy of Model2- Classical decomposition +
ARIMA:
## MSE_m2_dccr = 4.921232
## RMSE_m2_dccr = 2.218385
## ME_m2_dccr = 1.254614
## MAE_m2_dccr = 1.623782
## MPE_m2_dccr = 100.4379
## MAPE_m2_dccr = 100.4853
### Seasonal adjustment: STL Decomposition Using whole dataset
(box_clean_DCCR)
Seasonality (seasonal index) of daily_covid_confirmed_rate in
STL decomposition:
Seasonal adjustment: STL Decomposition on training dataset
(train)
Seasonality Adjusted training data in STL
decomposition:
Plotting Seasonality Adjusted training data in STL
decomposition
Checking stationary of Seasonality Adjusted training data in STL
decomposition:
##
## Augmented Dickey-Fuller Test
##
## data: seasadj_STL_train_dccr
## Dickey-Fuller = -1.7707, Lag order = 7, p-value = 0.6743
## alternative hypothesis: stationary
## Warning in kpss.test(seasadj_STL_train_dccr): p-value smaller than printed p-
## value
##
## KPSS Test for Level Stationarity
##
## data: seasadj_STL_train_dccr
## KPSS Level = 1.0606, Truncation lag parameter = 5, p-value = 0.01
Hence, a large p-value (p-value=0.6743 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
and KPSS test p-value = 0.01, a small p-values < 0.05
suggests that the series is not stationary and a differencing is
required.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_seasadj_STL_train_dccr
## Dickey-Fuller = -3.5956, Lag order = 7, p-value = 0.03347
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_seasadj_STL_train_dccr
## KPSS Level = 0.36095, Truncation lag parameter = 5, p-value = 0.09399
Hence, a small p-value (p-value=0.03347 less than Alpha=0.05)
from ADF test and a large p-value (p-value=0.09399 greater than
Alpha=0.05) from KPSS test suggests that the series is now stationary
after taking seasonal difference.
## Model3- STL+ETS: Fit a model
## ETS(A,Ad,N)
##
## Call:
## ets(y = Diff_seasadj_STL_train_dccr)
##
## Smoothing parameters:
## alpha = 0.0233
## beta = 0.0233
## phi = 0.9235
##
## Initial states:
## l = 0.0725
## b = -0.0149
##
## sigma: 0.148
##
## AIC AICc BIC
## 910.0875 910.2954 934.1990
Model3: STL+ETS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 46.44 7e-04 *
## McLeod-Li Q Q ~ chisq(20) 32.16 0.0416 *
## Turning points T (T-272.7)/8.5 ~ N(0,1) 291 0.0316 *
## Diff signs S (S-205)/5.9 ~ N(0,1) 213 0.1722
## Rank P (P-42127.5)/1391.2 ~ N(0,1) 42077 0.971
Model3: STL + ETS residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Model3-STL+ETS Forecast:
Forecasting with seasonal adjustment:
Model3-STL+ETS
Adding model3 Forecast mean with STL seasonal index of whole
data in the length of the test data.
Changing the f_stlets_dccr_sea_adj as the format of test
data
Forecasting accuracy measures based on the test dataset: Model3
- STL+ETS
Accuracy of Model3- STL+ETS:
## MSE_m3_dccr = 5.251982
## RMSE_m3_dccr = 2.29172
## ME_m3_dccr = 1.389098
## MAE_m3_dccr = 1.645767
## MPE_m3_dccr = 86.33861
## MAPE_m3_dccr = 102.2685
## Model4- STL+ARIMA Fit a model
## Series: Diff_seasadj_STL_train_dccr
## ARIMA(3,0,2)(2,0,2)[7] with non-zero mean
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 sar1 sar2 sma1
## 1.6985 -0.4409 -0.2589 -1.8826 0.9090 1.1761 -0.5295 -1.7196
## s.e. 0.0506 0.1007 0.0532 0.0238 0.0242 0.1052 0.0511 0.1250
## sma2 mean
## 0.7393 0.0002
## s.e. 0.1243 0.0079
##
## sigma^2 = 0.01395: log likelihood = 289.86
## AIC=-557.71 AICc=-557.05 BIC=-513.51
Model4: STL+ARIMA residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 17.42 0.6258
## McLeod-Li Q Q ~ chisq(20) 24.5 0.2212
## Turning points T (T-272.7)/8.5 ~ N(0,1) 269 0.6673
## Diff signs S (S-205)/5.9 ~ N(0,1) 211 0.3058
## Rank P (P-42127.5)/1391.2 ~ N(0,1) 41796 0.8117
Model3: STL + ETS residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting STL+ARIMA Model Forecast:
Forecasting with seasonal adjustment
STL+ARIMA
Adding model STL+ARIMA Forecast mean with STL seasonal index of
whole data in the length of the test data.
Changing the f_stlarima_dccr_sea_adj as the format of test
data
Forecasting accuracy measures based on the test dataset: Model4
- STL + ARIMA
accuracy of STL + ARIMA:
## MSE_m4_dccr = 4.96045
## RMSE_m4_dccr = 2.227207
## ME_m4_dccr = 1.266971
## MAE_m4_dccr = 1.62623
## MPE_m4_dccr = 98.87952
## MAPE_m4_dccr = 101.0316
| Model | MSE | RMSE | MAE | MPE | MAPE |
|---|---|---|---|---|---|
| M1-TBATS | 16.94361 | 4.116261 | 3.67378 | 525.7242 | 1484.96 |
| M2-CD+ARIMA | 4.921232 | 2.218385 | 1.623782 | 100.4379 | 100.4853 |
| M3- STL+ETS | 5.251982 | 2.29172 | 1.645767 | 86.33861 | 102.2685 |
| M4- STL+ARIMA | 4.965446 | 2.228328 | 1.627759 | 99.00741 | 101.3473 |
From the above comparison, it is obvious that our M2 (Classical Decomposition+ARIMA) has lower error.
So, comparatively M2 is the best model.
| Seasonal Index using classical decomposition (whole data) =>
s_c_DCCR
Fit into the Best Model Classical Decomposition + ARIMA:
## Series: Adj_s_c_DCCR
## ARIMA(3,1,2)
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2
## 1.2986 -0.1787 -0.1554 -1.6500 0.7343
## s.e. 0.1057 0.0815 0.0674 0.0943 0.0812
##
## sigma^2 = 0.03603: log likelihood = 144.61
## AIC=-277.23 AICc=-277.08 BIC=-250.98
## Series: Adj_s_c_DCCR
## ARIMA(2,0,2) with non-zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 mean
## 1.9632 -0.9649 -1.3394 0.4430 2.2725
## s.e. 0.0137 0.0137 0.0404 0.0385 0.4571
##
## sigma^2 = 0.03547: log likelihood = 147.17
## AIC=-282.35 AICc=-282.2 BIC=-256.09
# Plotting the best model Forecast:
Forecasting with seasonal adjustment Classical
Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data [60:84].
## Time Series:
## Start = c(85, 1)
## End = c(87, 6)
## Frequency = 7
## [1] 2.384815 2.311789 2.274689 2.200703 2.213426 2.103879 2.087820 2.113595
## [9] 2.051116 2.024638 1.961329 1.984763 1.885944 1.880609 1.917086 1.865270
## [17] 1.849398 1.796624 1.830505 1.742030
Final Forecast of Daily Covid Confirmed Rate: Actual Prediction
30 lag ahead: Back Transform from Box.cox
Transformation: Back-transforming is required for model M2 to get
forecasts on the original scale.
## Time Series:
## Start = c(85, 1)
## End = c(87, 6)
## Frequency = 7
## [1] 10.857050 10.092467 9.724894 9.031363 9.146998 8.197910 8.067313
## [8] 8.277949 7.776574 7.573365 7.108768 7.277325 6.592577 6.557500
## [15] 6.801114 6.457679 6.355994 6.029259 6.237036 5.708921
Converting the dataset into univariate daily data with a weekly pattern selecting “male_death_rate” column:
Outliers & Missing value treatment:
## $index
## [1] 20 21 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326
## [20] 327 359 395 396 397 398 399 400 401 402 403 404 405 406 407 547
##
## $replacements
## [1] 76.21311 76.08336 64.03244 64.03065 64.02682 64.02672 64.02976 64.02601
## [9] 64.02759 64.02837 64.02598 64.02301 64.02343 64.02479 64.02206 64.02517
## [17] 64.02434 64.02139 64.01933 64.02030 63.96744 63.92620 63.92108 63.91905
## [25] 63.91658 63.90811 63.90914 63.90002 63.89787 63.89416 63.89073 63.88776
## [33] 63.87976 63.88092 63.83957
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 63.82 63.85 64.09 67.76 72.40 76.26
Converting the dataset into univariate daily data with a weekly pattern selecting “male_death_rate” column:
Outliers & Missing value treatment:
## $index
## [1] 85 87 88 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104
## [19] 105 106 107 108 109 110 111 112 113 114 115 116 117 118 127 128 129 130
## [37] 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 174 175
## [55] 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
## [73] 194 195 196 197 198 199 200 201 202 204 205 206 207 208 209 210 211 212
## [91] 213 214 215 216 217 218 219 220 221 222 223 224 225 228 229 230 231 232
## [109] 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
## [127] 251 252 253 254 255 256 257 258 260 261 262 263 266 299 300 301 302 303
## [145] 304 305 306 307
##
## $replacements
## [1] 24.37463 24.41284 24.40592 24.48815 24.52801 24.57191 24.61451 24.68377
## [9] 24.71325 24.76559 24.83343 24.87314 24.91751 24.96004 25.02872 25.05764
## [17] 25.11128 25.17890 25.21857 25.26339 25.30619 25.36933 25.40301 25.45763
## [25] 25.52485 25.56453 25.60931 25.65232 25.70984 25.74821 25.80370 25.87042
## [33] 26.58569 26.62982 26.68500 26.72736 26.78300 26.84981 26.88999 26.93539
## [41] 26.97976 27.03517 27.07743 27.13280 27.19968 27.23998 27.28520 27.32978
## [49] 27.38539 27.42755 27.48263 27.54958 28.10016 28.15349 28.21176 28.26900
## [57] 28.33742 28.39155 28.45942 28.53936 28.59252 28.65109 28.70833 28.77632
## [65] 28.83075 28.89886 28.97859 29.03158 29.09030 29.14761 29.21529 29.27002
## [73] 29.33814 29.41779 29.47080 29.52950 29.58689 29.65426 29.70927 29.77740
## [81] 29.85697 30.02378 30.13620 30.25861 30.36872 30.49189 30.62651 30.73462
## [89] 30.84838 30.96077 31.08317 31.19334 31.31649 31.45109 31.55924 31.67298
## [97] 31.78538 31.90779 32.01793 32.14108 32.27568 32.38381 32.49757 32.85058
## [105] 32.93419 33.02923 33.09780 33.17205 33.24493 33.32776 33.39833 33.48197
## [113] 33.57693 33.64560 33.71986 33.79274 33.87554 33.94609 34.02977 34.12467
## [121] 34.19345 34.26776 34.34071 34.42337 34.49392 34.57752 34.67109 34.74175
## [129] 34.81580 34.88878 34.97127 35.04178 35.12524 35.21746 35.32375 35.35564
## [137] 35.39702 35.42761 35.52985 35.93268 35.93684 35.94186 35.94405 35.94154
## [145] 35.94818 35.94942 35.95198 35.95512
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 23.74 27.59 35.91 32.25 36.15 36.18
Checking the distribution of the time series.
Visualizing the “Male death rate” time series data:
Visualizing the “Female death rate” time series
data:
## Classical Decomposition for Seasonal adjustment: suitable for
daily data with a weekly pattern
Classical Decomposition on whole dataset
(daily_covid_confirmed_rate)
Additive Seasonality of daily_covid_confirmed_rate in classical
decomposition:
Additive Seasonality of daily_covid_confirmed_rate in classical
decomposition:
Visualizing the “daily_covid_confirmed_rate” time series data
after adjusted seasonality with classical decomposition:
Split “log_clean_daily_covid_confirmed_rate” data 70% : 30% =
training : test, n=length(testdata)
## [1] 176
## Model1: TBATS
## TBATS(0.902, {1,0}, 1, {<7,2>})
##
## Call: tbats(y = train_male)
##
## Parameters
## Lambda: 0.90204
## Alpha: 0.155927
## Beta: 0.042308
## Damping Parameter: 1
## Gamma-1 Values: -0.001198862
## Gamma-2 Values: 0.01346572
## AR coefficients: 0.878092
##
## Seed States:
## [,1]
## [1,] 54.1420369934
## [2,] -0.0088969460
## [3,] 0.0019658361
## [4,] -0.0001500732
## [5,] -0.0030024583
## [6,] 0.0024272066
## [7,] 0.0000000000
## attr(,"lambda")
## [1] 0.9020397
##
## Sigma: 0.01876142
## AIC: -427.4503
Model1: TBATS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 23.14 0.2818
## McLeod-Li Q Q ~ chisq(20) 163.2 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 235 0 *
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 205 0.9321
## Rank P (P-42333)/1396.3 ~ N(0,1) 42069 0.85
Model1: TBATS errors Diagnostic Assumptions and useful properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast of Model1-TBATS:
Changing the f_tbats_male_mean as the format of test
data
TBATS-forecasting accuracy measures based on the test
dataset:
Model1-TBATS accuracy:
## MSE_m1_male = 0.9534867
## RMSE_m1_male = 0.9764664
## ME_m1_male = 0.8450482
## MAE_m1_male = 0.8450482
## MPE_m1_male = 1.323743
## MAPE_m1_male = 1.323743
Seasonal adjustment: Classical Decomposition on training
dataset (train_male)
Additive Seasonality of train_male_death_rate in classical
decomposition:
Seasonality Adjusted from train_daily_covid_confirmed_rate in
Classical decomposition:
Visualizing the “train_daily_covid_confirmed_rate” adjusted
seasonality with classical decomposition:
Now check stationary of “Adj_s_c_train_male”
Augmented Dickey-Fuller (ADF) test:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_train_male
## Dickey-Fuller = -2.2068, Lag order = 7, p-value = 0.49
## alternative hypothesis: stationary
The null hypothesis will be rejected if p-value < alpha.
Here, p-value > 0.05. So, we do not reject the null hypothesis.
Hence, a large p-value (p-value=0.49 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS)
test:
## Warning in kpss.test(Adj_s_c_train_male): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_train_male
## KPSS Level = 6.7789, Truncation lag parameter = 5, p-value = 0.01
The null hypothesis will be rejected if p-value < alpha.
Here, p-value < 0.05. So, we reject the null hypothesis.
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) from
KPSS test suggests that the series is not stationary.
In conclusion, the Series is not stationary by the ADF and KPSS
tests. We need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_train_male
## Dickey-Fuller = -1.9011, Lag order = 7, p-value = 0.6192
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_train_male): p-value smaller than printed p-
## value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_train_male
## KPSS Level = 0.99098, Truncation lag parameter = 5, p-value = 0.01
ADF and KPSS’s test p-value suggests that the series is not
become stationary. So, taking seasonal difference did not work in this
time. So we will continue without taking seasonal
difference.
## Model2- Classical decomposition + ARIMA
Model2- ARIMA residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 35.44 0.0179 *
## McLeod-Li Q Q ~ chisq(20) 169.73 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 251 0.0089 *
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 207 0.7982
## Rank P (P-42333)/1396.3 ~ N(0,1) 41530 0.5652
Model2: auto.arima residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast: Model2- Classical decomposition +
ARIMA
Forecasting with seasonal adjustment: Classical decomposition +
ARIMA:
Adding model1 auto.arima Forecast mean with seasonal index of
daily_covid_confirmed_rate in the length of the test data
[413:n_daily_covid_confirmed_rate].
Changing the f_arima_sea_adj as the format of test
data:
Forecasting accuracy measures based on the test dataset: Model2
- Classical decomposition + ARIMA
Accuracy of Model2- Classical decomposition +
ARIMA:
## MSE_m2_maledr = 0.3326204
## RMSE_m2_maledr = 0.5767325
## ME_m2_maledr = 0.5019622
## MAE_m2_maledr = 0.5019622
## MPE_m2_maledr = 0.7863076
## MAPE_m2_maledr = 0.7863076
| Model | MSE | RMSE | MAE | MPE | MAPE |
|---|---|---|---|---|---|
| M1-TBATS | 0.9534867 | 0.9764664 | 0.8450482 | 1.323743 | 1.323743 |
| M2-CD+ARIMA | 0.3326204 | 0.5767325 | 0.5019622 | 0.7863076 | 0.7863076 |
From the above comparison, it is obvious that our M2 (Classical Decomposition+ARIMA) has lower error.
So, comparatively M2 is the best model.
| Seasonal Index using classical decomposition (whole data) =>
s_c_DCCR
Fit into the Best Model Classical Decomposition + ARIMA:
## Series: Adj_s_c_male
## ARIMA(0,2,1)(0,0,2)[7]
##
## Coefficients:
## ma1 sma1 sma2
## -0.8101 0.1273 0.0648
## s.e. 0.0219 0.0435 0.0444
##
## sigma^2 = 0.0006176: log likelihood = 1344.22
## AIC=-2680.43 AICc=-2680.37 BIC=-2662.94
# Plotting the best model Forecast:
Forecasting with seasonal adjustment Classical
Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
Final Forecast of Male death rate:
## Time Series:
## Start = c(85, 1)
## End = c(87, 6)
## Frequency = 7
## [1] 63.82913 63.82773 63.82522 63.82477 63.82733 63.82987 63.82700 63.82741
## [9] 63.82524 63.82241 63.82190 63.82478 63.82762 63.82376 63.82479 63.82258
## [17] 63.81959 63.81904 63.82209 63.82511
Final Forecast of Female death rate:
#### Male death rate by covid-19 demonstrating downward trend. On the
contrary, female death rate trending upward for next August 2022 which
is the opposite from the situation at the beginning.
Converting the dataset into univariate daily data with a weekly pattern selecting “age” columns:
Outliers & Missing value treatment:
## $index
## [1] 90 122 123 125 132 134 173 174 175 176 177 214 220
##
## $replacements
## [1] 0.4258158 0.3821749 0.3778434 0.3890710 0.3882166 0.3787093 0.3586873
## [8] 0.3755928 0.3738548 0.3728332 0.3743561 0.2967784 0.2914153
## $index
## [1] 86 87 97 109 119 122 129 205 206 207 208 224 432 433 434 435 436 437
## [19] 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455
## [37] 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473
## [55] 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491
## [73] 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509
## [91] 510 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528
## [109] 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546
## [127] 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562
##
## $replacements
## [1] 0.7522188 0.7512193 0.7545150 0.7342437 0.6850542 0.6851129 0.6700313
## [8] 0.6352553 0.6300823 0.6295010 0.6274569 0.6399224 0.6804026 0.6804623
## [15] 0.6805307 0.6808903 0.6797133 0.6800834 0.6800272 0.6805066 0.6805371
## [22] 0.6805867 0.6809538 0.6797715 0.6801415 0.6800741 0.6805924 0.6805985
## [29] 0.6806402 0.6810188 0.6798311 0.6802010 0.6801222 0.6806790 0.6806608
## [36] 0.6806947 0.6810823 0.6798918 0.6802617 0.6801830 0.6807418 0.6807218
## [43] 0.6807552 0.6811473 0.6799535 0.6803233 0.6802445 0.6808053 0.6807834
## [50] 0.6808164 0.6812091 0.6800153 0.6803851 0.6803064 0.6808671 0.6808453
## [57] 0.6808783 0.6812712 0.6800774 0.6804471 0.6803683 0.6809291 0.6809072
## [64] 0.6809404 0.6813330 0.6801393 0.6805090 0.6804302 0.6809910 0.6809690
## [71] 0.6810025 0.6813949 0.6802012 0.6805711 0.6804922 0.6810529 0.6810309
## [78] 0.6810646 0.6814568 0.6802631 0.6806330 0.6805542 0.6811149 0.6810928
## [85] 0.6811262 0.6815188 0.6803252 0.6806951 0.6806163 0.6811769 0.6811548
## [92] 0.6815579 0.6803415 0.6806887 0.6805871 0.6811250 0.6810801 0.6810899
## [99] 0.6814604 0.6802440 0.6805911 0.6804896 0.6810274 0.6809826 0.6809920
## [106] 0.6813631 0.6801464 0.6804933 0.6803918 0.6809298 0.6808852 0.6808956
## [113] 0.6812654 0.6800484 0.6803952 0.6802937 0.6808318 0.6807874 0.6807987
## [120] 0.6811682 0.6799503 0.6802966 0.6801951 0.6807360 0.6806914 0.6807033
## [127] 0.6810700 0.6798513 0.6801974 0.6800960 0.6806398 0.6805950 0.6806076
## [134] 0.6809712 0.6797528 0.6800988 0.6799974 0.6805432 0.6805015 0.6805099
## [141] 0.6808721 0.6796541
## $index
## [1] 75 76 110 111 112 122 123 124 125 126 127 128 129 130 179 180 181 183 184
## [20] 185 194 195 196 197 198 199 209 210 211 212 213 214 215 221 222 223 224 232
## [39] 233 235 236 237 238 241 242 243
##
## $replacements
## [1] 2.038783 2.038479 1.934446 1.922647 1.926944 1.842496 1.832489 1.826435
## [9] 1.812493 1.816754 1.818838 1.814069 1.816277 1.805750 1.820318 1.833556
## [17] 1.836485 1.862997 1.867498 1.871052 1.937244 1.939601 1.943304 1.965645
## [25] 1.960857 1.967106 2.113087 2.084536 2.104455 2.084738 2.089534 2.077068
## [33] 2.059579 2.154113 2.154638 2.215041 2.170497 2.294343 2.240198 2.224558
## [41] 2.216821 2.222427 2.178168 2.270064 2.263416 2.274407
## $index
## [1] 17 93 94 95 96 97 98 112 113 146 154 155 158 164 171 172 173 174 175
## [20] 176 177 178 179 180 181 194 195 196 197 198 199 204 212 213 217 218 228 229
## [39] 230 231 232 233 234 235 236 237 238 239 240 246 584
##
## $replacements
## [1] 4.995262 4.917464 4.930483 4.923745 4.932161 4.933723 4.941990 4.975891
## [9] 4.980647 4.897254 4.905319 4.918605 4.923649 4.990509 5.070249 5.082710
## [17] 5.098813 5.112008 5.108302 5.122936 5.123206 5.161860 5.174838 5.191318
## [25] 5.203984 5.473458 5.482750 5.475290 5.482532 5.465061 5.520662 5.644756
## [33] 5.659224 5.705703 5.793560 5.812532 5.924096 5.927639 5.935985 5.936119
## [41] 5.932448 5.940968 5.954954 5.959626 5.962396 5.971750 5.973555 5.969226
## [49] 5.979951 5.973704 5.889963
## $index
## [1] 102 103 104 106 111 112 123 129 137 138 170 171 172 173 174 175 176 177 179
## [20] 180 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 201
## [39] 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
## [58] 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 237 238 239 241
## [77] 243 244 246 247 248
##
## $replacements
## [1] 11.25485 11.25690 11.26356 11.24593 11.14375 11.12574 11.14236 11.07829
## [9] 11.04426 11.04764 11.14032 11.14339 11.15713 11.16360 11.16308 11.17365
## [17] 11.17634 11.18689 11.21052 11.22371 11.25225 11.27290 11.28588 11.30935
## [25] 11.32545 11.33470 11.35534 11.36757 11.38823 11.40122 11.42468 11.44077
## [33] 11.45002 11.47065 11.48285 11.50354 11.51653 11.54964 11.55244 11.56663
## [41] 11.57239 11.58663 11.59318 11.61021 11.61983 11.62265 11.63683 11.64260
## [49] 11.65683 11.66338 11.68043 11.69002 11.69287 11.70705 11.71283 11.72705
## [57] 11.73361 11.75045 11.76012 11.76326 11.77731 11.78306 11.79726 11.80385
## [65] 11.82046 11.83021 11.83366 11.84757 11.85329 11.86747 11.87437 11.88982
## [73] 11.88535 11.87920 11.86525 11.85147 11.84752 11.84050 11.83057 11.82950
## [81] 11.82222
## $index
## [1] 95 96 97 98 99 100 116 117 118 119 120 121 122 123 129 130 131 132 133
## [20] 134 135 136 137 138 139 140 141 142 175 177 196 197 198 200 201 202 203 204
## [39] 223 224 243 244 255 256 257 258 259
##
## $replacements
## [1] 24.75490 24.74305 24.73872 24.73690 24.72144 24.72112 24.55919 24.54097
## [9] 24.53392 24.51766 24.49928 24.48983 24.47786 24.45743 24.28204 24.26508
## [17] 24.25151 24.24640 24.23113 24.21490 24.20817 24.20108 24.18441 24.17249
## [25] 24.16622 24.15077 24.13306 24.12621 24.06826 24.06122 24.14910 24.14866
## [33] 24.16604 24.15252 24.15692 24.13737 24.12089 24.09751 23.80854 23.79047
## [41] 23.59986 23.58142 23.50997 23.51134 23.51251 23.51521 23.51098
## $index
## [1] 12 23 24 43 44 101 102 103 104 105 106 116 118 120 121 122 127 128
## [19] 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
## [37] 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178
## [55] 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196
## [73] 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214
## [91] 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
## [109] 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
## [127] 251 252 253 555
##
## $replacements
## [1] 54.62387 54.93031 54.93867 55.38513 55.39471 56.00482 56.05379 56.08111
## [9] 56.07231 56.11391 56.12962 56.33245 56.37757 56.46224 56.39278 56.46383
## [17] 56.79453 56.70800 56.76288 56.80699 56.84345 56.86957 56.86198 56.89680
## [25] 56.80981 56.86646 56.91046 56.94554 56.97138 56.96603 56.99974 56.91221
## [33] 56.97052 57.01426 57.04780 57.07319 56.59531 56.33818 55.96187 55.73592
## [41] 55.49469 55.23266 54.96943 54.68628 54.42854 54.05194 53.82669 53.58674
## [49] 53.32231 53.05897 52.77739 52.51900 52.14221 51.91696 51.67763 51.41237
## [57] 51.14915 50.86822 50.60957 50.23256 50.00729 49.76859 49.50252 49.23945
## [65] 48.95919 48.69996 48.32290 48.09763 47.85898 47.59286 47.32982 47.04967
## [73] 46.79053 46.41339 46.18807 45.94946 45.68326 45.42024 45.14020 44.88092
## [81] 44.50389 44.27863 44.03987 43.77366 43.51065 43.23059 42.97136 42.59443
## [89] 42.36922 42.13032 41.86410 41.60112 41.32106 41.06147 40.68527 40.46014
## [97] 40.22063 39.95424 39.69131 39.41122 39.15174 38.77622 38.55112 38.31090
## [105] 38.04425 37.78127 37.50106 37.24094 36.86846 36.64512 36.40055 36.13327
## [113] 35.87060 35.59081 35.33000 34.96044 34.73860 34.48925 34.22086 33.95801
## [121] 33.67818 33.41630 33.07017 32.83405 32.57576 32.30563 32.04287 31.76405
## [129] 31.50226 30.91918
## $index
## [1] 79 86 87 94 95 96 97 98 99 100 101 102 103 104 105 106 113 114 115
## [20] 116 117 118 123 124 210 217 218 225 226 227 228 229 230 231 232 233 234 235
## [39] 236 237 244 245 246 247 248 249 254 255 337 408 423 424 426 427 428 429 430
##
## $replacements
## [1] 17.39944 17.38259 17.38622 17.39100 17.39218 17.39163 17.39055 17.38863
## [9] 17.38995 17.39000 17.39098 17.39214 17.39160 17.39053 17.38862 17.38994
## [17] 17.43129 17.43136 17.43219 17.43343 17.43287 17.43190 17.43132 17.43084
## [25] 17.40025 17.38342 17.38301 17.38690 17.39394 17.39371 17.39677 17.39996
## [33] 17.39074 17.39031 17.38739 17.40367 17.39445 17.40639 17.40246 17.39168
## [41] 17.41765 17.41470 17.41090 17.43792 17.42035 17.43616 17.45099 17.42660
## [49] 17.33968 17.39499 17.49361 17.49826 17.49470 17.51372 17.50965 17.51290
## [57] 17.51614
## $index
## [1] 76 104 130 131 204 205 206 207 235 274 277 316 325 400 401 402 403 404 405
## [20] 406 414 415 436
##
## $replacements
## [1] 5.400335 5.474088 5.584613 5.591308 5.403294 5.408662 5.414031 5.409665
## [9] 5.493431 5.634774 5.630227 5.649999 5.649999 5.683700 5.687057 5.691273
## [17] 5.695118 5.699156 5.702559 5.706287 5.796635 5.802962 5.865009
## $index
## [1] 14 21 89 90 91 93 94 115 116 125 127 220 221 222 224 225 246 400
##
## $replacements
## [1] 1.101204 1.099655 1.108600 1.105664 1.114628 1.110723 1.098768 1.152513
## [9] 1.157936 1.168328 1.170688 1.100470 1.095198 1.106518 1.111403 1.121482
## [17] 1.170705 1.205000
## clean_0_10 clean_11_20 clean_21_30 clean_31_40
## Min. :0.2700 Min. :0.6100 Min. :1.770 Min. :4.880
## 1st Qu.:0.2800 1st Qu.:0.6600 1st Qu.:2.030 1st Qu.:5.010
## Median :0.3000 Median :0.6798 Median :2.310 Median :5.890
## Mean :0.3313 Mean :0.6800 Mean :2.191 Mean :5.608
## 3rd Qu.:0.3800 3rd Qu.:0.6815 3rd Qu.:2.350 3rd Qu.:5.950
## Max. :0.4700 Max. :0.7700 Max. :2.370 Max. :6.000
## clean_41_50 clean_51_60 clean_61_70 clean_71_80
## Min. :11.04 Min. :23.25 Min. :30.90 Min. :17.33
## 1st Qu.:11.42 1st Qu.:23.30 1st Qu.:30.92 1st Qu.:17.39
## Median :11.76 Median :23.59 Median :31.06 Median :17.42
## Mean :11.63 Mean :23.87 Mean :39.80 Mean :17.43
## 3rd Qu.:11.83 3rd Qu.:24.17 3rd Qu.:54.78 3rd Qu.:17.52
## Max. :11.90 Max. :25.56 Max. :57.08 Max. :17.53
## clean_81_90 clean_91_100
## Min. :5.399 Min. :1.095
## 1st Qu.:5.480 1st Qu.:1.130
## Median :5.650 Median :1.180
## Mean :5.637 Mean :1.187
## 3rd Qu.:5.860 3rd Qu.:1.270
## Max. :5.920 Max. :1.290
Timeplot for Age 0-10 daily death rate:
Classical decomposition: Additive Seasonality of “clean_0_10” in
:
Additive Seasonality of age_0_10_death_rate (“clean_0_10_dr”) in
classical decomposition:
Visualizing the age_0_10_death_rate (“clean_0_10_dr”) time
series data after adjusted seasonality with classical
decomposition:
Split “clean_0_10_dr” into train : test = 70 : 30 dataset:
## [1] 176
## Model1: TBATS
## BATS(0, {0,0}, 0.967, -)
##
## Call: tbats(y = train_0_10)
##
## Parameters
## Lambda: 0
## Alpha: 0.5344394
## Beta: 0.03600666
## Damping Parameter: 0.967111
##
## Seed States:
## [,1]
## [1,] -0.780109544
## [2,] 0.001033224
## attr(,"lambda")
## [1] 4.396406e-07
##
## Sigma: 0.01230845
## AIC: -2024.692
Model1: TBATS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 24.16 0.2357
## McLeod-Li Q Q ~ chisq(20) 42.66 0.0023 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 118 0 *
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 194 0.05 *
## Rank P (P-42333)/1396.3 ~ N(0,1) 41452 0.5281
Model1: TBATS errors Diagnostic Assumptions and useful properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast of Model1-TBATS:
Changing the f_tbats_0_10_mean as the format of test
data
TBATS-forecasting accuracy measures based on the test
dataset:
Model1-TBATS accuracy:
## MSE_m1_0_10 = 0.0003926621
## RMSE_m1_0_10 = 0.0198157
## ME_m1_0_10 = 0.0193762
## MAE_m1_0_10 = 0.0193762
## MPE_m1_0_10 = 6.453993
## MAPE_m1_0_10 = 6.453993
Classical Decomposition on training dataset (traindata) -
Seasonal adjustment:
Additive Seasonality of train_0_10 death rate in classical
decomposition:
Seasonality Adjusted from train_0_10 death rate in Classical
decomposition:
Visualizing the “train_0_10 death rate” adjusted seasonality
with classical decomposition:
Now check stationary of “Adj_s_c_train_0_10”
Augmented Dickey-Fuller (ADF) test:
## Warning in adf.test(Adj_s_c_train_0_10): p-value greater than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_train_0_10
## Dickey-Fuller = -0.26275, Lag order = 7, p-value = 0.99
## alternative hypothesis: stationary
The null hypothesis will be rejected if p-value < alpha.
Here, p-value > 0.05. So, we do not reject the null hypothesis.
Hence, a large p-value (p-value=0.99 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS)
test:
## Warning in kpss.test(Adj_s_c_train_0_10): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_train_0_10
## KPSS Level = 6.4555, Truncation lag parameter = 5, p-value = 0.01
The null hypothesis will be rejected if p-value < alpha.
Here, p-value < 0.05. So, we reject the null hypothesis.
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) from
KPSS test suggests that the series is not stationary.
In conclusion, the Series is not stationary by the ADF and KPSS
tests. We need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_train_0_10): p-value smaller than printed p-
## value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_train_0_10
## Dickey-Fuller = -7.1937, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_train_0_10
## KPSS Level = 0.41867, Truncation lag parameter = 5, p-value = 0.06911
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.06626 greater
than Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
## Model2- Classical decomposition + ARIMA
## Series: Diff_Adj_s_c_train_0_10
## ARIMA(1,0,1)(0,0,1)[7] with non-zero mean
##
## Coefficients:
## ar1 ma1 sma1 mean
## -0.0605 -0.3915 0.1087 -5e-04
## s.e. 0.1018 0.0916 0.0493 1e-04
##
## sigma^2 = 1.951e-05: log likelihood = 1647.27
## AIC=-3284.53 AICc=-3284.38 BIC=-3264.44
Model2- ARIMA residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 22.63 0.3073
## McLeod-Li Q Q ~ chisq(20) 62.96 0 *
## Turning points T (T-272.7)/8.5 ~ N(0,1) 235 0 *
## Diff signs S (S-205)/5.9 ~ N(0,1) 253 0 *
## Rank P (P-42127.5)/1391.2 ~ N(0,1) 45530 0.0145 *
Model2: auto.arima residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast: Model2- Classical decomposition +
ARIMA
Forecasting with seasonal adjustment: Classical decomposition +
ARIMA:
Adding model1 auto.arima Forecast mean with seasonal index of
age_0_10_death_rate in the length of the test data
[413:588].
Changing the f_arima_sea_adj as the format of test
data:
Forecasting accuracy measures based on the test dataset: Model2
- Classical decomposition + ARIMA
Accuracy of Model2- Classical decomposition +
ARIMA:
## MSE_m2_0_10 = 0.08991204
## RMSE_m2_0_10 = 0.2998534
## ME_m2_0_10 = 0.2998244
## MAE_m2_0_10 = 0.2998244
## MPE_m2_0_10 = 100.1501
## MAPE_m2_0_10 = 100.1501
Using whole dataset (clean_0_10)
Seasonality (seasonal index) of age_0_10_death_rate in STL
decomposition:
Seasonal adjustment: STL Decomposition on training dataset
(train)
Seasonality Adjusted training data in STL
decomposition:
Plotting Seasonality Adjusted training data in STL
decomposition
Checking stationary of Seasonality Adjusted training data in STL
decomposition:
## Warning in adf.test(seasadj_STL_train_0_10): p-value greater than printed p-
## value
##
## Augmented Dickey-Fuller Test
##
## data: seasadj_STL_train_0_10
## Dickey-Fuller = -0.29631, Lag order = 7, p-value = 0.99
## alternative hypothesis: stationary
## Warning in kpss.test(seasadj_STL_train_0_10): p-value smaller than printed p-
## value
##
## KPSS Test for Level Stationarity
##
## data: seasadj_STL_train_0_10
## KPSS Level = 6.4552, Truncation lag parameter = 5, p-value = 0.01
Hence, a large p-value (p-value=0.99 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
and KPSS test p-value = 0.01, a small p-values < 0.05
suggests that the series is not stationary and a differencing is
required.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_seasadj_STL_train_0_10): p-value smaller than printed
## p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_seasadj_STL_train_0_10
## Dickey-Fuller = -7.6442, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_seasadj_STL_train_0_10
## KPSS Level = 0.45176, Truncation lag parameter = 5, p-value = 0.05484
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) from
ADF test and a large p-value (p-value=0.05806 greater than Alpha=0.05)
from KPSS test suggests that the series is now stationary after taking
seasonal difference.
## Model3- STL+ETS: Fit a model
## ETS(A,N,N)
##
## Call:
## ets(y = Diff_seasadj_STL_train_0_10)
##
## Smoothing parameters:
## alpha = 1e-04
##
## Initial states:
## l = -5e-04
##
## sigma: 0.0044
##
## AIC AICc BIC
## -1984.933 -1984.874 -1972.877
Model3: STL+ETS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 128.64 0 *
## McLeod-Li Q Q ~ chisq(20) 128.57 0 *
## Turning points T (T-272.7)/8.5 ~ N(0,1) 297 0.0043 *
## Diff signs S (S-205)/5.9 ~ N(0,1) 204 0.8645
## Rank P (P-42127.5)/1391.2 ~ N(0,1) 43451 0.3414
Model3: STL + ETS residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Model3-STL+ETS Forecast:
Forecasting with seasonal adjustment:
Model3-STL+ETS
Adding model3 Forecast mean with STL seasonal index of whole
data in the length of the test data.
Changing the f_stlets_dccr_sea_adj as the format of test
data
Forecasting accuracy measures based on the test dataset: Model3
- STL+ETS
Accuracy of Model3- STL+ETS:
## MSE_m3_0_10 = 0.08991859
## RMSE_m3_0_10 = 0.2998643
## ME_m3_0_10 = 0.2998358
## MAE_m3_0_10 = 0.2998358
## MPE_m3_0_10 = 100.1541
## MAPE_m3_0_10 = 100.1541
| Model | MSE | RMSE | MAE | MPE | MAPE |
|---|---|---|---|---|---|
| M1-TBATS | 0.0003921216 | MSE0.01980206 | RMSE3.67378 | MAE6.449671 | MPE6.449672 |
| M2-CD+ARIMA | 0.08991268 | MSE0.2998544 | RMSE0.2998255 | MAE100.1505 | MPE100.1505 | MAPE
| M3-STL+ETS | 0.0899148 | MSE0.299858 | RMSE0.2998295 | MAE100.152 | MPE100.152 | MAPE
From the above comparison, it is obvious that our M2 (Classical Decomposition+ARIMA) has lower error.
So, comparatively M2 is the best model.
| Seasonal Index using classical decomposition (whole data) =>
s_c_0_10
Fit into the Best Model Classical Decomposition + ARIMA:
## Series: Adj_s_c_0_10_dr
## ARIMA(0,2,2)(2,0,0)[7]
##
## Coefficients:
## ma1 ma2 sar1 sar2
## -1.4303 0.4464 0.0854 -0.0081
## s.e. 0.0373 0.0404 0.0451 0.0438
##
## sigma^2 = 1.433e-05: log likelihood = 2436.4
## AIC=-4862.8 AICc=-4862.69 BIC=-4840.93
# Plotting the best model Forecast:
Final Forecasting with seasonal adjustment
Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(87, 6)
## Frequency = 7
## [1] 0.3102200 0.3109158 0.3111980 0.3109655 0.3108217 0.3117049 0.3119421
## [8] 0.3117077 0.3124630 0.3127693 0.3125170 0.3123609 0.3134000 0.3136575
## [15] 0.3133825 0.3141373 0.3144434 0.3141912 0.3140352 0.3150806
#### 0-10 yrs age group death rate by covid-19 demonstrating upward
trend for next month August 2022.
From above analysis, classical Decomposition + ARIMA is the
best model for this dataset. Therefore we will use this model to
forecast further age group.
#### Age 11-20 death rate:
Timeplot of Age 11-20 death rate:
Classical decomposition: Additive Seasonality of “clean_11_20”
in :
Additive Seasonality of age_11_20_death_rate (“clean_11_20_dr”)
in classical decomposition:
Visualizing the age_11_20_death_rate (“clean_11_20_dr”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_11_20_dr
## Dickey-Fuller = -1.4172, Lag order = 8, p-value = 0.825
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_11_20_dr): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_11_20_dr
## KPSS Level = 2.659, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_11_20): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_11_20
## Dickey-Fuller = -7.3691, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_11_20
## KPSS Level = 0.45658, Truncation lag parameter = 6, p-value = 0.05277
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) and a
large p-value (p-value=0.05277 greater than Alpha=0.05) suggests that
the series is now stationary after taking seasonal
difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_11_20_dr
## ARIMA(2,1,2)(2,0,2)[7]
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sar2 sma1 sma2
## 0.2157 -0.7065 -0.3300 0.6894 0.1227 0.5689 -0.0475 -0.6128
## s.e. 0.1791 0.1532 0.1874 0.1482 0.3498 0.2281 0.3492 0.2409
##
## sigma^2 = 1.572e-05: log likelihood = 2417.26
## AIC=-4816.53 AICc=-4816.22 BIC=-4777.15
# Plotting the best model Forecast:
Final Forecasting of Age 11- 20 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 0.6900724 0.6889276 0.6891482 0.6894739 0.6901611 0.6901819 0.6896137
## [8] 0.6898886 0.6887970 0.6888741 0.6891219 0.6901621 0.6905013 0.6898582
## [15] 0.6900765 0.6890291 0.6891627 0.6893992 0.6902099 0.6903767 0.6897775
## [22] 0.6899990 0.6889070 0.6890229 0.6892826 0.6902406 0.6905133 0.6898830
## [29] 0.6901124 0.6890442 0.6891630
Timeplot of Age 21-30 death rate:
Classical decomposition: Additive Seasonality of “clean_21_30”
in :
Additive Seasonality of age_21_30_death_rate (“clean_21_30”) in
classical decomposition:
Visualizing the age_21_30_death_rate (“clean_21_30”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_21_30
## Dickey-Fuller = -1.6824, Lag order = 8, p-value = 0.7128
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_21_30): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_21_30
## KPSS Level = 6.255, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_21_30): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_21_30
## Dickey-Fuller = -4.9952, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_21_30
## KPSS Level = 0.58302, Truncation lag parameter = 6, p-value = 0.02418
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_21_30
## ARIMA(2,2,2)(0,0,2)[7]
##
## Coefficients:
## ar1 ar2 ma1 ma2 sma1 sma2
## -1.0593 -0.2695 -0.1475 -0.7720 0.1662 0.1524
## s.e. 0.0812 0.0429 0.0734 0.0696 0.0454 0.0407
##
## sigma^2 = 7.829e-05: log likelihood = 1940.32
## AIC=-3866.65 AICc=-3866.45 BIC=-3836.03
# Plotting the best model Forecast:
Final Forecast of Age 21-30 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 2.372252 2.371698 2.371578 2.372953 2.373181 2.374826 2.373685 2.377116
## [9] 2.375926 2.376146 2.377100 2.377518 2.379355 2.378085 2.381955 2.380616
## [17] 2.380861 2.380219 2.380691 2.382753 2.381317 2.385305 2.383885 2.384183
## [25] 2.383507 2.384001 2.386049 2.384622 2.388604 2.387188 2.387484
Timeplot of Age 31-40 death rate:
Classical decomposition: Additive Seasonality of “clean_31_40”
in :
Additive Seasonality of age_31_40_death_rate (“clean_31_40”) in
classical decomposition:
Visualizing the age_31_40_death_rate (“clean_31_40”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_31_40
## Dickey-Fuller = -1.1825, Lag order = 8, p-value = 0.9093
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_31_40): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_31_40
## KPSS Level = 6.3163, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_31_40): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_31_40
## Dickey-Fuller = -4.0213, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_31_40): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_31_40
## KPSS Level = 0.76606, Truncation lag parameter = 6, p-value = 0.01
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_31_40
## ARIMA(3,2,5)(2,0,1)[7]
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 ma3 ma4 ma5
## -0.8115 -1.1035 -0.3256 -0.1518 0.2615 -0.7194 -0.2779 0.1327
## s.e. 0.3333 0.1566 0.3208 0.3323 0.1569 0.1737 0.3084 0.0453
## sar1 sar2 sma1
## 0.1011 0.1076 -0.0222
## s.e. 0.3307 0.0511 0.3293
##
## sigma^2 = 8.535e-05: log likelihood = 1917.7
## AIC=-3811.4 AICc=-3810.86 BIC=-3758.92
# Plotting the best model Forecast:
Final Forecast of Age 31-40 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 5.888800 5.887380 5.888245 5.887673 5.888160 5.889178 5.886723 5.885406
## [9] 5.885224 5.886267 5.885554 5.885942 5.886153 5.884609 5.882967 5.882803
## [17] 5.884109 5.883198 5.883547 5.883934 5.882227 5.880394 5.880425 5.881885
## [25] 5.880833 5.881114 5.881557 5.879923 5.877923 5.878036 5.879618
Timeplot of Age 41-50 death rate:
Classical decomposition: Additive Seasonality of “clean_41_50”
in :
Additive Seasonality of age_41_50_death_rate (“clean_41_50”) in
classical decomposition:
Visualizing the age_41_50_death_rate (“clean_41_50”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_41_50
## Dickey-Fuller = -1.8163, Lag order = 8, p-value = 0.6561
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_41_50): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_41_50
## KPSS Level = 4.2846, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_41_50): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_41_50
## Dickey-Fuller = -5.1413, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_41_50
## KPSS Level = 0.63128, Truncation lag parameter = 6, p-value = 0.01979
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_41_50
## ARIMA(3,1,3)(2,0,1)[7]
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 ma3 sar1 sar2
## -0.7501 -0.1175 0.1238 -0.1928 -0.4870 -0.2034 0.6050 0.0084
## s.e. 0.4898 0.3951 0.0469 0.4936 0.2192 0.3462 0.3331 0.0531
## sma1
## -0.5491
## s.e. 0.3293
##
## sigma^2 = 8.365e-05: log likelihood = 1923
## AIC=-3826 AICc=-3825.62 BIC=-3782.27
# Plotting the best model Forecast:
Final Forecast of Age 41-50 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] -0.0008526252 -0.0019269300 -0.0016091974 -0.0007366226 0.0002775965
## [6] -0.0003275142 -0.0008334442 -0.0010684481 -0.0019834916 -0.0015962045
## [11] -0.0007645466 0.0002834610 -0.0004073261 -0.0008843674 -0.0010461160
## [16] -0.0018540426 -0.0016052725 -0.0007657726 0.0002892741 -0.0004722499
## [21] -0.0009201294 -0.0010373697 -0.0017750054 -0.0016106095 -0.0007672847
## [26] 0.0002933855 -0.0005125388 -0.0009420671 -0.0010318776 -0.0017261707
## [31] -0.0016138500
Timeplot of Age 51-60 death rate:
Classical decomposition: Additive Seasonality of “clean_51_60”
in :
Additive Seasonality of age_51_60_death_rate (“clean_51_60”) in
classical decomposition:
Visualizing the age_51_60_death_rate (“clean_51_60”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_51_60
## Dickey-Fuller = -1.8298, Lag order = 8, p-value = 0.6504
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_51_60): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_51_60
## KPSS Level = 7.349, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_51_60): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_51_60
## Dickey-Fuller = -5.7649, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_51_60): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_51_60
## KPSS Level = 1.752, Truncation lag parameter = 6, p-value = 0.01
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_51_60
## ARIMA(2,1,1)(2,0,0)[7]
##
## Coefficients:
## ar1 ar2 ma1 sar1 sar2
## -0.0085 -0.0872 -0.8976 0.0386 0.0048
## s.e. 0.0546 0.0490 0.0334 0.0459 0.0435
##
## sigma^2 = 0.0001767: log likelihood = 1701.94
## AIC=-3391.88 AICc=-3391.74 BIC=-3365.64
# Plotting the best model Forecast:
Final Forecast of Age 51_60 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] -0.0034554033 -0.0015666348 -0.0017345626 -0.0011231832 -0.0021331350
## [6] 0.0005066834 -0.0011945706 -0.0036281953 -0.0012590532 -0.0014203719
## [11] -0.0011965861 -0.0022270564 0.0005572294 -0.0013205076 -0.0036524708
## [16] -0.0012443237 -0.0013696030 -0.0012054774 -0.0022423793 0.0005652176
## [21] -0.0013410845 -0.0036542419 -0.0012422717 -0.0013661264 -0.0012061748
## [26] -0.0022434242 0.0005657699 -0.0013424868 -0.0036544274 -0.0012421214
## [31] -0.0013657473
Timeplot of Age 61-70 death rate:
Classical decomposition: Additive Seasonality of “clean_61-70”
in :
Additive Seasonality of age_61-70_death_rate (“clean_61-70”) in
classical decomposition:
Visualizing the age_61-70_death_rate (“clean_61-70”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_61_70
## Dickey-Fuller = -2.6026, Lag order = 8, p-value = 0.3232
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_61_70): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_61_70
## KPSS Level = 7.0461, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_61_70
## Dickey-Fuller = -2.0836, Lag order = 8, p-value = 0.5429
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_61_70): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_61_70
## KPSS Level = 1.1307, Truncation lag parameter = 6, p-value = 0.01
After taking seasonal difference, the Series is not stationary
by the ADF and KPSS tests. We can continue our analysis without taking
seasonal differences.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_61_70
## ARIMA(5,2,1)(2,0,2)[7]
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ma1 sar1 sar2
## -1.1421 -0.5276 -0.2625 -0.0675 0.0798 0.6332 0.6354 0.2317
## s.e. 0.1690 0.1082 0.0761 0.0715 0.0483 0.1653 0.2845 0.2600
## sma1 sma2
## -0.3929 -0.0562
## s.e. 0.2866 0.1824
##
## sigma^2 = 0.0009802: log likelihood = 1203.9
## AIC=-2385.81 AICc=-2385.35 BIC=-2337.7
# Plotting the best model Forecast:
Final Forecast of Age 61-70 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 30.88852 30.88111 30.88096 30.87966 30.87370 30.86453 30.86152 30.85035
## [9] 30.83976 30.83883 30.83695 30.82920 30.81978 30.81512 30.80332 30.78820
## [17] 30.78666 30.78412 30.77419 30.76462 30.75840 30.74606 30.72749 30.72528
## [25] 30.72224 30.71049 30.70076 30.69320 30.68033 30.65854 30.65576
Timeplot of Age 71-80 death rate:
Classical decomposition: Additive Seasonality of “clean_71-80”
in :
Additive Seasonality of age_51_60_death_rate (“clean_71-80”) in
classical decomposition:
Visualizing the age_71-80_death_rate (“clean_71-80”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_71_80
## Dickey-Fuller = -1.6565, Lag order = 8, p-value = 0.7237
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_71_80): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_71_80
## KPSS Level = 3.6597, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_71_80): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_71_80
## Dickey-Fuller = -5.6453, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_71_80): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_71_80
## KPSS Level = 0.32402, Truncation lag parameter = 6, p-value = 0.1
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.1 greater than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_71_80
## ARIMA(2,0,2)(0,0,2)[7] with zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 sma1 sma2
## 1.2305 -0.7289 -1.3658 0.7854 0.2275 0.1022
## s.e. 0.1846 0.2026 0.1495 0.1948 0.0467 0.0512
##
## sigma^2 = 2.848e-05: log likelihood = 2241.56
## AIC=-4469.11 AICc=-4468.92 BIC=-4438.49
# Plotting the best model Forecast:
Final Forecast of Age 71-80 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] -0.0017757072 0.0001526584 0.0004421039 0.0010337341 0.0010363939
## [6] -0.0002511023 -0.0008234869 -0.0017500853 0.0007252975 0.0003180477
## [11] 0.0011040270 0.0008851899 -0.0005757239 -0.0008533213 -0.0017464391
## [16] 0.0010123880 0.0003045708 0.0011660042 0.0008237907 -0.0007357037
## [21] -0.0008888280 -0.0017219282 0.0010500440 0.0003330407 0.0011735890
## [26] 0.0008123716 -0.0007552839 -0.0009045981 -0.0017270614 0.0010552226
## [31] 0.0003431549
Timeplot of Age 81-90 death rate:
Classical decomposition: Additive Seasonality of
“clean_81-90”:
Additive Seasonality of age_81-90_death_rate (“clean_81-90”) in
classical decomposition:
Visualizing the age_81-90_death_rate (“clean_81-90”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_81_90
## Dickey-Fuller = -2.0054, Lag order = 8, p-value = 0.576
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_81_90): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_81_90
## KPSS Level = 7.8549, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_81_90): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_81_90
## Dickey-Fuller = -5.829, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_81_90): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_81_90
## KPSS Level = 0.12262, Truncation lag parameter = 6, p-value = 0.1
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.1 greater than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_81_90
## ARIMA(3,0,3)(0,0,2)[7] with non-zero mean
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 ma3 sma1 sma2 mean
## -0.1292 -0.1605 -0.8004 0.0214 0.0346 0.8934 0.2261 0.1232 9e-04
## s.e. 0.0507 0.0440 0.0425 0.0383 0.0299 0.0328 0.0458 0.0450 3e-04
##
## sigma^2 = 4.549e-05: log likelihood = 2105.36
## AIC=-4190.71 AICc=-4190.33 BIC=-4146.96
# Plotting the best model Forecast:
Final Forecast of Age 81-90 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 5.421856e-03 3.171414e-04 4.339993e-03 -5.689063e-04 7.435274e-05
## [6] 6.151670e-04 1.622889e-03 2.917438e-03 6.118759e-04 2.004177e-03
## [11] 2.364837e-04 2.687999e-04 1.221983e-03 8.517295e-04 1.861860e-03
## [16] -4.400418e-04 1.764766e-03 4.966053e-04 9.792230e-04 9.237638e-04
## [21] 6.648875e-04 1.344096e-03 1.721418e-04 1.780780e-03 8.107273e-04
## [26] 4.460707e-04 9.294408e-04 4.982765e-04 1.791449e-03 1.365223e-04
## [31] 1.846959e-03
Timeplot of Age 91-100 death rate:
Classical decomposition: Additive Seasonality of “clean_91-100”
in :
Additive Seasonality of age_91-100_death_rate (“clean_91-100”)
in classical decomposition:
Visualizing the age_91-100_death_rate (“clean_91-100”) time
series data after adjusted seasonality with classical
decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_91_100
## Dickey-Fuller = -2.2543, Lag order = 8, p-value = 0.4707
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_91_100): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_91_100
## KPSS Level = 7.8385, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_91_100): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_91_100
## Dickey-Fuller = -6.9365, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_91_100): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_91_100
## KPSS Level = 0.082995, Truncation lag parameter = 6, p-value = 0.1
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.1 greater than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_91_100
## ARIMA(1,0,1)(0,0,1)[7] with non-zero mean
##
## Coefficients:
## ar1 ma1 sma1 mean
## -0.6232 0.4436 0.1279 3e-04
## s.e. 0.1146 0.1302 0.0399 2e-04
##
## sigma^2 = 1.252e-05: log likelihood = 2482.1
## AIC=-4954.19 AICc=-4954.09 BIC=-4932.32
# Plotting the best model Forecast:
Final Forecast of Age 91-100 daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 1.154185e-03 2.505588e-04 1.843208e-04 -4.851836e-04 3.793708e-04
## [6] -2.323502e-04 7.703687e-04 1.089928e-03 3.869190e-04 7.303300e-05
## [11] -4.368763e-04 4.756952e-04 -2.513691e-04 9.230403e-04 1.060711e-03
## [16] 4.051259e-04 6.168705e-05 -4.298059e-04 4.712892e-04 -2.486234e-04
## [21] 9.213293e-04 1.061778e-03 4.044615e-04 6.210111e-05 -4.300639e-04
## [26] 4.714500e-04 -2.487236e-04 9.213918e-04 1.061739e-03 4.044858e-04
## [31] 6.208600e-05
Converting the dataset into univariate daily data with a weekly pattern selecting “7 division” columns:
Outliers & Missing value treatment:
## $index
## [1] 373 462 517
##
## $replacements
## [1] 43.67066 43.94064 43.93998
## $index
## [1] 9 10 11 12 22 23 24 25 26 27 28 29 30 31 32 73 74 75
## [19] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93
## [37] 94 95 96 97 107 108 109 110 111 115 116 117 118 122 125 127 128 129
## [55] 130 131 132 133 134 135 157 159 162 163 169 170 171 172 173 174 175 176
## [73] 177 178 179 180 181 182 183 184 188 189 190 191 192 193 194 199 200 201
## [91] 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219
## [109] 220 221 222 223 224 225 231 232 233 234 235 236 237 238 239 241 242 243
## [127] 244 245 246 250 318 319 320 321 322 323 324 325 326 327 328 329 330 331
## [145] 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 371
## [163] 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413
## [181] 414 415 416 417 418 419 420 421 422 423
##
## $replacements
## [1] 18.63055 18.63622 18.62543 18.60783 18.37778 18.36144 18.37704 18.37528
## [9] 18.36719 18.36288 18.35447 18.34283 18.32641 18.34225 18.34000 18.39518
## [17] 18.39237 18.37921 18.37748 18.36531 18.34890 18.33016 18.34571 18.34318
## [25] 18.32983 18.32837 18.31683 18.29930 18.28038 18.29629 18.29402 18.28045
## [33] 18.27925 18.26832 18.25003 18.23079 18.24665 18.24470 18.23097 18.23005
## [41] 18.00993 18.02567 18.02481 18.01242 18.01069 18.05370 18.05168 18.03886
## [49] 18.03412 18.04746 18.07394 18.07080 18.08943 18.13359 18.16362 18.18389
## [57] 18.20930 18.24839 18.22785 18.24625 18.98455 19.00263 19.04790 19.05863
## [65] 19.11494 19.09826 19.11293 19.10916 19.10053 19.09521 19.08210 19.06037
## [73] 19.04253 19.05643 19.05282 19.04347 19.03871 19.02547 19.00580 18.98683
## [81] 18.94096 18.91274 18.87830 18.84453 18.84280 18.82431 18.79812 18.59325
## [89] 18.60955 18.61704 18.63480 18.64256 18.64310 18.64443 18.67764 18.69366
## [97] 18.70070 18.71899 18.72682 18.72739 18.72878 18.76197 18.77772 18.78432
## [105] 18.80315 18.81103 18.81145 18.81441 18.84529 18.86146 18.86851 18.88723
## [113] 18.89525 18.89546 19.23853 19.26885 19.30864 19.36110 19.40968 19.44820
## [121] 19.49565 19.53577 19.56534 19.65189 19.69799 19.73305 19.77486 19.81246
## [129] 19.83785 19.95307 20.31660 20.32016 20.31557 20.31850 20.31594 20.30604
## [137] 20.30048 20.31077 20.31376 20.30822 20.31260 20.30916 20.29845 20.29274
## [145] 20.30409 20.30701 20.30128 20.30551 20.30202 20.29106 20.28514 20.29748
## [153] 20.30026 20.29429 20.29834 20.29480 20.28445 20.27993 20.28949 20.29270
## [161] 20.28739 20.27115 20.26972 20.26131 20.25994 20.25264 20.24107 20.24181
## [169] 20.23834 20.23757 20.22880 20.22881 20.22083 20.20784 20.20990 20.20666
## [177] 20.20538 20.19625 20.19765 20.18901 20.17600 20.17763 20.17476 20.17271
## [185] 20.16414 20.16547 20.15706 20.14435 20.14548 20.14289
## $index
## [1] 1 2 3 4 7 9 10 11 12 14 15 17 18 21 22 24 25 32
## [19] 35 36 38 39 43 45 46 47 48 49 50 52 121 126 127 148 149 150
## [37] 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
## [55] 169 170 171 172 173 174 175 176 177 186 187 189 191 192 193 194 195 196
## [73] 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214
## [91] 216 217 218 219 220 221 223 225 226 227 228 229 236 241 243 317 319 320
## [109] 321 322 323 324 326 327 330 331 332 333 334 335 336 337 338 339 340 341
## [127] 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359
## [145] 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377
## [163] 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395
## [181] 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413
## [199] 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431
## [217] 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449
## [235] 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467
## [253] 468 469 470 471 472 474 475 476 478 479 486
##
## $replacements
## [1] 5.813890 5.812238 5.799668 5.817333 5.825607 5.826708 5.811893 5.827083
## [9] 5.811376 5.797133 5.774171 5.757017 5.793810 5.789965 5.765736 5.746415
## [17] 5.781970 5.777914 5.759770 5.735780 5.719023 5.754181 5.735413 5.700684
## [25] 5.715135 5.702971 5.695448 5.713092 5.711751 5.693832 5.287595 5.269106
## [33] 5.259513 5.519209 5.518470 5.548069 5.557345 5.579464 5.587537 5.503501
## [41] 5.649377 5.646550 5.682285 5.685408 5.706112 5.713910 5.752371 5.779619
## [49] 5.776110 5.857899 5.821743 5.841802 5.853072 5.877275 5.923924 5.918446
## [57] 6.044259 5.966551 5.983790 5.996806 6.005881 6.083619 6.065014 6.645144
## [65] 6.738213 6.718655 6.864808 6.975105 6.896069 6.881311 6.877570 6.758924
## [73] 6.983954 6.903574 6.999367 6.941811 6.923796 6.916960 6.798750 7.020205
## [81] 6.842839 6.911535 6.955724 7.300406 6.992552 6.802541 7.024541 6.755092
## [89] 6.801908 6.954174 7.134523 6.948320 7.246749 7.088983 7.162070 7.409623
## [97] 7.754837 7.831311 7.583826 7.480835 7.620743 7.262980 7.277837 7.278428
## [105] 7.374415 7.337038 7.350484 7.353369 7.350443 7.357570 7.365123 7.356151
## [113] 7.348931 7.343873 7.337824 7.323549 7.296189 7.323459 7.327053 7.319809
## [121] 7.333890 7.343035 7.324421 7.296802 7.324460 7.328733 7.320129 7.338201
## [129] 7.344859 7.325775 7.298399 7.325975 7.330374 7.321584 7.340233 7.346957
## [137] 7.327357 7.300180 7.327634 7.332118 7.323101 7.342293 7.348772 7.329087
## [145] 7.301948 7.329389 7.333886 7.324886 7.344121 7.350567 7.330803 7.303704
## [153] 7.331132 7.335641 7.326653 7.345925 7.352354 7.332585 7.305484 7.332910
## [161] 7.337417 7.328427 7.347700 7.354123 7.334356 7.307257 7.334684 7.339190
## [169] 7.330200 7.349473 7.355897 7.336129 7.309030 7.336457 7.340963 7.331972
## [177] 7.351247 7.357670 7.337902 7.310803 7.338229 7.342735 7.333745 7.353019
## [185] 7.359443 7.339676 7.312576 7.340003 7.344509 7.335518 7.354793 7.361215
## [193] 7.341448 7.314349 7.341775 7.346280 7.337290 7.356564 7.362989 7.343222
## [201] 7.316122 7.343548 7.348054 7.339064 7.358339 7.364761 7.344994 7.317893
## [209] 7.345318 7.349823 7.340833 7.360107 7.366536 7.346772 7.319670 7.347094
## [217] 7.351600 7.342610 7.361880 7.368302 7.348541 7.321438 7.348856 7.353356
## [225] 7.344357 7.363613 7.370043 7.350377 7.323290 7.350662 7.355132 7.346216
## [233] 7.365213 7.371759 7.352200 7.325139 7.352468 7.356905 7.348065 7.366794
## [241] 7.373120 7.354224 7.327026 7.354698 7.358564 7.351035 7.366745 7.374586
## [249] 7.356404 7.329123 7.357193 7.360537 7.354362 7.367083 7.372645 7.359132
## [257] 7.332287 7.359754 7.352888 7.358367 7.350482 7.323376 7.332117
## $index
## [1] 119 120 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
## [19] 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155
## [37] 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173
## [55] 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
## [73] 192 193 194 195 196 197 199 200 201 202 203 204 205 206 207 208 209 210
## [91] 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
## [109] 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246
## [127] 247 248 249 250 251 252
##
## $replacements
## [1] 6.187489 6.181767 6.224896 6.271435 6.319575 6.366469 6.427841
## [8] 6.476795 6.507742 6.573003 6.619772 6.667780 6.714695 6.776056
## [15] 6.825065 6.855954 6.921217 6.968021 7.016006 7.062929 7.124287
## [22] 7.173330 7.204161 7.269428 7.316268 7.364231 7.411159 7.472512
## [29] 7.521540 7.552394 7.617668 7.664516 7.712475 7.759405 7.820753
## [36] 7.869734 7.900615 7.965897 8.012753 8.060706 8.107638 8.168978
## [43] 8.217973 8.248846 8.314135 8.360990 8.408942 8.455871 8.517212
## [50] 8.566202 8.597071 8.662368 8.709224 8.757176 8.804104 8.865446
## [57] 8.914437 8.945286 9.010603 9.057457 9.105408 9.152335 9.213679
## [64] 9.262674 9.293504 9.358839 9.405691 9.453639 9.500566 9.561911
## [71] 9.610903 9.641751 9.707068 9.753918 9.801866 9.848793 9.910137
## [78] 9.959134 10.060480 10.112511 10.165640 10.217748 10.284275 10.338448
## [85] 10.374516 10.444977 10.497007 10.550137 10.602243 10.668767 10.722950
## [92] 10.759039 10.829482 10.881514 10.934647 10.986756 11.053282 11.107425
## [99] 11.143493 11.213990 11.266028 11.319158 11.371261 11.437769 11.491919
## [106] 11.527964 11.598515 11.650561 11.703693 11.755798 11.822296 11.876334
## [113] 11.912425 11.982996 12.035092 12.088256 12.140359 12.206643 12.260794
## [120] 12.296926 12.367516 12.419670 12.472880 12.525003 12.591098 12.645352
## [127] 12.681540 12.751331 12.804347 12.857131 12.910118 12.975300
## $index
## [1] 32 35 36 37 86 87 88 90 91 98 99 100 106 108 113 120 123 124 125
## [20] 128 143 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
## [39] 206 207 208 209 210 211 212 213 214 216 217 218 219 220 229 230 231 232 233
## [58] 234 235 236 237 238 239 240 241 242 243 244 245 252 253
##
## $replacements
## [1] 3.026799 3.060932 3.011817 3.015552 2.989840 2.986973 2.980391 3.028940
## [9] 3.041540 3.016574 3.009310 2.989092 2.975395 2.973256 2.964287 2.978028
## [17] 2.979350 2.976767 2.974824 2.980030 3.031125 2.953289 2.957476 2.959601
## [25] 2.970421 2.976563 2.962805 2.980352 2.987952 2.990893 2.993562 3.004705
## [33] 3.010808 2.996592 3.010528 3.022201 3.024244 3.026029 3.037935 3.043537
## [41] 3.028441 3.054786 3.054275 3.056026 3.057224 3.070212 3.075636 3.104234
## [49] 3.097051 3.104175 3.110477 3.129601 3.217699 3.245958 3.254044 3.259791
## [57] 3.263558 3.277959 3.290834 3.278158 3.306383 3.314015 3.320307 3.323385
## [65] 3.333870 3.351112 3.337981 3.365999 3.372922 3.414919 3.420470
## $index
## [1] 3 6 7 9 11 12 13 15 17 18 19 21 22 27 34 35 36 37
## [19] 38 39 40 41 42 55 94 102 103 104 109 118 119 120 121 122 123 124
## [37] 125 126 164 166 168 169 170 171 172 173 174 175 178 179 180 181 182 183
## [55] 184 187 188 189 196 198 200 201 202 203 204 205 206 207 208 209 210 211
## [73] 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 230
## [91] 231 232 233 240 242 244 245 246 247 248 249 250 251 252 253 254 255 256
## [109] 257 258 259 260 261 262 263 264 265 266 267 268 269 356 373
##
## $replacements
## [1] 3.935961 3.878310 3.881898 3.863692 3.867516 3.856858 3.831871 3.827585
## [9] 3.829204 3.856745 3.864498 3.890446 3.827311 3.818644 3.834636 3.815826
## [17] 3.790854 3.776254 3.768429 3.768124 3.744943 3.776026 3.759178 3.735026
## [25] 3.585689 3.546823 3.535111 3.535887 3.512725 3.413973 3.410736 3.415548
## [33] 3.403062 3.393334 3.400189 3.394455 3.401503 3.398306 3.638231 3.657842
## [41] 3.671463 3.664683 3.667553 3.656584 3.670460 3.671666 3.686075 3.687204
## [49] 3.675953 3.686436 3.684859 3.718569 3.694176 3.684194 3.683933 3.673966
## [57] 3.725405 3.674055 3.586521 3.593008 3.597783 3.605110 3.619931 3.625778
## [65] 3.620017 3.637971 3.632587 3.649555 3.655827 3.670152 3.675652 3.661614
## [73] 3.687573 3.682893 3.699276 3.704661 3.718575 3.723749 3.717397 3.734819
## [81] 3.730794 3.748206 3.752040 3.766549 3.771746 3.773160 3.782225 3.779004
## [89] 3.797581 3.829706 3.849493 3.865531 3.889236 4.041781 4.114837 4.162283
## [97] 4.174476 4.183029 4.199119 4.201427 4.227520 4.236527 4.258614 4.270983
## [105] 4.279710 4.295728 4.297730 4.322559 4.332837 4.354501 4.366728 4.375144
## [113] 4.390779 4.395404 4.416638 4.426681 4.465221 4.460189 4.468611 4.484276
## [121] 4.491948 4.540000 4.540000
## $index
## [1] 1 2 3 14 17 18 19 20 21 95 98 105 106 119 120 121 122 128
## [19] 129 130 131 132 137 138 139 140 141 142 149 150 151 152 153 154 155 156
## [37] 157 158 159 162 163 164 165 166 178 179 180 181 182 183 184 185 186 187
## [55] 188 189 190 191 192 193 194 197 198 199 203 204 205 206 207 208 211 213
## [73] 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231
## [91] 232 233 235 236 238 322 323 324 325 326 327 328 329 330 331 332 333 334
## [109] 335 336 337 338 339 340 341 342 343 344 345 373 403 404 405 406 407 408
## [127] 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423
##
## $replacements
## [1] 4.495341 4.494771 4.504494 4.520721 4.516094 4.508285 4.509912 4.504864
## [9] 4.500151 4.219116 4.174710 4.069734 4.075372 3.815455 3.855611 3.827931
## [17] 3.683346 3.742843 3.524508 3.720149 3.702992 3.702674 3.871101 3.842949
## [25] 3.835175 3.811891 3.923295 3.906135 3.659993 3.431075 3.665575 3.639539
## [33] 3.630683 3.639590 3.726178 3.750466 3.535864 3.760656 3.730094 3.765262
## [41] 3.797582 3.676068 3.810920 3.790494 3.911562 3.951471 3.965450 3.959955
## [49] 3.998410 4.030010 4.070708 4.088529 4.109879 4.134711 4.131240 4.167945
## [57] 4.197891 4.238278 4.255930 4.275705 4.301509 4.392365 4.436443 4.457889
## [65] 4.526105 4.555344 4.597646 4.617277 4.637475 4.666166 4.723149 4.755843
## [73] 4.762148 4.776932 4.788399 4.788386 4.803244 4.831672 4.837332 4.844169
## [81] 4.858469 4.870046 4.870916 4.885058 4.913326 4.918783 4.926127 4.939919
## [89] 4.951586 4.953326 4.968276 4.994337 4.997188 4.999282 4.999374 4.885903
## [97] 4.886852 4.889726 4.891994 4.888158 4.890499 4.890238 4.884405 4.885844
## [105] 4.888325 4.891583 4.888036 4.890227 4.889786 4.884076 4.884826 4.886880
## [113] 4.891099 4.887819 4.889850 4.889230 4.883653 4.884688 4.886034 4.889118
## [121] 4.868348 4.869710 4.869017 4.868467 4.864263 4.864203 4.867721 4.865468
## [129] 4.867250 4.866642 4.865887 4.862109 4.861614 4.865030 4.862717 4.864527
## [137] 4.863917 4.862940 4.860046 4.859106 4.862398
## $index
## [1] 9 21 22 23 24 25 43 44 45 46 47 48 49 50 51 90 96 113 114
## [20] 115 116 117 119 122 124 164 165 166 167 174 175 176 177 180 181 182 183 184
## [39] 185 186 187 188 189 190 191 192 193 194 195 197 198 199 203 212 215 226 230
## [58] 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
## [77] 251 252 253 254 255 256 257 258 259 260 261
##
## $replacements
## [1] 2.200121 2.302995 2.303752 2.305572 2.308213 2.311943 2.345183 2.341224
## [9] 2.342049 2.341846 2.334833 2.337903 2.342240 2.338668 2.334575 2.233812
## [17] 2.280530 2.132059 2.127063 2.141067 2.120447 2.122663 2.124780 2.130438
## [25] 2.132318 2.043385 2.040356 2.038757 2.038398 2.065953 2.092233 2.077155
## [33] 2.080119 2.095913 2.105679 2.134521 2.121933 2.126685 2.139214 2.147536
## [41] 2.151932 2.162981 2.191910 2.179308 2.183979 2.198046 2.208075 2.200804
## [49] 2.220651 2.244888 2.257254 2.280581 2.361320 2.502118 2.485725 2.706020
## [57] 2.717092 2.786431 2.790195 2.814967 2.805137 2.785812 2.787805 2.838370
## [65] 2.839745 2.841657 2.867761 2.855306 2.854328 2.875721 2.888298 2.895175
## [73] 2.896967 2.919986 2.912663 2.916327 2.932500 2.945293 2.955337 2.956046
## [81] 2.975013 2.971937 2.979514 2.989954 3.002667 3.012503 3.015853
## clean_dh clean_rs clean_kh clean_ba
## Min. :43.49 Min. :5.210 Min. : 6.160 Min. :2.950
## 1st Qu.:43.74 1st Qu.:5.789 1st Qu.: 7.509 1st Qu.:3.030
## Median :43.94 Median :7.331 Median :12.760 Median :3.370
## Mean :48.15 Mean :6.800 Mean :10.827 Mean :3.243
## 3rd Qu.:55.47 3rd Qu.:7.350 3rd Qu.:12.852 3rd Qu.:3.380
## Max. :58.47 Max. :7.831 Max. :13.040 Max. :3.430
## clean_sy clean_rp clean_mm
## Min. :3.393 Min. :3.431 Min. :2.038
## 1st Qu.:3.700 1st Qu.:4.360 1st Qu.:2.312
## Median :4.530 Median :4.860 Median :3.020
## Mean :4.176 Mean :4.623 Mean :2.715
## 3rd Qu.:4.550 3rd Qu.:4.886 3rd Qu.:3.020
## Max. :4.560 Max. :5.010 Max. :3.050
Timeplot for Dhaka daily death rate:
Classical decomposition: Additive Seasonality of “clean_dh” in
:
Additive Seasonality of Dhaka daily death rate (“clean_dh”) in
classical decomposition:
Visualizing the Dhaka daily death rate (“clean_dh”) time series
data after adjusted seasonality with classical decomposition:
Split “clean_dh” into train : test = 70 : 30 dataset:
## [1] 176
## Model1: TBATS
## BATS(1, {2,0}, 0.986, -)
##
## Call: tbats(y = train_dh)
##
## Parameters
## Alpha: 0.2125907
## Beta: 0.1085413
## Damping Parameter: 0.985503
## AR coefficients: 0.847158 -0.115533
##
## Seed States:
## [,1]
## [1,] 54.5705825
## [2,] 0.0224782
## [3,] 0.0000000
## [4,] 0.0000000
##
## Sigma: 0.0342772
## AIC: -280.9169
Model1: TBATS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 21.42 0.3729
## McLeod-Li Q Q ~ chisq(20) 282.57 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 263 0.2263
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 214 0.1474
## Rank P (P-42333)/1396.3 ~ N(0,1) 43383 0.4521
Model1: TBATS errors Diagnostic Assumptions and useful properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast of Model1-TBATS:
Changing the f_tbats_dh_mean as the format of test
data
TBATS-forecasting accuracy measures based on the test
dataset:
Model1-TBATS accuracy:
## MSE_m1_dh = 0.8909398
## RMSE_m1_dh = 0.9438961
## ME_m1_dh = -0.8720921
## MAE_m1_dh = 0.8720921
## MPE_m1_dh = -1.98412
## MAPE_m1_dh = 1.98412
Classical Decomposition on training dataset (traindata) -
Seasonal adjustment:
Additive Seasonality of Dhaka daily death rate in classical
decomposition:
Seasonality Adjusted from train_dh death rate in Classical
decomposition:
Visualizing the “train_dh death rate” adjusted seasonality with
classical decomposition:
Now check stationary of “Adj_s_c_train_dh”
Augmented Dickey-Fuller (ADF) test:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_train_dh
## Dickey-Fuller = -3.7752, Lag order = 7, p-value = 0.02042
## alternative hypothesis: stationary
The null hypothesis will be rejected if p-value < alpha.
Here, p-value > 0.05. So, we do not reject the null hypothesis.
Hence, a small p-value (p-value=0.02042 less than Alpha=0.05)
from ADF test suggests that the series is stationary.
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS)
test:
## Warning in kpss.test(Adj_s_c_train_dh): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_train_dh
## KPSS Level = 5.9797, Truncation lag parameter = 5, p-value = 0.01
The null hypothesis will be rejected if p-value < alpha.
Here, p-value < 0.05. So, we reject the null hypothesis.
Hence, a small p-value (p-value=0.01 less than Alpha=0.05) from
KPSS test suggests that the series is not stationary.
We can take seasonal difference in order to make the series
stationary in both tests.
Taking seasonal difference to get a stationary series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_train_dh
## Dickey-Fuller = -1.2264, Lag order = 7, p-value = 0.9016
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_train_dh): p-value smaller than printed p-
## value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_train_dh
## KPSS Level = 0.99611, Truncation lag parameter = 5, p-value = 0.01
Hence, a large p-value of ADF test (p-value=0.9016 less than
Alpha=0.05) and a small p-value of KPSS test (p-value=0.01 less than
Alpha=0.05) suggests that the series is not stationary after taking
seasonal difference. So, we should continue without taking seasonal
differences.
## Model2- Classical decomposition + ARIMA
## Series: Adj_s_c_train_dh
## ARIMA(0,2,1)(1,0,1)[7]
##
## Coefficients:
## ma1 sar1 sma1
## -0.7431 -0.3137 0.4663
## s.e. 0.0287 0.1924 0.1758
##
## sigma^2 = 0.001176: log likelihood = 804.7
## AIC=-1601.4 AICc=-1601.3 BIC=-1585.33
Model2- ARIMA residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 31.02 0.0549
## McLeod-Li Q Q ~ chisq(20) 325.86 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 269 0.6118
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 208 0.67
## Rank P (P-42333)/1396.3 ~ N(0,1) 42897 0.6863
Model2: auto.arima residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Forecast: Model2- Classical decomposition +
ARIMA
Forecasting with seasonal adjustment: Classical decomposition +
ARIMA:
Adding model1 auto.arima Forecast mean with seasonal index of
Dhaka daily death rate in the length of the test data
[413:588].
Changing the f_arima_sea_adj as the format of test
data:
Forecasting accuracy measures based on the test dataset: Model2
- Classical decomposition + ARIMA
Accuracy of Model2- Classical decomposition +
ARIMA:
## MSE_m2_dh = 1.863793
## RMSE_m2_dh = 1.365208
## ME_m2_dh = -1.173337
## MAE_m2_dh = 1.173337
## MPE_m2_dh = -2.669113
## MAPE_m2_dh = 2.669113
Using whole dataset (clean_dh)
Seasonality (seasonal index) of dh_death_rate in STL
decomposition:
Seasonal adjustment: STL Decomposition on training dataset
(train)
Seasonality Adjusted training data in STL
decomposition:
Plotting Seasonality Adjusted training data in STL
decomposition
Checking stationary of Seasonality Adjusted training data in STL
decomposition:
##
## Augmented Dickey-Fuller Test
##
## data: seasadj_STL_train_dh
## Dickey-Fuller = -3.4079, Lag order = 7, p-value = 0.05267
## alternative hypothesis: stationary
## Warning in kpss.test(seasadj_STL_train_dh): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: seasadj_STL_train_dh
## KPSS Level = 5.9796, Truncation lag parameter = 5, p-value = 0.01
Hence, a large p-value (p-value=0.05267 greater than Alpha=0.05)
from ADF test suggests that the series is not stationary.
and KPSS test p-value = 0.01, a small p-values < 0.05
suggests that the series is not stationary and a differencing is
required.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_seasadj_STL_train_dh
## Dickey-Fuller = -1.3, Lag order = 7, p-value = 0.8731
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_seasadj_STL_train_dh): p-value smaller than printed p-
## value
##
## KPSS Test for Level Stationarity
##
## data: Diff_seasadj_STL_train_dh
## KPSS Level = 0.99892, Truncation lag parameter = 5, p-value = 0.01
ADF test and KPSS test suggests that the series is not yet
stationary after taking seasonal difference. So we will continue our
analysis without taking seasonal differences.
## Model3- STL+ETS: Fit a model
## ETS(M,A,N)
##
## Call:
## ets(y = seasadj_STL_train_dh)
##
## Smoothing parameters:
## alpha = 0.9101
## beta = 0.3064
##
## Initial states:
## l = 54.6539
## b = 0.0055
##
## sigma: 6e-04
##
## AIC AICc BIC
## -383.4993 -383.3515 -363.3942
Model3: STL+ETS residuals diagnostic checking:
## Null hypothesis: Residuals are iid noise.
## Test Distribution Statistic p-value
## Ljung-Box Q Q ~ chisq(20) 78.52 0 *
## McLeod-Li Q Q ~ chisq(20) 305.02 0 *
## Turning points T (T-273.3)/8.5 ~ N(0,1) 271 0.7847
## Diff signs S (S-205.5)/5.9 ~ N(0,1) 207 0.7982
## Rank P (P-42333)/1396.3 ~ N(0,1) 42733 0.7745
Model3: STL + ETS residuals Diagnostic Assumptions and useful
properties:
1. {et} uncorrelated. all of the spikes from ACF and PACF plots are
within threshold bound. So, the residuals are uncorrelated. The lack of
correlation suggesting the forecasts are good.
2. {et} have mean zero. The time plot of the residuals shows mean zero.
Variation of the residuals stays much the same across the historical
data, and one outlier, and therefore the residual variance can be
treated as constant.
3. {et} are normally distributed. The Normal Q-Q Plot shows the
residuals are normally distributed.
Plotting Model3-STL+ETS Forecast:
Forecasting with seasonal adjustment:
Model3-STL+ETS
Adding model3 Forecast mean with STL seasonal index of whole
data in the length of the test data.
Changing the f_stlets_dccr_sea_adj as the format of test
data
Forecasting accuracy measures based on the test dataset: Model3
- STL+ETS
Accuracy of Model3- STL+ETS:
## MSE_m3_dh = 2.28557
## RMSE_m3_dh = 1.51181
## ME_m3_dh = -1.300388
## MAE_m3_dh = 1.300388
## MPE_m3_dh = -2.958131
## MAPE_m3_dh = 2.958131
| Model | MSE | RMSE | MAE | MPE | MAPE |
|---|---|---|---|---|---|
| M1-TBATS | MSE | RMSE | MAE | MPE | |
| M2-CD+ARIMA | MSE | RMSE | MAE | MPE | MAPE |
| M3-STL+ETS | MSE | RMSE | MAE | MPE | MAPE |
From the above comparison, it is obvious that our M2 (Classical Decomposition+ARIMA) has lower error.
So, comparatively M2 is the best model.
| Seasonal Index using classical decomposition (whole data) =>
s_c_dh
Fit into the Best Model Classical Decomposition + ARIMA:
## Series: Adj_s_c_dh
## ARIMA(1,2,3)(0,0,2)[7]
##
## Coefficients:
## ar1 ma1 ma2 ma3 sma1 sma2
## 0.8477 -1.6531 0.6788 0.0340 0.1114 -0.0952
## s.e. 0.3576 0.3572 0.3596 0.1186 0.0514 0.0483
##
## sigma^2 = 0.0008169: log likelihood = 1257.8
## AIC=-2501.61 AICc=-2501.41 BIC=-2470.99
# Plotting the best model Forecast:
Final Forecasting with seasonal adjustment
Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 43.99991 43.99715 43.99608 43.99302 43.98980 43.99565 43.98765 43.98729
## [9] 43.98557 43.98152 43.97649 43.97260 43.97856 43.97192 43.97132 43.96868
## [17] 43.96570 43.96099 43.95727 43.96266 43.95552 43.95487 43.95221 43.94921
## [25] 43.94448 43.94075 43.94613 43.93898 43.93832 43.93565 43.93265
#### Dhaka daily death rate by covid-19 demonstrating downward trend for
next month August 2022.
From above analysis, classical Decomposition + ARIMA is the
best model for this dataset. Therefore we will use this model to
forecast further division wise daily death rate.
Timeplot of Chattogram daily death rate:
Classical decomposition: Additive Seasonality of “clean_cg”
:
Additive Seasonality of age_cg_death_rate (“clean_cg”) in
classical decomposition:
Visualizing the age_cg_death_rate (“clean_cg”) time series
data after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_cg
## Dickey-Fuller = -2.0544, Lag order = 8, p-value = 0.5553
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_cg): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_cg
## KPSS Level = 6.7873, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_cg
## Dickey-Fuller = -3.8778, Lag order = 8, p-value = 0.01511
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_cg
## KPSS Level = 0.40971, Truncation lag parameter = 6, p-value = 0.07297
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.07297 greater
than Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_cg
## ARIMA(1,0,1)(1,0,1)[7] with zero mean
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.9212 -0.6699 0.7715 -0.6013
## s.e. 0.0264 0.0499 0.0790 0.1009
##
## sigma^2 = 0.0002002: log likelihood = 1667.79
## AIC=-3325.58 AICc=-3325.47 BIC=-3303.7
# Plotting the best model Forecast:
Final Forecast of Chattogram daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] -0.0009068336 -0.0052994602 -0.0024331559 0.0046542035 0.0017168612
## [6] 0.0030372968 0.0031063878 -0.0016890551 -0.0055881080 -0.0018173292
## [11] 0.0043244721 0.0016719434 0.0029990058 0.0029094113 -0.0022784132
## [16] -0.0057977782 -0.0013302061 0.0040811337 0.0016474733 0.0029788462
## [21] 0.0027660837 -0.0027251516 -0.0059522073 -0.0009476236 0.0038996189
## [26] 0.0016343292 0.0029685760 0.0026603709 -0.0030653350 -0.0060672216
## [31] -0.0006486485
Timeplot of Rajshahi daily death rate:
Classical decomposition: Additive Seasonality of “clean_rs”
:
Additive Seasonality of rs_death_rate (“clean_rs”) in classical
decomposition:
Visualizing the rs_death_rate (“clean_rs”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_rs
## Dickey-Fuller = -1.0178, Lag order = 8, p-value = 0.9358
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_rs): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_rs
## KPSS Level = 6.1343, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_rs): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_rs
## Dickey-Fuller = -5.3597, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_rs): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_rs
## KPSS Level = 0.34149, Truncation lag parameter = 6, p-value = 0.1
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.1 greater than
Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_rs
## ARIMA(0,0,1)(0,0,2)[7] with zero mean
##
## Coefficients:
## ma1 sma1 sma2
## -0.2967 0.7109 0.1591
## s.e. 0.0404 0.0409 0.0446
##
## sigma^2 = 0.002493: log likelihood = 926.06
## AIC=-1844.13 AICc=-1844.06 BIC=-1826.63
# Plotting the best model Forecast:
Final Forecast of Chattogram daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 0.0103760005 0.0027365589 -0.0098466906 -0.0009408113 0.0052010932
## [6] 0.0039001665 -0.0035770820 0.0134752011 -0.0049333121 -0.0113821168
## [11] 0.0017994836 0.0076087924 0.0008902547 -0.0062857613 0.0145005096
## [16] -0.0064637613 -0.0115359190 0.0018033957 0.0081450212 0.0002192035
## [21] -0.0068958797 0.0147279397 -0.0064637613 -0.0115359190 0.0018033957
## [26] 0.0081450212 0.0002192035 -0.0068958797 0.0147279397 -0.0064637613
## [31] -0.0115359190
Timeplot of Khulna daily death rate:
Classical decomposition: Additive Seasonality of “clean_kh”
:
Additive Seasonality of kh_death_rate (“clean_kh”) in classical
decomposition:
Visualizing the kh_death_rate (“clean_kh”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_kh
## Dickey-Fuller = -1.9352, Lag order = 8, p-value = 0.6057
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_kh): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_kh
## KPSS Level = 6.8355, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_kh
## Dickey-Fuller = -2.1923, Lag order = 8, p-value = 0.4969
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_kh): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_kh
## KPSS Level = 1.4863, Truncation lag parameter = 6, p-value = 0.01
Hence, the series is not stationary by ADF and KPSS test after
taking seasonal difference. So we will continue our analysis without
taking seasonal differences for this series.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_kh
## ARIMA(2,2,2)(1,0,2)[7]
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sma1 sma2
## -0.9241 -0.1447 0.2801 -0.4327 0.8342 -0.7298 0.1290
## s.e. 0.1167 0.0690 0.1106 0.0804 0.0477 0.0640 0.0466
##
## sigma^2 = 8.558e-05: log likelihood = 1915.55
## AIC=-3815.1 AICc=-3814.85 BIC=-3780.11
# Plotting the best model Forecast:
Final Forecast of Khulna daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 12.73954 12.73687 12.73774 12.73573 12.73603 12.73390 12.73536 12.73488
## [9] 12.73196 12.73323 12.73142 12.73148 12.72964 12.73070 12.73006 12.72680
## [17] 12.72826 12.72645 12.72625 12.72439 12.72532 12.72461 12.72099 12.72267
## [25] 12.72082 12.72043 12.71852 12.71937 12.71858 12.71467 12.71653
Timeplot of Barisal daily death rate:
Classical decomposition: Additive Seasonality of “clean_ba”
:
Additive Seasonality of ba_death_rate (“clean_ba”) in classical
decomposition:
Visualizing the ba_death_rate (“clean_ba”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_ba
## Dickey-Fuller = -1.4568, Lag order = 8, p-value = 0.8082
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_ba): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_ba
## KPSS Level = 6.2515, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_ba): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_ba
## Dickey-Fuller = -4.7937, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_ba
## KPSS Level = 0.40526, Truncation lag parameter = 6, p-value = 0.07489
Hence, a small p-value of ADF test (p-value=0.01 less than
Alpha=0.05) and a large p-value of KPSS test (p-value=0.07489 greater
than Alpha=0.05) suggests that the series is now stationary after taking
seasonal difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Diff_Adj_s_c_ba
## ARIMA(2,0,2)(0,0,1)[7] with zero mean
##
## Coefficients:
## ar1 ar2 ma1 ma2 sma1
## 1.2625 -0.2842 -1.4477 0.5050 0.1201
## s.e. 0.4232 0.4138 0.3931 0.3698 0.0471
##
## sigma^2 = 6.571e-05: log likelihood = 1995.74
## AIC=-3979.48 AICc=-3979.34 BIC=-3953.23
# Plotting the best model Forecast:
Final Forecast of Barisal daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 1.076929e-04 -1.289053e-03 -1.182423e-03 4.374031e-04 3.584421e-04
## [6] 1.258056e-03 7.625738e-04 -9.700829e-05 -1.633603e-03 -1.142092e-03
## [11] 6.209205e-04 4.846427e-04 1.347779e-03 6.893568e-04 -1.203067e-04
## [16] -1.645573e-03 -1.150582e-03 6.136025e-04 4.778167e-04 1.341241e-03
## [21] 6.830421e-04 -1.264210e-04 -1.651498e-03 -1.156325e-03 6.080364e-04
## [26] 4.724214e-04 1.336011e-03 6.779728e-04 -1.313348e-04 -1.656261e-03
## [31] -1.160942e-03
Timeplot of Sylhet daily death rate:
Classical decomposition: Additive Seasonality of “clean_sy”
:
Additive Seasonality of sy_death_rate (“clean_sy”) in classical
decomposition:
Visualizing the sy_death_rate (“clean_sy”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_sy
## Dickey-Fuller = -1.7675, Lag order = 8, p-value = 0.6767
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_sy): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_sy
## KPSS Level = 6.864, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_sy
## Dickey-Fuller = -3.3026, Lag order = 8, p-value = 0.07025
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_sy
## KPSS Level = 0.65996, Truncation lag parameter = 6, p-value = 0.01719
Hence, the series is not stationary after taking seasonal
difference. So we will continue our analysis without taking seasonal
differences.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_sy
## ARIMA(1,2,1)(1,0,1)[7]
##
## Coefficients:
## ar1 ma1 sar1 sma1
## -0.1256 -0.9079 0.4464 -0.1738
## s.e. 0.0464 0.0206 0.1572 0.1732
##
## sigma^2 = 8.155e-05: log likelihood = 1927.95
## AIC=-3845.91 AICc=-3845.8 BIC=-3824.04
# Plotting the best model Forecast:
Final Forecast of Sylhet daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 4.557634 4.558873 4.558606 4.562344 4.562564 4.564583 4.566780 4.565091
## [9] 4.566513 4.566194 4.570949 4.571229 4.573799 4.574343 4.572727 4.574259
## [17] 4.573914 4.579122 4.579429 4.582245 4.582051 4.580467 4.582049 4.581691
## [25] 4.587102 4.587422 4.590347 4.589823 4.588254 4.589858 4.589495
Timeplot of Rangpur daily death rate:
Classical decomposition: Additive Seasonality of “clean_rp”
:
Additive Seasonality of rp_death_rate (“clean_rp”) in classical
decomposition:
Visualizing the rp_death_rate (“clean_rp”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_rp
## Dickey-Fuller = -1.7386, Lag order = 8, p-value = 0.689
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_rp): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_rp
## KPSS Level = 4.5444, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
## Warning in adf.test(Diff_Adj_s_c_rp): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_rp
## Dickey-Fuller = -5.3822, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## Warning in kpss.test(Diff_Adj_s_c_rp): p-value greater than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_rp
## KPSS Level = 0.34195, Truncation lag parameter = 6, p-value = 0.1
Hence, the series is not stationary after taking seasonal
difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_rp
## ARIMA(2,1,2)(1,0,1)[7]
##
## Coefficients:
## ar1 ar2 ma1 ma2 sar1 sma1
## 0.8174 -0.7952 -1.0376 0.8799 0.7868 -0.2446
## s.e. 0.0380 0.0564 0.0352 0.0360 0.0357 0.0574
##
## sigma^2 = 0.0004696: log likelihood = 1417.5
## AIC=-2820.99 AICc=-2820.8 BIC=-2790.37
# Plotting the best model Forecast:
Final Forecast of Rangpur daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 4.851212 4.851151 4.846130 4.849501 4.849852 4.849790 4.849400 4.850823
## [9] 4.851201 4.844224 4.849668 4.849898 4.849353 4.848635 4.850590 4.851534
## [17] 4.842907 4.849714 4.849721 4.848903 4.848115 4.850560 4.851855 4.841797
## [25] 4.849643 4.849552 4.848610 4.847779 4.850548 4.852059 4.840875
Timeplot of Mymensingh daily death rate:
Classical decomposition: Additive Seasonality of “clean_mm”
:
Additive Seasonality of mm_death_rate (“clean_mm”) in classical
decomposition:
Visualizing the mm_death_rate (“clean_mm”) time series data
after adjusted seasonality with classical decomposition:
Checking stationary after seasonal adjustment:
##
## Augmented Dickey-Fuller Test
##
## data: Adj_s_c_mm
## Dickey-Fuller = -2.1174, Lag order = 8, p-value = 0.5286
## alternative hypothesis: stationary
## Warning in kpss.test(Adj_s_c_mm): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: Adj_s_c_mm
## KPSS Level = 6.6633, Truncation lag parameter = 6, p-value = 0.01
The Series is not stationary by the ADF and KPSS tests. We
need to take seasonal difference in order to make the series
stationary.
Taking seasonal difference to get a stationary
series:
Rechecking stationary after taking seasonal
difference:
##
## Augmented Dickey-Fuller Test
##
## data: Diff_Adj_s_c_mm
## Dickey-Fuller = -2.8553, Lag order = 8, p-value = 0.2163
## alternative hypothesis: stationary
##
## KPSS Test for Level Stationarity
##
## data: Diff_Adj_s_c_mm
## KPSS Level = 0.62379, Truncation lag parameter = 6, p-value = 0.02047
Hence, the series is not stationary after taking seasonal
difference.
Fit into the Best Model Classical Decomposition +
ARIMA:
## Series: Adj_s_c_mm
## ARIMA(2,2,1)(1,0,0)[7]
##
## Coefficients:
## ar1 ar2 ma1 sar1
## -0.3743 -0.1682 -0.8957 0.2699
## s.e. 0.0441 0.0442 0.0197 0.0422
##
## sigma^2 = 6.483e-05: log likelihood = 1994.8
## AIC=-3979.6 AICc=-3979.5 BIC=-3957.74
# Plotting the best model Forecast:
Final Forecast of Mymensingh daily death rate with seasonal
adjustment Classical Decomposition + ARIMA
Adding model2 Forecast mean with seasonality of whole data in the
length of the test data.
## Time Series:
## Start = c(85, 1)
## End = c(89, 3)
## Frequency = 7
## [1] 3.049810 3.049329 3.050224 3.048856 3.048009 3.048443 3.052283 3.050532
## [9] 3.049840 3.051260 3.049430 3.048342 3.048914 3.053785 3.051611 3.050863
## [17] 3.052425 3.050471 3.049317 3.049926 3.055075 3.052787 3.052025 3.053625
## [25] 3.051637 3.050466 3.051085 3.056309 3.053990 3.053224 3.054834