AFricaCDC
2024-10-05
This document provides an analysis of cholera cases using time series techniques. The analysis includes visualizing the data, checking stationarity, and modeling with ARIMA.
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Registered S3 methods overwritten by 'forecast':
## method from
## autoplot.Arima ggfortify
## autoplot.acf ggfortify
## autoplot.ar ggfortify
## autoplot.bats ggfortify
## autoplot.decomposed.ts ggfortify
## autoplot.ets ggfortify
## autoplot.forecast ggfortify
## autoplot.stl ggfortify
## autoplot.ts ggfortify
## fitted.ar ggfortify
## fortify.ts ggfortify
## residuals.ar ggfortify
## reported_cases
## 1 11086
## 2 72654
## 3 5137
## 4 6337
## 5 6074
## 6 6650
## 7 3180
## 8 9502
## 9 24643
## 10 21586
## 11 18742
## 12 19415
## 13 46924
## 14 37383
## 15 17504
## 16 31884
## 17 35585
## 18 34358
## 19 23012
## 20 35857
## 21 43262
## 22 153367
## 23 92079
## 24 76713
## 25 162413
## 26 72597
## 27 108535
## 28 118367
## 29 211761
## 30 210820
## 31 124484
## 32 173954
## 33 137866
## 34 108067
## 35 95560
## 36 125018
## 37 234226
## 38 167298
## 39 179323
## 40 217333
## 41 115106
## 42 188678
## 43 117570
## 44 56329
## 45 105287
## 46 71176
## 47 71058
## 48 179835
## 49 120650
## 50 55087
## 51 47256
## 52 141467
## 53 112282
## 54 177521
## 55 100317
## Time Series:
## Start = 1970
## End = 2024
## Frequency = 1
## reported_cases
## [1,] 11086
## [2,] 72654
## [3,] 5137
## [4,] 6337
## [5,] 6074
## [6,] 6650
## [7,] 3180
## [8,] 9502
## [9,] 24643
## [10,] 21586
## [11,] 18742
## [12,] 19415
## [13,] 46924
## [14,] 37383
## [15,] 17504
## [16,] 31884
## [17,] 35585
## [18,] 34358
## [19,] 23012
## [20,] 35857
## [21,] 43262
## [22,] 153367
## [23,] 92079
## [24,] 76713
## [25,] 162413
## [26,] 72597
## [27,] 108535
## [28,] 118367
## [29,] 211761
## [30,] 210820
## [31,] 124484
## [32,] 173954
## [33,] 137866
## [34,] 108067
## [35,] 95560
## [36,] 125018
## [37,] 234226
## [38,] 167298
## [39,] 179323
## [40,] 217333
## [41,] 115106
## [42,] 188678
## [43,] 117570
## [44,] 56329
## [45,] 105287
## [46,] 71176
## [47,] 71058
## [48,] 179835
## [49,] 120650
## [50,] 55087
## [51,] 47256
## [52,] 141467
## [53,] 112282
## [54,] 177521
## [55,] 100317
##
## Augmented Dickey-Fuller Test
##
## data: cases.ts
## Dickey-Fuller = -2.0897, Lag order = 3, p-value = 0.5384
## alternative hypothesis: stationary
## Warning in adf.test(cases.ts_diff1): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: cases.ts_diff1
## Dickey-Fuller = -5.296, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
## $d
## [1] 1.062263
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.6941739
## $d
## [1] 1.062263
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.6941739
## $d
## [1] 1.062263
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.6941739
## Series: cases.ts_diff1
## ARIMA(2,1,2)
##
## Coefficients:
## ar1 ar2 ma1 ma2
## 0.0482 -0.0489 -1.6252 0.6253
## s.e. 0.2890 0.2064 0.2903 0.2787
##
## sigma^2 = 2.246e+09: log likelihood = -646.92
## AIC=1303.84 AICc=1305.11 BIC=1313.69
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -969.8791 45148.98 32370.8 437.2808 667.6828 0.4806923
## ACF1
## Training set -0.002766728
## RMSE: 130737.7
## MAE: 112511
## MAPE: 174.4429 %
This analysis covers the basic steps in exploring and modeling cholera case data using time series methods. Further refinement of models and diagnostics could be pursued to enhance predictive accuracy.
This document provides an analysis death cases due to cholera using time series techniques. The analysis includes visualizing the data, checking stationarity, and modeling with ARIMA.
## Time Series:
## Start = 1970
## End = 2024
## Frequency = 1
## death
## [1,] 747
## [2,] 11427
## [3,] 386
## [4,] 636
## [5,] 582
## [6,] 504
## [7,] 194
## [8,] 462
## [9,] 1591
## [10,] 1869
## [11,] 1185
## [12,] 1581
## [13,] 2988
## [14,] 1903
## [15,] 1711
## [16,] 3837
## [17,] 3490
## [18,] 2610
## [19,] 2237
## [20,] 1443
## [21,] 2167
## [22,] 13998
## [23,] 5319
## [24,] 2542
## [25,] 8136
## [26,] 2962
## [27,] 5935
## [28,] 5853
## [29,] 9858
## [30,] 8707
## [31,] 4960
## [32,] 2752
## [33,] 4551
## [34,] 1884
## [35,] 2331
## [36,] 2230
## [37,] 6292
## [38,] 3996
## [39,] 5074
## [40,] 4883
## [41,] 3397
## [42,] 4148
## [43,] 2042
## [44,] 1366
## [45,] 1882
## [46,] 937
## [47,] 1762
## [48,] 3217
## [49,] 2436
## [50,] 880
## [51,] 741
## [52,] 4094
## [53,] 2495
## [54,] 1745
## [55,] 1379
##
## Augmented Dickey-Fuller Test
##
## data: death.ts
## Dickey-Fuller = -2.2376, Lag order = 3, p-value = 0.4788
## alternative hypothesis: stationary
## Warning in adf.test(death.ts_diff1): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: death.ts_diff1
## Dickey-Fuller = -5.3765, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
## $d
## [1] 0.8461494
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.4145109
## $d
## [1] 0.8461494
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.4145109
## $d
## [1] 0.8461494
##
## $sd.as
## [1] 0.6192913
##
## $sd.reg
## [1] 0.4145109
## Series: death.ts_diff1
## ARIMA(2,1,2)
##
## Coefficients:
## ar1 ar2 ma1 ma2
## -0.0554 -0.1083 -1.6733 0.6733
## s.e. 0.2301 0.2093 0.2318 0.2190
##
## sigma^2 = 8216034: log likelihood = -498.66
## AIC=1007.33 AICc=1008.61 BIC=1017.18
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -82.56852 2730.438 1762.568 84.76194 221.9807 0.5033628 0.1352915
cases
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2025 -631.0939 -6471.863 5209.675 -9563.779 8301.591
## 2026 -448.0867 -6562.465 5666.291 -9799.221 8903.048
## 2027 -574.4255 -8221.251 7072.400 -12269.237 11120.386
## 2028 -487.2076 -8586.946 7612.531 -12874.689 11900.274
## 2029 -547.4184 -9579.752 8484.916 -14361.181 13266.344
## 2030 -505.8520 -10049.315 9037.611 -15101.318 14089.615
## 2031 -534.5473 -10775.070 9705.975 -16196.075 15126.981
## 2032 -514.7375 -11266.842 10237.367 -16958.662 15929.187
## 2033 -528.4132 -11860.352 10803.526 -17859.118 16802.292
## 2034 -518.9722 -12341.514 11303.570 -18599.990 17562.046
## 2035 -525.4898 -12859.215 11808.235 -19388.295 18337.315
## 2036 -520.9904 -13318.731 12276.750 -20093.445 19051.464
## 2037 -524.0965 -13787.415 12739.222 -20808.592 19760.399
## 2038 -521.9522 -14223.127 13179.222 -21476.091 20432.186
## 2039 -523.4325 -14656.866 13610.001 -22138.654 21091.789
## 2040 -522.4106 -15069.920 14025.099 -22770.906 21726.085
## 2041 -523.1161 -15476.827 14430.595 -23392.843 22346.611
## 2042 -522.6290 -15869.379 14824.121 -23993.458 22948.200
## 2043 -522.9653 -16254.564 15208.633 -24582.369 23536.438
## 2044 -522.7331 -16628.890 15583.424 -25154.975 24109.508
## 2045 -522.8934 -16995.835 15950.048 -25716.083 24670.296
## 2046 -522.7828 -17354.017 16308.452 -26263.934 25218.369
## 2047 -522.8591 -17705.255 16659.537 -26801.066 25755.348
## 2048 -522.8064 -18049.101 17003.488 -27326.960 26281.348
## 2049 -522.8428 -18386.571 17340.886 -27843.058 26797.372
## 2050 -522.8177 -18717.618 17671.983 -28349.364 27303.728
## 2051 -522.8350 -19042.861 17997.191 -28846.771 27801.101
## 2052 -522.8231 -19362.413 18316.767 -29335.490 28289.844
## 2053 -522.8313 -19676.688 18631.025 -29816.127 28770.464
## 2054 -522.8256 -19985.852 18940.201 -30288.956 29243.304
## 2055 -522.8296 -20290.206 19244.547 -30754.423 29708.764
## 2056 -522.8268 -20589.928 19544.274 -31212.809 30167.156
## 2057 -522.8287 -20885.250 19839.592 -31664.464 30618.807
## 2058 -522.8274 -21176.341 20130.687 -32109.651 31063.997
## 2059 -522.8283 -21463.393 20417.736 -32548.658 31503.001
## 2060 -522.8277 -21746.558 20700.903 -32981.722 31936.067
## 2061 -522.8281 -22025.997 20980.341 -33409.088 32363.431
## 2062 -522.8278 -22301.850 21256.194 -33830.968 32785.312
## 2063 -522.8280 -22574.253 21528.597 -34247.573 33201.917
## 2064 -522.8279 -22843.331 21797.676 -34659.092 33613.437
## 2065 -522.8280 -23109.205 22063.549 -35065.711 34020.055
## 2066 -522.8279 -23371.984 22326.329 -35467.597 34421.941
## 2067 -522.8280 -23631.776 22586.120 -35864.914 34819.258
## 2068 -522.8279 -23888.680 22843.024 -36257.814 35212.158
## 2069 -522.8280 -24142.789 23097.133 -36646.441 35600.785
## 2070 -522.8279 -24394.193 23348.538 -37030.931 35985.275
## 2071 -522.8279 -24642.978 23597.322 -37411.414 36365.758
## 2072 -522.8279 -24889.222 23843.566 -37788.012 36742.356
## 2073 -522.8279 -25133.003 24087.347 -38160.842 37115.186
## 2074 -522.8279 -25374.392 24328.736 -38530.015 37484.359
## 2075 -522.8279 -25613.459 24567.803 -38895.637 37849.981
## 2076 -522.8279 -25850.270 24804.614 -39257.807 38212.151
## 2077 -522.8279 -26084.886 25039.231 -39616.623 38570.967
## 2078 -522.8279 -26317.369 25271.713 -39972.175 38926.519
## 2079 -522.8279 -26547.776 25502.120 -40324.551 39278.895
## RMSE: 4693.267
## MAE: 3766.429
## MAPE: 135.6387 %
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
## death
## 1 747
## 2 11427
## 3 386
## 4 636
## 5 582
## 6 504
## 7 194
## 8 462
## 9 1591
## 10 1869
## 11 1185
## 12 1581
## 13 2988
## 14 1903
## 15 1711
## 16 3837
## 17 3490
## 18 2610
## 19 2237
## 20 1443
## 21 2167
## 22 13998
## 23 5319
## 24 2542
## 25 8136
## 26 2962
## 27 5935
## 28 5853
## 29 9858
## 30 8707
## 31 4960
## 32 2752
## 33 4551
## 34 1884
## 35 2331
## 36 2230
## 37 6292
## 38 3996
## 39 5074
## 40 4883
## 41 3397
## 42 4148
## 43 2042
## 44 1366
## 45 1882
## 46 937
## 47 1762
## 48 3217
## 49 2436
## 50 880
## 51 741
## 52 4094
## 53 2495
## 54 1745
## 55 1379
## reported_cases
## 1 11086
## 2 72654
## 3 5137
## 4 6337
## 5 6074
## 6 6650
## 7 3180
## 8 9502
## 9 24643
## 10 21586
## 11 18742
## 12 19415
## 13 46924
## 14 37383
## 15 17504
## 16 31884
## 17 35585
## 18 34358
## 19 23012
## 20 35857
## 21 43262
## 22 153367
## 23 92079
## 24 76713
## 25 162413
## 26 72597
## 27 108535
## 28 118367
## 29 211761
## 30 210820
## 31 124484
## 32 173954
## 33 137866
## 34 108067
## 35 95560
## 36 125018
## 37 234226
## 38 167298
## 39 179323
## 40 217333
## 41 115106
## 42 188678
## 43 117570
## 44 56329
## 45 105287
## 46 71176
## 47 71058
## 48 179835
## 49 120650
## 50 55087
## 51 47256
## 52 141467
## 53 112282
## 54 177521
## 55 100317
##
## Augmented Dickey-Fuller Test
##
## data: cases.ts
## Dickey-Fuller = -2.0897, Lag order = 3, p-value = 0.5384
## alternative hypothesis: stationary
##
## Call:
## arima(x = cases.ts, order = c(1, d, 1), seasonal = list(order = c(1, 0, 1),
## period = 1))
##
## Coefficients:
## Warning in sqrt(diag(x$var.coef)): NaNs produced
## ar1 ma1 sar1 sma1
## 0.2715 -0.5632 0.2715 -0.5632
## s.e. NaN 1.2227 NaN 1.2227
##
## sigma^2 estimated as 2.062e+09: log likelihood = -655.94, aic = 1321.89
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 4953.194 44996.19 32512.04 -31.29897 62.92448 0.8199362
## ACF1
## Training set -0.02012548
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Don't know how to automatically pick scale for object of type <ts>. Defaulting
## to continuous.
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2025 115067.7 56871.29 173264.1 26063.960 204071.5
## 2026 117963.2 54919.77 181006.6 21546.607 214379.7
## 2027 118448.1 52315.79 184580.3 17307.475 219588.7
## 2028 118497.9 49360.62 187635.2 12761.534 224234.3
## 2029 118489.3 46354.71 190623.8 8168.981 228809.6
## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...
##
## Attaching package: 'MASS'
## The following object is masked from 'package:plotly':
##
## select
## The following object is masked from 'package:dplyr':
##
## select
## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...
## Warning: No trace type specified and no positional attributes specified
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
This analysis covers the basic steps in exploring and modeling cholera reported cases and death data using time series methods while exploring the relationship between the reported cases and death as reported by the member states for the development of the public health intelligence report for Africa.