Loading important libraries
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(anomalize)
## ══ Use anomalize to improve your Forecasts by 50%! ═════════════════════════════
## Business Science offers a 1-hour course - Lab #18: Time Series Anomaly Detection!
## </> Learn more at: https://university.business-science.io/p/learning-labs-pro </>
Collect our time series data
tidyverse_cran_downloads
## # A tibble: 6,375 × 3
## # Groups: package [15]
## date count package
## <date> <dbl> <chr>
## 1 2017-01-01 873 tidyr
## 2 2017-01-02 1840 tidyr
## 3 2017-01-03 2495 tidyr
## 4 2017-01-04 2906 tidyr
## 5 2017-01-05 2847 tidyr
## 6 2017-01-06 2756 tidyr
## 7 2017-01-07 1439 tidyr
## 8 2017-01-08 1556 tidyr
## 9 2017-01-09 3678 tidyr
## 10 2017-01-10 7086 tidyr
## # … with 6,365 more rows
## # ℹ Use `print(n = ...)` to see more rows
Detecting our anomalies We now use the following functions to detect and visualize anomalies; We decomposed the “count” column into “observed”, “season”, “trend”, and “remainder” columns. The default values for time series decompose are method = “stl”, which is just seasonal decomposition using a Loess smoother (refer to stats::stl()). The frequency and trend parameters are automatically set based on the time scale (or periodicity) of the time series using tibbletime based function under the hood. time_decompose() This function would help with time series decomposition.
anomalize() We perform anomaly detection on the decomposed data using the remainder column through the use of the anomalize() function which procides 3 new columns; “remainder_l1” (lower limit), “remainder_l2” (upper limit), and “anomaly” (Yes/No Flag). The default method is method = “iqr”, which is fast and relatively accurate at detecting anomalies. The alpha parameter is by default set to alpha = 0.05, but can be adjusted to increase or decrease the height of the anomaly bands, making it more difficult or less difficult for data to be anomalous. The max_anoms parameter is by default set to a maximum of max_anoms = 0.2 for 20% of data that can be anomalous.
time_recompose() We create the lower and upper bounds around the “observed” values through the use of the time_recompose() function, which recomposes the lower and upper bounds of the anomalies around the observed values. We create new columns created: “recomposed_l1” (lower limit) and “recomposed_l2” (upper limit).
plot_anomalies() we now plot using plot_anomaly_decomposition() to visualize out data.
tidyverse_cran_downloads %>%
time_decompose(count) %>%
anomalize(remainder) %>%
time_recompose() %>%
plot_anomalies(time_recomposed = TRUE, ncol = 3, alpha_dots = 0.5)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
Find the anomalies on the following given time series dataset
logs_path <- read.csv("http://bit.ly/LogsDataset")
logs_path
## date count server
## 1 2017-05-22 7 SERVER-549521
## 2 2017-05-23 9 SERVER-549521
## 3 2017-05-24 12 SERVER-549521
## 4 2017-05-25 4 SERVER-549521
## 5 2017-05-26 4 SERVER-549521
## 6 2017-05-30 2 SERVER-549521
## 7 2017-05-31 10 SERVER-549521
## 8 2017-06-01 14 SERVER-549521
## 9 2017-06-02 12 SERVER-549521
## 10 2017-06-05 7 SERVER-549521
## 11 2017-06-06 57 SERVER-549521
## 12 2017-06-07 2 SERVER-549521
## 13 2017-06-08 12 SERVER-549521
## 14 2017-06-09 72 SERVER-549521
## 15 2017-06-12 13 SERVER-549521
## 16 2017-06-13 13 SERVER-549521
## 17 2017-06-14 10 SERVER-549521
## 18 2017-06-15 14 SERVER-549521
## 19 2017-06-16 14 SERVER-549521
## 20 2017-06-19 5 SERVER-549521
## 21 2017-06-20 11 SERVER-549521
## 22 2017-06-21 6 SERVER-549521
## 23 2017-06-22 7 SERVER-549521
## 24 2017-06-23 10 SERVER-549521
## 25 2017-06-26 9 SERVER-549521
## 26 2017-06-27 7 SERVER-549521
## 27 2017-06-28 6 SERVER-549521
## 28 2017-06-29 2 SERVER-549521
## 29 2017-06-30 4 SERVER-549521
## 30 2017-07-05 2 SERVER-549521
## 31 2017-07-06 10 SERVER-549521
## 32 2017-07-07 4 SERVER-549521
## 33 2017-07-10 11 SERVER-549521
## 34 2017-07-11 6 SERVER-549521
## 35 2017-07-12 5 SERVER-549521
## 36 2017-07-13 8 SERVER-549521
## 37 2017-07-14 13 SERVER-549521
## 38 2017-07-17 9 SERVER-549521
## 39 2017-07-18 13 SERVER-549521
## 40 2017-07-19 13 SERVER-549521
## 41 2017-07-20 3 SERVER-549521
## 42 2017-07-21 5 SERVER-549521
## 43 2017-07-24 13 SERVER-549521
## 44 2017-07-25 13 SERVER-549521
## 45 2017-07-26 10 SERVER-549521
## 46 2017-07-28 13 SERVER-549521
## 47 2017-07-31 9 SERVER-549521
## 48 2017-08-01 9 SERVER-549521
## 49 2017-08-02 13 SERVER-549521
## 50 2017-08-03 4 SERVER-549521
## 51 2017-08-04 1 SERVER-549521
## 52 2017-08-07 3 SERVER-549521
## 53 2017-08-08 10 SERVER-549521
## 54 2017-08-09 8 SERVER-549521
## 55 2017-08-11 10 SERVER-549521
## 56 2017-08-14 8 SERVER-549521
## 57 2017-08-15 4 SERVER-549521
## 58 2017-08-16 4 SERVER-549521
## 59 2017-08-17 13 SERVER-549521
## 60 2017-08-18 2 SERVER-549521
## 61 2017-08-21 6 SERVER-549521
## 62 2017-08-22 3 SERVER-549521
## 63 2017-08-23 6 SERVER-549521
## 64 2017-08-24 8 SERVER-549521
## 65 2017-08-25 9 SERVER-549521
## 66 2017-08-27 20 SERVER-549521
## 67 2017-05-22 7 SERVER-573826
## 68 2017-05-23 9 SERVER-573826
## 69 2017-05-24 12 SERVER-573826
## 70 2017-05-25 4 SERVER-573826
## 71 2017-05-26 4 SERVER-573826
## 72 2017-05-30 2 SERVER-573826
## 73 2017-05-31 10 SERVER-573826
## 74 2017-06-01 14 SERVER-573826
## 75 2017-06-02 12 SERVER-573826
## 76 2017-06-05 7 SERVER-573826
## 77 2017-06-06 10 SERVER-573826
## 78 2017-06-07 2 SERVER-573826
## 79 2017-06-08 12 SERVER-573826
## 80 2017-06-09 12 SERVER-573826
## 81 2017-06-12 13 SERVER-573826
## 82 2017-06-13 13 SERVER-573826
## 83 2017-06-14 10 SERVER-573826
## 84 2017-06-15 14 SERVER-573826
## 85 2017-06-16 14 SERVER-573826
## 86 2017-06-19 5 SERVER-573826
## 87 2017-06-20 11 SERVER-573826
## 88 2017-06-21 6 SERVER-573826
## 89 2017-06-22 7 SERVER-573826
## 90 2017-06-23 10 SERVER-573826
## 91 2017-06-26 9 SERVER-573826
## 92 2017-06-27 7 SERVER-573826
## 93 2017-06-28 6 SERVER-573826
## 94 2017-06-29 2 SERVER-573826
## 95 2017-06-30 4 SERVER-573826
## 96 2017-07-05 2 SERVER-573826
## 97 2017-07-06 10 SERVER-573826
## 98 2017-07-07 4 SERVER-573826
## 99 2017-07-10 11 SERVER-573826
## 100 2017-07-11 124 SERVER-573826
## 101 2017-07-12 5 SERVER-573826
## 102 2017-07-13 8 SERVER-573826
## 103 2017-07-14 13 SERVER-573826
## 104 2017-07-17 9 SERVER-573826
## 105 2017-07-18 13 SERVER-573826
## 106 2017-07-19 13 SERVER-573826
## 107 2017-07-20 3 SERVER-573826
## 108 2017-07-21 5 SERVER-573826
## 109 2017-07-24 113 SERVER-573826
## 110 2017-07-25 13 SERVER-573826
## 111 2017-07-26 10 SERVER-573826
## 112 2017-07-28 13 SERVER-573826
## 113 2017-07-31 9 SERVER-573826
## 114 2017-08-01 9 SERVER-573826
## 115 2017-08-02 13 SERVER-573826
## 116 2017-08-03 4 SERVER-573826
## 117 2017-08-04 1 SERVER-573826
## 118 2017-08-07 3 SERVER-573826
## 119 2017-08-08 10 SERVER-573826
## 120 2017-08-09 8 SERVER-573826
## 121 2017-08-11 10 SERVER-573826
## 122 2017-08-14 8 SERVER-573826
## 123 2017-08-15 4 SERVER-573826
## 124 2017-08-16 4 SERVER-573826
## 125 2017-08-17 13 SERVER-573826
## 126 2017-08-18 2 SERVER-573826
## 127 2017-08-21 6 SERVER-573826
## 128 2017-08-22 3 SERVER-573826
## 129 2017-08-23 6 SERVER-573826
## 130 2017-08-24 8 SERVER-573826
## 131 2017-08-25 9 SERVER-573826
## 132 2017-08-27 20 SERVER-573826
## 133 2017-05-22 7 SERVER-472389
## 134 2017-05-23 9 SERVER-472389
## 135 2017-05-24 12 SERVER-472389
## 136 2017-05-25 4 SERVER-472389
## 137 2017-05-26 4 SERVER-472389
## 138 2017-05-30 2 SERVER-472389
## 139 2017-05-31 10 SERVER-472389
## 140 2017-06-01 14 SERVER-472389
## 141 2017-06-02 12 SERVER-472389
## 142 2017-06-05 7 SERVER-472389
## 143 2017-06-06 10 SERVER-472389
## 144 2017-06-07 2 SERVER-472389
## 145 2017-06-08 12 SERVER-472389
## 146 2017-06-09 12 SERVER-472389
## 147 2017-06-12 13 SERVER-472389
## 148 2017-06-13 13 SERVER-472389
## 149 2017-06-14 10 SERVER-472389
## 150 2017-06-15 14 SERVER-472389
## 151 2017-06-16 14 SERVER-472389
## 152 2017-06-19 5 SERVER-472389
## 153 2017-06-20 11 SERVER-472389
## 154 2017-06-21 6 SERVER-472389
## 155 2017-06-22 7 SERVER-472389
## 156 2017-06-23 10 SERVER-472389
## 157 2017-06-26 9 SERVER-472389
## 158 2017-06-27 72 SERVER-472389
## 159 2017-06-28 6 SERVER-472389
## 160 2017-06-29 2 SERVER-472389
## 161 2017-06-30 4 SERVER-472389
## 162 2017-07-05 2 SERVER-472389
## 163 2017-07-06 10 SERVER-472389
## 164 2017-07-07 4 SERVER-472389
## 165 2017-07-10 11 SERVER-472389
## 166 2017-07-11 6 SERVER-472389
## 167 2017-07-12 5 SERVER-472389
## 168 2017-07-13 8 SERVER-472389
## 169 2017-07-14 13 SERVER-472389
## 170 2017-07-17 9 SERVER-472389
## 171 2017-07-18 13 SERVER-472389
## 172 2017-07-19 53 SERVER-472389
## 173 2017-07-20 3 SERVER-472389
## 174 2017-07-21 5 SERVER-472389
## 175 2017-07-24 13 SERVER-472389
## 176 2017-07-25 13 SERVER-472389
## 177 2017-07-26 10 SERVER-472389
## 178 2017-07-28 13 SERVER-472389
## 179 2017-07-31 9 SERVER-472389
## 180 2017-08-01 9 SERVER-472389
## 181 2017-08-02 13 SERVER-472389
## 182 2017-08-03 4 SERVER-472389
## 183 2017-08-04 1 SERVER-472389
## 184 2017-08-07 3 SERVER-472389
## 185 2017-08-08 10 SERVER-472389
## 186 2017-08-09 8 SERVER-472389
## 187 2017-08-11 10 SERVER-472389
## 188 2017-08-14 8 SERVER-472389
## 189 2017-08-15 4 SERVER-472389
## 190 2017-08-16 4 SERVER-472389
## 191 2017-08-17 13 SERVER-472389
## 192 2017-08-18 2 SERVER-472389
## 193 2017-08-21 6 SERVER-472389
## 194 2017-08-22 3 SERVER-472389
## 195 2017-08-23 6 SERVER-472389
## 196 2017-08-24 8 SERVER-472389
## 197 2017-08-25 9 SERVER-472389
## 198 2017-08-27 20 SERVER-472389