Anomaly Detection

Loading important libraries

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(anomalize)
## ══ Use anomalize to improve your Forecasts by 50%! ═════════════════════════════
## Business Science offers a 1-hour course - Lab #18: Time Series Anomaly Detection!
## </> Learn more at: https://university.business-science.io/p/learning-labs-pro </>

Collect our time series data

tidyverse_cran_downloads
## # A tibble: 6,375 × 3
## # Groups:   package [15]
##    date       count package
##    <date>     <dbl> <chr>  
##  1 2017-01-01   873 tidyr  
##  2 2017-01-02  1840 tidyr  
##  3 2017-01-03  2495 tidyr  
##  4 2017-01-04  2906 tidyr  
##  5 2017-01-05  2847 tidyr  
##  6 2017-01-06  2756 tidyr  
##  7 2017-01-07  1439 tidyr  
##  8 2017-01-08  1556 tidyr  
##  9 2017-01-09  3678 tidyr  
## 10 2017-01-10  7086 tidyr  
## # … with 6,365 more rows
## # ℹ Use `print(n = ...)` to see more rows

Detecting our anomalies We now use the following functions to detect and visualize anomalies; We decomposed the “count” column into “observed”, “season”, “trend”, and “remainder” columns. The default values for time series decompose are method = “stl”, which is just seasonal decomposition using a Loess smoother (refer to stats::stl()). The frequency and trend parameters are automatically set based on the time scale (or periodicity) of the time series using tibbletime based function under the hood. time_decompose() This function would help with time series decomposition.

anomalize() We perform anomaly detection on the decomposed data using the remainder column through the use of the anomalize() function which procides 3 new columns; “remainder_l1” (lower limit), “remainder_l2” (upper limit), and “anomaly” (Yes/No Flag). The default method is method = “iqr”, which is fast and relatively accurate at detecting anomalies. The alpha parameter is by default set to alpha = 0.05, but can be adjusted to increase or decrease the height of the anomaly bands, making it more difficult or less difficult for data to be anomalous. The max_anoms parameter is by default set to a maximum of max_anoms = 0.2 for 20% of data that can be anomalous.

time_recompose() We create the lower and upper bounds around the “observed” values through the use of the time_recompose() function, which recomposes the lower and upper bounds of the anomalies around the observed values. We create new columns created: “recomposed_l1” (lower limit) and “recomposed_l2” (upper limit).

plot_anomalies() we now plot using plot_anomaly_decomposition() to visualize out data.

tidyverse_cran_downloads %>%
    time_decompose(count) %>%
    anomalize(remainder) %>%
    time_recompose() %>%
    plot_anomalies(time_recomposed = TRUE, ncol = 3, alpha_dots = 0.5)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

Challenge

Find the anomalies on the following given time series dataset

logs_path <- read.csv("http://bit.ly/LogsDataset")
logs_path
##           date count        server
## 1   2017-05-22     7 SERVER-549521
## 2   2017-05-23     9 SERVER-549521
## 3   2017-05-24    12 SERVER-549521
## 4   2017-05-25     4 SERVER-549521
## 5   2017-05-26     4 SERVER-549521
## 6   2017-05-30     2 SERVER-549521
## 7   2017-05-31    10 SERVER-549521
## 8   2017-06-01    14 SERVER-549521
## 9   2017-06-02    12 SERVER-549521
## 10  2017-06-05     7 SERVER-549521
## 11  2017-06-06    57 SERVER-549521
## 12  2017-06-07     2 SERVER-549521
## 13  2017-06-08    12 SERVER-549521
## 14  2017-06-09    72 SERVER-549521
## 15  2017-06-12    13 SERVER-549521
## 16  2017-06-13    13 SERVER-549521
## 17  2017-06-14    10 SERVER-549521
## 18  2017-06-15    14 SERVER-549521
## 19  2017-06-16    14 SERVER-549521
## 20  2017-06-19     5 SERVER-549521
## 21  2017-06-20    11 SERVER-549521
## 22  2017-06-21     6 SERVER-549521
## 23  2017-06-22     7 SERVER-549521
## 24  2017-06-23    10 SERVER-549521
## 25  2017-06-26     9 SERVER-549521
## 26  2017-06-27     7 SERVER-549521
## 27  2017-06-28     6 SERVER-549521
## 28  2017-06-29     2 SERVER-549521
## 29  2017-06-30     4 SERVER-549521
## 30  2017-07-05     2 SERVER-549521
## 31  2017-07-06    10 SERVER-549521
## 32  2017-07-07     4 SERVER-549521
## 33  2017-07-10    11 SERVER-549521
## 34  2017-07-11     6 SERVER-549521
## 35  2017-07-12     5 SERVER-549521
## 36  2017-07-13     8 SERVER-549521
## 37  2017-07-14    13 SERVER-549521
## 38  2017-07-17     9 SERVER-549521
## 39  2017-07-18    13 SERVER-549521
## 40  2017-07-19    13 SERVER-549521
## 41  2017-07-20     3 SERVER-549521
## 42  2017-07-21     5 SERVER-549521
## 43  2017-07-24    13 SERVER-549521
## 44  2017-07-25    13 SERVER-549521
## 45  2017-07-26    10 SERVER-549521
## 46  2017-07-28    13 SERVER-549521
## 47  2017-07-31     9 SERVER-549521
## 48  2017-08-01     9 SERVER-549521
## 49  2017-08-02    13 SERVER-549521
## 50  2017-08-03     4 SERVER-549521
## 51  2017-08-04     1 SERVER-549521
## 52  2017-08-07     3 SERVER-549521
## 53  2017-08-08    10 SERVER-549521
## 54  2017-08-09     8 SERVER-549521
## 55  2017-08-11    10 SERVER-549521
## 56  2017-08-14     8 SERVER-549521
## 57  2017-08-15     4 SERVER-549521
## 58  2017-08-16     4 SERVER-549521
## 59  2017-08-17    13 SERVER-549521
## 60  2017-08-18     2 SERVER-549521
## 61  2017-08-21     6 SERVER-549521
## 62  2017-08-22     3 SERVER-549521
## 63  2017-08-23     6 SERVER-549521
## 64  2017-08-24     8 SERVER-549521
## 65  2017-08-25     9 SERVER-549521
## 66  2017-08-27    20 SERVER-549521
## 67  2017-05-22     7 SERVER-573826
## 68  2017-05-23     9 SERVER-573826
## 69  2017-05-24    12 SERVER-573826
## 70  2017-05-25     4 SERVER-573826
## 71  2017-05-26     4 SERVER-573826
## 72  2017-05-30     2 SERVER-573826
## 73  2017-05-31    10 SERVER-573826
## 74  2017-06-01    14 SERVER-573826
## 75  2017-06-02    12 SERVER-573826
## 76  2017-06-05     7 SERVER-573826
## 77  2017-06-06    10 SERVER-573826
## 78  2017-06-07     2 SERVER-573826
## 79  2017-06-08    12 SERVER-573826
## 80  2017-06-09    12 SERVER-573826
## 81  2017-06-12    13 SERVER-573826
## 82  2017-06-13    13 SERVER-573826
## 83  2017-06-14    10 SERVER-573826
## 84  2017-06-15    14 SERVER-573826
## 85  2017-06-16    14 SERVER-573826
## 86  2017-06-19     5 SERVER-573826
## 87  2017-06-20    11 SERVER-573826
## 88  2017-06-21     6 SERVER-573826
## 89  2017-06-22     7 SERVER-573826
## 90  2017-06-23    10 SERVER-573826
## 91  2017-06-26     9 SERVER-573826
## 92  2017-06-27     7 SERVER-573826
## 93  2017-06-28     6 SERVER-573826
## 94  2017-06-29     2 SERVER-573826
## 95  2017-06-30     4 SERVER-573826
## 96  2017-07-05     2 SERVER-573826
## 97  2017-07-06    10 SERVER-573826
## 98  2017-07-07     4 SERVER-573826
## 99  2017-07-10    11 SERVER-573826
## 100 2017-07-11   124 SERVER-573826
## 101 2017-07-12     5 SERVER-573826
## 102 2017-07-13     8 SERVER-573826
## 103 2017-07-14    13 SERVER-573826
## 104 2017-07-17     9 SERVER-573826
## 105 2017-07-18    13 SERVER-573826
## 106 2017-07-19    13 SERVER-573826
## 107 2017-07-20     3 SERVER-573826
## 108 2017-07-21     5 SERVER-573826
## 109 2017-07-24   113 SERVER-573826
## 110 2017-07-25    13 SERVER-573826
## 111 2017-07-26    10 SERVER-573826
## 112 2017-07-28    13 SERVER-573826
## 113 2017-07-31     9 SERVER-573826
## 114 2017-08-01     9 SERVER-573826
## 115 2017-08-02    13 SERVER-573826
## 116 2017-08-03     4 SERVER-573826
## 117 2017-08-04     1 SERVER-573826
## 118 2017-08-07     3 SERVER-573826
## 119 2017-08-08    10 SERVER-573826
## 120 2017-08-09     8 SERVER-573826
## 121 2017-08-11    10 SERVER-573826
## 122 2017-08-14     8 SERVER-573826
## 123 2017-08-15     4 SERVER-573826
## 124 2017-08-16     4 SERVER-573826
## 125 2017-08-17    13 SERVER-573826
## 126 2017-08-18     2 SERVER-573826
## 127 2017-08-21     6 SERVER-573826
## 128 2017-08-22     3 SERVER-573826
## 129 2017-08-23     6 SERVER-573826
## 130 2017-08-24     8 SERVER-573826
## 131 2017-08-25     9 SERVER-573826
## 132 2017-08-27    20 SERVER-573826
## 133 2017-05-22     7 SERVER-472389
## 134 2017-05-23     9 SERVER-472389
## 135 2017-05-24    12 SERVER-472389
## 136 2017-05-25     4 SERVER-472389
## 137 2017-05-26     4 SERVER-472389
## 138 2017-05-30     2 SERVER-472389
## 139 2017-05-31    10 SERVER-472389
## 140 2017-06-01    14 SERVER-472389
## 141 2017-06-02    12 SERVER-472389
## 142 2017-06-05     7 SERVER-472389
## 143 2017-06-06    10 SERVER-472389
## 144 2017-06-07     2 SERVER-472389
## 145 2017-06-08    12 SERVER-472389
## 146 2017-06-09    12 SERVER-472389
## 147 2017-06-12    13 SERVER-472389
## 148 2017-06-13    13 SERVER-472389
## 149 2017-06-14    10 SERVER-472389
## 150 2017-06-15    14 SERVER-472389
## 151 2017-06-16    14 SERVER-472389
## 152 2017-06-19     5 SERVER-472389
## 153 2017-06-20    11 SERVER-472389
## 154 2017-06-21     6 SERVER-472389
## 155 2017-06-22     7 SERVER-472389
## 156 2017-06-23    10 SERVER-472389
## 157 2017-06-26     9 SERVER-472389
## 158 2017-06-27    72 SERVER-472389
## 159 2017-06-28     6 SERVER-472389
## 160 2017-06-29     2 SERVER-472389
## 161 2017-06-30     4 SERVER-472389
## 162 2017-07-05     2 SERVER-472389
## 163 2017-07-06    10 SERVER-472389
## 164 2017-07-07     4 SERVER-472389
## 165 2017-07-10    11 SERVER-472389
## 166 2017-07-11     6 SERVER-472389
## 167 2017-07-12     5 SERVER-472389
## 168 2017-07-13     8 SERVER-472389
## 169 2017-07-14    13 SERVER-472389
## 170 2017-07-17     9 SERVER-472389
## 171 2017-07-18    13 SERVER-472389
## 172 2017-07-19    53 SERVER-472389
## 173 2017-07-20     3 SERVER-472389
## 174 2017-07-21     5 SERVER-472389
## 175 2017-07-24    13 SERVER-472389
## 176 2017-07-25    13 SERVER-472389
## 177 2017-07-26    10 SERVER-472389
## 178 2017-07-28    13 SERVER-472389
## 179 2017-07-31     9 SERVER-472389
## 180 2017-08-01     9 SERVER-472389
## 181 2017-08-02    13 SERVER-472389
## 182 2017-08-03     4 SERVER-472389
## 183 2017-08-04     1 SERVER-472389
## 184 2017-08-07     3 SERVER-472389
## 185 2017-08-08    10 SERVER-472389
## 186 2017-08-09     8 SERVER-472389
## 187 2017-08-11    10 SERVER-472389
## 188 2017-08-14     8 SERVER-472389
## 189 2017-08-15     4 SERVER-472389
## 190 2017-08-16     4 SERVER-472389
## 191 2017-08-17    13 SERVER-472389
## 192 2017-08-18     2 SERVER-472389
## 193 2017-08-21     6 SERVER-472389
## 194 2017-08-22     3 SERVER-472389
## 195 2017-08-23     6 SERVER-472389
## 196 2017-08-24     8 SERVER-472389
## 197 2017-08-25     9 SERVER-472389
## 198 2017-08-27    20 SERVER-472389