Concept Drift Detection
Assuming prior knowledge of the terms used in stream learning, Drift Detection is the step of a continuous evaluation of the model error. When this error exceeds a threshold, de drift detector returns a signal informing the model that the context has changed and it must adapt to the new data distribution.
Figure 1 depicts a simple diagram showing where the detection plays its role.

Here we will simulate a concept drift using two gaussian distributions and moving one of them towards the other. We will use a simple KNN algorithm and at certain point we expect that it starts to misclassify new points. This is where the drift detector must raise a signal.
We will make 10^5 iterations, and the KNN algorithm will have a window where the model will learn how to predict the next observation. Of course we could use an evolving algorithm to avoid the re-learning process, but this is beyond this assessment. Using a window size helps to minimize the computation burden and also introduces some kind of forgetting mechanism.
The baseline error rate is given by the formula \(p_i\) that means the probability of observe a misclassification, with standard deviation of \(s_i = \sqrt{p_i(1-p_i)/{i}}\), considering that the distribution is near normal for >30 observations, the confidence interval for \(p\) is \(pi\pm\alpha*s_i\).
At every step we will keep two values: \(p_{min}\) and \(s_{min}\), updating them whenever the current \(p_i+s_i\) reaches a new minimum. In this example, we will use an \(\alpha\) of 95%, and the drift detection will be flagged when \(p_i+s_i \ge p_{min}+2*s_{min}\).
n <- 10000 # number of iterations
drift_start <- 4000 # drift will start at this observation
drift_length <- 50 # drift will end after this amount of steps
from_mean <- 5 # the distribution will start on this mean
to_mean <- 2 # and will end on this mean
sd <- 0.5 # all distributions will have this std dev
window_size <- 100 # this is the window size of KNN
k <- 1 # number of neighbours considered
slow_drift <- seq(from_mean, to_mean, length.out = drift_length) # change sequence
#### 2 clusters dataset ----
set.seed(2020)
data <- rbind(
matrix(rnorm(n, 0, sd), ncol = 2),
matrix(c(rnorm(n / 2, 2, sd), rnorm(n / 2, from_mean, sd)), ncol = 2)
)
data <- cbind(data, rep(c(1, 2), each = n / 2))
plot(data[, 1], data[, 2], col = data[, 3], xlim = c(-5, 8), ylim = c(-4, 7),
main = "Clusters Starting Point", ylab = "", xlab = "")

# mix the samples
set.seed(2020)
idxs_train <- sample(seq_len(nrow(data)))
data <- data[idxs_train, ]
#### prime the model ----
# learn the first window_size examples and do a classification
class <- knn(data[1:window_size, -3], data[(window_size + 1), -3], data[1:window_size, 3], k = k)
# is it wrong?
error <- data[(window_size + 1), 3] != class
pmin <- pi <- 0.5 # initial values
smin <- si <- sqrt(pi * (1 - pi) / window_size) # initial values
# Start the iterations starting from the next observation
set.seed(2020) # set seed again because the drift will use rng
alarm <- 0
pi_counter <- NULL
for (i in seq_len(n - window_size - 1)) {
# First do the drift if it is time
if ((i + window_size) >= drift_start) {
# only drift the second class
if (data[(window_size + i + 1), 3] == 2) {
m <- ifelse((i + window_size) < drift_start + drift_length,
slow_drift[i + window_size - drift_start + 1], tail(slow_drift, 1)
)
data[(window_size + i + 1), -3] <- c(rnorm(1, 2, sd), rnorm(1, m, sd))
}
}
cl <- knn(data[(i + 1):(window_size + i), -3],
data[(window_size + i + 1), -3],
data[(i + 1):(window_size + i), 3],
k = k
)
class <- c(class, cl)
error <- c(error, cl != data[(window_size + i + 1), 3])
er <- sum(error) # sum(tail(error, window_size))
# update probability of failure
pi <- er / i
si <- sqrt(pi * (1 - pi) / window_size)
pi_counter <- c(pi_counter, pi)
if ((pi + si) < (pmin + smin)) {
pmin <- pi
smin <- si
} else if ((pi + si) > (pmin + 2 * smin)) {
if (alarm == 0) {
alarm <- (i + window_size)
}
pmin <- pi
smin <- si
}
}
The following plot shows the error rate for the KNN classifier. The black line is the drift starting point, and the red line is the detection point.
plot(pi_counter, type = "l", main = "Drift Detection", xlab = "Time", ylab = "Error Rate")
abline(v = c(drift_start, alarm), col = c(1, 2))

LS0tCnRpdGxlOiAiSEVBRFMgLSBISURBIC0gTGVhcm46IENvbmNlcHQgRHJpZnQiCm91dHB1dDogCiAgaHRtbF9ub3RlYm9vazogCiAgICBmaWdfaGVpZ2h0OiA4CiAgICBmaWdfd2lkdGg6IDEwCiAgICBoaWdobGlnaHQ6IHB5Z21lbnRzCiAgICB0aGVtZTogdW5pdGVkCiAgICB0b2M6IHllcwphdXRob3I6IAogIC0gRGlhbmEgTGVpdGUgUG9ydGVsYSBTaWx2YQogIC0gRnJhbmNpc2NvIEJpc2Nob2ZmCiAgLSBIdWdvIEZpbGlwZSBCYXB0aXN0YSBNb250ZWlybwogIC0gxZ5lcmFmZmV0aW4gR8O8bmXFnwotLS0KCiMgSW50cm9kdWN0aW9uCgpgYGB7ciBzZXR1cCwgbWVzc2FnZSA9IEZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQodGlkeS5vcHRzID0gbGlzdCh3aWR0aC5jdXRvZmYgPSA2MCksIHRpZHkgPSBUUlVFKQpsaWJyYXJ5KGNvd3Bsb3QpCmBgYAoKTW9zdCBvZiB0aGUgdGltZSwgbWFjaGluZSBsZWFybmluZyBhbGdvcml0aG1zIG5lZWQgdG8gbGVhcm4gZnJvbSBjb250aW51b3VzIGRhdGEsIHVsdGltYXRlbHkgYW4gaW5maW5pdGUgYW1vdW50IG9mIGRhdGEuCkl0IGlzIG9ubHkgbmF0dXJhbCB0byBiZWxpZXZlIHRoYXQgdGhlc2UgZGF0YSB3aWxsIGNoYW5nZSBvdmVyIHRpbWUsIGFuZCB3aGF0IGlzIG5vdyByZWxldmFudCwgbWF5IG5vdCBiZSBpbiB0aGUgZnV0dXJlLiBTbyB0aGVyZSBpcyBhIG5lZWQgdG8gdW5kZXJzdGFuZCBfd2hlbl8gdGhlIGRhdGEgaXMgY2hhbmdpbmcuCgojIENvbmNlcHQgRHJpZnQgRGV0ZWN0aW9uCgpBc3N1bWluZyBwcmlvciBrbm93bGVkZ2Ugb2YgdGhlIHRlcm1zIHVzZWQgaW4gc3RyZWFtIGxlYXJuaW5nLCBEcmlmdCBEZXRlY3Rpb24gaXMgdGhlIHN0ZXAgb2YgYSBjb250aW51b3VzIGV2YWx1YXRpb24gb2YgdGhlIG1vZGVsIGVycm9yLiBXaGVuIHRoaXMgZXJyb3IgZXhjZWVkcyBhIHRocmVzaG9sZCwgZGUgZHJpZnQgZGV0ZWN0b3IgcmV0dXJucyBhIHNpZ25hbCBpbmZvcm1pbmcgdGhlIG1vZGVsIHRoYXQgdGhlIGNvbnRleHQgaGFzIGNoYW5nZWQgYW5kIGl0IG11c3QgYWRhcHQgdG8gdGhlIG5ldyBkYXRhIGRpc3RyaWJ1dGlvbi4KCkZpZ3VyZSAxIGRlcGljdHMgYSBzaW1wbGUgZGlhZ3JhbSBzaG93aW5nIHdoZXJlIHRoZSBkZXRlY3Rpb24gcGxheXMgaXRzIHJvbGUuCgpgYGB7ciBlY2hvPUZBTFNFLCBmaWcud2lkdGg9MTAsIGZpZy5oZWlnaHQ9Nn0KZmlnX3N2ZyA8LSBnZ2RyYXcoKSArIGRyYXdfaW1hZ2UoImRyaWZ0VU1MLnN2ZyIpICsgZHJhd19wbG90X2xhYmVsKCJGaWd1cmUgMTogU2VxdWVuY2UgZGlhZ3JhbSBmb3IgYW4gb25saW5lIGFkYXB0aXZlIGxlYXJuaW5nIGFsZ29yaXRobS4gQWRhcHRlZCBmcm9tIEdhbWEsIDIwMDQuIiwgeCA9IC0wLjUsIHkgPSAwLjA4LCBzaXplID0gMTAsIGhqdXN0ID0gLTEuMiwgZm9udGZhY2UgPSAicGxhaW4iKQpwbG90KGZpZ19zdmcpCmBgYAoKCkhlcmUgd2Ugd2lsbCBzaW11bGF0ZSBhIGNvbmNlcHQgZHJpZnQgdXNpbmcgdHdvIGdhdXNzaWFuIGRpc3RyaWJ1dGlvbnMgYW5kIG1vdmluZyBvbmUgb2YgdGhlbSB0b3dhcmRzIHRoZSBvdGhlci4gV2Ugd2lsbCB1c2UgYSBzaW1wbGUgS05OIGFsZ29yaXRobSBhbmQgYXQgY2VydGFpbiBwb2ludCB3ZSBleHBlY3QgdGhhdCBpdCBzdGFydHMgdG8gbWlzY2xhc3NpZnkgbmV3IHBvaW50cy4gVGhpcyBpcyB3aGVyZSB0aGUgZHJpZnQgZGV0ZWN0b3IgbXVzdCByYWlzZSBhIHNpZ25hbC4KCldlIHdpbGwgbWFrZSAxMF41IGl0ZXJhdGlvbnMsIGFuZCB0aGUgS05OIGFsZ29yaXRobSB3aWxsIGhhdmUgYSB3aW5kb3cgd2hlcmUgdGhlIG1vZGVsIHdpbGwgbGVhcm4gaG93IHRvIHByZWRpY3QgdGhlIG5leHQgb2JzZXJ2YXRpb24uIE9mIGNvdXJzZSB3ZSBjb3VsZCB1c2UgYW4gZXZvbHZpbmcgYWxnb3JpdGhtIHRvIGF2b2lkIHRoZSByZS1sZWFybmluZyBwcm9jZXNzLCBidXQgdGhpcyBpcyBiZXlvbmQgdGhpcyBhc3Nlc3NtZW50LiBVc2luZyBhIHdpbmRvdyBzaXplIGhlbHBzIHRvIG1pbmltaXplIHRoZSBjb21wdXRhdGlvbiBidXJkZW4gYW5kIGFsc28gaW50cm9kdWNlcyBzb21lIGtpbmQgb2YgZm9yZ2V0dGluZyBtZWNoYW5pc20uCgpUaGUgYmFzZWxpbmUgZXJyb3IgcmF0ZSBpcyBnaXZlbiBieSB0aGUgZm9ybXVsYSAkcF9pJCB0aGF0IG1lYW5zIHRoZSBwcm9iYWJpbGl0eSBvZiBvYnNlcnZlIGEgbWlzY2xhc3NpZmljYXRpb24sIHdpdGggc3RhbmRhcmQgZGV2aWF0aW9uIG9mICRzX2kgPSBcc3FydHtwX2koMS1wX2kpL3tpfX0kLCBjb25zaWRlcmluZyB0aGF0IHRoZSBkaXN0cmlidXRpb24gaXMgbmVhciBub3JtYWwgZm9yID4zMCBvYnNlcnZhdGlvbnMsIHRoZSBjb25maWRlbmNlIGludGVydmFsIGZvciAkcCQgaXMgJHBpXHBtXGFscGhhKnNfaSQuCgpBdCBldmVyeSBzdGVwIHdlIHdpbGwga2VlcCB0d28gdmFsdWVzOiAkcF97bWlufSQgYW5kICRzX3ttaW59JCwgdXBkYXRpbmcgdGhlbSB3aGVuZXZlciB0aGUgY3VycmVudCAkcF9pK3NfaSQgcmVhY2hlcyBhIG5ldyBtaW5pbXVtLiBJbiB0aGlzIGV4YW1wbGUsIHdlIHdpbGwgdXNlIGFuICRcYWxwaGEkIG9mIDk1JSwgYW5kIHRoZSBkcmlmdCBkZXRlY3Rpb24gd2lsbCBiZSBmbGFnZ2VkIHdoZW4gJHBfaStzX2kgXGdlIHBfe21pbn0rMipzX3ttaW59JC4KCmBgYHtyLCBmaWcuaGVpZ2h0PTUsIGZpZy53aWR0aD03fQpuIDwtIDEwMDAwICMgbnVtYmVyIG9mIGl0ZXJhdGlvbnMKZHJpZnRfc3RhcnQgPC0gNDAwMCAjIGRyaWZ0IHdpbGwgc3RhcnQgYXQgdGhpcyBvYnNlcnZhdGlvbgpkcmlmdF9sZW5ndGggPC0gNTAgIyBkcmlmdCB3aWxsIGVuZCBhZnRlciB0aGlzIGFtb3VudCBvZiBzdGVwcwpmcm9tX21lYW4gPC0gNSAjIHRoZSBkaXN0cmlidXRpb24gd2lsbCBzdGFydCBvbiB0aGlzIG1lYW4KdG9fbWVhbiA8LSAyICMgYW5kIHdpbGwgZW5kIG9uIHRoaXMgbWVhbgpzZCA8LSAwLjUgIyBhbGwgZGlzdHJpYnV0aW9ucyB3aWxsIGhhdmUgdGhpcyBzdGQgZGV2CndpbmRvd19zaXplIDwtIDEwMCAjIHRoaXMgaXMgdGhlIHdpbmRvdyBzaXplIG9mIEtOTgprIDwtIDEgIyBudW1iZXIgb2YgbmVpZ2hib3VycyBjb25zaWRlcmVkCnNsb3dfZHJpZnQgPC0gc2VxKGZyb21fbWVhbiwgdG9fbWVhbiwgbGVuZ3RoLm91dCA9IGRyaWZ0X2xlbmd0aCkgIyBjaGFuZ2Ugc2VxdWVuY2UKCiMjIyMgMiBjbHVzdGVycyBkYXRhc2V0IC0tLS0Kc2V0LnNlZWQoMjAyMCkKZGF0YSA8LSByYmluZCgKICBtYXRyaXgocm5vcm0obiwgMCwgc2QpLCBuY29sID0gMiksCiAgbWF0cml4KGMocm5vcm0obiAvIDIsIDIsIHNkKSwgcm5vcm0obiAvIDIsIGZyb21fbWVhbiwgc2QpKSwgbmNvbCA9IDIpCikKZGF0YSA8LSBjYmluZChkYXRhLCByZXAoYygxLCAyKSwgZWFjaCA9IG4gLyAyKSkKCnBsb3QoZGF0YVssIDFdLCBkYXRhWywgMl0sIGNvbCA9IGRhdGFbLCAzXSwgeGxpbSA9IGMoLTUsIDgpLCB5bGltID0gYygtNCwgNyksCiAgICAgbWFpbiA9ICJDbHVzdGVycyBTdGFydGluZyBQb2ludCIsIHlsYWIgPSAiIiwgeGxhYiA9ICIiKQojIG1peCB0aGUgc2FtcGxlcwpzZXQuc2VlZCgyMDIwKQppZHhzX3RyYWluIDwtIHNhbXBsZShzZXFfbGVuKG5yb3coZGF0YSkpKQpkYXRhIDwtIGRhdGFbaWR4c190cmFpbiwgXQoKIyMjIyBwcmltZSB0aGUgbW9kZWwgLS0tLQojIGxlYXJuIHRoZSBmaXJzdCB3aW5kb3dfc2l6ZSBleGFtcGxlcyBhbmQgZG8gYSBjbGFzc2lmaWNhdGlvbgpjbGFzcyA8LSBrbm4oZGF0YVsxOndpbmRvd19zaXplLCAtM10sIGRhdGFbKHdpbmRvd19zaXplICsgMSksIC0zXSwgZGF0YVsxOndpbmRvd19zaXplLCAzXSwgayA9IGspCgojIGlzIGl0IHdyb25nPwplcnJvciA8LSBkYXRhWyh3aW5kb3dfc2l6ZSArIDEpLCAzXSAhPSBjbGFzcwoKcG1pbiA8LSBwaSA8LSAwLjUgIyBpbml0aWFsIHZhbHVlcwpzbWluIDwtIHNpIDwtIHNxcnQocGkgKiAoMSAtIHBpKSAvIHdpbmRvd19zaXplKSAjIGluaXRpYWwgdmFsdWVzCgojIFN0YXJ0IHRoZSBpdGVyYXRpb25zIHN0YXJ0aW5nIGZyb20gdGhlIG5leHQgb2JzZXJ2YXRpb24Kc2V0LnNlZWQoMjAyMCkgIyBzZXQgc2VlZCBhZ2FpbiBiZWNhdXNlIHRoZSBkcmlmdCB3aWxsIHVzZSBybmcKYWxhcm0gPC0gMApwaV9jb3VudGVyIDwtIE5VTEwKZm9yIChpIGluIHNlcV9sZW4obiAtIHdpbmRvd19zaXplIC0gMSkpIHsKCiAgIyBGaXJzdCBkbyB0aGUgZHJpZnQgaWYgaXQgaXMgdGltZQogIGlmICgoaSArIHdpbmRvd19zaXplKSA+PSBkcmlmdF9zdGFydCkgewogICAgIyBvbmx5IGRyaWZ0IHRoZSBzZWNvbmQgY2xhc3MKICAgIGlmIChkYXRhWyh3aW5kb3dfc2l6ZSArIGkgKyAxKSwgM10gPT0gMikgewogICAgICBtIDwtIGlmZWxzZSgoaSArIHdpbmRvd19zaXplKSA8IGRyaWZ0X3N0YXJ0ICsgZHJpZnRfbGVuZ3RoLAogICAgICAgIHNsb3dfZHJpZnRbaSArIHdpbmRvd19zaXplIC0gZHJpZnRfc3RhcnQgKyAxXSwgdGFpbChzbG93X2RyaWZ0LCAxKQogICAgICApCiAgICAgIGRhdGFbKHdpbmRvd19zaXplICsgaSArIDEpLCAtM10gPC0gYyhybm9ybSgxLCAyLCBzZCksIHJub3JtKDEsIG0sIHNkKSkKICAgIH0KICB9CgogIGNsIDwtIGtubihkYXRhWyhpICsgMSk6KHdpbmRvd19zaXplICsgaSksIC0zXSwKICAgIGRhdGFbKHdpbmRvd19zaXplICsgaSArIDEpLCAtM10sCiAgICBkYXRhWyhpICsgMSk6KHdpbmRvd19zaXplICsgaSksIDNdLAogICAgayA9IGsKICApCiAgY2xhc3MgPC0gYyhjbGFzcywgY2wpCiAgZXJyb3IgPC0gYyhlcnJvciwgY2wgIT0gZGF0YVsod2luZG93X3NpemUgKyBpICsgMSksIDNdKQoKICBlciA8LSBzdW0oZXJyb3IpICMgc3VtKHRhaWwoZXJyb3IsIHdpbmRvd19zaXplKSkKICAjIHVwZGF0ZSBwcm9iYWJpbGl0eSBvZiBmYWlsdXJlCiAgcGkgPC0gZXIgLyBpCiAgc2kgPC0gc3FydChwaSAqICgxIC0gcGkpIC8gd2luZG93X3NpemUpCiAgcGlfY291bnRlciA8LSBjKHBpX2NvdW50ZXIsIHBpKQoKICBpZiAoKHBpICsgc2kpIDwgKHBtaW4gKyBzbWluKSkgewogICAgcG1pbiA8LSBwaQogICAgc21pbiA8LSBzaQogIH0gZWxzZSBpZiAoKHBpICsgc2kpID4gKHBtaW4gKyAyICogc21pbikpIHsKICAgIGlmIChhbGFybSA9PSAwKSB7CiAgICAgIGFsYXJtIDwtIChpICsgd2luZG93X3NpemUpCiAgICB9CiAgICBwbWluIDwtIHBpCiAgICBzbWluIDwtIHNpCiAgfQp9CmBgYAoKVGhlIGZvbGxvd2luZyBwbG90IHNob3dzIHRoZSBlcnJvciByYXRlIGZvciB0aGUgS05OIGNsYXNzaWZpZXIuIFRoZSBibGFjayBsaW5lIGlzIHRoZSBkcmlmdCBzdGFydGluZyBwb2ludCwgYW5kIHRoZSByZWQgbGluZSBpcyB0aGUgZGV0ZWN0aW9uIHBvaW50LgoKYGBge3IsIGZpZy5oZWlnaHQ9NSwgZmlnLndpZHRoPTEwfQpwbG90KHBpX2NvdW50ZXIsIHR5cGUgPSAibCIsIG1haW4gPSAiRHJpZnQgRGV0ZWN0aW9uIiwgeGxhYiA9ICJUaW1lIiwgeWxhYiA9ICJFcnJvciBSYXRlIikKYWJsaW5lKHYgPSBjKGRyaWZ0X3N0YXJ0LCBhbGFybSksIGNvbCA9IGMoMSwgMikpCmBgYAo=