This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(anomalize)
## ══ Use anomalize to improve your Forecasts by 50%! ═════════════════════════════
## Business Science offers a 1-hour course - Lab #18: Time Series Anomaly Detection!
## </> Learn more at: https://university.business-science.io/p/learning-labs-pro </>
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(dplyr)
library(tibble)
library(tibble)
df=read.csv("http://bit.ly/CarreFourSalesDataset")
head(df)
## Date Sales
## 1 1/5/2019 548.9715
## 2 3/8/2019 80.2200
## 3 3/3/2019 340.5255
## 4 1/27/2019 489.0480
## 5 2/8/2019 634.3785
## 6 3/25/2019 627.6165
#changing date column to appropriate format and changing to tibble
df$Date <- as.Date(df$Date, format = "%m/%d/%Y")
df$Date <- as.POSIXct(df$Date)
df <- as_tibble(df)
#data structure
str(df)
## tibble [1,000 × 2] (S3: tbl_df/tbl/data.frame)
## $ Date : POSIXct[1:1000], format: "2019-01-05 03:00:00" "2019-03-08 03:00:00" ...
## $ Sales: num [1:1000] 549 80.2 340.5 489 634.4 ...
df %>%
time_decompose(Sales, method = 'stl', frequency = 'auto', trend = 'auto') %>%
anomalize(remainder, method = 'iqr', alpha = 0.1, max_anoms = 0.5) %>%
plot_anomaly_decomposition(ncol = 2, alpha_dots = 0.6)
## Converting from tbl_df to tbl_time.
## Auto-index message: index = Date
## Note: Index not ordered. tibbletime assumes index is in ascending order. Results may not be as desired.
## frequency = 12 seconds
## Note: Index not ordered. tibbletime assumes index is in ascending order. Results may not be as desired.
## trend = 12 seconds
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
df %>%
time_decompose(Sales, method = 'stl', frequency = 'auto', trend = 'auto') %>%
anomalize(remainder, method = 'iqr', alpha = 0.1, max_anoms = 0.5) %>%
time_recompose() %>%
plot_anomalies(time_recomposed = TRUE, ncol = 2, alpha_dots = 0.6)
## Converting from tbl_df to tbl_time.
## Auto-index message: index = Date
## Note: Index not ordered. tibbletime assumes index is in ascending order. Results may not be as desired.
## frequency = 12 seconds
## Note: Index not ordered. tibbletime assumes index is in ascending order. Results may not be as desired.
## trend = 12 seconds
we can see that we have many anomalies in our data