I will create a synthetic dataset with three columns Date, Item, and Price. I’ll sort the items by Date in ascending order. We’ll use the window function to group items by year to date average and Six-Day Moving Average without ruining the original dataset.
Code base
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(zoo)
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
synthetic_dataset <-read.csv("synthetic_dataset.csv")processed_data <- synthetic_dataset %>%group_by(item) %>%arrange(date) %>%mutate(ytd_average_sales =cummean(sales),six_day_moving_avg =rollmean(sales, k =6, fill =NA, align ="right") ) %>%ungroup()processed_data %>%filter(date >=as.Date("2024-01-06")) %>%head(10)
# A tibble: 10 × 9
date item category region sales quantity price_per_unit ytd_average_sales
<chr> <chr> <chr> <chr> <dbl> <int> <dbl> <dbl>
1 2024-0… Prod… Electro… North 101. 2 57.1 105.
2 2024-0… Prod… Clothing South 137. 3 37.4 153.
3 2024-0… Prod… Electro… East 318. 6 47.8 327.
4 2024-0… Prod… Home West 53.7 1 19.3 83.9
5 2024-0… Prod… Clothing Centr… 156. 10 67.9 161.
6 2024-0… Prod… Electro… North 120. 3 72.6 107.
7 2024-0… Prod… Clothing South 165. 12 46.3 155.
8 2024-0… Prod… Electro… East 335. 8 52.9 328.
9 2024-0… Prod… Home West 70.4 4 98.1 82.0
10 2024-0… Prod… Clothing Centr… 219. 12 18.8 169.
# ℹ 1 more variable: six_day_moving_avg <dbl>