Scrum Methodology in Data Analytics Consulting

Illya Mowerman & Kirk Mettler

library(dplyr)
library(ggplot2)
library(tidyr)
set.seed(123)

Opening

“Scrum’s agility fits consulting—delivering insights fast amid change.”

What is Scrum?

Why Scrum in Analytics? - Early client value (e.g., quick insights). - Handles data surprises or scope shifts. - Focuses on outcomes, not just code.

Scrum Framework: Three Pillars

R Example: Simulate a dataset to inspect and adapt.

sales_data <- data.frame(
  Date = seq(as.Date("2025-01-01"), by = "day", length.out = 30),
  Sales = rnorm(30, mean = 1000, sd = 200),
  Region = sample(c("North", "South"), 30, replace = TRUE)
)
head(sales_data, 3)
##         Date     Sales Region
## 1 2025-01-01  887.9049  North
## 2 2025-01-02  953.9645  South
## 3 2025-01-03 1311.7417  South

“Start with this, inspect trends, adapt if client adds regions.”

Scrum Roles: Product Owner

R Example: Prioritize a simple sales summary.

sales_summary <- sales_data %>%
  group_by(Region) %>%
  summarise(Avg_Sales = mean(Sales)) %>%
  arrange(desc(Avg_Sales))
sales_summary
## # A tibble: 2 × 2
##   Region Avg_Sales
##   <chr>      <dbl>
## 1 South      1019.
## 2 North       966.

“PO says: ‘Client needs this stat now.’”

Scrum Roles: Scrum Master

R Example: Automate a data check to remove a blocker.

missing_check <- sales_data %>%
  summarise(Missing_Sales = sum(is.na(Sales)))
missing_check
##   Missing_Sales
## 1             0

“SM: ‘I automated this—focus on analysis.’”

Scrum Roles: Development Team

R Example: Team builds a quick viz.

ggplot(sales_data, aes(x = Date, y = Sales, color = Region)) +
  geom_line() +
  labs(title = "Sales Trends by Region") +
  theme_minimal()

“Team: ‘I’ll plot, you clean data.’”

Scrum Events: Sprint

Flow: Plan → Work → Review → Repeat.

Scrum Events: Sprint Planning

R Example: Plan a Sprint backlog.

sprint_backlog <- c("Clean sales data", "Plot trends", "Summarize stats")
sprint_backlog
## [1] "Clean sales data" "Plot trends"      "Summarize stats"

“Output: Sprint Goal—‘Basic sales insights.’”

Scrum Events: Daily Scrum

R Example: Track daily progress.

daily_log <- data.frame(
  Day = c("Day 1", "Day 2"),
  Task = c("Data cleaning", "Plotting"),
  Status = c("Done", "In Progress")
)
daily_log
##     Day          Task      Status
## 1 Day 1 Data cleaning        Done
## 2 Day 2      Plotting In Progress

Scrum Events: Sprint Review

R Example: Demo with client feedback.

ggplot(sales_data, aes(x = Date, y = Sales)) +
  geom_line() +
  labs(title = "Sprint 1: Sales Trend") +
  theme_minimal()

“Client: ‘Nice! Can we filter by region?’”

Scrum Events: Sprint Retrospective

R Example: Analyze Sprint effort.

effort <- data.frame(
  Task = c("Cleaning", "Plotting"),
  Hours = c(10, 4)
)
ggplot(effort, aes(x = Task, y = Hours)) +
  geom_bar(stat = "identity") +
  labs(title = "Effort Distribution")

“Retro: ‘Let’s cut cleaning time.’”

Scrum Artifacts: Product Backlog

R Example: Build a backlog.

product_backlog <- data.frame(
  Item = c("Sales trends", "Forecast Q2", "Customer segments"),
  Priority = c(1, 2, 3)
)
product_backlog
##                Item Priority
## 1      Sales trends        1
## 2       Forecast Q2        2
## 3 Customer segments        3

Scrum Artifacts: Sprint Backlog

R Example: Subset backlog.

sprint_tasks <- product_backlog[1, ]
sprint_tasks
##           Item Priority
## 1 Sales trends        1

Scrum Artifacts: Increment

R Example: Final increment.

ggplot(sales_data, aes(x = Date, y = Sales, color = Region)) +
  geom_line() +
  labs(title = "Increment: Regional Sales") +
  theme_minimal()

Scrum Artifacts: Burndown Chart

R Example: Simulate burndown.

burndown <- data.frame(
  Day = 1:10,
  Hours_Remaining = c(40, 35, 30, 28, 20, 15, 12, 8, 4, 0),
  Task = "Analytics Sprint"
)
ggplot(burndown, aes(x = Day, y = Hours_Remaining)) +
  geom_line() +
  geom_point() +
  labs(title = "Burndown: Analytics Sprint", y = "Hours Left")

Applying Scrum to Analytics Consulting

Scenario: Retail client needs sales insights. - Product Backlog: Trends, forecasts, segments. - Sprint 1: - Goal: “Sales trend dashboard.” - Daily Scrum: “API’s slow.” - Review: “Add filters.” - Retro: “Automate ETL.”

R Output: Final deliverable.

sales_data %>%
  ggplot(aes(x = Date, y = Sales, color = Region)) +
  geom_line() +
  labs(title = "Sprint 1 Deliverable") +
  theme_minimal()

Benefits: Fast value, adaptability.

Challenges: Data delays, scope creep.

Closing & Q&A

“Iterate, deliver, improve—Scrum powers analytics consulting.”