REGORK FINAL PROJECT

INTRODUCTION

Introduction:

In an increasingly competitive grocery chain industry, it is more important than ever for Regork to strategically capitalize on its investments to maintain a strong market position. By taking our findings into consideration, Regork will not only remain as one of the top grocer chains, but it will “thrive” as the most adaptable, affordable, and advanced one in the market.

What the Problem Statement is:

Our Regork customers are not as satisfied as they need to be; their loyalty to Regork is waning as they turn to our competitors. Our current practices are resulting in customer churn, signaling an error on our end that is costing us severely.

Question: Where should Regork invest in targeted, calendar-aware promotions and in-aisle bundles—based on observed purchase seasonality and frequently co-purchased items—to lift revenue with existing traffic?

How We Address It:

In order to figure out how to stop this negative trend, we first need to analyze “why” our customers are shopping with other grocers. We need to gather data to gain insight into buyer behavior to draw conclusions and make educated predictions. We based our findings off of the Complete Journey Study provided by 84.51°, a partner brand of Regork that specializes in data analytics. By using advanced data wrangling techniques, we can develop a better understanding of our consumers, leading to business practices that better suit them.

Our Solution

Based on our findings, we believe that an evolved, seasonally-conscious marketing campaign will prove to be incredibly helpful in gaining more customer loyalty. We want our customers to feel that we offer the best deals exactly when they need them; the best way to do this is to have an innovative structure that focuses on general shopping trends. This will also give us more of a competitive edge against other grocery chains. For example, timing promotions when foot traffic is highest will generate more sales.

Moreover, a marketing campaign revolving around customer incentives for our top departments will cause a majority of our old customers, along with new, to repeatedly shop at Regork. This will involve offering promotions, exclusive loyalty programs, and well-timed discounts. Additionally, offering wide-spread communication for our consumers to alert them to these deals will be imperative. We need to capitalize on our strengths, and highlighting our best departments is a great way to do that.

Furthermore a personalized approach to for a communication strategy will portray an understanding and familiar essence to our customer, resulting in increased customer engagement. A mass-audience communication strategy will reach the largest audience possible, while a personalized “opt-in” messaging will result in a stronger allure to shop at Regork to a niche group of individuals.

Based on our data, we believe these strategies will prove to be the most effective at reducing customer churn. After understanding our consumers’ needs and desires, we came up with strategies that directly relate to them to increase customer satisfaction and loyalty. In turn, stronger, longer-lasting relationships will be made between Regork and the customer.

PACKAGES/LIBRARIES REQUIRED

Libraries

knitr::opts_chunk$set(message = FALSE, warning = FALSE)

# Import everything up front (i) and quietly (ii)
suppressPackageStartupMessages({
  library(completejourney)  # Complete Journey data access
  library(tidyverse)        # wrangling + plotting
  library(lubridate)        # date features (month/week/day-of-week)
  library(janitor)          # clean_names() and quick QA tables
  library(gt)               # clean, presentation-ready tables
  library(scales)           # currency/percent labels
  library(widyr)            # simple co-occurrence (bundle ideas)
})

What each package is for:

  • completejourney: loads the full transactions, products, households, and promotions datasets.
  • tidyverse: makes data cleaning, joining, and plotting simple.
  • lubridate: turns timestamps into month/week/day-of-week for seasonality.
  • janitor: fixes messy column names and helps quick data checks.
  • gt: creates polished tables for business-friendly display.
  • scales: formats dollars/percentages in plots and tables.
  • widyr: finds items frequently bought together (bundle candidates).

EXPLORATORY DATA

Table: Top 10 Departments by Sales

tx   <- completejourney::get_transactions()
prod <- completejourney::products
txp  <- dplyr::left_join(tx, prod, by = "product_id")
txp$tx_date <- as.Date(txp$transaction_timestamp)
txp$month   <- lubridate::month(txp$tx_date, label = TRUE, abbr = TRUE)
dept_tbl <- txp |>
  dplyr::group_by(department) |>
  dplyr::summarise(
    sales = sum(sales_value, na.rm = TRUE),
    units = sum(quantity,    na.rm = TRUE),
    .groups = "drop"
  ) |>
  dplyr::arrange(dplyr::desc(sales)) |>
  dplyr::slice_head(n = 10)

dept_tbl |>
  dplyr::mutate(sales = scales::dollar(sales)) |>
  gt::gt() |>
  gt::tab_header("Top 10 Departments by Sales")
Top 10 Departments by Sales
department sales units
GROCERY $2,316,394 1242944
DRUG GM $596,827 198635
FUEL $329,594 129662940
PRODUCE $322,859 185444
MEAT $308,575 67526
MEAT-PCKGD $232,283 83423
DELI $148,344 37954
MISCELLANEOUS $78,859 21361882
PASTRY $69,117 28132
NUTRITION $57,261 25024

Graph: Sales by Month

by_month <- txp |>
  dplyr::filter(!is.na(month)) |>
  dplyr::group_by(month) |>
  dplyr::summarise(sales = sum(sales_value, na.rm = TRUE), .groups = "drop")

ggplot2::ggplot(by_month, ggplot2::aes(month, sales, fill = month)) +
  ggplot2::geom_col(color = "white") +
  ggplot2::scale_y_continuous(labels = scales::dollar) +
  ggplot2::scale_fill_brewer(palette = "Set3", guide = "none") +
  ggplot2::labs(title = "Sales by Month", x = NULL, y = "Sales ($)")

Graph: Sales by Day of Week

# Calendar heatmap (ordered months, no Jan 2018, clearer legend)

dy <- txp |>
  dplyr::filter(!is.na(tx_date), tx_date < as.Date("2018-01-01")) |>  # drop Jan 2018
  dplyr::group_by(tx_date) |>
  dplyr::summarise(sales = sum(sales_value, na.rm = TRUE), .groups = "drop") |>
  dplyr::mutate(
    month_start = lubridate::floor_date(tx_date, "month"),
    # build month labels and set factor levels by chronological order
    month_lab_raw = format(month_start, "%b %Y"),
    month_lab = factor(
      month_lab_raw,
      levels = format(sort(unique(month_start)), "%b %Y")
    ),
    dow = lubridate::wday(tx_date, label = TRUE, abbr = TRUE, week_start = 1),
    wom = ceiling((lubridate::day(tx_date) +
                   lubridate::wday(month_start, week_start = 1) - 1) / 7)
  )

max_sales <- max(dy$sales, na.rm = TRUE)

ggplot2::ggplot(dy, ggplot2::aes(x = wom, y = dow, fill = sales)) +
  ggplot2::geom_tile(color = "white", linewidth = 0.3) +
  ggplot2::scale_fill_viridis_c(
    labels = scales::dollar,
    name   = paste0("Sales ($)\n(yellow ≈ ", scales::dollar(max_sales), " = highest)"),
    guide  = ggplot2::guide_colorbar(barheight = grid::unit(80, "pt"))
  ) +
  ggplot2::facet_wrap(~ month_lab, ncol = 2) +
  ggplot2::labs(
    title = "Calendar Heatmap: Daily Sales",
    x = "Week of Month", y = NULL
  ) +
  ggplot2::theme_minimal(base_size = 12) +
  ggplot2::theme(
    strip.text = ggplot2::element_text(face = "bold"),
    axis.text.y = ggplot2::element_text(size = 10, margin = ggplot2::margin(r = 6)),
    panel.spacing.y = grid::unit(1.0, "lines"),
    panel.spacing.x = grid::unit(0.8, "lines")
  )

Graph: Category Seasonality (Lift vs. Own Average)

nm_lower <- tolower(names(txp))
cand <- c("commodity","commodity_desc","product_category","category","sub_commodity")
has  <- names(txp)[nm_lower %in% cand]
cat_col <- if (length(has)) has[1] else "department"  # fallback

# top 6 categories by sales
comm_tbl <- txp |>
  dplyr::count(!!rlang::sym(cat_col), wt = sales_value, name = "sales") |>
  dplyr::arrange(dplyr::desc(sales)) |>
  dplyr::slice_head(n = 6) |>
  dplyr::rename(category = !!rlang::sym(cat_col))

top_cats <- comm_tbl$category

# monthly lift vs each category's own average
comm_month <- txp |>
  dplyr::filter(!!rlang::sym(cat_col) %in% top_cats, !is.na(month)) |>
  dplyr::group_by(!!rlang::sym(cat_col), month) |>
  dplyr::summarise(sales = sum(sales_value, na.rm = TRUE), .groups = "drop_last") |>
  dplyr::mutate(lift = sales / mean(sales)) |>
  dplyr::rename(category = !!rlang::sym(cat_col)) |>
  dplyr::ungroup()

# --- Faceted seasonality chart with month labels under each facet ---
# Use facet_grid(rows = vars(...)) instead of facet_wrap
ggplot2::ggplot(comm_month, ggplot2::aes(month, lift, group = category)) +
  ggplot2::geom_line(linewidth = 1, color = "#1F78B4") +
  ggplot2::geom_hline(yintercept = 1, linetype = 2) +
  ggplot2::facet_grid(rows = vars(category), scales = "free_y", switch = "y") +
  ggplot2::labs(
    title = paste0("Seasonality by ", cat_col, " (Lift vs. Own Average)"),
    x = "Month", y = "Lift"
  ) +
  ggplot2::theme_minimal(base_size = 12) +
  ggplot2::theme(
    strip.text.y.left = ggplot2::element_text(angle = 0, face = "bold"),
    axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5, hjust = 1),
    panel.spacing = grid::unit(0.8, "lines")
  )

SUMMARY

Our analysis of Regork’s transaction and product data revealed clear opportunities to strengthen customer engagement and drive store performance through data-informed marketing strategies.

  1. Product Enhancement:
    Regularly refresh and expand product offerings to sustain customer interest and align with shifting preferences. Product updates should reflect both seasonal demand trends and customer feedback to maintain relevance and appeal.

  2. Technology Integration:
    Explore and implement emerging technologies—such as personalized recommendation systems, predictive analytics, and AI-driven pricing—to improve promotional timing, optimize inventory, and enhance operational efficiency.

  3. Collaborative Partnerships:
    Develop strategic partnerships with technology innovators and industry leaders to support scalable digital transformation and expand the reach of promotional efforts.

  4. Coupon and Promotional Strategies:
    Leverage coupon-based promotional pricing to stimulate store traffic and encourage repeat purchases. Targeted coupon marketing can serve as a traffic-building sales promotion, reinforcing loyalty while increasing the frequency and value of customer visits.

  5. Data-Driven Campaign Design:
    Utilize insights from purchasing patterns and seasonality to tailor marketing campaigns that match customer buying cycles. Align promotions with high-demand categories and time-sensitive behaviors to maximize lift and retention.

  6. Limitations and Next Steps:
    While our data analysis did not fully capture demographic-specific insights, future work should focus on personalized engagement strategies. Tailoring campaigns by demographic segment—and partnering with influencers or thought leaders—can amplify brand messaging and deepen consumer connection.