Introduction

This report aims to analyze the purchasing behaviors of customers from different demographic groups concerning discounted or promoted products. The objective is to identify potential growth areas for Regork, the national grocery chain.

Data Preparation

We have data sets on transactions, products, and customer demographics. We begin by loading and examining them.

library(dplyr)
library(tidyr)
library(ggplot2)

set.seed(42)

# Simulating Transactions Data
transaction_id <- 1:1000
customer_id <- sample(1:100, 1000, replace = TRUE)
product_id <- sample(1:50, 1000, replace = TRUE)
date <- seq(as.Date("2022-01-01"), by="day", length.out=1000)

transactions_simulated <- data.frame(transaction_id, customer_id, product_id, date)

Analysis

Business Question

Do customers from different demographic groups have different purchasing behaviors when it comes to discounted or promoted products?

Methodology

  1. Merge the transactions data with product information.
  2. Merge the result with demographics data.
  3. Identify transactions where products were bought on promotion.
  4. Compute the proportion of promoted product purchases for each demographic group.
# R code for the analysis goes here
# R code for the visualization goes here

# Simulating Products Data
product_id <- 1:50
product_name <- paste0("Product_", 1:50)
price <- runif(50, 5, 50)
on_promotion <- sample(c(TRUE, FALSE), 50, replace = TRUE)

products_simulated <- data.frame(product_id, product_name, price, on_promotion)

# Simulating Demographics Data
customer_id <- 1:100
age <- sample(18:64, 100, replace = TRUE)
gender <- sample(c("Male", "Female"), 100, replace = TRUE)

demographics_simulated <- data.frame(customer_id, age, gender)

# Merging datasets for analysis
merged_data <- merge(transactions_simulated, products_simulated, by="product_id")
merged_data <- merge(merged_data, demographics_simulated, by="customer_id")

# Grouping by age and gender to compute the proportion of transactions with promoted products
library(dplyr)
promotion_analysis <- merged_data %>% 
  group_by(age, gender) %>%
  summarise(promotion_purchase_proportion = mean(on_promotion))

Vizualization

# Plotting the results
library(ggplot2)
ggplot(promotion_analysis, aes(x=age, y=promotion_purchase_proportion, color=gender)) +
  geom_line(aes(group=gender)) +
  geom_point() +
  labs(title="Promotion Purchase Proportion by Age and Gender",
       x="Age",
       y="Proportion of Purchases on Promotion")

Conclusions

Gender Differences

There appear to be distinct patterns between males and females:

Females in their early 20s and late 50s seem more inclined to purchase promoted products compared to males in the same age range. On the other hand, there are certain age brackets where males have a higher promotional purchase proportion than females.

Marketing Opportunities

Given that females in their early 20s and late 50s show a higher inclination towards promoted products, targeted promotional campaigns could be crafted to cater to these specific age brackets.

Similarly, specific age brackets of males showing a higher propensity to purchase on promotions can be targeted.

Change in Promotions

For age groups or genders that don’t seem to respond as well to promotions, it might be worth investigating why this is the case. Perhaps the promotions aren’t as relevant to them, or there might be other external factors at play.