Introduction
Regork grocery chain caters to a wide demographic. Based on the
data, it would be best to change that strategy so that the optimizations
can improve our customer satisfaction and financial metrics. We have
analyzed the most important customer demographics based on income
bracket. I propose we optimize various departments of this particular
store to cater to the main customer demographic.
Packages Used
tidyverse - a collection of packages for data wrangling, exploration
and visualization
completejourney - data set for Regork grocery store
Lubridate - allows date/time calculations
Data Preparation/Exploration
Sorting total sales by customer income
Preparing dataset
Prod <- products
tran <- get_transactions()
demo <- demographics
X1 <- inner_join(tran, Prod, by = "product_id")
D1 <- inner_join(X1, demo, by = "household_id")
Analyzing Data
D2 <- D1 %>%
group_by(income) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE)) %>%
arrange(desc(total_sales)) %>%
mutate(SUM = sum(total_sales), Percent = (total_sales/SUM)*100)
Visualize Data

Finding the most popular week days for our target demopraphic
($1k-99k income)
Preparing dataset
D1$weekday <- wday(D1$transaction_timestamp)
D1$hour <- hour(D1$transaction_timestamp)
target_demo <- c("50-74K","35-49K", "75-99K")
Analyzing Data
D3 <- D1 %>%
group_by(weekday) %>%
filter(income %in% target_demo) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE)) %>%
arrange(desc(total_sales)) %>%
mutate(SUM = sum(total_sales), Percent = (total_sales/SUM)*100)
D3$weekday[D3$weekday==1]<-"SUNDAY"
D3$weekday[D3$weekday==2]<-"MONDAY"
D3$weekday[D3$weekday==3]<-"TUESDAY"
D3$weekday[D3$weekday==4]<-"WEDNESDAY"
D3$weekday[D3$weekday==5]<-"THURSDAY"
D3$weekday[D3$weekday==6]<-"FRIDAY"
D3$weekday[D3$weekday==7]<-"SATURDAY"
Visualize Data

Finding the most popular hours for our target demograhpic
($1k-$99k)
Analyzing Data
D4 <- D1 %>%
group_by(hour) %>%
filter(income %in% target_demo) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE)) %>%
arrange(desc(total_sales)) %>%
mutate(SUM = sum(total_sales), Percent = (total_sales/SUM)*100) %>%
slice(1:10)
D4$hour[D4$hour==17]<-"5 PM"
D4$hour[D4$hour==18]<-"6 PM"
D4$hour[D4$hour==16]<-"4 PM"
D4$hour[D4$hour==19]<-"7 PM"
D4$hour[D4$hour==13]<-"1 PM"
D4$hour[D4$hour==15]<-"3 PM"
D4$hour[D4$hour==14]<-"2 PM"
D4$hour[D4$hour==12]<-"12 PM"
D4$hour[D4$hour==20]<-"8 PM"
D4$hour[D4$hour==11]<-"11 AM"
Visualize Data

Finding the top 10 most popular products for our customer
demographic
Analyzing Data
D5 <- D1 %>%
group_by(product_category) %>%
filter(income %in% target_demo) %>%
filter(product_category!="COUPON/MISC ITEMS") %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE)) %>%
arrange(desc(total_sales)) %>%
slice(1:10)
Visualizing Data

Summary
The problem identified through exploratory analysis was that this
particular store was not catering it’s main customer demographic.
Customers making $35K-$99k make up 55% of total sales. Customers making
less than $35k made up 21% of total sales. Customers making more than
$99k made up 26% of total sales. The solution is to optimize inventory,
schedule, and other attributes of the store to focus on low-middle class
customers. From now on the target demographic are customers making
$1k-99k. Our strategy should focus on conforming our product types to
this income bracket with decent packaging to appeal to higher income
brackets. We need to reevaluate the products in soft drinks, beef, milk,
cheese, frozen dinners, snacks, bread, beer, frozen pizza and deli meats
and make sure that it appeals to our target income bracket. And optimize
our staff scheduling/supply chain management to account for the most
common times our target demographic shops which is - Sunday, Monday,
Friday and Saturday from 4PM - 7PM. We could improve this analysis by
obtaining more attributes regarding product inventory and customer
demographics.