Can we use the holiday sales of Ham and Turkey to bolster the sales of each respective product? To solve this problem we looked at the following sets of data: Transactions, Products, Coupon Redemption, and Demographics. After joining the sets of data, we mainly used data visualization and filtering to show the total monthly sales by income range for Ham and Turkey. We also used the Coupon Redemption data to back up our solution regarding the implementation of future coupons. Our analysis of this data will help to come up with future marketing campaigns to raise the Total sales of these two products in non holiday months.
library(completejourney)
library(tidyverse)
library(lubridate)
The completejourney package contained all the data we needed to look at
the tidyverse package allowed us to manipulate the data and plot our findings
The lubridate package allowed us to use transaction timestamp to look at the totals sales per month
transactions <- get_transactions()
promotions <- get_promotions()
transactions %>%
inner_join(products) %>%
inner_join(demographics) %>%
filter(str_detect(product_type, "HAM")) %>%
filter(str_detect(product_type, "WHOLE")) %>%
mutate(month = month(transaction_timestamp, label = TRUE)) %>%
group_by(income, age, month) %>%
summarize(total_sales = sum(sales_value)) %>%
ggplot(aes(x = month, y = total_sales, fill = income)) +
geom_col() +
coord_flip() +
scale_x_discrete("Month") +
scale_y_continuous("Total Sales of Ham", labels = scales::dollar)+
labs(title = "Total Monthly Sales of Ham by Income Level")
The data shown above separates out the total sales of whole hams throughout the year by household income level. The data is interesting because one would think that higher level income households would spend more on ham throughout the year, but the data does not reflect this. We hypothesize this is because there are simply less households with that high of income level. As shown in the months of January, November and December, these are the months with the highest total sales on whole hams throughout the year. There are multiple holiday seasons that coincide with these months. What is interesting about the breakup of income levels buying ham is that a significant portion of total sales comes from lower income households. The lower income households are buying more ham during these seasons. We see that there is potential for growth in October and April, which coincides with Halloween and Easter. Our suggestion for how to increase sales in this area will be discussed after taking a look at the Turkey data.
transactions %>%
inner_join(products) %>%
inner_join(demographics) %>%
filter(str_detect(product_category, "TURKEY")) %>%
filter(str_detect(product_type, "WHOLE")) %>%
mutate(month = month(transaction_timestamp, label = TRUE)) %>%
group_by(income, age, month) %>%
summarize(total_sales = sum(sales_value)) %>%
ggplot(aes(x = month, y = total_sales, fill = income)) +
geom_col() +
coord_flip() +
scale_x_discrete("Month") +
scale_y_continuous("Total Sales of Turkey", labels = scales::dollar)+
labs(title = "Total Monthly Sales of Turkey by Income Level")
The turkey data shown above indicates there are 3 months that make up a significant portion of total sales of whole turkeys throughout the year. These three months coincide with the holidays of Thanksgiving, Hanukkah, Christmas and New Years. Turkey is an incredibly popular dish during these holidays and the data reflects this. Similar to the Ham data, a larger portion of the total sales is composed of lower income households rather than higher income households. There are small spikes in total sales of Turkey in February and April. There is potential to further sales in these months.
transactions %>%
inner_join(demographics)%>%
inner_join(products, by = "product_id")%>%
inner_join(coupons)%>%
select(product_type, product_category, income, household_id, coupon_disc, transaction_timestamp, coupon_upc, campaign_id)%>%
filter(str_detect(product_category, "TURKEY")) %>%
filter(str_detect(product_type, "WHOLE"))%>%
filter(coupon_disc > 0)
transactions %>%
inner_join(demographics) %>%
inner_join(products, by = "product_id") %>%
inner_join(coupons) %>%
select(product_type, product_category, income, household_id, coupon_disc, transaction_timestamp, coupon_upc, campaign_id) %>%
filter(str_detect(product_type, "HAM")) %>%
filter(str_detect(product_type, "WHOLE")) %>%
filter(coupon_disc > 0)
When filtering to find the amount of times a coupon was used for Turkey we found that the coupon was only used three times, and when doing the same but for Ham, the coupons were not used.
Outside of major Holiday months, Ham and Turkey sales remain relatively low throughout the calendar year. * November, December, and April are the months with the highest total sales for Ham, with approximately $430, $620, and $400 respectively. which coincides with Holidays such as Thanksgiving, Christmas, New Years, and Easter.
November is the Month with the highest total sales of Turkey at approximately $1,500, coinciding with Thanksgiving.
The data shows that mid to lower income households account for most of the total sales for both Ham and Turkey.
It was interesting to see that coupons were being underutilized for the products whole ham and whole turkey
Looking at our findings, it appears the customer only has incentive to buy Whole Turkey and Ham when preparing for a Holiday meal. After discovering that there are very little to no coupons being used with these products, we would suggest that we implement a new marketing campaign to allow the customers to earn coupons for those select products. There are multiple ways to go about this but the general idea is that in purchasing one product, you receive a coupon for the other product for later. This would work out nicely for holidays like Thanksgiving and Christmas, as it would be a relatively short time frame for customers to redeem a coupon from purchasing a turkey for thanksgiving and apply that towards a ham for Christmas. Additionally, we could have these coupons be valid during our “down” months for these products.