##Introduction
I have been asked to partner with Regork to further increase the sales and revenue for their store. As a grocery store chain it can be hard to do things that make them stand out from their competitors. While working with Regork I analyzed their candy and gift/fruit Basket sales. Candy and gift baskets are very popular items in todays day and age and I would like to see how we could use these items to boost the sales for the store.
How does Regork increase the sales in candy and gift/fruit baskets throughout their store?
I will be analyzing certain holidays during the year to see how sales in these items increase or decrease. Using Exploratory Data Analysis (EDA) techniques and different plots, I will be able to identify the increasing factors that lead to more candy sales within the company.
##Packages
library(completejourney)
## Welcome to the completejourney package! Learn more about these data
## sets at http://bit.ly/completejourney.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(lubridate)
##Exploratory Data Analysis
CANDY
transactions <- get_transactions()
candy_transactions <- transactions%>%
inner_join(products)%>%
filter(str_detect(product_category, regex("candy", ignore_case = TRUE))) %>%
mutate(date = as_date(transaction_timestamp))
## Joining with `by = join_by(product_id)`
unique(candy_transactions$product_type)
## [1] "CANDY BARS (SINGLES)(INCLUDING" "CANDY REFRIGERATED"
## [3] "SEASONAL CANDY BOX NON-CHOCOLA" "CANDY BAGS-CHOCOCLATE"
## [5] "CHEWING GUM" "MINTS CANDY & BREATH (NOT LIF"
## [7] "SEASONAL CANDY BOX-CHOCOLATE" "CANDY BAGS-NON CHOCOLATE"
## [9] "CANDY BARS (MULTI PACK)" "MISCELLANEOUS CANDY (INCLUDING"
## [11] "CANDY BOXED CHOCOLATES W/FLOUR" "CANDY & BREATH MINTS (PKGD) (N"
## [13] "CANDY BAGS-CHOCOCLATE W/FLOUR" "SEASONAL CANDY BAGS NON-CHOCOL"
## [15] "NOVELTY CANDY" "GUM (PACKAGED)"
## [17] "CANDY BARS MULTI PACK W/FLOUR" "SEASONAL MISCELLANEOUS"
## [19] "SEASONAL CANDY BAGS-CHOCOLATE" "CANDY BOXED CHOCOLATES"
## [21] "SEASONAL CANDY BOX-CHOCOLATE W" "CANDY BOX NON-CHOCOLATE"
## [23] "CANDY BAGS-NON CHOCOLATE W/FLO" "MISC CHECKLANE CANDY"
## [25] "BULK CANDY" "DIET CNTRL BARS NUTRITIONAL"
## [27] "ANT AND ROACH" "ALMAY"
## [29] "CANDY" "SEASONAL MISCELLANEOUS W/FLOUR"
## [31] "EASTER GIFTWARE/DECOR" "EASTER EGG COLORING"
## [33] "CONTINUITY" "NOVELTY CANDY-TAXABLE"
## [35] "PROCESSED OTHER" "GERMAN FOODS"
## [37] "CARAMEL COATED SNACKS" "BASIC TOWELS"
## [39] "ANTI-ACIDS" "BULK CANDY W/FLOUR"
## [41] "HAIR BARRETTES TAILERS" "SALLY HANSEN"
## [43] "NOVELTY CANDY W/FLOUR"
candy_transactions%>%
filter(product_type == c("SEASONAL CANDY BAGS-CHOCOLATE", "SEASONAL MISCELLANEOUS" , "CANDY BARS (MULTI PACK)" , "CANDY BAGS-CHOCOCLATE", "CANDY BAGS-NON CHOCOLATE"))%>%
count(product_type)
## Warning: There was 1 warning in `filter()`.
## ℹ In argument: `==...`.
## Caused by warning in `product_type == c("SEASONAL CANDY BAGS-CHOCOLATE", "SEASONAL MISCELLANEOUS",
## "CANDY BARS (MULTI PACK)", "CANDY BAGS-CHOCOCLATE",
## "CANDY BAGS-NON CHOCOLATE")`:
## ! longer object length is not a multiple of shorter object length
## # A tibble: 5 × 2
## product_type n
## <chr> <int>
## 1 CANDY BAGS-CHOCOCLATE 596
## 2 CANDY BAGS-NON CHOCOLATE 527
## 3 CANDY BARS (MULTI PACK) 757
## 4 SEASONAL CANDY BAGS-CHOCOLATE 472
## 5 SEASONAL MISCELLANEOUS 175
candy_transactions%>%
mutate(date = as_date(transaction_timestamp))%>%
group_by(date)%>%
mutate(total_sales = sum(sales_value))%>%
ggplot(aes(x = date, y = total_sales, color = "darksalmon"))+
geom_line()
I noticed there were certain spikes in total sale of candy during holidays. Around February (Valentines day), April (Easter), end of October (Halloween), and December (Christmas) had the highest amount of sales.
GIFT/FRUIT BASKET
transactions <- get_transactions()
gift_transactions <- transactions%>%
inner_join(products)%>%
filter(str_detect(product_category, regex("gift", ignore_case = TRUE))) %>%
mutate(date = as_date(transaction_timestamp))
## Joining with `by = join_by(product_id)`
unique(gift_transactions$product_type)
## [1] "GIFT BASKETS (NO FOOD)" "GIFT BASKETS W/FOOD" "FRUIT BASKETS"
gift_transactions%>%
filter(product_type == c("GIFT BASKETS (NO FOOD)", "GIFT BASKETS W/FOOD", "FRUIT BASKETS"))%>%
count(product_type)
## # A tibble: 3 × 2
## product_type n
## <chr> <int>
## 1 FRUIT BASKETS 2
## 2 GIFT BASKETS (NO FOOD) 2
## 3 GIFT BASKETS W/FOOD 1
gift_transactions%>%
mutate(date = as_date(transaction_timestamp))%>%
group_by(date)%>%
mutate(total_sales = sum(sales_value))%>%
ggplot(aes( x = date, y = total_sales,))+
geom_line()
This plot shows the total sales of gift baskets and fruit baskets sold during 2017. There is a spike in sales during the middle of May and end of December. This correlation could be due to the end of the school year being towards the middle of May and the end of December being New Years Eve.
Time Analysis
CANDY
candy_transactions %>%
filter(date <= "2017-02-15", date >= "2017-02-10")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity, color = "darksalmon"))+
geom_col()
This is a graph of the total sales around Valentines day. It is notorious for people to buy candy the day before Valentines day which is very clear within this graph.
candy_transactions %>%
filter(date <= "2017-04-17", date >= "2017-04-10")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity, color = "darksalmon"))+
geom_col()
This is a graph depicting the candy sales around Easter. Following along with the Valentine’s day sales, most of the candy sold is a day or a couple days before the actual holiday.
candy_transactions %>%
filter(date <= "2017-11-01", date >= "2017-10-01")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity, color = "darksalmon"))+
geom_col()
This is a graph of the total sales around October. Many people are buying candy the day of Halloween unlike Valentines day and Easter where most sales were the day before if not a couple days before. There is a huge spike on the day of the holiday but there can be many reasons for this. Halloween is a holiday celebrated at night so people may buy all their candy the day of in preparation for the night.
candy_transactions %>%
filter(date <= "2017-12-26", date >= "2017-12-15")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity, color = "darksalmon"))+
geom_col()
This is a graph of total sales leading up to Christmas. The highest sales are a few days before Christmas. There are also no sales on the day of Christmas. This could be due to the fact that most stores are closed Christmas day. Christmas is also celebrated in the morning so people may need to buy their candy a day or days in advance.
GIFT/FRUIT BASKET
gift_transactions %>%
filter(date <= "2017-05-30", date >= "2017-01-15")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity))+
geom_col()
There is sales for gift baskets and fruit baskets on Valentines day, end of Febraury, and the day before Easter. This could be correlated to holidays but there is no comparison to how much candy is sold during those times. Candy is a top seller compared to gift baskets and fruit baskets.
gift_transactions %>%
filter(date <= "2017-12-31", date >= "2017-09-01")%>%
group_by(date)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = date, y = total_quantity))+
geom_col()
There is a spike in September of gift basket and fruit basket sales. There are also sales on December 1st and towards Christmas time in December. These spikes also do not compare to the amount of candy sold during this time.
Product Category Analysis
CANDY
candy_transactions %>%
filter(date <= "2017-02-15", date >= "2017-02-10")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity, fill = "darksalmon"))+
geom_col()+
theme(axis.text.x = element_text(angle = 90))
The top selling items during February 10th to Febraury 15th are Candy Bar (Singles), Seasonal Candy Bags (Chocolate), and Chewing Gum.
candy_transactions %>%
filter(date <= "2017-04-17", date >= "2017-04-10")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity, fill = "darksalmon"))+
geom_col()+
theme(axis.text.x = element_text(angle = 90))
The top selling items during April 10th to April 17th are Candy Bars (Singles), Season Candy Bags (Chocolate), and Seasonal Miscellaneous.
candy_transactions %>%
filter(date <= "2017-11-01", date >= "2017-10-01")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity, fill = "darksalmon"))+
geom_col()+
theme(axis.text.x = element_text(angle = 90))
The top selling items during October 1st to November 1st are Candy Bars (Singles), Chewing Gum, and Seasonal Candy Bags (Chocolate).
candy_transactions %>%
filter(date <= "2017-12-26", date >= "2017-12-15")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity, fill = "darksalmon"))+
geom_col()+
theme(axis.text.x = element_text(angle = 90))
The top selling item during December 15th to December 26th are Candy Bars (Singles), Chewing Gum, Candy Bars (MultiPack), Seasonal Candy Bags (Chocolate), and Seasonal Miscellaneous.
GIFT/FRUIT BASKET
gift_transactions %>%
filter(date <= "2017-05-30", date >= "2017-01-15")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity))+
geom_col()
Between the time period of January 1st to May 30th, gift baskets with no food and gift baskets with food are sold very highly. There are no sales for gift baskets with fruit.
gift_transactions %>%
filter(date <= "2017-12-31", date >= "2017-09-01")%>%
group_by(product_type)%>%
mutate(total_quantity = sum(quantity))%>%
ggplot(aes(x = product_type, y = total_quantity))+
geom_col()
During September 1st to December 31st, fruit baskets are sold at an all time high with little sales in gift baskets with food. There are no gift baskets without food sold during this time.
##Summary
Overall, candy sales and gift/fruit basket sales reach an all time high during the holiday seasons. Candy sales really spike during these times while gift baskets excel but not as much as candy. One idea I analyzed was to create a candy gift basket combination. This offers multiply candy options in one for a reduced price. Regork’s top candy sales were in Candy Bars (singles) and Seasonal Candy Bags (Chocolate). To increase their sales in gift baskets putting these two items in one gift basket could increase their sales. Since gift baskets with no food are a top seller as well, they could include seasonal toys or decorations along with the candy. This way shoppers see a jam packed gift basket with all essentials for holiday gifts rather than just grabbing a candy bar or bag.