Fresh Food Now Data Analysis

\[Excess Short Shelf-Life Items\]

Preparing the Data

Due to the limited amount of data available to us we have stratified all three databases into one based on SKUs and Commodity to explore any potential correlations. The majority of the data that the resulting frame consisted of were ‘Perishables’ and from the month of April.

Based on this observation, we have graphed the data by SKU to determine how these items with short shelf lives are moving through the system within a single month.

Conclusion:

Excess short shelf-life items month-to-month may be an indication that although they are forecasting growth through 2023 that does not necessarily mean they need to increase the number of received quantities for all items.

\[Commodity Frequency\]

\[Warehouse Storage Planning\]

Influence of Unit Measurements

An additional way our team decided to look at the data was based on how it would be stored. Pulling all the SKUs and creating a cluster plot based on their Volume and Weight resulted in the plot below.

Conclusion:

Since Fresh Food Now will be preparing a new facility, this plot is a starting point that could influence the layout and design of the details within the future. We can see that the data falls into increments that could guide the team to determine what type of spacing and support requirements the storage facility will need to optimize their space utilization.

Lead Time by Storage Commodity

To analyze which products impose the longest Lead Times our team chose to build on the previous “Commodity Frequency” analysis by plotting the Commodities using Box and Whisker plots with respect to Lead Time. Using our filtered data, the four methods of storage (Commodities) produce the following graph:

Noting the methods of storage which impose the greatest Lead Time will further allow Fresh Food Now to restructure their inventory management for specific products to better accommodate the capacity at which they are restocked or consumed.

Conclusion:

Considering products stored as “Dry stock” contains numerous outliers which impose greater lead times than any other storage method, we can evaluate which Commodities impose larger lead times by filtering the data and obtain the following products by increasing Lead Times:

Canned Vegetables
Grocery Items (Unspecified)
Coffee and Chocolate Items

In response to greater Lead Times, Fresh Food Now can evaluate their consumer-supplier relationship and implement accurate ordering schedules and quantities as the majority of the high-lead time products face high risk for product stock-out if the forecast falls short.

Average Number Issued of Commodities by Month

Certain months throughout the year have holidays that influence the demand of certain commodities. Take for example Valentines Day; intuition tells us that to expect an influx of demand for chocolates and wine. In contrast, other holidays such as Halloween is primarily celebrated with chocolate, while New Years and the Fourth of July should see more champagne and beer bought than normal.

While a warehouse manager might have the intuition to prepare for these holidays by ordering extra chocolate or alcohol ahead of time, past data will confirm or deny through a more numerical method if these intuitions are accurate.

## [1] 0.4663303

Conclusions of Commodities by Months

Surprisingly, the results mostly defies the expectations someone might have regarding the increase of demand of chocolate or alcohol for certain holidays. Most notably, the commodity of chocolates experience the lowest average of chocolates during the month of February where many people traditionally buy chocolates and sweets for their loved ones. By the same token, Halloween stays in the average of chocolate issued and is not notable significant compared to the rest of the year. For alcohol, the highest averages are experienced in March and December perhaps due to Mardi Gras and St. Patricks day in March and New Years Eve in December.

One possible explanation for these discrepancies could be due to the Unit of Measurement, or UOM, of these items.Perhaps some of these Number Issued values are measured in individual units (such as bottles) while some items are measured in larger units (such as cases or pallets). To attempt to check the validity of this explanation, we merged the two data files SkuUsage and SkUMaster by SkuNumbers so that more information can be gained regarding the UOM. However, a significant number of the Sku’s with the commodity type (Alcohol and Chocolate/Teas) we wished to observed were either dropped because of missing data or did not match any Sku’s in both data frames.

Complete Code:

## Alcohol
dat1<-SKUUsage
dat1$ShipToCode<-as.factor(dat1$ShipToCode)
dat1<-dat1[dat1$Issued>0,]
dat1$Issued<-as.numeric(dat1$Issued)
dat2<-dat1[dat1$ShipToCode=="3L",]
dat3<-dat1[dat1$ShipToCode=="CD3L",]
dat2<-droplevels(dat2)
dat3<-droplevels(dat3)
dat1<-merge(dat2,dat3,all=TRUE)
dat6 <- read.csv("https://raw.githubusercontent.com/CGarcia44/EDA-/main/6%20SKU%20Master.csv")
dat7 <- read.csv("https://raw.githubusercontent.com/CGarcia44/EDA-/main/7%20SKU%20Usage%20Last%20FY.csv")
dat8 <- read.csv("https://raw.githubusercontent.com/CGarcia44/EDA-/main/8%20WH%20Replenishment%20Last%20FY.csv")
Months<-c("JANUARY","FEBRUARY","MARCH","APRIL","MAY",
          "JUNE","JULY","AUGUST","SEPTEMBER","OCTOBER",
          "NOVEMBER","DECEMBER")
MonthsNumb<-c(1,2,3,4,5,6,7,8,9,10,11,12)
Months<-as.factor(Months)
Months<-data.frame(Months)

dat5<-merge(dat6,dat7, by="SkuNbr")
dat5<-merge(dat5,dat8,by="SkuNbr")
dat5$Whs<-as.factor(dat5$Whs)

library(tidyr)
library(dplyr)
library(ggplot2)
library(forcats)
SKUMaster <- read.csv("https://raw.githubusercontent.com/torirv/FinalFiles/main/SKU%20Master.csv")
SKUUsage <- read.csv("https://raw.githubusercontent.com/torirv/FinalFiles/main/SKU%20Usage%20Last%20FY.csv")
WHReplenishment <- read.csv("https://raw.githubusercontent.com/torirv/FinalFiles/main/WH%20Replenishment%20Last%20FY.csv")
SKUMaster$UOM <- as.factor(SKUMaster$UOM)
SKUMaster$Flow <- as.factor(SKUMaster$Flow)
SKUMaster$SKU <-as.factor(SKUMaster$SKU)
SKUUsage$SKU <-as.factor(SKUUsage$SKU)
SKUUsage$Issued <- as.numeric(SKUUsage$Issued)

## Warning: NAs introduced by coercion

WHReplenishment$SKU <-as.factor(WHReplenishment$SKU)
SKUMaster <- SKUMaster[SKUMaster$UOMCube>0,]
SKUMaster <- SKUMaster[SKUMaster$UOMCube<2,]
SKUMaster<-SKUMaster[SKUMaster$UOMWeight>0,]
SKUMaster<-SKUMaster[SKUMaster$UOMWeight<50,]
SKUMaster<-SKUMaster[SKUMaster$UOM %in% c("CA","EA","PL","LB"),]
SKUMaster<-na.omit(SKUMaster)
SKUMaster<-droplevels(SKUMaster)
SKUUsage<-na.omit(SKUUsage)
SKUUsage<-droplevels(SKUUsage)
WHReplenishment<-na.omit(WHReplenishment)
WHReplenishment<-droplevels(WHReplenishment)

NewWHR <- WHReplenishment %>%
  group_by(SKU, Month) %>%
  summarise(NMonth,SKU, Month,Received = sum(Received))%>%
  as.data.frame(NewWHR)

## `summarise()` has grouped output by 'SKU', 'Month'. You can override using the
## `.groups` argument.

NewUsage <- SKUUsage %>%
  group_by(SKU, Month)%>%
  summarise(SKU, Month,Issued = sum(Issued),ShipToCode)%>%
  as.data.frame(NewUsage)

## `summarise()` has grouped output by 'SKU', 'Month'. You can override using the
## `.groups` argument.

Combo <- merge(NewWHR, NewUsage)%>%
  merge(SKUMaster)
Combo <- distinct(Combo)
Ratio <- Combo$Issued/Combo$Received
Forecast <-Combo$Issued*1.055
ForecastRatio <-Forecast/Combo$Received 
Combo<- cbind(Combo, Ratio,Forecast,ForecastRatio)
Combo <- Combo %>% 
  mutate(Product = case_when(
    endsWith(ShipToCode, "D") ~ "Drystock",
    endsWith(ShipToCode,"P") ~ "Perishable",
    endsWith(ShipToCode, "M") ~ "Chocolates/Teas",
    endsWith(ShipToCode,"L") ~ "Alcohol"))
Perishables <- Combo[Combo$Product == "Perishable",]
Perishables <- Perishables[Perishables$Month == "APRIL",]
QP <- quantile(Perishables$Ratio, probs = c(.25,.75),)
IQRP <-IQR(Perishables$Ratio)
Perishables<- subset(Perishables,Perishables$Ratio > (QP[1]-1.5*IQRP) & Perishables$Ratio < (QP[2]+1.5*IQRP))
Perishables$SKU <- factor(Perishables$SKU,
                          levels = Perishables$SKU[order(Perishables$Ratio,decreasing = TRUE)])

PerishPlot<-ggplot(Perishables,aes(SKU,Ratio)) +
  geom_bar(aes(fill=SKU),stat = 'identity') +
  theme(axis.text.x = element_text(angle = 90, face="bold", size = 12),
        axis.text.y = element_text(face="bold", size = 12),
        legend.position = "none",
        axis.title = element_text(face = "bold",size = 14),
        plot.title = element_text(size=20,face = "bold",hjust = .5))+
  ggtitle("Percentage of Issued Perishables \n in April by SKU")
PerishPlot

SKUMasterNew <- SKUMaster %>% 
  mutate(Storage = case_when(
    endsWith(WHS, "D") ~ "Drystock",
    endsWith(WHS,"P") ~ "Perishable",
    endsWith(WHS, "M") ~ "Chocolates/Teas",
    endsWith(WHS,"L") ~ "Alcohol"))
SKUMasterNew <- na.omit(SKUMasterNew)


SKUPlot <- ggplot(SKUMasterNew, aes(fill= Storage,x= fct_infreq(Storage)))+
  geom_bar() +
  labs(x="Commodity", y="Count")
SKUPlot

Warehouse <- SKUMasterNew[SKUMasterNew$Flow == "WH",]
Warehouse <- select(Warehouse,UOMCube, UOMWeight)


cls <-kmeans(Warehouse,5,20,iter.max=20)
Warehouse$Cluster <- as.character(cls$cluster)
ClusterPlot<- ggplot() +
  geom_point(data = Warehouse,
             mapping = aes(x=UOMCube,y=UOMWeight,colour=Cluster))
ClusterPlot

plot(dat5$Whs,dat5$LeadTime,
     xlab = "Internal Commodity",
     ylab="Lead Time")

AlcMeanIssued<-c(mean(dat1[dat1$Month=="JANUARY",]$Issued),
              mean(dat1[dat1$Month=="FEBRUARY",]$Issued),
              mean(dat1[dat1$Month=="MARCH",]$Issued),
              mean(dat1[dat1$Month=="APRIL",]$Issued),
              mean(dat1[dat1$Month=="MAY",]$Issued),
              mean(dat1[dat1$Month=="JUNE",]$Issued),
              mean(dat1[dat1$Month=="JULY",]$Issued),
              mean(dat1[dat1$Month=="AUGUST",]$Issued),
              mean(dat1[dat1$Month=="SEPTEMBER",]$Issued),
              mean(dat1[dat1$Month=="OCTOBER",]$Issued),
              mean(dat1[dat1$Month=="NOVEMBER",]$Issued),
              mean(dat1[dat1$Month=="DECEMBER",]$Issued))

Alc<-data.frame(Months,AlcMeanIssued)
plot(MonthsNumb,AlcMeanIssued,xlab="Months", ylab="Average",
              main="Average Alcohol Issued by Month")

##Chocolates & Teas

dat1<-SKUUsage
dat1$ShipToCode<-as.factor(dat1$ShipToCode)
dat1<-dat1[dat1$Issued>0,]
dat1$Issued<-as.numeric(dat1$Issued)
dat2<-dat1[dat1$ShipToCode=="3M",]
dat3<-dat1[dat1$ShipToCode=="CD3M",]
dat2<-droplevels(dat2)
dat3<-droplevels(dat3)
dat1<-merge(dat2,dat3,all=TRUE)

                   
ChocMeanIssued<-c(mean(dat1[dat1$Month=="JANUARY",]$Issued),
              mean(dat1[dat1$Month=="FEBRUARY",]$Issued),
              mean(dat1[dat1$Month=="MARCH",]$Issued),
              mean(dat1[dat1$Month=="APRIL",]$Issued),
              mean(dat1[dat1$Month=="MAY",]$Issued),
              mean(dat1[dat1$Month=="JUNE",]$Issued),
              mean(dat1[dat1$Month=="JULY",]$Issued),
              mean(dat1[dat1$Month=="AUGUST",]$Issued),
              mean(dat1[dat1$Month=="SEPTEMBER",]$Issued),
              mean(dat1[dat1$Month=="OCTOBER",]$Issued),
              mean(dat1[dat1$Month=="NOVEMBER",]$Issued),
              mean(dat1[dat1$Month=="DECEMBER",]$Issued))
Choc<-data.frame(MonthsNumb,Months,ChocMeanIssued)
plot(MonthsNumb,ChocMeanIssued,xlab="Months", ylab="Average",
              main="Average Chocolate/Teas Issued by Month")

AlcChoc<-merge(Alc,Choc,by="Months")
cor(AlcChoc$AlcMeanIssued,AlcChoc$ChocMeanIssued)

## [1] 0.4663303

Fresh Food Now Data Analysis

Victoria Vasquez

Dylan Katalbas

Caleb Garcia

4/30/2022

\[Excess Short Shelf-Life Items\]

Preparing the Data

Due to the limited amount of data available to us we have stratified all three databases into one based on SKUs and Commodity to explore any potential correlations. The majority of the data that the resulting frame consisted of were ‘Perishables’ and from the month of April.

Based on this observation, we have graphed the data by SKU to determine how these items with short shelf lives are moving through the system within a single month.

Conclusion:

Excess short shelf-life items month-to-month may be an indication that although they are forecasting growth through 2023 that does not necessarily mean they need to increase the number of received quantities for all items.

\[Commodity Frequency\]

\[Warehouse Storage Planning\]

Influence of Unit Measurements

An additional way our team decided to look at the data was based on how it would be stored. Pulling all the SKUs and creating a cluster plot based on their Volume and Weight resulted in the plot below.

Conclusion:

Lead Time by Storage Commodity

Conclusion:

Average Number Issued of Commodities by Month

Conclusions of Commodities by Months

Complete Code: