Introduction

This past year instacart has become huge because of the pandemic. Even I worked there over the past year trying to earn some money on the side. I found some data released by instacart from 2018 while it doesnt relate to the pandemic I still thought it would be interesting to take a dive into what peoples habits are on instacart. Some of the first questions that popped into my mind was what day of the week are people ordering the most and even what time of day people order the most. So I took a dive into a few of my most wondered questions about instacart and its users in this project.

What day of the Week to people order Most

When working at the BJs grocery store over this past year/lockdown I ran into alot of instacart workers and some days my aisle would backed up with just instacart workers. This got me thinking about when most orders are made for instacart and my hypothesis would be that on Saturday and Sunday the most orders are made on instacart with Thursday being a close third.

orders %>% 
  ggplot(aes(x=order_dow)) + 
  geom_bar(stat="count",fill="blue")+
  theme_classic()+
  ggtitle("Days of the Week with the Most orders")+
  xlab("Day of the Week (Starting at Saturday)")+ ylab("Amount of Orders")

Review

When looking at the graph above Day 0 and 1 come out with the most orders by far. Sadly there is no information about if day 0 is the Saturday but I would assume that it is. If we go off the assumption that 0 is Saturday then the weekend and Monday are the top three days for orders while Thursday comes at a very close fourth behind Monday.

What Time of Day do people order?

Again when working as a cashier this summer I learned about when most orders would come in for instacart and it seemed like most orders came in either early around 8 to 9 am or late in the day at around 4 to 5 since people were getting off of work around that time and wanted to make sure they would have groceries when they got home. So to hypothesize I think the most orders in an hour on average would be at 9am.

orders %>% 
  ggplot(aes(x=order_hour_of_day)) + 
  geom_bar(stat="count",fill="deepskyblue4") +
  ggtitle("Order Rates by Hours of the Day") + theme_classic() +
  xlab("Hour of the Day") + ylab("Amount of Orders")

Review

When looking at the graph above my hypothesis was close as 10 am seems to be the height of instacart orders throughout the day. The orders seem to slow down mid day and slightly pick back up around four to five pm.

How many items do people normally order?

While working as a cashier again I would get large variety of the sizes of orders that people wanted. One time I had a instacart order of 66 items that filled my entire checkout. So thinking of this I wanted to explore what is the most frequent amount of items bought by a customer at instacart. While I was working I saw a lot of large orders from instacart so my hypothesis is that the orders are most frequent around the 20-25 item range.

order_products_prior %>% 
  group_by(order_id) %>% 
  summarize(n_items = last(add_to_cart_order)) %>%
  ggplot(aes(x=n_items))+ 
  geom_histogram(stat="count",fill="green4") + 
  geom_rug() + 
  coord_cartesian(xlim=c(0,40))+
  ggtitle("How many items do people normally order?")+
  xlab("Amount of orders")+ ylab("Items Ordered")
## Warning: Ignoring unknown parameters: binwidth, bins, pad

Review

Wow!! Was my guess off! The data shows here that the most frequently bought amount of items is five items with six being close behind. My estimation must have been off because I worked at a wholesale store where people came and bought large amounts of items. At stores like Harris Teeter or Food Lion orders must be alot smaller as the frequency of that small orders did not happen often while I worked as a Cashier/instacart worker.

What Items are bought the most?

When thinking about items that are bought alot at the grocery store I think of paper towels, Milk and ice cream as some of the most bought items. When looking through instacarts data I wanted to see if I could reflect on that and figure out what are the most ordered items on Instacart.

Top_Ordered <- order_products_prior %>% 
  group_by(product_id) %>% 
  summarize(count = n()) %>% 
  top_n(10, wt = count) %>%
  left_join(select(products,product_id,product_name),by="product_id") %>%
  arrange(desc(count)) 

kable(Top_Ordered)
product_id count product_name
24852 472565 Banana
13176 379450 Bag of Organic Bananas
21137 264683 Organic Strawberries
21903 241921 Organic Baby Spinach
47209 213584 Organic Hass Avocado
47766 176815 Organic Avocado
47626 152657 Large Lemon
16797 142951 Strawberries
26209 140627 Limes
27845 137905 Organic Whole Milk
Top_Ordered %>% 
  ggplot(aes(x=reorder(product_name,-count), y=count))+
  geom_bar(stat="identity",fill="violetred4")+
  theme(axis.text.x=element_text(angle=90, hjust=1),axis.title.x = element_blank())+ 
  ggtitle("Most Ordered Items") + ylab("Times Ordered")

Review

When looking at this visualization its shows the top two items are bananas by a pretty large margin. Suprisingly the top nine entries in the top ten are all fruits or vegetables. While this is a great thing I wouldnt expect it in the United States as we are a pretty obese country. Also its interesting to see the amount of Organic foods sold on instacart I think this is because its a delivery service and the demographic of orders comes from higher income areas and that allows for people to buy organic.

Comparing to the Pandemic

While Instacart has not released any data from during the pandemic they have written an article about it. Some interesting things are that customers placed 3% more mid week (Tuesday-Friday) orders than before the pandemic. Instacart orders have also gone up 32% during local working hours (9am-5pm) during 2020. During lockdown a few items that help relieve stress seem to have spiked. Canned or Premixed cocktails grew by 127% and beer fanatics helped fuel a big spike in beer and spiked seltzers also, with the slaes of the two increasing by 96% and 131% respectively. In the hard liquor category gin proved to be the spirit of choice with a 21% increase since the beginning of the pandemic.

If you want to look at the article yourself it is linked here. https://news.instacart.com/beyond-the-cart-a-year-of-essential-insights-b6ac201228e6

Conclusion

When looking at the habits of instacart customers its seems we can try and recreate the most likely order. Based on the data its seems that on a Saturday around 10am someone is going to order 5 items worth of food and more than likely they are gonna buy some bananas.

The data found for this project was found here. https://www.kaggle.com/c/instacart-market-basket-analysis/overview