This past year instacart has become huge because of the pandemic. Even I worked there over the past year trying to earn some money on the side. I found some data released by instacart from 2018 while it doesnt relate to the pandemic I still thought it would be interesting to take a dive into what peoples habits are on instacart. Some of the first questions that popped into my mind was what day of the week are people ordering the most and even what time of day people order the most. So I took a dive into a few of my most wondered questions about instacart and its users in this project.
When working at the BJs grocery store over this past year/lockdown I ran into alot of instacart workers and some days my aisle would backed up with just instacart workers. This got me thinking about when most orders are made for instacart and my hypothesis would be that on Saturday and Sunday the most orders are made on instacart with Thursday being a close third.
orders %>%
ggplot(aes(x=order_dow)) +
geom_bar(stat="count",fill="blue")+
theme_classic()+
ggtitle("Days of the Week with the Most orders")+
xlab("Day of the Week (Starting at Saturday)")+ ylab("Amount of Orders")
When looking at the graph above Day 0 and 1 come out with the most orders by far. Sadly there is no information about if day 0 is the Saturday but I would assume that it is. If we go off the assumption that 0 is Saturday then the weekend and Monday are the top three days for orders while Thursday comes at a very close fourth behind Monday.
Again when working as a cashier this summer I learned about when most orders would come in for instacart and it seemed like most orders came in either early around 8 to 9 am or late in the day at around 4 to 5 since people were getting off of work around that time and wanted to make sure they would have groceries when they got home. So to hypothesize I think the most orders in an hour on average would be at 9am.
orders %>%
ggplot(aes(x=order_hour_of_day)) +
geom_bar(stat="count",fill="deepskyblue4") +
ggtitle("Order Rates by Hours of the Day") + theme_classic() +
xlab("Hour of the Day") + ylab("Amount of Orders")
When looking at the graph above my hypothesis was close as 10 am seems to be the height of instacart orders throughout the day. The orders seem to slow down mid day and slightly pick back up around four to five pm.
While working as a cashier again I would get large variety of the sizes of orders that people wanted. One time I had a instacart order of 66 items that filled my entire checkout. So thinking of this I wanted to explore what is the most frequent amount of items bought by a customer at instacart. While I was working I saw a lot of large orders from instacart so my hypothesis is that the orders are most frequent around the 20-25 item range.
order_products_prior %>%
group_by(order_id) %>%
summarize(n_items = last(add_to_cart_order)) %>%
ggplot(aes(x=n_items))+
geom_histogram(stat="count",fill="green4") +
geom_rug() +
coord_cartesian(xlim=c(0,40))+
ggtitle("How many items do people normally order?")+
xlab("Amount of orders")+ ylab("Items Ordered")
## Warning: Ignoring unknown parameters: binwidth, bins, pad
Wow!! Was my guess off! The data shows here that the most frequently bought amount of items is five items with six being close behind. My estimation must have been off because I worked at a wholesale store where people came and bought large amounts of items. At stores like Harris Teeter or Food Lion orders must be alot smaller as the frequency of that small orders did not happen often while I worked as a Cashier/instacart worker.
When thinking about items that are bought alot at the grocery store I think of paper towels, Milk and ice cream as some of the most bought items. When looking through instacarts data I wanted to see if I could reflect on that and figure out what are the most ordered items on Instacart.
Top_Ordered <- order_products_prior %>%
group_by(product_id) %>%
summarize(count = n()) %>%
top_n(10, wt = count) %>%
left_join(select(products,product_id,product_name),by="product_id") %>%
arrange(desc(count))
kable(Top_Ordered)
| product_id | count | product_name |
|---|---|---|
| 24852 | 472565 | Banana |
| 13176 | 379450 | Bag of Organic Bananas |
| 21137 | 264683 | Organic Strawberries |
| 21903 | 241921 | Organic Baby Spinach |
| 47209 | 213584 | Organic Hass Avocado |
| 47766 | 176815 | Organic Avocado |
| 47626 | 152657 | Large Lemon |
| 16797 | 142951 | Strawberries |
| 26209 | 140627 | Limes |
| 27845 | 137905 | Organic Whole Milk |
Top_Ordered %>%
ggplot(aes(x=reorder(product_name,-count), y=count))+
geom_bar(stat="identity",fill="violetred4")+
theme(axis.text.x=element_text(angle=90, hjust=1),axis.title.x = element_blank())+
ggtitle("Most Ordered Items") + ylab("Times Ordered")
When looking at this visualization its shows the top two items are bananas by a pretty large margin. Suprisingly the top nine entries in the top ten are all fruits or vegetables. While this is a great thing I wouldnt expect it in the United States as we are a pretty obese country. Also its interesting to see the amount of Organic foods sold on instacart I think this is because its a delivery service and the demographic of orders comes from higher income areas and that allows for people to buy organic.
While Instacart has not released any data from during the pandemic they have written an article about it. Some interesting things are that customers placed 3% more mid week (Tuesday-Friday) orders than before the pandemic. Instacart orders have also gone up 32% during local working hours (9am-5pm) during 2020. During lockdown a few items that help relieve stress seem to have spiked. Canned or Premixed cocktails grew by 127% and beer fanatics helped fuel a big spike in beer and spiked seltzers also, with the slaes of the two increasing by 96% and 131% respectively. In the hard liquor category gin proved to be the spirit of choice with a 21% increase since the beginning of the pandemic.
If you want to look at the article yourself it is linked here. https://news.instacart.com/beyond-the-cart-a-year-of-essential-insights-b6ac201228e6
When looking at the habits of instacart customers its seems we can try and recreate the most likely order. Based on the data its seems that on a Saturday around 10am someone is going to order 5 items worth of food and more than likely they are gonna buy some bananas.
The data found for this project was found here. https://www.kaggle.com/c/instacart-market-basket-analysis/overview