Do you love ice cream? I certainly love ice cream. This dessert is a common household item in the US. Most fond memories involve ice cream. It is at every birthday party and celebratory event. Ice cream is known to complement warm weather as Hot Chocolate (or known as Hot Cocoa) complements cold weather. Maybe I am a statistical outlier as I will eat ice cream year-round. In this analysis I take a deep dive into ice cream sales in 2017 to answer the question of, does weather temperature affect ice cream sales? This analysis will better understand how outside factors such as temperature can affect sales of products.
In my analysis I look at weather trends in the US compared to our sales of ice cream and hot cocoa mixes. Using data from the National Weather Service I showcase daily temperatures from various cities across the US and correlate the temperature to our sales.
completejourney- Data set characterizing household transactions over one year.
ggplot2- Visualization of data package.
dplyr- Combines data frames and allows for filtering.
tidyverse- A group of statistical packages to ease data analysis. This package is needed in almost every analysis computed on R Studios.
library(completejourney)
library(ggplot2)
library(dplyr)
library(tidyverse)
My analysis includes data of transactions, demographics, and products within “completejourney”. I also found the daily temperature in 2017 for Cincinnati, New York, Seattle, San Antonio, and Tampa when the completejourney data was compiled. These locations are geographically spaced out across the continental US (48 states).
I filtered the product_category for “Ice Cream” and used the associated product_id from the transactions data frame to find when these products were purchased. I then plugged this data into the demographics data frame to find who was buying ice cream. I could then find out which demographic was buying the most ice cream and when.
The temperature data was not raw and I could not find the raw versions to upload into R Studios. The graphs were auto generated from the National Weather Services website (cited below graphs and in references).
library(completejourney)
library(ggplot2)
library(dplyr)
library(tidyverse)
transactions <- get_transactions()
demographics <- demographics
products <- products
Ice_Cream_Sales_per_Month <- products %>%
inner_join(transactions, by = "product_id") %>%
filter(str_detect(product_category, regex("ICE CREAM", ignore_case = TRUE))) %>%
group_by(month = month(transaction_timestamp, label = TRUE)) %>%
summarize(total_sales = sum(sales_value))
knitr::kable(Ice_Cream_Sales_per_Month, format= "markdown")
month | total_sales |
---|---|
Jan | 3899.99 |
Feb | 3313.17 |
Mar | 3977.91 |
Apr | 3634.69 |
May | 3946.00 |
Jun | 4344.59 |
Jul | 4874.61 |
Aug | 4153.73 |
Sep | 4028.95 |
Oct | 3724.57 |
Nov | 3424.42 |
Dec | 3553.88 |
library(completejourney)
library(ggplot2)
library(dplyr)
library(tidyverse)
transactions <- get_transactions()
demographics <- demographics
products <- products
Ice_Cream_Sales_by_Age <- products %>%
filter(str_detect(product_category, regex("(ICE CREAM)"))) %>%
inner_join(transactions, by = "product_id") %>%
inner_join(demographics, by = "household_id") %>%
group_by(age) %>%
summarize(total_sales = sum(sales_value, na.rm = TRUE))
knitr::kable(Ice_Cream_Sales_by_Age, format= "markdown")
age | total_sales |
---|---|
19-24 | 1046.35 |
25-34 | 3693.99 |
35-44 | 7376.07 |
45-54 | 10355.26 |
55-64 | 1889.73 |
65+ | 1915.93 |
library(completejourney)
library(ggplot2)
library(dplyr)
library(tidyverse)
transactions <- get_transactions()
demographics <- demographics
products <- products
Ice_Cream_Sales_per_Income_and_Number_of_Kids <- products %>%
filter(str_detect(product_category, regex("ICE CREAM", ignore_case = TRUE))) %>%
inner_join(transactions, by = "product_id") %>%
inner_join(demographics, by = "household_id") %>%
group_by(kids_count, income) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE))
knitr::kable(Ice_Cream_Sales_per_Income_and_Number_of_Kids, format= "markdown")
kids_count | income | total_sales |
---|---|---|
0 | Under 15K | 895.87 |
0 | 15-24K | 1475.84 |
0 | 25-34K | 1726.94 |
0 | 35-49K | 3104.13 |
0 | 50-74K | 4357.26 |
0 | 75-99K | 1564.20 |
0 | 100-124K | 780.08 |
0 | 125-149K | 747.88 |
0 | 150-174K | 773.91 |
0 | 175-199K | 181.61 |
0 | 200-249K | 29.19 |
0 | 250K+ | 437.81 |
1 | Under 15K | 267.79 |
1 | 15-24K | 307.66 |
1 | 25-34K | 410.32 |
1 | 35-49K | 1205.68 |
1 | 50-74K | 656.30 |
1 | 75-99K | 909.07 |
1 | 100-124K | 169.58 |
1 | 125-149K | 460.91 |
1 | 150-174K | 515.11 |
1 | 175-199K | 68.79 |
1 | 200-249K | 79.54 |
1 | 250K+ | 121.39 |
2 | Under 15K | 171.57 |
2 | 15-24K | 173.55 |
2 | 25-34K | 179.49 |
2 | 35-49K | 385.84 |
2 | 50-74K | 793.59 |
2 | 75-99K | 156.84 |
2 | 100-124K | 115.23 |
2 | 125-149K | 118.05 |
2 | 150-174K | 63.47 |
3+ | Under 15K | 93.71 |
3+ | 15-24K | 340.18 |
3+ | 25-34K | 291.33 |
3+ | 35-49K | 407.63 |
3+ | 50-74K | 217.29 |
3+ | 75-99K | 482.94 |
3+ | 100-124K | 161.28 |
3+ | 125-149K | 535.63 |
3+ | 150-174K | 231.33 |
3+ | 175-199K | 25.01 |
3+ | 200-249K | 14.97 |
3+ | 250K+ | 71.54 |
library(completejourney)
library(ggplot2)
library(dplyr)
library(tidyverse)
transactions <- get_transactions()
demographics <- demographics
products <- products
Cocoa_Sales_Per_Month <- products %>%
inner_join(transactions, by = "product_id") %>%
filter(str_detect(product_category, regex("COCOA MIXES", ignore_case = TRUE))) %>%
group_by(month = month(transaction_timestamp, label = TRUE)) %>%
summarize(total_sales = sum(sales_value))
knitr::kable(Cocoa_Sales_Per_Month, format= "markdown")
month | total_sales |
---|---|
Jan | 529.26 |
Feb | 438.49 |
Mar | 414.49 |
Apr | 322.15 |
May | 351.00 |
Jun | 315.95 |
Jul | 272.09 |
Aug | 297.04 |
Sep | 378.60 |
Oct | 546.31 |
Nov | 564.75 |
Dec | 637.97 |
transactions <- get_transactions()
demographics <- demographics
products <- products
products %>%
inner_join(transactions, by = "product_id") %>%
filter(str_detect(product_category, regex("ICE CREAM", ignore_case = TRUE))) %>%
group_by(month = month(transaction_timestamp, label = TRUE)) %>%
summarize(total_sales = sum(sales_value)) %>%
ggplot(aes(x = month, y = total_sales,color = month, fill = month)) +
geom_col() +
scale_x_discrete("Month") +
scale_y_continuous("Ice Cream Sales", label = scales::dollar) +
ggtitle(label = "Ice Cream Sales per Month", subtitle = "July has the most ice cream sales")
The data shown above shows a distinct curve in ice cream sales. There is two notable outliers (Jan and Mar).
products %>%
filter(str_detect(product_category, regex("(ICE CREAM)"))) %>%
inner_join(transactions, by = "product_id") %>%
inner_join(demographics, by = "household_id") %>%
group_by(age) %>%
summarize(total_sales = sum(sales_value, na.rm = TRUE)) %>%
ggplot(aes(x = factor(age), y = total_sales, color = factor(age), fill = factor(age))) +
geom_col() +
scale_y_continuous("Total Sales of Ice Cream", labels = scales::dollar) +
scale_x_discrete("Age") +
ggtitle(label = "Ice Cream Sales by Age", subtitle = "Middle aged customers buy the most ice cream")
The bulk of consumers buying ice cream are middle aged between the ages of 35 and 54.
products %>%
filter(str_detect(product_category, regex("ICE CREAM", ignore_case = TRUE))) %>%
inner_join(transactions, by = "product_id") %>%
inner_join(demographics, by = "household_id") %>%
group_by(kids_count, income) %>%
summarise(total_sales = sum(sales_value, na.rm = TRUE)) %>%
ggplot(aes(x = kids_count, y = total_sales, color = kids_count, fill = kids_count)) +
geom_col() +
facet_wrap(~income) +
scale_x_discrete("Number of Kids") +
scale_y_continuous("Total Ice Cream Sales", label = scales::dollar) +
ggtitle(label = "Ice Cream Sales per Income and Number of Kids", subtitle = "Households with 50-74K and no kids spend the most on ice cream")
Households without kids leads every income range in regards to ice cream purchases except one, income ranging from 200-249k.
transactions <- get_transactions()
demographics <- demographics
products <- products
products %>%
inner_join(transactions, by = "product_id") %>%
filter(str_detect(product_category, regex("COCOA MIXES", ignore_case = TRUE))) %>%
group_by(month = month(transaction_timestamp, label = TRUE)) %>%
summarize(total_sales = sum(sales_value)) %>%
ggplot(aes(x = month, y = total_sales,color = month, fill = month)) +
geom_col() +
scale_x_discrete("Month") +
scale_y_continuous("Cocoa Mix Sales", label = scales::dollar) +
ggtitle(label = "Cocoa Mix Sales per Month", subtitle = "December has the most cocoa sales")
This graph is almost the opposite results from Ice Cream Sales. These sales numbers also correlate to the temperatures.
Source: National Weather
Service (https://www.weather.gov/wrh/Climate?wfo=iln)
After mid October, the temperature was not recorded for the Cincinnati area by the National Weather Service.
Source: National Weather Service
(https://www.weather.gov/wrh/Climate?wfo=okx)
New York, NY best represents ice cream sales by Regork. This graph shows a couple peaks in temperature in late winter which is also shown in ice cream sales.
Source: National Weather
Service (https://www.weather.gov/wrh/Climate?wfo=ewx)
Source: National Weather Service
(https://www.weather.gov/wrh/Climate?wfo=sew)
Source: National Weather Service (https://www.weather.gov/wrh/Climate?wfo=tbw)
All of the temperature graphs show a hump in temperature for the month of July. Some areas might not be as drastic as others.
After reviewing the data, I have determined that the temperature does affect the sales of ice cream. This can be proven by also looking at hot cocoa mix sales. Note that the weather across the country is not all the same. I also found that most ice cream is being purchased by customers who are 35-54 with no kids and make between 35-74K a year.
My proposed solution to increase ice cream sales during colder months is to add additional marketing to areas of the country where temperature change is not as drastic. With a moderate temperature year-round ice cream sales should be relatively constant. Adding additional marketing would help boost the sales.
I also propose to add seasonal flavors. The fall could feature flavors like apple, pumpkin, and spice. Spring could feature marshmallow and confetti flavors. Finally, the winter could feature flavors like gingerbread, eggnog, and peppermint. Making these flavors more available in the season and less during off season would create a desire for the product without consumers being burnt out on the flavors. The main consumer of ice cream is middle aged customers so unique flavors that remind them of holidays and family could be beneficial.
Note that local mom and pop ice cream parlors usually close from Labor Day to Memorial Day. This would allow us to tap into a revenue stream that may be diverted during peak warm weather.
As a last resort, we now know that ice cream is not selling the best in colder months so this will give us the opportunity to possibly invest in hot cocoa mixes more heavily as we know these sales drastically increase during colder months. This is not the ideal proposed solution, but it gives us flexibility if the above options fail.
In closing, the temperature does impact our ice cream sales. This has been shown and reinforced with the sales of hot cocoa mixes.
Limitations
If the data from the National Weather Service was raw this could be used to pick out the key data (temperature) and leave out unwanted information about record highs/lows in the area. I think another table within completejourney with data about locations of the stores would allow temperature data to be more accurate.
Boehmke, B. (2019, September 28). The Complete Journey User Guide. https://cran.r-project.org/web/packages/completejourney/vignettes/completejourney.html
US Department of Commerce, N. (2024, July 30). Climate. National Weather Service. https://www.weather.gov/wrh/climate
Cream, M. I. (2020, December 11). 4 tips for increasing wholesale ice cream sales - my/mochiTM blog. My/MochiTM Ice Cream. https://www.mymochi.com/blog/4-tips-increasing-wholesale-ice-cream-sales/
Moncel, B. (2024, June 20). What is ice cream and how is it made?. The Spruce Eats. https://www.thespruceeats.com/what-is-ice-cream-1328784