Calendar Heatmap of Online Retail Sales

Author

Nethrasree T and Deepika P

Introduction

This project aims to visualize online retail sales using a calendar heatmap.
The objective is to understand how sales vary across days and identify trends over a year.

Data Source

The data is based on monthly online retail sales trends reported by Digital Commerce 360.
Since the data is not available as a single downloadable dataset, values were compiled and approximated based on published trends.

Data Loading

In this step, the dataset is loaded from a CSV file into R.

data <- read.csv("sales_data.csv")

head(data)
   month  sales
1 Jan-25 119.84
2 Feb-25 121.00
3 Mar-25 123.50
4 Apr-25 124.80
5 May-25 124.80
6 Jun-25 125.31

Load Libraries

The required libraries are loaded to perform data manipulation and visualization. The ggplot2 package is used for creating the heatmap visualization. Base R functions are used for data processing to avoid dependency issues.

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.5.3
Warning: package 'ggplot2' was built under R version 4.5.3
Warning: package 'lubridate' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(ggplot2)

Data cleaning

The dataset contains month values in text format (e.g., “Jan-2025”). These values are converted into proper date format so that they can be used for time-based analysis and visualization.

data$month <- as.Date(paste0("01-", data$month), format="%d-%b-%Y")

str(data)
'data.frame':   12 obs. of  2 variables:
 $ month: Date, format: "0025-01-01" "0025-02-01" ...
 $ sales: num  120 121 124 125 125 ...

Data Transfromation

The dataset contains only monthly sales values. To create a calendar heatmap, daily data is required. Therefore, each month’s sales value is distributed evenly across all the days in that month. This transformation helps in visualizing sales patterns on a daily basis.

daily_data <- do.call(rbind, lapply(1:nrow(data), function(i) {
  days <- seq(data$month[i],
              as.Date(format(data$month[i] + 32, "%Y-%m-01")) - 1,
              by = "day")
  
  data.frame(
    date = days,
    sales = data$sales[i] / length(days)
  )
}))

Feature Creation

To construct the calendar heatmap, additional variables are required. The weekday represents the day of the week (Monday, Tuesday, etc.), and the week represents the week number in the year.

These variables help in arranging the data in a calendar-like format for visualization.

daily_data$weekday <- weekdays(daily_data$date)
daily_data$week <- as.numeric(format(daily_data$date, "%U"))

Calendar Heatmap Visualization

A calendar heatmap is created to visualize daily online sales. Each tile represents a day, and the color intensity indicates the sales value. Darker colors represent higher sales, while lighter colors represent lower sales.

This visualization helps in identifying patterns and trends over time.

library(ggplot2)

ggplot(daily_data, aes(x = week, y = weekday, fill = sales)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(
    title = "Calendar Heatmap of Daily Online Sales",
    x = "Week of Year",
    y = "Day of Week",
    fill = "Sales"
  ) +
  theme_minimal()