data <- read.csv("sales_data.csv")
head(data) month sales
1 Jan-25 119.84
2 Feb-25 121.00
3 Mar-25 123.50
4 Apr-25 124.80
5 May-25 124.80
6 Jun-25 125.31
This project aims to visualize online retail sales using a calendar heatmap.
The objective is to understand how sales vary across days and identify trends over a year.
The data is based on monthly online retail sales trends reported by Digital Commerce 360.
Since the data is not available as a single downloadable dataset, values were compiled and approximated based on published trends.
In this step, the dataset is loaded from a CSV file into R.
data <- read.csv("sales_data.csv")
head(data) month sales
1 Jan-25 119.84
2 Feb-25 121.00
3 Mar-25 123.50
4 Apr-25 124.80
5 May-25 124.80
6 Jun-25 125.31
The required libraries are loaded to perform data manipulation and visualization. The ggplot2 package is used for creating the heatmap visualization. Base R functions are used for data processing to avoid dependency issues.
library(tidyverse)Warning: package 'tidyverse' was built under R version 4.5.3
Warning: package 'ggplot2' was built under R version 4.5.3
Warning: package 'lubridate' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(ggplot2)The dataset contains month values in text format (e.g., “Jan-2025”). These values are converted into proper date format so that they can be used for time-based analysis and visualization.
data$month <- as.Date(paste0("01-", data$month), format="%d-%b-%Y")
str(data)'data.frame': 12 obs. of 2 variables:
$ month: Date, format: "0025-01-01" "0025-02-01" ...
$ sales: num 120 121 124 125 125 ...
The dataset contains only monthly sales values. To create a calendar heatmap, daily data is required. Therefore, each month’s sales value is distributed evenly across all the days in that month. This transformation helps in visualizing sales patterns on a daily basis.
daily_data <- do.call(rbind, lapply(1:nrow(data), function(i) {
days <- seq(data$month[i],
as.Date(format(data$month[i] + 32, "%Y-%m-01")) - 1,
by = "day")
data.frame(
date = days,
sales = data$sales[i] / length(days)
)
}))To construct the calendar heatmap, additional variables are required. The weekday represents the day of the week (Monday, Tuesday, etc.), and the week represents the week number in the year.
These variables help in arranging the data in a calendar-like format for visualization.
daily_data$weekday <- weekdays(daily_data$date)
daily_data$week <- as.numeric(format(daily_data$date, "%U"))A calendar heatmap is created to visualize daily online sales. Each tile represents a day, and the color intensity indicates the sales value. Darker colors represent higher sales, while lighter colors represent lower sales.
This visualization helps in identifying patterns and trends over time.
library(ggplot2)
ggplot(daily_data, aes(x = week, y = weekday, fill = sales)) +
geom_tile(color = "white") +
scale_fill_gradient(low = "lightblue", high = "darkblue") +
labs(
title = "Calendar Heatmap of Daily Online Sales",
x = "Week of Year",
y = "Day of Week",
fill = "Sales"
) +
theme_minimal()