LA-1

Author

SANJANA

PROBLEM STATEMENT

Create a calendar heatmap to show daily online sales data over one year.

STEP 1:LOAD THE REQUIRED LIBRARIES

  • readxl: Reads Excel files.

  • dplyr: Data wrangling and transformation.

  • lubridate: Date parsing and manipulation.

  • ggplot2: Data visualization

  • scales: Formatting numeric and date scales in plots.

library(readxl)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
library(lubridate)

Attaching package: 'lubridate'
The following objects are masked from 'package:base':

    date, intersect, setdiff, union

STEP 2:READ AND CONVERT THE DATA FROM EXCEL FILE

  • Reads data from the Excel file.
  • Contains two columns: Date (Excel serial format) and Simulated Sales.
# Read the Excel file
sales_data <- read_excel("C:/Users/sanja/OneDrive/Desktop/april.xlsx")

# Convert Excel serial date to Date format
sales_data <- sales_data %>%
  mutate(date = as.Date(Date, origin = "1899-12-30"),
         sales = `Simulated Sales`)
sales_data
# A tibble: 30 × 4
   Date                `Simulated Sales` date       sales
   <dttm>                          <dbl> <date>     <dbl>
 1 2024-04-01 00:00:00             1000  2024-04-01 1000 
 2 2024-04-02 00:00:00             1000. 2024-04-02 1000.
 3 2024-04-03 00:00:00             1001. 2024-04-03 1001.
 4 2024-04-04 00:00:00             1001. 2024-04-04 1001.
 5 2024-04-05 00:00:00             1002. 2024-04-05 1002.
 6 2024-04-06 00:00:00             1002. 2024-04-06 1002.
 7 2024-04-07 00:00:00             1002. 2024-04-07 1002.
 8 2024-04-08 00:00:00             1003. 2024-04-08 1003.
 9 2024-04-09 00:00:00             1003. 2024-04-09 1003.
10 2024-04-10 00:00:00             1004. 2024-04-10 1004.
# ℹ 20 more rows

STEP 3:FILTER THE APRIL MONTH DATA

  • Converts Excel date from serial number to actual Date.

  • Renames and selects relevant columns: date and sales.

# Filter only April data
april_sales <- sales_data %>%
  filter(month(date) == 4) %>%
  mutate(
    month = month(date, label = TRUE),
    weekday = wday(date, label = TRUE, week_start = 1),
    week = isoweek(date)
  )
# Fix week 1 edge case in April
april_sales$week <- ifelse(april_sales$month == "Apr" & april_sales$week == 1, 53, april_sales$week)

april_sales
# A tibble: 30 × 7
   Date                `Simulated Sales` date       sales month weekday  week
   <dttm>                          <dbl> <date>     <dbl> <ord> <ord>   <dbl>
 1 2024-04-01 00:00:00             1000  2024-04-01 1000  Apr   Mon        14
 2 2024-04-02 00:00:00             1000. 2024-04-02 1000. Apr   Tue        14
 3 2024-04-03 00:00:00             1001. 2024-04-03 1001. Apr   Wed        14
 4 2024-04-04 00:00:00             1001. 2024-04-04 1001. Apr   Thu        14
 5 2024-04-05 00:00:00             1002. 2024-04-05 1002. Apr   Fri        14
 6 2024-04-06 00:00:00             1002. 2024-04-06 1002. Apr   Sat        14
 7 2024-04-07 00:00:00             1002. 2024-04-07 1002. Apr   Sun        14
 8 2024-04-08 00:00:00             1003. 2024-04-08 1003. Apr   Mon        15
 9 2024-04-09 00:00:00             1003. 2024-04-09 1003. Apr   Tue        15
10 2024-04-10 00:00:00             1004. 2024-04-10 1004. Apr   Wed        15
# ℹ 20 more rows

STEP 4:PLOT THE HEATMAP

  • aes(x, y, fill) maps weekday on X-axis, negative week on Y-axis, and sales as fill color.
  • geom_tile() draws the calendar-like heatmap.
  • scale_fill_gradient() uses blue intensity to represent sales volume.
  • Custom labels and minimal theme for clarity.
# Plot calendar heatmap for April
ggplot(april_sales, aes(x = weekday, y = -week, fill = sales)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "white", high = "steelblue", name = "Sales ($)") +
  labs(
    title = "April 2024 Sales Calendar Heatmap",
    x = "Day of Week",
    y = "Week of Year"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    panel.grid = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks = element_blank()
  )