1 Overview

This report analyzes bicycle sales performance using an instructor-provided financial transaction database. The focus is store- and location-based performance, showing how transactional data can be aggregated and transformed into managerial insights for retail operations.

Key Objectives:

  • Analyze overall sales performance over time (weekly, monthly, quarterly)
  • Compare sales contributions across bikeshops (store-level performance)
  • Compare sales contributions across geographic locations (state/city-level performance)
  • Provide managerial insights and recommendations to support decision-making

2 Data Source

This analysis uses multiple related datasets provided by the instructor, representing a financial transaction database for bicycle sales.

Datasets included :

  • orderlines: base transactional records (order-level line items)
  • bikes: product reference (model and product attributes)
  • bikeshops: store reference (bikeshop and geographic information)
  • bike_orderlines: enriched transactional dataset resulting from integration of transactional and reference tables

The bike_orderlines dataset is used as the primary analysis table because it is a consolidated dataset suitable for time-based and store/location-based reporting.

# Import Library

library(tidyverse)
library(lubridate)
library(scales)
library(readxl)

# Load base/dimension tables (for relational database demonstration)

orderlines <- read_excel("orderlines.xlsx")
bikes <- read_excel("bikes.xlsx")
bikeshops <- read_excel("bikeshops.xlsx")

# Load enriched transactional dataset (main analysis table)

sales <- read_csv("bike_orderlines.csv", show_col_types = FALSE)

# Robust parsing and cleaning to prevent non-finite values in plots

sales <- sales %>%
mutate(
order_date = parse_date_time(order_date, orders = c("Ymd HMS", "Ymd HM", "Ymd")),
total_price = readr::parse_number(as.character(total_price))
) %>%
filter(!is.na(order_date), !is.na(total_price)) %>%
mutate(
week = floor_date(order_date, "week", week_start = 1),
month = floor_date(order_date, "month"),
quarter_label = paste0(year(order_date), "-Q", quarter(order_date))
)

glimpse(sales)
## Rows: 15,644
## Columns: 16
## $ order_date     <dttm> 2011-01-07, 2011-01-07, 2011-01-10, 2011-01-10, 2011-0…
## $ order_id       <dbl> 1, 1, 2, 2, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7…
## $ order_line     <dbl> 1, 2, 1, 2, 1, 2, 3, 4, 5, 1, 1, 2, 3, 4, 1, 2, 3, 4, 1…
## $ quantity       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1…
## $ price          <dbl> 6070, 5970, 2770, 5970, 10660, 3200, 12790, 5330, 1570,…
## $ total_price    <dbl> 6070, 5970, 2770, 5970, 10660, 3200, 12790, 5330, 1570,…
## $ model          <chr> "Jekyll Carbon 2", "Trigger Carbon 2", "Beast of the Ea…
## $ category_1     <chr> "Mountain", "Mountain", "Mountain", "Mountain", "Road",…
## $ category_2     <chr> "Over Mountain", "Over Mountain", "Trail", "Over Mounta…
## $ frame_material <chr> "Carbon", "Carbon", "Aluminum", "Carbon", "Carbon", "Ca…
## $ bikeshop_name  <chr> "Ithaca Mountain Climbers", "Ithaca Mountain Climbers",…
## $ city           <chr> "Ithaca", "Ithaca", "Kansas City", "Kansas City", "Loui…
## $ state          <chr> "NY", "NY", "KS", "KS", "KY", "KY", "KY", "KY", "KY", "…
## $ week           <dttm> 2011-01-03, 2011-01-03, 2011-01-10, 2011-01-10, 2011-0…
## $ month          <dttm> 2011-01-01, 2011-01-01, 2011-01-01, 2011-01-01, 2011-0…
## $ quarter_label  <chr> "2011-Q1", "2011-Q1", "2011-Q1", "2011-Q1", "2011-Q1", …

3 Data Management & Aggregation

In the financial reporting, transaction-level data is typically aggregated into different time horizons:

  • Weekly: operational monitoring and short-term fluctuations
  • Monthly: tactical planning and budgeting
  • Quarterly: strategic evaluation and executive reporting

Additionally, this report aggregates sales by:

  • Bikeshop (store performance)
  • State/City (geographic performance)
make_total_ts <- function(df, period_col, period_name) {
df %>%
group_by({{ period_col }}) %>%
summarise(total_sales = sum(total_price, na.rm = TRUE), .groups = "drop") %>%
filter(is.finite(total_sales)) %>%
mutate(period_type = period_name)
}

5 Store Performance Analysis (Bikeshop-Level)

This section evaluates revenue contribution per bikeshop, identifying top-performing and underperforming stores.

5.1 Top 10 Bikeshops by Total Sales

shop_sales <- sales %>%
group_by(bikeshop_name) %>%
summarise(total_sales = sum(total_price, na.rm = TRUE), .groups = "drop") %>%
arrange(desc(total_sales))

top10_shops <- shop_sales %>% slice_head(n = 10)

ggplot(top10_shops, aes(x = reorder(bikeshop_name, total_sales), y = total_sales, fill = bikeshop_name)) +
geom_col(alpha = 0.9, show.legend = FALSE) +
coord_flip() +
scale_y_continuous(labels = scales::dollar) +
labs(
title = "Top 10 Bikeshops by Total Sales",
subtitle = "Store-level performance comparison (highest to lowest)",
x = "Bikeshop",
y = "Total Sales (USD)"
) +
theme_minimal(base_size = 12)

5.2 Sales Concentration: How Much Do the Top Stores Contribute?

total_all <- sum(shop_sales$total_sales, na.rm = TRUE)
top5_total <- sum(shop_sales$total_sales[1:5], na.rm = TRUE)
top10_total <- sum(shop_sales$total_sales[1:10], na.rm = TRUE)

concentration <- tibble(
group = c("Top 5 Stores", "Top 10 Stores", "All Stores"),
total_sales = c(top5_total, top10_total, total_all),
share = c(top5_total/total_all, top10_total/total_all, 1)
)

concentration

Interpretation If the top stores represent a large share of total sales, management should treat them as critical revenue drivers and ensure operational excellence and sufficient inventory allocation.

6 Geographic Performance Analysis (State/City-Level)

This section analyzes sales distribution across locations to support decisions such as market expansion, regional marketing, and store network optimization.

6.1 Sales by State (Top 10)

state_sales <- sales %>%
group_by(state) %>%
summarise(total_sales = sum(total_price, na.rm = TRUE), .groups = "drop") %>%
arrange(desc(total_sales))

top10_states <- state_sales %>% slice_head(n = 10)

ggplot(top10_states, aes(x = reorder(state, total_sales), y = total_sales, fill = state)) +
geom_col(alpha = 0.9, show.legend = FALSE) +
coord_flip() +
scale_y_continuous(labels = scales::dollar) +
labs(
title = "Top 10 States by Total Sales",
subtitle = "Geographic concentration of revenue across states",
x = "State",
y = "Total Sales (USD)"
) +
theme_minimal(base_size = 12)

6.2 Sales by City (Top 10)

city_sales <- sales %>%
group_by(city) %>%
summarise(total_sales = sum(total_price, na.rm = TRUE), .groups = "drop") %>%
arrange(desc(total_sales))

top10_cities <- city_sales %>% slice_head(n = 10)

ggplot(top10_cities, aes(x = reorder(city, total_sales), y = total_sales, fill = city)) +
geom_col(alpha = 0.9, show.legend = FALSE) +
coord_flip() +
scale_y_continuous(labels = scales::dollar) +
labs(
title = "Top 10 Cities by Total Sales",
subtitle = "City-level revenue comparison (highest to lowest)",
x = "City",
y = "Total Sales (USD)"
) +
theme_minimal(base_size = 12)

7 Key Managerial Insights and Recommendations

7.1 Key Managerial Insights and Recommendations

  • Time-based performance:

Weekly sales show higher short-term variability, while monthly and quarterly trends provide more stable signals for tactical and strategic decision-making.

  • Store performance differences:

Total sales differ substantially across bikeshops, indicating unequal revenue contribution and varying store productivity.

  • Sales concentration:

Total sales differ substantially across bikeshops, indicating unequal revenue contribution and varying store productivity.

  • Geographic concentration:

Certain states and cities contribute disproportionately to total sales, suggesting strong regional markets and potential areas for expansion or increased marketing support.

  • Operational implications:

Variability in sales across time and location suggests the need for flexible inventory, staffing, and promotional planning.

7.2 Managerial Recommendations

  • Prioritize key revenue-driving bikeshops:

Allocate inventory, staffing, and marketing resources to high-performing stores to protect and grow core revenue streams.

  • Improve underperforming stores:

Investigate operational issues (assortment, staffing, local market conditions) and implement targeted interventions.

  • Use geographic insights for market planning:

Expand promotional efforts and market development in top-performing regions; consider strategic adjustments in low-performing regions.

  • Align operations with demand patterns:

Use monthly/quarterly trends to guide budgeting and forecasting, while using weekly monitoring to respond to short-term changes.

  • Monitor sales concentration risk:

If revenue is highly concentrated in a few stores/regions, develop risk-mitigation plans such as diversifying markets and strengthening mid-tier stores.