Project #1: Sales Data Analysis

Author

Elias Acton

Published

January 1, 2026

I. Define the Problem or Question [Step 1]

What is the purpose of the analysis?

To analyze retail sales transactions across multiple U.S. store cities to evaluate revenue, profit, products, promotions, and customer performance.

Who is this analysis for ?

Store leadership, finance, marketing, and operations teams.

Why is it important?

To identify top-performing products, cities, and promotions and improve profitability.

II. Detailed Analysis

How did you approach your analysis, what techniques did you use?

Data cleaning, grouping and summarizing (revenue & profit), bar charts, trend analysis, boxplots, correlation analysis, and Pareto evaluation.

Collect Data [Step 2]

Summary of key variables and what they mean

date – Transaction date

product_name – Item sold

store_city / store_region – Store location

category – Product category

promo_code – Promotion used

customer_type – New or returning customer

loyalty_member – Loyalty program status

line_revenue – Total revenue per transaction

gross_margin – Profit per transaction

gross_margin_pcnt – Profit percentage

Clear Environment

Load Libraries and Custom Functions

Load Data

Clean Data [Step 3]

Convert Data to Useful Definitions

Modify Data - Create Calculated Fields

We have unit level data (price and cost) and aggregate revenue (line_total) and profit (gross margin) but are missing aggregate cost and profit margin %.

  1. Group data by category and show sum of revenue, cost, gross margin, and average gmpcnt.
category sum_revenue sum_cost sum_gm sum_discount avg_discpcnt <- mean(discount_pct) avg_gmpcnt
Beverages 62918.667 35762.5 27156.167 6902.1225 0.05152812 0.4631132
Hardware 39238.472 19843.5 19394.972 3956.7675 0.04651442 0.5178189
Household 26915.810 14484.8 12431.010 2533.1800 0.04721670 0.4952461
Housewares 3860.955 2064.0 1796.955 437.3250 0.05116279 0.4908708
Personal 4195.057 1807.6 2387.457 404.6925 0.04734513 0.6026533
Snacks 23715.295 10956.6 12758.695 2259.4750 0.04853229 0.5526811
Stationery 2043.608 833.8 1209.808 227.1825 0.05416667 0.6559702

Group by Store and save in a variable. Create a table and save the table

product_name sum_revenue sum_cost sum_gm sum_discount avg_discpcnt <- mean(discount_pct) avg_gmpcnt
Almonds 200g 1813.7300 889.0 924.7300 215.7300 0.05952381 0.5315486
Ballpoint Pens 10ct 500.0775 148.8 351.2775 56.0625 0.06666667 0.7119710
Battery AA 12ct 1845.6525 828.0 1017.6525 222.2775 0.05147059 0.5760963
Chips Family Size 1101.5425 541.2 560.3425 125.9975 0.05897436 0.5290016
Desk Lamp 3860.9550 2064.0 1796.9550 437.3250 0.05116279 0.4908708
Dish Soap 750ml 960.8600 474.0 486.8600 103.2700 0.05444444 0.5269855
Espresso Pods 20ct 28379.8175 17909.5 10470.3175 3204.1125 0.05213333 0.3986853
Granola 500g 1103.6575 568.4 535.2575 112.3125 0.05000000 0.5053482
Green Tea 50ct 1186.5150 486.0 700.5150 107.8650 0.03750000 0.6084946
Hand Soap 500ml 442.8900 142.8 300.0900 31.9200 0.02812500 0.6896985
Laundry Detergent 2L 1821.8475 836.0 985.8475 152.6325 0.03552632 0.5595190
Luxury Chocolate 200g 19198.9875 8735.0 10463.9875 1747.5425 0.04465318 0.5616102
Notebook A5 410.9475 125.0 285.9475 25.3025 0.03593750 0.7017322
Orange Juice 2L 1174.3200 576.0 598.3200 83.8800 0.02553191 0.5289480
Paper Towels 6pk 2275.0200 1314.0 961.0200 241.2900 0.06527778 0.4384846
Phone Charger USB-C 3293.9400 1687.5 1606.4400 303.8100 0.04500000 0.5067528
Premium Coffee Beans 1kg 31181.5800 16389.0 14792.5800 3399.2100 0.05679012 0.4948390
Protein Bar 497.3775 223.0 274.3775 57.8925 0.05853659 0.5711423
Shampoo 400ml 1366.8900 569.6 797.2900 144.3300 0.04777778 0.6023675
Sparkling Water 12pk 996.4350 402.0 594.4350 107.0550 0.04500000 0.6164861
Stapler 1132.5825 560.0 572.5825 145.8175 0.05892857 0.5316697
Toilet Paper 12pk 21415.1925 11718.0 9697.1925 2004.0675 0.04744318 0.4723776
Toothpaste 100ml 458.0625 174.0 284.0625 47.9875 0.04571429 0.6380949
USB Flash Drive 128GB 34098.8800 17328.0 16770.8800 3430.6800 0.04623494 0.5135173
Vitamin C 120ct 2370.1050 1064.0 1306.1050 212.3750 0.04848485 0.5654537

Explore Data [Step 4]

Evaluate your overall data set using SKIMR

Here is a checklist of things to check on when evaluating data:

  1. Check the structure and metadata - Evaluate how many rows, columns, data-types, and so forth.

  2. Missing Data - Evaluate if your data is complete and identify missing data overall and, if so, particular patterns (e.g. “monday transaction data always is missing cost”)

  3. Data Quality & Consistency - What are the ranges of values and does the data make sense? Are data elements consistent - e.g. is “M”, “Male”, and “Men” values in the same field

  4. Relationships and Distributions - Evaluate the summary statistics - mean, std. deviation, min, max, etc. - and the distribution of the data. Is it normally distributed?

  5. Test of Reasonableness (TOR) - Does the data make sense in the context of the analysis?

SKIMR provides a quick way to look at all the key aspects of your data with a single command. The first window looks at your overall data set and then each subsequent window looks at the variables based on type (chr, factor, date, etc.).

Data summary
Name sales_data_clean
Number of rows 2500
Number of columns 23
_______________________
Column type frequency:
character 4
Date 1
factor 8
numeric 10
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
transaction_id 0 1 11 11 0 2500 0
transaction_datetime 0 1 19 19 0 2498 0
store_id 0 1 4 4 0 10 0
product_id 0 1 4 4 0 25 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
date 0 1 2025-05-01 2025-08-31 2025-07-02 123

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
store_city 0 1 FALSE 10 New: 622, Los: 502, Chi: 459, Dal: 177
store_region 0 1 FALSE 4 Wes: 843, Nor: 744, Mid: 459, Sou: 454
product_name 0 1 FALSE 25 Esp: 375, Toi: 352, Lux: 346, USB: 332
category 0 1 FALSE 7 Bev: 818, Sna: 511, Hou: 503, Har: 416
promo_code 0 1 FALSE 5 NON: 1403, SUM: 475, VIP: 282, WEE: 221
payment_method 0 1 FALSE 4 Cre: 1492, Cas: 403, Deb: 370, Mob: 235
customer_type 0 1 FALSE 2 Ret: 1396, New: 1104
loyalty_member 0 1 FALSE 2 Tru: 1279, Fal: 1221

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
unit_price 0 1 14.45 6.46 2.49 11.99 13.99 18.99 25.99 ▃▂▇▂▂
unit_cost 0 1 6.86 3.23 0.80 5.00 7.00 9.00 12.00 ▅▆▆▇▅
quantity 0 1 5.02 5.27 -30.00 1.00 3.00 5.00 30.00 ▁▁▇▂▁
discount_pct 0 1 0.05 0.06 0.00 0.00 0.00 0.10 0.20 ▇▁▂▂▁
gross_margin 0 1 30.85 34.39 -194.70 9.98 19.98 41.97 317.26 ▁▃▇▁▁
line_revenue 0 1 65.16 75.36 -449.70 15.99 37.98 85.45 617.26 ▁▁▇▁▁
line_cost 0 1 34.30 42.04 -255.00 8.50 18.00 45.00 300.00 ▁▁▇▁▁
gross_margin_pcnt 0 1 0.51 0.08 0.29 0.45 0.52 0.56 0.73 ▁▃▇▃▁
unit_discount 0 1 0.71 1.07 0.00 0.00 0.00 1.30 5.20 ▇▂▁▁▁
line_discount 0 1 6.69 12.22 0.00 0.00 0.00 9.50 129.95 ▇▁▁▁▁

This is not a fun step but is important to do early in your analysis. Remember the GIGO principle - garbage in = garbage out. If you have bad data, no amount of brilliant analysis will over come it.

Exploratory Analysis Table - Build Table to Focus Analysis (Dublin Table)

The table helps the analyst ensure all the main information is analyzed. For each categorical variable, evaluate each numeric value. Judgement should be applied and unnecessary or unrelated data can be eliminated but it is important to ensure it is a conscious decision.

The table below lists the combination of categorical and numeric variables that should be analyzed.

# A tibble: 10 × 9
   numeric_var       store_city store_region product_name category promo_code
   <chr>             <chr>      <chr>        <chr>        <chr>    <chr>     
 1 unit_price        " O "      " O "        " O "        " O "    " O "     
 2 unit_cost         " O "      " O "        " O "        " O "    " O "     
 3 quantity          " O "      " O "        " O "        " O "    " O "     
 4 discount_pct      " O "      " O "        " O "        " O "    " O "     
 5 gross_margin      " O "      " O "        " O "        " O "    " O "     
 6 line_revenue      " O "      " O "        " O "        " O "    " O "     
 7 line_cost         " O "      " O "        " O "        " O "    " O "     
 8 gross_margin_pcnt " O "      " O "        " O "        " O "    " O "     
 9 unit_discount     " O "      " O "        " O "        " O "    " O "     
10 line_discount     " O "      " O "        " O "        " O "    " O "     
# ℹ 3 more variables: payment_method <chr>, customer_type <chr>,
#   loyalty_member <chr>

Evaluate all numeric values to determine patterns in the data

Evaluate numeric values grouped by categorical values

1. Evaluate all numeric data by customer type

Findings:

  • New customers aren’t much different than existing

2. Evaluate all numeric data by store city location

Findings:

  • New York generates the highest total revenue (~$43.8K), followed by Los Angeles (~$31.1K) and Chicago (~$30.1K)

  • Profit follows the same pattern — top cities drive most gross margin

  • Large gap between top 3 cities and the rest, showing concentration risk and growth opportunity in lower performers.

3. Evaluate all numeric data by store region

4. Evaluate all numeric data by product category

Findings

  • Revenue and gross margin are highly concentrated in a small number of products.

  • Quantity and discount levels vary widely across products.

  • Gross margin % differs significantly by product, indicating pricing and cost structure differences.

Utilize bar charts to compare and rank numeric data by a category

1. Evaluate top 20 revenue generating products

Findings:

  • Top product is USB Flash Drive 128GB ($34,099).

  • Top 5 products (USB Flash Drive 128GB, Premium Coffee Beans 1kg, Espresso Pods 20ct, Toilet Paper 12pk, Luxury Chocolate 200g) generate the majority of total revenue.

2. Evaluate top 5 revenue generating stores

Findings

  • New York generates the most revenue ($43,772).

  • Los Angeles ($31,084) and Chicago ($30,113) are also major contributors.

  • There is a sharp drop after the top three cities (Dallas and Atlanta under $10K).

3. Evaluate top 25 most profitable products

Findings:

  • USB Flash Drive 128GB is the most profitable product ($16,771).

  • Premium Coffee Beans 1kg is second ($14,793).

  • Profit is heavily concentrated in the top 5 products.

4. Evaluate top 5 most profitable product categories

Findings:

  • Electronics and specialty food items dominate profitability.

  • USB Flash Drive 128GB and Premium Coffee Beans 1kg drive category-level profit.

  • Lower-priced household staples contribute revenue but significantly less profit.

Perform a Trend Analysis

Perform a time series analysis. The key value of trend analysis is to see change in values over time (can be a date, age of a person, how long people are in line, etc.).

  1. Evaluate monthly trends of revenue by month

Findings

  • Recurring monthly peaks and dips across nearly all cities.

  • Steady upward momentum over the two‑year period, indicating growing demand.

  • Some markets show sharp monthly swings while others remain stable.

  1. Evaluate monthly trends of revenue by month by product category

  1. Evaluate monthly trend by month and any other category

  1. Look at Gross Margin $ by a key categorical variable

Utilize Box Plots to Evaluate Distribution of Revenue by City by Promotion Code

Evaluate boxplot by revenue by product category by loyalty customer type (New or Returning)

Correlation - Evaluate Relationship of Numeric Values

  1. Evaluate relationship of all numeric values in a correlation matrix

  1. Evaluate Correlation of Key Numeric Values

[1] -0.01875503
  1. Evaluate correlation between discount level and quantity sold

Distributions - Evaluate Spread of Values

1. Evaluate Distribution of Revenue Data Points

  1. Evaluate Distribution of Unit Cost by Category

  1. Evaluate Distribution of Quantity Sold By Product Category

Pareto - Identify Relationship of Inputs to Outputs (“80/20 Rule”)

  1. Perform a Pareto Analysis of Revenue by Product

  1. Perform a pareto analysis of revenue and store city location

  1. Perform a pareto analysis of revenue and Category location

Build A Report to Evaluate Impact of Promotions

  1. Evaluate Impact of Promotions on Qty Sold, Unit Price, and Profit
Promo Performance by Product
Sales, Revenue, Profitability, and Discounts
Transactions Total Qty Sold Avg Unit Price Total Revenue Revenue / Unit Total Profit Profit / Unit Avg Discount % Avg Gross Margin Avg GM %
Almonds 200g
CLEARANCE20 3 40 $7.99 $255.68 $6.39 $115.68 $2.89 20.0% $38.56 45.2%
NONE 21 44 $7.99 $351.56 $7.99 $197.56 $4.49 0.0% $9.41 56.2%
SUMMER10 6 50 $7.99 $359.55 $7.19 $184.55 $3.69 10.0% $30.76 51.3%
VIP15 7 80 $7.99 $543.32 $6.79 $263.32 $3.29 15.0% $37.62 48.5%
WEEKEND5 5 40 $7.99 $303.62 $7.59 $163.62 $4.09 5.0% $32.72 53.9%
Ballpoint Pens 10ct
CLEARANCE20 1 5 $2.99 $11.96 $2.39 $7.96 $1.59 20.0% $7.96 66.6%
NONE 14 21 $2.99 $62.79 $2.99 $45.99 $2.19 0.0% $3.29 73.2%
SUMMER10 11 75 $2.99 $200.33 $2.67 $140.33 $1.87 9.1% $12.76 70.5%
VIP15 7 55 $2.99 $139.78 $2.54 $95.78 $1.74 15.0% $13.68 68.5%
WEEKEND5 3 30 $2.99 $85.22 $2.84 $61.22 $2.04 5.0% $20.41 71.8%
Battery AA 12ct
CLEARANCE20 1 25 $9.99 $199.80 $7.99 $99.80 $3.99 20.0% $99.80 49.9%
NONE 17 27 $9.99 $269.73 $9.99 $161.73 $5.99 0.0% $9.51 60.0%
SUMMER10 10 105 $9.99 $944.05 $8.99 $524.05 $4.99 10.0% $52.41 55.5%
VIP15 3 40 $9.99 $339.66 $8.49 $179.66 $4.49 15.0% $59.89 52.9%
WEEKEND5 3 10 $9.99 $92.41 $9.24 $52.41 $5.24 3.3% $17.47 58.6%
Chips Family Size
CLEARANCE20 2 10 $4.99 $39.92 $3.99 $17.92 $1.79 20.0% $8.96 44.9%
NONE 20 41 $4.99 $204.59 $4.99 $114.39 $2.79 0.0% $5.72 55.9%
SUMMER10 9 100 $4.99 $449.10 $4.49 $229.10 $2.29 10.0% $25.46 51.0%
VIP15 6 85 $4.99 $360.53 $4.24 $173.53 $2.04 15.0% $28.92 48.1%
WEEKEND5 2 10 $4.99 $47.41 $4.74 $25.41 $2.54 5.0% $12.70 53.6%
Desk Lamp
CLEARANCE20 5 45 $24.99 $899.64 $19.99 $359.64 $7.99 20.0% $71.93 40.0%
NONE 25 37 $24.99 $924.63 $24.99 $480.63 $12.99 0.0% $19.23 52.0%
SUMMER10 7 50 $24.99 $1,124.55 $22.49 $524.55 $10.49 10.0% $74.94 46.6%
VIP15 2 15 $24.99 $318.62 $21.24 $138.62 $9.24 15.0% $69.31 43.5%
WEEKEND5 4 25 $24.99 $593.51 $23.74 $293.51 $11.74 5.0% $73.38 49.5%
Dish Soap 750ml
CLEARANCE20 1 10 $4.49 $35.92 $3.59 $15.92 $1.59 20.0% $15.92 44.3%
NONE 23 32 $4.49 $143.68 $4.49 $79.68 $2.49 0.0% $3.46 55.5%
SUMMER10 16 145 $4.49 $585.95 $4.04 $295.95 $2.04 10.0% $18.50 50.5%
VIP15 4 40 $4.49 $152.66 $3.82 $72.66 $1.82 15.0% $18.16 47.6%
WEEKEND5 1 10 $4.49 $42.66 $4.27 $22.66 $2.27 5.0% $22.66 53.1%
Espresso Pods 20ct
CLEARANCE20 20 295 $14.99 $3,537.64 $11.99 $1,030.14 $3.49 20.0% $51.51 29.1%
NONE 207 392 $14.99 $5,876.08 $14.99 $2,544.08 $6.49 0.0% $12.29 43.3%
SUMMER10 69 660 $14.99 $8,904.06 $13.49 $3,294.06 $4.99 10.0% $47.74 37.0%
VIP15 50 455 $14.99 $5,718.69 $12.57 $1,851.19 $4.07 14.4% $37.02 33.7%
WEEKEND5 29 305 $14.99 $4,343.35 $14.24 $1,750.85 $5.74 5.0% $60.37 40.3%
Granola 500g
CLEARANCE20 3 20 $5.99 $95.84 $4.79 $39.84 $1.99 20.0% $13.28 41.6%
NONE 25 48 $5.99 $287.52 $5.99 $153.12 $3.19 0.0% $6.12 53.3%
SUMMER10 8 70 $5.99 $377.37 $5.39 $181.37 $2.59 10.0% $22.67 48.1%
VIP15 4 45 $5.99 $229.12 $5.09 $103.12 $2.29 15.0% $25.78 45.0%
WEEKEND5 3 20 $5.99 $113.81 $5.69 $57.81 $2.89 5.0% $19.27 50.8%
Green Tea 50ct
NONE 21 42 $7.99 $335.58 $7.99 $209.58 $4.99 0.0% $9.98 62.5%
SUMMER10 5 60 $7.99 $431.46 $7.19 $251.46 $4.19 10.0% $50.29 58.3%
VIP15 4 45 $7.99 $305.62 $6.79 $170.62 $3.79 15.0% $42.65 55.8%
WEEKEND5 2 15 $7.99 $113.86 $7.59 $68.86 $4.59 5.0% $34.43 60.5%
Hand Soap 500ml
CLEARANCE20 1 5 $3.99 $15.96 $3.19 $9.96 $1.99 20.0% $9.96 62.4%
NONE 21 39 $3.99 $155.61 $3.99 $108.81 $2.79 0.0% $5.18 69.9%
SUMMER10 6 55 $3.99 $195.51 $3.55 $129.51 $2.35 8.3% $21.59 67.1%
WEEKEND5 4 20 $3.99 $75.81 $3.79 $51.81 $2.59 5.0% $12.95 68.3%
Laundry Detergent 2L
CLEARANCE20 1 10 $12.99 $103.92 $10.39 $48.92 $4.89 20.0% $48.92 47.1%
NONE 23 37 $12.99 $480.63 $12.99 $277.13 $7.49 0.0% $12.05 57.7%
SUMMER10 6 40 $12.99 $467.64 $11.69 $247.64 $6.19 10.0% $41.27 53.0%
VIP15 3 10 $12.99 $90.93 $9.09 $35.93 $3.59 10.0% $11.98 52.7%
WEEKEND5 5 55 $12.99 $678.73 $12.34 $376.23 $6.84 5.0% $75.25 55.4%
Luxury Chocolate 200g
CLEARANCE20 11 80 $11.99 $767.36 $9.59 $367.36 $4.59 20.0% $33.40 47.9%
NONE 203 382 $11.99 $4,580.18 $11.99 $2,670.18 $6.99 0.0% $13.15 58.3%
SUMMER10 61 610 $11.99 $6,570.52 $10.77 $3,520.52 $5.77 9.8% $57.71 53.7%
VIP15 37 340 $11.99 $3,465.11 $10.19 $1,765.11 $5.19 15.0% $47.71 50.9%
WEEKEND5 34 335 $11.99 $3,815.82 $11.39 $2,140.82 $6.39 5.0% $62.97 56.1%
Notebook A5
CLEARANCE20 1 5 $3.49 $13.96 $2.79 $8.96 $1.79 20.0% $8.96 64.2%
NONE 20 40 $3.49 $139.60 $3.49 $99.60 $2.49 0.0% $4.98 71.3%
SUMMER10 4 25 $3.49 $78.53 $3.14 $53.53 $2.14 10.0% $13.38 68.2%
VIP15 2 10 $3.49 $29.66 $2.97 $19.66 $1.97 15.0% $9.83 66.3%
WEEKEND5 5 45 $3.49 $149.20 $3.32 $104.20 $2.32 5.0% $20.84 69.8%
Orange Juice 2L
NONE 35 55 $6.99 $384.45 $6.99 $208.45 $3.79 0.0% $5.96 54.2%
SUMMER10 4 45 $6.99 $283.10 $6.29 $139.10 $3.09 10.0% $34.77 49.1%
VIP15 4 35 $6.99 $207.95 $5.94 $95.95 $2.74 15.0% $23.99 46.1%
WEEKEND5 4 45 $6.99 $298.82 $6.64 $154.82 $3.44 5.0% $38.71 51.8%
Paper Towels 6pk
CLEARANCE20 2 10 $11.49 $91.92 $9.19 $31.92 $3.19 20.0% $15.96 34.7%
NONE 15 29 $11.49 $333.21 $11.49 $159.21 $5.49 0.0% $10.61 47.8%
SUMMER10 8 90 $11.49 $930.69 $10.34 $390.69 $4.34 10.0% $48.84 42.0%
VIP15 6 55 $11.49 $537.16 $9.77 $207.16 $3.77 15.0% $34.53 38.6%
WEEKEND5 5 35 $11.49 $382.04 $10.92 $172.04 $4.92 5.0% $34.41 45.0%
Phone Charger USB-C
CLEARANCE20 2 40 $15.99 $511.68 $12.79 $211.68 $5.29 20.0% $105.84 41.4%
NONE 29 65 $15.99 $1,039.35 $15.99 $551.85 $8.49 0.0% $19.03 53.1%
SUMMER10 10 60 $15.99 $863.46 $14.39 $413.46 $6.89 10.0% $41.35 47.9%
VIP15 4 20 $15.99 $271.83 $13.59 $121.83 $6.09 15.0% $30.46 44.8%
WEEKEND5 5 40 $15.99 $607.62 $15.19 $307.62 $7.69 5.0% $61.52 50.6%
Premium Coffee Beans 1kg
CLEARANCE20 22 250 $18.99 $3,798.00 $15.19 $1,548.00 $6.19 20.0% $70.36 40.8%
NONE 165 296 $18.99 $5,621.04 $18.99 $2,957.04 $9.99 0.0% $17.92 52.6%
SUMMER10 63 555 $18.99 $9,485.50 $17.09 $4,490.50 $8.09 10.0% $71.28 47.3%
VIP15 40 375 $18.99 $6,053.06 $16.14 $2,678.06 $7.14 15.0% $66.95 44.2%
WEEKEND5 34 345 $18.99 $6,223.97 $18.04 $3,118.97 $9.04 5.0% $91.73 50.1%
Protein Bar
CLEARANCE20 2 20 $2.49 $39.84 $1.99 $19.84 $0.99 20.0% $9.92 49.8%
NONE 20 48 $2.49 $119.52 $2.49 $71.52 $1.49 0.0% $3.58 59.8%
SUMMER10 8 60 $2.49 $134.46 $2.24 $74.46 $1.24 10.0% $9.31 55.4%
VIP15 8 70 $2.49 $144.42 $2.06 $74.42 $1.06 13.1% $9.30 53.6%
WEEKEND5 3 25 $2.49 $59.14 $2.37 $34.14 $1.37 5.0% $11.38 57.7%
Shampoo 400ml
CLEARANCE20 1 10 $8.49 $67.92 $6.79 $35.92 $3.59 20.0% $35.92 52.9%
NONE 26 33 $8.49 $280.17 $8.49 $174.57 $5.29 0.0% $6.71 62.3%
SUMMER10 7 55 $8.49 $420.25 $7.64 $244.25 $4.44 10.0% $34.89 58.1%
VIP15 7 55 $8.49 $396.91 $7.22 $220.91 $4.02 15.0% $31.56 55.7%
WEEKEND5 4 25 $8.49 $201.64 $8.07 $121.64 $4.87 5.0% $30.41 60.3%
Sparkling Water 12pk
CLEARANCE20 3 30 $5.49 $131.76 $4.39 $71.76 $2.39 20.0% $23.92 54.5%
NONE 26 51 $5.49 $279.99 $5.49 $177.99 $3.49 0.0% $6.85 63.6%
SUMMER10 7 80 $5.49 $395.28 $4.94 $235.28 $2.94 10.0% $33.61 59.5%
VIP15 3 35 $5.49 $163.33 $4.67 $93.33 $2.67 15.0% $31.11 57.1%
WEEKEND5 1 5 $5.49 $26.08 $5.22 $16.08 $3.22 5.0% $16.08 61.7%
Stapler
CLEARANCE20 4 55 $7.99 $351.56 $6.39 $159.06 $2.89 20.0% $39.77 45.2%
NONE 14 20 $7.99 $159.80 $7.99 $89.80 $4.49 0.0% $6.41 56.2%
SUMMER10 7 60 $7.99 $431.46 $7.19 $221.46 $3.69 10.0% $31.64 51.3%
WEEKEND5 3 25 $7.99 $189.76 $7.59 $102.26 $4.09 5.0% $34.09 53.9%
Toilet Paper 12pk
CLEARANCE20 14 95 $13.99 $1,049.25 $11.04 $384.25 $4.04 18.6% $27.45 38.3%
NONE 199 369 $13.99 $5,162.31 $13.99 $2,579.31 $6.99 0.0% $12.96 50.0%
SUMMER10 77 625 $13.99 $7,841.40 $12.55 $3,466.40 $5.55 9.7% $45.02 44.5%
VIP15 35 295 $13.99 $3,507.99 $11.89 $1,442.99 $4.89 15.0% $41.23 41.1%
WEEKEND5 27 290 $13.99 $3,854.24 $13.29 $1,824.24 $6.29 5.0% $67.56 47.3%
Toothpaste 100ml
CLEARANCE20 1 15 $3.49 $41.88 $2.79 $23.88 $1.59 20.0% $23.88 57.0%
NONE 21 30 $3.49 $104.70 $3.49 $68.70 $2.29 0.0% $3.27 65.6%
SUMMER10 7 45 $3.49 $141.35 $3.14 $87.35 $1.94 10.0% $12.48 61.8%
VIP15 4 35 $3.49 $103.83 $2.97 $61.83 $1.77 15.0% $15.46 59.5%
WEEKEND5 2 20 $3.49 $66.31 $3.32 $42.31 $2.12 5.0% $21.16 63.8%
USB Flash Drive 128GB
CLEARANCE20 18 130 $25.99 $2,676.97 $20.59 $1,116.97 $8.59 18.9% $62.05 42.9%
NONE 197 334 $25.99 $8,680.66 $25.99 $4,672.66 $13.99 0.0% $23.72 53.8%
SUMMER10 51 410 $25.99 $9,577.31 $23.36 $4,657.31 $11.36 9.8% $91.32 48.8%
VIP15 37 345 $25.99 $7,621.57 $22.09 $3,481.57 $10.09 15.0% $94.10 45.7%
WEEKEND5 29 225 $25.99 $5,542.37 $24.63 $2,842.37 $12.63 4.8% $98.01 51.5%
Vitamin C 120ct
NONE 16 27 $16.99 $458.73 $16.99 $269.73 $9.99 0.0% $16.86 58.8%
SUMMER10 8 80 $16.99 $1,223.28 $15.29 $663.28 $8.29 10.0% $82.91 54.2%
VIP15 5 15 $16.99 $203.88 $13.59 $98.88 $6.59 12.0% $19.78 53.0%
WEEKEND5 4 30 $16.99 $484.22 $16.14 $274.22 $9.14 5.0% $68.55 56.6%
  1. Evaluate Relationship of Promotions to Quantity Sold For All Proucts - Visually

Visualizing complex relationships is a powerful way to derive meaning from data.

  1. Now evaluate revenue by Promotion Code Over Time

  1. Now evaluate Revenue by Product by City Over Time

  1. Now evaluate revenue by Product by City

Analyze Data [Step 5]

The exploration phase of the analytic process is intended to expansive and generative, asking “what does the data tell me?” to the analysis phase, which is more focused on outcome in relation to the objective, and asks “what does the data mean?”.

It is a subtle but important mind shift.

Analysis 1: Analyze Sales Data

Question: What was overall sales revenue across all stores?

Our total sales revenue was $162,888.

Question: What are the best selling products?

The best selling product is USB Flash Drive 128GB and brought in 0

Display a nice table of the best selling 15 products:

product_name sum_revenue
USB Flash Drive 128GB $34,098.88
Premium Coffee Beans 1kg $31,181.58
Espresso Pods 20ct $28,379.82
Toilet Paper 12pk $21,415.19
Luxury Chocolate 200g $19,198.99
Desk Lamp $3,860.96
Phone Charger USB-C $3,293.94
Vitamin C 120ct $2,370.10
Paper Towels 6pk $2,275.02
Battery AA 12ct $1,845.65
Laundry Detergent 2L $1,821.85
Almonds 200g $1,813.73
Shampoo 400ml $1,366.89
Green Tea 50ct $1,186.52
Orange Juice 2L $1,174.32

Analysis 2: Analyze Profit Data

Question: What is the overall profitability (Revenue - Cost)?

[1] "$77,135.06"

Our total profit was $77,135.06

Question: Which Products are the most profitable?

product_name sum_gm
USB Flash Drive 128GB $16,770.88
Premium Coffee Beans 1kg $14,792.58
Espresso Pods 20ct $10,470.32
Luxury Chocolate 200g $10,463.99
Toilet Paper 12pk $9,697.19
Desk Lamp $1,796.96
Phone Charger USB-C $1,606.44
Vitamin C 120ct $1,306.10
Battery AA 12ct $1,017.65
Laundry Detergent 2L $985.85
Paper Towels 6pk $961.02
Almonds 200g $924.73
Shampoo 400ml $797.29
Green Tea 50ct $700.52
Orange Juice 2L $598.32

The most profitable product is USB Flash Drive 128GB and the total profit is $16,770.88

Here are the top 15 most profitable items:

Analysis 3: Analyze Location Data

Question: What locations generated the most revenue?

Question: What locations generated the most profit?

Analysis 4: Analyze Customer Type & Loyalty Data

Question: What customer types generated the most revenue?

Question: What customer type generated the most profit?

Question: Were Loyalty Programs Effective? Look at Revenue & Profit

loyalty_member sum_revenue sum_gm avg_gmpcnt
False 82977.25 39225.45 0.5111213
True 79910.62 37909.62 0.5112349

Analysis 5: Analyze Product Data

Question: What products generated the most revenue?

Question: What products generated the most profit?

Question: What product is the highest REVENUE generating overall?

Question: What product is the highest PROFIT (gross margin) generating overall?

Question: Were the promotions effective? If so, which were effective?

Analysis 6: Evaluate the Revenue Trend Over the Year (aggregate sales)

Question: Is sales revenue trending UP or DOWN?

Question: How is sales revenue trending UP or DOWN by Location?

Question: How are top products trending by Location?

Question: Which location is trending the best in terms of profit?

Question: Which product category is the trending the best in terms of profit?

III. Executive Summary

Write this last. This section should be concise enough that an executive can read it in 60–90 seconds. Use the numbers computed in the analysis below (inline values) and keep the narrative focused on decisions.

At-a-glance

  • Total revenue: $162,888
  • Total profit (gross margin): $77,135.06
  • Top product by revenue: USB Flash Drive 128GB ($34,098.88)
  • Top product by profit: USB Flash Drive 128GB ($16,770.88)
  • Top location by revenue: (0)

What this means (2–4 bullets)

Overall sales revenue across all stores was total_revenue → shows total performance across all stores.

The best selling product is top_rev_product and brought in top_rev_product_rev → highest revenue generating product overall.

The most profitable product is most_profitable_product_name and the total profit is most_profitable_product_margin → highest gross margin generating product.

The location generating the most revenue is top_city → top revenue generating location. ## Recommended actions (2–4 bullets)

Focus on top_rev_product — increase sales volume — Sales Team

Prioritize most_profitable_product_name — maximize gross margin — Finance & Product Team

Support top_city — allocate resources to top revenue generating location — Operations

Evaluate promotion and loyalty performance — improve revenue and gross margin — Marketing

IV. Communicate Findings [Step 6]

This is the evidence section. It should connect directly back to the question/problem in Step 1 and point to the specific tables/plots generated in Step 4–5.

Key findings

  1. Revenue is concentrated in a few products and cities

Evidence: Top 15 Revenue Products table; Revenue by Store City bar chart

Interpretation: r top_rev_product generated r top_rev_product_rev, and a small number of cities drive most sales.

  1. Highest revenue ≠ highest profit

Evidence: Top Revenue vs. Top Profit product tables

Interpretation: r most_profitable_product_name produced the most profit (r most_profitable_product_margin), showing margin matters as much as volume.

Notes, assumptions, and limitations

Data may contain outliers affecting averages.

Trends may reflect seasonality or promo timing.

Further analysis could test promo effectiveness and margin optimization.

V. Take Action [Step 7]

Turn the findings into an action plan that someone can execute.

Actions to take now (0–30 days)

  • Increase inventory and marketing support for r top_rev_product — Owner: Sales Manager — Metric: +10% revenue lift — Due: 30 days

  • Prioritize margin focus on r most_profitable_product_name — Owner: Finance & Product Team — Metric: Maintain or improve GM% — Due: 30 days

Actions to take next (30–90 days)

  • Optimize pricing strategy on top 10 products — Owner: Finance — Metric: +2–3% overall gross margin — Due: 60 days

  • Redesign underperforming promotions — Owner: Marketing — Metric: Revenue growth without GM% decline — Due: 75 days

Follow-up analysis to reduce uncertainty

  • Compare promoted vs. non-promoted periods for the same product to measure incremental revenue and profit, not just total sales.

  • Track new vs. returning customers over time to measure repeat rate and revenue per customer.