Executive Summary

3 numbers every stakeholder needs before reading further.

Metric Value What it means
Overall dissatisfaction rate 16.5% of reviewed orders score โ‰ค 2 1 in ~6 customers is unhappy
On-time delivery, still dissatisfied 11.7% of orders delivered early/on-time still score โ‰ค 2 Delivery is not the dominant root cause
Return/refund complaints Highest dissatisfaction intensity at ~79% Process pain amplifies product defect impact

Analytical flow: Category Risk โ†’ Root-Cause Split (delivery vs product) โ†’ Text Evidence โ†’ Priority Matrix โ†’ Recommendations.


1. Data & Base Table

1.1 Construction

# All joins already executed above in exec_summary chunk.
# This section shows the logic for transparency.

base_tbl <- order_items %>%
  left_join(product_lookup, by = "product_id") %>%   # category label
  left_join(orders_clean,  by = "order_id") %>%      # delay info
  left_join(reviews_clean, by = "order_id") %>%      # score + comment
  mutate(category_en = replace_na(category_en, "unknown"))

Grain: one row = one order-item. Reviews are aggregated at order level (mean score when multiple reviews exist) then joined.

1.2 Data Quality Check โ€” Score-Text Mismatch

Before any analysis, we must know whether numeric scores and written comments agree. A high mismatch means we cannot rely on one signal alone.

negative_themes_vec <- c(
  "Product defect/low quality", "Wrong item/mismatch",
  "Packaging issue",            "Delivery delay/not arrived",
  "Seller/service issue",       "Return/refund process",
  "Price/freight concern"
)

# classify comments (same keywords used in section 3)
classify_theme <- function(msg) {
  case_when(
    str_detect(msg, "quebrad|amassad|defeit|estrag|baixa qualidade|pessima qualidade|nao funciona|ruim") ~ "Product defect/low quality",
    str_detect(msg, "diferente|nao corresponde|nao e o que pedi|errad|cor errada|tamanho errado")        ~ "Wrong item/mismatch",
    str_detect(msg, "embalag|caixa|pacote|mal embalado|avariado")                                       ~ "Packaging issue",
    str_detect(msg, "atras|demor|prazo|entrega|tarde|nao chegou|nao recebi")                            ~ "Delivery delay/not arrived",
    str_detect(msg, "atendimento|suporte|resposta|vendedor|sac")                                        ~ "Seller/service issue",
    str_detect(msg, "cancel|estorno|reembolso|devolu|troca")                                            ~ "Return/refund process",
    str_detect(msg, "preco|caro|valor|frete")                                                           ~ "Price/freight concern",
    TRUE                                                                                                 ~ "Other/unclear"
  )
}

comment_tbl <- base_tbl %>%
  filter(!is.na(review_comment_message), category_en != "unknown") %>%
  mutate(
    msg   = review_comment_message %>%
              iconv(from = "", to = "ASCII//TRANSLIT") %>%
              str_to_lower() %>%
              str_replace_all("[^a-z0-9 ]", " ") %>%
              str_squish(),
    theme = classify_theme(msg)
  )

mismatch_tbl <- comment_tbl %>%
  mutate(
    negative_theme             = theme %in% negative_themes_vec,
    high_score_negative_text   = if_else(review_band == "High (4โ€“5)" & negative_theme,   1L, 0L),
    low_score_nonnegative_text = if_else(review_band == "Low (1โ€“2)"  & !negative_theme,  1L, 0L)
  ) %>%
  summarise(
    `Total comments with text`          = n(),
    `High score + negative text (rate)` = percent(mean(high_score_negative_text),   0.1),
    `Low score + non-negative text (rate)` = percent(mean(low_score_nonnegative_text), 0.1)
  )

kable(mismatch_tbl, align = "lcc") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Total comments with text High score + negative text (rate) Low score + non-negative text (rate)
47672 26.8% 15.1%

Implication: ~27% of happy-scored orders contain complaint language, and ~15% of low-scored orders have non-negative text. We must use both numeric scores and text themes together โ€” neither alone is sufficient. This is why Section 3 (text) cross-validates Section 2 (scores).


2. Category Risk Ranking

Question: Which categories carry the most product-quality pain?

Method: Dissatisfaction rate (score โ‰ค 2) ร— volume. Only categories with โ‰ฅ 100 reviewed orders are included (statistical stability).

MIN_ORDERS <- 100

category_core <- base_tbl %>%
  filter(!is.na(review_score), category_en != "unknown") %>%
  group_by(category_en) %>%
  summarise(
    reviewed_orders      = n_distinct(order_id),
    avg_review           = mean(review_score,  na.rm = TRUE),
    dissatisfaction_rate = mean(dissatisfied,   na.rm = TRUE),
    delay_rate           = mean(delayed,        na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(reviewed_orders >= MIN_ORDERS) %>%
  # risk score: 60% dissatisfaction intensity + 40% volume impact
  mutate(
    z_dissat  = as.numeric(scale(dissatisfaction_rate)),
    z_vol     = as.numeric(scale(log1p(reviewed_orders))),
    risk_score = 0.60 * z_dissat + 0.40 * z_vol
  ) %>%
  arrange(desc(risk_score))

# top 15 for chart
top15 <- category_core %>% slice_head(n = 15)

ggplot(top15,
       aes(x = reorder(category_en, dissatisfaction_rate),
           y = dissatisfaction_rate,
           fill = reviewed_orders)) +
  geom_col(width = 0.7) +
  geom_text(aes(label = percent(dissatisfaction_rate, 0.1)),
            hjust = -0.15, size = 3.2, colour = "grey30") +
  coord_flip() +
  scale_y_continuous(labels = percent_format(0.1),
                     expand = expansion(mult = c(0, 0.12))) +
  scale_fill_gradient(low = CLR_ORANGE, high = CLR_RED,
                      labels = comma_format()) +
  labs(
    title    = "Top 15 Categories by Dissatisfaction Rate",
    subtitle = paste0("Categories with โ‰ฅ ", MIN_ORDERS, " reviewed orders | Colour = order volume"),
    x = NULL, y = "Dissatisfaction Rate (score โ‰ค 2)",
    fill = "Reviewed orders",
    caption = "Risk score = 60% dissatisfaction rate + 40% log(volume)"
  ) +
  theme_olist()

top15 %>%
  transmute(
    Category             = category_en,
    `Reviewed Orders`    = comma(reviewed_orders),
    `Avg Score`          = round(avg_review, 2),
    `Dissatisfaction %`  = percent(dissatisfaction_rate, 0.1),
    `Delay Rate %`       = percent(delay_rate, 0.1),
    `Risk Score`         = round(risk_score, 2)
  ) %>%
  kable(align = "lrrrrr") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  row_spec(1:3, bold = TRUE, background = "#fff3cd")
Category Reviewed Orders Avg Score Dissatisfaction % Delay Rate % Risk Score
office_furniture 1,273 3.48 26.3% 8.8% 1.66
fashion_male_clothing 112 3.62 28.8% 5.3% 1.33
bed_bath_table 9,417 3.87 19.4% 8.3% 1.19
furniture_decor 6,449 3.89 19.7% 8.3% 1.13
fixed_telephony 217 3.67 26.1% 4.2% 1.12
computers_accessories 6,689 3.92 18.8% 7.6% 1.00
watches_gifts 5,624 4.00 16.7% 8.1% 0.63
audio 350 3.81 22.0% 12.6% 0.63
telephony 4,199 3.93 17.2% 8.1% 0.63
housewares 5,884 4.04 16.0% 6.3% 0.55
baby 2,885 3.99 17.2% 8.5% 0.52
sports_leisure 7,720 4.09 15.0% 7.2% 0.47
garden_tools 3,518 4.03 16.3% 7.8% 0.44
auto 3,897 4.04 15.9% 8.1% 0.42
home_confort 397 3.83 20.3% 10.1% 0.41

Finding: fashion_male_clothing has the highest dissatisfaction rate (28.8%), but bed_bath_table and furniture_decor rank higher on risk score because their volume (9,417 and 6,449 orders) amplifies the business impact. These three are the priority investigation targets.


3. Root-Cause Split โ€” Delivery vs Product

Question: Is dissatisfaction caused by late delivery, or by intrinsic product issues?

Logic: If a category has high dissatisfaction even for orders delivered on-time or early, the root cause is product quality, not logistics.

top12_cats <- category_core %>% slice_head(n = 12) %>% pull(category_en)

delay_split <- base_tbl %>%
  filter(!is.na(review_score), category_en %in% top12_cats, !is.na(delay_days)) %>%
  mutate(
    delivery_segment = case_when(
      delay_days <= 0              ~ "On-time / Early",
      delay_days <= 3              ~ "Late 1โ€“3 days",
      delay_days <= 7              ~ "Late 4โ€“7 days",
      TRUE                         ~ "Late > 7 days"
    ),
    delivery_segment = factor(delivery_segment,
      levels = c("On-time / Early", "Late 1โ€“3 days", "Late 4โ€“7 days", "Late > 7 days"))
  ) %>%
  group_by(category_en, delivery_segment) %>%
  summarise(
    n_orders             = n_distinct(order_id),
    dissatisfaction_rate = mean(dissatisfied, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(n_orders >= 30)   # only show cells with enough data

ggplot(delay_split,
       aes(x = delivery_segment,
           y = reorder(category_en, dissatisfaction_rate),
           fill = dissatisfaction_rate)) +
  geom_tile(colour = "white", size = 0.8) +
  geom_text(aes(label = if_else(n_orders >= 30,
                                percent(dissatisfaction_rate, 1), "")),
            size = 3, colour = "white", fontface = "bold") +
  scale_fill_gradient2(low = CLR_GREEN, mid = CLR_ORANGE, high = CLR_RED,
                       midpoint = 0.20,
                       labels = percent_format(1)) +
  labs(
    title    = "Dissatisfaction Rate by Delivery Timing",
    subtitle = "Cells with < 30 orders hidden | % shown in white = dissatisfaction rate",
    x = "Delivery Segment", y = NULL,
    fill = "Dissatisfaction",
    caption = "Key: High dissatisfaction in 'On-time/Early' column โ†’ product issue, not logistics"
  ) +
  theme_olist() +
  theme(axis.text.x = element_text(angle = 20, hjust = 1))

# Summarise on-time dissatisfaction per category for a clear verdict table
verdict_tbl <- base_tbl %>%
  filter(!is.na(dissatisfied), category_en %in% top12_cats, !is.na(delay_days)) %>%
  mutate(
    ontime = delay_days <= 0
  ) %>%
  group_by(category_en, ontime) %>%
  summarise(dissat = mean(dissatisfied), n = n_distinct(order_id), .groups = "drop") %>%
  pivot_wider(names_from = ontime,
              values_from = c(dissat, n),
              names_prefix = "") %>%
  rename(
    ontime_dissat  = `dissat_TRUE`,
    late_dissat    = `dissat_FALSE`,
    n_ontime       = `n_TRUE`,
    n_late         = `n_FALSE`
  ) %>%
  mutate(
    delta         = late_dissat - ontime_dissat,
    root_cause    = case_when(
      ontime_dissat >= 0.15 & delta < 0.15 ~ "๐Ÿ”ด Product-driven",
      ontime_dissat >= 0.15 & delta >= 0.15 ~ "๐ŸŸ  Mixed (Product + Ops)",
      ontime_dissat < 0.15  & delta >= 0.15 ~ "๐ŸŸก Ops/Delivery-driven",
      TRUE                                   ~ "๐ŸŸข Low risk"
    )
  ) %>%
  arrange(desc(ontime_dissat))

verdict_tbl %>%
  transmute(
    Category              = category_en,
    `On-time Dissatisfied` = percent(ontime_dissat, 0.1),
    `Late Dissatisfied`    = percent(late_dissat, 0.1),
    `Gap (Late โ€“ On-time)` = percent(delta, 0.1),
    `Root Cause`           = root_cause
  ) %>%
  kable(align = "lrrrr") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  row_spec(which(str_detect(verdict_tbl$root_cause, "Product")), background = "#fde8e8")
Category On-time Dissatisfied Late Dissatisfied Gap (Late โ€“ On-time) Root Cause
office_furniture 22.6% 57.0% 34.5% ๐ŸŸ  Mixed (Product + Ops)|
fashion_male_clothing 22.0% 85.7% 63.7% ๐ŸŸ  Mixed (Product + Ops)|
fixed_telephony 21.3% 72.7% 51.4% ๐ŸŸ  Mixed (Product + Ops)|
furniture_decor 15.2% 53.6% 38.5% ๐ŸŸ  Mixed (Product + Ops)|
bed_bath_table 15.2% 56.8% 41.7% ๐ŸŸ  Mixed (Product + Ops)|
audio 14.6% 71.7% 57.2% ๐ŸŸก Ops/Delivery-driven|
computers_accessories 14.0% 54.5% 40.5% ๐ŸŸก Ops/Delivery-driven|
telephony 12.7% 50.1% 37.5% ๐ŸŸก Ops/Delivery-driven|
housewares 12.1% 50.8% 38.7% ๐ŸŸก Ops/Delivery-driven|
watches_gifts 11.3% 58.6% 47.3% ๐ŸŸก Ops/Delivery-driven|
baby 11.0% 60.7% 49.7% ๐ŸŸก Ops/Delivery-driven|
sports_leisure 9.9% 56.6% 46.7% ๐ŸŸก Ops/Delivery-driven|

Key finding: Most high-dissatisfaction categories remain unhappy even when orders arrive on time. This is decisive evidence that product quality โ€” not logistics โ€” is the primary root cause. Mixed categories need a joint fix between Product and Operations teams.


4. Text Evidence โ€” What Are Customers Complaining About?

Question: What specific problems do customers describe in their own words?

4.1 Theme Distribution (All Categories)

theme_summary <- comment_tbl %>%
  group_by(theme) %>%
  summarise(
    comments             = n(),
    dissatisfaction_rate = mean(dissatisfied, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(dissatisfaction_rate))

# bubble chart: x = volume, y = dissatisfaction rate, size = impact
ggplot(theme_summary %>% filter(theme != "Other/unclear"),
       aes(x = comments, y = dissatisfaction_rate,
           colour = dissatisfaction_rate, size = comments)) +
  geom_point(alpha = 0.85) +
  geom_text_repel(aes(label = theme), size = 3.2, max.overlaps = 20,
                  colour = "grey20", fontface = "bold") +
  scale_x_log10(labels = comma_format()) +
  scale_y_continuous(labels = percent_format(1), limits = c(0, 1)) +
  scale_colour_gradient(low = CLR_ORANGE, high = CLR_RED, guide = "none") +
  scale_size_continuous(range = c(4, 14), guide = "none") +
  labs(
    title    = "Complaint Themes: Volume vs Dissatisfaction Intensity",
    subtitle = "Bubble size = comment volume | Top-right = highest priority",
    x = "Number of Comments (log scale)",
    y = "Dissatisfaction Rate",
    caption = "'Other/unclear' excluded for clarity"
  ) +
  theme_olist()

theme_summary %>%
  filter(theme != "Other/unclear") %>%
  transmute(
    Theme                 = theme,
    `Comment Volume`      = comma(comments),
    `Dissatisfaction Rate` = percent(dissatisfaction_rate, 0.1),
    Priority = case_when(
      dissatisfaction_rate >= 0.65 ~ "๐Ÿ”ด Critical",
      dissatisfaction_rate >= 0.45 ~ "๐ŸŸ  High",
      TRUE                         ~ "๐ŸŸก Medium"
    )
  ) %>%
  kable(align = "lrrl") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  row_spec(which(theme_summary %>%
                   filter(theme != "Other/unclear") %>%
                   pull(dissatisfaction_rate) >= 0.65),
           background = "#fde8e8", bold = TRUE)
Theme Comment Volume Dissatisfaction Rate Priority
Return/refund process 808 79.2% ๐Ÿ”ด Critical |
Wrong item/mismatch 1,228 68.2% ๐Ÿ”ด Critical |
Product defect/low quality 1,300 63.4% ๐ŸŸ  High |
Packaging issue 1,074 50.5% ๐ŸŸ  High |
Seller/service issue 1,420 41.5% ๐ŸŸก Medium |
Price/freight concern 935 33.3% ๐ŸŸก Medium |
Delivery delay/not arrived 14,890 21.5% ๐ŸŸก Medium |

4.2 Theme ร— Category Cross-Tab (Where Does Each Problem Concentrate?)

theme_cat <- comment_tbl %>%
  filter(
    category_en %in% top12_cats,
    theme != "Other/unclear"
  ) %>%
  count(category_en, theme) %>%
  group_by(category_en) %>%
  mutate(share = n / sum(n)) %>%
  ungroup()

ggplot(theme_cat,
       aes(x = theme, y = reorder(category_en, n),
           fill = share, size = n)) +
  geom_point(shape = 21, colour = "white") +
  scale_fill_gradient(low = CLR_ORANGE, high = CLR_RED,
                      labels = percent_format(1)) +
  scale_size_continuous(range = c(2, 12)) +
  scale_x_discrete(labels = function(x) str_wrap(x, 12)) +
  labs(
    title    = "Complaint Theme Mix by Category",
    subtitle = "Bubble size = raw count | Colour = share within category",
    x = NULL, y = NULL,
    fill = "Share of category\ncomplaints",
    size = "Comment count",
    caption = "Larger + darker = dominant complaint type for that category"
  ) +
  theme_olist() +
  theme(axis.text.x = element_text(angle = 30, hjust = 1, size = 9))

Finding: Wrong item/mismatch and Product defect concentrate in fashion and electronics categories. Packaging issue is disproportionate in furniture. Return/refund process (79% dissatisfaction rate) amplifies underlying problems โ€” customers who get a bad product and canโ€™t return it easily are the angriest segment.


5. Priority Matrix โ€” What to Fix First

Framework: Plot all qualifying categories on a 2ร—2 matrix: - X-axis: Dissatisfaction rate (intensity of pain) - Y-axis: Reviewed order volume (scale of impact) - Quadrant logic: Top-right = Fix Now | Bottom-right = Investigate | Top-left = Monitor | Bottom-left = Watch

# Median splits for quadrants
x_mid <- median(category_core$dissatisfaction_rate)
y_mid <- median(log10(category_core$reviewed_orders))

matrix_data <- category_core %>%
  mutate(
    log_vol    = log10(reviewed_orders),
    quadrant   = case_when(
      dissatisfaction_rate >= x_mid & log_vol >= y_mid ~ "FIX NOW",
      dissatisfaction_rate >= x_mid & log_vol <  y_mid ~ "INVESTIGATE",
      dissatisfaction_rate <  x_mid & log_vol >= y_mid ~ "MONITOR",
      TRUE                                              ~ "WATCH"
    ),
    q_colour   = case_when(
      quadrant == "FIX NOW"     ~ CLR_RED,
      quadrant == "INVESTIGATE" ~ CLR_ORANGE,
      quadrant == "MONITOR"     ~ CLR_BLUE,
      TRUE                      ~ CLR_GREY
    ),
    label_flag = quadrant %in% c("FIX NOW", "INVESTIGATE") |
                   risk_score > quantile(risk_score, 0.85, na.rm = TRUE)
  )

ggplot(matrix_data,
       aes(x = dissatisfaction_rate, y = log_vol,
           colour = quadrant, size = reviewed_orders)) +
  # quadrant shading
  annotate("rect", xmin = x_mid, xmax = Inf,  ymin = y_mid, ymax = Inf,
           fill = CLR_RED,    alpha = 0.05) +
  annotate("rect", xmin = x_mid, xmax = Inf,  ymin = -Inf, ymax = y_mid,
           fill = CLR_ORANGE, alpha = 0.05) +
  annotate("rect", xmin = -Inf, xmax = x_mid, ymin = y_mid, ymax = Inf,
           fill = CLR_BLUE,   alpha = 0.05) +
  annotate("rect", xmin = -Inf, xmax = x_mid, ymin = -Inf, ymax = y_mid,
           fill = CLR_GREY,   alpha = 0.05) +
  # quadrant labels
  annotate("text", x = max(matrix_data$dissatisfaction_rate) * 0.97,
           y = max(matrix_data$log_vol) * 1.01,
           label = "๐Ÿ”ด FIX NOW", hjust = 1, fontface = "bold", colour = CLR_RED,  size = 4) +
  annotate("text", x = max(matrix_data$dissatisfaction_rate) * 0.97,
           y = y_mid * 0.97,
           label = "๐ŸŸ  INVESTIGATE", hjust = 1, fontface = "bold", colour = CLR_ORANGE, size = 4) +
  annotate("text", x = x_mid * 1.02,
           y = max(matrix_data$log_vol) * 1.01,
           label = "๐Ÿ”ต MONITOR", hjust = 0, fontface = "bold", colour = CLR_BLUE,  size = 4) +
  # points
  geom_point(alpha = 0.75) +
  geom_text_repel(
    data = matrix_data %>% filter(label_flag),
    aes(label = str_replace_all(category_en, "_", " ")),
    size = 3, fontface = "bold", max.overlaps = 20,
    colour = "grey20"
  ) +
  # reference lines
  geom_vline(xintercept = x_mid, linetype = "dashed", colour = "grey50") +
  geom_hline(yintercept = y_mid, linetype = "dashed", colour = "grey50") +
  scale_x_continuous(labels = percent_format(1)) +
  scale_y_continuous(labels = function(x) comma(10^x)) +
  scale_colour_manual(values = c(
    "FIX NOW" = CLR_RED, "INVESTIGATE" = CLR_ORANGE,
    "MONITOR" = CLR_BLUE, "WATCH" = CLR_GREY
  )) +
  scale_size_continuous(range = c(2, 10), labels = comma_format()) +
  labs(
    title    = "Category Priority Matrix",
    subtitle = "X = dissatisfaction intensity | Y = reviewed order volume",
    x = "Dissatisfaction Rate (score โ‰ค 2)",
    y = "Reviewed Orders (log scale)",
    colour = "Quadrant",
    size = "Reviewed orders",
    caption = "Dashed lines = median split"
  ) +
  theme_olist()


6. Recommendations

Specific, evidence-backed actions per quadrant. Each action is tied directly to a finding in sections 2โ€“4.

reco_tbl <- matrix_data %>%
  filter(quadrant %in% c("FIX NOW", "INVESTIGATE")) %>%
  left_join(
    verdict_tbl %>% select(category_en, root_cause, ontime_dissat),
    by = "category_en"
  ) %>%
  left_join(
    theme_cat %>%
      group_by(category_en) %>%
      slice_max(n, n = 1) %>%
      select(category_en, top_complaint = theme),
    by = "category_en"
  ) %>%
  mutate(
    owner = case_when(
      str_detect(root_cause, "Product")     ~ "Product / Merchant Team",
      str_detect(root_cause, "Ops")         ~ "Operations / Logistics",
      str_detect(root_cause, "Mixed")       ~ "Product + Operations (joint)",
      TRUE                                   ~ "TBD"
    ),
    recommended_action = case_when(
      str_detect(top_complaint, "Wrong item|mismatch") ~
        "Audit product listing accuracy (title, photo, spec); enforce SKU mapping checks pre-shipment.",
      str_detect(top_complaint, "defect|low quality") ~
        "Add incoming quality inspection at seller onboarding; require quality cert for high-volume sellers.",
      str_detect(top_complaint, "Packaging") ~
        "Define packaging standards by product size/weight; penalise sellers with >10% packaging complaints.",
      str_detect(top_complaint, "Delivery") ~
        "Review SLA promise-dates; flag sellers with >15% late rate for performance review.",
      str_detect(top_complaint, "Return|refund") ~
        "Simplify return policy; add one-click return initiation in post-delivery email.",
      TRUE ~
        "Deep-dive review text sample; run seller scorecard audit."
    )
  ) %>%
  arrange(quadrant, desc(dissatisfaction_rate))

reco_tbl %>%
  transmute(
    Quadrant         = quadrant,
    Category         = str_replace_all(category_en, "_", " "),
    `Dissat. Rate`   = percent(dissatisfaction_rate, 0.1),
    `Volume`         = comma(reviewed_orders),
    `Root Cause`     = root_cause,
    `Top Complaint`  = top_complaint,
    Owner            = owner,
    `Recommended Action` = recommended_action
  ) %>%
  kable(align = "llrrlll") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(which(reco_tbl$quadrant == "FIX NOW"),
           background = "#fff0f0", bold = TRUE) %>%
  column_spec(8, width = "30em")
Quadrant Category Dissat. Rate Volume Root Cause Top Complaint Owner Recommended Action
FIX NOW office furniture 26.3% 1,273 ๐ŸŸ  Mixed (Product + Ops) | elivery delay/not arrived | roduct / Merchant Team | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW furniture decor 19.7% 6,449 ๐ŸŸ  Mixed (Product + Ops) | elivery delay/not arrived | roduct / Merchant Team | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW bed bath table 19.4% 9,417 ๐ŸŸ  Mixed (Product + Ops) | elivery delay/not arrived | roduct / Merchant Team | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW computers accessories 18.8% 6,689 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW telephony 17.2% 4,199 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW baby 17.2% 2,885 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW watches gifts 16.7% 5,624 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW garden tools 16.3% 3,518 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
FIX NOW housewares 16.0% 5,884 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
FIX NOW electronics 16.0% 2,550 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
FIX NOW consoles games 16.0% 1,062 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE fashion male clothing 28.8% 112 ๐ŸŸ  Mixed (Product + Ops) | elivery delay/not arrived | roduct / Merchant Team | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
INVESTIGATE fixed telephony 26.1% 217 ๐ŸŸ  Mixed (Product + Ops) | elivery delay/not arrived | roduct / Merchant Team | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
INVESTIGATE audio 22.0% 350 ๐ŸŸก Ops/Delivery-driven | elivery delay/not arrived | perations / Logistics | eview SLA promise-dates; flag sellers with >15% late rate for performance review. |
INVESTIGATE construction tools safety 20.6% 167 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE home confort 20.3% 397 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE air conditioning 19.2% 253 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE art 18.7% 202 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE furniture living room 18.5% 422 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE fashion underwear beach 18.3% 121 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE home construction 18.2% 490 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE costruction tools garden 17.6% 194 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE christmas supplies 17.6% 128 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE construction tools lights 17.1% 244 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE drinks 16.1% 297 NA NA TBD Deep-dive review text sample; run seller scorecard audit.
INVESTIGATE agro industry and commerce 16.0% 182 NA NA TBD Deep-dive review text sample; run seller scorecard audit.

7. Summary of Findings

# Finding Evidence Action Owner
1 Product quality, not delivery, drives most dissatisfaction 11.7% of on-time orders still score โ‰ค 2 Product / Merchant Team
2 fashion_male_clothing, office_furniture, fixed_telephony have highest dissatisfaction rate Section 2 ranking Product Quality
3 bed_bath_table and furniture_decor are highest business risk due to volume ร— rate Priority Matrix Product Quality
4 Wrong item/mismatch and product defects are most destructive complaint types (63โ€“68% dissat.) Text theme analysis Listing accuracy & QC
5 Return/refund friction amplifies every other problem (79% dissat. rate) Theme table Customer Experience
6 27% of satisfied customers still wrote complaint text โ€” tracking scores alone underestimates real pain Mismatch check Analytics / Monitoring

North Star recommendation: Build a monthly Category Quality Scorecard combining dissatisfaction rate + on-time dissatisfaction + top complaint theme. Any category crossing 15% on-time dissatisfaction triggers an automatic seller audit. This converts this one-time analysis into a continuous quality governance loop.


Report generated with R 4.5.2 ยท Dataset: Olist E-Commerce Public Dataset (Kaggle)