University of Sydney | MATH1115 | LAB F07.03.353 | Wed 11:00 | October 2024

Introduction

  • Stakeholder: This report is intended for the supermarket branch-head management team.
  • Aim: Identifying key factors influencing customer satisfaction to guide strategies for improvement across all branches.
  • Focus: Addressing customer satisfaction by focusing on high spenders and optimizing product offerings.
  • Dataset: Sourced from Kaggle: (Pyae, 2019). 1000 transactions across three supermarket branches over a period of three months.

Key Factors under Investigation

Financial Factors

  • Total Spending
  • Spending Per Item

Customer Demographics & Behavior

  • Customer Type
  • Gender
  • Time Period
  • Spender Type

Product & Shopping Experience

  • Product Line
  • Payment Method
  • Branch

Customer Ratings by Branch

No significant differences observed across branches.

Customer Ratings by Branch

ANOVA Results: Testing Differences in Ratings Across Branches
Degrees of Freedom Sum of Squares Mean Square F-Statistic P-Value
Branch 2 12.23358 6.116789 2.075477 0.1260384
Residuals 997 2938.33113 2.947173 NA NA
Tukey Post-Hoc Test: Pairwise Comparison Between Branches
Difference Between Branches Lower Bound Upper Bound Adjusted P-Value
B-A -0.2089865 -0.5198941 0.1019211 0.2557453
C-A 0.0458070 -0.2660583 0.3576723 0.9365896
C-B 0.2547936 -0.0589113 0.5684984 0.1373571

Key insights from ANOVA and Tukey Post-Hoc Test:

ANOVA:

  • p-value: 0.126
    Indicates no significant differences in customer satisfaction across branches.

Tukey Post-Hoc Test:

  1. Branch B vs A
    Adjusted p-value: 0.256
    No significant difference.

  2. Branch C vs A
    Adjusted p-value: 0.937
    No significant difference.

  3. Branch C vs B
    Adjusted p-value: 0.137
    No significant difference.

Random Forest Model: Predicting satisfaction

Variables Used:

  • Total
  • Spending Per Item
  • Product Line
  • Payment Method
  • Gender
  • Customer Type
  • Time Period (morning, afternoon, evening)
  • Spender Type

Model Parameters:

  • ntree = 500: Based on performance stabilization
  • Default mtry (square root of predictors)
  • Retained in-bag samples (keep.inbag = TRUE) for out-of-bag error estimation

Variable Importance

Key Insights

  • High spenders tend to give higher ratings.
  • Product lines like Health & Beauty receive higher satisfaction ratings.
  • No significant branch effect on customer satisfaction.

Recommendations

  1. Focus on high spenders: Implement loyalty programs and personalized promotions (Emilev, 2023).
  2. Optimize Product Lines: Expand high-satisfaction categories like Health & Beauty (King, 2023).
  3. Uniform Strategy: Customer satisfaction strategies can be applied uniformly across branches (King, 2023).

Conclusion:

  • Total Spending and Product Line are key drivers of satisfaction.
  • Focus on optimizing product offerings and rewarding high-spending customers.

References:

The citations in this report follow the APA7 style format, the .csl file is sourced from the Citation Style Language repository (Wiernik, 2020).

Emilev, E. (2023). Benefits of retail loyalty programs. https://www.growave.io/blog/retail-loyalty-programs

King, T. (2023). Advantages of product diversification. https://www.e-marketingassociates.com/blog/advantages-of-product-diversification

Pyae, A. (2019). Supermarket sales dataset. https://www.kaggle.com/datasets/aungpyaeap/supermarket-sales

Wiernik, B. M. (2020). APA style. https://github.com/citation-style-language/styles/blob/master/apa.csl