Executive summary


“How does the delivery experience function within the customer journey?”
Since the COVID-19 pandemic, demand for online shopping has surged, driving rapid growth in the online retail sector. According to data released by the Korea’s Ministry of Data and Statistics, the value of online shopping transactions in 2023 reached approximately KRW 20.0905 trillion — an increase of 11.8% compared to the previous year. The share of online shopping within the total retail market has also continued to expand, reaching 26.8%.

Furthermore, a report by IMARC Group estimates that the global e-commerce market will reach approximately USD 26.8 trillion in 2024, indicating that online consumption and digital distribution ecosystems continue to expand worldwide even after the end of the pandemic.

In light of these market changes, delivery methods themselves have become increasingly diversified and segmented. Delivery options such as free shipping, next-day or two-day delivery, standard delivery, express delivery, and in-store pickup have evolved beyond mere logistics, becoming key components that shape the customer journey.

Online Shopping Trends_2023.10
Online Shopping Trends_2023.10
Proportion of online shopping transaction amount
Proportion of online shopping transaction amount


According to prior studies, delivery service quality—specifically delivery speed, economic efficiency, product condition, and information transparency—has been shown to have a significant impact on customer satisfaction and repurchase intention.

However, previous research has mostly been conducted using surveys and has focused on perceived satisfaction or perception-based outcomes, resulting in a relative lack of studies that analyze actual purchase data alongside delivery methods and consumer behavior indicators.

Therefore, this study aims to explore how delivery methods are related to purchasing experience, satisfaction, and repurchase intention based on real customer behavioral data, and to present these findings through visualizations.

Reference papers: [1] 소비자 배송 만족도, 제품 만족도가 점포 충성도에 미치는 영향에 관한 연구, 이상곤

[2] 온라인 쇼핑몰의 배송서비스 품질이 고객 만족도 및 쇼핑몰 충성도에 미치는 영향, 박봉교, 최호규

Limitations and Study Assumptions

This dataset does not include a variable distinguishing purchase channels (online vs. offline), which introduces interpretive limitations. However, based on the following elements, this study assumes that the dataset reflects online-based purchasing behavior: The presence of the variable Shipping Type, which is primarily associated with online transactions.

The inclusion of ecommerce-related variables such as Payment Method, Discount Applied, and Promo Code Used, which are commonly used in digital purchasing environments.

Since this study focuses on a single variable—shipping method—additional contextual factors such as shipping cost, return experience, delivery delays, or service accuracy are not included. Nevertheless, shipping method is closely tied to consumer price perception, convenience evaluation, psychological cost, and satisfaction. Therefore, the results should be interpreted not as definitive causal effects, but rather as correlational and context-dependent patterns.

Research Question

How does the delivery experience operate within the customer journey?

Core Narrative

By analyzing the relationship between delivery method and consumer purchasing behavior through data-driven analysis, this study is expected to contribute to a more empirical understanding of the impact of delivery experience on consumer decision-making. This will enable companies to gain practical insights, such as developing delivery policies based on customer segmentation and improving services from a customer journey perspective.

Visualization Focus

Through four sub-research questions, we will examine whether delivery experience leads to purchase amount, satisfaction, and repurchase stages, and visualize whether it actually influences consumer decision-making.


Data background

Dataset
The data used in this study is the Shopping Behavior Dataset, uploaded by Kaggle user Saad Ali Yaseen, and publicly available on the Kaggle platform. The “Provenance” section on Kaggle states, “SOURCES: We take this dataset from the Kaggle platform.” This indicates that no information about the source or company of the data was provided. Therefore, this data is best interpreted as a consumer behavior dataset constructed for research, education, and machine learning practice, rather than as actual transaction logs collected directly from a specific retailer or official agency.

This dataset consists of 3,900 observations and 18 variables, including customer demographic characteristics (Age, Gender), purchase information (Purchase Amount, Product Category, Review Rating), shipping method (Shipping Type), shopping frequency (Frequency of Purchases), promotion usage (Discount Applied, Coupon Used), and previous purchase experience (Previous Purchases). This broad variable structure allows for a comprehensive analysis of consumer shopping behavior, shipping choices, satisfaction, and promotion usage patterns.

The data is structured in a table format, with each row representing an individual customer’s purchase history and each column containing attribute information for that purchase. The mix of numeric and categorical variables makes it suitable for a variety of exploratory data analyses, including group comparisons, distribution analysis, and visualization. Because the source of this data is unclear and only the underlying platform is presented, the generalizability of the results may be limited. This study acknowledges these limitations and, within the scope of the data, explores the relationship between delivery method and consumer purchasing behavior.


Data cleaning

In this study, data preprocessing was performed to accurately analyze the impact of delivery method on purchasing behavior.

First, only variables directly related to the research question were selected from the original data, and variable names containing blanks were redefined to eliminate blanks for analysis efficiency. Observations with missing values in key variables such as delivery method, purchase amount, and satisfaction were then removed to ensure the reliability of the analysis. For purchase amount, outliers were removed using the IQR criterion to avoid the risk of extreme values ​​distorting the analysis results.

Finally, categorical variables such as delivery method, gender, discount availability, and purchase frequency were converted to factor form to facilitate visualization and statistical analysis. The final dataset generated through this preprocessing process, with missing and outlier removal and a well-organized variable structure, served as the basis for all subsequent analyses.

shopping_raw <- read_csv("data/shopping_behavior_updated.csv")
## Rows: 3900 Columns: 18
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (13): Gender, Item Purchased, Category, Location, Size, Color, Season, S...
## dbl  (5): Customer ID, Age, Purchase Amount (USD), Review Rating, Previous P...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
shopping <- shopping_raw %>% 
  select(
    CustomerID        = `Customer ID`,
    age               = Age,
    gender            = Gender,
    shipping_type     = `Shipping Type`,
    purchase_amount   = `Purchase Amount (USD)`,
    review_rating     = `Review Rating`,
    discount_applied  = `Discount Applied`,
    previous_purchases = `Previous Purchases`,
    purchase_frequency = `Frequency of Purchases`
  )

shopping_no_na <- shopping %>% 
  filter(
    !is.na(shipping_type),
    !is.na(purchase_amount),
    !is.na(review_rating)
  )

q1  <- quantile(shopping_no_na$purchase_amount, 0.25, na.rm = TRUE)
q3  <- quantile(shopping_no_na$purchase_amount, 0.75, na.rm = TRUE)
iqr <- q3 - q1

lower_bound <- q1 - 1.5 * iqr
upper_bound <- q3 + 1.5 * iqr

shopping_clean <- shopping_no_na %>% 
  filter(
    purchase_amount >= lower_bound,
    purchase_amount <= upper_bound
  )

shopping_final <- shopping_clean %>% 
  mutate(
    shipping_type      = factor(shipping_type),
    gender             = factor(gender),
    discount_applied   = factor(discount_applied),
    purchase_frequency = factor(purchase_frequency)
  )

glimpse(shopping_final)
## Rows: 3,900
## Columns: 9
## $ CustomerID         <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …
## $ age                <dbl> 55, 19, 50, 21, 45, 46, 63, 27, 26, 57, 53, 30, 61,…
## $ gender             <fct> Male, Male, Male, Male, Male, Male, Male, Male, Mal…
## $ shipping_type      <fct> Express, Express, Free Shipping, Next Day Air, Free…
## $ purchase_amount    <dbl> 53, 64, 73, 90, 49, 20, 85, 34, 97, 31, 34, 68, 72,…
## $ review_rating      <dbl> 3.1, 3.1, 3.1, 3.5, 2.7, 2.9, 3.2, 3.2, 2.6, 4.8, 4…
## $ discount_applied   <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Y…
## $ previous_purchases <dbl> 14, 2, 23, 49, 31, 14, 49, 19, 8, 4, 26, 10, 37, 31…
## $ purchase_frequency <fct> Fortnightly, Fortnightly, Weekly, Weekly, Annually,…
summary(shopping_final)
##    CustomerID          age           gender            shipping_type
##  Min.   :   1.0   Min.   :18.00   Female:1248   2-Day Shipping:627  
##  1st Qu.: 975.8   1st Qu.:31.00   Male  :2652   Express       :646  
##  Median :1950.5   Median :44.00                 Free Shipping :675  
##  Mean   :1950.5   Mean   :44.07                 Next Day Air  :648  
##  3rd Qu.:2925.2   3rd Qu.:57.00                 Standard      :654  
##  Max.   :3900.0   Max.   :70.00                 Store Pickup  :650  
##                                                                     
##  purchase_amount  review_rating  discount_applied previous_purchases
##  Min.   : 20.00   Min.   :2.50   No :2223         Min.   : 1.00     
##  1st Qu.: 39.00   1st Qu.:3.10   Yes:1677         1st Qu.:13.00     
##  Median : 60.00   Median :3.70                    Median :25.00     
##  Mean   : 59.76   Mean   :3.75                    Mean   :25.35     
##  3rd Qu.: 81.00   3rd Qu.:4.40                    3rd Qu.:38.00     
##  Max.   :100.00   Max.   :5.00                    Max.   :50.00     
##                                                                     
##       purchase_frequency
##  Annually      :572     
##  Bi-Weekly     :547     
##  Every 3 Months:584     
##  Fortnightly   :542     
##  Monthly       :553     
##  Quarterly     :563     
##  Weekly        :539

In this study, delivery method and purchase frequency had a meaningful order, so the levels of each variable were redefined to facilitate the interpretation of analysis and visualization. Shipping method (shipping_type) was sorted from fastest to slowest, and purchase frequency (purchase_frequency) was sorted from shortest (frequent) to longest purchase cycle.

To clearly demonstrate the characteristics of each variable, the visualization applied a consistent color palette: shipping_type represented by a blue gradient based on speed, gender by a pink-blue contrast, and purchase_frequency by a darker red for higher frequency. This ensured a natural flow within the graph, making comparison and interpretation more intuitive.

shopping_final$shipping_type <- factor(
  shopping_final$shipping_type,
  levels = c(
    "Express",
    "Next Day Air",
    "2-Day Shipping",
    "Standard",
    "Free Shipping",
    "Store Pickup"
  )
)

shopping_final$purchase_frequency <- factor(
  shopping_final$purchase_frequency,
  levels = c(
    "Weekly",
    "Bi-Weekly",
    "Fortnightly",
    "Monthly",
    "Every 3 Months",
    "Quarterly",
    "Annually"
  )
)

shipping_colors <- c(
  "Express"        = "#08306B", 
  "Next Day Air"   = "#2171B5", 
  "2-Day Shipping" = "#6BAED6",  
  "Standard"       = "#9ECAE1", 
  "Free Shipping"  = "#C6DBEF", 
  "Store Pickup"   = "#41AB5D"   
)

gender_colors <- c(
  "Female" = "#CC79A7",
  "Male"   = "#0072B2"
)

freq_gender_colors <- c(
  "Female_High" = "#CC79A7",
  "Female_Low"  = "#F2CFE3",
  "Male_High"   = "#0072B2",
  "Male_Low"    = "#A6CEE3"
)

freq_colors <- c(
  "Weekly"         = "#99000D",
  "Bi-Weekly"      = "#CB181D",
  "Fortnightly"    = "#EF3B2C",
  "Monthly"        = "#FB6A4A",
  "Every 3 Months" = "#FC9272",
  "Quarterly"      = "#FCBBA1",
  "Annually"       = "#FEE0D2"
)

freq_group_colors <- c(
  "High" = "#EF3B2C",   
  "Low"  = "#FCBBA1"    
)


Individual figures

We will proceed with data visualization for the research question through four sub-questions.

Figure 1 Does consumer satisfaction vary depending on the delivery method?

rating_summary <- shopping_final %>%
  group_by(shipping_type) %>%
  summarise(mean_rating = mean(review_rating, na.rm = TRUE))


ggplot(rating_summary, aes(x = shipping_type, y = mean_rating, fill = shipping_type)) +
  geom_col() +
  geom_text(aes(label = round(mean_rating, 2)), vjust = -0.5) +
  scale_fill_manual(values = shipping_colors) +
  labs(
    title = "Average Customer Satisfaction by Shipping Speed",
    x = "Shipping Type",
    y = "Average Rating"
  ) +
  scale_y_continuous(limits = c(0, 5)) +
  theme_minimal() +
  theme(legend.position = "none")


The bar chart allows for a visual overview of the differences in ratings across groups, providing both numerical and visual information to visually and intuitively represent satisfaction across the entire score range.

Consumer satisfaction by delivery method was lowest for store pickup at 3.71, and highest for standard delivery at 3.82. Express delivery came in second at 3.78, followed by two-day delivery at 3.76, and free and next-day delivery at 3.72, demonstrating no significant differences in satisfaction across delivery methods.

rating_gender <- shopping_final %>%
  group_by(shipping_type, gender) %>% 
  summarise(mean_rating = mean(review_rating, na.rm = TRUE)) %>% 
  ungroup()
## `summarise()` has grouped output by 'shipping_type'. You can override using the
## `.groups` argument.
ggplot(rating_gender, aes(x = shipping_type, y = mean_rating, fill = gender)) +
  geom_col(position = position_dodge(width = 0.8)) +
  geom_text(aes(label = round(mean_rating, 2)),
            position = position_dodge(width = 0.8),
            vjust = -0.5, size = 3) +
  scale_fill_manual(values = gender_colors) +
  labs(
    title = "Average Customer Satisfaction by Shipping Type and Gender",
    x = "Shipping Type",
    y = "Average Rating"
  ) +
  coord_cartesian(ylim = c(0, 5)) +
  theme_minimal()


Examining both groups together revealed no significant differences in satisfaction between men and women for express, free, next-day, and standard shipping. However, there was a small difference in average satisfaction between men and women, with two-day delivery at 0.09 and in-store pickup at 0.08.

However, it should be noted that this difference stems from the significant difference in sample size: 420-457 male data points, while 191-213 female data points were available. Therefore, the difference in satisfaction may have been more pronounced in the female group.

shipping_table <- shopping_final %>%
  group_by(shipping_type, gender) %>%
  summarise(count = n(), .groups = "drop") %>%
  tidyr::pivot_wider(
    names_from = gender,
    values_from = count,
    values_fill = 0
  ) %>%
  mutate(Total = rowSums(across(where(is.numeric))))

shipping_table
## # A tibble: 6 × 4
##   shipping_type  Female  Male Total
##   <fct>           <int> <int> <dbl>
## 1 Express           194   452   646
## 2 Next Day Air      191   457   648
## 3 2-Day Shipping    207   420   627
## 4 Standard          213   441   654
## 5 Free Shipping     249   426   675
## 6 Store Pickup      194   456   650
rating_summary <- shopping_final %>%
  group_by(shipping_type) %>%
  summarise(
    mean_rating = mean(review_rating, na.rm = TRUE),
    sd          = sd(review_rating, na.rm = TRUE),
    n           = n(),
    se          = sd / sqrt(n)
  )

ggplot(rating_summary, aes(x = shipping_type, y = mean_rating)) +
  geom_errorbar(aes(ymin = mean_rating - se, ymax = mean_rating + se),
                width = 0.2, linewidth = 0.8, color = "black") +
  geom_point(aes(color = shipping_type, size = n)) +
  geom_text(
    aes(label = round(mean_rating, 2)), vjust = -0.7, size = 3.5) +
  scale_color_manual(values = shipping_colors) +
  
  scale_size(range = c(3, 8)) +
  labs(
    title = "Mean Customer Satisfaction by Shipping Type (with SE)",
    x = "Shipping Type",
    y = "Average Rating"
  ) +
  theme_minimal() +
  theme(legend.position = "none")


However, since the difference in satisfaction between groups was not large, making it difficult to visually obtain information, we provided the mean and standard error together to show variability and confidence intervals, and used error bars with adjusted Y-axis scaling to clearly show small differences.

The graph shows the mean as a dot, and the vertical line represents the standard error, and the size of the dot is proportional to the number of samples. The standard error is a measure of how close our calculated mean is to the actual population mean, and is considered when determining whether the mean is a reliable number. The smaller the standard error, the higher the reliability of the results. The error bar length was generally short, forming a narrow confidence interval for the mean, which means that customer satisfaction within the sample is relatively stable and the mean is a reliable estimate.

ggplot(rating_gender, aes(x = shipping_type, y = mean_rating, 
                          color = gender, group = gender)) +
  geom_point(size = 3) +
  geom_line(size = 1) +
  geom_text(aes(label = round(mean_rating, 2)), 
            vjust = -0.7, size = 3) +
  scale_color_manual(values = gender_colors) +
  labs(
    title = "Trend of Customer Satisfaction by Shipping Type and Gender",
    x = "Shipping Type",
    y = "Average Rating"
  ) +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.


Using geom_line, we were able to visually understand how satisfaction trends by gender change across delivery methods. The slope and position of the line intuitively show which delivery methods increase or decrease the difference between men and women, making it easy to understand the direction and magnitude of the difference between the two groups. Connecting points and lines provides a smooth and easy-to-understand representation of customer satisfaction trends. By displaying the average score alongside the data, we clearly demonstrate that the actual differences are not significant despite the y-axis scaling.

Women were relatively dissatisfied with two-day delivery, while men and women were similarly satisfied with express delivery. Free and next-day delivery were similar across both groups, while standard delivery was the highest for both groups, and women rated store pickup the lowest. While women’s satisfaction tended to be slightly higher overall than men’s, women’s satisfaction was particularly low for two-day delivery and store pickup, and women rated both delivery methods the lowest.

This suggests that women are more sensitive to the quality of the delivery experience, incorporating not only cost, convenience, and delivery method, but also emotional and experiential factors into their evaluations. Conversely, men showed relatively small differences in satisfaction across delivery methods, with Standard Shipping being the most satisfying, while Free Shipping and Next Day Air were rated relatively low. The two groups showed differences in satisfaction with two-day delivery and in-store pickup of 0.08 to 0.09, respectively. However, considering that the difference between the highest and lowest satisfaction levels across all groups was only 0.11, and the average satisfaction ranged from 3.71 to 3.82, it is difficult to attribute significant differences in satisfaction to a specific group.

“Satisfaction doesn’t fluctuate significantly even with different delivery methods.”
Delivery method is the most noticeable change in the customer experience. The choices—whether it’s fast delivery, economical delivery, or self-pickup—clearly change, and we often expect these choices to significantly impact customer satisfaction. However, analysis reveals that changes in delivery method don’t significantly impact customer sentiment.


Figure 2 Does high satisfaction increase repurchase behavior?

To determine whether satisfaction actually influences customer repurchase behavior, we conducted a multi-step visualization analysis. Each graph was designed to begin with simple validation and trend identification, then gradually delve into the relationship in greater detail, moving through noise removal, conditional analysis, and group comparison.

1. Full Scatter Plot, Filled 2D Density: Understanding the overall picture of the relationship

ggplot(shopping_final, aes(x = review_rating, y = previous_purchases)) +
  geom_jitter(alpha = 0.35, color = "#2C3E50", width = 0.1, height = 0.1) +
  geom_smooth(method = "lm", se = TRUE, color = "red") +
  labs(
    title = "Relationship between Review Rating and Previous Purchases",
    x = "Review Rating",
    y = "Previous Purchases"
  ) +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'


First, we used a scatterplot to examine the overall pattern between the two variables. A scatterplot is a basic exploratory tool that most intuitively displays the distribution of individual observations and can be used to quickly detect whether there is a true trend of increased repurchase as satisfaction increases. A red regression line was added to the scatterplot to clearly visualize the data trend.

However, the regression line shown throughout the scatterplot was nearly horizontal, indicating no clear direction or trend between satisfaction and the number of repurchases. In other words, no evidence was observed that higher satisfaction leads to more repurchases.

ggplot(shopping_final, aes(x = review_rating, y = previous_purchases)) +
  geom_density_2d_filled(contour_var = "ndensity") +
  scale_fill_brewer(palette = "RdYlGn", direction = -1) +
  labs(
    title = "Filled 2D Density between Review Rating and Previous Purchases",
    x = "Review Rating",
    y = "Previous Purchases"
  ) +
  theme_minimal()


A filled density graph is a visualization that uses color depth to indicate where data is concentrated (density) in the relationship between review ratings and repeat purchases. A color closer to red indicates a region with a higher concentration of data, while a color closer to green indicates a region with relatively fewer observations.

This graph clearly demonstrates that the relationship between review ratings and repeat purchases is not a strong linear relationship, but rather a distribution concentrated in a specific range. Most customers give above-average ratings (3.5-4.5), and repeat purchases are concentrated in the mid-range (10-35). While there is a gentle correlation between high ratings and repeat purchases, extremely high repeat purchases are not directly correlated with ratings.

In other words, while customer satisfaction does have some influence on repeat purchase behavior, the relationship between the two variables is not a strong linear trend, but rather a distribution concentrated in a specific range.

2. Binned Mean Plot: Check the average pattern after removing noise.

Scatterplots can become noisy when there are many observations, obscuring important trends. To address this issue, we divided review scores into intervals and calculated the average number of repeat purchases for each interval, creating a binned mean plot.

This graph, a more refined version of a scatterplot, aims to identify whether there is a regular pattern of increasing or decreasing average repeat purchases as satisfaction levels change.

binned_df <- shopping_final %>%
  mutate(bin = cut(review_rating, breaks = 10)) %>%
  group_by(bin) %>%
  summarise(
    rating_mid = mean(review_rating),
    mean_prev  = mean(previous_purchases)
  )

ggplot(binned_df, aes(x = rating_mid, y = mean_prev, color = rating_mid)) +
  geom_point(size = 3) +
  geom_path(size = 1.3) +
  scale_color_distiller( palette = "Spectral", direction = -1, name = "Review Rating") +
  labs(
    title = "Binned Mean Previous Purchases by Review Rating",
    x = "Review Rating",
    y = "Mean Previous Purchases"
  ) +
  theme_minimal()


The average number of previous purchases based on review ratings remained largely unchanged, hovering around 25-26. There was no clear trend toward higher ratings leading to higher numbers of previous purchases.

However, a temporary high average was observed in the 4.6 rating range, but this is likely due to a specific customer segment or sample imbalance. Therefore, review ratings are unlikely to be a primary factor in explaining customers’ repeat purchases.

3. Facet by Shipping Type: Conditional Pattern Exploration

In the next step, we analyzed the satisfaction-repurchase relationship separately by delivery method to determine whether there are any patterns that are not visible overall but only to customers using a specific delivery method.

ggplot(shopping_final, 
       aes(x = review_rating, y = previous_purchases, color = shipping_type)) +     
  geom_jitter(alpha = 0.4, width = 0.1, height = 0.1) +
  geom_smooth(method = "lm", se = FALSE, color = "red", linewidth = 1) +
  facet_wrap(~ shipping_type) +
  scale_color_manual(values = shipping_colors) + 
  labs(
    title = "Satisfaction vs. Previous Purchases by Shipping Type",
    x = "Review Rating",
    y = "Previous Purchases",
    color = "Shipping Type"
  ) +
  theme_minimal() +
  theme(legend.position = "none") 
## `geom_smooth()` using formula = 'y ~ x'


This conditional analysis is important because it allows us to explore hidden patterns within subgroups of customers.

However, the results showed that the regression lines were nearly horizontal or had only a very slight slope across all delivery methods, and there was no evidence of any difference in the impact of satisfaction across delivery methods.

In other words, we concluded that delivery method did not play a moderating role in the relationship between satisfaction and repeat purchase.

4. Facet by Gender: Exploring Differences Based on Demographic Factors

Finally, to determine whether the impact of satisfaction on repurchase behavior differs by gender, the same analysis was conducted separately for male and female customers. This step is part of the customer segment analysis, and serves as an in-depth analysis to explore whether the relationship is stronger for certain groups.

ggplot(shopping_final, aes(x = review_rating, y = previous_purchases, color = gender)) +
  geom_jitter(alpha = 0.35, width = 0.1, height = 0.1) +
  geom_smooth(method = "lm", se = FALSE, color = "red", linewidth = 1.1) +
  facet_wrap(~ gender) +
  scale_color_manual(values = gender_colors) +
  labs(
    title = "Satisfaction vs Previous Purchases by Gender",
    x = "Review Rating",
    y = "Previous Purchases",
    color = "Gender"
  ) +
  theme_minimal() +
  theme(legend.position = "none")
## `geom_smooth()` using formula = 'y ~ x'


The analysis results show that for female customers, there is a very weak positive slope, indicating that higher satisfaction leads to a slight increase in repeat purchases. For male customers, the regression line is completely horizontal, suggesting no relationship between satisfaction and repeat purchases.

However, even for female customers, the effect is so minimal that it cannot be considered a substantial predictor. Ultimately, we conclude that gender is not a variable that explains or reinforces the satisfaction-repurchase relationship.


“Satisfaction was sufficient, but not enough to change behavior.”

Many companies expect that increasing customer satisfaction will naturally lead to repeat purchases. However, the analysis in Figure 2 tells a different story. While customers generally already have high satisfaction ratings, that satisfaction doesn’t drive repeat purchases. While high customer ratings may signal a company’s success, they don’t automatically translate into loyalty or repeat purchases. Ultimately, the real drivers of customer repeat purchases likely lie elsewhere, such as price, accessibility, benefits, habits, and lock-in structures, rather than satisfaction.


Figure 3 Does repurchase/purchase frequency vary depending on the delivery method?

To determine whether delivery method influences customers’ long-term purchasing behavior (number of repurchases, shopping frequency), we conducted a progressive, in-depth analysis in the following order: basic exploration → distribution comparison → conditional analysis → simplified model → population group comparison. This structure was designed to gradually verify whether delivery method functions as a predictor of purchasing behavior.

1. Repurchase

(1) Bar chart comparing averages by delivery method - Initial confirmation of relationship

As a first step, we compared the average number of repeat purchases by delivery method. Bar charts allow for quick visualization of the difference in average size between each group, making them a fundamental analytical tool for intuitively understanding whether delivery method influences long-term purchasing behavior.

prev_summary <- shopping_final %>% 
  group_by(shipping_type) %>% 
  summarise(
    mean_prev = mean(previous_purchases, na.rm = TRUE),
    sd_prev   = sd(previous_purchases, na.rm = TRUE),
    n         = n(),
    se_prev   = sd_prev / sqrt(n)
  )

ggplot(prev_summary, aes(x = shipping_type, y = mean_prev, fill = shipping_type)) +
  geom_col() +
  geom_errorbar( aes(ymin = mean_prev - se_prev, ymax = mean_prev + se_prev), width = 0.2, color = "gray40") +
  geom_text(
  aes(label = round(mean_prev, 2), 
      y = mean_prev + se_prev + 0.3), 
  vjust = 0, size = 3.8
  ) +
  scale_fill_manual(values = shipping_colors) +
  labs(
    title = "Mean Previous Purchases by Shipping Type",
    x = "Shipping Type",
    y = "Mean Previous Purchases"
  ) +
  theme_minimal() +
  theme(legend.position = "none")


The analysis results show that the average number of repeat purchases across all delivery methods is very close, ranging from 24 to 26, and the standard errors are also very small, indicating virtually no differences between groups. In other words, differences in delivery methods do not affect customers’ long-term repeat purchase patterns.

(2) Violin Plot—Advanced Verification Through Comparing the Entire Distribution

To identify differences in distribution shape that might be hidden by the mean alone, we used a violin plot. This is a more in-depth analysis graph that shows the median, quartiles, and overall distribution shape.

ggplot(shopping_final, aes(x = shipping_type, y = previous_purchases, fill = shipping_type)) +
  geom_violin(alpha = 0.7) +
  geom_boxplot(width = 0.15, fill = "white", color = "black", outlier.shape = NA) +
  scale_fill_manual(values = shipping_colors) +
  labs(
    title = "Distribution of Previous Purchases by Shipping Type",
    x = "Shipping Type",
    y = "Previous Purchases"
  ) +
  theme_minimal() +
  theme(legend.position = "none")


The distribution of the number of previous purchases by shipping method was nearly identical, and the median and interquartile range did not differ significantly.

This suggests that not only the mean but also the overall distribution remained unchanged by shipping method. In other words, a customer’s chosen shipping method did not distinguish or predict previous repeat purchases.

(3) Gender × Shipping Method Cross-Analysis — Exploring Differences by Population Group

Next, we conducted a Shipping Type × Gender cross-analysis to determine whether shipping method differed by gender.

prev_gender <- shopping_final %>%
  group_by(gender, shipping_type) %>%
  summarise(
    mean_prev = mean(previous_purchases, na.rm = TRUE),
    sd_prev   = sd(previous_purchases, na.rm = TRUE),
    n         = n(),
    se_prev   = sd_prev / sqrt(n),
    .groups = "drop"
  )

ggplot(prev_gender,
       aes(x = shipping_type,
           y = mean_prev,
           fill = gender)) +
  geom_col(position = position_dodge(width = 0.8)) +
  geom_text(
    aes(label = round(mean_prev, 2)),
    position = position_dodge(width = 0.8),
    vjust = -0.5,
    size = 3
  ) +
  scale_fill_manual(values = gender_colors) +
  labs(
    title = "Mean Previous Purchases by Shipping Type and Gender",
    x = "Shipping Type",
    y = "Mean Previous Purchases",
    fill = "Gender"
  ) +
  coord_cartesian(
    ylim = c(0, max(prev_gender$mean_prev) * 1.1)
  ) +
  theme_minimal()


Comparing the average number of previous purchases by shipping method by gender, men made approximately one to three more previous purchases than women for most shipping methods. This difference was particularly pronounced for Express (26.53 for men, 23.09 for women) and Next Day Air (25.47 for men, 23.10 for women).

Conversely, women’s averages were slightly higher for 2-Day Shipping and Store Pickup. While gender differences exist overall, the absolute differences are not significant, suggesting that shipping method selection is more influenced by individual needs or situational factors than by gender.


2. Purchase Frequency

(1) Stacked Bar Chart — Understanding the Basic Structure of Purchase Frequency by Shipping Method

First, we used a stacked bar chart to compare the distribution of purchase frequency by shipping method. This is ideal for quickly comparing changes in the proportions of multiple categories.

freq_summary <- shopping_final %>% 
  filter(!is.na(purchase_frequency)) %>%
  group_by(shipping_type, purchase_frequency) %>% 
  summarise(n = n(), .groups = "drop") %>% 
  group_by(shipping_type) %>% 
  mutate(prop = n / sum(n))

ggplot(freq_summary, 
       aes(x = shipping_type, y = prop, fill = purchase_frequency)) +
  geom_col() +
  scale_fill_manual(values = freq_colors) +
  labs(
    title = "Purchase Frequency Distribution by Shipping Type",
    x = "Shipping Type",
    y = "Proportion",
    fill = "Purchase Frequency"
  ) +
  theme_minimal()


The results of the stacked bar chart revealed a nearly identical distribution of purchase frequency by delivery method.

All purchase frequency levels—Weekly, Monthly, Quarterly, and Annually—showed similar rates regardless of delivery method. No specific delivery method was found to attract more frequent or less frequent customers.

This suggests that delivery method does not explain how frequently a customer shops, and that purchase frequency is a behavioral pattern formed independently of delivery choice.

(2) Alluvial Plot—In-Depth Analysis Visualizing Customer Flow

Next, we utilized an alluvial plot to visualize customer movement patterns between delivery methods and purchase frequencies in the form of flows, beyond simple ratio comparisons. Each flow represents the number of data counts flowing from one category to another, with thicker lines indicating more observations for that combination.

This graph provides an intuitive understanding of flow structures, such as “Which delivery method customers have which purchase frequency?”, which are difficult to grasp with simple bar graphs or crosstabs.

freq_alluvial <- shopping_final %>%
  group_by(shipping_type, purchase_frequency) %>%
  summarise(n = n(), .groups = "drop")


ggplot(freq_alluvial,
       aes(axis1 = shipping_type,
           axis2 = purchase_frequency,
           y = n)) +
  geom_alluvium(aes(fill = purchase_frequency), width = 1/12, alpha = 0.9) +

  geom_stratum(width = 1/12, fill = "gray95", color = "gray70") +
  geom_text(stat = "stratum", aes(label = after_stat(stratum)), size = 3) +
  scale_fill_manual(values = freq_colors) +
  labs(
    title = "Alluvial Plot of Shipping Type \u2192 Purchase Frequency",
    x = "",
    y = "Count",
    fill = "Purchase Frequency"
  ) +
  theme_minimal()


Analysis of customer flow from shipping method to purchase frequency using an alluvial plot revealed no clear distinction in purchase frequency by shipping type. Express and Next Day Air customers were relatively more likely to move to Weekly and Bi-Weekly purchases, indicating a high proportion of repeat purchasers.

Conversely, Free Shipping and Store Pickup customers were more likely to move to lower-frequency groups such as Monthly, Quarterly, and Annually. This suggests that while shipping method choice is somewhat related to customer purchase frequency, overall, customers with varying purchase frequencies are evenly distributed.

(3) High vs. Low Simplification Analysis — Additional Verification to Determine if Patterns Were Hidden

Since multiple categories can obscure subtle patterns, we simplified purchase frequency into two groups, High and Low, and compared the ratios again.

shopping_final <- shopping_final %>%
  mutate(
    freq_group = case_when(
      purchase_frequency %in% c("Weekly", "Fortnightly", "Monthly") ~ "High",
      purchase_frequency %in% c("Quarterly", "Every 3 Months", "Bi-Weekly", "Annually") ~ "Low",
      TRUE ~ NA_character_
    )
  )

table(shopping_final$freq_group, useNA = "always")
## 
## High  Low <NA> 
## 1634 2266    0
freq_simple <- shopping_final %>%
  filter(!is.na(freq_group)) %>%
  group_by(shipping_type, freq_group) %>%
  summarise(n = n(), .groups = "drop") %>%
  group_by(shipping_type) %>%
  mutate(prop = n / sum(n))

ggplot(freq_simple, 
       aes(x = shipping_type, y = prop, fill = freq_group)) +
  geom_col() +
  geom_text(
    aes(label = scales::percent(prop, accuracy = 1)),
    position = position_stack(vjust = 0.5),  
    color = "black",
    size = 3.8
  ) +
  scale_fill_manual(values = freq_group_colors) +
  
  labs(
    title = "High vs Low Purchase Frequency by Shipping Type",
    x = "Shipping Type",
    y = "Proportion",
    fill = "Frequency Group"
  ) +
  theme_minimal()


When comparing purchase frequency by simplifying it into two groups, High and Low, the proportions of the two groups were nearly identical across all delivery methods. High-frequency purchasers comprised approximately 40–45%, while low-frequency purchasers comprised approximately 55–60%. This proportional structure remained largely unchanged regardless of delivery method.

This suggests that a particular delivery method does not tend to attract more frequent purchasers or less infrequent purchasers. In other words, purchase frequency is a behavioral characteristic that develops independently of delivery method.

(4) Gender Analysis — Demographic Factor Verification

Finally, we compared purchase frequency groups by delivery method × gender.

freq_gender <- shopping_final %>%
  filter(!is.na(freq_group)) %>%
  group_by(gender, shipping_type, freq_group) %>%
  summarise(n = n(), .groups = "drop") %>%
  group_by(gender, shipping_type) %>%
  mutate(prop = n / sum(n))

freq_gender <- freq_gender %>%
  mutate(
    fill_group = case_when(
      gender == "Female" & freq_group == "High" ~ "Female_High",
      gender == "Female" & freq_group == "Low"  ~ "Female_Low",
      gender == "Male"   & freq_group == "High" ~ "Male_High",
      gender == "Male"   & freq_group == "Low"  ~ "Male_Low"
    )
  )

ggplot(freq_gender, 
       aes(x = shipping_type, y = prop, fill = fill_group)) +
  geom_col(position = "stack") +
  geom_text(
    aes(label = scales::percent(prop, accuracy = 1)),
    position = position_stack(vjust = 0.5),
    size = 3.5,
    color = "black"
  ) +
  facet_wrap(~ gender) +
  scale_fill_manual(values = freq_gender_colors) +
  
  labs(
    title = "High vs Low Purchase Frequency by Shipping Type and Gender",
    x = "Shipping Type",
    y = "Proportion",
    fill = "Frequency Group"
  ) +
  
  theme_minimal()


Regardless of gender, the ratio of High/Low purchase frequency was nearly identical across all delivery methods, indicating that delivery method and purchase frequency are independent and that gender does not influence this relationship.


“Delivery Choice Does Not Change Behavior”

A step-by-step verification of whether delivery method can explain customers’ long-term purchasing behavior revealed that delivery method had a very limited impact on both repeat purchases and purchase frequency.

Customers did not choose delivery method as a “result of purchase behavior,” but rather based on independent criteria such as situational needs or convenience.

Delivery method did not determine or predict customers’ repeat purchase levels, nor did it function as a factor in differentiating customers’ shopping frequency patterns. This suggests that improving delivery policies is unlikely to be a strategy for changing long-term purchasing behavior, and suggests that factors other than delivery method should be considered to increase repeat purchases.


Figure 4 Does the purchase amount differ according to the delivery method?

ggplot(shopping_final, aes(x = shipping_type, y = purchase_amount, fill = shipping_type)) +
  geom_boxplot(alpha = 0.7) +
  scale_fill_manual(values = shipping_colors) +
  labs(
    title = "Purchase Amount by Shipping Type",
    x = "Shipping Type",
    y = "Purchase Amount (USD)"
  ) +
  theme_minimal() +
  theme(legend.position = "none")


The box plot results show that the median purchase amount across shipping methods is around USD 60, and the distribution of purchase amounts is nearly identical, showing no significant differences between shipping methods based on purchase amount.

We chose to use the box plot to visually assess the distribution, highlight outliers, and directly compare the median, range, and distribution. However, the small difference in purchase amounts made it difficult to grasp the information at a glance.

shopping_final %>%
  group_by(shipping_type) %>%
  summarise(count = n())
## # A tibble: 6 × 2
##   shipping_type  count
##   <fct>          <int>
## 1 Express          646
## 2 Next Day Air     648
## 3 2-Day Shipping   627
## 4 Standard         654
## 5 Free Shipping    675
## 6 Store Pickup     650
shopping_final %>%
  group_by(shipping_type) %>%
  summarise(
    count           = n(),
    mean_purchase   = mean(purchase_amount,  na.rm = TRUE),
    median_purchase = median(purchase_amount, na.rm = TRUE),
    sd_purchase     = sd(purchase_amount,     na.rm = TRUE)
  )
## # A tibble: 6 × 5
##   shipping_type  count mean_purchase median_purchase sd_purchase
##   <fct>          <int>         <dbl>           <dbl>       <dbl>
## 1 Express          646          60.5            60          24.0
## 2 Next Day Air     648          58.6            58.5        23.3
## 3 2-Day Shipping   627          60.7            60          23.4
## 4 Standard         654          58.5            58          24.1
## 5 Free Shipping    675          60.4            62          23.4
## 6 Store Pickup     650          59.9            60          23.9
summary_df <- shopping_final %>%
  group_by(shipping_type) %>%
  summarise(
    mean = mean(purchase_amount, na.rm = TRUE),
    sd   = sd(purchase_amount,   na.rm = TRUE),
    n    = n(),
    se   = sd / sqrt(n)
  )

ggplot(summary_df, aes(x = shipping_type, y = mean)) +
  geom_errorbar(aes(ymin = mean - se, ymax = mean + se),
                width = 0.2, color = "black") +
  geom_point(aes(color = shipping_type, size = n)) +
  geom_text(aes(label = round(mean, 2)),
            vjust = -0.7, size = 3.5) +
  
  scale_color_manual(values = shipping_colors) +
  scale_size(range = c(3, 7)) + 
  labs(
    title = "Mean Purchase Amount by Shipping Type (with SE)",
    x = "Shipping Type",
    y = "Mean Purchase Amount (USD)"
  ) +
  theme_minimal() +
  theme(legend.position = "none")


Therefore, we aimed to clearly identify the mean and standard error using error bars scaled to USD 57.5–61.5 on the Y-axis, providing sufficient information to easily visualize differences. The mean is represented by a dot on the graph, and the vertical line represents the standard error, with the size of the dot proportional to the sample size. The length of the error bars in this graph is not excessively long overall, demonstrating relatively limited variability around the average purchase amount.

This suggests that purchase amounts across shipping methods are not significantly distributed, and that the average value likely represents a reasonable estimate of consumer behavior within the sample. However, because error bar lengths vary across shipping methods, data variability is greater for certain methods, and caution is required when interpreting average comparisons.

The average purchase amount for 2-Day Shipping, Express, Free Shipping, and Pickup is slightly higher, while that for Standard and Next Day is slightly lower. Although the absolute magnitude of the difference is so small that it is unlikely to significantly influence actual purchasing behavior or decision-making, given the large sample size of approximately 650 for each delivery method, it is likely that these $1–2 differences reflect weak trends in actual consumption patterns rather than mere chance.

summary_gender <- shopping_final %>%
  group_by(gender, shipping_type) %>%
  summarise(
    mean = mean(purchase_amount, na.rm = TRUE),
    sd   = sd(purchase_amount,   na.rm = TRUE),
    n    = n(),
    se   = sd / sqrt(n),
    .groups = "drop"
  )

ggplot(summary_gender, 
       aes(x = shipping_type, y = mean, color = gender, group = gender)) +
  geom_point(size = 3) +
  geom_line(size = 1) +
  geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width = 0.15) +
  scale_color_manual(values = gender_colors)+
  labs(
    title = "Mean Purchase Amount by Shipping Type and Gender (with SE)",
    x = "Shipping Type",
    y = "Mean Purchase Amount (USD)"
  ) +
  theme_minimal()


Similarly, we used error bars to examine whether there was any distortion in the averages due to group differences, using the degree of overlap between error bars for the two groups. Comparing the lengths of the points (mean) and error bars (standard error), we found that the error bars were often longer or similar to the difference between the mean and the different delivery methods. While differences in purchase amounts by delivery method exist, they are not statistically distinct.

Differences in purchase amounts by gender were also observed, but the large standard errors make it difficult to draw strong conclusions. Comparing the lengths of the points and error bars suggests that the choice of delivery method is not a major factor in determining spending.

Men tend to spend more than women on two-day delivery, and women spend more on average on next-day delivery and in-store pickup than men. Women also spend more on free shipping, though slightly less than other delivery methods. Expedited and standard delivery were similar to the average purchase amounts for the entire group. However, while the difference was approximately USD 57.5-61.5, suggesting a trend in purchase amounts by gender, it is difficult to conclude that a specific group is driving the significant difference in purchase amounts.


“Different shipping methods, but wallets open nearly the same.”

In summary, delivery method doesn’t truly differentiate or predict customer spending. Regardless of which delivery method customers choose, spending remains roughly constant, and gender differences aren’t significant enough to explain purchasing behavior. Ultimately, we conclude that delivery method is simply a choice based on convenience, not a real variable that influences purchase spending.

Conclusion

The top 1-2 delivery methods for each item in each group are shown, and * indicates the lowest delivery method.

Overall Group

Category Result
Consumer Satisfaction 3.82 Standard / 3.78 Express
Repurchase Frequency 26.23 Standard / 26.11 Two-Day Shipping
24.77 Next-Day Shipping
Purchase Frequency High 44% Standard Shipping
Purchase Amount $60.73 Two-Day Shipping

Given the high satisfaction with standard shipping and the high frequency of repeat purchases/purchases, a strategy is needed to increase purchase volume for products primarily delivered via standard shipping.

Female Group

Category Result
Consumer Satisfaction 3.83 Standard Shipping / 3.79 Express Shipping
3.65 Store Pickup
Repurchase Frequency 26.55 Two-Day Delivery / 26.17 Standard Shipping
23.09 Express Shipping, 23.1 Next-Day Delivery
Purchase Frequency High 45% Two-Day Delivery
Low 39% Store Pickup
Purchase Amount $61.27 Store Pickup / $61.13 Free Shipping

While store pickup offers the highest purchase amount, customer satisfaction is the lowest. Therefore, identifying customer dissatisfaction with store pickup and improving it is necessary.

Male Group

Category Result
Consumer Satisfaction 3.81 Standard Shipping / 3.79 Two-Day Shipping
Repurchase Frequency 28.53 Express Shipping
24.67 Store Pickup
Purchase Frequency High 45% Standard Shipping
Purchase Amount $61.41 Two-Day Shipping / $60.46 Express Shipping
$57.69 Next-Day Shipping

Overall, there isn’t a significant difference in shipping method preferences. Most are satisfied with standard, two-day, and express shipping, so strategies are needed to strengthen these options or increase the experience value of other shipping methods.

This study identified overall trends in response to the research question using data from the entire cohort. While relatively small differences were observed, a closer look revealed subtle, yet significant, patterns.

The results of this study revealed that consumer satisfaction, repurchase, purchase frequency, and purchase amount were not significantly correlated with delivery method and were independent behavioral patterns. Furthermore, satisfaction was not a causal factor in customer repurchase behavior. This suggests that while the delivery experience is part of the customer journey, it is not a primary variable determining actual purchase behavior, but rather a secondary experience of value following purchase. Additional comparative analysis of gender distribution confirmed that specific gender groups did not bias the overall average or distort the interpretation of the results.

Delivery method provides a supplementary service experience as a choice during the product purchase process. This suggests that the customer journey may be determined by factors other than delivery method, such as price, promotions, product category, customer needs, and loyalty programs. Furthermore, women are more sensitive to the quality of the delivery experience, and their evaluations can fully reflect not only cost, convenience, and delivery method, but also emotional and experiential factors. Therefore, future research and practical applications will require a comprehensive approach that analyzes the customer journey, considering the consumption context and psychological factors, rather than focusing solely on the impact of the delivery method itself.