Introduction

Coffee shops play an integral role in the lives of students, serving as spaces for studying, socializing, and relaxation. Despite the numerous local coffee shops around SJSU, many students prefer larger chains such as Starbucks. This study aims to understand the factors influencing student satisfaction across various coffee shops and provide actionable insights to help local coffee shops thrive.

Problem Statement

Local coffee shops near SJSU struggle to attract students, even though they offer unique, student-focused environments. Understanding why students underutilize these spaces is critical to helping local businesses improve their offerings and compete effectively with larger chains.

Objectives

The primary objective of this study is to: 1. Identify key factors that influence student satisfaction across coffee shops. 2. Provide actionable recommendations to local coffee shops to enhance their offerings and better meet student needs. 3. Help local businesses create a stronger connection with the SJSU student community.

Methods

This study uses a dataset collected from SJSU students, where satisfaction levels for four coffee shops (7Leaves, Break Time, Gong Cha, and Starbucks) are rated across four categories: - Ambiance - Drink Quality - Price - Customer Service

The analysis includes: 1. Descriptive statistics for each category. 2. Visualizations for individual category ratings and overall satisfaction. 3. Independence testing (Spearman correlation for each category pair). 4. Statistical assumptions testing (e.g., Levene’s test for equal variances). 5. ANOVA analysis to test for significant differences.

Data Summary

# Load necessary libraries
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.2
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
## Warning: package 'tidyr' was built under R version 4.4.2
library(car)
## Warning: package 'car' was built under R version 4.4.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.2
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
# Load dataset interactively
combined_survey_data <- read.csv(file.choose())

# Ensure Coffee_Shop is treated as a categorical variable
combined_survey_data$Coffee_Shop <- as.factor(combined_survey_data$Coffee_Shop)

# Generate a summary of the dataset
cat("## Data Summary\n")
## ## Data Summary
summary_data <- summary(combined_survey_data)

# Print the summary
print(summary_data)
##      Coffee_Shop    Ambiance     Drink_Quality      Price      
##  7Leaves   :60   Min.   :1.000   Min.   :1.00   Min.   :1.000  
##  Break Time:60   1st Qu.:2.000   1st Qu.:2.00   1st Qu.:2.000  
##  Gong Cha  :60   Median :3.000   Median :3.00   Median :3.000  
##  Starbucks :60   Mean   :2.904   Mean   :2.95   Mean   :2.871  
##                  3rd Qu.:4.000   3rd Qu.:4.00   3rd Qu.:4.000  
##                  Max.   :5.000   Max.   :5.00   Max.   :5.000  
##  Customer_Service
##  Min.   :1.0     
##  1st Qu.:2.0     
##  Median :3.0     
##  Mean   :2.9     
##  3rd Qu.:4.0     
##  Max.   :5.0
# Optionally create a table for better visualization
knitr::kable(summary_data, caption = "Summary of Coffee Shop Satisfaction Data")
Summary of Coffee Shop Satisfaction Data
Coffee_Shop Ambiance Drink_Quality Price Customer_Service
7Leaves :60 Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.0
Break Time:60 1st Qu.:2.000 1st Qu.:2.00 1st Qu.:2.000 1st Qu.:2.0
Gong Cha :60 Median :3.000 Median :3.00 Median :3.000 Median :3.0
Starbucks :60 Mean :2.904 Mean :2.95 Mean :2.871 Mean :2.9
NA 3rd Qu.:4.000 3rd Qu.:4.00 3rd Qu.:4.000 3rd Qu.:4.0
NA Max. :5.000 Max. :5.00 Max. :5.000 Max. :5.0
# Descriptive Statistics and Boxplots
long_data <- combined_survey_data %>%
  pivot_longer(cols = c("Ambiance", "Drink_Quality", "Price", "Customer_Service"),
               names_to = "Category", values_to = "Score")

ggplot(long_data, aes(x = Coffee_Shop, y = Score, fill = Coffee_Shop)) +
  geom_boxplot() +
  facet_wrap(~Category, scales = "free") +
  labs(
    title = "Satisfaction by Category and Coffee Shop",
    x = "Coffee Shop",
    y = "Satisfaction Score"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

# Calculate total average satisfaction
combined_survey_data <- combined_survey_data %>%
  rowwise() %>%
  mutate(Total_Average = mean(c(Ambiance, Drink_Quality, Price, Customer_Service), na.rm = TRUE))

ggplot(combined_survey_data, aes(x = Coffee_Shop, y = Total_Average, fill = Coffee_Shop)) +
  geom_boxplot() +
  labs(
    title = "Overall Average Satisfaction by Coffee Shop",
    x = "Coffee Shop",
    y = "Average Satisfaction Score"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

independence_results <- list(
  `Ambiance vs Drink Quality` = cor.test(combined_survey_data$Ambiance, combined_survey_data$Drink_Quality, method = "spearman"),
  `Ambiance vs Price` = cor.test(combined_survey_data$Ambiance, combined_survey_data$Price, method = "spearman"),
  `Ambiance vs Customer Service` = cor.test(combined_survey_data$Ambiance, combined_survey_data$Customer_Service, method = "spearman"),
  `Drink Quality vs Price` = cor.test(combined_survey_data$Drink_Quality, combined_survey_data$Price, method = "spearman"),
  `Drink Quality vs Customer Service` = cor.test(combined_survey_data$Drink_Quality, combined_survey_data$Customer_Service, method = "spearman"),
  `Price vs Customer Service` = cor.test(combined_survey_data$Price, combined_survey_data$Customer_Service, method = "spearman")
)
## Warning in cor.test.default(combined_survey_data$Ambiance,
## combined_survey_data$Drink_Quality, : Cannot compute exact p-value with ties
## Warning in cor.test.default(combined_survey_data$Ambiance,
## combined_survey_data$Price, : Cannot compute exact p-value with ties
## Warning in cor.test.default(combined_survey_data$Ambiance,
## combined_survey_data$Customer_Service, : Cannot compute exact p-value with ties
## Warning in cor.test.default(combined_survey_data$Drink_Quality,
## combined_survey_data$Price, : Cannot compute exact p-value with ties
## Warning in cor.test.default(combined_survey_data$Drink_Quality,
## combined_survey_data$Customer_Service, : Cannot compute exact p-value with ties
## Warning in cor.test.default(combined_survey_data$Price,
## combined_survey_data$Customer_Service, : Cannot compute exact p-value with ties
cat("\nIndependence Test Results:\n")
## 
## Independence Test Results:
for (test_name in names(independence_results)) {
  test_result <- independence_results[[test_name]]
  cat(test_name, ":\n")
  cat("  Spearman rho:", round(test_result$estimate, 3), "\n")
  cat("  p-value:", round(test_result$p.value, 4), "\n")
  if (test_result$p.value < 0.05) {
    cat("  Conclusion: Dependent\n")
  } else {
    cat("  Conclusion: Independent\n")
  }
  cat("\n")
}
## Ambiance vs Drink Quality :
##   Spearman rho: -0.087 
##   p-value: 0.1767 
##   Conclusion: Independent
## 
## Ambiance vs Price :
##   Spearman rho: -0.037 
##   p-value: 0.5636 
##   Conclusion: Independent
## 
## Ambiance vs Customer Service :
##   Spearman rho: 0.017 
##   p-value: 0.7884 
##   Conclusion: Independent
## 
## Drink Quality vs Price :
##   Spearman rho: -0.096 
##   p-value: 0.1392 
##   Conclusion: Independent
## 
## Drink Quality vs Customer Service :
##   Spearman rho: -0.027 
##   p-value: 0.6789 
##   Conclusion: Independent
## 
## Price vs Customer Service :
##   Spearman rho: 0.105 
##   p-value: 0.1046 
##   Conclusion: Independent
cat("\nLevene's Test Results:\n")
## 
## Levene's Test Results:
levene_ambiance <- leveneTest(Ambiance ~ Coffee_Shop, data = combined_survey_data)
cat("Ambiance:\n")
## Ambiance:
print(levene_ambiance)
## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   3  0.3778 0.7691
##       236
levene_drink_quality <- leveneTest(Drink_Quality ~ Coffee_Shop, data = combined_survey_data)
cat("\nDrink Quality:\n")
## 
## Drink Quality:
print(levene_drink_quality)
## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   3  0.9598 0.4124
##       236
levene_price <- leveneTest(Price ~ Coffee_Shop, data = combined_survey_data)
cat("\nPrice:\n")
## 
## Price:
print(levene_price)
## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   3  0.4024 0.7514
##       236
levene_customer_service <- leveneTest(Customer_Service ~ Coffee_Shop, data = combined_survey_data)
cat("\nCustomer Service:\n")
## 
## Customer Service:
print(levene_customer_service)
## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   3  0.1734 0.9143
##       236
anova_ambiance <- aov(Ambiance ~ Coffee_Shop, data = combined_survey_data)
anova_drink_quality <- aov(Drink_Quality ~ Coffee_Shop, data = combined_survey_data)
anova_price <- aov(Price ~ Coffee_Shop, data = combined_survey_data)
anova_customer_service <- aov(Customer_Service ~ Coffee_Shop, data = combined_survey_data)

anova_results <- list(
  Ambiance = summary(anova_ambiance)[[1]]["Pr(>F)"][1, 1],
  Drink_Quality = summary(anova_drink_quality)[[1]]["Pr(>F)"][1, 1],
  Price = summary(anova_price)[[1]]["Pr(>F)"][1, 1],
  Customer_Service = summary(anova_customer_service)[[1]]["Pr(>F)"][1, 1]
)

cat("\nANOVA Results for Individual Categories:\n")
## 
## ANOVA Results for Individual Categories:
for (category in names(anova_results)) {
  p_value <- anova_results[[category]]
  cat(category, ": p-value = ", round(p_value, 2), "\n")
}
## Ambiance : p-value =  0.61 
## Drink_Quality : p-value =  0.28 
## Price : p-value =  0.43 
## Customer_Service : p-value =  0.08

Conclusion

Based on the analysis, satisfaction levels across coffee shops are largely similar in the studied categories. This suggests that factors outside of ambiance, drink quality, price, and customer service may influence student preferences, such as location or convenience.

Recommendations

Introduce loyalty programs: Encourage repeat visits by offering rewards for frequent purchases.

Recruit student ambassadors to spread the word and bring in more peers.

Enhance unique features: Focus on WiFi reliability, exclusive study-friendly spaces, or creative menu options to stand out.