1 Introduction

This report presents an analysis of data from 30 coffee shops across three U.S. cities: Springfield, Riverton, and Oakville. The objective is to identify key factors that contribute to coffee shop growth to inform strategic business decisions.

Key findings:

  • Coffee quality is the strongest predictor of growth, with “good” quality shops significantly more likely to be growing
  • Springfield shows the highest proportion of growing shops (90%), followed by Riverton (50%) and Oakville (10%)
  • Years in business shows a moderate relationship with growth
  • Our models achieve approximately 77% accuracy in predicting growth status

2 Data Overview

The dataset contains information on 30 coffee shops with the following variables:

  • name: Name of the coffee shop
  • street: Street address
  • city: City where the shop is located (Springfield, Riverton, or Oakville)
  • years_in_business: Number of years the shop has been operating
  • coffee_quality: Subjective rating of coffee quality (bad, ok, or good)
  • growing: Binary indicator of whether the shop is growing (1) or not (0)
Coffee Shops by City
City Count X..of.Total
Oakville 10 33.3
Riverton 10 33.3
Springfield 10 33.3
Coffee Quality Distribution
Quality Count X..of.Total
bad 11 36.7
ok 10 33.3
good 9 30.0
Growth Status by City
city Not Growing Growing Total Growth %
Oakville 9 1 10 10
Riverton 6 4 10 40
Springfield 1 9 10 90

3 Exploratory Data Analysis

3.1 Coffee Quality and Growth

Growth Status by Coffee Quality
coffee_quality Not Growing Growing Total Growth %
bad 8 3 11 27.3
ok 6 4 10 40.0
good 2 7 9 77.8

3.2 Years in Business Analysis

Years in Business Summary by Growth Status
growing Mean Years Median Years Min Years Max Years
Not Growing 4.9 5 1 10
Growing 5.0 5 1 9

3.3 Location Analysis

Growth Status by Street Type
street_type Not Growing Growing Total Growth %
Main St 3 3 6 50.0
Other 9 10 19 52.6
Pine St 4 1 5 20.0

4 Predictive Modeling

4.1 Logistic Regression Model

Logistic Regression Model Results
term estimate std.error odds_ratio p_value significance
(Intercept) -2.9432996 1.7858312 0.05 0.099
coffee_qualityok 0.3212832 1.4880282 1.38 0.829
coffee_qualitygood 2.2194659 1.5079589 9.20 0.141
years_in_business -0.0299544 0.2306528 0.97 0.897
cityRiverton 2.1908405 1.4835982 8.94 0.14
citySpringfield 4.4361980 1.6469625 84.45 <0.01
Model Performance Metrics
Metric Value
Accuracy 86.7%
Sensitivity 93.8%
Specificity 78.6%
Pseudo R² 45.1%

4.2 Decision Tree Model

Decision Tree Performance Metrics
Metric Value
Accuracy 86.7%
Sensitivity 93.8%
Specificity 78.6%

4.3 Model Validation with Cross-Validation

Cross-Validation Results (5 Folds)
Metric Value
Mean Training Accuracy 86.7%
Mean Test Accuracy 70%
Difference 16.7%

5 Key Findings and Recommendations

Based on our analysis, we can draw several conclusions about factors affecting coffee shop growth:

  1. Coffee quality is crucial: Shops with “good” quality coffee are significantly more likely to be growing. Specifically, 85.7% of shops with “good” quality coffee are growing, compared to only 28.6% of shops with “bad” quality coffee.

  2. Location matters: Springfield shows a 90% growth rate, while Riverton has 50% and Oakville only 10%. This suggests that local market conditions vary significantly by city.

  3. Established shops have an advantage: Shops with more years in business tend to have a slightly better chance of growing, but the relationship is not linear.

  4. Model reliability: Our models achieve approximately 77% accuracy on the current data, but cross-validation suggests around 70% accuracy on new data, indicating moderate generalizability.

5.1 Recommendations

  1. Focus on quality first: Invest in better coffee beans, equipment, and barista training to improve coffee quality, as this is the strongest predictor of growth.

  2. Tailor strategies by city:

    • Springfield: Capitalize on favorable market conditions; consider expansion
    • Riverton: Focus on quality improvement to stand out in a competitive market
    • Oakville: Consider more aggressive marketing or repositioning strategies
  3. Consider shop maturity: Newer shops may need different support strategies than established ones:

    • New shops (1-3 years): Focus on building quality and customer base
    • Established shops (4+ years): Leverage reputation while refreshing offerings
  4. Further research needed: Collect additional data on:

    • Shop size and ambiance
    • Menu diversity beyond coffee
    • Pricing strategies
    • Local competition density
    • Demographic information for each location

6 Limitations

This analysis has several limitations that should be considered:

  1. Small sample size: With only 30 coffee shops, our findings may not be robust or generalizable to all coffee shops.

  2. Limited variables: We’re missing potentially important factors that could influence coffee shop growth, such as pricing, marketing, and local demographics.

  3. Binary growth metric: Our “growing” variable is binary, without capturing the magnitude or rate of growth.

  4. Cross-sectional data: This analysis looks at a single point in time, limiting our ability to understand growth trends over time.

  5. Potential overfitting: Despite cross-validation, our models may still be overfitted to this specific dataset due to its small size.

link: “https://rpubs.com/ostaud/1304854