2 Introduction

This report looks at what factors predict whether a coffee shop is growing. Using data from three cities, we ran a regression model with years in business, coffee quality, and city as predictors. The results showed that city had the strongest impact, coffee quality had a smaller positive effect, and years in business had no clear role. The model explained about 49% of the variation in growth and was statistically significant overall and this can be used to help coffee shop owners understand what factors are most important for their growth.

3 Data Overview

The dataset includes the following variables:

name: The name of the coffee shop

city: The city in which the shop is located

street: The street address of the shop

years_in_business: Number of years the shop has been in operation

coffee_quality: A categorical rating of coffee quality — good, ok, or bad

growing: A binary indicator where 1 means the shop is growing and 0 means it is not

The dataset contains 30 observations across 6 variables.

3.1 Correlations

4 Histograms

We begin by examining where most coffee shops are located, the years in business, and the growth status of the shops.

5 Predicting gorwth

We use a linear model to predict whether a coffee shop is growing based on the city it in, the number of years in business, and coffee quality.

## 
## Call:
## lm(formula = growing ~ years_in_business + coffee_quality_num + 
##     city_num, data = t_numbers)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.83713 -0.27486  0.00378  0.16197  0.88185 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.0422838  0.2419808   4.307 0.000209 ***
## years_in_business  -0.0004534  0.0282270  -0.016 0.987308    
## coffee_quality_num  0.1579590  0.0919889   1.717 0.097842 .  
## city_num           -0.3603969  0.0893391  -4.034 0.000428 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3825 on 26 degrees of freedom
## Multiple R-squared:  0.4905, Adjusted R-squared:  0.4317 
## F-statistic: 8.343 on 3 and 26 DF,  p-value: 0.0004736

6 Decision Tree

I created a decision tree to predict the growth variable.

#Predict on the test data This is a confusion matrix that shows how the decision tree predicts growth based on the test dataset.

##              
##               Growing Not Growing
##   GROWING           6           2
##   NOT GROWING       1           5

7 Conclusion

This analysis found that a coffee shop’s location (city) was the strongest predictor of whether it was growing. Coffee quality had a smaller, positive influence, while years in business did not matter. The model explained nearly half of the variation in growth. These results suggest that local context and quality play a bigger role in growth than longevity.