This report looks at what factors predict whether a coffee shop is growing. Using data from three cities, we ran a regression model with years in business, coffee quality, and city as predictors. The results showed that city had the strongest impact, coffee quality had a smaller positive effect, and years in business had no clear role. The model explained about 49% of the variation in growth and was statistically significant overall and this can be used to help coffee shop owners understand what factors are most important for their growth.
The dataset includes the following variables:
name: The name of the coffee shop
city: The city in which the shop is located
street: The street address of the shop
years_in_business: Number of years the shop has been in operation
coffee_quality: A categorical rating of coffee quality — good, ok, or bad
growing: A binary indicator where 1 means the shop is growing and 0 means it is not
The dataset contains 30 observations across 6 variables.
We begin by examining where most coffee shops are located, the years in business, and the growth status of the shops.
We use a linear model to predict whether a coffee shop is growing based on the city it in, the number of years in business, and coffee quality.
##
## Call:
## lm(formula = growing ~ years_in_business + coffee_quality_num +
## city_num, data = t_numbers)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.83713 -0.27486 0.00378 0.16197 0.88185
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.0422838 0.2419808 4.307 0.000209 ***
## years_in_business -0.0004534 0.0282270 -0.016 0.987308
## coffee_quality_num 0.1579590 0.0919889 1.717 0.097842 .
## city_num -0.3603969 0.0893391 -4.034 0.000428 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3825 on 26 degrees of freedom
## Multiple R-squared: 0.4905, Adjusted R-squared: 0.4317
## F-statistic: 8.343 on 3 and 26 DF, p-value: 0.0004736
I created a decision tree to predict the growth variable.
#Predict on the test data This is a confusion matrix that shows how the
decision tree predicts growth based on the test dataset.
##
## Growing Not Growing
## GROWING 6 2
## NOT GROWING 1 5
This analysis found that a coffee shop’s location (city) was the strongest predictor of whether it was growing. Coffee quality had a smaller, positive influence, while years in business did not matter. The model explained nearly half of the variation in growth. These results suggest that local context and quality play a bigger role in growth than longevity.