This document analyzes a dataset of coffee shops to understand the factors influencing their growth. The dataset includes information on the coffee shop’s name, street, city, years in business, coffee quality, and whether the shop is growing. We will explore the relationships between these variables using visualizations and statistical models.
The correlation matrix shows the relationships between the variables in
the dataset. The correlation between
city_num and
growing is positive, indicating that certain cities tend to
have more growing coffee shops. The correlation between
coffee_quality and growing is also positive,
suggesting that higher coffee quality is associated with growth. The
correlation between city_num and
coffee_quality is not as strong, indicating that while
there are differences in coffee quality across cities, it is not the
primary driver of growth.
##
## Call:
## glm(formula = growing ~ city_num + coffee_quality_num, family = binomial,
## data = t)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.3146 2.0804 -3.035 0.00240 **
## city_num 2.1409 0.7718 2.774 0.00554 **
## coffee_quality_num 1.0020 0.6385 1.569 0.11658
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 41.455 on 29 degrees of freedom
## Residual deviance: 23.927 on 27 degrees of freedom
## AIC: 29.927
##
## Number of Fisher Scoring iterations: 5
Here can we can see that the predicted growth is higher for coffee shops
with good coffee quality, and the city also plays a significant role in
the predicted growth. The model predicts that coffee shops in certain
cities are more likely to grow, even if they have lower coffee
quality.
Here we can see that the decision tree shows that the most important variable for predicting growth is the city. The next most important variable is the coffee quality. The tree splits the data into two main branches: one for good coffee quality and one for ok or bad coffee quality. Within the good coffee quality branch, there are further splits based on the city. This indicates that coffee quality is a strong predictor of growth, but the city also plays a significant role.
## [1] "Test Accuracy: 91.67 %"
In this analysis, we explored the relationships between coffee shop growth, city, and coffee quality. We found that both city and coffee quality are significant predictors of growth. The logistic regression model and decision tree model both indicated that coffee quality is a strong predictor of growth, but the city also plays a significant role. The decision tree provided a clear visual representation of how these variables interact to influence growth. The test set evaluation showed that the model has a good accuracy in predicting growth based on these factors.