1 Introduction

The aim of this project is to examine whether the success of GoodLood ice cream shops in Kraków can be explained by their spatial location and surroundings, or whether it is primarily driven by brand-related factors and online popularity.

More specifically, the analysis focuses on:

• the spatial distribution of GoodLood shops and competing ice cream shops across Kraków,

• the “areas of influence” of individual ice cream shops, modelled using Voronoi tessellations,

• characteristics of the local environment (density of other services, access to public transport etc.),

• the presence of spatial dependencies, assessed through spatial autocorrelation measures and spatial econometric models.

The project combines self-collected data (obtained through web scraping) with publicly available spatial data from OpenStreetMap.


2 Data

2.1 Ice cream shops data

Initially the data regarding Kraków Ice Cream Shops was loaded as well as the Kraków boundaries area.

lodziarnie <- read.csv("lodziarnie.csv", stringsAsFactors = FALSE, fileEncoding = "UTF-8", encoding = "UTF-8")
krakow_raw <- opq("Kraków, Poland") |>
  add_osm_feature(key = "boundary", value = "administrative") |>
  osmdata_sf()

krakow_poly <- krakow_raw$osm_multipolygons |>
  filter(name == "Kraków") |>
  st_transform(2180) |>
  st_union() |>
  st_sf(geometry = _)

2.1.1 Data transformation into sf object

Firstly data was transformed into sf object and the “Good Lood” flag was added. Then as the overview I decided to visualize the ice cream shop data and investigate their distribution in Kraków area. For the better visibility the GoodLood shops was marked with their characteristic pink colour.

lodziarnie_sf <- st_as_sf(lodziarnie, coords = c("lon", "lat"), crs = 4326)
lodziarnie_sf <- st_transform(lodziarnie_sf, 2180)


if("name" %in% names(lodziarnie_sf)){
  lodziarnie_sf <- lodziarnie_sf |>
    mutate(is_goodlood = ifelse(grepl("good[[:space:]]*lood", name, ignore.case = TRUE), 1, 0))}

lodziarnie_sf <- lodziarnie_sf |>
  mutate(typ = factor(
      ifelse(is_goodlood, "GoodLood", "Competitors"),
      levels = c("Competitors", "GoodLood")
    ))
ggplot() +
  geom_sf(data = krakow_poly, fill = "grey97", color = "grey85") +
  geom_sf(data = lodziarnie_sf, 
          aes(color = typ),
          size = 3) +
  scale_color_manual(
    values = c("Competitors" = "#F7E4D1", "GoodLood" = "#B51E55"),
    name = "Ice cream shop type"
  ) +
  theme_minimal() +
  labs(
    title = "Locations of GoodLood Ice Cream Shops and Competitors in Kraków",
    caption = "Data sources: self-collected data (web scraping) and OpenStreetMap"
  )

2.1.2 Voronoi tesselations

To better understand the spatial reach of each ice cream shop in Kraków, I constructed a Voronoi tesselations based on the point locations of all ice cream shops in Kraków area. Each Voronoi cell give the overview of the potential area of influence of each individual shop.

The tessellations were generated based on the individual ice-cream shop coordinates and restricted to the Kraków area. This step will be the base for the future analysis of neighborhood of each ice-cream shop.

lodziarnie_unique <- lodziarnie_sf |>
  distinct(geometry, .keep_all = TRUE) |>
  mutate(voronoi_id = row_number())

v_raw <- st_voronoi(st_union(lodziarnie_unique))
v_polys <- st_collection_extract(v_raw, "POLYGON")

voronoi_sf <- st_sf(
  voronoi_id = lodziarnie_unique$voronoi_id,
  geometry   = st_sfc(v_polys, crs = 2180)
)


voronoi_clip <- st_intersection(voronoi_sf, krakow_poly)
voronoi_full <- st_join(voronoi_clip, lodziarnie_unique, join = st_contains)
voronoi_full <- voronoi_full |> mutate(voronoi_id = dplyr::row_number())
ggplot() +
  geom_sf(data = krakow_poly, fill = "grey97", color = "grey85") +
  geom_sf(data = voronoi_full,
          aes(fill = typ),
          color = "#3B1F24",
          size = 0.2,
          alpha = 0.7) +
  geom_sf(data = lodziarnie_sf,
          color = "black",
          size = 1) +
  scale_fill_manual(
    values = c("Competitors" = "#F7E4D1", "GoodLood" = "#B51E55"),
    name   = "Ice cream shop type"
  ) +
  theme_minimal() +
  labs(
    title   = "Voronoi cells for GoodLood ice cream shops and competitors in Kraków",
    caption = "Data sources: self-collected data (web scraping) and OpenStreetMap"
  )

2.2 Point of interest in Voronoi cells

To obtain unique character of each spatial zone of influence for each ice cream shop (Voronoi cell) a set of variables capturing the key features of each cell was constructed.

Transport accessibility: number of bus and tram stops within the cell, distance to the nearest stop.

Green and recreational areas: total area of parks and green spaces, share of green space in the cell.

Touristic and cultural amenities: count of attractions, historic sites and museums.

Commercial and service environment: number of restaurants, cafés, bars, fast-food, and retail shops.

Educational facilities: number of schools and universities.

All features were detrieved from OSM by intersecting Voronoi polygons with OpenStreetMap point and polygon layers. Counts of POI (points of interest obtained using spatial intersections (st_intersects), while measures of area and green-space share rely on geometric overlay operations (st_intersection, st_area). Distances to the nearest transport stop are computed from Voronoi centroids using st_distance.

bus_stops   <- st_read("data/osm_bus_stops.gpkg", quiet = TRUE)
tram_stops  <- st_read("data/osm_tram_stops.gpkg", quiet = TRUE)
food_pois   <- st_read("data/osm_food_pois.gpkg", quiet = TRUE)
shops       <- st_read("data/osm_shops.gpkg", quiet = TRUE)
attractions <- st_read("data/osm_attractions.gpkg", quiet = TRUE)
museums     <- st_read("data/osm_museums.gpkg", quiet = TRUE)
historic    <- st_read("data/osm_historic.gpkg", quiet = TRUE)
schools     <- st_read("data/osm_schools.gpkg", quiet = TRUE)
uni         <- st_read("data/osm_uni.gpkg", quiet = TRUE)
parks  <- st_read("data/osm_parks.gpkg", quiet = TRUE)
wisla  <- st_read("data/osm_wisla.gpkg", quiet = TRUE)
rynek  <- st_read("data/osm_rynek.gpkg", quiet = TRUE)
wawel  <- st_read("data/osm_wawel.gpkg", quiet = TRUE)
buildings <- st_read("data/osm_buildings.gpkg", quiet = TRUE)

all_pois <- list(bus_stops, tram_stops, food_pois, shops,
                 attractions, museums, historic, schools, uni)

2.2.1 POI heatmaps visualisations per voronoi

In this subsection, the spatial distribution of selected points of interest (POI) within the catchment areas of individual ice cream Voronoi’s area was visualized. Using the previously constructed Voronoi tessellation, I aggregate different POI categories to the level of each Voronoi cell and then map these counts to heatmaps.

ggplot() +
  geom_sf(data = voronoi_full,
          aes(fill = school_count),
          color = "white",
          size  = 0.2) +
  scale_fill_viridis_c(option = "C", direction = 1,
                       name = "Number of schools") +
  theme_minimal() +
  labs(
    title   = "Number of schools within Voronoi cells",
    subtitle = "Zones of influence of ice cream shops in Kraków",
    caption = "Data: self-collected outlets + OSM (schools)"
  )

ggplot() +
  geom_sf(data = voronoi_full,
          aes(fill = food_poi_count),
          color = "white",
          size  = 0.2) +
  scale_fill_viridis_c(option = "C", direction = 1,
                       name = "Restuarants") +
  theme_minimal() +
  labs(
    title    = "Number of restaurants within Voronoi cells",
    subtitle = "Zones of influence of ice cream shops in Kraków",
    caption  = "Data: OSM (restaurants)"
  )

ggplot() +
  geom_sf(data = voronoi_full,
          aes(fill = park_count),
          color = "white",
          size  = 0.2) +
  scale_fill_viridis_c(
    option   = "C",
    direction = 1,
    name     = "Number of parks"
  ) +
  theme_minimal() +
  labs(
    title    = "Number of parks within Voronoi cells",
    subtitle = "Zones of influence of ice cream shops in Kraków",
    caption  = "Data: self-collected outlets + OSM (parks, gardens, playgrounds)"
  )

3 Spatial analysis

3.0.1 Goal of the project and problem definition

In this project, the main goal is not to measure the performance of ice cream shops directly, but rather to investigate whether the spatial success of the GoodLood brand can be related to unique local characteristics. Since GoodLood has become a widely recognized local brand in Kraków and even expanded to other polish cities, the empirical question is whether the brand shops tend to locate in more advantageous urban environments than competing ice-cream shops.

To address this question, the dependent variable is defined as a binary indicator that captures whether a given Voronoi cell contains a GoodLood ice cream shop 1 if the voronoi cell contains Good Lood Ice Cream Shop, 0 if the voronoi cell contains competitor shop.

This formulation allows to model the probability that a location is chosen by GoodLood as a function of its spatial characteristics. If these factors significantly predict the presence of GoodLood shops, it would suggest that the brand’s success is at least partly rooted in beneficial locations, rather than being driven solely by brand-related factors.

voronoi_full <- voronoi_full |>
  mutate(
    is_goodlood = ifelse(typ == "GoodLood", 1L, 0L)
  )

3.1 Characteristics of Good Lood and competitors closest neighbourhood

Before moving to spatial econometric models, the local environments of GoodLood and competing ice-cream shops was compared. The goal is to check whether GoodLood tends to locate in areas with systematically higher number of attractions or richer service context.

The heatmap below provides an overview of the environmental characteristics surrounding GoodLood ice cream shops and their competitors. For each variable, values were scaled to a 0–1 range within that variable, allowing the plot to highlight relative differences. A value of 1 indicates that a given group exhibits a higher average value for that characteristic, while 0 denotes a lower average.

GoodLood outlets tend to be located in areas with greater concentrations of food-related services, more parks and playgrounds, and closer proximity to central urban landmarks such as the Main Square and Wawel Castle. They are also situated nearer to public transport stops on average. On the other hand, competitor shops are relatively more frequently in zones with higher counts of educational institutions (schools and universities) and certain categories of cultural POI.

vars_to_sum <- c(
  "bus_stop_count", "tram_stop_count",
  "food_poi_count", "shop_count",
  "attraction_count", "museum_count", "historic_count",
  "school_count", "uni_count",
  "park_count",
  "dist_to_stop_m", "dist_to_wisla_m",
  "dist_to_rynek_m", "dist_to_wawel_m",
  "building_count", "built_share",
  "share_residential", "share_commercial_retail"
)

means_by_type <- voronoi_full |>
  st_drop_geometry() |>
  group_by(typ) |>
  summarise(
    across(all_of(vars_to_sum), ~mean(.x, na.rm = TRUE)),
    .groups = "drop"
  ) |>
  pivot_longer(
    cols = all_of(vars_to_sum),
    names_to = "variable",
    values_to = "mean_value"
  )

means_scaled <- means_by_type |>
  group_by(variable) |>
  mutate(mean_scaled = rescale(mean_value)) |>
  ungroup()

nice_labels <- c(
  bus_stop_count        = "Bus stops",
  tram_stop_count       = "Tram stops",
  food_poi_count        = "Food-related POI",
  shop_count            = "Shops",
  attraction_count      = "Tourist attractions",
  museum_count          = "Museums",
  historic_count        = "Historic sites",
  school_count          = "Schools",
  uni_count             = "Universities",
  park_count            = "Parks / playgrounds",
  dist_to_stop_m        = "Distance to nearest stop [m]",
  dist_to_wisla_m       = "Distance to Vistula [m]",
  dist_to_rynek_m       = "Distance to Main Square [m]",
  dist_to_wawel_m       = "Distance to Wawel [m]",
  building_count        = "Number of buildings",
  built_share           = "Built-up share",
  share_residential     = "Share of residential buildings",
  share_commercial_retail = "Share of commercial/retail buildings"
)

means_scaled$variable_label <- nice_labels[means_scaled$variable]

ggplot(means_scaled, aes(x = variable_label, y = typ, fill = mean_scaled)) +
  geom_tile(color = "white") +
  scale_fill_viridis_c(
    name = "Scaled mean\n(0–1 per variable)"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.title  = element_blank()
  ) +
  labs(
    title = "Relative profile of environmental and built-up characteristics",
    subtitle = "Comparison of average Voronoi-cell context for GoodLood vs competitors\n(values scaled 0–1 within each variable)",
    caption = "Scaling highlights relative differences; not statistical standardisation."
  )

3.1.1 Spatial Weights Matrix

Next, the spatial weights graph was investigated. It reveals a highly interconnected urban structure, particularly in the central part of Kraków. Voronoi cells in the city centre are small with the high density, forming a tightly knit neighbourhood network. This implies the presence of strong local spatial dependence: characteristics of one cell are likely to resemble those of nearby cells.

voronoi_nb <- poly2nb(voronoi_full, queen = TRUE)
lw_vor <- nb2listw(voronoi_nb, style = "W")
par(mar = c(3, 3, 3, 3))

plot(st_geometry(voronoi_full),
     border = "grey70", col = NA,
     main = "Spatial Neighbourhood Structure of Voronoi Cells",
     cex.main = 1.1)

cent_coords <- st_coordinates(st_centroid(voronoi_full))

plot(voronoi_nb, cent_coords,
     add = TRUE,
     col = "#B51E55",
     lwd = 1.2)

legend(
  "bottomleft",
  legend = c("Voronoi borders", "Neighbour links"),
  col    = c("grey70", "#B51E55"),
  lwd    = c(0.8, 1.5),
  bty    = "n"
)

3.1.2 Moran’s I statistics

In the next step Moran’s I statistics was calculated. It reveals significant spatial autocorrelation for nearly all environmental variables describing the surroundings of ice cream shops in Kraków. Distance measures show very strong autocorrelation, reflecting the monocentric structure of the city.

vars_moran <- c(
  "bus_stop_count", "tram_stop_count",
  "food_poi_count", "shop_count",
  "attraction_count", "museum_count", "historic_count",
  "school_count", "uni_count", "park_count",
  "dist_to_stop_m", "dist_to_wisla_m",
  "dist_to_rynek_m", "dist_to_wawel_m",
  "built_area_m2", "built_share", "building_count",
  "share_commercial_retail", "share_industrial",
  "share_other", "share_public", "share_religious"
)


moran_results <- lapply(vars_moran, function(v) {
  test <- moran.test(voronoi_full[[v]], lw_vor)
  tibble(
    variable = v,
    moran_I = round(test$estimate["Moran I statistic"], 4),
    expected_I = round(test$estimate["Expectation"], 4),
    p_value = signif(test$p.value, 3)
  )
})

#pretty layout 
moran_table <- bind_rows(moran_results)
moran_table <- moran_table |> mutate(significant = ifelse(p_value < 0.05, "Yes", "No"))

kable(
  moran_table,
  caption = "Global Moran’s I for each of spatial variables",
  align = "lccc"
)
Global Moran’s I for each of spatial variables
variable moran_I expected_I p_value significant
bus_stop_count 0.4000 -0.0051 0.00e+00 Yes
tram_stop_count 0.1185 -0.0051 1.25e-03 Yes
food_poi_count 0.1136 -0.0051 7.60e-04 Yes
shop_count 0.3000 -0.0051 0.00e+00 Yes
attraction_count 0.0303 -0.0051 1.81e-01 No
museum_count 0.0647 -0.0051 1.48e-02 Yes
historic_count 0.1251 -0.0051 3.35e-04 Yes
school_count 0.1264 -0.0051 8.37e-04 Yes
uni_count 0.1681 -0.0051 6.20e-06 Yes
park_count 0.0860 -0.0051 6.75e-03 Yes
dist_to_stop_m 0.0552 -0.0051 7.28e-02 No
dist_to_wisla_m 0.8432 -0.0051 0.00e+00 Yes
dist_to_rynek_m 0.9039 -0.0051 0.00e+00 Yes
dist_to_wawel_m 0.9063 -0.0051 0.00e+00 Yes
built_area_m2 0.3934 -0.0051 0.00e+00 Yes
built_share 0.7298 -0.0051 0.00e+00 Yes
building_count 0.4647 -0.0051 0.00e+00 Yes
share_commercial_retail 0.1817 -0.0051 2.40e-06 Yes
share_industrial 0.1510 -0.0051 4.35e-05 Yes
share_other 0.1903 -0.0051 1.50e-06 Yes
share_public 0.2427 -0.0051 0.00e+00 Yes
share_religious 0.3122 -0.0051 0.00e+00 Yes

Building and attraction variables show weaker, yet statistically significant spatial dependence, indicating that services, schools, parks and commercial buildings tend to form localised spatial clusters. Only two variables—attraction_count and dist_to_stop_m don’t show significant spatial autocorrelation.

These results confirm that the dataset contains a strong spatial structure and there is a need to use of spatial econometric models in further analysis. As the example the map below was generated, it shows the built-up share to illustrate how differently Voronoi cells can look in terms of their local environment. We clearly see that the central part of Kraków is much more densely built-up, while outer areas are more open and less urbanized.

library(ggplot2)

ggplot(voronoi_full) +
  geom_sf(aes(fill = built_share), color = "white", size = 0.2) +
  scale_fill_viridis_c(
    option = "C",
    name   = "Built-up share"
  ) +
  theme_minimal() +
  labs(
    title   = "Built-up intensity within Voronoi cells",
    subtitle = "Higher values indicate a larger share of built-up area",
    caption = "Data: OSM buildings intersected with Voronoi cells"
  )

4 Spatial modelling

Performed exploratory analysis showed that many variables describing the surroundings of ice-cream shops are spatially clustered. Because neighbouring areas tend to look similar, classic regression models would ignore this dependence and may fail to capture spatial dependence effect. For this reason, spatial econometric methods are needed.

In this section, the probability that a Voronoi cell contains a GoodLood shop (1) rather than a competitor (0) was modeled. First, the simple logistic regression was estimated as a non-spatial baseline. Then, three spatial models were applied:

  • SLX model
  • SAR model
  • SEM model

Using these models allowed to evaluate whether GoodLood tends to choose locations with specific spatial features, whether neighbourhood spillovers play a role, and whether additional spatial processes influence the pattern of shops in Kraków.

As a first step, a standard logistic regression model was estimated to assess how local environmental characteristics influence the probability of a GoodLood location, without accounting for spatial dependence. This model serves as a benchmark for further spatial extensions.

Explanatory variables were selected based on exploratory spatial data analysis. Only variables with significant Moran’s I statistic were considered. Among these, variables with the strongest spatial structure were grouped into categories (urban density, POI categories, measures of centrality) and from each group only one or two variables with a relatively strong Moran’s I value was selected in order to capture the main spatial pattern.

model_variables <- c(
  "built_share",
  "shop_count",
  "food_poi_count",
  "park_count",
  "dist_to_rynek_m",
  "share_commercial_retail"
)

model_df <- voronoi_full |>
  st_drop_geometry() |>
  select(is_goodlood, all_of(model_variables)) |>
  na.omit()

4.1 Base model - logistic regression

As the first step, a standard logistic regression model was estimated as a baseline for further analysis. The dependent variable indicates whether a given Voronoi cell contains a GoodLood ice cream shop (1) or a competing shop (0).

  model_logit<- glm(is_goodlood ~ .,
  data = model_df,
  family = binomial(link = "logit")
)

To assess whether the non-spatial model is sufficient, Moran’s I test was applied to the residuals of the logistic regression. The results do not indicate significant spatial autocorrelation, suggesting that the baseline model captures the main spatial structure present in the data.

resid_logit <- residuals(model_logit, type = "pearson")
moran_resid <- moran.test(resid_logit, lw_vor)

moran_resid
## 
##  Moran I test under randomisation
## 
## data:  resid_logit  
## weights: lw_vor    
## 
## Moran I statistic standard deviate = -0.75532, p-value = 0.775
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      -0.036570095      -0.005050505       0.001741381

4.2 Spatial models

Although the baseline logistic regression does not show spatial autocorrelation in the residuals, spatial econometric models are estimated as an additional check. The goal of this step is to examine whether spatial spillovers or other spatial effects still play a role in explaining the location of GoodLood ice cream shops, beyond the local characteristics already included in the model.

model_slx <- lmSLX(
  is_goodlood ~ built_share + shop_count + food_poi_count +
    park_count + dist_to_rynek_m + share_commercial_retail,
  data  = model_df,
  listw = lw_vor)

model_sar <- lagsarlm(
  is_goodlood ~ built_share + shop_count + food_poi_count +
    park_count + dist_to_rynek_m + share_commercial_retail,
  data   = model_df,
  listw  = lw_vor,
  method = "eigen")

model_sem <- errorsarlm(
  is_goodlood ~ built_share + shop_count + food_poi_count +
    park_count + dist_to_rynek_m + share_commercial_retail,
  data   = model_df,
  listw  = lw_vor,
  method = "eigen"
)

Firstly the estimated models were compared using the Akaike Information Criterion and presented in the table below.

As it was done during classes AIC is the first criteria of choosing the best model for analyzed data. The differences between AIC value for all models were inspected. The difference between the SAR and SEM models is very small, which indicates that these two specifications are practically indistinguishable in terms of model fit. In contrast, the SAR model performs better than the baseline logistic regression and clearly outperforms the SLX model. This suggests that including models for spatial structure improves the fit of the model.

AIC(model_logit, model_slx, model_sar, model_sem)

Next the estimated models were jointly compared in summary table. The table below presents the results for the baseline logistic regression and the three spatial models (SLX, SAR and SEM):

modelsummary(
  list(
    Logit = model_logit,
    SLX   = model_slx,
    SAR   = model_sar,
    SEM   = model_sem
  ),
  stars = TRUE,
  output = "markdown"
)
Logit SLX SAR SEM
(Intercept) -0.844 0.228 0.257** 0.231**
(0.841) (0.148) (0.094) (0.086)
built_share -3.305 -0.457 -0.323 -0.297
(2.420) (0.342) (0.226) (0.212)
shop_count 0.001 0.001 0.000 0.000
(0.006) (0.001) (0.001) (0.001)
food_poi_count -0.005 -0.000 -0.000 -0.000
(0.014) (0.001) (0.001) (0.001)
park_count -0.002 -0.000 -0.000 -0.000
(0.006) (0.001) (0.000) (0.000)
dist_to_rynek_m -0.000 -0.000 -0.000 -0.000
(0.000) (0.000) (0.000) (0.000)
share_commercial_retail 3.213 0.359 0.378 0.357
(4.495) (0.606) (0.541) (0.529)
lag.built_share 0.193
(0.498)
lag.shop_count -0.002
(0.002)
lag.food_poi_count -0.000
(0.003)
lag.park_count -0.000
(0.001)
lag.dist_to_rynek_m 0.000
(0.000)
lag.share_commercial_retail -0.257
(1.183)
rho -0.131
(0.121)
lambda -0.128
(0.122)
Num.Obs. 199 199
R2 0.024
R2 Adj. -0.039
AIC 165.0 155.0 145.8 145.9
BIC 188.1 201.1 175.5 175.6
Log.Lik. -75.520 -63.520
F 0.471
RMSE 0.34 0.33 0.33 0.33
  • p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

In the SAR and SEM models, the spatial parameters (ρ and λ) are not statistically significant. This suggests that GoodLood locations do not show strong spatial clustering and that there are no important unobserved spatial factors affecting the results once the main location characteristics are taken into account.

The SLX model was used to check whether the characteristics of neighbouring areas affect the location of GoodLood ice cream shops. The results show that the spatially lagged variables are not statistically significant, which suggests that GoodLood locations depend mainly on local conditions rather than on spillovers from nearby Voronoi cells.

Taken together, these results indicate that spatial dependence plays only a limited role in explaining GoodLood locations. Therefore, the baseline logistic regression is selected as the main reference model.

4.3 Final model

summary(model_logit)
## 
## Call:
## glm(formula = is_goodlood ~ ., family = binomial(link = "logit"), 
##     data = model_df)
## 
## Coefficients:
##                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)             -0.8441979  0.8412667  -1.003    0.316
## built_share             -3.3052343  2.4202581  -1.366    0.172
## shop_count               0.0013016  0.0055791   0.233    0.816
## food_poi_count          -0.0045286  0.0143894  -0.315    0.753
## park_count              -0.0019601  0.0059733  -0.328    0.743
## dist_to_rynek_m         -0.0001185  0.0001201  -0.987    0.324
## share_commercial_retail  3.2134605  4.4951482   0.715    0.475
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 154.28  on 198  degrees of freedom
## Residual deviance: 151.04  on 192  degrees of freedom
## AIC: 165.04
## 
## Number of Fisher Scoring iterations: 5

The results of the logistic regression suggest that no local characteristic strongly determines whether a Voronoi cell contains a GoodLood ice cream shop. However, the goal of this model is to compare the local context of GoodLood locations with competitors rather than to predict shop success.

Most explanatory variables are not statistically significant, which implies that GoodLood locations cannot be easily distinguished from competitor locations based on one dominant factor. However, the signs of the coefficients point to general tendencies. In particular, locations closer to the city centre and areas with a stronger commercial and retail character appear to be more favourable for GoodLood shops.

Overall, the final model indicates that GoodLood locations are associated with broader urban characteristics rather than with specific neighbourhood features.

5 Conclusions

The aim of this project was to investigate whether the success of GoodLood ice cream shops in Kraków can be explained primarily by their spatial location or by factors related to brand recognition and online popularity. Using Voronoi tessellations and spatial econometric methods, the analysis examined differences between GoodLood locations and competing ice cream shops in terms of their surrounding urban environment.

The results show that although many spatial characteristics of the city exhibit strong spatial patterns, these patterns are largely explained by observable location features. Spatial econometric models do not reveal strong clustering or spillover effects, and the baseline logistic regression provides an adequate description of the data. This suggests that GoodLood locations are not driven by spatial dependence or imitation of nearby shops.

At the same time, the absence of strong statistically significant effects indicates that GoodLood shops do not rely on one clearly identifiable type of location. Instead, their presence appears to be related to general urban advantages, such as centrality and commercial activity, combined with the strength of the brand itself. This supports the interpretation that GoodLood’s success is not only a matter of location, but also of brand recognition and popularity beyond purely spatial factors.


Sources:

  • materials provided during classes

  • OpenStreetMap (OSM) data

  • GoodLood data obtained from the company’s official website

  • text editing, debugging, and corrections were assisted by ChatGPT