Spatial Distribution of Pharmacies in Warsaw

1. Introduction

Pharmacies are essential healthcare infrastructure and simultaneously commercial enterprises operating in a tightly regulated market. In Poland, the 2017 apteka dla aptekarza (pharmacy for a pharmacist) law significantly restricted the opening of new pharmacies: new licences are limited to qualified pharmacists and pharmacy chains are capped at four outlets per owner. This has accelerated consolidation — the national pharmacy count has been declining slowly from a peak of roughly 15,000, with incumbents competing intensely for the highest-demand locations rather than expanding freely. In this context, understanding where pharmacies locate is both a spatial and an economic question: operators are incentivised to locate near demand sources to maximise prescription and over-the-counter turnover. Hospitals and clinics generate consistent, inelastic foot traffic from patients and carers, making proximity to healthcare facilities a plausible and economically rational driver of pharmacy location.

Research questions:

Are pharmacies in Warsaw spatially clustered, regularly spaced, or randomly distributed?
Is pharmacy intensity driven by distance to hospitals and clinics?

Null hypothesis: Pharmacies follow Complete Spatial Randomness (CSR) — a homogeneous Poisson process.

2. Literature Review

2.1 Healthcare Accessibility and Pharmacy Location

The spatial distribution of pharmacies is an established concern in health geography. Guagliardo (2004) provides a foundational review of spatial accessibility measures for primary care facilities, arguing that Euclidean distance to the nearest provider is the simplest valid proxy for accessibility when detailed travel-network data are unavailable. The concept of “pharmacy deserts” — areas lacking convenient pharmacy access — was formally quantified by Qato et al. (2014) in a study of Chicago, which found systematic under-provision in low-income and minority neighbourhoods. At the European scale, Vogler, Habimana, and Arts (2014) survey the effect of pharmacy deregulation across EU member states, finding that stricter entry regulation tends to produce more spatially uniform provision, whereas deregulated markets exhibit pronounced clustering in high-demand urban cores — a pattern consistent with the economic proximity logic motivating this analysis.

The Polish pharmacy market underwent significant structural change following the 2017 apteka dla aptekarza law (Ustawa z dnia 7 kwietnia 2017 r., Dz.U. poz. 1015), which restricted new pharmacy licences to qualified pharmacists and capped chain ownership at four outlets per owner. This regulatory constraint creates a clear spatial prediction: operators competing for scarce high-demand locations will concentrate new openings near the strongest demand anchors — hospitals and clinics — rather than dispersing to under-served peripheral areas. The inhomogeneous Poisson process model adopted in Sections 8–9 operationalises this prediction empirically.

2.2 Spatial Point Pattern Analysis: Foundations and First-Order Methods

The statistical study of spatial point patterns has its roots in Ripley’s seminal work (Ripley, 1976; 1977) and was consolidated by Diggle (2013), whose monograph covers the progression from descriptive intensity estimation through second-order summaries to parametric point process modelling. A comprehensive practical treatment for R practitioners is provided by Baddeley, Rubak, and Turner (2015), whose spatstat package implements the full analytical pipeline employed in this study. First-order analysis characterises the mean density of events per unit area: while the global intensity \(\hat{\lambda} = n/|W|\) provides a single summary, non-parametric kernel density estimation (KDE) recovers local spatial variation. Diggle (1985) introduced the cross-validated bandwidth selector bw.diggle(), which minimises the mean integrated squared error of the density estimate. The quadrat count test provides a simple chi-squared test of spatial homogeneity; its sensitivity to grid resolution is addressed through multi-resolution analysis (Diggle, 2013, ch. 3).

2.3 Second-Order Methods and Point Process Modelling

Ripley’s \(K\)-function summarises the expected number of additional events within distance \(r\) of an arbitrary event, normalised by global intensity, with \(K(r) = \pi r^2\) under CSR (Ripley, 1976). Besag (1977) proposed the variance-stabilising \(L\)-transformation \(L(r) = \sqrt{K(r)/\pi}\), which linearises the CSR benchmark and simplifies visual interpretation. Simulation envelopes from repeated CSR realisations provide a test of departure from randomness across all spatial scales (Diggle, 2013), with Ripley’s isotropic edge correction applied throughout (Ripley, 1988). Parametric modelling moves beyond description to explanatory inference: the inhomogeneous Poisson process (IPP) allows intensity to vary as \(\log\lambda(u) = \mathbf{x}(u)^{\top}\boldsymbol{\beta}\), fitted via the pseudolikelihood approximation of Berman and Turner (1992). Kopczewska (2021) and Arbia, Espa, and Giuliani (2016) situate such models within the economic geography tradition, where business location is the outcome of profit-maximising behaviour under spatial constraints — a framework directly applicable to pharmacy siting near demand anchors.

3. Data

suppressPackageStartupMessages({
  library(sf)
  library(spatstat)
  library(jsonlite)
  library(httr)
  library(ggplot2)
  library(tidyverse)
  library(viridis)
  library(patchwork)
})
Sys.setenv(LANG = "en")
base_dir <- "/Users/rajatdogra/Downloads/Point&Line"

Data were downloaded from OpenStreetMap via the Overpass API (May 2026). The chunk below fetches the data fresh if the local JSON files are not present, making the document fully self-contained and reproducible.

overpass <- "https://overpass-api.de/api/interpreter"
warsaw_bb <- "(52.09,20.85,52.37,21.27)"

fetch_osm <- function(query, file) {
  if (!file.exists(file)) {
    r <- POST(overpass, body = list(data = query), encode = "form", timeout(120))
    write(content(r, "text", encoding = "UTF-8"), file)
  }
}

fetch_osm(
  paste0('[out:json][timeout:90];(node["amenity"="pharmacy"]', warsaw_bb,
         ';way["amenity"="pharmacy"]', warsaw_bb, ');out center;'),
  "data_pharmacies.json"
)
fetch_osm(
  paste0('[out:json][timeout:90];(node["amenity"~"^(hospital|clinic)$"]', warsaw_bb,
         ';way["amenity"~"^(hospital|clinic)$"]', warsaw_bb, ');out center;'),
  "data_hospitals.json"
)
fetch_osm(
  '[out:json][timeout:60];relation["name"="Warszawa"]["admin_level"="6"];out geom;',
  "data_boundary.json"
)

Pharmacies — OSM nodes/ways tagged amenity=pharmacy within Warsaw
Hospitals/clinics — OSM features tagged amenity=hospital or amenity=clinic
Warsaw administrative boundary — outer polygon of the city (admin_level=6)

# --- Pharmacies ---
pharm_json <- fromJSON("data_pharmacies.json")$elements
pharm_lat  <- ifelse(pharm_json$type == "node", pharm_json$lat, pharm_json$center$lat)
pharm_lon  <- ifelse(pharm_json$type == "node", pharm_json$lon, pharm_json$center$lon)
pharm_sf   <- data.frame(lon = pharm_lon, lat = pharm_lat) %>%
  filter(!is.na(lon), !is.na(lat)) %>%
  st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
  st_transform(2180)

# --- Hospitals ---
hosp_json <- fromJSON("data_hospitals.json")$elements
hosp_lat  <- ifelse(hosp_json$type == "node", hosp_json$lat, hosp_json$center$lat)
hosp_lon  <- ifelse(hosp_json$type == "node", hosp_json$lon, hosp_json$center$lon)
hosp_sf   <- data.frame(lon = hosp_lon, lat = hosp_lat) %>%
  filter(!is.na(lon), !is.na(lat)) %>%
  st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
  st_transform(2180)

# --- Warsaw boundary ---
bound_json <- fromJSON("data_boundary.json")$elements
members    <- bound_json$members[[1]] %>% filter(role == "outer")
lines      <- lapply(members$geometry, function(g) {
  if (is.null(g) || nrow(g) < 2) return(NULL)
  st_linestring(as.matrix(g[, c("lon", "lat")]))
})
lines      <- Filter(Negate(is.null), lines)
warsaw_sf  <- st_sfc(lines, crs = 4326) %>%
  st_sf() %>% st_union() %>% st_polygonize() %>%
  st_collection_extract("POLYGON") %>% st_sf() %>%
  mutate(area = st_area(geometry)) %>%
  arrange(desc(area)) %>% slice(1) %>%
  st_transform(2180)

# Clip to boundary
pharm_sf <- st_intersection(pharm_sf, st_geometry(warsaw_sf))
hosp_sf  <- st_intersection(hosp_sf,  st_geometry(warsaw_sf))

Table 1. Dataset summary.
Dataset	Count	Notes
Pharmacies	586	OSM amenity=pharmacy
Hospitals / clinics	270	OSM amenity=hospital or clinic
Study area	—	516.7 km², EPSG:2180

4. Study Area and Observation Window

All analysis is carried out in EPSG:2180 (Polish national coordinate system, units = metres), which preserves distances and areas accurately.

warsaw_owin  <- as.owin(st_geometry(warsaw_sf))
pharm_coords <- st_coordinates(pharm_sf)
hosp_coords  <- st_coordinates(hosp_sf)

pharm_ppp <- ppp(pharm_coords[,1], pharm_coords[,2], window = warsaw_owin) %>% unique()
hosp_ppp  <- ppp(hosp_coords[,1],  hosp_coords[,2],  window = warsaw_owin) %>% unique()

global_intensity <- npoints(pharm_ppp) / area_km2

pharm_df <- as.data.frame(pharm_ppp)
hosp_df  <- as.data.frame(hosp_ppp)

ggplot() +
  geom_sf(data = warsaw_sf, fill = "#f5f5f5", color = "grey40", linewidth = 0.5) +
  geom_point(data = hosp_df,  aes(x = x, y = y),
             color = "firebrick", size = 2, alpha = 0.7, shape = 3, stroke = 1.2) +
  geom_point(data = pharm_df, aes(x = x, y = y),
             color = "steelblue", size = 1.2, alpha = 0.55) +
  annotate("point", x = -Inf, y = Inf, size = 0) +
  labs(title = "Pharmacies and hospitals in Warsaw",
       subtitle = sprintf("Pharmacies: %d  |  Hospitals/clinics: %d  |  Area: %.0f km²",
                          npoints(pharm_ppp), npoints(hosp_ppp), area_km2),
       caption = "Source: OpenStreetMap contributors, ODbL") +
  theme_minimal(base_size = 12) +
  theme(axis.title = element_blank(),
        plot.subtitle = element_text(color = "grey40"))

Figure 1. Spatial distribution of pharmacies (blue) and hospitals/clinics (red crosses) in Warsaw. Data: OpenStreetMap.

Visual inspection already suggests clustering in the city centre and along major axes, with lower density in outer districts.

5. First-Order Analysis

5.0 Nearest-Neighbour Distances

Before computing global and kernel-based intensity measures, we examine the distribution of distances to the \(k\)th nearest neighbour using distfun(). This gives an immediate visual impression of how tightly pharmacies are packed across the city — complementary to intensity, which is a count-based measure.

# Distance to 5th nearest neighbour (unmarked pattern)
dist_5nn <- distfun(pharm_ppp, k = 5)

plot(dist_5nn,
     main = "Distance to 5th nearest pharmacy (m)",
     col  = rev(viridis(256)))
points(pharm_ppp, pch = 16, cex = 0.25, col = "white")

Figure 4a. Distance from each location in Warsaw to the 5th nearest pharmacy. Dark areas (low distance) highlight the dense central network; lighter areas show under-served peripheral zones.

summary(dist_5nn)

## Distance function for point pattern
## Distance to 5th nearest point will be computed
## defined in a polygonal window inside the frame [626505.9, 655260.3] x 
## [472229.5, 502172.4] units
## 
## Distance function values:
##  range = [201.419, 6069.665]
##  mean = 1795.578

High values in the outer districts — particularly south-east Warsaw and areas beyond the ring road — indicate potential pharmacy deserts, where residents must travel significantly further to access the network. This has direct equity and economic implications: peripheral residents are less likely to fill prescriptions promptly, and operators face lower competitive pressure, which may allow above-average margins at isolated locations.

5.1 Global intensity

cat(sprintf("Global intensity: %.2f pharmacies per km²\n", global_intensity))

## Global intensity: 1.13 pharmacies per km²

cat(sprintf("             i.e. one pharmacy per %.1f km²\n", 1 / global_intensity))

##              i.e. one pharmacy per 0.9 km²

5.2 Kernel Density Estimation

KDE provides a smooth surface of local intensity. We use Diggle’s cross-validated bandwidth selector (bw.diggle), which minimises the mean squared error of the density estimate.

bw <- bw.diggle(pharm_ppp)
cat(sprintf("Diggle bandwidth: %.0f m\n", bw))

## Diggle bandwidth: 373 m

pharm_kde <- density(pharm_ppp, sigma = bw)
plot(pharm_kde,
     main = sprintf("Pharmacy KDE — Warsaw  (σ = %.0f m)", bw),
     ribbon = TRUE)
contour(pharm_kde, add = TRUE, col = "white", lwd = 0.4, drawlabels = FALSE)
points(pharm_ppp, pch = ".", col = "white", cex = 0.8)

Figure 2. Kernel density estimate of pharmacy locations (Diggle bandwidth). Warmer colours indicate higher local intensity.

The KDE reveals a prominent high-intensity core in the city centre (Śródmieście), with secondary peaks along radial corridors and lower intensity in the outer suburbs.

5.3 Quadrat Count Test

The quadrat test divides the window into a 6×6 grid and compares observed counts to those expected under CSR.

H₀: Pharmacies follow a homogeneous Poisson process (CSR)
H₁: Significant deviation from CSR

pharm_qt <- quadrat.test(pharm_ppp, nx = 6, ny = 6)
plot(pharm_qt, main = "Quadrat count test — pharmacies in Warsaw", cex = 0.7)

Figure 3. Quadrat count test. Numbers show observed (top) and expected (bottom) counts per cell.

## χ²(30) = 450.02,  p = 3.96e-76

Result: The quadrat test strongly rejects CSR (p < 2×10⁻¹⁶). Pharmacies are not randomly distributed across Warsaw.

5.3.1 Sensitivity to grid size

The quadrat test outcome depends on the chosen grid resolution. A coarse grid may miss fine-scale heterogeneity; an overly fine grid produces many empty cells that inflate the χ² statistic artificially. We verify that the rejection of CSR is robust by repeating the test at three resolutions:

qt_4  <- quadrat.test(pharm_ppp, nx = 4, ny = 4)
qt_6  <- quadrat.test(pharm_ppp, nx = 6, ny = 6)   # original
qt_8  <- quadrat.test(pharm_ppp, nx = 8, ny = 8)
qt_10 <- quadrat.test(pharm_ppp, nx = 10, ny = 10)

sensitivity_df <- data.frame(
  Grid    = c("4×4", "6×6", "8×8", "10×10"),
  df      = c(qt_4$parameter, qt_6$parameter, qt_8$parameter, qt_10$parameter),
  chi_sq  = round(c(qt_4$statistic, qt_6$statistic, qt_8$statistic, qt_10$statistic), 1),
  p_value = sapply(list(qt_4, qt_6, qt_8, qt_10), function(x) format.pval(x$p.value, digits = 2))
)

knitr::kable(sensitivity_df,
             caption = "Table 3. Quadrat test sensitivity to grid resolution.",
             col.names = c("Grid", "df", "χ²", "p-value"))

Table 3. Quadrat test sensitivity to grid resolution.
Grid	df	χ²	p-value
4×4	14	333.7	<2e-16
6×6	30	450.0	<2e-16
8×8	50	569.9	<2e-16
10×10	78	695.3	<2e-16

All four grids yield χ² statistics far exceeding the critical value and p-values well below 2×10⁻¹⁶, confirming that the rejection of CSR is not an artefact of the 6×6 choice.

6. Covariate Analysis: Distance to Nearest Hospital

6.1 Distance Image

We compute a pixel image \(d(u)\) = Euclidean distance from each location \(u\) to the nearest hospital/clinic. We use sf::st_distance on a regular grid (more numerically stable than spatstat::distmap with large projected coordinates) and convert the result to a spatstat pixel image.

# Build a regular grid over Warsaw, keep only interior points
grid_sf     <- st_make_grid(warsaw_sf, n = c(100, 100), what = "centers")
inside      <- as.vector(st_within(grid_sf, warsaw_sf, sparse = FALSE))
grid_in     <- grid_sf[inside]
grid_coords <- st_coordinates(grid_in)

# Distance from each grid point to nearest hospital (in metres)
dist_mat    <- st_distance(grid_in, hosp_sf)
min_dist_m  <- as.numeric(apply(dist_mat, 1, min))

# Convert to spatstat pixel image
dist_to_hosp <- as.im(
  data.frame(x = grid_coords[, 1], y = grid_coords[, 2], z = min_dist_m),
  W = warsaw_owin
)

plot(dist_to_hosp,
     main = "Distance to nearest hospital/clinic (m)",
     col  = rev(viridis(256)))
points(pharm_ppp, pch = 16, cex = 0.3, col = "white")
points(hosp_ppp,  pch = 3,  cex = 0.6, col = "red", lwd = 1.2)

Figure 6. Distance to nearest hospital/clinic (metres). Pharmacies (white dots) and hospitals (red crosses) overlaid.

6.2 Nonparametric Intensity Estimate: rhohat

rhohat() estimates the intensity \(\hat{\rho}(d)\) as a function of the covariate nonparametrically. A decreasing curve would confirm that pharmacies prefer locations close to hospitals.

pharm_rhohat <- rhohat(pharm_ppp, dist_to_hosp)

plot(pharm_rhohat,
     main  = "rhohat: pharmacy intensity vs. distance to hospital",
     xlab  = "Distance to nearest hospital (m)",
     ylab  = expression(hat(rho)(d)))

Figure 7. Nonparametric estimate of pharmacy intensity as a function of distance to the nearest hospital. The dashed red line is the global mean intensity. The curve falls steeply within the first 1–2 km then stabilises near the mean, consistent with a strong proximity effect.

Result: The rhohat curve falls sharply within the first 0–2 km, then converges toward the mean intensity. This confirms that pharmacy intensity is highest immediately around hospitals, decaying with distance — consistent with the economic proximity hypothesis.

7. Second-Order Analysis: Ripley’s K and L Functions

Second-order analysis uses pairwise distances to detect clustering or regularity at different spatial scales \(r\).

7.1 K and L Functions

\[L(r) = \sqrt{\frac{K(r)}{\pi}} \quad \Rightarrow \quad L(r) - r = 0 \text{ under CSR}\]

Positive values of \(L(r) - r\) indicate clustering; negative values indicate regularity.

pharm_L <- Lest(pharm_ppp, correction = "iso")
plot(pharm_L, . - r ~ r,
     main = "L(r) − r  —  pharmacies in Warsaw",
     ylab = "L(r) − r",
     legendpos = "topleft",
     col = c("black", "red"),
     lty = c(1, 2))
abline(h = 0, lty = 2, col = "grey50")

Figure 4. Ripley’s L(r)−r function for Warsaw pharmacies. The dashed red line shows the CSR expectation (zero).

7.2 Simulation Envelopes Under CSR

We generate 99 CSR realisations to build a pointwise simulation envelope. If the observed curve lies above the upper band, clustering is significant at that scale.

set.seed(2025)
pharm_L_env <- envelope(pharm_ppp, Lest, nsim = 99,
                         correction = "iso", rank = 1,
                         global = FALSE, verbose = FALSE)
plot(pharm_L_env, . - r ~ r,
     main = "L(r) − r  with 99 CSR simulation envelopes",
     ylab = "L(r) − r",
     legendpos = "topleft",
     shade = c("lo", "hi"))

Figure 5. L(r)−r with 99 pointwise CSR simulation envelopes (grey band). The observed curve (black) lies far above the envelope across all scales, confirming strong, significant clustering.

Result: The observed L(r)−r is massively above the CSR envelope at all scales. Clustering is statistically significant from very short distances (~100 m) all the way up to city-wide scales (>5 km).

Methodological note — isotropy and edge correction. The standard L-function assumes isotropy: that clustering behaviour is the same in all directions. For Warsaw pharmacies this is a reasonable starting assumption — there is no obvious directional bias in a city built on a roughly radial plan — but it cannot be verified without a formal test (e.g., fryplot or sector-based K). The iso (Ripley) edge correction used here is appropriate for moderate sample sizes and compact windows; the border correction, by contrast, is slightly biased but computationally cheaper and was used as a secondary check with equivalent results.

8. Point Process Model

8.1 Model Specification

We fit an inhomogeneous Poisson process (IPP) using maximum pseudolikelihood (Berman-Turner approximation):

\[\log \lambda(u) = \beta_0 + \beta_1 \cdot d(u)\]

where \(d(u)\) is the distance to the nearest hospital at location \(u\). A negative \(\hat{\beta}_1\) means intensity decreases with distance.

model_null <- ppm(pharm_ppp ~ 1)
model_dist <- ppm(pharm_ppp ~ dist_to_hosp)
summary(model_dist)

## Point process model
## Fitted to data: pharm_ppp
## Fitting method: maximum likelihood (Berman-Turner approximation)
## Model was fitted using glm()
## Algorithm converged
## Call:
## ppm.formula(Q = pharm_ppp ~ dist_to_hosp)
## Edge correction: "border"
##  [border correction distance r = 0 ]
## --------------------------------------------------------------------------------
## Quadrature scheme (Berman-Turner) = data + dummy + weights
## 
## Data pattern:
## Planar point pattern:  586 points
## Average intensity 1.13e-06 points per square unit
## Window: polygonal boundary
## single connected closed polygon with 1745 vertices
## enclosing rectangle: [626505.9, 655260.3] x [472229.5, 502172.4] units
##                      (28750 x 29940 units)
## Window area = 516715000 square units
## Fraction of frame area: 0.6
## 
## Dummy quadrature points:
##      64 x 64 grid of dummy points, plus 4 corner points
##      dummy spacing: 449.2877 x 467.8586 units
## 
## Original dummy parameters: =
## Planar point pattern:  2540 points
## Average intensity 4.92e-06 points per square unit
## Window: polygonal boundary
## single connected closed polygon with 1745 vertices
## enclosing rectangle: [626505.9, 655260.3] x [472229.5, 502172.4] units
##                      (28750 x 29940 units)
## Window area = 516715000 square units
## Fraction of frame area: 0.6
## Quadrature weights:
##      (counting weights based on 64 x 64 array of rectangular tiles)
## All weights:
##  range: [16600, 210000]  total: 5.15e+08
## Weights on data points:
##  range: [35000, 105000]  total: 47700000
## Weights on dummy points:
##  range: [16600, 210000]  total: 4.68e+08
## --------------------------------------------------------------------------------
## FITTED :
## 
## Nonstationary Poisson process
## 
## ---- Intensity: ----
## 
## Log intensity: ~dist_to_hosp
## Model depends on external covariate 'dist_to_hosp'
## Covariates provided:
##  dist_to_hosp: im
## 
## Fitted trend coefficients:
##   (Intercept)  dist_to_hosp 
## -12.259493700  -0.001670928 
## 
##                   Estimate         S.E.      CI95.lo       CI95.hi Ztest
## (Intercept)  -12.259493700 6.649688e-02 -12.38982519 -12.129162209   ***
## dist_to_hosp  -0.001670928 9.344664e-05  -0.00185408  -0.001487776   ***
##                    Zval
## (Intercept)  -184.36194
## dist_to_hosp  -17.88109
## 
## ----------- gory details -----
## 
## Fitted regular parameters (theta):
##   (Intercept)  dist_to_hosp 
## -12.259493700  -0.001670928 
## 
## Fitted exp(theta):
##  (Intercept) dist_to_hosp 
## 4.739904e-06 9.983305e-01

8.2 Model Comparison

anova(model_null, model_dist, test = "LRT")

## Analysis of Deviance Table
## 
## Model 1: ~1   Poisson
## Model 2: ~dist_to_hosp    Poisson
##   Npar Df Deviance  Pr(>Chi)    
## 1    1                          
## 2    2  1   581.35 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Table 2. AIC comparison of fitted models.
Model	AIC	ΔAIC
Null (CSR)	17218.3	—
Distance to hospital	16636.2	582.2

Result: The distance model is overwhelmingly preferred:

Likelihood-ratio test: χ²(1) = 581, p < 2×10⁻¹⁶
ΔAIC = 582 — very strong evidence for the covariate model

The negative coefficient (\(\hat{\beta}_1 = -1.671 \times 10^{-3}\), p < 10⁻¹⁶) confirms that pharmacy intensity decreases significantly with distance from hospitals. In practical terms, each additional kilometre from the nearest hospital/clinic is associated with a reduction in pharmacy intensity of approximately \(1 - e^{-1.671 \times 10^{-3} \times 1000} \approx 81\%\).

Methodological note — edge correction. The ppm() function uses the border edge correction by default (guard zone \(r = 0\)), which is computationally efficient but can slightly underestimate the intensity near the boundary of the window. For most practical applications this bias is small and does not affect the sign or significance of the estimated covariate effect. An alternative is the isotropic correction, which would produce marginally different coefficient estimates but identical qualitative conclusions.

8.3 Fitted Intensity Surface

plot(model_dist, how = "image",
     main = "Fitted intensity — inhomogeneous Poisson (distance to hospital)",
     se   = FALSE)
points(pharm_ppp, pch = 16, cex = 0.25, col = "white")

Figure 8. Fitted intensity surface from the inhomogeneous Poisson model. Observed pharmacy locations overlaid (white dots).

9. Goodness of Fit

9.1 Smooth Residuals

diagnose.ppm(model_dist, which = "smooth",
             main = "Smooth residuals — distance model")

Figure 9. Smoothed residual field from the distance model. Values near zero (green) indicate good local fit; positive values (red) suggest underprediction.

## Model diagnostics (raw residuals)
## Diagnostics available:
##  smoothed residual field
## range of smoothed field =  [-4.882e-07, 7.17e-07]

9.2 Kolmogorov-Smirnov Test

pharm_ks <- cdf.test(pharm_ppp, dist_to_hosp)
print(pharm_ks)

## 
##  Spatial Kolmogorov-Smirnov test of CSR in two dimensions
## 
## data:  covariate 'dist_to_hosp' evaluated at points of 'pharm_ppp' 
##      and transformed to uniform distribution under CSR
## D = 0.45878, p-value < 2.2e-16
## alternative hypothesis: two-sided

The KS test rejects the null of no covariate effect (p < 2×10⁻¹⁶), confirming that the distance covariate significantly shapes the distribution.

9.3 Inhomogeneous K Function

We test whether residual clustering remains after accounting for the hospital covariate. If the observed inhomogeneous K falls within the simulation envelope under the fitted model, the model captures the spatial structure well.

set.seed(2025)
pharm_Ki_env <- envelope(model_dist, Kinhom, nsim = 39,
                          correction = "iso", global = FALSE, verbose = FALSE)
plot(pharm_Ki_env,
     main = "Inhomogeneous K — envelopes under fitted model",
     legendpos = "topleft")

Figure 10. Inhomogeneous K function with simulation envelopes under the fitted model. The observed curve (black) lies mostly inside the envelope, indicating reasonable fit, though some residual clustering remains at short distances.

Interpretation: The fitted inhomogeneous Poisson model captures much of the spatial variation. The residual clustering at short distances (< 500 m) suggests that additional fine-scale clustering exists beyond the hospital proximity effect — likely reflecting commercial corridors, transport hubs, and population density.

10. Discussion

Summary of findings

Analysis	Result
KDE	Pronounced centre-periphery gradient; secondary peaks along radial corridors
Quadrat test	χ²(30) = 450, p ≈ 4×10⁻⁷⁶ — CSR strongly rejected
L function	Observed far above CSR envelope at all scales (0–7 km)
rhohat	Sharp intensity decay within 0–2 km of nearest hospital
PPM coefficient	β₁ = −1.671×10⁻³ (p < 10⁻¹⁶) — ~81% drop in intensity per km
ΔAIC	582 — covariate model massively preferred
Inhomogeneous K	Residual clustering at < 500 m after accounting for covariate

Interpretation

Warsaw pharmacies exhibit strong, statistically significant clustering at all spatial scales. The clustering is not random: proximity to hospitals and clinics is a highly significant predictor of pharmacy location, consistent with economic logic (demand proximity). A ΔAIC of 582 represents very strong evidence.

From a market perspective, this finding is unsurprising but quantitatively useful. Hospitals and clinics generate inelastic foot traffic — patients collecting prescriptions are less price-sensitive and less likely to comparison-shop across neighbourhoods. Locating near a hospital therefore reduces the pharmacist’s dependence on general foot traffic and provides a stable prescription revenue base. The 81% drop in predicted intensity per kilometre from the nearest hospital is a striking magnitude: it implies that, all else equal, a pharmacy opening 2 km from the nearest hospital faces roughly 4% of the predicted intensity of one directly adjacent.

However, the hospital distance covariate does not fully explain the pattern — residual clustering at very short distances (< 500 m) suggests additional drivers, potentially:

Commercial co-location: pharmacies cluster near each other at busy intersections to capture comparison-shopping foot traffic
Population density and pedestrian flow
Proximity to tram/metro stops
Commercial zoning restrictions
Historical centre-periphery development patterns

Limitations and next steps

Residual short-range clustering. The Kinhom envelopes (Section 9.3) show that the fitted inhomogeneous Poisson model does not fully capture the pattern at scales below ~500 m. This residual clustering is consistent with a commercial co-location dynamic: pharmacies deliberately locate near each other to exploit visibility and comparison-shopping behaviour, or to capture the same micro-scale demand anchors (a single busy intersection, a transit hub). A Gibbs point process — such as a Strauss model or a Geyer saturation model — could formally test whether the residual interaction is attractive (positive association between nearby pharmacies) or repulsive (mutual avoidance due to competitive exclusion). Fitting such a model with ppm(pharm_ppp, ~dist_to_hosp, Strauss(r = 400)) would produce an interaction parameter \(\gamma\) whose value indicates which regime applies: if \(\gamma > 1\) clustering persists; if \(\gamma < 1\) pharmacies exhibit local repulsion after controlling for hospital proximity. The current IPP model is still the defensible baseline — it correctly identifies the dominant city-scale driver — but the Gibbs extension would tighten residuals and improve short-range predictions.

Additional covariates. Population density and metro/tram stop proximity are the most plausible additional drivers not captured in the current model. Including them as pixel images in ppm() would decompose the remaining variance and sharpen the hospital-distance coefficient.

Economic and policy implications. The apteka dla aptekarza law of 2017 restricted new pharmacy licences to qualified pharmacists and capped chain ownership at four outlets. The resulting market consolidation means that new openings are increasingly concentrated near the highest-demand anchors — hospitals and clinics — exactly as the model shows. Outer districts of Warsaw where distance to the nearest pharmacy exceeds 1 km (visible in the distfun map, Section 5.0) represent potential pharmacy deserts: areas where residents face longer travel times, higher barriers to filling prescriptions, and reduced access to over-the-counter health products. Quantifying these access gaps — for example by overlaying the distance surface with demographic data on elderly or low-mobility populations — would transform this statistical analysis into actionable health-equity intelligence.

Marked pattern analysis. Chain pharmacies (e.g., Apteka Gemini, Dbam o Zdrowie) and independents may follow different spatial logics: chains likely target high-footfall anchors systematically, while independents fill neighbourhood gaps. Classifying pharmacies by the OSM operator tag and building a marked PPP would enable a formal test using split K-functions or a marked ppm() with an interaction term between pharmacy type and distance to hospital.

11. Reproducibility

## Data source: OpenStreetMap, downloaded May 2026

## CRS: EPSG:2180 (Polish national grid, metres)

## R version 4.4.3 (2025-02-28)
## Platform: aarch64-apple-darwin20
## Running under: macOS 26.3.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Warsaw
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.3.2        viridis_0.6.5          viridisLite_0.4.2     
##  [4] lubridate_1.9.4        forcats_1.0.1          stringr_1.5.1         
##  [7] dplyr_1.2.0            purrr_1.2.1            readr_2.1.6           
## [10] tidyr_1.3.2            tibble_3.3.1           tidyverse_2.0.0       
## [13] ggplot2_4.0.2          httr_1.4.7             jsonlite_1.9.1        
## [16] spatstat_3.5-1         spatstat.linnet_3.4-1  spatstat.model_3.6-1  
## [19] rpart_4.1.24           spatstat.explore_3.8-0 nlme_3.1-167          
## [22] spatstat.random_3.4-5  spatstat.geom_3.7-2    spatstat.univar_3.1-7 
## [25] spatstat.data_3.1-9    sf_1.0-23             
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6          xfun_0.52             bslib_0.9.0          
##  [4] spatstat.sparse_3.1-0 lattice_0.22-6        tzdb_0.5.0           
##  [7] vctrs_0.7.1           tools_4.4.3           spatstat.utils_3.2-2 
## [10] generics_0.1.4        goftest_1.2-3         proxy_0.4-28         
## [13] pkgconfig_2.0.3       Matrix_1.7-2          KernSmooth_2.23-26   
## [16] RColorBrewer_1.1-3    S7_0.2.1              lifecycle_1.0.5      
## [19] compiler_4.4.3        farver_2.1.2          deldir_2.0-4         
## [22] htmltools_0.5.8.1     class_7.3-23          sass_0.4.9           
## [25] yaml_2.3.10           pillar_1.11.1         jquerylib_0.1.4      
## [28] classInt_0.4-11       cachem_1.1.0          wk_0.9.5             
## [31] abind_1.4-8           tidyselect_1.2.1      digest_0.6.37        
## [34] stringi_1.8.4         splines_4.4.3         polyclip_1.10-7      
## [37] fastmap_1.2.0         grid_4.4.3            cli_3.6.5            
## [40] magrittr_2.0.4        e1071_1.7-17          withr_3.0.2          
## [43] tensor_1.5.1          scales_1.4.0          timechange_0.3.0     
## [46] rmarkdown_2.29        gridExtra_2.3         hms_1.1.4            
## [49] evaluate_1.0.3        knitr_1.49            s2_1.1.9             
## [52] mgcv_1.9-1            rlang_1.1.7           Rcpp_1.0.14          
## [55] glue_1.8.0            DBI_1.2.3             R6_2.6.1             
## [58] units_1.0-0

References

Arbia, G., Espa, G., and Giuliani, D. (2016). Spatial Microeconometrics. Routledge, London.

Baddeley, A., Rubak, E., and Turner, R. (2015). Spatial Point Patterns: Methodology and Applications with R. CRC Press, Boca Raton.

Berman, M. and Turner, T.R. (1992). Approximating point process likelihoods with GLIM. Applied Statistics, 41(1), 31–38.

Besag, J. (1977). Comments on Ripley’s paper. Journal of the Royal Statistical Society, Series B, 39(2), 193–195.

Diggle, P.J. (1985). A kernel method for smoothing point process data. Applied Statistics, 34(2), 138–147.

Diggle, P.J. (2013). Statistical Analysis of Spatial and Spatio-Temporal Point Patterns, 3rd ed. CRC Press, Boca Raton.

Guagliardo, M.F. (2004). Spatial accessibility of primary care: concepts, methods and challenges. International Journal of Health Geographics, 3(1), 3.

Kopczewska, K. (2021). Applied Spatial Statistics and Econometrics: Data Analysis in R. Routledge, London.

Qato, D.M., Daviglus, M.L., Wilder, J., Lee, T., Qato, D., and Lambert, B. (2014). ‘Pharmacy deserts’ are prevalent in Chicago’s predominantly minority communities, raising medication access concerns. Health Affairs, 33(8), 1409–1416.

Ripley, B.D. (1976). The second-order analysis of stationary point processes. Journal of Applied Probability, 13(2), 255–266.

Ripley, B.D. (1977). Modelling spatial patterns. Journal of the Royal Statistical Society, Series B, 39(2), 172–212.

Ripley, B.D. (1988). Statistical Inference for Spatial Processes. Cambridge University Press, Cambridge.

Ustawa z dnia 7 kwietnia 2017 r. o zmianie ustawy – Prawo farmaceutyczne. Dziennik Ustaw 2017 poz. 1015. Sejm Rzeczypospolitej Polskiej, Warsaw.

Vogler, S., Habimana, K., and Arts, D. (2014). Does deregulation in community pharmacy lead to better accessibility of medicines? Health Policy, 117(3), 311–327.