1 Introduction

Bike-sharing systems (BSS) have become a pivotal component of modern smart cities, offering a flexible, low-carbon alternative to motorised transport and helping to reduce urban emissions. Across many cities, BSS demand is far from random: it is shaped by the underlying urban fabric — transit access, population density and centrality — which makes station siting fundamentally a spatial planning decision. One such system is Veturilo, Warsaw’s public bike-sharing network. This study takes the placement of Veturilo stations themselves as a spatial point pattern and asks two questions that point-pattern analysis can answer rigorously:

  1. Do stations cluster, or is their placement spatially random?
  2. If the pattern is structured, which urban covariates drive station intensity, and does an inhomogeneous Poisson process (independent points, spatially varying intensity) explain it — as opposed to genuine point-to-point interaction?

We do not study all of Warsaw but two contrasting districts, which also keeps point counts and computation modest:

Using kernel intensity estimation, Ripley’s \(K\)/\(L\) and the pair correlation function with Monte-Carlo envelopes, nonparametric covariate analysis (rhohat) and inhomogeneous Poisson process models (ppm), we test whether an inhomogeneous Poisson process explains the observed pattern. Apparent clustering in Śródmieście disappears once spatially varying intensity is accounted for; the pattern is well described by an inhomogeneous Poisson process driven primarily by metro proximity. Mokotów shows only weak, borderline inhomogeneity and is effectively consistent with a Poisson process at its smaller sample size.

2 Literature review

The methods used here build on three standard references. Baddeley, Rubak and Turner (2015) is the canonical treatment of spatial point patterns in R and of the spatstat package, covering kernel intensity estimation, Ripley’s \(K\)/\(L\) and the pair-correlation function, inhomogeneous Poisson and Gibbs models, and envelope-based goodness-of-fit testing. Kopczewska (2020) places these tools within an applied spatial-statistics and spatial-econometrics workflow in R. Arbia, Espa and Giuliani (2021) develop the framework of spatial microeconometrics, in which individual economic units are treated as a realised point pattern with covariate-driven intensity.

Empirically, this framework has been applied most extensively to firm and retail location. Arbia, Espa, Giuliani and Mazzitelli (2012) model high-tech firms in Milan and report that apparent clustering is largely absorbed once an inhomogeneous intensity is fitted — a finding echoed in micro-spatial studies of food stores in Trento (Arbia et al. 2015), Madrid’s electronics sector (Gómez-Antonio and Sweeney 2018) and Italian manufacturing (Bocci and Rocco 2016). Giuliani, Arbia and Espa (2014) refine the methodology by weighting Ripley’s \(K\) for heterogeneous point sizes. The recurring message — that urban facility patterns are rarely homogeneous and that the appropriate null is an inhomogeneous Poisson process — motivates the analytical strategy used below.

In the bike-sharing literature, by contrast, station locations are usually taken as given. They appear either as candidates in normative location-optimisation problems (Liu et al. 2015; Conrow et al. 2018; Mix et al. 2022; Chen et al. 2023; Bencekri et al. 2024) or as fixed sites whose ridership is modelled in time and space (Lee and Leung 2023; Venkadavarahan et al. 2023). Treating the realised station network itself as a spatial point pattern and testing it against a covariate-driven Poisson null is uncommon, and is the perspective applied here to Warsaw’s Veturilo system.

3 Data and study area

Stations and their precomputed attributes come from the Mendeley dataset pxp72h4wdg/1 (2023, CC BY 4.0). District boundaries are a static Warsaw dzielnice GeoJSON (a live Overpass query was unreliable). All geometry is projected to EPSG:2180 (metric) so distances, areas and K-functions are in metres.

Observation windows and first-order summary. 379/383 stations matched a district (4 outside all polygons).
District n stations Area (km²) Intensity (/km²) Median NN (m)
Śródmieście 70 15.6 4.49 246
Mokotów 35 35.6 0.98 607

Śródmieście packs stations roughly 2.5× more tightly than Mokotów - a first hint that intensity is far from constant.

4 Methods

The workflow follows standard spatstat practice:

5 Exploratory analysis

Quadrat chi-squared test of homogeneous CSR (3x3 grid).
District Chi-sq df p-value
Śródmieście 17.30 7 0.0312
Mokotów 15.99 8 0.0852

Śródmieście rejects homogeneity (\(p\approx0.03\)); Mokotów does not (\(p\approx0.09\)) but with only ~4 expected points per quadrat the \(\chi^2\) approximation is unreliable, and the density surfaces below show Mokotów is clearly inhomogeneous. This illustrates the low power of quadrat tests and motivates Monte-Carlo \(K\)/\(L\) analysis.

In Śródmieście, all three bandwidths show consistently high intensities along a central spine with several local hotspots, indicating multiple dense clusters rather than a single smooth gradient . In contrast, Mokotów exhibits a pronounced west–east intensity gradient, with a compact high‑intensity area in the west and much lower intensities towards the east, especially in the larger, more peripheral part of the district. Śródmieście’s station pattern is strongly inhomogeneous with several dense pockets, whereas Mokotów is dominated by a broad, directional gradient in station density, consistent with more peripheral, less transit‑served areas having fewer stations.

Śródmieście — kernel intensity at three bandwidths.
Śródmieście — kernel intensity at three bandwidths.
Mokotów — kernel intensity at three bandwidths. Note the steep west→east gradient.
Mokotów — kernel intensity at three bandwidths. Note the steep west→east gradient.

6 Second-order analysis

Where the observed K leaves the 199-sim envelope, under the homogeneous (CSR) and inhomogeneous Poisson nulls.
District K vs CSR: outside at r (m) K vs inhom.: radii outside
Śródmieście [562, 1439] 0 / 513
Mokotów never 0 / 513

In Śródmieście, the observed \(K\) rises above the CSR envelope at medium–large scales (apparent clustering) but lies entirely inside the inhomogeneous Poisson envelope at every radius: the clustering is an artefact of spatially varying intensity, not point interaction. Mokotów shows no detectable second-order departure (low power at \(n=35\)). Consistently, the combined intensity and \(K-function\) plots indicate that both districts deviate from a simple homogeneous Poisson process, but the different reasons. In Śródmieście, a high and continuous band of intensity through the core and a strongly elevated \(K(r)\) under CSR reflect large-scale clustering that disappears once inhomogeneity is modelled, whereas in Mokotów the west-east intensity gradient and only modest excursions of \(K(r)\) around the CSR envelope are already well captured by an inhomogeneous Poisson model, leaving no evidence of additional interaction beyond that gradient.

Two-district synthesis: intensity surface, K vs CSR, and K vs inhomogeneous Poisson.
Two-district synthesis: intensity surface, K vs CSR, and K vs inhomogeneous Poisson.

7 Covariate analysis

Śródmieście - covariate correlations at station locations.
d_city_cen d_metro_st d_bus_tram d_railway pop_2023
d_city_cen 1.00 0.54 0.04 -0.03 -0.51
d_metro_st 0.54 1.00 0.42 0.02 -0.58
d_bus_tram 0.04 0.42 1.00 -0.22 -0.50
d_railway -0.03 0.02 -0.22 1.00 0.37
pop_2023 -0.51 -0.58 -0.50 0.37 1.00
Mokotów - covariate correlations at station locations.
d_city_cen d_metro_st d_bus_tram d_railway pop_2023
d_city_cen 1.00 0.38 -0.34 -0.02 -0.75
d_metro_st 0.38 1.00 0.35 0.48 -0.18
d_bus_tram -0.34 0.35 1.00 -0.16 0.28
d_railway -0.02 0.48 -0.16 1.00 0.17
pop_2023 -0.75 -0.18 0.28 0.17 1.00

Distance-to-centre was recovered exactly (multilateration RMSE \(\approx0\) m). In Śródmieście intensity falls monotonically with distance to centre; in Mokotów that effect is flat and a population-type gradient dominates (note the strong \(d_{\text{centre}}\!\leftrightarrow\!\text{pop}\) correlation \(\approx-0.75\)). Moderate collinearity means ppm coefficients are interpreted with care. Quantitatively, this suggests that in Śródmieście station siting closely tracks centrality, whereas in Mokotów the centre–periphery pattern is weaker and partly confounded with population gradients, so distance to centre alone is a less decisive predictor of where stations are placed.

Śródmieście - ρ vs distance to centre
Śródmieście - ρ vs distance to centre
Mokotów - ρ vs distance to centre
Mokotów - ρ vs distance to centre

8 Inhomogeneous Poisson models

Models per district: ~1 (CSR) → ~ d_city_cen_km~ d_city_cen_km + d_metro_km + pop_k (distances in km, population in thousands).

Śródmieście - full model coefficients.
Estimate S.E. CI95.lo CI95.hi Ztest Zval
(Intercept) -10.3901 0.8513 -12.0585 -8.7216 *** -12.2056
d_city_cen_km -0.2258 0.1898 -0.5978 0.1462 -1.1897
d_metro_km -1.9100 0.6809 -3.2446 -0.5753 ** -2.8049
pop_k -4.9333 5.7481 -16.1994 6.3328 -0.8582
Śródmieście - model selection (AIC).
model AIC npar
~1 1866.0 1
~centre 1854.9 2
~centre+metro+pop 1849.7 4
Śródmieście - likelihood-ratio tests (CSR → +centre → +full).
Npar Df Deviance Pr(>Chi)
1 NA NA NA
2 1 12.9939 0.0003
4 2 9.2903 0.0096
Mokotów - full model coefficients.
Estimate S.E. CI95.lo CI95.hi Ztest Zval
(Intercept) -13.3671 1.9793 -17.2464 -9.4877 *** -6.7534
d_city_cen_km -0.0316 0.1926 -0.4092 0.3460 -0.1640
d_metro_km -0.4786 0.2250 -0.9196 -0.0376 * -2.1271
pop_k 5.2790 13.0568 -20.3118 30.8698 0.4043
Mokotów - model selection (AIC).
model AIC npar
~1 1040.2 1
~centre 1039.2 2
~centre+metro+pop 1037.9 4
Mokotów - likelihood-ratio tests (CSR → +centre → +full).
Npar Df Deviance Pr(>Chi)
1 NA NA NA
2 1 3.0039 0.0831
4 2 5.3278 0.0697

Śródmieście: the inhomogeneous model is decisively better than CSR (LRT \(p=0.0003\) for distance-to-centre; \(p=0.010\) adding metro and population). In the full model metro proximity is the dominant driver (\(\beta_{\text{metro}}\approx-1.91\), \(p<0.01\) — intensity drops \(\approx85\%\) per km from a metro station); distance-to-centre is significant alone but absorbed by the correlated metro term.

Mokotów: improvements are only borderline (LRT \(p\approx0.07\)\(0.08\)); metro is marginally significant (\(p<0.05\)). Inhomogeneity is weak and the pattern is close to Poisson at this sample size.

9 Diagnostics

Goodness of fit: K_inhom simulated FROM the fitted ppm (199 sims).
District Kinhom outside fitted-model envelope
Śródmieście 0 / 513 radii
Mokotów 0 / 513 radii

For both districts the observed \(K_{\text{inhom}}\) lies inside the envelope simulated from the fitted model at all radii - the inhomogeneous Poisson model adequately reproduces the second-order structure. The Śródmieście residual field retains a mild north–south trend (≈±10 % of mean intensity), a visible consequence of using interpolated proxy covariates rather than exact rasters. The goodness‑of‑fit plot confirms that 0 out of 513 radii leave the simulated envelope, and the smoothed residual surface shows a gradual shift from positive residuals in the north to negative residuals in the south of roughly \(10^{-6}\) in intensity units, indicating only a small but systematic under‑ and over‑prediction at the extremes of the district.

Śródmieście - K_inhom goodness of fit
Śródmieście - K_inhom goodness of fit
Śródmieście - smoothed raw residuals
Śródmieście - smoothed raw residuals

10 Conclusions

  1. Veturilo station placement is neither spatially random nor self-clustering. In Śródmieście, it is well described by an inhomogeneous Poisson process whose intensity is driven primarily by metro proximity (and, collinearly, centrality). Stations are not just randomly scattered, much more tightly packed stations than in Mokotów, their pattern explained how close places are to metro and central areas.

  2. Mokotów exhibits only weak, borderline inhomogeneity and no detectable second-order structure - effectively a (mildly inhomogeneous) Poisson process; the smaller sample limits power. This district has lower station density, more spread-out network with fewer stations. However, there are hints that metro and population matter, but the evidence is weaker, partly because there are fewer stations to learn from.

  3. The substantive contrast: in the core, stations track transit and demand tightly; the periphery has a sparser, less covariate-structured roll-out.

11 Limitations

12 Reproducibility

Analysis runs in a pinned Docker image (rocker/geospatial:4.5.2; R 4.5.2, spatstat 3.5.1, GDAL 3.8.4) with an renv.lock version manifest.

make build && make smoke
for s in 02 03 04 05 06 07 08 09; do
  docker run --rm --platform=linux/amd64 -v "$PWD":/project -w /project \
    veturilo-sppa:4.5.2 Rscript R/${s}_*.R
done
make report   # renders this document to HTML + PDF

13 References

Arbia, G., Cella, P., Espa, G., & Giuliani, D. (2015). A micro spatial analysis of firm demography: The case of food stores in the area of Trento (Italy). Empirical Economics, 48, 923–937.

Arbia, G., Espa, G., & Giuliani, D. (2021). Spatial microeconometrics. Routledge.

Arbia, G., Espa, G., Giuliani, D., & Mazzitelli, A. (2012). Clusters of firms in an inhomogeneous space: The high-tech industries in Milan. Economic Modelling, 29(1), 3–11.

Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: methodology and applications with R. CRC Press.

Bencekri, M., Van Fan, Y., Lee, D., Choi, M., & Lee, S. (2024). Optimizing shared bike systems for economic gain: Integrating land use and retail. Journal of Transport Geography. https://doi.org/10.1016/j.jtrangeo.2024.103920

Bocci, C., & Rocco, E. (2016). Modelling the location decisions of manufacturing firms with a spatial point process approach. Journal of Applied Statistics, 43(7), 1226–1239.

Chen, W., Chen, X., Cheng, L., Chen, J., & Tao, S. (2023). Locating new docked bike sharing stations considering demand suitability and spatial accessibility. Travel Behaviour and Society. https://doi.org/10.1016/j.tbs.2023.100675

Conrow, L., Murray, A., & Fischer, H. (2018). An optimization approach for equitable bicycle share station siting. Journal of Transport Geography, 69, 163–170. https://doi.org/10.1016/j.jtrangeo.2018.04.023

Giuliani, D., Arbia, G., & Espa, G. (2014). Weighting Ripley’s K-function to account for the firm dimension in the analysis of spatial concentration. International Regional Science Review, 37(3), 251–272.

Gómez-Antonio, M., & Sweeney, S. (2018). Firm location, interaction, and local characteristics: A case study for Madrid’s electronics sector. Papers in Regional Science, 97(3), 663–686.

Kopczewska, K. (2020). Applied spatial statistics and econometrics: data analysis in R. Routledge.

Lee, C., & Leung, E. (2023). Spatiotemporal analysis of bike-share demand using DTW-based clustering and predictive analytics. Transportation Research Part E: Logistics and Transportation Review. https://doi.org/10.1016/j.tre.2023.103361

Liu, J., Li, Q., Qu, M., Chen, W., Yang, J., Xiong, H., Zhong, H., & Fu, Y. (2015). Station site optimization in bike sharing systems. 2015 IEEE International Conference on Data Mining, 883–888. https://doi.org/10.1109/icdm.2015.99

Mix, R., Hurtubia, R., & Raveau, S. (2022). Optimal location of bike-sharing stations: A built environment and accessibility approach. Transportation Research Part A: Policy and Practice. https://doi.org/10.1016/j.tra.2022.03.022

Venkadavarahan, M., Joji, M., & Marisamynathan, S. (2023). Development of spatial econometric models for estimating the bicycle sharing trip activity. Sustainable Cities and Society. https://doi.org/10.1016/j.scs.2023.104861