Bike-sharing systems (BSS) have become a pivotal component of modern smart cities, offering a flexible, low-carbon alternative to motorised transport and helping to reduce urban emissions. Across many cities, BSS demand is far from random: it is shaped by the underlying urban fabric — transit access, population density and centrality — which makes station siting fundamentally a spatial planning decision. One such system is Veturilo, Warsaw’s public bike-sharing network. This study takes the placement of Veturilo stations themselves as a spatial point pattern and asks two questions that point-pattern analysis can answer rigorously:
We do not study all of Warsaw but two contrasting districts, which also keeps point counts and computation modest:
Using kernel intensity estimation, Ripley’s \(K\)/\(L\)
and the pair correlation function with Monte-Carlo envelopes,
nonparametric covariate analysis (rhohat) and inhomogeneous
Poisson process models (ppm), we test whether an
inhomogeneous Poisson process explains the observed pattern. Apparent
clustering in Śródmieście disappears once spatially varying intensity is
accounted for; the pattern is well described by an inhomogeneous Poisson
process driven primarily by metro proximity. Mokotów shows only weak,
borderline inhomogeneity and is effectively consistent with a Poisson
process at its smaller sample size.
The methods used here build on three standard references. Baddeley,
Rubak and Turner (2015) is the canonical treatment of spatial point
patterns in R and of the spatstat package, covering kernel
intensity estimation, Ripley’s \(K\)/\(L\)
and the pair-correlation function, inhomogeneous Poisson and Gibbs
models, and envelope-based goodness-of-fit testing. Kopczewska (2020)
places these tools within an applied spatial-statistics and
spatial-econometrics workflow in R. Arbia, Espa and Giuliani (2021)
develop the framework of spatial microeconometrics, in which
individual economic units are treated as a realised point pattern with
covariate-driven intensity.
Empirically, this framework has been applied most extensively to firm and retail location. Arbia, Espa, Giuliani and Mazzitelli (2012) model high-tech firms in Milan and report that apparent clustering is largely absorbed once an inhomogeneous intensity is fitted — a finding echoed in micro-spatial studies of food stores in Trento (Arbia et al. 2015), Madrid’s electronics sector (Gómez-Antonio and Sweeney 2018) and Italian manufacturing (Bocci and Rocco 2016). Giuliani, Arbia and Espa (2014) refine the methodology by weighting Ripley’s \(K\) for heterogeneous point sizes. The recurring message — that urban facility patterns are rarely homogeneous and that the appropriate null is an inhomogeneous Poisson process — motivates the analytical strategy used below.
In the bike-sharing literature, by contrast, station locations are usually taken as given. They appear either as candidates in normative location-optimisation problems (Liu et al. 2015; Conrow et al. 2018; Mix et al. 2022; Chen et al. 2023; Bencekri et al. 2024) or as fixed sites whose ridership is modelled in time and space (Lee and Leung 2023; Venkadavarahan et al. 2023). Treating the realised station network itself as a spatial point pattern and testing it against a covariate-driven Poisson null is uncommon, and is the perspective applied here to Warsaw’s Veturilo system.
Stations and their precomputed attributes come from the Mendeley
dataset pxp72h4wdg/1 (2023, CC BY 4.0). District boundaries
are a static Warsaw dzielnice GeoJSON (a live Overpass query
was unreliable). All geometry is projected to EPSG:2180
(metric) so distances, areas and K-functions are in metres.
| District | n stations | Area (km²) | Intensity (/km²) | Median NN (m) |
|---|---|---|---|---|
| Śródmieście | 70 | 15.6 | 4.49 | 246 |
| Mokotów | 35 | 35.6 | 0.98 | 607 |
Śródmieście packs stations roughly 2.5× more tightly than Mokotów - a first hint that intensity is far from constant.
The workflow follows standard spatstat practice:
ppp/owin
per district; urban covariates carried as point marks.density.ppp, several bandwidths) and a quadrat \(\chi^2\) test of complete spatial
randomness (CSR).rhohat. Distance
to the city centre is reconstructed exactly by least-squares
multilateration of the recorded station-to-centre distances;
transit/population covariates are kernel-interpolated proxies.ppm,
compared to the homogeneous baseline by AIC and likelihood-ratio
tests.| District | Chi-sq | df | p-value |
|---|---|---|---|
| Śródmieście | 17.30 | 7 | 0.0312 |
| Mokotów | 15.99 | 8 | 0.0852 |
Śródmieście rejects homogeneity (\(p\approx0.03\)); Mokotów does not (\(p\approx0.09\)) but with only ~4 expected points per quadrat the \(\chi^2\) approximation is unreliable, and the density surfaces below show Mokotów is clearly inhomogeneous. This illustrates the low power of quadrat tests and motivates Monte-Carlo \(K\)/\(L\) analysis.
In Śródmieście, all three bandwidths show consistently high intensities along a central spine with several local hotspots, indicating multiple dense clusters rather than a single smooth gradient . In contrast, Mokotów exhibits a pronounced west–east intensity gradient, with a compact high‑intensity area in the west and much lower intensities towards the east, especially in the larger, more peripheral part of the district. Śródmieście’s station pattern is strongly inhomogeneous with several dense pockets, whereas Mokotów is dominated by a broad, directional gradient in station density, consistent with more peripheral, less transit‑served areas having fewer stations.
| District | K vs CSR: outside at r (m) | K vs inhom.: radii outside |
|---|---|---|
| Śródmieście | [562, 1439] | 0 / 513 |
| Mokotów | never | 0 / 513 |
In Śródmieście, the observed \(K\) rises above the CSR envelope at medium–large scales (apparent clustering) but lies entirely inside the inhomogeneous Poisson envelope at every radius: the clustering is an artefact of spatially varying intensity, not point interaction. Mokotów shows no detectable second-order departure (low power at \(n=35\)). Consistently, the combined intensity and \(K-function\) plots indicate that both districts deviate from a simple homogeneous Poisson process, but the different reasons. In Śródmieście, a high and continuous band of intensity through the core and a strongly elevated \(K(r)\) under CSR reflect large-scale clustering that disappears once inhomogeneity is modelled, whereas in Mokotów the west-east intensity gradient and only modest excursions of \(K(r)\) around the CSR envelope are already well captured by an inhomogeneous Poisson model, leaving no evidence of additional interaction beyond that gradient.
| d_city_cen | d_metro_st | d_bus_tram | d_railway | pop_2023 | |
|---|---|---|---|---|---|
| d_city_cen | 1.00 | 0.54 | 0.04 | -0.03 | -0.51 |
| d_metro_st | 0.54 | 1.00 | 0.42 | 0.02 | -0.58 |
| d_bus_tram | 0.04 | 0.42 | 1.00 | -0.22 | -0.50 |
| d_railway | -0.03 | 0.02 | -0.22 | 1.00 | 0.37 |
| pop_2023 | -0.51 | -0.58 | -0.50 | 0.37 | 1.00 |
| d_city_cen | d_metro_st | d_bus_tram | d_railway | pop_2023 | |
|---|---|---|---|---|---|
| d_city_cen | 1.00 | 0.38 | -0.34 | -0.02 | -0.75 |
| d_metro_st | 0.38 | 1.00 | 0.35 | 0.48 | -0.18 |
| d_bus_tram | -0.34 | 0.35 | 1.00 | -0.16 | 0.28 |
| d_railway | -0.02 | 0.48 | -0.16 | 1.00 | 0.17 |
| pop_2023 | -0.75 | -0.18 | 0.28 | 0.17 | 1.00 |
Distance-to-centre was recovered exactly (multilateration RMSE \(\approx0\) m). In Śródmieście intensity
falls monotonically with distance to centre; in Mokotów that effect is
flat and a population-type gradient dominates (note the strong \(d_{\text{centre}}\!\leftrightarrow\!\text{pop}\)
correlation \(\approx-0.75\)). Moderate
collinearity means ppm coefficients are interpreted with
care. Quantitatively, this suggests that in Śródmieście station siting
closely tracks centrality, whereas in Mokotów the centre–periphery
pattern is weaker and partly confounded with population gradients, so
distance to centre alone is a less decisive predictor of where stations
are placed.
Models per district: ~1 (CSR) →
~ d_city_cen_km →
~ d_city_cen_km + d_metro_km + pop_k (distances in km,
population in thousands).
| Estimate | S.E. | CI95.lo | CI95.hi | Ztest | Zval | |
|---|---|---|---|---|---|---|
| (Intercept) | -10.3901 | 0.8513 | -12.0585 | -8.7216 | *** | -12.2056 |
| d_city_cen_km | -0.2258 | 0.1898 | -0.5978 | 0.1462 | -1.1897 | |
| d_metro_km | -1.9100 | 0.6809 | -3.2446 | -0.5753 | ** | -2.8049 |
| pop_k | -4.9333 | 5.7481 | -16.1994 | 6.3328 | -0.8582 |
| model | AIC | npar |
|---|---|---|
| ~1 | 1866.0 | 1 |
| ~centre | 1854.9 | 2 |
| ~centre+metro+pop | 1849.7 | 4 |
| Npar | Df | Deviance | Pr(>Chi) |
|---|---|---|---|
| 1 | NA | NA | NA |
| 2 | 1 | 12.9939 | 0.0003 |
| 4 | 2 | 9.2903 | 0.0096 |
| Estimate | S.E. | CI95.lo | CI95.hi | Ztest | Zval | |
|---|---|---|---|---|---|---|
| (Intercept) | -13.3671 | 1.9793 | -17.2464 | -9.4877 | *** | -6.7534 |
| d_city_cen_km | -0.0316 | 0.1926 | -0.4092 | 0.3460 | -0.1640 | |
| d_metro_km | -0.4786 | 0.2250 | -0.9196 | -0.0376 | * | -2.1271 |
| pop_k | 5.2790 | 13.0568 | -20.3118 | 30.8698 | 0.4043 |
| model | AIC | npar |
|---|---|---|
| ~1 | 1040.2 | 1 |
| ~centre | 1039.2 | 2 |
| ~centre+metro+pop | 1037.9 | 4 |
| Npar | Df | Deviance | Pr(>Chi) |
|---|---|---|---|
| 1 | NA | NA | NA |
| 2 | 1 | 3.0039 | 0.0831 |
| 4 | 2 | 5.3278 | 0.0697 |
Śródmieście: the inhomogeneous model is decisively better than CSR (LRT \(p=0.0003\) for distance-to-centre; \(p=0.010\) adding metro and population). In the full model metro proximity is the dominant driver (\(\beta_{\text{metro}}\approx-1.91\), \(p<0.01\) — intensity drops \(\approx85\%\) per km from a metro station); distance-to-centre is significant alone but absorbed by the correlated metro term.
Mokotów: improvements are only borderline (LRT \(p\approx0.07\)–\(0.08\)); metro is marginally significant (\(p<0.05\)). Inhomogeneity is weak and the pattern is close to Poisson at this sample size.
| District | Kinhom outside fitted-model envelope |
|---|---|
| Śródmieście | 0 / 513 radii |
| Mokotów | 0 / 513 radii |
For both districts the observed \(K_{\text{inhom}}\) lies inside the envelope simulated from the fitted model at all radii - the inhomogeneous Poisson model adequately reproduces the second-order structure. The Śródmieście residual field retains a mild north–south trend (≈±10 % of mean intensity), a visible consequence of using interpolated proxy covariates rather than exact rasters. The goodness‑of‑fit plot confirms that 0 out of 513 radii leave the simulated envelope, and the smoothed residual surface shows a gradual shift from positive residuals in the north to negative residuals in the south of roughly \(10^{-6}\) in intensity units, indicating only a small but systematic under‑ and over‑prediction at the extremes of the district.
Veturilo station placement is neither spatially random nor self-clustering. In Śródmieście, it is well described by an inhomogeneous Poisson process whose intensity is driven primarily by metro proximity (and, collinearly, centrality). Stations are not just randomly scattered, much more tightly packed stations than in Mokotów, their pattern explained how close places are to metro and central areas.
Mokotów exhibits only weak, borderline inhomogeneity and no detectable second-order structure - effectively a (mildly inhomogeneous) Poisson process; the smaller sample limits power. This district has lower station density, more spread-out network with fewer stations. However, there are hints that metro and population matter, but the evidence is weaker, partly because there are fewer stations to learn from.
The substantive contrast: in the core, stations track transit and demand tightly; the periphery has a sparser, less covariate-structured roll-out.
Analysis runs in a pinned Docker image
(rocker/geospatial:4.5.2; R 4.5.2, spatstat 3.5.1, GDAL
3.8.4) with an renv.lock version manifest.
make build && make smoke
for s in 02 03 04 05 06 07 08 09; do
docker run --rm --platform=linux/amd64 -v "$PWD":/project -w /project \
veturilo-sppa:4.5.2 Rscript R/${s}_*.R
done
make report # renders this document to HTML + PDF
Arbia, G., Cella, P., Espa, G., & Giuliani, D. (2015). A micro spatial analysis of firm demography: The case of food stores in the area of Trento (Italy). Empirical Economics, 48, 923–937.
Arbia, G., Espa, G., & Giuliani, D. (2021). Spatial microeconometrics. Routledge.
Arbia, G., Espa, G., Giuliani, D., & Mazzitelli, A. (2012). Clusters of firms in an inhomogeneous space: The high-tech industries in Milan. Economic Modelling, 29(1), 3–11.
Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: methodology and applications with R. CRC Press.
Bencekri, M., Van Fan, Y., Lee, D., Choi, M., & Lee, S. (2024). Optimizing shared bike systems for economic gain: Integrating land use and retail. Journal of Transport Geography. https://doi.org/10.1016/j.jtrangeo.2024.103920
Bocci, C., & Rocco, E. (2016). Modelling the location decisions of manufacturing firms with a spatial point process approach. Journal of Applied Statistics, 43(7), 1226–1239.
Chen, W., Chen, X., Cheng, L., Chen, J., & Tao, S. (2023). Locating new docked bike sharing stations considering demand suitability and spatial accessibility. Travel Behaviour and Society. https://doi.org/10.1016/j.tbs.2023.100675
Conrow, L., Murray, A., & Fischer, H. (2018). An optimization approach for equitable bicycle share station siting. Journal of Transport Geography, 69, 163–170. https://doi.org/10.1016/j.jtrangeo.2018.04.023
Giuliani, D., Arbia, G., & Espa, G. (2014). Weighting Ripley’s K-function to account for the firm dimension in the analysis of spatial concentration. International Regional Science Review, 37(3), 251–272.
Gómez-Antonio, M., & Sweeney, S. (2018). Firm location, interaction, and local characteristics: A case study for Madrid’s electronics sector. Papers in Regional Science, 97(3), 663–686.
Kopczewska, K. (2020). Applied spatial statistics and econometrics: data analysis in R. Routledge.
Lee, C., & Leung, E. (2023). Spatiotemporal analysis of bike-share demand using DTW-based clustering and predictive analytics. Transportation Research Part E: Logistics and Transportation Review. https://doi.org/10.1016/j.tre.2023.103361
Liu, J., Li, Q., Qu, M., Chen, W., Yang, J., Xiong, H., Zhong, H., & Fu, Y. (2015). Station site optimization in bike sharing systems. 2015 IEEE International Conference on Data Mining, 883–888. https://doi.org/10.1109/icdm.2015.99
Mix, R., Hurtubia, R., & Raveau, S. (2022). Optimal location of bike-sharing stations: A built environment and accessibility approach. Transportation Research Part A: Policy and Practice. https://doi.org/10.1016/j.tra.2022.03.022
Venkadavarahan, M., Joji, M., & Marisamynathan, S. (2023). Development of spatial econometric models for estimating the bicycle sharing trip activity. Sustainable Cities and Society. https://doi.org/10.1016/j.scs.2023.104861