1 Introduction

This project studies how educational and public facilities are distributed across two central districts of Warsaw — Śródmieście and Żoliborz. The data was collected from OpenStreetMap and includes 394 mapped locations of different facilities such as schools, kindergartens, libraries, universities, and colleges.

The study area covers 24.06 km² and all analysis was performed using the spatstat package in R, with the observation window defined from the merged administrative boundary of both districts reprojected to EPSG:2180 (Polish national grid, metres).

Research question:

Is the distribution of educational facilities spatially random, clustered, or regular?
Does this distribution show any relationship with proximity to the city centre?

2 Packages and Setup

The following R packages are used throughout this analysis:

Package	Purpose
`sf`	Spatial data loading, manipulation and reprojection
`spatstat`	Core point pattern analysis framework
`spatstat.geom`	Point pattern geometry and `ppp` objects
`spatstat.explore`	Density estimation, K/L/G/F functions, envelopes
`tidyverse`	Data manipulation and visualisation
`viridis`	Colour palettes for spatial maps
`RColorBrewer`	Additional colour palettes for marked patterns
`osmdata`	Downloading road network from OpenStreetMap
`GET`	Global envelope tests

3 Data Preparation

We loaded two GeoJSON files from OpenStreetMap. The boundary file was filtered to extract Śródmieście and Żoliborz districts (admin_level = 9) and dissolved into a single polygon. Facility points were clipped to this polygon using st_within(). All layers were reprojected to EPSG:2180 (Polish national grid, metres) for metric distance calculations. The point pattern was then converted to a ppp object and rescaled to kilometres, giving a study area of 24.06 km² and an average intensity of 16.38 facilities per km².

Figure 1: Study Area — Śródmieście and Żoliborz Districts, Warsaw

The table below summarises the nine facility types included in the dataset:

Table 1: Educational and public facilities by type
Facility Type	Count	Intensity (per km²)
school	132	5.49
kindergarten	99	4.12
library	52	2.16
college	40	1.66
university	39	1.62
language_school	23	0.96
driving_school	4	0.17
music_school	3	0.12
dancing_school	2	0.08

The map below shows the spatial distribution of all 394 facilities within the study area, coloured by facility type.

Figure 1: Educational and public facilities by type in Śródmieście and Żoliborz

Figure 2: Distribution of facilities by type

The marked point pattern already suggests spatial inhomogeneity — facilities are not evenly spread across the study area. Schools and kindergartens appear more concentrated in the northern Żoliborz section while universities are visible primarily in the southern Śródmieście area. This visual impression is formally tested in the sections that follow.

4 First Order Analysis

First order analysis describes the overall intensity of a point pattern — how many points occur per unit area and whether this intensity is constant or varies spatially across the study region.

We examine intensity through quadrat counts, kernel density estimation, bandwidth selection, and formal statistical tests (KS and Berman) against spatial covariates including geographic coordinates and distance to the Warsaw city centre. ## Intensity

The average intensity of the point pattern is 16.38 facilities per km². This confirms a high density of educational infrastructure in central Warsaw. The intensity varies significantly by type — schools dominate with 5.49 per km², followed by kindergartens at 4.12 per km².

4.1 Quadrat Test

To formally test whether the pattern deviates from Complete Spatial Randomness (CSR), we applied the Chi-squared quadrat test using a 5×5 grid.

Figure 3: Quadrat count test — all facilities

The quadrat counts range from 0 in peripheral quadrats to a maximum of 64.5 per km² in the densest central quadrat. The formal test gives X² = 183.05, df = 15, p < 2.2e-16 against the clustered alternative — strong evidence to reject CSR in favour of clustering. The regular alternative gives p = 1, confirming no evidence of regularity.

4.2 Kernel Density Estimation

Kernel density estimation was applied to visualise the continuous intensity surface across the study area. The default Gaussian kernel was used with automatic bandwidth selection.

Figure 4: Kernel density surface — all facilities

The density surface confirms the pattern observed visually — intensity peaks in the north-central zone at the Żoliborz–Śródmieście boundary, reaching values above 25 per km². Intensity decreases progressively toward the southern tip of the study area. The contour plot shows tight concentric contours in the high-density zone (values 22–24 per km²), with a single dominant hotspot rather than multiple dispersed peaks.

4.3 Bandwidth Selection

Four bandwidth methods were compared — Diggle (0.0099 km), PPL (0.163 km), Scott (0.51 km) and CvL (0.50 km). The Diggle bandwidth is too small for district-scale interpretation. Scott and CvL produce the smoothest and most interpretable surfaces and are preferred for describing the overall spatial trend.

4.4 Spatial Distribution Tests

To test whether intensity varies systematically with location, we applied the Kolmogorov-Smirnov CDF test and Berman test along both axes and against distance to the city centre (Palace of Culture).

Test	Covariate	Statistic	p-value	Conclusion
KS test	x (E-W)	D = 0.139	4.5e-07	Significant
KS test	y (N-S)	D = 0.121	1.86e-05	Significant
KS test	dist to centre	D = 0.178	2.6e-11	Significant
Berman Z2	x	-2.37	0.018	Significant
Berman Z2	dist to centre	-6.47	9.997e-11	Highly significant

Figure 5: Smoothed distance to Palace of Culture (km)

All tests confirm that facility intensity is not spatially uniform. Facilities concentrate closer to the city centre — for every 1 km further from the Palace of Culture, intensity decreases by approximately 21%. Both east-west and north-south gradients are statistically significant.

5 Second Order Analysis

Second order analysis examines the spatial dependence between points — whether the presence of one facility influences the likelihood of another facility nearby. Unlike first order analysis which describes overall intensity, second order analysis captures interactions between points at different distances.

5.1 K, L, G, F and J Functions

We computed the full suite of inhomogeneous summary functions to characterise the spatial structure of the pattern at multiple scales.

Figure 6: Inhomogeneous summary functions — all facilities

All six functions give a consistent and unanimous result:

Function	Observed vs Poisson	Interpretation
Kinhom	Above reference	Clustering at all scales
Linhom	Above diagonal	Clustering at all scales
PCF	Starts at ~1.9, decreases to 1	Strong short-range clustering
Finhom	Right of reference	Large empty spaces between clusters
Ginhom	Left of reference	Short nearest-neighbour distances
Jinhom	Below 1, decreasing	Clustering (J < 1 = attraction)

Even after accounting for the non-uniform intensity gradient, the pattern shows genuine spatial dependence between facility locations — facilities tend to locate near other facilities beyond what intensity variation alone explains.

5.2 Envelope Tests

To formally test whether the observed clustering exceeds what an inhomogeneous Poisson process would generate, we computed pointwise and global envelopes using 99 simulations.

## Generating 99 simulations by evaluating expression  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
## 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
## 99.
## 
## Done.

## Generating 78 simulations by evaluating expression (39 to estimate the mean and 
## 39 to calculate envelopes) ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
## 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 
## 78.
## 
## Done.

Figure 7: Pointwise and global envelopes — Linhom

The observed Linhom curve lies outside the envelope across virtually all distances in both the pointwise and global tests. This confirms that the clustering is statistically significant even under the inhomogeneous Poisson null hypothesis.

Formal test results:

Test	Statistic	p-value	Conclusion
MAD test	0.0898	0.05	Significant
DCLF test	0.00306	0.05	Significant
Clark-Evans	R = 0.633	< 2.2e-16	Strong clustering
Hopkins-Skellam	A = 0.194	< 2.2e-16	Strong clustering

The Clark-Evans R = 0.633 means facilities are on average only 63% as far from their nearest neighbour as expected under CSR — confirming extremely strong clustering.

6 Marked Pattern Analysis

Marked pattern analysis extends the basic point pattern by incorporating categorical labels (marks) — in our case the nine facility types. This allows us to examine whether different types of facilities show distinct spatial distributions and whether they are spatially segregated from each other.

6.1 Intensity by Facility Type

Figure 8: Kernel density by facility type

The four types occupy clearly different parts of the study area:

Schools — hotspot in northwestern Żoliborz (peak ~10 per km²)
Kindergartens — hotspot in northern Żoliborz tip (peak ~8 per km²)
Libraries — central-western corridor, most evenly distributed type
Universities — sharply concentrated in southern Śródmieście, nearly absent from Żoliborz

6.2 Relative Risk

The relative risk surfaces show the probability of each facility type at each location, conditioning on a facility being present. This removes the effect of overall intensity and reveals pure spatial differentiation between types.

Figure 9: Relative risk — probability of each facility type

Schools have the highest probability (up to 0.45) in northwestern Żoliborz. Kindergartens dominate the northern tip (up to 0.65). Universities show a sharp probability peak in southern Śródmieście (up to 0.15). Libraries show the most even spatial distribution of all types.

6.3 Segregation Test

To formally test whether facility types are spatially segregated we applied the Monte Carlo segregation test with 99 simulations.

The null hypothesis is that all types share the same spatial distribution (random labelling). The result:

T = 9.24, p = 0.01 — the nine facility types are significantly spatially segregated at the 1% level. Schools and kindergartens concentrate in Żoliborz while universities concentrate in southern Śródmieście — this differentiation is far stronger than would arise by chance.

6.4 Schools vs Kindergartens — Independence Test

## Generating 99 simulations by evaluating expression  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
## 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
## 99.
## 
## Done.

Figure 10: Schools vs Kindergartens — random labelling test

The observed Lcross curve falls within the random labelling envelope for most distances. Despite both types concentrating in Żoliborz, schools and kindergartens show no significant spatial attraction or repulsion between them. Their co-location reflects a shared response to residential demand in Żoliborz, not direct spatial dependence between the two types.

7 Point Process Models

Point process models (ppm) allow us to formally estimate how the intensity of the pattern depends on spatial covariates. We fitted four Poisson process models of increasing complexity and compared them using AIC and likelihood ratio tests.

7.1 Model Specifications

Model	Formula	Description
M1	`~ 1`	Homogeneous Poisson — CSR null model
M2	`~ x + y`	Intensity varies with coordinates
M3	`~ dist_im`	Intensity driven by distance to city centre
M4	`~ polynom(dist_im, 2)`	Non-linear distance effect

7.2 Model Comparison

Table 3: Model comparison by AIC (lower = better fit)
Model	Formula	AIC
M1 — Homogeneous	~1	-1413.2
M2 — x + y trend	~x + y	-1443.9
M3 — Distance to centre	~dist_im	-1452.7
M4 — Polynomial distance	~polynom(dist_im, 2)	-1450.7

Model 3 (distance to city centre) is the best fitting model with AIC = -1453.0. The likelihood ratio test confirms distance to centre significantly improves over the null model (deviance = 39.9, p = 2.7e-10). The polynomial term in Model 4 does not improve fit (p = 0.60) — the linear relationship is adequate.

Key coefficient from Model 3: The distance coefficient is β = -0.233 — for every 1 km further from the Palace of Culture, intensity multiplies by exp(-0.233) = 0.79, a 21% reduction per km.

7.3 Fitted Intensity

Figure 11: Fitted intensity surface — Model 3 (distance to city centre)

7.4 Model Validation

Figure 12: Relative intensity vs distance to Palace of Culture

The rhohat plot reveals that the model underestimates intensity at ~2–3 km from the Palace — the true hotspot is at the Żoliborz–Śródmieście boundary, not at the Palace itself. The residual K-function confirms remaining clustering after fitting, suggesting a cluster process model would improve the fit further.

8 Line Pattern Analysis

Line pattern analysis examines the relationship between point locations and the underlying street network. Rather than treating space as a uniform plane, we consider facilities as located on or near a network of roads, which reflects how people actually access these facilities in an urban setting.

The road network for Śródmieście and Żoliborz was downloaded from OpenStreetMap using the osmdata package, covering primary, secondary, tertiary and residential streets.

8.1 Facilities on the Street Network

Figure 13: Facilities by type on the street network

The map confirms the spatial segregation identified in earlier sections — kindergartens (purple) and schools (pink) dominate the residential streets of Żoliborz in the northern section, while universities (grey) and colleges (red) are concentrated along the major arteries of southern Śródmieście.

8.2 Facility Density Along Roads

Figure 14: Road segments coloured by nearby facility count (100m buffer)

The road density map shows that most segments are dark purple (low density) while a central corridor of higher-density segments is visible running north-south through the study area.

8.3 Network Summary

Table 4: Road network accessibility statistics
Metric	Value
Road segments with 0 facilities within 100m	1623 (59%)
Road segments with 1+ facilities within 100m	1149 (41%)
Maximum facilities near one road segment	20

41% of road segments are within 100m of at least one facility, confirming that educational infrastructure in central Warsaw is well integrated with the street network. The maximum of 20 facilities near a single road segment occurs on the main central arteries of Śródmieście.

9 Conclusions

This project analysed the spatial distribution of 394 educational and public facilities across Śródmieście and Żoliborz, Warsaw. The key findings are:

Educational facilities in Śródmieście and Żoliborz are strongly clustered — confirmed by every method applied, with the Clark-Evans R = 0.633 showing facilities are on average only 63% as far from their nearest neighbour as expected under complete spatial randomness
The best fitting point process model shows that for every 1 km further from the Palace of Culture, facility intensity decreases by approximately 21% — distance to the city centre is the strongest spatial driver of facility distribution
Schools and kindergartens concentrate in the residential streets of Żoliborz while universities and colleges concentrate in the institutional zone of southern Śródmieście — libraries occupy a central intermediate position accessible to both districts
The nine facility types are significantly spatially segregated (p = 0.01) — each type occupies a distinct spatial niche reflecting the different urban character of the two districts
41% of road segments are within 100m of at least one facility, confirming good street network accessibility of educational infrastructure in central Warsaw

Spatial Point Pattern Analysis of Educational and Public Facilities

Warsaw — Śródmieście and Żoliborz Districts

Shagufta Shaheen (477654)
Celia Mlambo (476670)

Under the supervision of
dr Kateryna Zabarina

1 Introduction

2 Packages and Setup

3 Data Preparation

4 First Order Analysis

4.1 Quadrat Test

4.2 Kernel Density Estimation

4.3 Bandwidth Selection

4.4 Spatial Distribution Tests

5 Second Order Analysis

5.1 K, L, G, F and J Functions

5.2 Envelope Tests

6 Marked Pattern Analysis

6.1 Intensity by Facility Type

6.2 Relative Risk

6.3 Segregation Test

6.4 Schools vs Kindergartens — Independence Test

7 Point Process Models

7.1 Model Specifications

7.2 Model Comparison

7.3 Fitted Intensity

7.4 Model Validation

8 Line Pattern Analysis

8.1 Facilities on the Street Network

8.2 Facility Density Along Roads

8.3 Network Summary

9 Conclusions

Spatial Point Pattern Analysis of Educational and Public Facilities

Warsaw — Śródmieście and Żoliborz Districts

Shagufta Shaheen (477654) Celia Mlambo (476670) Under the supervision of dr Kateryna Zabarina

1 Introduction

2 Packages and Setup

3 Data Preparation

4 First Order Analysis

4.1 Quadrat Test

4.2 Kernel Density Estimation

4.3 Bandwidth Selection

4.4 Spatial Distribution Tests

5 Second Order Analysis

5.1 K, L, G, F and J Functions

5.2 Envelope Tests

6 Marked Pattern Analysis

6.1 Intensity by Facility Type

6.2 Relative Risk

6.3 Segregation Test

6.4 Schools vs Kindergartens — Independence Test

7 Point Process Models

7.1 Model Specifications

7.2 Model Comparison

7.3 Fitted Intensity

7.4 Model Validation

8 Line Pattern Analysis

8.1 Facilities on the Street Network

8.2 Facility Density Along Roads

8.3 Network Summary

9 Conclusions

Shagufta Shaheen (477654)
Celia Mlambo (476670)

Under the supervision of
dr Kateryna Zabarina