This project examines the spatial dimensions of regional economic development in Poland, focusing on the hypothesis that proximity to Warsaw positively influences regional economic performance. Specifically, we investigate whether regions closer to Warsaw (measured by distance in kilometers) exhibit higher GDP per capita and lower unemployment rates, while controlling for population density effects.
The analysis utilizes spatial panel data econometrics to capture both spatial dependencies and temporal dynamics across Polish NUTS-2 regions from 2014-2022. This approach allows us to account for spatial spillovers between neighboring regions and test whether Warsaw’s economic dominance creates a measurable proximity effect throughout Poland. ## Data and Methodology
# Install packages if needed
#install.packages(c("eurostat", "dplyr", "tidyr", "sf", "spdep",
# "splm", "plm", "car", "ggplot2", "tmap", "spatialreg"))
library(eurostat)
library(dplyr)
library(tidyr)
library(sf)
library(spdep)
library(splm)
library(plm)
library(car)
library(ggplot2)
library(tmap)
library(spatialreg)
library(Matrix) # get nuts2 shapefile for Poland
nuts2 <- get_eurostat_geospatial(
nuts_level = 2,
year = 2024,
resolution = "01"
) %>%
st_transform(3035) %>%
filter(CNTR_CODE == "PL")
print(paste("Number of Polish NUTS-2 regions:", nrow(nuts2)))## [1] "Number of Polish NUTS-2 regions: 17"
#calculate distances to Warsaw
centroids <- st_centroid(nuts2)
warsaw_pt <- st_sfc(
st_point(c(21.0122, 52.2297)),
crs = 4326
) %>%
st_transform(3035)
common_years <- 2014:2022
nuts2$dist_to_warsaw_km <- as.numeric(st_distance(centroids, warsaw_pt)) / 1000
summary(nuts2$dist_to_warsaw_km)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7.74 172.40 254.40 222.70 279.10 396.23
The dataset consists of 17 Polish NUTS-2 regions observed over 9 years (2014-2022). The result of a panel data was that NUTS-2 creates only 17 variables, which is a very small sample that might introduce errors. As a result of extending the dataset , the result was the creation of a balanced panel of 153 observations. Bigger time period was planned, however due to the data unavailability, the period 2014-2022 was the optimal one. Key variables include:
gdp_pc - GDP per
capita in purchasing power standardspop_density - Population densityunemp_rate - Unemployment ratedist_to_warsaw_km - Distance to Warsaw in
kilometers# GDP per capita (PPS, EU27=100)
gdp <- get_eurostat("tgs00006", time_format = "num")
gdp_pl <- gdp %>%
filter(startsWith(geo, "PL"), nchar(geo) == 4,
unit == "PPS_HAB_EU27_2020",
TIME_PERIOD %in% common_years) %>%
rename(NUTS_ID = geo, gdp_pc = values, time = TIME_PERIOD)
print(paste("GDP observations:", nrow(gdp_pl)))## [1] "GDP observations: 153"
# Population density
popdens <- get_eurostat("demo_r_d3dens", time_format = "num")
popdens_pl <- popdens %>%
filter(startsWith(geo, "PL"), nchar(geo) == 4,
TIME_PERIOD %in% common_years) %>%
rename(NUTS_ID = geo, pop_density = values, time = TIME_PERIOD)
print(paste("Population density observations:", nrow(popdens_pl)))## [1] "Population density observations: 153"
# Unemployment rate
unemp <- get_eurostat("tgs00007", time_format = "num", filters = list(sex = "T"))
unemp_pl <- unemp %>%
filter(startsWith(geo, "PL"), nchar(geo) == 4,
time %in% common_years) %>%
rename(NUTS_ID = geo, unemp_rate = values)
print(paste("Unemployment rate observations:", nrow(unemp_pl)))## [1] "Unemployment rate observations: 153"
And now we can construct our panel
#build a panel dataset
panel <- expand.grid(
NUTS_ID = unique(nuts2$NUTS_ID),
time = common_years
) %>%
left_join(
st_drop_geometry(nuts2) %>% select(NUTS_ID, NAME_LATN, dist_to_warsaw_km),
by = "NUTS_ID"
) %>%
left_join(gdp_pl, by = c("NUTS_ID", "time")) %>%
left_join(popdens_pl, by = c("NUTS_ID", "time")) %>%
left_join(unemp_pl, by = c("NUTS_ID", "time")) %>%
select(NUTS_ID, NAME_LATN, time, dist_to_warsaw_km,
gdp_pc, pop_density, unemp_rate) %>%
arrange(NUTS_ID, time)
# Display panel structure
print(paste("Total observations:", nrow(panel)))## [1] "Total observations: 153"
## [1] "Number of regions: 17"
## [1] "Time periods: 9"
# Summary statistics
summary(panel[, c("gdp_pc", "pop_density", "unemp_rate", "dist_to_warsaw_km")])## gdp_pc pop_density unemp_rate dist_to_warsaw_km
## Min. : 48.00 Min. : 54.3 Min. :55.60 Min. : 7.74
## 1st Qu.: 57.00 1st Qu.: 79.1 1st Qu.:63.30 1st Qu.:172.40
## Median : 63.00 Median :116.8 Median :66.20 Median :254.40
## Mean : 68.56 Mean :146.8 Mean :66.43 Mean :222.70
## 3rd Qu.: 73.00 3rd Qu.:136.7 3rd Qu.:69.30 3rd Qu.:279.10
## Max. :160.00 Max. :542.7 Max. :79.60 Max. :396.23
Additionally, variables like :
-rail transportation (tran_r_net)
-R&D (rd_e_gerdreg)
-education secondary, 25 and over (edat_lfse_04)
-Gross fixed capital formation (nama_10r_2gfcf)
Were initially used as well, but through various configurations some ended up very collinear (R&D, education) with very high VIF scores (approx. 15) and a correlation with other predictors above 0.7 or ended up being insignificant, inflating errors. ### Spatial Weights Matrix A queen contiguity spatial weights matrix (W) was constructed, where regions sharing borders or corners are considered neighbors. The matrix is row-standardized for interpretability.
coords <- st_coordinates(st_centroid(nuts2))
nb <- poly2nb(nuts2, queen = TRUE)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)
#Spatial neighbors summary
print(summary(nb))## Neighbour list object:
## Number of regions: 17
## Number of nonzero links: 70
## Percentage nonzero weights: 24.22145
## Average number of links: 4.117647
## Link number distribution:
##
## 1 3 4 5 6 7
## 1 6 5 1 2 2
## 1 least connected region:
## 15 with 1 link
## 2 most connected regions:
## 2 13 with 7 links
plot(st_geometry(nuts2), border = "grey")
plot(nb, coords, add = TRUE, col = "red")
title("Queen Contiguity Spatial Weights Matrix")The matrix has 70 neighbour links, meaning about 24% of all possible region pairs are their neighbours. so on average a region has 4 neighbours but this varies across regions, for example region 15 is the least connected with only 1 neighbor (so quite isolated), while the regions 2 and 13 are the most connected with 7 neighbours as they sit in the middle of clusters. The rest of the regions typically have 3–4 neighbours.
To analyze regional economic performance in Poland, we employ three complementary econometric models that allow us to capture both local determinants and spatial interdependencies:
Ordinary Least Squares (OLS) - provides a simple baseline, estimating the direct relationships between GDP per capita, population density, and employment rate without considering spatial effects. While straightforward, OLS may produce biased or inefficient estimates if neighboring regions influence each other.
Spatial Autoregressive (SAR) - Fixed Effects Model accounts for the possibility that a region’s GDP per Capita is affected by the economic outcomes of its neighbors. The spatial lag term (ρ × W × gdp_pc) captures a spillove” effect, while individual fixed effects control for unobserved time-invariant regional characteristics.
Spatial Error Model (SEM) - Fixed Effects assumes that unobserved factors influencing GDP per Capita are spatially correlated. The spatial error term (λ × W × uᵢₜ) captures these dependencies, while fixed effects control for regional heterogeneity.
gdp_pc = β₀ + β₁ × pop_density + β₂ × unemp_rate + ε
gdp_pc = ρ × W × gdp_pc + β₁ × pop_density + β₂ × unemp_rate + αᵢ + uᵢₜ
gdp_pc = β₁ × pop_density + β₂ × unemp_rate + αᵢ + uᵢₜ
uᵢₜ = λ × W × uᵢₜ + εᵢₜ
Where:
ρ = spatial autoregressive parameter
λ = spatial error parameter
αᵢ = individual fixed effects
W = spatial weights matrix
To test our research question, we started with an OLS model. Its role was to mostly show OLS’s inability to account for spatial dependencies and cross-regional spillovers, which are likely present in regional economic data.
#estimate OLS model
ols <- lm(gdp_pc ~ pop_density + unemp_rate, data = panel)
###OLS BASELINE MODEL
print(summary(ols))##
## Call:
## lm(formula = gdp_pc ~ pop_density + unemp_rate, data = panel)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.0738 -5.9790 -0.8424 5.6945 20.9142
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -90.723876 11.580447 -7.834 7.99e-13 ***
## pop_density 0.133032 0.007197 18.485 < 2e-16 ***
## unemp_rate 2.103752 0.180220 11.673 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.264 on 150 degrees of freedom
## Multiple R-squared: 0.8448, Adjusted R-squared: 0.8427
## F-statistic: 408.3 on 2 and 150 DF, p-value: < 2.2e-16
## pop_density unemp_rate
## 1.229121 1.229121
## pop_density unemp_rate
## pop_density 1.0000000 0.4317525
## unemp_rate 0.4317525 1.0000000
OLS estimation results:
Model Fit: Excellent explanatory power with R² = 0.845, indicating that 84.5% of variation in regional GDP per capita is explained by population density and unemployment rate
Population Density: Strong positive effect (β = 0.133, p < 0.001), suggesting that each additional inhabitant per km² increases GDP per capita by 0.133 PPS units
Unemployment Rate: positive coefficient (β = 2.104, p < 0.001), indicating higher employment regions show higher GDP per capita in the pooled OLS
Diagnostic Results:
Multicollinearity: VIF values of 1.23 indicate no concerning multicollinearity
Correlation: Moderate correlation (r = 0.43) between population density and unemployment rate
Model Limitations: The OLS results likely suffer from omitted variable bias and ignore spatial dependencies
While OLS provides a simple baseline and highlights the relationships between GDP per capita, population density, and unemployment rate, it ignores the influence of neighboring regions. As a result, the OLS estimates may suffer from omitted variable bias and can produce misleading coefficients, particularly when regional outcomes are spatially correlated.
Building on this baseline, we then employed spatial panel models (SAR and SEM) with fixed effects to properly capture both the temporal dynamics and spatial interdependencies in Polish regions, allowing for a more accurate analysis of the Warsaw proximity effect.
The following SAR and SEM models were estimated using ‘spml’ package. Sadly, due to problems with panel data and the package, Spatial Durbin Model was not estimated.
# SAR Fixed Effects Model
lag_panel <- spml(
gdp_pc ~ pop_density + unemp_rate,
data = panel,
listw = lw,
index = c("NUTS_ID", "time"),
model = "within",
effect = "individual",
spatial.error = "none",
lag = TRUE,
zero.policy = TRUE,
reweight = FALSE
)
#SAR Fixed Effects Results
print(summary(lag_panel))## Spatial panel fixed effects lag model
##
##
## Call:
## spml(formula = gdp_pc ~ pop_density + unemp_rate, data = panel,
## index = c("NUTS_ID", "time"), listw = lw, model = "within",
## effect = "individual", lag = TRUE, spatial.error = "none",
## zero.policy = TRUE, reweight = FALSE)
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -3.564794 -1.046776 -0.077414 0.684769 6.917427
##
## Spatial autoregressive coefficient:
## Estimate Std. Error t-value Pr(>|t|)
## lambda 0.642725 0.057556 11.167 < 2.2e-16 ***
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## pop_density -0.066476 0.030451 -2.1831 0.02903 *
## unemp_rate 0.341894 0.067167 5.0902 3.576e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# SEM Fixed Effects Model
error_panel <- spml(
gdp_pc ~ pop_density + unemp_rate,
data = panel,
listw = lw,
index = c("NUTS_ID", "time"),
model = "within",
effect = "individual",
spatial.error = "b",
lag = FALSE,
zero.policy = TRUE,
reweight = FALSE
)
# SEM Fixed Effects Results
print(summary(error_panel))## Spatial panel fixed effects error model
##
##
## Call:
## spml(formula = gdp_pc ~ pop_density + unemp_rate, data = panel,
## index = c("NUTS_ID", "time"), listw = lw, model = "within",
## effect = "individual", lag = FALSE, spatial.error = "b",
## zero.policy = TRUE, reweight = FALSE)
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -5.13609 -1.91389 -0.50342 1.48022 6.66659
##
## Spatial error parameter:
## Estimate Std. Error t-value Pr(>|t|)
## rho 0.721319 0.053032 13.601 < 2.2e-16 ***
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## pop_density -0.077197 0.029486 -2.6181 0.008842 **
## unemp_rate 0.616581 0.102442 6.0188 1.757e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SAR Fixed Effects estimation results:
Spatial Autoregressive Parameter (λ = 0.643):
Highly significant (p < 0.001) which confirms strong spatial spillovers
The result means that a 1% increase in neighbouring regions GDP per Capita leads to a 0.64% increase in own region’s GDP, which validates the spatial nature of economic development in Poland
Population Density Effect:
Coefficient becomes negative (-0.066) after controlling for fixed effects and spatial spillovers
The result means that Within regions over time, higher population density is associated with lower GDP per Capita
Employment Rate Effect:
Positive coefficient (0.342): statistically significant
Higher employment reflects stronger local economies, which supports higher GDP per capita.
SEM Fixed Effects estimation results:
Spatial Error Parameter (ρ = 0.721):
Very high significance (p < 0.001), indicating substantial spatial correlation in unobserved factors
72% of spatial correlation in unobserved determinants of GDP per capita is captured
Suggests strong spatial clustering of unmeasured economic factors
After estimating both SAR and SEM fixed effects models, we focus on the SAR model for impact analysis because it explicitly models the spatial spillovers in GDP, which is central to studying the Warsaw proximity effect.
We choose the SAR model over the SEM model because the SAR incorporates spatial lag of the dependent variable (GDP per capita), which allows us to directly quantify how economic outcomes in one region influence the neighbouring regions
Additionally, a Hausman Test was conducted to see if in our current specification, fixed or random effects should be used. What’s worth noting, including distance_to_warsaw variable in fixed effects resulted in an error - we can’t use time invariant variables in fixed effects, only random effects.
# Random Effects models (for Hausman test comparison)
lag_re <- spml(
gdp_pc ~ dist_to_warsaw_km + pop_density + unemp_rate,
data = panel,
listw = lw,
index = c("NUTS_ID", "time"),
model = "random",
spatial.error = "none",
lag = TRUE,
zero.policy = TRUE,
reweight = FALSE
)
err_re <- spml(
gdp_pc ~ dist_to_warsaw_km + pop_density + unemp_rate,
data = panel,
listw = lw,
index = c("NUTS_ID", "time"),
model = "random",
spatial.error = "b",
lag = FALSE,
zero.policy = TRUE,
reweight = FALSE
)
# Hausman Tests
#SAR Fixed vs Random Effects
print(sphtest(lag_panel, lag_re))##
## Hausman test for spatial models
##
## data: gdp_pc ~ pop_density + unemp_rate
## chisq = 42.52, df = 2, p-value = 5.846e-10
## alternative hypothesis: one model is inconsistent
##
## Hausman test for spatial models
##
## data: gdp_pc ~ pop_density + unemp_rate
## chisq = 54.086, df = 2, p-value = 1.801e-12
## alternative hypothesis: one model is inconsistent
Hausman test interpretation:
-Fixed effects method is advised
After estimating spatial models, it is crucial to check whether residuals still exhibit spatial dependence. Ignoring residual spatial autocorrelation can lead to biased or inefficient estimates.
We use Moran’s I statistic, a standard measure of spatial autocorrelation, to test whether the residuals from our SAR and SEM models are randomly distributed across space.
#calculate residuals for SAR and SEM models
resid_lag <- residuals(lag_panel)
resid_err <- residuals(error_panel)
#average residuals by region for spatial testing
resid_lag_avg <- panel %>%
mutate(resid = resid_lag) %>%
group_by(NUTS_ID) %>%
summarise(resid_mean = mean(resid, na.rm = TRUE)) %>%
pull(resid_mean)
resid_err_avg <- panel %>%
mutate(resid = resid_err) %>%
group_by(NUTS_ID) %>%
summarise(resid_mean = mean(resid, na.rm = TRUE)) %>%
pull(resid_mean)
# Moran's I tests
moran_lag <- moran.test(resid_lag_avg, lw, zero.policy = TRUE)
moran_err <- moran.test(resid_err_avg, lw, zero.policy = TRUE)
print(moran_lag)##
## Moran I test under randomisation
##
## data: resid_lag_avg
## weights: lw
##
## Moran I statistic standard deviate = -0.10376, p-value = 0.5413
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## -0.07883061 -0.06250000 0.02477126
##
## Moran I test under randomisation
##
## data: resid_err_avg
## weights: lw
##
## Moran I statistic standard deviate = -0.095996, p-value = 0.5382
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## -0.07764164 -0.06250000 0.02487948
Model Validation:
Both SAR and SEM residuals are not significantly spatially autocorrelated (p > 0.5).
This indicates that both models successfully capture the spatial dependencies in GDP per capita.
The lack of residual spatial clustering confirms that the model specification is adequate and that spatial spillovers have been properly accounted for.
#we use a single, latest 2022 year
sf2022 <- nuts2 %>%
left_join(
panel %>% filter(time == 2022) %>% select(NUTS_ID, gdp_pc),
by = "NUTS_ID"
)
tmap_mode("plot")
# Create choropleth map
gdp_map <- tm_shape(sf2022) +
tm_fill(
"gdp_pc",
palette = "plasma",
style = "quantile",
n = 5,
title = "GDP per capita\n(2022, PPS EU27=100)"
) +
tm_borders(col = "white", lwd = 0.5) +
tm_layout(
main.title = "Regional GDP per Capita in Poland (2022)",
main.title.size = 1.2,
legend.outside = TRUE,
frame = FALSE
)
print(gdp_map)The map clearly shows the economic dominance of the Warsaw region (Mazowieckie) and relatively higher performance in western and northern regions compared to eastern areas.
Spatial panel models allow us to to quantify how these effects of regional characteristics on GDP per Capita spill over to neighboring regions. Direct effects measure changes within a region, indirect effects capture spillovers to adjacent regions via the spatial weights matrix, and total effects combine both to reflect the overall influence on regional economic performance.
attr(lag_panel, "have_factor_preds") <- FALSE #allows us to skip the limitations of impacts for spatial panel in from 'spml'
W_sp <- as(lw, "CsparseMatrix")
trMatc <- trW(W_sp, type = "mult")
#calculate impacts
imp <- impacts(lag_panel, tr = trMatc, R = 200)
#impact results
print(summary(imp, zstats = TRUE, short = TRUE))## Impact measures (lag, trace):
## Direct Indirect Total
## pop_density dy/dx -0.067656 -0.1184091 -0.1860651
## unemp_rate dy/dx 0.347960 0.6089872 0.9569472
## ========================================================
## Simulation results ( variance matrix):
## ========================================================
## Simulated standard errors
## Direct Indirect Total
## pop_density dy/dx 0.03306116 0.07369394 0.1041119
## unemp_rate dy/dx 0.07078948 0.23734072 0.2907846
##
## Simulated z-values:
## Direct Indirect Total
## pop_density dy/dx -1.942772 -1.669775 -1.798859
## unemp_rate dy/dx 5.013710 2.837852 3.536830
##
## Simulated p-values:
## Direct Indirect Total
## pop_density dy/dx 0.052044 0.0949639 0.07204098
## unemp_rate dy/dx 5.3391e-07 0.0045418 0.00040496
Comprehensive Effect Decomposition:
Population Density Effects:
-Direct Effect (-0.077): Measures the within-region impact of population density on GDP per capita. Denser regions, after controlling for fixed effects and spatial spillovers, tend to have slightly lower GDP per capita.
-Indirect Effect (-0.109): Captures the influence of neighboring regions’ population density through spatial spillovers. High density in surrounding regions slightly reduces GDP in a given region.
-Total Effect (-0.186): Sum of direct and indirect effects, representing the overall influence of population density including both local and neighboring regions.
A 100 inhabitants/km² increase reduces GDP per capita by 7.7 PPS locally and 18.6 PPS overall when considering spillover effects.
Employment Rate Effects: - Direct Effect (0.396): Shows the positive effect of within-region employment on GDP per Capita. Higher employment reflects stronger local economic activity.
= Indirect Effect (0.560): Indicates that higher employment in neighboring regions boosts GDP in the region of interest, demonstrating positive spatial spillovers.
-Total Effect (0.957): The sum of direct and indirect effects, almost 1:1 impact, which shows substantial amplification from regional interactions.
Economic Interpretation: The decomposition confirms that employment is a key driver of regional economic performance, and its impact is spatially amplified.
Policies that increase local employment can have both direct benefits and strong positive spillovers to adjacent regions, emphasizing the need for coordinated regional labor policies.
Conversely, population density has a nuanced, sometimes negative effect, likely reflecting congestion or resource competition after controlling for regional fixed effects.
The spatial panel analysis demonstrates that regional economic performance in Poland is strongly shaped by both local factors and the outcomes of neighboring regions. The SAR fixed effects model highlights that GDP per capita in a region is not only determined by its own employment rate and population density, but also by the economic conditions of adjacent regions. This confirms the presence of spatial spillovers, indicating that growth and development are interconnected across regions.
The SAR spatial autoregressive parameter (ρ = 0.643, p < 0.001) indicates that a 1% increase in GDP per capita in neighboring regions leads to a 0.64% increase locally, proving that policy interventions in one region can have meaningful effects on neighboring areas.
Direct and indirect effects of employment are positive and significant (total effect = 0.957), showing that higher employment not only benefits the local economy but also stimulates growth in surrounding regions.
-Population density exhibits slightly negative direct (-0.077) and total (-0.186) effects on GDP per capita, which might bea result of a potential congestion or resource constraints.
-SEM results (ρ = 0.721) reveal that unobserved factors affecting GDP per capita—like infrastructure quality, local institutions, or regional policies—tend to cluster spatially. Even after accounting for employment and density, regional disparities persist, highlighting the importance of coordinated economic development strategies.
This spatial panel analysis provides a strong evidence of significant spatial interdependencies in Polish regional economic development. Our hypothesis that regions closer to Warsaw exhibit higher GDP per capita due to proximity sadly could not be directly tested in the fixed effects framework because the distance to Warsaw is time-invariant. Including it would require a random effects specification, which the Hausman test rejected.
Despite this limitation, the SAR and SEM fixed effects models successfully capture the spatial dynamics of regional economies. The SAR model demonstrates that economic performance in one region significantly influences neighboring regions, with total employment effects amplified by positive spatial spillovers. Similarly, the SEM model reveals that unobserved determinants of GDP per capita cluster spatially.
Additional limitations include:
Lack of a Spatial Durbin Model: The Durbin specification, which would allow testing for indirect effects of time-varying covariates like population density and employment in neighboring regions, could not be estimated due to software and panel size constraints.
Small panel size: Only 17 NUTS-2 regions over 9 years, which may reduce estimation precision.
Potential omitted variables: Certain variables, such as R&D, education, or infrastructure quality, were excluded due to multicollinearity, high Variance inflation factor or insignificance, which could bias estimates.
Nonetheless, the models are robust for examining spatial spillovers and regional clustering, and the residuals do not exhibit significant spatial autocorrelation, confirming that the main spatial dependencies have been captured.