Introduction

Welcome to the RPubs companion for the article Urban greening and support for environmental parties: Evidence from Germany by Eliza Hałatek and Katarzyna Kopczewska (2026).

This document provides the main R code used to reproduce the empirical results presented in the article. The analysis is based on a constituency-level dataset constructed from multiple sources, including land-use indicators, electoral results, and environmental funding data for Germany.

The analytical framework of the study is presented below. It distinguishes three key dimensions shaping urban greening outcomes: political demand, financial capacity, and structural context. The framework highlights both direct and indirect pathways through which these factors interact.

In particular, the framework emphasizes a novel political–financial pathway, where electoral support for environmental parties may influence greening outcomes indirectly through increased access to environmental funding.

Data access and replication

To reproduce the empirical results, please download the replication package here and place all files in your working directory.

All file paths in this document are defined relative to the project folder. It is therefore recommended to open the R Markdown file within an RStudio Project and to preserve the folder structure provided in the replication package.

Replication package contents

The replication package includes the following input datasets:

projects.csv → project-level dataset with harmonised postal codes and grant information
plz_unique.rds → postal-code geometries
LUCAS_DE.csv → land-use data (LUCAS, Germany only) used to construct greening indicators
german_2017_constituency_results.csv → electoral data (Green Party vote share)
Geometrie_Wahlkreise_19DBT_VG250.shp (and associated files: .dbf, .shx, etc.) → constituency boundaries for Germany

Please ensure that all shapefile components are stored in the same folder.

The shapefile is processed using the sf package and merged with the analytical dataset via constituency identifiers (WKR_NR).

Replication notes

The code provided in this document reproduces the full data preparation pipeline starting from the input datasets listed above.
The objects projects.csv and plz_unique.rds are pre-processed to reduce computation time and ensure consistency of spatial matching.

Please download all files, keep them in a single directory (without nested subfolders), and set this directory as your working directory before running the code.

Data preparation

This section documents how the final analytical dataset was constructed from raw spatial, electoral, land-use, and project-level data. The pipeline includes data cleaning, harmonisation of postal codes, spatial matching of projects to constituencies, aggregation of land-use transitions, and merging of electoral indicators.

Universal packages

library(sf)
library(dplyr)
library(readr)
library(readxl)
library(tidyr)
library(stringr)
library(writexl)

Reading data

# Constituency geometries
const <- st_read("Geometrie_Wahlkreise_19DBT_VG250.shp", quiet = TRUE)

# Cached project data (cleaned)
projects_all <- readr::read_csv("projects.csv", show_col_types = FALSE)

# Postal-code geometries
plz_unique <- readRDS("plz_unique.rds")

# Land-use data (LUCAS)
landtake_raw <- read.csv("LUCAS_DE.csv", header = TRUE, sep = ",", dec = ".")

# Election results
election <- readr::read_csv("german_2017_constituency_results.csv", show_col_types = FALSE)

Preparation

# Ensure valid geometries
const <- st_make_valid(const)
plz_unique <- st_make_valid(plz_unique)

# Ensure numeric variables
projects_all <- projects_all %>%
  mutate(
    Grant_numeric = as.numeric(Grant_numeric),
    StartYear = as.numeric(StartYear),
    EndYear = as.numeric(EndYear)
  )

# Convert LUCAS points to sf
landtake <- st_as_sf(
  landtake_raw,
  coords = c("X_WGS84", "Y_WGS84"),
  crs = 4326
) %>%
  st_transform(25832)

Assign projects to constituencies

projects_sf <- projects_all %>%
  left_join(plz_unique, by = c("Postal_clean_final" = "plz_code5")) %>%
  st_as_sf() %>%
  st_make_valid() %>%
  mutate(
    project_id = row_number(),
    plz_area = st_area(geometry)
  )

assignments <- st_intersection(
  projects_sf %>% dplyr::select(project_id, plz_area),
  const %>% dplyr::select(WKR_NR, WKR_NAME)
) %>%
  mutate(
    overlap_area = st_area(geometry),
    overlap_share = as.numeric(overlap_area / plz_area)
  )%>%
  st_drop_geometry() %>%
  group_by(project_id) %>%
  slice_max(overlap_share, n = 1, with_ties = FALSE) %>%
  ungroup()

projects_assigned <- projects_sf %>%
  left_join(assignments, by = "project_id")

project_counts <- projects_assigned %>%
  st_drop_geometry() %>%
  group_by(WKR_NR) %>%
  summarise(
    num_projects = n(),
    sum_grant = sum(Grant_numeric, na.rm = TRUE),
    .groups = "drop"
  )

Construct land-use indicators

landtake <- landtake %>%
  mutate(
    change_to_6   = as.integer(change %in% c("1to6","2to6","3to6","4to6","5to6","7to6")),
    change_from_6 = as.integer(change %in% c("6to1","6to2","6to3","6to4","6to5","6to7"))
  )

landtake_with_const <- st_join(landtake, const)

Aggregate greening measures

landtake_summary <- landtake_with_const %>%
  group_by(WKR_NR) %>%
  summarise(
    num_greened_cells = sum(change_from_6 == 1, na.rm = TRUE),
    total_valid_cells = sum(!is.na(change_from_6)),
    sum_TOT_P_2018    = sum(TOT_P_2018, na.rm = TRUE),
    sum_TOT_P_2006    = sum(TOT_P_2006, na.rm = TRUE),
    mean_DIST_COAST   = mean(DIST_COAST, na.rm = TRUE),
    mean_DIST_BORD    = mean(DIST_BORD, na.rm = TRUE),
    mean_ELEV         = mean(ELEV, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    fraction_greened = num_greened_cells / total_valid_cells,
    percent_greened  = 100 * fraction_greened
  ) %>%
  rename(geometry_land_take = geometry)

Inhabited-area greening indicators

inhab_metrics <- landtake_with_const %>%
  mutate(
    inhab_any = as.integer(
      replace_na(TOT_P_2006, 0) > 0 | replace_na(TOT_P_2018, 0) > 0
    )
  ) %>%
  group_by(WKR_NR) %>%
  summarise(
    num_greened_cells_pop = sum((change_from_6 == 1) & (inhab_any == 1), na.rm = TRUE),
    total_inhab_cells     = sum(inhab_any == 1, na.rm = TRUE),
    perc_greened_inhab    = ifelse(
      total_inhab_cells > 0,
      100 * num_greened_cells_pop / total_inhab_cells,
      NA_real_
    ),
    sum_TOT_P_2006_pop = sum(
      if_else(inhab_any == 1, replace_na(TOT_P_2006, 0), 0),
      na.rm = TRUE
    ),
    sum_TOT_P_2018_pop = sum(
      if_else(inhab_any == 1, replace_na(TOT_P_2018, 0), 0),
      na.rm = TRUE
    ),
    .groups = "drop"
  ) %>%
  mutate(
    popul.rel.change_pop = ifelse(
      sum_TOT_P_2006_pop > 0,
      sum_TOT_P_2018_pop / sum_TOT_P_2006_pop,
      NA_real_
    )
  )

thr_pop <- median(inhab_metrics$perc_greened_inhab, na.rm = TRUE)

inhab_metrics <- inhab_metrics %>%
  mutate(
    greened_binary_pop = as.integer(perc_greened_inhab >= thr_pop)
  )

Baseline land-use structure

baseline_overall <- landtake_with_const %>%
  mutate(
    is_artificial_2006 = as.integer(STR05 == 6),
    is_green_2006      = as.integer(STR05 %in% c(3, 4, 5)),
    is_artificial_2018 = as.integer(STR18a == 6),
    is_green_2018      = as.integer(STR18a %in% c(3, 4, 5))
  ) %>%
  group_by(WKR_NR) %>%
  summarise(
    pct_artificial_2006 = 100 * mean(is_artificial_2006, na.rm = TRUE),
    pct_green_2006      = 100 * mean(is_green_2006, na.rm = TRUE),
    pct_artificial_2018 = 100 * mean(is_artificial_2018, na.rm = TRUE),
    pct_green_2018      = 100 * mean(is_green_2018, na.rm = TRUE),
    .groups = "drop"
  )

baseline_inhab <- landtake_with_const %>%
  mutate(
    inhab_any = as.integer(
      replace_na(TOT_P_2006, 0) > 0 | replace_na(TOT_P_2018, 0) > 0
    ),
    is_artificial_2006 = as.integer(STR05 == 6),
    is_green_2006      = as.integer(STR05 %in% c(3, 4, 5)),
    is_artificial_2018 = as.integer(STR18a == 6),
    is_green_2018      = as.integer(STR18a %in% c(3, 4, 5))
  ) %>%
  filter(inhab_any == 1) %>%
  group_by(WKR_NR) %>%
  summarise(
    pct_artificial_2006_pop = 100 * mean(is_artificial_2006, na.rm = TRUE),
    pct_green_2006_pop      = 100 * mean(is_green_2006, na.rm = TRUE),
    pct_artificial_2018_pop = 100 * mean(is_artificial_2018, na.rm = TRUE),
    pct_green_2018_pop      = 100 * mean(is_green_2018, na.rm = TRUE),
    .groups = "drop"
  )

Project timing variables

project_years <- projects_assigned %>%
  mutate(
    StartYear_num = as.numeric(StartYear),
    EndYear_num   = as.numeric(EndYear)
  ) %>%
  group_by(WKR_NR) %>%
  summarise(
    first_project_year = suppressWarnings(min(StartYear_num, na.rm = TRUE)),
    last_project_year  = suppressWarnings(max(EndYear_num, na.rm = TRUE)),
    mean_project_year  = suppressWarnings(mean(StartYear_num, na.rm = TRUE)),
    .groups = "drop"
  )

Consolidate constituency-level summaries

cons_summary <- landtake_summary %>%
  left_join(st_drop_geometry(inhab_metrics), by = "WKR_NR") %>%
  left_join(st_drop_geometry(baseline_overall), by = "WKR_NR") %>%
  left_join(st_drop_geometry(baseline_inhab), by = "WKR_NR") %>%
  left_join(st_drop_geometry(project_years), by = "WKR_NR")

thr_uncond <- median(cons_summary$percent_greened, na.rm = TRUE)

cons_summary <- cons_summary %>%
  mutate(
    greened_binary = as.integer(percent_greened >= thr_uncond)
  )

rm(thr_pop, thr_uncond)

Election data

election_processed <- election %>%
  mutate(WKR_NR = as.numeric(sub(":.*", "", Wahlkreis))) %>%
  filter(Specification %in% c("GRÜNE", "GRÜNE/B 90")) %>%
  select(WKR_NR, green_vote_share = `Second votes %`) %>%
  mutate(green_vote_share = as.numeric(green_vote_share)) %>%
  distinct()

Build final dataset

const_data <- const %>%
  rename(geometry_const = geometry) %>%   # keep constituency geometry
  left_join(election_processed, by = "WKR_NR") %>%   # add Green Party vote share
  left_join(st_drop_geometry(cons_summary), by = "WKR_NR") %>%   # add greening and structural indicators
  left_join(st_drop_geometry(project_counts), by = "WKR_NR") %>%   # add project counts and funding
  mutate(
    green_vote_share_dec = green_vote_share / 100,   # convert vote share to proportion
    sum_TOT_P_2006_K     = sum_TOT_P_2006 / 1000,    # population in thousands
    sum_TOT_P_2018_K     = sum_TOT_P_2018 / 1000,
    sum_grant_K          = sum_grant / 1000,         # grants in thousands of euro
    mean_DIST_COAST_K    = mean_DIST_COAST / 1000,   # distance rescaled
    mean_DIST_BORD_K     = mean_DIST_BORD / 1000,
    popul.change.06.18   = sum_TOT_P_2018 - sum_TOT_P_2006,
    popul.rel.change     = ifelse(
      sum_TOT_P_2006 > 0,
      sum_TOT_P_2018 / sum_TOT_P_2006,
      NA_real_
    ),
    grant_percap = ifelse(
      sum_TOT_P_2018 > 0,
      sum_grant / sum_TOT_P_2018,
      NA_real_
    ),
    proj_percap = ifelse(
      sum_TOT_P_2018 > 0,
      num_projects / sum_TOT_P_2018,
      NA_real_
    ),
    funding_index = as.numeric(scale(grant_percap)) + as.numeric(scale(proj_percap))
  )

const_data <- const_data %>%
  mutate(
    ELEV_class = case_when(
      mean_ELEV < 0                         ~ "elev_<0",
      mean_ELEV >= 0   & mean_ELEV < 250    ~ "elev_0-250",
      mean_ELEV >= 250 & mean_ELEV < 500    ~ "elev_250-500",
      mean_ELEV >= 500 & mean_ELEV < 1000   ~ "elev_500-1000",
      mean_ELEV >= 1000 & mean_ELEV < 2000  ~ "elev_1000-2000",
      mean_ELEV >= 2000                     ~ "elev_>2000",
      TRUE                                  ~ NA_character_
    ),
    ELEV_class = factor(
      ELEV_class,
      levels = c(
        "elev_<0", "elev_0-250", "elev_250-500",
        "elev_500-1000", "elev_1000-2000", "elev_>2000"
      )
    )
  )

# Replace only project-related missing (NA) values with zero
const_data <- const_data %>%
  mutate(
    across(
      c(
        green_vote_share_dec,
        num_projects,
        sum_grant,
        sum_grant_K,
        grant_percap,
        proj_percap,
        funding_index
      ),
      ~ ifelse(is.na(.), 0, .)
    )
  )

dat <- st_drop_geometry(const_data)

Setup packages

library(sf)
library(dplyr)
library(readr)
library(tidyr)
library(stringr)
library(ggplot2)
library(psych)
library(DescTools)
library(car)
library(spdep)
library(mediation)
library(caret)
library(vip)
library(ranger)
library(xgboost)
library(pROC)
library(shapviz)
library(broom)

Variable preparation

dat <- dat %>%
  # POLITICAL VARIABLES
  mutate(
    green_vote_share_dec = ifelse(
      !is.na(green_vote_share_dec),
      green_vote_share_dec,
      green_vote_share / 100
    )
  ) %>%
  # FINANCIAL VARIABLES
  mutate(
    # Log transformation of total grants (avoid log(0))
    log_grantsK = log(pmax(sum_grant_K, 0) + 1),

    # Ensure finite values only
    grant_percap = ifelse(is.finite(grant_percap), grant_percap, NA_real_),
    proj_percap  = ifelse(is.finite(proj_percap),  proj_percap,  NA_real_),

    # Log projects per capita (small offset to avoid -Inf)
    log_proj_percap = log(proj_percap + 1e-6)
  ) %>%
  # CONTROL VARIABLES (only those requiring transformation)
  mutate(
    # Elevation as categorical factor
    ELEV_class = as.factor(ELEV_class)
  )

Variable definitions

Variable	Level	Description
Land use class (STR05)	1	Arable land
	2	Permanent crops
	3	Grass
	4	Wooded and shrub areas
	5	Bare land / low vegetation
	6	Artificial land
	7	Water

Land use class (STR18a)	1	Arable land
	2	Permanent crops
	3	Grass
	4	Wooded areas and shrubs (merged)
	5	Bare surface / low vegetation
	6	Artificial / sealed areas
	7	Inland and coastal waters
Land-use change 2005-2018	Change	STR05 ≠ STR18a
	No change	STR05 = STR18a
	Total	All categories combined
Urban greening	Binary	1 if artificial → non-artificial ≥ median; 0 otherwise
Percent greened	Continuous	Fraction of artificial → other land-use transitions
Elevation class	Categorical	<0; 0-250; 250-500; 500-1000; 1000-2000; >2000 m ASL
% artificial (2006, inhab.)	Continuous	Share of artificial land in 2006 (inhabited)
% green (2006, inhab.)	Continuous	Share of green land in 2006 (inhabited)
Green vote share	Continuous	Green Party vote share (0-1)
Grant per capita	Continuous	DBU funding per capita (€)
Log projects per capita	Continuous	Log of DBU projects per capita
Population change	Continuous	Population ratio (2018 / 2006)
Mean distance to border	Continuous	Average distance to national border (km)

Descriptive statistics

Summary statistics

	mean	sd	min	max	n
greened_binary_pop	0.505	0.501	0.000	1.000	299
perc_greened_inhab	10.259	10.424	1.572	68.421	299
green_vote_share_dec	0.088	0.039	0.022	0.212	299
grant_percap	4.883	5.831	0.000	46.500	299
log_proj_percap	-10.763	1.052	-13.816	-7.592	299
log_grantsK	6.479	1.640	0.000	9.453	299
popul.rel.change	1.006	0.084	0.786	1.569	299
pct_artificial_2006_pop	21.057	20.616	3.020	93.333	299
pct_green_2006_pop	44.060	18.386	3.846	93.130	299
mean_DIST_BORD_K	0.082	0.053	0.004	0.236	299

Correlation matrix

	greened_binary_pop	perc_greened_inhab	green_vote_share_dec	grant_percap	log_proj_percap	log_grantsK	popul.rel.change	pct_artificial_2006_pop	pct_green_2006_pop	mean_DIST_BORD_K
greened_binary_pop	1.00	0.60	0.30	0.12	0.09	0.07	0.27	0.63	-0.41	-0.03
perc_greened_inhab	0.60	1.00	0.33	0.13	0.13	0.11	0.25	0.90	-0.47	-0.03
green_vote_share_dec	0.30	0.33	1.00	0.31	0.31	0.31	0.68	0.40	-0.14	0.01
grant_percap	0.12	0.13	0.31	1.00	0.75	0.64	0.21	0.22	-0.13	0.07
log_proj_percap	0.09	0.13	0.31	0.75	1.00	0.90	0.21	0.18	-0.14	0.10
log_grantsK	0.07	0.11	0.31	0.64	0.90	1.00	0.23	0.14	-0.08	0.06
popul.rel.change	0.27	0.25	0.68	0.21	0.21	0.23	1.00	0.38	-0.21	-0.08
pct_artificial_2006_pop	0.63	0.90	0.40	0.22	0.18	0.14	0.38	1.00	-0.54	-0.06
pct_green_2006_pop	-0.41	-0.47	-0.14	-0.13	-0.14	-0.08	-0.21	-0.54	1.00	-0.17
mean_DIST_BORD_K	-0.03	-0.03	0.01	0.07	0.10	0.06	-0.08	-0.06	-0.17	1.00

Selected plots

ggplot(plot_data_inhab, aes(x = x, y = y)) +
  geom_point(aes(color = role), alpha = 0.5, size = 1.8) +
  geom_smooth(
    method = "lm",
    se = FALSE,
    aes(color = role),
    linewidth = 0.5
  ) +
  facet_wrap(
    ~ var,
    scales = "free_x",
    labeller = as_labeller(var_labels_inhab)
  ) +
  scale_color_manual(
    values = role_colors,
    name = "Variable type"
  ) +
  labs(
    x = NULL,
    y = "% greened area (inhabited cells)"
  ) +
  theme_bw() +
  theme(
    strip.background = element_rect(fill = "grey95"),
    axis.text.x = element_text(angle = 30, hjust = 1),
    legend.position = "bottom"
  )

ggplot(const_plot) +
  geom_sf(aes(fill = green_vote_share)) +
  scale_fill_gradientn(
    colours = c("grey85", "#A8E6A3", "forestgreen", "darkgreen"),
    name = "Vote share (%)"
  ) +
  theme_minimal() +
  labs(
    title = "Percentage of votes for the Green Party (second votes)"
  )

ggplot(const_plot) +
  geom_sf(aes(fill = as.factor(greened_binary_pop))) +
  scale_fill_manual(
    values = c("lightgray", "forestgreen"),
    labels = c("Below median", "Above median"),
    name = "Urban greening"
  ) +
  theme_minimal() +
  labs(
    title = "Fraction of areas showing urban greening"
  )

ggplot(const_plot) +
  geom_sf(aes(fill = grant_per_cap_cat)) +
  scale_fill_manual(
    values = c("#edf8e9", "#bae4b3", "#74c476", "#31a354", "#006d2c")
  ) +
  theme_minimal() +
  labs(
    title = "Green-related grants per capita",
    fill = "Grants per capita (€)"
  )

ggplot(const_plot) +
  geom_sf(aes(fill = project_cat)) +
  scale_fill_manual(
    values = c("#e5f5e0", "#a1d99b", "#74c476", "#31a354", "#006d2c")
  ) +
  theme_minimal() +
  labs(
    title = "Number of green-related projects",
    fill = "Project count"
  )

The empirical strategy combines three complementary approaches to investigate the relationship between political preferences and urban greening.

The model decomposes the total effect of political support into a direct component and an indirect component operating through financial capacity. This allows us to assess whether funding acts as a mechanism translating political preferences into environmental outcomes.

Probit models

We begin with a parametric approach to estimate the baseline relationship between political support and urban greening.

Model comparison

summary(mod1_inhab)

## 
## Call:
## glm(formula = greened_binary_pop ~ green_vote_share_dec + grant_percap + 
##     log_proj_percap + popul.rel.change + pct_artificial_2006_pop + 
##     pct_green_2006_pop + I(green_vote_share_dec * pct_artificial_2006_pop) + 
##     mean_DIST_BORD_K + ELEV_class, family = binomial(link = "probit"), 
##     data = dat)
## 
## Coefficients:
##                                                    Estimate Std. Error z value
## (Intercept)                                       -6.875571   3.481814  -1.975
## green_vote_share_dec                              17.002029   8.645218   1.967
## grant_percap                                      -0.025647   0.035288  -0.727
## log_proj_percap                                    0.112229   0.182040   0.617
## popul.rel.change                                   3.418891   2.999100   1.140
## pct_artificial_2006_pop                            0.455643   0.075873   6.005
## pct_green_2006_pop                                -0.010111   0.008125  -1.244
## I(green_vote_share_dec * pct_artificial_2006_pop) -1.991461   0.549469  -3.624
## mean_DIST_BORD_K                                   2.273279   2.315583   0.982
## ELEV_classelev_250-500                             0.259277   0.293854   0.882
## ELEV_classelev_500-1000                            0.386481   0.434666   0.889
##                                                   Pr(>|z|)    
## (Intercept)                                        0.04830 *  
## green_vote_share_dec                               0.04922 *  
## grant_percap                                       0.46735    
## log_proj_percap                                    0.53756    
## popul.rel.change                                   0.25430    
## pct_artificial_2006_pop                           1.91e-09 ***
## pct_green_2006_pop                                 0.21335    
## I(green_vote_share_dec * pct_artificial_2006_pop)  0.00029 ***
## mean_DIST_BORD_K                                   0.32623    
## ELEV_classelev_250-500                             0.37760    
## ELEV_classelev_500-1000                            0.37393    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 414.47  on 298  degrees of freedom
## Residual deviance: 145.46  on 288  degrees of freedom
## AIC: 167.46
## 
## Number of Fisher Scoring iterations: 11

summary(mod2_inhab)

## 
## Call:
## glm(formula = greened_binary_pop ~ green_vote_share_dec + grant_percap + 
##     popul.rel.change + pct_artificial_2006_pop + I(green_vote_share_dec * 
##     pct_artificial_2006_pop) + mean_DIST_BORD_K + ELEV_class, 
##     family = binomial(link = "probit"), data = dat)
## 
## Coefficients:
##                                                    Estimate Std. Error z value
## (Intercept)                                       -8.676029   2.858275  -3.035
## green_vote_share_dec                              17.204170   8.752413   1.966
## grant_percap                                      -0.008384   0.026672  -0.314
## popul.rel.change                                   3.446442   2.986048   1.154
## pct_artificial_2006_pop                            0.451803   0.075168   6.011
## I(green_vote_share_dec * pct_artificial_2006_pop) -1.956454   0.553040  -3.538
## mean_DIST_BORD_K                                   3.007147   2.237581   1.344
## ELEV_classelev_250-500                             0.083448   0.260160   0.321
## ELEV_classelev_500-1000                            0.197624   0.400364   0.494
##                                                   Pr(>|z|)    
## (Intercept)                                       0.002402 ** 
## green_vote_share_dec                              0.049339 *  
## grant_percap                                      0.753250    
## popul.rel.change                                  0.248426    
## pct_artificial_2006_pop                           1.85e-09 ***
## I(green_vote_share_dec * pct_artificial_2006_pop) 0.000404 ***
## mean_DIST_BORD_K                                  0.178972    
## ELEV_classelev_250-500                            0.748396    
## ELEV_classelev_500-1000                           0.621581    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 414.47  on 298  degrees of freedom
## Residual deviance: 147.61  on 290  degrees of freedom
## AIC: 165.61
## 
## Number of Fisher Scoring iterations: 11

Spatial diagnostics

Moran’s I for residuals

model	Moran_I	p_value
M1	0.039	0.117
M2	0.030	0.177

Machine learning models

To complement the regression analysis, we apply machine learning methods that allow for flexible functional forms and interactions between predictors.

ML performance

model	AUC	Accuracy	Kappa
Random Forest	0.966	0.898	0.796
XGBoost	0.963	0.881	0.762

Random Forest: variable importance

ggplot(rf_imp_df, aes(x = reorder(Variable, Importance), y = Importance)) +
  geom_col() +
  coord_flip() +
  theme_minimal() +
  labs(
    x = "Predictor",
    y = "Importance"
  )

XGBoost: variable importance

vip(xgb_model$finalModel, num_features = 10)

SHAP values (XGBoost)

sv_importance(sv, show_numbers = TRUE, max_display = 8)

sv_importance(sv, kind = "bee", show_numbers = TRUE, max_display = 8)

autoplot(pdp_artificial) +
  labs(
    x = "Share of artificial land in 2006 (inhabited areas)",
    y = "Predicted probability of urban greening"
  ) +
  theme_minimal()

Mediation analysis

This section tests whether environmental funding mediates the relationship between Green Party support and urban greening in inhabited areas. The preferred specification includes the baseline share of artificial land in 2006 within inhabited areas as a structural control.

Mediation results

Mediator	ACME	ACME 95% CI	ADE	ADE 95% CI	Total Effect
Grants per capita	-0.0125	[-0.0449; 0.0143]	-0.0785	[-0.1831; 0.0258]	-0.0889
Projects per capita	-0.0054	[-0.0377; 0.0258]	-0.0875	[-0.1973; 0.0179]	-0.0921

Urban greening and support for environmental parties: Evidence from Germany

Eliza Hałatek, Katarzyna Kopczewska

2026-04-07