library(kirkegaard)
load_packages(
dplyr, tidyr, readr, stringr, purrr, tibble,
sf, ggplot2, ggrepel, ggcorrplot, patchwork,
lavaan, modelsummary, gt
)
theme_set(theme_bw(base_size = 11))
options(digits = 3)
knitr::opts_chunk$set(
echo = TRUE, message = FALSE, warning = FALSE,
fig.width = 9, fig.height = 6, dpi = 130
)
A debate in Danish politics about what it means to be Danish motivated this analysis. A politician from the Social Democrats, Frederik Vad, posted commune-level results from the Ministry of Justice’s Tryghedsundersøgelse (safety/tryghed survey), and it was obvious that commune-level ethnic composition correlated strongly with reported safety — but the correlation had not been quantified and the underlying appendix was missing from the printed version of the main alternative source (Realdania’s Vores Livskvalitet, 2. oplag 2025). Realdania later supplied the full Appendix 4A via email.
This notebook documents the data assembly and the analyses we ran. All source scripts live in R/ and all intermediate datasets in data/.
| Measure | Source | Unit | Time |
|---|---|---|---|
| MoJ Tryghedsundersøgelsen | Ministry of Justice / Rigspolitiet annual reports; commune-level values transcribed from the district maps. | 98 communes | 2019, 2021, 2022, 2023, 2024 |
| Realdania safety & noise (Appendix 4A) | Vores Livskvalitet (Realdania 2025), received by email. | 98 communes | 2024/25 wave |
| Realdania six commune indexes | Vores Livskvalitet appendices 3A, 5B, 6A, 10B, 11A (livskvalitet, socioøkon, naboskab, mental sundhed, tillid). | 98 communes | 2024/25 wave |
| FOLK1C (ancestry × origin country) | Danmarks Statistik, direct API pull. | 98 communes × 4 HERKOMST × 242 IELAND | 2022Q2 |
| STRAF11 (reported offences) | Danmarks Statistik, direct API pull. | 98 communes × offense category × quarter | 2021-2024 |
| BY2 (population by settlement size) | Danmarks Statistik, direct API pull. | 98 communes × BYST | 2024 |
| Commune polygons | Dataforsyningen DAWA GeoJSON. | 98 commune polygons | current |
| Postnummer polygons | Dataforsyningen DAWA GeoJSON. | 1,089 postnumre | current |
| Postnummer demographics 2014 | A4.csv (DST pull, geometry-indexed) in data/. |
592 postnumre | 2014-01-01 |
| Postnummer demographics 2016 | postnumre dst 2016.xlsx in data/. |
457 postnumre | 2016-10-01 |
The R/extract_*.R scripts produce the tidy datasets read below.
tryghed_long <- readRDS("data/tryghed_MoJ_long.rds") # long panel 2019-2024
tryghed_mean <- readRDS("data/tryghed_mean_2021_2024.rds") # commune mean 2021-24
tryghed_rd <- readRDS("data/tryghed_realdania.rds") # Realdania Appendix 4A
tryghed_pn <- readRDS("data/tryghed_postnumre_top_bund.rds")# top/bund 20 postnumre
menapta <- readRDS("data/commune_menapta_2022Q2.rds") # commune MENAPTA share
crime <- readRDS("data/straf11_commune_2021_2024.rds") # reported offences
urban <- readRDS("data/urbanisering_2024.rds") # % in byområder ≥200
popdens <- readRDS("data/popdensity_2022Q2.rds") # pop / km²
indexes <- readRDS("data/commune_indexes.rds") # Realdania book indexes
pn14 <- readRDS("data/postnumre_2014.rds") # 2014 postnummer demog
pn16 <- readRDS("data/postnumre_2016.rds") # 2016 postnummer demog
pn_nbs <- readRDS("data/tryghed_postnumre_neighbours.rds")
path_fits <- readRDS("data/path_models.rds")
geo_kom <- st_read("data/geo/kommuner.geojson", quiet = TRUE) |>
filter(!udenforkommuneinddeling) |>
transmute(kommune = navn, geometry)
Five years of map-based commune values (2019, 2021, 2022, 2023, 2024). Rank-order stability and the reliability of a multi-year average.
w <- tryghed_long |>
select(kommune, year, tryghed_pct) |>
pivot_wider(names_from = year, values_from = tryghed_pct, names_prefix = "y") |>
select(-kommune)
pearson_mat <- cor(w, method = "pearson")
spearman_mat <- cor(w, method = "spearman")
round(pearson_mat, 2)
## y2019 y2021 y2022 y2023 y2024
## y2019 1.00 0.70 0.54 0.60 0.66
## y2021 0.70 1.00 0.72 0.75 0.75
## y2022 0.54 0.72 1.00 0.72 0.69
## y2023 0.60 0.75 0.72 1.00 0.70
## y2024 0.66 0.75 0.69 0.70 1.00
round(spearman_mat, 2)
## y2019 y2021 y2022 y2023 y2024
## y2019 1.00 0.61 0.47 0.45 0.56
## y2021 0.61 1.00 0.66 0.64 0.65
## y2022 0.47 0.66 1.00 0.65 0.63
## y2023 0.45 0.64 0.65 1.00 0.59
## y2024 0.56 0.65 0.63 0.59 1.00
z <- scale(w)
alpha <- function(M) {
k <- ncol(M); v <- apply(M, 2, var); tot <- var(rowSums(M))
(k/(k-1)) * (1 - sum(v)/tot)
}
sb <- function(k, rbar) k*rbar / (1 + (k-1)*rbar)
r_avg <- function(M) { R <- cor(M); mean(R[lower.tri(R)]) }
tibble(
model = c("all 5 years", "drop 2019 (4 map-era years only)"),
alpha = c(alpha(z), alpha(z[, -1])),
mean_r = c(r_avg(z), r_avg(z[, -1])),
reliability_of_mean = c(sb(5, r_avg(z)), sb(4, r_avg(z[, -1])))
) |> mutate(across(where(is.numeric), round, 3))
Adjacent-year Pearson correlations are ~0.70–0.75 and adjacent Spearmans ~0.60–0.66, indicating a stable between-commune ranking with meaningful sampling noise in any single year. Cronbach α across 5 years is 0.915; dropping 2019 (different survey format) barely changes α (0.912). We use the mean of 2021-2024 as the primary MoJ safety variable.
m <- geo_kom |> left_join(tryghed_mean, by = "kommune")
ggplot(m) +
geom_sf(aes(fill = tryghed_mean_21_24), colour = "white", linewidth = 0.15) +
scale_fill_viridis_c(option = "mako", direction = -1,
name = "% feeling safe\nin neighbourhood\n(2021-2024)",
breaks = seq(75, 95, 5)) +
coord_sf(crs = 25832) +
labs(title = "Feeling safe in neighbourhood, mean 2021-2024",
subtitle = "Ministry of Justice Tryghedsundersøgelsen (share answering 1-3 on 1-7 scale)") +
theme_void(base_size = 11) +
theme(plot.title = element_text(face = "bold"),
legend.position = c(0.88, 0.55))
Low-tryghed cluster is concentrated on Copenhagen’s western ring (Vestegn): Ishøj, Brøndby, Albertslund, Høje-Taastrup, Herlev. Islands and rural western Jutland are high.
MENAPTA: share of commune residents (1st + 2nd generation) with origin in the Middle East, North Africa, Pakistan, Turkey, Afghanistan, or the rest of Africa (from Statistics Denmark FOLK1C, 2022Q2, using the midpoint of the 2021-2024 tryghed window).
Excluded: South Africa; Israel; European overseas territories in Africa (Réunion, St Helena, Spanish Africa — all exclaves with European populations). Stateless persons included (the majority in DK are Palestinians).
menapta |>
arrange(desc(menapta_pct)) |>
select(kommune, pop_total, menapta_n, menapta_pct) |>
head(10)
National MENAPTA share is 5.48 %. The top communes are all in the Copenhagen Vestegn ring (Ishøj 29%, Brøndby 23%, Albertslund 22%, Høje-Taastrup 19%).
df_moj <- tryghed_mean |> inner_join(menapta |> select(kommune, menapta_pct), by = "kommune")
r_p <- cor(df_moj$menapta_pct, df_moj$tryghed_mean_21_24)
r_s <- cor(df_moj$menapta_pct, df_moj$tryghed_mean_21_24, method = "spearman")
ggplot(df_moj, aes(menapta_pct, tryghed_mean_21_24)) +
geom_smooth(method = "lm", se = TRUE, colour = "steelblue",
fill = "steelblue", alpha = 0.12, linewidth = 0.6) +
geom_point(alpha = 0.75, size = 1.8, colour = "grey25") +
geom_text_repel(aes(label = kommune), size = 2.8, max.overlaps = 30,
segment.alpha = 0.4, min.segment.length = 0.2) +
scale_x_continuous(breaks = seq(0, 30, 5), labels = \(x) paste0(x, "%")) +
labs(x = "MENAPTA share of commune population (2022Q2)",
y = "Share feeling safe in neighbourhood, mean 2021-2024 (%)",
title = "Commune safety vs MENAPTA share",
subtitle = sprintf("Pearson r = %.2f Spearman ρ = %.2f n = 98",
r_p, r_s)) +
theme_minimal(base_size = 11)
Realdania’s Appendix 4A asks a different question:
df_cmp <- inner_join(tryghed_mean, tryghed_rd |> select(kommune, tryg_lokalomraade),
by = "kommune")
r_p2 <- cor(df_cmp$tryghed_mean_21_24, df_cmp$tryg_lokalomraade)
r_s2 <- cor(df_cmp$tryghed_mean_21_24, df_cmp$tryg_lokalomraade, method = "spearman")
fit <- lm(tryg_lokalomraade ~ tryghed_mean_21_24, df_cmp)
df_cmp <- df_cmp |>
mutate(resid = tryg_lokalomraade - predict(fit),
label = ifelse(rank(-abs(resid)) <= 15, kommune, NA_character_))
ggplot(df_cmp, aes(tryghed_mean_21_24, tryg_lokalomraade)) +
geom_abline(slope = 1, intercept = 0, colour = "grey70", linetype = "dashed") +
geom_smooth(method = "lm", se = TRUE, colour = "steelblue",
fill = "steelblue", alpha = 0.12, linewidth = 0.6) +
geom_point(alpha = 0.8, size = 1.8, colour = "grey25") +
geom_text_repel(aes(label = label), size = 2.9, min.segment.length = 0.2) +
labs(x = "MoJ: safe in immediate neighbourhood, mean 2021-2024",
y = "Realdania: safe in whole local area, 2024/25",
title = "Two commune-level safety measures",
subtitle = sprintf(
"Pearson r = %.2f Spearman ρ = %.2f n = 98 | Realdania = %.2f + %.2f · MoJ",
r_p2, r_s2, coef(fit)[1], coef(fit)[2])) +
theme_minimal(base_size = 11)
The two measures correlate strongly (r = 0.87) but Realdania’s range is 2.4× wider. All 98 points sit below the 45° line — Realdania values are systematically lower. The largest negative residuals are all urban Vestegn-adjacent communes (Herlev, Vallensbæk, Hvidovre, Slagelse, Glostrup, Ishøj, Albertslund, Frederikssund), consistent with the “whole lokalområde” framing picking up visible nearby problem areas.
full <- tryghed_mean |>
left_join(tryghed_rd, by = "kommune") |>
left_join(menapta |> select(kommune, menapta_pct), by = "kommune") |>
left_join(crime |> select(kommune, rate_straffelov_per1000), by = "kommune") |>
left_join(indexes, by = "kommune")
m_eng <- full |>
transmute(
`Safety MoJ (neighbourhood, 2021-24)` = tryghed_mean_21_24,
`Safety Realdania (local area)` = tryg_lokalomraade,
`% reporting neighbour noise` = pct_nabostoej,
`% reporting traffic noise` = pct_trafikstoej,
`MENAPTA %` = menapta_pct,
`Crime rate (criminal code /1000/yr)` = rate_straffelov_per1000,
`Life quality (raw)` = livskvalitet_raw,
`Life quality (adj.)` = livskvalitet_adj,
`Socioeconomic index` = socioekon_indeks,
`% with lack of space at home` = pladsmangel_pct,
`Neighbourhood satisfaction` = naboskab_tilfreds,
`Neighbour activities` = nabo_aktiviteter,
`Mental health (ABC index)` = mental_sundhed_abc,
`Trust in others` = tillid_andre,
`Trust in authorities` = tillid_myndigheder
)
cm <- cor(m_eng, method = "pearson")
ggcorrplot(cm, type = "lower", hc.order = TRUE, hc.method = "ward.D2",
lab = TRUE, lab_size = 2.6, outline.col = "white",
colors = c("#c6362e", "white", "#1f6f8b"), tl.cex = 10,
legend.title = "Pearson r") +
labs(title = "Commune-level correlations (n = 98)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Two clean clusters: a “local wellbeing” bundle (safety, neighbourhood satisfaction, life quality, trust in others) and an “urban stress” bundle (MENAPTA, crime, noise, socioeconomic problems) which anti-correlate. Trust in authorities is the clear outlier — its variation across communes is driven by individual-level demographics (education, age, politics), not the local environment gradient.
crime |>
arrange(desc(rate_straffelov_per1000)) |>
select(kommune, rate_tot_per1000, rate_straffelov_per1000, rate_person_per1000) |>
head(10)
Reported offences sum 2021-2024, expressed as annual rate per 1,000 inhabitants using the 2022Q2 FOLK1C population denominator. We use “straffelov i alt” (all criminal code offences) as the default headline rate; narrower subcategories like violence or sexual offences produce single-digit annual counts in small communes and become unstable.
bind_rows(
urban |> arrange(desc(urban_pct)) |> head(3) |>
transmute(kommune, measure = "Urban %", value = urban_pct),
popdens |> arrange(desc(pop_density)) |> head(3) |>
transmute(kommune, measure = "Density /km²", value = pop_density)
)
Plain population density (pop / total commune area) is distorted by empty land (forests, uninhabited coastal zones); residents living in settlements of ≥200 (DST’s definition of a byområde) is a cleaner measure of “how urban the average resident’s environment is”.
dz <- full |>
left_join(urban |> select(kommune, urban_pct), by = "kommune") |>
left_join(popdens |> select(kommune, log_pop_density), by = "kommune") |>
rename(moj = tryghed_mean_21_24, rd = tryg_lokalomraade,
menapta = menapta_pct, crime = rate_straffelov_per1000,
urban = urban_pct, logdens = log_pop_density) |>
mutate(across(c(moj, rd, menapta, crime, urban, logdens),
\(x) as.numeric(scale(x))))
models <- list(
"(1) MoJ + Urban%" = lm(moj ~ menapta + crime + urban, dz),
"(2) MoJ + log10(density)" = lm(moj ~ menapta + crime + logdens, dz),
"(3) Realdania + Urban%" = lm(rd ~ menapta + crime + urban, dz),
"(4) Realdania + log10(density)" = lm(rd ~ menapta + crime + logdens, dz)
)
modelsummary(
models,
coef_map = c(
"menapta" = "MENAPTA %",
"crime" = "Crime rate (criminal code /1000/yr)",
"urban" = "Urban % (residents in settlements ≥ 200)",
"logdens" = "log10(population density, /km²)"
),
gof_map = tribble(~raw, ~clean, ~fmt,
"adj.r.squared", "adj. R²", 3),
stars = c("*" = .05, "**" = .01, "***" = .001),
statistic = "({std.error})",
coef_omit = "(Intercept)",
title = "Commune-level determinants of subjective safety (n = 98) — standardised β",
notes = c(
"Standardised β coefficients (all variables z-scored); SE in parentheses. * p<0.05, ** p<0.01, *** p<0.001. n = 98.",
"MENAPTA %: 1st+2nd-gen population share from Middle East, North Africa, Pakistan, Turkey, Afghanistan, rest of Africa. Excl. South Africa, Israel, EU overseas territories; incl. stateless. Source: DST FOLK1C, 2022Q2.",
"Crime rate: DST STRAF11 criminal code offences / 1,000 inhabitants / year, 2021-2024.",
"Urban %: DST BY2 (2024), residents in settlements ≥ 200. Density: FOLK1C pop / commune land area (EPSG:25832)."
),
output = "html"
)
| (1) MoJ + Urban% | (2) MoJ + log10(density) | (3) Realdania + Urban% | (4) Realdania + log10(density) | |
|---|---|---|---|---|
| * p < 0.05, ** p < 0.01, *** p < 0.001 | ||||
| Standardised β coefficients (all variables z-scored); SE in parentheses. * p<0.05, ** p<0.01, *** p<0.001. n = 98. | ||||
| MENAPTA %: 1st+2nd-gen population share from Middle East, North Africa, Pakistan, Turkey, Afghanistan, rest of Africa. Excl. South Africa, Israel, EU overseas territories; incl. stateless. Source: DST FOLK1C, 2022Q2. | ||||
| Crime rate: DST STRAF11 criminal code offences / 1,000 inhabitants / year, 2021-2024. | ||||
| Urban %: DST BY2 (2024), residents in settlements ≥ 200. Density: FOLK1C pop / commune land area (EPSG:25832). | ||||
| MENAPTA % | −0.656*** | −0.712*** | −0.478*** | −0.519*** |
| (0.062) | (0.064) | (0.064) | (0.070) | |
| Crime rate (criminal code /1000/yr) | −0.270*** | −0.379*** | −0.251*** | −0.333*** |
| (0.071) | (0.077) | (0.072) | (0.084) | |
| Urban % (residents in settlements ≥ 200) | −0.060 | −0.286*** | ||
| (0.071) | (0.073) | |||
| log10(population density, /km²) | 0.128 | −0.122 | ||
| (0.083) | (0.091) | |||
| adj. R² | 0.774 | 0.778 | 0.764 | 0.731 |
Key findings:
Standardised β coefficients (bias-corrected bootstrap CIs, 2000 replicates). Source code: R/path_model_safety.R and R/plot_path_model.R.
extract_effects <- function(fit, ylabel) {
ps <- parameterestimates(fit)
get <- function(op, lhs, rhs = NULL, lab = NULL) {
r <- if (!is.null(lab)) ps |> filter(label == lab)
else ps |> filter(op == !!op, lhs == !!lhs, rhs == !!rhs)
slice(r, 1)
}
tibble(
outcome = ylabel,
direct_MENAPTA = get("~", if(ylabel=="MoJ") "moj_safety" else "rd_safety", "menapta")$est,
direct_urban = get("~", if(ylabel=="MoJ") "moj_safety" else "rd_safety", "urban")$est,
indirect_MENAPTA= get(lab = "ind_men")$est,
indirect_urban = get(lab = "ind_urban")$est,
total_MENAPTA = get(lab = "tot_men")$est,
total_urban = get(lab = "tot_urban")$est,
pct_direct_MENAPTA = 100 * direct_MENAPTA / total_MENAPTA,
pct_direct_urban = 100 * direct_urban / total_urban
)
}
bind_rows(
extract_effects(path_fits$moj, "MoJ"),
extract_effects(path_fits$rd, "Realdania")
) |> mutate(across(where(is.numeric), round, 3))
MENAPTA operates almost entirely through a direct (non-crime-mediated) channel: ~91% direct in MoJ, ~88% in Realdania. The direct channel likely reflects some combination of (a) crime under-reporting or antisocial behaviour not reaching STRAF11 (nuisance, low-level disorder), and (b) a composition effect — individuals in the included origin groups reporting lower safety irrespective of where they live (this is well- documented in US interpersonal trust research).
Urban% behaves differently by measure: fully mediated by crime in the MoJ model (direct effect NS, ~71% mediated), but retains a large direct effect in the Realdania model (~67% direct) — the “whole-lokalområde” framing picks up atmosphere-of-urbanity signal that crime alone does not capture.
Realdania published safety for the top-10 and bottom-10 postnumre (Table 4.4) — 20 of 603 qualifying postnumre. We joined these with postnummer-level demographic data from two DST snapshots (2014-01-01 and 2016-10-01).
pn_moj <- pn14 |> select(postnr, menapta_proxy_pct, ikke_dansk_pct) |>
mutate(postnr = as.character(postnr))
j14 <- tryghed_pn |> mutate(postnr = as.character(postnr)) |>
inner_join(pn_moj, by = "postnr")
tibble(
predictor = c("Ikke-dansk % (2016)", "Ikke-dansk % (2014)",
"Asien + Afrika % (2014, MENAPTA-proxy)"),
n = c(19, 20, 20),
pearson_r = c(
cor(pn16 |> mutate(postnr=as.character(postnr)) |>
inner_join(tryghed_pn |> mutate(postnr=as.character(postnr)), by="postnr") |>
pull(tryg_lokalomraade),
pn16 |> mutate(postnr=as.character(postnr)) |>
inner_join(tryghed_pn |> mutate(postnr=as.character(postnr)), by="postnr") |>
pull(ikke_dansk_pct)),
cor(j14$tryg_lokalomraade, j14$ikke_dansk_pct),
cor(j14$tryg_lokalomraade, j14$menapta_proxy_pct)
),
spearman_rho = c(
cor(pn16 |> mutate(postnr=as.character(postnr)) |>
inner_join(tryghed_pn |> mutate(postnr=as.character(postnr)), by="postnr") |>
pull(tryg_lokalomraade),
pn16 |> mutate(postnr=as.character(postnr)) |>
inner_join(tryghed_pn |> mutate(postnr=as.character(postnr)), by="postnr") |>
pull(ikke_dansk_pct), method = "spearman"),
cor(j14$tryg_lokalomraade, j14$ikke_dansk_pct, method = "spearman"),
cor(j14$tryg_lokalomraade, j14$menapta_proxy_pct, method = "spearman")
)
) |> mutate(across(where(is.numeric), round, 3))
Same sign and similar magnitude at the postnummer level as at the commune level. MENAPTA-proxy (Asien + Afrika in the DST regional grouping) gives the tightest fit, mirroring the commune finding.
Neighbour-postnummer analysis (R/analyse_postnr_neighbors.R) shows that for Skovlunde (postnr 2740) the low-safety reading is consistent with a neighbour-spillover story (adjacent Albertslund 2620 has MENAPTA-proxy 23%); for Frederikssund (3600), neighbours are whiter than the focal postnummer, so the spillover hypothesis fails there and a within-postnummer “concentrated block” explanation (Ådalsparken) is the residual candidate.
MoJ (Tryghedsundersøgelsen): “Now a question about fear of crime. On a scale from 1-7 […], how safe or unsafe do you feel? By ‘neighbourhood’ we mean the area immediately surrounding your residence.” Safety % = share answering 1–3.
Realdania (Vores Livskvalitet, Appendix 4A): “Are there places in your local area where you avoid going after nightfall because you feel unsafe?” Safety % = share answering “No, I feel safe in the whole local area.”
The two constructs differ in reference radius (immediate vs whole local area) and response format (7-point scale vs forced-choice behavioural avoidance). This cleanly explains the structural offset between the two measures (MoJ mean 89%, Realdania mean 75%) and why Realdania’s spread is 2.4× MoJ’s.
# save a sessionInfo snapshot
writeLines(capture.output(sessionInfo()), "sessions_info.txt")
# OSF upload (run manually after knitting).
library(osfr)
osf_auth(readr::read_lines("~/.config/osf_token"))
proj <- osf_create_project(
title = "Migration and feelings of safety in Denmark",
description = "Commune-level analysis linking ethnic composition (MENAPTA), reported crime, urbanization and population density to two subjective safety measures (Ministry of Justice Tryghedsundersøgelsen 2019-2024 and Realdania 'Vores Livskvalitet' Appendix 4A 2024/25).",
public = TRUE
)
osf_upload(proj,
path = c("data", "figures", "R",
"notebook.Rmd", "notebook.html", "sessions_info.txt"),
conflicts = "overwrite")