2) Ethiopian population density and ERSS sample villages

This section replicates and reconstructs Figure 4 from the paper “Rural electrification, migration and structural transformation: Evidence from Ethiopia”, titled Ethiopian population density and ERSS sample villages.

Due to data access limitations, we use a combination of open geospatial datasets and survey-derived geographic variables rather than the restricted datasets used by the original authors.

Data Sources

  1. Population Density Raster (2012)
  1. Administrative Boundaries (2026)
  1. Electricity Transmission Network (2017)
  1. Road Infrastructure (2018)
  1. Household Geographic Variables (2012)
##1. Load and smooth population raster
pop <- rast("~/Desktop/eth_pd_2012_1km.tif")

# Aggregate to smoother resolution (~5km)
pop_smooth <- aggregate(pop, fact = 5, fun = mean, na.rm = TRUE)

##2. Load Ethipia Border and neighbouring countries
ethiopia_border <- ne_countries(country = "Ethiopia", scale = "medium", returnclass = "sf")
ethiopia_border <- st_transform(ethiopia_border, crs(pop_smooth))
ethiopia_vect <- vect(ethiopia_border)

neighbors <- ne_countries(scale = "medium", returnclass = "sf") %>%
  st_transform(st_crs(ethiopia_border)) %>%
  st_make_valid()

##3. Clip raster to Ethiopia and convert to DF

pop_crop   <- crop(pop_smooth, ethiopia_vect)
pop_masked <- mask(pop_crop, ethiopia_vect)
pop_df <- as.data.frame(pop_masked, xy = TRUE, na.rm = TRUE)
colnames(pop_df)[3] <- "pop"

##4. Classify population density (quantiles)

breaks <- quantile(pop_df$pop, probs = seq(0, 1, length.out = 7), na.rm = TRUE)
pop_df$pop_class <- cut(pop_df$pop, breaks = breaks, include.lowest = TRUE)

##5. Local boundaries

eth_l1_sf <- st_read("~/Desktop/gadm41_ETH_shp/gadm41_ETH_1.shp", quiet = TRUE)
eth_l2_sf <- st_read("~/Desktop/gadm41_ETH_shp/gadm41_ETH_2.shp", quiet = TRUE)

eth_l1_sf <- st_transform(eth_l1_sf, st_crs(ethiopia_border))
eth_l2_sf <- st_transform(eth_l2_sf, st_crs(ethiopia_border))

##6. Load WB power transmission network
wb_power <- st_read("~/Desktop/ethiopia-electricity-transmission-network/Ethiopia Electricity Transmission Network.shp", quiet = TRUE)
wb_power <- st_transform(wb_power, st_crs(ethiopia_border))
wb_power <- st_intersection(wb_power, ethiopia_border)

##7. Load and filter major roads
roads <- st_read("~/Desktop/ethiopia-180101-free.shp/gis_osm_roads_free_1.shp", quiet = TRUE)

roads_major <- roads %>%
  filter(fclass %in% c("motorway", "trunk", "primary")) %>%
  st_transform(st_crs(ethiopia_border)) %>%
  st_intersection(ethiopia_border)

##8. Load household variables for village 

hh_geo <- read_csv("~/Desktop/pub_eth_householdgeovariables_y1.csv")

hh_sf <- st_as_sf(hh_geo, coords = c("LON_DD_MOD", "LAT_DD_MOD"), crs = 4326) %>%
  st_transform(st_crs(ethiopia_border))

##9. Admin for map

bb <- st_bbox(ethiopia_border)

xpad <- (bb$xmax - bb$xmin) * 0.15
ypad <- (bb$ymax - bb$ymin) * 0.15

paper_blues <- c(
  "#f2f6fa",
  "#dbe7f2",
  "#b7d2e8",
  "#8fb9da",
  "#5f9ec7",
  "#2f6da4"
)

Transformations Done and Final Output

Several transformations were applied to ensure clarity and visual comparability:

  • Raster Aggregation:The original 1km raster was aggregated to ~5km resolution to reduce noise and match the visual smoothness of the paper.
  • Quantile Classification:Population density was classified using quantiles to highlight relative spatial concentration rather than absolute values.
  • CRS Harmonization:All layers were transformed into a common coordinate reference system to ensure spatial alignment.
  • Boundary Masking: Population raster was clipped strictly to Ethiopia’s borders to avoid misleading density outside the study area.
  • Road Filtering: Only major roads were retained to avoid clutter and to match the visual emphasis of the paper.

These are the differences from the paper:

  • Electrification Status Not Publicly Available: The paper classifies villages as electrified vs non-electrified. In this replication, household points are plotted without classification, represented by a yellow dot.
  • Transmission Network Source: The paper uses utility-grade grid data, while we use World Bank open grid data, which approximates the structure.
  • Population Density Representation: The paper aggregates density at the district level, while ee use a smoothed raster-based representation for higher spatial resolution.
  • Temporal Differences: The paper studies changes between 2012 and 2014, while our map reflects a static 2012 snapshot overlayed with data from 2017, 2018 and 2026. Some historical data was unavailable.

Original Figure

knitr::include_graphics("~/Desktop/Qn2 actual chart.png")

End