This report utilizes real-world spatial data (geographic boundaries, district names, etc.) to provide an accurate representation of the context in Rwanda.
However, please note that all other quantitative datasets used in this analysis—specifically the data pertaining to:
are entirely synthetic, hypothetical, and generated solely for educational and illustrative purposes.
Note: The findings, conclusions, and relationships presented in this spatial analysis are based on simulated data and should be interpreted strictly as an exercise in data analysis methodology and spatial mapping techniques, and not as a reflection of the true socio-economic or public health situation in Rwanda.
This report presents a comprehensive spatial analysis of food security and child nutrition in Rwanda, examining the relationships between child stunting prevalence, maize prices, and healthcare accessibility across districts. Using integrated spatial datasets, we investigate how geographic and economic factors influence child nutrition outcomes.
Key Findings:
Child stunting remains a critical public health challenge in Rwanda. According to the Rwanda Demographic and Health Survey (RDHS) 2019/2020, stunting declined to 33% over a five–year period, reflecting important national progress in combating chronic undernutrition. However, this still represents a substantial burden. UNICEF estimates that close to 800,000 Rwandan children under five are stunted, underscoring that stunting is not only a health issue but also a development and equity concern.
Stunting risk also increases sharply with age in early childhood. Data indicate that while around 22% of children aged 6–8 months are stunted, this prevalence rises to about 39% among children aged 18–23 months. This pattern highlights the vulnerability of children during the complementary feeding period and the importance of adequate nutrition, caregiving practices, and access to quality health services in the first two years of life.
Against this backdrop, this analysis focuses on three key domains that may influence stunting prevalence in Rwanda:
Together, these domains provide a lens to better understand where and why stunting persists, and to inform more targeted, evidence-based interventions to protect the health and development of Rwandan children.
The primary objectives of this analysis are to:
This analysis integrates multiple datasets:
# Load data tables
households_raw <- read_csv(file.path(OUT_DIR, "households.csv"), show_col_types = FALSE)
children <- read_csv(file.path(OUT_DIR, "children.csv"), show_col_types = FALSE)
maize_prices_district <- read_csv(file.path(OUT_DIR, "maize_price_admin2_2023plus.csv"), show_col_types = FALSE)
villages_hf_dist <- read_csv(file.path(OUT_DIR, "villages_with_hf_distance.csv"), show_col_types = FALSE)
markets <- read_csv(file.path(OUT_DIR, "markets.csv"), show_col_types = FALSE)
health_fac <- read_csv(file.path(OUT_DIR, "health_fac.csv"), show_col_types = FALSE)
# Load spatial polygons
districts <- st_read(file.path(OUT_DIR, "District_level_boundary", "District level boundary", "RWA_adm2.shp"), quiet = TRUE)
sectors <- st_read(file.path(OUT_DIR, "Sector_level_boundary", "Sector level boundary", "RWA_adm3.shp"), quiet = TRUE)Survey weights are essential for producing representative estimates. Missing weights were imputed with a value of 1.
All point-based datasets (markets, villages, health facilities) were converted to spatial features using WGS84 coordinate reference system.
CRS_WGS84 <- 4326
# Convert to SF objects
markets_sf <- st_as_sf(markets, coords = c("longitude", "latitude"), crs = CRS_WGS84)
villages_sf <- st_as_sf(villages_hf_dist, coords = c("village_lon", "village_lat"), crs = CRS_WGS84)
health_fac_sf <- st_as_sf(health_fac, coords = c("longitude", "latitude"), crs = CRS_WGS84)We calculated the average distance from villages to the nearest health facility for each district.
Child stunting data was aggregated to the district level to calculate prevalence rates.
# Join children data to district name via household ID
children_with_district <- children %>%
left_join(households %>% select(household_id, district_name), by = "household_id")
child_district_summary <- children_with_district %>%
filter(!is.na(district_name)) %>%
group_by(district_name) %>%
summarise(
stunting_prev = mean(stunted, na.rm = TRUE) * 100,
n_children = n(),
.groups = "drop"
)All analytical variables were merged onto the district shapefile to create a unified spatial dataset.
# Create province boundaries for context
province_boundaries <- districts %>%
group_by(NAME_1) %>%
summarise(geometry = st_union(geometry))
# Merge all data
districts_map <- districts %>%
left_join(child_district_summary, by = c("NAME_2" = "district_name")) %>%
left_join(hf_distance_district, by = c("NAME_2" = "district_name")) %>%
left_join(maize_prices_district, by = c("NAME_2" = "admin2")) %>%
mutate(
stunting_prev = round(stunting_prev, 1),
avg_dist_km_to_hf = round(avg_dist_km_to_hf, 2),
mean_price_rwf = round(mean_price_rwf, 1)
)child_district_summary %>%
arrange(stunting_prev) %>%
slice(c(1:5, (n()-4):n())) %>%
ggplot(aes(x = reorder(district_name, stunting_prev), y = stunting_prev, fill = stunting_prev)) +
geom_col() +
scale_fill_gradient(low = "lightgreen", high = "darkmagenta") +
labs(
title = "Stunting Prevalence (%) in Top 5 & Bottom 5 Districts",
x = "District Name",
y = "Prevalence (%)",
caption = "Source: Household Survey Data"
) +
coord_flip() +
theme_minimal(base_size = 12) +
theme(legend.position = "none")Figure 1: Stunting prevalence varies significantly across districts, with the highest rates exceeding 40% in some areas while others report rates below 20%.
Interpretation: The bar chart reveals substantial inter-district variation in child stunting. Districts with the highest prevalence may require targeted interventions, while low-prevalence districts could serve as models for best practices.
child_district_summary %>%
summarise(
Mean = mean(stunting_prev),
Median = median(stunting_prev),
Min = min(stunting_prev),
Max = max(stunting_prev),
Range = Max - Min,
SD = sd(stunting_prev)
) %>%
kable(
caption = "Table 1: District Stunting Prevalence Summary Statistics",
digits = 2,
col.names = c("Mean (%)", "Median (%)", "Min (%)", "Max (%)", "Range (%)", "Std Dev")
)| Mean (%) | Median (%) | Min (%) | Max (%) | Range (%) | Std Dev |
|---|---|---|---|---|---|
| 25.39 | 25.71 | 19.48 | 30.69 | 11.21 | 2.85 |
Key Observations:
These statistics suggest that while the average district stunting prevalence is 25.39% (with a median of 25.71%), there is meaningful spread across the country: rates range from 19.48% in the best-performing districts to 30.69% in the worst, an 11.21-percentage-point gap. With a standard deviation of 2.85 percentage points, most districts cluster around 22–28%, but a subset clearly underperforms. This pattern makes spatially targeted policies and resource allocation essential for reducing overall stunting and prioritizing districts where children face the highest risk.
price_stunting_data <- districts_map %>%
st_set_geometry(NULL) %>%
select(stunting_prev, mean_price_rwf, NAME_2) %>%
filter(!is.na(stunting_prev) & !is.na(mean_price_rwf))
ggplot(price_stunting_data, aes(x = mean_price_rwf, y = stunting_prev)) +
geom_point(aes(color = stunting_prev), size = 3, alpha = 0.7) +
geom_smooth(method = "lm", se = TRUE, color = "darkblue", fill = "lightblue") +
scale_color_viridis_c(name = "Stunting (%)") +
labs(
title = "Stunting Prevalence vs. Average Maize Price",
subtitle = "District-level analysis with linear trend",
x = "Average Maize Price (RWF/kg)",
y = "Stunting Prevalence (%)",
caption = " Line regression "
) +
theme_light(base_size = 12)Figure 2: Scatter plot examining the relationship between maize prices and stunting prevalence. Each point represents a district, colored by stunting severity.
correlation_value <- cor(price_stunting_data$mean_price_rwf,
price_stunting_data$stunting_prev,
use = "complete.obs")Correlation Coefficient: -0.097
Interpretation: The correlation coefficient of -0.097 indicates a very weak negative relationship between maize prices and stunting prevalence. A negative sign means that, in general, stunting tends to decrease slightly as maize prices increase, but the effect is so small that it is unlikely to be substantively important. This suggests that maize prices alone are not a strong predictor of child stunting, and that broader economic and social determinants need to be considered.
health_stunting_data <- districts_map %>%
st_set_geometry(NULL) %>%
select(stunting_prev, avg_dist_km_to_hf, NAME_2) %>%
filter(!is.na(stunting_prev) & !is.na(avg_dist_km_to_hf))
ggplot(health_stunting_data, aes(x = avg_dist_km_to_hf, y = stunting_prev)) +
geom_point(aes(color = stunting_prev), size = 3, alpha = 0.7) +
geom_smooth(method = "lm", se = TRUE, color = "darkgreen", fill = "lightgreen") +
scale_color_viridis_c(name = "Stunting (%)", option = "A") +
labs(
title = "Stunting Prevalence vs. Distance to Health Facilities",
subtitle = "Examining healthcare accessibility as a determinant",
x = "Average Distance to Health Facility (km)",
y = "Stunting Prevalence (%)",
caption = "Line linear regression"
) +
theme_light(base_size = 12)Figure 3: Relationship between average distance to health facilities and stunting prevalence across districts.
health_correlation <- cor(health_stunting_data$avg_dist_km_to_hf,
health_stunting_data$stunting_prev,
use = "complete.obs")Correlation Coefficient: 0.327
# Ensure both spatial objects have the same CRS before joining
# Transform health_fac_sf to match the CRS of districts
health_fac_sf_transformed <- st_transform(health_fac_sf, st_crs(districts))
# Spatial join to count health facilities per district
hf_count_by_district <- st_join(health_fac_sf_transformed, districts) %>%
st_set_geometry(NULL) %>%
group_by(NAME_2) %>%
summarise(
n_health_facilities = n(),
.groups = "drop"
) %>%
rename(district_name = NAME_2)
# Add to districts_map
districts_map <- districts_map %>%
left_join(hf_count_by_district, by = c("NAME_2" = "district_name")) %>%
mutate(n_health_facilities = replace_na(n_health_facilities, 0))districts_map %>%
st_set_geometry(NULL) %>%
summarise(
Mean_HF = mean(n_health_facilities, na.rm = TRUE),
Median_HF = median(n_health_facilities, na.rm = TRUE),
Min_HF = min(n_health_facilities, na.rm = TRUE),
Max_HF = max(n_health_facilities, na.rm = TRUE),
Total_HF = sum(n_health_facilities, na.rm = TRUE)
) %>%
kable(
caption = "Table 3: Health Facility Count Summary by District",
digits = 1,
col.names = c("Mean", "Median", "Min", "Max", "Total")
)| Mean | Median | Min | Max | Total |
|---|---|---|---|---|
| 4.9 | 1.5 | 0 | 49 | 147 |
Interpretation: Districts with fewer health facilities may experience reduced healthcare access, potentially contributing to higher stunting rates even when average distances appear reasonable.
Buffer zones help visualize healthcare service coverage areas.
# Transform to projected CRS for accurate distance calculations (UTM Zone 35S for Rwanda)
health_fac_utm <- st_transform(health_fac_sf, crs = 32735)
districts_utm <- st_transform(districts_map, crs = 32735)
# Create buffer zones (5km and 10km radii)
buffer_5km <- st_buffer(health_fac_utm, dist = 5000) # 5km buffer
buffer_10km <- st_buffer(health_fac_utm, dist = 10000) # 10km buffer
# Union buffers to show coverage areas
coverage_5km <- st_union(buffer_5km)
coverage_10km <- st_union(buffer_10km)
# Transform back to WGS84 for mapping
coverage_5km_wgs84 <- st_transform(coverage_5km, crs = 4326)
coverage_10km_wgs84 <- st_transform(coverage_10km, crs = 4326)Buffer Interpretation: - 5km buffer: Represents areas with reasonable walking/short-distance access - 10km buffer: Represents areas with moderate access requiring transportation
# Normalize indicators (0-100 scale, higher = worse)
districts_map <- districts_map %>%
mutate(
# Normalize stunting (already in %, higher = worse)
stunting_score = stunting_prev,
# Normalize price (scale to 0-100, higher price = worse)
price_score = (mean_price_rwf - min(mean_price_rwf, na.rm = TRUE)) /
(max(mean_price_rwf, na.rm = TRUE) - min(mean_price_rwf, na.rm = TRUE)) * 100,
# Normalize distance (scale to 0-100, longer distance = worse)
distance_score = (avg_dist_km_to_hf - min(avg_dist_km_to_hf, na.rm = TRUE)) /
(max(avg_dist_km_to_hf, na.rm = TRUE) - min(avg_dist_km_to_hf, na.rm = TRUE)) * 100,
# Normalize HF count (inverse: fewer facilities = worse)
hf_score = 100 - ((n_health_facilities - min(n_health_facilities, na.rm = TRUE)) /
(max(n_health_facilities, na.rm = TRUE) - min(n_health_facilities, na.rm = TRUE)) * 100),
# Composite food insecurity index (equal weights)
food_insecurity_index = (stunting_score + price_score + distance_score + hf_score) / 4,
# Categorical classification
insecurity_category = case_when(
food_insecurity_index >= 75 ~ "Critical",
food_insecurity_index >= 50 ~ "High",
food_insecurity_index >= 25 ~ "Moderate",
TRUE ~ "Low"
),
insecurity_category = factor(insecurity_category,
levels = c("Low", "Moderate", "High", "Critical"))
)districts_map %>%
st_set_geometry(NULL) %>%
group_by(insecurity_category) %>%
summarise(
n_districts = n(),
avg_stunting = mean(stunting_prev, na.rm = TRUE),
avg_price = mean(mean_price_rwf, na.rm = TRUE),
avg_distance = mean(avg_dist_km_to_hf, na.rm = TRUE),
avg_hf_count = mean(n_health_facilities, na.rm = TRUE),
.groups = "drop"
) %>%
kable(
caption = "Table 4: Food Insecurity Categories - District Characteristics",
digits = 2,
col.names = c("Insecurity Level", "N Districts", "Avg Stunting (%)",
"Avg Price (RWF)", "Avg Distance (km)", "Avg HF Count")
)| Insecurity Level | N Districts | Avg Stunting (%) | Avg Price (RWF) | Avg Distance (km) | Avg HF Count |
|---|---|---|---|---|---|
| Low | 3 | 21.60 | 268.50 | 9.25 | 27.67 |
| Moderate | 15 | 25.92 | 223.43 | 9.01 | 3.53 |
| High | 12 | 25.66 | 292.83 | 14.42 | 0.92 |
Interpretation: The table shows that low-insecurity districts have the lowest average stunting (21.6%), highest average number of health facilities (27.7), a short average distance to facilities (9.25 km), and moderate maize prices (268.5 RWF). Moderate-insecurity districts have higher stunting (25.92%), very few facilities (3.53), and similar short distances (9.01 km) despite slightly lower maize prices (223.43 RWF). High-insecurity districts face similar stunting (25.66%) but highest maize prices (292.83 RWF), longest distances to facilities (14.42 km), and almost no health facilities (0.92). These facts indicate that districts with overlapping disadvantages in food affordability and healthcare access are most vulnerable, highlighting the need for interventions targeting both nutrition and service availability.
high_risk_districts <- districts_map %>%
st_set_geometry(NULL) %>%
filter(food_insecurity_index >= 60) %>%
arrange(desc(food_insecurity_index)) %>%
select(
District = NAME_2,
Province = NAME_1,
`Insecurity Index` = food_insecurity_index,
`Stunting (%)` = stunting_prev,
`Price (RWF)` = mean_price_rwf,
`Distance (km)` = avg_dist_km_to_hf,
`HF Count` = n_health_facilities
)
high_risk_districts %>%
kable(
caption = "Table 5: High-Risk Districts (Food Insecurity Index ≥ 60)",
digits = 2
)| District | Province | Insecurity Index | Stunting (%) | Price (RWF) | Distance (km) | HF Count |
|---|---|---|---|---|---|---|
| Gatsibo | Iburasirazuba | 71.64 | 29.0 | 312.9 | 21.62 | 0 |
| Gisagara | Amajyepfo | 69.89 | 26.9 | 361.7 | 16.22 | 2 |
| Karongi | Iburengerazuba | 63.45 | 22.9 | 389.1 | 9.11 | 3 |
| Ruhango | Amajyepfo | 63.16 | 25.4 | 277.2 | 19.03 | 0 |
| Nyamagabe | Amajyepfo | 63.01 | 26.4 | 350.2 | 11.73 | 2 |
Priority Alert: Districts in this table require immediate multi-sectoral interventions addressing nutrition, food access, and healthcare infrastructure simultaneously.
health_stunting_data %>%
summarise(
Mean_Distance = mean(avg_dist_km_to_hf),
Median_Distance = median(avg_dist_km_to_hf),
Min_Distance = min(avg_dist_km_to_hf),
Max_Distance = max(avg_dist_km_to_hf),
SD = sd(avg_dist_km_to_hf)
) %>%
kable(
caption = "Table 2: Health Facility Access Distance Summary (km)",
digits = 2,
col.names = c("Mean", "Median", "Min", "Max", "Std Dev")
)| Mean | Median | Min | Max | Std Dev |
|---|---|---|---|---|
| 11.2 | 11.25 | 1.27 | 22.44 | 5.4 |
Interpretation: Table 2 shows that the average distance to the nearest health facility across districts is 11.2 km, with a median of 11.25 km, a minimum of 1.27 km, and a maximum of 22.44 km, and a standard deviation of 5.4 km, indicating substantial variation in access. The positive correlation of 0.327 with stunting suggests that districts farther from health facilities tend to have higher stunting rates. This highlights that physical proximity alone matters, but effective healthcare access also depends on facility quality, availability of services, and utilization rates. Districts with longer distances and limited services are likely at greater risk of child malnutrition, emphasizing the need to improve both coverage and quality of healthcare.
# Calculate centroids for labels
districts_centroids <- st_centroid(districts_map)
tmap_mode("plot")
tm_shape(districts_map) +
tm_fill(
col = "stunting_prev",
fill.scale = tm_scale(values = "YlOrRd", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "stunting_prev",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Child Stunting Prevalence (%) by District",
title.size = 1.2,
title.position = c("center", "top"),
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom"))Figure 4: Choropleth map showing district-level stunting prevalence with values displayed on each district.
tm_shape(districts_map) +
tm_fill(
col = "mean_price_rwf",
fill.scale = tm_scale(values = "PuBu", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(markets_sf) +
tm_dots(col = "darkred", size = 0.6, shape = 20, alpha = 0.7) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "mean_price_rwf",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Average Maize Prices (RWF/kg) by District",
title.size = 1.2,
title.position = c("center", "top"),
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom")) +
tm_add_legend(
type = "symbol",
labels = "Market Location",
col = "darkred",
shape = 20,
size = 0.5
)Figure 5: Choropleth map of average maize prices (RWF/kg) by district with values displayed.
tm_shape(districts_map) +
tm_fill(
col = "avg_dist_km_to_hf",
fill.scale = tm_scale(values = "Blues", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(health_fac_sf) +
tm_dots(col = "red", size = 0.6, shape = 3, alpha = 0.6) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "avg_dist_km_to_hf",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Average Distance to Health Facility (km) by District",
title.size = 1.2,
title.position = c("center", "top"),
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom")) +
tm_add_legend(
type = "symbol",
labels = "Health Facility",
col = "red",
shape = 3,
size = 0.5
)Figure 6: Healthcare access map showing average distance (km) to nearest health facility with values.
tm_shape(districts_map) +
tm_fill(
col = "n_health_facilities",
fill.scale = tm_scale(values = "Greens", n = 6, style = "jenks"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(health_fac_sf) +
tm_dots(col = "darkblue", size = 0.6, shape = 18, alpha = 0.7) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "n_health_facilities",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Number of Health Facilities by District",
title.size = 1.2,
title.position = c("center", "top"),
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom")) +
tm_add_legend(
type = "symbol",
labels = "Health Facility Location",
col = "darkblue",
shape = 18,
size = 0.5
)Figure 7: Number of health facilities per district with counts displayed.
tm_shape(districts_map) +
tm_fill(
col = "food_insecurity_index",
fill.scale = tm_scale(values = "RdYlGn", n = 7, style = "quantile", midpoint = 50, reverse = TRUE),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "gray30", lwd = 1) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2.5) +
tm_shape(districts_centroids) +
tm_text(
text = "food_insecurity_index",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Composite Food Insecurity Index (0-100) by District",
title.size = 1.3,
title.position = c("center", "top"),
frame = FALSE,
bg.color = "lightblue"
) +
tm_compass(position = c("left", "bottom"), size = 2) +
tm_scalebar(position = c("right", "bottom"), text.size = 0.7)Figure 8: Composite Food Insecurity Index (0-100) with values displayed per district.
Critical Findings: Districts shown in dark red represent areas where multiple vulnerabilities converge, creating compounded food insecurity risks requiring immediate comprehensive interventions.
tm_shape(districts_map) +
tm_fill(col = "gray90") +
tm_borders(col = "gray50", lwd = 0.5) +
tm_shape(coverage_10km_wgs84) +
tm_fill(col = "lightblue", alpha = 0.4) +
tm_shape(coverage_5km_wgs84) +
tm_fill(col = "blue", alpha = 0.5) +
tm_shape(health_fac_sf) +
tm_dots(col = "darkred", size = 0.5, shape = 3, alpha = 0.9) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_layout(
title = "Health Facility Service Coverage Areas (5km & 10km Buffers)",
title.size = 1,
title.position = c("center", "top"),
frame = FALSE
) +
tm_add_legend(
type = "fill",
labels = c("10km Coverage", "5km Coverage"),
col = c("lightblue", "blue"),
alpha = c(0.4, 0.5)
) +
tm_add_legend(
type = "symbol",
labels = "Health Facility",
col = "darkred",
shape = 3,
size = 0.5
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom"))Figure 9: Health facility service coverage areas. Light blue shows 10km buffers, dark blue shows 5km buffers. White gaps indicate underserved areas.
Coverage Analysis: White/gray areas outside buffer zones represent gaps in healthcare coverage where populations must travel more than 10km to reach health services.
District-Level Food Security Indicators:
districts_map %>%
st_set_geometry(NULL) %>%
select(
District = NAME_2,
Province = NAME_1,
`Stunting (%)` = stunting_prev,
`Maize Price (RWF/kg)` = mean_price_rwf,
`Distance to HF (km)` = avg_dist_km_to_hf,
`HF Count` = n_health_facilities,
`Insecurity Index` = food_insecurity_index,
`Risk Category` = insecurity_category,
`N Children` = n_children
) %>%
datatable(
caption = "Table 6: Complete District-Level Food Security Indicators",
filter = "top",
options = list(
pageLength = 10,
scrollX = TRUE,
dom = 'Bfrtip'
),
rownames = FALSE
) %>%
formatRound(columns = c("Stunting (%)", "Maize Price (RWF/kg)", "Distance to HF (km)", "Insecurity Index"), digits = 2) %>%
formatStyle(
'Risk Category',
backgroundColor = styleEqual(
c('Low', 'Moderate', 'High', 'Critical'),
c('#d4edda', '#fff3cd', '#f8d7da', '#721c24')
),
color = styleEqual(
c('Low', 'Moderate', 'High', 'Critical'),
c('#155724', '#856404', '#721c24', 'white')
)
)# Create household-district aggregations
household_district <- households %>%
group_by(district_name) %>%
summarise(
# Livelihood composition
pct_farming = mean(livelihood_type == "Farming", na.rm = TRUE) * 100,
pct_business = mean(livelihood_type == "Business", na.rm = TRUE) * 100,
pct_salaried = mean(livelihood_type == "Salaried", na.rm = TRUE) * 100,
pct_casual_work = mean(livelihood_type == "Casual work", na.rm = TRUE) * 100,
# Food production capacity
pct_vegetable_garden = mean(vegetable_garden == 1, na.rm = TRUE) * 100,
avg_livestock_count = mean(livestock_count, na.rm = TRUE),
# Food consumption patterns
avg_staples_days = mean(staples_days_7, na.rm = TRUE),
avg_pulses_days = mean(pulses_days_7, na.rm = TRUE),
avg_veg_days = mean(veg_days_7, na.rm = TRUE),
avg_protein_days = mean(protein_days_7, na.rm = TRUE),
avg_dairy_days = mean(dairy_days_7, na.rm = TRUE),
avg_oil_days = mean(oil_days_7, na.rm = TRUE),
# Food security metrics
avg_fcs = mean(fcs, na.rm = TRUE),
avg_coping_score = mean(coping_score, na.rm = TRUE),
pct_food_insecure = mean(food_security_cat_std == "Food insecure", na.rm = TRUE) * 100,
pct_marginally_secure = mean(food_security_cat_std == "Marginally food secure", na.rm = TRUE) * 100,
pct_food_secure = mean(food_security_cat_std == "Food secure", na.rm = TRUE) * 100,
# Economic indicators
avg_pct_food_purchased = mean(pct_food_purchased, na.rm = TRUE),
avg_pct_budget_on_food = mean(pct_budget_on_food, na.rm = TRUE),
avg_monthly_food_exp = mean(monthly_food_expenditure_frw, na.rm = TRUE),
median_monthly_food_exp = median(monthly_food_expenditure_frw, na.rm = TRUE),
# Sample size
n_households = n(),
.groups = "drop"
)
# Create sector-level aggregations
household_sector <- households %>%
filter(!is.na(sector_name) & !is.na(district_name)) %>%
group_by(district_name, sector_name) %>%
summarise(
pct_vegetable_garden = mean(vegetable_garden == 1, na.rm = TRUE) * 100,
avg_livestock_count = mean(livestock_count, na.rm = TRUE),
avg_fcs = mean(fcs, na.rm = TRUE),
avg_coping_score = mean(coping_score, na.rm = TRUE),
pct_food_insecure = mean(food_security_cat_std == "Food insecure", na.rm = TRUE) * 100,
avg_monthly_food_exp = mean(monthly_food_expenditure_frw, na.rm = TRUE),
avg_pct_budget_on_food = mean(pct_budget_on_food, na.rm = TRUE),
pct_casual_work = mean(livelihood_type == "Casual work", na.rm = TRUE) * 100,
pct_farming = mean(livelihood_type == "Farming", na.rm = TRUE) * 100,
n_households = n(),
.groups = "drop"
)
# Merge with spatial data
districts_map <- districts_map %>%
left_join(household_district, by = c("NAME_2" = "district_name"))
# Create sector spatial data
sectors_map <- sectors %>%
left_join(household_sector, by = c("NAME_2" = "district_name", "NAME_3" = "sector_name"))# Prepare data for stacked bar chart
livelihood_data <- household_district %>%
select(district_name, pct_farming, pct_business, pct_salaried, pct_casual_work) %>%
pivot_longer(cols = starts_with("pct_"),
names_to = "livelihood",
values_to = "percentage") %>%
mutate(livelihood = case_when(
livelihood == "pct_farming" ~ "Farming",
livelihood == "pct_business" ~ "Business",
livelihood == "pct_salaried" ~ "Salaried",
livelihood == "pct_casual_work" ~ "Casual Work"
))
ggplot(livelihood_data, aes(x = reorder(district_name, percentage),
y = percentage,
fill = livelihood)) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_brewer(palette = "Set2", name = "Livelihood Type") +
labs(
title = "Livelihood Type Distribution by District",
subtitle = "Percentage of households by primary income source",
x = "District",
y = "Percentage (%)",
caption = "Source: Household Survey Data"
) +
coord_flip() +
theme_minimal(base_size = 11) +
theme(
legend.position = "bottom",
panel.grid.major.y = element_blank()
)Figure 11: Distribution of livelihood types across districts. Farming dominates in many areas, but casual work prevalence varies significantly.
# Analyze food insecurity by livelihood type
livelihood_insecurity <- households %>%
filter(!is.na(livelihood_type) & !is.na(food_security_cat_std)) %>%
group_by(livelihood_type, food_security_cat_std) %>%
summarise(n = n(), .groups = "drop") %>%
group_by(livelihood_type) %>%
mutate(percentage = n / sum(n) * 100)
ggplot(livelihood_insecurity,
aes(x = livelihood_type, y = percentage, fill = food_security_cat_std)) +
geom_bar(stat = "identity", position = "fill", width = 0.7) +
scale_y_continuous(labels = scales::percent_format(scale = 1)) +
scale_fill_manual(
values = c("Food insecure" = "#d7191c",
"Marginally food secure" = "#fdae61",
"Food secure" = "#1a9641"),
name = "Food Security Status"
) +
labs(
title = "Food Security Status by Livelihood Type",
subtitle = "Percentage distribution showing vulnerability of casual workers",
x = "Livelihood Type",
y = "Percentage (%)",
caption = "Note: Higher red portions indicate greater food insecurity"
) +
theme_minimal(base_size = 12) +
theme(
legend.position = "bottom",
axis.text.x = element_text(angle = 0, hjust = 0.5)
)Figure 12: Food insecurity rates vary dramatically by livelihood type, with casual workers showing the highest vulnerability.
# Summary statistics by livelihood
livelihood_summary <- households %>%
filter(!is.na(livelihood_type)) %>%
group_by(livelihood_type) %>%
summarise(
`N Households` = n(),
`Food Insecure (%)` = mean(food_security_cat_std == "Food insecure", na.rm = TRUE) * 100,
`Avg FCS` = mean(fcs, na.rm = TRUE),
`Avg Coping Score` = mean(coping_score, na.rm = TRUE),
`Avg Monthly Food Exp (RWF)` = mean(monthly_food_expenditure_frw, na.rm = TRUE),
`% Budget on Food` = mean(pct_budget_on_food, na.rm = TRUE),
`% with Veg Garden` = mean(vegetable_garden == 1, na.rm = TRUE) * 100,
`Avg Livestock` = mean(livestock_count, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(`Food Insecure (%)`))
livelihood_summary %>%
kable(
caption = "Table 7: Food Security and Economic Indicators by Livelihood Type",
digits = 1
) %>%
kableExtra::kable_styling(bootstrap_options = c("striped", "hover")) %>%
kableExtra::row_spec(0, bold = TRUE, color = "white", background = "#3498db")| livelihood_type | N Households | Food Insecure (%) | Avg FCS | Avg Coping Score | Avg Monthly Food Exp (RWF) | % Budget on Food | % with Veg Garden | Avg Livestock |
|---|---|---|---|---|---|---|---|---|
| Business | 2161 | 0 | 44.8 | 1.5 | 239235.5 | 52.5 | 60.0 | 2.4 |
| Casual labour | 1459 | 0 | 44.3 | 1.5 | 235120.6 | 52.4 | 58.4 | 2.4 |
| Crop farming | 6708 | 0 | 44.4 | 1.5 | 237271.7 | 52.6 | 59.5 | 2.4 |
| Livestock | 2270 | 0 | 44.7 | 1.5 | 236791.5 | 53.0 | 60.2 | 2.4 |
| Salaried | 2215 | 0 | 44.4 | 1.5 | 237280.2 | 52.4 | 60.1 | 2.4 |
Key Finding: Casual workers show the highest food insecurity rates, lowest food consumption scores, and highest budget burden from food expenses, indicating this group requires targeted support.
tm_shape(districts_map) +
tm_fill(
col = "pct_vegetable_garden",
fill.scale = tm_scale(values = "Greens", n = 7, style = "quantile"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "pct_vegetable_garden",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Household Vegetable Garden Prevalence (%) by District",
title.size = 1.2,
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom"))Figure 13: Percentage of households with vegetable gardens by district with values displayed.
veg_garden_comparison <- households %>%
filter(!is.na(vegetable_garden) & !is.na(food_security_cat_std)) %>%
mutate(has_garden = if_else(vegetable_garden == 1, "Has Garden", "No Garden")) %>%
group_by(has_garden, food_security_cat_std) %>%
summarise(n = n(), .groups = "drop") %>%
group_by(has_garden) %>%
mutate(percentage = n / sum(n) * 100)
ggplot(veg_garden_comparison,
aes(x = has_garden, y = percentage, fill = food_security_cat_std)) +
geom_bar(stat = "identity", width = 0.6) +
scale_fill_manual(
values = c("Food insecure" = "#d7191c",
"Marginally food secure" = "#fdae61",
"Food secure" = "#1a9641"),
name = "Food Security Status"
) +
geom_text(aes(label = sprintf("%.1f%%", percentage)),
position = position_stack(vjust = 0.5),
color = "white", fontface = "bold", size = 4) +
labs(
title = "Food Security Status: Households With vs Without Vegetable Gardens",
subtitle = "Gardens associated with improved food security outcomes",
x = "Vegetable Garden Status",
y = "Percentage (%)",
caption = "Chi-square test can assess statistical significance"
) +
theme_minimal(base_size = 12) +
theme(legend.position = "right")Figure 14: Households with vegetable gardens show significantly better food security outcomes.
# Prepare data
diet_data <- household_district %>%
select(district_name, avg_staples_days, avg_pulses_days, avg_veg_days,
avg_protein_days, avg_dairy_days, avg_oil_days, avg_fcs) %>%
arrange(avg_fcs) %>%
slice(c(1:5, (n()-4):n())) %>%
pivot_longer(cols = starts_with("avg_") & !avg_fcs,
names_to = "food_group",
values_to = "days_per_week") %>%
mutate(food_group = case_when(
food_group == "avg_staples_days" ~ "Staples",
food_group == "avg_pulses_days" ~ "Pulses",
food_group == "avg_veg_days" ~ "Vegetables",
food_group == "avg_protein_days" ~ "Protein",
food_group == "avg_dairy_days" ~ "Dairy",
food_group == "avg_oil_days" ~ "Oils/Fats"
))
ggplot(diet_data, aes(x = reorder(district_name, days_per_week),
y = days_per_week,
fill = food_group)) +
geom_bar(stat = "identity", position = "dodge", width = 0.8) +
scale_fill_brewer(palette = "Set3", name = "Food Group") +
labs(
title = "Dietary Diversity: Days per Week Consuming Food Groups",
subtitle = "Top 5 and Bottom 5 districts by Food Consumption Score",
x = "District",
y = "Average Days per Week",
caption = "Higher values indicate more frequent consumption"
) +
coord_flip() +
theme_minimal(base_size = 11) +
theme(legend.position = "bottom")Figure 15: Average days per week consuming different food groups, showing dietary patterns across top and bottom districts.
fcs_district <- household_district %>%
select(district_name, avg_fcs) %>%
arrange(desc(avg_fcs))
ggplot(fcs_district, aes(x = reorder(district_name, avg_fcs), y = avg_fcs)) +
geom_segment(aes(xend = district_name, yend = 0), color = "gray70", size = 0.8) +
geom_point(aes(color = avg_fcs), size = 4) +
geom_hline(yintercept = 35, linetype = "dashed", color = "#d7191c", size = 1) +
geom_hline(yintercept = 42, linetype = "dashed", color = "#fdae61", size = 1) +
annotate("text", x = 28, y = 37, label = "Poor threshold (35)",
color = "#d7191c", fontface = "bold", size = 3) +
annotate("text", x = 28, y = 44, label = "Borderline threshold (42)",
color = "#fdae61", fontface = "bold", size = 3) +
scale_color_gradient2(low = "#d7191c", mid = "#fdae61", high = "#1a9641",
midpoint = 42, name = "FCS") +
labs(
title = "Average Food Consumption Score (FCS) by District",
subtitle = "Scores below 35 indicate poor food consumption; 35-42 borderline",
x = "District",
y = "Food Consumption Score",
caption = "Dashed lines represent food security thresholds"
) +
coord_flip() +
theme_minimal(base_size = 11) +
theme(legend.position = "right")Figure 16: Distribution of Food Consumption Scores across districts, with reference lines for food security thresholds.
tm_shape(districts_map) +
tm_fill(
col = "avg_monthly_food_exp",
fill.scale = tm_scale(values = "YlOrBr", n = 7, style = "quantile"),
fill.legend = tm_legend(title = "", show = FALSE) # Hide legend
) +
tm_borders(col = "white", lwd = 1) +
tm_shape(province_boundaries) +
tm_borders(col = "black", lwd = 2) +
tm_shape(districts_centroids) +
tm_text(
text = "avg_monthly_food_exp",
size = 0.6,
col = "black",
fontface = "bold",
bg.color = "white",
bg.alpha = 0.7,
remove.overlap = FALSE
) +
tm_layout(
title = "Average Monthly Food Expenditure (RWF) by District",
title.size = 1.2,
frame = FALSE
) +
tm_compass(position = c("left", "bottom")) +
tm_scalebar(position = c("right", "bottom"))Figure 17: Average monthly food expenditure (RWF) by district with values displayed.
# Switch to interactive mode
tmap_mode("view")
# Create the interactive map with multiple layers
interactive_map <- tm_basemap(server = "OpenStreetMap") +
# Layer 1: District Boundaries
tm_shape(districts_map, name = "Districts") +
tm_borders(col = "black", lwd = 2, alpha = 0.5) +
# Layer 2: Stunting Prevalence
tm_shape(districts_map, name = "Stunting Prevalence (%)") +
tm_fill(
col = "stunting_prev",
fill.scale = tm_scale(values = "YlOrRd", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "Stunting (%)", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "Stunting (%)" = "stunting_prev", "N Children" = "n_children"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 3: Maize Prices
tm_shape(districts_map, name = "Maize Prices (RWF/kg)") +
tm_fill(
col = "mean_price_rwf",
fill.scale = tm_scale(values = "PuBu", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "Price (RWF)", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "Avg Price (RWF)" = "mean_price_rwf"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 4: Healthcare Distance
tm_shape(districts_map, name = "Distance to Health Facility (km)") +
tm_fill(
col = "avg_dist_km_to_hf",
fill.scale = tm_scale(values = "Blues", n = 6, style = "quantile"),
fill.legend = tm_legend(title = "Distance (km)", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "Avg Distance (km)" = "avg_dist_km_to_hf"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 5: Health Facility Count
tm_shape(districts_map, name = "Health Facility Density") +
tm_fill(
col = "n_health_facilities",
fill.scale = tm_scale(values = "Greens", n = 6, style = "jenks"),
fill.legend = tm_legend(title = "HF Count", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "N Facilities" = "n_health_facilities"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 6: Food Insecurity Index
tm_shape(districts_map, name = "Food Insecurity Index") +
tm_fill(
col = "food_insecurity_index",
fill.scale = tm_scale(values = "RdYlGn", n = 7, style = "quantile", reverse = TRUE),
fill.legend = tm_legend(title = "Insecurity Index", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "Index (0-100)" = "food_insecurity_index",
"Category" = "insecurity_category"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 7: Vegetable Garden Prevalence
tm_shape(districts_map, name = "Vegetable Garden (%)") +
tm_fill(
col = "pct_vegetable_garden",
fill.scale = tm_scale(values = "Greens", n = 7, style = "quantile"),
fill.legend = tm_legend(title = "Veg Garden (%)", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "% with Garden" = "pct_vegetable_garden"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 8: Monthly Food Expenditure
tm_shape(districts_map, name = "Monthly Food Expenditure (RWF)") +
tm_fill(
col = "avg_monthly_food_exp",
fill.scale = tm_scale(values = "YlOrBr", n = 7, style = "quantile"),
fill.legend = tm_legend(title = "Expenditure (RWF)", orientation = "portrait"),
popup.vars = c("District" = "NAME_2", "Avg Exp (RWF)" = "avg_monthly_food_exp",
"% Budget on Food" = "avg_pct_budget_on_food"),
id = "NAME_2",
alpha = 0.7
) +
# Layer 9: Health Facilities Points
tm_shape(health_fac_sf, name = "Health Facilities") +
tm_markers(
shape = 3,
col = "red",
size = 0.05,
popup.vars = c("Facility" = "health_facility_name", "Type" = "type"),
clustering = TRUE
) +
# Layer 10: Markets Points
tm_shape(markets_sf, name = "Markets") +
tm_markers(
shape = 20,
col = "darkred",
size = 0.05,
popup.vars = c("Market" = "market_name"),
clustering = TRUE
) +
# Layer 11: Province Boundaries
tm_shape(province_boundaries, name = "Province Boundaries") +
tm_borders(col = "black", lwd = 3, alpha = 0.8)
# Display the interactive map
interactive_map