Geospatial Analysis of Affected Adolescents in DRC

Author

Mulumba Kalonji Alain

Published

June 8, 2025

Click to expand R code
```{r message=FALSE}
setwd("C:/Users/Alain/Documents")


options(repos = c(CRAN = "https://cloud.r-project.org/"))

requiredPackages <- c("sf","RColorBrewer", "spatstat.geom", "spatstat.utils", "ggplot2",
              "spatstat", "maptools", "httr","ggspatial", "viridis", "dplyr", "readxl",
              "forcats","tidyr", "stringr")
for(i in requiredPackages)
{if(!require(i,character.only = TRUE)) install.packages(i)}
```

1 Introduction

1.1 Context and Objectives

This geospatial analysis aims to map the distribution of affected adolescents in the Democratic Republic of Congo (DRC) across three key dimensions:

  1. Geographic distribution by province
  2. Gender disparities (male/female)
  3. Relative intensity (absolute numbers and population ratios)

The primary objective is to identify priority zones for targeted interventions by answering:
- Where are the most critical needs?
- Are there significant gender disparities?
- How do provinces compare after population standardization?

1.2 Methodology

Data sources:
- Geographic layers:
- Provincial boundaries (IGF)
- Natural Earth base map
- Demographic data:
- Affected adolescent census (2023)
- INS population projections

Technical approach:
- Bivariate mapping (intensity + gender)
- Standardized ratios per 1,000 inhabitants
- Interactive visualizations with ggplot2 and sf

1.3 Key Results

Three main visualizations will be presented:

  1. Gender-comparative map (facets)
  2. Bivariate intensity/gender analysis
  3. Geographic reference map (province names)

1.4 # Part I: Exploration of DRC

2 Administrative Composition

2.1 Provincial Structure

3 banderies_nad83 contains provincial boundaries in NAD83 projection

4 rdc_83 provides a more detailed map background (via Natural Earth)

Click to expand R code
```{r}
banderies <- st_read("C:/Users/Alain/Downloads/province26/provinces26/Province26.shp")
banderies_nad83 <- st_transform(banderies, crs = "+proj=longlat +datum=NAD83")
banderies_nad83$CODE_INS

library(rnaturalearth)
rdc <- ne_states(country = "Democratic Republic of the Congo", returnclass = "sf")
rdc_83<- st_transform(rdc, crs = 3857)
rdc_83$adm1_code

ggplot() + 
  geom_sf(rdc_83, mapping = aes(geometry=geometry))
```
Reading layer `Province26' from data source 
  `C:\Users\Alain\Downloads\province26\provinces26\Province26.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 26 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 1358728 ymin: -1501941 xmax: 3481773 ymax: 596454.8
Projected CRS: World_Mercator
 [1]  10  20 302 303 305 402 403 405 406 407 502 503 504 505  61  62  63 704 705
[20] 706 707 802 803 804 902 903
 [1] "COD-1461" "COD-1871" "COD-1884" "COD-1883" "COD-1559" "COD-1894"
 [7] "COD-1897" "COD-1898" "COD-1872" "COD-1896" "COD-1895"

Click to expand R code
```{r}
provinces <- st_read("C:/Users/Alain/Downloads/geoBoundaries-COD-ADM2-all/geoBoundaries-COD-ADM2.shp")
provinces_nad83 <- st_transform(provinces, crs = "+proj=longlat +datum=NAD83")

ggplot() + 
  geom_sf(provinces_nad83, mapping = aes(geometry=geometry))



crds <- st_centroid(st_make_valid(provinces_nad83)) 
head(crds)

plot(st_geometry(provinces_nad83))  # rysuje kontur / plotting the contour
plot(st_geometry(crds), pch=21, bg='red', add=TRUE)
```
Reading layer `geoBoundaries-COD-ADM2' from data source 
  `C:\Users\Alain\Downloads\geoBoundaries-COD-ADM2-all\geoBoundaries-COD-ADM2.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 189 features and 5 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 12.20566 ymin: -13.456 xmax: 31.30522 ymax: 5.386098
Geodetic CRS:  WGS 84
Simple feature collection with 6 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 23.7547 ymin: 0.7354574 xmax: 30.70698 ymax: 4.471862
Geodetic CRS:  +proj=longlat +datum=NAD83
   shapeName shapeISO                 shapeID shapeGroup shapeType
1        Aba     <NA> 63176286B20411552692002        COD      ADM2
2      Aketi     <NA> 63176286B77771623677962        COD      ADM2
3       Ango     <NA> 63176286B24678106401017        COD      ADM2
4    Ariwara     <NA> 63176286B55306571020797        COD      ADM2
5        Aru     <NA> 63176286B14719092717616        COD      ADM2
6 Bafwasende     <NA> 63176286B50174012793828        COD      ADM2
                    geometry
1  POINT (30.23818 3.858422)
2    POINT (23.7547 2.98004)
3  POINT (26.07273 4.471862)
4  POINT (30.70698 3.136862)
5  POINT (30.53241 3.065347)
6 POINT (26.99695 0.7354574)

Click to expand R code
```{r}
library(rnaturalearth)
rdc <- ne_states(country = "Democratic Republic of the Congo", returnclass = "sf")
rdc_83<- st_transform(rdc, crs = 3857)
rdc_83$adm1_code
```
 [1] "COD-1461" "COD-1871" "COD-1884" "COD-1883" "COD-1559" "COD-1894"
 [7] "COD-1897" "COD-1898" "COD-1872" "COD-1896" "COD-1895"

5 The province hosting the most extensive kimberlite diamond fields

Click to expand R code
```{r}
unique(rdc$name)
Kasai_Oriental <- rdc[rdc$name == "Kasaï-Oriental", ]
Kasai_Oriental <- rdc[grepl("Kasaï-Oriental", rdc$name, ignore.case = TRUE), ]
nrow(Kasai_Oriental)



Kasai_Oriental_proj <- st_transform(Kasai_Oriental, crs = 32733)
Kasai_Oriental_geom <- st_geometry(Kasai_Oriental_proj)[[1]]  
Kasai_Oriental_win <- as.owin(Kasai_Oriental_geom)
```
 [1] "Équateur"         "Bandundu"         "Kinshasa City"    "Bas-Congo"       
 [5] "Orientale"        "Sud-Kivu"         "Katanga"          "Nord-Kivu"       
 [9] "Kasaï-Occidental" "Kasaï-Oriental"   "Maniema"         
[1] 1

6 4. Plot to confirm

Click to expand R code
```{r}
plot(Kasai_Oriental_win, main = "Observation window - Kasaï-Oriental")
ggplot() + 
  geom_sf(Kasai_Oriental_proj, mapping = aes(geometry=geometry))

Kasai_Oriental_proj <- st_transform(Kasai_Oriental, crs = 32733)
st_bbox(Kasai_Oriental_proj)
```
   xmin    ymin    xmax    ymax 
1268838 9113480 1757329 9804526 

#st_bbox(banderies_nad83)

7 contour map with centroids – in sf and data.frame class

Click to expand R code
```{r}
crds <- st_centroid(st_make_valid(Kasai_Oriental_proj)) 
head(crds)

plot(st_geometry(Kasai_Oriental_proj))  
plot(st_geometry(crds), pch=21, bg='red', add=TRUE)
```
Simple feature collection with 1 feature and 121 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 1496407 ymin: 9491290 xmax: 1496407 ymax: 9491290
Projected CRS: WGS 84 / UTM zone 33S
                         featurecla scalerank adm1_code diss_me iso_3166_2
3789 Admin-1 states provinces lakes         5  COD-1896    1896      CD-KE
     wikipedia iso_a2 adm0_sr           name              name_alt name_local
3789      <NA>     CD       1 Kasaï-Oriental East Kasai|Kasai East       <NA>
         type  type_en code_local code_hasc note hasc_maybe region region_cod
3789 Province Province       <NA>     CD.KR <NA>       <NA>   <NA>       <NA>
     provnum_ne gadm_level check_me datarank abbrev postal area_sqkm sameascity
3789          9          1        0        3   <NA>     KR         0        -99
     labelrank name_len mapcolor9 mapcolor13 fips fips_alt  woe_id
3789         6       14         4          7 CG04     <NA> 2344980
                                            woe_label       woe_name latitude
3789 Kasai-Oriental, CD, Democratic Republic of Congo Kasaï-Oriental -4.40511
     longitude sov_a3 adm0_a3 adm0_label                            admin
3789   24.0842    COD     COD          2 Democratic Republic of the Congo
                             geonunit gu_a3  gn_id                    gn_name
3789 Democratic Republic of the Congo   COD 214138 Province du Kasai-Oriental
       gns_id       gns_name gn_level gn_region gn_a1_code region_sub sub_code
3789 -2046899 Kasai-Oriental        1      <NA>      CD.04       <NA>     <NA>
     gns_level gns_lang gns_adm1 gns_region min_label max_label min_zoom
3789         1      fra     CG04       <NA>         6        11        6
     wikidataid       name_ar         name_bn        name_de        name_en
3789     Q80953 كاساي الشرقية কাসাই-ওরিয়েন্টাল Kasaï-Oriental Kasaï-Oriental
            name_es        name_fr        name_el          name_hi     name_hu
3789 Kasai Oriental Kasaï-Oriental Κασάι-Οριεντάλ कासाइ-पूर्वी प्रान्त Kelet-Kasai
            name_id         name_it    name_ja        name_ko    name_nl
3789 Kasai-Oriental Kasai Orientale 東カサイ州 카사이오리앙탈 Oost-Kasaï
             name_pl        name_pt         name_ru        name_sv    name_tr
3789 Kasai Wschodnie Kasaï-Oriental Восточное Касаи Kasaï-Oriental Doğu Kasai
            name_vi  name_zh      ne_id        name_he      name_uk
3789 Kasai-Oriental 东开赛省 1159311343 קאסאי-אוריינטל Східне Касаї
          name_ur         name_fa name_zht FCLASS_ISO FCLASS_US FCLASS_FR
3789 کاسائی-مشرقی کاسای- اورینتال 东开赛省       <NA>      <NA>      <NA>
     FCLASS_RU FCLASS_ES FCLASS_CN FCLASS_TW FCLASS_IN FCLASS_NP FCLASS_PK
3789      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
     FCLASS_DE FCLASS_GB FCLASS_BR FCLASS_IL FCLASS_PS FCLASS_SA FCLASS_EG
3789      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
     FCLASS_MA FCLASS_PT FCLASS_AR FCLASS_JP FCLASS_KO FCLASS_VN FCLASS_TR
3789      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
     FCLASS_ID FCLASS_PL FCLASS_GR FCLASS_IT FCLASS_NL FCLASS_SE FCLASS_BD
3789      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
     FCLASS_UA FCLASS_TLC                geometry
3789      <NA>       <NA> POINT (1496407 9491290)

8 The Congolese railway network: an underdeveloped colonial legacy with limited modern connectivity

Click to expand R code
```{r}
railways <- st_read("C:/Users/Alain/Downloads/congo-democratic-republic-latest-free.shp/gis_osm_railways_free_1.shp")
railways_nad83 <- st_transform(railways, crs = "+proj=longlat +datum=NAD83")
str(railways)
table(railways$fclass)
plot(railways$geometry)

district <- st_read("C:/Users/Alain/Downloads/district/District.shp")
district_nad83 <- st_transform(district, crs = "+proj=longlat +datum=NAD83")


table(district$NOM)
plot(district$geometry)
```
Reading layer `gis_osm_railways_free_1' from data source 
  `C:\Users\Alain\Downloads\congo-democratic-republic-latest-free.shp\gis_osm_railways_free_1.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 1900 features and 7 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 13.43202 ymin: -13.49397 xmax: 30.71915 ymax: 3.36139
Geodetic CRS:  WGS 84
Classes 'sf' and 'data.frame':  1900 obs. of  8 variables:
 $ osm_id  : chr  "4402431" "4402432" "4402433" "4402434" ...
 $ code    : int  6101 6101 6101 6101 6101 6101 6101 6101 6101 6101 ...
 $ fclass  : chr  "rail" "rail" "rail" "rail" ...
 $ name    : chr  NA NA NA NA ...
 $ layer   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ bridge  : chr  "F" "F" "F" "F" ...
 $ tunnel  : chr  "F" "F" "F" "F" ...
 $ geometry:sfc_LINESTRING of length 1900; first list element:  'XY' num [1:3, 1:2] 15.3 15.3 15.3 -4.3 -4.3 ...
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA
  ..- attr(*, "names")= chr [1:7] "osm_id" "code" "fclass" "name" ...

 funicular light_rail       rail 
         1         12       1887 
Reading layer `District' from data source 
  `C:\Users\Alain\Downloads\district\District.shp' using driver `ESRI Shapefile'
Simple feature collection with 48 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 1358728 ymin: -1501941 xmax: 3481773 ymax: 596454.8
Projected CRS: World_Mercator

    Bandundu   Bas-Fleuve     Bas-Uele         Beni         Boma       Bukavu 
           1            1            1            1            1            1 
     Butembo   Cataractes     Equateur    Gbadolite         Goma Haut-Katanga 
           1            1            1            1            1            1 
 Haut-Lomami    Haut-Uele        Ituri      Kabinda      Kananga        Kasai 
           1            1            1            1            1            1 
      Kikwit        Kindu     Kinshasa    Kisangani      Kolwezi       Kwango 
           1            1            1            1            1            1 
       Kwilu       Likasi      Lualaba   Lubumbashi       Lukaya        Lulua 
           1            1            1            1            1            1 
  Maï-Ndombe      Maniema       Matadi     Mbandaka   Mbuji-Mayi      Mongala 
           1            1            1            1            1            1 
  Mwene-Ditu    Nord-Kivu  Nord-Ubangi     Plateaux      Sankuru     Sud-Kivu 
           1            1            1            1            1            1 
  Sud-Ubangi    Tanganyka    Tshilenge       Tshopo      Tshuapa        Zongo 
           1            1            1            1            1            1 

9 DRC’s parks harbor some of Earth’s rarest biodiversity, including endemic species like the Okapi, Bonobo, and Congo Peacock.

Click to expand R code
```{r}
parc <- st_read("C:/Users/Alain/Downloads/parc/Parc.shp")
parc_nad83 <- st_transform(parc, crs = "+proj=longlat +datum=NAD83")
table(parc$NOM)
plot(parc$geometry)
```
Reading layer `Parc' from data source `C:\Users\Alain\Downloads\parc\Parc.shp' using driver `ESRI Shapefile'
Simple feature collection with 44 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 12.35354 ymin: -11.19897 xmax: 30.20467 ymax: 5.388336
Geodetic CRS:  WGS 84

       Domaine de chasse de Bili-Uere     Domaine de Chasse de Bombo Lumene 
                                    1                                     1 
            Domaine de Chasse de Bomu        Domaine de Chasse de Bushimaie 
                                    1                                     1 
Domaine de Chasse de Gangala-na Bodio    Domaine de Chasse de Luama-Katanga 
                                    1                                     1 
      Domaine de chasse de Luama-Kivu    Domaine de chasse de Lubudi Sampwe 
                                    1                                     1 
     Domaine de chasse de Maika-Penge           Domaine de chasse de Mangai 
                                    1                                     1 
       Domaine de chasse de Rubi-Tele         Domaine de chasse de Rutshuru 
                                    1                                     1 
      Domaine de Chasse de Swa-Kibula      Domaine de chasse de Tshangalele 
                                    1                                     1 
                     Massif d'Itombwe                     Parc de la N'Sele 
                                    1                                     1 
             Parc Marin des Mangroves         Parc National de Kahuzi-Biega 
                                    1                                     1 
          Parc National de Kundelungu             Parc National de l'Upemba 
                                    1                                     1 
          Parc National de la Garamba             Parc National de la Maiko 
                                    1                                     1 
          Parc National de la Salonga             Parc National des Virunga 
                                    2                                     1 
                Réserve de Abumonbazi     Réserve de biosphère de la Lufira 
                                    1                                     1 
      Réserve de biosphere de la Luki      Réserve de biosphère de Yangambi 
                                    1                                     1 
                      Réserve de Bomu                        Réserve de Epi 
                                    1                                     1 
            Réserve de faune à okapis             Réserve de Lomami-Lualaba 
                                    1                                     1 
                 Réserve de Mai Mpili                    Réserve de Maniema 
                                    1                                     1 
            Réserve de Shaba Elephant          Réserve du Lac Tumba-Lediima 
                                    1                                     1 
               Réserve du Mont Kabobo                 Réserve du Sud Masisi 
                                    1                                     1 
      Réserve du triangle de la Ngiri Réserve forestière de Lomako-Yokokala 
                                    1                                     1 
   Reserve Naturelle de Kisimba Ikobo            Reserve Naturelle de Tayna 
                                    1                                     1 
          Réserve Scientifique de Luo 
                                    1 

10 Geospatial profile of Kinshasa, Democratic Republic of Congo’s capital

Click to expand R code
```{r}
plot(st_geometry(st_transform(district, crs = st_crs(provinces_nad83))))
plot(st_geometry(provinces_nad83[provinces$shapeName == "Kinshasa", ]), 
     add = TRUE, 
     col = "red")
```

11 A stark urban divide: Kinshasa’s core areas (Lukunga/Tshangu) operate at full capacity (90-100%) versus Maluku’s sub-30% occupancy in eastern outskirts.”

Click to expand R code
```{r}
b0 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon.gpkg")
b2 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon-lvl2.gpkg")
b4 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon-lvl4.gpkg")
b7 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon-lvl7.gpkg")
b8 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon-lvl8.gpkg")
b9 <- st_read("C:/Users/Alain/Downloads/CD-KN-b7bd0eae-20250601-fr-gpkg/data/boundary-polygon-lvl9.gpkg")

target_crs <- 3857  

b0 <- st_transform(b0, crs = target_crs)
b2 <- st_transform(b2, crs = target_crs)
b4 <- st_transform(b4, crs = target_crs)
b7 <- st_transform(b7, crs = target_crs)
b8 <- st_transform(b8, crs = target_crs)
b9 <- st_transform(b9, crs = target_crs)

boundaries <- rbind(b0, b2, b4, b7, b8, b9)
nb_col <- length(unique(boundaries$NAME))
palette <- colorRampPalette(brewer.pal(9, "Set3"))(nb_col)
```
Reading layer `boundary-polygon' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon.gpkg' 
  using driver `GPKG'
Simple feature collection with 234 features and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.12706 ymin: -5.032536 xmax: 16.53412 ymax: -3.927611
Geodetic CRS:  WGS 84
Reading layer `boundary-polygon-lvl2' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon-lvl2.gpkg' 
  using driver `GPKG'
Simple feature collection with 1 feature and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.12706 ymin: -5.032536 xmax: 16.53412 ymax: -3.927611
Geodetic CRS:  WGS 84
Reading layer `boundary-polygon-lvl4' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon-lvl4.gpkg' 
  using driver `GPKG'
Simple feature collection with 1 feature and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.12706 ymin: -5.032536 xmax: 16.53412 ymax: -3.927611
Geodetic CRS:  WGS 84
Reading layer `boundary-polygon-lvl7' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon-lvl7.gpkg' 
  using driver `GPKG'
Simple feature collection with 24 features and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.12982 ymin: -5.032536 xmax: 16.53412 ymax: -3.927611
Geodetic CRS:  WGS 84
Reading layer `boundary-polygon-lvl8' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon-lvl8.gpkg' 
  using driver `GPKG'
Simple feature collection with 197 features and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.20606 ymin: -4.440094 xmax: 15.42763 ymax: -4.296267
Geodetic CRS:  WGS 84
Reading layer `boundary-polygon-lvl9' from data source 
  `C:\Users\Alain\Downloads\CD-KN-b7bd0eae-20250601-fr-gpkg\data\boundary-polygon-lvl9.gpkg' 
  using driver `GPKG'
Simple feature collection with 11 features and 25 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 15.21345 ymin: -4.398089 xmax: 15.37933 ymax: -4.320823
Geodetic CRS:  WGS 84
Click to expand R code
```{r}
ggplot(boundaries) +
  geom_sf(aes(fill = NAME), color = "black", size = 0.2, show.legend = FALSE) +
  scale_fill_manual(values = palette) +
  labs(title = "Communes, quartiers et districts de Kinshasa",
       subtitle = "Chaque entité est représentée par une couleur distincte") +
  annotation_scale(location = "bl") +
  annotation_north_arrow(location = "tl", which_north = "true",
                         style = north_arrow_fancy_orienteering()) +
  theme_minimal() +
  
  # 🏷️ Affichage des noms des entités
  geom_sf_text(aes(label = NAME), size = 2.5, color = "black", check_overlap = TRUE)
```

Click to expand R code
```{r}
commune_zoom <- boundaries %>%
  filter(NAME %in% c("Gombe", "Limete", "Kintambo", "Bandalungwa"))

# Create a manual color palette for the 4 targeted communes
palette_zoom <- setNames(c("red", "blue", "green", "orange"),
                         c("Gombe", "Limete", "Kintambo", "Bandalungwa"))

ggplot(commune_zoom) +
  geom_sf(aes(fill = NAME), color = "black", size = 0.2, show.legend = FALSE) +
  scale_fill_manual(values = palette_zoom) +
  labs(
    title = "Zoom on Selected Communes of Kinshasa",
    subtitle = "Communes: Gombe, Limete, Kintambo, Bandalungwa"
  ) +
  annotation_scale(location = "bl") +
  annotation_north_arrow(location = "tl", which_north = "true",
                         style = north_arrow_fancy_orienteering()) +
  theme_minimal() +
  geom_sf_text(aes(label = NAME), size = 2.5, color = "black", check_overlap = TRUE)
```

Click to expand R code
```{r}
# 📥 Charger les données adolescents affectes par les Infections carabines
ado <- read_excel("C:/Users/Alain/Downloads/ADO&JEUNES 2023.xlsx")
str(ado)
```
tibble [28 × 4] (S3: tbl_df/tbl/data.frame)
 $ Indicateur: chr [1:28] NA "Somme de Valeur" "Étiquettes de lignes" "Bas-Uele" ...
 $ (Tous)    : chr [1:28] NA "Étiquettes de colonnes" "Féminin" "13993" ...
 $ ...3      : chr [1:28] NA NA "Masculin" "10896" ...
 $ ...4      : chr [1:28] NA NA "Total général" "24889" ...
Click to expand R code
```{r}
ado_large <- ado %>%
  rename(
  Province = Indicateur,
    Femme = `(Tous)`,
    Homme = `...3`,
    Total_general = `...4`
  ) %>%
  filter(!is.na(Province)) %>%
  mutate(
    Femme = as.numeric(Femme),
    Homme = as.numeric(Homme),
    Total_general = as.numeric(Total_general)
  ) %>%
  drop_na(Femme, Homme, Total_general) %>%  # ✅ Supprimer les lignes avec au moins un NA
  distinct() %>%
  arrange(Province)

# Affichage final
print(ado_large)
glimpse(ado_large)



# Transformation en format long avec gestion des doublons
ado_long <- ado_large %>%
  distinct() %>%  # Élimine les doublons complets
  pivot_longer(
    cols = c(Femme, Homme),
    names_to = "Gender",
    values_to = "valeur"
  ) %>%
  mutate(
    Gender = as_factor(Gender) %>%  # Convertir en facteur
      fct_relevel("Femme", "Homme") %>%  # Ordonner les niveaux
      fct_recode("Female" = "Femme", "Male" = "Homme"),  # Renommer en anglais
    .after = Province  # Placer la colonne Gender après Province
  ) %>%
  distinct(Province, Gender, .keep_all = TRUE) %>%  # Éviter les doublons par combinaison Province-Gender
  select(Province, Gender, valeur, Total_general)  # Sélection des colonnes

# 👀 Afficher le résultat
glimpse(ado_long)
View(ado_long)

ado_clean <- ado_long[complete.cases(ado_long), ]
```
# A tibble: 25 × 4
   Province         Femme   Homme Total_general
   <chr>            <dbl>   <dbl>         <dbl>
 1 Bas-Uele         13993   10896         24889
 2 Haut-Katanga   2435551 2124191       4559742
 3 Haut-Lomami     111958   35086        147044
 4 Haut-Uele        80677   54107        134784
 5 Ituri            40843   36467         77310
 6 Kasaï           139262  112649        251911
 7 Kasaï-Central    78996   66261        145257
 8 Kasaï-Oriental  520251  315834        836085
 9 Kinshasa        207106  233067        440173
10 Kongo-Central    86683   77415        164098
# ℹ 15 more rows
Rows: 25
Columns: 4
$ Province      <chr> "Bas-Uele", "Haut-Katanga", "Haut-Lomami", "Haut-Uele", …
$ Femme         <dbl> 13993, 2435551, 111958, 80677, 40843, 139262, 78996, 520…
$ Homme         <dbl> 10896, 2124191, 35086, 54107, 36467, 112649, 66261, 3158…
$ Total_general <dbl> 24889, 4559742, 147044, 134784, 77310, 251911, 145257, 8…
Rows: 50
Columns: 4
$ Province      <chr> "Bas-Uele", "Bas-Uele", "Haut-Katanga", "Haut-Katanga", …
$ Gender        <fct> Female, Male, Female, Male, Female, Male, Female, Male, …
$ valeur        <dbl> 13993, 10896, 2435551, 2124191, 111958, 35086, 80677, 54…
$ Total_general <dbl> 24889, 24889, 4559742, 4559742, 147044, 147044, 134784, …
Click to expand R code
```{r}
pop_total_Rdc <- c(
  "Bas-Uele" = 1419000,
  "Équateur" = 1856000,
  "Haut-Katanga" = 5378000,
  "Haut-Lomami" = 2842000,
  "Haut-Uele" = 2614000,
  "Ituri" = 4392000,
  "Kasaï" = 3199000,
  "Kasaï-Central" = 3743000,
  "Kasaï-Oriental" = 3145000,
  "Kinshasa" = 17071000,
  "Kongo-Central" = 6365000,
  "Kwango" = 2416000,
  "Kwilu" = 6149000,
  "Lomami" = 2842000,
  "Lualaba" = 3138000,
  "Mai-Ndombe" = 2482000,
  "Mongala" = 2358000,
  "Nord-Kivu" = 8103000,
  "Nord-Ubangi" = 1482000,
  "Sankuru" = 2478000,
  "Sud-Kivu" = 6565000,
  "Sud-Ubangi" = 2614000,
  "Tanganyika" = 3561000,
  "Tshopo" = 3113000,
  "Tshuapa" = 1887000
)

# 1. Convertir le vecteur en dataframe
pop_data <- data.frame(
  Province = names(pop_total_Rdc),
  pop_total_Rdc = pop_total_Rdc,
  row.names = NULL
)

View(pop_data)
# 2. Jointure avec ado_clean
# Ajouter la colonne des adolescents par 1000 habitant
```

12 Cleaning province names to avoid formatting differences

Click to expand R code
```{r}
pop_data <- pop_data %>%
  mutate(Province = str_trim(str_to_title(Province)))  # Normalisation des noms

ado_clean <- ado_clean %>%
  mutate(Province = str_trim(str_to_title(Province)))  # Harmonisation des noms

# Fusion des deux data.frames sur la colonne "Province"
merged_data <- full_join(pop_data, ado_clean, by = "Province") %>%
  distinct()  # Supprimer les doublons

# Vérifier la structure des données après fusion
glimpse(merged_data)


# Afficher un aperçu
head(merged_data)
View(merged_data)
```
Rows: 53
Columns: 5
$ Province      <chr> "Bas-Uele", "Bas-Uele", "Équateur", "Équateur", "Haut-Ka…
$ pop_total_Rdc <dbl> 1419000, 1419000, 1856000, 1856000, 5378000, 5378000, 28…
$ Gender        <fct> Female, Male, Female, Male, Female, Male, Female, Male, …
$ valeur        <dbl> 13993, 10896, 12622, 16048, 2435551, 2124191, 111958, 35…
$ Total_general <dbl> 24889, 24889, 28670, 28670, 4559742, 4559742, 147044, 14…
      Province pop_total_Rdc Gender  valeur Total_general
1     Bas-Uele       1419000 Female   13993         24889
2     Bas-Uele       1419000   Male   10896         24889
3     Équateur       1856000 Female   12622         28670
4     Équateur       1856000   Male   16048         28670
5 Haut-Katanga       5378000 Female 2435551       4559742
6 Haut-Katanga       5378000   Male 2124191       4559742
Click to expand R code
```{r}
province_code <- c(
  "Kinshasa" = 10, "Kongo-Central" = 20, "Kwango" = 302, "Kwilu" = 303, "Mai-Ndombe" = 305,
  "Tshuapa" = 402, "Mongala" = 403, "Nord-Ubangi" = 405, "Sud-Ubangi" = 406, "Équateur" = 407,
  "Tshopo" = 502, "Haut-Uele" = 503, "Bas-Uele" = 504, "Ituri" = 505, "Haut-Lomami" = 61,
  "Lomami" = 62, "Kasaï" = 63, "Kasaï-Central" = 704, "Kasaï-Oriental" = 705, "Sankuru" = 706,
  "Maniema" = 707, "Tanganyika" = 802, "Haut-Katanga" = 803, "Lualaba" = 804, "Sud-Kivu" = 904, # Code approximatif ajouté
  "Nord-Kivu" = 903
)

# Mise à jour de merged_data avec ce code
# Ajouter la colonne code avec cbind()
merged_data <- cbind(merged_data, code = province_code[merged_data$Province])

# Supprimer les lignes avec NA pour éviter les erreurs
merged_data <- merged_data[complete.cases(merged_data), ]

# Vérification
print(merged_data)
View(merged_data)
```
         Province pop_total_Rdc Gender  valeur Total_general code
1        Bas-Uele       1419000 Female   13993         24889  504
2        Bas-Uele       1419000   Male   10896         24889  504
3        Équateur       1856000 Female   12622         28670  407
4        Équateur       1856000   Male   16048         28670  407
5    Haut-Katanga       5378000 Female 2435551       4559742  803
6    Haut-Katanga       5378000   Male 2124191       4559742  803
7     Haut-Lomami       2842000 Female  111958        147044   61
8     Haut-Lomami       2842000   Male   35086        147044   61
9       Haut-Uele       2614000 Female   80677        134784  503
10      Haut-Uele       2614000   Male   54107        134784  503
11          Ituri       4392000 Female   40843         77310  505
12          Ituri       4392000   Male   36467         77310  505
13          Kasaï       3199000 Female  139262        251911   63
14          Kasaï       3199000   Male  112649        251911   63
15  Kasaï-Central       3743000 Female   78996        145257  704
16  Kasaï-Central       3743000   Male   66261        145257  704
17 Kasaï-Oriental       3145000 Female  520251        836085  705
18 Kasaï-Oriental       3145000   Male  315834        836085  705
19       Kinshasa      17071000 Female  207106        440173   10
20       Kinshasa      17071000   Male  233067        440173   10
21  Kongo-Central       6365000 Female   86683        164098   20
22  Kongo-Central       6365000   Male   77415        164098   20
23         Kwango       2416000 Female  106223        228126  302
24         Kwango       2416000   Male  121903        228126  302
25          Kwilu       6149000 Female   66973        124196  303
26          Kwilu       6149000   Male   57223        124196  303
27         Lomami       2842000 Female  560237        848427   62
28         Lomami       2842000   Male  288190        848427   62
30     Mai-Ndombe       2482000 Female   90209        127284  305
31     Mai-Ndombe       2482000   Male   37075        127284  305
32        Mongala       2358000 Female  451806        697983  403
33        Mongala       2358000   Male  246177        697983  403
34      Nord-Kivu       8103000 Female 1852584       2459446  903
35      Nord-Kivu       8103000   Male  606862       2459446  903
36    Nord-Ubangi       1482000 Female  110080        211618  405
37    Nord-Ubangi       1482000   Male  101538        211618  405
38        Sankuru       2478000 Female   69549        138639  706
39        Sankuru       2478000   Male   69090        138639  706
41     Sud-Ubangi       2614000 Female  150424        229227  406
42     Sud-Ubangi       2614000   Male   78803        229227  406
43     Tanganyika       3561000 Female  816465       1635918  802
44     Tanganyika       3561000   Male  819453       1635918  802
45         Tshopo       3113000 Female   60979         95752  502
46         Tshopo       3113000   Male   34773         95752  502

13 Solution using unique()

Click to expand R code
```{r}
ado_ratios <- merged_data %>%
  # Étape 1: Garder uniquement les combinaisons uniques Province-Gender
  group_by(Province, Gender) %>%
  filter(row_number() == 1) %>%  # Équivalent à unique() pour chaque groupe
  ungroup() %>%
  
  # Étape 2: Calcul des ratios (votre code original)
  group_by(Province, Gender) %>%
  mutate(
    ratio_per_1000 = (valeur / first(Total_general)) * 1000
  ) %>%
  ungroup()

# Vérification
stopifnot(
  "There are still duplicates" = !any(duplicated(ado_ratios[, c("Province", "Gender")]))
)
```

15 Data preparation (calculation of the proportion of females)

Click to expand R code
```{r}
carte_bivariee <- ado_ratios %>%
  group_by(code) %>%
  summarise(
    prop_female = sum(valeur[Gender == "Female"]) / sum(valeur),
    valeur_tot = sum(valeur)
  ) %>%
  left_join(banderies_nad83, by = c("code" = "CODE_INS")) %>%
  st_as_sf()

# Visualization
ggplot() +
  geom_sf(
    data = carte_bivariee,
    aes(fill = prop_female, alpha = valeur_tot),
    color = "white", size = 0.3
  ) +
  
# Scales
  scale_fill_gradient2(
    low = "#E41A1C", mid = "#F7F7F7", high = "#377EB8",
    midpoint = 0.5, name = "Percentage of Females"
  ) +
  scale_alpha_continuous(range = c(0.3, 0.9), name = "Total Intensity") +
  
  # Thème
  theme_void() +
  labs(title = "Intensity and Gender Distribution by Province")
```

Click to expand R code
```{r}
# 1. 📥 Load and clean the adolescent health data
ado <- read_excel("C:/Users/Alain/Downloads/ADO&JEUNES 2023.xlsx") %>%
  rename(
    Province = Indicateur,
    Female = `(Tous)`,
    Male = `...3`,
    Total_general = `...4`
  ) %>%
  filter(!is.na(Province)) %>%
  mutate(
    Female = as.numeric(Female),
    Male = as.numeric(Male),
    Total_general = as.numeric(Total_general)
  ) %>%
  drop_na(Female, Male, Total_general)

# 2. 🌍 Add simulated coordinates for provinces (e.g., approximate centroids)
set.seed(123)
ado$lon <- runif(nrow(ado), min = 21, max = 31)  # approximate longitude in DRC
ado$lat <- runif(nrow(ado), min = -13, max = 5)  # approximate latitude in DRC

# 3. 🎯 Select a 20% random sample of provinces
set.seed(456)
ado$u <- runif(nrow(ado))
ado_small <- ado %>% filter(u <= 0.2)

# 4. 🔄 Convert to sf (projected geographic coordinate system)
ado_small.sf <- st_as_sf(ado_small, coords = c("lon", "lat"), crs = 4326)
ado_small.sf <- st_transform(ado_small.sf, crs = 3857)  # Mercator projection

# 5. 🗺️ Define the spatial window (based on observed coordinates)
W <- as.owin(st_bbox(ado_small.sf))
rsc <- 1000  # rescaling factor (1 unit = 1 km)

# 6. 🔘 Create spatial patterns
# Unmarked pattern
pattern.um <- ppp(
  x = st_coordinates(ado_small.sf)[,1],
  y = st_coordinates(ado_small.sf)[,2],
  window = W
)
pattern.um <- rescale(pattern.um, rsc, "km")
plot(pattern.um, main = "Affected adolescents (unmarked pattern)")

# Marked pattern: pivot_longer to split by gender
ado_long <- ado_small %>%
  pivot_longer(cols = c(Female, Male), names_to = "Gender", values_to = "value")

ado_long.sf <- st_as_sf(ado_long, coords = c("lon", "lat"), crs = 4326) %>%
  st_transform(crs = 3857)

pattern.m <- ppp(
  x = st_coordinates(ado_long.sf)[,1],
  y = st_coordinates(ado_long.sf)[,2],
  window = W,
  marks = as.factor(ado_long.sf$Gender)
)
pattern.m <- rescale(pattern.m, rsc, "km")
plot(pattern.m, main = "Affected adolescents by gender (marked pattern)")

# 7. 📏 Distance analysis
between_dist <- pairdist(pattern.um)
between_dist_df <- as.data.frame(between_dist)
between_dist_df$min_d <- apply(between_dist_df, 1, function(x) sort(x)[2])

first_neib <- nndist(pattern.um)

# 8. 🗺️ Display map of DRC with case intensity
rdc <- ne_states(country = "Democratic Republic of the Congo", returnclass = "sf")
rdc_83 <- st_transform(rdc, crs = 3857)

# Join total cases to map
carte_rdc <- left_join(rdc_83, ado, by = c("name" = "Province"))

# 9. 🖼️ Choropleth visualization
ggplot() +
  geom_sf(data = carte_rdc, aes(fill = Total_general), color = "white") +
  scale_fill_viridis_c(name = "Reported cases") +
  labs(
    title = "Distribution of affected adolescents in DRC",
    subtitle = "Total number of cases per province",
    caption = "Source: ADO & JEUNES 2023"
  ) +
  theme_minimal()
```