Summary :

Our report focuses on the study of the causes of the recurrence of cholera that is raging. This analysis was done by the method of principal component analysis on 13 African countries and 11 variables which are the health, climatic, environmental, socio-economic and demographic variables which could favor the ’installation of the disease in our countries. The results of the PCA give us a percentage of variance of the factorial plan of 70.48%. The correlations between the variables show that the recurrence of cholera is favored by high temperatures, rainfall and countries with a relatively very young population. On the other hand, factors such as access to drinking water, sanitation facilities, good hygiene practices accompanied by good implementation of Integrated Water Resources Management contribute to fighting the disease. This indicates that a multisectoral approach is necessary to succeed in slowing down the disease, or even eradicating it.

Keywords: Cholera, Recurrence, Africa, Variable, Individuals, Climatic, Health, Socio-economic, environmental, Principal component analysis

PROBLEMATIC

Cholera is a waterborne disease in the family of acute diarrheal diseases caused by the bacterium Vibrio cholerae. It remains a major health problem in Africa, with recurring epidemics affecting many countries, despite the prevention and control efforts implemented by the various organizations working to eradicate it [1] . Although cholera is preventable by simple measures such as improving access to drinking water and basic sanitation, epidemics have persisted on the continent for decades [2] . According to the World Health Organization (WHO), Africa is one of the regions most affected by this disease, with peaks in incidence often linked to poor sanitation and humanitarian crises [3] .

Despite efforts at vaccination, awareness raising and improved health infrastructure, cholera outbreaks continue to occur at regular intervals, particularly in vulnerable areas [4] . This raises the question: what are the root causes of cholera recurrence in Africa and how do these factors interact to sustain its spread?

SPECIFIC OBJECTIVES

To achieve our main goal, we will think about the following specific objectives:

I. Materials and methodologies

I.1 Material

As part of our work, we mainly used software to help us in the collection, processing, analysis of data and mapping of our study area. These include:

  • ZOTERO: This is a free application that allows us to manage our bibliography throughout our work.

  • KOBOTOOLBOX: Allowed us to generate the questionnaire addressed to our different targets for the collection of data related to cholera.

  • R/ Rstudio : R is an open source software that is widely used in statistical analysis. It was used to perform our principal component analysis with packages such as FactoMinR , FactoShiny , and to write the report with the Rmarkdow component .

  • QGIS: Open source geographic information system (GIS) software, QGIS allowed us to map our study area and develop other maps related to our work.

I.2 Methodology

To carry out our work of analysis on the causes of the recurrence of cholera in Africa, we proceeded in stages

Step Two: Data Collection

The data collection for our study was done on the basis of the news on the recurrence of cholera in African countries for the year 2023 according to the WHO. Thus, the choice of individuals was based on the countries having recorded the most cases of cholera and these are among others Burundi, Cameroon, the Democratic Republic of Congo (DRC), Ethiopia, Kenya, Malawi, Mozambique, Nigeria, South Africa, South Sudan, Tanzania, Zambia and Zimbabwe. All these countries have recorded a surge in the number of cholera cases in 2023 according to the WHO report on the number of cholera cases 2023.

The choice of variables was made on the basis of the parameters that could influence the recurrence of cholera. We grouped them into five groups and these are among others:

  • Demographic parameters : We have selected two that we collected from the database of the ourworldindata.org website. These are population density and age distribution (0-15, 15-64 and over). This will allow us to know whether population density positively influences the occurrence of cholera or not and who are most affected by the disease.

Figure 1: Population map from 0 to 15 years old

Figure 2: Population map aged 15 to 64

Figure 3: Population density

II. Results and discussion

II.3 Results

II.1.1 Presentation of the dataset and mapping of the study area

At the end of the collection we have thirteen (13) countries therefore 13 individuals and eleven (11) variables which constitute our data set. The table below summarizes the data and figure I represents the mapping of our study area.

Table 1: Summary of data set for the study

# Loarding data

setwd("C:/Groupe10GEAAH_corrige/Donnee_ACP_alimentaire/")

data = read.csv(file ="data.csv", header = TRUE, sep = ";", quote = "\"",
                dec = ",", row.names = 1)

data[,1:11]
##      TCD     PRE    TP   IF   TIA     PIB       UP     DCN    TA   TU       PA
## BEN 2.56 1049.02 12.72 22.2  5.90 3321.55  3402.00    6900 47.10 0.02  1175297
## BFA 2.34  831.04 25.28 24.5  6.81 2176.09  2832.85    2400 34.49 0.01  2086893
## CPV 0.69  187.82 11.50 10.8  0.58 6356.75     5.00     750 91.00 0.01    54765
## CIV 2.48 1299.33  9.73 22.3  7.69 5316.46    22.87    2500 89.89 0.01   241095
## GMB 2.34  995.69 17.24 17.6  9.60 2076.57   300.61    7000 58.67 0.03  2555332
## GHA 1.92 1209.73 25.21 14.9 10.29 5420.79  9689.50    2700 80.38 0.02  1311530
## GIN 2.49 1790.59 13.82 27.5 15.77 2640.34   110.50     340 45.33 0.01   197266
## GNB 2.23 1649.39 25.96 27.5  5.04 1831.38    82.00     410 53.90 0.01  2561140
## LBR 2.08 2450.19 27.62 33.3  6.62 1423.23  1035.56    3700 48.30 0.01   507043
## MLI 3.04  328.92 20.85 24.7  5.19 2120.62     3.75   24000 30.76 0.00  2018765
## MRT 2.87  110.81 25.35 22.6  5.77 1652.00   347.31   23000 66.96 0.00   450720
## NER 3.22  183.89 50.61 27.5  7.47 1186.53    34.79   23523 38.10 0.00  2393877
## NGA 2.09 1187.12 30.86 28.3 20.39 4922.63 59278.17 2437000 62.02 0.03 37941470
## SEN 2.49  722.89  9.93 16.3  2.91 3511.64   577.18   12000 57.67 0.01  1622980
## SLE 2.25 2654.33 26.06 31.3 17.02 1614.86   360.95     800 48.64 0.01   802371
## TGO 2.36 1217.05 26.59 23.7  8.95 2130.86  1425.00   16000 66.54 0.02   830017

II.1.2 ACP result

II.1.2.1 Correlation matrix

# library nécessaire
library(car)  # regression linéaire
## Warning: package 'car' was built under R version 4.4.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.2
library(carData)  # regression linéaire
library("clusterSim")
## Warning: package 'clusterSim' was built under R version 4.4.2
## Loading required package: cluster
## Loading required package: MASS
library(DataExplorer)
## Warning: package 'DataExplorer' was built under R version 4.4.2
library(FactoInvestigate)
attach(data)
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.4.2
## corrplot 0.95 loaded
library(psych)
## 
## Attaching package: 'psych'
## The following object is masked from 'package:car':
## 
##     logit
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 4.4.2
## 
## Attaching package: 'Hmisc'
## The following object is masked from 'package:psych':
## 
##     describe
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(ggplot2) # graphique
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
library(factoextra)  # graphique
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(FactoMineR)  # lancement ACP
## Warning: package 'FactoMineR' was built under R version 4.4.3
mat_cor = cor(data)
col = colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(mat_cor, method="color", col=col(200),  
         type="upper", order="hclust", 
         addCoef.col = "black", # Ajout du coefficient de corrélation
         tl.col="black", tl.srt=90, #Rotation des étiquettes de textes
         , sig.level = 0.1, insig = "blank", 
         # Cacher les coefficients de corrélation sur la diagonale
         diag=FALSE)

Figure 12: Correlation matrix between variables

rcorr(as.matrix(mat_cor[,1:11]))
##       TCD   PRE    TP    IF   TIA   PIB    UP   DCN    TA    TU    PA
## TCD  1.00  0.07  0.76  0.77  0.15 -0.91 -0.35 -0.28 -0.91 -0.72 -0.23
## PRE  0.07  1.00  0.05  0.59  0.56 -0.40 -0.18 -0.17 -0.35  0.05 -0.17
## TP   0.76  0.05  1.00  0.78  0.34 -0.79  0.01  0.07 -0.83 -0.54  0.12
## IF   0.77  0.59  0.78  1.00  0.57 -0.90 -0.12 -0.05 -0.93 -0.49 -0.02
## TIA  0.15  0.56  0.34  0.57  1.00 -0.26  0.58  0.60 -0.43  0.36  0.62
## PIB -0.91 -0.40 -0.79 -0.90 -0.26  1.00  0.43  0.36  0.95  0.68  0.32
## UP  -0.35 -0.18  0.01 -0.12  0.58  0.43  1.00  1.00  0.17  0.68  0.99
## DCN -0.28 -0.17  0.07 -0.05  0.60  0.36  1.00  1.00  0.10  0.63  1.00
## TA  -0.91 -0.35 -0.83 -0.93 -0.43  0.95  0.17  0.10  1.00  0.56  0.06
## TU  -0.72  0.05 -0.54 -0.49  0.36  0.68  0.68  0.63  0.56  1.00  0.61
## PA  -0.23 -0.17  0.12 -0.02  0.62  0.32  0.99  1.00  0.06  0.61  1.00
## 
## n= 11 
## 
## 
## P
##     TCD    PRE    TP     IF     TIA    PIB    UP     DCN    TA     TU    
## TCD        0.8335 0.0066 0.0061 0.6555 0.0000 0.2965 0.4114 0.0000 0.0122
## PRE 0.8335        0.8736 0.0574 0.0749 0.2196 0.5902 0.6114 0.2942 0.8775
## TP  0.0066 0.8736        0.0050 0.3079 0.0037 0.9799 0.8341 0.0017 0.0883
## IF  0.0061 0.0574 0.0050        0.0675 0.0001 0.7152 0.8730 0.0000 0.1305
## TIA 0.6555 0.0749 0.3079 0.0675        0.4461 0.0622 0.0492 0.1913 0.2777
## PIB 0.0000 0.2196 0.0037 0.0001 0.4461        0.1888 0.2735 0.0000 0.0219
## UP  0.2965 0.5902 0.9799 0.7152 0.0622 0.1888        0.0000 0.6101 0.0205
## DCN 0.4114 0.6114 0.8341 0.8730 0.0492 0.2735 0.0000        0.7628 0.0382
## TA  0.0000 0.2942 0.0017 0.0000 0.1913 0.0000 0.6101 0.7628        0.0755
## TU  0.0122 0.8775 0.0883 0.1305 0.2777 0.0219 0.0205 0.0382 0.0755       
## PA  0.4925 0.6094 0.7362 0.9514 0.0438 0.3371 0.0000 0.0000 0.8663 0.0464
##     PA    
## TCD 0.4925
## PRE 0.6094
## TP  0.7362
## IF  0.9514
## TIA 0.0438
## PIB 0.3371
## UP  0.0000
## DCN 0.0000
## TA  0.8663
## TU  0.0464
## PA

Figure 13: P-value test

II.1.2.3 ACP results without South Africa

Distribution of inertia

pca_1 = PCA(X = data, scale.unit = TRUE, ncp = 11, ind.sup = NULL, 
            quanti.sup = NULL, quali.sup = NULL, row.w = NULL, 
            col.w = NULL, graph = FALSE, axes = c(1,2))

fviz_eig(pca_1, addlabels=TRUE, hjust = -0.3) +
  ylim(0, 65)

Figure 17: Decomposition of total inertia