Our report focuses on the study of the causes of the recurrence of cholera that is raging. This analysis was done by the method of principal component analysis on 13 African countries and 11 variables which are the health, climatic, environmental, socio-economic and demographic variables which could favor the ’installation of the disease in our countries. The results of the PCA give us a percentage of variance of the factorial plan of 70.48%. The correlations between the variables show that the recurrence of cholera is favored by high temperatures, rainfall and countries with a relatively very young population. On the other hand, factors such as access to drinking water, sanitation facilities, good hygiene practices accompanied by good implementation of Integrated Water Resources Management contribute to fighting the disease. This indicates that a multisectoral approach is necessary to succeed in slowing down the disease, or even eradicating it.
Keywords: Cholera, Recurrence, Africa, Variable, Individuals, Climatic, Health, Socio-economic, environmental, Principal component analysis
Cholera is a waterborne disease in the family of acute diarrheal diseases caused by the bacterium Vibrio cholerae. It remains a major health problem in Africa, with recurring epidemics affecting many countries, despite the prevention and control efforts implemented by the various organizations working to eradicate it [1] . Although cholera is preventable by simple measures such as improving access to drinking water and basic sanitation, epidemics have persisted on the continent for decades [2] . According to the World Health Organization (WHO), Africa is one of the regions most affected by this disease, with peaks in incidence often linked to poor sanitation and humanitarian crises [3] .
Despite efforts at vaccination, awareness raising and improved health infrastructure, cholera outbreaks continue to occur at regular intervals, particularly in vulnerable areas [4] . This raises the question: what are the root causes of cholera recurrence in Africa and how do these factors interact to sustain its spread?
To achieve our main goal, we will think about the following specific objectives:
Analyze environmental conditions and health infrastructure that promote cholera transmission, including access to drinking water and sanitation, as well as the impact of climate change.
Analyze the socio-economic determinants that influence the vulnerability of populations to cholera, in particular the influence of inequalities in access to health services and precarious living conditions.
Propose practical and appropriate solutions to limit the recurrence of cholera in Africa, based on the results of the analysis.
As part of our work, we mainly used software to help us in the collection, processing, analysis of data and mapping of our study area. These include:
ZOTERO: This is a free application that allows us to manage our bibliography throughout our work.
KOBOTOOLBOX: Allowed us to generate the questionnaire addressed to our different targets for the collection of data related to cholera.
R/ Rstudio : R is an open source software that is widely used in statistical analysis. It was used to perform our principal component analysis with packages such as FactoMinR , FactoShiny , and to write the report with the Rmarkdow component .
QGIS: Open source geographic information system (GIS) software, QGIS allowed us to map our study area and develop other maps related to our work.
To carry out our work of analysis on the causes of the recurrence of cholera in Africa, we proceeded in stages
Step Two: Data Collection
The data collection for our study was done on the basis of the news on the recurrence of cholera in African countries for the year 2023 according to the WHO. Thus, the choice of individuals was based on the countries having recorded the most cases of cholera and these are among others Burundi, Cameroon, the Democratic Republic of Congo (DRC), Ethiopia, Kenya, Malawi, Mozambique, Nigeria, South Africa, South Sudan, Tanzania, Zambia and Zimbabwe. All these countries have recorded a surge in the number of cholera cases in 2023 according to the WHO report on the number of cholera cases 2023.
The choice of variables was made on the basis of the parameters that could influence the recurrence of cholera. We grouped them into five groups and these are among others:
Figure 1: Population map from 0 to 15 years old
Figure 2: Population map aged 15 to 64
Figure 3: Population density
At the end of the collection we have thirteen (13) countries therefore 13 individuals and eleven (11) variables which constitute our data set. The table below summarizes the data and figure I represents the mapping of our study area.
Table 1: Summary of data set for the study
# Loarding data
setwd("C:/Groupe10GEAAH_corrige/Donnee_ACP_alimentaire/")
data = read.csv(file ="data.csv", header = TRUE, sep = ";", quote = "\"",
dec = ",", row.names = 1)
data[,1:11]
## TCD PRE TP IF TIA PIB UP DCN TA TU PA
## BEN 2.56 1049.02 12.72 22.2 5.90 3321.55 3402.00 6900 47.10 0.02 1175297
## BFA 2.34 831.04 25.28 24.5 6.81 2176.09 2832.85 2400 34.49 0.01 2086893
## CPV 0.69 187.82 11.50 10.8 0.58 6356.75 5.00 750 91.00 0.01 54765
## CIV 2.48 1299.33 9.73 22.3 7.69 5316.46 22.87 2500 89.89 0.01 241095
## GMB 2.34 995.69 17.24 17.6 9.60 2076.57 300.61 7000 58.67 0.03 2555332
## GHA 1.92 1209.73 25.21 14.9 10.29 5420.79 9689.50 2700 80.38 0.02 1311530
## GIN 2.49 1790.59 13.82 27.5 15.77 2640.34 110.50 340 45.33 0.01 197266
## GNB 2.23 1649.39 25.96 27.5 5.04 1831.38 82.00 410 53.90 0.01 2561140
## LBR 2.08 2450.19 27.62 33.3 6.62 1423.23 1035.56 3700 48.30 0.01 507043
## MLI 3.04 328.92 20.85 24.7 5.19 2120.62 3.75 24000 30.76 0.00 2018765
## MRT 2.87 110.81 25.35 22.6 5.77 1652.00 347.31 23000 66.96 0.00 450720
## NER 3.22 183.89 50.61 27.5 7.47 1186.53 34.79 23523 38.10 0.00 2393877
## NGA 2.09 1187.12 30.86 28.3 20.39 4922.63 59278.17 2437000 62.02 0.03 37941470
## SEN 2.49 722.89 9.93 16.3 2.91 3511.64 577.18 12000 57.67 0.01 1622980
## SLE 2.25 2654.33 26.06 31.3 17.02 1614.86 360.95 800 48.64 0.01 802371
## TGO 2.36 1217.05 26.59 23.7 8.95 2130.86 1425.00 16000 66.54 0.02 830017
# library nécessaire
library(car) # regression linéaire
## Warning: package 'car' was built under R version 4.4.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.2
library(carData) # regression linéaire
library("clusterSim")
## Warning: package 'clusterSim' was built under R version 4.4.2
## Loading required package: cluster
## Loading required package: MASS
library(DataExplorer)
## Warning: package 'DataExplorer' was built under R version 4.4.2
library(FactoInvestigate)
attach(data)
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.4.2
## corrplot 0.95 loaded
library(psych)
##
## Attaching package: 'psych'
## The following object is masked from 'package:car':
##
## logit
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 4.4.2
##
## Attaching package: 'Hmisc'
## The following object is masked from 'package:psych':
##
## describe
## The following objects are masked from 'package:base':
##
## format.pval, units
library(ggplot2) # graphique
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(factoextra) # graphique
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(FactoMineR) # lancement ACP
## Warning: package 'FactoMineR' was built under R version 4.4.3
mat_cor = cor(data)
col = colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(mat_cor, method="color", col=col(200),
type="upper", order="hclust",
addCoef.col = "black", # Ajout du coefficient de corrélation
tl.col="black", tl.srt=90, #Rotation des étiquettes de textes
, sig.level = 0.1, insig = "blank",
# Cacher les coefficients de corrélation sur la diagonale
diag=FALSE)
Figure 12: Correlation matrix between variables
rcorr(as.matrix(mat_cor[,1:11]))
## TCD PRE TP IF TIA PIB UP DCN TA TU PA
## TCD 1.00 0.07 0.76 0.77 0.15 -0.91 -0.35 -0.28 -0.91 -0.72 -0.23
## PRE 0.07 1.00 0.05 0.59 0.56 -0.40 -0.18 -0.17 -0.35 0.05 -0.17
## TP 0.76 0.05 1.00 0.78 0.34 -0.79 0.01 0.07 -0.83 -0.54 0.12
## IF 0.77 0.59 0.78 1.00 0.57 -0.90 -0.12 -0.05 -0.93 -0.49 -0.02
## TIA 0.15 0.56 0.34 0.57 1.00 -0.26 0.58 0.60 -0.43 0.36 0.62
## PIB -0.91 -0.40 -0.79 -0.90 -0.26 1.00 0.43 0.36 0.95 0.68 0.32
## UP -0.35 -0.18 0.01 -0.12 0.58 0.43 1.00 1.00 0.17 0.68 0.99
## DCN -0.28 -0.17 0.07 -0.05 0.60 0.36 1.00 1.00 0.10 0.63 1.00
## TA -0.91 -0.35 -0.83 -0.93 -0.43 0.95 0.17 0.10 1.00 0.56 0.06
## TU -0.72 0.05 -0.54 -0.49 0.36 0.68 0.68 0.63 0.56 1.00 0.61
## PA -0.23 -0.17 0.12 -0.02 0.62 0.32 0.99 1.00 0.06 0.61 1.00
##
## n= 11
##
##
## P
## TCD PRE TP IF TIA PIB UP DCN TA TU
## TCD 0.8335 0.0066 0.0061 0.6555 0.0000 0.2965 0.4114 0.0000 0.0122
## PRE 0.8335 0.8736 0.0574 0.0749 0.2196 0.5902 0.6114 0.2942 0.8775
## TP 0.0066 0.8736 0.0050 0.3079 0.0037 0.9799 0.8341 0.0017 0.0883
## IF 0.0061 0.0574 0.0050 0.0675 0.0001 0.7152 0.8730 0.0000 0.1305
## TIA 0.6555 0.0749 0.3079 0.0675 0.4461 0.0622 0.0492 0.1913 0.2777
## PIB 0.0000 0.2196 0.0037 0.0001 0.4461 0.1888 0.2735 0.0000 0.0219
## UP 0.2965 0.5902 0.9799 0.7152 0.0622 0.1888 0.0000 0.6101 0.0205
## DCN 0.4114 0.6114 0.8341 0.8730 0.0492 0.2735 0.0000 0.7628 0.0382
## TA 0.0000 0.2942 0.0017 0.0000 0.1913 0.0000 0.6101 0.7628 0.0755
## TU 0.0122 0.8775 0.0883 0.1305 0.2777 0.0219 0.0205 0.0382 0.0755
## PA 0.4925 0.6094 0.7362 0.9514 0.0438 0.3371 0.0000 0.0000 0.8663 0.0464
## PA
## TCD 0.4925
## PRE 0.6094
## TP 0.7362
## IF 0.9514
## TIA 0.0438
## PIB 0.3371
## UP 0.0000
## DCN 0.0000
## TA 0.8663
## TU 0.0464
## PA
Figure 13: P-value test
Distribution of inertia
pca_1 = PCA(X = data, scale.unit = TRUE, ncp = 11, ind.sup = NULL,
quanti.sup = NULL, quali.sup = NULL, row.w = NULL,
col.w = NULL, graph = FALSE, axes = c(1,2))
fviz_eig(pca_1, addlabels=TRUE, hjust = -0.3) +
ylim(0, 65)
Figure 17: Decomposition of total inertia