Project Utility (Scientific Rationale)
This project is useful because it allows us to:
• identify the structural factors (economic, demographic, infrastructural) that explain low electrification rates;
• assess energy inequalities between countries and between urban and rural areas;
• guide public policies towards more effective and appropriate solutions;
• determine the true potential of renewable energies as a lever for reducing disparities in access;
• provide an essential quantitative basis to support energy planning, investments, and electrification strategies.The methodology is based on a quantitative and comparative approach, structured in several stages:
Stage 1 : Data Collection
• Data from reliable sources: World Bank, UN, IEA, Our World in Data, AFRISTAT, national databases.
• Period studied: depending on availability (often 2000–2022).
Stage 2 : Definition and Structuring of Variables
• Dependent variables: access to electricity (total, rural, urban).
• Explanatory variables: economic, demographic, energy, and infrastructural aspects.
Stage 3 : Descriptive Analysis
• Descriptive statistics
• Comparison between countries
• Analysis of urban/rural disparities
Stage 4 : Causal Analysis
• Correlations and visualizations
• Explanatory models (if necessary: multiple regression, factor analysis)
Stage 5 : Interpretation
• Discussion of major challenges
• Identification of opportunities
• Policy or strategic recommendationsCentral Africa remains one of the least electrified regions in the world, despite its considerable energy potential, particularly in hydropower and solar power. Electricity access rates vary significantly between countries and between urban and rural areas, revealing structural challenges related to inadequate infrastructure, economic constraints, low investment, and often unstable energy governance. Furthermore, population growth, rapid urbanization, and increasing energy demand are placing additional strain on already fragile power systems.
In this context, it is essential to conduct a scientific study to identify the major obstacles to electrification and assess the potential role of renewable energies in the sustainable improvement of electricity access. This research is justified by its contribution to the development of effective strategies tailored to the realities of the region for policymakers, regional institutions, energy stakeholders, and the scientific community.What are the main technical, economic, demographic, and institutional factors that explain the low electrification rates in the region?
How do disparities in access to electricity manifest themselves between rural and urban areas?
What is the real potential of renewable energies to improve access to electricity in Central African countries?
What levers can be used to sustainably reduce energy inequalities?
To analyze the socio-economic, demographic, and energy determinants of access to electricity.
To assess disparities in access between rural and urban areas.
To examine the current and future contribution of renewable energy to the regional energy mix.
To identify policies and strategies likely to strengthen sustainable electrification.
H1: Low levels of electrification in Central Africa are primarily linked to weak infrastructure, financial constraints, and high population growth.
H2: Disparities between urban and rural areas result from high installation costs and a lack of targeted investment in sparsely populated areas.
H3: Renewable energies, particularly solar and hydropower, constitute a viable alternative for reducing inequalities in access.
H4: Strengthened energy governance and coherent policies would significantly improve regional electrification.
This study adopts a quantitative and analytical approach based on:
• Sources: World Bank, UN, IEA, Our World in Data, national reports. • Variables: electricity production, access (urban/rural), population, population growth, GDP, income, HDI, industrialization, renewables, electricity demand.
• Basic Statistics • Comparison Between Countries • Analysis of Rural/Urban Disparities
• Summary Graphs and Tables
• Correlations Between Variables • Analysis of Determinants of Access • Identification of Relationships Between Economic, Demographic, and Energy Factors
These variables were chosen because they: • cover the essential dimensions of electrification challenges: economic, demographic, social, and energy-related;
• explain territorial inequalities, particularly between rural and urban areas;
• reflect the structural constraints specific to Central Africa;
• allow for a robust quantitative analysis based on available data;
• allow for an assessment of the potential role of renewable energies.
Together, these variables describe the theme because they allow for the analysis of both obstacles (production, income, governance, rural population) and opportunities (hydropower, energy transition).Overview of electrification in Central Africa
The specific challenges of the Central African region Central Africa
faces considerable challenges in terms of energy supply. An exhaustive
analysis of the issues surrounding rural electrification and the
development of renewable energies in Central Africa reveals that the
obstacles to electrification are multidimensional, encompassing
structural, economic, and institutional factors specific to the region
[1]. Territorial disparities between urban and rural areas are
particularly pronounced throughout the region. Electrification of rural
areas represents a major challenge, as rural populations account for a
substantial proportion of those without access to electricity in Central
Africa. The scale issues involved in decentralized electricity
production are particularly acute in this region, characterized by low
population density, dense forests, and considerable distances between
communities [2].
Regional dynamics of energy transition Energy policies in Central Africa
remain fragmented and often ill-suited to the realities on the ground. A
holistic analysis of sustainable energy policies in the region reveals
the complexity of institutional, economic, and social issues
characterizing the Central African energy sector [3]. The examination of
sustainable energy transition pace in Central Africa identifies factors
slowing this transition despite the region’s considerable energy
resources. Central Africa’s energy potential remains largely untapped,
particularly in renewable energy, despite existing national and regional
initiatives [4]. Political economy and institutional determinants The
political and economic determinants of access to energy Understanding
the political economy of energy is crucial to grasping the obstacles to
electrification in Central Africa. The exploration of political and
economic determinants of energy access demonstrates how power relations,
governance structures, and political interests substantially influence
investment decisions in the energy sector of Central African countries
[5]. Across the region, the analysis of political outlook for renewable
energy examines the status, drivers, and barriers of the sector. This
reveals recurring tensions between political rhetoric favoring renewable
energy and ground realities, where investment remains limited and
regulatory frameworks are underdeveloped in most countries in the
subregion [6]. Regional institutional and regulatory architectures A
comprehensive overview of renewable energy sources in Central
Africahighlights several structural shortcomings common to the region:
weak regulatory capacity, lack of appropriate financing mechanisms, poor
coordination between national and regional actors, and insufficient
incentives for private investors [7]. These institutional failures
characterize Central Africa as a whole, where energy governance
structures remain fragile and ill-suited to the requirements of massive
and sustainable electrification. The lack of harmonization of energy
policies among countries in the sub-region is also a major obstacle to
regional energy integration.
The considerable potential of renewable energies in Central Africa Diversity and abundance of regional renewable resources Central Africa has exceptional potential in terms of renewable energy, widely documented in scientific literature. An in-depth examination of the region’s renewable energy potential identifies abundant resources in solar energy (particularly in the Sahelian zones of Chad and the CAR), hydropower (Congo Basin), and biomass (equatorial forests), with technical potential far exceeding the region’s current needs [8]. A detailed mapping of renewable energy sources in Central Africa demonstrates that the region has all the natural assets necessary to develop an energy mix based entirely on renewable energies, with significant geographical complementarities between countries [7]. Hydropower: a major strategic asset for Central Africa Hydropower is Central Africa’s most significant asset, due to its abundant water resources, particularly the Congo Basin, which is the second largest river basin in the world. Analysis of the region’s hydropower potential and its development within the framework of the Central Africa Power Pool (CAPP) vision shows that the region’s hydroelectric potential could not only meet the local needs of all Central African countries, but also contribute substantially to regional energy integration and exports to neighboring regions [9]. However, the development of large hydroelectric dams, although technically feasible in several countries in the region, raises significant environmental and social issues, particularly in terms of population displacement, biodiversity protection, and impacts on forest ecosystems. Micro hydroelectricity therefore appears to be a particularly suitable alternative for isolated rural areas throughout the sub-region [9].Microhydroelectricity therefore appears to be a particularly suitable alternative for isolated rural areas throughout the sub-region [9]. Photovoltaic and micro-hydroelectric solutions for regional rural electrification
Central Africa, solutions combining solar photovoltaic energy and micro-hydroelectricity are proposed. These decentralized technologies are particularly well suited to the context of Central Africa, where the dispersion of rural populations across all countries in the region makes the extension of the central electricity grid economically unviable in many cases [10]. Hybrid systems offer additional flexibility to meet the specific needs of each area. Optimal multi-criteria sizing and analysis for systems combining photovoltaics, wind power, fuel cells, batteries, and diesel generators ensure a reliable electricity supply even during periods of low sunlight or low water levels, which is a significant issue in Central Africa given its marked seasonal variations and climatic diversity between countries in the sub-region [11]. Microgrids: an organizational paradigm for regional decentralized electrification The exploration of renewable energy-powered microgrids’ role in rural electrification reveals concepts directly transferable to the whole of Central Africa. Microgrids enable the creation of autonomous electrical systems at the village or community level, thereby circumventing the prohibitive costs of extending national grids in the sparsely populated areas characteristic of the region [12]. This decentralized approach is particularly relevant for Central Africa as a whole, where geography (dense forests in Gabon and the DRC, desert areas in Chad, rugged terrain) and the dispersion of settlements make conventional electrification particularly costly and logistically complex in all countries of the sub-region. Rural electrification: a key issue in reducing inequalities in Central Africa The scale of the rural challenge at the regional level Rural electrification is the key challenge in reducing spatial inequalities in access to energy across Central Africa. Documentation of these challenges specifically for the region highlights that rural electrification rates remain extremely low in all Central African countries, often below 10% and sometimes even below 5% in certain remote rural areas of Chad, CAR, and DRC [1]. Examination of opportunities and challenges of rural renewable energy projects in Africa identifies several critical success factors applicable to the region: effective community participation, viable economic models adapted to local contexts, development of local maintenance capacities in each country, and adaptation of technologies to the specific socio-cultural and geographical contexts of each area [13]. Climate resilience and access to clean energy in the region Analysis shows how access to clean energy in rural areas can accelerate climate resilience, a particularly crucial issue for Central Africa in the face of climate change, which affects the northern Sahelian zones and the forested areas of the Congo forest preservation [14]. Sector-specific applications: the regional health imperative Exploration of strategies for achieving universalelectrification of rural health facilities with decentralized energy sources is crucial for the wholeof Central Africa, where the lack of electricity in ruralhealth centers seriously compromises the quality of care in all countries in the sub-region (vaccine storage, lighting for night-time deliveries, operation of essential medical equipment) [15]. The electrification of basic social services (health, education) should be a strategic priority in the rural electrification policies of all Central African countries, as it generates immediate and tangible socio-economic benefits that can catalyze regional development. Regional promotion and implementation strategies Integrated approaches: renewable energy and energy efficiency Strategies to promote renewable energy and energy efficiency in Central Africa advocate an integrated approach combining renewable energy development and energy efficiency improvements at the regional level in order to maximize the impact of the limited investments available in the various countries of the subregion [16], [17]. This approach is particularly relevant for Central Africa as a whole, where energy demand is growing rapidly with the economic development of major urban areas. Improving energy efficiency in all countries in the region will help moderate this growth in demand and thus reduce the investment needed in production capacity at the regional level. Multidimensional assessment of regional renewable potential Assessment of renewable energy potential according to three critical dimensions—accessibility, affordability, and reliability—is essential for Central Africa as a whole, where physical access to electricity is not enough; energy must also be financially accessible to local populations in all countries and sufficiently reliable to support productive activities that can contribute to regional economic development [18]. Renewable energy, particularly in decentralized configurations, can offer a better balance between these three dimensions than the extension of conventional networks in rural areas of all Central African countries, where infrastructure costs are prohibitive. Technical and operational challenges specific to Central Africa Reliability and intermittency of renewable energy sources in the region Study of the impact of periods of low resource availability on the reliability of rural electrification systems addresses the issue of intermittency applicable to the various renewable energies that can be exploited in Central Africa, with specific 4 characteristics depending on the geographical area [19]. In Central Africa, the seasonal variability of resources has different characteristics depending on the country: alternating dry and rainy seasons affecting hydropower in Cameroon and Gabon, more marked variations in sunshine in the Sahelian areas of Chad, and a more stable rainfall regime in the Congo Basin. This diversity requires storage solutions or hybrid systems adapted to the specific characteristics of each area to ensure a continuous supply of electricity at the regional level. SWOT analysis of the renewable energy sector in Central Africa A SWOT analysis (strengths, weaknesses, opportunities, threats) of renewable energies provides conclusions directly applicable to Central Africa as a whole [20]: Strengths: abundant and diversified renewable potential throughout the region, falling technology costs, opportunity for technological leapfrogging for all countries. Weaknesses: limited technical and institutional capacities in all countries, difficult access to financing, lack of local maintenance. Opportunities: regional economic growth, progressive urbanization, international commitment to universal access to energy, regional cooperation via the CAPP. Threats: political instability in certain areas of the and Chad, climate change affecting water and forest resources, competition with fossil fuels, which are sometimes subsidized. Energy transition scenarios for the region Analysis of a scenario for transitioning to 100% renewable energy identifies technical and economic constraints. For Central Africa as a whole, such a transition is technically possible thanks to the complementary nature of resources between countries: abundant water resources in the Congo Basin that can provide stable base load power for several countries, significant solar potential in Chad and the CAR, and biomass resources in forested areas [21]. Nevertheless, this regional transition requires massive investments in production and transport infrastructure, institutional capacity building in all countries, the development of a local maintenance and equipment production industry at the regional level, and, above all, effective regional coordination through the CAPP.
# Set options and load required libraries
knitr::opts_chunk$set(echo = TRUE)
library(FactoMineR)
library(factoextra)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.5.2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(ggplot2)
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(Factoshiny)
## Loading required package: shiny
## Loading required package: FactoInvestigate
library(shiny)
library(FactoInvestigate)
#Set options and load required libraries
library(DataExplorer)
## Warning: package 'DataExplorer' was built under R version 4.5.2
library(corrplot)
## corrplot 0.95 loaded
library(pander)
## Warning: package 'pander' was built under R version 4.5.2
##
## Attaching package: 'pander'
## The following object is masked from 'package:shiny':
##
## p
library(DT)
##
## Attaching package: 'DT'
## The following objects are masked from 'package:shiny':
##
## dataTableOutput, renderDataTable
library(rsconnect)
##
## Attaching package: 'rsconnect'
## The following object is masked from 'package:shiny':
##
## serverInfo
library(askpass)
library(VIM)
## Loading required package: colorspace
## Loading required package: grid
## VIM is ready to use.
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues
##
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
##
## sleep
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
The database was entered into Excel and saved in CSV (semicolon-delimited) format. We then imported this file into R using the “read.csv()” function.
#Define the file path
file <- "C:/Users/LEGION/Desktop/TD Analyse des Données/PROJET RTI/Groupe5GEE/RTI_DATA_2022.csv"
# Set working directory and load data
donnees_csv <- read.csv(file, header = TRUE, sep = ";", dec = ",", row.names = 1, fileEncoding = "latin1")
datatable(donnees_csv, options = list(pageLength = 5, autoWidth = TRUE))
donnees_csv <- read.csv(file, header = TRUE, sep = “;”, dec = “,”, row.names = 1, fileEncoding = “latin1”) datatable(donnees_csv, options = list(pageLength = 5, autoWidth = TRUE))
To improve the readability of the database header, we decided to
rename the variables so that the software could not display the units of
the data contained within parentheses. To do this, we used the
“colnames” function, which we applied to our database.
#Rename the columns with the variable names without the units in parentheses
colnames(donnees_csv) <- c(
"Elect_Gen", # Elec_Gen (GWh)
"Access_Elect", # Access_Elec (% of pop)
"Access_Elect_Urbain", # Access_Elec_urban (% of urban pop)
"Access_Elect_Rural", # Access_Elec_Rural (% of rural pop)
"Elec_Demand", # Elec_Demand (GWh)
"Total_Pop", # Total_Pop (hbts)
"Rural_Pop", # Rural_Pop (% of total pop)
"Pop_Growth", # Pop_Growth (annual %)
"GDP_Per_Capita", # GDP_Per_Capita (current US$)
"HDI", # HDI
"Fossil_fuels_elect_gen", # Fossil fuels elect gen (billion kWh)
"Hydroelectricity_gen", # Hydroelectricity generation (billion kWh)
"Income_Class", # Income_Class
"Indust_Level" #Indust_Level
)
datatable(donnees_csv, options = list(pageLength = 5, autoWidth = TRUE))
In this section, we created a function that identifies missing values and calculates their proportion relative to the entire database. After running the code, we see that the data on the rural electrification rate is missing. Therefore, we have a proportion of 0.1 missing values compared to 0.9 for the other data.
# Function to calculate the proportion of missing values per variable
proportion_valeurs_manquantes <- function(data)
{
# Calculating the number of missing values per column
nb_valeurs_manquantes <- sapply(data, function(x) sum(is.na(x)))
# Calculating the proportion of missing values
proportion_manquantes <- nb_valeurs_manquantes / nrow(data)
# Creating a dataframe for the result
resultat <- data.frame(Nombre = nb_valeurs_manquantes, Proportion = proportion_manquantes)
return(resultat)
}
# Using the function with your database
resultat <- proportion_valeurs_manquantes(donnees_csv)
# Displaying the result
resultat
## Nombre Proportion
## Elect_Gen 0 0.00000000
## Access_Elect 0 0.00000000
## Access_Elect_Urbain 0 0.00000000
## Access_Elect_Rural 1 0.09090909
## Elec_Demand 0 0.00000000
## Total_Pop 0 0.00000000
## Rural_Pop 0 0.00000000
## Pop_Growth 0 0.00000000
## GDP_Per_Capita 0 0.00000000
## HDI 0 0.00000000
## Fossil_fuels_elect_gen 0 0.00000000
## Hydroelectricity_gen 0 0.00000000
## Income_Class 0 0.00000000
## Indust_Level 0 0.00000000
# Using the aggr() function to view missing values
aggr(donnees_csv, col=c('navyblue','yellow'), numbers=TRUE, sortVars=TRUE,
labels=names(donnees_csv), cex.axis=.7, gap=3, ylab=c("Histogram of missing data","Pattern"))
##
## Variables sorted by number of missings:
## Variable Count
## Access_Elect_Rural 0.09090909
## Elect_Gen 0.00000000
## Access_Elect 0.00000000
## Access_Elect_Urbain 0.00000000
## Elec_Demand 0.00000000
## Total_Pop 0.00000000
## Rural_Pop 0.00000000
## Pop_Growth 0.00000000
## GDP_Per_Capita 0.00000000
## HDI 0.00000000
## Fossil_fuels_elect_gen 0.00000000
## Hydroelectricity_gen 0.00000000
## Income_Class 0.00000000
## Indust_Level 0.00000000
Only one variable contains a missing value: • Access_Elect_Rural: 1 missing value (10% for this variable). All other variables have 0% missing values.
Interpretation • The dataset is generally clean and fully usable.
• The single missing value does not invalidate the PCA, especially since FactoMineR automatically imputes using the mean (but the note warned you to potentially use imputePCA).
• This near-total absence of NA means that the PCA results will be stable and reliable.
This description involves creating histograms and boxplots for each variable. These graphs will allow us to analyze and understand the distribution of each variable: the mean, the variance, outliers, etc. We begin this step by identifying the columns containing quantitative variables using the “sapply()” package.
# Sélection des colonnes 1, 4, 9, 11 et 12
colonnes_cible <- c(1, 4, 9, 11, 12)
# Vérifier qu'elles sont bien numériques
vars_quantitatives <- sapply(donnees_csv[, colonnes_cible], is.numeric)
# Boucle pour faire un histogramme pour chaque variable souhaitée
for (var in names(donnees_csv[, colonnes_cible])[vars_quantitatives]) {
print(
ggplot(donnees_csv, aes_string(x = var)) +
geom_histogram(bins = 30, fill = "blue", color = "black") +
theme_minimal() +
labs(title = paste("Histogramme de", var),
x = var,
y = "Fréquence")
)
}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).
The histograms show very heterogeneous distributions, which is normal in an analysis of Central Africa where countries have very different energy profiles. Indeed,
Elect_Gen (Total Electricity Production) • Highly asymmetrical distribution. • Cameroon, Congo, and Gabon produce significantly more electricity than the Central African Republic (CAR) and Chad. This description indicates strong structural disparities between countries.
Access_Elect, Access_Elect_Urban, Access_Elect_Rural • High heterogeneity: • Gabon and Cameroon: very high access • Chad and CAR: very low access • Rural: extremely low values everywhere (very low for CAR and Chad) This confirms that rural access is the main challenge in the region.
GDP_per_capita, HDI • Highly dispersed distribution: Gabon and Equatorial Guinea are by far the most dominant. Wealthy countries have better energy performance.
Fossil fuels, electricity, and hydropower • Some countries do not use fossil fuels or hydropower at all (Chad, Central African Republic). The energy mix also explains the disparities in electrification.
These descriptions may be confirmed or refuted by the ACP
# Sélection des colonnes 1, 4, 9 et 12
colonnes_cible <- c(1, 4, 9, 11, 12)
# Vérifier que ces colonnes sont bien numériques
vars_quantitatives <- sapply(donnees_csv[, colonnes_cible], is.numeric)
# Création d'un boxplot pour chaque variable ciblée
for (var in names(donnees_csv[, colonnes_cible])[vars_quantitatives]) {
print(
ggplot(donnees_csv, aes_string(x = factor(1), y = var)) +
geom_boxplot(fill = "skyblue", color = "darkblue") +
theme_minimal() +
labs(title = paste("Boxplot de", var),
x = "",
y = var)
)
}
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_boxplot()`).
The boxplots show: • Strong outliers for: • General Elective (Cameroon very high) • GDP per capita (Gabon and Equatorial Guinea well above) • Rural Elective Access (Gabon much above the others) This confirms that the sample contains extremely diverse countries, fully justifying the use of PCA to identify typical profiles.
# Function to create a barplot in proportions
creer_barplot_proportion <- function(data, column_name)
{
# Calculate the proportionss
proportions <- data %>%
count(.data[[column_name]]) %>%
mutate(Proportion = n / sum(n))
# Create the barplot
ggplot(proportions, aes_string(x = column_name, y = "Proportion", fill = column_name)) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = scales::percent_format()) +
labs(x = column_name, y = "Proportion (%)") +
theme_minimal()
}
# Create a bar plot for the variable "Income_Class"
creer_barplot_proportion(donnees_csv, "Income_Class")
# Create a bar plot for the variable "Indust_Level"
creer_barplot_proportion(donnees_csv, "Indust_Level")
The income bracket chart shows that over 45% of countries have very low incomes and nearly 35% have middle incomes. Only slightly less than 20% of the countries studied have relatively high incomes.
The industrialization level chart shows that nearly 40% of Central African countries have low levels of industrialization, while nearly 40% have high levels of industrialization.
We calculate the correlation matrix for the first nine variables and visualize it using a correlation plot. This helps in understanding relationships between variables before performing PCA.
# Identify the quantitative columns
vars_quantitatives <- sapply(donnees_csv, is.numeric)
#Extraction of quantitative variables
donnees_quantitatives <- donnees_csv[, vars_quantitatives]
# Calculate the correlation matrix
matrice_correlation <- cor(donnees_quantitatives, use = "complete.obs")
datatable(matrice_correlation, options = list(pageLength = 6)) %>%
formatRound(columns = 1:ncol(matrice_correlation), digits = 2)
# DataExplorer correlation plot
corrplot(matrice_correlation, method = "color", type = "upper", tl.col = "black", tl.srt = 75)
The correlation graph (heatmap) highlights the relationships between the different variables associated with electrification in Central Africa. Several important links clearly emerge:
These three electricity access variables are very strongly correlated with each other (high coefficients, in dark blue). This means that: • when a country has good overall access to electricity, • it also has good access in urban areas, • and often better access in rural areas (even if the levels remain low). This is logical: overall access is primarily driven by urban performance, but when rural access improves, it immediately enhances total access.
The matrix shows one of the highest positive correlations (intense blue). This indicates that: • Countries with a high GDP per capita (Gabon, Equatorial Guinea) • also have a higher Human Development Index. This reflects a structural reality: The wealthier a country is, the better its performance in health, education, and infrastructure—and therefore in electrification.
The two variables are almost perfectly correlated. This means that: • Countries that produce a lot of electricity • are also those that consume a lot of it. This is normal behavior for energy systems: Demand drives production, and production capacity depends on the level of industrialization and urbanization.
• Countries with high hydroelectric production (Cameroon, Gabon) are not those with high fossil fuel production. • The two variables are therefore generally inversely correlated.
This shows two types of energy profiles: • “hydro-dependent” countries • “fossil fuel-dependent” countries# Center and reduce the data
donnees_centrees_reduites <- scale(donnees_quantitatives,center = TRUE,scale=TRUE)
datatable(donnees_centrees_reduites, options = list(pageLength = 5, autoWidth = TRUE))
# Perform the PCA
resultat_acp <- PCA(donnees_centrees_reduites, axes = c(1, 2), graph = TRUE)
## Warning in PCA(donnees_centrees_reduites, axes = c(1, 2), graph = TRUE):
## Missing values are imputed by the mean of the variable: you should use the
## imputePCA function of the missMDA package
# Display the results of the PCA
print(resultat_acp)
## **Results for the Principal Component Analysis (PCA)**
## The analysis was performed on 11 individuals, described by 12 variables
## *The results are available in the following objects:
##
## name description
## 1 "$eig" "eigenvalues"
## 2 "$var" "results for the variables"
## 3 "$var$coord" "coord. for the variables"
## 4 "$var$cor" "correlations variables - dimensions"
## 5 "$var$cos2" "cos2 for the variables"
## 6 "$var$contrib" "contributions of the variables"
## 7 "$ind" "results for the individuals"
## 8 "$ind$coord" "coord. for the individuals"
## 9 "$ind$cos2" "cos2 for the individuals"
## 10 "$ind$contrib" "contributions of the individuals"
## 11 "$call" "summary statistics"
## 12 "$call$centre" "mean of the variables"
## 13 "$call$ecart.type" "standard error of the variables"
## 14 "$call$row.w" "weights for the individuals"
## 15 "$call$col.w" "weights for the variables"
### Correlation Circle of Variables
The correlation circle analysis clearly identifies the structure of the first two axes of the PCA. The first dimension (Dimension 1), which explains 43.41% of the total variance, is strongly correlated with variables reflecting the level of socioeconomic development and energy performance. We observe that the vectors GDP_Per_Capita, HDI, Access_Elect, Access_Elect_Urban, as well as the variables related to electricity production and demand (Elect_Gen, Elect_Demand, Hydroelectricity_gen, Fossil_fuels_elect_gen) clearly point in the positive direction of this axis. This means that Dimension 1 contrasts countries with high levels of wealth, advanced electrification, and a more developed energy system with those with low economic and energy capacity. Thus, this dimension can be named:
Axis 1: “Economic Development and Energy Performance”
The second dimension (Dimension 2), which explains 36.60% of the variance, primarily contrasts demographic variables. The vectors Pop_Growth and Total_Pop are positively aligned on this axis, while Rural_Pop is projected to the negative side. This structure reflects a contrast between, on the one hand, countries with high population growth or a large total population, and on the other hand, those where the population is predominantly rural. The positioning of variables such as Hydroelectricity_gen or Elect_Demand near the vertical axis indicates that they contribute moderately to this dimension, without strongly structuring it. Dimension 2 therefore expresses characteristics related to population pressure, urbanization, and territorial imbalances more than to energy performance. This dimension can be named:
Axis 2: “Demographic Dynamics and Territorial Structure”# Perform the PCA with qualitatives variables
resultat_acp <- PCA(donnees_csv, scale.unit = TRUE, ncp = 2, quali.sup = 13:14, graph = TRUE)
## Warning in PCA(donnees_csv, scale.unit = TRUE, ncp = 2, quali.sup = 13:14, :
## Missing values are imputed by the mean of the variable: you should use the
## imputePCA function of the missMDA package
# Display the results of the PCA
print(resultat_acp)
## **Results for the Principal Component Analysis (PCA)**
## The analysis was performed on 11 individuals, described by 14 variables
## *The results are available in the following objects:
##
## name description
## 1 "$eig" "eigenvalues"
## 2 "$var" "results for the variables"
## 3 "$var$coord" "coord. for the variables"
## 4 "$var$cor" "correlations variables - dimensions"
## 5 "$var$cos2" "cos2 for the variables"
## 6 "$var$contrib" "contributions of the variables"
## 7 "$ind" "results for the individuals"
## 8 "$ind$coord" "coord. for the individuals"
## 9 "$ind$cos2" "cos2 for the individuals"
## 10 "$ind$contrib" "contributions of the individuals"
## 11 "$quali.sup" "results for the supplementary categorical variables"
## 12 "$quali.sup$coord" "coord. for the supplementary categories"
## 13 "$quali.sup$v.test" "v-test of the supplementary categories"
## 14 "$call" "summary statistics"
## 15 "$call$centre" "mean of the variables"
## 16 "$call$ecart.type" "standard error of the variables"
## 17 "$call$row.w" "weights for the individuals"
## 18 "$call$col.w" "weights for the variables"
# Biplot visualization
fviz_pca_biplot(resultat_acp, repel = TRUE)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the ggpubr package.
## Please report the issue at <https://github.com/kassambara/ggpubr/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
This graph is a principal component analysis (PCA) bigraph, which visually represents the relationships between countries (points) and variables (arrows) based on the first two principal dimensions, Dim1 and Dim2. These dimensions capture 43.55% and 34.57% of the data variance, respectively, meaning that together they account for 78.12% of the total variance. We can also make the following observations:
• The first axis, titled “Economic Development and Energy Performance,” reflects a clear opposition between two groups of countries. To the right of the axis are the most developed countries, characterized by high GDP per capita, a higher HDI, significant energy production, and better rates of access to electricity, both in urban and rural areas. These include Gabon, Equatorial Guinea, Cameroon, DRC and Angola. Conversely, to the left of this axis appear countries with a lower level of development, limited energy production, and reduced access to electricity, such as the Central African Republic, Chad, and Burundi. The first axis thus reflects the overall gradient of development and energy performance among the countries studied.
• The second axis, called “Demographic Dynamics and Territorial Structure,” contrasts countries characterized by strong population growth and a predominantly rural population—such as Chad, the Central African Republic, and Burundi—with those whose demographic structure is more stable and more urbanized, such as Gabon, Equatorial Guinea, and São Tomé and Príncipe. This axis therefore highlights the influence of rurality and demographic pressure on the challenges related to electrification, showing that the more rural countries with high population growth are also those that encounter the greatest difficulties in accessing energy services.The eigenvalues indicate the amount of variance explained by each principal component.
# Extract and plot eigenvalues
val.propre <- get_eigenvalue(resultat_acp)
pander(val.propre)
| eigenvalue | variance.percent | cumulative.variance.percent | |
|---|---|---|---|
| Dim.1 | 5.226 | 43.55 | 43.55 |
| Dim.2 | 4.148 | 34.57 | 78.12 |
| Dim.3 | 1.044 | 8.696 | 86.81 |
| Dim.4 | 0.7268 | 6.056 | 92.87 |
| Dim.5 | 0.5651 | 4.709 | 97.58 |
| Dim.6 | 0.1574 | 1.312 | 98.89 |
| Dim.7 | 0.08546 | 0.7122 | 99.6 |
| Dim.8 | 0.03278 | 0.2732 | 99.88 |
| Dim.9 | 0.01335 | 0.1113 | 99.99 |
| Dim.10 | 0.001465 | 0.01221 | 100 |
fviz_eig(resultat_acp, addlabels = TRUE, ylim = c(0, 50))
## Warning in geom_bar(stat = "identity", fill = barfill, color = barcolor, :
## Ignoring empty aesthetic: `width`.
We examine the contribution of each variable to the principal components.
# Get PCA variable results
resultat.var <- get_pca_var(resultat_acp)
pander(resultat.var$coord)
| Dim.1 | Dim.2 | |
|---|---|---|
| Elect_Gen | -0.4548 | 0.8823 |
| Access_Elect | 0.8706 | 0.4483 |
| Access_Elect_Urbain | 0.7557 | 0.3368 |
| Access_Elect_Rural | 0.6208 | 0.07858 |
| Elec_Demand | -0.4971 | 0.8549 |
| Total_Pop | -0.7399 | 0.5019 |
| Rural_Pop | -0.6205 | -0.6026 |
| Pop_Growth | -0.8368 | 0.319 |
| GDP_Per_Capita | 0.7041 | 0.3572 |
| HDI | 0.779 | 0.6015 |
| Fossil_fuels_elect_gen | 0.1798 | 0.6623 |
| Hydroelectricity_gen | -0.5388 | 0.8205 |
pander(resultat.var$cor)
| Dim.1 | Dim.2 | |
|---|---|---|
| Elect_Gen | -0.4548 | 0.8823 |
| Access_Elect | 0.8706 | 0.4483 |
| Access_Elect_Urbain | 0.7557 | 0.3368 |
| Access_Elect_Rural | 0.6208 | 0.07858 |
| Elec_Demand | -0.4971 | 0.8549 |
| Total_Pop | -0.7399 | 0.5019 |
| Rural_Pop | -0.6205 | -0.6026 |
| Pop_Growth | -0.8368 | 0.319 |
| GDP_Per_Capita | 0.7041 | 0.3572 |
| HDI | 0.779 | 0.6015 |
| Fossil_fuels_elect_gen | 0.1798 | 0.6623 |
| Hydroelectricity_gen | -0.5388 | 0.8205 |
pander(resultat.var$cos2)
| Dim.1 | Dim.2 | |
|---|---|---|
| Elect_Gen | 0.2069 | 0.7784 |
| Access_Elect | 0.7579 | 0.2009 |
| Access_Elect_Urbain | 0.5711 | 0.1135 |
| Access_Elect_Rural | 0.3854 | 0.006175 |
| Elec_Demand | 0.2471 | 0.7308 |
| Total_Pop | 0.5475 | 0.2519 |
| Rural_Pop | 0.385 | 0.3632 |
| Pop_Growth | 0.7002 | 0.1018 |
| GDP_Per_Capita | 0.4958 | 0.1276 |
| HDI | 0.6068 | 0.3618 |
| Fossil_fuels_elect_gen | 0.03233 | 0.4386 |
| Hydroelectricity_gen | 0.2903 | 0.6732 |
pander(resultat.var$contrib)
| Dim.1 | Dim.2 | |
|---|---|---|
| Elect_Gen | 3.958 | 18.77 |
| Access_Elect | 14.5 | 4.845 |
| Access_Elect_Urbain | 10.93 | 2.735 |
| Access_Elect_Rural | 7.374 | 0.1489 |
| Elec_Demand | 4.728 | 17.62 |
| Total_Pop | 10.48 | 6.074 |
| Rural_Pop | 7.367 | 8.756 |
| Pop_Growth | 13.4 | 2.453 |
| GDP_Per_Capita | 9.486 | 3.076 |
| HDI | 11.61 | 8.722 |
| Fossil_fuels_elect_gen | 0.6186 | 10.57 |
| Hydroelectricity_gen | 5.555 | 16.23 |
Let’s now visualize these contributions on the contribution graphs :
From the analysis of the contribution graphs for the variables, it emerges that:
The variables that participate best in the formation of dimension 1 are the variables HDI, Fossil_fuels_elect_gen, Hydroelectricity_gen, Elec_Demand and Elec_Gen
The variables that contribute best to the formation of dimension 2 are Pop_Growth, Total_Pop and Access_Elec
Similarly, the variables Total_Pop, Elec_Demand, Elec_Gen, HDI, Access_Elec, Hydroelectricity_gen and Total_Pop contribute best to the formation of factorial plan.fviz_pca_var(resultat_acp, col.var = "contrib", gradient.cols = c("blue", "orange", "red"), repel = TRUE, title = "Contribution of Variables to Principal Components")
fviz_contrib(resultat_acp, choice = "var", axes = 1, top = 12)
fviz_contrib(resultat_acp, choice = "var", axes = 2, top = 12)
fviz_contrib(resultat_acp, choice = "var", axes = 1:2, top = 12)
In this section, we explore the coordinates, quality of representation, and contributions of individuals (observations) to the PCA axes.
# Get PCA individual results
resultat.ind <- get_pca_ind(resultat_acp)
pander(resultat.ind$coord)
| Dim.1 | Dim.2 | |
|---|---|---|
| Cameroon | 0.5026 | 1.469 |
| Republic of the Congo | 0.9198 | 0.3619 |
| DRC | -4.357 | 2.313 |
| Gabon | 3.612 | 1.262 |
| Chad | -2.188 | -2.291 |
| Central African Republic | -1.257 | -2.717 |
| Equatorial Guinea | 2.132 | 0.2254 |
| Angola | -1.384 | 3.906 |
| Rwanda | 0.7403 | -1.327 |
| Burundi | -1.613 | -2.485 |
| São Tomé and Príncipe | 2.894 | -0.7181 |
pander(resultat.ind$cos2)
| Dim.1 | Dim.2 | |
|---|---|---|
| Cameroon | 0.06392 | 0.5459 |
| Republic of the Congo | 0.1893 | 0.0293 |
| DRC | 0.6939 | 0.1956 |
| Gabon | 0.8161 | 0.09962 |
| Chad | 0.4192 | 0.4593 |
| Central African Republic | 0.1402 | 0.6543 |
| Equatorial Guinea | 0.5306 | 0.00593 |
| Angola | 0.1021 | 0.8129 |
| Rwanda | 0.08499 | 0.273 |
| Burundi | 0.2763 | 0.6559 |
| São Tomé and Príncipe | 0.585 | 0.03602 |
pander(resultat.ind$contrib)
| Dim.1 | Dim.2 | |
|---|---|---|
| Cameroon | 0.4394 | 4.728 |
| Republic of the Congo | 1.472 | 0.287 |
| DRC | 33.03 | 11.73 |
| Gabon | 22.7 | 3.491 |
| Chad | 8.33 | 11.5 |
| Central African Republic | 2.751 | 16.18 |
| Equatorial Guinea | 7.904 | 0.1113 |
| Angola | 3.334 | 33.45 |
| Rwanda | 0.9533 | 3.858 |
| Burundi | 4.526 | 13.54 |
| São Tomé and Príncipe | 14.57 | 1.13 |
Let’s now visualize these contributions on the contribution graphs :
From the analysis of the contribution graphs for the individuals, it emerges that :
The individual that participate best in the formation of dimension 1 are DRC, Gabon and São Tomé and Príncipe
The individual that contribute best to the formation of dimension 2 are Angola and Central African Republic
Similarly, the individuals DRC, Angola and Gabon contribute best to the formation of factorial plan.fviz_pca_ind(resultat_acp, col.ind = "cos2", gradient.cols = c("blue", "orange", "red"), repel = TRUE)
fviz_contrib(resultat_acp, choice = "ind", axes = 1, top = 12)
fviz_contrib(resultat_acp, choice = "ind", axes = 2, top = 12)
fviz_contrib(resultat_acp, choice = "ind", axes = 1:2, top = 12)
# Perform HCPC
resultat.cah <- HCPC(resultat_acp, nb.clust = -1, consol = FALSE, graph = FALSE)
# Visualize hierarchical clustering
plot.HCPC(resultat.cah, choice = 'tree', title = 'Hierarchical Tree')
plot.HCPC(resultat.cah, choice = 'map', draw.tree = FALSE, title = 'Factor Map')
# Fit multiple linear regression
regression <- lm(Access_Elect ~ Total_Pop + Elect_Gen + Elec_Demand + Rural_Pop + Pop_Growth + HDI + GDP_Per_Capita + Fossil_fuels_elect_gen + Hydroelectricity_gen, data = donnees_csv)
print(summary(regression))
##
## Call:
## lm(formula = Access_Elect ~ Total_Pop + Elect_Gen + Elec_Demand +
## Rural_Pop + Pop_Growth + HDI + GDP_Per_Capita + Fossil_fuels_elect_gen +
## Hydroelectricity_gen, data = donnees_csv)
##
## Residuals:
## Cameroon Republic of the Congo DRC
## -0.45629 -0.16779 -0.08937
## Gabon Chad Central African Republic
## -0.26422 1.67162 1.38688
## Equatorial Guinea Angola Rwanda
## 0.02580 0.29476 2.39262
## Burundi São Tomé and Príncipe
## -4.32482 -0.46918
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -9.896e+00 3.970e+01 -0.249 0.844
## Total_Pop 1.964e-06 7.106e-07 2.763 0.221
## Elect_Gen -9.286e-02 3.373e-02 -2.753 0.222
## Elec_Demand 1.068e-02 1.192e-02 0.896 0.535
## Rural_Pop -5.320e-01 2.255e-01 -2.359 0.255
## Pop_Growth -2.065e+01 1.127e+01 -1.832 0.318
## HDI 2.361e+02 4.705e+01 5.017 0.125
## GDP_Per_Capita -1.845e-03 2.421e-03 -0.762 0.585
## Fossil_fuels_elect_gen 8.264e+01 2.925e+01 2.826 0.217
## Hydroelectricity_gen 8.124e+01 2.401e+01 3.384 0.183
##
## Residual standard error: 5.456 on 1 degrees of freedom
## Multiple R-squared: 0.9964, Adjusted R-squared: 0.9644
## F-statistic: 31.06 on 9 and 1 DF, p-value: 0.1384
regression5 <- lm(Access_Elect ~ Hydroelectricity_gen + HDI, data = donnees_csv)
print(summary(regression5))
##
## Call:
## lm(formula = Access_Elect ~ Hydroelectricity_gen + HDI, data = donnees_csv)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.8533 -5.4869 -0.4644 5.8651 13.6879
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -92.2424 15.1983 -6.069 0.000299 ***
## Hydroelectricity_gen -0.8533 0.5783 -1.476 0.178278
## HDI 262.3808 27.6107 9.503 1.24e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.183 on 8 degrees of freedom
## Multiple R-squared: 0.9192, Adjusted R-squared: 0.899
## F-statistic: 45.51 on 2 and 8 DF, p-value: 4.26e-05
# Plot regression diagnostics
plot(regression5,which = 1)
# Plot regression diagnostics
plot(regression5,which = 2)
We can use the model to make predictions for Access_Elec based on the values of the predictor variables.
# Make predictions
predictions <- predict(regression)
pander(predictions)
| Cameroon | Republic of the Congo | DRC | Gabon | Chad |
|---|---|---|---|---|
| 71.46 | 50.77 | 21.59 | 93.76 | 10.03 |
| Central African Republic | Equatorial Guinea | Angola | Rwanda | Burundi |
|---|---|---|---|---|
| 14.31 | 66.97 | 48.21 | 48.21 | 14.62 |
| São Tomé and Príncipe |
|---|
| 78.47 |
In this analysis, we examined the challenges of electrification in Central Africa by applying statistical techniques to a set of socioeconomic, demographic, and energy variables. Principal Component Analysis (PCA) reduced the complexity of the dataset and identified the major dimensions that structure regional disparities. The results highlight the crucial role of GDP per capita, electricity production, access to electricity (urban and rural), and demographic characteristics in differentiating the countries of the region.
The PCA revealed that the first two dimensions capture most of the variability between countries, contrasting, on the one hand, states with relatively high economic and energy capacity, and on the other hand, those facing structural weaknesses, high levels of rural population, or significant population growth. Cluster analysis, when combined with PCA results, reveals distinct national profiles, reflecting heterogeneous levels of electrification, economic development, and territorial organization. The results also suggest that GDP per capita remains a key explanatory factor for access to electricity in the region, thus confirming dynamics already observed in other African contexts.
Overall, these results provide crucial insights into the persistent disparities in electrification across Central Africa and underscore the need for targeted policies, particularly to strengthen rural electrification, improve energy efficiency, and diversify generation sources, especially through renewable energy. Future research could incorporate longitudinal data to analyze changes over time, or include institutional and policy variables to better understand the influence of governance on electrification progress.