Introduction

Agriculture holds a central position in the economies of West Africa. It serves as the primary source of employment and income for rural populations, while making a substantial contribution to the Gross Domestic Product (GDP). Nevertheless, this sector is currently confronted with a dual challenge: promoting economic development while mitigating its impact on the environment and the climate. Indeed, agriculture simultaneously acts as a driving force for development and a contributing factor to climate change. On the one hand, it fosters economic growth and ensures food security; on the other, it generates greenhouse gas emissions, accelerates deforestation, and exerts increasing pressure on freshwater resources. In this context, it becomes imperative to examine the true contribution of agriculture to both economic development and climate change. This report, undertaken within the framework of the Research and Information Processing Project (RTI), seeks to elucidate—through empirical data—the interrelations between agricultural, economic, and environmental variables, with the ultimate goal of guiding the formulation and implementation of sustainable, evidence-based public policies.

This report explores the key socioeconomic and environmental variables that influence the interaction between agriculture, economic development, and climate change in West Africa. Using advanced statistical tools such as Principal Component Analysis (PCA), correlation analysis, hierarchical clustering, and multiple linear regression, we aim to uncover patterns and relationships that provide valuable insights into the challenges and opportunities for electrification in the region.

Map of West African Countries Included in the Analysis

Map of West African Countries Included in the Analysis

By analyzing a comprehensive dataset covering GHG emissions, population dynamics, land use, and climatic factors, this report aims to provide actionable recommendations For policymakers, researchers, and stakeholders working to ensure resilient and sustainable agriculture In West Africa.

# Set options and load required libraries
knitr::opts_chunk$set(echo = TRUE)
library(Factoshiny)
## Le chargement a nécessité le package : FactoMineR
## Le chargement a nécessité le package : shiny
## Le chargement a nécessité le package : FactoInvestigate
## Le chargement a nécessité le package : ggplot2
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(ggplot2)
library(psych)
## 
## Attachement du package : 'psych'
## Les objets suivants sont masqués depuis 'package:ggplot2':
## 
##     %+%, alpha
library(factoextra)
library(shiny)
library(FactoInvestigate)
library(DataExplorer)
library(corrplot)
## corrplot 0.95 loaded
library(pander)
## 
## Attachement du package : 'pander'
## L'objet suivant est masqué depuis 'package:shiny':
## 
##     p
library(DT)
## 
## Attachement du package : 'DT'
## Les objets suivants sont masqués depuis 'package:shiny':
## 
##     dataTableOutput, renderDataTable
library(dplyr)
## 
## Attachement du package : 'dplyr'
## Les objets suivants sont masqués depuis 'package:stats':
## 
##     filter, lag
## Les objets suivants sont masqués depuis 'package:base':
## 
##     intersect, setdiff, setequal, union
library(FactoMineR)
library(rsconnect)
## 
## Attachement du package : 'rsconnect'
## L'objet suivant est masqué depuis 'package:shiny':
## 
##     serverInfo
library(knitr)
library(magick)
## Warning: le package 'magick' a été compilé avec la version R 4.5.2
## Linking to ImageMagick 6.9.13.29
## Enabled features: cairo, freetype, fftw, ghostscript, heic, lcms, pango, raw, rsvg, webp
## Disabled features: fontconfig, x11

Loading Data

The dataset is loaded from a specified directory.

# Set working directory and load data

setwd("C:/Users/Victus/OneDrive - Institut 2IE/Bureau/PROJET RTI")

donnees = read.csv(file = "Données RTI GEAAH bon 2026.csv", header = TRUE, sep = ";", quote = "\"",dec= ",", row.names = 1)
datatable(donnees, options = list(pageLength = 10, autoWidth = TRUE))

Correlation Analysis

We calculate the correlation matrix for all the variables and visualize it using a correlation plot. This helps in understanding relationships between variables before performing PCA.

# Calculate correlation matrix
matrice.cor <- cor(donnees[, 1:13])
datatable(matrice.cor, options = list(pageLength = 20)) %>%
  formatRound(columns = 1:ncol(matrice.cor), digits = 3)
# Data Explorer correlation plot
corrplot(matrice.cor, method = "color", type = "upper", tl.col = "black", tl.srt = 75)

The graph thus represented highlights the relationships between the different variables. We can thus notice that:

PIB (Gross Domestic Product) and Perte_couv_forest (Forest Cover Loss): A average correlation is observed on PCA Dimension 2, meaning that the current economic development model (as measured by PIB) is directly correlated with the degradation of natural capital (deforestation and forest cover loss).

Population and Ret_eau_douce (Freshwater Withdrawal): A strong positive association is evident on Dimension 1, indicating that regions under high demographic pressure and intense Prod_cer (Cereal Production) are also those facing the most pronounced water stress (high freshwater withdrawal).

Prod_cer (Cereal Production) and Specific Agricultural Emissions (emi_metan and emi_N): A strong positive correlation is observed on Dimension 1. This suggests that the increase in agricultural production, essential for food security, is currently heavily linked to the rise in non-CO2 GHG emissions (methane from livestock, nitrous oxide from fertilizers).

Pluv (Rainfall) and Demographic/Emissions Pressure (Population, emi_metan): A negative correlation is observed (variables positioned oppositely on Dim1), suggesting that countries most affected by low Rainfall (the Sahel) are also those that must manage high demographic pressure and associated agricultural emissions.

PIB (Gross Domestic Product) and Simple Climate Indicators (T_moy, GES): The PIB shows a low overall contribution to the structure of the two dimensions (5% on Dim1-2) and a moderate correlation with T_moy and GES on Dimension 1. This indicates that economic wealth is not the primary factor explaining the differences in the climate and environmental profile among countries in the region.

Factorial Analysis (FA) / Principal component analysis (PCA)

Factor analysis is a statistical data analysis method that reduces the dimensionality of a data set by identifying the underlying variables (called factors) that explain the relationships between the observed variables. This technique is particularly useful for simplifying the analysis of complex data with a large number of interconnected variables. For our analysis, we will use principal component analysis, which is a factor analysis method based on quantitative data.Factor analysis is a statistical data analysis method that reduces the dimensionality of a data set by identifying the underlying variables (called factors) that explain the relationships between the observed variables. This technique is particularly useful for simplifying the analysis of complex data with a large number of interconnected variables.

For our analysis, we will use principal component analysis (PCA), which is a factor analysis method based on quantitative data.

We perform a PCA to reduce dimensionality and identify the main components that explain the variance in the dataset.

# Perform PCA
resultat.acp <- PCA(donnees, scale.unit = TRUE, ncp = 5, graph = TRUE)

The best qualitative variable to illustrate the distances between individuals on the map is the variable: Income_Class. This is also the only variable that has been represented in the graph of individuals. Moreover, the analysis of this result does not reveal any atypical individuals.

# Biplot visualization
fviz_pca_biplot(resultat.acp, repel = TRUE)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the ggpubr package.
##   Please report the issue at <https://github.com/kassambara/ggpubr/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

This graph is a principal component analysis (PCA) biplot, which visually represents the relationships between countries (dots) and variables (arrows) based on the first two principal dimensions, Dim1 and Dim2. These dimensions capture 50.22% and 21.8% of the variance in the data, respectively, which means that together they explain 72.02% of the total variance. We can also make the following observations :

The First Principal Axis, which explains 50.22% of the variance, contrasts countries whose profile is dominated by extreme environmental and agricultural constraints (such as FreshwaterWithdrawal and DemographicPressure on CerealProduction) with countries where these constraints are less pronounced. This axis highlights the fault line between climate adaptation models: countries in the Sahel (like Niger and Mali) face priority adaptation challenges related to resource scarcity and specific agricultural emissions.

The Second Principal Axis, which explains 21.80% of the variance, contrasts countries where economic growth (PIB) is strongly linked to the degradation of natural capital (ForestCoverLoss) with those having a different development structure and environmental pressure. This axis underscores the impact of the economic growth model on climate mitigation: countries on the Coast (like Côte d’Ivoire and Ghana) are defined by the environmental externalities of their stronger economic development, requiring land-use planning policies aimed at decoupling PIB growth from ForestCoverLoss.

Eigenvalues

The eigenvalues indicate the amount of variance explained by each principal component.

# Extract and plot eigenvalues
val.propre <- get_eigenvalue(resultat.acp)
pander(val.propre)
  eigenvalue variance.percent cumulative.variance.percent
Dim.1 6.528 50.22 50.22
Dim.2 2.834 21.8 72.01
Dim.3 1.65 12.69 84.71
Dim.4 0.813 6.254 90.96
Dim.5 0.5691 4.377 95.34
Dim.6 0.2138 1.644 96.98
Dim.7 0.1783 1.372 98.35
Dim.8 0.1571 1.208 99.56
Dim.9 0.03231 0.2485 99.81
Dim.10 0.01326 0.102 99.91
Dim.11 0.01012 0.07781 99.99
Dim.12 0.001252 0.009631 100
Dim.13 9.243e-05 0.000711 100
fviz_eig(resultat.acp, addlabels = TRUE, ylim = c(0, 50))
## Warning in geom_bar(stat = "identity", fill = barfill, color = barcolor, :
## Ignoring empty aesthetic: `width`.

The Kaizer criterion is verified here (we only retain the axes associated with eigenvalues ​​greater than 1), that only the first two axes whose variances are greater than 11.11% will be retained for the rest of the analysis. Similarly, the elbow rule also confirms this conclusion.

Contribution of Variables to Components

We examine the contribution of each variable to the principal components.

# Get PCA variable results
resultat.ind <- get_pca_ind(resultat.acp)
pander(resultat.ind$coord)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin -0.7391 -0.2913 0.3556 0.4479 0.2687
Burkina Faso 2.131 0.0367 -1.11 0.2684 -0.7802
Cap Vert -3.765 0.01316 1.965 -2.552 -0.05553
Cote ivoire 1.986 3.663 0.2642 -0.0434 -0.8967
Gambia -2.336 -1.443 0.4386 0.9443 -0.1387
Ghana 2.931 3.06 1.947 0.5571 0.5778
Guinee 0.2131 1.132 -1.611 -0.7131 0.1519
Guinee Bissau -2.632 -0.8734 -0.4919 0.8918 0.4948
Liberia -2.794 1.239 -1.656 -0.1878 -0.1963
Mali 4.782 -1.891 -0.9253 -1.196 1.47
Mauritania -0.624 -1.55 1.595 0.6179 -0.784
Niger 3.41 -2.541 -0.3675 -0.4026 -1.617
Senegal 1.549 -0.8287 1.204 0.8662 0.9015
Sierra Leone -2.332 0.7767 -2.148 0.3382 0.3392
Togo -1.779 -0.5035 0.5412 0.1634 0.2649
pander(resultat.ind$cos2)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin 0.2745 0.04265 0.06355 0.1008 0.03629
Burkina Faso 0.581 0.0001724 0.1576 0.009218 0.07791
Cap Vert 0.5768 7.049e-06 0.1572 0.2651 0.0001255
Cote ivoire 0.2022 0.6879 0.003578 9.654e-05 0.04121
Gambia 0.6206 0.2367 0.02187 0.1014 0.002186
Ghana 0.3703 0.4039 0.1634 0.01338 0.0144
Guinee 0.009112 0.257 0.5209 0.102 0.004627
Guinee Bissau 0.7585 0.08353 0.02649 0.08709 0.02681
Liberia 0.6171 0.1213 0.2169 0.002788 0.003045
Mali 0.7334 0.1147 0.02746 0.04588 0.0693
Mauritania 0.054 0.333 0.3529 0.05293 0.08522
Niger 0.5322 0.2954 0.006182 0.007418 0.1197
Senegal 0.3623 0.1036 0.2188 0.1132 0.1226
Sierra Leone 0.4788 0.0531 0.4062 0.01007 0.01013
Togo 0.7651 0.06127 0.07079 0.006453 0.01696
pander(resultat.ind$contrib)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin 0.5579 0.1997 0.5111 1.645 0.8461
Burkina Faso 4.636 0.003168 4.977 0.5906 7.131
Cap Vert 14.47 0.0004074 15.6 53.41 0.03613
Cote ivoire 4.029 31.57 0.2821 0.01544 9.419
Gambia 5.575 4.898 0.7775 7.312 0.2253
Ghana 8.771 22.03 15.31 2.545 3.911
Guinee 0.04638 3.013 10.49 4.17 0.2702
Guinee Bissau 7.074 1.794 0.9776 6.521 2.868
Liberia 7.974 3.61 11.09 0.2892 0.4514
Mali 23.35 8.411 3.46 11.73 25.31
Mauritania 0.3977 5.65 10.28 3.13 7.2
Niger 11.88 15.18 0.5458 1.329 30.64
Senegal 2.452 1.616 5.858 6.152 9.52
Sierra Leone 5.556 1.419 18.65 0.938 1.348
Togo 3.232 0.5963 1.183 0.2189 0.822

Let’s now visualize these contributions on the contribution graphs :

From the analysis of the contribution graphs for the variables, it emerges that:

-The variables that participate best in the formation of Dimension 1 (50.22%), the axis of Agricultural Intensification, Demographics, and Water Stress, are: emi_metan (Methane Emissions), Prod_cer (Cereal Production), Population, and Ret_eau_douce (Freshwater Withdrawal).

-The variables that contribute best to the formation of Dimension 2 (21.80%), the axis of Development and Forest Degradation, are: Perte_couv_forest (Forest Cover Loss) and PIB (Gross Domestic Product).

-Similarly, the variables emi_metan, Perte_couv_forest, Ret_eau_douce, and Population contribute best to the formation of the overall factorial plan.

Variable contributions

resultat.var <- get_pca_var(resultat.acp)
fviz_pca_var(resultat.acp, 
             col.var = "contrib", 
             gradient.cols = c("blue", "orange", "red"), 
             repel = TRUE, 
             title = "Contribution of Variables to Principal Components")

Contributions on Dim 1

fviz_contrib(resultat.acp, choice = "var", axes = 1, top = 10)

Contributions on Dim 2

fviz_contrib(resultat.acp, choice = "var", axes = 2, top = 10)

Contributions on Dime 1-2

fviz_contrib(resultat.acp, choice = "var", axes = 1:2, top = 10)

PCA for Individuals

In this section, we explore the coordinates, quality of representation, and contributions of individuals (observations) to the PCA axes.

# Get PCA individual results
resultat.var <- get_pca_var(resultat.acp)
pander(resultat.ind$coord)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin -0.7391 -0.2913 0.3556 0.4479 0.2687
Burkina Faso 2.131 0.0367 -1.11 0.2684 -0.7802
Cap Vert -3.765 0.01316 1.965 -2.552 -0.05553
Cote ivoire 1.986 3.663 0.2642 -0.0434 -0.8967
Gambia -2.336 -1.443 0.4386 0.9443 -0.1387
Ghana 2.931 3.06 1.947 0.5571 0.5778
Guinee 0.2131 1.132 -1.611 -0.7131 0.1519
Guinee Bissau -2.632 -0.8734 -0.4919 0.8918 0.4948
Liberia -2.794 1.239 -1.656 -0.1878 -0.1963
Mali 4.782 -1.891 -0.9253 -1.196 1.47
Mauritania -0.624 -1.55 1.595 0.6179 -0.784
Niger 3.41 -2.541 -0.3675 -0.4026 -1.617
Senegal 1.549 -0.8287 1.204 0.8662 0.9015
Sierra Leone -2.332 0.7767 -2.148 0.3382 0.3392
Togo -1.779 -0.5035 0.5412 0.1634 0.2649
pander(resultat.ind$cos2)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin 0.2745 0.04265 0.06355 0.1008 0.03629
Burkina Faso 0.581 0.0001724 0.1576 0.009218 0.07791
Cap Vert 0.5768 7.049e-06 0.1572 0.2651 0.0001255
Cote ivoire 0.2022 0.6879 0.003578 9.654e-05 0.04121
Gambia 0.6206 0.2367 0.02187 0.1014 0.002186
Ghana 0.3703 0.4039 0.1634 0.01338 0.0144
Guinee 0.009112 0.257 0.5209 0.102 0.004627
Guinee Bissau 0.7585 0.08353 0.02649 0.08709 0.02681
Liberia 0.6171 0.1213 0.2169 0.002788 0.003045
Mali 0.7334 0.1147 0.02746 0.04588 0.0693
Mauritania 0.054 0.333 0.3529 0.05293 0.08522
Niger 0.5322 0.2954 0.006182 0.007418 0.1197
Senegal 0.3623 0.1036 0.2188 0.1132 0.1226
Sierra Leone 0.4788 0.0531 0.4062 0.01007 0.01013
Togo 0.7651 0.06127 0.07079 0.006453 0.01696
pander(resultat.ind$contrib)
  Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
Benin 0.5579 0.1997 0.5111 1.645 0.8461
Burkina Faso 4.636 0.003168 4.977 0.5906 7.131
Cap Vert 14.47 0.0004074 15.6 53.41 0.03613
Cote ivoire 4.029 31.57 0.2821 0.01544 9.419
Gambia 5.575 4.898 0.7775 7.312 0.2253
Ghana 8.771 22.03 15.31 2.545 3.911
Guinee 0.04638 3.013 10.49 4.17 0.2702
Guinee Bissau 7.074 1.794 0.9776 6.521 2.868
Liberia 7.974 3.61 11.09 0.2892 0.4514
Mali 23.35 8.411 3.46 11.73 25.31
Mauritania 0.3977 5.65 10.28 3.13 7.2
Niger 11.88 15.18 0.5458 1.329 30.64
Senegal 2.452 1.616 5.858 6.152 9.52
Sierra Leone 5.556 1.419 18.65 0.938 1.348
Togo 3.232 0.5963 1.183 0.2189 0.822

Let’s now visualize these contributions on the contribution graphs :

From the analysis of the contribution graphs for the individuals, it emerges that :

-The individuals that participate best in the formation of Dimension 1 (50.22%), the Adaptation and Water Stress axis, are: Mali, Niger, and Ghana.

-The individual that contributes best to the formation of Dimension 2 (21.80%), the Growth and Forest Degradation axis, is: Côte d’Ivoire (with a highly dominant contribution).

-Similarly, the individuals Côte d’Ivoire, Mali, Ghana, and Niger contribute best to the formation of the overall factorial plan.

Contribution of individuals

fviz_pca_ind(resultat.acp, 
             col.ind = "cos2", 
             gradient.cols = c("blue", "orange", "red"), 
             repel = TRUE,
             title = "Position des Pays et Qualité de Représentation (Cos2)")

Contribution on Dim 1

fviz_contrib(resultat.acp, choice = "var", axes = 1, top = 10)

Contribution on Dim 2

fviz_contrib(resultat.acp, choice = "var", axes = 2, top = 10)

Contribution on Dim 1-2

fviz_contrib(resultat.acp, choice = "var", axes = 1:2, top = 10)

Classification / Hierarchical Clustering on Principal Components (HCPC)

Classification, like factor analysis, is a data analysis method that groups observations into categories called classes. There are several types of classification; however, for our dataset, we used Hierarchical Ascendant Classification.

Hierarchical clustering is performed on the PCA results to identify clusters within the data.

# Perform HCPC
resultat.cah <- HCPC(resultat.acp, nb.clust = 3, consol = FALSE, graph = FALSE)

# Visualize hierarchical clustering
plot.HCPC(resultat.cah, choice = 'tree', title = 'Hierarchical Tree')

plot.HCPC(resultat.cah, choice = 'map', draw.tree = FALSE, title = 'Factor Map')

The Hierarchical clustering made on individuals reveals 3 clusters :

  • Cluster 1: Intermediate and Transition Countries

This large cluster, encompassing Liberia, SierraLeone, GuineaBissau, CapVert, Mauritania, Togo, Benin, and Gambia, includes countries that do not present the extreme characteristics of the other two poles. The key drivers are lowvalues for all pressure variables (Population,emi_metan,GES,PIB, etc.). Their central or negative position on the PCA indicates that agricultural and climate pressures do not yet dominate their structure as much as they do for the other clusters. These countries are crucial as they represent a window of opportunity to prevent imbalances. Their issues are heterogeneous and often linked to Governance factors (e.g., the effectiveness of Togo and Benin in managing deforestation). Strategic Implication: They require a mixed and contextual approach, utilizing the transferofbestpractices (water engineering from the Sahel and sustainable land management from the Coast), with an emphasis on strengthening governance and the enforcement of environmental laws.

  • Cluster 2: The Priority Adaptation Pole

This cluster corresponds to the group located at the positive extreme of Dimension 1 of the PCA, representing the primary Adaptation challenges. Its composition includes Mali, Niger, and Senegal. The key drivers are high values for Emi_N (Nitrogen Oxide), Prod_cer (Cereal Production), emi_metan (Methane), and Ret_eau_douc (Freshwater Withdrawal). This confirms that the challenges in these countries are dominated by Agricultural Intensification and Livestock (hence the specific emissions) within a context of structural Water Stress. Consequently, the low HDI (Human Development Index) value indicates that this agricultural and environmental pressure directly results in a lowlevelofhumandevelopment. Strategic Implication: Water engineering policies must focus on securing resources and improving Water Use Efficiency (WUE) to support cereal production without depleting groundwater reserves.

  • Cluster 3 (The Coast): The Mitigation and Growth Pole

This cluster is positioned in the upper part of the factorial map and represents the Mitigation challenges, dominated by the Dimension2 dynamic. Its composition includes Co^ted′Ivoire and Ghana. The key drivers are high values for PIB (GDP), emission (GHG), Population, and GES (Greenhouse Gases). This confirms that their core issue is one of intense EconomicGrowth(PIB) which generates strong NegativeEnvironmentalExternalities (GES and emission). Crucially, PIB and Perte_couv_forest (Forest Cover Loss) are strongly correlated along this axis. Strategic Implication: Hydro-agricultural planning policies must target Mitigation, aiming to decouple GDP growth from deforestation and emissions through practices like Agroforestry and land tenure security.

The following figure presents the different groups of individuals in the situational plane.

Multiple Linear Regression

Finally, we fit a multiple linear regression model to explore the relationships between PIB* (Gross Domestic Product) and various predictors.

Calculation of regression coefficients

# Fit multiple linear regression
regression <- lm(PIB ~ Population + Pluv + Prod_cer + Temp_moy + emission + emi_metan + Emi_N + Ret_eau_douc + IDH + GES + POL_AIR + Perte_couv_forest, data = donnees)
print(summary(regression))
## 
## Call:
## lm(formula = PIB ~ Population + Pluv + Prod_cer + Temp_moy + 
##     emission + emi_metan + Emi_N + Ret_eau_douc + IDH + GES + 
##     POL_AIR + Perte_couv_forest, data = donnees)
## 
## Residuals:
##         Benin  Burkina Faso      Cap Vert   Cote ivoire        Gambia 
##    -1.485e+09     1.014e+09     1.209e+09     4.632e+08    -5.835e+07 
##         Ghana        Guinee Guinee Bissau       Liberia          Mali 
##     1.354e+09    -2.073e+09     3.438e+09    -2.404e+09     7.828e+08 
##    Mauritania         Niger       Senegal  Sierra Leone          Togo 
##     7.163e+08    -1.960e+08    -1.995e+09     2.297e+09    -3.062e+09 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)
## (Intercept)       -1.018e+11  2.253e+11  -0.452    0.696
## Population        -5.688e+02  4.625e+03  -0.123    0.913
## Pluv               9.707e+06  3.356e+07   0.289    0.800
## Prod_cer           1.319e+04  2.762e+04   0.478    0.680
## Temp_moy          -1.812e+09  6.640e+09  -0.273    0.811
## emission           2.378e+03  4.063e+03   0.585    0.618
## emi_metan         -6.671e+02  6.000e+03  -0.111    0.922
## Emi_N             -5.337e+03  6.939e+03  -0.769    0.522
## Ret_eau_douc      -6.440e+00  2.451e+01  -0.263    0.817
## IDH                1.315e+11  3.084e+11   0.426    0.711
## GES                4.935e+02  1.031e+03   0.479    0.679
## POL_AIR            1.357e+09  3.657e+09   0.371    0.746
## Perte_couv_forest  1.190e+04  4.367e+04   0.272    0.811
## 
## Residual standard error: 4.922e+09 on 2 degrees of freedom
## Multiple R-squared:  0.9922, Adjusted R-squared:  0.9454 
## F-statistic:  21.2 on 12 and 2 DF,  p-value: 0.04591

Our model is statistically significant, with a p-value below 5% (1.731e-05 ). However, only the variable Emi_N explains PIB with a p-value below 5% (0.042633).

regression1 <- lm(PIB ~ Prod_cer + emission + Emi_N + Ret_eau_douc + IDH + GES + POL_AIR, data = donnees)
print(summary(regression1))
## 
## Call:
## lm(formula = PIB ~ Prod_cer + emission + Emi_N + Ret_eau_douc + 
##     IDH + GES + POL_AIR, data = donnees)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -5.322e+09 -1.719e+09 -2.873e+08  2.235e+09  4.785e+09 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -4.050e+10  2.068e+10  -1.958 0.091045 .  
## Prod_cer      6.297e+03  3.043e+03   2.070 0.077258 .  
## emission      2.363e+03  4.119e+02   5.738 0.000707 ***
## Emi_N        -4.476e+03  1.810e+03  -2.473 0.042633 *  
## Ret_eau_douc -3.311e+00  1.829e+00  -1.810 0.113223    
## IDH           3.679e+10  2.884e+10   1.276 0.242756    
## GES           4.006e+02  6.465e+01   6.197 0.000446 ***
## POL_AIR       3.880e+08  2.114e+08   1.835 0.109097    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.117e+09 on 7 degrees of freedom
## Multiple R-squared:  0.9809, Adjusted R-squared:  0.9618 
## F-statistic: 51.33 on 7 and 7 DF,  p-value: 1.731e-05
# Fit multiple linear regression
regression7 <- lm(PIB ~ emission + Emi_N + GES, data = donnees)
print(summary(regression7))
## 
## Call:
## lm(formula = PIB ~ emission + Emi_N + GES, data = donnees)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -7.080e+09 -2.958e+09  5.672e+08  2.562e+09  6.107e+09 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.206e+09  1.858e+09  -1.725  0.11243    
## emission     2.890e+03  2.791e+02  10.356 5.20e-07 ***
## Emi_N       -1.369e+03  3.437e+02  -3.982  0.00215 ** 
## GES          3.807e+02  6.290e+01   6.054 8.27e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.32e+09 on 11 degrees of freedom
## Multiple R-squared:  0.9669, Adjusted R-squared:  0.9579 
## F-statistic: 107.3 on 3 and 11 DF,  p-value: 1.995e-08

At the seventh iteration, we obtained a more significant model with a p-value of 1.995e-08. Additionally, it consists of variables that are all significant, with p-values below 5%. Furthermore, our model achieved a determination coefficient R² = 0.9669 (close to 1), indicating the quality of our fit. To improve the significance of our model, we will successively remove variables with high p-values.

Regression graphs

# Plot regression diagnostics
plot(regression, which = 1)

The random distribution of points supports the model’s validity, but the presence of outliers,particularly for Togo, Liberia, and Guinée Bissau, suggests that additional investigation into these cases may be warranted.

plot(regression1, which = 2)

We observe that the points generally follow a straight line, although there are some deviations, particularly for Benin, Ghana, and Niger. This suggests an overall normal distribution, thus demonstrating the quality of our model.

Predictions

We can use the model to make predictions for PIB based on the values of the predictor variables.

# Make predictions
predictions <- predict(regression)
pander(predictions)
Table continues below
Benin Burkina Faso Cap Vert Cote ivoire Gambia Ghana
1.717e+10 1.671e+10 612379013 6.256e+10 1.871e+09 6.865e+10
Table continues below
Guinee Guinee Bissau Liberia Mali Mauritania Niger
1.616e+10 -3.047e+09 5.58e+09 1.992e+10 7.544e+09 1.394e+10
Senegal Sierra Leone Togo
2.653e+10 4.397e+09 1.046e+10

The model appears to be quite close to the actual values for several countries (for example, Ivory Cost, Ghana, Guinea-Bissau, Niger, Sénégal), but there are notable discrepancies for certain countries (for example, Gambia, Mauritania, Togo)

Conclusion

In this analysis, challenges of climate adaptation and mitigation in West Africa by applying statistical techniques (PCA, HAC, and regression) to key socioeconomic, agricultural, and environmental data. The results highlight key factors, such as demographic pressure, agricultural emissions (emi_metan), and Forest Cover Loss, that influence demographic pressure, agricultural emissions (emi_metan), and Forest Cover Loss.