Impact du Changement Climatique sur l’Agriculture dans la Région Euro-Méditerranéenne : Une Analyse par Économétrie Spatiale

Data

Pour mener à bien cette analyse empirique, nous avons construit une base de données de panel équilibré couvrant 23 pays de la région Euro-Méditerranéenne sur une période de 21 ans, s’étendant de 2000 à 2020.

Le choix des variables a suivi une approche itérative. En l’absence d’une référence bibliographique unique traitant spécifiquement de cette configuration régionale avec cette méthodologie précise, nous avons ajusté notre modèle au fur et à mesure de l’étude}. Cette démarche exploratoire nous a permis de tester différentes combinaisons de variables pour isoler les facteurs les plus pertinents expliquant la variabilité de la production agricole.

Notre jeu de données se divise en deux catégories principales :

Variables Économiques : Elles incluent les indicateurs de performance agricole (tels que la valeur ajoutée agricole en pourcentage du PIB ou en dollars constants) ainsi que les facteurs de production classiques (travail, capital, terres arables). La collecte de ces données a été automatisée grâce à des scripts Python développés pour scraper l’API de la base de données de la Banque Mondiale (World Bank Open Data).

Variables Climatiques : Pour capturer la réalité physique du changement climatique, nous avons intégré des variables telles que la température moyenne annuelle, les précipitations et d’autres indices bioclimatiques pertinents après standardisation. Ces données granulaires ont été extraites et agrégées à l’échelle nationale en utilisant la puissance de calcul de Google Earth Engine, assurant ainsi une précision temporelle et spatiale élevée.

L’assemblage de ces deux sources distinctes a nécessité un travail rigoureux d’harmonisation et de nettoyage, notamment pour assurer la cohérence des unités de mesure avant l’étape de modélisation spatiale.

# ===========================================================
# 0. Importation des Packages
# ===========================================================

library(sf)

## Warning: le package 'sf' a été compilé avec la version R 4.4.3

## Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE

library(sp)

## Warning: le package 'sp' a été compilé avec la version R 4.4.3

library(spdep)

## Warning: le package 'spdep' a été compilé avec la version R 4.4.3

## Le chargement a nécessité le package : spData

## Warning: le package 'spData' a été compilé avec la version R 4.4.3

## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`

library(spatialreg)

## Warning: le package 'spatialreg' a été compilé avec la version R 4.4.3

## Le chargement a nécessité le package : Matrix

## 
## Attachement du package : 'spatialreg'

## Les objets suivants sont masqués depuis 'package:spdep':
## 
##     get.ClusterOption, get.coresOption, get.mcOption,
##     get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
##     set.coresOption, set.mcOption, set.VerboseOption,
##     set.ZeroPolicyOption

library(plm)

## Warning: le package 'plm' a été compilé avec la version R 4.4.3

library(splm)

## Warning: le package 'splm' a été compilé avec la version R 4.4.3

library(readxl)
library(dplyr)

## Warning: le package 'dplyr' a été compilé avec la version R 4.4.3

## 
## Attachement du package : 'dplyr'

## Les objets suivants sont masqués depuis 'package:plm':
## 
##     between, lag, lead

## Les objets suivants sont masqués depuis 'package:stats':
## 
##     filter, lag

## Les objets suivants sont masqués depuis 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)

## Warning: le package 'ggplot2' a été compilé avec la version R 4.4.3

library(RColorBrewer)
library(classInt)

## Warning: le package 'classInt' a été compilé avec la version R 4.4.3

library(corrplot)

## corrplot 0.95 loaded

library(car)

## Le chargement a nécessité le package : carData

## 
## Attachement du package : 'car'

## L'objet suivant est masqué depuis 'package:dplyr':
## 
##     recode

# ===========================================================
# 1.1 CHARGEMENT ET PRÉPARATION DES DONNÉES
# ===========================================================

panel <- read_excel("panel_final.xlsx")

panel <- panel %>%
  mutate(
    LogProd = log(Agri_Production_USD_2015),
    
    Prep = Precipitation_mm,
    Land = Agri_Land_SqKm,
    Rural = Rural_Pop_Share,
    Fert = Fertilizer_Consump,
    TempAnom = Temp_Anomaly,
    PrecipAnom = Precip_Anomaly,
    CV = precip_cv,
    HotDays = hot_days
  ) %>%
  select(ISO3, Year, LogProd, Prep, Land, Rural, Fert, 
         TempAnom, PrecipAnom, CV, HotDays)

panel <- panel %>%
  filter(complete.cases(.))

cat("Observations après nettoyage :", nrow(panel), "\n")

## Observations après nettoyage : 483

# ===========================================================
# 1.2 DIAGNOSTIC DE MULTICOLINÉARITÉ
# ===========================================================

# Matrice de corrélation
vars_numeriques <- panel %>%
  select(LogProd, Land, Rural, Fert, TempAnom, PrecipAnom, CV, HotDays)

cor_matrix <- cor(vars_numeriques, use = "complete.obs")

# Visualiser
corrplot(cor_matrix, method = "color", type = "upper", 
         addCoef.col = "black", tl.col = "black", tl.srt = 45,
         title = "Matrice de Corrélation",
         mar = c(0, 0, 2, 0))

# Test VIF (sur modèle simple)
model_test <- lm(LogProd ~ Prep + Land + Rural + Fert + TempAnom + 
                   PrecipAnom + CV + HotDays, data = panel)

vif_values <- vif(model_test)
cat("\n=== VIF (Variance Inflation Factor) ===\n")

## 
## === VIF (Variance Inflation Factor) ===

print(sort(vif_values, decreasing = TRUE))

##       Prep         CV    HotDays       Fert      Rural PrecipAnom       Land 
##   4.274247   2.795028   2.619879   1.349912   1.142868   1.142638   1.123962 
##   TempAnom 
##   1.022246

# Si VIF > 10 : problème sérieux
vars_problematiques <- names(vif_values[vif_values > 10])
if(length(vars_problematiques) > 0) {
  cat("\n Variables avec VIF > 10 :\n")
  print(vars_problematiques)
}

# ==========================================================
# 1.3 STANDARDISATION DES VARIABLES
# ==========================================================

# Standardiser toutes les variables continues (sauf ISO3 et Year)
panel_std <- panel %>%
  mutate(across(
    c(LogProd, Prep, Land, Rural, Fert, TempAnom, PrecipAnom, CV, HotDays),
    ~ scale(.x)[,1],
    .names = "{.col}_std"
  ))

cat("\n=== Variables standardisées ===\n")

## 
## === Variables standardisées ===

summary(panel_std %>% select(ends_with("_std")))

##   LogProd_std         Prep_std          Land_std         Rural_std      
##  Min.   :-2.4720   Min.   :-0.9385   Min.   :-0.5827   Min.   :-1.5067  
##  1st Qu.:-0.6551   1st Qu.:-0.7510   1st Qu.:-0.5642   1st Qu.:-0.7788  
##  Median : 0.1112   Median :-0.5212   Median :-0.3389   Median :-0.1390  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.9521   3rd Qu.: 0.9448   3rd Qu.: 0.2225   3rd Qu.: 0.4583  
##  Max.   : 1.5415   Max.   : 2.6389   Max.   : 4.2486   Max.   : 2.8944  
##     Fert_std        TempAnom_std      PrecipAnom_std         CV_std        
##  Min.   :-0.7691   Min.   :-4.14811   Min.   :-4.00948   Min.   :-1.55876  
##  1st Qu.:-0.5702   1st Qu.:-0.46470   1st Qu.:-0.31143   1st Qu.:-0.79799  
##  Median :-0.3306   Median :-0.01205   Median :-0.06335   Median :-0.04055  
##  Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.00000  
##  3rd Qu.: 0.1331   3rd Qu.: 0.50797   3rd Qu.: 0.25941   3rd Qu.: 0.52462  
##  Max.   : 7.2468   Max.   : 3.40152   Max.   : 4.97198   Max.   : 3.28936  
##   HotDays_std     
##  Min.   :-1.2130  
##  1st Qu.:-1.0581  
##  Median :-0.2053  
##  Mean   : 0.0000  
##  3rd Qu.: 0.8668  
##  Max.   : 1.9260

# VIF après standardisation
model_test_std <- lm(LogProd_std ~ Prep_std + Land_std + Rural_std + Fert_std + 
                       TempAnom_std + PrecipAnom_std + CV_std + HotDays_std, 
                     data = panel_std)

vif_std <- vif(model_test_std)
cat("\n=== VIF après standardisation ===\n")

## 
## === VIF après standardisation ===

print(sort(vif_std, decreasing = TRUE))

##       Prep_std         CV_std    HotDays_std       Fert_std      Rural_std 
##       4.274247       2.795028       2.619879       1.349912       1.142868 
## PrecipAnom_std       Land_std   TempAnom_std 
##       1.142638       1.123962       1.022246

# ========================================================
# 2. CRÉATION DU SHAPEFILE
# ========================================================

library(rnaturalearth)

## Warning: le package 'rnaturalearth' a été compilé avec la version R 4.4.3

world <- ne_countries(scale = "medium", returnclass = "sf")

pays <- c("MAR", "DZA", "TUN", "LBY", "EGY", "JOR", "LBN", "IRQ", 
          "SAU", "YEM", "OMN", "ARE", "QAT", "KWT", "BHR", "IRN", 
          "TUR", "ISR", "PRT", "ESP", "FRA", "ITA", "GRC")

shapefile <- world %>% 
  filter(adm0_a3 %in% pays | admin == "France") %>%
  select(iso_a3, adm0_a3, name, geometry)

shapefile_sp <- as_Spatial(shapefile)

plot(shapefile$geometry, main = "Zone d'étude (23 pays)")

# =======================================================
# 3. MATRICES DE PONDÉRATION SPATIALE
# =======================================================

coords <- coordinates(shapefile_sp)
Points <- SpatialPoints(coords)

# Contiguïté Queen
nb_queen <- poly2nb(shapefile_sp, queen = TRUE)

## Warning in poly2nb(shapefile_sp, queen = TRUE): some observations have no neighbours;
## if this seems unexpected, try increasing the snap argument.

## Warning in poly2nb(shapefile_sp, queen = TRUE): neighbour object has 3 sub-graphs;
## if this sub-graph count seems unexpected, try increasing the snap argument.

W_queen <- nb2listw(nb_queen, style = "W", zero.policy = TRUE)
cat("\n=== Matrice Contiguïté Queen ===\n")

## 
## === Matrice Contiguïté Queen ===

summary(nb_queen)

## Neighbour list object:
## Number of regions: 23 
## Number of nonzero links: 52 
## Percentage nonzero weights: 9.829868 
## Average number of links: 2.26087 
## 1 region with no links:
## 22
## 3 disjoint connected subgraphs
## Link number distribution:
## 
## 0 1 2 3 5 7 
## 1 6 8 6 1 1 
## 6 least connected regions:
## 7 8 10 12 15 19 with 1 link
## 1 most connected region:
## 6 with 7 links

# 3 plus proches voisins
knn3 <- knearneigh(Points, k = 3)
nb_k3 <- knn2nb(knn3)
W_k3 <- nb2listw(nb_k3, style = "W", zero.policy = TRUE)
cat("\n=== Matrice 3 plus proches voisins ===\n")

## 
## === Matrice 3 plus proches voisins ===

summary(nb_k3)

## Neighbour list object:
## Number of regions: 23 
## Number of nonzero links: 69 
## Percentage nonzero weights: 13.04348 
## Average number of links: 3 
## Non-symmetric neighbours list
## Link number distribution:
## 
##  3 
## 23 
## 23 least connected regions:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 with 3 links
## 23 most connected regions:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 with 3 links

# 4 plus proches voisins
knn4 <- knearneigh(Points, k = 4)
nb_k4 <- knn2nb(knn4)
W_k4 <- nb2listw(nb_k4, style = "W", zero.policy = TRUE)

# Visualiser
plot(shapefile_sp, main = "Matrice 3 plus proches voisins")
plot(nb_k3, coords, add = TRUE, col = "red")

# =====================================================
# 4. CARTOGRAPHIE (2020)
# =====================================================

data_2020 <- panel %>% filter(Year == 2020)

shapefile_2020 <- shapefile %>%
  left_join(data_2020, by = c("adm0_a3" = "ISO3"))

# Carte Log Production
ggplot(shapefile_2020) +
  geom_sf(aes(fill = LogProd), color = "white") +
  scale_fill_viridis_c(name = "Log(Production)") +
  theme_minimal() +
  labs(title = "Log Production Agricole 2020")

ggsave("carte_logprod_2020.png", width = 12, height = 8, dpi = 300)

# Carte HotDays
ggplot(shapefile_2020) +
  geom_sf(aes(fill = HotDays), color = "white") +
  scale_fill_gradient2(low = "blue", mid = "yellow", high = "red",
                       midpoint = 100, name = "Jours > 35°C") +
  theme_minimal() +
  labs(title = "Jours de Chaleur Extrême 2020")

ggsave("carte_hotdays_2020.png", width = 12, height = 8, dpi = 300)

# =======================================================
# 5. AUTOCORRÉLATION SPATIALE (par année)
# =======================================================

moran_annees <- data.frame()

for(annee in unique(panel$Year)) {
  data_temp <- panel %>% filter(Year == annee)
  
  shapefile_temp <- shapefile %>%
    left_join(data_temp, by = c("adm0_a3" = "ISO3"))
  
  shapefile_temp_sp <- as_Spatial(shapefile_temp)
  
  if(sum(!is.na(shapefile_temp_sp$LogProd)) >= 3) {
    moran_test <- moran.test(shapefile_temp_sp$LogProd, 
                             listw = W_k3, 
                             zero.policy = TRUE,
                             na.action = na.omit)
    
    moran_annees <- rbind(moran_annees, data.frame(
      Year = annee,
      Moran_I = moran_test$estimate[1],
      p_value = moran_test$p.value
    ))
  }
}

# Graphique
ggplot(moran_annees, aes(x = Year, y = Moran_I)) +
  geom_line(color = "blue", size = 1) +
  geom_point(size = 3, aes(color = p_value < 0.05)) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
  scale_color_manual(values = c("grey", "darkblue"), 
                     name = "Significatif (p<0.05)") +
  theme_minimal() +
  labs(title = "Autocorrélation Spatiale - Log Production (2000-2020)",
       y = "I de Moran", x = "Année")

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

ggsave("evolution_moran.png", width = 12, height = 7, dpi = 300)

# =============================================================
# 6. PRÉPARATION PANEL
# =============================================================

# Utiliser les variables standardisées
pdata <- pdata.frame(panel_std, index = c("ISO3", "Year"))
pdim(pdata)

## Balanced Panel: n = 23, T = 21, N = 483

# Spécification avec variables standardisées
Eq <- LogProd_std ~ Land_std + Rural_std +
  Fert_std + TempAnom_std + CV_std + HotDays_std

# ====================================================
# 7.1 MODÈLES ASPATIALS
# ====================================================

# Pooled
pooled <- plm(Eq, data = pdata, model = "pooling")
cat("\n=== POOLED OLS ===\n")

## 
## === POOLED OLS ===

summary(pooled)

## Pooling Model
## 
## Call:
## plm(formula = Eq, data = pdata, model = "pooling")
## 
## Balanced Panel: n = 23, T = 21, N = 483
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -1.8533851 -0.4845680  0.0073198  0.5484674  1.4459489 
## 
## Coefficients:
##                 Estimate  Std. Error t-value  Pr(>|t|)    
## (Intercept)   9.1568e-16  2.9672e-02  0.0000  1.000000    
## Land_std      3.7729e-01  3.1486e-02 11.9826 < 2.2e-16 ***
## Rural_std     4.4401e-01  3.1729e-02 13.9936 < 2.2e-16 ***
## Fert_std     -1.0300e-01  3.4448e-02 -2.9900  0.002934 ** 
## TempAnom_std  4.5214e-02  2.9903e-02  1.5121  0.131184    
## CV_std       -1.4094e-01  3.9582e-02 -3.5607  0.000407 ***
## HotDays_std  -2.6401e-01  3.8209e-02 -6.9097 1.565e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    482
## Residual Sum of Squares: 202.42
## R-Squared:      0.58004
## Adj. R-Squared: 0.57475
## F-statistic: 109.575 on 6 and 476 DF, p-value: < 2.22e-16

# Effets fixes
fe <- plm(Eq, data = pdata, model = "within")
cat("\n=== EFFETS FIXES ===\n")

## 
## === EFFETS FIXES ===

summary(fe)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = Eq, data = pdata, model = "within")
## 
## Balanced Panel: n = 23, T = 21, N = 483
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4841789 -0.0634734 -0.0066343  0.0661197  0.5545844 
## 
## Coefficients:
##                Estimate Std. Error  t-value  Pr(>|t|)    
## Land_std      0.0716880  0.1260723   0.5686 0.5698909    
## Rural_std    -0.3646060  0.0299553 -12.1717 < 2.2e-16 ***
## Fert_std      0.0076786  0.0096753   0.7936 0.4278244    
## TempAnom_std -0.0065740  0.0058281  -1.1280 0.2599222    
## CV_std       -0.0605016  0.0171291  -3.5321 0.0004545 ***
## HotDays_std   0.1333755  0.0599599   2.2244 0.0266117 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    9.297
## Residual Sum of Squares: 6.4333
## R-Squared:      0.30803
## Adj. R-Squared: 0.26535
## F-statistic: 33.6824 on 6 and 454 DF, p-value: < 2.22e-16

# Effets aléatoires
re <- plm(Eq, data = pdata, model = "random")
cat("\n=== EFFETS ALÉATOIRES ===\n")

## 
## === EFFETS ALÉATOIRES ===

summary(re)

## Oneway (individual) effect Random Effect Model 
##    (Swamy-Arora's transformation)
## 
## Call:
## plm(formula = Eq, data = pdata, model = "random")
## 
## Balanced Panel: n = 23, T = 21, N = 483
## 
## Effects:
##                   var std.dev share
## idiosyncratic 0.01417 0.11904 0.032
## individual    0.43402 0.65881 0.968
## theta: 0.9606
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -0.57531409 -0.08250507 -0.00021344  0.08183267  0.45762199 
## 
## Coefficients:
##                 Estimate  Std. Error  z-value  Pr(>|z|)    
## (Intercept)   1.4721e-16  1.4499e-01   0.0000 1.0000000    
## Land_std      2.0792e-01  9.8323e-02   2.1146 0.0344599 *  
## Rural_std    -3.4134e-01  3.0459e-02 -11.2068 < 2.2e-16 ***
## Fert_std      5.4372e-03  1.0177e-02   0.5342 0.5931748    
## TempAnom_std -3.6202e-03  6.1063e-03  -0.5929 0.5532786    
## CV_std       -6.5567e-02  1.7978e-02  -3.6470 0.0002653 ***
## HotDays_std   5.5631e-02  5.8285e-02   0.9545 0.3398482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    10.031
## Residual Sum of Squares: 7.502
## R-Squared:      0.2521
## Adj. R-Squared: 0.24267
## Chisq: 160.447 on 6 DF, p-value: < 2.22e-16

# Test Hausman
cat("\n=== TEST HAUSMAN ===\n")

## 
## === TEST HAUSMAN ===

print(phtest(fe, re))

## 
##  Hausman Test
## 
## data:  Eq
## chisq = 15.486, df = 6, p-value = 0.01679
## alternative hypothesis: one model is inconsistent

# ===================================================
# 7.2 TESTS LM POUR SPÉCIFICATION SPATIALE
# ===================================================

cat("\n=== TESTS LM (Baltagi, Song & Koh) ===\n")

## 
## === TESTS LM (Baltagi, Song & Koh) ===

# Ces tests sont faits pour les panels (splm) et gèrent automatiquement N*T

# 1. Test Joint (H0: Pas d'autocorrélation spatiale ni d'effets aléatoires)
test_joint <- bsktest(Eq, data = pdata, listw = W_k3, test = "LMH")
cat("\nTest Joint (LMH) :\n")

## 
## Test Joint (LMH) :

print(test_joint)

## 
##  Baltagi, Song and Koh LM-H one-sided joint test
## 
## data:  LogProd_std ~ Land_std + Rural_std + Fert_std + TempAnom_std +     CV_std + HotDays_std
## LM-H = 3743.3, p-value < 2.2e-16
## alternative hypothesis: Random Regional Effects and Spatial autocorrelation

# 2. Test LM pour Effets Aléatoires vs OLS (sans spatial)
test_lm1 <- bsktest(Eq, data = pdata, listw = W_k3, test = "LM1")
cat("\nTest LM1 (Effets Aléatoires) :\n")

## 
## Test LM1 (Effets Aléatoires) :

print(test_lm1)

## 
##  Baltagi, Song and Koh SLM1 marginal test
## 
## data:  LogProd_std ~ Land_std + Rural_std + Fert_std + TempAnom_std +     CV_std + HotDays_std
## LM1 = 61.182, p-value < 2.2e-16
## alternative hypothesis: Random effects

# 3. Test LM pour Corrélation Spatiale (SLM2)
# H0 : Pas de corrélation spatiale (lambda = 0)
test_lm2 <- bsktest(Eq, data = pdata, listw = W_k3, test = "LM2")
cat("\nTest LM2 (Corrélation Spatiale) :\n")

## 
## Test LM2 (Corrélation Spatiale) :

print(test_lm2)

## 
##  Baltagi, Song and Koh LM2 marginal test
## 
## data:  LogProd_std ~ Land_std + Rural_std + Fert_std + TempAnom_std +     CV_std + HotDays_std
## LM2 = -1.044, p-value = 1.703
## alternative hypothesis: Spatial autocorrelation

# 4. Test Conditionnel pour le Lag Spatial (SAR)
# H0 : Rho = 0 (en supposant qu'il y a des effets aléatoires)
test_cond_sar <- bsktest(Eq, data = pdata, listw = W_k3, test = "CLMlambda")
cat("\nTest Conditionnel SAR (CLMlambda) :\n")

## 
## Test Conditionnel SAR (CLMlambda) :

print(test_cond_sar)

## 
##  Baltagi, Song and Koh LM*-lambda conditional LM test (assuming
##  sigma^2_mu >= 0)
## 
## data:  LogProd_std ~ Land_std + Rural_std + Fert_std + TempAnom_std +     CV_std + HotDays_std
## LM*-lambda = 6.8512, p-value = 7.323e-12
## alternative hypothesis: Spatial autocorrelation

# 5. Test Conditionnel pour l'Erreur Spatiale (SEM)
# H0 : Lambda = 0
test_cond_sem <- bsktest(Eq, data = pdata, listw = W_k3, test = "CLMmu")
cat("\nTest Conditionnel SEM (CLMmu) :\n")

## 
## Test Conditionnel SEM (CLMmu) :

print(test_cond_sem)

## 
##  Baltagi, Song and Koh LM*- mu conditional LM test (assuming lambda may
##  or may not be = 0)
## 
## data:  LogProd_std ~ Land_std + Rural_std + Fert_std + TempAnom_std +     CV_std + HotDays_std
## LM*-mu = 59.191, p-value < 2.2e-16
## alternative hypothesis: Random regional effects

# =====================================================
# 7.3 MODÈLES SPATIAUX (Maximum de Vraisemblance)
# =====================================================


# ================================
# SEM
# ================================
cat("\n=== SEM (Effets Fixes) ===\n")

## 
## === SEM (Effets Fixes) ===

SEM_fe <- spml(Eq, data = pdata, listw = W_k3,
               model = "within", lag = FALSE, spatial.error = "b")
summary(SEM_fe)

## Spatial panel fixed effects error model
##  
## 
## Call:
## spml(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = FALSE, spatial.error = "b")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4844040 -0.0638887 -0.0065038  0.0670269  0.5552875 
## 
## Spatial error parameter:
##     Estimate Std. Error t-value  Pr(>|t|)    
## rho 0.360470   0.047358  7.6117 2.706e-14 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.1245597  0.1149902  1.0832  0.278711    
## Rural_std    -0.3021559  0.0304093 -9.9363 < 2.2e-16 ***
## Fert_std      0.0069578  0.0085226  0.8164  0.414275    
## TempAnom_std -0.0060757  0.0066913 -0.9080  0.363875    
## CV_std       -0.0473542  0.0159718 -2.9649  0.003028 ** 
## HotDays_std   0.0644050  0.0591439  1.0890  0.276174    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# SAR
# ================================
cat("\n=== SAR (Effets Fixes) ===\n")

## 
## === SAR (Effets Fixes) ===

SAR_fe <- spml(Eq, data = pdata, listw = W_k3,
               model = "within", lag = TRUE, spatial.error = "none")
summary(SAR_fe)

## Spatial panel fixed effects lag model
##  
## 
## Call:
## spml(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = "none")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4436408 -0.0612026 -0.0033675  0.0564098  0.5360724 
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda 0.370797   0.039878  9.2983 < 2.2e-16 ***
## 
## Coefficients:
##                 Estimate  Std. Error  t-value  Pr(>|t|)    
## Land_std      0.10491129  0.11180823   0.9383  0.348083    
## Rural_std    -0.28294073  0.02827812 -10.0056 < 2.2e-16 ***
## Fert_std      0.00037107  0.00862203   0.0430  0.965672    
## TempAnom_std -0.00668758  0.00516897  -1.2938  0.195736    
## CV_std       -0.04480384  0.01526207  -2.9356  0.003329 ** 
## HotDays_std   0.07077813  0.05337545   1.3260  0.184826    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# CALCUL MANUEL DES IMPACTS SAR 
cat("\n=== IMPACTS SAR (Calcul Manuel) ===\n")

## 
## === IMPACTS SAR (Calcul Manuel) ===

# 1. On récupère les coefficients bruts du modèle
coeffs <- coef(SAR_fe)
lambda <- as.numeric(SAR_fe$coefficients["lambda"]) # Le coefficient spatial
betas <- coeffs[names(coeffs) != "lambda"] # Les coefficients X (Beta)

# 2. On prépare le Multiplicateur Spatial (I - lambda*W)^-1
# On a besoin de la matrice W sous forme de matrice carrée
W_mat <- listw2mat(W_k3) 
N <- nrow(W_mat)
I <- diag(N)

# Le cœur du calcul spatial (Leplicateur de LeSage & Pace)
# S = (I - lambda * W)^-1
S <- solve(I - lambda * W_mat) 

# 3. Calcul des scalaires
# Impact Direct Moyen = Moyenne de la diagonale de S
impact_direct_multiplier <- mean(diag(S))

# Impact Total Moyen = Moyenne de la somme des lignes de S
# (1 / (1 - lambda) est aussi une bonne approximation pour le total)
impact_total_multiplier <- mean(rowSums(S))

# 4. Application aux variables
# On crée un tableau propre
impacts_table <- data.frame(
  Variable = names(betas),
  Beta_Coeff = as.numeric(betas),
  Direct = as.numeric(betas) * impact_direct_multiplier,
  Total = as.numeric(betas) * impact_total_multiplier
)

# On calcule l'indirect (Total - Direct)
impacts_table$Indirect <- impacts_table$Total - impacts_table$Direct

# Affichage propre
print(impacts_table, digits = 4)

##       Variable Beta_Coeff     Direct      Total   Indirect
## 1     Land_std  0.1049113  0.1092877  0.1667368  0.0574491
## 2    Rural_std -0.2829407 -0.2947437 -0.4496813 -0.1549375
## 3     Fert_std  0.0003711  0.0003866  0.0005897  0.0002032
## 4 TempAnom_std -0.0066876 -0.0069666 -0.0106287 -0.0036621
## 5       CV_std -0.0448038 -0.0466729 -0.0712073 -0.0245345
## 6  HotDays_std  0.0707781  0.0737307  0.1124886  0.0387579

# Interprétation pour le rapport :
cat("\n--- INTERPRÉTATION ---\n")

## 
## --- INTERPRÉTATION ---

cat("Le multiplicateur spatial est de", round(impact_total_multiplier, 3), ".\n")

## Le multiplicateur spatial est de 1.589 .

cat("Cela signifie que tout effet direct est amplifié de", 
    round((impact_total_multiplier - 1)*100, 1), "% par les rétroactions régionales.\n")

## Cela signifie que tout effet direct est amplifié de 58.9 % par les rétroactions régionales.

# ================================
# SDM
# ================================
cat("\n=== SDM (Effets Fixes) ===\n")

## 
## === SDM (Effets Fixes) ===

SDM_fe <- tryCatch({
  spml(Eq, data = pdata, listw = W_k3,
       model = "within", lag = TRUE, spatial.error = "none",
       Durbin = TRUE)
}, error = function(e) {
  cat("  SDM échoue :", e$message, "\n")
  NULL
})

summary(SDM_fe)

## Spatial panel fixed effects lag model
##  
## 
## Call:
## spml(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = "none", Durbin = TRUE)
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4436408 -0.0612026 -0.0033675  0.0564098  0.5360724 
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda 0.370797   0.039878  9.2983 < 2.2e-16 ***
## 
## Coefficients:
##                 Estimate  Std. Error  t-value  Pr(>|t|)    
## Land_std      0.10491129  0.11180823   0.9383  0.348083    
## Rural_std    -0.28294073  0.02827812 -10.0056 < 2.2e-16 ***
## Fert_std      0.00037107  0.00862203   0.0430  0.965672    
## TempAnom_std -0.00668758  0.00516897  -1.2938  0.195736    
## CV_std       -0.04480384  0.01526207  -2.9356  0.003329 ** 
## HotDays_std   0.07077813  0.05337545   1.3260  0.184826    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# CALCUL MANUEL DES IMPACTS SDM
cat("\n=== IMPACTS SDM (Calcul Manuel) ===\n")

## 
## === IMPACTS SDM (Calcul Manuel) ===

if(exists("SDM_fe") && !is.null(SDM_fe)) {
  
  # 1. Préparation des matrices
  # ---------------------------
  W_mat <- listw2mat(W_k3) # Ta matrice W en format carré
  N <- nrow(W_mat)
  I <- diag(N)
  
  # Récupération des coefficients
  coeffs <- coef(SDM_fe)
  rho <- as.numeric(SDM_fe$coefficients["lambda"]) # Le coefficient spatial (rho)
  
  # Le Multiplicateur Spatial Global : S = (I - rho * W)^-1
  S_global <- solve(I - rho * W_mat) 
  
  # 2. Identification des variables (X et WX)
  # ---------------------------
  # On liste les variables explicatives de base (sans les effets spatiaux)
  # Adapte cette liste si tes noms de variables changent !
  vars_base <- c("Land_std", "Rural_std", "Fert_std", "TempAnom_std", 
                 "PrecipAnom_std", "CV_std", "HotDays_std")
  
  # Tableau pour stocker les résultats
  res_impacts <- data.frame(
    Variable = vars_base,
    Direct = NA,
    Indirect = NA,
    Total = NA
  )
  
  # 3. Boucle de calcul pour chaque variable
  # ---------------------------
  for(i in 1:length(vars_base)) {
    var_name <- vars_base[i]
    
    # a. Trouver Beta (Coef Direct)
    beta <- coeffs[var_name]
    if(is.na(beta)) beta <- 0 # Sécurité
    
    # b. Trouver Theta (Coef Indirect/Spatial)
    # splm nomme souvent les lags spatiaux "lag_NomVariable" ou "W.NomVariable"
    # On essaie de le trouver
    theta_name <- paste0("lag_", var_name) 
    # Si 'lag_' ne marche pas, essaie sans préfixe ou regarde names(coeffs)
    theta <- coeffs[theta_name]
    if(is.na(theta)) theta <- 0 
    
    # c. Calcul de la Matrice d'Impact pour cette variable k
    # M_k = S_global * (Beta * I + Theta * W)
    # C'est la formule magique du SDM (LeSage & Pace)
    Impact_Matrix <- S_global %*% (beta * I + theta * W_mat)
    
    # d. Extraction des scalaires (Moyennes)
    # Direct = Moyenne de la diagonale (Impact sur soi-même + feedback)
    res_impacts$Direct[i] <- mean(diag(Impact_Matrix))
    
    # Total = Moyenne de la somme des lignes (Impact sur tout le monde)
    res_impacts$Total[i] <- mean(rowSums(Impact_Matrix))
    
    # Indirect = Total - Direct (Débordement pur)
    res_impacts$Indirect[i] <- res_impacts$Total[i] - res_impacts$Direct[i]
  }
  
  # 4. Affichage
  print(res_impacts, digits = 4)
  
  cat("\nNote : \n")
  cat("- Direct : L'effet local incluant le feedback (boucle de rétroaction).\n")
  cat("- Indirect : L'effet de débordement vers (ou depuis) les voisins.\n")
}

##         Variable     Direct   Indirect      Total
## 1       Land_std  0.1092877  0.0574491  0.1667368
## 2      Rural_std -0.2947437 -0.1549375 -0.4496813
## 3       Fert_std  0.0003866  0.0002032  0.0005897
## 4   TempAnom_std -0.0069666 -0.0036621 -0.0106287
## 5 PrecipAnom_std  0.0000000  0.0000000  0.0000000
## 6         CV_std -0.0466729 -0.0245345 -0.0712073
## 7    HotDays_std  0.0737307  0.0387579  0.1124886
## 
## Note : 
## - Direct : L'effet local incluant le feedback (boucle de rétroaction).
## - Indirect : L'effet de débordement vers (ou depuis) les voisins.

# ================================
# SDEM
# ================================
cat("\n=== SDEM (Effets Fixes) ===\n")

## 
## === SDEM (Effets Fixes) ===

SDEM_fe <- tryCatch({
  spml(Eq, data = pdata, listw = W_k3,
       model = "within", lag = FALSE, spatial.error = "b",
       Durbin = TRUE)
}, error = function(e) {
  cat("  SDEM échoue :", e$message, "\n")
  NULL
})

if(!is.null(SDEM_fe)) summary(SDEM_fe)

## Spatial panel fixed effects error model
##  
## 
## Call:
## spml(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = FALSE, spatial.error = "b", Durbin = TRUE)
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4844040 -0.0638887 -0.0065038  0.0670269  0.5552875 
## 
## Spatial error parameter:
##     Estimate Std. Error t-value  Pr(>|t|)    
## rho 0.360470   0.047358  7.6117 2.706e-14 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.1245597  0.1149902  1.0832  0.278711    
## Rural_std    -0.3021559  0.0304093 -9.9363 < 2.2e-16 ***
## Fert_std      0.0069578  0.0085226  0.8164  0.414275    
## TempAnom_std -0.0060757  0.0066913 -0.9080  0.363875    
## CV_std       -0.0473542  0.0159718 -2.9649  0.003028 ** 
## HotDays_std   0.0644050  0.0591439  1.0890  0.276174    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# SARAR
# ================================
cat("\n=== SARAR (Effets Fixes) ===\n")

## 
## === SARAR (Effets Fixes) ===

SARAR_fe <- tryCatch({
  spml(Eq, data = pdata, listw = W_k3,
       model = "within", lag = TRUE, spatial.error = "kkp")
}, error = function(e) {
  cat("  SARAR échoue :", e$message, "\n")
  NULL
})

if(!is.null(SARAR_fe)) summary(SARAR_fe)

## Spatial panel fixed effects sarar model
##  
## 
## Call:
## spml(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = "kkp")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4144700 -0.0570232 -0.0018168  0.0549934  0.5382131 
## 
## Spatial error parameter:
##      Estimate Std. Error t-value  Pr(>|t|)    
## rho -0.706006   0.070577 -10.003 < 2.2e-16 ***
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda  0.72093    0.03023  23.848 < 2.2e-16 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.0682653  0.0830351  0.8221   0.41101    
## Rural_std    -0.1347073  0.0215879 -6.2399 4.378e-10 ***
## Fert_std     -0.0059797  0.0070657 -0.8463   0.39738    
## TempAnom_std -0.0049792  0.0028365 -1.7554   0.07919 .  
## CV_std       -0.0264665  0.0110939 -2.3857   0.01705 *  
## HotDays_std   0.0574537  0.0353441  1.6256   0.10404    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ===============================================
# 7.4 MODÈLES PAR MÉTHODE DES MOMENTS GÉNÉRALISÉS
# ===============================================


# ================================
# SEM GM
# ================================
cat("\n=== SEM (GM) ===\n")

## 
## === SEM (GM) ===

SEM_GM <- spgm(Eq, data = pdata, listw = W_k3,
               model = "within", lag = FALSE, spatial.error = TRUE,
               moments = "fullweights")
summary(SEM_GM)

## Spatial fixed effects error model (GM estimation) 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = FALSE, spatial.error = TRUE, moments = "fullweights")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4622044 -0.0567109 -0.0046324  0.0596812  0.5567316 
## 
## Estimated spatial coefficient, variance components and theta:
##           Estimate
## rho        0.28333
## sigma^2_v  0.01298
## 
## Coefficients:
##                Estimate Std. Error  t-value  Pr(>|t|)    
## Land_std      0.1133729  0.1172537   0.9669  0.333593    
## Rural_std    -0.3185086  0.0303627 -10.4901 < 2.2e-16 ***
## Fert_std      0.0078410  0.0087406   0.8971  0.369678    
## TempAnom_std -0.0060156  0.0064985  -0.9257  0.354605    
## CV_std       -0.0503962  0.0162165  -3.1077  0.001885 ** 
## HotDays_std   0.0800413  0.0593487   1.3487  0.177445    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# SAR GM
# ================================
cat("\n=== SAR (GM) ===\n")

## 
## === SAR (GM) ===

SAR_GM <- spgm(Eq, data = pdata, listw = W_k3,
               model = "within", lag = TRUE, spatial.error = FALSE,
               moments = "fullweights")
summary(SAR_GM)

## Spatial w2sls model 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = FALSE, moments = "fullweights")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4243026 -0.0575349  0.0010825  0.0567094  0.5272416 
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda 0.547681   0.080035  6.8431 7.752e-12 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.1207600  0.1120338  1.0779    0.2811    
## Rural_std    -0.2439833  0.0318813 -7.6529 1.966e-14 ***
## Fert_std     -0.0031149  0.0087241 -0.3570    0.7211    
## TempAnom_std -0.0067418  0.0051686 -1.3044    0.1921    
## CV_std       -0.0373154  0.0155638 -2.3976    0.0165 *  
## HotDays_std   0.0409168  0.0548637  0.7458    0.4558    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# SARAR GM
# ================================
cat("\n=== SARAR (GM) ===\n")

## 
## === SARAR (GM) ===

SARAR_GM <- spgm(Eq, data = pdata, listw = W_k3,
                 model = "within", lag = TRUE, spatial.error = TRUE,
                 moments = "fullweights")
summary(SARAR_GM)

## Spatial fixed effects  SARAR model (GM estimation) 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = TRUE, moments = "fullweights")
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -2.92267 -0.67174  0.13370 -0.00774  0.84073  2.06411 
## 
## Estimated spatial coefficient, variance components and theta:
##            Estimate
## rho       -0.307888
## sigma^2_v  0.010581
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda 0.582344   0.073575   7.915 2.473e-15 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.0846554  0.1050505  0.8059   0.42033    
## Rural_std    -0.2114300  0.0323279 -6.5402 6.145e-11 ***
## Fert_std     -0.0051727  0.0084566 -0.6117   0.54075    
## TempAnom_std -0.0060776  0.0041126 -1.4778   0.13946    
## CV_std       -0.0354771  0.0145894 -2.4317   0.01503 *  
## HotDays_std   0.0594276  0.0492609  1.2064   0.22767    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Note technique : Pour SARAR et GNS, les impacts dépendent du paramètre rho (Lag Spatial).
# Le paramètre lambda (Erreur Spatiale) n'affecte pas les impacts marginaux, juste l'efficacité.
# Donc, on peut utiliser la fonction impacts() standard.



# ================================
# SDM GM
# ================================
cat("\n=== SDM (GM) ===\n")

## 
## === SDM (GM) ===

SDM_GM <- tryCatch({
  spgm(Eq, data = pdata, listw = W_k3,
       model = "within", lag = TRUE, spatial.error = FALSE,
       Durbin = TRUE, moments = "fullweights")
}, error = function(e) {
  cat("  SDM GM échoue :", e$message, "\n")
  NULL
})

if(!is.null(SDM_GM)) summary(SDM_GM)

## Spatial w2sls model 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, Durbin = TRUE, 
##     model = "within", lag = TRUE, spatial.error = FALSE, moments = "fullweights")
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -0.43495294 -0.05836112  0.00024012  0.05324303  0.50373296 
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value Pr(>|t|)  
## lambda  0.36996    0.14364  2.5756  0.01001 *
## 
## Coefficients:
##                    Estimate Std. Error t-value  Pr(>|t|)    
## Land_std          0.1180265  0.1140683  1.0347   0.30081    
## Rural_std        -0.2686852  0.0332390 -8.0834 6.296e-16 ***
## Fert_std         -0.0037189  0.0089123 -0.4173   0.67648    
## TempAnom_std     -0.0068447  0.0073363 -0.9330   0.35082    
## CV_std           -0.0394132  0.0158607 -2.4850   0.01296 *  
## HotDays_std       0.0030325  0.0590712  0.0513   0.95906    
## lag_Land_std     -0.0260544  0.1875483 -0.1389   0.88951    
## lag_Rural_std    -0.0391780  0.0618701 -0.6332   0.52658    
## lag_Fert_std     -0.0229731  0.0168972 -1.3596   0.17396    
## lag_TempAnom_std -0.0014899  0.0077524 -0.1922   0.84760    
## lag_CV_std       -0.0127003  0.0257434 -0.4933   0.62177    
## lag_HotDays_std   0.1916280  0.0902100  2.1242   0.03365 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# SDEM GM
# ================================
cat("\n=== SDEM (GM) ===\n")

## 
## === SDEM (GM) ===

SDEM_GM <- tryCatch({
  spgm(Eq, data = pdata, listw = W_k3,
       model = "within", lag = FALSE, spatial.error = TRUE,
       Durbin = TRUE, moments = "fullweights")
}, error = function(e) {
  cat("  SDEM GM échoue :", e$message, "\n")
  NULL
})

if(!is.null(SDEM_GM)) summary(SDEM_GM)

## Spatial fixed effects error model (GM estimation) 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, Durbin = TRUE, 
##     model = "within", lag = FALSE, spatial.error = TRUE, moments = "fullweights")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4463083 -0.0592220 -0.0008193  0.0535213  0.5070592 
## 
## Estimated spatial coefficient, variance components and theta:
##           Estimate
## rho       0.274581
## sigma^2_v 0.012055
## 
## Coefficients:
##                    Estimate Std. Error t-value  Pr(>|t|)    
## Land_std          0.0947363  0.1167479  0.8115  0.417102    
## Rural_std        -0.2881849  0.0312662 -9.2172 < 2.2e-16 ***
## Fert_std         -0.0044350  0.0094488 -0.4694  0.638803    
## TempAnom_std     -0.0078650  0.0070506 -1.1155  0.264632    
## CV_std           -0.0461532  0.0159449 -2.8945  0.003797 ** 
## HotDays_std       0.0287577  0.0584726  0.4918  0.622850    
## lag_Land_std     -0.0303047  0.2172911 -0.1395  0.889082    
## lag_Rural_std    -0.1800764  0.0427651 -4.2108 2.544e-05 ***
## lag_Fert_std     -0.0161481  0.0177053 -0.9121  0.361741    
## lag_TempAnom_std -0.0034129  0.0080323 -0.4249  0.670913    
## lag_CV_std       -0.0340081  0.0275689 -1.2336  0.217365    
## lag_HotDays_std   0.2381072  0.0920197  2.5876  0.009666 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# ================================
# MODÈLE MANSKI 
# ================================
# GNS combine : Spatial Lag (rho) + Spatial Error (lambda) + Spatial X (theta)
# C'est souvent lourd à estimer, on utilise GMM (spgm)

cat("\n=== GNS / MANSKI (GM) ===\n")

## 
## === GNS / MANSKI (GM) ===

# On active tout : spatial.error=TRUE, lag=TRUE, Durbin=TRUE
# On utilise GMM (spgm) car ML est souvent impossible à identifier

GNS_GM <- tryCatch({
  spgm(Eq, data = pdata, listw = W_k3,
       model = "within", 
       lag = TRUE,            # Lag spatial (rho)
       spatial.error = TRUE,  # Erreur spatiale (lambda)
       Durbin = TRUE,         # Lag des X (theta)
       moments = "fullweights")
}, error = function(e) {
  cat("  GNS non convergent :", e$message, "\n")
  NULL
})

if(!is.null(GNS_GM)) summary(GNS_GM)

## Spatial fixed effects  SARAR model (GM estimation) 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, Durbin = TRUE, 
##     model = "within", lag = TRUE, spatial.error = TRUE, moments = "fullweights")
## 
## Residuals:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -2.8977 -0.6675  0.2148  0.0249  0.9099  1.9257 
## 
## Estimated spatial coefficient, variance components and theta:
##           Estimate
## rho       -0.10987
## sigma^2_v  0.01151
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value Pr(>|t|)    
## lambda  0.50613    0.13895  3.6425  0.00027 ***
## 
## Coefficients:
##                     Estimate  Std. Error t-value  Pr(>|t|)    
## Land_std          0.12121209  0.11686849  1.0372   0.29966    
## Rural_std        -0.26031695  0.03431939 -7.5851 3.322e-14 ***
## Fert_std         -0.00226569  0.00898898 -0.2521   0.80100    
## TempAnom_std     -0.00642378  0.00763651 -0.8412   0.40024    
## CV_std           -0.03705444  0.01631264 -2.2715   0.02312 *  
## HotDays_std      -0.00751924  0.06108677 -0.1231   0.90203    
## lag_Land_std     -0.02048493  0.18273126 -0.1121   0.91074    
## lag_Rural_std     0.01403534  0.06098525  0.2301   0.81798    
## lag_Fert_std     -0.02164125  0.01693473 -1.2779   0.20128    
## lag_TempAnom_std -0.00070762  0.00789114 -0.0897   0.92855    
## lag_CV_std       -0.00495344  0.02518838 -0.1967   0.84410    
## lag_HotDays_std   0.17007772  0.08917512  1.9072   0.05649 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# CALCUL MANUEL DES IMPACTS GNS / MANSKI
# ======================================
if(exists("GNS_GM") && !is.null(GNS_GM)) {
  cat("\n=== IMPACTS GNS / MANSKI (Calcul Manuel) ===\n")
  
  # 1. Récupérer les coefficients
  coeffs <- coef(GNS_GM)
  # Dans spgm, le lag spatial s'appelle souvent 'lambda' ou 'rho' selon la version
  # Regardons ton output : "Spatial autoregressive coefficient: lambda"
  # Donc le coefficient multiplicateur est 'lambda' ici (0.50613)
  rho_scalar <- GNS_GM$coefficients[1] # C'est souvent le premier dans le vecteur $coefficients pur
  
  # Si on ne le trouve pas, on le prend dans le résumé
  if(is.null(rho_scalar)) rho_scalar <- 0.50613 # Valeur de ton output précédent
  
  # 2. Préparer le Multiplicateur Spatial S = (I - rho*W)^-1
  W_mat <- listw2mat(W_k3)
  N <- nrow(W_mat)
  I <- diag(N)
  S_global <- solve(I - rho_scalar * W_mat)
  
  # 3. Boucle sur les variables
  vars_base <- c("Land_std", "Rural_std", "Fert_std", "TempAnom_std", 
                 "PrecipAnom_std", "CV_std", "HotDays_std")
  
  res_gns <- data.frame(Variable = vars_base, Direct = NA, Indirect = NA, Total = NA)
  
  for(i in 1:length(vars_base)) {
    var <- vars_base[i]
    
    # Beta (Direct)
    beta <- coeffs[var]
    if(is.na(beta)) beta <- 0
    
    # Theta (Lagged X)
    # Dans ton output GNS, ils s'appellent "lag_Land_std", etc.
    theta_name <- paste0("lag_", var)
    theta <- coeffs[theta_name]
    if(is.na(theta)) theta <- 0
    
    # Matrice d'impact
    M <- S_global %*% (beta * I + theta * W_mat)
    
    res_gns$Direct[i] <- mean(diag(M))
    res_gns$Total[i]  <- mean(rowSums(M))
    res_gns$Indirect[i] <- res_gns$Total[i] - res_gns$Direct[i]
  }
  
  print(res_gns, digits = 4)
}

## 
## === IMPACTS GNS / MANSKI (Calcul Manuel) ===
##         Variable    Direct  Indirect    Total
## 1       Land_std  0.128506  0.075450  0.20396
## 2      Rural_std -0.281328 -0.217351 -0.49868
## 3       Fert_std -0.006333 -0.042075 -0.04841
## 4   TempAnom_std -0.007130 -0.007309 -0.01444
## 5 PrecipAnom_std  0.000000  0.000000  0.00000
## 6         CV_std -0.041286 -0.043773 -0.08506
## 7    HotDays_std  0.022158  0.306996  0.32915

# =========================================================
# 7.5 CRÉATION MANUELLE DES LAGS SPATIAUX (WX) POUR LE SLX
# =========================================================
# On doit créer les variables W_Land, W_Rural, etc. (Moyenne des voisins)

# 1. On s'assure que le panel suit l'ordre de la matrice de poids W_k3
# L'ordre de W_k3 est celui de 'shapefile_sp' (basé sur 'shapefile')
order_iso <- shapefile$adm0_a3

# 2. Variables à spatialiser (Tes X)
vars_X <- c("Land_std", "Rural_std", "Fert_std", "TempAnom_std", 
            "PrecipAnom_std", "CV_std", "HotDays_std")

# 3. Boucle pour créer les W_X
panel_slx <- panel_std %>%
  arrange(Year, match(ISO3, order_iso)) # On trie par Année puis par l'ordre de W

# On initialise les colonnes
for(v in vars_X) panel_slx[[paste0("W_", v)]] <- NA

# Calcul année par année
for(an in unique(panel_slx$Year)) {
  # Indices des lignes pour cette année
  idx <- which(panel_slx$Year == an)
  
  # Vérification de sécurité : est-ce qu'on a bien tous les pays ?
  if(length(idx) == length(order_iso)) {
    sub_data <- panel_slx[idx, ]
    
    # Calcul des lags spatiaux pour chaque variable
    for(v in vars_X) {
      # lag.listw multiplie la matrice W par le vecteur de données
      panel_slx[idx, paste0("W_", v)] <- spdep::lag.listw(W_k3, sub_data[[v]])
    }
  }
}

# Mise à jour du pdata avec ces nouvelles variables
pdata_slx <- pdata.frame(panel_slx, index = c("ISO3", "Year"))
cat(" Variables spatiales W_X créées pour le modèle SLX.\n")

##  Variables spatiales W_X créées pour le modèle SLX.

# --- A. MODÈLE SLX (Spatial Lag of X) ---
# Estimé par PLM (Effets Fixes) avec les variables W_... qu'on vient de créer
cat("\n=== SLX (Effets Fixes) ===\n")

## 
## === SLX (Effets Fixes) ===

# On écrit la formule explicitement : Y ~ X + W_X
formule_SLX <- LogProd_std ~ Land_std + Rural_std + Fert_std + 
  TempAnom_std + PrecipAnom_std + CV_std + HotDays_std +
  W_Land_std + W_Rural_std + W_Fert_std + 
  W_TempAnom_std + W_PrecipAnom_std + W_CV_std + W_HotDays_std

SLX_fe <- plm(formule_SLX, data = pdata_slx, model = "within")
summary(SLX_fe)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = formule_SLX, data = pdata_slx, model = "within")
## 
## Balanced Panel: n = 23, T = 21, N = 483
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4223014 -0.0588610 -0.0040658  0.0578428  0.5247039 
## 
## Coefficients:
##                    Estimate Std. Error t-value  Pr(>|t|)    
## Land_std          0.1316401  0.1248779  1.0542   0.29238    
## Rural_std        -0.2710782  0.0334678 -8.0997 5.321e-15 ***
## Fert_std          0.0023832  0.0094333  0.2526   0.80066    
## TempAnom_std     -0.0064566  0.0142969 -0.4516   0.65177    
## PrecipAnom_std    0.0040578  0.0065323  0.6212   0.53479    
## CV_std           -0.0079020  0.0206809 -0.3821   0.70258    
## HotDays_std       0.1362215  0.0710516  1.9172   0.05585 .  
## W_Land_std        1.6183788  0.7486200  2.1618   0.03116 *  
## W_Rural_std      -0.2913538  0.0599154 -4.8628 1.607e-06 ***
## W_Fert_std        0.0089613  0.0128501  0.6974   0.48593    
## W_TempAnom_std   -0.0027028  0.0148067 -0.1825   0.85524    
## W_PrecipAnom_std -0.0082402  0.0085910 -0.9592   0.33799    
## W_CV_std         -0.0777442  0.0229556 -3.3867   0.00077 ***
## W_HotDays_std    -0.0058710  0.0917497 -0.0640   0.94901    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    9.297
## Residual Sum of Squares: 5.859
## R-Squared:      0.3698
## Adj. R-Squared: 0.31893
## F-statistic: 18.6936 on 14 and 446 DF, p-value: < 2.22e-16

# Note : Pour SLX, Impacts Directs = Beta, Impacts Indirects = Theta (Coeffs W_)


# ======================
# 8. COMPARAISON AIC 
# ======================

# Fonction AIC sécurisée (renvoie NA si le modèle a planté)
get_aic_safe <- function(model_obj) {
  tryCatch({
    if(is.null(model_obj)) return(NA)
    
    # Si c'est un modèle spgm (GMM), on approxime l'AIC via les résidus
    # car GMM n'a pas de vraisemblance standard
    resid <- residuals(model_obj)
    n <- length(resid)
    ssr <- sum(resid^2)
    k <- length(coef(model_obj))
    
    # Formule AIC simple : n * log(SSR/n) + 2k
    aic_val <- n * log(ssr/n) + 2 * k
    return(aic_val)
  }, error = function(e) return(NA))
}

# 1. On liste tous les noms de modèles possibles dans ton script
noms_potentiels <- c("pooled", "fe", "re", "SLX_fe", 
                     "SEM_fe", "SAR_fe", "SDM_fe", "SDEM_fe", "SARAR_fe",
                     "SEM_GM", "SAR_GM", "SARAR_GM", "SDM_GM", "SDEM_GM", "GNS_GM")

# 2. On construit la table de comparaison dynamiquement
comparaison <- data.frame(Modele = character(), AIC = numeric(), stringsAsFactors = FALSE)

# On boucle pour vérifier quels modèles existent vraiment dans ton environnement
# et on les ajoute à une liste propre 'modeles_clean'
modeles_clean <- list()

for(nom in noms_potentiels) {
  if(exists(nom)) {     # Si l'objet existe en mémoire
    obj <- get(nom)     # On le récupère
    if(!is.null(obj)) { # Si ce n'est pas NULL (échec tryCatch)
      val_aic <- get_aic_safe(obj)
      
      if(!is.na(val_aic)) {
        # On ajoute au tableau
        comparaison <- rbind(comparaison, data.frame(Modele = nom, AIC = val_aic))
        # On ajoute à la liste propre pour la suite
        modeles_clean[[nom]] <- obj
      }
    }
  }
}

# 3. Tri final
comparaison <- comparaison %>% 
  arrange(AIC)

cat("\n=== CLASSEMENT FINAL PAR AIC ===\n")

## 
## === CLASSEMENT FINAL PAR AIC ===

print(comparaison)

##      Modele        AIC
## 1    SAR_GM -2165.0216
## 2    SAR_fe -2158.0538
## 3    SDM_fe -2158.0538
## 4    SDM_GM -2156.6765
## 5   SDEM_GM -2148.5737
## 6  SARAR_fe -2138.2964
## 7    SEM_GM -2125.2548
## 8    SLX_fe -2103.0175
## 9        fe -2073.8530
## 10   SEM_fe -2062.3102
## 11  SDEM_fe -2062.3102
## 12       re -1997.6209
## 13   pooled  -406.0546
## 14 SARAR_GM   189.4138
## 15   GNS_GM   193.0619

# Graphique de comparaison
ggplot(comparaison, aes(x = reorder(Modele, AIC), y = AIC)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  geom_text(aes(label = round(AIC, 0)), hjust = -0.1, size = 3) +
  coord_flip() +
  theme_minimal() +
  labs(title = "Comparaison des Modèles (AIC)", subtitle = "Plus petit = Meilleur", x = "", y = "AIC")

ggsave("comparaison_aic_final.png", width = 8, height = 6)


# ===============================================
# 9. SÉLECTION ET VALIDATION DU MEILLEUR MODÈLE
# ===============================================

# On prend le premier de la liste (le plus petit AIC)
meilleur_nom <- comparaison$Modele[1]
modele_final <- modeles_clean[[meilleur_nom]]

cat("\n LE MODÈLE GAGNANT EST :", meilleur_nom, "\n")

## 
##  LE MODÈLE GAGNANT EST : SAR_GM

cat("Avec un AIC de :", comparaison$AIC[1], "\n\n")

## Avec un AIC de : -2165.022

summary(modele_final)

## Spatial w2sls model 
## 
## Call:
## spgm(formula = Eq, data = pdata, listw = W_k3, model = "within", 
##     lag = TRUE, spatial.error = FALSE, moments = "fullweights")
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.4243026 -0.0575349  0.0010825  0.0567094  0.5272416 
## 
## Spatial autoregressive coefficient:
##        Estimate Std. Error t-value  Pr(>|t|)    
## lambda 0.547681   0.080035  6.8431 7.752e-12 ***
## 
## Coefficients:
##                Estimate Std. Error t-value  Pr(>|t|)    
## Land_std      0.1207600  0.1120338  1.0779    0.2811    
## Rural_std    -0.2439833  0.0318813 -7.6529 1.966e-14 ***
## Fert_std     -0.0031149  0.0087241 -0.3570    0.7211    
## TempAnom_std -0.0067418  0.0051686 -1.3044    0.1921    
## CV_std       -0.0373154  0.0155638 -2.3976    0.0165 *  
## HotDays_std   0.0409168  0.0548637  0.7458    0.4558    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# --- DIAGNOSTIC DES RÉSIDUS ---
residus <- residuals(modele_final)

# 1. Test de Normalité (Shapiro-Wilk)
# On prend un échantillon si N > 5000 (limite de la fonction shapiro)
if(length(residus) > 5000) {
  residus_test <- sample(residus, 5000)
} else {
  residus_test <- residus
}
cat("\n--- Test de Normalité (Shapiro-Wilk) ---\n")

## 
## --- Test de Normalité (Shapiro-Wilk) ---

print(shapiro.test(residus_test))

## 
##  Shapiro-Wilk normality test
## 
## data:  residus_test
## W = 0.94902, p-value = 7.588e-12

# 2. Test d'Hétéroscédasticité (Breusch-Pagan)
# Note : bptest marche sur plm/spml si la méthode est compatible
#try({
#  library(lmtest)
#  cat("\n--- Test d'Hétéroscédasticité (Breusch-Pagan) ---\n")
#  print(bptest(modele_final))
#}, silent = TRUE)

# 3. RMSE (Erreur Quadratique Moyenne)
rmse <- sqrt(mean(residus^2))
cat("\n--- Performance Prédictive ---\n")

## 
## --- Performance Prédictive ---

cat("RMSE :", round(rmse, 4), "\n")

## RMSE : 0.1048

Impact du Changement Climatique sur l’Agriculture dans la Région Euro-Méditerranéenne : Une Analyse par Économétrie Spatiale

A. EL GHAOUTI, B. REGRAGUI, S. ELALAMI

2025-12-23

Introduction

Data