This code evaluates and compares the influence of various climatic variables (temperature, pressure, humidity, wind characteristics, sunshine, cloud cover, evapotranspiration, soil moisture) on rainfall. By applying Relative Weights Analysis (RWA), iopsych relative weights, relimp (relative importance in linear regression), and Random Forest variable importance, it identifies which predictors contribute most to rainfall variability. The approach provides a robust understanding of the dominant climatic drivers, allowing researchers to prioritize variables for predictive modeling and better interpret their impact on rainfall patterns. The comparison across multiple methods ensures the reliability and consistency of variable importance assessments.
A synthetic climate dataset is generated with 100 observations and 20 variables, including rainfall and selected predictors. A subset of key variables is prepared for analysis.
set.seed(123) # For reproducibility
n <- 100 # Number of observations
data <- data.frame(
rainfall = rnorm(n, mean = 50, sd = 20), # mm
temperature = rnorm(n, mean = 27, sd = 5), # °C
mslp = rnorm(n, mean = 1013, sd = 10), # hPa
humidity = rnorm(n, mean = 70, sd = 15), # %
wind_speed = rnorm(n, mean = 5, sd = 2), # m/s
wind_gust = rnorm(n, mean = 10, sd = 4),
wind_dir = runif(n, min = 0, max = 360), # degrees
sunshine = rnorm(n, mean = 8, sd = 3), # hours
cloud_cover = rnorm(n, mean = 50, sd = 20), # %
evapotransp = rnorm(n, mean = 4, sd = 1.2), # mm
soil_moisture = rnorm(n, mean = 30, sd = 10), # %
dew_point = rnorm(n, mean = 20, sd = 3), # °C
vapor_pressure = rnorm(n, mean = 25, sd = 5), # hPa
radiation = rnorm(n, mean = 200, sd = 40), # W/m²
snowfall = rnorm(n, mean = 2, sd = 1), # mm
cyclone_freq = rpois(n, lambda = 1), # count/year
drought_index = rnorm(n, mean = 0, sd = 1), # normalized index
heatwaves = rpois(n, lambda = 3), # count/year
storm_surge = rnorm(n, mean = 0.5, sd = 0.2), # m
sea_temp = rnorm(n, mean = 28, sd = 2) # °C
)
# Display column names
cbind(names(data))
## [,1]
## [1,] "rainfall"
## [2,] "temperature"
## [3,] "mslp"
## [4,] "humidity"
## [5,] "wind_speed"
## [6,] "wind_gust"
## [7,] "wind_dir"
## [8,] "sunshine"
## [9,] "cloud_cover"
## [10,] "evapotransp"
## [11,] "soil_moisture"
## [12,] "dew_point"
## [13,] "vapor_pressure"
## [14,] "radiation"
## [15,] "snowfall"
## [16,] "cyclone_freq"
## [17,] "drought_index"
## [18,] "heatwaves"
## [19,] "storm_surge"
## [20,] "sea_temp"
# Select variables of interest
predictors <- c("temperature", "mslp", "humidity", "wind_speed", "wind_gust",
"wind_dir", "sunshine", "cloud_cover", "evapotransp", "soil_moisture")
Relative Weights Analysis (RWA) is performed to assess the relative
importance of predictors for rainfall using the rwa
package. Results are visualized with a plot.
# Install if necessary: devtools::install_github("martinctc/rwa")
library(rwa)
library(tidyverse)
library(ggplot2)
# Perform RWA
rwa_res <- data %>%
rwa(outcome = "rainfall", predictors = predictors, applysigns = TRUE, plot = TRUE)
print(rwa_res)
## $predictors
## [1] "temperature" "mslp" "humidity" "wind_speed"
## [5] "wind_gust" "wind_dir" "sunshine" "cloud_cover"
## [9] "evapotransp" "soil_moisture"
##
## $rsquare
## [1] 0.1393408
##
## $result
## Variables Raw.RelWeight Rescaled.RelWeight Sign Sign.Rescaled.RelWeight
## 1 temperature 0.0029786799 2.1376937 - -2.1376937
## 2 mslp 0.0211222581 15.1587011 - -15.1587011
## 3 humidity 0.0032970672 2.3661891 - -2.3661891
## 4 wind_speed 0.0368088936 26.4164472 - -26.4164472
## 5 wind_gust 0.0012037277 0.8638730 - -0.8638730
## 6 wind_dir 0.0031142104 2.2349592 + 2.2349592
## 7 sunshine 0.0316936800 22.7454385 + 22.7454385
## 8 cloud_cover 0.0001839223 0.1319946 - -0.1319946
## 9 evapotransp 0.0288193989 20.6826681 + 20.6826681
## 10 soil_moisture 0.0101189795 7.2620354 + 7.2620354
##
## $n
## [1] 100
##
## $lambda
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.989392202 0.017190783 0.019097416 -0.071206322 0.061324594
## [2,] 0.017190783 0.992811599 -0.016805712 -0.011865218 0.006548506
## [3,] 0.019097416 -0.016805712 0.989205937 -0.007634166 -0.040597968
## [4,] -0.071206322 -0.011865218 -0.007634166 0.987566845 0.103897794
## [5,] 0.061324594 0.006548506 -0.040597968 0.103897794 0.985541980
## [6,] -0.004509839 0.036606653 -0.011674573 -0.078344027 -0.046207227
## [7,] -0.021780922 0.097444018 -0.047481920 0.022529978 0.021140349
## [8,] -0.099726611 0.033139669 -0.095129650 -0.037112788 0.014425753
## [9,] -0.027009740 -0.040167074 0.087553484 -0.024551124 -0.094432935
## [10,] 0.021026663 0.004061219 0.003450175 -0.004697654 0.027436608
## [,6] [,7] [,8] [,9] [,10]
## [1,] -0.0045098393 -0.021780922 -0.099726611 -0.02700974 0.0210266633
## [2,] 0.0366066532 0.097444018 0.033139669 -0.04016707 0.0040612193
## [3,] -0.0116745729 -0.047481920 -0.095129650 0.08755348 0.0034501748
## [4,] -0.0783440274 0.022529978 -0.037112788 -0.02455112 -0.0046976542
## [5,] -0.0462072266 0.021140349 0.014425753 -0.09443293 0.0274366076
## [6,] 0.9940246420 0.021199866 -0.034865806 0.02190644 0.0006958231
## [7,] 0.0211998660 0.987571848 0.003419253 -0.07545609 0.0732687245
## [8,] -0.0348658059 0.003419253 0.987695793 -0.01178077 -0.0375788717
## [9,] 0.0219064447 -0.075456090 -0.011780769 0.98513914 0.0604955616
## [10,] 0.0006958231 0.073268724 -0.037578872 0.06049556 0.9941398152
##
## $RXX
## temperature mslp humidity wind_speed wind_gust
## temperature 1.000000000 0.03057903 0.04383271 -0.130622187 0.11448792
## mslp 0.030579031 1.00000000 -0.04486571 -0.024848379 0.01821008
## humidity 0.043832713 -0.04486571 1.00000000 -0.019259855 -0.08991276
## wind_speed -0.130622187 -0.02484838 -0.01925986 1.000000000 0.20661771
## wind_gust 0.114487923 0.01821008 -0.08991276 0.206617706 1.00000000
## wind_dir -0.003355358 0.07351034 -0.01715014 -0.158840275 -0.10127733
## sunshine -0.039457550 0.19748895 -0.10387146 0.047174056 0.05348873
## cloud_cover -0.195277199 0.06542997 -0.19174217 -0.061100182 0.02433769
## evapotransp -0.052401317 -0.08840685 0.18169769 -0.059775298 -0.19523946
## soil_moisture 0.044374452 0.01209812 0.01148258 -0.006524876 0.05026555
## wind_dir sunshine cloud_cover evapotransp soil_moisture
## temperature -0.003355358 -0.03945755 -0.19527720 -0.05240132 0.044374452
## mslp 0.073510345 0.19748895 0.06542997 -0.08840685 0.012098125
## humidity -0.017150138 -0.10387146 -0.19174217 0.18169769 0.011482584
## wind_speed -0.158840275 0.04717406 -0.06110018 -0.05977530 -0.006524876
## wind_gust -0.101277328 0.05348873 0.02433769 -0.19523946 0.050265551
## wind_dir 1.000000000 0.04176611 -0.06429154 0.04612582 0.004685983
## sunshine 0.041766105 1.00000000 0.01353752 -0.15402893 0.140767109
## cloud_cover -0.064291543 0.01353752 1.00000000 -0.03395423 -0.076681974
## evapotransp 0.046125819 -0.15402893 -0.03395423 1.00000000 0.111762407
## soil_moisture 0.004685983 0.14076711 -0.07668197 0.11176241 1.000000000
##
## $RXY
## temperature mslp humidity wind_speed wind_gust
## -4.953215e-02 -1.291760e-01 -4.407900e-02 -1.927111e-01 -5.648439e-02
## wind_dir sunshine cloud_cover evapotransp soil_moisture
## 7.277891e-02 1.570370e-01 -2.214862e-05 1.719307e-01 1.213162e-01
# Plot RWA results
data %>%
rwa(outcome = "rainfall", predictors = predictors, applysigns = TRUE) %>%
plot_rwa()
# Extract rescaled relative weights
rwa_vals <- data.frame(
term = rwa_res$result$Variables,
rwa_importance = rwa_res$result$Rescaled.RelWeight,
sign = rwa_res$result$Sign
)
rwa_vals
## term rwa_importance sign
## 1 temperature 2.1376937 -
## 2 mslp 15.1587011 -
## 3 humidity 2.3661891 -
## 4 wind_speed 26.4164472 -
## 5 wind_gust 0.8638730 -
## 6 wind_dir 2.2349592 +
## 7 sunshine 22.7454385 +
## 8 cloud_cover 0.1319946 -
## 9 evapotransp 20.6826681 +
## 10 soil_moisture 7.2620354 +
The iopsych package is used to compute relative weights
for a subset of predictors based on the correlation matrix.
# Install if necessary: install.packages("iopsych")
library(iopsych)
# Select climatic variables
predictors_iopsych <- c("temperature", "mslp", "humidity", "wind_speed", "wind_gust")
data_clim <- data[, c(predictors_iopsych, "rainfall")]
# Compute correlation matrix
Rs <- cor(data_clim)
# Define indices
ys <- ncol(Rs) # Target variable 'rainfall' is the last column
xs <- 1:(ys-1) # Predictors are columns 1 to (n-1)
# Calculate relative weights
weights <- relWt(Rs, ys, xs)
print(weights)
## $eps
## EPS
## temperature 0.003548804
## mslp 0.017318088
## humidity 0.002365735
## wind_speed 0.038113258
## wind_gust 0.001547116
##
## $beta_star
## [1] -0.05820187 -0.13163076 -0.04856565 -0.19674865 -0.03332019
##
## $lambda_star
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.99532663 0.014886284 0.023334112 -0.068864095 0.061778323
## [2,] 0.01488628 0.999519329 -0.022469955 -0.012509081 0.008839314
## [3,] 0.02333411 -0.022469955 0.998419037 -0.006584295 -0.045460908
## [4,] -0.06886410 -0.012509081 -0.006584295 0.991852888 0.106234434
## [5,] 0.06177832 0.008839314 -0.045460908 0.106234434 0.991338921
# Store results
weights_vals <- data.frame(term = predictors_iopsych, relWt_iopsych = weights)
A multiple linear regression model is built, and relative importance
is assessed using the relaimpo package with the LMG method.
Bootstrap confidence intervals are also computed.
library(relaimpo)
# Linear regression model
lm_model <- lm(rainfall ~ ., data = data[, c("rainfall", predictors)])
summary(lm_model)
##
## Call:
## lm(formula = rainfall ~ ., data = data[, c("rainfall", predictors)])
##
## Residuals:
## Min 1Q Median 3Q Max
## -45.602 -11.525 -0.225 11.167 41.761
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 364.924572 196.626364 1.856 0.0668 .
## temperature -0.220011 0.389082 -0.565 0.5732
## mslp -0.312790 0.194325 -1.610 0.1110
## humidity -0.077033 0.119974 -0.642 0.5225
## wind_speed -1.866023 0.956224 -1.951 0.0541 .
## wind_gust 0.062871 0.506159 0.124 0.9014
## wind_dir 0.005954 0.017351 0.343 0.7323
## sunshine 1.166700 0.588929 1.981 0.0507 .
## cloud_cover -0.013768 0.096268 -0.143 0.8866
## evapotransp 2.572923 1.505432 1.709 0.0909 .
## soil_moisture 0.126101 0.170411 0.740 0.4613
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 17.86 on 89 degrees of freedom
## Multiple R-squared: 0.1393, Adjusted R-squared: 0.04264
## F-statistic: 1.441 on 10 and 89 DF, p-value: 0.1757
# Relative importance with relaimpo
relimp_res <- calc.relimp(lm_model, type = "lmg", rela = TRUE)
print(relimp_res)
## Response variable: rainfall
## Total response variance: 333.2931
## Analysis based on 100 observations
##
## 10 Regressors:
## temperature mslp humidity wind_speed wind_gust wind_dir sunshine cloud_cover evapotransp soil_moisture
## Proportion of variance explained by model: 13.93%
## Metrics are normalized to sum to 100% (rela=TRUE).
##
## Relative importance metrics:
##
## lmg
## temperature 0.020885225
## mslp 0.151627778
## humidity 0.023324071
## wind_speed 0.263072950
## wind_gust 0.009597559
## wind_dir 0.022322165
## sunshine 0.228147535
## cloud_cover 0.001161206
## evapotransp 0.205806175
## soil_moisture 0.074055335
##
## Average coefficients for different model sizes:
##
## 1X 2Xs 3Xs 4Xs
## temperature -1.870294e-01 -0.1899729785 -0.1928580743 -0.195799811
## mslp -2.482714e-01 -0.2559734345 -0.2636952587 -0.271360841
## humidity -5.164515e-02 -0.0558978107 -0.0596962722 -0.063073041
## wind_speed -1.778041e+00 -1.7799703414 -1.7841633685 -1.790499762
## wind_gust -2.746261e-01 -0.2386503104 -0.2023252684 -0.165633967
## wind_dir 1.254081e-02 0.0117749084 0.0110191800 0.010273201
## sunshine 8.956961e-01 0.9306700515 0.9640719669 0.996069216
## cloud_cover -2.059056e-05 -0.0003318168 -0.0009560183 -0.001885244
## evapotransp 2.486101e+00 2.4874998969 2.4900708944 2.494131835
## soil_moisture 2.048541e-01 0.1990002361 0.1924365314 0.185157119
## 5Xs 6Xs 7Xs 8Xs 9Xs
## temperature -0.198899017 -0.202248068 -0.205936631 -0.210057250 -0.214710709
## mslp -0.278902135 -0.286258187 -0.293373981 -0.300199099 -0.306686289
## humidity -0.066066444 -0.068720741 -0.071086097 -0.073218359 -0.075178652
## wind_speed -1.798842830 -1.809050192 -1.820984059 -1.834520570 -1.849557943
## wind_gust -0.128564857 -0.091108255 -0.053253397 -0.014986115 0.023713072
## wind_dir 0.009536436 0.008808174 0.008087433 0.007372834 0.006662446
## sunshine 1.026821300 1.056477944 1.085177310 1.113044188 1.140187978
## cloud_cover -0.003113638 -0.004638156 -0.006459209 -0.008581219 -0.011013053
## evapotransp 2.500052045 2.508245241 2.519159378 2.533263408 2.551031167
## soil_moisture 0.177155837 0.168426208 0.158961606 0.148755690 0.137803161
## 10Xs
## temperature -0.220011243
## mslp -0.312790074
## humidity -0.077032866
## wind_speed -1.866023436
## wind_gust 0.062870786
## wind_dir 0.005953642
## sunshine 1.166700296
## cloud_cover -0.013768347
## evapotransp 2.572922800
## soil_moisture 0.126100983
plot(relimp_res)
# Bootstrap confidence intervals
boot_rel <- boot.relimp(lm_model, type = "lmg", nboot = 100)
boot_eval <- booteval.relimp(boot_rel)
plot(boot_eval)
# Store results
relimp_vals <- data.frame(term = names(relimp_res$lmg), relimp = relimp_res$lmg)
A Random Forest model is trained to evaluate variable importance based on the percentage increase in Mean Squared Error (%IncMSE).
library(randomForest)
library(vip)
# Random Forest model
rf_model <- randomForest(rainfall ~ ., data = data[, c("rainfall", predictors)], importance = TRUE)
vip(rf_model)
# Store results
rf_vals <- data.frame(term = rownames(rf_model$importance), rf_importance = rf_model$importance[, "%IncMSE"])
Results from RWA, iopsych, relaimpo, and Random Forest are combined for a comparative analysis of predictor importance.
# Merge results
comparison <- reduce(list(rwa_vals[, c("term", "rwa_importance")],
weights_vals,
relimp_vals,
rf_vals), full_join, by = "term")
comparison
## term rwa_importance relWt_iopsych.EPS relWt_iopsych.beta_star
## 1 temperature 2.1376937 0.003548804 -0.05820187
## 2 mslp 15.1587011 0.017318088 -0.13163076
## 3 humidity 2.3661891 0.002365735 -0.04856565
## 4 wind_speed 26.4164472 0.038113258 -0.19674865
## 5 wind_gust 0.8638730 0.001547116 -0.03332019
## 6 wind_dir 2.2349592 NA NA
## 7 sunshine 22.7454385 NA NA
## 8 cloud_cover 0.1319946 NA NA
## 9 evapotransp 20.6826681 NA NA
## 10 soil_moisture 7.2620354 NA NA
## relWt_iopsych.lambda_star.1 relWt_iopsych.lambda_star.2
## 1 0.99532663 0.014886284
## 2 0.01488628 0.999519329
## 3 0.02333411 -0.022469955
## 4 -0.06886410 -0.012509081
## 5 0.06177832 0.008839314
## 6 NA NA
## 7 NA NA
## 8 NA NA
## 9 NA NA
## 10 NA NA
## relWt_iopsych.lambda_star.3 relWt_iopsych.lambda_star.4
## 1 0.023334112 -0.068864095
## 2 -0.022469955 -0.012509081
## 3 0.998419037 -0.006584295
## 4 -0.006584295 0.991852888
## 5 -0.045460908 0.106234434
## 6 NA NA
## 7 NA NA
## 8 NA NA
## 9 NA NA
## 10 NA NA
## relWt_iopsych.lambda_star.5 relimp rf_importance
## 1 0.061778323 0.020885225 -5.5541866
## 2 0.008839314 0.151627778 -12.8965991
## 3 -0.045460908 0.023324071 -5.0028329
## 4 0.106234434 0.263072950 7.4090640
## 5 0.991338921 0.009597559 -12.7619162
## 6 NA 0.022322165 -4.4516461
## 7 NA 0.228147535 8.1225910
## 8 NA 0.001161206 -1.2997407
## 9 NA 0.205806175 -1.6443073
## 10 NA 0.074055335 0.4869619
# Display comparison table
library(kableExtra)
comparison %>%
kbl() %>%
kable_classic_2(full_width = FALSE)
| term | rwa_importance | relWt_iopsych.EPS | relWt_iopsych.beta_star | relWt_iopsych.lambda_star.1 | relWt_iopsych.lambda_star.2 | relWt_iopsych.lambda_star.3 | relWt_iopsych.lambda_star.4 | relWt_iopsych.lambda_star.5 | relimp | rf_importance |
|---|---|---|---|---|---|---|---|---|---|---|
| temperature | 2.1376937 | 0.0035488 | -0.0582019 | 0.9953266 | 0.0148863 | 0.0233341 | -0.0688641 | 0.0617783 | 0.0208852 | -5.5541866 |
| mslp | 15.1587011 | 0.0173181 | -0.1316308 | 0.0148863 | 0.9995193 | -0.0224700 | -0.0125091 | 0.0088393 | 0.1516278 | -12.8965991 |
| humidity | 2.3661891 | 0.0023657 | -0.0485656 | 0.0233341 | -0.0224700 | 0.9984190 | -0.0065843 | -0.0454609 | 0.0233241 | -5.0028329 |
| wind_speed | 26.4164472 | 0.0381133 | -0.1967486 | -0.0688641 | -0.0125091 | -0.0065843 | 0.9918529 | 0.1062344 | 0.2630729 | 7.4090640 |
| wind_gust | 0.8638730 | 0.0015471 | -0.0333202 | 0.0617783 | 0.0088393 | -0.0454609 | 0.1062344 | 0.9913389 | 0.0095976 | -12.7619162 |
| wind_dir | 2.2349592 | NA | NA | NA | NA | NA | NA | NA | 0.0223222 | -4.4516461 |
| sunshine | 22.7454385 | NA | NA | NA | NA | NA | NA | NA | 0.2281475 | 8.1225910 |
| cloud_cover | 0.1319946 | NA | NA | NA | NA | NA | NA | NA | 0.0011612 | -1.2997407 |
| evapotransp | 20.6826681 | NA | NA | NA | NA | NA | NA | NA | 0.2058062 | -1.6443073 |
| soil_moisture | 7.2620354 | NA | NA | NA | NA | NA | NA | NA | 0.0740553 | 0.4869619 |
This project serves as an educational exercise, demonstrating the application of multiple methods (RWA, iopsych, relaimpo, Random Forest) to assess predictor importance in a controlled environment. The methodology provides insights into the dominant climatic drivers of rainfall and can be applied to real-world climate data analysis, enhancing understanding of variable importance and predictive modeling. author: “Abdi-Basid ADAN” email: “abdi-basid@outlook.com”