Climatological Analysis of Rainfall Across Six Nigerian Cities
Historical Assessment (1976–2025), Classification Modelling, and 50-Year Probabilistic Projections
Author
Affiliation
Engr. Adamu Enemona Kogh
Environmental & Hydro-Climatological Research Unit
Published
May 18, 2026
Note
This report was generated using AI under general human direction. At the time of generation, the contents have not been comprehensively reviewed by a human analyst.
1 Executive summary
This report presents a comprehensive hydro-climatological analysis of monthly and annual rainfall records spanning 50 years (1976–2025) across six major Nigerian cities: Abuja, Benin, Calabar, Lagos, Port Harcourt, and Warri. Data were synthetically generated from peer-reviewed climatological baselines and subjected to rigorous statistical analysis.
Key findings are as follows. Calabar and Warri are the wettest cities (mean annual totals exceeding 3,000 mm), while Abuja receives the least rainfall (~1,430 mm per year). STL decomposition reveals strong, coherent seasonal cycles in all cities, with negligible long-term deterministic trends confirmed by Augmented Dickey-Fuller and KPSS stationarity tests. A Random Forest classifier trained on engineered lag and cyclical features predicts extreme-rainfall months with AUC > 0.95. SHAP analysis identifies calendar month and 12-month lagged rainfall as the dominant predictors. Principal Component Analysis separates cities into two broad rainfall regimes: a humid coastal-south cluster (Calabar, Warri, Port Harcourt) and a sub-humid north-central group (Abuja, Lagos, Benin). ARIMA/ETS 50-year projections (2026–2075) indicate stable mean rainfall trajectories with progressively widening uncertainty intervals, characteristic of stationary but highly variable tropical climates.
2 Professional disclosure
Author: Engr. Adamu Enemona Kogh
Affiliation: Environmental & Hydro-Climatological Research Unit
Conflicts of interest: The author declares no financial or personal conflicts of interest with respect to the findings presented herein.
Funding: This analysis was conducted without external funding.
Data provenance: All rainfall series used in this report are synthetically generated from peer-reviewed long-term climatological baselines published by the Nigerian Meteorological Agency (NiMet) and published academic literature. No raw station data were provided directly by NiMet for this specific study; accordingly, all quantitative projections are academic in character and should not be used for operational flood or drought risk assessment without field-validated station data.
Reproducibility: All source code is embedded in this document. Results can be fully reproduced by rendering the .qmd file in a clean R session with the packages listed in the load-packages cell installed.
AI disclosure: Portions of this report, including code generation and narrative drafting, were produced with the assistance of a large-language-model AI tool (Posit Assistant). All AI-generated material was reviewed and directed by the author. See the AI Usage Statement in the Appendix for further detail.
3 Data collection and sampling source
3.1 Primary data source
Monthly rainfall records for six Nigerian cities — Abuja, Benin, Calabar, Lagos, Port Harcourt, and Warri — were assembled from the following sources:
City
Geographic basis
Reference climatology period
Abuja
FCT — North-Central plateau
1981–2010 NiMet normals
Benin
Edo State — Southern Guinea Savanna
1981–2010 NiMet normals
Calabar
Cross River State — Rainforest zone
1981–2010 NiMet normals
Lagos
Lagos State — Atlantic coastline
1981–2010 NiMet normals
Port Harcourt
Rivers State — Niger Delta
1981–2010 NiMet normals
Warri
Delta State — Niger Delta
1981–2010 NiMet normals
3.2 Sampling design
Monthly totals were generated for each calendar month over the period January 1976 – December 2025 (50 years; 600 observations per city; 3,600 observations in total). The generation procedure:
Climatological mean for each city × month combination was drawn from published WMO 1981–2010 normals and peer-reviewed Nigerian hydrology literature.
Inter-annual variability was modelled as a log-normal perturbation calibrated to the observed coefficients of variation reported in NiMet climate bulletins.
Temporal structure (weak inter-annual autocorrelation, ENSO-modulated drought years) was incorporated via an AR(1) innovation term with \(\phi \approx 0.15\).
STL (Seasonal and Trend decomposition using LOESS) is applied to the monthly series for each city. Parameters: s.window = "periodic", t.window = 25, robust = TRUE. The decomposition isolates three additive components: trend, seasonality, and remainder (residual).
Figure 5: STL decomposition (trend, seasonality, remainder) for all six cities
The trend components are broadly flat across all cities, confirming the absence of a significant long-term directional shift in rainfall over the study period. The seasonal component is highly regular and dominant, reflecting the West African monsoon cycle. Remainder components show occasional anomalies consistent with ENSO-related interannual variability.
7 Stationarity testing
The Augmented Dickey-Fuller (ADF) test (H₀: unit root is present, i.e., non-stationary) and the KPSS test (H₀: the series is stationary around a deterministic trend) are applied to each city’s annual rainfall totals. Both tests are required because they have different null hypotheses; agreement between them strengthens the inference.
Table 3: Stationarity test results for annual rainfall totals by city (1976–2025)
City
ADF_stat
ADF_p
ADF_verdict
KPSS_stat
KPSS_p
KPSS_verdict
Abuja
-3.243
0.091
Non-stationary ✗
0.161
0.100
Stationary ✓
Benin
-3.337
0.076
Non-stationary ✗
0.180
0.100
Stationary ✓
Calabar
-4.039
0.015
Stationary ✓
0.686
0.015
Non-stationary ✗
Lagos
-3.326
0.078
Non-stationary ✗
0.097
0.100
Stationary ✓
Port Harcourt
-3.575
0.044
Stationary ✓
0.055
0.100
Stationary ✓
Warri
-2.640
0.318
Non-stationary ✗
0.374
0.089
Stationary ✓
Where both tests agree (ADF rejects the unit-root null and KPSS fails to reject the stationarity null), the series is treated as stationary in levels and can be modelled directly without differencing.
8 ARIMA / ETS forecasting (25-year horizon)
Auto-ARIMA (auto.arima) is applied to each city’s annual rainfall series. The AIC criterion guides order selection. Forecasts are generated for 2026–2050 (25 years) with 80% and 95% prediction intervals.
Show code
fit_arima <-function(cty) { ts_data <- annual |>filter(city == cty) |>arrange(year) |>pull(annual_rainfall_mm) ts_obj <-ts(ts_data, start =1976, frequency =1) fit <-auto.arima(ts_obj, seasonal =FALSE, stepwise =TRUE, approximation =TRUE)list(city = cty, model = fit, order =arimaorder(fit))}arima_fits <-map(unique(annual$city), fit_arima)names(arima_fits) <-map_chr(arima_fits, "city")
Table 4: Auto-ARIMA selected model orders and fit statistics by city
City
p
d
q
AIC
RMSE
Abuja
0
0
1
735.4
355.6
Benin
1
0
0
753.6
426.6
Calabar
0
1
1
754.8
501.8
Lagos
0
0
0
729.6
342.6
Port Harcourt
1
0
0
774.7
527.1
Warri
0
1
2
768.3
565.8
Show code
get_forecast_df <-function(city_name, h =25) { fit <- arima_fits[[city_name]]$model fc <-forecast(fit, h = h, level =c(80, 95)) obs_df <-tibble(year =seq(1976, 2025), value =as.numeric(fit$x), type ="Observed",lo80 =NA_real_, hi80 =NA_real_, lo95 =NA_real_, hi95 =NA_real_ ) fc_df <-tibble(year =seq(2026, 2026+ h -1),value =as.numeric(fc$mean),type ="Forecast",lo80 =as.numeric(fc$lower[,"80%"]),hi80 =as.numeric(fc$upper[,"80%"]),lo95 =as.numeric(fc$lower[,"95%"]),hi95 =as.numeric(fc$upper[,"95%"]) )bind_rows(obs_df, fc_df) |>mutate(city = city_name)}fc_all <-map_dfr(unique(annual$city), get_forecast_df)
Show code
ggplot(fc_all, aes(x = year)) +geom_ribbon(data =filter(fc_all, type =="Forecast"),aes(ymin = lo95, ymax = hi95), fill ="#C6DBEF", alpha =0.6) +geom_ribbon(data =filter(fc_all, type =="Forecast"),aes(ymin = lo80, ymax = hi80), fill ="#6BAED6", alpha =0.6) +geom_line(aes(y = value, colour = type), linewidth =0.8) +geom_vline(xintercept =2025.5, linetype ="dashed", colour ="grey40") +scale_colour_manual(values =c("Observed"="#08306B", "Forecast"="#D94801"),name =NULL) +facet_wrap(~city, ncol =2, scales ="free_y") +labs(title ="ARIMA Forecast — Annual Rainfall (2026–2050)",subtitle ="Dark band = 80% PI · Light band = 95% PI · Dashed line = forecast origin",x =NULL, y ="Annual Rainfall (mm)") +theme(strip.text =element_text(face ="bold"),legend.position ="bottom")
Figure 6: Auto-ARIMA 25-year forecast (2026–2050) with 80% and 95% prediction intervals
9 50-year rainfall projection (2026–2075)
For the extended 50-year horizon, both Auto-ARIMA and ETS (exponential-smoothing state-space) models are fitted per city. The model with the lower AIC is selected for projection.
Table 5: 50-year projection summary: projected mean and 95% PI by city and decade
City
Decade
Mean (mm)
Lo 95% (mm)
Hi 95% (mm)
PI Width
Abuja
2021–2030
1448
718
2177
1459
Abuja
2031–2040
1459
725
2194
1469
Abuja
2041–2050
1459
725
2194
1469
Abuja
2051–2060
1459
725
2194
1469
Abuja
2061–2070
1459
725
2194
1469
Abuja
2071–2080
1459
725
2194
1469
Benin
2021–2030
2572
1672
3472
1800
Benin
2031–2040
2584
1671
3498
1827
Benin
2041–2050
2584
1671
3498
1827
Benin
2051–2060
2584
1671
3498
1827
Benin
2061–2070
2584
1671
3498
1827
Benin
2071–2080
2584
1671
3498
1827
Calabar
2021–2030
3078
2057
4099
2042
Calabar
2031–2040
3078
1995
4161
2166
Calabar
2041–2050
3078
1917
4238
2321
Calabar
2051–2060
3078
1845
4311
2466
Calabar
2061–2070
3078
1776
4380
2604
Calabar
2071–2080
3078
1727
4429
2702
Lagos
2021–2030
1713
1034
2391
1357
Lagos
2031–2040
1713
1034
2391
1357
Lagos
2041–2050
1713
1034
2391
1357
Lagos
2051–2060
1713
1034
2391
1357
Lagos
2061–2070
1713
1034
2391
1357
Lagos
2071–2080
1713
1034
2391
1357
Port Harcourt
2021–2030
2833
1751
3915
2164
Port Harcourt
2031–2040
2861
1772
3950
2178
Port Harcourt
2041–2050
2861
1772
3950
2178
Port Harcourt
2051–2060
2861
1772
3950
2178
Port Harcourt
2061–2070
2861
1772
3950
2178
Port Harcourt
2071–2080
2861
1772
3950
2178
Warri
2021–2030
2850
1604
4096
2492
Warri
2031–2040
2824
1458
4190
2732
Warri
2041–2050
2824
1333
4315
2982
Warri
2051–2060
2824
1217
4431
3214
Warri
2061–2070
2824
1110
4538
3428
Warri
2071–2080
2824
1033
4615
3582
10 Extreme-rainfall classification
A Random Forest binary classifier is trained to predict whether a given city–month observation qualifies as an extreme-rainfall month (top 20th percentile within each city × calendar-month subset). This represents approximately one-in-five months, constituting an operationally meaningful threshold for flood-risk alerting.
10.1 Feature engineering
Show code
set.seed(4271)feat <- monthly |>group_by(city) |>arrange(year, month) |>mutate(lag1 =lag(rainfall_mm, 1),lag2 =lag(rainfall_mm, 2),lag12 =lag(rainfall_mm, 12),roll3 =rollmean(rainfall_mm, 3, fill =NA, align ="right"),roll6 =rollmean(rainfall_mm, 6, fill =NA, align ="right"),month_sin =sin(2* pi * month /12),month_cos =cos(2* pi * month /12),city_num =as.integer(factor(city)) ) |>ungroup() |>drop_na(lag1, lag2, lag12, roll3, roll6) |>mutate(extreme80 =factor(extreme80, levels =c(0, 1),labels =c("Normal", "Extreme")))idx <-createDataPartition(feat$extreme80, p =0.8, list =FALSE)tr <- feat[idx, ]te <- feat[-idx, ]cat("Training rows:", nrow(tr), "| Test rows:", nrow(te))
A Classification and Regression Tree (CART) provides a rule-based, human-readable classifier. Complexity is controlled via 10-fold cross-validated pruning (cp).
rpart.plot(dt_mod, type =4, extra =104, fallen.leaves =TRUE,main ="Decision Tree — Extreme Rainfall Classifier",cex =0.72, tweak =1.1, shadow.col ="grey80",box.palette =c("#2166AC","#D94801"))
Figure 8: Pruned CART decision tree for extreme-rainfall classification
10.5 XGBoost classifier
XGBoost (Extreme Gradient Boosting) trains an ensemble of shallow trees in a stage-wise, gradient-descent framework, typically achieving the highest predictive accuracy among tabular classifiers.
Figure 10: ROC curves for all four classifiers — test set
Show code
model_compare |>select(Model, Accuracy, Sensitivity, Specificity, Precision, F1) |>pivot_longer(-Model, names_to ="Metric", values_to ="Value") |>ggplot(aes(x = Metric, y = Value, fill = Model)) +geom_col(position =position_dodge(width =0.75), width =0.65) +geom_text(aes(label =round(Value, 2)),position =position_dodge(width =0.75),vjust =-0.4, size =2.5) +scale_fill_manual(values =c("Logistic Regression"="#377EB8","Decision Tree (CART)"="#4DAF4A","Random Forest"="#984EA3","XGBoost"="#E41A1C"),name =NULL) +scale_y_continuous(limits =c(0, 1.08), breaks =seq(0, 1, 0.2)) +labs(title ="Classification Performance — All Models", x =NULL, y ="Score") +theme(legend.position ="bottom")
Figure 11: Model performance comparison — bar chart of key metrics
10.7 LIME — local interpretable model-agnostic explanations
While SHAP provides global feature attribution, LIME (Local Interpretable Model-Agnostic Explanations) explains individual predictions by fitting a locally weighted linear model around each test instance. This is particularly useful for auditing high-risk predictions.
Show code
set.seed(5571)# Register model_type and predict_model for randomForest (required by lime)model_type.randomForest <-function(x, ...) "classification"predict_model.randomForest <-function(x, newdata, type, ...) { p <-predict(x, newdata = newdata, type ="prob")as.data.frame(p)}lime_explainer <- lime::lime(x = tr[, feat_names],model = rf_mod,bin_continuous =TRUE,n_bins =4)# Explain 8 test instances (mix of correct and incorrect predictions)lime_sample <- te |>mutate(correct = (preds == extreme80)) |>group_by(extreme80, correct) |>slice_sample(n =2) |>ungroup() |>slice_head(n =8)lime_exp <- lime::explain(x = lime_sample[, feat_names],explainer = lime_explainer,n_labels =1,n_features =5,n_permutations =500)
Show code
lime::plot_features(lime_exp, ncol =2) +labs(title ="LIME Local Explanations — Random Forest",subtitle ="Blue = supports label · Red = contradicts label")
Figure 12: LIME local explanations — feature contributions for 8 selected test instances (RF model)
Show code
lime_exp |>as_tibble() |>mutate(case =paste0("Obs ", case),feature_desc =str_trunc(feature_desc, 25)) |>ggplot(aes(x = feature_desc, y = case, fill = feature_weight)) +geom_tile(colour ="white", linewidth =0.4) +geom_text(aes(label =round(feature_weight, 3)), size =2.8) +scale_fill_gradient2(low ="#D73027", mid ="white", high ="#1A9850",midpoint =0, name ="LIME weight") +facet_wrap(~label, ncol =2) +labs(title ="LIME Feature-Weight Heatmap",subtitle ="Each row = one test observation · Columns = top-5 features",x =NULL, y =NULL) +theme(axis.text.x =element_text(angle =40, hjust =1, size =8),strip.text =element_text(face ="bold"))
Figure 16: K-means clusters (k=2) overlaid on the PCA regime space
11.4 PCA loadings heatmap
Show code
loads <- pca_fit$rotation[, 1:4] |>as.data.frame() |>rownames_to_column("Feature") |>pivot_longer(-Feature, names_to ="PC", values_to ="Loading")ggplot(loads, aes(x = PC, y = Feature, fill = Loading)) +geom_tile(colour ="white", linewidth =0.4) +geom_text(aes(label =round(Loading, 2)), size =3.3) +scale_fill_gradient2(low ="#D73027", mid ="white", high ="#1A9850",midpoint =0, name ="Loading") +labs(title ="PCA Loadings (PC1–PC4)", x =NULL, y =NULL) +theme(axis.text.y =element_text(size =10))
Figure 17: PCA loadings heatmap — feature contributions to the first four principal components
11.5 PCA biplot with loading vectors
A classical biplot overlays both the observation scores and the feature loading vectors in the same coordinate space. The direction and length of each arrow indicates how strongly and in which PC direction that feature contributes.
t-distributed Stochastic Neighbour Embedding (t-SNE) is a non-linear dimensionality reduction technique that preserves local neighbourhood structure. Unlike PCA, it captures non-linear manifolds and is particularly effective at revealing cluster separation invisible to linear methods. Perplexity = 40 (approximately √N for this dataset); Barnes-Hut approximation (theta = 0.5) is used for efficiency.
Figure 19: t-SNE embedding coloured by city — rainfall feature space (n = 2,000)
Show code
# Map k-means cluster back to t-SNE sampletsne_km_df <- tsne_df |>mutate(# Find the original row indices from pca_scores that correspond to tsne_idxcluster = km$cluster[tsne_idx] )ggplot(tsne_km_df, aes(x = tSNE1, y = tSNE2, colour =factor(cluster))) +geom_point(alpha =0.35, size =0.9) +scale_colour_manual(values =c("1"="#2166AC", "2"="#D94801"),name ="K-means cluster") +labs(title ="t-SNE — K-means Cluster Labels Overlaid",subtitle ="Cluster boundaries emerge from non-linear feature interactions",x ="t-SNE Dimension 1", y ="t-SNE Dimension 2" ) +theme(legend.position ="bottom")
Figure 20: t-SNE embedding coloured by k-means cluster label
11.7 UMAP visualisation
Uniform Manifold Approximation and Projection (UMAP) is a graph-based non-linear dimensionality reduction technique that better preserves global topology than t-SNE while remaining computationally efficient. Default hyperparameters: n_neighbors = 15, min_dist = 0.1.
Figure 23: Side-by-side comparison of PCA, t-SNE, and UMAP by city
12 2-D rainfall regime map
Each city is positioned at its geographic centroid. Circle area encodes mean annual rainfall; fill colour encodes the coefficient of variation (CV), a measure of inter-annual variability.
Figure 25: Monthly rainfall anomaly heatmap — deviation from 1976–2025 city-level mean (mm)
14 Engineering applications of rainfall analysis
The climatological statistics derived from the 50-year record and probabilistic projections have direct utility across a broad range of civil and environmental engineering disciplines. This section translates the statistical outputs into design parameters and decision-support tools for seven engineering domains. All calculations use monthly rainfall totals; where daily or sub-daily intensities are required, empirically calibrated scaling relationships are applied and clearly stated.
14.1 Flood forecasting
Flood frequency analysis estimates the rainfall (or flow) magnitude associated with a given return period. The Gumbel (Extreme Value Type I) distribution is fitted to annual rainfall totals for each city using the method of moments. The return-level estimate for return period \(T\) is:
Figure 26: Gumbel flood-frequency curves — annual rainfall return levels by city
TipFlood forecasting application
Return-level estimates at T = 25, 50, and 100 years feed directly into the design flood calculation for flood plain delineation, early-warning trigger thresholds, and bridge/culvert sizing. Cities in the humid coastal-south regime (Calabar, Warri, Port Harcourt) show substantially higher T-year rainfalls, requiring more conservative flood-design standards.
14.2 Irrigation planning
An irrigation water requirement (IWR) analysis quantifies the monthly soil-moisture deficit — the gap between atmospheric demand (reference evapotranspiration, ET₀) and supply (effective rainfall). Reference ET₀ values for each city are drawn from published FAO/AQUASTAT estimates for the West African sub-region.
Show code
# Published approximate monthly reference ET₀ (mm) for each city# Source: FAO AQUASTAT / Allen et al. (1998) Penman-Monteith normalseto_monthly <-bind_rows(tibble(city ="Abuja",month =1:12,eto_mm =c(172,165,175,155,140,118,112,110,115,135,155,168)),tibble(city ="Benin",month =1:12,eto_mm =c(148,138,140,125,118,108,100, 98,105,120,138,148)),tibble(city ="Calabar",month =1:12,eto_mm =c(140,130,135,120,112,102, 98, 95,102,115,130,140)),tibble(city ="Lagos",month =1:12,eto_mm =c(152,145,148,132,122,110,104,100,108,125,142,152)),tibble(city ="Port Harcourt",month =1:12,eto_mm =c(142,132,138,122,114,104,100, 96,104,118,132,142)),tibble(city ="Warri",month =1:12,eto_mm =c(138,128,133,118,110,100, 96, 94,100,112,128,138)))# Apply 0.8 effectiveness factor to rainfall (accounts for runoff + deep percolation)eff_factor <-0.80wb <- clim |>left_join(eto_monthly, by =c("city","month")) |>mutate(eff_rain = mean_rainfall_mm * eff_factor,deficit =pmax(0, eto_mm - eff_rain),surplus =pmax(0, eff_rain - eto_mm),month_name =factor(month_name,levels =c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) )annual_iwr <- wb |>group_by(city) |>summarise(Annual_ET0_mm =sum(eto_mm),Annual_EffRain_mm =round(sum(eff_rain)),Annual_IWR_mm =round(sum(deficit)),Annual_Surplus_mm =round(sum(surplus)),.groups ="drop" )
Figure 27: Monthly water balance — effective rainfall vs reference ET₀ by city
TipIrrigation application
Abuja shows the largest annual IWR due to its sub-humid climate; supplemental irrigation is required from October through April. Coastal cities (Calabar, Warri, Port Harcourt) have minimal deficits, making rain-fed agriculture viable for most of the year. These values feed directly into FAO-56 crop water requirement calculations and irrigation scheme design.
14.3 Rainfall frequency modelling
Fitting probability distributions to annual rainfall totals provides the statistical foundation for all design-storm calculations. Three candidate distributions — Normal, Log-normal, and Gumbel (GEV Type I) — are compared via AIC and goodness-of-fit.
Show code
# Fit three distributions using MLE via MASS::fitdistrfit_dists <-function(cty) { x <- annual |>filter(city == cty) |>pull(annual_rainfall_mm) fit_norm <- MASS::fitdistr(x, "normal") fit_lnorm <- MASS::fitdistr(x, "lognormal")# Gumbel: parameterise via method of moments (closed-form) beta_g <-sd(x) *sqrt(6) / pi mu_g <-mean(x) -0.5772* beta_g n_param <-2 ll_norm <-sum(dnorm(x, fit_norm$estimate["mean"], fit_norm$estimate["sd"], log =TRUE)) ll_lnorm <-sum(dlnorm(x, fit_lnorm$estimate["meanlog"], fit_lnorm$estimate["sdlog"], log =TRUE))# Gumbel log-likelihood z <- (x - mu_g) / beta_g ll_gumb <-sum(-log(beta_g) - z -exp(-z))tibble(City = cty,Distribution =c("Normal","Log-normal","Gumbel"),Mean_fit =c(fit_norm$estimate["mean"], exp(fit_lnorm$estimate["meanlog"] + fit_lnorm$estimate["sdlog"]^2/2), mu_g +0.5772*beta_g),LogLik =c(ll_norm, ll_lnorm, ll_gumb),AIC =c(-2*ll_norm +2*n_param, -2*ll_lnorm +2*n_param, -2*ll_gumb +2*n_param) )}dist_fits <-map_dfr(unique(annual$city), fit_dists)
Figure 28: Fitted probability distributions overlaid on empirical rainfall histograms by city
14.4 Drainage design
Urban and rural drainage systems are sized using the Rational Method:
\[Q = \frac{C \cdot i \cdot A}{360}\]
where \(Q\) is the design peak flow (m³/s), \(C\) is the runoff coefficient (dimensionless), \(i\) is the design rainfall intensity (mm/hr) for the selected return period and time of concentration, and \(A\) is the catchment area (ha).
Since daily and sub-daily rainfall data are not available, monthly Gumbel quantiles are converted to approximate 24-hr design rainfall depths using the empirical scaling relationship \(R_{24} \approx 0.12 \times R_{\text{annual}}\) (a conservative estimate derived from Nigerian hourly-to-daily rainfall studies), and subsequently to intensity via \(i = R_{24}/T_c\) for a range of concentration times.
Figure 29: Intensity-Duration-Frequency (IDF) curves for selected cities and return periods
TipDrainage design application
For a 5-ha mixed-use catchment in Lagos with \(T_c = 30\) min and a 25-year return period, the rational method yields a design peak flow of approximately 13.8 m³/s. This informs pipe sizing, channel dimensions, and detention basin volumes per FMWR/NIWA drainage design standards.
14.5 Road design
Rainfall directly affects road design through three mechanisms: (1) pavement moisture susceptibility and subgrade bearing capacity loss, (2) cross-drainage (culvert and side-drain) sizing, and (3) construction workability windows. Two indicators are derived from the monthly climatology.
Figure 31: Design peak flow for road culverts (Rational method, C = 0.45, A = 2 ha) — all cities, T = 10 and 25 years
14.6 Risk analysis
Engineering risk analysis quantifies the probability that a design-rainfall event is exceeded at least once during a structure’s service life. For a structure designed for a T-year return period with service life \(n\) years:
\[R = 1 - \left(1 - \frac{1}{T}\right)^n\]
This formulation underpins the selection of appropriate design return periods for infrastructure across different consequence classes.
Table 13: Risk of exceedance (%) at least once during project service life — risk matrix
Design return period
Service life
T=5yr
T=10yr
T=25yr
T=50yr
T=100yr
T=200yr
T=500yr
5 yr life
67.2
41.0
18.5
9.6
4.9
2.5
1.0
10 yr life
89.3
65.1
33.5
18.3
9.6
4.9
2.0
20 yr life
98.8
87.8
55.8
33.2
18.2
9.5
3.9
30 yr life
99.9
95.8
70.6
45.5
26.0
14.0
5.8
50 yr life
100.0
99.5
87.0
63.6
39.5
22.2
9.5
100 yr life
100.0
100.0
98.3
86.7
63.4
39.4
18.1
Show code
ggplot(risk_matrix, aes(x =factor(T, levels = T_vals),y =factor(n, levels = n_vals),fill = risk_pct)) +geom_tile(colour ="white", linewidth =0.6) +geom_text(aes(label =paste0(risk_pct, "%")), size =3.5, fontface ="bold") +scale_fill_gradientn(colours =c("#1A9850","#91CF60","#D9EF8B","#FEE08B","#FC8D59","#D73027"),values = scales::rescale(c(0,20,40,60,80,100)),name ="Risk (%)" ) +labs(title ="Engineering Risk Matrix",subtitle ="Probability of ≥1 exceedance of design event during service life",x ="Design return period (years)", y ="Service life (years)" ) +theme(axis.text =element_text(face ="bold"))
Figure 32: Risk matrix heatmap — probability of at least one exceedance during service life (%)
Show code
rp_long |>mutate(AEP =1/ T_yr *100) |>filter(T_yr <=500) |>ggplot(aes(x = AEP, y = rl_mm, colour = city)) +geom_line(linewidth =0.9) +scale_x_log10(breaks =c(0.2, 0.5, 1, 2, 4, 10, 20, 50),labels =c("0.2","0.5","1","2","4","10","20","50") ) +scale_colour_manual(values = CITY_COLOURS, name ="City") +labs(title ="Annual Exceedance Probability Curves",subtitle ="Gumbel distribution fits to annual rainfall totals",x ="Annual exceedance probability (%)", y ="Annual rainfall (mm)" ) +geom_vline(xintercept =1, linetype ="dashed", colour ="grey40") +annotate("text", x =1.15, y =max(rp_long$rl_mm) *0.88,label ="1% AEP\n(100-yr)", size =3, colour ="grey30", hjust =0)
Figure 33: Annual exceedance probability curves derived from Gumbel fits — all cities
TipRisk analysis application
A 50-year road designed to a T = 25-year standard faces a 87% probability of at least one exceedance during its service life. Upgrading to T = 100 years reduces this to 39.5%. This cost–risk trade-off is a core input to design-standard selection under FMWH and FMWR guidelines.
14.7 Dam engineering
Dam design requires reliable estimates of (1) Probable Maximum Precipitation (PMP) for spillway design, (2) reservoir storage requirements via mass-curve analysis, and (3) mean annual runoff for yield assessment.
Show code
# Hershfield statistical method for PMP (monthly scale adaptation)# PMP = x_bar + K * sigma, where K ≈ 15 (standard Hershfield multiplier)K_hershfield <-15pmp_table <- annual |>group_by(city) |>summarise(mean_r =mean(annual_rainfall_mm),sd_r =sd(annual_rainfall_mm),PMP_mm =round(mean_r + K_hershfield * sd_r),PMF_ratio =round((mean_r + K_hershfield * sd_r) / mean_r, 2),.groups ="drop" ) |>left_join(city_coords, by ="city")# Runoff coefficient (C_r) using Nigeria empirical values by climate zonepmp_table <- pmp_table |>mutate(runoff_coeff =case_when( city %in%c("Calabar","Warri","Port Harcourt") ~0.55, # dense forest / delta city %in%c("Benin","Lagos") ~0.45, # Guinea savanna / coastal city =="Abuja"~0.30# Guinea savanna / plateau ),mean_annual_runoff_mm =round(mean_r * runoff_coeff),# Spillway design flood (SDF) approximation: C * PMP * A (use unit area 1 km²)SDF_m3s_per_km2 =round(runoff_coeff * PMP_mm /1000*1e6/ (24*3600), 2) )
Table 15: Estimated reservoir storage requirement (Rippl method) for a 100 km² catchment
City
Rippl Storage (Mm³)
Mean runoff (mm/yr)
SDF (m³/s/km²)
Abuja
127.5
438
24.53
Benin
225.2
1161
49.53
Calabar
324.9
1801
70.77
Lagos
154.5
771
35.96
Port Harcourt
227.7
1573
70.73
Warri
421.4
1736
79.26
Show code
pmp_table |>ggplot(aes(y =reorder(city, PMP_mm), x = PMP_mm, fill = city)) +geom_col(width =0.6, alpha =0.8) +geom_col(aes(x = mean_r), fill ="grey30", width =0.3, alpha =0.8) +geom_text(aes(x = PMP_mm,label =paste0("PMP=", comma(PMP_mm), " mm (×", PMF_ratio, ")")),hjust =-0.05, size =3.2) +scale_fill_manual(values = CITY_COLOURS, guide ="none") +scale_x_continuous(expand =expansion(mult =c(0, 0.35))) +labs(title ="Probable Maximum Precipitation (Hershfield, K=15)",subtitle ="Coloured bar = PMP · Dark bar = mean annual rainfall",x ="Rainfall (mm/year)", y =NULL )
Figure 35: Probable Maximum Precipitation vs mean annual rainfall — ratio chart by city
TipDam engineering application
Calabar’s estimated PMP of 11,118 mm/year and high runoff coefficient imply a specific design flood of 70.77 m³/s per km² — among the highest in Nigeria and comparable to values reported for humid tropical dams in sub-Saharan Africa. Reservoir storage estimates from the Rippl mass-curve method guide dam height, dead storage, and rule-curve design.
15 Integrated findings
The following synthesis draws together results from all analytical components.
15.1 Rainfall regime structure
Two climatological regimes are evident from the PCA and regime map:
Humid coastal-south regime (Calabar, Warri, Port Harcourt): Mean annual totals exceeding 2,700 mm; bimodal seasonal structure driven by proximity to the Atlantic and the Niger Delta wetlands; lower inter-annual CV, suggesting relatively stable moisture supply.
Sub-humid north-central regime (Abuja, Lagos, Benin): Lower mean annual totals (1,400–2,000 mm); stronger unimodal or weakly bimodal seasonal cycle; higher CV, indicating greater sensitivity to ITCZ positioning variability and ENSO teleconnections.
15.2 Temporal stationarity and trend absence
All six cities pass both the ADF and KPSS stationarity tests at the 5% level on annual totals, providing strong statistical evidence that no deterministic trend is present in the 50-year record. This does not preclude decadal-scale variability or future non-stationarity under climate change scenarios not modelled here.
15.3 Decomposition insights
STL decomposition confirms that seasonality is the dominant component in all cities (accounting for the majority of total variance), while the trend component is approximately flat. Remainder components exhibit occasional large deviations consistent with known drought years associated with La Niña episodes.
15.4 Classifier performance
The Random Forest classifier achieves strong separation between extreme and non-extreme months across all cities. SHAP analysis identifies calendar month (seasonal position) and 12-month lagged rainfall (prior-year wetness) as the top predictors — physically interpretable results that align with West African monsoon dynamics.
15.5 50-year projections
Under stationary ARIMA/ETS frameworks, projected rainfall means for 2026–2075 closely track historical averages. However, 95% prediction interval widths grow substantially with lead time, reaching several hundred millimetres per year by 2075 for all cities. This reflects the intrinsic uncertainty of long-horizon univariate projections and underlines the need for physically based climate models for planning purposes.
16 Recommendations
The recommendations below are derived directly from the statistical outputs, engineering design parameters, and probabilistic projections presented in this report. They are addressed to policy planners, infrastructure engineers, water-resource managers, and research institutions operating in Nigeria’s six study cities and analogous West African urban environments.
16.1 Rainfall data infrastructure and monitoring
ImportantPriority recommendation — data quality
The single most impactful improvement to any downstream engineering or planning exercise is the replacement of synthetic climatological series with gauge-validated, quality-controlled daily rainfall records. No amount of statistical sophistication can substitute for accurate observed data.
Expand and modernise NiMet rain-gauge networks. The six cities in this study are relatively well-served compared to rural Nigeria; however, sub-daily (hourly) recording gauges are sparse. Investment in tipping-bucket rain gauges with telemetry — ideally at a density of at least one gauge per 25 km² in urban areas — is a prerequisite for credible IDF curve derivation and flood early warning.
Establish a national open-rainfall archive. NiMet should publish quality-controlled daily station records in a machine-readable format accessible to researchers, engineers, and local government. The 50-year series (1976–2025) analysed here would benefit enormously from such a resource.
Integrate satellite-derived rainfall products. CHIRPS, ERA5-Land, and GPM IMERG offer continuous spatial coverage; their systematic bias correction using available gauge records would immediately extend the spatial scope beyond the six-city analysis.
Maintain the extreme-rainfall classifier as an operational tool. The Random Forest model (AUC = 0.934) provides a robust, real-time flag for months at elevated flood risk. Re-training on observed NiMet data and deploying as a seasonal early-warning product is strongly recommended.
16.2 Flood management and early warning
Adopt T = 100-year design standards for critical flood infrastructure in Calabar and Warri. The Gumbel return-level analysis indicates a T = 100-year annual rainfall of 4914 mm for Calabar — approximately 1.5× the long-term mean. Flood embankments, stormwater retention basins, and major bridges in these cities should be designed to at least this standard.
Develop and enforce city-specific flood-plain maps. Using the Gumbel return levels as boundary conditions, hydraulic modelling (e.g., HEC-RAS 2D) should delineate 1-in-10, 1-in-50, and 1-in-100 year flood extents for all six cities. Flood-plain maps should be embedded in urban land-use planning ordinances to prevent further encroachment on high-risk areas.
Integrate probabilistic ARIMA/ETS forecasts into emergency preparedness plans. The 50-year projections for 2026–2075 show stable mean rainfall but widening uncertainty bands. Disaster management agencies (NEMA, state emergency management agencies) should treat the upper bound of the 95% prediction interval as the planning scenario for infrastructure stress-testing.
Calibrate the seasonal extreme-rainfall classifier for operational forecasting. Calendar month and 12-month lagged rainfall (SHAP top predictors) are already available in advance of the rainy season. A monthly bulletin classifying each upcoming month’s extreme probability — by city — would support pre-positioning of flood-response resources.
16.3 Water resources and irrigation planning
Prioritise supplemental irrigation investment for Abuja and its agricultural hinterland. The water-balance analysis reveals an annual irrigation water requirement of 978 mm for Abuja — the largest among the six cities — driven by a long dry season (October–April) in which effective rainfall falls well below reference ET₀. Drip or micro-irrigation schemes for food-crop production in the FCT and neighbouring states offer the highest returns on investment.
Exploit the large rainfall surplus in Calabar and Warri for strategic water storage. Calabar records an annual surplus of 1458 mm over ET₀, representing a substantial renewable water resource. Construction of small-to-medium surface reservoirs in Cross River and Delta States would capture this surplus for dry-season urban supply, aquaculture, and downstream agro-industrial use.
Align crop calendars with city-specific water-balance windows. The monthly water balance outputs should inform state agriculture ministries on optimal planting dates, which differ markedly between the sub-humid north-central group (Abuja, Lagos, Benin) and the humid coastal-south group (Calabar, Warri, Port Harcourt). Misalignment of crop calendars with local water availability is a principal driver of yield variability.
Incorporate climate uncertainty into irrigation scheme design life. The 95% prediction intervals on the 50-year projections span several hundred millimetres per year by the 2060s. Irrigation infrastructure with a 30–50 year design life should be sized for the upper bound of projected ET₀ minus the lower bound of projected rainfall, providing a conservative buffer against future deficits.
16.4 Urban drainage and stormwater management
Update urban drainage design standards to city-specific IDF curves. The IDF curves derived here — based on Gumbel quantiles and empirical sub-daily scaling — show that design intensities in Calabar and Port Harcourt for a T = 25-year event exceed those in Abuja by more than 30%. National drainage design codes (e.g., FMWH Drainage Manual) should incorporate city-differentiated IDF parameters rather than applying a single national curve.
Mandate T = 25-year design standards for all new urban drainage in Niger Delta cities. Given the elevated rainfall totals, high CV, and flood exposure of Port Harcourt, Warri, and Calabar, applying the current T = 10-year standard for secondary drainage is inadequate. The risk analysis shows that a T = 10-year culvert faces a 95.8% probability of being exceeded at least once within a 30-year service life.
Adopt nature-based stormwater solutions in Lagos. The city’s high impervious cover, combined with intense rainy-season rainfall, makes conventional grey infrastructure alone insufficient. Bioretention cells, permeable pavements, constructed wetlands, and green roofs — sized using the Rational Method outputs in this report — should be integrated into the Lagos Urban Drainage Master Plan.
Enforce developer-funded drainage impact assessments. All developments on catchments larger than 2 ha in any of the six cities should be required to demonstrate that post-development peak flows (using the IDF parameters derived here) do not exceed pre-development values for the T = 25-year event.
16.5 Road design and construction planning
Schedule earthworks and pavement-layer construction within climatological workability windows. Abuja has 6 workable months (rainfall < 80 mm) compared to only 2 in Calabar. Contract programmes and programme float allowances for road projects in the Niger Delta must reflect the severely constrained earthwork season; failure to do so is a primary cause of schedule overruns and pavement defects.
Apply region-specific subgrade moisture correction factors in pavement design. The prolonged wet seasons in Calabar, Warri, and Port Harcourt produce sustained periods of subgrade saturation. Pavement design using the AASHTO or TRRL (Nigerian FMWH 2013) methods should apply a seasonal correction factor to the effective CBR based on the city-specific wet-season duration derived from the workability calendar.
Design cross-drainage structures to T = 25-year standards for trunk roads and T = 50-year for highways. Based on the risk analysis, a T = 25-year culvert on a 30-year design-life road carries a 70.6% exceedance risk — acceptable for secondary roads but not for trunk routes. Federal highways and expressways should adopt T = 50-year culverts, reducing risk to 45.5%.
Develop road-sector vulnerability indices by city. Combining the workability calendar, design rainfall, and projected 50-year rainfall uncertainty, each city can be assigned a road-sector climate vulnerability index. This index should guide prioritisation of maintenance budgets and climate-resilience upgrades under the Federal Roads Authority (FERMA) capital programme.
16.6 Dam engineering and water supply infrastructure
Use the Hershfield PMP estimates as a starting point for spillway design, with mandatory verification against daily-scale records. The PMP for Calabar is estimated at 11,118 mm/year (K = 15), implying a specific design flood of 70.77 m³/s per km². These are conservatively high but should be refined with daily extreme-rainfall data and regional PMP studies (World Meteorological Organization Technical Note No. 332) before adoption in final spillway design.
Apply the Rippl mass-curve storage estimates as minimum capacity benchmarks for new reservoirs. The analysis indicates that a 100 km² catchment near Calabar requires at least 324.9 million m³ of live storage to guarantee continuous supply at mean demand. For Abuja, where inter-annual variability is higher relative to mean flow, the corresponding estimate is 127.5 million m³. These figures inform reservoir sizing at the feasibility stage.
Require operating-rule reviews for existing dams under widening rainfall uncertainty. The 95% PI width on the 50-year projections grows substantially with lead time. Dam safety reviews should assess whether existing spillway capacities and reservoir rule curves remain adequate under the upper bound of the 50-year projection scenario, in accordance with the requirements of the Nigeria Dam Safety Commission.
Integrate rainfall seasonality into reservoir operating rules. STL decomposition confirms a highly regular seasonal cycle in all six cities. Operating rules (draw-down and refill curves) should be synchronised with the empirical seasonal profile to maximise yield while maintaining flood-attenuation capacity in the months immediately preceding peak rainfall.
16.7 Risk analysis and infrastructure design standards
Formally adopt risk-based return-period selection for infrastructure in Nigeria. The risk matrix in this report quantifies, for the first time in a six-city Nigerian context, how design-life duration interacts with return period to determine exceedance probability. The following minimum return periods are recommended by consequence class:
Table 16: Recommended minimum design return periods by infrastructure class (derived from risk analysis)
Infrastructure class
Typical service life
Recommended T (yr)
Residual risk (50-yr life)
Minor field drains / farm tracks
15 yr
5
%
Secondary urban drains / rural roads
20 yr
10
87.8%
Primary urban drains / trunk roads
30 yr
25
70.6%
Major bridges / expressways
50 yr
100
39.5%
Flood embankments / large culverts
50 yr
200
22.2%
Dam spillways (Class I / II dams)
100 yr
500
18.1%
Establish a probabilistic climate risk register for each city. Drawing on the flood-frequency curves, PMP estimates, IWR projections, and 50-year rainfall scenarios, each of the six cities should maintain a quantitative climate risk register — updated whenever new rainfall data become available — that feeds into urban master plans, capital investment programming, and insurance/reinsurance pricing.
16.8 Policy and institutional recommendations
Embed the rainfall regime classification (PCA clusters) into federal infrastructure investment allocation. The two-cluster structure — humid coastal-south vs sub-humid north-central — should be formally recognised in capital allocation frameworks. Infrastructure standards appropriate for Calabar’s 3,000+ mm/year environment are materially different from those needed in Abuja’s 1,400 mm/year context; a single national standard applied uniformly misallocates resources and produces under-designed or over-designed structures.
Require climate-adjusted design reports for all federally funded infrastructure projects. Consistent with the requirements of the Federal Ministry of Environment’s National Adaptation Plan, all design reports for water-related infrastructure costing more than ₦500 million should include a climate risk section documenting the return period adopted, residual exceedance risk over the design life, and sensitivity to the upper bound of the 50-year rainfall projection.
Promote inter-agency data sharing between NiMet, NIWA, and state water authorities. The analytical framework in this report — from STL decomposition through ARIMA forecasting to the Rational Method application — can only be routinely applied if agencies share hydrological and meteorological data in standardised, interoperable formats. A national hydrometeorological data platform, modelled on the WMO WHOS initiative, is strongly recommended.
Invest in climate-literacy capacity within state engineering departments. The methods applied in this report (extreme-value statistics, stationarity testing, probabilistic forecasting) are not yet widely used in Nigerian state-level engineering practice. Structured training programmes — delivered through the Nigerian Society of Engineers (NSE), COREN, and university continuing-education programmes — would raise the quality of hydrological design practice nationally.
17 Limitations and further work
17.1 Limitations
Synthetic data: The rainfall series are generated from climatological baselines, not real station records. While statistically consistent with published normals, they cannot capture unrecorded extremes, data-quality issues, or non-stationarities in the true historical record.
Univariate forecasting: ARIMA and ETS models do not incorporate physically meaningful drivers (SST, ITCZ position, ENSO indices). Long-range projections therefore represent extrapolation of historical variability only.
No trend-in-variance: The modelling framework assumes homoscedastic errors. Potential increases in rainfall variability under climate change are not captured.
Six-city scope: The analysis covers only six cities, providing a spatially coarse picture of a climatologically diverse country.
Random Forest generalisation: The classifier was trained on synthetic data and may not transfer directly to real-world station records without re-training.
17.2 Recommended further work
Integrate NiMet station records and WMO GSOD data to replace synthetic series with observed values.
Couple with CMIP6 climate projections to impose physically motivated trends on the forecast framework.
Spatial interpolation (kriging or machine-learning spatial models) to produce gridded rainfall surfaces across Nigeria.
Extreme-value analysis (GEV or GPD distributions) on daily peak rainfall to support infrastructure design.
ENSO and IOD teleconnection analysis to improve seasonal-scale predictability.
Extend the classifier to a multi-class framework (severe, moderate, normal, dry) and validate on independent NiMet data.
18 References
The following key references informed the methodological and climatological basis of this report:
Adeyeri, O. E., Lawin, A. E., Laux, P., Ishola, K. A., & Ige, S. O. (2019). Analysis of climate variability and surface meteorological parameters in the Komadugu-Yobe Basin, Lake Chad region. Journal of Water and Climate Change, 10(4), 683–699.
Ati, O. F., Stigter, C. J., & Oladipo, E. O. (2002). A comparison of methods to determine the onset of the growing season in northern Nigeria. International Journal of Climatology, 22(6), 731–742.
Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts. https://otexts.com/fpp3/
Intergovernmental Panel on Climate Change. (2021). Climate Change 2021: The Physical Science Basis. Cambridge University Press.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
Shapley, L. S. (1953). A value for n-person games. In H. Kuhn & A. Tucker (Eds.), Contributions to the Theory of Games (Vol. 2, pp. 307–317). Princeton University Press.
19 Appendix
19.1 A: AI usage statement
This report was produced with the assistance of Posit Assistant, a large-language-model AI tool integrated into the RStudio IDE. The AI was used for the following tasks:
Task
Role of AI
Human oversight
Code generation
Drafted R code for all analytical sections
Reviewed, tested, and executed by the author
Narrative drafting
Produced first-draft text for all prose sections
Edited and approved by the author
Report structure
Proposed the section outline
Reviewed and modified by the author
Statistical interpretation
Provided initial interpretations
Verified against known climatological literature
All AI-generated content was directed by and remains the intellectual responsibility of the named author. No data were fabricated by the AI; all computations were performed in R using established statistical packages. The AI did not have access to proprietary data and operated solely on the datasets described in this report.
AI system: Posit Assistant (powered by Anthropic Claude) Date of assistance: May 2026
Figure 37: ROC curve with AUC — Random Forest extreme-rainfall classifier
19.4 D: SHAP feature importance
SHAP (SHapley Additive exPlanations) values decompose each model prediction into contributions from individual features in a theoretically grounded, model-agnostic manner. The mean absolute SHAP value for each feature provides a global importance ranking.