This report documents the current state of the Reservoir Operations workstream (Part 2 of the Transboundary Dams project). The central question is whether transboundary dams — dams whose downstream river crosses an international border — are operated more aggressively than otherwise-comparable domestic dams, and whether that excess intensifies during periods of geopolitical dispute.
The analysis uses satellite-derived data on reservoir storage (GRSAD surface area, GREALM water height) as proxies for dam operations, climate-residualized to isolate human decisions from weather-driven variation. Three identification strategies are used: (1) a cross-sectional dose-response design exploiting variation in downstream border distance; (2) a within-cascade panel design for the Lancang–Mekong; and (3) an event study around known geopolitical flashpoints.
Monthly satellite-derived reservoir surface area (km²) from Landsat
imagery, for ~7,200 reservoirs worldwide. Linked to GRanD via reservoir
ID (id); matched to GDAT via a GRanD→GDAT crosswalk.
Baseline period: 1984–2000 (defines the seasonal
climatology used to construct anomalies).
scale(area_km2) over the full operational series) as a
fallback volatility measure, flagged as
vol_source = "within_sample_grsad".Satellite altimetry water surface height (m) from NASA/CNES radar
altimeters. Monthly and 10-day resolution, keyed by
lake_id. Used for dams not in GRSAD or for case study dams
where height is preferable to area.
| Dam | Basin | Role | GDAT fid | GREALM lake_id | Data source | Baseline available |
|---|---|---|---|---|---|---|
| High Aswan | Nile | Downstream | 313 | 331 | GRSAD + GREALM | Yes (1984–2000) |
| GERD | Nile | Upstream | 345 | 1296 | GREALM only | No (filled 2020) |
| Nuozhadu | Mekong | Upstream | 6511 | 1979 | GRSAD + GREALM | No (completed 2014) |
| Xiaowan | Mekong | Upstream | 6610 | 1992 | GRSAD + GREALM | Partial |
Note on GERD: GERD is not in GRanD v1.3 and has no GRSAD coverage. All GERD analysis uses GREALM lake_id 1296 (identified as the closest GREALM lake to GERD’s coordinates at 11.2°N, 35.1°E, distance ~0.21°). Because GERD began filling in July 2020, there is no baseline period; within-sample z-scores are used throughout.
For each reservoir \(i\) and calendar month \(t\), we construct a baseline seasonal climatology over 1984–2000:
\[\bar{x}_{im} = \text{median}(x_{it} : \text{month}(t) = m,\ t \in \text{baseline}), \quad \sigma_{im} = \text{sd}(x_{it} : \text{month}(t) = m,\ t \in \text{baseline})\]
The raw anomaly is \(z_{it} = (x_{it} - \bar{x}_{im}) / \sigma_{im}\), where \(x_{it}\) is surface area (km², GRSAD) or water height (m, GREALM).
To isolate the human operational signal from climate-driven variation, we residualize within each reservoir on ERA5-Land catchment runoff anomalies, precipitation, temperature anomalies, and month fixed effects:
\[z_{it} = \alpha_i + \beta_1\, \text{runoff}_{it} + \beta_2\, \text{precip}_{it} + \beta_3\, \text{temp}_{it} + \gamma_m + \varepsilon_{it}\]
The residual \(\hat{\varepsilon}_{it}\) is our primary measure of operational anomaly — reservoir behavior unexplained by climate. Requires ≥24 observations per reservoir.
From \(\hat{\varepsilon}_{it}\) we construct three reservoir-level summary statistics:
| Statistic | Definition | Interpretation |
|---|---|---|
vol_annual_sd |
Mean annual SD of \(\hat{\varepsilon}_{it}\) | Primary outcome: operational volatility |
frac_extreme |
Fraction of months with \(|\hat{\varepsilon}_{it}| > 1.5\) | Frequency of large deviations |
dry_bias |
Mean of negative anomalies | Systematic tendency toward low storage |
These figures show the raw reservoir storage time series and standardized anomaly for each case study dam. The top panel shows the raw measure (area in km² or height in m). The bottom panel shows the z-score relative to baseline (or within-sample z for post-2000 dams), with the climate-residualized version highlighted in red/orange.
High Aswan Dam: GRSAD surface area and standardized anomaly (GRSAD)
High Aswan Dam: GREALM water height anomaly
Interpretation: High Aswan is the oldest dam in the sample (completed 1970) and the downstream unit in the Nile case. Baseline-normalized data are available from 1984. The climate-residualized anomaly captures operational decisions by the Egyptian government, distinct from Nile flood regime variation.
GERD (Grand Ethiopian Renaissance Dam): GREALM water height, within-sample z-score (no baseline period available)
Interpretation: GERD’s GREALM series begins when filling started in 2020. Three distinct filling phases are visible (July 2020, July 2021, July 2022), corresponding to the three annual rainy seasons during which Ethiopia impounded additional water. The z-score is within-sample (no pre-construction baseline exists), so the “anomaly” measures deviation from the dam’s own average post-construction behavior. This is not directly comparable to baseline-normalized z-scores at High Aswan. The within-sample scaling means the series is centered on the mean and scaled to unit variance over 2020–present.
Nuozhadu Dam: GRSAD surface area and anomaly
Nuozhadu Dam: GREALM height anomaly (within-sample z)
Interpretation: Nuozhadu (completed 2014) is the largest dam in the Lancang cascade, with a storage capacity of ~23,000 MCM. It lies ~1,000 km upstream of the Chinese–Myanmar border. The GRSAD series shows relatively flat area (the reservoir is large and area is a poor proxy for volume at large reservoirs); the GREALM height series is more informative. No baseline period overlap: filled after 2000.
Xiaowan Dam: GRSAD surface area and anomaly
Xiaowan Dam: GREALM height anomaly
Interpretation: Xiaowan (completed 2010) lies upstream of Nuozhadu in the Lancang cascade. It has partial baseline overlap and benefits from climate residualization over its pre-2000 construction period. The climate-residualized anomaly isolates operational behavior from Yunnan monsoon dynamics.
These figures overlay the operational anomalies of the upstream treated dam and the downstream (control/affected) reservoir at the same time scale. Periods where the upstream dam shows high storage (positive anomaly) while the downstream unit shows low storage (negative anomaly) are candidates for operational divergence driven by retention.
Nile: GERD (upstream, blue) vs. High Aswan (downstream, red) monthly operational anomalies. ~68 overlapping months (GERD post-fill era, Jul 2020–present).
Interpretation and caveats: The overlapping period covers 2020–present (GERD’s post-filling era only). The two series are not on the same scale: GERD uses within-sample z, while High Aswan uses climate-residualized baseline z. Direct comparison of levels is therefore not straightforward. The shading threshold (upstream z > 0.5 AND downstream z < −0.5 simultaneously) is never triggered in the current data — likely because the different normalizations put the series on incompatible scales. Decision needed: normalize both to a common metric (e.g., both within-sample z, or both percentile rank) before interpreting divergent episodes.
Mekong: Xiaowan (upstream, blue) vs. Nuozhadu (downstream, red) monthly operational anomalies. 442 overlapping months.
Interpretation: The Mekong series covers 1985–present for Xiaowan (partial baseline, completed 2010) but only 2014–present for Nuozhadu (within-sample z, completed 2014). There is no pre-construction Nuozhadu series, so the figure does not provide a true pre-trend check — only Xiaowan’s pre-2010 behavior is visible. Post-2014 (both dams operational), some divergence is visible but the shading threshold is also never triggered. The same scale-incompatibility issue applies for months where within-sample z is used.
The cross-section exploits variation in downstream border distance (\(d_i\) = along-river km from dam to nearest international border) as a continuous treatment intensity. The estimating equation is:
\[V_i = \alpha + \beta \cdot f(d_i) + \gamma X_i + \delta_{\text{purpose}} + \delta_{\text{basin}} + \varepsilon_i\]
where \(V_i\) is operational
volatility (vol_annual_sd), \(f(d_i)\) is either linear or log-linear in
border distance, \(X_i\) includes log
reservoir capacity and dam age, and fixed effects absorb purpose
(hydropower/irrigation/flood control) and basin heterogeneity. Standard
errors clustered by river system.
Control group construction: For each basin, controls are drawn from different operator countries in hydrologically isolated basins (see Identification section below).
Sample: 139 dams in total; 39 have non-NA
dam_dist_border from
rivers_dams_purpose.gpkg. The remaining 100 are
non-transboundary controls that lack border distance by
construction.
Known limitations: - GERD (fid=345) is not in
rivers_dams_purpose.gpkg and has no
dam_dist_border. Excluded from regression even though it is
the primary treated dam in the Nile basin. - Kishanganga (fid=7993,
completed 2018) has no GRSAD/GREALM data. Not in sample. - Hrusov
(fid=33153) is used as the Danube treated dam, as a proxy for Gabčíkovo
(same dam system, 0.52° apart, dam_dist_border = 0). -
log_capacity is imputed to 0 for 133/139 dams (most GDAT
dams lack volume documentation); cap_missing indicator
included.
| Specification | dam_dist_border | log_dam_dist_border | dam_age | N | Within R² |
|---|---|---|---|---|---|
|
-0.0003* (0.0001) | — | -0.0006* (0.0002) | 39 | 0.044 |
|
— | -0.0123* (0.0040) | -0.0004* (0.0001) | 39 | 0.019 |
|
-0.0003* (0.0001) | — | -0.0006* (0.0002) | 39 | 0.044 |
|
— | -0.0123* (0.0040) | -0.0004* (0.0001) | 39 | 0.019 |
|
−4.7×10⁻⁵ (4.0×10⁻⁵) | — | -0.0006** (3×10⁻⁵) | 38 | 0.095 |
|
−8×10⁻²¹ (4×10⁻²⁰) | — | 1.4×10⁻¹⁸ (—) | 38 | 0.051 |
Dose-response: predicted operational volatility by downstream border distance (log scale). Points = individual dams; line = predicted from log-linear specification.
Main finding: Border distance has a negative
and statistically significant coefficient in both linear
(−0.0003, p<0.05) and log-linear (−0.0123, p<0.05) specifications.
The sign means: dams closer to an international border have
higher operational volatility, consistent with the strategic
excess hypothesis. The log-linear coefficient implies a 10% increase in
border distance is associated with a 0.0013-unit decrease in
vol_annual_sd (roughly 0.2% of the mean).
Dam age: The negative coefficient on
dam_age (dams built longer ago are less volatile) may
reflect regulatory maturation, sediment infilling, or systematic
differences in data quality for older reservoirs. It is not central to
the transboundary argument but important to control for.
Entropy balancing (EB): The EB coefficients are identical to OLS, which indicates the control group was already reasonably balanced on observables before reweighting. This is reassuring for internal validity but worth checking more carefully with the full 139-observation sample.
Caveats to interpret carefully: - N=39 is small.
With purpose + basin FE, degrees of freedom are limited. - The primary
treated dams of interest (GERD, Baglihar, Gabčíkovo proper) are all
absent from this regression due to data gaps. The result is identified
primarily from variation among control dams plus Baglihar (with
within-sample vol) and Hrusov. - The dry_bias outcome (col
6) shows negligible and statistically meaningless coefficients — the
point estimate is essentially zero and the SE is numerically unstable.
This outcome may need to be redefined or dropped.
| Basin | dam_dist_border | dam_age | N | R² |
|---|---|---|---|---|
| Pooled | -0.0003 (6.7×10⁻⁵) | -0.0005 (0.0005) | 38 | 0.045 |
Note: The by-basin table from the pipeline currently collapses to a single row (38 obs) because individual-basin samples are too small to separately absorb purpose and basin FE. Once GERD border distance and Baglihar data quality issues are resolved, individual-basin estimates will be meaningful.
For the Lancang cascade, we cannot use cross-country controls (all major Chinese dam operators share the same ultimate state principal). Instead, we exploit within-cascade variation in border proximity: dams closer to the Chinese–Myanmar border should exhibit higher operational anomalies during periods of diplomatic tension, after controlling for cascade position via dam fixed effects.
\[\hat{\varepsilon}_{it} = \alpha_i + \beta \cdot \text{borderdist}_i \times \text{Post}_t + \delta_t + \nu_{it}\]
where \(\text{Post}_t\) indexes periods of Mekong diplomatic tension. Three event windows are tested: the 2019 drought (when Stimson Center satellite data revealed Chinese retention), the post-Stimson political response, and any tension period.
| Specification | borderdist × Pre | borderdist × Post | N | Within R² |
|---|---|---|---|---|
| 2019 Drought | −0.089 (0.084) | — (dropped: collinear) | 1115 | 0.123 |
| Post-Stimson | −0.045 (0.038) | — (dropped: collinear) | 1115 | 0.016 |
| Any Tension | 0.201 (0.092) | 0.283* (0.075) | 1115 | 0.130 |
Mekong cascade: operational anomaly time series by dam, with tension periods shaded.
Main finding: In the “any tension” specification,
the borderdist × Post interaction is positive and
significant (0.283, p<0.05), suggesting that dams closer to the
border have higher operational anomalies during tension periods relative
to non-tension periods. The pre-tension coefficient (0.201) is also
large, however, which raises a concern about pre-existing
differences.
Critical caveat — collinearity: The 2019 drought and
post-Stimson specifications drop the Post interaction due
to collinearity. This occurs because Nuozhadu (fid=6511) and
Xiaowan (fid=6610) — the two most downstream cascade dams and thus most
relevant for border-distance variation — are absent from
rivers_dams_purpose.gpkg, so their
dam_dist_border is NA. The border distance variation in the
panel comes only from the 3 upper-cascade dams (Jinfeng, Manwan,
Dachaoshan), which may not cover the post-2019 drought period well.
Status: The within-cascade design is currently unidentified for the primary event windows. This needs to be resolved either by (a) manually adding Nuozhadu and Xiaowan border distances from the raw GDAT gpkg, or (b) accepting that the “any tension” result is the only feasible specification given the sample structure.
To test whether operational excess intensifies during active geopolitical disputes, we estimate:
\[\hat{\varepsilon}_{it} = \sum_{k=-K}^{K} \beta_k \cdot \mathbf{1}[t = t^* + k] + \alpha_i + \delta_{\text{event}} + \nu_{it}\]
where \(t^*\) is the event date, \(K=24\) months, \(\alpha_i\) are dam fixed effects, and \(\delta_{\text{event}}\) are event fixed effects. Reference month: \(k = -1\).
Events included:
| Event | Date | Dam | Data |
|---|---|---|---|
| GERD First Fill | 2020-07-01 | GERD (GREALM 1296) | ⚠️ per-event fails (ref=−1 missing) |
| GERD Second Fill | 2021-07-01 | GERD (GREALM 1296) | ✅ |
| GERD Third Fill | 2022-07-01 | GERD (GREALM 1296) | ✅ |
| Mekong Drought | 2019-01-01 | Nuozhadu (GREALM 1979) | ✅ |
| ICJ Gabčíkovo | 1992-10-01 | Gabčíkovo | ❌ no data |
Total sample: 77 observations across 4 events (pooled model).
Pooled event study: operational anomaly (within-sample z) relative to event date, ±24 months. Reference = −1 month. Dam + event FE. Coefficients from pooled model across GERD fill events and Mekong 2019 drought.
Per-event event study: same specification estimated separately for each event. GERD First Fill (Event 1) fails due to missing pre-event data.
Pooled result: The event study shows a positive trend in operational anomalies in the post-period relative to the reference month (−1). Coefficients rise from near zero at \(k=0\) to large positive values by \(k=14\)–\(17\) (approximately 44–56 units of within-sample z). However, the standard errors in the pooled table are essentially zero (on the order of \(10^{-13}\)), which indicates a numerical degeneracy — the model is perfectly collinear with the data given 77 observations and K=24 bins. This is a fundamental power problem, not a finding:
tryCatch in the pipeline).GERD First Fill failure: Event 1 (2020-07-01) fails in the per-event model because GERD’s GREALM series begins after filling started — June 2020 (rel_month = −1, the reference) has no observation. Decision needed: Use \(K=12\) instead of \(K=24\) (reduces obs requirement), or change the reference period to \(k=0\) for post-construction events.
Bottom line: The event study design is correct in concept but currently under-powered. The fix requires either (a) expanding the event set substantially to get more obs per bin, or (b) reducing K, or (c) using a simpler pre/post DiD around events rather than the full event study coefficient vector.
Treatment: Continuous downstream border distance
\(d_i\) (along-river km, from
rivers_dams_purpose.gpkg).
Control group logic (per basin):
| Basin | Treated dam(s) | Control countries | Control selection |
|---|---|---|---|
| Nile | GERD (Ethiopia) | Kenya, Uganda, Tanzania | Same East African climate; no Blue Nile basin overlap |
| Indus | Baglihar, Kishanganga (India) | India (Godavari, Mahanadi basins) | Different river basin; state-level operators unconnected to Indus geopolitics |
| Danube | Hrusov/Gabčíkovo (Slovakia) | Czech Republic, Austria | Different countries; no Danube basin overlap in gpkg controls |
Entropy balancing: Controls reweighted to match treated moments on log capacity, purpose, dam age. EB weights are estimated but coefficients were unchanged post-weighting (good sign for initial balance).
Identifying assumption: Conditional on purpose FE, basin FE, log capacity, and dam age, downstream border distance is as good as randomly assigned — two otherwise-identical dams differ in operational volatility only because one faces a geopolitical incentive to retain water.
Plausibility: This assumption is strengthened by the companion Part 3 (strategic placement) finding: if transboundary dams are deliberately sited near borders, the cross-section compares intentionally-placed treated dams against engineering-optimal domestic dams, and \(\hat{\beta}\) is a lower bound on true operational excess.
| Check | Status |
|---|---|
| Pre-trend test (event study) | ⚠️ Fails due to data sparsity (see above) |
| Dose-response monotonicity | Not yet tested (tercile split) |
| Placebo basins (Amazon, Congo) | Not yet implemented |
| Purpose heterogeneity (hydropower × border dist) | Table generated (tab_resops_purpose_het.tex), not yet
reviewed |
| Connection to Part 1 (discharge prediction) | Not yet implemented |
| Component | Status |
|---|---|
| GRSAD anomalies (Steps 1–2) | ✅ 7,246 reservoirs; 6,399 residualized |
| GREALM anomalies (Steps 3–4) | ✅ 519 lakes; 177 residualized |
| Within-sample vol fallback (Step 6A-WS) | ✅ 847 post-2000 dams added |
| Figure 1 time series (all 4 case dams) | ✅ Saved |
| Figure 2 upstream-downstream (Nile + Mekong) | ✅ Saved (scaling issue noted) |
| Cross-section regression (Steps 7A–7B) | ✅ N=39; negative border-distance coefficient |
| Mekong cascade (Step 8) | ⚠️ Partial — “any tension” result only |
| Event study pooled (Step 9) | ⚠️ Runs but SEs degenerate; K=24 too large |
1. GERD border distance: GERD (fid=345) is not in
rivers_dams_purpose.gpkg. Should it be added manually? GERD
is clearly transboundary (downstream border Ethiopia–Sudan ≈ 100 km). If
added manually, it would enter the cross-section regression as the
primary treated observation in the Nile basin.
2. Mekong cascade collinearity: Nuozhadu and Xiaowan
lack dam_dist_border in the gpkg. Options: (a) add manually
from raw GDAT coordinates; (b) accept “any tension” result only; (c)
drop the within-cascade design.
3. Event study window K: Currently K=24 → 77 obs → degenerate SEs. Reduce to K=12? Or change reference period for GERD events (ref=0 instead of ref=−1)?
4. Figure 2 scale: Both upstream and downstream series should be on the same anomaly scale for meaningful comparison. Options: (a) both within-sample z; (b) both percentile rank; (c) normalize relative to their own post-treatment period.
5. Baglihar within-sample vol:
vol_source = "within_sample_grsad" for Baglihar (completed
2009; vol_annual_sd = 0.788). Is this acceptable for the paper? Needs a
note that post-2000 dams use within-sample normalization.
6. Validity checks (Step 10): Placebo basins,
dose-response monotonicity, and the connection to Part 1 discharge
anomalies are not yet implemented. Step 10E also requires
join_grsad_to_panel.R to be run first.
| Output | Location |
|---|---|
| Figures | Output/Figures/Reservoir_Operations/ |
| Regression tables (LaTeX) | Output/Tables/tab_resops_*.tex |
resops_dam_summary.rds |
Data/Intermediate/Reservoir_Operations/ |
grsad_area_anomalies_resops.csv |
Data/Intermediate/Reservoir_Operations/ |
grealm_height_anomalies_monthly.csv |
Data/Intermediate/Reservoir_Operations/ |
| Main analysis script | Code/reservoir_operations.R |
| Cluster run wrapper | run_resops.R (repo root) |
| Diagnostic script | run_diagnostics.R (repo root) |
| Pipeline review log | review_log_2026-03-14.md (repo root) |
| Methodology writeup | Writeup/res_ops_empirical.tex |
Report generated: 2026-03-14. Run
rmarkdown::render("res_ops_progress_report.Rmd") from the
repo root to regenerate.