Reservoir Operations: Progress Report

Overview

This report documents the current state of the Reservoir Operations workstream (Part 2 of the Transboundary Dams project). The central question is whether transboundary dams — dams whose downstream river crosses an international border — are operated more aggressively than otherwise-comparable domestic dams, and whether that excess intensifies during periods of geopolitical dispute.

The analysis uses satellite-derived data on reservoir storage (GRSAD surface area, GREALM water height) as proxies for dam operations, climate-residualized to isolate human decisions from weather-driven variation. Three identification strategies are used: (1) a cross-sectional dose-response design exploiting variation in downstream border distance; (2) a within-cascade panel design for the Lancang–Mekong; and (3) an event study around known geopolitical flashpoints.

Data Sources

GRSAD — Global Reservoir Surface Area Dataset

Monthly satellite-derived reservoir surface area (km²) from Landsat imagery, for ~7,200 reservoirs worldwide. Linked to GRanD via reservoir ID (id); matched to GDAT via a GRanD→GDAT crosswalk. Baseline period: 1984–2000 (defines the seasonal climatology used to construct anomalies).

Coverage: 7,246 GRSAD reservoirs; 6,399 successfully residualized against climate
Post-2000 dams: Reservoirs completed after ~2000 have no baseline observations and cannot be baseline-normalized. For these ~847 dams, we compute a within-sample z-score (scale(area_km2) over the full operational series) as a fallback volatility measure, flagged as vol_source = "within_sample_grsad".

GREALM — Global Reservoir and Lake Monitor

Satellite altimetry water surface height (m) from NASA/CNES radar altimeters. Monthly and 10-day resolution, keyed by lake_id. Used for dams not in GRSAD or for case study dams where height is preferable to area.

Coverage: 519 GREALM lakes/reservoirs; 177 residualized
Case study lake_ids: High Aswan = 331, GERD = 1296 (“Millennium” in GREALM), Nuozhadu = 1979, Xiaowan = 1992

Case Study Dams

Case study dams: identifiers and data availability
Dam	Basin	Role	GDAT fid	GREALM lake_id	Data source	Baseline available
High Aswan	Nile	Downstream	313	331	GRSAD + GREALM	Yes (1984–2000)
GERD	Nile	Upstream	345	1296	GREALM only	No (filled 2020)
Nuozhadu	Mekong	Upstream	6511	1979	GRSAD + GREALM	No (completed 2014)
Xiaowan	Mekong	Upstream	6610	1992	GRSAD + GREALM	Partial

Note on GERD: GERD is not in GRanD v1.3 and has no GRSAD coverage. All GERD analysis uses GREALM lake_id 1296 (identified as the closest GREALM lake to GERD’s coordinates at 11.2°N, 35.1°E, distance ~0.21°). Because GERD began filling in July 2020, there is no baseline period; within-sample z-scores are used throughout.

Methodology

Step 1: Constructing the Operational Anomaly

For each reservoir \(i\) and calendar month \(t\), we construct a baseline seasonal climatology over 1984–2000:

\[\bar{x}_{im} = \text{median}(x_{it} : \text{month}(t) = m,\ t \in \text{baseline}), \quad \sigma_{im} = \text{sd}(x_{it} : \text{month}(t) = m,\ t \in \text{baseline})\]

The raw anomaly is \(z_{it} = (x_{it} - \bar{x}_{im}) / \sigma_{im}\), where \(x_{it}\) is surface area (km², GRSAD) or water height (m, GREALM).

Step 2: Climate Residualization

To isolate the human operational signal from climate-driven variation, we residualize within each reservoir on ERA5-Land catchment runoff anomalies, precipitation, temperature anomalies, and month fixed effects:

\[z_{it} = \alpha_i + \beta_1\, \text{runoff}_{it} + \beta_2\, \text{precip}_{it} + \beta_3\, \text{temp}_{it} + \gamma_m + \varepsilon_{it}\]

The residual \(\hat{\varepsilon}_{it}\) is our primary measure of operational anomaly — reservoir behavior unexplained by climate. Requires ≥24 observations per reservoir.

Step 3: Volatility Summary Statistics

From \(\hat{\varepsilon}_{it}\) we construct three reservoir-level summary statistics:

Statistic	Definition	Interpretation
`vol_annual_sd`	Mean annual SD of \(\hat{\varepsilon}_{it}\)	Primary outcome: operational volatility
`frac_extreme`	Fraction of months with \(\|\hat{\varepsilon}_{it}\| > 1.5\)	Frequency of large deviations
`dry_bias`	Mean of negative anomalies	Systematic tendency toward low storage

Figure 1: Case Study Time Series

These figures show the raw reservoir storage time series and standardized anomaly for each case study dam. The top panel shows the raw measure (area in km² or height in m). The bottom panel shows the z-score relative to baseline (or within-sample z for post-2000 dams), with the climate-residualized version highlighted in red/orange.

High Aswan (Nile — downstream)

High Aswan Dam: GRSAD surface area and standardized anomaly (GRSAD)

High Aswan Dam: GREALM water height anomaly

Interpretation: High Aswan is the oldest dam in the sample (completed 1970) and the downstream unit in the Nile case. Baseline-normalized data are available from 1984. The climate-residualized anomaly captures operational decisions by the Egyptian government, distinct from Nile flood regime variation.

GERD (Nile — upstream, Ethiopia)

GERD (Grand Ethiopian Renaissance Dam): GREALM water height, within-sample z-score (no baseline period available)

Interpretation: GERD’s GREALM series begins when filling started in 2020. Three distinct filling phases are visible (July 2020, July 2021, July 2022), corresponding to the three annual rainy seasons during which Ethiopia impounded additional water. The z-score is within-sample (no pre-construction baseline exists), so the “anomaly” measures deviation from the dam’s own average post-construction behavior. This is not directly comparable to baseline-normalized z-scores at High Aswan. The within-sample scaling means the series is centered on the mean and scaled to unit variance over 2020–present.

Nuozhadu (Mekong — upstream, China)

Nuozhadu Dam: GRSAD surface area and anomaly

Nuozhadu Dam: GREALM height anomaly (within-sample z)

Interpretation: Nuozhadu (completed 2014) is the largest dam in the Lancang cascade, with a storage capacity of ~23,000 MCM. It lies ~1,000 km upstream of the Chinese–Myanmar border. The GRSAD series shows relatively flat area (the reservoir is large and area is a poor proxy for volume at large reservoirs); the GREALM height series is more informative. No baseline period overlap: filled after 2000.

Xiaowan (Mekong — upstream, China)

Xiaowan Dam: GRSAD surface area and anomaly

Xiaowan Dam: GREALM height anomaly

Interpretation: Xiaowan (completed 2010) lies upstream of Nuozhadu in the Lancang cascade. It has partial baseline overlap and benefits from climate residualization over its pre-2000 construction period. The climate-residualized anomaly isolates operational behavior from Yunnan monsoon dynamics.

Figure 2: Upstream–Downstream Comparison

These figures overlay the operational anomalies of the upstream treated dam and the downstream (control/affected) reservoir at the same time scale. Periods where the upstream dam shows high storage (positive anomaly) while the downstream unit shows low storage (negative anomaly) are candidates for operational divergence driven by retention.

Nile: GERD vs. High Aswan

Nile: GERD (upstream, blue) vs. High Aswan (downstream, red) monthly operational anomalies. ~68 overlapping months (GERD post-fill era, Jul 2020–present).

Interpretation and caveats: The overlapping period covers 2020–present (GERD’s post-filling era only). The two series are not on the same scale: GERD uses within-sample z, while High Aswan uses climate-residualized baseline z. Direct comparison of levels is therefore not straightforward. The shading threshold (upstream z > 0.5 AND downstream z < −0.5 simultaneously) is never triggered in the current data — likely because the different normalizations put the series on incompatible scales. Decision needed: normalize both to a common metric (e.g., both within-sample z, or both percentile rank) before interpreting divergent episodes.

Mekong: Xiaowan vs. Nuozhadu

Mekong: Xiaowan (upstream, blue) vs. Nuozhadu (downstream, red) monthly operational anomalies. 442 overlapping months.

Interpretation: The Mekong series covers 1985–present for Xiaowan (partial baseline, completed 2010) but only 2014–present for Nuozhadu (within-sample z, completed 2014). There is no pre-construction Nuozhadu series, so the figure does not provide a true pre-trend check — only Xiaowan’s pre-2010 behavior is visible. Post-2014 (both dams operational), some divergence is visible but the shading threshold is also never triggered. The same scale-incompatibility issue applies for months where within-sample z is used.

Cross-Sectional Analysis

Design

The cross-section exploits variation in downstream border distance (\(d_i\) = along-river km from dam to nearest international border) as a continuous treatment intensity. The estimating equation is:

\[V_i = \alpha + \beta \cdot f(d_i) + \gamma X_i + \delta_{\text{purpose}} + \delta_{\text{basin}} + \varepsilon_i\]

where \(V_i\) is operational volatility (vol_annual_sd), \(f(d_i)\) is either linear or log-linear in border distance, \(X_i\) includes log reservoir capacity and dam age, and fixed effects absorb purpose (hydropower/irrigation/flood control) and basin heterogeneity. Standard errors clustered by river system.

Control group construction: For each basin, controls are drawn from different operator countries in hydrologically isolated basins (see Identification section below).

Sample: 139 dams in total; 39 have non-NA dam_dist_border from rivers_dams_purpose.gpkg. The remaining 100 are non-transboundary controls that lack border distance by construction.

Known limitations: - GERD (fid=345) is not in rivers_dams_purpose.gpkg and has no dam_dist_border. Excluded from regression even though it is the primary treated dam in the Nile basin. - Kishanganga (fid=7993, completed 2018) has no GRSAD/GREALM data. Not in sample. - Hrusov (fid=33153) is used as the Danube treated dam, as a proxy for Gabčíkovo (same dam system, 0.52° apart, dam_dist_border = 0). - log_capacity is imputed to 0 for 133/139 dams (most GDAT dams lack volume documentation); cap_missing indicator included.

Results: Main Cross-Section Table

Cross-sectional regression: operational volatility on border distance. Purpose and basin FE in all columns. SE clustered at river-system level. EB = entropy balanced. * p<0.05, ** p<0.01.
Specification	dam_dist_border	log_dam_dist_border	dam_age	N	Within R²
Linear	-0.0003* (0.0001)	—	-0.0006* (0.0002)	39	0.044
Log-linear	—	-0.0123* (0.0040)	-0.0004* (0.0001)	39	0.019
EB Linear	-0.0003* (0.0001)	—	-0.0006* (0.0002)	39	0.044
EB Log-linear	—	-0.0123* (0.0040)	-0.0004* (0.0001)	39	0.019
Frac Extreme	−4.7×10⁻⁵ (4.0×10⁻⁵)	—	-0.0006** (3×10⁻⁵)	38	0.095
Dry Bias	−8×10⁻²¹ (4×10⁻²⁰)	—	1.4×10⁻¹⁸ (—)	38	0.051

Dose-Response Plot

Dose-response: predicted operational volatility by downstream border distance (log scale). Points = individual dams; line = predicted from log-linear specification.

Interpretation

Main finding: Border distance has a negative and statistically significant coefficient in both linear (−0.0003, p<0.05) and log-linear (−0.0123, p<0.05) specifications. The sign means: dams closer to an international border have higher operational volatility, consistent with the strategic excess hypothesis. The log-linear coefficient implies a 10% increase in border distance is associated with a 0.0013-unit decrease in vol_annual_sd (roughly 0.2% of the mean).

Dam age: The negative coefficient on dam_age (dams built longer ago are less volatile) may reflect regulatory maturation, sediment infilling, or systematic differences in data quality for older reservoirs. It is not central to the transboundary argument but important to control for.

Entropy balancing (EB): The EB coefficients are identical to OLS, which indicates the control group was already reasonably balanced on observables before reweighting. This is reassuring for internal validity but worth checking more carefully with the full 139-observation sample.

Caveats to interpret carefully: - N=39 is small. With purpose + basin FE, degrees of freedom are limited. - The primary treated dams of interest (GERD, Baglihar, Gabčíkovo proper) are all absent from this regression due to data gaps. The result is identified primarily from variation among control dams plus Baglihar (with within-sample vol) and Hrusov. - The dry_bias outcome (col 6) shows negligible and statistically meaningless coefficients — the point estimate is essentially zero and the SE is numerically unstable. This outcome may need to be redefined or dropped.

By-Basin Results

Per-basin cross-sectional regression (Nile + Indus + Danube pooled). Note: individual basin N too small for separate estimation with FE; by-basin table shows pooled within-group result.
Basin	dam_dist_border	dam_age	N	R²
Pooled	-0.0003 (6.7×10⁻⁵)	-0.0005 (0.0005)	38	0.045

Note: The by-basin table from the pipeline currently collapses to a single row (38 obs) because individual-basin samples are too small to separately absorb purpose and basin FE. Once GERD border distance and Baglihar data quality issues are resolved, individual-basin estimates will be meaningful.

Mekong Cascade: Within-Cascade Design

Design

For the Lancang cascade, we cannot use cross-country controls (all major Chinese dam operators share the same ultimate state principal). Instead, we exploit within-cascade variation in border proximity: dams closer to the Chinese–Myanmar border should exhibit higher operational anomalies during periods of diplomatic tension, after controlling for cascade position via dam fixed effects.

\[\hat{\varepsilon}_{it} = \alpha_i + \beta \cdot \text{borderdist}_i \times \text{Post}_t + \delta_t + \nu_{it}\]

where \(\text{Post}_t\) indexes periods of Mekong diplomatic tension. Three event windows are tested: the 2019 drought (when Stimson Center satellite data revealed Chinese retention), the post-Stimson political response, and any tension period.

Results

Mekong within-cascade panel. Dam and year-month FE. SE clustered at dam level. Lancang cascade dams only (5 dams, 1,115 obs). * p<0.05.
Specification	borderdist × Pre	borderdist × Post	N	Within R²
2019 Drought	−0.089 (0.084)	— (dropped: collinear)	1115	0.123
Post-Stimson	−0.045 (0.038)	— (dropped: collinear)	1115	0.016
Any Tension	0.201 (0.092)	0.283* (0.075)	1115	0.130

Mekong cascade: operational anomaly time series by dam, with tension periods shaded.

Interpretation

Main finding: In the “any tension” specification, the borderdist × Post interaction is positive and significant (0.283, p<0.05), suggesting that dams closer to the border have higher operational anomalies during tension periods relative to non-tension periods. The pre-tension coefficient (0.201) is also large, however, which raises a concern about pre-existing differences.

Critical caveat — collinearity: The 2019 drought and post-Stimson specifications drop the Post interaction due to collinearity. This occurs because Nuozhadu (fid=6511) and Xiaowan (fid=6610) — the two most downstream cascade dams and thus most relevant for border-distance variation — are absent from rivers_dams_purpose.gpkg, so their dam_dist_border is NA. The border distance variation in the panel comes only from the 3 upper-cascade dams (Jinfeng, Manwan, Dachaoshan), which may not cover the post-2019 drought period well.

Status: The within-cascade design is currently unidentified for the primary event windows. This needs to be resolved either by (a) manually adding Nuozhadu and Xiaowan border distances from the raw GDAT gpkg, or (b) accepting that the “any tension” result is the only feasible specification given the sample structure.

Event Study

Design

To test whether operational excess intensifies during active geopolitical disputes, we estimate:

\[\hat{\varepsilon}_{it} = \sum_{k=-K}^{K} \beta_k \cdot \mathbf{1}[t = t^* + k] + \alpha_i + \delta_{\text{event}} + \nu_{it}\]

where \(t^*\) is the event date, \(K=24\) months, \(\alpha_i\) are dam fixed effects, and \(\delta_{\text{event}}\) are event fixed effects. Reference month: \(k = -1\).

Events included:

Event	Date	Dam	Data
GERD First Fill	2020-07-01	GERD (GREALM 1296)	⚠️ per-event fails (ref=−1 missing)
GERD Second Fill	2021-07-01	GERD (GREALM 1296)	✅
GERD Third Fill	2022-07-01	GERD (GREALM 1296)	✅
Mekong Drought	2019-01-01	Nuozhadu (GREALM 1979)	✅
ICJ Gabčíkovo	1992-10-01	Gabčíkovo	❌ no data

Total sample: 77 observations across 4 events (pooled model).

Results

Pooled event study: operational anomaly (within-sample z) relative to event date, ±24 months. Reference = −1 month. Dam + event FE. Coefficients from pooled model across GERD fill events and Mekong 2019 drought.

Per-event event study: same specification estimated separately for each event. GERD First Fill (Event 1) fails due to missing pre-event data.

Interpretation

Pooled result: The event study shows a positive trend in operational anomalies in the post-period relative to the reference month (−1). Coefficients rise from near zero at \(k=0\) to large positive values by \(k=14\)–\(17\) (approximately 44–56 units of within-sample z). However, the standard errors in the pooled table are essentially zero (on the order of \(10^{-13}\)), which indicates a numerical degeneracy — the model is perfectly collinear with the data given 77 observations and K=24 bins. This is a fundamental power problem, not a finding:

With 77 obs and 48 relative-month bins (plus dam FE and event FE), most bins have 1–2 observations. The OLS is over-parameterized.
The “significance” stars are spurious — standard errors approach machine epsilon when the design matrix has near-zero rank.
The pre-trend F-test fails with a singular covariance matrix (confirmed by tryCatch in the pipeline).

GERD First Fill failure: Event 1 (2020-07-01) fails in the per-event model because GERD’s GREALM series begins after filling started — June 2020 (rel_month = −1, the reference) has no observation. Decision needed: Use \(K=12\) instead of \(K=24\) (reduces obs requirement), or change the reference period to \(k=0\) for post-construction events.

Bottom line: The event study design is correct in concept but currently under-powered. The fix requires either (a) expanding the event set substantially to get more obs per bin, or (b) reducing K, or (c) using a simpler pre/post DiD around events rather than the full event study coefficient vector.

Identification Strategy

Cross-Sectional Design

Treatment: Continuous downstream border distance \(d_i\) (along-river km, from rivers_dams_purpose.gpkg).

Control group logic (per basin):

Basin	Treated dam(s)	Control countries	Control selection
Nile	GERD (Ethiopia)	Kenya, Uganda, Tanzania	Same East African climate; no Blue Nile basin overlap
Indus	Baglihar, Kishanganga (India)	India (Godavari, Mahanadi basins)	Different river basin; state-level operators unconnected to Indus geopolitics
Danube	Hrusov/Gabčíkovo (Slovakia)	Czech Republic, Austria	Different countries; no Danube basin overlap in gpkg controls

Entropy balancing: Controls reweighted to match treated moments on log capacity, purpose, dam age. EB weights are estimated but coefficients were unchanged post-weighting (good sign for initial balance).

Identifying assumption: Conditional on purpose FE, basin FE, log capacity, and dam age, downstream border distance is as good as randomly assigned — two otherwise-identical dams differ in operational volatility only because one faces a geopolitical incentive to retain water.

Plausibility: This assumption is strengthened by the companion Part 3 (strategic placement) finding: if transboundary dams are deliberately sited near borders, the cross-section compares intentionally-placed treated dams against engineering-optimal domestic dams, and \(\hat{\beta}\) is a lower bound on true operational excess.

SUTVA

No interference: River network interference flows only downstream and within basins. Controls are in different operator countries and hydrologically isolated basins.
No hidden treatment variation: Controls from different countries ensure “domestic dam” and “transboundary dam” represent genuinely different institutional environments.

Validity Checks (planned)

Check	Status
Pre-trend test (event study)	⚠️ Fails due to data sparsity (see above)
Dose-response monotonicity	Not yet tested (tercile split)
Placebo basins (Amazon, Congo)	Not yet implemented
Purpose heterogeneity (hydropower × border dist)	Table generated (`tab_resops_purpose_het.tex`), not yet reviewed
Connection to Part 1 (discharge prediction)	Not yet implemented

Current Status and Decisions Needed

What is working

Component	Status
GRSAD anomalies (Steps 1–2)	✅ 7,246 reservoirs; 6,399 residualized
GREALM anomalies (Steps 3–4)	✅ 519 lakes; 177 residualized
Within-sample vol fallback (Step 6A-WS)	✅ 847 post-2000 dams added
Figure 1 time series (all 4 case dams)	✅ Saved
Figure 2 upstream-downstream (Nile + Mekong)	✅ Saved (scaling issue noted)
Cross-section regression (Steps 7A–7B)	✅ N=39; negative border-distance coefficient
Mekong cascade (Step 8)	⚠️ Partial — “any tension” result only
Event study pooled (Step 9)	⚠️ Runs but SEs degenerate; K=24 too large

Decisions needed

1. GERD border distance: GERD (fid=345) is not in rivers_dams_purpose.gpkg. Should it be added manually? GERD is clearly transboundary (downstream border Ethiopia–Sudan ≈ 100 km). If added manually, it would enter the cross-section regression as the primary treated observation in the Nile basin.

2. Mekong cascade collinearity: Nuozhadu and Xiaowan lack dam_dist_border in the gpkg. Options: (a) add manually from raw GDAT coordinates; (b) accept “any tension” result only; (c) drop the within-cascade design.

3. Event study window K: Currently K=24 → 77 obs → degenerate SEs. Reduce to K=12? Or change reference period for GERD events (ref=0 instead of ref=−1)?

4. Figure 2 scale: Both upstream and downstream series should be on the same anomaly scale for meaningful comparison. Options: (a) both within-sample z; (b) both percentile rank; (c) normalize relative to their own post-treatment period.

5. Baglihar within-sample vol: vol_source = "within_sample_grsad" for Baglihar (completed 2009; vol_annual_sd = 0.788). Is this acceptable for the paper? Needs a note that post-2000 dams use within-sample normalization.

6. Validity checks (Step 10): Placebo basins, dose-response monotonicity, and the connection to Part 1 discharge anomalies are not yet implemented. Step 10E also requires join_grsad_to_panel.R to be run first.

File Reference

Output	Location
Figures	`Output/Figures/Reservoir_Operations/`
Regression tables (LaTeX)	`Output/Tables/tab_resops_*.tex`
`resops_dam_summary.rds`	`Data/Intermediate/Reservoir_Operations/`
`grsad_area_anomalies_resops.csv`	`Data/Intermediate/Reservoir_Operations/`
`grealm_height_anomalies_monthly.csv`	`Data/Intermediate/Reservoir_Operations/`
Main analysis script	`Code/reservoir_operations.R`
Cluster run wrapper	`run_resops.R` (repo root)
Diagnostic script	`run_diagnostics.R` (repo root)
Pipeline review log	`review_log_2026-03-14.md` (repo root)
Methodology writeup	`Writeup/res_ops_empirical.tex`

Report generated: 2026-03-14. Run rmarkdown::render("res_ops_progress_report.Rmd") from the repo root to regenerate.

Reservoir Operations: Progress Report

Transboundary Dams — Part 2

Katie Zhang

March 14, 2026

Overview

Data Sources

GRSAD — Global Reservoir Surface Area Dataset

GREALM — Global Reservoir and Lake Monitor

Case Study Dams

Methodology

Step 1: Constructing the Operational Anomaly

Step 2: Climate Residualization

Step 3: Volatility Summary Statistics

Figure 1: Case Study Time Series

High Aswan (Nile — downstream)

GERD (Nile — upstream, Ethiopia)

Nuozhadu (Mekong — upstream, China)

Xiaowan (Mekong — upstream, China)

Figure 2: Upstream–Downstream Comparison

Nile: GERD vs. High Aswan

Mekong: Xiaowan vs. Nuozhadu

Cross-Sectional Analysis

Design

Results: Main Cross-Section Table

Dose-Response Plot

Interpretation

By-Basin Results

Mekong Cascade: Within-Cascade Design

Design

Results

Interpretation

Event Study

Design

Results

Interpretation

Identification Strategy

Cross-Sectional Design

SUTVA

Validity Checks (planned)

Current Status and Decisions Needed

What is working

Decisions needed

File Reference