Rural China Electric Kettle Promotion Program: Statistical Analysis Report

A Cluster-Randomized Controlled Trial — Unified Results

Author

Alasdair Cohen, Andrew Mertens

Published

June 5, 2026

TODOs and Open Questions

PM2.5 (SAP §5.1.2 secondary outcome) is out of scope for this analysis per PI decision and is documented as a deviation below.

Deviations from Pre-Registered SAP

Documented Deviations from SAP v1 (2018-04-29)

Methodological deviations:

  1. Unpaired vs. paired t-tests for continuous outcomes (SAP §5.2). The SAP specifies “paired t-tests” for unadjusted between-group comparisons of EUM, TTC, and PM2.5. Since these are comparisons between independent treatment and control households (not within-household before-after), unpaired (Welch’s) two-sample t-tests are used. This is the standard approach for a parallel-arm cRCT.

  2. Diarrhea age group boundary (SAP §5.1.2). The SAP specifies age groups “children <5; individuals age 5-18; adults age >=18” — the boundary at age 18 overlaps. We assign 18-year-olds to the 5-18 group, consistent with the SAP’s use of “individuals” for that category and “adults” (implying >18) for the older group.

  3. Covariate screening does not account for clustering (SAP §5.3). Pre-screening of covariates via likelihood ratio tests uses simple GLM/LM models without cluster-level random effects. The final adjusted models do include cluster random effects.

  4. EUM outlier cap at 10 L/person/day. SAP is silent on EUM outliers. Raw data contains values up to ~150 L/p/d, which are physically impossible (a single kettle cannot realistically boil > 5 kWh/day = ~10 L/p/d for a 5-person HH). lpd_outlier_cap = 10 in load_clean_eum_data() sets all values above this to NA. The external replicator effectively used the same threshold.

Data availability deviations:

  1. PM2.5 analysis excluded. PM2.5 is out of scope for this analysis per PI decision; the SAP-listed sensor outcome will not be part of this paper.

  2. Electricity affordability and risk perception variables not found. Resolved: fe15electricity_afford, fq1/fq2risk_untreated_water_first/_second are now in d_clean. Subgroup targets subgroup_electricity_afford, subgroup_risk_perception, subgroup_boiler_sex, subgroup_boiler_age are implemented and render in §11 of this report.

  3. Sensitivity analysis by participation rate not yet implemented (SAP §5.6). Cannot be derived cleanly from existing data — the natural proxies (village-level EK uptake, EUM coverage) are either the outcome itself (using it as a stratifier creates selection bias) or a planned subsample unrelated to attendance. Requires the NCRWSTG attendance records of the promotion events.

  4. Travel time to health center and travel time to water source not available as covariates for adjusted models (SAP §5.3). Resolved: hl15health_centre_travel_min and w11water_travel_min are now in d_clean and sap_covariates.

  5. Round 6 (endline) not included in primary analyses. The endline was a short survey only; most pre-specified outcomes are analyzed at baseline (R1) and midline (R3) per the SAP timing (§2.7 note about choosing ~6-month follow-up). An exploratory 12-month EK-use estimate is reported in §5d from the R6 enumerator-observation index.

Highest-value spot-checks

  1. uses_electric_kettle primary outcome — concordance against BE.1 and the EK Use Likelihood Index. The SAP primary outcome is built solely from S.1 (drinking_water_method == "Boiled W Electricity"). Cross-check it against (a) kettle_type (BE.1, asked only if S.1 endorsed electric boiling) and (b) the enumerator-observation EK Use Likelihood Index (1–10). Systematic disagreement would suggest the primary outcome needs a tweak (e.g. include HHs flagged by BE.1 even if S.1 says otherwise).

  2. Handwashing recoding rules vs field practice. The code currently maps only "Yes, after defecating" and "Yes, after defecating & before meals" to reported_hw_soap_outcome == 1. The other “Yes” categories ("Yes, but rarely" — 410 HHs at baseline; "Yes, before meals" — 8 HHs; "Yes, when guests visit") are currently coded as 0. Same call needed for OD.1’s "Yes, but not likely regularly used" (currently 0). Confirm these are the SAP-intended categorisations.

  3. Verify the 7 unmatched EUM household IDs against paper forms. IDs 20021018, 20021124, 20021125, 20021126, 20020714, 20020815, 20020917 each have 5 rounds of real EUM kWh readings but no survey match (the prior merge corrections were reverted because they crossed treatment arms). Resolving them recovers up to 35 rounds of legitimate EUM data; leaving them unresolved means they are silently dropped from cluster-aware analyses.

  4. Village → randomisation-block lookup tables. Every stratified analysis depends on the village ID lists at src/R/data_cleaning_functions.R lines 12–21 (mountain_villageids, plain_villageids, baseline_ek_use_0..3_villageids). Cross-check against the original SAP §8.3 randomisation allocation document. A single mis-assigned village would systematically corrupt one block.

  5. The 0.104 kWh/L EUM conversion constant. This single multiplier scales every litres-per-person-per-day estimate in the report. The SAP §8.1 references it as the “grand mean from EUM testing protocol”. Confirm against the original kettle- calibration document — particularly whether the grand mean is still appropriate given the observed mix of kettle types (Standard EK w/auto, EK (M)/EK (L) no auto, induction, etc.).

  6. The 21 inconsistent 1-week / 2-week diarrhea responses (logged in §2.6.3). We have shown these are real respondent inconsistencies rather than a coding bug. A PI field-memory read on whether this is broad diarrhea-recall noise — i.e., whether it should push us toward switching the SAP §5.1.2 primary recall window from 7-day to same-day — would be more credible than a remote analyst’s call.

1. Study Overview

This analysis follows the pre-registered Statistical Analysis Plan (SAP v1, 2018-04-29) for the Rural China Electric Kettle Promotion Program cluster-randomized controlled trial (ClinicalTrials.gov: NCT03376152).

Design: Parallel arm cohort cRCT, 1:1 ratio, 30 villages (clusters), stratified randomization by geography (mountains/plains) and baseline EK use.

Population: 900 poverty households (30 per village), ITT analysis.

Primary Outcomes: Electric kettle use prevalence (reported) and intensity (EUM-measured).

Two-sided tests, 5% significance level, 95% confidence intervals.

2. Data Cleaning and QC Audit

2.1 Pipeline architecture

The authoritative pipeline is driven by _targets.R, which calls four cleaning functions in src/R/data_cleaning_functions.R:

  • load_clean_household_data()d_clean (HH × round panel)
  • load_clean_individual_data()d_individ (individual × round)
  • load_clean_diarrhea_data(d_clean)d_diar (individual × round × recall period)
  • load_clean_eum_data(d_clean)d_eum (HH × EUM measurement)

The legacy src/01_data_cleaning.R is retained only for producing shareable CSVs (see replication/EKT_replication_data_corrected/); it does not feed any analysis target.

2.2 Variable derivation audit

The following key derivations were cross-checked against the survey instrument and the raw .dta factor labels:

Output variable Raw source Recoding rule Status
uses_electric_kettle s1 "Boiled W Electricity" → 1, NA → NA, else 0 ✓ verified
reported_hw_soap_outcome hy4 "Yes, after defecating" OR "Yes, after defecating & before meals" → 1; "Don't know" → NA; else 0 ✓ verified
observed_hw_soap_outcome od1 "Yes, & likely used regularly" → 1; "Not able to check" → NA; else 0 ✓ verified
mountains village 20 mountain villages, 10 plain villages ✓ verified (SAP §8.3)
baseline_uses_ek village 4 categories (Cat. 0–3, SAP §8.3) ✓ verified
diarrhea_today hl8 "Yes"→1, "No"→0, "DK"→NA ✓ value labels confirmed
diarrhea_oneweek hl9 same ✓ value labels confirmed
diarrhea_twoweek hl10 same ✓ verified (survey HL.10 = “Diarrhea in the last two weeks?”)
cough_today hl2 same ✓ confirmed (hl1 is the skip-gate)
cough_oneweek hl3 same ✓ confirmed
rash_oneweek hl5 same ✓ confirmed (hl4 is the skip-gate)
toothache_oneweek hl6 same ✓ confirmed
hoh_female g3c1..g3c4, g4c1..g4c4 first person where g3=="Yes" AND g4=="Female" ✓ verified (all observed HoHs at positions 1–4)
hoh_age g3c1..g3c4, g5c1..g5c4 age of first g3=="Yes" person ✓ verified
pct_adults_highschool g5c*, g6c* adults (age≥18) with edu ∈ {"High School", "University & higher"} ✓ fixed (see §2.3)
nhh g1c1..g1c8 count of non-NA person codes ✓ verified
eum_conversion g5c1..g5c8 sum of (age<12 → 0.5, else 1) ✓ verified (SAP §8.1)
lpd eum_interval, eum_conversion, interval_days (eum_interval / 0.104) / eum_conversion / interval_days ✓ verified (SAP §8.1)
lpd_outlier_cap parameter (default 10 L/p/d) lpd > cap → NA documented deviation; SAP-silent decision (see §2.4)
water_travel_min w11 numeric minutes (NA when HH uses a piped connection at home) ✓ verified (SAP §5.3 covariate trigger evaluated in §2.6.13)
health_centre_travel_min hl15 numeric minutes to nearest health centre ✓ verified (SAP §5.3 covariate; ~99% non-NA at baseline; median 20 min)
risk_untreated_water_first fq1 first-response category (Nothing / They will get sick / diarrhea / vomit / typhoid / amoeba / cholera / stomach ache / Don’t know / Other) ✓ verified (SAP §5.4 subgroup; ~99% non-NA)
risk_untreated_water_second fq2 secondary response (same categories as fq1) ✓ verified (less complete — ~66% non-NA — since respondents often gave only one answer)
electricity_afford fe15 Very expensive / Expensive / Slightly expensive / Affordable / Very affordable / Don’t know / Govt. pays ✓ verified (SAP §5.4 subgroup; 100% non-NA)
primary_boiler_age be16_age1 numeric age (years) of the primary person who boils water (BE.16 person 1) ✓ verified (SAP §5.4 boiling subgroup; ~38% non-NA — only HHs that report boiling are asked BE.16)
primary_boiler_sex be16_sex1 “Female” / “Male” ✓ verified (60% Female, 40% Male among boiling HHs)

2.3 Issues identified and resolved

Education match for pct_adults_highschool. The previous code matched education strings "High school", "College", "College/university". The actual factor labels in both baseline and midline .dta files are "High School" (capital S) and "University & higher". The capital-S form was matched correctly; the "University & higher" label was not, so any adult with university education was excluded from the numerator of the high-school+ proportion. The effect is small in this sample (2 university-educated adults observed at g6c1 in baseline, 0 in midline), but systematic. Fixed by updating the match set to include both forms.

Diarrhea two-week recall mapping (hl10diarrhea_twoweek). An earlier draft of this report flagged a potential variable misidentification because the -layout extraction of the survey PDF appeared to show only three question texts in the diarrhea block (HL.7 / HL.8 / HL.9), with a fourth column header (HL.10) but no matching question. A second extraction without -layout cleared this up: in the survey, HL.1 / HL.4 / HL.7 / HL.11 are the “Person/code” column labels that introduce each block of health questions, and the actual diarrhea questions are:

  • HL.8 = “Diarrhea today?”
  • HL.9 = “Diarrhea in the last week?”
  • HL.10 = “Diarrhea in the last two weeks?”

This is exactly what the cleaning code assumes, and matches the pattern across the other blocks (HL.2/HL.3 = cough today/1-week; HL.5/HL.6 = rash/toothache 1-week). The Stata hl1, hl4, hl7, hl11 variables are not real questions — they correspond to the Person/code column labels, have no value labels, and (as expected) show 0% “Yes” rates in the data. The hl8/hl9/hl10diarrhea_today/oneweek/twoweek mapping is correct.

A separate observation that remains: at baseline c1, 21 individuals answered “Yes” to one-week diarrhea but “No” to two-week diarrhea (logically impossible for nested recall windows), and 20 answered the consistent reverse pattern. The cleaning is not at fault — this is a real data inconsistency arising from recall bias or rapid-fire question order. It is worth noting in any final write-up of the diarrhea results, but it is not a cleaning bug. The differential-recall sensitivity analysis in §12 already addresses how to handle this in the primary outcome.

Cluster-aware unadjusted analyses. The previous unadjusted MH and t-tests treated households within villages as independent, which under-states variance for a cRCT (inflated Type I error). The pipeline now uses washb::washb_mh() and washb::washb_ttest() for all unadjusted analyses (binary outcomes via stratified Mantel-Haenszel with strat = randomization_block; continuous outcomes via paired t-tests on randomization-block means). Degrees of freedom for the t-tests are now n_blocks − 1 rather than n_households − 2, which is the correct cluster-aware accounting. The tt_* and mh_* targets now carry an n_blocks column reporting how many of the 8 SAP randomization blocks (mountains × baseline_uses_ek) have both arms present in each stratum.

2.4 Flags requiring further verification

1. Placeholder treatment assignment. Until the real (unblinded) village-level allocation is integrated, all household-level inference is null by construction. The current placeholder uses set.seed(20250117) and randomises at the village level (correct for a cRCT design), but the labels themselves are arbitrary.

2. lpd_outlier_cap is a non-SAP decision — RESOLVED (set to 10 L/p/d). The default is now lpd_outlier_cap = 10 in load_clean_eum_data(). Rationale: a single household kettle cannot realistically boil more than ~5 kWh/day of drinking water (boil cycles are ~0.1–0.3 kWh each); for a 5-person HH that’s ~10 L/p/d. The cap also aligns the authoritative pipeline with the external replicator (whose effective cap was ~10). Documented as a deviation in the SAP-deviations list (§6 item 5 in the Deviations callout).

3. Five HHs have 10 EUM rounds — RESOLVED: all 7 ID corrections reverted. Investigation showed each source ID had EUM arm = 1 (treatment) while each target ID had arm = 0 (control) and lived in a survey village. The arm mismatch is incompatible with the typo hypothesis: if the SRC were really a typo for the TGT, they would be the same household and therefore in the same arm. All 7 corrections have been disabled in load_clean_eum_data(). The 5 previously-merged HHs now correctly show 5 rounds each; the 7 SRC IDs now exist as separate, unmatched EUM records and are silently dropped from cluster-aware analyses (their joined mountains / baseline_uses_ek are NA). The 7 SRC IDs (20021018, 20021124, 20021125, 20021126, 20020714, 20020815, 20020917) need verification against the paper data collection forms before any merge can be safely re-applied.

4. EUM round-1 date 2017-01-01 — RESOLVED (one HH, fix applied). The single affected HH is 11020623 (village 1, control arm, mountain × Cat. 1). Round 1 was recorded as 2017-01-01 with kWh = 0 (an installation reading). The HH’s other rounds are spaced ~88–103 days apart starting 2018-02-02; the cohort’s median round-1 date is 2017-11-23 (range 2017-11-06 to 2017-12-10). The recorded date is almost certainly a typo where the month digit was lost (2017-11-012017-01-01). Without this fix the interval into round 2 is 397 days instead of ~95, dividing lpd by ~4 at round 2 for this HH. A targeted correction in load_clean_eum_data() now maps this single date to 2017-11-01. Paper-form verification by the data team is still requested to confirm the exact installation date.

5. 17 HHs with backward EUM dates — INVESTIGATED (3 distinct patterns identified; 12 rows remain after the ID-correction revert). The earlier count of 17 HHs / 32 rows included artefacts from the ID-correction merge (Flag 3); after reverting those corrections, 12 rows across 12 distinct HHs remain. They fall into three clear patterns:

  • Pattern A — village 30, round 5 (6 HHs): 10220504, 10220505, 10220514, 10220517, 10220528, 10220529. Each shows a round-5 date 2–4 days before round 4 (both in late Aug 2018). kWh either doesn’t change (HHs 10220504 / 10220505 show 0 kWh throughout) or increases sensibly (other 4 HHs). The most parsimonious read is that village 30 was never re-visited for the November endline and a round-5 row was entered with a near-duplicate Aug date.
  • Pattern B — village 21, round 3 (5 HHs): 10520126, 10520127, 10520128, 10520129, 10520130. Each has round 3 dated 2018-08-28 and round 4 dated 2018-08-19/20, putting round 4 before round 3. kWh increases monotonically across rounds, so the order is correct. Round 3 should be a midline (~May 2018) visit; the recorded 2018-08-28 is most likely a month typo (0508).
  • Pattern C — HH 20020418, round 5 (village 19, treatment): round 5 dated 2018-01-26, 196 days before round 4 (2018-08-10). kWh increases by 148 between rounds 4 and 5, consistent with another ~3 months of normal use. Most likely a month typo (1101); the correct date is probably ~2018-11-26.

The current lpd_negative flag sets lpd = NA for all 12 rows, so they do not bias the analysis. Fixing the underlying dates would recover the data for these intervals (the kWh readings are real), but the corrections need to be verified against the paper forms before being applied. Action: data team to verify the dates for the 12 (household_id, round) pairs listed in the §2.6.9 backward-date chunk; the inferred corrections above are the proposed mapping.

2.5 Confirmed correct against raw data

  • s1 factor label is exactly "Boiled W Electricity" in both baseline and midline .dta files.
  • hy4 and od1 factor labels match the SAP-aligned recoding rules.
  • Health-outcome hl* variables have value labels No / Yes / DK (codes 1/2/3) as the cleaning code assumes.
  • All hl# → outcome mappings used by the cleaning code match the survey instrument question numbering: HL.2 / HL.3 = cough today / 1-week; HL.5 / HL.6 = rash / toothache 1-week; HL.8 / HL.9 / HL.10 = diarrhea today / 1-week / 2-week. See §2.3 for the resolution of the earlier hl10 flag.
  • The hl1, hl4, hl7, hl11 variables correspond to the “Person/code” column labels that introduce each block of health questions in the survey. They have no value labels, near-zero “Yes” rates, and are not analysed.
  • Every other variable rename in rename_vars() has been cross-checked against the survey instrument (extracted without -layout to avoid column misalignment): s1 = drinking water method (S.1), w1 / w6 = winter / summer drinking water source (W.1 / W.6), w23 = water quality perception, w26 = water filter use, be1 = kettle type (BE.1), be6 / be7 = winter / summer boil frequency, hy2 / hy3 = handwash before meals / after defecation, hy4 = reported soap use, hy5 = toilet type, od1 = observed soap, hd13 / hd14 = water-test / air-pollution sample flags. The 11 asset variables g15a..g15k map to refrigerator, rice cooker, washing machine, AC, old TV, flat TV, DVD, computer, e-scooter, petrol scooter, motorcycle per survey item G.15. All mappings are one-to-one and correct.
  • HoH appears only at roster positions 1–4 in the survey (640/125/10/1 in baseline c1–c4; 0 at c5–c8), so the current 1–4 search is complete.
  • Village stratification matches SAP §8.3 (20 mountain + 10 plain; 4 baseline-EK-use categories; 30 villages total).
  • Modern health-outcome recoding (string match on "Yes"/"No"/"DK") correctly recovers all events. The legacy 01_data_cleaning.R has a bind-then-numeric-match bug that drops every "Yes" event; it is no longer used for analysis but is the source of the broken shared CSVs in replication/EKT replication data/.

2.6 Live QC checks

These chunks rebuild every time the report renders and surface any regressions in the cleaning pipeline.

2.6.1 Treatment counts per round

Treatment assignment counts by round
Should be constant across rounds (village-level cRCT)
round treat_0 treat_1 total
1 390 510 900
3 390 510 900
6 390 510 900

2.6.2 Drinking-water method ↔︎ EK use concordance

drinking_water_method × uses_electric_kettle
`Boiled W Electricity` should be 100% EK=1; all others should be EK=0
drinking_water_method uses_electric_kettle n
Bottled W 0 3
Boiled W 0 955
Boiled W Electricity 1 814
Other 0 1

2.6.3 Diarrhea recall-period consistency

For each individual–round combination, compare the 1-week and 2-week diarrhea answers. A “Yes” to the 1-week question implies “Yes” to the 2-week question (nested recall windows), so the “Yes 1wk / No 2wk” cell should be zero. Anything non-zero reflects respondent inconsistency (recall bias or rapid question ordering); the variable mapping itself was verified against the survey instrument in §2.3.

1-week vs 2-week diarrhea per individual
Logically impossible cells = response inconsistency, not a coding bug
label round_1 round_3
No both 1895 1876
No 1wk / Yes 2wk (consistent) 28 19
Yes 1wk / No 2wk (logically impossible) 43 6
Yes both 45 41

The specific individuals in the logically-impossible cell (“Yes 1wk / No 2wk”) are listed below so the data team can trace back to the survey records:

Individuals reporting diarrhea in past week but NOT in past 2 weeks
Cleaning is correct; these are respondent inconsistencies in the raw data
Round HH ID Person
1 10020706 1
1 10020706 2
1 10020709 1
1 10120006 2
1 10220505 1
1 10220506 2
1 10220512 1
1 10220525 1
1 10220527 1
1 10220527 2
1 10420815 1
1 10520007 1
1 10520016 2
1 10520021 5
1 10520030 2
1 10620713 3
1 10620714 3
1 10620720 4
1 10620924 1
1 10620925 1
1 10620930 4
1 10820523 1
1 10920001 5
1 10920702 3
1 10920703 1
1 10920703 2
1 10920708 2
1 10920710 1
1 10920712 5
1 10920724 2
1 10920728 1
1 10921103 1
1 11020212 2
1 11020504 2
1 11020615 1
1 20020603 1
1 20100013 2
1 20120228 1
1 20120610 1
1 20120612 1
1 20120614 2
1 20120623 1
1 20120623 2
3 10120504 3
3 10920709 1
3 10920712 4
3 10920716 3
3 10920718 1
3 10920724 2

2.6.4 Head-of-household identification rate by round

Households with an HoH identified, by round
Expected ~85–90% (some HHs do not flag a HoH in the roster)
round n_hh n_with_hoh pct_with_hoh
1 900 776 86.2
3 900 769 85.4
6 900 0 0.0

2.6.5 Education distribution at baseline (head-level)

Adults with high-school+ education (baseline)
% of adults; % of HHs with at least one such adult
n mean_pct median_pct pct_any_hs_adult
899 3.6 0 8.7

2.6.6 EUM outlier and negative-interval counts

EUM data quality flags by round
Negative interval = meter reset / date error; outlier = lpd > cap
round n_total n_lpd n_negative_flagged n_outlier_flagged
1 516 0 0 0
2 516 460 0 6
3 516 438 3 5
4 516 390 7 6
5 516 351 14 5

2.6.7 EUM interval-day distribution

Intervals between EUM readings should cluster around the planned ~90-day cadence. Intervals ≤ 0 days are date errors; very short (< 7) or very long (> 180) intervals are suspect.

Interval length between consecutive EUM readings, by round
Counts of intervals in each duration bin
interval_class round_2 round_3 round_4 round_5
30–180 days (expected) 486 466 437 409
NA (no prior reading) 30 45 74 93
> 180 days (very long) 0 5 0 0
≤ 0 days (backward / same-day) 0 0 5 7
7–29 days (short) 0 0 0 6
< 7 days (very short) 0 0 0 1

2.6.8 EUM households should have an electric kettle at baseline

Every household with EUM readings should have a kettle_type recorded at baseline (BE.1 is only asked when the household reports boiling with electricity in S.1).

EUM households cross-checked against baseline kettle_type
Any 'missing' rows = EUM data for HHs with no electric kettle on record
kettle_status n
kettle_type missing in baseline survey 294
kettle_type = Standard EK w/auto shutoff 118
kettle_type = EK (M), no auto shutoff 80
kettle_type = EK (L), no auto shutoff 19
kettle_type = Other 3
kettle_type = Induction kettle (M) 1
kettle_type = Kettle (S) & hot plate 1

2.6.9 EUM round-1 (installation) date sanity

Round-1 readings are the installation/baseline measurement. Dates should cluster around the baseline-survey period (late 2017 / early 2018). Wide ranges or out-of-period values suggest data-entry errors.

EUM round-1 (installation) date distribution
Should be a narrow window aligned with baseline data collection
n earliest_date median_date latest_date range_days
510 2017-11-01 2017-11-23 2017-12-10 39

Round-1 dates that fall well outside the expected installation window (2017-11 / 2017-12) — likely typos for the data team to correct:

EUM round-1 dates that look like data-entry typos
Any HH with a round-1 date before 2017-06-01 (median is 2017-11-23)
household_id round date eum
Within-HH date monotonicity (summary)
Rows where the EUM reading date is not strictly later than the previous round
n_backward_or_same_day n_households_affected
12 12

Specific (household_id, round) pairs with backward dates — for the data team to cross-check against paper forms:

Households with backward EUM dates
prev_date should be earlier than date; rows shown are where it is not
household_id Prev round Prev date Round Date
10220504 4 2018-08-18 5 2018-08-16
10220505 4 2018-08-26 5 2018-08-22
10220514 4 2018-08-19 5 2018-08-16
10220517 4 2018-08-19 5 2018-08-16
10220528 4 2018-08-20 5 2018-08-17
10220529 4 2018-08-18 5 2018-08-17
10520126 3 2018-08-28 4 2018-08-19
10520127 3 2018-08-28 4 2018-08-18
10520128 3 2018-08-28 4 2018-08-20
10520129 3 2018-08-28 4 2018-08-20
10520130 3 2018-08-28 4 2018-08-20
20020418 4 2018-08-10 5 2018-01-26

2.6.10 EUM attrition pattern

How many households completed how many EUM rounds?

EUM follow-up completion: households by number of rounds
Of HHs with any EUM data, distribution of rounds completed (max should be 5)
n_rounds_completed n_households pct
1 30 5.8
2 24 4.7
3 42 8.1
4 49 9.5
5 371 71.9

Households with more than 5 rounds — physically impossible (there are only 5 EUM rounds). See §2.4 flag 3: these IDs correspond exactly to the pending ID-correction list in load_clean_eum_data(), and the multi-round rows have non-monotonic cumulative kWh, so the corrections likely merge distinct households.

Households with > 5 EUM rounds (likely ID-correction artefacts)
All 5 IDs match the 'corrected to' targets in load_clean_eum_data()
household_id n_rounds_completed

2.6.11 Treatment-arm alignment: d_eum vs d_clean

d_eum’s treatment column comes from TC_scrambled in the EUM raw file. d_clean’s treatment column is the placeholder seeded village-level rbinom in load_clean_household_data(). Until the real allocation replaces the placeholder, these two assignments are independent random labels and should disagree at roughly chance rates. Once the real allocation arrives, this check must show 100% agreement.

Treatment arm: d_eum (TC_scrambled) vs d_clean (placeholder)
Expected to disagree today; must agree once the real allocation is wired in
tx_eum tx_clean n label
0 0 209 Agree
0 1 240 Disagree (placeholder vs TC_scrambled)
1 0 20 Disagree (placeholder vs TC_scrambled)
1 1 40 Agree

2.6.12 kWh-per-day sanity

Energy analogue of the lpd_outlier check: distribution of kWh used between consecutive readings, normalised by interval days. Negative values indicate meter resets/replacements. Persistently high values (> 5 kWh/day for a kettle is implausible — a boil cycle is ~0.1–0.3 kWh) suggest measurement errors that may slip past the lpd_outlier_cap.

kWh per day across all EUM intervals
n_negative = meter resets; n_gt_5kWh = implausibly high energy use
stat value
n 1701.000
min -1.388
p1 0.000
p25 0.053
median 0.156
p75 0.296
p99 1.891
max 55.748
n_negative 12.000
n_gt_5kWh 8.000

2.6.13 Water-travel-time covariate (SAP §5.3 trigger)

The SAP includes travel time to the drinking water source as a covariate only if more than 5% of households have travel time greater than 10 minutes. The chunk below evaluates that trigger.

Travel time to drinking water source (W.11) at baseline
SAP §5.3: include as covariate only if pct_gt_10_min > 5%
n_total n_with_minutes n_gt_10_min pct_gt_10_min median_min max_min
900 43 7 0.78 5 20

Most rows are NA because the question lets households fill in either a numeric minute count or a piped-connection letter code (a–e); only HHs who report fetching water externally fill in minutes. NA is therefore the expected value for HHs with piped water at home.

2.6.14 Missing-data summary for analysis-key variables

Missing-data rates at midline for key analysis variables
variable n_missing pct_missing
uses_electric_kettle 27 3.0
reported_hw_soap_outcome 4 0.4
observed_hw_soap_outcome 4 0.4
log10wTTC 623 69.2
mountains 0 0.0
baseline_uses_ek 0 0.0
toilet 34 3.8
filter_use 34 3.8
water_quality_perception 37 4.1

3. CONSORT Flow

Participant Flow Summary
Stage N
Eligible poverty households 3,545 (across 43 villages)
Randomly selected villages 30
Randomly selected households 900
Baseline (R1) — HH surveys 900
Treatment arm (R1) 510
Control arm (R1) 390
Midline (R3) — HH surveys 900
Treatment arm (R3) 510
Control arm (R3) 390
Endline (R6) — short survey 900
Water quality subsample (R1+R3) 546

4. Baseline Characteristics (SAP Section 4.5)

Table 1: Baseline characteristics by treatment arm
level 0 1
n 390 510
surveyTime (mean (SD)) 25.23 (7.59) 24.44 (7.02)
nhh (mean (SD)) 2.27 (1.11) 2.42 (1.26)
n_female_adults (mean (SD)) 0.96 (0.60) 1.02 (0.65)
n_female_children (mean (SD)) 0.13 (0.36) 0.16 (0.45)
pct_adults_highschool (mean (SD)) 0.03 (0.12) 0.04 (0.13)
uses_electric_kettle (%) 0 217 (55.6) 342 (67.1)
1 173 (44.4) 168 (32.9)
drinking_water_method (%) Bottled W 0 ( 0.0) 2 ( 0.4)
Boiled W 217 (55.6) 340 (66.7)
Boiled W Electricity 173 (44.4) 168 (32.9)
water_quality_perception (%) Very bad 1 ( 0.3) 0 ( 0.0)
Poor 55 (14.1) 34 ( 6.7)
Satisfactory 76 (19.5) 128 (25.1)
Good 160 (41.1) 248 (48.6)
Very good 78 (20.1) 85 (16.7)
Don't know 19 ( 4.9) 15 ( 2.9)
filter_use (%) 1 364 (97.6) 494 (97.2)
2 5 ( 1.3) 4 ( 0.8)
3 4 ( 1.1) 10 ( 2.0)
toilet (%) No toilet 8 ( 2.1) 6 ( 1.2)
Ventilation improved toilet 14 ( 3.6) 2 ( 0.4)
Basic pit latrine 309 (79.4) 429 (84.3)
Double pit latrine 1 ( 0.3) 1 ( 0.2)
Septic tank latrine 2 ( 0.5) 4 ( 0.8)
Biogas toilet 2 ( 0.5) 4 ( 0.8)
Water flush toilet 29 ( 7.5) 26 ( 5.1)
Water flush toilet w/sewer connection 21 ( 5.4) 36 ( 7.1)
Other 3 ( 0.8) 1 ( 0.2)
reported_hw_soap_outcome (%) 0 360 (92.3) 478 (93.7)
1 30 ( 7.7) 32 ( 6.3)
observed_hw_soap_outcome (%) 0 354 (90.8) 423 (82.9)
1 36 ( 9.2) 87 (17.1)
asset_fridge (%) 0 96 (24.6) 84 (16.5)
1 287 (73.6) 412 (80.8)
2 7 ( 1.8) 14 ( 2.7)
asset_washingmachine (%) 0 305 (78.4) 433 (84.9)
1 84 (21.6) 76 (14.9)
2 0 ( 0.0) 1 ( 0.2)
asset_ricecooker (%) 0 49 (12.6) 59 (11.6)
1 326 (83.6) 442 (86.7)
2 15 ( 3.8) 9 ( 1.8)
asset_tv_flat (%) 0 225 (57.8) 317 (63.0)
1 158 (40.6) 181 (36.0)
2 6 ( 1.5) 5 ( 1.0)
asset_computer (%) 0 369 (94.6) 482 (95.4)
1 19 ( 4.9) 22 ( 4.4)
2 2 ( 0.5) 1 ( 0.2)

5. Unadjusted Means and SDs (SAP Section 5.2)

Per the SAP: “Unadjusted and adjusted means and SDs will be reported for treatment and control groups.”

Unadjusted Means (SD) by Treatment Arm at Midline
Outcome
Control
Treatment
N Mean (SD) N Mean (SD)
EK use prevalence 378 0.537 (0.499) 495 0.545 (0.498)
Log10 TTC 58 0.285 (0.803) 219 0.139 (0.512)
Diarrhea (7-day) 836 0.042 (0.200) 1182 0.012 (0.108)
Handwashing (reported) 386 0.197 (0.398) 510 0.041 (0.199)
Handwashing (observed) 387 0.243 (0.429) 509 0.128 (0.334)
Cough (7-day) 787 0.105 (0.307) 1173 0.033 (0.179)
Skin rash (7-day) 826 0.036 (0.187) 1173 0.009 (0.092)
Toothache (7-day) 826 0.062 (0.241) 1167 0.010 (0.101)
EUM liters/person/day 387 1.119 (1.395) 51 1.438 (1.614)

Baseline vs Midline: Key Outcomes by Arm

Figure 1. Baseline (round 1) and midline (round 3) prevalence of self-reported electric kettle use and self-reported handwashing with soap, by treatment arm. Bars show the proportion in each round-arm cell.

6. Primary Outcomes (SAP Section 5.1.1)

5a. EK Use Prevalence (Reported)

Descriptive Statistics

EK Use Prevalence by Round and Treatment Arm
Round Arm N N Using EK Prevalence
1 Control 390 173 44.4%
1 Treatment 510 168 32.9%
3 Control 378 203 53.7%
3 Treatment 495 270 54.5%

Figure 2. Self-reported electric kettle (EK) use by round and arm. N and number of EK users per cell are shown above.

Figure 3. Proportion of households reporting electric kettle use at baseline (round 1) and midline (round 3), by treatment arm.

Unadjusted: Mantel-Haenszel Prevalence Ratios and Differences

EK Use: Unadjusted MH Estimates (Midline)
Stratum Measure Estimate (95% CI) P-value N
Overall (block-stratified) PR 1.068 (0.915, 1.246) 0.4057 873
Overall (block-stratified) RD 0.034 (-0.042, 0.11) 0.3835 873
Mountain PR 1.134 (0.92, 1.398) 0.2398 580
Mountain RD 0.06 (-0.034, 0.153) 0.2108 580
Plain PR 0.972 (0.777, 1.214) 0.8005 293
Plain RD -0.017 (-0.148, 0.114) 0.7960 293
EK Cat. 0 PR -- -- 144
EK Cat. 0 RD -- -- 144
EK Cat. 1 PR -- -- 204
EK Cat. 1 RD -- -- 204
EK Cat. 2 PR 0.913 (0.776, 1.074) 0.2731 353
EK Cat. 2 RD -0.057 (-0.159, 0.045) 0.2718 353
EK Cat. 3 PR -- -- 172
EK Cat. 3 RD -- -- 172

Adjusted: Modified Poisson GLMM (IRR)

EK Use: Adjusted IRR (Modified Poisson GLMM)
Outcome IRR (95% CI) P-value N
uses_electric_kettle 0.945 (0.522, 1.711) 0.8520 873

5b. EK Use Intensity (EUM)

Descriptive Statistics

EK Use Intensity: Liters per Person per Day by Round (Interval-based)
Round Arm N Mean L/p/d SD Median min_lpd max_lpd
2 Control 407 0.86 1.19 0.52 0.00000000 9.483516
2 Treatment 53 1.46 1.72 0.78 0.00000000 7.618343
3 Control 387 1.12 1.39 0.67 0.00000000 9.640861
3 Treatment 51 1.44 1.61 0.84 0.00000000 7.003934
4 Control 343 1.02 1.19 0.70 0.00000000 9.432814
4 Treatment 47 1.58 1.75 1.12 0.03243671 8.281638
5 Control 306 1.27 1.49 0.92 0.00000000 8.999084
5 Treatment 45 1.22 1.16 0.92 0.01254181 6.278281

Unadjusted: T-tests by Round

EUM lpd: unadjusted cluster-aware t-test by round (Treatment − Control)
Paired t-test on block-level means (washb_ttest); df = n_blocks − 1. Welch unpaired fallback used when n_blocks < 2 (not cluster-aware).
Round Control Mean Treatment Mean Diff (95% CI) P-value N Blocks Method
2 (~3mo) 0.86 1.46 1.223 (-0.639, 3.086) 0.1522 460 6 paired-block (washb_ttest)
3 (~6mo) 1.12 1.44 0.583 (-0.429, 1.595) 0.1986 438 6 paired-block (washb_ttest)
4 (~9mo) 1.02 1.58 0.699 (-0.074, 1.472) 0.0677 390 6 paired-block (washb_ttest)
5 (~12mo) 1.27 1.22 0.104 (-0.307, 0.514) 0.5450 351 6 paired-block (washb_ttest)

Unadjusted: T-tests Stratified (Midline, Round 3)

EUM lpd: Stratified T-tests (Midline, Round 3)
Stratum Control Treatment Diff (95% CI) P-value N
Overall (block-paired) 1.12 1.44 0.583 (-0.429, 1.595) 0.1986 438
Mountain 1.19 1.12 -0.057 (-1.354, 1.241) 0.8685 276
Plain 1.00 1.97 1.223 (-1.032, 3.478) 0.1448 162
EK Cat. 0 1.89 1.24 -0.653 (-3.085, 1.779) 0.4802 61
EK Cat. 1 1.13 3.06 -- -- 91
EK Cat. 2 0.94 1.45 0.647 (-3.532, 4.827) 0.2993 191
EK Cat. 3 0.92 1.29 0.314 (-1.583, 2.212) 0.2823 95

Adjusted: Linear Mixed Model

EUM lpd: Adjusted Mean Difference (Linear Mixed Model, Midline)
Outcome Estimate (95% CI) P-value N
lpd 0.436 (-0.366, 1.238) 0.2643 438

5c. EK Use Likelihood Index (Enumerator Observations)

EK Use Likelihood Index (1-10 scale) by Round and Arm
Round Arm N Mean Index SD
3 Control 246 8.13 1.49
3 Treatment 305 8.29 1.78
6 Control 271 8.62 1.48
6 Treatment 344 8.64 1.78

Figure 5. Distribution of the enumerator-observation EK Use Likelihood Index categories (Unlikely / Occasional / Likely regular user) at baseline (R1) and midline (R3), by treatment arm.

5d. Exploratory: EK use at 12 months (R6 endline)

Exploratory analysis — not in the pre-specified SAP

R6 (endline) was a short survey that captured only enumerator observations of the kettle (od8/od9/od10) and not the self-reported drinking_water_method that underlies the SAP-defined uses_electric_kettle outcome at baseline and midline.

To produce a 12-month EK-use estimate from R6, we derive a binary “likely EK user” from the enumerator-observation index ek_use_likelihood (1–10):

  • Strict threshold: ek_use_likelihood > 7 (“Likely regular user”).
  • Lenient threshold: ek_use_likelihood > 4 (“Occasional or Likely regular user”).

Both are reported because the cut-point is analyst-arbitrary. The lenient threshold is the closer analog to the midline self-report binary (which captures anyone whose primary boiling method is electric, regardless of frequency).

This analysis was not pre-specified and should be interpreted as a descriptive 12-month follow-up signal, not a primary outcome.

Descriptive: enumerator-observed EK use at 12 months by arm

Enumerator-observed EK use at 12 months
Mean likelihood index (1–10), and % above each threshold
Arm N Mean likelihood % strict (> 7) % lenient (> 4)
Control 271 8.62 89.3 97.8
Treatment 344 8.64 87.5 96.2

Cluster-aware MH at 12 months (both thresholds)

EK use at 12 months: MH PR / RD by threshold
Cluster-aware MH stratified by randomization block
Threshold Measure Estimate (95% CI) P-value N
Strict (likelihood > 7) PR 1.043 (0.973, 1.117) 0.2363 615
Strict (likelihood > 7) RD 0.037 (-0.025, 0.099) 0.2402 615
Lenient (likelihood > 4) PR 1.000 (0.978, 1.023) 0.9841 615
Lenient (likelihood > 4) RD 0.000 (-0.022, 0.023) 0.9839 615

7. Secondary Outcomes

6a. Thermotolerant Coliforms (TTC) — SAP Section 5.1.2

Descriptive: Detection, Geometric Mean, Risk Classification

TTC Summary by Round, Arm, and Geography
Round Arm Geography N % Detected Geo. Mean MPN/100mL geometric_mean_ci_lo geometric_mean_ci_hi
1 Control mountain 55 20.0 2.0 1.3 2.9
1 Treatment mountain 100 29.0 2.5 1.8 3.5
1 Treatment plain 114 21.1 2.0 1.5 2.7
3 Control mountain 58 12.1 1.9 1.2 3.1
3 Treatment mountain 104 8.7 1.4 1.1 1.8
3 Treatment plain 115 8.7 1.4 1.1 1.7

Figure 6. Geometric mean thermotolerant coliforms (TTC, MPN/100 mL) in household drinking water by round, geography (mountain vs plain), and treatment arm. Error bars are 95% confidence intervals around the geometric mean.
TTC Risk Classification
Round Arm Risk Category N %
1 Control Not detected 44 80.0
1 Control 1-9 MPN/100mL 2 3.6
1 Control 10-99 MPN/100mL 8 14.5
1 Control >=100 MPN/100mL 1 1.8
1 Treatment Not detected 161 75.2
1 Treatment 1-9 MPN/100mL 14 6.5
1 Treatment 10-99 MPN/100mL 30 14.0
1 Treatment >=100 MPN/100mL 9 4.2
3 Control Not detected 51 87.9
3 Control 1-9 MPN/100mL 0 0.0
3 Control 10-99 MPN/100mL 2 3.4
3 Control >=100 MPN/100mL 5 8.6
3 Treatment Not detected 200 91.3
3 Treatment 1-9 MPN/100mL 6 2.7
3 Treatment 10-99 MPN/100mL 5 2.3
3 Treatment >=100 MPN/100mL 8 3.7

Unadjusted: T-tests (Overall)

Log10 TTC: unadjusted cluster-aware t-test (Treatment − Control)
Paired t-test on block-level means when n_blocks ≥ 2; Welch unpaired fallback otherwise (not cluster-aware).
Round Control Mean Treatment Mean Diff (95% CI) P-value N Blocks Method
1 (Baseline) 0.29 0.35 0.062 (-0.130, 0.254) 0.5252 269 1 Welch (fallback, NOT cluster-aware)
3 (Midline) 0.29 0.14 -0.146 (-0.368, 0.075) 0.1917 277 1 Welch (fallback, NOT cluster-aware)
Note

The water-quality subsample (~270 at baseline, ~280 at midline) often does not cover ≥ 2 randomization blocks with both arms present. When that happens, the function falls back to a Welch unpaired t-test on the raw values; the Method column makes the fallback explicit. The fallback is not cluster-aware (it treats individual households as independent), so the adjusted LMM result below remains the primary inference for TTC.

Unadjusted: T-tests Stratified (Midline)

Log10 TTC: Stratified T-tests (Midline)
Stratum Control Treatment Diff (95% CI) P-value N
Overall (block-paired) 0.29 0.14 -0.146 (-0.368, 0.075) 0.1917 277
Mountain 0.29 0.15 -0.139 (-0.372, 0.095) 0.2409 162
Plain NA 0.13 -- -- 115
EK Cat. 0 NA 0.27 -- -- 52
EK Cat. 1 0.16 0.05 -0.113 (-0.355, 0.130) 0.3517 89
EK Cat. 2 NA 0.00 -- -- 28
EK Cat. 3 0.42 0.17 -0.250 (-0.637, 0.137) 0.1975 108

Adjusted: Linear Mixed Model

Log10 TTC: Adjusted Mean Difference (Linear Mixed Model, Midline)
Outcome Estimate (95% CI) P-value N
log10wTTC -0.157 (-1.130, 0.815) 0.5595 277

6b. Diarrhea Prevalence — SAP Section 5.1.2

Unadjusted: MH (7-day recall) — Overall + Stratified

Diarrhea (7-day recall): Unadjusted MH Estimates (Midline)
Stratum Measure Estimate (95% CI) P-value N
Overall (block-stratified) PR 0.484 (0.214, 1.095) 0.0817 2018
Overall (block-stratified) RD -0.015 (-0.031, 0.001) 0.0668 2018
Mountain PR 0.520 (0.189, 1.434) 0.2064 1291
Mountain RD -0.014 (-0.034, 0.006) 0.1702 1291
Plain PR 0.413 (0.105, 1.621) 0.2051 727
Plain RD -0.016 (-0.041, 0.009) 0.2144 727
EK Cat. 0 PR -- -- 334
EK Cat. 0 RD -- -- 334
EK Cat. 1 PR -- -- 454
EK Cat. 1 RD -- -- 454
EK Cat. 2 PR 0.965 (0.353, 2.636) 0.9441 797
EK Cat. 2 RD -0.001 (-0.020, 0.019) 0.9435 797
EK Cat. 3 PR -- -- 433
EK Cat. 3 RD -- -- 433

Diarrhea Prevalence by Age Group — Descriptive

Diarrhea Prevalence (%) by Age Group and Arm (Midline, 7-day recall)
Age Group
Control
Treatment
Control_n Control_events Control_prevalence Treatment_n Treatment_events Treatment_prevalence
5-18 years 82 3 3.7 125 1 0.8
<5 years 11 0 0.0 17 2 11.8
>18 years 661 20 3.0 990 11 1.1
Unknown 82 12 14.6 50 0 0.0

By Age Group — MH Prevalence Ratios

Diarrhea PR by Age Group (Midline, 7-day recall)
Age Group Measure Estimate (95% CI) P-value N
<5 years PR -- -- 28
5-18 years PR 0.355 (0.039, 3.234) 0.3581 207
>18 years PR 0.495 (0.192, 1.273) 0.1443 1651
all PR 0.484 (0.214, 1.095) 0.0817 2018

Figure 7. Cluster-aware Mantel–Haenszel prevalence ratios for 7-day diarrhea at midline, by age group (<5 years, 5–18 years, >18 years, and all ages). Stratified by the eight randomization blocks (mountains × baseline EK use category).

Figure 8. Forest plot of the 7-day diarrhea prevalence ratios by age group (midline), on a log scale. Point estimates and 95% CIs from cluster-aware Mantel–Haenszel.

Adjusted: Modified Poisson GLMM

Diarrhea: Adjusted IRR (Modified Poisson GLMM)
Outcome IRR (95% CI) P-value N
diarrhea_value 0.435 (0.032, 5.971) 0.5331 2018

6c. Handwashing with Soap — SAP Section 5.1.2

Reported Handwashing — Overall + Stratified

Reported Handwashing with Soap: Unadjusted MH (Midline)
stratum measure result pval n
Overall (block-stratified) PR 0.308 (0.193, 0.493) 0.0000 896
Overall (block-stratified) RD -0.115 (-0.166, -0.065) 0.0000 896
Mountain PR 0.215 (0.122, 0.381) 0.0000 596
Mountain RD -0.159 (-0.224, -0.094) 0.0000 596
Plain PR 0.714 (0.281, 1.818) 0.4802 300
Plain RD -0.027 (-0.102, 0.048) 0.4856 300
EK Cat. 0 PR -- -- 150
EK Cat. 0 RD -- -- 150
EK Cat. 1 PR -- -- 206
EK Cat. 1 RD -- -- 206
EK Cat. 2 PR 0.586 (0.309, 1.111) 0.1015 360
EK Cat. 2 RD -0.052 (-0.116, 0.012) 0.1115 360
EK Cat. 3 PR -- -- 180
EK Cat. 3 RD -- -- 180

Observed Handwashing — Overall + Stratified

Observed Handwashing with Soap: Unadjusted MH (Midline)
stratum measure result pval n
Overall (block-stratified) PR 0.870 (0.583, 1.298) 0.4962 896
Overall (block-stratified) RD -0.018 (-0.072, 0.035) 0.5073 896
Mountain PR 1.384 (0.765, 2.506) 0.2825 596
Mountain RD 0.029 (-0.026, 0.085) 0.2985 596
Plain PR 0.575 (0.326, 1.013) 0.0556 300
Plain RD -0.113 (-0.226, -0.000) 0.0492 300
EK Cat. 0 PR -- -- 150
EK Cat. 0 RD -- -- 150
EK Cat. 1 PR -- -- 207
EK Cat. 1 RD -- -- 207
EK Cat. 2 PR 0.605 (0.378, 0.966) 0.0353 359
EK Cat. 2 RD -0.087 (-0.167, -0.006) 0.0348 359
EK Cat. 3 PR -- -- 180
EK Cat. 3 RD -- -- 180

8. Tertiary Outcome: Cough Prevalence (SAP Section 5.1.3)

Overall + Stratified by Randomization Block

Cough (7-day): unadjusted cluster-aware MH (midline)
stratum measure result pval n
Overall (block-stratified) PR 0.432 (0.266, 0.699) 0.0006 1960
Overall (block-stratified) RD -0.049 (-0.077, -0.022) 0.0004 1960
Mountain PR 0.606 (0.311, 1.181) 0.1409 1240
Mountain RD -0.026 (-0.058, 0.006) 0.1121 1240
Plain PR 0.270 (0.131, 0.557) 0.0004 720
Plain RD -0.089 (-0.139, -0.040) 0.0004 720
EK Cat. 0 PR -- -- 332
EK Cat. 0 RD -- -- 332
EK Cat. 1 PR -- -- 446
EK Cat. 1 RD -- -- 446
EK Cat. 2 PR 0.971 (0.520, 1.813) 0.9261 763
EK Cat. 2 RD -0.002 (-0.035, 0.032) 0.9243 763
EK Cat. 3 PR -- -- 419
EK Cat. 3 RD -- -- 419

By Age Group (SAP §5.1.3: <5, 5-18, >18, overall)

Cough (7-day) PR by age group (midline)
Cluster-aware MH stratified by randomization block
age_group measure result pval n
<5 years PR 0.714 (0.128, 3.974) 0.7008 28
5-18 years PR 2.786 (0.267, 29.038) 0.3917 198
>18 years PR 0.411 (0.243, 0.697) 0.0010 1610
all PR 0.432 (0.266, 0.699) 0.0006 1960

9. Negative Control Outcomes (SAP Section 5.1.4)

Overall + Stratified by Randomization Block

Negative controls (7-day): unadjusted cluster-aware MH (midline)
Outcome Stratum Measure Estimate (95% CI) P-value N
rash_value Overall (block-stratified) PR 0.078 (0.020, 0.315) 0.0003 1999
rash_value Overall (block-stratified) RD -0.034 (-0.050, -0.018) 0.0000 1999
rash_value Mountain PR 0.034 (0.001, 0.907) 0.0435 1277
rash_value Mountain RD -0.022 (-0.037, -0.006) 0.0059 1277
rash_value Plain PR 0.107 (0.023, 0.503) 0.0047 722
rash_value Plain RD -0.057 (-0.091, -0.022) 0.0014 722
rash_value EK Cat. 0 PR -- -- 333
rash_value EK Cat. 0 RD -- -- 333
rash_value EK Cat. 1 PR -- -- 449
rash_value EK Cat. 1 RD -- -- 449
rash_value EK Cat. 2 PR 0.055 (0.007, 0.408) 0.0046 790
rash_value EK Cat. 2 RD -0.041 (-0.063, -0.018) 0.0004 790
rash_value EK Cat. 3 PR -- -- 427
rash_value EK Cat. 3 RD -- -- 427
toothache_value Overall (block-stratified) PR 0.137 (0.047, 0.400) 0.0003 1993
toothache_value Overall (block-stratified) RD -0.039 (-0.057, -0.021) 0.0000 1993
toothache_value Mountain PR 0.167 (0.046, 0.605) 0.0064 1272
toothache_value Mountain RD -0.037 (-0.060, -0.014) 0.0014 1272
toothache_value Plain PR 0.083 (0.011, 0.642) 0.0170 721
toothache_value Plain RD -0.042 (-0.071, -0.012) 0.0053 721
toothache_value EK Cat. 0 PR -- -- 328
toothache_value EK Cat. 0 RD -- -- 328
toothache_value EK Cat. 1 PR -- -- 450
toothache_value EK Cat. 1 RD -- -- 450
toothache_value EK Cat. 2 PR 0.309 (0.105, 0.905) 0.0322 791
toothache_value EK Cat. 2 RD -0.024 (-0.046, -0.002) 0.0320 791
toothache_value EK Cat. 3 PR -- -- 424
toothache_value EK Cat. 3 RD -- -- 424

Rash by Age Group (SAP §5.1.4)

Skin rash (7-day) PR by age group (midline)
Cluster-aware MH stratified by randomization block
age_group measure result pval n
<5 years PR -- -- 26
5-18 years PR -- -- 203
>18 years PR 0.094 (0.023, 0.380) 0.0009 1646
all PR 0.078 (0.020, 0.315) 0.0003 1999

Toothache by Age Group (SAP §5.1.4)

Toothache (7-day) PR by age group (midline)
Cluster-aware MH stratified by randomization block
age_group measure result pval n
<5 years PR -- -- 26
5-18 years PR -- -- 201
>18 years PR 0.162 (0.052, 0.504) 0.0017 1640
all PR 0.137 (0.047, 0.400) 0.0003 1993

10. All Adjusted Results Summary

All Adjusted Treatment Effects (Midline)
Outcome Model Type Estimate (95% CI) P-value N
uses_electric_kettle Binary (IRR) 0.945 (0.522, 1.711) 0.8520 873
diarrhea_value Binary (IRR) 0.435 (0.032, 5.971) 0.5331 2018
reported_hw_soap_outcome Binary (IRR) 0.178 (0.042, 0.755) 0.0192 896
observed_hw_soap_outcome Binary (IRR) 0.946 (0.250, 3.585) 0.9354 896
cough_value Binary (IRR) 0.524 (0.149, 1.845) 0.3143 1960
rash_value Binary (IRR) 0.086 (0.011, 0.649) 0.0174 1999
toothache_value Binary (IRR) 0.183 (0.024, 1.403) 0.1022 1993
log10wTTC Continuous (Coef) -0.157 (-1.130, 0.815) 0.5595 277
lpd Continuous (Coef) 0.436 (-0.366, 1.238) 0.2643 438

Forest Plot: All Outcomes

Figure 9. Forest plot of all binary outcomes at midline, showing the unadjusted cluster-aware Mantel–Haenszel prevalence ratio and the adjusted modified Poisson GLMM IRR (with CR2 cluster-robust SE) side by side. Negative controls (skin rash, toothache) are marked with triangles; SAP-specified outcomes with circles. Estimates on log scale; dashed reference at 1.

11. Subgroup Analyses (SAP Section 5.4)

By Primary Water Type

Subgroup by water type not estimable (no within-group variation in EK use when stratified by baseline water method).

By Water Quality Perception

EK Use PR by Water Quality Perception
Perception PR (95% CI) P-value N
Good 0.910 (0.764, 1.084) 0.2903 498
Very good 1.814 (0.789, 4.170) 0.1608 133
Don't know -- -- 14
Satisfactory 1.193 (0.724, 1.966) 0.4878 148
Poor 0.523 (0.219, 1.248) 0.1440 64

By Electricity Affordability (FE.15)

EK Use PR by Perceived Electricity Affordability (FE.15)
Affordability category PR (95% CI) P-value N
Slightly expensive 2.396 (1.729, 3.320) 0.0000 259
Affordable 0.773 (0.610, 0.979) 0.0326 536
Very affordable -- -- 50
Expensive 1.750 (0.576, 5.316) 0.3236 12
Very expensive -- -- 11

By Risk Perception of Untreated Water (FQ.1)

Shown both at the raw FQ.1-category level and at a binary collapse into “aware of any health consequence” vs “not aware”.

EK Use PR by Risk Perception of Untreated Water (FQ.1)
Variable Category PR (95% CI) P-value N
risk_untreated_water_first They will get sick 1.161 (0.919, 1.467) 0.2096 493
risk_untreated_water_first They will get diarrhea 1.519 (1.133, 2.037) 0.0053 252
risk_untreated_water_first They will get stomach ache -- -- 68
risk_untreated_water_first Nothing -- -- 13
risk_untreated_water_first Other -- -- 22
risk_untreated_water_first Don't know -- -- 14
risk_aware_binary Aware (any health consequence) 1.072 (0.912, 1.260) 0.4007 816
risk_aware_binary Not aware 0.735 (0.519, 1.043) 0.0845 19
risk_aware_binary Unclear -- -- 36

By Primary Boiler Sex (SAP §5.4 boiling subgroup)

Restricted to households where BE.16 was administered (i.e., the household reported boiling water and identified a primary boiler).

EK Use PR by Primary Boiler Sex
Boiler sex PR (95% CI) P-value N
Female -- -- 298
Male 1.022 (0.980, 1.066) 0.3173 176

By Primary Boiler Age Bin (SAP §5.4 boiling subgroup)

Bins (<50, 50–64, 65+) chosen to give comparable Ns from the observed distribution of primary_boiler_age (min 30, median 56, max 90).

EK Use PR by Primary Boiler Age Bin
Age bin PR (95% CI) P-value N
<50 1.029 (0.973, 1.089) 0.3173 112
50-64 -- -- 197
65+ -- -- 165

12. Sensitivity Analyses

12a. Differential Recall Check (SAP Section 5.1.2)

SAP §5.1.2 trigger and current default

The SAP-specified primary recall window for diarrhea is 7 days. If there is evidence of differential recall between treatment and control arms — i.e. the treatment-control prevalence comparison is materially different at 1-day vs 7-day recall — the SAP allows switching the primary outcome to same-day recall.

The pipeline does not switch automatically. The two tables below provide the inspection material; the decision is left to the PI.

Step 1 — prevalence by recall period, round, and arm

If recall is comparable across arms, the columns within each row should follow the same ordering (today < 1-week < 2-week, monotonically). A widening gap between arms as the window grows indicates differential recall.

Diarrhea prevalence (%) by recall period, round, and arm
round treatment n_oneweek n_today n_twoweek prevalence_oneweek prevalence_today prevalence_twoweek
1 Control 852 860 837 5.28 3.14 4.78
1 Treatment 1188 1199 1183 4.04 1.50 2.79
3 Control 836 835 793 4.19 2.63 5.67
3 Treatment 1182 1188 1149 1.18 0.51 1.31

Step 2 — treatment effect (MH PR / RD) at each recall window, midline

The clearer test for differential recall: does the treatment effect itself shift across recall windows? If the MH PR is similar at today / 1-week / 2-week, there is no differential-recall problem and the 7-day primary outcome stands. If it shifts (e.g., today shows a large protective effect that washes out at 1-week and 2-week), the SAP-allowed switch to same-day recall is supported.

Midline diarrhea MH (cluster-aware, block-stratified) by recall window
Compare the PR row across the three windows — large shifts indicate differential recall
Recall window Measure Estimate (95% CI) P-value N
Today (same day) PR 0.297 (0.081, 1.090) 0.0672 2023
Today (same day) RD -0.012 (-0.023, -0.000) 0.0455 2023
1-week (SAP primary) PR 0.484 (0.214, 1.095) 0.0817 2018
1-week (SAP primary) RD -0.015 (-0.031, 0.001) 0.0668 2018
2-week (sensitivity) PR 0.357 (0.165, 0.772) 0.0089 1942
2-week (sensitivity) RD -0.027 (-0.045, -0.008) 0.0057 1942
How to use these two tables
  1. Read Step 1 to confirm that prevalence increases monotonically with the recall window in each arm (today < 1-week < 2-week). If either arm violates that ordering, recall is suspect even before you check Step 2.
  2. Read Step 2 to compare the treatment-effect PR across the three recall windows. A roughly constant PR across windows means the 7-day primary outcome is fine. A PR that systematically narrows with longer recall (e.g., 0.5 today, 0.8 at 1-week, 1.0 at 2-week) would indicate differential under-reporting in one arm.
  3. If the PI judges that differential recall is present, the pre-specified switch to same-day (today) recall as the primary outcome is supported by SAP §5.1.2. The pipeline will then need to be updated to use recall_period == "today" as the primary filter for the diarrhea MH and adjusted models.

12b. 14-Day Diarrhea Recall Sensitivity

Diarrhea (14-day recall): Sensitivity MH (Midline)
measure result pval n
PR 0.357 (0.165, 0.772) 0.0089 1942
RD -0.027 (-0.045, -0.008) 0.0057 1942

12c. TTC Sensitivity (With/Without Outliers)

Log10 TTC: sensitivity analysis (with/without TTC:TC ≥ 1 outliers)
Analysis Diff (95% CI) P-value N Method
All observations -0.146 (-0.368, 0.075) 0.1917 277 Welch (fallback, NOT cluster-aware)
Excluding TTC:TC>=1 outliers (n=252 removed) -1.275 (-2.148, -0.402) 0.0091 25 Welch (fallback, NOT cluster-aware)
TC data needed to implement TTC:TC ≥ 1 outlier exclusion per SAP §5.1.2.
Participation Rate Sensitivity (SAP Section 5.6)

Sensitivity analysis stratifying by village-level participation rate to approximate Treatment on the Treated is specified in the SAP but village participation rate data has not yet been incorporated.

13. Covariates Used in Adjusted Models

Covariates Included in Adjusted Models (Pre-screened per SAP Section 5.3)
Model Type Covariates passing LRT screen (p <= 0.25)
Binary outcomes (Poisson) mountains, baseline_uses_ek, toilet, asset_fridge
Continuous outcomes (Linear) mountains, baseline_uses_ek, toilet, w19, w21

Report generated from targets pipeline. All analyses follow the pre-registered SAP (v1, 2018-04-29).

Statistical software: R R version 4.4.2 (2024-10-31 ucrt) with targets, lme4, lmerTest, metafor.