Note on sample

For now, sample follows conservative approach where the following criteria are met (among other data cleaning, not shown):

  1. Household was surveyed at each round (balanced) and
  2. Household did not move between villages across rounds and
  3. Has either had:
  1. intervention for a full year (i.e. bridge installed by ‘Follow-up round 1’) or
  2. has been a control across all survey rounds
##           Deployment tx_factor last_survey_date n_surveys_incl pct_of_surveys
## 1:          Baseline         0       2021-06-13           6666             47
## 2: Follow-up round 1         0       2022-06-23           5117             43
## 3: Follow-up round 1         1       2022-06-17           1549             70
## 4: Follow-up round 2         0       2023-03-12           5117             96
## 5: Follow-up round 2         1       2023-02-18           1549             41
##    villages_incl pct_of_villages
## 1:           203              49
## 2:           156              45
## 3:            47              72
## 4:           156              98
## 5:            47              42

Hypothesis

Stabilization of crop yields in intervention villages may be attributable to improved rural transportation. Villages with bridges are able to maintain yields despite external stress (climate, environmental, economic, etc.) due to improved and more dependable access to inputs and markets.

These plots from the preliminary exploratory data analysis are suggestive of this hypothesis.

Mean response plot of average cultivated land (m2) and harvest (kg) across villages

Mean response plot of average number and type of crops harvested across villages

Rwanda Situation, 2019 - 2023

Decrease in peak annual NDVI over time

For reference here are commonly accepted values for NDVI in vegetation, although it is highly dependent on species, season, and setting.

What explains notable decrease in NDVI starting in mid-2021?

  1. Climatic

Extreme drought in East Africa and relative drought in Rwanda.

  1. Environmental
  2. Economic

Due to the depressed supply of fertilizer in the global market, the decreased use of fertilizer due to high cost and national rations could be driving declines in agricultural productivity that are reflected in NDVI. Farmers Review Africa

  1. Technical

Representativeness of Sample

Temporal

  1. Are households re-visited in subsequent survey rounds at the same time of year?
  2. Are villages in each treatment group surveyed around the same time of year?

On average, the first follow-up was conducted about one year and one month after baseline in both treatment groups, although there was more variation in the time since baseline for control households. The two groups have been visited at similar rates in the second follow-up and very close to a year after the previous follow-up.

Villages were visited in each survey round within +/- 5 of original order with some outliers. The difference in the start of survey round by village is very similar to the pattern observed among households. The time spent surveying in each village was comparable across survey rounds, except for a few outliers, where it is likely one or two households were returned to after the main survey campaign in the village had ended.

There may be some bias in the sample regarding the time of year households in each treatment group are visited. It appears that households in intervention villages were most likely to be surveyed at the beginning of each round, corresponding to the fall. Since some of the questions in the survey, especially those related to harvest, use the recent past as a reference respondents could be answering the question asynchronously between treatment groups. For example, the main harvest season in Rwanda is Sept - Feb; households in intervention villages were surveyed at the beginning of this season and households in control villages were more often surveyed at the end of this season.

This bias will likely improve when the second follow-up round has been completed (~May 2023) because more intervention villages that are later in the visitation order will enter the balanced sample. Although because surveys in each treatment group were not uniformly distributed across survey rounds this may still influence the responses in the final sample.

Spatial

  1. Does placement of villages in proximity to urban areas or services differ between treatment groups?
  2. Does placement of villages impact whether there is potential for cropland expansions?

There are few obvious discrepancies in the spatial distribution of villages by treatment group. More intervention villages are in the Western Province and fewer in the Northern, Southern, and Eastern Provinces, but it is unclear how much difference there is between villages in these provinces. Both treatment groups contain villages near national or district roads and also those that appear more isolated. There may be a concentration of intervention villages closer to the center of the country, at mid latitudes. The spatial distribution between groups deserves further investigation and could perhaps consider population density, rural/urban status, etc.

##    Province_Eng tx_overall_sample  N  pct
## 1:     Northern           Control 21 13.5
## 2:     Northern      Intervention  3  6.4
## 3:     Southern           Control 63 40.4
## 4:     Southern      Intervention 18 38.3
## 5:      Eastern           Control  2  1.3
## 6:      Western           Control 61 39.1
## 7:      Western      Intervention 23 48.9
## 8:       Kigali           Control  9  5.8
## 9:       Kigali      Intervention  3  6.4
##    tx_overall_sample mean_popl
## 1:      Intervention       685
## 2:           Control       734

Crop composition, diversification, or intensification in RCT

Definitions and vocabulary used in RCT

Questions that inform estimates of total monthly harvest by crop:

  1. Does your family cultivate crops?
  2. Which of the following crops do you grow? [if yes in #1]
  3. In what month did you most recently harvest [crop from #2]?
    • 4a. How many kilos did you harvest in [month from #3]?
    • 4b. If harvest frequently in #3, then How many kilos did you harvest in the past month?
    • 4c. If have not yet harvested in #3, then not asked to quantify

Questions that inform estimates of cropland area:

  1. How much land do you use to cultivate crops? [if yes in #1]
  2. Do you own or rent any land besides what your house is built upon? (not linked to any of previous questions)
    • 7a. How much land owned by your household are you currently using for farming your own crops, grazing, and fallow land? [if yes, own in #6]
    • 7b. How much land do you rent to someone else? [if yes, rent to in #6]
    • 7c. How much land do you rent from someone else? [if yes, rent from in #6]

Expect #5 to be less than or equal to sum of #7a-#7c

Expect harvest from #2-4 to be collected on area less than or equal to #5

Crop diversification

Measured by increase in quantity of households growing a respective crop

RCT Question: If “Does your family cultivate crops” is Yes, then “Which of the following crops do you grow?”, and create entries for each crop respondent reports growing from a list of 28 options (no ‘other’ option is allowed). No time component associated with the survey.

Households in intervention villages exhibit more crop diversification than those in control villages after at least one year of the presence of a bridge. In intervention villages, the number of households growing each respective crop, except wheat, only increased (positive percentage point difference) at follow-up, and the relative change was higher than in control villages. There was a larger increase in the number of households reporting growing beans and maize after bridge construction. There are large differences in other crops as well; for example, bananas or sorghum. However, the difference between treatment groups is sometimes non-significant likely due to high variation in the data.

Note: The plot below represents the mean percentage point difference in the percent of households growing a respective crop at follow-up round 2 compared to baseline (among respondents who report growing any crop) in the intervention and control villages. The dashed vertical line at 0 indicates the level of no change between survey rounds.

##    tx_overall_sample         Crops   n mean_chg_from_baseline_hh
## 1:           Control         Beans 156                      0.97
## 2:      Intervention         Beans  47                      3.74
## 3:           Control         Maize 156                      1.56
## 4:      Intervention         Maize  47                      4.81
## 5:           Control Sweetpotatoes 156                      1.54
## 6:      Intervention Sweetpotatoes  47                      1.74
##    lci_chg_from_baseline_hh uci_chg_from_baseline_hh mean_chg_from_baseline_pct
## 1:                     0.41                      1.5                        3.0
## 2:                     2.68                      4.8                       11.0
## 3:                     0.69                      2.4                        4.8
## 4:                     3.10                      6.5                       14.1
## 5:                     0.99                      2.1                        4.8
## 6:                     0.82                      2.7                        4.6
##    lci_chg_from_baseline_pct uci_chg_from_baseline_pct
## 1:                       1.3                       4.8
## 2:                       7.9                      14.1
## 3:                       2.0                       7.5
## 4:                       8.8                      19.4
## 5:                       2.9                       6.6
## 6:                       1.8                       7.5

Crop intensification

Measured by quantity of total harvest among those that grow a respective crop

RCT Question: If “Does your family cultivate crops” is Yes, then for each [crop] reported in “Which of the following crops do you grow?”, then “In what month did you most recent harvest [crop]” and “How many kilos did you harvest in that month?” or if harvest frequently, “How many kilos did you harvest in the past month?”

Among crops subject to annual change – maize, beans, and sweet potatoes – the data suggests that the amount harvested of beans and maize may be greater during some months in intervention than control villages. The harvest of sweet potatoes seems greatly exaggerated in intervention villages, before and after the construction of a bridge, making this an unstable and unsuitable comparsion for treatment effect.

This plot represents median total monthly harvest (kg) for each crop in intervention and control villages where at least one household grew the respective crop in the survey round without removing outliers.

This plot compares the sum of the total harvest (kg) by crop between all intervention villages and the mean bootstrapped sum from subsets of control villages equal in size to the number of intervention villages. The data is trimmed at the 99% percentile regardless of treatment status. The overall magnitude of the harvest may be lower in intervention villages due to a smaller number of villages where any household grows the respective crop. This is apparent in the beans harvest during the Sept – Feb season. Conversely, there is some evidence to suggest that the harvest of beans is increasing between May - June and that the harvest of maize is greater in both seasons relative to control villages. The timing of the banana harvest appears to differ between the two groups, but the total amount over the season is more comparable. Though the harvest of sweet potatoes is greater in intervention villages, the pattern is similar pre- and post-construction of the bridge.

Cropland extent

Cropland extent is likely a inadequate outcome measure due to poor consistency between survey responses regarding area of land cultivated and area of land owned or rented, or in other words, land available to the household for cultivation. Many values are negative, meaning households are supposedly cultivating land that they neither own or rent.

Note: The plot below is a histogram of the difference in definitions of land where differences have been trimmed to the 99% quantile and null differences (=0) have been removed. It can contains data from the experimental subsample and across all survey rounds.

## [1] "Summary of differences"
## 
## negative     none positive 
##     2922    11172     5693

NDVI

The left y-axis in the plots below represents median total monthly harvest (kg) for each crop in intervention and control villages. The right y-axis represents the median peak NDVI by month from cropland in villages. The workflow to generate the NDVI metric was as follows:

  1. Mask Sentinel-2 NDVI to remove clouds and non-cropland;
  2. Create monthly composite by taking per pixel max NDVI value observed during respective month;
  3. In each village, take the median value of max NDVI among all pixels designated as cropland;
  4. Generate monthly median peak NDVI time series.

Among villages in which at least one household harvested the respective crop in the respective month, the data suggests that the amount harvested of beans and maize may be greater during some months in intervention than control villages. The plot below includes the individual and summed median harvest of bananas, beans, and maize. Although bananas are not a crop that is likely to have been subject to short-term change in response to the intervention, the crop mask over which NDVI is extracted was optimized to increase the accuracy of identification of these three crops; thus, bananas is included in the harvest to be more comparable to the calculation of NDVI. More work is needed to create a crop mask specific to maize and beans.

Looking at the total harvest of bananas, bean, and maize collectively against NDVI in villages, there are some differences in timing and amplitude of harvest. Unfortunately, this differentiation is not apparent in NDVI.

Summary

From the RCT, I chose to investigate differences in agricultural productivity between intervention and control villages and pre-/post-bridge construction as one of the outcomes most subject to short- to medium-term change and that showed some evidence of effect during preliminary analysis.

I took a conservative subset of data from the RCT that included households with surveys from baseline and both follow-up rounds (i.e. a balanced sample) and that had had either the intervention for at least a full year at the time of analysis or that had been a control across all survey rounds to date. To date, the analysis is using 46.57% of the total amount of data available, which is expected to increase to 73.5% by the time follow-up round 2, currently ongoing, is completed. Thus, this analysis is subject to change with additional data.

Of the 28 crops recorded in the household surveys, beans, maize, and sweet potatoes were suggested as the ones most likely to respond to changes in farming practices within a few growing seasons. Bananas are also included in some of the analysis because the spatial crop mask used to compute NDVI was optimized for detection for bananas in addition to beans and maize.

RCT Results

Crop diversification

Households in intervention villages exhibit more crop diversification than those in control villages. In intervention villages, the number of households growing each respective crop, except wheat, only increased (positive percentage point difference) at follow-up, and the relative change was higher than in control villages.

Crop intensification

Among villages in which at least one household harvested the respective crop in the respective month, the data suggests that the amount harvested of beans, maize, and bananas may be greater during some months in intervention than control villages. (The plot is interactive; click legend to select individual crop.)

However, the overall magnitude of the harvest may be lower in intervention villages due to a smaller number of villages where any household harvests the respective crop in that month. This is apparent in the beans harvest during the Sept – Feb season. Conversely, there is some evidence to suggest that the harvest of beans is increasing between May - June and that the harvest of maize is greater in both seasons relative to control villages. The timing of the banana harvest appears to differ between the two groups, but the seasonal harvest amount is more comparable. Though the harvest of sweet potatoes is greater in intervention villages, the pattern is similar pre- and post-construction of the bridge.

Cropland extent

Cropland extent is an inadequate outcome measure due to poor consistency between survey responses regarding area of land cultivated and area of land owned or rented, or in other words, land available to the household for cultivation.

NDVI Results

Peak NDVI notably decreased in the region, despite intervention status, in mid- to late-2021, near the start of follow-up round 1; likely a result of other environmental and economic factors, such as drought in East Africa and decreased global supply of fertilizer.

Looking at the total harvest of bananas, bean, and maize collectively against NDVI in villages, there are some differences in timing and amplitude of harvest. Unfortunately, this differentiation is not apparent in NDVI.

Limitations (Next Steps)

  • Timing / reference of surveys (more cleaning, inherent)
  • Reliability of survey responses (more cleaning, complete survey round, inherent)
  • Resolution of satellite data compared to size of plots and change in plot size (higher resolution data sources)
  • Crop mask insufficient (build local crop identifier, crop cuttings)
  • Confounding factors driving NDVI more than intervention (specify and quantify these)