Module: M&E Fundamentals — Module 3 Focus: Building dictionary-grade indicators — definitions, anatomy, characteristics, targets, disaggregation, and common pitfalls. Target: Entry-Level M&E Officers, Data Managers & Program Coordinators Delivery: Facilitator-led instructional tabs with applied multi-sector case activities
In the previous module, you got to know your project really well and used the logframe tool to identify the inputs, outputs, outcomes, impact, and assumptions that comprise your project design. In this module, we transition from what you want to achieve to how you will measure it.
To build a robust measurement system, we must first establish a shared vocabulary for the key terms you will use:
💡 Expert note — Indicator vs. Measure: A measure is any countable data point. An indicator is a measure deliberately chosen because it serves as a proxy for something harder to observe directly. You cannot directly observe “improved food security,” so you track a dietary diversity score or coping strategy index instead. Keeping this distinction explicit stops teams from assuming that anything countable is automatically worth tracking.
💡 Expert note — Baselines are sometimes estimates, not measurements: In programs that inherit a weak reporting history, the “baseline” is often itself a best estimate rather than a clean measured value. This uncertainty should be documented in the indicator dictionary, not hidden behind a tidy number.
Indicators DO NOT include the specific numbers or percentages that you would like to reach. An indicator is simply a variable. It should be independent, meaning that it is nondirectional and can vary in any direction.
💡 Core Rule: Students can apply this quick test to any indicator they encounter — including imperfect donor logframes that blend targets into indicator names.
Different levels of your program require different types of measurement. In your logframe, you defined project inputs, outputs, outcomes, and impact. For each of these logframe levels, you will choose at least one indicator.
💡 Expert note — The attribution gradient: As you move from inputs toward impact, your confidence that the program caused the observed change weakens, because more external factors intervene — weather for agriculture, co-financing partners for health systems, market prices for an enterprise. This is why monitoring indicators can be tracked monthly with high confidence, while evaluation indicators usually need a comparison group or trend analysis to claim causality, not just a before-and-after number.
Stage 1 INPUT
Stage 2 ACTIVITY
Stage 3 OUTPUT
Stage 4 OUTCOME
Stage 5 IMPACT
As a general rule, a program will track many inputs and outputs, but very few overarching impact indicators.
Indicators can be either quantitative or qualitative.
An important part of what constitutes a quantitative indicator is the metric — the precise calculation or formula on which the indicator is based. If an indicator calls for a percentage, a fraction is required to calculate it. You must clearly define your metrics:
In many cases, indicators need to be accompanied by clarifications of the terms used. For example, if your indicator is number of antenatal care (ANC) providers trained, “providers” would need to be defined (e.g., clinicians providing direct clinical services) and “trained” would also need to be defined (e.g., staff who attended every day of a five-day training course and passed the final exam with a score of at least 85%).
💡 Expert note — Numerator-denominator alignment: Both the numerator and denominator must measure the same population, time period, and geographic unit, or the resulting percentage is meaningless. A frequent real-world error: the numerator is counted for the current month while the denominator is drawn from an outdated annual projection.
🍳 Why a fraction without aligned ingredients fails
Imagine baking a cake where the recipe calls for “2 cups of flour per batch,” but the flour was measured this morning while “batch” refers to a tray size from last year’s oven. The ratio looks precise on paper, but it no longer describes anything real — the two numbers were never measuring the same thing at the same time.
A numerator and denominator work the same way. A percentage is only meaningful when both halves of the fraction share the same population, the same time window, and the same geography. Mismatch any one of these, and the indicator becomes a number that looks authoritative but means nothing.
A fraction is only as trustworthy as the alignment between its numerator and denominator.
💡 Expert note — Document your data source: A metric is only as good as its source. The indicator dictionary should always specify where the numerator and denominator values come from and how frequently that source is updated.
Let’s follow a single quantitative indicator — TB Treatment Completion Rate — through its entire lifecycle to see how the definitions interact numerically.
| Period | Numerator | Denominator | Result | Target | Variance |
|---|---|---|---|---|---|
| Baseline (2024) | 650 | 1000 | 65% | N/A | N/A |
| Q1 2025 | 180 | 250 | 72% | 70% | +2% |
| Q2 2025 | 210 | 280 | 75% | 75% | 0% |
| Q3 2025 | 230 | 275 | 83.6% | 80% | +3.6% |
| Q4 2025 (Target) | NA | NA | NA | 85% | NA |
M&E professionals often evaluate indicators against standard frameworks to ensure they are fit for purpose (such as the SMART or CREAM criteria). Specifically, a strong indicator meets these seven criteria:
No ambiguity about what is being measured and what data are being collected.
Captures the input, output, outcome, impact, or risk directly — not a tangential proxy.
Data can be collected without excessive cost, time, or technical burden.
On its own (or with companion indicators), gives enough signal to judge success.
The result actually changes what the team does next.
If the indicator changes, the team can reasonably link that change to the project.
Data is split where doing so reveals something management needs to know.
💡 Expert note — The criteria are often in tension: A highly Direct indicator (precisely measuring the true outcome) is sometimes the least Practical one to collect — expensive, slow, or technically difficult. This forces teams to choose a proxy. Naming this tradeoff explicitly helps students understand why real M&E plans often use the “best available” indicator rather than a theoretically perfect one.
| Common Flaw | Weak Indicator | Strong Indicator |
|---|---|---|
| Directional (Includes Target) | Decrease in malaria mortality | Malaria mortality rate |
| Subjective / Vague | Quality of HIV care provided | Percentage of facilities scoring >85% on standard HIV care checklist |
| Not Direct | Number of pamphlets printed to measure knowledge | Percentage of target population able to identify 3 HIV transmission routes |
To ensure consistency, every indicator should be documented in an M&E plan using an Indicator Dictionary.
| Element | Detail |
|---|---|
| Indicator Name | ANC 1 Coverage |
| Definition | Proportion of pregnant women attending at least one ANC visit |
| Numerator | Number of new ANC 1 visits |
| Denominator | Estimated number of pregnant women in catchment |
| Data Source | ANC Register / Population Projections |
| Frequency | Monthly |
Setting targets can be a powerful exercise. It motivates your team to achieve success, and it also ensures that everyone — team members, beneficiaries, partners, and donors — understands what is going to be accomplished.
Your targets can express:
When setting targets, it is important to be ambitious but realistic. You must use data, not feelings, to set targets. You can find this data by looking at your baseline, checking historical trends, or consulting expert opinion and research findings.
💡 Expert note — Floor targets vs. stretch targets: A floor target is the minimum acceptable result, often contractual and tied to funding continuation. A stretch target is an aspirational ceiling used internally for motivation but not reported as a firm commitment. Conflating the two is a common cause of donor relationship strain, when teams end up reporting against an aspirational number they never should have promised.
⚠️ Expert note — Watch for target-setting bias: When the same team being evaluated also sets its own targets, there is a natural incentive to set them low. Strong M&E practice involves an external sanity check — historical sector data or comparable peer programs — before finalizing a target.
Most strong indicators are disaggregated, meaning that the data is separated into categories. Common disaggregations include gender (Male/Female) and age (0-5, 5-15, etc.). It is important to disaggregate your indicators because activities will affect different types of people in different ways.
Note: Not every indicator needs to be disaggregated. If it is not helpful to break an indicator into different categories, then don’t do it.
💡 Expert note — Disaggregation has a sample-size cost: Every additional category multiplies the sample size needed for statistically meaningful comparisons between subgroups. This matters most for an enterprise or small program without a large participant pool. Pair the rule “disaggregate only if it informs a decision” with “and only if you’ll have enough cases in each category for the comparison to mean anything.”
When selecting indicators, teams often face common challenges. Avoid these pitfalls:
1. Choosing an indicator that the program activities cannot affect. For example, a project providing local clinical training should not use a global indicator like proportion of health care facilities with adequate conditions to provide care, because facility conditions rely on supply chains and equipment that training alone cannot fix.
2. Selecting an indicator that does not accurately represent the desired outcome. This often happens when the denominator is wrong. For instance, if an Intermediate Result states expanded access to antiretroviral (ARV) treatment for pregnant women, using the indicator percentage of women on ARVs who are pregnant is incorrect. If more nonpregnant women receive treatment but the number of pregnant women receiving treatment stays the same, the indicator would decrease — which is irrelevant to the program’s actual desired outcome.
3. Indicator drift. This occurs when an indicator’s definition quietly changes mid-program — a different denominator population, a different “trained” threshold — without being re-documented. The indicator name stays the same, but trend data across periods is no longer truly comparable. This is one of the most common, and most silent, data-quality failures in long-running programs.
An agricultural program is tracking the success rate of a livestock cross-breeding initiative focused on meat production. They are introducing Boran cattle, Dorper sheep, and Galla goats. Their chosen indicator is: Percentage of all farm animals that are pedigree cross-breeds. Over one year, this percentage drops from 15% to 8%. The project manager panics, assuming the breeding program has a negative return on investment (ROI).
Task: Discuss why this indicator is flawed based on what we learned about denominators. What alternative factor (e.g., the farmer purchasing a massive flock of local chickens) could cause this drop? Propose the correct numerator and denominator to track the actual ROI for the targeted breeds.
Debrief: Map your corrected indicator back to the seven criteria from Section 4 — which criteria did the original indicator fail, and which does your corrected version now satisfy?
A foodstuff enterprise dealing in Mwea Pishori rice wants to track its sales volume to better manage inventory. The primary outcome indicator is the Total kilograms of rice sold per month.
Task: Define the standard disaggregations required for this indicator to be “Useful for Management.” Justify why separating the data by pricing tiers — specifically disaggregating retail orders (under 30 kilograms) versus wholesale orders (over 30 kilograms) — is critical for financial decision-making.
Debrief: Map your corrected indicator back to the seven criteria from Section 4 — which criteria did the original indicator fail, and which does your corrected version now satisfy?
Your organization is facilitating On-the-Job Training (OJT) for clinical documentation systems. The scope of this training encompasses 400 total participants across ten different facility locations.
Task: As a group, build a complete indicator dictionary entry for tracking the success of this rollout. Determine the indicator name, a strict definition of what counts as a “trained” participant, the exact numerator, and practical data sources to verify attendance across the 10 locations.
| Attribute | Your_Input |
|---|---|
| Indicator Name | |
| Definition | |
| Numerator | |
| Denominator | |
| Data Source | |
| Disaggregation | |
| Frequency |
Debrief: Map your corrected indicator back to the seven criteria from Section 4 — which criteria did the original indicator fail, and which does your corrected version now satisfy?
Question 1 “Increase the percentage of patients retained on ART.” Is this a valid indicator name?
Question 2 Which logframe level does “Number of nutritional supplements distributed” measure?
Question 3 True or False: Every indicator in your logframe must be disaggregated by age and sex.
Question 4 A program reports its trained-staff numerator using a stricter pass mark this year than last year, without updating the indicator dictionary. What problem does this illustrate?
Question 5 What is the difference between a floor target and a stretch target?
| Question | Correct Choice | Technical Trainer Rationale |
|---|---|---|
| 1 | B | The phrase “Increase the percentage” bakes a direction into the indicator name. The Golden Rule requires the indicator to remain true whether the value rises or falls — this fails the diagnostic test from Section 1. |
| 2 | C | Distributed supplements are an immediate, tangible product of project activity — fully within the project’s control, which is the defining feature of an Output. |
| 3 | B | Disaggregation carries a real sample-size and reporting-burden cost. It should be applied only where it changes a management decision and where each subgroup has enough cases to be meaningful. |
| 4 | B | This is a textbook case of indicator drift: the underlying definition changed (stricter pass mark) without re-documentation, so the trend line across years is no longer a true apples-to-apples comparison. |
| 5 | B | Floor targets are contractual minimums tied to funding; stretch targets are internal motivational ceilings. Reporting a stretch target as if it were a commitment is a common source of donor relationship strain. |
| Academic Resource Literature | Relevance to Current Module | Access Node |
|---|---|---|
| WHO — Consolidated Guidelines on Person-Centred HIV Strategic Information | Strategic information standards for HIV indicator design. | who.int |
| PEPFAR — Monitoring, Evaluation, and Reporting (MER) Indicator Reference Guide | Standardized indicator definitions, numerators, and denominators for global health programs. | pepfar.gov |
| MEASURE Evaluation — Data Quality Assurance (DQA) Toolkits | Practical tools for assessing data accuracy, completeness, timeliness, and integrity. | measureevaluation.org |
| DHIS2 — User Manual: Indicators and Indicator Types | Technical reference for building Data Elements, Indicator Types, and Indicators in DHIS2. | dhis2.org |