Overview

Row

77.1%
GPS Average Endline
Baseline 59.2% across all six modules

48.1%
Control Average Endline
Baseline 45.3% across control schools

+29.0pp
GPS Advantage at Endline
Consistent positive effect across all six modules

Row

+62.9pp
Scaffolding for Struggling Learners
37.1% → 100%  ·  Largest individual item gain in the evaluation

+67.6pp
SEND Adaptation
13.6% → 81.2%  ·  Transformational shift in inclusive practice

Row

Key Findings

Strongest Performer
Lesson Planning & Classroom Management achieved both the highest endline score (82.2%) and the greatest module-level gain (+23.8pp). It is the standout model for replication across the programme.
Priority Area Action Required
Assessment & Evaluation has the weakest gain (+8.3pp) and is the only module still below 70%. The core problem: teachers are assessing but assessment results are not feeding back into lesson planning (42.1% the lowest individual item in the entire evaluation). Every assessment being conducted right now is failing its primary purpose.
Programme Impact
GPS schools outperformed control schools across all six modules at endline. The GPS advantage averaged +29.0pp — a substantial and consistent programme effect.
Progress Against Threshold
Five of six modules crossed the 70% performance threshold by endline. Child Development (68.9%) is the final module yet to reach this target and is close to achieving it.

Row

Cross-Cutting Training Gap Present in All Six Modules
Interactive and group-based pedagogy failed to take root across every module in the programme. Group work, collaborative problem-solving, and varied approaches are the weakest-scoring item in M1 (45.7%), M2 (48.8%), M4 (53.6%), and M5 (45.3%), and remain below 70% in M3. This is not six separate module weaknesses it is one systemic training gap. Teachers absorbed the content of each module; they have not internalised the pedagogy. Interactive methods must be the centrepiece of the next training cycle, not one item among many.

Row

Safeguarding Notice Programme Team Action Required
Enumerators documented incidents of physical punishment and threatening language in a small number of classrooms during endline observations. These incidents are not reflected in the M6 aggregate score of 93.1%. These are safeguarding incidents, not data points. They require direct follow-up with the schools and teachers concerned through programme safeguarding protocols, independently of this evaluation. See the M6 tab for detail.

Row

Module Performance: Endline Score vs. Improvement from Baseline

Legend
Strongest
performer
Priority
area
All other
modules
Quadrant Lines
Vertical: 70% programme performance threshold. Horizontal: mean improvement across all six modules (+17.9pp).

Row

Module Summary

Module Endline Gain GPS Lead Position
M6: Ethics, Principles & Ownership 93.1% +11.1 pp +35.3 pp Strong but Stable
M4: Lesson Planning & Classroom Mgmt 82.2% +23.8 pp +32.8 pp Strong and Improving
M2: Curriculum Studies 81.2% +19.3 pp +24.0 pp Strong and Improving
M3: Teaching Methodology 70.7% +21.6 pp +32.4 pp Strong and Improving
M1: Child Development & Psychology 68.9% +23.4 pp +25.9 pp Improving but Still Weak
M5: Assessment and Evaluation 66.6% +8.3 pp +24.0 pp Needs Attention

M1: Child Development

Row

68.9%
GPS Endline Score
Amber just below the 70% threshold

+23.4pp
Improvement from Baseline
Above the programme average of +17.9pp

+25.9pp
GPS Advantage at Endline
Control schools: 43.0%

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Equal participation and encouragement rose sharply. Teachers consistently praised effort not just correct answers, and ensured all learners regardless of gender or background had equal opportunity to contribute.
Remaining Challenges
Local teaching materials (23.5%), group work and play-based activities (45.7%), and varied teaching approaches (53.7%) remain weakest. Most classrooms still relied on single-method verbal delivery.
Field Observation
“The teacher celebrated progress not just for perfection. He acknowledged the efforts publicly to support every child’s improvement.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Scaffolding / support for struggling learners 100.0% +62.9 pp +56.2 pp Strong & Improving
Equal participation (gender / background) 95.2% +3.5 pp +11.9 pp Stable
Encouragement and motivation 92.9% +28.9 pp +33.8 pp Strong & Improving
Respectful classroom culture 90.7% +11.6 pp +24.0 pp Stable
SEND adaptation 81.2% +67.6 pp +31.2 pp Strong & Improving
Developmental appropriateness 77.9% +24.4 pp +23.7 pp Strong & Improving
Empathy / conflict resolution 69.6% -8.2 pp +49.6 pp Declined
Responds to student emotions 62.0% +43.8 pp +26.3 pp Improving
Varied approaches (visuals, stories, songs) 53.7% +50.7 pp +20.4 pp Improving
Seating promotes cooperation 52.3% +3.5 pp +10.6 pp Needs Attention
Calming techniques (low-resource) 52.1% +39.6 pp +18.8 pp Improving
Group work / play-based learning 45.7% +43.1 pp +16.5 pp Improving
Local / low-cost teaching materials 23.5% +14.4 pp +23.5 pp Needs Attention

M2: Curriculum Studies

Row

81.2%
GPS Endline Score
Green tier above the 70% threshold

+19.3pp
Improvement from Baseline
Above the programme average of +17.9pp

+24.0pp
GPS Advantage at Endline
Control schools: 57.2%

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Subject knowledge and lesson sequencing were near-universal strengths. Teachers were confident, organised lessons with clear structure, connected new content to prior knowledge, and corrected student errors effectively.
Remaining Challenges
Interactive methods (group work, problem-solving) scored just 48.8% — the weakest item in the module. Most lessons relied on teacher-led Q&A; collaborative or practical activities were rarely observed.
Field Observation
“The lesson followed a clear and logical sequence. The teacher connected new content to previous knowledge or real-life experiences, making it easier for learners to understand and follow.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Content confidence and accuracy 100.0% +3.6 pp +12.5 pp Stable
Logical lesson sequencing 98.8% +14.3 pp +44.6 pp Stable
Student engagement and interest 84.9% +8.7 pp +26.6 pp Stable
Error correction / addressing confusion 81.2% +28.9 pp +24.1 pp Strong & Improving
Real-life / local relevance 75.6% +33.1 pp +25.6 pp Strong & Improving
Interactive methods (group work, problem-solving) 48.8% +37.3 pp +11.3 pp Improving

M3: Teaching Methodology

Row

70.7%
GPS Endline Score
Green tier above the 70% threshold

+21.6pp
Improvement from Baseline
Above the programme average of +17.9pp

+32.4pp
GPS Advantage at Endline
Control schools: 38.3%

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Linking lessons to prior knowledge and active monitoring both improved substantially. Most teachers walked the classroom, marked exercises, and provided corrections to students in real time during activities.
Remaining Challenges
Teaching aids (30.3%) remain critically absent — visual, audio, or audio-visual resources were not observed in the majority of classrooms. Active student participation vs. passive listening remained inconsistent.
Field Observation
“The teacher monitors the student’s progress. He goes through the class, marks the exercise of the students, engages the students. He corrects and makes feedback of the students’ work.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Links to prior knowledge 87.2% +20.5 pp +45.5 pp Stable
Active monitoring and timely feedback 80.2% +12.3 pp +38.5 pp Stable
Method-objective alignment 79.1% +34.9 pp +23.5 pp Strong & Improving
Active participation vs lecture 68.6% +20.9 pp +22.8 pp Needs Attention
Teaching aids used skillfully 30.3% +24.4 pp +30.3 pp Improving

M4: Lesson Planning & Mgmt

Row

82.2%
GPS Endline Score
Programme's strongest module green tier

+23.8pp
Improvement from Baseline
Largest gain across all six modules

+32.8pp
GPS Advantage at Endline
Control schools: 49.4%

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Lesson structure (93.0%), non-punitive discipline (91.7%), and classroom routines (88.4%) were all high. Teachers arrived with structured plans, managed transitions smoothly, and corrected behaviour fairly.
Remaining Challenges
Interactive engagement scored 53.6% — the only weak item. Classrooms were well-managed but often passive; few lessons used group work or problem-solving to activate deeper learner participation.
Field Observation
“The lesson followed a logical sequence. The content was matched and aligned to the objectives of the lesson plan and scheme of work.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Clear lesson structure 93.0% +15.6 pp +15.2 pp Stable
Positive / non-punitive discipline 91.7% +1.2 pp +41.7 pp Stable
Classroom rules and routines 88.4% +31.5 pp +38.4 pp Strong & Improving
Formative assessment before close 86.0% +14.6 pp +26.9 pp Stable
Interactive methods for engagement 53.6% +39.0 pp +20.3 pp Improving

M5: Assessment & Evaluation

Row

66.6%
GPS Endline Score
Amber below 70% threshold. Assessment loop broken. Action required.

+8.3pp
Improvement from Baseline
Lowest gain across all six modules

+24.0pp
GPS Advantage at Endline
Control schools: 42.6%

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Assessment alignment to content reached 100% — tasks consistently tested what was taught. Fair, calm assessment environments were maintained across schools and scoring was generally consistent.
Remaining Challenges
The assessment feedback loop is broken. ‘Assessment data used for planning’ scored 42.1% — the lowest individual item across the entire evaluation. Teachers are assessing, but results stop there: they do not feed back into future lesson design. Assessment without that loop is measurement without learning. Varied methods also scored 45.3%; oral questioning dominated throughout.
Field Observation
“The teacher assessed the students through short quizzes, oral questions and class work activities during the period. The teacher allowed students to demonstrate equal opportunity to show their knowledge.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Assessment aligned to content 100.0% +17.9 pp +10.0 pp Strong & Improving
Fair assessment environment 90.8% +16.4 pp +27.2 pp Strong & Improving
Consistent scoring method 75.0% +10.0 pp -25.0 pp Strong & Improving
Constructive feedback 62.5% +8.4 pp +22.5 pp Improving
Variety of assessment methods 45.3% +0.1 pp +16.1 pp Needs Attention
Assessment data used for planning 42.1% +24.2 pp +37.6 pp Improving

M6: Ethics & Ownership

Row

93.1%
GPS Endline Score
Highest endline score across all six modules

+11.1pp
Improvement from Baseline
Below programme average module already strong at baseline

+35.3pp
GPS Advantage at Endline
Control schools: 57.8%    Control declined from baseline

Row

Safeguarding Notice Programme Team Action Required
Enumerators documented incidents of physical punishment and threatening language in a small number of classrooms during endline observations. These incidents do not appear in the aggregate score of 93.1% and require direct follow-up with the schools and teachers concerned. These are safeguarding incidents, not statistical outliers. They must be escalated through programme safeguarding protocols independently of this evaluation report. R&E will make the raw observation records available to the programme team on request.

Row

What We Observed at Endline — Enumerator Notes (n=55)
What Improved
Respectful treatment (98.8%), punctuality (97.7%), and trust-building (91.9%) were near-universal. Teachers arrived prepared with plans and materials, treated all students fairly, and built warm classroom relationships.
Remaining Challenges
A small number of incidents involving physical punishment or threatening language were documented, undermining classroom safety in those settings. In some classrooms, students were not yet fully confident to ask questions or make mistakes freely.
Field Observation
“The teacher was punctual and visibly well prepared. He arrived early on time with a clear lesson plan, lesson note, scheme of work, and teaching materials.”

Row

Item Performance — Quadrant Analysis

Row

Item Scorecard

Item Endline Change vs Control Position
Respectful / fair treatment of all students 98.8% +6.9 pp +23.8 pp Stable
Punctuality and preparedness 97.7% +11.1 pp +39.4 pp Strong & Improving
Builds trust and respect with students 91.9% +2.4 pp +55.5 pp Stable
Recognises student success and effort 91.7% +24.3 pp +14.4 pp Strong & Improving
Teacher engagement and motivation 86.0% +10.4 pp +40.2 pp Stable

School Breakdown

Row

GPS School Performance — Baseline, Endline and Improvement by Module

Each cell shows endline score (top) and change from baseline (bottom).  Green ≥70%  |  Amber 60–70%  |  Red <60%  |  Sorted by overall improvement. GPS schools only (n=43).
School Overall M1
Child Dev
M2
Curriculum
M3
Teaching
M4
Lesson Plan
M5
Assessment
M6
Ethics
Gadhyare★ Model school
Baseline: 46%
88%
+43pp
83%
+50pp
93%
+52pp
90%
+62pp
84%
+44pp
80%
+26pp
100%
+22pp
Faadumo Biixi★ Model school
Baseline: 44%
87%
+42pp
80%
+38pp
86%
+29pp
87%
+53pp
93%
+50pp
85%
+61pp
90%
+23pp
Sheikh Yousuf★ Model school
Baseline: 49%
90%
+42pp
85%
+51pp
94%
+47pp
87%
+50pp
97%
+42pp
81%
+33pp
100%
+27pp
31 May
Baseline: 53%
75%
+22pp
73%
+26pp
79%
+16pp
70%
+30pp
70%
+39pp
79%
+34pp
80%
-10pp
Mohamoud Xandulle
Baseline: 60%
79%
+19pp
72%
+29pp
88%
+29pp
70%
+22pp
82%
+23pp
62%
-4pp
100%
+15pp
Omar Tooray
Baseline: 51%
70%
+19pp
56%
+20pp
75%
+13pp
48%
+12pp
86%
+31pp
58%
+4pp
98%
+32pp
Omar Binu Khadab
Baseline: 63%
82%
+18pp
79%
+34pp
83%
+15pp
70%
+11pp
82%
+10pp
77%
+33pp
100%
+8pp
Ahmed Dhagax
Baseline: 60%
72%
+12pp
62%
+29pp
78%
+18pp
63%
+12pp
82%
+14pp
58%
-2pp
87%
+2pp
Gacmo Dheere
Baseline: 66%
77%
+11pp
66%
+11pp
81%
+12pp
79%
+19pp
78%
+21pp
73%
+6pp
87%
-1pp
Sheikh Ali IbrahimINVESTIGATE
Baseline: 77%
64%
-12pp
55%
-16pp
67%
-8pp
57%
-12pp
75%
+1pp
41%
-43pp
92%
+5pp

Action Points

Row

Priority Actions for the Next Programme Phase

Immediate Before Next School Term
1
Escalate safeguarding incidents from M6 endline observations
Physical punishment and threatening language were documented in a small number of classrooms. Identify the specific schools from enumerator observation records and initiate follow-up through programme safeguarding protocols. R&E will provide raw observation records on request.
2
Review how assessment and lesson planning are taught as a connected cycle (M5)
Assessment data used for planning scored 42.1% the single lowest item in the entire evaluation. The current training sequence does not appear to be building a feedback loop between assessment and future lesson design. Lead trainers should review how this connection is taught and whether it is practised, not just described, in training sessions.
Next Training Cycle Structural Changes Required
3
Make interactive pedagogy the centrepiece of the next training cycle not one item among many
Group work, collaborative problem-solving, and varied approaches are the weakest item in four of six modules (M1: 45.7%, M2: 48.8%, M4: 53.6%, M5: 45.3%). This is a programme-wide failure to translate content knowledge into classroom practice. Interactive methods training must run as a cross-cutting strand through all modules, with observed practice during training sessions not just instruction about why it matters.
4
Build the assessment-to-planning feedback loop as a required classroom routine (M5)
Assessment variety (45.3%) and use of results for planning (42.1%) require targeted module redesign. Consider introducing a simple structured routine for example, a weekly one-page template where teachers record what the assessment revealed and what they will adjust next lesson. The routine should be taught, practised, and observed in training.
5
Provide targeted support to bring M1 Child Development above the 70% threshold
At 68.9%, Child Development is the only module still below the programme performance threshold and the gap is small. Two items are dragging the average: local and low-cost teaching materials (23.5%) and group work or play-based learning (45.7%). School-level coaching visits focused specifically on these two items are likely to close the gap without a full retraining cycle.
Ongoing Monitoring
6
Track interactive methods as a single cross-programme metric in quarterly reporting
Because the weakness appears across all modules, report it as one programme-level indicator rather than disaggregated module scores. A single headline figure makes the trend visible and actionable at SLT level.
7
Investigate the control group decline on M6 Ethics before the next data collection
Control schools declined from 64.2% to 57.8% on M6. With n=12 control schools this is a fragile figure, but a 6.4pp deterioration in teacher professional conduct in non-GPS schools is worth investigating. It may reflect broader systemic pressures or a sampling artefact either way it should be understood before it becomes a question the programme cannot answer.