Background

Since 2015, the Foundation began a process of building an evaluation system for its projects.

The primary responsibilities of the evaluation system were established as:

In a general way, the objectives of the interventions are to improve the academic results and the equality of opportunities and benefits for life that sports offer by implementing a series of sports activities aimed at children and young people at risk.

Most of the Foundation’s interventions took place in communities so that the articulation between these interventions and the activities of the formal education system was neither direct nor evident. The entity decided in 2016 to turn around and pass the execution of the interventions to formal education spaces through agreements with certified education secretariats.

During 2016 the Inter-American Development Bank located a test of free access to the public, the YARPET survey, designed in English, which provides items around 17 common youth development goals and is aimed at programs of experiential intervention on youth at risk. It was, according to its author, an administratively efficient and psychometrically valid instrument to investigate the development of such interventions. Consultants of the Inter-American Development Bank translated the test and applied it in 2016 to children, youth, and instructors of the intervention developed by the Foundation in communities.

During 2017, two progress activities were carried out following the change of approach and towards the conformation of the evaluation system to which it was desired to reach:

The change of profile of the projects towards spaces of formal education and the delivery of the pedagogical documents to teachers forced to rethink the test. It was for follow-up in such a way that the pedagogical method was aligned with what was evaluated. During the first semester of 2018, the instrument was adapted again, under the philosophy of YARPET, but changing the items.

In the second half of 2018 the new test was applied in the intervention that was being developed in different schools to those of the previous year and simultaneously in other schools.

The new analyzes showed the adequacy of the questions to the conception of socio-emotional and citizen competences differently in the large city than in rural schools and small urban areas. It was decided to establish two tests, one for metropolitan areas and another for the rest of the country’s territorial entities.

The adjusted version for rural schools and small urban areas was applied in 2019 in the projects.

The present document is right about the answers given in the projects.

Introduction

The proposal that guides the pedagogical records of the project mixes elements of pedagogy for peace, the development of citizen competencies and life skills.

Within the framework of competencies for life, it seeks to develop self-knowledge, empathy, assertive communication, management of emotions and feelings, management of problems and conflicts, management of stress and tensions, critical and creative thinking, interpersonal relationships, and decision making.

Each competition includes a set of achievements that are developed in the pedagogical records.

The evaluation system must match the skills for which the program tends to those it evaluates. On that basis, the current version of the tool was designed.

Methodology of inquiry.

The spirit of the tool is the investigation of the level of development of citizen and life skills in children and young people over ten years old through self-empowered items by them. It is expected that the skills will be incorporated by the student in his work and daily life, and therefore are considered latent factors that are not necessarily aware.

The self-concept items are written in the form of statements that must be answered in such a way that there is an ordinal gradation, from “This statement does not describe me at all” to “This statement describes me very well; It looks a lot like me”, with five intermediate options, where “not being like this” is the least desirable, and “being that way” is the most desirable. The seven options are symmetrical, the three on the left are considered less desirable, and the three on the right are more desirable. Young people are expected to respond honestly, as it is a test of opinion about themselves in which there are no right or wrong answers.

The items are messy in the sense that those that point to the same competition are not contiguous, not making clear the objective of each one.

Latent factors

The latent factors to be evaluated were conceptualized, which were grouped into two macro factors: personal and social objectives.

The tool for the measurement of socio-emotional and citizen competences was designed with such a construct in mind; however, not necessarily the students who answer the tool understand each item in the same way as expected. For this reason, it is crucial to perform analyzes that identify how the items are grouped into latent factors since they are not directly observable except through the items.

Reliability Analysis

Two factors affect the degree of reliability of a test: variability and length.

Variability: The type of sample of subjects that have been chosen to calculate the reliability of a test can influence the result obtained. This means that reliability no longer depends solely on the characteristics of the test, but also on the sample of subjects to which it is applied, so it cannot be said that the same test has a fixed reliability coefficient. In sum, it can be affirmed that the reliability of a test will be higher, the more significant the variability in the selected sample of subjects.

Length: In general terms, it can be said that the reliability of a test increases as its length increases, that is, its number of items. This does not mean that it is advisable to unnecessarily extend a length test in order to increase its reliability, nor that any test can become a reliable instrument by the sole means of increasing its length indefinitely.

In order to determine the reliability of the instrument applied to the project population, an analysis was carried out to determine Cronbach’s alpha coefficient and, in turn, McDonald’s Omega coefficient, a similar indicator. Reliability refers to the degree of consistency and stability of the scores obtained throughout successive measurement processes with the same instrument.

In the framework of the classical theory of tests, the empirical score (X) obtained by a subject in a test is the sum of two components: the true score (V) and the inevitable measurement error associated with the instrument. That is, if the measurement error is eliminated from the empirical score (obtained in the test), the true score will be obtained. In this context, reliability refers to the proportion of the true variance, that is, the part of the total variance that the items discriminate based on what they have in common.

Cronbach’s alpha (\(\alpha\)): It is an indicator of the reliability of a test based on its degree of internal consistency. It indicates the degree to which the items of a test would covariate.

McDonald’s Omega (\(\omega\)): Based on the confirmatory factor analysis of the items, it is an indicator of internal consistency similar to the alpha coefficient, but preferred by some psychometrists.

There is no general rule; however, it is common to value a value higher than 0.70 as acceptable for both coefficients, although they are expected to be higher than 0.80.

It should be noted that there is usually a subset of students who answer the test carelessly or try to sabotage it, which would lower the level of reliability of the test. Most of these students are identified through statistical techniques and are eliminated from the analysis.

65 item tool

Below is the diagram that represents the scheme of how the items are grouped according to what was answered by the sample of students to whom the tool of 65 items was applied:

Global reliability:

Cronbach`s Alpha: 0.95

McDonald`s Omega: 0.96

The version for non-metropolitan areas, concerning the population in which it was applied, shows high reliability.

Omega for latent factors
Omega
Self-esteem 0.85
Leadership and teamwork 0.78
Conflict management and resolution 0.76
Organization 0.79

By construction, it is better to have five or more items.

Item Weighting
Self-esteem Weighting
I11 0.141
I17 0.134
I18 0.137
I22 0.119
I24 0.104
I25 0.130
I31 0.128
I56 0.107
Item Weighting
Leadership and teamwork Weighting
I03 0.170
I06 0.195
I30 0.142
I36 0.140
I49 0.164
I61 0.188
Item Weighting
Conflict management and resolution Weighting
I04 0.205
I39 0.295
I58 0.237
I60 0.263
Item Weighting
Organization Weighting
I12 0.201
I16 0.182
I23 0.161
I42 0.203
I46 0.123
I54 0.131

The calculation was made on 553 records.

Annex A. Construct confirmation methodology.

The tool has been built from a theory. It must be verified if the statements meet this theory. It may happen that not all students understand in the same way an affirmation, in which case the item does not correlate consistently with the rest of their group (latent factor) and behaves like an isolated item. In other cases, an item planned for a given latent factor can be consistently answered with items from another and has relevance. To verify how they were understood, factor analysis is performed. An exploratory one is usually performed first and then a confirmatory one.

The variability in responses by young people can conceptually break down like this:

The total variance of an item = variance shared with the other items + specific variance of the item + variance of measurement errors.

It can be joined the specific variance with the variance due to measurement errors, meaning all the variance not shared with the other variables:

The total variance of an item = Shared or common variance + variance not shared by the item.

The variance of measurement errors can be seen as the systematic style of each young person in the way of selecting the response to the items: for example, to select the extreme scores or to avoid the extreme scores.

The exploratory Factor Analysis is responsible for analyzing the variance common to all variables based on the correlation matrix. It tries to simplify the information it offers.

In each cell of the correlation matrix, variance proportion common to two items is reflected, except in the main diagonal, where each item coincides with itself and therefore has a value of one. All the variance is reflected in the ones of the main diagonal, both what shares and what does not share. If it is sought to analyze the shared variance exclusively, one alternative is to eliminate the ones from the correlation matrix and put in place the estimate of the proportion of variance that each item has in common with all the others.

As it is not known that value, what is sought is to estimate it, as well as keeping cells values that differ from the adjusted diagonal.

In conceptual terms, what is sought is to give a score to the latent factor, the one of which we do not have information directly, expressed as a set of equations of style:

Each \(F_i\) represents a latent factor. We know the \(X_i\) response values of the items, and we must assign weights or loads (loads) to each variable, as weighting how each variable aims to help that factor. A variable that does not point to the factor will have a load or weighting of zero or very close to zero. A variable that directly helps the factor will have a load or weighting with a positive value, and the variable that contributes negatively to the factor will have a load or weighting with a negative value. The value is called the factorial score (loading factor). The variables with factor scores higher than a certain threshold, for example, 0.3, are those that define the factor, and in this way, it is verified that the theory coincides with the practice.

The confirmatory evaluation is carried out by establishing the model of structural equations to be confirmed. For example:

\(g = + I01 + I05 + I06 + I08 + I09 + I10 + I11 + I15 + I16 + I17 + I19 + I20 + I23 + I24 + I25 + I28 + I29 + I30 + I31 + I32 + I33 + I34 + I35 + I36 + + I37 + I38 + I39 + I40 + I41 + I42 + I43 + I44 + I45 + I46 + I48\)

\(F1 = ~ + I30 + I32 + I33 + I27 + I35 + I36 + I39 + I44 + I45 + I46 + I48\)

\(F2 = ~ + I08 + I10 + I15 + I23 + I25 + I28 + I29 + I31 + I37 + I38 + I40 + I43 + I41\)

\(F3 = ~ + I01 + I05 + I11 + I19 + I34 + I49 + I02 + I18 + I21 + I22 + I42\)

\(F4 = ~ + I06 + I09 + I17 + I24 + I04 + I07\)

\(F5 = ~ + I16 + I20 + I03 + I12 + I13 + I26\)

For its presentation, it is requested again to the computer tool that only takes into account the variables with factor scores higher than a certain threshold.


  1. A brief technical explanation is provided in Annex A of how the exploratory factor analysis was developed in 2017 and 2018.↩︎