Cover page

Cover image

Value Boxes

Valuebox indicators

55

Valuebox components

12

Valuebox dimensions

3

Valuebox regions

240

Composite indicators

Column

Pros and cons of composite indicators

Pros

Cons

Summarise complex or multi-dimensional issues, in view of supporting decision-makers.

May send misleading policy messages, if they are poorly constructed or misinterpreted.

Are easier to interpret than trying to find a trend in many separate indicators.

May invite drawing simplistic policy conclusions, if not used in combination with the indicators.

Facilitate the task of ranking countries on complex issues in a benchmarking exercise.

May lend themselves to instrumental use, if the various stages are not transparent and based on sound statistical or conceptual principles.

Assess progress of countries over time on complex issues.

The selection of indicators and weights could be the target of political challenge.

Reduce the size of a set of indicators or include more information within the existing size limit.

May disguise serious failings in some dimensions of the phenomenon, and thus increase the difficulty in identifying the proper remedial action.

Place issues of countries performance and progress at the centre of the policy arena.

May lead wrong policies, if dimensions of performance that are difficult to measure are ignored.

Facilitate communication with ordinary citizens and promote accountability.

Column

Stages in the construction of composite indicators

Stage

Description

1. Theoretical framework

Provides the basis for the selection and combination of variables into a meaningful composite indicator under a fitness-for-purpose principle.

2. Data selection

Should be based on the analytical soundness, measurability, country coverage, and relevance of the indicators to the phenomenon being measured and relationship to each other. The use of proxy variables should be considered when data are scarce.

3. Imputation of missing data and outliers

Is needed in order to provide a complete and clean dataset (e.g. by means of single or multiple imputation).

4. Multivariate analysis

Should be used to study the overall structure of the dataset, assess its suitability, and guide subsequent methodological choices (e.g., weighting, aggregation).

5. Normalization

Should be carried out to render the variables comparable.

6. Weighting and aggregation

Should be done along the lines of the underlying theoretical framework.

7. Uncertainty and sensitivity analysis

Should be undertaken to assess the robustness of the composite indicator in terms of e.g., the mechanism for including or excluding an indicator, the normalisation scheme, the imputation of missing data, the choice of weights, the aggregation method.

8. Back to the data

Is needed to reveal the main drivers for an overall good or bad performance. Transparency is primordial to good analysis and policymaking.

9. Links to other indicators

Should be made to correlate the composite indicator (or its dimensions) with existing (simple or composite) indicators as well as to identify linkages through regressions.

10. Visualisation of the results

Should receive proper attention, given that the visualisation can influence (or help to enhance) interpretability.

Basic indicators

Basic Human Needs

Nutrition and basic care

Water and sanitation

Shelter

Personal security

Foundations

Access to basic knowledge

Access to ICT

Health and wellness

Environmental quality

Opportunity

Personal rights

Personal freedom and choice

Tolerance and inclusion

Access to advanced education

Methodology

Steps and \(\textsf{R}\) packages:

  1. Exploratory Data Analysis (EDA): skimr.

  2. Treatment of outliers and NAs: dlookr.

  3. Assessment of internal consistency using PCA: FactoMineR.

  4. Normalization of raw indicators: min-max and \(z\)-scores.

  5. Weighting of indicators: equal weights and PCA: FactoMineR.

  6. Aggregation of indicators: generalized means (\(\beta\) = 0, 0.25, 0.50, 0.75, 1.0), PCA and PCA-equally weightings.

  7. Sensitivity and uncertainty analysis using Monte Carlo simulations: COINr.

  8. Clustering of NUTS 2 regions using Kohonen self-organizing maps (SOM): kohonen.

  9. Comparison with other indicators (GDP): eurostat, mgcv.

  10. Visualization and communication: flexdashboard, DT, flextable, giscoR, highcharter, reactable, tmap.

EDA

Column

Categorical variables: Summary statistics

Characteristics of the categorical variables in the EU-SPI dataset

Variable

NAs

Complete rate

# Levels

Country

0

100%

26

NUTS2 region

0

100%

238

Region name

0

100%

236

Dimension

0

100%

3

Component

0

100%

12

NAs: missing values.
There are two regions with the same names: Limburg (BE and NL) and Luxembourg (BE and LU).
Croatia has been removed, because the NUTS2 definition was changed in 2021 and the regions do not match with the EU-SPI 2020 dataset.

Column

Numeric variables: Summary statistics

By region

Maps

Capital regions

By region and dimension

Maps

Capital regions

By region and component

Cleveland dot plot

Boxplots by component

Robustness

EU-SPI distributions

Rank uncertainty

Monte Carlo simulations

Clustering

EU-SPI vs. GDP

Eligibility GDP

EU-SPI vs. GDP

Conclusions

1. Composite indicators synthesize many primary variables of diverse nature (environmental, social, economic, technical).

2. This work focuses on the EU-SPI indicator, which measures social progress in EU regions.

3. Methods tested following OECD and EU recommendations in the stages of:

4. Robustness of the EU-SPI indicator verified through:

5. Future research lines:

References