1 Experiments


combination of experiments used: risk - time - trust

variables under analysis:

Distribution of variables under analysis

Figure 1.1: Distribution of variables under analysis



Distribution of variables under analysis by country

Figure 1.2: Distribution of variables under analysis by country


1.1 Cluster analysis: choice of algorithm and number of groups


Comparison of cluster algorithms and by number of groups; analysis on standardised data; <br>algorithms: kmeans: k-means, pam: partition around medoids, ward.D2: hierarchical clustering by Ward's method; <br>indices: CH=Calinski-Harabasz, S=Silhouette, C=Hubert & Levin C index, DB=Davies-Bouldin

Figure 1.3: Comparison of cluster algorithms and by number of groups; analysis on standardised data;
algorithms: kmeans: k-means, pam: partition around medoids, ward.D2: hierarchical clustering by Ward’s method;
indices: CH=Calinski-Harabasz, S=Silhouette, C=Hubert & Levin C index, DB=Davies-Bouldin


La scelta dell’algoritmo di cluster e del numero ottimale di gruppi è stata fatta:
1. standardizzando le variabili in analisi;
2. calcolando la matrice di distanze euclidee;
3. confrontando i valori degli indici di Calinski-Harabasz, Silhouette, Hubert & Levin C e Davies-Bouldin ottenuti per le possibili soluzioni da 2 a 10 gruppi, forniti dagli algoritmi k-means, partition around medoids e cluster gerarchica con metodo di Ward

Escludendo la soluzione a due gruppi, ritenuta troppo generica, osserviamo che gli indici di Calinski-Harabasz e Silhouette presentano il valore massimo in corrispondenza del raggruppamento in quattro cluster per l’algoritmo di k-means; l’indice C di Hubert & Levin, per cui l’ottimo corrisponde al valore minimo, tende a decrescere fino ai dieci cluster, tuttavia si può notare un gomito in corrispondenza del partizionamento in quattro gruppi, dopodichè la decrescita rallenta.

choice: k-means with 4 clusters


1.2 Cluster analysis results




Cluster profiling: average value of variables under analysis by cluster

Figure 1.4: Cluster profiling: average value of variables under analysis by cluster



Clusters summary:

  • 1: patient but untrusting (orange)
  • 2: high risk-takers (red)
  • 3: impatient (green)
  • 4: trusting (blue)


clusters distribution

Figure 1.5: clusters distribution


Clusters by socioeconomic variables

Figure 1.6: Clusters by socioeconomic variables





Correspondence Analysis - Clusters Experiments

Figure 1.7: Correspondence Analysis - Clusters Experiments


Clusters by gender, age, education level and income at country level

Figure 1.8: Clusters by gender, age, education level and income at country level





2 DQQ



2.1 Cluster analysis: choice of algorithm and number of groups


The choice of the agorithm and the optimal number of clusters was made using a distance matrix obtained with the Gower metric.


Comparison between algorithms and by number of clusters; analysis on standardised data; <br>algorithms: pam: partition around medoids, ward.D2: hierarchical clustering by Ward's method; <br>indices: S=Silhouette, C=Hubert & Levin C index

Figure 2.1: Comparison between algorithms and by number of clusters; analysis on standardised data;
algorithms: pam: partition around medoids, ward.D2: hierarchical clustering by Ward’s method;
indices: S=Silhouette, C=Hubert & Levin C index


In questo caso, avendo un mix di variabili quantitative e qualitative, si sono potuti mettere a confronto solo gli algoritmi di Partition Around Medoids e di cluster gerarchica, sempre ottenuta con il metodo di Ward;
in particolare si è calcolata la matrice di distanze usando la metrica di Gower e sono stati valutati i valori degli indici di Silhouette e C di Hubert & Levin (data la struttura dei dati non è possibile usare Calinski-Harabasz e Davies-Bouldin) per tutti i possibili raggruppamenti da 2 a 10.
In generale l’algoritmo PAM fornisce risultati migliori per entrambi gli indici; per quanto riguarda invece la scelta del numero di gruppi, non essendoci differenze significative nei valori di Silhouette tra i diversi raggruppamenti, si è optato per una soluzione che non portasse ad un overfitting dei gruppi.

choice: partition around medoids with 4 clusters


2.2 Cluster analysis results




Cluster profiling by the variables in analysis

Figure 2.2: Cluster profiling by the variables in analysis



Mosaic plot DQQ cluster and qualitative cluster variables

Figure 2.3: Mosaic plot DQQ cluster and qualitative cluster variables


Clusters summary:

  • 1: lack (only 2.4% consume all five recommended food groups, have the lowest FGDS value);
  • 2: little and badly (less than 20 per cent consume all five recommended food groups but at the same time more than 80 per cent consume sweetFood, saltyFriedSnack, sweetBeverage, and are those who consume processedMeat the most);
  • 3: balanced (90% consume the five recommended food groups with an above-average FGDS);
  • 4: much but also badly (almost all consume the five recommended food groups (95%) but, at the same time have a high NCD-Risk score, with consumption of sweetFood, saltyFriedSnack, sweetBeverage and processedMeat).


clusters distribution

Figure 2.4: clusters distribution


Clusters DQQ by socioeconomic variables

Figure 2.5: Clusters DQQ by socioeconomic variables





Correspondence Analysis - Clusters DQQ

Figure 2.6: Correspondence Analysis - Clusters DQQ


Clusters DQQ by gender, age, education level and income at country level

Figure 2.7: Clusters DQQ by gender, age, education level and income at country level





3 comparison between cluster Experiments - DQQ



clusterDQQ - clusterExp

Figure 3.1: clusterDQQ - clusterExp


3.1 DQQ variables by Experiment clusters




Average values of quantitative DQQ variables by Experiment clusters

Figure 3.2: Average values of quantitative DQQ variables by Experiment clusters





qualitative DQQ variables by Experiment clusters

Figure 3.3: qualitative DQQ variables by Experiment clusters


3.2 Experiment variables by DQQ clusters


3.2.1 Risk


Risk propensity distribution by DQQ clusters and country

Figure 3.4: Risk propensity distribution by DQQ clusters and country




3.2.2 Time


Time Preferences distribution by DQQ clusters and country

Figure 3.5: Time Preferences distribution by DQQ clusters and country




3.2.3 Trust game


player.trust distribution by DQQ clusters and country

Figure 3.6: player.trust distribution by DQQ clusters and country

player.trust_inst distribution by DQQ clusters and country

Figure 3.7: player.trust_inst distribution by DQQ clusters and country




3.2.4 Public Goods Game


PGG.1 distribution by DQQ clusters and country

Figure 3.8: PGG.1 distribution by DQQ clusters and country

PGG.2 distribution by DQQ clusters and country

Figure 3.9: PGG.2 distribution by DQQ clusters and country




4 Innovation by clusters


4.1 DQQ cluster by interest in innovation and in local products


4.1.1 Q12. New Food Product



Mosaic plot Choice on a new food product by DQQ cluster

Figure 4.1: Mosaic plot Choice on a new food product by DQQ cluster


4.1.2 Q13. Interest in a new nutrient dense Food Product




multiple pairwise comparisons (with Bonferroni correction)

Figure 4.2: multiple pairwise comparisons (with Bonferroni correction)


4.1.3 Q6. Interest in a new local food product




multiple pairwise comparisons (with Bonferroni correction)

Figure 4.3: multiple pairwise comparisons (with Bonferroni correction)

4.2 Experiment cluster by interest in innovation and in local products


4.2.1 Q12. New Food Product



Mosaic plot Choice on a new food product by Experiment cluster

Figure 4.4: Mosaic plot Choice on a new food product by Experiment cluster


4.2.2 Q13. Interest in a new nutrient dense Food Product



4.2.3 Q6. Interest in a new local food product