| age_label | gender_label | n |
|---|---|---|
| 25-34 | Female | 34 |
| 35-44 | Male | 28 |
| 35-44 | Female | 25 |
| 45-54 | Male | 23 |
| 45-54 | Female | 21 |
| 25-34 | Male | 18 |
| 55-64 | Male | 16 |
| 55-64 | Female | 15 |
| 18-24 | Male | 6 |
| 18-24 | Female | 5 |
| 65+ | Female | 4 |
| 65+ | Male | 3 |
| 25-34 | Prefer not to say | 1 |
| 35-44 | Prefer not to say | 1 |
Human vs Retora Message Ratings
1 Overview
This report compares human survey ratings with Retora model outputs for the same set of messages. We summarize agreement at the message level, examine patterns by demographic group, and provide dimension-level comparisons.
2 Data Sources
- Qualtrics export: retora_February 2, 2026_15.54.csv
- Retora runs: custom_feedback_runs_by_group.csv
- Message ID map: messages_surveyjs_id_map.csv
3 Prepare Human Ratings
4 Prepare Retora Runs
5 Combine Sources
| source | n_ratings |
|---|---|
| human | 1998 |
| retora | 12100 |
6 Message Summary Table (Rank Comparison)
| Message ID | Message Text | Human Rank | Retora Rank |
|---|---|---|---|
| msg_01 | To achieve carbon neutrality by 2050, the EU should strengthen climate policy by expanding renewables and also increasing fossil fuel production to stabilize emissions. A just transition is needed, but jobs should not change. | 9 | 10 |
| msg_02 | The EU must strengthen climate policy to reach carbon neutrality by 2050, but we should avoid major changes. Invest massively in renewables while keeping fossil jobs exactly as they are. | 5 | 8 |
| msg_03 | The EU must reach carbon neutrality by 2050 with stronger climate policy, but it should cost nothing and require no lifestyle changes. Invest massively in renewables while guaranteeing no higher bills and no disruption. | 10 | 9 |
| msg_04 | To reach carbon neutrality by 2050, the EU should strengthen climate policy and invest in renewables. This can be done without meaningful public spending if markets work properly and workers adapt. | 1 | 7 |
| msg_05 | The EU must impose tougher climate rules to force rapid decarbonization by 2050. Fossil industries have delayed progress too long, so they should accept the costs while renewables expand quickly. | 2 | 6 |
| msg_06 | Meeting the 2050 neutrality goal needs stronger EU climate policy: faster renewable deployment, grid upgrades, and stable investment rules. A just transition should fund retraining and targeted regional support for fossil-dependent areas. | 7 | 5 |
| msg_07 | Carbon neutrality by 2050 requires stronger EU policies focused on delivery: accelerate permits for renewables, invest in grids, and provide predictable signals for investors. To sustain support, fund retraining and transition protections. | 4 | 2 |
| msg_08 | To credibly reach carbon neutrality by 2050, the EU should strengthen climate policy through faster permitting, grid investment, and clear financing pathways for renewables. Pair this with retraining, wage support, and regional development. | 8 | 3 |
| msg_09 | The EU must strengthen climate policies to reach carbon neutrality by 2050. This requires major investment in renewables and a just transition that supports workers and regions affected by the decline of fossil fuels. | 3 | 4 |
| msg_10 | If the EU is serious about carbon neutrality by 2050, it must strengthen climate policy with execution: speed renewable permitting, modernize grids, and mobilize public and private capital. Guarantee a just transition for workers and regions. | 6 | 1 |
7 Message-Level Comparison (Overall)
| Message Text | Human Mean | Retora Mean | Human N | Retora N |
|---|---|---|---|---|
| To reach carbon neutrality by 2050, the EU should strengthen climate policy and invest in renewables. This can be done without meaningful public spending if markets work properly and workers adapt. | 7.518 | 6.467 | 220 | 1210 |
| The EU must impose tougher climate rules to force rapid decarbonization by 2050. Fossil industries have delayed progress too long, so they should accept the costs while renewables expand quickly. | 7.130 | 6.991 | 169 | 1210 |
| The EU must strengthen climate policies to reach carbon neutrality by 2050. This requires major investment in renewables and a just transition that supports workers and regions affected by the decline of fossil fuels. | 6.839 | 7.279 | 180 | 1210 |
| Carbon neutrality by 2050 requires stronger EU policies focused on delivery: accelerate permits for renewables, invest in grids, and provide predictable signals for investors. To sustain support, fund retraining and transition protections. | 6.815 | 7.305 | 200 | 1210 |
| The EU must strengthen climate policy to reach carbon neutrality by 2050, but we should avoid major changes. Invest massively in renewables while keeping fossil jobs exactly as they are. | 6.783 | 5.833 | 230 | 1210 |
| If the EU is serious about carbon neutrality by 2050, it must strengthen climate policy with execution: speed renewable permitting, modernize grids, and mobilize public and private capital. Guarantee a just transition for workers and regions. | 6.591 | 7.383 | 230 | 1210 |
| Meeting the 2050 neutrality goal needs stronger EU climate policy: faster renewable deployment, grid upgrades, and stable investment rules. A just transition should fund retraining and targeted regional support for fossil-dependent areas. | 6.541 | 7.225 | 170 | 1210 |
| To credibly reach carbon neutrality by 2050, the EU should strengthen climate policy through faster permitting, grid investment, and clear financing pathways for renewables. Pair this with retraining, wage support, and regional development. | 6.519 | 7.296 | 189 | 1210 |
| To achieve carbon neutrality by 2050, the EU should strengthen climate policy by expanding renewables and also increasing fossil fuel production to stabilize emissions. A just transition is needed, but jobs should not change. | 6.040 | 4.299 | 200 | 1210 |
| The EU must reach carbon neutrality by 2050 with stronger climate policy, but it should cost nothing and require no lifestyle changes. Invest massively in renewables while guaranteeing no higher bills and no disruption. | 5.824 | 5.244 | 210 | 1210 |
8 Message-Level Comparison by Dimension
9 Message-Level Comparison by Age
10 Message-Level Comparison by Gender (Male/Female)
11 Agreement Metrics (Message Level)
| spearman | pearson | r2 |
|---|---|---|
| 0.261 | 0.261 | 0.068 |
How to read these numbers (plain language)
- Spearman ρ: whether the two rankings move in the same order. Closer to 1 means the ordering is very similar.
- Pearson r: how straight‑line the relationship is between the two ranks. Closer to 1 means a tighter line.
- R²: how much of Retora’s ranking can be “explained” by the human ranking in a simple line. Closer to 1 means stronger agreement.
12 Agreement Metrics (Mean Scores)
| spearman | pearson | r2 |
|---|---|---|
| 0.261 | 0.54 | 0.292 |
13 Retora Rank Stability (Across Runs and Demographics)
This section looks only at Retora outputs. We first ask: if we repeat the same run multiple times, do we get the same ranking of messages?
Then we check: do different demographic groups produce similar rankings?
13.1 Stability Across Runs (within each age x gender group)
| Gender | Age | Spearman Mean | Spearman SD | Pearson Mean | Pearson SD | R2 Mean | R2 SD |
|---|---|---|---|---|---|---|---|
| Female | 18-24 | 0.931 | 0.047 | 0.933 | 0.044 | 0.872 | 0.080 |
| Male | 18-24 | 0.919 | 0.063 | 0.921 | 0.057 | 0.850 | 0.099 |
| Male | 25-34 | 0.899 | 0.092 | 0.902 | 0.083 | 0.820 | 0.143 |
| Male | 45-54 | 0.899 | 0.043 | 0.899 | 0.036 | 0.809 | 0.065 |
| Female | 35-44 | 0.891 | 0.072 | 0.900 | 0.053 | 0.812 | 0.092 |
| Female | 25-34 | 0.883 | 0.095 | 0.900 | 0.066 | 0.814 | 0.114 |
| Male | 55-64 | 0.867 | 0.088 | 0.866 | 0.084 | 0.756 | 0.145 |
| Female | 45-54 | 0.861 | 0.081 | 0.866 | 0.072 | 0.755 | 0.122 |
| Male | 35-44 | 0.857 | 0.106 | 0.859 | 0.098 | 0.747 | 0.163 |
| Female | 55-64 | 0.816 | 0.150 | 0.843 | 0.125 | 0.725 | 0.207 |
| Male | 65-74 | 0.810 | 0.109 | 0.805 | 0.107 | 0.658 | 0.171 |
| Male | 75-85 | 0.800 | 0.112 | 0.792 | 0.122 | 0.640 | 0.179 |
Why stability matters
Stability is basically the “would we get the same answer twice?” test. If the same Retora setup is run again and the message order changes a lot, it is hard to trust the ranking for decisions like which message to show, which copy to keep, or which segment to target. Stable rankings mean the system is picking up a real preference signal, not noise from sampling, randomness, or small quirks in who happened to be included. It also makes the results easier to explain to others, because you can say “this ordering is robust,” not “it depends on the run.”
How stable these rankings are
Overall, these rankings look quite stable. In most age and gender groups, rerunning the same setup gives you almost the same ordering of messages, with only small reshuffling. The average similarity across runs is high (Spearman 0.87 and Pearson 0.87), which in plain terms means “usually the same story.”
Where it gets a bit less steady is in the groups with the lowest average Spearman. In this run, those are: Male 75-85, Male 65-74, Female 55-64. There, the ordering still tends to be broadly similar, but you see more wobble from run to run, meaning a few messages trade places depending on the particular run. In the plot, occasional dips look like “one run had an odd shuffle,” rather than a general collapse of consistency.
So the human summary is: Retora is giving a fairly reliable message ordering across repeated runs, and it is also broadly consistent across demographic groups. The main caution is that a couple of segments show more run-to-run variability, so you should treat fine-grained differences in those segments (like rank 4 vs rank 6) as less certain than the big picture (top messages vs bottom messages).
13.2 Stability Across Demographic Groups
| Spearman rho | Pearson r | R2 |
|---|---|---|
| 0.697 | 0.697 | 0.486 |
Plain-language takeaways
- Across runs: high Spearman/Pearson means repeating the same Retora setup gives very similar message ordering.
- Across groups: high correlations mean different age/gender segments still rank messages in a similar order.
- Lower values suggest that rankings shift depending on which group you look at or which run you sample.