1 A Trick of the Tail

Analyzing levels of extremism and their trends requires identifying the more extremist part of the population i.e. the tail of the distribution. The selected tail should have enough samples for statistical analysis and enough resolution (number of unique values) differentiating the samples levels of extremism. In this analysis we compare three methods of calculating a threshold that will identify the tail: Quantile, Median + K x MAD, and Median + K x Qn. The population with extremism level above the threshold will be considered as the more extremist part of the population.

1.1 The Main Challenge

Calculating the extremist threshold create a challenge due to two main reasons: + The size of the tail is too small (we want ~ 100 samples) + The size of the tail is too high (non-extremist population is included in it) + The tails does not have enough granularity between samples (low number of unique values)

1.2 Threshold Calculation Methods

Quantile-based (Top N%) approach simply defines the tail as the top N% of the data. For example, the top 15% might be considered the tail. It’s straightforward to understand and implement but doesn’t account for the shape of the distribution.

Pros:

Simple to understand and communicate
Directly controls the proportion of data in the tail
Consistent across different datasets

Cons:

Doesn’t account for the shape of the distribution

Median + K x MAD approach uses the median and the Median Absolute Deviation (MAD) to establish the extremism threshold. It’s robust against outliers and works well for non-normal distributions. The threshold is typically set at Median + k * MAD, where k is set to allow big enough size of the extremism tail.

Pros:

Robust to outliers, as it uses the median instead of the mean
Accounts for the spread of the data
Works well for non-normal distributions

Cons:

May not be as intuitive to interpret as percentile-based methods
The choice K * MAD can be somewhat arbitrary

Median + K x Qn approach uses the median and the Rousseeuw-Croux Qn estimator to establish the extremism threshold. The Qn estimator is more efficient than MAD and better handles skewed distributions common in extremism research. The threshold is set at Median + k * Qn.

Pros:

More efficient than MAD for non-normal distributions
Robust to outliers and asymmetric distributions
Better suited for extremism data which is often skewed

Cons:

Requires the robustbase package
Less commonly known than MAD
The choice of K factor can be somewhat arbitrary

2 Select most suitable K

# Define the three methods to compare
methods <- c("Quantile", "MAD", "Tau")
q_pct <- 0.85
k_factor <- 1.5
k_range <- seq(1, 2.5, 0.1)
min_tail_size <- 100
min_unique_values <- 5
moderate_threshold <- NULL
note <- paste0("Minimal tail size = ", min_tail_size, " | Minimal unique values = ", min_unique_values)
threshold_subtitle <- paste0("k factor = ", k_factor, " | Quantile percentage = ", q_pct*100, "%")
external_legend <- list(c("red", "dashed"), c("blue", "dashed"), c("darkgreen", "dashed"))
legend_labels <- c("Quantile", "Median+MAD", "Median+TAU")


# Set method parameters
method_params <- list(
  q_pct = q_pct,          # percentile for Quantile method
  k_factor = k_factor     # Factor for MAD, Sn, Qn, TAu methods
)

group_selections <- list(
  Wave = c("First", "Second", "Third", "Fourth", "Fifth", "Sixth")
)

2.1 Ideology

results <- af_test_robust_k_suitability(df, "pe_ideology", wave_var = "Wave", method = "MAD", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "MAD K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
MAD K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	341	288	183	162	255	247
Tail %	21.2	17.9	20.7	23.2	16.7	22.2
Tail Mean	6.852	6.345	6.199	6.247	6.599	6.523
Unique Values	2	8	13	16	10	11
Suitable	✗ Not Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable
Suggested K	1	1.7	1.8	2	1.8	1.8
Minimal tail size = 100 \| Minimal unique values = 5


results <- af_test_robust_k_suitability(df, "pe_ideology", wave_var = "Wave", method = "Tau", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "Tau K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
Tau K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	282	187	130	106	172	180
Tail %	17.5	11.6	14.7	15.2	11.3	16.2
Tail Mean	7	6.567	6.425	6.574	6.842	6.754
Unique Values	1	2	4	5	4	4
Suitable	✗ Not Suitable	✗ Not Suitable	✗ Not Suitable	✓ Suitable	✗ Not Suitable	✗ Not Suitable
Suggested K	1	1.4	1.4	1.5	1.4	1.4
Minimal tail size = 100 \| Minimal unique values = 5

result <- af_compare_tail_methods(df, 
                                  group_selections = group_selections, 
                                  measure_var = "pe_ideology",
                                  methods = methods,
                                  method_params = method_params)
  
# gt::gt(result$results)
# af_combine_plots(result$plots)
af_combine_plots(result$plots, 
                 main_title = "Cognitive (Ideology) Dimension - Tail Threshold",
                 subtitle = threshold_subtitle, note = "", common_legend = FALSE,
                 external_legend = external_legend, legend_labels = legend_labels)

2.2 Violence

results <- af_test_robust_k_suitability(df, "pe_violence", wave_var = "Wave", method = "MAD", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "MAD K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
MAD K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	578	505	282	201	467	360
Tail %	35.9	31.4	31.8	28.8	30.6	32.3
Tail Mean	2.383	2.718	2.608	2.616	2.678	2.677
Unique Values	438	469	272	195	449	345
Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable
Suggested K	2.5	2.5	2.5	2.5	2.5	2.5
Minimal tail size = 100 \| Minimal unique values = 5


results <- af_test_robust_k_suitability(df, "pe_violence", wave_var = "Wave",  method = "Tau", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "Tau K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
Tau K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	481	427	236	173	398	302
Tail %	29.9	26.6	26.6	24.8	26.1	27.1
Tail Mean	2.594	2.95	2.843	2.815	2.884	2.902
Unique Values	399	407	230	170	387	292
Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable
Suggested K	2.5	2.5	2.5	2.5	2.5	2.5
Minimal tail size = 100 \| Minimal unique values = 5

result <- af_compare_tail_methods(df, 
                                  group_selections = group_selections, 
                                  measure_var = "pe_violence",
                                  methods = methods,
                                  method_params = method_params)
  
# gt::gt(result$results)
af_combine_plots(result$plots, 
                 main_title = "Behavioral (Violence) Dimension - Tail Threshold",
                 subtitle = threshold_subtitle, note = "", common_legend = FALSE,
                 external_legend = external_legend, legend_labels = legend_labels)

2.3 Intolernace

results <- af_test_robust_k_suitability(df, "pe_intolerance", wave_var = "Wave", method = "MAD", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "MAD K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
MAD K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	260	273	131	111	231	174
Tail %	16.2	17	14.8	15.9	15.2	15.6
Tail Mean	6.285	6.251	6.446	6.301	6.407	6.272
Unique Values	208	199	103	90	186	114
Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable	✓ Suitable
Suggested K	2.2	2.1	1.7	1.6	2.1	1.9
Minimal tail size = 100 \| Minimal unique values = 5


results <- af_test_robust_k_suitability(df, "pe_intolerance", wave_var = "Wave", method = "Tau", 
                                        k_factor = k_factor, k_range = k_range,
                                        min_tail_size = min_tail_size, min_unique_values = min_unique_values, 
                                        moderate_threshold = moderate_threshold, suggest_optimal = TRUE)

# Comprehensive multi-metric table
comprehensive_table <- af_create_comprehensive_k_table(results, title = "Tau K Factor Analysis Summary", note = note)
comprehensive_table

Metric	First	Second	Third	Fourth	Fifth	Sixth
Tau K Factor Analysis Summary
K Factor: 1.5 \| Waves: 6
Sample Size	1608	1607	886	697	1524	1114
Tail Size	154	170	71	61	121	94
Tail %	9.6	10.6	8	8.8	7.9	8.4
Tail Mean	6.583	6.482	6.765	6.592	6.774	6.512
Unique Values	103	96	44	40	76	35
Suitable	✓ Suitable	✓ Suitable	✗ Not Suitable	✗ Not Suitable	✓ Suitable	✗ Not Suitable
Suggested K	1.6	1.6	1.3	1.2	1.6	1.4
Minimal tail size = 100 \| Minimal unique values = 5

result <- af_compare_tail_methods(df, 
                                  group_selections = group_selections, 
                                  measure_var = "pe_intolerance",
                                  methods = methods,
                                  method_params = method_params)
  
# gt::gt(result$results)
af_combine_plots(result$plots, 
                 main_title = "Social (Intolerance) Dimension - Tail Threshold",
                 subtitle = threshold_subtitle, note = "", common_legend = FALSE,
                 external_legend = external_legend, legend_labels = legend_labels)

3 Community Check

3.1 Political Orientation

# Generate the warning table
warning_table <- af_extremism_tail_warnings(
  df = df,  # Your original survey data
  community_var = "pe_left_center_right",
  wave_var = "Wave",
  dimensions = c("pe_ideology", "pe_violence", "pe_intolerance"),
  threshold_type = "MAD",
  k_factor = 1.5,
  tail_size_warning = 10
)

# Display the table
warning_table

Community	First	Second	Third	Fourth	Fifth	Sixth
Extremism Tail Size Analysis by Community and Wave
Method: MAD \| K-factor: 1.5 \| Warning threshold: 10 individuals¹
Cognitive
right	198 (22%)	157 (17.7%)	90 (16.6%)	70 (16.7%)	122 (13.7%)	107 (16.2%)
center	60 (12.9%)	47 (9.9%)	29 (12.2%)	41 (21.7%)	73 (16%)	80 (24.4%)
left	83 (34.2%)	84 (34.3%)	64 (60.4%)	51 (57.3%)	60 (33.7%)	60 (48.4%)
Behavioral
right	357 (39.6%)	342 (38.6%)	194 (35.8%)	116 (27.7%)	268 (30.1%)	184 (27.8%)
center	141 (30.4%)	114 (24%)	69 (29%)	59 (31.2%)	140 (30.6%)	127 (38.7%)
left	80 (32.9%)	49 (20%)	19 (17.9%)	26 (29.2%)	59 (33.1%)	49 (39.5%)
Social
right	204 (22.6%)	233 (26.3%)	110 (20.3%)	94 (22.4%)	188 (21.1%)	140 (21.1%)
center	49 (10.6%)	31 (6.5%)	18 (7.6%)	17 (9%)	38 (8.3%)	28 (8.5%)
left	7 (2.9%)	9 (3.7%)	3 (2.8%)	0 (0%)	5 (2.8%)	6 (4.8%)
¹ Values shown as: count (percentage of community). Bold red indicates tail size below warning threshold.

3.2 Religiosity

# Generate the warning table
warning_table <- af_extremism_tail_warnings(
  df = df,  # Your original survey data
  community_var = "pe_religiosity",
  wave_var = "Wave",
  dimensions = c("pe_ideology", "pe_violence", "pe_intolerance"),
  threshold_type = "MAD",
  k_factor = 1.5,
  tail_size_warning = 10
)

# Display the table
warning_table

Community	First	Second	Third	Fourth	Fifth	Sixth
Extremism Tail Size Analysis by Community and Wave
Method: MAD \| K-factor: 1.5 \| Warning threshold: 10 individuals¹
Cognitive
Secular	146 (21.3%)	151 (20.7%)	101 (27.2%)	90 (30.7%)	128 (21.1%)	148 (33%)
Religious	126 (18.1%)	84 (12.7%)	58 (15.3%)	57 (19.1%)	80 (11.6%)	68 (13.8%)
National Religious	24 (41.4%)	16 (30.8%)	8 (17.8%)	6 (15.8%)	11 (16.7%)	10 (19.2%)
Ultra-Orthodox	45 (26.9%)	37 (23%)	16 (17.6%)	9 (13.2%)	36 (22.2%)	21 (17.6%)
Behavioral
Secular	231 (33.7%)	209 (28.6%)	121 (32.5%)	84 (28.7%)	198 (32.7%)	166 (37%)
Religious	273 (39.2%)	225 (33.9%)	131 (34.7%)	100 (33.6%)	219 (31.7%)	150 (30.4%)
National Religious	18 (31%)	10 (19.2%)	7 (15.6%)	5 (13.2%)	12 (18.2%)	9 (17.3%)
Ultra-Orthodox	56 (33.5%)	61 (37.9%)	23 (25.3%)	12 (17.6%)	38 (23.5%)	35 (29.4%)
Social
Secular	56 (8.2%)	66 (9%)	42 (11.3%)	38 (13%)	34 (5.6%)	40 (8.9%)
Religious	129 (18.5%)	126 (19%)	59 (15.6%)	50 (16.8%)	144 (20.9%)	84 (17%)
National Religious	12 (20.7%)	15 (28.8%)	7 (15.6%)	7 (18.4%)	15 (22.7%)	14 (26.9%)
Ultra-Orthodox	63 (37.7%)	66 (41%)	23 (25.3%)	16 (23.5%)	38 (23.5%)	36 (30.3%)
¹ Values shown as: count (percentage of community). Bold red indicates tail size below warning threshold.

IL Trick of the Tail

IL Political Extremism

Amir Freund

2025-08-07

1 A Trick of the Tail

1.1 The Main Challenge

1.2 Threshold Calculation Methods

2 Select most suitable K

2.1 Ideology

2.2 Violence

2.3 Intolernace

3 Community Check

3.1 Political Orientation

3.2 Religiosity