Set the default theme & font for plots and tables
font_size = 12
af_theme <- theme_bw(base_size = font_size) +
theme(
plot.title = element_text(size = font_size), # Plot title
axis.title = element_text(size = font_size), # Axis titles
axis.text = element_text(size = font_size), # Axis tick labels
legend.title = element_text(size = font_size), # Legend title
legend.text = element_text(size = font_size), # Legend text
strip.text = element_text(size = font_size) # Facet variable's name
)
ggplot2::theme_set(af_theme)
load survey dataset
df <- as.data.frame(readRDS("Israel Survey/data/il_pe.RDS"))
# Create age group variable (move to measures)
df$age_group <- af_create_groups(df$age, c(18, 30, 45, 60, Inf),
c("18-30", "31-45", "46-60", "60+"))
This analysis calculates similarity scores between different groups (e.g., Traditional vs Secular, Traditional vs Religious) to determine which groups have more similar characteristics across various variables.
Method: Cramér’s V with Similarity Transformation
Create contingency table: Cross-tabulate the comparison groups with the variable categories
Remove empty cells: Filter out rows and columns with zero observations to ensure valid statistical analysis
Calculate Cramér’s V: This measures the strength of association between the two variables
Convert to similarity: Similarity = 1 - Cramér’s V
Interpretation: Higher similarity scores indicate that the two groups have more similar voting patterns, gender distributions, etc.
Method: Kolmogorov-Smirnov Test P-value
Extract data: Get the numerical values for each comparison group
Perform KS test: Compare the distributions of the two groups
Use p-value as similarity: Higher p-values indicate more similar distributions
Interpretation: Higher p-values indicate that the age distributions (or other numerical variables) of the two groups are more similar.
For each wave and overall analysis:
This method provides a standardized way to compare how similar different groups are across multiple types of variables, helping to make evidence-based decisions about category grouping.
# Run the analysis
result <- af_compare_religiosity_grouping(
df = df,
nominal_vars = c("vote", "vote2022", "pe_left_center_right"),
numerical_vars = c(),
comparison_var = "religiosity",
comparison_levels = list(
c("Traditional", "Secular"),
c("Traditional", "Religious")
),
wave_var = "Wave"
)
# Display recommendation
cat(result$recommendation)
Based on overall similarity scores, the most similar pair is: Traditional vs Religious (similarity score: 0.704)
# Display similarity scores table
knitr::kable(result$summary_by_wave,
caption = "Average Similarity Scores by Wave",
digits = 3)
wave | Traditional vs Secular_similarity | Traditional vs Religious_similarity |
---|---|---|
Overall | 0.631 | 0.704 |
Wave Fifth | 0.499 | 0.759 |
Wave First | 0.628 | 0.776 |
Wave Fourth | 0.700 | 0.665 |
Wave Second | 0.616 | 0.713 |
Wave Sixth | 0.630 | 0.709 |
Wave Third | 0.696 | 0.662 |
# # Display detailed results table
# knitr::kable(result$similarity_scores,
# caption = "Detailed Similarity Scores",
# digits = 3)
# Display plots
# Access plots by variable name
result$plots$overall
# Run the analysis
result <- af_compare_religiosity_grouping(
df = df,
nominal_vars = c("vote", "vote2022", "pe_left_center_right"),
numerical_vars = c(),
comparison_var = "religiosity",
comparison_levels = list(
c("National Ultra-Orthodox", "Orthodox"),
c("National Ultra-Orthodox", "National Religious")
),
wave_var = "Wave"
)
# Display recommendation
cat(result$recommendation)
Based on overall similarity scores, the most similar pair is: National Ultra-Orthodox vs Orthodox (similarity score: 0.711)
# Display similarity scores table
knitr::kable(result$summary_by_wave,
caption = "Average Similarity Scores by Wave",
digits = 3)
wave | National Ultra-Orthodox vs Orthodox_similarity | National Ultra-Orthodox vs National Religious_similarity |
---|---|---|
Overall | 0.711 | 0.590 |
Wave Fifth | 0.868 | 0.686 |
Wave First | 0.779 | 0.575 |
Wave Fourth | 0.470 | 0.549 |
Wave Second | 0.859 | 0.491 |
Wave Sixth | 0.623 | 0.911 |
Wave Third | 0.479 | 0.561 |
# # Display detailed results table
# knitr::kable(result$similarity_scores,
# caption = "Detailed Similarity Scores",
# digits = 3)
# Display plots
# Access plots by variable name
result$plots$overall