This document examines the effectiveness of different communication methods at varying distances, illustrating the concept of sum-to-zero contrasts in statistical modeling.
We’ll examine how effective different communication methods are at varying distances:
Effectiveness is measured by the percentage of words correctly understood by the listener.
methods <- rep(c("Whisper", "Shout"), each = 40)
distances <- rep(c("Next to you", "Mile away"), times = 2, each = 20)
effectiveness <- c(
rnorm(20, mean = 95, sd = 3), # Whispering next to you
rnorm(20, mean = 0, sd = 0.5), # Whispering a mile away
rnorm(20, mean = 98, sd = 2), # Shouting next to you
rnorm(20, mean = 60, sd = 10) # Shouting a mile away
)
effectiveness <- pmin(pmax(effectiveness, 0), 100)
# adding them duplicate so one of each factor
# is used in the contrasts
data <- data.frame(Method = factor(methods),
Method_dummy = factor(methods),
Distance = factor(distances),
Distance_dummy = factor(distances),
Effectiveness = effectiveness)
kable(head(data, 10)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = F, position = "center") %>%
add_header_above(c("Our Variables" = 4, "Our Measure" = 1)) %>%
row_spec(0, bold = T, color = "white", background = "#3498db")| Method | Method_dummy | Distance | Distance_dummy | Effectiveness |
|---|---|---|---|---|
| Whisper | Whisper | Next to you | Next to you | 98.10967 |
| Whisper | Whisper | Next to you | Next to you | 91.70492 |
| Whisper | Whisper | Next to you | Next to you | 99.17936 |
| Whisper | Whisper | Next to you | Next to you | 97.60990 |
| Whisper | Whisper | Next to you | Next to you | 92.03071 |
| Whisper | Whisper | Next to you | Next to you | 99.69145 |
| Whisper | Whisper | Next to you | Next to you | 94.82602 |
| Whisper | Whisper | Next to you | Next to you | 99.02199 |
| Whisper | Whisper | Next to you | Next to you | 99.95801 |
| Whisper | Whisper | Next to you | Next to you | 96.11295 |
p <- ggplot(data, aes(x = Method, y = Effectiveness, fill = Distance)) +
geom_boxplot(width = 0.7, alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.5, aes(color = Distance)) +
theme_minimal() +
labs(title = "Communication Effectiveness by Method and Distance",
y = "Effectiveness (%)",
x = "Communication Method") +
scale_fill_viridis(discrete = TRUE, option = "D", end = 0.8) +
scale_color_viridis(discrete = TRUE, option = "D", end = 0.8) +
theme(
legend.position = "bottom",
plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(size = 10),
legend.title = element_text(size = 10, face = "bold")
) +
scale_y_continuous(limits = c(0, 100), breaks = seq(0, 100, 20)) +
guides(fill = guide_legend(title = "Distance"),
color = "none") # This removes the duplicate legend
ggplotly(p)model <- lm(Effectiveness ~ Method_dummy * Distance_dummy, data = data)
tidy_model <- tidy(model, conf.int = TRUE) %>%
mutate(across(where(is.numeric), ~round(., 3)))
model_summary <- summary(model)
coefs <-model_summary$coefficients
kable(tidy_model,
col.names = c("Term", "Estimate", "Std. Error", "t value", "p value", "CI Lower", "CI Upper"),
caption = "Linear Model Summary") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#3498db")| Term | Estimate | Std. Error | t value | p value | CI Lower | CI Upper |
|---|---|---|---|---|---|---|
| (Intercept) | 61.716 | 1.122 | 55.014 | 0 | 59.482 | 63.951 |
| Method_dummyWhisper | -61.632 | 1.586 | -38.848 | 0 | -64.791 | -58.472 |
| Distance_dummyNext to you | 36.021 | 1.586 | 22.705 | 0 | 32.861 | 39.181 |
| Method_dummyWhisper:Distance_dummyNext to you | 59.516 | 2.244 | 26.526 | 0 | 55.047 | 63.984 |
In this dummy-coded model:
Predicted values:
Now, let’s apply sum-to-zero contrasts:
contrasts(data$Method) <- contr.sum(2)
contrasts(data$Distance) <- contr.sum(2)
model_sum_zero <- lm(Effectiveness ~ Method * Distance, data = data)
model_sums_summary <- summary(model_sum_zero)
coef_sum <- model_sums_summary$coefficients
tidy_model_sum_zero <- tidy(model_sum_zero, conf.int = TRUE) %>%
mutate(across(where(is.numeric), ~round(., 3)))
kable(tidy_model_sum_zero,
col.names = c("Term", "Estimate", "Std. Error", "t value", "p value", "CI Lower", "CI Upper"),
caption = "Sum-to-Zero Contrasts Model Summary") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#3498db")| Term | Estimate | Std. Error | t value | p value | CI Lower | CI Upper |
|---|---|---|---|---|---|---|
| (Intercept) | 63.790 | 0.561 | 113.726 | 0 | 62.673 | 64.907 |
| Method1 | 15.937 | 0.561 | 28.413 | 0 | 14.820 | 17.054 |
| Distance1 | -32.889 | 0.561 | -58.636 | 0 | -34.007 | -31.772 |
| Method1:Distance1 | 14.879 | 0.561 | 26.526 | 0 | 13.762 | 15.996 |
Predicted values:
Whispering a Mile Away:
Whispering Next to you:
Shouting a Mile Away:
Shouting Next to you:
This example illustrates how different communication methods interact with distance. The extreme interaction effect shows that the effectiveness of a communication method depends heavily on the distance. The dummy-coded model helps us understand the specific effects relative to a reference group (Whispering Next to you), while the sum-to-zero contrasts provide insights about overall effects and deviations from the grand mean. Both coding schemes reveal the strong interaction: the effect of distance is much more pronounced for Whispering than for Shouting. This real-world example provides an intuitive understanding of interaction effects in statistical models.