This tutorial walks through the analysis of a Randomized Controlled Trial (RCT) testing three behavioural nudge interventions designed to increase physical exercise. Participants were randomly assigned to one of four groups:
Group
Description
Control
No intervention
Social Comparison
Participants received information about how their exercise behaviour compared to peers
Commitment Device
Participants made a commitment (e.g., depositing money) they would forfeit if they did not meet exercise goals
Default Scheduling
Exercise sessions were pre-scheduled in participants’ calendars, requiring them to opt out rather than opt in
The primary outcome is the change in exercise days per week from baseline to 8 weeks (change_days_8wk). A secondary binary outcome indicates whether participants met a success threshold (success_8wk).
We will cover:
Descriptive statistics and balance checks
Outcome distributions
Chi-squared tests for success rates
ANOVA and linear regression for main effects
Moderation (interaction) analysis
Mediation analysis
Logistic regression for predicting success
Setup
Load packages
We use library() to load R packages — collections of functions that extend R’s capabilities. Each package serves a specific purpose:
Package
Purpose
haven
Read data files from other statistical software (Stata, SPSS, SAS)
dplyr
Data manipulation — filtering, grouping, summarising, creating new variables
ggplot2
Creating plots and visualisations using a grammar of graphics
broom
Convert model output into tidy data frames
car
Companion to Applied Regression — additional regression diagnostics
patchwork
Combine multiple ggplot2 plots into a single figure
library(haven)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(ggplot2)library(broom)library(car)
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:dplyr':
recode
library(patchwork)
Load and prepare data
The data are stored as a Stata .dta file. After loading, we convert the numeric group and gender codes into labelled factors so that tables and plots display meaningful names.
Commands used:
read_dta() (from haven): Reads a Stata .dta file into R as a data frame. The result is stored in an object called data using the assignment operator <-.
|> (pipe operator): Takes the result on the left and passes it as input to the function on the right. This lets us chain operations in a readable sequence instead of nesting functions.
mutate() (from dplyr): Creates or modifies columns in a data frame. Here we overwrite two existing columns.
factor(): Converts a numeric variable into a categorical (factor) variable. The levels argument specifies which numeric codes exist; labels assigns meaningful names to each level.
In an RCT, randomisation should make the groups comparable on baseline characteristics. We verify this by computing the mean and standard deviation of key variables within each group. Large differences would suggest a problem with the randomisation.
Commands used:
group_by() (from dplyr): Splits the data frame into groups defined by one or more variables. All subsequent operations (like summarise()) are then performed separately within each group.
summarise() (from dplyr): Collapses each group into a single summary row. Inside it, we call summary functions on each variable.
mean() and sd(): Base R functions that compute the arithmetic mean and standard deviation. The argument na.rm = TRUE tells R to ignore missing values instead of returning NA.
If the means and SDs are roughly similar across the four groups, the randomisation appears to have worked well. Formal balance tests (e.g., ANOVA or chi-squared tests on each baseline variable) can be used but are not strictly necessary — randomisation guarantees balance in expectation.
Outcome distribution
Before running any statistical tests, it is good practice to inspect the distribution of the outcome variable. This helps identify skewness, outliers, or floor/ceiling effects that might affect the choice of statistical model.
Commands used:
summary(): A base R function that produces a quick numeric summary — minimum, 1st quartile, median, mean, 3rd quartile, and maximum. The $ operator extracts a single column from a data frame (e.g., data$change_days_8wk).
summary(data$days_exercise_8wk)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 1.500 2.800 2.995 4.200 7.000
summary(data$change_days_8wk)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-2.1000 0.0000 0.5000 0.5533 1.1000 3.3000
Next we create histograms using ggplot2. The key components:
ggplot(): Initialises a plot. The aes() function inside it maps variables to visual properties (here, x = change_days_8wk maps the outcome to the horizontal axis).
geom_histogram(): Adds a histogram layer. bins = 30 controls how many bars the range is divided into.
facet_wrap(~ group_assignment): Creates a separate panel for each level of group_assignment, so we can compare distributions side by side.
labs(): Sets the plot title and axis labels.
ggplot(data, aes(x = change_days_8wk)) +geom_histogram(bins =30, fill ="skyblue", color ="black") +facet_wrap(~ group_assignment) +labs(title ="Change in exercise days by group",x ="Change in days per week",y ="Count" )
TipWhat to look for
Check whether the distributions are roughly symmetric and whether any group shows a clear shift relative to the control group.
Success rates by group
The binary variable success_8wk indicates whether a participant met a pre-defined exercise threshold. We use a cross-tabulation to inspect success rates and a chi-squared test to assess whether success rates differ significantly across groups.
Commands used:
table(): Creates a cross-tabulation (contingency table) of two categorical variables. The first argument becomes the rows, the second becomes the columns. The cells contain counts.
prop.table(): Converts a table of counts into proportions. The argument margin = 1 means “compute proportions within each row” (i.e., each row sums to 1). Using margin = 2 would compute column proportions instead.
chisq.test(): Performs Pearson’s chi-squared test of independence. It tests the null hypothesis that the row variable (group assignment) and the column variable (success) are independent — i.e., that success rates are the same across all groups.
A statistically significant chi-squared test (p < 0.05) indicates that the probability of success differs across at least some of the groups. It does not tell us which groups differ — for that, we would need pairwise comparisons.
Main effects analysis
One-way ANOVA
A one-way ANOVA tests whether the mean change in exercise days differs across the four groups. It is the multi-group generalisation of a two-sample t-test.
Commands used:
oneway.test(): Performs a one-way analysis of variance. The formula change_days_8wk ~ group_assignment reads as “change in exercise days as a function of group assignment.” The ~ (tilde) is R’s formula notation: the outcome goes on the left, the predictor(s) on the right. By default, this version does not assume equal variances across groups (Welch’s ANOVA).
oneway.test(change_days_8wk ~ group_assignment, data = data)
One-way analysis of means (not assuming equal variances)
data: change_days_8wk and group_assignment
F = 392.93, num df = 3, denom df = 1660, p-value < 2.2e-16
Linear regression (basic)
An equivalent approach is to fit a linear regression with group_assignment as the predictor. The control group serves as the reference category, so each coefficient represents the difference in mean change between that intervention group and the control.
Commands used:
lm(): Fits a linear regression model (ordinary least squares). The formula change_days_8wk ~ group_assignment specifies the outcome and predictor. When the predictor is a factor, R automatically creates dummy variables — one for each level except the reference level (the first level, here “Control”).
summary(): When applied to a model object (not a data vector), this prints the full regression table — coefficients, standard errors, t-statistics, p-values, R-squared, and the F-test.
basic_reg <-lm(change_days_8wk ~ group_assignment, data = data)summary(basic_reg)
Call:
lm(formula = change_days_8wk ~ group_assignment, data = data)
Residuals:
Min 1Q Median 3Q Max
-2.21760 -0.49467 0.00533 0.47507 2.80533
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.07507 0.02487 -3.018 0.00257 **
group_assignmentSocial Comparison 0.89267 0.03518 25.376 < 2e-16 ***
group_assignmentCommitment Device 1.05093 0.03518 29.875 < 2e-16 ***
group_assignmentDefault Scheduling 0.56973 0.03518 16.196 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.6812 on 2996 degrees of freedom
Multiple R-squared: 0.2586, Adjusted R-squared: 0.2579
F-statistic: 348.4 on 3 and 2996 DF, p-value: < 2.2e-16
TipReading the output
The intercept is the mean change for the control group.
Each group_assignment coefficient is the additional change relative to control.
A significant coefficient (p < 0.05) suggests that group’s nudge had an effect beyond the control condition.
Regression with covariates
Adding baseline covariates (age, gender, baseline exercise) can improve precision by reducing residual variance. In an RCT, covariate adjustment should not change the point estimates much (if randomisation worked), but it can tighten confidence intervals.
Commands used:
lm() again, but now with additional predictors joined by +. The formula change_days_8wk ~ group_assignment + age + gender + baseline_days_exercise includes the treatment variable and three covariates. The + in a formula means “and also control for” — it does not imply an interaction between the variables.
Call:
lm(formula = change_days_8wk ~ group_assignment + age + gender +
baseline_days_exercise, data = data)
Residuals:
Min 1Q Median 3Q Max
-2.00406 -0.45097 -0.03391 0.44709 2.66079
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.100751 0.054593 1.845 0.0651 .
group_assignmentSocial Comparison 0.891009 0.034154 26.088 <2e-16 ***
group_assignmentCommitment Device 1.059475 0.034157 31.018 <2e-16 ***
group_assignmentDefault Scheduling 0.569158 0.034149 16.667 <2e-16 ***
age 0.000457 0.001043 0.438 0.6611
genderFemale 0.029561 0.024157 1.224 0.2212
baseline_days_exercise -0.086766 0.006383 -13.593 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.6612 on 2993 degrees of freedom
Multiple R-squared: 0.3022, Adjusted R-squared: 0.3008
F-statistic: 216.1 on 6 and 2993 DF, p-value: < 2.2e-16
Moderation analysis
Moderation (interaction) analysis asks: does the effect of the intervention depend on a participant characteristic? For example, does the Social Comparison nudge work better for people who are more sensitive to social norms?
We fit separate interaction models for four moderators, each motivated by a specific theoretical link to the interventions.
Social norm sensitivity
We expect that the Social Comparison nudge might be more effective for participants who are more sensitive to social norms.
Commands used (these apply to all four moderation models below):
lm() with the * operator: In a formula, A * B is shorthand for A + B + A:B — it includes the main effects of both variables and their interaction. The interaction term A:B allows the effect of A to vary depending on the value of B (and vice versa). This is the key ingredient of moderation analysis.
expand.grid(): Creates a data frame containing every combination of the supplied values. Here we generate all combinations of the 4 group levels and social norm sensitivity values 1 through 10. This gives us a grid of hypothetical participants to generate predictions for.
predict(): Takes a fitted model and new data, and returns the model’s predicted outcome for each row. The argument newdata = specifies the data frame of values to predict for. This is how we produce the predicted lines in the interaction plots.
geom_line(): A ggplot2 layer that draws lines connecting data points. Combined with color = group_assignment, it draws a separate line for each group.
social_norm_df <-expand.grid(group_assignment =levels(data$group_assignment),social_norm_sensitivity =1:10)social_norm_df$predicted <-predict(social_norm_mod, newdata = social_norm_df)ggplot(social_norm_df, aes(x = social_norm_sensitivity, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Predicted change by social norm sensitivity",x ="Social Norm Sensitivity",y ="Predicted Change in Exercise Days",color ="Group" ) +theme_minimal()
TipReading interaction plots
If the lines are roughly parallel, there is little moderation — the intervention effect is similar regardless of the moderator value. Non-parallel lines (especially crossing lines) suggest moderation. Check the regression output to see whether the interaction terms are statistically significant.
How to interpret this model
The regression output includes several types of coefficients. Here is how to read each one:
Main effects of group (e.g., group_assignmentSocial Comparison): These represent the difference between that intervention group and the control group when social norm sensitivity equals zero. Because the scale does not include zero in practice, these coefficients are not directly meaningful on their own — they are the intercept difference extrapolated to the edge of the scale.
Main effect of social norm sensitivity (social_norm_sensitivity): This is the slope of social norm sensitivity for the control group only (the reference category). It tells you how much the predicted change in exercise days increases per one-unit increase in social norm sensitivity among control participants.
Interaction terms (e.g., group_assignmentSocial Comparison:social_norm_sensitivity): These are the key coefficients for moderation. Each one tells you how much the slope of social norm sensitivity differs in that intervention group compared to the control group. A positive and significant interaction means the intervention effect grows as social norm sensitivity increases. A negative and significant interaction means the intervention effect shrinks (or reverses) at higher levels of the moderator.
In the plot, look specifically at the Social Comparison line relative to Control. If the gap between these two lines widens as social norm sensitivity increases, this confirms a positive interaction — the Social Comparison nudge is more effective for participants who care more about social norms.
Loss aversion
The Commitment Device nudge involves a potential loss (forfeiting a deposit). We test whether participants with higher loss aversion respond more strongly to this nudge.
The commands below follow the same pattern as the social norm sensitivity model: lm() with * for the interaction, expand.grid() + predict() for the prediction grid, and geom_line() for the plot.
loss_aversion_df <-expand.grid(group_assignment =levels(data$group_assignment),loss_aversion =1:10)loss_aversion_df$predicted <-predict(loss_aversion_mod, newdata = loss_aversion_df)ggplot(loss_aversion_df, aes(x = loss_aversion, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Predicted change by loss aversion",x ="Loss Aversion",y ="Predicted Change in Exercise Days",color ="Group" ) +theme_minimal()
How to interpret this model
The structure of this output mirrors the social norm sensitivity model above:
The interaction termgroup_assignmentCommitment Device:loss_aversion is the most theoretically relevant coefficient. A positive and significant value would indicate that the Commitment Device nudge becomes more effective as loss aversion increases — consistent with the idea that people who are more averse to losses respond more strongly to the threat of forfeiting their deposit.
Interaction terms for the other groups (Social Comparison, Default Scheduling) test whether loss aversion moderates those interventions as well, which may or may not be theoretically expected.
In the plot, focus on the Commitment Device line. If it diverges upward from Control as loss aversion increases, this supports the hypothesis. If all four lines remain roughly parallel, loss aversion does not meaningfully moderate any of the intervention effects.
Health literacy
The Default Scheduling nudge may be particularly helpful for participants with lower health literacy, who might otherwise struggle to plan exercise independently.
Same commands as the previous moderation models (lm() with *, expand.grid(), predict(), geom_line()).
health_literacy_df <-expand.grid(group_assignment =levels(data$group_assignment),health_literacy =1:10)health_literacy_df$predicted <-predict(health_literacy_mod, newdata = health_literacy_df)ggplot(health_literacy_df, aes(x = health_literacy, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Predicted change by health literacy",x ="Health Literacy",y ="Predicted Change in Exercise Days",color ="Group" ) +theme_minimal()
How to interpret this model
Here the theoretical prediction runs in the opposite direction from the previous two models. We expect Default Scheduling to be more helpful for people with lower health literacy, because these participants may lack the knowledge to structure their own exercise routines. This means we are looking for a negative interaction term for group_assignmentDefault Scheduling:health_literacy — indicating that the advantage of Default Scheduling over Control decreases as health literacy goes up.
In the plot, look for the Default Scheduling line to be well above Control on the left side (low health literacy) but converging toward Control on the right (high health literacy). This pattern would confirm that the nudge compensates for low health literacy.
Self-efficacy
Self-efficacy (belief in one’s ability to exercise) may moderate the effect of all interventions. Participants with higher self-efficacy might benefit more because they feel capable of acting on the nudge.
Unlike the previous three moderators, self-efficacy is not tied to a single intervention — it could plausibly moderate all of them. A person who believes they can exercise regularly may be better positioned to act on any nudge they receive.
Look at the interaction terms for all three intervention groups. Positive and significant values would suggest that participants with higher self-efficacy benefit more from the nudge.
Alternatively, if the interaction is negative, it could mean that the nudge is most helpful for people with low self-efficacy — the intervention compensates for a lack of internal motivation.
In the plot, check whether the intervention lines fan out (diverge from Control) as self-efficacy increases (positive moderation) or whether they converge (negative moderation). Also note whether the pattern is similar across interventions or specific to one.
NoteA note on multiple interaction models
We have fitted four separate interaction models, each with a different moderator. Each model includes interaction terms for all three intervention groups, meaning we are testing many coefficients across these models. Be cautious about over-interpreting any single marginally significant interaction (e.g., p-values near 0.05). Ideally, interpret these results in light of the pre-specified theoretical hypotheses rather than treating every significant coefficient as a confirmed finding.
Mediation analysis
Mediation analysis asks: through what mechanism does the intervention affect the outcome? Here we test whether adherence (how closely participants followed the intervention protocol) mediates the effect of group assignment on the change in exercise days.
We follow the classic Baron and Kenny (1986) approach:
Step 1: Show that the intervention affects the mediator (adherence).
Step 2: Show that the mediator affects the outcome, controlling for the intervention.
If both steps yield significant effects, and the direct effect of the intervention is reduced when controlling for the mediator, we have evidence of mediation.
Step 1: Intervention → Adherence
Commands used:
Both steps use lm() and summary(), which we have already seen. The only difference is the choice of outcome and predictors in the formula.
In Step 1, the outcome is the mediator (adherence) and the predictor is group_assignment. This tests whether the intervention affects adherence.
In Step 2, the outcome is the final outcome (change_days_8wk) and we include bothadherence and group_assignment using +. This lets us see (a) whether adherence predicts the outcome after controlling for group, and (b) whether the group coefficients shrink compared to the basic regression — which would indicate mediation.
adherence_mod <-lm(adherence ~ group_assignment, data = data)summary(adherence_mod)
Call:
lm(formula = adherence ~ group_assignment, data = data)
Residuals:
Min 1Q Median 3Q Max
-31.097 -6.783 -0.064 6.903 38.903
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 47.7827 0.3490 136.92 <2e-16 ***
group_assignmentSocial Comparison 8.2813 0.4935 16.78 <2e-16 ***
group_assignmentCommitment Device 10.0427 0.4935 20.35 <2e-16 ***
group_assignmentDefault Scheduling 5.3147 0.4935 10.77 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 9.557 on 2996 degrees of freedom
Multiple R-squared: 0.1371, Adjusted R-squared: 0.1363
F-statistic: 158.7 on 3 and 2996 DF, p-value: < 2.2e-16
Step 2: Adherence → Outcome (controlling for intervention)
mediation_mod <-lm(change_days_8wk ~ adherence + group_assignment, data = data)summary(mediation_mod)
Call:
lm(formula = change_days_8wk ~ adherence + group_assignment,
data = data)
Residuals:
Min 1Q Median 3Q Max
-2.49456 -0.31504 0.01512 0.31666 1.58562
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.4799715 0.0474607 -52.25 <2e-16 ***
adherence 0.0503301 0.0009223 54.57 <2e-16 ***
group_assignmentSocial Comparison 0.4758666 0.0260588 18.26 <2e-16 ***
group_assignmentCommitment Device 0.5454852 0.0265803 20.52 <2e-16 ***
group_assignmentDefault Scheduling 0.3022458 0.0253919 11.90 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4825 on 2995 degrees of freedom
Multiple R-squared: 0.6283, Adjusted R-squared: 0.6278
F-statistic: 1265 on 4 and 2995 DF, p-value: < 2.2e-16
TipInterpretation
Compare the group coefficients in this model with those from the basic regression (without adherence). If the coefficients shrink when adherence is added, this suggests that part of the intervention effect operates through adherence. A significant coefficient on adherence is necessary for mediation.
Logistic regression
When the outcome is binary (success_8wk), we use logistic regression instead of linear regression. The model estimates the log-odds of success as a function of group assignment and psychological factors.
Commands used:
glm(): Stands for Generalised Linear Model. It extends lm() to handle non-normal outcomes. The formula works the same way as in lm(). The critical difference is the family = binomial() argument, which tells R to fit a logistic regression (binary outcome, logit link function). The coefficients are on the log-odds scale, not the probability scale.
summary() on a glm object: Similar to a linear model summary, but reports z-statistics instead of t-statistics. Coefficients represent changes in log-odds per unit increase in the predictor.
Call:
glm(formula = success_8wk ~ group_assignment + self_efficacy +
loss_aversion + social_norm_sensitivity, family = binomial(),
data = data)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.86185 0.59783 -13.151 < 2e-16 ***
group_assignmentSocial Comparison 3.55527 0.45919 7.742 9.75e-15 ***
group_assignmentCommitment Device 4.02467 0.45717 8.803 < 2e-16 ***
group_assignmentDefault Scheduling 2.68794 0.46724 5.753 8.78e-09 ***
self_efficacy 0.24173 0.03648 6.626 3.46e-11 ***
loss_aversion 0.12284 0.03306 3.715 0.000203 ***
social_norm_sensitivity 0.12847 0.03408 3.769 0.000164 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2385.8 on 2999 degrees of freedom
Residual deviance: 2025.0 on 2993 degrees of freedom
AIC: 2039
Number of Fisher Scoring iterations: 7
Predicted probabilities
To make the logistic regression results more interpretable, we compute predicted probabilities of success for each group at low (3) and high (7) levels of self-efficacy, holding the other variables at their midpoints.
Commands used:
data.frame(): Creates a new data frame manually by specifying column names and their values. We use rep() (repeat) to generate repeated values for each column.
predict() with type = "response": When applied to a logistic model, type = "response" converts the predictions from the log-odds scale back to the probability scale (values between 0 and 1). Without this argument, predict() would return log-odds, which are harder to interpret.
logit_predictions <-data.frame(group_assignment =rep(levels(data$group_assignment), each =2),self_efficacy =rep(c(3, 7), times =4),loss_aversion =5,social_norm_sensitivity =5)logit_predictions$predicted_prob <-predict( logit_mod,newdata = logit_predictions,type ="response")logit_predictions
The predicted_prob column shows the estimated probability of meeting the exercise success threshold. Comparing rows within a group shows the effect of self-efficacy; comparing across groups shows the intervention effect.
Summary visualisations
Change in exercise by group
Commands used:
geom_boxplot(): A ggplot2 layer that draws box-and-whisker plots. The box shows the interquartile range (25th to 75th percentile), the line inside is the median, and the whiskers extend to 1.5× the IQR. Points beyond the whiskers are plotted individually as potential outliers.
theme_minimal(): Applies a clean, minimal visual theme to the plot (light background, no grid boxes).
theme(legend.position = "none"): Hides the legend, since the x-axis labels already identify the groups.
ggplot(data, aes(x = group_assignment, y = change_days_8wk,fill = group_assignment)) +geom_boxplot() +labs(title ="Change in exercise frequency by intervention",x ="Intervention Group",y ="Change in Days per Week" ) +theme_minimal() +theme(legend.position ="none")
Combined interaction effects
Commands used:
Here we use the patchwork package to arrange multiple plots into a single figure.
We save each plot to a variable (p1, p2, p3, p4) using <- instead of displaying them immediately.
The + operator (from patchwork, not base R) places plots side by side.
The / operator stacks plots vertically.
So (p1 + p2) / (p3 + p4) creates a 2×2 grid: top row is p1 and p2, bottom row is p3 and p4.
plot_annotation() adds an overall title to the combined figure.
p1 <-ggplot(social_norm_df, aes(x = social_norm_sensitivity, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Social norm sensitivity", x ="Social Norm Sensitivity",y ="Predicted Change", color ="Group") +theme_minimal()p2 <-ggplot(loss_aversion_df, aes(x = loss_aversion, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Loss aversion", x ="Loss Aversion",y ="Predicted Change", color ="Group") +theme_minimal()p3 <-ggplot(health_literacy_df, aes(x = health_literacy, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Health literacy", x ="Health Literacy",y ="Predicted Change", color ="Group") +theme_minimal()p4 <-ggplot(self_efficacy_df, aes(x = self_efficacy, y = predicted,color = group_assignment)) +geom_line(linewidth =1) +labs(title ="Self-efficacy", x ="Self-Efficacy",y ="Predicted Change", color ="Group") +theme_minimal()(p1 + p2) / (p3 + p4) +plot_annotation(title ="Interaction effects by cognitive factors")
Further reading
If you want to deepen your understanding of R and the statistical methods used in this tutorial, the following books are excellent starting points:
Wickham, Çetinkaya-Rundel, and Grolemund (2023) — The go-to guide for data wrangling and visualisation with the tidyverse. Covers everything from importing data to building ggplot2 graphics. Freely available online.
Ismay and Kim (2019) — A gentle, hands-on introduction to statistical inference using R and the tidyverse, well suited for students new to both statistics and programming. Freely available online.
Navarro (2019) — A comprehensive introductory statistics textbook written specifically for psychology students, with all examples in R. Freely available online.
Field, Miles, and Field (2012) — A thorough and accessible guide to statistics using R, covering ANOVA, regression, and more with a practical, example-driven approach.
James et al. (2021) — An approachable introduction to machine learning and statistical modelling with R, ideal for students ready to go beyond basic inference. Freely available online.
References
Baron, Reuben M, and David A Kenny. 1986. “The Moderator–Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations.”Journal of Personality and Social Psychology 51 (6): 1173–82. https://doi.org/10.1037/0022-3514.51.6.1173.
Field, Andy, Jeremy Miles, and Zoë Field. 2012. Discovering Statistics Using R. SAGE Publications.
Ismay, Chester, and Albert Y Kim. 2019. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. Chapman; Hall/CRC. https://moderndive.com.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in R. 2nd ed. Springer. https://www.statlearning.com.
Navarro, Danielle. 2019. Learning Statistics with R: A Tutorial for Psychology Students and Other Beginners. https://learningstatisticswithr.com.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2nd ed. O’Reilly Media. https://r4ds.hadley.nz.
Social norm sensitivity
We expect that the Social Comparison nudge might be more effective for participants who are more sensitive to social norms.
Commands used (these apply to all four moderation models below):
lm()with the*operator: In a formula,A * Bis shorthand forA + B + A:B— it includes the main effects of both variables and their interaction. The interaction termA:Ballows the effect of A to vary depending on the value of B (and vice versa). This is the key ingredient of moderation analysis.expand.grid(): Creates a data frame containing every combination of the supplied values. Here we generate all combinations of the 4 group levels and social norm sensitivity values 1 through 10. This gives us a grid of hypothetical participants to generate predictions for.predict(): Takes a fitted model and new data, and returns the model’s predicted outcome for each row. The argumentnewdata =specifies the data frame of values to predict for. This is how we produce the predicted lines in the interaction plots.geom_line(): A ggplot2 layer that draws lines connecting data points. Combined withcolor = group_assignment, it draws a separate line for each group.If the lines are roughly parallel, there is little moderation — the intervention effect is similar regardless of the moderator value. Non-parallel lines (especially crossing lines) suggest moderation. Check the regression output to see whether the interaction terms are statistically significant.
How to interpret this model
The regression output includes several types of coefficients. Here is how to read each one:
group_assignmentSocial Comparison): These represent the difference between that intervention group and the control group when social norm sensitivity equals zero. Because the scale does not include zero in practice, these coefficients are not directly meaningful on their own — they are the intercept difference extrapolated to the edge of the scale.social_norm_sensitivity): This is the slope of social norm sensitivity for the control group only (the reference category). It tells you how much the predicted change in exercise days increases per one-unit increase in social norm sensitivity among control participants.group_assignmentSocial Comparison:social_norm_sensitivity): These are the key coefficients for moderation. Each one tells you how much the slope of social norm sensitivity differs in that intervention group compared to the control group. A positive and significant interaction means the intervention effect grows as social norm sensitivity increases. A negative and significant interaction means the intervention effect shrinks (or reverses) at higher levels of the moderator.In the plot, look specifically at the Social Comparison line relative to Control. If the gap between these two lines widens as social norm sensitivity increases, this confirms a positive interaction — the Social Comparison nudge is more effective for participants who care more about social norms.