For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
#install.packages("ez")


# #optional packages/functions:
# library(afex) # anova functions
 library(ez) # anova functions 2
# library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error

Step 2: Load data

# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData.xls'
d <- read_excel(data_path, sheet=3)
head(d)
## # A tibble: 6 × 5
##   group    participant subscale item                      Agreement (0=complet…¹
##   <chr>          <dbl> <chr>    <chr>                                      <dbl>
## 1 Moroccan           1 PAST     1. Para mí son muy impor…                      4
## 2 Moroccan           1 PAST     2. Los jóvenes deben con…                      4
## 3 Moroccan           1 PAST     3. Creo que las personas…                      5
## 4 Moroccan           1 PAST     4. La juventud de hoy en…                      2
## 5 Moroccan           1 PAST     5. Los ancianos saben má…                      4
## 6 Moroccan           1 PAST     6. El modo correcto de h…                      3
## # ℹ abbreviated name:
## #   ¹​`Agreement (0=complete disagreement; 5=complete agreement)`
#d$subscale

Step 3: Tidy data

d_select <- d %>%
  rename(Agreement = `Agreement (0=complete disagreement; 5=complete agreement)`)

d_clean <- d_select %>%
  mutate(group = recode(group,
                        "young Spaniard" = "Spaniards",
                        "Moroccan" = "Moroccans"))

Step 4: Run analysis

Pre-processing

summary_table <- d_clean %>%
  group_by(participant, group, subscale) %>%
  summarise(agree_score = mean(Agreement, na.rm = TRUE)) %>%
  ungroup()

d_wide <- summary_table %>%
  pivot_wider(names_from = subscale, values_from = agree_score)

head(d_wide)
## # A tibble: 6 × 4
##   participant group     FUTURE  PAST
##         <dbl> <chr>      <dbl> <dbl>
## 1           1 Moroccans    3.3  3.36
## 2           1 Spaniards    3.3  2.55
## 3           2 Moroccans    3.2  3.82
## 4           2 Spaniards    3.6  3.91
## 5           3 Moroccans    3.2  3.18
## 6           3 Spaniards    3.5  3.45

Descriptive statistics

Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):

d_long <- d_wide %>%
  pivot_longer(
    cols = c(PAST, FUTURE),
    names_to = "time",
    values_to = "Agreement") %>%
  mutate(
    Agreement = as.numeric(Agreement),
    group = factor(group, levels = c("Spaniards", "Moroccans")),
    time  = factor(time,  levels = c("PAST", "FUTURE")))

descr_table <- d_long %>%
  group_by(group, time) %>%
  summarise(
    n_nonmiss= sum(!is.na(Agreement)),
    mean_Agreement =mean(Agreement, na.rm = TRUE),
    se_Agreement= sd(Agreement, na.rm = TRUE) / sqrt(n_nonmiss),
    .groups = "drop"
  )

ggplot(descr_table, aes(x = group, y = mean_Agreement, fill = time)) +
  geom_bar(stat = "identity", 
           position = position_dodge(width = 0.8), 
           color = "black", 
           width = 0.6) +  # thinner bars
  geom_errorbar(aes(ymin = mean_Agreement - se_Agreement,
                    ymax = mean_Agreement + se_Agreement),
                position = position_dodge(width = 0.8), 
                width = 0) +  # 0 width = line-style error bars (no caps)
  coord_cartesian(ylim = c(2, 4)) +
  labs(x = "", y = "Rating", fill = NULL) +
  scale_fill_manual(values = c("PAST" = "darkgray", "FUTURE" = "lightgray"),
                    labels = c("PAST" = "Past-Focused Statements", 
                               "FUTURE" = "Future-Focused Statements"),
                    guide = guide_legend(nrow = 1)) +
  theme_minimal() +
  theme(
    text = element_text(size = 12),
    legend.position = "top",
    legend.title = element_blank(),
    legend.text = element_text(size = 12),
    panel.grid = element_blank(),
    axis.line = element_line(color = "black")
  )

Inferential statistics

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).

# reproduce the above results here

#d_long
# participant 25 has mising data, needs to be removed from dataset
d_long_clean <- d_long %>%
  filter(!is.na(Agreement)) %>%
  distinct(participant, group, time, .keep_all = TRUE) %>%
  group_by(participant, group) %>%
  filter(sum(time == "PAST") == 1 & sum(time == "FUTURE") == 1) %>%
  ungroup() %>%
  mutate(
    participant_unique = paste0("P", participant, "_", group),
    participant_unique = factor(participant_unique),
    group = factor(group, levels = c("Spaniards", "Moroccans")),
    time  = factor(time,  levels = c("PAST", "FUTURE"))
  )

missing_check <- d_long_clean %>%
  group_by(participant_unique, group) %>%
  summarise(
    n_obs     = n(),
    has_past  = sum(time == "PAST", na.rm = TRUE),
    has_future= sum(time == "FUTURE", na.rm = TRUE),
    .groups = "drop"
  )

complete_pairs <- missing_check %>%
  filter(n_obs == 2, has_past == 1, has_future == 1) %>%
  select(participant_unique, group)

d_long_complete <- d_long_clean %>%
  semi_join(complete_pairs, by = c("participant_unique", "group"))

nrow(d_long) - nrow(d_long_complete) # verify missing data removed
## [1] 4
# Now run the ANOVA
anova_result <- ezANOVA(
    data = d_long_complete,
    dv = .(Agreement),
    wid = .(participant_unique),
    within = .(time),
    between = .(group),
    detailed = TRUE,
    type = 3)

print(anova_result)
## $ANOVA
##        Effect DFn DFd          SSn      SSd           F            p p<.05
## 1 (Intercept)   1  76 1543.4554821 15.24977 7692.090448 3.991708e-78     *
## 2       group   1  76    0.4398308 15.24977    2.191977 1.428650e-01      
## 3        time   1  76    3.9662047 37.77666    7.979308 6.040164e-03     *
## 4  group:time   1  76    9.1188907 37.77666   18.345608 5.327735e-05     *
##           ges
## 1 0.966785451
## 2 0.008226326
## 3 0.069591536
## 4 0.146734962

Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,

# reproduce the above results here
# reproduce the above results here
past_data <- d_long_clean %>%
  filter(time == "PAST")

past_data$group <- factor(past_data$group, levels = c("Moroccans", "Spaniards"))

print(t.test(Agreement ~ relevel(group, ref = "Moroccans"), var.equal = TRUE, data = past_data))
## 
##  Two Sample t-test
## 
## data:  Agreement by relevel(group, ref = "Moroccans")
## t = 3.8562, df = 76, p-value = 0.0002394
## alternative hypothesis: true difference in means between group Moroccans and group Spaniards is not equal to 0
## 95 percent confidence interval:
##  0.2851528 0.8943343
## sample estimates:
## mean in group Moroccans mean in group Spaniards 
##                3.280886                2.691142

and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)

# reproduce the above results here
future_data <- d_long_complete %>%
  filter(time == "FUTURE")

t_test_result <- t.test(Agreement ~ relevel(group, ref = "Moroccans"), var.equal = TRUE, data = future_data)

print(t_test_result)
## 
##  Two Sample t-test
## 
## data:  Agreement by relevel(group, ref = "Moroccans")
## t = -3.3898, df = 76, p-value = 0.001112
## alternative hypothesis: true difference in means between group Moroccans and group Spaniards is not equal to 0
## 95 percent confidence interval:
##  -0.5990628 -0.1556380
## sample estimates:
## mean in group Moroccans mean in group Spaniards 
##                3.116239                3.493590

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Very similarly results, although t values for both the past and future data and p-value for the past data were slightly different from what was reported. This could be due to participant 25 being removed from my dataset?

How difficult was it to reproduce your results?

Not difficult once the data was cleaned

What aspects made it difficult? What aspects made it easy?

Not perfectly clean data, slightly more manipulation of dataset (pivot long/pivot wide). Also deciding at first if Welch’s or student’s t-test more closely reflected the original authors results/assumptions on variance (I based this off the df). I’m not sure if I used the right one.