For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
library(sjstats)
library(psychReport)
library(rstatix)

# #optional packages/functions:
# library(afex) # anova functions
library(ez) # anova functions 2
# library(scales) # for plotting
std.err <- function(x) sd(x)/sqrt(length(x)) # standard error

Step 2: Load data

# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData.xls'
d <- read_excel(data_path, sheet=3)

Step 3: Tidy data

participantnum <- rep(1:80, each = 21) #setting up id vector

d <- d %>% #change column name
  select(-item) %>% 
  rename(agreement = `Agreement (0=complete disagreement; 5=complete agreement)`)

d$participant <- replace(d$participant, c(501:521), 25)
d$participant <- replace(d$participant, c(1341:1361), 25) #misnumbered participants

d <- d %>% #had to change the id column because of duplicates
  mutate(id = participantnum) %>% 
  select(-participant)

Step 4: Run analysis

Pre-processing

dfig2 <- d %>% 
  group_by(group, subscale) %>% 
  summarize(mean = mean(agreement), stderr = std.err(agreement))

dfig2
## # A tibble: 4 × 4
## # Groups:   group [2]
##   group          subscale  mean stderr
##   <chr>          <chr>    <dbl>  <dbl>
## 1 Moroccan       FUTURE    3.12 0.0698
## 2 Moroccan       PAST      3.29 0.0698
## 3 young Spaniard FUTURE    3.49 0.0600
## 4 young Spaniard PAST      2.68 0.0578

Descriptive statistics

Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):

dfig2$group <- factor(dfig2$group, levels = c("young Spaniard", "Moroccan"), labels = c("Spaniard", "Moroccan"), ordered = TRUE)
dfig2$subscale <- factor(dfig2$subscale, levels = c("PAST", "FUTURE"), labels = c("Past-Focused Statements", "Future-Focused Statements"), ordered = TRUE)

ggplot(dfig2, aes(x = group, y = mean, fill = subscale)) +
  geom_bar(position = "dodge", stat = "identity") +
  geom_errorbar(aes(ymin = mean - stderr, ymax = mean + stderr), position = position_dodge(.9), width = .25) +
  ylim(0, 4) +
  theme_classic()

Inferential statistics

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).

# reproduce the above results here

daov <- ezANOVA(data = d, dv = agreement, wid = id, within = subscale, between = group, detailed = TRUE)

daov
## $ANOVA
##           Effect DFn DFd          SSn      SSd           F            p p<.05
## 1    (Intercept)   1  78 1582.7355501 16.34106 7554.795449 2.107189e-79     *
## 2          group   1  78    0.6035956 16.34106    2.881114 9.361091e-02      
## 3       subscale   1  78    4.1514592 39.98697    8.097983 5.659022e-03     *
## 4 group:subscale   1  78    9.8145046 39.98697   19.144520 3.712656e-05     *
##          ges
## 1 0.96563402
## 2 0.01060211
## 3 0.06864243
## 4 0.14838416
if (!require(MBESS)) { install.packages("MBESS"); library(MBESS) }

results <- daov

results$ANOVA$partialetasquared <- results$ANOVA$SSn/(results$ANOVA$SSn+results$ANOVA$SSd) #code taken to produce the partial eta squared from ezANOVA, found here: https://groups.google.com/g/ez4r/c/4CHBP-jlZGY?pli=1 
loweretasquared <- c()
upperetasquared <- c()
for (cR in 1:nrow(results$ANOVA)) {
  Lims <- conf.limits.ncf(F.value = results$ANOVA$F[cR], conf.level = 0.95, df.1 <- results$ANOVA$DFn[cR], df.2 <- results$ANOVA$DFd[cR])
  Lower.lim <- Lims$Lower.Limit/(Lims$Lower.Limit + df.1 + df.2 + 1)
  Upper.lim <- Lims$Upper.Limit/(Lims$Upper.Limit + df.1 + df.2 + 1)
  if (is.na(Lower.lim)) {
    Lower.lim <- 0
  }
  if (is.na(Upper.lim)) {
    Upper.lim <- 1
  }
  loweretasquared <- c(loweretasquared,Lower.lim)
  upperetasquared <- c(upperetasquared,Upper.lim)
}
results$ANOVA$partialetasquared.lower <- loweretasquared
results$ANOVA$partialetasquared.upper <- upperetasquared

print(results$ANOVA$partialetasquared[4]) #this is the partial eta squared for the interaction
## [1] 0.1970726
#for full table print(results$ANOVA)

Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,

# reproduce the above results here

dt1 <- d %>% #have to take the mean because they had 78 df
  filter(subscale == "PAST") %>% 
  group_by(id, group) %>% 
  summarize(mean = mean(agreement))

t.test(mean ~ group, data = dt1, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  mean by group
## t = 4.0034, df = 78, p-value = 0.0001413
## alternative hypothesis: true difference in means between group Moroccan and group young Spaniard is not equal to 0
## 95 percent confidence interval:
##  0.3107666 0.9255970
## sample estimates:
##       mean in group Moroccan mean in group young Spaniard 
##                     3.293182                     2.675000

and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)

# reproduce the above results here

dt2 <- d %>% #have to take the mean because they had 78 df
  filter(subscale == "FUTURE") %>% 
  group_by(id, group) %>% 
  summarize(mean = mean(agreement))

t.test(mean ~ group, data = dt2, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  mean by group
## t = -3.3637, df = 78, p-value = 0.001195
## alternative hypothesis: true difference in means between group Moroccan and group young Spaniard is not equal to 0
## 95 percent confidence interval:
##  -0.5929718 -0.1520282
## sample estimates:
##       mean in group Moroccan mean in group young Spaniard 
##                       3.1200                       3.4925

I reached three hours of working on this assignment.

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

I was (mostly) able to reproduce the results. The test statistics, p-values, and effect sizes are slightly off from what they report in the paper (only by miniscule differences), so there must be some difference in the statistical tests that changed the values from the tests from what I had (which were the basic, default options). Mostly though I think the results hold up, though it is a bit strange I wasn’t able to produce all of the exact values for all of the statistical tests.

How difficult was it to reproduce your results?

It was pretty difficult - I wasn’t able to exactly reproduce the results.

What aspects made it difficult? What aspects made it easy?

First, there were errors in the data frame - participant id numbers were mislabeled in the original data frame that was an issue I ran into in further statistical tests. Next, they duplicate numbered all of their participants in their between subjects factor (group), such that there were two 1’s and two 2’s, etc. which made RStudio confused when it came to running the statistical tests. Third, they made the names of one of their columns with spaces, which made converting it a bit tricky. Finally, they did not specify all aspects of their statistical tests, so I spent a long time tweaking with the parameters of the test and R functions to figure out how to get the exact results they obtained, but to no avail. For example they did not specify how they got to 78 df in their t-tests - did they take the mean of all of the judgments per participant in each subscale (this is what I did, and I got pretty close with the result)? Fortunately, their data was formatted really well and intuitively, so it was easy to clean and work with. But I reached three hours before I could figure out how to exactly reproduce their results.