For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).

Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages/functions:
library(afex) # anova functions
library(ez) # anova functions 2
# library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error

Step 2: Load data

# Just Experiment 2
data_path <- '/Users/as4508/Documents/GitHub/problem_sets/ps3/Group B/Choice 3/data/DeLaFuenteEtAl_2014_RawData-OSF-corrected_EHedit.xls'
d <- read_excel(data_path, sheet=3)

Step 3: Tidy data

names(d)

## [1] "group"                                                    
## [2] "participant"                                              
## [3] "subscale"                                                 
## [4] "item"                                                     
## [5] "Agreement (0=complete disagreement; 5=complete agreement)"

head(d)

## # A tibble: 6 × 5
##   group    participant subscale item                      Agreement (0=complet…¹
##   <chr>          <dbl> <chr>    <chr>                                      <dbl>
## 1 Moroccan           1 PAST     1. Para mí son muy impor…                      4
## 2 Moroccan           1 PAST     2. Los jóvenes deben con…                      4
## 3 Moroccan           1 PAST     3. Creo que las personas…                      5
## 4 Moroccan           1 PAST     4. La juventud de hoy en…                      2
## 5 Moroccan           1 PAST     5. Los ancianos saben má…                      4
## 6 Moroccan           1 PAST     6. El modo correcto de h…                      3
## # ℹ abbreviated name:
## #   ¹`Agreement (0=complete disagreement; 5=complete agreement)`

d_tidy <- d %>%
  mutate(agreement = `Agreement (0=complete disagreement; 5=complete agreement)`,
         group = factor(group, levels = c("young Spaniard", "Moroccan")),
         subscale = factor(subscale, levels = c("PAST", "FUTURE")),
         participant = factor(participant),
         id = interaction(group, participant, drop = TRUE))

head(d_tidy)

## # A tibble: 6 × 7
##   group    participant subscale item      Agreement (0=complet…¹ agreement id   
##   <fct>    <fct>       <fct>    <chr>                      <dbl>     <dbl> <fct>
## 1 Moroccan 1           PAST     1. Para …                      4         4 Moro…
## 2 Moroccan 1           PAST     2. Los j…                      4         4 Moro…
## 3 Moroccan 1           PAST     3. Creo …                      5         5 Moro…
## 4 Moroccan 1           PAST     4. La ju…                      2         2 Moro…
## 5 Moroccan 1           PAST     5. Los a…                      4         4 Moro…
## 6 Moroccan 1           PAST     6. El mo…                      3         3 Moro…
## # ℹ abbreviated name:
## #   ¹`Agreement (0=complete disagreement; 5=complete agreement)`

Step 4: Run analysis

Pre-processing

se <- function(x) sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))

d_subject <- d_tidy %>%
  group_by(group, participant, subscale) %>%
  summarise(mean_agreement = mean(agreement),
            se = se(agreement),
            .groups = "drop")

d_group <- d_tidy %>%
  group_by(group, subscale) %>%
  summarise(mean_agreement = mean(agreement),
            se = se(agreement),
            .groups = "drop")

Descriptive statistics

Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):

knitr::kable(d_group, digits = 2)

group	subscale	mean_agreement	se
young Spaniard	PAST	2.67	0.06
young Spaniard	FUTURE	3.49	0.06
Moroccan	PAST	3.29	0.07
Moroccan	FUTURE	3.12	0.07

ggplot(d_group, aes(x = group, y = mean_agreement, fill = subscale)) +
  geom_col(position = position_dodge(width = 0.6), width = 0.6) +
  geom_errorbar(aes(ymin = mean_agreement - se, ymax = mean_agreement + se), 
                width = 0.15,
                position = position_dodge(width = 0.6)) +
  scale_x_discrete(labels = c("Moroccan" = "Moroccans", "young Spaniard" = "Spaniards"),) +
  scale_fill_manual(values = c("grey40", "grey80"),
                    labels = c("FUTURE" = "Future-Focused Statements", "PAST" = "Past-Focused Statements")) + 
  coord_cartesian(ylim = c(2, 4)) +
  labs(x = "Group", y = "Rating", fill = "Temporal focus") +
  theme_minimal() +
  theme(legend.position   = "top", legend.direction  = "horizontal", panel.grid.minor  = element_blank())

Inferential statistics

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).

# reproduce the above results here

anova <- aov_ez(id = "id",
                    dv = "agreement",
                    between = "group",
                    within = "subscale",
                    data = d_tidy,
                    fun_aggregate = mean)

anova

## Anova Table (Type 3 tests)
## 
## Response: agreement
##           Effect    df  MSE         F  ges p.value
## 1          group 1, 78 0.21    2.88 + .011    .094
## 2       subscale 1, 78 0.51   8.10 ** .069    .006
## 3 group:subscale 1, 78 0.51 19.14 *** .148   <.001
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1

anova_results <- ezANOVA(data = d_tidy,
                         dv = agreement,
                         wid = id,
                         within = subscale,
                         between = group,
                         detailed = TRUE,
                         type = 3)

anova_results

## $ANOVA
##           Effect DFn DFd          SSn      SSd           F            p p<.05
## 1    (Intercept)   1  78 1582.7355501 16.34106 7554.795449 2.107189e-79     *
## 2          group   1  78    0.6035956 16.34106    2.881114 9.361091e-02      
## 3       subscale   1  78    4.1514592 39.98697    8.097983 5.659022e-03     *
## 4 group:subscale   1  78    9.8145046 39.98697   19.144520 3.712656e-05     *
##          ges
## 1 0.96563402
## 2 0.01060211
## 3 0.06864243
## 4 0.14838416

partial_eta2 <- anova_results$ANOVA
partial_eta2 <- partial_eta2 %>% mutate(partial_eta2 = SSn / (SSn + SSd))
partial_eta2

##           Effect DFn DFd          SSn      SSd           F            p p<.05
## 1    (Intercept)   1  78 1582.7355501 16.34106 7554.795449 2.107189e-79     *
## 2          group   1  78    0.6035956 16.34106    2.881114 9.361091e-02      
## 3       subscale   1  78    4.1514592 39.98697    8.097983 5.659022e-03     *
## 4 group:subscale   1  78    9.8145046 39.98697   19.144520 3.712656e-05     *
##          ges partial_eta2
## 1 0.96563402   0.98978094
## 2 0.01060211   0.03562159
## 3 0.06864243   0.09405544
## 4 0.14838416   0.19707257

Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,

# reproduce the above results here

past <- filter(d_subject, subscale == "PAST")
t.test(mean_agreement ~ relevel(group, ref = "Moroccan"), data = past, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  mean_agreement by relevel(group, ref = "Moroccan")
## t = 4.0034, df = 78, p-value = 0.0001413
## alternative hypothesis: true difference in means between group Moroccan and group young Spaniard is not equal to 0
## 95 percent confidence interval:
##  0.3107666 0.9255970
## sample estimates:
##       mean in group Moroccan mean in group young Spaniard 
##                     3.293182                     2.675000

and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)

# reproduce the above results here

future <- filter(d_subject, subscale == "FUTURE")
t.test(mean_agreement ~ relevel(group, ref = "Moroccan"), data = future, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  mean_agreement by relevel(group, ref = "Moroccan")
## t = -3.3637, df = 78, p-value = 0.001195
## alternative hypothesis: true difference in means between group Moroccan and group young Spaniard is not equal to 0
## 95 percent confidence interval:
##  -0.5929718 -0.1520282
## sample estimates:
##       mean in group Moroccan mean in group young Spaniard 
##                       3.1200                       3.4925

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

I was able to reproduce the results after finally switching to the corrected data (although there are very slight rounding differences from some of the stats).

How difficult was it to reproduce your results?

It was extremely difficult to reproduce the results because of the data I was using - it was very hard to run the ANOVA because I was using the original raw data, and didn’t notice that in the folder there was a corrected set of data (because there were observations originally missing for participant 25 that kept causing errors in the ANOVA which I was able to manually fix, but then caused me to get a different F statistic). It was also difficult to set up the figure/plot to be as similar as possible to the original paper.

What aspects made it difficult? What aspects made it easy?

Using the raw data made it incredibly difficult (as mentioned above with my issues in running the ANOVA), as this led to a lot of debugging issues with the ANOVA and different stats. Documentation on the methods / types of analyses that were run made it easier (along with information like sample size / degrees of freedom).

Reproducibility Report: Group B Choice 3