For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).

Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
library(Rmisc)
library(purrr)
library(broom)
library(car)
library(psych)
library(rstatix)

# #optional packages/functions:
# library(afex) # anova functions
# library(ez) # anova functions 2
# library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error

Step 2: Load data

# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData_jbyun.xls'
d <- read_excel(data_path, sheet=3)

Step 3: Tidy data & Pre-processing

str(d)

## tibble [1,680 × 5] (S3: tbl_df/tbl/data.frame)
##  $ group                                                    : chr [1:1680] "Moroccan" "Moroccan" "Moroccan" "Moroccan" ...
##  $ participant                                              : num [1:1680] 1 1 1 1 1 1 1 1 1 1 ...
##  $ subscale                                                 : chr [1:1680] "PAST" "PAST" "PAST" "PAST" ...
##  $ item                                                     : chr [1:1680] "1. Para mí son muy importantes las tradiciones y las antiguas costumbres" "2. Los jóvenes deben conservar las tradiciones" "3. Creo que las personas eran más felices hace unas décadas que en la actualidad" "4. La juventud de hoy en día necesita mantener los valores de sus padres y sus abuelos" ...
##  $ Agreement (0=complete disagreement; 5=complete agreement): num [1:1680] 4 4 5 2 4 3 4 2 2 3 ...

colnames(d) <- c('group', 'id', 'subscale', 'item', 'agreement')

#d_past <- d %>%
#  filter(subscale == "PAST")

#d_future <- d %>%
#  filter(subscale == "FUTURE")

df <- pivot_wider(data = d, names_from = item, values_from = agreement)

#df_past <- pivot_wider(data = d_past, names_from = item, values_from = 'agreement')
#colnames(df_past) <- c('group', 'id', 'subscale', 'Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Q9', 'Q10', 'Q11')

#df_future <- pivot_wider(data = d_future, names_from = item, values_from = 'agreement')
#colnames(df_future) <- c('group', 'id', 'subscale', 'Q12', 'Q13', 'Q14', 'Q15', 'Q16', 'Q17', 'Q18', 'Q19', 'Q20', 'Q21')

colnames(df) <- c('group', 'id', 'subscale', 'Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Q9', 'Q10', 'Q11', 'Q12', 'Q13', 'Q14', 'Q15', 'Q16', 'Q17', 'Q18', 'Q19', 'Q20', 'Q21')

df <- df %>%
  mutate(Moroccan = ifelse(group == "Moroccan", 1, 0)) %>%
  mutate(Moroccan = as.factor(Moroccan)) %>%
  mutate(Future = ifelse(subscale == "FUTURE", 1, 0)) %>%
  mutate(Future = as.factor(Future)) %>%
  mutate(group = ifelse(group == "young Spaniard", "Spaniard", "Moroccan")) %>%
  mutate(group = as.factor(group)) %>%
  mutate(subscale = as.factor(subscale)) %>%
  mutate(id = as.factor(id))

# get average agreement score
df <- df %>%
  mutate(avg_agreement = rowMeans(df[ , c('Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Q9', 'Q10', 'Q11', 'Q12', 'Q13', 'Q14', 'Q15', 'Q16', 'Q17', 'Q18', 'Q19', 'Q20', 'Q21')], na.rm = T))

col_order <- c('id', 'Moroccan', 'Future', 'avg_agreement', 'group', 'subscale', 'Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Q9', 'Q10', 'Q11', 'Q12', 'Q13', 'Q14', 'Q15', 'Q16', 'Q17', 'Q18', 'Q19', 'Q20', 'Q21')
df <- df[, col_order]

col_short <- c('id', 'Moroccan', 'Future', 'avg_agreement', 'group', 'subscale')
df_tidy <- df[, col_short]

Step 4: Run analysis

Pre-processing

Pre-processing was done while tidying up data.

Descriptive statistics

df_summ <- summarySE(df_tidy, measurevar = 'avg_agreement', groupvars = c('group', 'subscale'), na.rm = TRUE)
df_summ

##      group subscale  N avg_agreement        sd         se        ci
## 1 Moroccan   FUTURE 40      3.120000 0.5561774 0.08793937 0.1778742
## 2 Moroccan     PAST 40      3.293182 0.7311921 0.11561162 0.2338466
## 3 Spaniard   FUTURE 40      3.492500 0.4257045 0.06730980 0.1361469
## 4 Spaniard     PAST 40      2.675000 0.6473862 0.10236075 0.2070442

#kable(df_summ)

df_tidy %>%
  dplyr::group_by(group, subscale) %>%
  get_summary_stats(avg_agreement, type = "mean_sd")

## # A tibble: 4 × 6
##   group    subscale variable          n  mean    sd
##   <fct>    <fct>    <chr>         <dbl> <dbl> <dbl>
## 1 Moroccan FUTURE   avg_agreement    40  3.12 0.556
## 2 Moroccan PAST     avg_agreement    40  3.29 0.731
## 3 Spaniard FUTURE   avg_agreement    40  3.49 0.426
## 4 Spaniard PAST     avg_agreement    40  2.68 0.647

Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):

ggplot(data = df_summ, aes(y = avg_agreement, x = group, fill = subscale)) +
  geom_bar(position = position_dodge(), stat = 'identity') +
  geom_errorbar(aes(ymin = avg_agreement - se, ymax = avg_agreement + se),
                width = .2, position = position_dodge(.9)) +
  coord_cartesian(ylim = c(2.0, 4.0)) +
  scale_y_continuous(breaks = seq(2.00, 4.00, 0.25)) +
  theme(legend.direction = "vertical",
        legend.background = element_rect(fill = "transparent"),
        axis.line = element_line(),
        panel.grid = element_blank(), 
        panel.background = element_blank(),
        plot.title = element_text(hjust = 0.5)) +
  labs(x = "Group", y = "Rating", fill = "Temporal Focus")

Inferential statistics

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).

# two-way mixed design ANOVA using the r base function
aov_mix <- aov(avg_agreement ~ Moroccan*Future + Error(id/Future), data = df_tidy)
summary(aov_mix)

## 
## Error: id
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 39  8.549  0.2192               
## 
## Error: id:Future
##           Df Sum Sq Mean Sq F value Pr(>F)  
## Future     1  4.151   4.151   6.974 0.0118 *
## Residuals 39 23.215   0.595                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: Within
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## Moroccan         1  0.604   0.604   1.917     0.17    
## Moroccan:Future  1  9.815   9.815  31.164 3.32e-07 ***
## Residuals       78 24.565   0.315                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# reproduce the above results here: two-way mixed ANOVA using rstatix package
mix_anova <- anova_test(data = df_tidy, dv = avg_agreement, wid = id, between = Moroccan, within = Future)
get_anova_table(mix_anova)

## ANOVA Table (type II tests)
## 
##            Effect DFn DFd      F        p p<.05   ges
## 1        Moroccan   1  78  2.881 9.40e-02       0.011
## 2          Future   1  78  8.098 6.00e-03     * 0.069
## 3 Moroccan:Future   1  78 19.145 3.71e-05     * 0.148

This test (Mixed ANOVA using rstatix package) gives me results that are more similar to those from original work.

Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,

# reproduce the above results here
df_past <- df_tidy %>%
  filter(Future == 0)

t1 <- t.test(avg_agreement ~ group, data = df_past, alternative = "two.sided", conf.level = 0.95)
t1

## 
##  Welch Two Sample t-test
## 
## data:  avg_agreement by group
## t = 4.0034, df = 76.872, p-value = 0.0001428
## alternative hypothesis: true difference in means between group Moroccan and group Spaniard is not equal to 0
## 95 percent confidence interval:
##  0.3106955 0.9256681
## sample estimates:
## mean in group Moroccan mean in group Spaniard 
##               3.293182               2.675000

I would say the test statistic seems similar.

and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)

# reproduce the above results here
df_future <- df_tidy %>%
  filter(Future == 1)

t2 <- t.test(avg_agreement ~ group, data = df_future, alternative = "two.sided", conf.level = 0.95)
t2

## 
##  Welch Two Sample t-test
## 
## data:  avg_agreement by group
## t = -3.3637, df = 73.02, p-value = 0.001228
## alternative hypothesis: true difference in means between group Moroccan and group Spaniard is not equal to 0
## 95 percent confidence interval:
##  -0.5932088 -0.1517912
## sample estimates:
## mean in group Moroccan mean in group Spaniard 
##                 3.1200                 3.4925

The test statistic seems similar here as well.

tab <- map_df(list(t1, t2), tidy)
tab <- tab %>% add_column("group" = c("Past", "Future"))
tab <- tab %>% select(c("group", "estimate", "estimate1", "estimate2", "statistic", "p.value", "conf.low", "conf.high", "alternative"))
kable(tab, caption = "t-test results (Mean agreement by group) for Past-focused statements and Future-focused statements")

t-test results (Mean agreement by group) for Past-focused statements and Future-focused statements
group	estimate	estimate1	estimate2	statistic	p.value	conf.low	conf.high	alternative
Past	0.6181818	3.293182	2.6750	4.003398	0.0001428	0.3106955	0.9256681	two.sided
Future	-0.3725000	3.120000	3.4925	-3.363653	0.0012279	-0.5932088	-0.1517912	two.sided

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

For the most part, I was able to get similar numbers. However, I could not exactly reproducte the results I attempted to reproduce. I used two different packages to run the mixed design ANOVA and got different results in terms of test statistics. Only one of the results seemed similar to the original results.

How difficult was it to reproduce your results?

I was quite difficult. The dataset was not in the tidy format, so I had to tidy up data first. There were also errors in the dataset, which made me to make some decisions.

What aspects made it difficult? What aspects made it easy?

First, there were errors in the data (with participant 24.) I was not sure what happened but it seemed like a coding error. So instead of dropping the observations with error I modified the data (participant number to be exact), which might have caused discrepancies between my results and the original authors’. Second, the dataset was not in a tidy format so it required some time to clean it up. What made it really difficult for me to reproduce the results was the lack of clarity in terms of the used statistical models. I had to speculate what had been done to proceed. Having a codebook made it easier for me to understand data. Also, having a long format data was helpful because it was easy to read and understand data structure.

Reproducibility Report: Group B Choice 3

Jiwon Byun