For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.
Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.
Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):
According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).
library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
# #optional packages/functions:
# library(afex) # anova functions
# library(ez) # anova functions 2
# library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error
# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData.xls'
d <- read_excel(data_path, sheet=3)
head(d)
## # A tibble: 6 x 5
## group participant subscale item `Agreement (0=complete di…
## <chr> <dbl> <chr> <chr> <dbl>
## 1 Moroc… 1 PAST 1. Para mí son muy imp… 4
## 2 Moroc… 1 PAST 2. Los jóvenes deben c… 4
## 3 Moroc… 1 PAST 3. Creo que las person… 5
## 4 Moroc… 1 PAST 4. La juventud de hoy … 2
## 5 Moroc… 1 PAST 5. Los ancianos saben … 4
## 6 Moroc… 1 PAST 6. El modo correcto de… 3
length(d)
## [1] 5
filterd_d <- select(d, -item)
colnames(filterd_d)
## [1] "group"
## [2] "participant"
## [3] "subscale"
## [4] "Agreement (0=complete disagreement; 5=complete agreement)"
names(filterd_d)[4] <- "Agreement"
#summarize
sorted_d <- filterd_d %>%
group_by(participant, group, subscale)%>%
summarise(Agreement=mean(Agreement))
sorted_d
## # A tibble: 158 x 4
## # Groups: participant, group [80]
## participant group subscale Agreement
## <dbl> <chr> <chr> <dbl>
## 1 1 Moroccan FUTURE 3.3
## 2 1 Moroccan PAST 3.36
## 3 1 young Spaniard FUTURE 3.3
## 4 1 young Spaniard PAST 2.55
## 5 2 Moroccan FUTURE 3.2
## 6 2 Moroccan PAST 3.82
## 7 2 young Spaniard FUTURE 3.6
## 8 2 young Spaniard PAST 3.91
## 9 3 Moroccan FUTURE 3.2
## 10 3 Moroccan PAST 3.18
## # … with 148 more rows
length(sorted_d$participant) #check data frame length
## [1] 158
#make data wide
tpf_long <- sorted_d %>%
pivot_wider(names_from = "subscale",
values_from = "Agreement")
#check sample size
length(tpf_long$participant) #equals 80 that's good
## [1] 80
#re-sort by condition
tpf_long_2 <- arrange(tpf_long, group)
names(tpf_long_2)[2] <- "Country"
names(tpf_long_2)[3] <- "Future"
names(tpf_long_2)[4] <- "Past"
tpf_long_2
## # A tibble: 80 x 4
## # Groups: participant, Country [80]
## participant Country Future Past
## <dbl> <chr> <dbl> <dbl>
## 1 1 Moroccan 3.3 3.36
## 2 2 Moroccan 3.2 3.82
## 3 3 Moroccan 3.2 3.18
## 4 4 Moroccan 4 3.82
## 5 5 Moroccan 2.9 3.27
## 6 6 Moroccan 3.2 2.27
## 7 7 Moroccan 3.3 4.09
## 8 8 Moroccan 4.3 1.55
## 9 9 Moroccan 3 4
## 10 10 Moroccan 2.7 3.27
## # … with 70 more rows
#compute summary stats
tpf_summ <- tpf_long_2 %>%
group_by(Country) %>%
summarise(FutureMean = mean(Future), PastMean = mean(Past))
tpf_summ
## # A tibble: 2 x 3
## Country FutureMean PastMean
## <chr> <dbl> <dbl>
## 1 Moroccan 3.14 NA
## 2 young Spaniard 3.49 NA
Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):
According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).
# reproduce the above results here
Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,
# reproduce the above results here
and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)
# reproduce the above results here
Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?
No, was unable to complete it in 3 hours It was a steep learning curve to figure out the code for tidying the data. I could have kept going but it would take me longer than I have time for. I got stuck at the pre-processing stage. I couldn’t generate a table with summary statistics. I think its because I couldn’t group the table with the ‘participant’ column still there and I couldn’t figure out a way around it.
How difficult was it to reproduce your results?
I think the hardest part is the data wrangling.
What aspects made it difficult? What aspects made it easy?
Since I am a beginner I had to look up things quite a lot and it took a lot of time. I am also often not sure if I’m doing the right thing. I definitely need to practice this more when I have more time.