For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.
Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics.
In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement.
Analyses
The authors then performed a mixed-design ANOVA with
Between-Subjects 2) group (Moroccan or Spanish, between-subjects) as the fixed-effects factor,
Within-Subjects 3) and temporal focus (past or future, within-subjects) as the random effects factor.
In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.
Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):
According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).
library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
# #optional packages/functions:
library(afex) # anova functions
library(ez) # anova functions 2
library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error
# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData.xls'
d <- read_excel(data_path, sheet=3)
View(d)
#if else, it lets you pass in some condition- If it is true it spits out one value and if it is falso it spits out another.
data <-d |> #look at this data set.
#mutate is just selecting a column
mutate(participant = ifelse(group == "Moroccan", participant, participant +40))
View(data)
Notes for myself: duplicated(): for identifying duplicated elements and unique(): for extracting unique elements.The author has a lot of duplicate items for the same participant so need to fix this error.
#Goal was to find the duplicates.
#look at what they did to their data before they ran the analyses.
#identify duplicates.
#we do not need to include duplicates for agreement because there will be a lot of duplicates.
duplicates <- data %>% #piping to read left to right.
arrange(group, participant, subscale, item) %>% #arrange() function in R programming is used to reorder the rows of a data frame/table by using column names.This is saying I want to reorder all the data by rows using the group, participant, subscales, and item.In simple terms we are asking the rows to be organized by alphabetical order using the group, participants, subscale, and item.
group_by(group, participant, subscale, item) %>% # group() is used to group rows by column values in the DataFrame. This is creating two groups the duplicated data and the non-duplicated data.
filter(n()>1) #pulling any cells that are repeated more than once I would need to duplicate it.
View(duplicates)
#Removing the duplicates rows.
data_drop <- data %>%
arrange(group, participant,item) %>%
rename(rating= "Agreement (0=complete disagreement; 5=complete agreement)") %>% #reanmaing this because R will not be able to read it otherwise.
distinct(group, participant, subscale, item, .keep_all=T) #distinct is mean to keep specific rows.
View(data_drop)
Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):
#First we need to have the mean agreement with the past and future focused statement.
datasummary <- data_drop |> #created a new table.
group_by(group, subscale)|>
summarise(meanRating = mean(rating),
n= n(),
sdRating=sd(rating, na.rm=T)/sqrt(n))
View(datasummary)
#Running a ggplot.
ggplot(datasummary, aes(x=group, y=meanRating, fill=subscale)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(aes(ymin=meanRating-sdRating,ymax=meanRating+sdRating),width=0.2,position=position_dodge(.9), stat="identity") +
scale_fill_brewer(palette="Set1") +
coord_cartesian(ylim=c(2,4))
## Inferential statistics
According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).
# reproduce the above results here
Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,
# reproduce the above results here
and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)
# reproduce the above results here
Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?
No I was not able to finsih the entire assignment. I just got to the part where I made the bar graph.
How difficult was it to reproduce your results?
Very difficult.
What aspects made it difficult? What aspects made it easy?
I did not understand what was wrong with the data set at first so I had to figure that out and I did not know what I needed to google when I was working on it.