Experiment 1 of Tarampi’s (2016) paper explored how stereotype threat contributes to gender differences in perspective-taking task performances. The researchers found that women tended to perform better when the same task was framed as a social perspective-taking task rather than spatial. This research aligned with my research interest in social cognition. Here, the result suggests that expectations from others influences one’s performance. I wonder how it might translate into social learning and education strategies.
Two tests were used in the experiment: the spatial orientation test and the road-map test. Participants also filled out the AQ questionnaire (Baron-Cohen et al., 2001). All the materials can be found on OSF. The original experiment was done in person, individually, or in a group of 2- 8 people of the same sex. Researchers first introduced perspective-taking ability, framing it either as a test of spatial ability or as a test of empathetic ability. Then, the participants took the spatial orientation test, road-map test, and filled out the AQ questionnaire, with researchers providing instruction at the start of each of the three tasks.
A foreseeable challenge will be crafting a reasonable schedule for recruiting participants, running the task in person, and coding responses. The original study had 139 undergraduate student participants. If I were to run it with around the same number of participants, the time needed from recruiting participants to producing analyzable data would be a lot. Some possible solutions are only focusing on one type of task or running the experiment online.
The repository: https://github.com/psych251/tarampi2016_rescue
The prior replication attempt made four main changes compared to the original study.
The original study was conducted in person, and participants were tested individually or in same-sex groups of 2 to 8 participants. The prior replication attempt converted the study to an online format, and participants were tested individually. A concern the replication experimenter raised is whether state induction of gender bias was effective in the online environment.
Participants of the original study were 139 undergraduate students at UC Santa Barbara. The prior replication was done with 255 Mechanical Turk workers whose demographics weren’t specified in the report, so we don’t know if the demographics differ a lot in terms of age and background.
The original study consisted of three parts: two timed pencil-and-paper tests of perspective-taking ability (the object-perspective/spatial orientation test (Hegarty & Waller, 2004) and the road-map test (Money et al., 1965; modified by Zacks et al., 2002)) and a questionnaire that aims to measure autistic trait in adults. The prior replication attempt only kept the road-map test in order to collect more data of interest. They also included two questions at the end of the experiment, one asking about the participant’s prior CS experience and the other asking about participants’ assessment of their math ability.
No description of the exclusion criterion was mentioned in the original paper. The experimenter of the replication attempt did two sets of data analyses using a light exclusion criterion (excluding participants who labeled no corner) and a strict exclusion criterion (excluding those with less than 60% accuracy). The original results did not replicate in either one.
Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.
How much power does your planned sample have for original effect? For an attenuated effect that is half the size of the original?
(If power analysis is not possible or precise, discuss more fully how you determined a sample size that would be sufficient for rescue.)
The sample size will be determined after the power analysis. The age range of the participants will be 18-22 to match that in the original study. The study must also be done on Laptops.
The original study did not specify any exclusion criterion. I’m considering excluding participants who do not label any corner or whose accuracy rate is below chance since it signals low effort in the study.
In the original study, “the experiment consisted of two timed pencil-and-paper tests of perspective-taking ability: the object-perspective/spatial-orientation test (Hegarty & Waller, 2004) and the standardized road-map test of direction sense (the road-map test; Money et al., 1965; modified by Zacks et al., 2002)…The road-map test consisted of a bird’s-eye diagram of a path through a city. Participants were instructed to imagine walking along the path and write either “R” or “L” at each corner to indicate whether to take a right or left turn. The social version of the task included a human figure at every corner (see Fig. 2, right panel). Participants in the social condition were instructed to imagine themselves taking the perspective of the person as he or she walked along the path. Their score was the number of corners labeled correctly.”
In this rescue project, I will follow the first replication attempt and only focus on the road-map test. In the first replication attempt, participants labeled corners using drop-down menus to indicate either turning left or right. The experimenter expressed concerns about the extra cognitive load related to interacting with the drop-down menus. Therefore, I plan to change the drop-down menus to text boxes where participants can type “L” for left or “R” for right.
The code for the replication study was in Javascript, HTML, and CSS. Only partial codes are available, and there is no code for data collection, so I plan to recreate the study with jsPsych.
Participants will be tested individually online. In both conditions, participants will be told that they will complete a task that will test their perspective-taking ability.
“Participants in the spatial condition were given unmodified tests and also received the following information, which emphasized that perspective-taking is a spatial ability in which men have an advantage over women:
Perspective-taking ability can be thought of as a measure of spatial ability. Spatial ability is a cognitive ability that is defined as understanding the relations between objects in space and being able to mentally manipulate them and respond correctly. Males often score higher on measures of spatial ability.
Participants in the social condition were given modified tests, which included human figures, and received the following additional information, which emphasized that perspective-taking is an empathetic ability in which women have an advantage over men: Perspective-taking ability can be thought of as a measure of empathetic ability. Empathetic ability is a social ability that is defined as being able to identify with and understand what another person is seeing or feeling, and respond appropriately. Females often score higher on measures of empathetic ability.”
The above-mentioned instructions will appear line by line on the screen. Participants will be instructed to read carefully and click the “next” button to go to the next sentence. This is different from the previous replication attempt, and the goal is to achieve a more effective state induction.
The participants then complete the road-map test. In the road-map test, participants will be first given the instruction and complete an untimed practice trial where they are expected to label three corners correctly.
Once they correctly completed the practice trial, they will be given 36 s to complete as many of the 32 items as possible. The original study gave participants 30 s, and we are giving participants 36 s to account for the fact that it may take longer to type. Instead of asking for participants’ CS experience as the experiment did in the first replication attempt, I plan to ask questions that probe whether the state induction of gender bias has been effective.
So, after completing the road-map test, two multiple-choice questions will appear on the screen.
Please fill in the blank based on the information you saw at the beginning of the experiment.
Perspective-taking ability can be thought of as a measure of ______(Choice1: spatial ability. Choice 2: empathetic ability)
_________ (Choice1: females. Choice 2: males) often score higher on measures of empathetic/spatial (depending on the condition) ability
The fact that the participant needs to click the next button to proceed to the next sentence in the instruction functions as an attention check. Additionally, participants need to complete the practice trial in order to proceed.
The analysis will be a 2 (sex: male, female) X 2 (condition: social, spatial) between-subject analysis of variance (ANOVA). The original study did not exclude any data based on participants’ performance in the road-map test. However, I’m considering excluding participants who clearly did not put in the effort (ex., did not label any corner or whose accuracy rate was below chance).
Unlike the original study, I plan to perform three individual ANOVA tests: one with the participants who correctly answered the multiple-choice questions at the end of the study, one with who did not, and one with all the participants.
This study aims to convert the original study to an online format, which is in line with the 1st replication attempt. However, it differs from the 1st replication in three ways. 1) Instead of showing the instructions in one paragraph, I plan to show the instructions line by line to mimic real-time communication. 2) Instead of asking the participant to select “right” or “left” from a drop-down menu, I plan to let them type in their answer in a text box. 3) Unlike the 1st replication, which introduced additional questions probing participants’ CS and math background, I plan to introduce questions that probe whether the state induction succeeded. This is to address one of the main concerns from the previous replication.
Unlike the original study (which had no exclusion criteria) or the 1st replication (had two sets of exclusion criteria), I plan to apply one exclusion criterion to filter out the data from participants who clearly did not put effort into their answering.
You can comment this section out prior to final report with data collection.
Sample size, demographics, data exclusions based on rules spelled out in analysis plan
Any differences from what was described as the original plan, or “none”.
Data preparation following the analysis plan. Read in data and remove unneeded columns. Exclude participants who did not label any corner right or whose accuracy is less than 50%. Calculate mean scores in each group (males- spatial, males-social, females-spatial, females-social)
Conduct a t-test to see whether state induction was effective.
Run ANOVA test. Run the same test on participants who correctly answered both of the state induction questions. Run the same test on participants who did not correctly answer both of the state induction questions.
Three-panel graph with original, 1st replication, and your replication is ideal here
Any follow-up analyses desired (not required).
Combining across the original paper, 1st replication, and 2nd replication, what is the aggregate effect size?
Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.
Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.