Replication of ‘Data quality of platforms and panels for online behavioral research’ by Peer et al. (2021, Behavior Research Methods)

Introduction

Justification for choice of study

I am replicating a recent study by Peer et al. (March, 2021), in which they compared quality of data gathered from the most popular sites used for online behavioral research — Amazon MTurk, CloudResearch (formerly TurkPrime), and Prolific. Peer et al. (2021) found “higher data quality for Prolific and CloudResearch compared to MTurk (the differences between Prolific and MTurk and between CR and MTurk were significant (p < .001), but the difference between CR and Prolific was not (p = 0.91)).”

In response, a team of investigators from CloudResearch, Litman et al. (2021, SSRN), attempted to replicate Peer et al.’s (2021) findings and found “undisclosed methodological decisions that would limit the inferences made in [the original publication]” (Litman et al., 2021). The team from CloudResearch were specifically concerned that Peer et al. (2021) “chose to turn off the recommended data quality filters and reputation qualifications, including filters that are on by default and were designed to address known data quality issues on MTurk.” Contrary to the findings from Peer et al., (2021), upon implementing what they claim are recommended and default filters on MTurk, the team at CloudResearch (Litman et al., 2021) found CloudResearch’s data quality superior to that of Prolific.

I am curious to examine the data quality filters that were used in each of these studies and investigate these sites’ data quality (as operationalize originally by Peer et al. (2021) as a relatively-objective third-party investigator.

Anticipated challenges

The only challenges I anticipate at this stage is tracking and figuring out the specific automated/default, and manually-set data quality filters that each team of researchers incorporated in their data collection process.

Links

Project repository: https://github.com/psych251/saadatian2021.git

Survey preview: https://ucbpsych.qualtrics.com/jfe/preview/SV_cSKr5voz0akZ7P8?Q_CHL=preview&Q_SurveyVersionID=current

Project OSF: https://osf.io/qw4sf/?view_only=49333769f6464316a805d48e8cd1d5ca

Original paper: https://github.com/psych251/saadatian2021/blob/3c2eeb3b8a352c51dea439f7878b2ab95f67acff/original-paper/Peer%20et%20al.%20(2021).pdf

Original study’s OSF: https://osf.io/342dp/

Methods

Power Analysis

Here I aim to replicate the main finding in Study 2, where Peer et al., (2021) compare data quality across the three sites and find “statistically significant differences between the sites on [overall data quality score], F(2, 1458) = 129.4, p < .001, which showed higher scores for Prolific and CR (M = 5.87, 5.78, SD = 1.0, 1.1, respectively) compared to MTurk (M = 4.55, SD = 1.9))”.

To replicate an F statistic of 129.4 with a power of 0.95, I would need a sample of 43 participants minimum from each site. However, to get precise estimates of the differences in data quality across the three sites, I recruit 100 participants from each of the three platforms (Mturk, CloudResearch, and Prolific). Participants from these three independent samples should add up to a total of 300 participants for the study as a whole.

Planned Sample

One hundred participants who are 18 years or older and ALSO current residents of the United States are recruited on Amazon MTurk, CloudResearch, and Prolific. Similar to the original study, participants are paid a $1.5 each for completing the study, which originally averaged around 9.8 minutes-long (SD = 5.2; Peer et al., 2021).

Data Quality Pre-screening

The present study implements all data quality pre-screening filters that were used in the original study (Peer et al., 2021). Peer et al. (2021) describe the data quality pre-screening and exclusion criteria as such:

“Restricting the study to participants with at least 95% approval rating and at least 100 previous submissions” for participants recruited from all three sites
Using the site setting to “block low data quality workers” on CloudResearch
Excluding Prolific participants who completed their Study 1 on Prolific
Excluding participants on CloudResearch completed their Study 1 on CloudResearch or MTurk

I plan to implement every relevant exclusion criteria used by the original authors (numbers 1 and 2 from the list above).

*Note: since the original investigators had to post the survey twice on Mturk (once through their MTurk account and once through their CR account), some participants (N = 39) completed the survey twice. Peer et al. (2021) removed those participants’ later submissions and I will be doing the same.

Exclusions

Unlike the original study, I do not plan to exclude participants who do not complete all of the study as I think that might significantly affect the data quality.

Measures

Attention

-> [ac1, ac2, ac3]

The original scholars measure attention using one two-item explicit question and one covert question. Similar to the original study, “the first [attention check] asks participants to answer”six" and “three” to two items regardless of their actual preference (other responses are coded as failures); the second is an item within a scale worded “I currently don’t pay attention to the questions I’m being asked in the survey” (response other than “strongly disagree” is coded as a failure)."

Comprehension

-> [comprehension1 and comprehension2]

Comprehension is examined via coding of participants’ written summaries of instructions to two tasks. “The first task asks participants to identify faces in a picture, but includes an instruction to only report zero; the second includes instructions for the”Matrix task" (Mazar et al., 2008)."In the original study, two independent, blind raters coded the participant responses. The investigators counted an answer as correct if both coders had coded it as correct, applying a third rater to responses with split votes. Due to time constraints in data coding and analysis, the PI, Kimia Saadatian, is the sole rater of open responses to comprehension items. Following this class replication project, Kimia plans to have the data coded by two blind raters independently.

Reliability

-> [nfc1 to nfc18]

—> Not immediately relevant - will later be reported in exploraty analyses.

Similar to the original study, reliability is measured using Cronbach’s alpha for the eighteen-item, five-point Need for Cognition scale (Cacioppo et al., 1984). The original investigators chose this scale because it has been validated and found to be highly reliable across many studies.

Honesty

-> [honesty 1, honesty2, honesty3]

Replicating the honesty measure in the original study, honesty is measured using “an online version of the Matrix task (Mazar et al., 2008) that include(s) two unsolvable matrices. Reporting solving any of these two problems will be coded as a dishonest response. Additionally, we examine whether participants lie about their eligibility for a future study by asking them to indicate if they want to be invited to a study that samples participants of their own gender but whose age will be described as 5-10 years above the age participants reported in the beginning of the study.”

Data Quality

-> [ac_total + comprehension_total + honesty_total]

Data quality was computed by calculating composite score of data quality based on the average scores of attention, comprehension, and dishonesty.

Other measures

The below measures are not immediately relevant to the main analyses reported in this class project report.

To replicate the original study as closely as possible, we also measure participants’ drop-out rates, response duration, overall response time and speed between sites, differences in responses to the NFC scale, demographics, and patterns of usage of the site (main purpose, frequency of usage, number of submissions and approval ratings, usage of other sites), and whether or not the participants have completed a similar study in the last months.

Procedure

Mimicking the original survey introduction (Peer et al., 2021), the description of the currect study states : “You are invited to complete a survey on individual differences in personal attitudes, opinions, and behaviors.”

I use the original .QSF file for the survey content, which begins with demographic questions, followed by the data quality measures described above.

Participants are prompted with questions related to their usage of the online platform (e.g., how often they use the site, why they use it, how much they earn on average, their percent of approved submissions (responses that participants submit and are approved by the researcher), and how often (if at all) they use other sites) before finishing the survey and receiving their online payment.

Deviations from the Original Study

The new survey did not force responses on any of the questions, except the demographics questions on the first page (data from which is needed for measuring later items). Instead, all questions are set to “request” responses (a Qualtrics setting).

Analysis Plan

The present study utilizes the same analysis plan (one-way ANOVA) that were conducted in the original study by Peer et al. (2021). I attempt to replicate Study 2, where the main finding is that the data quality is significantly higher for data gathered from Prolific and CloudResearch (M = 5.87, 5.78, SD = 1.0, 1.1, respectively) compared to MTurk (M = 4.55, SD = 1.9; F(2, 1458) = 129.4, p < .001).

In order to compute the overall data quality score, the original authors:

Relevant for present EXPLORATORY analyses:

"Used chi-square tests to examine differences in the rates of attention, comprehension and dishonesty
Computed average scores for (across the items per aspect)
Compared them between sites using ANOVA and regression analyses.
Tested for differences between reliability coefficients using Hakistan & Whalen (1976) method"

Relevant for present CONFIRMATORY analyses:

"Computed an overall composite score of data quality (based on the average scores of attention, comprehension and dishonesty)
Compare composite score of data quality between sites using ANOVA and regression."

Differences from Original Study

I do not anticipate any deviations from the original analysis plan at this stage.

Methods Addendum

Actual Sample

As of Nov 30 at 10 pm PST, the total number of recorded responses were 236. Before exporting the data file from Qualtrics, one submission from MTurk submission was removed due to the participant entering an invalid survey ID at the end. The final .CSV file included a total of 235 rows.

Unlike Peer et al. (2021), I do not exclude participants who have not responded to all of the survey as I believe that would be a manipulation on the data quality in and of itself. Instead, I omit NAs from analyses.

Duplicated Rows?

*NOTE: I expect that there will be duplicated data or participants taking the survey multiple times (not sure how this happened). Duplicates have not yet been removed at the time of writing and presenting this report. I will search for (and later remove) duplicates closely by examining duplicate participation IDs, duplicate IP addresses, and duplicate qualitative responses to the open-ended questions). Once data collection from CloudResearch is complete, I will again remove participants that took the survey through both the CR posting and the MTurk posting of the project.

Differences from pre-data collection methods plan

Data Quality Filters

I planned to apply the following data quality filters that were used by Peer et al. (2021):

Data collected from MTurk - only workers who have completed 100 submissions or more and have an HIT approval rating of 95% more.
Data collected from Prolific - only workers who have completed 100 submissions or more and 95% of those submissions are eligible.
Data collected from CloudResearch - only workers who have completed 100 submissions or more and have an HIT approval rating of 95% more. Keep the default setting, which is set to “block low data quality workers”. Implement other possible default filters that could be intentionally or mistakenly taken off by Peer et al. (2021), as suggested by Litman et al. (2021).

However, on CloudResearch and Prolific, I was not able to find or create a filter that would restrict participation to those with 100 prior submissions. Other than that, I used the default filters from each site with the exception of adding a filter on MTurk to only allow participants who have an HIT approval rating of 95% more (premium qualifications on MTurk would incur additional fees").

Variables and Items

Attention Checks

I calculated 3 attention check questions; whereas, the original investigators discussed and measured only two (even though they were present in the .QSF file that was posted on their OSF page). I included all three in my calculation of the overall data quality (therefore increasing the possible range and possible max scores for data quality).

Second Honesty Measure

Peer et al. say that 2 out of the 5 matrices used to calculate participants’ honesty were unsolve-able. However, after looking at the .QSF file from from their OSF page, I have come to strongly believe that there are 3 (as opposed to 2) unsolve-able matrices. As such, I recoded responses to those three items so that if participants report that they have solved those matrices, their responses will be coded as “0” for honesty. By including the third item in this honesty scale, I again, increased the possible range and possible max scores for data quality.

Results

Results from Peer et al. (2021)

The composite Data Quality score ranged from 0 to 7 (M = 5.41; SD = 1; Med = 6). Overall data quality composite scores ranked from highest to lowest were: 1) Prolific (M = 5.87, SD = 1.0) 2) CR ( M = 5.78, SD = 1.1) 3) MTurk (M = 4.55, SD = 1.9)

Peer et al. (2021) found statistically significant differences between the sites on Overall Data Quality Score, F(2, 1458) = 129.4, p < .001.

Post hoc tests with Bonferroni correction showed: - differences between Prolific and MTurk p < .001 - differences between CR and MTurk p < .001 - difference between CR and Prolific p = 0.91.

Data preparation

Data preparation following the analysis plan.

###Data Preparation
##### Starting a Script #####
  # Clear environment
    rm(list=ls())
  # Checking working directory
    getwd()

## [1] "/Users/kimiasaadatian/Desktop/ExpPsych/saadatian2021"

  #### Loading in data ####
      dat <- read.csv("incompletedata.csv",
                       na.strings="NA", strip.white=TRUE)
  ### Load Relevant Libraries and Functions
    library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

    library(dplyr)
    library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

    library(ggplot2)
    library(ltm)

## Loading required package: MASS

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

## Loading required package: msm

## Loading required package: polycor

##### Looking at your data #####
  #### Identifying problems ####
    # Finding the number of duplicate cases
      # WARNING: These do not include the first instance of the duplicated value.
      length(dat$id) - length(unique(dat$id))

## [1] 29

    # Finding rows of duplicate cases. 
      # WARNING: These do not include the first instance of the duplicated value.
      which(duplicated(dat$column))

## integer(0)

  #### Looking at the dataset ####
        colnames(dat) # list of column names in order

##   [1] "StartDate"                                  
##   [2] "EndDate"                                    
##   [3] "Status"                                     
##   [4] "IPAddress"                                  
##   [5] "Progress"                                   
##   [6] "Duration..in.seconds."                      
##   [7] "Finished"                                   
##   [8] "RecordedDate"                               
##   [9] "ResponseId"                                 
##  [10] "RecipientLastName"                          
##  [11] "RecipientFirstName"                         
##  [12] "RecipientEmail"                             
##  [13] "ExternalReference"                          
##  [14] "LocationLatitude"                           
##  [15] "LocationLongitude"                          
##  [16] "DistributionChannel"                        
##  [17] "UserLanguage"                               
##  [18] "Q_BallotBoxStuffing"                        
##  [19] "Q1.2_Browser"                               
##  [20] "Q1.2_Version"                               
##  [21] "Q1.2_Operating.System"                      
##  [22] "Q1.2_Resolution"                            
##  [23] "gender"                                     
##  [24] "gender_3_TEXT"                              
##  [25] "age"                                        
##  [26] "ethnicity"                                  
##  [27] "ethnicity_5_TEXT"                           
##  [28] "education"                                  
##  [29] "education_9_TEXT"                           
##  [30] "income"                                     
##  [31] "country"                                    
##  [32] "ac1"                                        
##  [33] "ac2"                                        
##  [34] "nfc1"                                       
##  [35] "nfc2"                                       
##  [36] "nfc3"                                       
##  [37] "nfc4"                                       
##  [38] "nfc5"                                       
##  [39] "nfc6"                                       
##  [40] "nfc7"                                       
##  [41] "nfc8"                                       
##  [42] "nfc9"                                       
##  [43] "nfc10"                                      
##  [44] "nfc11"                                      
##  [45] "ac3"                                        
##  [46] "nfc12"                                      
##  [47] "nfc13"                                      
##  [48] "nfc14"                                      
##  [49] "nfc15"                                      
##  [50] "nfc16"                                      
##  [51] "nfc17"                                      
##  [52] "nfc18"                                      
##  [53] "nfctime_First.Click"                        
##  [54] "nfctime_Last.Click"                         
##  [55] "nfctime_Page.Submit"                        
##  [56] "nfctime_Click.Count"                        
##  [57] "comprehension1"                             
##  [58] "comprehensiontime1_First.Click"             
##  [59] "comprehensiontime1_Last.Click"              
##  [60] "comprehensiontime1_Page.Submit"             
##  [61] "comprehensiontime1_Click.Count"             
##  [62] "comprehensiontime2_First.Click"             
##  [63] "comprehensiontime2_Last.Click"              
##  [64] "comprehensiontime2_Page.Submit"             
##  [65] "comprehensiontime2_Click.Count"             
##  [66] "accomprehension"                            
##  [67] "comprehension2"                             
##  [68] "X1_matrixtime_First.Click"                  
##  [69] "X1_matrixtime_Last.Click"                   
##  [70] "X1_matrixtime_Page.Submit"                  
##  [71] "X1_matrixtime_Click.Count"                  
##  [72] "X1_honesty1"                                
##  [73] "X2_matrixtime_First.Click"                  
##  [74] "X2_matrixtime_Last.Click"                   
##  [75] "X2_matrixtime_Page.Submit"                  
##  [76] "X2_matrixtime_Click.Count"                  
##  [77] "X2_honesty1"                                
##  [78] "X3_matrixtime_First.Click"                  
##  [79] "X3_matrixtime_Last.Click"                   
##  [80] "X3_matrixtime_Page.Submit"                  
##  [81] "X3_matrixtime_Click.Count"                  
##  [82] "X3_honesty1"                                
##  [83] "X4_matrixtime_First.Click"                  
##  [84] "X4_matrixtime_Last.Click"                   
##  [85] "X4_matrixtime_Page.Submit"                  
##  [86] "X4_matrixtime_Click.Count"                  
##  [87] "X4_honesty1"                                
##  [88] "X5_matrixtime_First.Click"                  
##  [89] "X5_matrixtime_Last.Click"                   
##  [90] "X5_matrixtime_Page.Submit"                  
##  [91] "X5_matrixtime_Click.Count"                  
##  [92] "X5_honesty1"                                
##  [93] "usage1"                                     
##  [94] "usage2"                                     
##  [95] "usage2_4_TEXT"                              
##  [96] "usage3"                                     
##  [97] "usage4"                                     
##  [98] "Q9.5_8"                                     
##  [99] "Q9.5_9"                                     
## [100] "Q9.5_20"                                    
## [101] "Q9.5_21"                                    
## [102] "Q9.5_16"                                    
## [103] "Q9.5_16_TEXT"                               
## [104] "usage6"                                     
## [105] "usage6_3_TEXT"                              
## [106] "honesty2"                                   
## [107] "honesty2_2_TEXT"                            
## [108] "id"                                         
## [109] "comments"                                   
## [110] "SC0"                                        
## [111] "PROLIFIC_PID"                               
## [112] "workerid"                                   
## [113] "assignmentId"                               
## [114] "hitId"                                      
## [115] "site"                                       
## [116] "gender.1"                                   
## [117] "aid"                                        
## [118] "randomId"                                   
## [119] "sign"                                       
## [120] "sample"                                     
## [121] "Create.New.Field.or.Choose.From.Dropdown..."

        head(dat) # first five rows

##        StartDate        EndDate Status       IPAddress Progress
## 1 11/29/21 23:07 11/29/21 23:13      0  71.227.248.177      100
## 2 11/29/21 23:02 11/29/21 23:14      0     73.45.63.48      100
## 3 11/29/21 23:01 11/29/21 23:15      0 108.226.198.101      100
## 4 11/29/21 23:04 11/29/21 23:16      0    72.24.229.97      100
## 5 11/29/21 23:14 11/29/21 23:21      0  71.212.117.122      100
## 6 11/29/21 23:11 11/29/21 23:22      0     108.64.6.85      100
##   Duration..in.seconds. Finished   RecordedDate        ResponseId
## 1                   346        1 11/29/21 23:13 R_w05LdNz9t3MdMvT
## 2                   712        1 11/29/21 23:14 R_1LYdFg61IZxKZJM
## 3                   851        1 11/29/21 23:15 R_3ndAaQKXVjdw6HG
## 4                   708        1 11/29/21 23:16 R_1Io7uoe6rKoNTZp
## 5                   464        1 11/29/21 23:21 R_1RlLRVZvxcjk2Q1
## 6                   670        1 11/29/21 23:22 R_vCZLiJb8raUMuXL
##   RecipientLastName RecipientFirstName RecipientEmail ExternalReference
## 1                NA                 NA             NA                NA
## 2                NA                 NA             NA                NA
## 3                NA                 NA             NA                NA
## 4                NA                 NA             NA                NA
## 5                NA                 NA             NA                NA
## 6                NA                 NA             NA                NA
##   LocationLatitude LocationLongitude DistributionChannel UserLanguage
## 1          47.6722         -122.1257           anonymous           EN
## 2          42.1097          -87.9399           anonymous           EN
## 3          35.4145         -119.0403           anonymous           EN
## 4          43.6138         -116.3972           anonymous           EN
## 5          47.6459         -122.3995           anonymous           EN
## 6          26.1250          -80.2670           anonymous           EN
##   Q_BallotBoxStuffing  Q1.2_Browser Q1.2_Version Q1.2_Operating.System
## 1                  NA Safari iPhone         15.1                iPhone
## 2                  NA        Chrome 96.0.4664.55             Macintosh
## 3                  NA        Chrome 96.0.4664.45       Windows NT 10.0
## 4                  NA        Chrome 96.0.4664.45       Windows NT 10.0
## 5                  NA        Chrome 96.0.4664.45       Windows NT 10.0
## 6                  NA        Safari           14             Macintosh
##   Q1.2_Resolution gender gender_3_TEXT age ethnicity ethnicity_5_TEXT education
## 1         414x736      2                17         1                          4
## 2       2560x1440      1                14         4                          4
## 3        1536x864      2                23         1                          4
## 4        1708x961      2                54         1                          5
## 5       1920x1080      2                28       2,3                          3
## 6        810x1080      2                15       1,4                          4
##   education_9_TEXT income country ac1 ac2 nfc1 nfc2 nfc3 nfc4 nfc5 nfc6 nfc7
## 1               NA      4     189   5   4    3    2    4    3    3    2    4
## 2               NA      4     189   7   4    3    4    2    4    2    1    5
## 3               NA      2     189   6   3    4    4    2    2    2    4    2
## 4               NA      1     189   6   3    2    3    4    3    4    2    3
## 5               NA      3     189   6   3    2    4    2    2    3    4    2
## 6               NA      1     189   6   3    4    4    2    2    2    3    2
##   nfc8 nfc9 nfc10 nfc11 ac3 nfc12 nfc13 nfc14 nfc15 nfc16 nfc17 nfc18
## 1    3    3     3     4   1     2     2     3     2     4     2     2
## 2    3    4     3     4   2     1     3     3     3     1     4     3
## 3    2    2     4     4   1     2     4     4     4     2     1     4
## 4    4    4     2     2   1     2     2     2     2     4     4     2
## 5    3    4     4     4   1     2     3     4     4     4     3     4
## 6    3    2     3     4   1     2     3     4     4     3     2     4
##   nfctime_First.Click nfctime_Last.Click nfctime_Page.Submit
## 1               4.722            101.132             101.757
## 2              28.934            171.462             172.227
## 3              10.681            112.922             115.176
## 4              10.855            129.400             131.725
## 5               7.694             90.258              91.612
## 6               2.902            121.400             122.619
##   nfctime_Click.Count comprehension1 comprehensiontime1_First.Click
## 1                  35              1                          0.496
## 2                  19              0                         43.569
## 3                  25              1                         51.391
## 4                  19              1                         39.662
## 5                  21              1                          6.537
## 6                  35              1                          4.811
##   comprehensiontime1_Last.Click comprehensiontime1_Page.Submit
## 1                        48.968                         52.746
## 2                        78.131                         84.274
## 3                       129.717                        134.719
## 4                        84.654                         86.418
## 5                        58.372                         67.765
## 6                        88.782                         90.455
##   comprehensiontime1_Click.Count comprehensiontime2_First.Click
## 1                             12                          0.549
## 2                              2                          0.000
## 3                              6                          4.254
## 4                              2                          2.973
## 5                              7                          2.352
## 6                              9                          0.890
##   comprehensiontime2_Last.Click comprehensiontime2_Page.Submit
## 1                         1.264                          3.038
## 2                         0.000                         20.011
## 3                         5.707                          7.464
## 4                         4.202                          5.866
## 5                         3.828                          5.161
## 6                         2.139                          4.178
##   comprehensiontime2_Click.Count accomprehension comprehension2
## 1                              3               1              1
## 2                              0              NA             NA
## 3                              2               1              1
## 4                              2               1              1
## 5                              2               1              1
## 6                              2               1              1
##   X1_matrixtime_First.Click X1_matrixtime_Last.Click X1_matrixtime_Page.Submit
## 1                    15.157                   15.157                    16.042
## 2                    15.301                   15.301                    21.006
## 3                     7.913                    7.913                    13.869
## 4                     0.000                    0.000                    21.007
## 5                    11.023                   11.023                    12.682
## 6                     0.000                    0.000                    21.016
##   X1_matrixtime_Click.Count X1_honesty1 X2_matrixtime_First.Click
## 1                         1           0                     1.707
## 2                         1           0                    13.320
## 3                         1           0                     0.000
## 4                         0          NA                     0.000
## 5                         1           0                    13.944
## 6                         0          NA                    20.287
##   X2_matrixtime_Last.Click X2_matrixtime_Page.Submit X2_matrixtime_Click.Count
## 1                    1.707                     2.133                         1
## 2                   14.688                    16.322                         2
## 3                    0.000                    21.017                         0
## 4                    0.000                    21.109                         0
## 5                   13.944                    15.735                         1
## 6                   20.287                    21.114                         1
##   X2_honesty1 X3_matrixtime_First.Click X3_matrixtime_Last.Click
## 1           0                     1.633                    1.633
## 2           0                     9.912                    9.912
## 3          NA                    19.512                   19.512
## 4          NA                    20.468                   20.468
## 5           0                     0.000                    0.000
## 6           1                    17.386                   17.386
##   X3_matrixtime_Page.Submit X3_matrixtime_Click.Count X3_honesty1
## 1                     1.886                         1           0
## 2                    10.812                         1           0
## 3                    21.010                         1           0
## 4                    21.007                         1           1
## 5                    21.023                         0          NA
## 6                    21.013                         1           1
##   X4_matrixtime_First.Click X4_matrixtime_Last.Click X4_matrixtime_Page.Submit
## 1                     1.001                    1.001                     1.215
## 2                     8.484                    8.484                    12.599
## 3                     0.000                    0.000                    21.112
## 4                     0.000                    0.000                    21.109
## 5                    19.933                   19.933                    20.527
## 6                    16.401                   16.401                    21.013
##   X4_matrixtime_Click.Count X4_honesty1 X5_matrixtime_First.Click
## 1                         1           0                     0.644
## 2                         1           0                     6.091
## 3                         0          NA                     0.000
## 4                         0          NA                    19.508
## 5                         1           1                    18.299
## 6                         1           1                    14.585
##   X5_matrixtime_Last.Click X5_matrixtime_Page.Submit X5_matrixtime_Click.Count
## 1                    0.644                     0.828                         1
## 2                    6.091                     6.858                         1
## 3                    0.000                    21.007                         0
## 4                   19.508                    21.007                         1
## 5                   18.299                    18.971                         1
## 6                   14.585                    21.111                         1
##   X5_honesty1 usage1 usage2 usage2_4_TEXT               usage3
## 1           0      3      2                                 $7
## 2           0      4      3                             20 GBP
## 3          NA      4      2                             $5-$10
## 4           1      3      2                                 20
## 5           1      2      2                                  4
## 6           1      2      2               I don’t know like 7$
##                                                                                                                             usage4
## 1                                                                                                                             100%
## 2                                                                                                                              97%
## 3 I'm not sure. I believe the vast majority, if not all. It will not let me check while doing this task, but it's likely over 95%.
## 4                                                                                                                             100%
## 5                                                                                                                               97
## 6                                                                                                            I don’t know like 97%
##   Q9.5_8 Q9.5_9 Q9.5_20 Q9.5_21 Q9.5_16 Q9.5_16_TEXT usage6 usage6_3_TEXT
## 1      1      5       1       1       1                   4            NA
## 2      1      5       1       1       1                   4            NA
## 3      1      4       1       1       1                   4            NA
## 4      1      4       1       1       5       Google      4            NA
## 5      5      5       3       1       1         none      4            NA
## 6      1      4       3       1       1                   1            NA
##   honesty2 honesty2_2_TEXT                       id
## 1        0                 610aba8c00984e30b7fc809b
## 2        0                 616507c81808c6caff7f6278
## 3        1                 5ed70cb596035b197801f12f
## 4        1                 614f783d947d56fab33f93fe
## 5        1                 58a96911ea3d11000170e50f
## 6        1                 6111b9c48538fdc476c7b8f1
##                                    comments SC0             PROLIFIC_PID
## 1                                             5 610aba8c00984e30b7fc809b
## 2                                             5 616507c81808c6caff7f6278
## 3                                             2 5ed70cb596035b197801f12f
## 4                                             0 614f783d947d56fab33f93fe
## 5                                             2 58a96911ea3d11000170e50f
## 6 Sorry I couldn’t answer the math problems   0 6111b9c48538fdc476c7b8f1
##   workerid assignmentId hitId     site gender.1 aid randomId sign   sample
## 1       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
## 2       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
## 3       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
## 4       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
## 5       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
## 6       NA           NA    NA Prolific       NA  NA       NA  GBP Prolific
##   Create.New.Field.or.Choose.From.Dropdown...
## 1                                          NA
## 2                                          NA
## 3                                          NA
## 4                                          NA
## 5                                          NA
## 6                                          NA

        View(dat) # view dataset

## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1

## Warning in View(dat): unable to load shared object '/Library/Frameworks/R.framework/Resources/modules//R_de.so':
##   dlopen(/Library/Frameworks/R.framework/Resources/modules//R_de.so, 6): Library not loaded: /opt/X11/lib/libSM.6.dylib
##   Referenced from: /Library/Frameworks/R.framework/Versions/4.1/Resources/modules/R_de.so
##   Reason: image not found

## Error in View(dat): X11 dataentry cannot be loaded

        nrow(dat) # number of rows in dataset ## N = 235

## [1] 235

        ncol(dat) # number of columns in dataset

## [1] 121

#### Data exclusion / filtering
  #### Removing unwanted data ####
    # Omitting rows with NAs
      data <- na.omit(dat)
    # Removing columns we don't need about for primary analyses
      dat <- dat %>% 
          dplyr::select(ac1, ac2, ac3, comprehension1, 
                        comprehension2, contains("honesty"),
                        ResponseId, site, sample
                        )
      colnames(dat)

##  [1] "ac1"             "ac2"             "ac3"             "comprehension1" 
##  [5] "comprehension2"  "X1_honesty1"     "X2_honesty1"     "X3_honesty1"    
##  [9] "X4_honesty1"     "X5_honesty1"     "honesty2"        "honesty2_2_TEXT"
## [13] "ResponseId"      "site"            "sample"

Confirmatory analysis

I report differences between sites on each data quality measure and then aggregate those findings to a composite score of data quality, reporting differences across the three sites.

ATTENTION (originally : ACQs, χ2(4) = 548.48, 203.56, p < .001. )

#### recode ac1 so that : pass if  ac1 = "6" ; fail ifelse
#### record ac2 so that : pass if ac2 = "3" ; fail ifelse
#### recode ac3 so that :  pass if ac3 = '1' ; fail ifelse

dat <- dat %>% 
  mutate(ac1_pass = ifelse(ac1 == "6", 1,0),
         ac2_pass = ifelse(ac2 == "3", 1,0),
         ac3_pass = ifelse(ac3 == "1", 1,0),
         ac_total = (ac1_pass + ac2_pass + ac3_pass)
         )

### Basic Descriptive Stats for Attention from each Sample
group_by(dat, sample) %>%
  summarise(
    mean = mean(ac_total, na.rm = TRUE),
    sd = sd(ac_total, na.rm = TRUE) # the na.rm tells R to ignore NA values
           )

## # A tibble: 2 × 3
##   sample    mean    sd
##   <chr>    <dbl> <dbl>
## 1 MTurk     2.41 0.920
## 2 Prolific  2.73 0.700

COMPREHENSION (originally : χ2(4) = 152.4, p < .001 )

#### Have 2 coders code any response that suggested a minimum level of understanding as indicating comprehension and only flag responses as incorrect if they were undoubtedly illegible. Responses that are flagged by both raters will be coded as incorrect answers.
  ### manually code open responses to comprehension1
  ### manually code open responses to comprehension2

dat <- dat %>% 
  mutate(comprehension_total = (comprehension1 + comprehension2)
         )
dat$comprehension_total

##   [1]  2 NA  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2 NA  2  2  2  2  2
##  [26]  2  2  1  2  2  2  2  2  2  2  1  2  2  2  2  1  2  2  2  2  2  2  2  2  2
##  [51]  2  2  1  2  2  2  2  2  2  2  2  1  2  2  2  2  2  2  2  2  2  2  2  2  2
##  [76]  1  1  2  2  2  2  2  2  1  2  2  2  2  2  2  2  2  2  1  2  2  2  2  2  2
## [101] NA NA NA  2 NA  0  0  1  2  0  0  2  1  2  0  1  0  2  0  2  2  0  2  0  0
## [126]  2  2  2  2  1  2  1  2  0  0  1  0  0  2  1  1  1  0  1  1  0  0  0  2  2
## [151]  2  2  1  0  2  1  2  2  2  2  2  2  1  1  2  2  0  1  2  2  2  1  2  2  2
## [176]  0  2  1  2  1  1  2  2  2  2  2  1  2  2  2  2  2  2  2  2  2  2  2  1  0
## [201]  2  2  2  2  2  2  2 NA NA NA  0 NA NA NA NA NA NA NA NA NA NA NA NA NA  2
## [226]  2  2 NA NA  2  2 NA NA  0 NA

### Basic Descriptive Stats for Comprehension from each Sample
group_by(dat, sample) %>%
  summarise(
    mean = mean(comprehension_total, na.rm = TRUE),
    sd = sd(comprehension_total, na.rm = TRUE) # the na.rm tells R to ignore NA values
           )

## # A tibble: 2 × 3
##   sample    mean    sd
##   <chr>    <dbl> <dbl>
## 1 MTurk     1.36 0.822
## 2 Prolific  1.91 0.289

HONESTY (originally : χ2(4) = 153.44, p < .001 )

Peer et al. (2021) claim that 2 out of the 5 matrices in the first measure of honesty (X1_honesty1 to X5_honesty1) are unsolve-able. However, I use the same .QSF file and find that three matrices are unsolve-able.

More specifically, X1_honesty1 and X2_honesty1 are solve-able so we wont be looking at participants’ response to these matrices while calculating the overall honesty scores.

X3_honesty1, X4_honesty1, and X5_honesty1 are NOT solve-able. Since responses were coded as 0 for “found it” and 1 for “no”, they do not need to be reverse-coded to demonstrate honesty scores (e.g., a response of 0 to these three items means the participant is NOT being honest).

#### code responses to honesty manually so that:
  ### honesty1 : 0 = "found it" = dihonest    1 = "no" = honest
  ### honesty2  : 0 = not honest 1 = honest
  ### honesty3 : 0 = not honest 1 = honest 2 = other

colnames(dat)

##  [1] "ac1"                 "ac2"                 "ac3"                
##  [4] "comprehension1"      "comprehension2"      "X1_honesty1"        
##  [7] "X2_honesty1"         "X3_honesty1"         "X4_honesty1"        
## [10] "X5_honesty1"         "honesty2"            "honesty2_2_TEXT"    
## [13] "ResponseId"          "site"                "sample"             
## [16] "ac1_pass"            "ac2_pass"            "ac3_pass"           
## [19] "ac_total"            "comprehension_total"

dat <- dat %>%
  mutate(honesty1_matrix = (X3_honesty1 + X4_honesty1 +
                          X5_honesty1
                          ),
          honesty_total = (honesty1_matrix + honesty2
                          )
         )

### Basic Descriptive Stats for Honesty from each Sample
group_by(dat, sample) %>%
  summarise(
    mean = mean(honesty_total, na.rm = TRUE),
    sd = sd(honesty_total, na.rm = TRUE) # the na.rm tells R to ignore NA values
           )

## # A tibble: 2 × 3
##   sample    mean    sd
##   <chr>    <dbl> <dbl>
## 1 MTurk     2.04  1.51
## 2 Prolific  2.59  1.36

### We can remove the unnecessary honesty columns from our dataset now 
## dont forget to include the new variables (comprehension_total, etc.) in this process.
colnames(dat)

##  [1] "ac1"                 "ac2"                 "ac3"                
##  [4] "comprehension1"      "comprehension2"      "X1_honesty1"        
##  [7] "X2_honesty1"         "X3_honesty1"         "X4_honesty1"        
## [10] "X5_honesty1"         "honesty2"            "honesty2_2_TEXT"    
## [13] "ResponseId"          "site"                "sample"             
## [16] "ac1_pass"            "ac2_pass"            "ac3_pass"           
## [19] "ac_total"            "comprehension_total" "honesty1_matrix"    
## [22] "honesty_total"

    dat <- dat %>% 
        dplyr::select(ac1, ac2, ac3, ac_total, comprehension1, 
                      comprehension2, comprehension_total, 
                      X3_honesty1, X4_honesty1, X5_honesty1, 
                      honesty1_matrix, honesty2, honesty_total,
                      ResponseId, site, sample
                      )
colnames(dat)

##  [1] "ac1"                 "ac2"                 "ac3"                
##  [4] "ac_total"            "comprehension1"      "comprehension2"     
##  [7] "comprehension_total" "X3_honesty1"         "X4_honesty1"        
## [10] "X5_honesty1"         "honesty1_matrix"     "honesty2"           
## [13] "honesty_total"       "ResponseId"          "site"               
## [16] "sample"

OVERALL DATA QUALITY SCORES (From the original study: “The score gave participants a value between 0 and 7, showing whether they passed one or both ACQs, answered correctly one or two comprehension questions,.. claim to have solved the unsolvable problems, [and claimed to qualify for a future study that they did not qualify for based on their earlier responses in the survey]. … The overall composite score should not be considered as measuring the same construct. Rather, it is used here as a multifactorial measure that attests to the overall general level of data quality”)

dat <- dat %>%
  mutate(dataquality = (ac_total + comprehension_total +
                        honesty_total)
          )

colnames(dat)

##  [1] "ac1"                 "ac2"                 "ac3"                
##  [4] "ac_total"            "comprehension1"      "comprehension2"     
##  [7] "comprehension_total" "X3_honesty1"         "X4_honesty1"        
## [10] "X5_honesty1"         "honesty1_matrix"     "honesty2"           
## [13] "honesty_total"       "ResponseId"          "site"               
## [16] "sample"              "dataquality"

dat$dataquality

##   [1]  3 NA NA NA NA  9  4 NA NA NA NA  7 NA NA NA  7  9 NA NA NA  7 NA NA NA NA
##  [26] NA NA NA NA NA  9 NA NA NA  9 NA  8  8 NA NA NA NA NA NA NA NA NA NA NA NA
##  [51]  9 NA  7  9 NA NA NA NA  6 NA  9  4 NA NA  6 NA NA NA NA NA NA NA NA  8  9
##  [76]  5 NA  9  7  8  7  6 NA  7 NA NA NA NA NA NA NA  6 NA NA  6  7  8 NA NA NA
## [101] NA NA NA NA NA  1  2  5  9  3  5  6  5  9 NA  2  0 10  2  4 NA  3  9  2  0
## [126] NA  6  8  9  8 NA  3 NA  3  6  4  4  2  9  5  4  3 NA  6  6  1 NA  3  7 NA
## [151] NA  6  4  4  3  4 NA NA  5 NA NA  6  5 NA NA  6 NA NA  6 NA NA  4 NA  6  7
## [176]  3 NA  7  9  5 NA  9 NA  9  9 NA  5  8  6 NA  9 NA NA NA  9  7  9 NA NA  6
## [201] NA  9  9  9  9  7 NA NA NA NA  2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [226] NA  9 NA NA NA NA NA NA NA NA

LOOKING AT ALL MEANS ON EACH PLATFORM

group_by(dat, sample) %>%
  summarise(
    mean = mean(dataquality, na.rm = TRUE),
    sd = sd(dataquality, na.rm = TRUE) # the na.rm tells R to ignore NA values
           )

## # A tibble: 2 × 3
##   sample    mean    sd
##   <chr>    <dbl> <dbl>
## 1 MTurk     5.61  2.67
## 2 Prolific  7.19  1.66

summary(dat$dataquality) # dataquality ranges from 0 to 10 (M = 6.09)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   4.000   6.000   6.087   9.000  10.000     132

## original authors' data quality ranged from 0 to 7 including : comp1 comp2 honesty1_1 honesty1_2 honesty 2 ac1 ac2 for the max data quality score of 7. Here we have added ac3 and honesty1_3; with these 2 new variables, our range should come up to 9 so Im not sure what the max score of 10 really indicates???

COMPARING DATA QUALITY ACROSS THE GROUPS (the different sites)

# Compute t test since we currently only have 2 samples
ttest <- print(t.test(dat$dataquality ~ dat$sample))

## 
##  Welch Two Sample t-test
## 
## data:  dat$dataquality by dat$sample
## t = -3.6508, df = 87.815, p-value = 0.0004436
## alternative hypothesis: true difference in means between group MTurk and group Prolific is not equal to 0
## 95 percent confidence interval:
##  -2.4438623 -0.7210123
## sample estimates:
##    mean in group MTurk mean in group Prolific 
##               5.611111               7.193548

report(ttest) # medium significant difference with Mturk having lower data quality

## Error in report(ttest): could not find function "report"

# once data collection is complete,, Compute one-way ANOVA
singleANOVA <- aov(dataquality ~ sample, data = dat)
# Summary of one-way ANOVA
summary(singleANOVA)

##              Df Sum Sq Mean Sq F value  Pr(>F)   
## sample        1   54.3   54.26   9.322 0.00289 **
## Residuals   101  587.9    5.82                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 132 observations deleted due to missingness

library("report") # Load the package every time you start R
report(singleANOVA) # medium significant difference with MTurk having lower data quality

## For one-way between subjects designs, partial eta squared is equivalent to eta squared.
## Returning eta squared.

## The ANOVA (formula: dataquality ~ sample) suggests that:
## 
##   - The main effect of sample is statistically significant and medium (F(1, 101) = 9.32, p = 0.003; Eta2 = 0.08, 95% CI [0.02, 1.00])
## 
## Effect sizes were labelled following Field's (2013) recommendations.

## checking normality even though it's not necessary given the size of the samples
# histogram
hist(singleANOVA$residuals)

# QQ-plot
library(car)

qqPlot(singleANOVA$residuals,
  id = FALSE # id = FALSE to remove point identification
      )

## Since residuals follow a normal distribution, we can now check whether the variances are equal across different groups. The result will determine whether we use the ANOVA or the Welch ANOVA.

par(mfrow = c(1, 2)) # combine plots

# 1. Homogeneity of variances
plot(singleANOVA, which = 3)

# 2. Normality
plot(singleANOVA, which = 2)

It turns out that there is in fact a significant difference on data quality scores based on the site you gather your data from!

The Welch Two Sample t-test testing the difference of data quality scores by sample (mean in group MTurk = 5.61, mean in group Prolific = 7.19) suggests that the effect is negative, statistically significant, and medium (difference = -1.58, 95% CI [-2.44, -0.72], t(87.81) = -3.65, p < .001; Cohen’s d = -0.71, 95% CI [-1.11, -0.31])

The one-way ANOVA also demonstrates that the main effect of sample is statistically significant and medium (F(1, 101) = 9.32, p = 0.003; Eta2 = 0.08, 95% CI [0.02, 1.00]).

#### *Side-by-side graph with original graph is ideal here*
data <- dat %>% 
        group_by(sample, dataquality) %>%
        summarise()

## `summarise()` has grouped output by 'sample'. You can override using the `.groups` argument.

# boxplots and t-tests since we only have 2 samples at this point
## Basic Boxplot
boxplot(dat$dataquality ~ dat$sample, 
        ylab = ("Composite Data Quality Score"), 
        xlab = ("Sites used to Gather the Data")
       ) ## there are no outliers

## Elementary Boxplot
ggplot(dat) +
  aes(x = sample, y = dataquality) +
  geom_boxplot() +
  geom_smooth() +
  theme_classic2() +
  ylab("Composite Data Quality Score") + 
  xlab("Sites used to Gather the Data")

## Error in theme_classic2(): could not find function "theme_classic2"

## Cool colorful boxplot with P values
library(ggpubr)
# Edit from here #
x <- which(names(dat) == "sample") # name of grouping variable
y <- which(names(dat) == "dataquality"
           | names(dat) == "ac_total"
           | names(dat) == "comprehension_total"
           | names(dat) == "honesty_total"
           ) # names of variables to test
method <- "t.test" # one of "wilcox.test" or "t.test"
paired <- FALSE # if paired make sure that in the dataframe you have first all individuals at T1, then all individuals again at T2
# Edit until here

# Edit at your own risk
for (i in y) {
  for (j in x) {
    ifelse(paired == TRUE,
      p <- ggpaired(dat,
        x = colnames(dat[j]), y = colnames(dat[i]),
        color = colnames(dat[j]), line.color = "gray", line.size = 0.4,
        palette = "npg",
        legend = "none",
        xlab = colnames(dat[j]),
        ylab = colnames(dat[i]),
        add = "jitter"
      ),
      p <- ggboxplot(dat,
        x = colnames(dat[j]), y = colnames(dat[i]),
        color = colnames(dat[j]),
        palette = "npg",
        legend = "none",
        add = "jitter"
      )
    )
    #  Add p-value
    print(p + stat_compare_means(aes(label = paste0(..method.., ", p-value = ", ..p.format..)),
      method = method,
      paired = paired,
      # group.by = NULL,
      ref.group = NULL
    ))
  }
}

## Warning: Removed 7 rows containing non-finite values (stat_boxplot).

## Warning: Removed 7 rows containing non-finite values (stat_compare_means).

## Warning: Removed 7 rows containing missing values (geom_point).

## Warning: Removed 27 rows containing non-finite values (stat_boxplot).

## Warning: Removed 27 rows containing non-finite values (stat_compare_means).

## Warning: Removed 27 rows containing missing values (geom_point).

## Warning: Removed 131 rows containing non-finite values (stat_boxplot).

## Warning: Removed 131 rows containing non-finite values (stat_compare_means).

## Warning: Removed 131 rows containing missing values (geom_point).

## Warning: Removed 132 rows containing non-finite values (stat_boxplot).

## Warning: Removed 132 rows containing non-finite values (stat_compare_means).

## Warning: Removed 132 rows containing missing values (geom_point).

### Run this code after data collection is complete to visualize the comparison for all 3 sites
# Edit from here
x <- which(names(dat) == "sample") # name of grouping variable
y <- which(names(dat) == "dataquality"
           | names(dat) == "ac_total"
           | names(dat) == "comprehension_total"
           | names(dat) == "honesty_total"# names of variables to test
          )
method1 <- "anova" # one of "anova" or "kruskal.test"
method2 <- "t.test" # one of "wilcox.test" or "t.test"
my_comparisons <- list(c("MTurk", "Prolific"), 
                       c("MTurk", "CloudResearch"), 
                       c("Prolific", "CloudResearch")) # comparisons for post-hoc tests
# Edit until here

# Edit at your own risk
for (i in y) {
  for (j in x) {
    p <- ggboxplot(dat,
      x = colnames(dat[j]), y = colnames(dat[i]),
      color = colnames(dat[j]),
      legend = "none",
      palette = "npg",
      add = "jitter"
    )
    print(
      p + stat_compare_means(aes(label = paste0(..method.., ", p-value = ", ..p.format..)),
        method = method1, label.y = max(dat[, i], na.rm = TRUE)
      )
      + stat_compare_means(comparisons = my_comparisons, method = method2, label = "p.format") # remove if p-value of ANOVA or Kruskal-Wallis test >= alpha
    )
  }
}

## Warning: Removed 7 rows containing non-finite values (stat_boxplot).

## Warning: Removed 7 rows containing non-finite values (stat_compare_means).

## Warning: Removed 7 rows containing non-finite values (stat_signif).

## Warning: Computation failed in `stat_signif()`:
## not enough 'y' observations

## Warning: Removed 7 rows containing missing values (geom_point).

## Warning: Removed 27 rows containing non-finite values (stat_boxplot).

## Warning: Removed 27 rows containing non-finite values (stat_compare_means).

## Warning: Removed 27 rows containing non-finite values (stat_signif).

## Warning: Computation failed in `stat_signif()`:
## not enough 'y' observations

## Warning: Removed 27 rows containing missing values (geom_point).

## Warning: Removed 131 rows containing non-finite values (stat_boxplot).

## Warning: Removed 131 rows containing non-finite values (stat_compare_means).

## Warning: Removed 131 rows containing non-finite values (stat_signif).

## Warning: Computation failed in `stat_signif()`:
## not enough 'y' observations

## Warning: Removed 131 rows containing missing values (geom_point).

## Warning: Removed 132 rows containing non-finite values (stat_boxplot).

## Warning: Removed 132 rows containing non-finite values (stat_compare_means).

## Warning: Removed 132 rows containing non-finite values (stat_signif).

## Warning: Computation failed in `stat_signif()`:
## not enough 'y' observations

## Warning: Removed 132 rows containing missing values (geom_point).

#### coolest Boxplot
library(ggstatsplot)

## You can cite this package as:
##      Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
##      Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167

ggbetweenstats(data = dat,
                x = sample,
                y = dataquality,
                type = "parametric", # ANOVA or Kruskal-Wallis
                var.equal = TRUE, # ANOVA or Welch ANOVA
                plot.type = "box",
                pairwise.comparisons = TRUE,
                pairwise.display = "significant",
                centrality.plotting = FALSE,
                bf.message = FALSE
              )

Exploratory analyses

I truly wish I had more time to run exploratory analyses in time to for my final presentation and submission of this report. I began looking at data reliability and responses to the Need for Cognition Scale but have not had a chance to complete that.

For the time being, please disregard the below section on reliability and the NFC scale. I look forward to looking into the data more closely once the quarter ends!

RELIABILITY

#### Reliability is measured using the cronbach alpha of the NFC scale
### The original authors re-coded the negatively worded items of the NFC before running analyses but here, our code specifically defines the reverse scored items within this scale. Since NFC is a 5-point linkert scale, I use 6-ITEM to reverse code an item.

## Reverse scoring nfc items
dat$nfc3_reversed <- 6-dat$nfc3

## Error in `$<-.data.frame`(`*tmp*`, nfc3_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc4_reversed <- 6-dat$nfc4

## Error in `$<-.data.frame`(`*tmp*`, nfc4_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc5_reversed <- 6-dat$nfc5

## Error in `$<-.data.frame`(`*tmp*`, nfc5_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc7_reversed <- 6-dat$nfc7

## Error in `$<-.data.frame`(`*tmp*`, nfc7_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc8_reversed <- 6-dat$nfc8

## Error in `$<-.data.frame`(`*tmp*`, nfc8_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc9_reversed <- 6-dat$nfc9

## Error in `$<-.data.frame`(`*tmp*`, nfc9_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc12_reversed <- 6-dat$nfc12

## Error in `$<-.data.frame`(`*tmp*`, nfc12_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc16_reversed <- 6-dat$nfc16

## Error in `$<-.data.frame`(`*tmp*`, nfc16_reversed, value = numeric(0)): replacement has 0 rows, data has 235

dat$nfc17_reversed <- 6-dat$nfc7

## Error in `$<-.data.frame`(`*tmp*`, nfc17_reversed, value = numeric(0)): replacement has 0 rows, data has 235

## Creating the nfc scale
dat <- dat %>% 
  mutate(nfc_scale = (nfc1 + nfc2 + nfc3_reversed +
                      nfc4_reversed + nfc5_reversed +
                      nfc6 + nfc7_reversed + nfc8_reversed +
                      nfc9_reversed + nfc10 + nfc11 +
                      nfc12_reversed + nfc13 + nfc14 +nfc15 + 
                      nfc16_reversed + nfc17_reversed + nfc18
                      )
         )

## Error: Problem with `mutate()` column `nfc_scale`.
## ℹ `nfc_scale = (...)`.
## x object 'nfc1' not found

cronbach.alpha(nfc_total)

## Error in cronbach.alpha(nfc_total): object 'nfc_total' not found

Discussion

Summary of Replication Attempt

In short, the present study partially replicated the findings from Peer et al. (2021); however, that is simply because no data has been collected or analyzed from participants on CloudResearch. When it comes to comparisons between Prolific and MTurk, the present results replicated that of the original study. Similar to the original study, we find that there is a statistically significant and medium difference between the data quality from Prolific (M = 7.19, SD = 1.66) and MTurk (M = 5.61, SD = 2.67) with MTurk producing a significantly lower data quality (t(87.81) = -3.65, p < .001; Cohen’s d = -0.71, 95% CI [-1.11, -0.31]). The one-way ANOVA confirmed the results from the Welch Two Sample t-test (reported above) that the main effect of sample is statistically significant and medium (F(1, 101) = 9.32, p = 0.003; Eta2 = 0.08, 95% CI [0.02, 1.00]).

Commentary

Deviations from the original study (e.g., procedure and exclusion criteria) are very unlikely to have moderated the results. Furthermore, the question of whether or not the original findings and the aforementioned critiques of said findings hold, will be better answered once data collection is complete.