Initial report

Description of task

Toy name Hidden effect size
whizbang balls 0
constructo bricks 0.1
rainbow clickers 0.1851
doodle noodles 0.2961
singing bling rings 0.4331
brahma buddies 0.5961
magic colorclay 0.7851
moon-candy makers 1

Sample size index Hidden sample size per group Seconds to run ‘experiment’
1 10 1
2 12 2
3 14 2
4 16 2
5 19 2
6 22 3
7 26 3
8 30 3
9 35 4
10 41 5
11 48 5
12 57 6
13 66 7
14 78 8
15 91 10
16 106 11
17 125 13
18 146 15
19 171 18
20 200 20

\[ \begin{eqnarray*} z &\sim& \mbox{Normal}(\delta\sqrt{n/2}, 1)\\ x & = & \mbox{sgn}(z)\left[1 - \left(1 - F_{\chi_1^2}\left(z^2\right)\right)^{\frac{1}{q}}\right] \end{eqnarray*} \] where \(q\in\{3,7\}\) and \(x \in (-1,1)\).

Inclusion criteria

Initial cleaning of the records of people who did not consent or were rejected for using a mobile device yielded 1042 potential participants. Inclusion criteria, and the successive numbers of participants eliminated, are reported here.

  • First time doing experiment: Must have answered “No, this is my FIRST TIME doing this survey” to the question “Have you completed this survey previously?”. Eliminated 87 participants.
  • Following directions/engagement: Must have performed at least 1 null sample and 1 experimental sample. Participants eliminated by this criterion started the instructions, but did not perform the experimental task. Eliminated 233 participants.
  • Field is arguably scientific: Must have offered an answer to the question “Is your work in a field that would typically be considered scientific?”, and that answer must not have been “No”. Eliminated 167 participants.
  • Education at University level: Must have offered an answer to the question “What is the highest level of formal education you have achieved in your scientific field?”, and that answer must not have been “I have no formal scientific education at the University level”. Eliminated 53 participants.

Number of total participants after exclusions: 502

Understanding of shuffle reports

Question: “Do you understand why the random shuffle reports could be useful?”

Exploratory behavior by effect size

## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
## 
##     expand

## 
##  Kruskal-Wallis rank sum test
## 
## data:  n_null by factor(abs(effect_size))
## Kruskal-Wallis chi-squared = 2.7468, df = 7, p-value = 0.9074
## 
##  Kruskal-Wallis rank sum test
## 
## data:  n_expt by factor(abs(effect_size))
## Kruskal-Wallis chi-squared = 44.052, df = 7, p-value = 2.088e-07

Error rates

Question: “I’m ready to report to Santa, and…”

  • …I will tell him that I conclude that the SPARKLIES are faster when making [toy name]."
  • …I will tell him that I conclude that the JINGLIES are faster when making [toy name]."
  • …I will tell him that I conclude that BOTH TEAMS ARE THE SAME SPEED when making [toy name]."
  • …I will tell him that I CANNOT DETECT A DIFFERENCE between the two teams when making [toy name]."
  • …I will tell him that I got bored and want to leave."
## Using `n` as weighting variable

Confidence

Question: How confident are you in your assessment above?

  • Not confident at all
  • Somewhat doubtful
  • Somewhat confident
  • Very confident

Evidence responses

Effect of evidence power

## 
## Call:
## glm(formula = response_alt ~ abs(effect_size) * factor(evidence_power) + 
##     n_null + n_expt, family = binomial, data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.9546  -0.7577   0.1517   0.5768   1.7136  
## 
## Coefficients:
##                                            Estimate Std. Error z value
## (Intercept)                              -1.1601994  0.2899078  -4.002
## abs(effect_size)                          8.9516657  1.3525389   6.618
## factor(evidence_power)7                  -0.0571807  0.3317220  -0.172
## n_null                                    0.0005566  0.0006672   0.834
## n_expt                                    0.0020710  0.0039387   0.526
## abs(effect_size):factor(evidence_power)7 -2.2494326  1.6236334  -1.385
##                                          Pr(>|z|)    
## (Intercept)                              6.28e-05 ***
## abs(effect_size)                         3.63e-11 ***
## factor(evidence_power)7                     0.863    
## n_null                                      0.404    
## n_expt                                      0.599    
## abs(effect_size):factor(evidence_power)7    0.166    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 634.34  on 501  degrees of freedom
## Residual deviance: 395.94  on 496  degrees of freedom
## AIC: 407.94
## 
## Number of Fisher Scoring iterations: 6

Other questions

Field is scientific

Question: “Is your work in a field that would typically be considered scientific?”

  • Yes
  • No
  • Depends/Maybe/Other [specify]

Education and training

Question: “What is the highest level of formal education you have achieved in your scientific field?”

  • I have no formal scientific education at the University level
  • I have some education at the Bachelor level (or equivalent)
  • I have completed a Bachelor’s degree (or equivalent)
  • I have some education above the Bachelor level (e.g., Master’s, PhD, or equivalent)
  • I have completed a PhD (or equivalent)

and

Question: “How many years of formal statistical training do you have?”

[enter a number]

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   1.000   2.000   3.182   4.000  26.000      32

Field

Question: “How do statistics play a role in your work? (check all that apply)”

  • Statistics do not play a role in my work
  • I use statistics in practice for the analysis of data
  • I comment on statistical practice, but do not develop them myself
  • I develop statistical methods
  • I comment on the philosophy of statistics
  • Other [specify]

How they use statistics

Question: “In what applied field(s) do you use statistics? (check all that apply)”

  • Biological sciences
  • Medical sciences
  • Physical sciences
  • Social/Behavioral sciences
  • Computer science / technology
  • Other [specify]

Preferred statistical method

Question: “If you use statistics regularly in your work, what sort of inferential procedures would you typically prefer?”

  • Classical, frequentist, or error statistical
  • Bayesian (of any sort)
  • None; I prefer only descriptive
  • I do not use statistics in my work
  • Other [specify]

Significance testing

Question: “What is your opinion about statistical significance testing? (check all that apply)”

  • I do not have a strong opinion for or against significance testing
  • I think significance testing is necessary for science
  • I think significance testing is fine, but prefer other approaches to statistical inference
  • I think significance testing is fine, but people misunderstand/misuse it
  • I think significance testing should be discontinued or performed very rarely
  • I think the logic of significance testing is fatally flawed
  • I do not understand the question
  • Other [specify]

Richard D. Morey

2019-01-08