E. J. Wagenmakers recently posted a Twitter poll asking:

You read two papers, each reporting three preregistered analyses. Which effect is more likely to replicate?

  1. p=.024, p=.034, p=.031

  2. p=.25, p=.032, p=.0001

Plenty of researchers struggle with the interpretation of p-values. There are two arguments for option (1):

  1. It has more “significant” p-values (this is a bad argument).
  2. Tighter p-values mean more consistent test statistics, which imply a more precise estimate of the effect size (a little better).

However, there are two reasons that option (2) is the better choice.

Distribution of p-values

For this first approach, I’ll generate fake data with a “small” to “medium” effect size, use Welch’s t-test to evaluate the effect, and examine the distribution of t statistics and p-values.

library(dplyr)
library(broom)
library(ggplot2)
set.seed(20812)
N <- 119
es <- 0.4
tTests <- data_frame(n = 1:1000) %>%
  group_by(n) %>%
  do(tidy(t.test(rnorm(N, 0, 1), rnorm(N, 0+es, 1))))

Here are the t statistics:

ggplot(tTests, aes(statistic, ..density..)) +
  geom_histogram(binwidth = 0.5)

Here are the resulting p-values:

ggplot(tTests, aes(p.value, y = ..density..)) +
  geom_histogram(binwidth = 0.01)

When the null hypothesis is false, the distrubition of p-values has positive skewness. The values from option (2) look more like what we’d expect from this distribution.

For example, the mean of this p-value distrubution is 0.028, which is close to the mean p-value for option (1). However, the probability of drawing three p-values from this distribution and seeing none below 0.01 is approximately:

mean(replicate(10000, all(sample(tTests$p.value, 3) > 0.01)))
[1] 0.0302

Effect size

Using the degrees of freedom in this example data, I can convert the p-values from the poll to t statistics and estimate Cohen’s d for effect size:

pVals <- matrix(c(0.024, 0.034, 0.031,
                  0.25, 0.032, 0.0001),
                nrow = 2, byrow = TRUE)
dof <- N*2 - 2
tVals <- qt(pVals/2, dof)
dVals <- abs(2*tVals) / sqrt(dof)
apply(dVals, 1, mean)
[1] 0.2853020 0.3154265

The p-values for option (2) imply a larger average effect size than those for option (1). If we assume that a successful “replication” means finding an effect where one exists, the second set of p-values the best choice.

Conclusion

Wagenmakers’s question is underspecified. However, the p-values in option (2) look more like what we’d expect from an actual set of replications, and also imply a larger effect.

LS0tDQp0aXRsZTogIldoaWNoIGVmZmVjdCB3aWxsIHJlcGxpY2F0ZT8iDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQpFLiBKLiBXYWdlbm1ha2VycyByZWNlbnRseSBwb3N0ZWQgYSBUd2l0dGVyIHBvbGwgYXNraW5nOg0KDQo+IFlvdSByZWFkIHR3byBwYXBlcnMsIGVhY2ggcmVwb3J0aW5nIHRocmVlIHByZXJlZ2lzdGVyZWQgYW5hbHlzZXMuIA0KPiBXaGljaCBlZmZlY3QgaXMgbW9yZSBsaWtlbHkgdG8gcmVwbGljYXRlPw0KPiANCj4gMSkgcD0uMDI0LCBwPS4wMzQsIHA9LjAzMQ0KPiANCj4gMikgcD0uMjUsIHA9LjAzMiwgcD0uMDAwMQ0KDQpQbGVudHkgb2YgcmVzZWFyY2hlcnMgc3RydWdnbGUgd2l0aCB0aGUgaW50ZXJwcmV0YXRpb24gb2YgcC12YWx1ZXMuIA0KVGhlcmUgYXJlIHR3byBhcmd1bWVudHMgZm9yIG9wdGlvbiAoMSk6IA0KDQoxLiBJdCBoYXMgbW9yZSAic2lnbmlmaWNhbnQiIHAtdmFsdWVzICh0aGlzIGlzIGEgYmFkIGFyZ3VtZW50KS4NCjIuIFRpZ2h0ZXIgcC12YWx1ZXMgbWVhbiBtb3JlIGNvbnNpc3RlbnQgdGVzdCBzdGF0aXN0aWNzLCB3aGljaCANCmltcGx5IGEgbW9yZSBwcmVjaXNlIGVzdGltYXRlIG9mIHRoZSBlZmZlY3Qgc2l6ZSAoYSBsaXR0bGUgYmV0dGVyKS4NCg0KSG93ZXZlciwgdGhlcmUgYXJlIHR3byByZWFzb25zIHRoYXQgb3B0aW9uICgyKSBpcyB0aGUgYmV0dGVyIGNob2ljZS4NCg0KIyMjIERpc3RyaWJ1dGlvbiBvZiBwLXZhbHVlcw0KDQpGb3IgdGhpcyBmaXJzdCBhcHByb2FjaCwgSSdsbCBnZW5lcmF0ZSBmYWtlIGRhdGEgd2l0aCBhICJzbWFsbCIgdG8gIm1lZGl1bSIgDQplZmZlY3Qgc2l6ZSwgdXNlIFdlbGNoJ3MgX3RfLXRlc3QgdG8gZXZhbHVhdGUgdGhlIGVmZmVjdCwgYW5kIGV4YW1pbmUgdGhlIA0KZGlzdHJpYnV0aW9uIG9mIF90XyBzdGF0aXN0aWNzIGFuZCBwLXZhbHVlcy4gDQoNCmBgYHtyLCBtZXNzYWdlPUZBTFNFfQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkoYnJvb20pDQpsaWJyYXJ5KGdncGxvdDIpDQoNCnNldC5zZWVkKDIwODEyKQ0KDQpOIDwtIDExOQ0KZXMgPC0gMC40DQp0VGVzdHMgPC0gZGF0YV9mcmFtZShuID0gMToxMDAwKSAlPiUNCiAgZ3JvdXBfYnkobikgJT4lDQogIGRvKHRpZHkodC50ZXN0KHJub3JtKE4sIDAsIDEpLCBybm9ybShOLCAwK2VzLCAxKSkpKQ0KYGBgDQoNCkhlcmUgYXJlIHRoZSBfdF8gc3RhdGlzdGljczoNCg0KYGBge3J9DQpnZ3Bsb3QodFRlc3RzLCBhZXMoc3RhdGlzdGljLCAuLmRlbnNpdHkuLikpICsNCiAgZ2VvbV9oaXN0b2dyYW0oYmlud2lkdGggPSAwLjUpDQpgYGANCg0KSGVyZSBhcmUgdGhlIHJlc3VsdGluZyBwLXZhbHVlczoNCg0KYGBge3J9DQpnZ3Bsb3QodFRlc3RzLCBhZXMocC52YWx1ZSwgeSA9IC4uZGVuc2l0eS4uKSkgKw0KICBnZW9tX2hpc3RvZ3JhbShiaW53aWR0aCA9IDAuMDEpDQpgYGANCg0KV2hlbiB0aGUgbnVsbCBoeXBvdGhlc2lzIGlzIGZhbHNlLCB0aGUgZGlzdHJ1Yml0aW9uIG9mIHAtdmFsdWVzIGhhcyBwb3NpdGl2ZQ0Kc2tld25lc3MuIFRoZSB2YWx1ZXMgZnJvbSBvcHRpb24gKDIpIGxvb2sgbW9yZSBsaWtlIHdoYXQgd2UnZCBleHBlY3QgZnJvbSB0aGlzDQpkaXN0cmlidXRpb24uIA0KDQpGb3IgZXhhbXBsZSwgdGhlIG1lYW4gb2YgdGhpcyBwLXZhbHVlIGRpc3RydWJ1dGlvbiBpcyBgciByb3VuZChtZWFuKHRUZXN0cyRwLnZhbHVlKSwgMylgLCANCndoaWNoIGlzIGNsb3NlIHRvIHRoZSBtZWFuIHAtdmFsdWUgZm9yIG9wdGlvbiAoMSkuIEhvd2V2ZXIsIHRoZSBwcm9iYWJpbGl0eQ0Kb2YgZHJhd2luZyB0aHJlZSBwLXZhbHVlcyBmcm9tIHRoaXMgZGlzdHJpYnV0aW9uIGFuZCBzZWVpbmcgbm9uZSBiZWxvdyAwLjAxDQppcyBhcHByb3hpbWF0ZWx5Og0KDQpgYGB7cn0NCm1lYW4ocmVwbGljYXRlKDEwMDAwLCBhbGwoc2FtcGxlKHRUZXN0cyRwLnZhbHVlLCAzKSA+IDAuMDEpKSkNCmBgYA0KDQojIyMgRWZmZWN0IHNpemUNCg0KVXNpbmcgdGhlIGRlZ3JlZXMgb2YgZnJlZWRvbSBpbiB0aGlzIGV4YW1wbGUgZGF0YSwgSSBjYW4gY29udmVydCB0aGUgcC12YWx1ZXMgDQpmcm9tIHRoZSBwb2xsIHRvIF90XyBzdGF0aXN0aWNzIGFuZCBlc3RpbWF0ZSBDb2hlbidzIF9kXyBmb3IgZWZmZWN0IHNpemU6DQoNCmBgYHtyfQ0KcFZhbHMgPC0gbWF0cml4KGMoMC4wMjQsIDAuMDM0LCAwLjAzMSwNCiAgICAgICAgICAgICAgICAgIDAuMjUsIDAuMDMyLCAwLjAwMDEpLA0KICAgICAgICAgICAgICAgIG5yb3cgPSAyLCBieXJvdyA9IFRSVUUpDQpkb2YgPC0gTioyIC0gMg0KDQp0VmFscyA8LSBxdChwVmFscy8yLCBkb2YpDQpkVmFscyA8LSBhYnMoMip0VmFscykgLyBzcXJ0KGRvZikNCmFwcGx5KGRWYWxzLCAxLCBtZWFuKQ0KYGBgDQoNClRoZSBwLXZhbHVlcyBmb3Igb3B0aW9uICgyKSBpbXBseSBhIGxhcmdlciAqYXZlcmFnZSogZWZmZWN0IHNpemUgdGhhbiB0aG9zZSBmb3INCm9wdGlvbiAoMSkuIElmIHdlIGFzc3VtZSB0aGF0IGEgc3VjY2Vzc2Z1bCAicmVwbGljYXRpb24iIG1lYW5zIGZpbmRpbmcgYW4gDQplZmZlY3Qgd2hlcmUgb25lIGV4aXN0cywgdGhlIHNlY29uZCBzZXQgb2YgcC12YWx1ZXMgdGhlIGJlc3QgY2hvaWNlLg0KDQojIyMgQ29uY2x1c2lvbg0KDQpXYWdlbm1ha2VycydzIHF1ZXN0aW9uIGlzIHVuZGVyc3BlY2lmaWVkLiBIb3dldmVyLCB0aGUgcC12YWx1ZXMgaW4gb3B0aW9uICgyKSANCmxvb2sgbW9yZSBsaWtlIHdoYXQgd2UnZCBleHBlY3QgZnJvbSBhbiBhY3R1YWwgc2V0IG9mIHJlcGxpY2F0aW9ucywgYW5kIGFsc28gDQppbXBseSBhIGxhcmdlciBlZmZlY3QuDQo=