Download
Read in cache.
Counts.
## # A tibble: 1 × 1
## n
## <int>
## 1 210
## # A tibble: 8 × 2
## status n
## <chr> <int>
## 1 ASK 12
## 2 cannot tell direction in rep 4
## 3 cannot tell direction in target 1
## 4 done 159
## 5 missing 2
## 6 non-experiment 12
## 7 reproduction 19
## 8 unusable 1
## # A tibble: 1 × 1
## n
## <int>
## 1 177
## # A tibble: 4 × 2
## include n
## <chr> <int>
## 1 exp 45
## 2 no 34
## 3 pred_int 24
## 4 stats 107
We have maybe as many as 177 at least for some analyses.
Descriptions:
We parse out values from the raw stats.
Check that nothing that has a stat input and doesn’t get an ES out.
## # A tibble: 0 × 2
## # Rowwise:
## # … with 2 variables: target_lastauthor_year <chr>, target_raw_stat <chr>
## # A tibble: 0 × 2
## # Rowwise:
## # … with 2 variables: target_lastauthor_year <chr>, replication_raw_stat <chr>
Compute prediction intervals and p_orig.
First pass plots.
Working more on the visualization with subjective rep status.
Note, it may seem weird that we’re missing d and SE for many more replications than we are p values. This is because we can’t get d_calc if we don’t have a filled in value for same direction (but we have p value and unsigned d_calc).
## total rows
## [1] 177
## number of rows for subjective w/ demographic/experimental
## expected
## [1] 176
## actual
## [1] 176
## number of rows for predInt/p_orig w/ demographic/experimental
## expected
## [1] 131
## actual
## [1] 131
## number of complete rows for full analysis
## expected
## [1] 107
## actual
## [1] 107
## include target_lastauthor_year academic_year
## Length:177 Length:177 Length:177
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## subfield pub_year log_p log_sample
## Length:177 Min. :-48.525 Min. :-295.9909 Min. :0.6931
## Class :character 1st Qu.: -2.525 1st Qu.: -16.7525 1st Qu.:3.6889
## Mode :character Median : 1.475 Median : -6.0717 Median :4.6151
## Mean : 0.000 Mean : -16.1629 Mean :4.5020
## 3rd Qu.: 3.475 3rd Qu.: -3.5666 3rd Qu.:5.1957
## Max. : 8.475 Max. : 0.6932 Max. :7.4955
## NA's :57 NA's :1
## log_ratio_ss change_platform target_d_calc stanford
## Min. :-3.50656 Min. :0.0000 Min. :0.02236 Min. :0.00000
## 1st Qu.:-0.88504 1st Qu.:0.0000 1st Qu.:0.45040 1st Qu.:0.00000
## Median :-0.15149 Median :1.0000 Median :0.60249 Median :0.00000
## Mean :-0.41790 Mean :0.5311 Mean :0.82353 Mean :0.09605
## 3rd Qu.: 0.05205 3rd Qu.:1.0000 3rd Qu.:0.95758 3rd Qu.:0.00000
## Max. : 3.67313 Max. :1.0000 Max. :7.86738 Max. :1.00000
## NA's :1 NA's :56
## open_data open_mat is_within single_vignette
## Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000 Median :0.000 Median :0.0000
## Mean :0.2938 Mean :0.4689 Mean :0.452 Mean :0.4407
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.000 Max. :1.0000
##
## log_trials predInt p_orig sub_rep
## Min. :0.000 Min. :0.0000 Min. :0.00000 Min. :0.0000
## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
## Median :1.609 Median :0.0000 Median :0.03315 Median :0.5000
## Mean :2.163 Mean :0.4511 Mean :0.21397 Mean :0.4901
## 3rd Qu.:4.094 3rd Qu.:1.0000 3rd Qu.:0.36853 3rd Qu.:1.0000
## Max. :8.230 Max. :1.0000 Max. :0.99971 Max. :1.0000
## NA's :44 NA's :44 NA's :1
Note: tier 2 and 3 models are subject to change as a few more studies might in fact have useable stats & some of the effect size extraction might be incorrect.
including tier 3 predictors
regularized tier 1 sub
regularized p_orig
Tier 3
preds | r | p |
---|---|---|
z_pub_year | 0.064 | 0.399 |
open_data | 0.150 | 0.047 |
open_mat | 0.002 | 0.979 |
stanford | -0.027 | 0.725 |
change_platform | -0.158 | 0.037 |
z_log_ratio_ss | -0.047 | 0.536 |
is_within | 0.333 | 0.000 |
single_vignette | -0.267 | 0.000 |
z_log_sample | -0.108 | 0.155 |
z_log_trials | 0.182 | 0.015 |
Note: predInt may not be reliably calculated in some cases. Dealing with numbers is hard!
Note: predInt may not be reliably calculated in some cases. Dealing with numbers is hard!
# Exploratory even randomer
##
## Call:
## lm(formula = sub_rep ~ z_pub_year + subfield + open_data + open_mat +
## stanford + change_platform + z_log_ratio_ss + is_within +
## single_vignette + z_log_sample + z_log_trials, data = data_tier1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7511 -1.3244 -0.2346 1.3409 3.4323
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.218252 0.407093 7.905 3.89e-13 ***
## z_pub_year 0.094443 0.143961 0.656 0.51274
## subfieldnon-psych -0.094368 0.455422 -0.207 0.83611
## subfieldother-psych 0.005194 0.403553 0.013 0.98975
## subfieldsocial -0.600773 0.345548 -1.739 0.08400 .
## open_data 0.472727 0.357222 1.323 0.18759
## open_mat -0.508718 0.327028 -1.556 0.12176
## stanford -0.008220 0.442056 -0.019 0.98519
## change_platform -0.589066 0.288515 -2.042 0.04280 *
## z_log_ratio_ss -0.117647 0.157094 -0.749 0.45501
## is_within 1.218763 0.381555 3.194 0.00169 **
## single_vignette -0.328094 0.483279 -0.679 0.49818
## z_log_sample -0.091768 0.208241 -0.441 0.66003
## z_log_trials -0.350492 0.260318 -1.346 0.18005
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.633 on 162 degrees of freedom
## Multiple R-squared: 0.2007, Adjusted R-squared: 0.1365
## F-statistic: 3.128 on 13 and 162 DF, p-value: 0.0003479
Repeating regularized
## 11 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 2.8428523
## z_pub_year .
## open_data 0.1085095
## open_mat .
## stanford .
## change_platform -0.4669277
## z_log_ratio_ss .
## is_within 0.9137496
## single_vignette -0.1819649
## z_log_sample .
## z_log_trials .
## 11 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 2.72149889
## z_pub_year .
## open_data .
## open_mat .
## stanford .
## change_platform -0.08774184
## z_log_ratio_ss .
## is_within 0.62829911
## single_vignette .
## z_log_sample .
## z_log_trials .
Principal components are not very interpretable.
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## Standard deviation 1.504085 1.0247345 0.9327507 0.55806625 0.52678095
## Proportion of Variance 0.429554 0.1993864 0.1651977 0.05913497 0.05269058
## Cumulative Proportion 0.429554 0.6289404 0.7941381 0.85327309 0.90596367
## Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
## Standard deviation 0.40112599 0.3263851 0.30491529 0.27764252 0.24033322
## Proportion of Variance 0.03055164 0.0202271 0.01765352 0.01463676 0.01096732
## Cumulative Proportion 0.93651531 0.9567424 0.97439593 0.98903268 1.00000000
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## z_pub_year 0.306 0.643 0.602 0.317 0.162
## open_data 0.170 -0.506 0.186 -0.802
## open_mat 0.149 0.120 -0.556 0.517 -0.234 0.557
## stanford 0.110 0.126 -0.971
## change_platform -0.123 -0.113 0.531 -0.121 0.786 -0.177 0.108
## z_log_ratio_ss -0.409 -0.341 0.709 -0.450
## is_within -0.199 0.222 -0.107 -0.166 0.167 0.832
## single_vignette 0.185 -0.278 0.135 0.279 0.114 -0.236 -0.102
## z_log_sample 0.598 -0.155 0.138 -0.745 -0.104
## z_log_trials -0.510 0.541 -0.227 -0.297 -0.104 -0.432 -0.111
## Comp.10
## z_pub_year
## open_data
## open_mat
## stanford -0.164
## change_platform
## z_log_ratio_ss
## is_within 0.377
## single_vignette 0.839
## z_log_sample 0.147
## z_log_trials 0.306
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## SS loadings 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
## Proportion Var 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
## Cumulative Var 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
## Comp.10
## SS loadings 1.0
## Proportion Var 0.1
## Cumulative Var 1.0
Let’s try super strong priors, consolidating a few very highly correlated variables.
And let’s see if it’s the link
##
## Call:
## lm(formula = sub_rep ~ is_within + single_vignette + z_log_trials,
## data = data_tier1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7587 -1.3730 -0.3709 1.3848 2.8753
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7485 0.3126 8.792 1.48e-15 ***
## is_within 1.2259 0.3624 3.383 0.000888 ***
## single_vignette -0.7744 0.4353 -1.779 0.076969 .
## z_log_trials -0.4134 0.2221 -1.861 0.064390 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.652 on 172 degrees of freedom
## Multiple R-squared: 0.1317, Adjusted R-squared: 0.1165
## F-statistic: 8.695 on 3 and 172 DF, p-value: 2.106e-05
## # A tibble: 1 × 12
## mean_pub_year mean_l…¹ mean_…² mean_…³ mean_…⁴ mean_…⁵ sd_pu…⁶ sd_lo…⁷ sd_lo…⁸
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 -6.17e-14 -16.2 4.50 -0.418 0.824 2.16 6.47 33.1 1.11
## # … with 3 more variables: sd_log_ratio_ss <dbl>, sd_target_d_calc <dbl>,
## # sd_log_trials <dbl>, and abbreviated variable names ¹mean_log_p,
## # ²mean_log_sample, ³mean_log_ratio_ss, ⁴mean_target_d_calc,
## # ⁵mean_log_trials, ⁶sd_pub_year, ⁷sd_log_p, ⁸sd_log_sample