Download
Read in cache.
Counts.
## # A tibble: 1 × 1
## n
## <int>
## 1 210
## # A tibble: 12 × 2
## status n
## <chr> <int>
## 1 ASK 12
## 2 cannot tell direction in rep 2
## 3 cannot tell direction in replication 1
## 4 cannot tell direction in target 1
## 5 cannot tell direction on rep 1
## 6 done 156
## 7 missing 2
## 8 non-experiment 12
## 9 rep is the problem, might be salvageable 1
## 10 reproduction 19
## 11 unusable 1
## 12 use raw!!! 2
## # A tibble: 1 × 1
## n
## <int>
## 1 177
## # A tibble: 4 × 2
## include n
## <chr> <int>
## 1 exp 48
## 2 no 34
## 3 pred_int 24
## 4 stats 104
We have maybe as many as 177 at least for some analyses.
Descriptions:
We parse out values from the raw stats.
Check that nothing that has a stat input and doesn’t get an ES out.
## # A tibble: 0 × 2
## # Rowwise:
## # … with 2 variables: target_lastauthor_year <chr>, target_raw_stat <chr>
## # A tibble: 1 × 2
## # Rowwise:
## target_lastauthor_year replication_raw_stat
## <chr> <chr>
## 1 obrien2015 MSE4:m1=4.8(.6),m2=5.5(.6),m3=3.8(.5),m4=3.7(.5)
Compute prediction intervals and p_orig.
First pass plots.
Working more on the visualization with subjective rep status.
Note, it may seem weird that we’re missing d and SE for many more replications than we are p values. This is because we can’t get d_calc if we don’t have a filled in value for same direction (but we have p value and unsigned d_calc).
## total rows
## [1] 177
## number of rows for subjective w/ demographic/experimental
## expected
## [1] 176
## actual
## [1] 176
## number of rows for predInt/p_orig w/ demographic/experimental
## expected
## [1] 128
## actual
## [1] 127
## number of complete rows for full analysis
## expected
## [1] 104
## actual
## [1] 103
## include target_lastauthor_year academic_year
## Length:177 Length:177 Length:177
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## subfield pub_year log_p log_sample
## Length:177 Min. :-48.525 Min. :-295.9909 Min. :0.6931
## Class :character 1st Qu.: -2.525 1st Qu.: -15.9231 1st Qu.:3.6889
## Mode :character Median : 1.475 Median : -6.0717 Median :4.6151
## Mean : 0.000 Mean : -15.1343 Mean :4.5020
## 3rd Qu.: 3.475 3rd Qu.: -3.7106 3rd Qu.:5.1957
## Max. : 8.475 Max. : 0.6932 Max. :7.4955
## NA's :57 NA's :1
## log_ratio_ss change_platform target_d_calc stanford
## Min. :-3.50656 Min. :0.0000 Min. : 0.0562 Min. :0.00000
## 1st Qu.:-0.88504 1st Qu.:0.0000 1st Qu.: 0.4679 1st Qu.:0.00000
## Median :-0.15149 Median :1.0000 Median : 0.6905 Median :0.00000
## Mean :-0.41790 Mean :0.5311 Mean : 6.9238 Mean :0.09605
## 3rd Qu.: 0.05205 3rd Qu.:1.0000 3rd Qu.: 1.5818 3rd Qu.:0.00000
## Max. : 3.67313 Max. :1.0000 Max. :452.2157 Max. :1.00000
## NA's :1 NA's :56
## open_data open_mat is_within single_vignette
## Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000 Median :0.000 Median :0.0000
## Mean :0.2938 Mean :0.4689 Mean :0.452 Mean :0.4407
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.000 Max. :1.0000
##
## log_trials predInt p_orig sub_rep
## Min. :0.000 Min. :0.0000 Min. :0.00000 Min. :0.0000
## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.00001 1st Qu.:0.0000
## Median :1.609 Median :0.0000 Median :0.03442 Median :0.5000
## Mean :2.163 Mean :0.4651 Mean :0.24073 Mean :0.4901
## 3rd Qu.:4.094 3rd Qu.:1.0000 3rd Qu.:0.46557 3rd Qu.:1.0000
## Max. :8.230 Max. :1.0000 Max. :0.99971 Max. :1.0000
## NA's :48 NA's :48 NA's :1
## # A tibble: 91 × 3
## value2 value1 corr
## <chr> <chr> <dbl>
## 1 is_within z_log_trials 0.7
## 2 open_data open_mat 0.56
## 3 is.soc single_vignette 0.51
## 4 is.cog z_log_trials 0.39
## 5 open_data z_pub_year 0.38
## 6 single_vignette z_log_sample 0.38
## 7 open_mat z_pub_year 0.35
## 8 open_mat z_log_sample 0.35
## 9 is_within is.cog 0.31
## 10 z_log_sample z_pub_year 0.3
## # … with 81 more rows
regularized
preds | r | p |
---|---|---|
z_pub_year | 0.064 | 0.399 |
open_data | 0.150 | 0.047 |
open_mat | 0.002 | 0.979 |
stanford | -0.027 | 0.725 |
change_platform | -0.158 | 0.037 |
z_log_ratio_ss | -0.047 | 0.536 |
is_within | 0.333 | 0.000 |
single_vignette | -0.267 | 0.000 |
z_log_sample | -0.108 | 0.155 |
z_log_trials | 0.182 | 0.015 |
Principal components are not very interpretable.
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## Standard deviation 1.504085 1.0247345 0.9327507 0.55806625 0.52678095
## Proportion of Variance 0.429554 0.1993864 0.1651977 0.05913497 0.05269058
## Cumulative Proportion 0.429554 0.6289404 0.7941381 0.85327309 0.90596367
## Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
## Standard deviation 0.40112599 0.3263851 0.30491529 0.27764252 0.24033322
## Proportion of Variance 0.03055164 0.0202271 0.01765352 0.01463676 0.01096732
## Cumulative Proportion 0.93651531 0.9567424 0.97439593 0.98903268 1.00000000
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## z_pub_year 0.306 0.643 0.602 0.317 0.162
## open_data 0.170 -0.506 0.186 -0.802
## open_mat 0.149 0.120 -0.556 0.517 -0.234 0.557
## stanford 0.110 0.126 -0.971
## change_platform -0.123 -0.113 0.531 -0.121 0.786 -0.177 0.108
## z_log_ratio_ss -0.409 -0.341 0.709 -0.450
## is_within -0.199 0.222 -0.107 -0.166 0.167 0.832
## single_vignette 0.185 -0.278 0.135 0.279 0.114 -0.236 -0.102
## z_log_sample 0.598 -0.155 0.138 -0.745 -0.104
## z_log_trials -0.510 0.541 -0.227 -0.297 -0.104 -0.432 -0.111
## Comp.10
## z_pub_year
## open_data
## open_mat
## stanford -0.164
## change_platform
## z_log_ratio_ss
## is_within 0.377
## single_vignette 0.839
## z_log_sample 0.147
## z_log_trials 0.306
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## SS loadings 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
## Proportion Var 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
## Cumulative Var 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
## Comp.10
## SS loadings 1.0
## Proportion Var 0.1
## Cumulative Var 1.0
##
## Call:
## lm(formula = sub_rep ~ z_pub_year + subfield + open_data + open_mat +
## stanford + change_platform + z_log_ratio_ss + is_within +
## single_vignette + z_log_sample + z_log_trials, data = data_tier1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7511 -1.3244 -0.2346 1.3409 3.4323
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.218252 0.407093 7.905 3.89e-13 ***
## z_pub_year 0.094443 0.143961 0.656 0.51274
## subfieldnon-psych -0.094368 0.455422 -0.207 0.83611
## subfieldother-psych 0.005194 0.403553 0.013 0.98975
## subfieldsocial -0.600773 0.345548 -1.739 0.08400 .
## open_data 0.472727 0.357222 1.323 0.18759
## open_mat -0.508718 0.327028 -1.556 0.12176
## stanford -0.008220 0.442056 -0.019 0.98519
## change_platform -0.589066 0.288515 -2.042 0.04280 *
## z_log_ratio_ss -0.117647 0.157094 -0.749 0.45501
## is_within 1.218763 0.381555 3.194 0.00169 **
## single_vignette -0.328094 0.483279 -0.679 0.49818
## z_log_sample -0.091768 0.208241 -0.441 0.66003
## z_log_trials -0.350492 0.260318 -1.346 0.18005
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.633 on 162 degrees of freedom
## Multiple R-squared: 0.2007, Adjusted R-squared: 0.1365
## F-statistic: 3.128 on 13 and 162 DF, p-value: 0.0003479
Repeating regularized
## 11 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 2.8428523
## z_pub_year .
## open_data 0.1085095
## open_mat .
## stanford .
## change_platform -0.4669277
## z_log_ratio_ss .
## is_within 0.9137496
## single_vignette -0.1819649
## z_log_sample .
## z_log_trials .
## 11 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) 2.7619133
## z_pub_year .
## open_data .
## open_mat .
## stanford .
## change_platform .
## z_log_ratio_ss .
## is_within 0.4362908
## single_vignette .
## z_log_sample .
## z_log_trials .
Let’s try super strong priors, consolidating a few very highly correlated variables.
And let’s see if it’s the link
##
## Call:
## lm(formula = sub_rep ~ is_within + single_vignette + z_log_trials,
## data = data_tier1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7587 -1.3730 -0.3709 1.3848 2.8753
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7485 0.3126 8.792 1.48e-15 ***
## is_within 1.2259 0.3624 3.383 0.000888 ***
## single_vignette -0.7744 0.4353 -1.779 0.076969 .
## z_log_trials -0.4134 0.2221 -1.861 0.064390 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.652 on 172 degrees of freedom
## Multiple R-squared: 0.1317, Adjusted R-squared: 0.1165
## F-statistic: 8.695 on 3 and 172 DF, p-value: 2.106e-05
Note: predInt may not be reliably calculated in some cases. Dealing with numbers is hard!
Note: predInt may not be reliably calculated in some cases. Dealing with numbers is hard!