Pull data

Download

Read in cache.

Counts.

## # A tibble: 1 × 1
##       n
##   <int>
## 1   210

## # A tibble: 8 × 2
##   status                              n
##   <chr>                           <int>
## 1 ASK                                12
## 2 cannot tell direction in rep        4
## 3 cannot tell direction in target     1
## 4 done                              159
## 5 missing                             2
## 6 non-experiment                     12
## 7 reproduction                       19
## 8 unusable                            1

## # A tibble: 1 × 1
##       n
##   <int>
## 1   177

## # A tibble: 4 × 2
##   include      n
##   <chr>    <int>
## 1 exp         45
## 2 no          34
## 3 pred_int    24
## 4 stats      107

We have maybe as many as 177 at least for some analyses.

Descriptions:

“done” – fully coded, can be used for whatever is specified in other column
“ES_direction” – we don’t have the direction coded
“ask” – might be salvageable, but I have questions about coding
“unusable” – got as far as I can, and it does not have complete descriptives
“missing” – student didn’t finish or we don’t have write-up
“non-experiment” – student replicated something that was not an experiment
“reproduction” – student did a reproduction and not replication

Parsing

We parse out values from the raw stats.

what didn’t parse

Check that nothing that has a stat input and doesn’t get an ES out.

## # A tibble: 0 × 2
## # Rowwise: 
## # … with 2 variables: target_lastauthor_year <chr>, target_raw_stat <chr>

## # A tibble: 0 × 2
## # Rowwise: 
## # … with 2 variables: target_lastauthor_year <chr>, replication_raw_stat <chr>

PredInt and P_orig

Compute prediction intervals and p_orig.

viz SMD

First pass plots.

Working more on the visualization with subjective rep status.

How much missing data?

Note, it may seem weird that we’re missing d and SE for many more replications than we are p values. This is because we can’t get d_calc if we don’t have a filled in value for same direction (but we have p value and unsigned d_calc).

## total rows

## [1] 177

## number of rows for subjective w/ demographic/experimental

## expected

## [1] 176

## actual

## [1] 176

## number of rows for predInt/p_orig w/ demographic/experimental
## expected

## [1] 131

## actual

## [1] 131

## number of complete rows for full analysis
## expected

## [1] 107

## actual

## [1] 107

code vars for models

##    include          target_lastauthor_year academic_year     
##  Length:177         Length:177             Length:177        
##  Class :character   Class :character       Class :character  
##  Mode  :character   Mode  :character       Mode  :character  
##                                                              
##                                                              
##                                                              
##                                                              
##    subfield            pub_year           log_p             log_sample    
##  Length:177         Min.   :-48.525   Min.   :-295.9909   Min.   :0.6931  
##  Class :character   1st Qu.: -2.525   1st Qu.: -16.7525   1st Qu.:3.6889  
##  Mode  :character   Median :  1.475   Median :  -6.0717   Median :4.6151  
##                     Mean   :  0.000   Mean   : -16.1629   Mean   :4.5020  
##                     3rd Qu.:  3.475   3rd Qu.:  -3.5666   3rd Qu.:5.1957  
##                     Max.   :  8.475   Max.   :   0.6932   Max.   :7.4955  
##                                       NA's   :57          NA's   :1       
##   log_ratio_ss      change_platform  target_d_calc        stanford      
##  Min.   :-3.50656   Min.   :0.0000   Min.   :0.02236   Min.   :0.00000  
##  1st Qu.:-0.88504   1st Qu.:0.0000   1st Qu.:0.45040   1st Qu.:0.00000  
##  Median :-0.15149   Median :1.0000   Median :0.60249   Median :0.00000  
##  Mean   :-0.41790   Mean   :0.5311   Mean   :0.82353   Mean   :0.09605  
##  3rd Qu.: 0.05205   3rd Qu.:1.0000   3rd Qu.:0.95758   3rd Qu.:0.00000  
##  Max.   : 3.67313   Max.   :1.0000   Max.   :7.86738   Max.   :1.00000  
##  NA's   :1                           NA's   :56                         
##    open_data         open_mat        is_within     single_vignette 
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median :0.000   Median :0.0000  
##  Mean   :0.2938   Mean   :0.4689   Mean   :0.452   Mean   :0.4407  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.000   Max.   :1.0000  
##                                                                    
##    log_trials       predInt           p_orig           sub_rep      
##  Min.   :0.000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000  
##  Median :1.609   Median :0.0000   Median :0.03315   Median :0.5000  
##  Mean   :2.163   Mean   :0.4511   Mean   :0.21397   Mean   :0.4901  
##  3rd Qu.:4.094   3rd Qu.:1.0000   3rd Qu.:0.36853   3rd Qu.:1.0000  
##  Max.   :8.230   Max.   :1.0000   Max.   :0.99971   Max.   :1.0000  
##                  NA's   :44       NA's   :44        NA's   :1

z-score

check z-score

tier 1 data

tier 2 data

tier 3 data

Pre-reg’d Models

Note: tier 2 and 3 models are subject to change as a few more studies might in fact have useable stats & some of the effect size extraction might be incorrect.

Sensitivity analysis TODO

Exploratory working

correlation

including tier 3 predictors

Frequentist Lassoes

regularized tier 1 sub

regularized p_orig

Tier 3

Correlations

preds	r	p
z_pub_year	0.064	0.399
open_data	0.150	0.047
open_mat	0.002	0.979
stanford	-0.027	0.725
change_platform	-0.158	0.037
z_log_ratio_ss	-0.047	0.536
is_within	0.333	0.000
single_vignette	-0.267	0.000
z_log_sample	-0.108	0.155
z_log_trials	0.182	0.015

Individual predictor - outcome correlations

Sub_rep

Pred_int

Note: predInt may not be reliably calculated in some cases. Dealing with numbers is hard!

P_orig