statcheck package 介紹

statcheck是R的套件之一,用來快速檢查paper中的統計結果是否有錯誤的地方

這個套件可以自動抽取檔案中的統計結果,並重新計算。

藉由比對paper中的統計結果和經過statcheck重新計算的結果,
檢查是否有不一致的地方。

因為statcheck是抽取檔案中的文字後加以比對,
因此轉換過程也可能會發生錯誤,作者建議應該對有問題的地方進行人工檢查。

以下簡單示範如何使用statcheck package快速檢查位於我電腦中的“Voluntary Attention Modulates Processing of Eye-Specific Visual Information”這篇paper中統計結果的過程

基本設定

setwd("~/R/statcheck") # switch directory to the file directory
library(statcheck) # load statcheck library

讀取資料並比對結果

statcheck package可以讀取資料夾內所有pdf和html檔案並比對結果,但根據作者的說明,轉換pdf檔時比較容易發生問題,所以這次用html檔案做說明

# use checkHTML to get comparsion results
checkResults <- checkHTML("/Users/CSE/R/statcheck/Voluntary Attention Modulates Processing of Eye-Specific Visual Information.htm") 

# to load get results from all html files in the directory, use checkHTMLdir function

檢查結果

summary(checkResults) # show summary of comparison results
##   Source pValues Errors DecisionErrors
## 1      1      12      1              0
## 2  Total      12      1              0

source: 讀取了多少檔案
pValues: 比對幾個p-values
Errors: 不一致的結果的數目
DecisionErrors:影響推論的不一致結果數目

從比對結果可以看出有一個Error,可以呼叫詳細結果查看細節。

checkResults 
##                                                                         Source
## 1  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 2  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 3  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 4  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 5  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 6  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 7  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 8  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 9  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 10 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 11 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 12 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
##    Statistic df1 df2 Test.Comparison Value Reported.Comparison
## 1          t  NA   5               =  3.96                   <
## 2          F   2   8               = 15.10                   <
## 3          F   2   8               = 21.20                   <
## 4          t  NA   4               =  5.91                   <
## 5          t  NA   4               = -5.48                   <
## 6          t  NA   4               =  3.85                   <
## 7          t  NA   4               = -0.56                   >
## 8          t  NA   4               =  1.68                   <
## 9          t  NA   4               =  3.21                   <
## 10         F   2  10               = 10.17                   <
## 11         t  NA   5               =  3.49                   <
## 12         t  NA   5               =  2.02                   <
##    Reported.P.Value     Computed                       Raw Error
## 1             0.050 0.0107428441      t(5) = 3.96, p < .05 FALSE
## 2             0.005 0.0019235634  F(2, 8) = 15.1, p < .005 FALSE
## 3             0.001 0.0006348013  F(2, 8) = 21.2, p < .001 FALSE
## 4             0.010 0.0041033902      t(4) = 5.91, p < .01 FALSE
## 5             0.010 0.0053986115     t(4) = -5.48, p < .01 FALSE
## 6             0.050 0.0183025983      t(4) = 3.85, p < .05 FALSE
## 7             0.500 0.6053561995      t(4) = -0.56, p > .5 FALSE
## 8             0.100 0.1682548864       t(4) = 1.68, p < .1  TRUE
## 9             0.050 0.0325889218      t(4) = 3.21, p < .05 FALSE
## 10            0.010 0.0038897538 F(2, 10) = 10.17, p < .01 FALSE
## 11            0.050 0.0174704493      t(5) = 3.49, p < .05 FALSE
## 12            0.100 0.0993702870       t(5) = 2.02, p < .1 FALSE
##    DecisionError OneTail OneTailedInTxt APAfactor
## 1          FALSE   FALSE          FALSE      0.75
## 2          FALSE   FALSE          FALSE      0.75
## 3          FALSE   FALSE          FALSE      0.75
## 4          FALSE   FALSE          FALSE      0.75
## 5          FALSE   FALSE          FALSE      0.75
## 6          FALSE   FALSE          FALSE      0.75
## 7          FALSE   FALSE          FALSE      0.75
## 8          FALSE   FALSE          FALSE      0.75
## 9          FALSE   FALSE          FALSE      0.75
## 10         FALSE   FALSE          FALSE      0.75
## 11         FALSE   FALSE          FALSE      0.75
## 12         FALSE   FALSE          FALSE      0.75

Error = TRUE;有不一致的結果

表示第8個統計結果可能有誤
可以先比對“computed”和“Reported.P.Value”

computed: statcheck計算出的結果
Reported.P.Value: paper中的結果

Raw: 從文檔中抽出的文字片段
這裡的Raw檔是 “t(4) = 1.68, p < .1”, p-values<.1
而statcheck計算出的p-value=0.1682548864,所以不一致。
但因為預設的alpha為.05,所以DecisionErrors還是0個

但如果把alpha level改成.1,這個error就會被判斷成DecisionErrors

# reset alpha to .1
checkResults2 <- checkHTML("/Users/CSE/R/statcheck/Voluntary Attention Modulates Processing of Eye-Specific Visual Information.htm", alpha = 0.1)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
summary(checkResults2)
##   Source pValues Errors DecisionErrors
## 1      1      12      1              1
## 2  Total      12      1              1
checkResults2
##                                                                         Source
## 1  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 2  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 3  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 4  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 5  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 6  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 7  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 8  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 9  Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 10 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 11 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
## 12 Voluntary Attention Modulates Processing of Eye-Specific Visual Information
##    Statistic df1 df2 Test.Comparison Value Reported.Comparison
## 1          t  NA   5               =  3.96                   <
## 2          F   2   8               = 15.10                   <
## 3          F   2   8               = 21.20                   <
## 4          t  NA   4               =  5.91                   <
## 5          t  NA   4               = -5.48                   <
## 6          t  NA   4               =  3.85                   <
## 7          t  NA   4               = -0.56                   >
## 8          t  NA   4               =  1.68                   <
## 9          t  NA   4               =  3.21                   <
## 10         F   2  10               = 10.17                   <
## 11         t  NA   5               =  3.49                   <
## 12         t  NA   5               =  2.02                   <
##    Reported.P.Value     Computed                       Raw Error
## 1             0.050 0.0107428441      t(5) = 3.96, p < .05 FALSE
## 2             0.005 0.0019235634  F(2, 8) = 15.1, p < .005 FALSE
## 3             0.001 0.0006348013  F(2, 8) = 21.2, p < .001 FALSE
## 4             0.010 0.0041033902      t(4) = 5.91, p < .01 FALSE
## 5             0.010 0.0053986115     t(4) = -5.48, p < .01 FALSE
## 6             0.050 0.0183025983      t(4) = 3.85, p < .05 FALSE
## 7             0.500 0.6053561995      t(4) = -0.56, p > .5 FALSE
## 8             0.100 0.1682548864       t(4) = 1.68, p < .1  TRUE
## 9             0.050 0.0325889218      t(4) = 3.21, p < .05 FALSE
## 10            0.010 0.0038897538 F(2, 10) = 10.17, p < .01 FALSE
## 11            0.050 0.0174704493      t(5) = 3.49, p < .05 FALSE
## 12            0.100 0.0993702870       t(5) = 2.02, p < .1 FALSE
##    DecisionError OneTail OneTailedInTxt APAfactor
## 1          FALSE   FALSE          FALSE      0.75
## 2          FALSE   FALSE          FALSE      0.75
## 3          FALSE   FALSE          FALSE      0.75
## 4          FALSE   FALSE          FALSE      0.75
## 5          FALSE   FALSE          FALSE      0.75
## 6          FALSE   FALSE          FALSE      0.75
## 7          FALSE   FALSE          FALSE      0.75
## 8           TRUE   FALSE          FALSE      0.75
## 9          FALSE   FALSE          FALSE      0.75
## 10         FALSE   FALSE          FALSE      0.75
## 11         FALSE   FALSE          FALSE      0.75
## 12         FALSE   FALSE          FALSE      0.75

其他

  1. statcheck可以一次讀取比對資料中所有檔案的結果
  2. statcheck抽取文字片段時,只會抽取完全遵守APA style的片段,所以會有漏掉的部分
    例: “F(2, 70) = 4.48, MSE = 6.61, p <.02”,因為中間多了“MSE = 6.61”,所以會被略過
  3. 遇到有出現問題的結果,最好可以親自檢查一次。