Tutorial_{suddengains}

6/8/2020

注意

接下來所要介紹的大部分都是臨床心理學領域的理論，如有翻譯不當敬請見諒。

介紹

“突然進步”是個別實驗參與者數據在連續兩次治療間獲得變量上的大幅改善，該變量的改變在縱貫實驗檔案資料系列中是穩定的。

“突然進步”初次被Tang & DeRubeis所定義與調查，他們依循著治療會期之間檢驗實驗參與者在參與認知行為療程時，在憂鬱症狀上的改變。

“突然進步”的主要研究迄今為止都與心理療程有關係，但是分析方法也可以用於其他領域中來檢驗實驗參與者間的改變。

對於有著重要的潛在“突然進步”而言，特別觀察這些療程可能對於了解何時與為何“大幅改進”這個情況的發生會有更多的資訊，也會使介入的效能與效率增強。

心理療程16項研究的後設分析後(樣本總共1104位)，與那些沒經歷過“突然進步”的比較之下，發現經歷過“突然進步”的療程參與者都與療程結束後有更好的臨床結果有關係。

“突然進步”在已公開的臨床研究結果呈現的相當不一(例如：有經歷“突然進步”的實驗參與者從整體的17.8%至52.2%都有)，其中有部分可能是因為檢驗的方法導致呈現的結果不同。

無論如何，這些差異非常難以檢驗，因為能夠做成比較的必要方法細節通常不會顯示。

介紹

此外，由於安慰劑介入與研究模擬都出現了“突然進步”現象，有些研究對於現有的方法所檢驗出來的“突然進步”效度開始產生懷疑。

並不是所有“突然進步”都是有意義的，或是顯示出與介入有因果關係的“突然進步”都需要進一步被研究。

這凸顯了需要檢驗這些關聯的存在與強度，並考量到現在檢驗的方法都可以被改進。

“突然進步”封包是第一個軟體程式來提供精密且可複製的方法來自動檢驗“突然進步”，這可能對於改善方法的報告與研究的一致性有所價值。

這也可能可以促進對識別“突然進步”方法的進一步檢驗，來改善這些方法的效度以保證這些方法能夠更精準的反映有意義的情況。

這篇文章專注於提供較容易的，對於“突然進步”是如何被計算的概覽，描述package的主要功能，以及提供如何在縱貫研究資料上使用。

希望使用這封包能夠促進改善效率、報告、以及重現度在“突然進步”的研究上。

“突然進步”的識別

Tang & DeRubeis 提出了三項準則來識別“突然進步”：

1.“突然進步”在一定的實驗條件下，幅度必須非常大。

雖然最初是在「貝克抑鬱量表」(BDI)上必須降低至少7分才能歸於這類情況，之後的研究中，於其他量表上通常都使用「可信改變指數」(RCI)來確定一個合適的“突然進步”的發生時點。

(備註：量表的分數越低表示患者越不抑鬱，換句話說，進步 = 患者的量表分數變低)

2.“突然進步”在有關的實驗條件下，幅度必須非常大，這被定義為至少比先前的實驗分數少25%。

3.“突然進步”必須與症狀變動有極大的關聯。

最初獨立T考驗用來比較因為“突然進步”影響症狀變動之前與之後的情況下，“突然進步”的大小。

這種方法是有爭議的，因為在“突然進步”發生之前與之後的測量獨立性與假設的並不符合。

“突然進步”的識別

不同研究之間用識別“突然進步”的規準不同。

例如：有的研究使用了不同的方法來識別“突然進步”發生的間隔來符合規準1；

一些研究並沒有包含規準2，因為考慮到使用不同的反應量表，還有對於資料中找尋“突然進步”而言有幾乎沒有影響

以及研究使用了不同的方法來選擇於規準3內使用的臨界值

於規準1來識別一個“突然進步”發生的間隔上，Tang & DeRubeis起初於臨床實驗中會期之間分數變化的頻率分佈表為基礎，使用了規準1在「貝克抑鬱量表」上定義了減7分的發生時點。

研究者表示在臨床樣本上，7分大約就是1個標準差。

遺漏值

遺漏值，例如說有個實驗參與者不想在這個或是其他實驗提供數據了，在識別“突然進步”時因為幾個原因需要小心謹慎。

第一個原因，憑藉著個別參與者的遺漏值數量以及模式，是有可能不能識別“突然進步”的，詳見表格1。

特別的是：為了符合估計規準3的標準誤，3項–緊接著發生進步之前–的測量中至少有2項要被呈現出來，3項緊接著發生進步之後的測量中的2項也要。

有些研究者認為取代遺漏值的方法，像是最後觀察值推估或是插補法不適合用來識別“突然進步”因為在非實驗參與者提供的數據中會有潛在或是額外的進步會被觀察到。

第二個原因：在潛在“突然進步”發生前後時間點的遺漏值，2種方法被建議用來評價變化的穩定性。

由於Tang & colleagues更新的規準3 ，一些研究將所有會期還有會期間隔使用了2.776的臨界值來檢驗穩定性。

在之後的分析中，了解在識別階段時數據為何遺漏以及遺漏值要用何種方法處理非常重要。

在識別“突然進步”上，檢驗遺漏值的影響以及發現不同處理遺漏值方法的後續研究會非常有價值。

遺漏值

術語

在進步前後的特別會期(或是測量時點)依據慣例的命名，在緊接著進步前的會期叫做會期N(也叫進步前會期)，緊接著進步後的會期叫做會期N+1(也叫進步後會期)。

其他會期的名稱都是會期N(例如N-2、N+3)。

反轉

根據Tang & DeRube的說法，只要50%在“突然進步”得到的改善在之後的時點又再度回升，會被視作為“突然進步”的反轉。

例如：只要“突然進步”呈現了分數從40降到30，如果再之後的分數被觀測到大於35或更多，進步就會被歸類為有了反轉的跡象。

Wucherpfennig等人表示，反轉可能並不是個穩定現象。

這些研究者透過提出穩定反轉如果被歸類於“突然退步”來更改這個規準。

“突然退步”

雖然突然退步研究的比“突然進步”少，突然退步呈現了一種相反現象，研究參與者在結果上顯示出大量且穩定的分數增加。

有些研究者反轉三種“突然進步”的規準，其他研究者之後調整了規準2的門檻百分比，例如：33%。

“突然進步”封包的功能

Data

建立一個以25分的量表作為追蹤工具的縱貫資料

library(suddengains)
library(ggplot2)
library(tidyr)
set.seed(123)
x<- 1:25
test <- data.frame( 
         ID= 1:6, 
         t0= sample(x, 6), 
    t1= sample(x, 6),
    t2= sample(x, 6),
    t3= sample(x, 6),
    t4= sample(x, 6),
    t5= sample(x, 6))
head(test)

##   ID t0 t1 t2 t3 t4 t5
## 1  1 15 22 19  9  7  7
## 2  2 19 11  9 19 21  9
## 3  3 14  5  3  4 12 24
## 4  4  3 20  8 14 15 10
## 5  5 10 14  7 17 10 21
## 6  6 18 23 10 11 13 25

Data preparation(1/2)

methods=“pattern”

確認資料是否可以進行分析

thelist <- c("t0","t1", "t2","t3","t4", "t5")
select_cases(data=test, id_var_name="ID", 
             sg_var_list=c("t0","t1", "t2","t3","t4", "t5"), 
             method= "pattern", 
             return_id_lgl=F) #FALSE會回傳全部的資料，TRUE只會回傳ID與結果

## The method 'pattern' was used to select cases.
## See help('select_cases') for more information.

##   ID t0 t1 t2 t3 t4 t5 sg_select
## 1  1 15 22 19  9  7  7      TRUE
## 2  2 19 11  9 19 21  9      TRUE
## 3  3 14  5  3  4 12 24      TRUE
## 4  4  3 20  8 14 15 10      TRUE
## 5  5 10 14  7 17 10 21      TRUE
## 6  6 18 23 10 11 13 25      TRUE

Data preparation(1/2)

method = “min_sess”

也可以自行選擇session number

select_cases(data = test,
             id_var_name = "ID",
             sg_var_list = c("t0","t1", "t2","t3","t4", "t5"),
             method = "min_sess",
             min_sess_num = 4,
             return_id_lgl = T) #return_id_lgl=T,只會回傳單獨結果

## The method 'min_sess' was used to select cases.

##   ID sg_select
## 1  1      TRUE
## 2  2      TRUE
## 3  3      TRUE
## 4  4      TRUE
## 5  5      TRUE
## 6  6      TRUE

Default

include_graphics("pattern.png")

Note: xn-2 to xn+3 are consecutive data points of the primary outcome measure. “x” = Present data; “.” = Missing data. “x” represents available data to be examined as a possible pregain session.

select_cases

將符合條件的個案設為另一個物件

test2 <- select_cases(data=test, id_var_name="ID", 
            sg_var_list=thelist,
            method="pattern", 
            return_id_lgl=F)%>%
    dplyr::filter(sg_select ==TRUE)
test2

##   ID t0 t1 t2 t3 t4 t5 sg_select
## 1  1 15 22 19  9  7  7      TRUE
## 2  2 19 11  9 19 21  9      TRUE
## 3  3 14  5  3  4 12 24      TRUE
## 4  4  3 20  8 14 15 10      TRUE
## 5  5 10 14  7 17 10 21      TRUE
## 6  6 18 23 10 11 13 25      TRUE

Identification of sudden gains (crit1, crit2, crit3)

sg_crit1_cutoff

The cut-off value for clinically meaningful change on the measure used to identify sudden gains is specified using the argument sg_crit1_cutoff.

sg_crit2_pct

The minimum percentage drop from the pre- to post-gain session is specified using the argument sg_crit2_pct. The default is a minimum of a 25% drop, i.e. sg_crit2_pct = 0.25.

sg_crit3 = TRUE

Default sg_crit3_alpha = 0.05.

Define cut-off for first criterion

Default data

define_crit1_cutoff(sd = 10.5,
                    reliability = 0.931)

## $sd
## [1] 10.5
## 
## $reliability
## [1] 0.931
## 
## $standard_error_measurement
## [1] 2.758124
## 
## $standard_error_difference
## [1] 3.900577
## 
## $reliable_change_value
## [1] 7.645131

Reliability?

假設需要計算內在一致性信度(Cronbach’s alpha)

psych::alpha(test2[, -1])

## Some items ( t0 t4 t5 ) were negatively correlated with the total scale and 
## probably should be reversed.  
## To do this, run the function again with the 'check.keys=TRUE' option

## 
## Reliability analysis   
## Call: psych::alpha(x = test2[, -1])
## 
##   raw_alpha std.alpha G6(smc) average_r   S/N  ase mean  sd median_r
##      -0.38     -0.17   0.064    -0.025 -0.15 0.89   13 2.2   -0.076
## 
##  lower alpha upper     95% confidence boundaries
## -2.13 -0.38 1.38 
## 
##  Reliability if an item is dropped:
##    raw_alpha std.alpha G6(smc) average_r    S/N var.r  med.r
## t0     -0.81     -0.50    0.46    -0.071 -0.333 0.180 -0.229
## t1     -0.56     -0.32    0.63    -0.052 -0.245 0.119 -0.023
## t2     -0.42     -0.19    0.19    -0.033 -0.159 0.079 -0.104
## t3     -0.51     -0.43    1.00    -0.064 -0.301 0.145 -0.161
## t4     -0.28     -0.11    0.65    -0.020 -0.099 0.132 -0.023
## t5      0.34      0.33    0.65     0.090  0.494 0.120  0.082
## 
##  Item statistics 
##    n raw.r std.r  r.cor  r.drop mean  sd
## t0 6  0.59  0.58  0.584  0.1742 13.2 5.9
## t1 6  0.54  0.50  0.755  0.0071 15.8 7.1
## t2 6  0.37  0.42 -0.079 -0.0323  9.3 5.3
## t3 6  0.43  0.55  1.200  0.0222 12.3 5.5
## t4 6  0.22  0.36  0.319 -0.1372 13.0 4.8
## t5 6  0.08 -0.12 -2.200 -0.4724 16.0 8.2

#raw_alpha=-0.38, t0 sd=5.9

define_crit1_cutoff

可以用來計算達顯著改變的分數

define_crit1_cutoff(sd = 5.9,
                    reliability = 0.38)

## $sd
## [1] 5.9
## 
## $reliability
## [1] 0.38
## 
## $standard_error_measurement
## [1] 4.645665
## 
## $standard_error_difference
## [1] 6.569962
## 
## $reliable_change_value
## [1] 12.87713

create_bysg

bysg <- create_bysg(data=test2,  
        sg_crit1_cutoff =5, 
        sg_crit2_pct = NULL,
        sg_crit3 = F,
        id_var_name ="ID",
        tx_start_var_name="t0", 
        tx_end_var_name="t5",
        sg_var_list=thelist,
                sg_measure_name="t",
                identify="sg")
knitr::kable(bysg, table.attr = "style='width:30%;'")

ID	id_sg	sg_crit123	sg_session_n	sg_freq_byperson	t0	t1	t2	t3	t4	t5	sg_t_2n	sg_t_1n	sg_t_n	sg_t_n1	sg_t_n2	sg_t_n3	sg_magnitude	sg_t_tx_change	sg_change_proportion	sg_reversal_value	sg_reversal
1	1_sg_3	1	3	1	15	22	19	9	7	7	15	22	19	9	7	7	10	8	1.2500000	14.0	0
4	4_sg_2	1	2	1	3	20	8	14	15	10	NA	3	20	8	14	15	12	-7	-1.7142857	14.0	1
5	5_sg_2	1	2	2	10	14	7	17	10	21	NA	10	14	7	17	10	7	-11	-0.6363636	10.5	1
5	5_sg_4	1	4	2	10	14	7	17	10	21	14	7	17	10	21	NA	7	-11	-0.6363636	13.5	0
6	6_sg_2	1	2	1	18	23	10	11	13	25	NA	18	23	10	11	13	13	-7	-1.8571429	16.5	1

create_byperson

byperson1 <- create_byperson(data=test2, 
        sg_crit1_cutoff =5,
        sg_crit2_pct = NULL,
        sg_crit3 = F,
        id_var_name ="ID",
        tx_start_var_name="t0", 
        tx_end_var_name="t5",
        sg_var_list= thelist,
                sg_measure_name="t",
        identify_sg_1to2=F,
        multiple_sg_select="first")
knitr::kable(byperson1,, table.attr = "style='width:30%;'")

ID	id_sg	sg_crit123	sg_session_n	sg_freq_byperson	sg_t_2n	sg_t_1n	sg_t_n	sg_t_n1	sg_t_n2	sg_t_n3	sg_magnitude	sg_t_tx_change	sg_change_proportion	sg_reversal_value	sg_reversal	t0	t1	t2	t3	t4	t5	sg_select
1	1_sg_3	1	3	1	15	22	19	9	7	7	10	8	1.2500000	14.0	0	15	22	19	9	7	7	TRUE
2	NA	0	NA	0	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	19	11	9	19	21	9	TRUE
3	NA	0	NA	0	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	14	5	3	4	12	24	TRUE
4	4_sg_2	1	2	1	NA	3	20	8	14	15	12	-7	-1.7142857	14.0	1	3	20	8	14	15	10	TRUE
5	5_sg_2	1	2	2	NA	10	14	7	17	10	7	-11	-0.6363636	10.5	1	10	14	7	17	10	21	TRUE
6	6_sg_2	1	2	1	NA	18	23	10	11	13	13	-7	-1.8571429	16.5	1	18	23	10	11	13	25	TRUE

identify_sg

identify_sg(data =test2,
            sg_crit1_cutoff = 5,
            sg_crit2_pct = NULL,
            sg_crit3 = TRUE,
              sg_crit3_alpha = 0.05,
            id_var_name = "ID",
            sg_var_list = thelist,
            identify_sg_1to2 = FALSE)

## # A tibble: 6 x 10
##      ID    t0    t1    t2    t3    t4    t5 sg_2to3 sg_3to4 sg_4to5
##   <int> <int> <int> <int> <int> <int> <int>   <int>   <int>   <int>
## 1     1    15    22    19     9     7     7       0       1       0
## 2     2    19    11     9    19    21     9       0       0       0
## 3     3    14     5     3     4    12    24       0       0       0
## 4     4     3    20     8    14    15    10       0       0       0
## 5     5    10    14     7    17    10    21       0       0       0
## 6     6    18    23    10    11    13    25       1       0       0

          #crit123_details =T 把criterion 123的分析結果詳細列出

check_interval

檢測特定時間點之間的變化

check_interval( pre_value = test$t0, 
                post_value = test$t5, 
                sg_crit1_cutoff=5,
                sg_crit2_pct=0.1, 
                sg_crit3= TRUE,
                sg_crit3_alpha=.05, 
                identify="sg")

## # Check sudden gain
## ## Met Criterion 1: YES
## ## Met Criterion 2: YES
## ## Met Criterion 3: NO
## ## Sudden gain: NO
## 
## # Detailed output
## ## Criterion 1: Cut-off: 5 
## ## Criterion 2: Percentage change threshhold: 10 %
## ## Criterion 3: Adjusted: YES, Critical value: 2.228
## ## Number of pre gain values present: 6
## ## Number of post gain values present: 6
## ## Mean of pre gain values: 13.167
## ## Mean of post gain values: 16
## ## SD of pre gain values: 5.913
## ## SD of post gain values: 8.198

plot

Create plot of average change in score around the gain

plot_sg <- plot_sg(data = bysg,
                       id_var_name = "ID",
                       tx_start_var_name = "t0",
                       tx_end_var_name = "t5",
                       sg_pre_post_var_list = c("sg_t_2n", "sg_t_1n", "sg_t_n",
                                                "sg_t_n1", "sg_t_n2", "sg_t_n3"),
                       ylab = "Score", xlab = "Time",
                       colour_single = "darkblue")
plot_sg

## It is then possible to apply other ggplot2 functions to the plot if desired

plot_sg+ ggplot2::coord_cartesian(ylim=c(0, 40))+
  ggplot2::scale_x_discrete(labels = c("First", "n-2", "n-1", "n", 
                                       "n+1", "n+2", "n+3", "Last"))

## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.

Plot individual trajectories

tra_1 <- test %>% 
    plot_sg_trajectories(id_var="ID",
            select_id_list =c(1:6), 
            var_list= thelist, 
            show_id =T, 
            id_label_size=4,
            label.padding=0.2, 
            show_legend=F, 
            colour="viridis",
            connect_missing=F,
            scale_x_num=T, 
            scale_x_num_start=0, 
            xlab="Time",
            ylab="Score")

tra_1

Conclusion

優點

-可以快速找出個案顯著的變化以及變化的區間(不論是增加或減少)

-大量減少為了畫圖資料整理的時間

缺點

-只能定性不能定量

-追蹤時間點要超過4個以上才能使用

-似乎不好應用到其他資料類別的分析(例如想看年度追蹤的資料是不是在某一年有顯著增加，但跑不出來)