The main goal was to determine what to do with fluency tasks that used different categories (i.e., T, S, Fruit) than what was included in the TabCAT-EXAMINER SEM (i.e., F and L; Animals and Vegetables).
I first investigated distributions of each Fluency item as well as correlations between those items and TabCAT-EXAMINER components (e.g., TabCAT Flanker). Distributions of the T, S, and Fruits items and correlations differed from what was seen in F, L, Animals, and Vegetables; this suggests that we cannot simply put those fluency items in the models.
I next completed a series of cross walks, converting non-included fluency tasks to fluency tasks that were in the original SEM. There was a subset of participants who had: words (T) and words (S); words (S) and words (L); words (fruit) and words (animals); words (animals) and words (vegetables).
Using these combinations, I converted both words (T) and words (S) to words (L), based on the mean and SD of words (L). Similarly, I converted fruits to vegetables. Words (F) and animal data remained the same. I then recreated a EF score, meaning we now had two scores:
A. TabCAT-EXAMINER Original: Running Dots, Dot counting, Flanker, Set Shifting, Animals, Vegetables, Words (F) and Words (L).
B. TabCAT-EXAMINER Crosswalk: This score included the same data points for: Running Dots, Dot counting, Flanker, Set Shifting, Animals and Words (F). Vegetables and Words (L) included scores that were cross walked from Fruits and from Words (T), Words (S).
I completed very simple validation analyses (heatmap), looking at the relationship between the TabCAT-EXAMINER versions within the full sample. They are highly correlated, (.95), which makes sense since they have many of the same components. Additionally, the distributions of the scores were fairly similar: mean difference: -.078, SD of difference: .163.
I then reduced the sample to just those who had their fluency data converted since these participants may have differences (TabCAT EXAMINER - TabCAT-EXAMINER Crosswalk) that are masked in the full sample. The correlation between versions of TabCAT EXAMINER remains high (.94) and the average difference between groups is still small, although it is larger than in the full sample, as expected: -mean difference 0.152 SD (0.198)
There are differences in mean and SD across fluency items. There are also differences in the correlations between those fluency items and other components of the TabCAT-EXAMINER score. Below are histograms for each fluency item. Following the histograms is a table showing all numerical data descriptions.
Letter | n | mean | sd | cor:flanker | cor:Dot_count | cor:run_dot | cor:set_shift |
---|---|---|---|---|---|---|---|
word_f | 294 | 16.099 | 4.626 | 0.111 | 0.155 | 0.202 | 0.124 |
word_l | 435 | 15.309 | 4.199 | 0.057 | 0.286 | 0.142 | 0.15 |
word_s | 425 | 18.416 | 4.868 | 0.134 | 0.354 | 0.253 | 0.156 |
word_t | 289 | 17.08 | 4.902 | 0.098 | 0.34 | 0.236 | 0.085 |
word_m | 2 | 19.5 | 7.778 | n<5 | n<5 | n<5 | n<5 |
word_b | 2 | 15.5 | 4.95 | n<5 | n<5 | n<5 | n<5 |
word_n | 1 | 11 | NA | n<5 | n<5 | n<5 | n<5 |
word_c | 1 | 16 | NA | n<5 | n<5 | n<5 | n<5 |
word_fruit | 285 | 16.639 | 4.288 | 0.3 | 0.197 | 0.209 | 0.265 |
word_veg | 439 | 15.551 | 4.162 | 0.069 | 0.114 | 0.172 | 0.074 |
The TabCAT-EXAMINER score converts continuous variables to ordinal for factor creation. Here I checked to see if converting to ordinal removed some of the distribution and correlation issues noted in the raw score section. Binning in this way did not make much of a difference.
## Loading required package: gsubfn
## Loading required package: proto
## Loading required package: RSQLite
Letter | n | mean | sd | cor:flanker | cor:Dot_count | cor:run_dot | cor:set_shift |
---|---|---|---|---|---|---|---|
word_f_fword_recode | 294 | 8.238 | 2.21 | 0.112 | 0.159 | 0.203 | 0.135 |
word_s_fword_recode | 425 | 9.304 | 2.213 | 0.155 | 0.374 | 0.256 | 0.17 |
word_t_fword_recode | 289 | 8.702 | 2.308 | 0.113 | 0.357 | 0.227 | 0.088 |
word_l_lword_recode | 435 | 8.372 | 2.053 | 0.053 | 0.291 | 0.14 | 0.142 |
word_s_lword_recode | 425 | 9.706 | 2.075 | 0.152 | 0.349 | 0.24 | 0.167 |
word_t_lword_recode | 289 | 9.128 | 2.191 | 0.105 | 0.357 | 0.247 | 0.086 |
veg_recode | 439 | 7.695 | 1.831 | 0.071 | 0.112 | 0.168 | 0.073 |
word_fruit_animal_recode | 285 | 5.832 | 1.441 | 0.286 | 0.18 | 0.21 | 0.244 |
word_fruit_veg_recode | 285 | 8.172 | 1.956 | 0.301 | 0.196 | 0.228 | 0.262 |
Many participants were missing words (L) and words (vegetables). However, there was a decent distribution of folks who had: words (T) and words (S); words (S) and words (L); words (fruit) and words (animals); words (animals) and words (vegetables).
I completed a crosswalk to convert T words and S words to L, then fruits to vegetables. Animals and words (F) were not changed.
Results are reported under “creation of factor” in two tabs. The gist of it is that the crosswalk worked pretty well, and either score is likely viable. That said, it’d be a good idea to verify that both are correlated as expected with other data (e.g., age, CDR box, imaging etc.)
These analyses include all participants. Shows data comparing a TabCAT-EXAMINER score created with the original DIALS sample and a score (TabCAT-EXAMINER crosswalk) created after cross walking fluency items (lexical and semantic). Both versions were made with the same factor loadings, just different data. The correlation between scores is .95.
Histograms show distributions for both versions of the score. The print out shows numerical values about the distribution. The “Crosswalk Diff.” row displays the original score – the crosswalk score. To me, that difference is pretty small and (in this sample) is zero for most folks since their fluency scores did not change.
The heatmap shows the correlation between the latent factors. The Lexical factor and Semantic factor were part of the IRT model and included to account for covariance between the lexical (F words, L words) and Semantic (animals, vegetables). The first three columns/rows (TabCAT-EXAMINER, Lexical factor, Semantic factor) are from the model with the original data. The third through sixth columns/rows are from the model that included crosswalk fluency data. The 725 (lower triangle) indicates the number of participants in the corresponding correlation in the upper triangle. The main take away here is the .95 correlation between TabCAT-EXAMINER scores.
##
##
## Table: Description of factors
##
## | | n| mean| sd| median| min| max| range| skew| kurtosis|
## |:---------------|---:|------:|-----:|------:|------:|-----:|-----:|------:|--------:|
## |Original Score | 725| 0.775| 0.519| 0.805| -1.790| 2.162| 3.953| -0.354| 0.453|
## |Crosswalk Score | 725| 0.853| 0.531| 0.863| -1.423| 2.389| 3.812| -0.170| 0.278|
## |Crosswalk Diff. | 725| -0.078| 0.163| -0.067| -0.750| 0.421| 1.171| -0.640| 1.178|
## [1] "Values in heatmap are: Correlation (no covariates). Spearman used for spearman.list, pearson for other variables"
Includes only participants who had missing fluency data (n=269). I examined this sample in particular because their TabCAT-EXAMINER scores (original vs crosswalk) will differ. In the full sample, many of the participants have identical TabCAT-EXAMINER scores (original = crosswalk).
Histograms: distribution is fairly similar. The mean is higher in the crosswalk sample, as it was in the full sample analysis. The crosswalk difference score is higher here, which makes sense since we’re essentially removing folks who’s difference score was 0. Although this difference can be quite large (range: +/- 1.71), the patterns in the heatmap remain largely the same.
Heatmap: Main take away is that the correlation between factors remains high: .94
##
##
## Table: Description of factors (restricted sample)
##
## | | n| mean| sd| median| min| max| range| skew| kurtosis|
## |:--------------------|---:|------:|-----:|------:|------:|-----:|-----:|------:|--------:|
## |Original Score | 269| 0.744| 0.538| 0.735| -1.790| 1.845| 3.635| -0.541| 0.864|
## |Crosswalk Score | 269| 0.896| 0.575| 0.903| -1.423| 2.373| 3.796| -0.275| 0.267|
## |Crosswalk Difference | 269| -0.152| 0.198| -0.139| -0.750| 0.421| 1.171| -0.255| 0.118|
## [1] "Values in heatmap are: Correlation (no covariates). Spearman used for spearman.list, pearson for other variables"