Summary of Analyses

  The main goal was to determine what to do with fluency tasks that used different categories (i.e., T, S, Fruit) than what was included in the TabCAT-EXAMINER SEM (i.e., F and L; Animals and Vegetables).
  
  I first investigated distributions of each Fluency item as well as correlations between those items and TabCAT-EXAMINER components (e.g., TabCAT Flanker). Distributions of the T, S, and Fruits items and correlations differed from what was seen in F, L, Animals, and Vegetables; this suggests that we cannot simply put those fluency items in the models.
  
  I next completed a series of cross walks, converting non-included fluency tasks to fluency tasks that were in the original SEM. There was a subset of participants who had: words (T) and words (S); words (S) and words (L); words (fruit) and words (animals); words (animals) and words (vegetables).
  
  Using these combinations, I converted both words (T) and words (S) to words (L), based on the mean and SD of words (L). Similarly, I converted fruits to vegetables. Words (F) and animal data remained the same. I then recreated a EF score, meaning we now had two scores:
  
  
   A. TabCAT-EXAMINER Original: Running Dots, Dot counting, Flanker, Set Shifting, Animals, Vegetables, Words (F) and Words (L). 
   
   B. TabCAT-EXAMINER Crosswalk: This score included the same data points for:  Running Dots, Dot counting, Flanker, Set Shifting, Animals and Words (F). Vegetables and Words (L) included scores that were cross walked from Fruits and from Words (T), Words (S).
   
   
  I completed very simple validation analyses (heatmap), looking at the relationship between the TabCAT-EXAMINER versions within the full sample. They are highly correlated, (.95), which makes sense since they have many of the same components. Additionally, the distributions of the scores were fairly similar: mean difference: -.078, SD of difference: .163.
  
  I then reduced the sample to just those who had their fluency data converted since these participants may have differences (TabCAT EXAMINER - TabCAT-EXAMINER Crosswalk) that are masked in the full sample. The correlation between versions of TabCAT EXAMINER remains high (.94) and the average difference between groups is still small, although it is larger than in the full sample, as expected: -mean difference 0.152 SD (0.198)

1. Comparison of fluency tasks

a. Raw Scores

There are differences in mean and SD across fluency items. There are also differences in the correlations between those fluency items and other components of the TabCAT-EXAMINER score. Below are histograms for each fluency item. Following the histograms is a table showing all numerical data descriptions.

Letter n mean sd cor:flanker cor:Dot_count cor:run_dot cor:set_shift
word_f 294 16.099 4.626 0.111 0.155 0.202 0.124
word_l 435 15.309 4.199 0.057 0.286 0.142 0.15
word_s 425 18.416 4.868 0.134 0.354 0.253 0.156
word_t 289 17.08 4.902 0.098 0.34 0.236 0.085
word_m 2 19.5 7.778 n<5 n<5 n<5 n<5
word_b 2 15.5 4.95 n<5 n<5 n<5 n<5
word_n 1 11 NA n<5 n<5 n<5 n<5
word_c 1 16 NA n<5 n<5 n<5 n<5
word_fruit 285 16.639 4.288 0.3 0.197 0.209 0.265
word_veg 439 15.551 4.162 0.069 0.114 0.172 0.074

b. Binned Scores

The TabCAT-EXAMINER score converts continuous variables to ordinal for factor creation. Here I checked to see if converting to ordinal removed some of the distribution and correlation issues noted in the raw score section. Binning in this way did not make much of a difference.

## Loading required package: gsubfn
## Loading required package: proto
## Loading required package: RSQLite
Letter n mean sd cor:flanker cor:Dot_count cor:run_dot cor:set_shift
word_f_fword_recode 294 8.238 2.21 0.112 0.159 0.203 0.135
word_s_fword_recode 425 9.304 2.213 0.155 0.374 0.256 0.17
word_t_fword_recode 289 8.702 2.308 0.113 0.357 0.227 0.088
word_l_lword_recode 435 8.372 2.053 0.053 0.291 0.14 0.142
word_s_lword_recode 425 9.706 2.075 0.152 0.349 0.24 0.167
word_t_lword_recode 289 9.128 2.191 0.105 0.357 0.247 0.086
veg_recode 439 7.695 1.831 0.071 0.112 0.168 0.073
word_fruit_animal_recode 285 5.832 1.441 0.286 0.18 0.21 0.244
word_fruit_veg_recode 285 8.172 1.956 0.301 0.196 0.228 0.262

c. Cross walk

Completed based on Monsell et al 2016. Uses equate library, with “mean” method.

Many participants were missing words (L) and words (vegetables). However, there was a decent distribution of folks who had: words (T) and words (S); words (S) and words (L); words (fruit) and words (animals); words (animals) and words (vegetables).

I completed a crosswalk to convert T words and S words to L, then fruits to vegetables. Animals and words (F) were not changed.

Results are reported under “creation of factor” in two tabs. The gist of it is that the crosswalk worked pretty well, and either score is likely viable. That said, it’d be a good idea to verify that both are correlated as expected with other data (e.g., age, CDR box, imaging etc.)

2. Creation of Factor Score

Full sample

These analyses include all participants. Shows data comparing a TabCAT-EXAMINER score created with the original DIALS sample and a score (TabCAT-EXAMINER crosswalk) created after cross walking fluency items (lexical and semantic). Both versions were made with the same factor loadings, just different data. The correlation between scores is .95.

Histograms show distributions for both versions of the score. The print out shows numerical values about the distribution. The “Crosswalk Diff.” row displays the original score – the crosswalk score. To me, that difference is pretty small and (in this sample) is zero for most folks since their fluency scores did not change.

The heatmap shows the correlation between the latent factors. The Lexical factor and Semantic factor were part of the IRT model and included to account for covariance between the lexical (F words, L words) and Semantic (animals, vegetables). The first three columns/rows (TabCAT-EXAMINER, Lexical factor, Semantic factor) are from the model with the original data. The third through sixth columns/rows are from the model that included crosswalk fluency data. The 725 (lower triangle) indicates the number of participants in the corresponding correlation in the upper triangle. The main take away here is the .95 correlation between TabCAT-EXAMINER scores.

## 
## 
## Table: Description of factors
## 
## |                |   n|   mean|    sd| median|    min|   max| range|   skew| kurtosis|
## |:---------------|---:|------:|-----:|------:|------:|-----:|-----:|------:|--------:|
## |Original Score  | 725|  0.775| 0.519|  0.805| -1.790| 2.162| 3.953| -0.354|    0.453|
## |Crosswalk Score | 725|  0.853| 0.531|  0.863| -1.423| 2.389| 3.812| -0.170|    0.278|
## |Crosswalk Diff. | 725| -0.078| 0.163| -0.067| -0.750| 0.421| 1.171| -0.640|    1.178|
## [1] "Values in heatmap are: Correlation (no covariates). Spearman used for spearman.list, pearson for other variables"

Restricted sample

Includes only participants who had missing fluency data (n=269). I examined this sample in particular because their TabCAT-EXAMINER scores (original vs crosswalk) will differ. In the full sample, many of the participants have identical TabCAT-EXAMINER scores (original = crosswalk).

Histograms: distribution is fairly similar. The mean is higher in the crosswalk sample, as it was in the full sample analysis. The crosswalk difference score is higher here, which makes sense since we’re essentially removing folks who’s difference score was 0. Although this difference can be quite large (range: +/- 1.71), the patterns in the heatmap remain largely the same.

Heatmap: Main take away is that the correlation between factors remains high: .94

## 
## 
## Table: Description of factors (restricted sample)
## 
## |                     |   n|   mean|    sd| median|    min|   max| range|   skew| kurtosis|
## |:--------------------|---:|------:|-----:|------:|------:|-----:|-----:|------:|--------:|
## |Original Score       | 269|  0.744| 0.538|  0.735| -1.790| 1.845| 3.635| -0.541|    0.864|
## |Crosswalk Score      | 269|  0.896| 0.575|  0.903| -1.423| 2.373| 3.796| -0.275|    0.267|
## |Crosswalk Difference | 269| -0.152| 0.198| -0.139| -0.750| 0.421| 1.171| -0.255|    0.118|
## [1] "Values in heatmap are: Correlation (no covariates). Spearman used for spearman.list, pearson for other variables"