code written: 2020-01-14
last ran: 2020-01-18
website: http://rpubs.com/navona/SPINS_behaviouralOutliers


Description. This script visualizes and summarizes outliers in the neurocognition and social cognition variables that comprise the CCA \(Y\) set. Here, an outlier is defined as an observation that falls outside 1.5 * IQR (Inter Quartile Range), where IQR is the difference between 75th and 25th quartiles. Such observersations, especially in regression models, can distort predictions and affect accuracy if not adjusted. Here, however, our review suggests that all outliers are “real” values (i.e., not data entry errors). Thus, the identified outliers are left in the data. We may decide to replace the values after discussion with a statistician.

Note. This analysis includes data from the n=412 participants that were eligible and passed DWI quality control.


Visualization. Statistical outliers are labelled with record_id, and shown with larger points and a solid colour.


Table. The following table summarizes the data visualized above, and explicitly indicates the outlier count. Reported means and standard deviations are with outliers included.

mean standard deviation outlier count
neurocognition
Processing speed 45.3349515 13.4388565 4
Attention & vigilance 43.2296296 12.4959047 6
Working memory 44.5995146 11.7368677 0
Verbal learning 44.6432039 10.3183571 2
Visual learning 43.1601942 12.2489810 1
Problem solving 45.3300971 10.6929163 0
social cognition
RMET 25.8655257 4.9193505 8
RAD 55.6381418 8.9771558 4
ER_40 2151.1867322 705.4864946 19
TASIT_1 23.3853659 3.3004083 16
TASIT_2 50.4463415 7.8281230 20
TASIT_3 50.9266504 7.6606647 8
IRI 67.8446602 12.8686097 5
EA 0.4776058 0.1779057 12