Note: The manuscript is currently under preparation. The copyright belongs to Xie et al.

Method

Participants and design

To detect a medium-to-large effect size (average Cohen’s d = 0.65; Cohen’s f = 0.325) as reported by Xie et al. (2024), a minimum of 22 participants was required to achieve a high statistical power (1-β = .90), assuming a significance level of α = .05 (Faul et al., 2009). A total of 48 undergraduate students were initially recruited. One participant was excluded due to a disruption in the E-prime software, resulting in a final sample of 47 participants (40 women, Mage = 20.36 years, SDage = 1.95) for behavioral analyses. For fNIRS analyses, data from five participants were excluded due to an excessive number of bad channels (in this study, more than 20%). For log data analyses, six participants were excluded due to interruptions in the logging software. All participants had normal or corrected-to-normal vision and hearing, no history of neurological or psychiatric disorders, and were right-handed. Written informed consent was obtained prior to participation, and all participants received monetary compensation. The study was approved by the institutional ethics review committee.

A within-subjects design was employed, with strategy type as the independent variable, comprising three conditions: silent reading, stylus drawing, and finger drawing. Dependent variables mainly included accuracy on item memory and source memory tasks, self-reported motivation levels, and β values of relative changes in oxygenated hemoglobin concentration (Δ[HbO]).

Materials and measures

……

Apparatus and optode layout

All word and test stimuli were presented using E-prime v2.0 software (Psychology Software Tools, Pittsburgh, PA) on a non-touch desktop computer with a 24-inch Dell monitor (1,920 × 1,080 resolution). For the drawing tasks, participants were provided with a touchscreen tablet (12.3-inch Surface Pro) equipped with a custom drawing and logging application called “touchtimer” (see Figure A1). The interface of the software consisted of two main sections: a top toolbar and a large drawing area below. The toolbar included functions such as “Next page” (proceed to the next screen). Participants completed the stylus and finger drawing tasks in the designated drawing area. The background of the interface was set to white, with medium-width black lines used for drawing. In addition, the software automatically recorded the duration of each event in which the participant touched the screen with either the stylus or finger, enabling subsequent comparisons of log data (touchscreen duration and count) across drawing conditions.

Cortical hemodynamic activity was recorded using the NIRScout near-infrared functional brain imaging system (NIRx Medical Technologies, LLC, USA), which employs dual-wavelength (785 nm and 830 nm) continuous-wave semiconductor lasers. Near-infrared light emitted by multiple sources was scattered through cortical and other biological tissues and detected by corresponding detectors. The sampling rate of the imaging system was 6.25 Hz.

A total of 24 optode probes (see Figure 1) were arranged into two symmetrical multichannel probe arrays, each positioned over one cerebral hemisphere. Each array included 5 emitters and 7 detectors, forming 16 channels per hemisphere, for a total of 32 channels (see Figure 1). Probes were inserted into designated slots on a specialized probe cap (NIRScap, NIRx Medical Technologies), with an inter-optode distance of approximately 3 cm. Probe placement followed the international 10–10 EEG system. For example, detector D2, emitter S4, and detector D7 were placed at scalp locations C1, C3, and C5 (left of Cz), respectively; their right-hemisphere counterparts—detector D9, emitter S9, and detector D14—were placed at C2, C4, and C6 (right of Cz). Following data collection, spatial coordinates of five reference landmarks (Nz, Iz, AL, AR, Cz) and all optode positions were recorded using a 3D digitizer (PATRIOT, Polhemus, Colchester, VT). Channel locations were registered to MNI space using a probabilistic registration method, yielding anatomical localization and cortical coverage probabilities for each channel (see Table A3).



Figure 1 The optode configuration and channels on a 3D brain.

Procedure

……



Figure 2 Trial structure during the encoding phase.

Data analyses

Data preprocessing

For the behavioral data, a small portion of trials (1%) were excluded after confirming that participants failed to follow the designated task instructions (e.g., performing stylus drawing on trials intended for silent reading).

fNIRS data were preprocessed using nirsLAB software (v2017.6, NIRx Medical Technologies). First, data segments unrelated to the encoding phase were truncated to improve computational efficiency. Second, channel-level data quality was assessed by calculating the relative coefficient of variation (CV, in %), which served as an estimate of the signal-to-noise ratio (Schmitz et al., 2005). Channels with CVs exceeding 15% were classified as bad channels and excluded from analyses (Piper et al., 2014). Participants with more than six bad channels (i.e., >20%) were excluded from further analyses. Third, motion artifacts were removed, including discontinuities and spike artifacts. Discontinuities were automatically detected and corrected by subtracting a constant from the signal (STD threshold = 5), and spike artifacts were replaced using linear interpolation. Fourth, a band-pass filter (0.01–0.2 Hz) was applied to reduce physiological noise (e.g., respiration, heartbeat) and signal drifts caused by head movement, vasomotion, or instrumental instability (Huppert et al., 2009; Zhang et al., 2005). Then, based on the filtered data, hemodynamic responses were estimated using the modified Beer–Lambert law: light intensity was converted into optical density and its change, from which changes in concentrations of oxygenated and deoxygenated hemoglobin (Δ[HbO] and Δ[HbR]) were computed. Baseline correction was applied using the 30-second rest period prior to the first trial.

The hemodynamic response function was selected as the basis function for convolution (parameters: [6 16 1 1 6 0 32]) to construct the design matrix. Temporal high-pass filtering was applied using a discrete cosine transform with a cutoff (s) of 128. For each strategy type and channel, regression coefficients (β values) of Δ[HbO] and Δ[HbR] were finally calculated. Given that HbO is more sensitive to local cerebral blood flow and shows stronger correlations with fMRI BOLD signals (Hoshi et al., 2001; Huppert et al., 2006), only βs corresponding to Δ[HbO] were used as the brain activation measure in subsequent analyses.

Mixed-effects model analyses

Given the nonindependence of observations in our dataset, we employed mixed-effects models for inferential analyses to examine differences across conditions. These models incorporate both fixed and random effects, allowing us to account for variability at the participant and/or item level (Brown, 2021). Continuous outcomes, including fNIRS and log data, were analyzed using linear mixed-effects models (LMMs). Binary outcomes related to memory performance (item memory and source memory accuracy) were analyzed using generalized linear mixed-effects models (GLMMs). Ordinal motivation data were analyzed using cumulative link mixed models (CLMMs). All analyses were conducted in R 4.3.1 (RCoreTeam, 2023), using the lme4 package (Bates et al., 2015) for GLMMs and LMMs, the lmerTest package (Kuznetsova et al., 2017) for significance testing of fixed effects, and the ordinal package for CLMMs (Taylor et al., 2023). To control for inflated false positives due to multiple comparisons in the fNIRS analyses, p values were adjusted using the false discovery rate (FDR) procedure with a threshold of q < .05 (Benjamini & Hochberg, 1995).

Results

Means and standard deviations of the dependent variables by strategy type are presented in Tables 1 and A4. Correlation analyses indicated that memory accuracy was not significantly associated with age, visual imagery ability, touchscreen usage intensity, or drawing experience (see Table A5; |r|s ≤ .17, ps > .05). Accordingly, these variables were not included in subsequent analyses.



Memory data

……

Motivation data

……



Figure 3 Memory and motivation performance as a function of strategy type. The black spots represent the means. The bars represent the standard deviation. * p < .05. ** p < .01. *** p < .001.

fNIRS data

A series of LMMs were used for each channel to examine the effect of strategy type on β values of Δ[HbO]. Likelihood ratio tests revealed significant main effects of strategy type in several channels, including Channel 4 (primary motor cortex, left BA4), Channel 10 (primary motor cortex, left BA4), Channel 2 (pre-motor and supplementary motor cortex, left BA6), Channel 11 (pre-motor and supplementary motor cortex, left BA6), Channel 5 (primary somatosensory and somatosensory association cortex, left BA1/2/3/7), Channel 13 (primary somatosensory cortex, left BA1/2/3), and Channel 16 (subcentral area, left BA43) (qs < .05). Specifically, in these channels, β values were significantly higher in both the stylus drawing (qs < .05; see Figure 4a) and finger drawing (qs < .05; see Figure 4b) conditions compared to the silent reading condition. No significant differences emerged between the two drawing conditions (qs > .05; see Figure 4c). No significant main effects of strategy type were observed in the remaining channels (qs > .05).

Correlation analyses (see Table A7) between brain activation and memory accuracy revealed that activation in Channels 4, 10, 2, 11, 5, 13, and 16 was significantly positively associated with item memory accuracy (.19 ≤ r ≤ .25, ps < .05). No significant correlations were found between activation in the remaining channels and item memory accuracy. Additionally, activation in all channels was unrelated to source memory accuracy.



Figure 4 Heatmaps of brain activation comparisons across different strategy types.

Log data

LMMs were used to examine the effect of drawing type on touchscreen duration and touchscreen count (see Table 1). Touchscreen duration was significantly longer in the finger drawing condition compared to the stylus drawing condition (estimate = 0.93, SE = 0.07, t = 13.90, p < .001). Touchscreen count was significantly higher in the finger drawing condition compared to the stylus drawing condition (estimate = 0.97, SE = 0.22, t = 4.39, p < .001).