Model description

This document describes a real‑time activation model used to estimate the availability of propositional content (EDUs, e.g. sentences) in memory during writing-from-source tasks. The model treats reading as a stream of memory‑updating events that dynamically shape which ideas are available for writing under time pressure and interference and is driven by eye‑movement–derived events but abstracts away from word‑level processing. It is intended to capture proposition‑level availability, not surface recall.

The model builds on activation‑based and interference‑based accounts of memory in psycholinguistics (ACT‑R, cue‑based retrieval, discourse models), according to which linguistic representations are continuously updated, decay over time, and compete for retrieval (Anderson 1983; Anderson and Bower 1973; Lewis and Vasishth 2005). Like cue‑based retrieval and discourse comprehension models, it assumes that forgetting reflects reduced accessibility rather than loss of representations. EDUs are treated as propositional units whose activation is strengthened by reading, weakened by interference from new content, and eroded by non‑source-reading activities such as writing. Availability is defined in relative terms, capturing the selective accessibility of propositions under competition. The model integrates insights from activation‑based memory theory, discourse processing, and eye‑movement research into a real‑time account of source memory during writing.

The present model extends a time‑ and interference‑based activation framework by embedding propositions in structured networks over which activation can spread. This implements a hybrid architecture that combines principles from activation‑based memory theory, discourse‑level structure models, and knowledge‑based spreading activation accounts of comprehension. At any point in time, each EDU has a continuous activation value that reflects its current accessibility in memory. Activations change dynamically as a function of reading, writing, decay, interference, and spreading activation.

Overall the model implements a hybrid activation‑based memory system with the following properties:

This architecture aligns with a large body of psycholinguistic work in which forgetting during reading is understood as a consequence of decay, interference, and competitive retrieval in a structured memory space rather than loss of representations (e.g. Lewis, Vasishth, and Van Dyke 2006; Myers and O’Brien 1998).

Full functions can be found at the end of this document and will be loaded here.

# Load functions for reading history model with activation spreading
source("../r/rh-sa.R")

Model inputs

Example data

Below is an example of the input format that the model is expecting. All variables are required (except for token). The data are represented as time stamps with associated fixation durations associated with event_type. event_type has the value "read" always when edu is not NA and other when edu is NA. Thus, when event_type is "read" there is an edu, i.e. a text unit within the source texts, that is associated with the fixation duration. The code assumes that edu values are unambiguous identifiers of text regions across source texts.

Rows: 8,354
Columns: 7
$ token        <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ source       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "RW1_short_ascii", NA, NA, "RW1_short_ascii", "…
$ edu          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "001kgfc", NA, NA, "002smtf", "003twnf", "002sm…
$ t_start      <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ event_type   <chr> "other", "other", "other", "other", "other", "other", "other", "other", "other", "other…
$ duration     <int> 797, 355, 1788, 4286, 1761, 2183, 2565, 2076, 120, 93, 160, 140, 107, 602, 307, 221, 19…
$ words_in_edu <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 16, NA, NA, 22, 24, 22, 16, 16, 16, NA, 18, NA,…

words_in_edu comes from the rst_tsv file:

Rows: 246
Columns: 4
$ edu          <chr> "001wfwa", "002tiwf", "003atti", "004saec", "005osit", "006netd", "007tpid", "008bats",…
$ text         <chr> "when faced with a particularly tough question on rounds during my intern year i would …
$ source       <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_…
$ words_in_edu <int> 20, 25, 21, 18, 5, 39, 4, 25, 29, 19, 11, 36, 14, 32, 29, 16, 8, 23, 20, 31, 18, 35, 38…

Span information from the rst_tsv file is required to calculate the weights between PEs.

Rows: 183
Columns: 3
$ source <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii"…
$ edu    <chr> "001wfwa,002tiwf", "001wfwa,002tiwf,003atti,004saec,005osit", "001wfwa,002tiwf,003atti,004sae…
$ level  <dbl> 9, 7, 5, 3, 1, 8, 10, 8, 6, 8, 6, 4, 4, 2, 4, 3, 0, 2, 2, 4, 5, 3, 1, 3, 1, 2, 1, 3, 5, 3, 13…

The input data format requires minimal data transformation from the incremental report returned by CyWrite (Chukharev-Hudilainen 2019) and no data cleaning. The data wrangling that would be minimally required (in R) is shown here:

jsonlite::fromJSON(txt = token, flatten = TRUE) %>%
  select(eye) %>%
  unnest(eye) %>%
  # label edus as "read" or "other
  mutate(event_type = case_when(!is.na(edu) ~ "read", TRUE ~ "other")) %>%
  # add number of words in edu
  left_join(data_pe, by = join_by(edu)) 

Finally we need a named vector with source as values and EDUs as names.

# EDUs by source 
edu_source <- setNames(data_pe$source, data_pe$edu)

# Preview
edu_source[1:2]
          001wfwa           002tiwf 
"AM1_short_ascii" "AM1_short_ascii" 

Weights for activation spreading

IGNORE: Semantic similarity (TF‑IDF based option)

This function is currently not used by the model.

For semantic and discourse related activation spreading we need to prepare to weight matrices. The implementation of the functions that generate these matrices can be found at the end of this document.

First, the function build_semantic_W() builds a semantic similarity matrix for knowledge‑based spreading activation for each source text on the basis of the data stored in data_pe. This matrix captures how closely related propositions are in terms of their lexical‑semantic content. This matrix provides the structural basis for landscape‑style spreading activation, in which activation can flow between propositions that are semantically related even in the absence of explicit discourse‑structural links.

Semantic representations are constructed using a standard distributional semantic approach, operationalised as follows:

  1. Tokenisation: Each EDU text is tokenised into word tokens using a standard word tokenizer.
  2. Vocabulary Construction: A corpus vocabulary is constructed from the set of EDUs belonging to a given source text. Each unique word type defines a dimension in the semantic space.
  3. Document–Term Matrix (DTM): EDUs are mapped onto a document–term matrix in which each row corresponds to an EDU and each column corresponds to a vocabulary term. Cell values reflect term frequency within the EDU.
  4. TF‑IDF Transformation: To down‑weight high‑frequency, low‑informativeness words and emphasise discriminative lexical content, the DTM is transformed using term‑frequency–inverse‑document‑frequency (TF‑IDF) weighting. This yields a vector representation of each EDU in a weighted semantic space.

This code creates a semantic similarity matrix for each source text separately.

semantic_W_by_source <- data_pe %>%
  split(.$source) %>%
  lapply(build_semantic_W)

Semantic similarity between EDUs is computed using cosine similarity between their TF‑IDF vectors. For each pair of EDUs \(i\) and \(j\):

\[ S_{ij} = \frac{\vec{v}_i \cdot \vec{v}_j}{\|\vec{v}_i\| \|\vec{v}_j\|} \]

where \(\vec{v}_i\) and \(\vec{v}_j\) are the TF‑IDF vectors of EDUs \(i\) and \(j\), respectively.

Cosine similarity yields values in the range \([0, 1]\), where higher values indicate greater semantic overlap.

The resulting semantic similarity matrix \(S\) has the following properties:

  • Rows and columns correspond to EDUs from the same source text
  • \(S_{ij} = S_{ji}\) (the matrix is symmetric)
  • \(S_{ii} = 1\) (self‑similarity)
  • Off‑diagonal values encode graded semantic relatedness
  • The matrix is dense and continuous, reflecting probabilistic semantic associations rather than categorical links

Row and column names are set to EDU identifiers, allowing direct alignment with the activation state vector. A preview is shown here:

           001kgfc    002smtf    003twnf    004eahf    005wtsg    006bssf
001kgfc 1.00000000 0.10840610 0.01538937 0.08444951 0.07463417 0.02065252
002smtf 0.10840610 1.00000000 0.13354690 0.06980781 0.22185241 0.07885955
003twnf 0.01538937 0.13354690 1.00000000 0.04804877 0.08606724 0.05782249
004eahf 0.08444951 0.06980781 0.04804877 1.00000000 0.03901257 0.17790446
005wtsg 0.07463417 0.22185241 0.08606724 0.03901257 1.00000000 0.06385578
006bssf 0.02065252 0.07885955 0.05782249 0.17790446 0.06385578 1.00000000

This semantic similarity matrix approximates semantic memory structure in a distributional sense. It captures world‑knowledge‑based associations among propositions that are not necessarily adjacent in the text or directly linked by discourse relations.

In the model, this matrix supports landscape‑style spreading activation, whereby activation of one proposition partially activates other propositions that are semantically related. This mechanism accounts for phenomena such as:

  • activation of bridging and predictive inferences
  • background conceptual resonance during reading
  • reactivation of earlier propositions based on semantic overlap
  • maintenance of topic‑level coherence across time

Semantic spreading is implemented with a small weighting parameter, ensuring that it provides weak but persistent support rather than dominating direct encoding or discourse‑structural effects.

Semantic similarity (SBERT‑based option)

Ghe function build_semantic_W_sbert() constructs a semantic similarity matrix for knowledge‑based spreading activation for each source text on the basis of the data stored in data_pe. This matrix is derived from sentence‑level semantic embeddings allowing semantic relations to be captured independently of surface lexical overlap. This matrix provides the structural basis for landscape‑style semantic spreading activation, in which activation can flow between propositions that are semantically related in a conceptual sense, even if they do not share words or are not directly connected by discourse structure.

Each EDU is mapped onto a dense semantic vector using a Sentence‑BERT (SBERT) model (Reimers and Gurevych 2019). SBERT models are trained to represent the meaning of sentences such that semantically equivalent or closely related sentences have similar vector representations, even when they differ substantially in wording. This makes them particularly well suited for short, propositional EDUs, where lexical overlap may be minimal.

SBERT is accessed via Python.

library(reticulate)
use_python("/usr/bin/python3", required = TRUE)
transformers <- import("sentence_transformers")
np <- import("numpy")

# initialise model (do once per session)
sbert_model <- transformers$SentenceTransformer("all-MiniLM-L6-v2")

# Generate SBERT embeddings for EDUs
emb_sbert_by_source <- data_pe %>%
  split(.$source) %>%
  lapply(compute_sbert_embeddings)

Semantic representations are constructed as follows:

  1. Sentence encoding: Each EDU text is passed to a pretrained SBERT model, which produces a fixed‑length, dense vector representation intended to capture the EDU’s propositional meaning.
  2. Embedding normalisation: EDU embeddings are L2‑normalised to ensure that cosine similarity reflects angular distance in semantic space rather than vector magnitude.
  3. Per‑source processing: Embeddings are constructed separately for each source text, and semantic similarity matrices are computed independently for each source. This prevents semantic spreading across sources unless explicitly modeled elsewhere.
# Build semantic similarity matrix for each source and its emebddings
edu_list <- data_pe %>% split(.$source)

common_names <- intersect(names(edu_list), names(emb_sbert_by_source))

semantic_W_by_source <- mapply(
  build_semantic_W_sbert,
  edu_df = edu_list[common_names],
  embeddings = emb_sbert_by_source[common_names],
  SIMPLIFY = FALSE)

SBERT embeddings approximate long‑term semantic knowledge learned from large corpora, including information about synonymy, paraphrase, and typical conceptual associations.

For each source text, a semantic similarity matrix \(S\) is computed by taking the cosine similarity between all pairs of EDU embeddings. For EDUs \(i\) and \(j\):

\[ S_{ij} = \frac{\vec{v}_i \cdot \vec{v}_j}{\|\vec{v}_i\| \|\vec{v}_j\|} \]

where \(\vec{v}_i\) and \(\vec{v}_j\) are the sentence‑level embedding vectors of EDUs \(i\) and \(j\), respectively. A preview is shown here:

          001kgfc   002smtf   003twnf   004eahf   005wtsg   006bssf
001kgfc 1.0000000 0.2677321 0.1633561 0.4572471 0.1959719 0.1470794
002smtf 0.2677321 1.0000000 0.3595323 0.3795831 0.4427657 0.5298123
003twnf 0.1633561 0.3595323 1.0000000 0.2868217 0.4470820 0.3980450
004eahf 0.4572471 0.3795831 0.2868217 1.0000000 0.3417116 0.4565610
005wtsg 0.1959719 0.4427657 0.4470820 0.3417116 1.0000000 0.4074331
006bssf 0.1470794 0.5298123 0.3980450 0.4565610 0.4074331 1.0000000

Cosine similarity yields values in the range \([0, 1]\), where higher values indicate greater semantic relatedness in embedding space. Because embeddings encode meaning beyond surface form, high similarity values can arise even when EDUs share few or no lexical items.

As in the previous implementation, a separate semantic similarity matrix is created for each source text, ensuring that semantic spreading remains source‑internal by default.

The resulting semantic similarity matrix \(S\) has the following properties:

  • Rows and columns correspond to EDUs from the same source text
  • \(S_{ij} = S_{ji}\) (the matrix is symmetric)
  • \(S_{ii} = 1\) (self‑similarity)
  • Off‑diagonal values encode graded semantic relatedness
  • The matrix is dense and continuous, reflecting probabilistic semantic associations rather than categorical links

Row and column names are set to EDU identifiers, allowing direct alignment with the activation state vector.

The SBERT‑based semantic similarity matrix is intended to approximate the reader’s semantic memory structure in a distributional sense. Embedding‑based similarity captures conceptual relatedness that may support:

  • synonymy and paraphrase relations
  • bridging and predictive inferences
  • resonance between propositions with similar meanings
  • maintenance of topic‑level coherence despite variation in wording

In the model, this matrix supports landscape‑style semantic spreading activation, whereby activation of one proposition partially reactivates other propositions that are semantically related at the level of meaning rather than surface form. Semantic spreading is implemented with a small weighting parameter, ensuring that knowledge‑based resonance provides weak but persistent background support rather than dominating direct encoding or discourse‑structural integration mechanisms.

RST discourse-structural relations

The function build_rst_W() constructs a discourse‑structural weight matrix that encodes the strength of structural relations among propositions as defined by Rhetorical Structure Theory (RST) span annotations. This matrix provides the basis for discourse‑structural spreading activation in the model, corresponding to construction–integration processes. The implementation can be found at the end of this document.

W_by_source <- data_spans %>%
  split(.$source) %>%
  lapply(build_rst_W)

The input data_spans data frame contains one row per RST span and includes:

  • a list of EDUs participating in the span
  • a hierarchical level indicating the depth of the span within the RST tree

Lower levels correspond to more global discourse groupings, whereas higher levels correspond to more local, tightly integrated spans.

Each RST span contributes pairwise connections among all EDUs contained within that span. The contribution of a span is weighted by its hierarchical level according to the following mapping:

\[ w = \frac{1}{\text{level} + 1} \]

This transformation has three important properties:

  • it ensures finite, well‑defined weights even for root‑level spans (level = 0)
  • it assigns stronger weights to more local discourse relations
  • it assigns weaker weights to global spans, reflecting their background coherence role.

Thus, EDUs that co‑occur in smaller, lower‑level spans receive stronger mutual support than EDUs that are only related at higher discourse levels.

The resulting matrix \(W\) is constructed as follows:

  1. Initialisation: Asquare matrix is initialised with rows and columns corresponding to all EDUs appearing in the span annotations.

  2. Accumulation of Span Contributions: For each RST span:

    • all unordered pairs of EDUs within that span are identified
    • the span’s weight \(w\) is added symmetrically to the corresponding matrix cells.

Connections accumulate across spans, allowing EDUs that repeatedly co‑occur across different discourse structures to develop stronger association weights.

  1. Normalisation: After all spans have been processed:
    • rows with zero or non‑finite sums are stabilised
    • rows are normalised so that each row sums to one.

This ensures that the matrix can be interpreted as distributing a bounded amount of activation across structurally related EDUs.

An example preview of RST weight matrix is shown here:

           001kgfc    002smtf    003twnf    004eahf    005wtsg    006bssf
001kgfc 0.00000000 0.02932211 0.02816846 0.02680505 0.02513866 0.02513866
002smtf 0.02932211 0.00000000 0.02816846 0.02680505 0.02513866 0.02513866
003twnf 0.02820099 0.02820099 0.00000000 0.02683601 0.02516770 0.02516770
004eahf 0.02690947 0.02690947 0.02690947 0.00000000 0.02523659 0.02523659
005wtsg 0.02523659 0.02523659 0.02523659 0.02523659 0.00000000 0.02690947
006bssf 0.02523659 0.02523659 0.02523659 0.02523659 0.02690947 0.00000000

The final matrix has the following properties:

  • \(W_{ij} \ge 0\) for all \(i, j\);
  • \(W_{ij} = W_{ji}\) (symmetric);
  • self‑connections are excluded (\(W_{ii} = 0\));
  • each row sums to one (or zero if an EDU is structurally isolated).

The RST‑based weight matrix represents discourse‑structural support relations among propositions. It operationalises the idea that propositions are embedded in a hierarchical discourse structure and that integration processes distribute activation preferentially along structurally local connections.

Application of reading history model

The following code illustrates how the reading history function update_reading_history can be applied to slices of data. These data slices can have any size from 1 to the overall length of the writing session. Details of the implementation of update_reading_history and a conceptual description can be found below.

# Model parameters
lambda  <- .08       # slow passive time decay (/sec)
alpha   <- .9        # moderate encoding strength
beta    <- .1        # strong propositinoal level interference from reading
gamma   <- .18       # strong decay during writing / "other"
refix_window <- 2.0  # seconds
k_maint      <- 0.3  # refixation boost
theta        <- 0.15 # retrieval threshold (normalised)

# Activation spreading
discourse_spread_rate <- 0.01 # discourse structure (very low)
semantic_spread_rate <- 0.02  # semantic resonance

# Vector for initial states
state <- init_state()

# Data frame for results
rh <- tibble()

# Loop over random slices of data (to simulate real time)
for(i in seq(1, nrow(data), 20)){
  tmp <- slice(data, i)
  res <- update_reading_history(tmp, state)
  state <- res$state
  rh <- bind_rows(rh, res$output)
}

# Add source information for plotting
rh <- left_join(rh, data_pe, by = "edu")

The model returns the following information

Rows: 22,492
Columns: 8
$ token        <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ t_start      <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ activation   <dbl> 1.00000000, 0.29664513, 1.00000000, 0.29841672, 1.00000000, 0.30018625, 1.00000000, 0.1…
$ available    <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, F…
$ edu          <chr> "006bssf", "006bssf", "010isat", "006bssf", "010isat", "006bssf", "010isat", "006bssf",…
$ text         <chr> "but she still felt like a stranger at her own company whose remote policies were hapha…
$ source       <chr> "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_…
$ words_in_edu <int> 18, 18, 9, 18, 9, 18, 9, 18, 9, 12, 18, 9, 12, 15, 18, 9, 12, 15, 19, 18, 9, 12, 15, 19…

The model results are visualised in Figure 1. Activation values represent relative indices of memory prominence rather than absolute memory strength. Accordingly, the model’s predictions focus on availability and competition among propositions, which are assumed to drive rereading and source use decisions. The model does not attempt to estimate absolute memory strength. Instead, it models dynamic changes in relative availability under time‑based decay and interference, which are assumed to be the relevant determinants of writing behavior. The model permits statements about the relative prominence of propositions in memory (e.g. EDU “X” being more active than EDU “Y” at a given moment), but does not assign absolute memory‑strength units to activation values.

Reading history. Each line represents an EDU. Colour indicates whether the propositional information of the EDU is available or not.

Figure 1: Reading history. Each line represents an EDU. Colour indicates whether the propositional information of the EDU is available or not.

Conceptual overview

The model treats each EDU as a propositional memory trace whose activation evolves continuously over time as readers alternate between reading, writing, and other cognitively demanding activities. Activation reflects the momentary accessibility of the information associated with each EDU, not its permanent storage in long‑term memory.

Activation increases when an EDU is processed during reading and decreases due to the passage of time, competition from newly processed information, and attentional diversion during non‑reading activities such as writing or planning. In addition, activation can be redistributed across structurally and semantically related propositions via spreading‑activation mechanisms.

The core assumptions of the model are:

The model operates in real time and produces a memory snapshot after each event yielding continuous estimates of the activation and availability of all propositions encountered so far. This allows the model to track how the set of accessible ideas changes dynamically during writing-from-source tasks.

State Variables

At any time \(t\), the model maintains the following state variables:

The complete model state at time \(t\) can thus be characterised as:

\[ \text{state}(t) = \bigl\{ A(t),\; t_{\text{last}},\; t^{\text{fix}},\; W^{(s)},\; S^{(s)} \bigr\} \]

In other words, propositional information is strengthened, weakened, supported, and suppressed over time under the combined influence of processing, interference, discourse structure, and semantic knowledge.

Default parameters

lambda = 0.08   # passive decay rate (per sec)
alpha  = 0.9    # encoding strength
beta   = 0.10   # interference from new reading
gamma  = 0.18   # decay during non-reading (“other”)
theta  = 0.15   # availability threshold (normalised)

refix_window = 2.0  # refixation window (sec)
k_maint      = 0.3  # refixation boost

# Activation spreading
discourse_spread_rate = 0.01 # how much matters discourse structure
semantic_spread_rate = 0.02  # how much does semantic resonance matter

These values are intended for adult L1 readers processing propositional content.

Event Representation

Each input event corresponds to a single attentional or task-related episode. Events are processed sequentially in time order.

Each event has the following attributes:

The model makes no assumptions about dwell structure or fixation clustering; any event stream that satisfies these conditions can be used.

Mathematical model

Passive time‑based decay

Between any two events, all EDU activations decay exponentially:

\[ \Delta t = \frac{t - t_{\text{last}}}{1000} \]

\[ A_e(t) = A_e(t_{\text{last}}) \cdot e^{-\lambda \Delta t} \]

where:

  • \(A_e(t)\) is the activation of EDU \(e\)
  • \(\lambda\) is the passive decay rate (per second)

This operation implements a baseline loss of activation over time, independent of current task demands or new input. Exponential decay reflects the assumption, common to activation‑based memory models, that representations become less accessible as a smooth function of elapsed time rather than disappearing abruptly.

Crucially, continuous decay captures the fact that propositional information fades from working accessibility unless it is actively maintained or reactivated. Passive decay provides a temporal reference frame against which the effects of interference, rehearsal, and spreading activation can be evaluated.

For non-reading events (event_type == "other", e.g. writing, planning, thinking), no new content is encoded. Instead, existing activations decay further:

\[ A_e \leftarrow A_e \cdot e^{-\gamma \cdot (d / 1000)} \]

where:

  • \(\gamma\) reflects decay due to cognitive engagement
  • \(d\) is event duration in msecs

This step models task‑dependent decay capturing the idea that source information becomes less accessible when attention is directed away from the source text. Activities such as writing or planning do not merely allow time to pass; they actively compete for cognitive resources that might otherwise support maintenance of source propositions.

The parameter \(\gamma\) therefore represents additional decay due to attentional diversion, over and above passive time‑based fading. This allows the model to distinguish between simple temporal gaps and cognitively demanding activities that are known to accelerate forgetting of previously read content.

Encoding, interference, and maintenance during reading

During reading events (event_type == "read"), activation dynamics combine encoding, interference, and maintenance processes.

Encoding

Encoding input is computed as fixation duration on a logarithmic scale, normalised by EDU length

\[ I = \frac{\log(d + 1)}{w_e} \]

where EDU \(e\) is fixated for duration \(d\).

This transformation serves three purposes: 1. Logarithmic scaling reflects diminishing returns of fixation duration: very long fixations increase encoding strength, but not linearly. 2. Normalisation by EDU length prevents long propositions from receiving disproportionately large activation simply because they contain more words. 3. Continuous scaling allows encoding strength to vary smoothly rather than categorically.

As a result, \(I\) can be interpreted as an approximation of processing depth at the propositional level rather than raw visual exposure.

Interference

Reading a new EDU suppresses activation of all existing EDUs:

\[ A_j \leftarrow A_j \cdot e^{-\beta I} \quad \forall j \]

where \(\beta\) controls interference strength.

This operation implements similarity‑independent interference, a core principle of cue‑based and activation‑based memory models. Encoding new information reduces the accessibility of previously processed information because all representations compete for a limited pool of activation. Importantly, interference is scaled by encoding strength \(I\). Deep engagement with a new proposition produces stronger competition than superficial processing, reflecting that deeply processed information induces greater forgetting of competing representations.

Refixation‑based maintenance

If the same EDU was refixated within a short temporal window, these are not be treated as new encoding events but as short‑term conceptual rehearsal that stabilise an already active representation.

\[ \text{if } (t - t^{\text{fix}}_e) < \tau: \quad A_e \leftarrow A_e \cdot (1 + k) \]

where:

  • \(\tau\) is the refixation window
  • \(k\) is the maintenance gain

Multiplicative scaling preserves the relative activation of the EDU while strengthening it enough to resist decay and interference. This captures the intuition that rereading shortly after reading helps keep an idea active rather than creating a qualitatively new memory trace.

Encoding update

After interference and maintenance, the currently fixated EDU receives an additive activation boost:

\[ A_e \leftarrow A_e + \alpha I \]

Finally, its fixation time is updated:

\[ t^{\text{fix}}_e \leftarrow t \]

This step establishes the longer‑lasting encoding effect of reading: propositions gain activation in proportion to processing depth, allowing them to persist beyond the immediate fixation. The parameter \(\alpha\) determines how strongly new information enters the activation landscape relative to decay and spreading. Updating the last fixation time ensures that subsequent refixations can be identified as maintenance rather than new encoding.

Availability for retrieval

To estimate which EDUs are likely retrievable at a given moment, activations are normalised:

\[ A^*_e = \frac{A_e}{\max_j A_j} \]

An EDU is considered available if:

\[ A^*_e \ge \theta \]

where \(\theta\) is a threshold parameter. Thus availability reflects relative retrievability under competition, not absolute memory strength.

Normalisation transforms raw activation values into relative prominence within the current memory state. Availability is therefore defined competitively: an EDU is retrievable not because it is absolutely strong but because it remains sufficiently strong relative to other active propositions.

Activation spreading

Discourse spreading

The first spreading mechanism implements discourse‑based support between propositions that are structurally related within a text.

In the Construction–Integration (CI) model, propositions are not independent items but nodes in a discourse network whose connections reflect textual relations and coherence structure (Kintsch 1988, 1998). During the integration phase, activation spreads across this network, stabilising coherent interpretations while less connected propositions lose activation.

Rhetorical Structure Theory (RST) (Mann and Thompson 1988) provides an explicit representation of such discourse structure, capturing how clauses and propositions are grouped into spans at different hierarchical levels. Propositions that co‑occur in smaller, lower‑level spans are assumed to be more tightly integrated than propositions that are only linked at higher, more global levels.

For each source text, a discourse‑structural weight matrix \(W^{(s)}\) is constructed such that:

  • Rows and columns correspond to EDUs from the same source
  • \(W_{ij}\) reflects the strength of discourse‑structural association between EDU \(i\) and EDU \(j\)
  • Weights accumulate across all RST spans in which two EDUs co‑occur
  • Span contributions are weighted inversely by hierarchical level, implementing stronger links for more local relations
  • Rows are normalised so that outgoing activation is bounded

Formally, CI‑style spreading is implemented as:

\[ A^{new}_i = A_i + s \sum_j W_{ij} A_j \]

where \(s\) is a small spreading parameter. Spreading is restricted to EDUs belonging to the same source text.

This mechanism captures:

  • discourse coherence maintenance
  • reinforcement of locally related propositions
  • similarity‑based competition within a text

It corresponds to the integration phase of CI models and to episodic connections in landscape‑style models of comprehension.

Semantic spreading

The second spreading mechanism implements knowledge‑based resonance between propositions that are semantically related, independent of explicit discourse structure.

In the Landscape Model of reading comprehension (Broek et al. 1996, 1999), activation fluctuates across cycles as a function of:

  • attention to currently read units
  • residual activation of prior units
  • reactivation of related units via long‑term semantic memory

Recent developments of the landscape model (e.g., the LS‑R model) (Yeari and Broek 2015, 2016) operationalise semantic memory using distributional semantic representations (e.g., LSA), allowing activation to spread between text units based on their semantic similarity.

For each source text, a semantic similarity matrix \(S^{(s)}\) is constructed such that:

  • Rows and columns correspond to EDUs from the same source
  • \(S_{ij} \in [0,1]\) reflects semantic similarity between EDU texts
  • Similarity is computed from EDU‑level text representations (here using SBERT‑based semantic similarity)
  • The matrix is symmetric and numeric

Semantic spreading is implemented as:

\[ A^{new}_i = A_i + \gamma \sum_j S_{ij} A_j \]

where \(\gamma\) is a small semantic spreading rate.

This mechanism captures: - activation of inferences and predictions - background conceptual resonance - reactivation of previously read but semantically related information

Unlike discourse‑structural spreading, semantic spreading does not assume explicit textual relations and can activate propositions that were never adjacent in the text.

Algorithmic summary

  1. Calculate the time elapsed since the previous event:
    \(\Delta t = (t - t_{\text{last}}) / 1000\)
  2. Apply exponential time‑based decay as a function of \(\Delta t\)
  3. For event type "other" (e.g., writing, planning, thinking) apply additional task‑based decay proportional to the duration of the event
  4. For event type "read":
    • Compute encoding strength from fixation duration, normalised by EDU length
    • Apply interference to all active EDUs as a function of encoding strength
    • If EDU was fixated recently apply refixation‑based maintenance (rehearsal)
    • Add encoding activation to the currently fixated EDU
  5. Apply discourse‑structural spreading activation for each source text:
    • spread activation across structurally related EDUs using an RST‑derived weight matrix
    • spreading strength is governed by a small CI‑style spreading parameter
  6. Apply semantic spreading activation for each source text:
    • spread activation across semantically related EDUs using a semantic similarity matrix
    • spreading strength is governed by a small semantic resonance parameter
  7. Normalise activations relative to the most active EDU at that moment
  8. Determine availability based on predefined threshold

After each event, the model outputs a complete memory snapshot containing the current activation values and availability status for all EDUs encountered so far.

Full \(R\) implementation

# Initial state information -----------------------------------------------
init_state <- function() {
  list(
    A = numeric(),               # named activation vector
    last_t = NA_real_,           # last event time (ms)
    last_fix_time = numeric()    # last fixation per EDU (sec)
  )
}

# Activation weights based on RST relations -------------------------------
build_rst_W <- function(spans) {

  spans_source <- spans %>%
    mutate(
      edu_list = str_split(edu, ","),
      weight = 1 / (level + 1)
    )

  edus <- unique(unlist(spans_source$edu_list))

  W <- matrix(
    0,
    nrow = length(edus),
    ncol = length(edus),
    dimnames = list(edus, edus)
  )

  for (i in seq_len(nrow(spans_source))) {

    span_edus <- spans_source$edu_list[[i]]
    w <- spans_source$weight[i]

    if (length(span_edus) < 2) next

    pairs <- combn(span_edus, 2)

    for (j in seq_len(ncol(pairs))) {
      e1 <- pairs[1, j]
      e2 <- pairs[2, j]
      W[e1, e2] <- W[e1, e2] + w
      W[e2, e1] <- W[e2, e1] + w
    }
  }

  rs <- rowSums(W)
  rs[rs == 0 | !is.finite(rs)] <- 1
  W <- W / rs

  # row normalise
  W <- W / rowSums(W)
  W[is.na(W)] <- 0

  W
}

# Semantic similarity matrix for spreading activation ---------------------

# Compute and cache EDU embeddings
compute_sbert_embeddings <- function(edu_df) {

  # edu_df: columns edu, text
  texts <- edu_df$text

  emb <- sbert_model$encode(
    texts,
    convert_to_numpy = TRUE,
    normalize_embeddings = TRUE)

  rownames(emb) <- edu_df$edu
  emb
}

# Build semantic similarity matrix
build_semantic_W_sbert <- function(edu_df, embeddings) {

  edus <- edu_df$edu
  V <- embeddings[edus, , drop = FALSE]

  # cosine similarity
  S <- tcrossprod(V)

  # numerical safety
  diag(S) <- 1
  S[S < 0] <- 0

  S
}

# Compute TF‑IDF‑based semantic matrix (previous version)
build_semantic_W <- function(edu_df) {

  it <- itoken(
    edu_df$text,
    tokenizer = word_tokenizer,
    progressbar = FALSE
  )

  vocab <- create_vocabulary(it)
  vectorizer <- vocab_vectorizer(vocab)

  dtm <- create_dtm(it, vectorizer)
  tfidf <- TfIdf$new()
  dtm_tfidf <- tfidf$fit_transform(dtm)

  # cosine similarity
  S <- sim2(dtm_tfidf, method = "cosine", norm = "l2")

  S <- as.matrix(S)
  rownames(S) <- edu_df$edu
  colnames(S) <- edu_df$edu

  S
}

# Spreading activation ----------------------------------------------------

# Landscape semantic spreading, knowledge‑based resonance, semantic memory
spread_activation_landscape <- function(A,
                                        semantic_W_by_source,
                                        edu_source,
                                        gamma = 0.03) { # spreading of semantic information

  if (length(A) == 0) return(A)

  A_new <- A
  active_edus <- names(A)

  for (src in unique(edu_source[active_edus])) {

    if (!src %in% names(semantic_W_by_source)) next

    edus <- active_edus[edu_source[active_edus] == src]
    if (length(edus) < 2) next

    S <- semantic_W_by_source[[src]][edus, edus, drop = FALSE]

    S <- apply(S, 2, as.numeric)
    storage.mode(S) <- "double"

    a_vec <- as.numeric(A[edus])

    if (nrow(S) != length(a_vec)) next

    A_new[edus] <- A[edus] + gamma * as.numeric(S %*% a_vec)
  }

  A_new
}


# Disourse spreading activation (construction integration / RST)
spread_activation_ci <- function(A, W_by_source, edu_source, s = 0.05) {

  if (length(A) == 0) return(A)

  A_new <- A
  active_edus <- names(A)

  for (src in unique(edu_source[active_edus])) {

    # --- GUARD 1: source must exist in W_by_source ---
    if (!src %in% names(W_by_source)) next

    edus <- active_edus[edu_source[active_edus] == src]

    # --- GUARD 2: need at least two EDUs ---
    if (length(edus) < 2) next

    W_src <- W_by_source[[src]]

    # --- GUARD 3: weight matrix must be valid ---
    if (is.null(W_src)) next

    W <- W_src[edus, edus, drop = FALSE]

    # force numeric
    W <- as.matrix(W)
    a_vec <- as.numeric(A[edus])

    # --- GUARD 4: no spreading if degenerate ---
    if (any(!is.finite(a_vec)) || nrow(W) == 0) next

    # Add to activation the weigths for edus with a spreadiing factor that determines
    # the strength of the contribution of the weights
    A_new[edus] <- A[edus] + s * as.numeric(W %*% a_vec)
  }

  A_new
}

# Update activation -------------------------------------------------------
`%||%` <- function(x, y) if (is.null(x) || is.na(x)) y else x

update_one_event <- function(row, state) {

  A <- state$A
  last_fix_time <- state$last_fix_time

  # time elapsed (sec)
  if (is.na(state$last_t)) {
    dt <- 0
  } else {
    dt <- (row$t_start - state$last_t) / 1000
  }

  # universal passive decay
  if (length(A) > 0 && dt > 0) {
    A <- A * exp(-lambda * dt)
  }

  # ---- EVENT TYPES ----

  if (row$event_type == "other") {

    # task engagement without source input
    if (length(A) > 0) {
      A <- A * exp(-gamma * (row$duration / 1000))
    }

  } else if (row$event_type == "read") {

    edu <- row$edu
    now <- row$t_start / 1000

    # encoding strength (EDU-length normalised)
    input <- log(row$duration + 1) / row$words_in_edu

    # interference from new reading
    if (length(A) > 0) {
      A <- A * exp(-beta * input)
    }

    if (!is.na(edu) &&
        edu %in% names(last_fix_time) &&
        !is.na(last_fix_time[edu]) &&
        (now - last_fix_time[edu]) < refix_window) {

      A[edu] <- (A[edu] %||% 0) * (1 + k_maint)
    }

    # encoding
    A[edu] <- (A[edu] %||% 0) + alpha * input

    last_fix_time[edu] <- now
  }

  # spreading activation across propositions
  # after encoding / interference

  # discourse / CI spreading: based on RST relations
  A <- spread_activation_ci(
        A,
        W_by_source,
        edu_source = edu_source,
        s = discourse_spread_rate
  )

  # semantic / landscape spreading: based on semantic similarity
  A <- spread_activation_landscape(
    A,
    semantic_W_by_source,
    edu_source,
    gamma = semantic_spread_rate
  )


  list(
    A = A,
    last_t = row$t_start,
    last_fix_time = last_fix_time
  )
}

update_reading_history <- function(df, state) {

  df <- df[order(df$t_start), ]

  outputs <- vector("list", nrow(df))

  for (i in seq_len(nrow(df))) {

    state <- update_one_event(df[i, ], state)

    A <- state$A

    # reporting snapshot
    if (length(A) > 0 && any(is.finite(A))) {
      A_norm <- A / max(A)
    } else {
      A_norm <- A
    }

    outputs[[i]] <- tibble(
      token      = df$token[i],
      t_start    = df$t_start[i],
      edu        = names(A_norm),
      activation = as.numeric(A_norm),
      available  = A_norm >= theta
    )

  }

  list(
    state = state,
    output = dplyr::bind_rows(outputs)
  )
}

References

Anderson, John R. 1983. The Architecture of Cognition. Cambridge, MA: Harvard University Press.
Anderson, John R., and Gordon H. Bower. 1973. Human Associative Memory. Washington, DC: Winston.
Broek, Paul van den, Kimberly Risden, Charles R. Fletcher, and R. Thurlow. 1996. “A ‘Landscape’ View of Reading: Fluctuating Patterns of Activation and the Construction of a Stable Memory Representation.” In Models of Understanding Text, edited by Bruce K. Britton and Arthur C. Graesser, 165–87. Lawrence Erlbaum Associates.
Broek, Paul van den, Michael Young, Yuet-Yin Tzeng, and Tracy Linderholm. 1999. “The Landscape Model of Reading: Inferences and the Online Construction of a Memory Representation.” In The Construction of Mental Representations During Reading, edited by Herre van Oostendorp and Susan R. Goldman, 71–98. Lawrence Erlbaum Associates.
Chukharev-Hudilainen, Evgeny. 2019. “Empowering Automated Writing Evaluation with Keystroke Logging.” In Observing Writing, edited by Eva Lindgren and Kirk Sullivan, 38:125–42. Brill.
Kintsch, Walter. 1988. “The Role of Knowledge in Discourse Comprehension: A Construction-Integration Model.” Psychological Review 95 (2): 163–82. https://doi.org/10.1037/0033-295X.95.2.163.
———. 1998. Comprehension: A Paradigm for Cognition. Cambridge: Cambridge University Press.
Lewis, Richard L., and Shravan Vasishth. 2005. “An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval.” Cognitive Science 29 (3): 375–419. https://doi.org/10.1207/s15516709cog0000_25.
Lewis, Richard L., Shravan Vasishth, and Julie A. Van Dyke. 2006. “Computational Principles of Working Memory in Sentence Comprehension.” Trends in Cognitive Sciences 10 (10): 447–54. https://doi.org/10.1016/j.tics.2006.08.007.
Mann, William C., and Sandra A. Thompson. 1988. “Rhetorical Structure Theory: Toward a Functional Theory of Text Organization.” Text 8 (3): 243–81.
Myers, Jerome L., and Edward J. O’Brien. 1998. “Accessing the Discourse Representation During Reading.” Discourse Processes 26 (2-3): 131–57. https://doi.org/10.1080/01638539809545042.
Reimers, Nils, and Iryna Gurevych. 2019. “Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084.
Yeari, Menahem, and Paul van den Broek. 2015. “The Role of Textual Semantic Constraints in Knowledge-Based Inference Generation During Reading Comprehension: A Computational Approach.” Memory 23 (8): 1193–1214. https://doi.org/10.1080/09658211.2014.968169.
———. 2016. “A Computational Modeling of Semantic Knowledge in Reading Comprehension: Integrating the Landscape Model with Latent Semantic Analysis.” Behavior Research Methods 48: 880--896. https://doi.org/10.3758/s13428-016-0749-6.