TODO:

Phase 2

Treat gist as analysis, not mechanism: Extract it after the fact, Use activation and E.

Define 2–3 gist extraction metrics; get gist on the basis of activated EDUs and their associations

Phase 3

SBERT vs learned E: what does each contributes

Phase 4

Test gists against produced texts

Because the model is explicit and causal, you can do things like: “Which EDU needs reactivation to recover the intended gist?”

Mechanically:

Model description

This document describes a real‑time activation model used to estimate the availability of propositional content (EDUs, e.g. sentences) in memory during writing‑from‑source tasks. The model treats reading as a stream of memory‑updating events that dynamically shape which ideas are available for writing under time pressure and interference. It is driven by eye‑movement–derived events but abstracts away from word‑level processing, focusing instead on proposition‑level representations. The model is intended to capture availability and accessibility of meaning, not verbatim recall.

The model builds on activation‑based and interference‑based accounts of memory in psycholinguistics (e.g. ACT‑R, cue‑based retrieval, and related comprehension models), according to which linguistic representations are continuously updated, decay over time, and compete for retrieval (Anderson 1983; Anderson and Bower 1973; Lewis and Vasishth 2005). As in cue‑based retrieval frameworks, forgetting is treated not as loss of information but as a reduction in accessibility under competition (Lewis, Vasishth, and Van Dyke 2006; Myers and O’Brien 1998). EDUs are treated as propositional units whose activation is strengthened by reading, weakened by interference from new information, and eroded by non‑source‑reading activities such as writing. Availability is defined in relative terms, reflecting selective accessibility among competing propositions rather than absolute memory strength.

The present model extends a time‑ and interference‑based activation framework by embedding propositions in structured networks over which activation can spread and by allowing aspects of this structure to change during reading. This yields a hybrid architecture that combines (i) graded activation dynamics, (ii) knowledge‑based semantic relationships among propositions, and (iii) episodic learning driven by co‑activation during comprehension. At any point in time, each EDU has a continuous activation value reflecting its current accessibility. Activations change dynamically as a function of reading, writing, decay, interference, and spreading activation. In addition, when propositions remain simultaneously active and available, their mutual associations can strengthen, producing reader‑ and episode‑specific integration of the text.

In line with Fuzzy Trace Theory (Reyna and Brainerd 1995; Reyna 2008), gist is treated here as a meaning‑based representation that is more stable than verbatim content. In the present model, such representations are not implemented as separate traces but emerge as stable patterns of activation and connectivity across propositions. Stable patterns of activation and connectivity across propositions correspond to gist‑like understanding, while transient or weakly connected propositions correspond to peripheral or unstable interpretations. Gist representations are therefore not stored as distinct units but arise from the evolving structure of activation and learned associations across propositions over time.

Overall, the model implements a hybrid activation‑based memory system with the following properties:

This architecture aligns with a large body of psycholinguistic work in which forgetting during comprehension is understood as a consequence of decay, interference, competitive retrieval, and learning in a structured memory space, rather than as the loss of stored representations (e.g. Lewis, Vasishth, and Van Dyke 2006; Myers and O’Brien 1998).

Full functions can be found at the end of this document and will be loaded here.

# Load functions for reading history model with activation spreading
source("../r/rh-sa.R")

Model inputs

Example data

Below is an example of the input format that the model is expecting. All variables are required (except for token). The data are represented as time stamps with associated fixation durations associated with event_type. event_type has the value "read" always when edu is not NA and other when edu is NA. Thus, when event_type is "read" there is an edu, i.e. a text unit within the source texts, that is associated with the fixation duration. The code assumes that edu values are unambiguous identifiers of text regions across source texts.

Rows: 8,354
Columns: 7
$ token        <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ source       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "RW1_short_ascii", NA, NA, "RW1_short_ascii", "…
$ edu          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "001kgfc", NA, NA, "002smtf", "003twnf", "002sm…
$ t_start      <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ event_type   <chr> "other", "other", "other", "other", "other", "other", "other", "other", "other", "other…
$ duration     <int> 797, 355, 1788, 4286, 1761, 2183, 2565, 2076, 120, 93, 160, 140, 107, 602, 307, 221, 19…
$ words_in_edu <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 16, NA, NA, 22, 24, 22, 16, 16, 16, NA, 18, NA,…

words_in_edu comes from the rst_tsv file:

Rows: 246
Columns: 4
$ edu          <chr> "001wfwa", "002tiwf", "003atti", "004saec", "005osit", "006netd", "007tpid", "008bats",…
$ text         <chr> "when faced with a particularly tough question on rounds during my intern year i would …
$ source       <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_…
$ words_in_edu <int> 20, 25, 21, 18, 5, 39, 4, 25, 29, 19, 11, 36, 14, 32, 29, 16, 8, 23, 20, 31, 18, 35, 38…

Span information from the rst_tsv file is required to calculate the weights between PEs.

Rows: 183
Columns: 3
$ source <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii"…
$ edu    <chr> "001wfwa,002tiwf", "001wfwa,002tiwf,003atti,004saec,005osit", "001wfwa,002tiwf,003atti,004sae…
$ level  <dbl> 9, 7, 5, 3, 1, 8, 10, 8, 6, 8, 6, 4, 4, 2, 4, 3, 0, 2, 2, 4, 5, 3, 1, 3, 1, 2, 1, 3, 5, 3, 13…

The input data format requires minimal data transformation from the incremental report returned by CyWrite (Chukharev-Hudilainen 2019) and no data cleaning. The data wrangling that would be minimally required (in R) is shown here:

jsonlite::fromJSON(txt = token, flatten = TRUE) %>%
  select(eye) %>%
  unnest(eye) %>%
  # label edus as "read" or "other
  mutate(event_type = case_when(!is.na(edu) ~ "read", TRUE ~ "other")) %>%
  # add number of words in edu
  left_join(data_pe, by = join_by(edu)) 

Finally we need a named vector with source as values and EDUs as names.

# EDUs by source 
edu_source <- setNames(data_pe$source, data_pe$edu)

# Preview
edu_source[1:2]
          001wfwa           002tiwf 
"AM1_short_ascii" "AM1_short_ascii" 

Semantic similarity

The function build_S_sbert() constructs a semantic similarity matrix for knowledge‑based spreading activation for each source text on the basis of the data stored in data_pe. This matrix is derived from sentence‑level semantic embeddings allowing semantic relations to be captured independently of surface lexical overlap. This matrix provides the structural basis for landscape‑style semantic spreading activation, in which activation can flow between propositions that are semantically related in a conceptual sense, even if they do not share words or are not directly connected by discourse structure.

Each EDU is mapped onto a dense semantic vector using a Sentence‑BERT (SBERT) model (Reimers and Gurevych 2019). SBERT models are trained to represent the meaning of sentences such that semantically equivalent or closely related sentences have similar vector representations, even when they differ substantially in wording. This makes them particularly well suited for short, propositional EDUs, where lexical overlap may be minimal.

SBERT is accessed via Python.

library(reticulate)
use_python("/usr/bin/python3", required = TRUE)
transformers <- import("sentence_transformers")
np <- import("numpy")

# initialise model (do once per session)
sbert_model <- transformers$SentenceTransformer("all-MiniLM-L6-v2")

# Generate SBERT embeddings for EDUs
emb_sbert_by_source <- data_pe %>%
  split(.$source) %>%
  lapply(compute_sbert_embeddings)

Semantic representations are constructed as follows:

  1. Sentence encoding: Each EDU text is passed to a pretrained SBERT model, which produces a fixed‑length, dense vector representation intended to capture the EDU’s propositional meaning.
  2. Embedding normalisation: EDU embeddings are L2‑normalised to ensure that cosine similarity reflects angular distance in semantic space rather than vector magnitude.
  3. Per‑source processing: Embeddings are constructed separately for each source text, and semantic similarity matrices are computed independently for each source. This prevents semantic spreading across sources.
# Build semantic similarity matrix for each source and its emebddings
edu_list <- data_pe %>% split(.$source)

common_names <- intersect(names(edu_list), names(emb_sbert_by_source))

S_by_source <- mapply(
  build_S_sbert,
  edu_df = edu_list[common_names],
  embeddings = emb_sbert_by_source[common_names],
  SIMPLIFY = FALSE)

SBERT embeddings approximate long‑term semantic knowledge learned from large corpora, including information about synonymy, paraphrase, and typical conceptual associations.

For each source text, a semantic similarity matrix \(S\) is computed by taking the cosine similarity between all pairs of EDU embeddings. For EDUs \(i\) and \(j\):

\[ S_{ij} = \frac{\vec{v}_i \cdot \vec{v}_j}{\|\vec{v}_i\| \|\vec{v}_j\|} \]

where \(\vec{v}_i\) and \(\vec{v}_j\) are the sentence‑level embedding vectors of EDUs \(i\) and \(j\), respectively. A preview is shown here:

          001kgfc   002smtf   003twnf   004eahf   005wtsg   006bssf
001kgfc 1.0000000 0.2677321 0.1633561 0.4572471 0.1959719 0.1470794
002smtf 0.2677321 1.0000000 0.3595323 0.3795831 0.4427657 0.5298123
003twnf 0.1633561 0.3595323 1.0000000 0.2868217 0.4470820 0.3980450
004eahf 0.4572471 0.3795831 0.2868217 1.0000000 0.3417116 0.4565610
005wtsg 0.1959719 0.4427657 0.4470820 0.3417116 1.0000000 0.4074331
006bssf 0.1470794 0.5298123 0.3980450 0.4565610 0.4074331 1.0000000

Cosine similarity yields values in the range \([0, 1]\), where higher values indicate greater semantic relatedness in embedding space. Because embeddings encode meaning beyond surface form, high similarity values can arise even when EDUs share few or no lexical items.

The resulting semantic similarity matrix \(S\) has the following properties:

  • Rows and columns correspond to EDUs from the same source text
  • \(S_{ij} = S_{ji}\) (the matrix is symmetric)
  • \(S_{ii} = 1\) (self‑similarity)
  • Off‑diagonal values encode graded semantic relatedness
  • The matrix is dense and continuous, reflecting probabilistic semantic associations rather than categorical links

Row and column names are set to EDU identifiers, allowing direct alignment with the activation state vector.

The SBERT‑based semantic similarity matrix is intended to approximate the reader’s semantic memory structure in a distributional sense. Embedding‑based similarity captures conceptual relatedness that support:

  • synonymy and paraphrase relations
  • bridging and predictive inferences
  • resonance between propositions with similar meanings
  • maintenance of topic‑level coherence despite variation in wording

In the model, this matrix supports landscape‑style semantic activation spreading whereby activation of one proposition partially reactivates other propositions that are semantically related at the level of meaning rather than surface form. Semantic spreading is implemented with a small weighting parameter ensuring that knowledge‑based resonance provides weak but persistent background support rather than dominating direct encoding or discourse‑structural integration mechanisms.

Application of reading history model

The following code illustrates how the reading history function update_reading_history can be applied to slices of data. These data slices can have any size from 1 to the overall length of the writing session. Details of the implementation of update_reading_history and a conceptual description can be found below.

# Model parameters
lambda  <- .08       # slow passive time decay (/sec)
alpha   <- .9        # moderate encoding strength
beta    <- .1        # strong propositinoal level interference from reading
gamma   <- .18       # strong decay during writing / "other"
refix_window <- 2.0  # seconds
k_maint      <- 0.3  # refixation boost
theta        <- 0.15 # retrieval threshold (normalised)

# Episodic learning
eta <- 0.001  # learning rate
rho <- 0.0002  # slow decay

# Activation spreading
semantic_spread_rate <- 0.02  
episodic_spread_rate <- 0.01 # episodic learning influence

# Vector for initial states
state <- init_state(S_by_source, edu_source)

# Data frame for results
rh <- tibble()

# Loop over random slices of data (to simulate real time)
for(i in seq(1, nrow(data), 20)){
  tmp <- slice(data, i)
  res <- update_reading_history(tmp, state)
  state <- res$state
  rh <- bind_rows(rh, res$output)
}

# Add source information for plotting
rh <- left_join(rh, data_pe, by = "edu")

The model returns the following information

Rows: 22,492
Columns: 8
$ token        <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ t_start      <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ activation   <dbl> 1.00000000, 0.29644592, 1.00000000, 0.29644592, 1.00000000, 0.29644592, 1.00000000, 0.1…
$ available    <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, F…
$ edu          <chr> "006bssf", "006bssf", "010isat", "006bssf", "010isat", "006bssf", "010isat", "006bssf",…
$ text         <chr> "but she still felt like a stranger at her own company whose remote policies were hapha…
$ source       <chr> "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_…
$ words_in_edu <int> 18, 18, 9, 18, 9, 18, 9, 18, 9, 12, 18, 9, 12, 15, 18, 9, 12, 15, 19, 18, 9, 12, 15, 19…

The model results are visualised in Figure 1. Activation values represent relative indices of memory prominence rather than absolute memory strength. Accordingly, the model’s predictions focus on availability and competition among propositions, which are assumed to drive rereading and source use decisions. The model does not attempt to estimate absolute memory strength. Instead, it models dynamic changes in relative availability under time‑based decay and interference, which are assumed to be the relevant determinants of writing behavior. The model permits statements about the relative prominence of propositions in memory (e.g. EDU “X” being more active than EDU “Y” at a given moment), but does not assign absolute memory‑strength units to activation values.

Reading history. Each line represents an EDU. Colour indicates whether the propositional information of the EDU is available or not.

Figure 1: Reading history. Each line represents an EDU. Colour indicates whether the propositional information of the EDU is available or not.

Conceptual overview

The model treats each EDU as a propositional memory trace whose activation evolves continuously over time as readers alternate between reading, writing, and other cognitively demanding activities. Activation reflects the momentary accessibility of the information associated with each EDU, not its permanent storage in long‑term memory.

Activation increases when an EDU is processed during reading and decreases due to the passage of time, competition from newly processed information, and attentional diversion during non‑reading activities such as writing or planning. In addition, activation can be redistributed across semantically related propositions via spreading‑activation mechanism.

The core assumptions of the model are:

In addition to fixed semantic knowledge, the model assumes that associative structure can be learned during reading itself. When propositions remain simultaneously active and available, their mutual associations may strengthen, giving rise to episodic connections that reflect the reader’s evolving integration of the text. These episodic associations are specific to the current reading episode and depend on reading order, attentional patterns, and task demands. They are not interpreted as new propositional representations but as changes in the connectivity of existing propositions.

Within this framework, higher‑level understanding such as gist is not represented as a separate memory trace. Instead, gist emerges as a stable pattern of co‑activation and learned association across propositions that persist across time, interference, and task transitions. Peripheral propositions, by contrast, form weaker or transient connections and are more susceptible to decay. Episodic learning therefore provides a mechanism by which central ideas can gradually dominate the activation landscape as comprehension unfolds.

The model operates in real time and produces a memory snapshot after each event yielding continuous estimates of the activation and availability of all propositions encountered so far. This allows the model to track how the set of accessible ideas changes dynamically during writing-from-source tasks.

State variables

At any time \(t\), the model maintains the following state variables:

The complete model state at time \(t\) can thus be characterised as:

\[ \text{state}(t) = \bigl\{ A(t),\; t_{\text{last}},\; t^{\text{fix}},\; S^{(s)},\; E^{(s)}(t) \bigr\} \]

In other words, propositional information is strengthened, weakened, supported, and suppressed over time under the combined influence of processing, interference, semantic knowledge, and episodic learning driven by co‑activation.

Default parameters

lambda = 0.08   # passive decay rate (per sec)
alpha  = 0.9    # encoding strength
beta   = 0.10   # interference from new reading
gamma  = 0.18   # decay during non-reading ("other")
theta  = 0.15   # availability threshold (normalised)

refix_window = 2.0  # refixation window (sec)
k_maint      = 0.3  # refixation boost

# Activation spreading
semantic_spread_rate = 0.02  # how much does semantic resonance matter
episodic_spread_rate = 0.01 # episodic learning influence

# Episodic learning
eta = 0.001  # Learning rate
rho = 0.0002  # slow decay

These values are intended for adult L1 readers processing propositional content.

Event representation

Each input event corresponds to a single attentional or task-related episode. Events are processed sequentially in time order.

Each event has the following attributes:

The model makes no assumptions about dwell structure or fixation clustering; any event stream that satisfies these conditions can be used.

Mathematical model

Passive time‑based decay

Between any two events, all EDU activations decay exponentially:

\[ \Delta t = \frac{t - t_{\text{last}}}{1000} \]

\[ A_e(t) = A_e(t_{\text{last}}) \cdot e^{-\lambda \Delta t} \]

where:

  • \(A_e(t)\) is the activation of EDU \(e\)
  • \(\lambda\) is the passive decay rate (per second)

This operation implements a baseline loss of activation over time, independent of current task demands or new input. Exponential decay reflects the assumption, common to activation‑based memory models, that representations become less accessible as a smooth function of elapsed time rather than disappearing abruptly.

Crucially, continuous decay captures the fact that propositional information fades from working accessibility unless it is actively maintained or reactivated. Passive decay provides a temporal reference frame against which the effects of interference, rehearsal, and spreading activation can be evaluated.

For non-reading events (event_type == "other", e.g. writing, planning, thinking), no new content is encoded. Instead, existing activations decay further:

\[ A_e \leftarrow A_e \cdot e^{-\gamma \cdot (d / 1000)} \]

where:

  • \(\gamma\) reflects decay due to cognitive engagement
  • \(d\) is event duration in msecs

This step models task‑dependent decay capturing the idea that source information becomes less accessible when attention is directed away from the source text. Activities such as writing or planning do not merely allow time to pass; they actively compete for cognitive resources that might otherwise support maintenance of source propositions.

The parameter \(\gamma\) therefore represents additional decay due to attentional diversion, over and above passive time‑based fading. This allows the model to distinguish between simple temporal gaps and cognitively demanding activities that are known to accelerate forgetting of previously read content.

Encoding, interference, and maintenance during reading

During reading events (event_type == "read"), activation dynamics combine encoding, interference, and maintenance processes.

Encoding

Encoding input is computed as fixation duration on a logarithmic scale, normalised by EDU length

\[ I = \frac{\log(d + 1)}{w_e} \]

where EDU \(e\) is fixated for duration \(d\).

This transformation serves three purposes: 1. Logarithmic scaling reflects diminishing returns of fixation duration: very long fixations increase encoding strength, but not linearly. 2. Normalisation by EDU length prevents long propositions from receiving disproportionately large activation simply because they contain more words. 3. Continuous scaling allows encoding strength to vary smoothly rather than categorically.

As a result, \(I\) can be interpreted as an approximation of processing depth at the propositional level rather than raw visual exposure.

Interference

Reading a new EDU suppresses activation of all existing EDUs:

\[ A_j \leftarrow A_j \cdot e^{-\beta I} \quad \forall j \]

where \(\beta\) controls interference strength.

This operation implements similarity‑independent interference, a core principle of cue‑based and activation‑based memory models. Encoding new information reduces the accessibility of previously processed information because all representations compete for a limited pool of activation. Importantly, interference is scaled by encoding strength \(I\). Deep engagement with a new proposition produces stronger competition than superficial processing, reflecting that deeply processed information induces greater forgetting of competing representations.

Refixation‑based maintenance

If the same EDU was refixated within a short temporal window, these are not be treated as new encoding events but as short‑term conceptual rehearsal that stabilise an already active representation.

\[ \text{if } (t - t^{\text{fix}}_e) < \tau: \quad A_e \leftarrow A_e \cdot (1 + k) \]

where:

  • \(\tau\) is the refixation window
  • \(k\) is the maintenance gain

Multiplicative scaling preserves the relative activation of the EDU while strengthening it enough to resist decay and interference. This captures the intuition that rereading shortly after reading helps keep an idea active rather than creating a qualitatively new memory trace.

Encoding update

After interference and maintenance, the currently fixated EDU receives an additive activation boost:

\[ A_e \leftarrow A_e + \alpha I \]

Finally, its fixation time is updated:

\[ t^{\text{fix}}_e \leftarrow t \]

This step establishes the longer‑lasting encoding effect of reading: propositions gain activation in proportion to processing depth, allowing them to persist beyond the immediate fixation. The parameter \(\alpha\) determines how strongly new information enters the activation landscape relative to decay and spreading. Updating the last fixation time ensures that subsequent refixations can be identified as maintenance rather than new encoding.

Semantic activation spreading

The semantic spreading mechanism implements knowledge‑based resonance between propositions that are semantically related, independent of explicit discourse structure. In the Landscape Model of reading comprehension (Broek et al. 1996, 1999), activation fluctuates across cycles as a function of:

  • attention to currently read units
  • residual activation of prior units
  • reactivation of related units via long‑term semantic memory

Recent developments of the landscape model (e.g., the LS‑R model) (Yeari and Broek 2015, 2016) operationalise semantic memory using distributional semantic representations (e.g., LSA), allowing activation to spread between text units based on their semantic similarity.

For each source text, a semantic similarity matrix \(S^{(s)}\) is constructed such that:

  • Rows and columns correspond to EDUs from the same source
  • \(S_{ij} \in [0,1]\) reflects semantic similarity between EDU texts
  • Similarity is computed from EDU‑level text representations (here using SBERT‑based semantic similarity)
  • The matrix is symmetric and numeric

Semantic spreading is implemented as:

\[ A^{new}_i = A_i + \gamma \sum_j S_{ij} A_j \]

where \(\gamma\) is a small semantic spreading rate.

This mechanism captures: - activation of inferences and predictions - background conceptual resonance - reactivation of previously read but semantically related information

Unlike discourse‑structural spreading, semantic spreading does not assume explicit textual relations and can activate propositions that were never adjacent in the text.

Integrated activation spreading

Activation spreading in the model is governed by a composite spreading operator that combines fixed semantic knowledge with learned episodic associations. For each source text \(s\), activation spreads across propositions according to:

\[ A^{new}_i = A_i + \sum_j M^{(s)}_{ij} A_j \]

where the effective spreading matrix \(M^{(s)}\) is defined as:

\[ M^{(s)} = \gamma \, S^{(s)} + \epsilon \, E^{(s)} \]

Here:

  • \(S^{(s)}\) is the semantic similarity matrix for source text \(s\),
  • \(E^{(s)}\) is the episodic association matrix learned during reading,
  • \(\gamma\) controls the strength of semantic (knowledge‑based) spreading,
  • \(\epsilon\) controls the influence of episodic associations on spreading.

This formulation implements the assumption that activation dynamics are constrained jointly by long‑term semantic knowledge and by episodic structure emerging from prior co‑activation. Semantic spreading provides stable background resonance based on conceptual similarity, whereas episodic spreading reflects reader‑ and episode‑specific integration of propositions encountered during the current reading episode.

Integrated spreading is applied only to propositions that are both active and represented in the corresponding semantic and episodic structures. Early in reading, episodic associations are weak or absent, and activation spreading is dominated by semantic similarity. As reading progresses, episodic associations accumulate and increasingly shape the activation landscape, allowing central ideas to reinforce one another and peripheral ideas to fade.

This integrated spreading mechanism provides the link between episodic learning and evolving availability of propositions and constitutes the core mechanism by which stable, gist‑like activation patterns emerge over time.

Episodic learning via co‑activation

The model includes a mechanism for episodic learning that allows associations between propositions to change as a function of their co‑activation during reading. This mechanism captures the idea that comprehension involves not only activating propositions but also integrating them into a situation‑specific representation whose structure reflects the reader’s processing history.

For each source text \(s\), the model maintains an episodic association matrix \(E^{(s)}\), which is initialised with zero weights and updated during reading. After each reading event and subsequent activation spreading, episodic associations are strengthened between propositions that are simultaneously active and retrieval‑ready. Formally, for two propositions \(i\) and \(j\):

\[ \Delta E_{ij} = \eta \cdot A_i \cdot A_j \]

where \(\eta\) is a small learning rate controlling how quickly episodic associations form. Learning is gated by availability, such that associations are updated only for propositions whose normalised activation exceeds the availability threshold. This constraint prevents spurious learning from weakly activated or transient propositions and reflects the assumption that learning primarily involves information that is functionally accessible.

Episodic associations decay slowly over time:

\[ E^{(s)} \leftarrow (1 - \rho) \cdot E^{(s)} \]

where \(\rho\) is a small decay parameter. This allows infrequently reinforced associations to weaken, while repeatedly co‑activated propositions develop stable connections.

Episodic associations participate in activation spreading alongside semantic similarity, thereby influencing future activation dynamics. Because episodic learning depends on the sequence and duration of reading events, this mechanism allows the model to capture reader‑specific integration and order effects, and to explain how certain propositions become central while others remain peripheral. Gist‑level understanding emerges as a consequence of these stable episodic associations interacting with graded activation dynamics.

Availability for retrieval

To estimate which EDUs are likely retrievable at a given moment, activations are normalised:

\[ A^*_e = \frac{A_e}{\max_j A_j} \]

An EDU is considered available if:

\[ A^*_e \ge \theta \]

where \(\theta\) is a threshold parameter. Thus availability reflects relative retrievability under competition, not absolute memory strength.

Normalisation transforms raw activation values into relative prominence within the current memory state. Availability is therefore defined competitively: an EDU is retrievable not because it is absolutely strong but because it remains sufficiently strong relative to other active propositions.

Algorithmic summary

For each incoming event, processed sequentially in time order:

  1. Calculate elapsed time since the previous event:
    \[ \Delta t = \frac{t - t_{\text{last}}}{1000} \]

  2. Apply passive time‑based decay

    • All EDU activations decay exponentially as a function of \(\Delta t\)
  3. For event type "other" (e.g., writing, planning, thinking):

    • Apply additional task‑based decay proportional to event duration, reflecting attentional diversion from source content
  4. For event type "read":

    • Compute encoding strength from fixation duration, normalised by EDU length
    • Apply interference to all active EDUs as a function of encoding strength
    • If the currently fixated EDU was fixated recently, apply refixation‑based maintenance (conceptual rehearsal)
    • Add encoding activation to the currently fixated EDU
    • Update the fixation time for that EDU
  5. Apply integrated activation spreading

    • For the relevant source text, construct an effective spreading operator combining:
      • semantic similarity relations (knowledge‑based)
      • episodic associations learned through prior co‑activation
    • Redistribute activation across propositions using this composite structure
    • Spreading strength is governed by separate semantic and episodic weighting parameters
  6. Determine availability for retrieval

    • Normalise activation values relative to the most active EDU
    • Mark propositions as available if their normalised activation exceeds a predefined threshold
  7. Update episodic associations (read events only)

    • Strengthen associations between propositions that are simultaneously active and available, proportional to their joint activation
    • Apply slow decay to episodic associations, allowing weakly supported links to fade over time

After each event, the model outputs a complete memory snapshot containing the current activation values and availability status for all EDUs encountered so far, as well as updated episodic association strengths.

Full \(R\) implementation

# Initial state information -----------------------------------------------

init_state <- function(S_by_source, edu_source) {

  E_by_source <- lapply(S_by_source, function(S) {
    matrix(0, nrow = nrow(S), ncol = ncol(S),
           dimnames = dimnames(S))
  })

  list(
    A = numeric(),             # named activation vector
    last_t = NA_real_,         # last event time (ms)
    last_fix_time = numeric(), # last fixation per EDU (sec)
    S_by_source = S_by_source,
    E_by_source = E_by_source,
    edu_source = edu_source
  )
}

# Semantic similarity matrix for spreading activation ---------------------

# Compute and cache EDU embeddings
compute_sbert_embeddings <- function(edu_df) {

  # edu_df: columns edu, text
  texts <- edu_df$text

  emb <- sbert_model$encode(
    texts,
    convert_to_numpy = TRUE,
    normalize_embeddings = TRUE)

  rownames(emb) <- edu_df$edu
  emb
}

# Build semantic similarity matrix
build_S_sbert <- function(edu_df, embeddings) {

  edus <- edu_df$edu
  V <- embeddings[edus, , drop = FALSE]

  # cosine similarity
  S <- tcrossprod(V)

  # numerical safety
  diag(S) <- 1
  S[S < 0] <- 0

  S
}

# Episodic learning update ------------------------------------------------
update_E_by_coactivation <- function(E, A, available, eta, rho) {

  edus <- names(A)
  avail_edus <- edus[available[edus]]

  if (length(avail_edus) < 2) return(E)

  for (i in seq_along(avail_edus)) {
    for (j in seq_along(avail_edus)) {
      if (i < j) {
        ei <- avail_edus[i]
        ej <- avail_edus[j]

        delta <- eta * A[ei] * A[ej]
        E[ei, ej] <- E[ei, ej] + delta
        E[ej, ei] <- E[ei, ej]
      }
    }
  }

  # slow decay
  E <- (1 - rho) * E
  diag(E) <- 0

  E
}


# Combine spreading matrices ----------------------------------------------
combine_spreading_matrices <- function(S, E,
                                       semantic_rate,
                                       episodic_rate) {

  M <- semantic_rate  * S + episodic_rate  * E

  diag(M) <- 0
  M
}

# Update activation -------------------------------------------------------
`%||%` <- function(x, y) if (is.null(x) || is.na(x)) y else x

update_one_event <- function(row, state) {

  A <- state$A
  last_fix_time <- state$last_fix_time

  # time elapsed (sec)
  if (is.na(state$last_t)) {
    dt <- 0
  } else {
    dt <- (row$t_start - state$last_t) / 1000
  }

  # universal passive decay
  if (length(A) > 0 && dt > 0) {
    A <- A * exp(-lambda * dt)
  }

  # ---- EVENT TYPES ----

  if (row$event_type == "other") {

    # task engagement without source input
    if (length(A) > 0) {
      A <- A * exp(-gamma * (row$duration / 1000))
    }

  } else if (row$event_type == "read") {

    # encoding and interference (EDU-length normalised)
    edu <- row$edu
    now <- row$t_start / 1000
    input <- log(row$duration + 1) / row$words_in_edu

    # interference from new reading
    if (length(A) > 0) {
      A <- A * exp(-beta * input)
    }

    if (!is.na(edu) &&
        edu %in% names(last_fix_time) &&
        !is.na(last_fix_time[edu]) &&
        (now - last_fix_time[edu]) < refix_window) {

      A[edu] <- (A[edu] %||% 0) * (1 + k_maint)
    }

    A[edu] <- (A[edu] %||% 0) + alpha * input
    last_fix_time[edu] <- now

    ## ---- combined spreading (W + S + E) ----
    src  <- row$source
    edus_all <- names(A)

    if (!is.na(src) && src %in% names(state$S_by_source)) {

      S_full <- state$S_by_source[[src]]
      E_full <- state$E_by_source[[src]]

      # only EDUs that exist in the structure matrices
      edus_struct <- intersect(edus_all, rownames(S_full))

      # spreading only makes sense with at least 2 EDUs
      if (length(edus_struct) >= 2) {

        S <- S_full[edus_struct, edus_struct, drop = FALSE]
        E <- E_full[edus_struct, edus_struct, drop = FALSE]

        M <- combine_spreading_matrices(
          S, E,
          semantic_rate  = semantic_spread_rate,
          episodic_rate  = episodic_spread_rate
        )

        A[edus_struct] <- A[edus_struct] + as.numeric(M %*% A[edus_struct])
      }
    }

    ## ---- availability ----
    A_norm <- A / max(A)
    available <- A_norm >= theta
    names(available) <- names(A)

    ## ---- episodic learning ----
    # EDUs currently active
    edus_all <- names(A)

    # EDUs that exist in episodic matrix
    edus_E <- intersect(edus_all, rownames(E_full))

    # episodic learning only makes sense with >= 2 EDUs
    if (length(edus_E) >= 2) {

      E_sub <- E_full[edus_E, edus_E, drop = FALSE]

      E_sub <- update_E_by_coactivation(
        E_sub,
        A = A[edus_E],
        available = available[edus_E],
        eta = eta,
        rho = rho
      )

      E_full[edus_E, edus_E] <- E_sub
    }

    state$E_by_source[[src]] <- E_full
  }

  ## ---- return updated state ----
  list(
    A = A,
    last_t = row$t_start,
    last_fix_time = last_fix_time,
    S_by_source = state$S_by_source,
    E_by_source = state$E_by_source,
    edu_source = state$edu_source
  )

}

update_reading_history <- function(df, state) {

  df <- df[order(df$t_start), ]

  outputs <- vector("list", nrow(df))

  for (i in seq_len(nrow(df))) {

    state <- update_one_event(df[i, ], state)

    A <- state$A

    # reporting snapshot
    if (length(A) > 0 && any(is.finite(A))) {
      A_norm <- A / max(A)
    } else {
      A_norm <- A
    }

    outputs[[i]] <- tibble(
      token      = df$token[i],
      t_start    = df$t_start[i],
      edu        = names(A_norm),
      activation = as.numeric(A_norm),
      available  = A_norm >= theta
    )

  }

  list(
    state = state,
    output = dplyr::bind_rows(outputs)
  )
}

References

Anderson, John R. 1983. The Architecture of Cognition. Cambridge, MA: Harvard University Press.
Anderson, John R., and Gordon H. Bower. 1973. Human Associative Memory. Washington, DC: Winston.
Broek, Paul van den, Kimberly Risden, Charles R. Fletcher, and R. Thurlow. 1996. “A ‘Landscape’ View of Reading: Fluctuating Patterns of Activation and the Construction of a Stable Memory Representation.” In Models of Understanding Text, edited by Bruce K. Britton and Arthur C. Graesser, 165–87. Lawrence Erlbaum Associates.
Broek, Paul van den, Michael Young, Yuet-Yin Tzeng, and Tracy Linderholm. 1999. “The Landscape Model of Reading: Inferences and the Online Construction of a Memory Representation.” In The Construction of Mental Representations During Reading, edited by Herre van Oostendorp and Susan R. Goldman, 71–98. Lawrence Erlbaum Associates.
Chukharev-Hudilainen, Evgeny. 2019. “Empowering Automated Writing Evaluation with Keystroke Logging.” In Observing Writing, edited by Eva Lindgren and Kirk Sullivan, 38:125–42. Brill.
Lewis, Richard L., and Shravan Vasishth. 2005. “An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval.” Cognitive Science 29 (3): 375–419. https://doi.org/10.1207/s15516709cog0000_25.
Lewis, Richard L., Shravan Vasishth, and Julie A. Van Dyke. 2006. “Computational Principles of Working Memory in Sentence Comprehension.” Trends in Cognitive Sciences 10 (10): 447–54. https://doi.org/10.1016/j.tics.2006.08.007.
Myers, Jerome L., and Edward J. O’Brien. 1998. “Accessing the Discourse Representation During Reading.” Discourse Processes 26 (2-3): 131–57. https://doi.org/10.1080/01638539809545042.
Reimers, Nils, and Iryna Gurevych. 2019. “Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084.
Reyna, Valerie F. 2008. “A Theory of Medical Decision Making and Health: Fuzzy Trace Theory.” Medical Decision Making 28 (6): 850–65. https://doi.org/10.1177/0272989X08327066.
Reyna, Valerie F., and Charles J. Brainerd. 1995. “Fuzzy-Trace Theory: An Interim Synthesis.” Learning and Individual Differences 7 (1): 1–75. https://doi.org/10.1016/1041-6080(95)90031-4.
Yeari, Menahem, and Paul van den Broek. 2015. “The Role of Textual Semantic Constraints in Knowledge-Based Inference Generation During Reading Comprehension: A Computational Approach.” Memory 23 (8): 1193–1214. https://doi.org/10.1080/09658211.2014.968169.
———. 2016. “A Computational Modeling of Semantic Knowledge in Reading Comprehension: Integrating the Landscape Model with Latent Semantic Analysis.” Behavior Research Methods 48: 880--896. https://doi.org/10.3758/s13428-016-0749-6.