This document describes a real‑time activation model used to estimate the availability of propositional content (EDUs, e.g. sentences) in memory during writing-from-source tasks. The model treats reading as a stream of memory‑updating events that dynamically shape which ideas are available for writing under time pressure and interference and is driven by eye‑movement–derived events but abstracts away from word‑level processing. It is intended to capture proposition‑level availability, not surface recall.
The model builds on activation‑based and interference‑based accounts of memory in psycholinguistics (ACT‑R, cue‑based retrieval, discourse models), according to which linguistic representations are continuously updated, decay over time, and compete for retrieval (Anderson 1983; Anderson and Bower 1973; Lewis and Vasishth 2005). Like cue‑based retrieval and discourse comprehension models, it assumes that forgetting reflects reduced accessibility rather than loss of representations. EDUs are treated as propositional units whose activation is strengthened by reading, weakened by interference from new content, and eroded by non‑source-reading activities such as writing. Availability is defined in relative terms, capturing the selective accessibility of propositions under competition. The model integrates insights from activation‑based memory theory, discourse processing, and eye‑movement research into a real‑time account of source memory during writing.
The present model extends a time‑ and interference‑based activation framework by embedding propositions in structured networks over which activation can spread. This implements a hybrid architecture that combines principles from activation‑based memory theory, discourse‑level structure models, and knowledge‑based spreading activation accounts of comprehension. At any point in time, each EDU has a continuous activation value that reflects its current accessibility in memory. Activations change dynamically as a function of reading, writing, decay, interference, and spreading activation.
Overall the model implements a hybrid activation‑based memory system with the following properties:
This architecture aligns with a large body of psycholinguistic work in which forgetting during reading is understood as a consequence of decay, interference, and competitive retrieval in a structured memory space rather than loss of representations (e.g. Lewis, Vasishth, and Van Dyke 2006; Myers and O’Brien 1998).
Full functions can be found at the end of this document and will be loaded here.
# Load functions for reading history model with activation spreading
source("../r/rh-sa.R")
Below is an example of the input format that the model is expecting. All variables are required (except for token). The data are represented as time stamps with associated fixation durations associated with event_type. event_type has the value "read" always when edu is not NA and other when edu is NA. Thus, when event_type is "read" there is an edu, i.e. a text unit within the source texts, that is associated with the fixation duration. The code assumes that edu values are unambiguous identifiers of text regions across source texts.
Rows: 8,354
Columns: 7
$ token <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ source <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "RW1_short_ascii", NA, NA, "RW1_short_ascii", "…
$ edu <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "001kgfc", NA, NA, "002smtf", "003twnf", "002sm…
$ t_start <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ event_type <chr> "other", "other", "other", "other", "other", "other", "other", "other", "other", "other…
$ duration <int> 797, 355, 1788, 4286, 1761, 2183, 2565, 2076, 120, 93, 160, 140, 107, 602, 307, 221, 19…
$ words_in_edu <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 16, NA, NA, 22, 24, 22, 16, 16, 16, NA, 18, NA,…
words_in_edu comes from the rst_tsv file:
Rows: 246
Columns: 4
$ edu <chr> "001wfwa", "002tiwf", "003atti", "004saec", "005osit", "006netd", "007tpid", "008bats",…
$ text <chr> "when faced with a particularly tough question on rounds during my intern year i would …
$ source <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_…
$ words_in_edu <int> 20, 25, 21, 18, 5, 39, 4, 25, 29, 19, 11, 36, 14, 32, 29, 16, 8, 23, 20, 31, 18, 35, 38…
Span information from the rst_tsv file is required to calculate the weights between PEs.
Rows: 183
Columns: 3
$ source <chr> "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii", "AM1_short_ascii"…
$ edu <chr> "001wfwa,002tiwf", "001wfwa,002tiwf,003atti,004saec,005osit", "001wfwa,002tiwf,003atti,004sae…
$ level <dbl> 9, 7, 5, 3, 1, 8, 10, 8, 6, 8, 6, 4, 4, 2, 4, 3, 0, 2, 2, 4, 5, 3, 1, 3, 1, 2, 1, 3, 5, 3, 13…
The input data format requires minimal data transformation from the incremental report returned by CyWrite (Chukharev-Hudilainen 2019) and no data cleaning. The data wrangling that would be minimally required (in R) is shown here:
jsonlite::fromJSON(txt = token, flatten = TRUE) %>%
select(eye) %>%
unnest(eye) %>%
# label edus as "read" or "other
mutate(event_type = case_when(!is.na(edu) ~ "read", TRUE ~ "other")) %>%
# add number of words in edu
left_join(data_pe, by = join_by(edu))
Finally we need a named vector with source as values and EDUs as names.
# EDUs by source
edu_source <- setNames(data_pe$source, data_pe$edu)
# Preview
edu_source[1:2]
001wfwa 002tiwf
"AM1_short_ascii" "AM1_short_ascii"
This function is currently not used by the model.
For semantic and discourse related activation spreading we need to prepare to weight matrices. The implementation of the functions that generate these matrices can be found at the end of this document.
First, the function build_semantic_W() builds a semantic similarity matrix for knowledge‑based spreading activation for each source text on the basis of the data stored in data_pe. This matrix captures how closely related propositions are in terms of their lexical‑semantic content. This matrix provides the structural basis for landscape‑style spreading activation, in which activation can flow between propositions that are semantically related even in the absence of explicit discourse‑structural links.
Semantic representations are constructed using a standard distributional semantic approach, operationalised as follows:
This code creates a semantic similarity matrix for each source text separately.
semantic_W_by_source <- data_pe %>%
split(.$source) %>%
lapply(build_semantic_W)
Semantic similarity between EDUs is computed using cosine similarity between their TF‑IDF vectors. For each pair of EDUs \(i\) and \(j\):
\[ S_{ij} = \frac{\vec{v}_i \cdot \vec{v}_j}{\|\vec{v}_i\| \|\vec{v}_j\|} \]
where \(\vec{v}_i\) and \(\vec{v}_j\) are the TF‑IDF vectors of EDUs \(i\) and \(j\), respectively.
Cosine similarity yields values in the range \([0, 1]\), where higher values indicate greater semantic overlap.
The resulting semantic similarity matrix \(S\) has the following properties:
Row and column names are set to EDU identifiers, allowing direct alignment with the activation state vector. A preview is shown here:
001kgfc 002smtf 003twnf 004eahf 005wtsg 006bssf
001kgfc 1.00000000 0.10840610 0.01538937 0.08444951 0.07463417 0.02065252
002smtf 0.10840610 1.00000000 0.13354690 0.06980781 0.22185241 0.07885955
003twnf 0.01538937 0.13354690 1.00000000 0.04804877 0.08606724 0.05782249
004eahf 0.08444951 0.06980781 0.04804877 1.00000000 0.03901257 0.17790446
005wtsg 0.07463417 0.22185241 0.08606724 0.03901257 1.00000000 0.06385578
006bssf 0.02065252 0.07885955 0.05782249 0.17790446 0.06385578 1.00000000
This semantic similarity matrix approximates semantic memory structure in a distributional sense. It captures world‑knowledge‑based associations among propositions that are not necessarily adjacent in the text or directly linked by discourse relations.
In the model, this matrix supports landscape‑style spreading activation, whereby activation of one proposition partially activates other propositions that are semantically related. This mechanism accounts for phenomena such as:
Semantic spreading is implemented with a small weighting parameter, ensuring that it provides weak but persistent support rather than dominating direct encoding or discourse‑structural effects.
Ghe function build_semantic_W_sbert() constructs a semantic similarity matrix for knowledge‑based spreading activation for each source text on the basis of the data stored in data_pe. This matrix is derived from sentence‑level semantic embeddings allowing semantic relations to be captured independently of surface lexical overlap. This matrix provides the structural basis for landscape‑style semantic spreading activation, in which activation can flow between propositions that are semantically related in a conceptual sense, even if they do not share words or are not directly connected by discourse structure.
Each EDU is mapped onto a dense semantic vector using a Sentence‑BERT (SBERT) model (Reimers and Gurevych 2019). SBERT models are trained to represent the meaning of sentences such that semantically equivalent or closely related sentences have similar vector representations, even when they differ substantially in wording. This makes them particularly well suited for short, propositional EDUs, where lexical overlap may be minimal.
SBERT is accessed via Python.
library(reticulate)
use_python("/usr/bin/python3", required = TRUE)
transformers <- import("sentence_transformers")
np <- import("numpy")
# initialise model (do once per session)
sbert_model <- transformers$SentenceTransformer("all-MiniLM-L6-v2")
# Generate SBERT embeddings for EDUs
emb_sbert_by_source <- data_pe %>%
split(.$source) %>%
lapply(compute_sbert_embeddings)
Semantic representations are constructed as follows:
# Build semantic similarity matrix for each source and its emebddings
edu_list <- data_pe %>% split(.$source)
common_names <- intersect(names(edu_list), names(emb_sbert_by_source))
semantic_W_by_source <- mapply(
build_semantic_W_sbert,
edu_df = edu_list[common_names],
embeddings = emb_sbert_by_source[common_names],
SIMPLIFY = FALSE)
SBERT embeddings approximate long‑term semantic knowledge learned from large corpora, including information about synonymy, paraphrase, and typical conceptual associations.
For each source text, a semantic similarity matrix \(S\) is computed by taking the cosine similarity between all pairs of EDU embeddings. For EDUs \(i\) and \(j\):
\[ S_{ij} = \frac{\vec{v}_i \cdot \vec{v}_j}{\|\vec{v}_i\| \|\vec{v}_j\|} \]
where \(\vec{v}_i\) and \(\vec{v}_j\) are the sentence‑level embedding vectors of EDUs \(i\) and \(j\), respectively. A preview is shown here:
001kgfc 002smtf 003twnf 004eahf 005wtsg 006bssf
001kgfc 1.0000000 0.2677321 0.1633561 0.4572471 0.1959719 0.1470794
002smtf 0.2677321 1.0000000 0.3595323 0.3795831 0.4427657 0.5298123
003twnf 0.1633561 0.3595323 1.0000000 0.2868217 0.4470820 0.3980450
004eahf 0.4572471 0.3795831 0.2868217 1.0000000 0.3417116 0.4565610
005wtsg 0.1959719 0.4427657 0.4470820 0.3417116 1.0000000 0.4074331
006bssf 0.1470794 0.5298123 0.3980450 0.4565610 0.4074331 1.0000000
Cosine similarity yields values in the range \([0, 1]\), where higher values indicate greater semantic relatedness in embedding space. Because embeddings encode meaning beyond surface form, high similarity values can arise even when EDUs share few or no lexical items.
As in the previous implementation, a separate semantic similarity matrix is created for each source text, ensuring that semantic spreading remains source‑internal by default.
The resulting semantic similarity matrix \(S\) has the following properties:
Row and column names are set to EDU identifiers, allowing direct alignment with the activation state vector.
The SBERT‑based semantic similarity matrix is intended to approximate the reader’s semantic memory structure in a distributional sense. Embedding‑based similarity captures conceptual relatedness that may support:
In the model, this matrix supports landscape‑style semantic spreading activation, whereby activation of one proposition partially reactivates other propositions that are semantically related at the level of meaning rather than surface form. Semantic spreading is implemented with a small weighting parameter, ensuring that knowledge‑based resonance provides weak but persistent background support rather than dominating direct encoding or discourse‑structural integration mechanisms.
The function build_rst_W() constructs a discourse‑structural weight matrix that encodes the strength of structural relations among propositions as defined by Rhetorical Structure Theory (RST) span annotations. This matrix provides the basis for discourse‑structural spreading activation in the model, corresponding to construction–integration processes. The implementation can be found at the end of this document.
W_by_source <- data_spans %>%
split(.$source) %>%
lapply(build_rst_W)
The input data_spans data frame contains one row per RST span and includes:
level indicating the depth of the span within the RST treeLower levels correspond to more global discourse groupings, whereas higher levels correspond to more local, tightly integrated spans.
Each RST span contributes pairwise connections among all EDUs contained within that span. The contribution of a span is weighted by its hierarchical level according to the following mapping:
\[ w = \frac{1}{\text{level} + 1} \]
This transformation has three important properties:
level = 0)Thus, EDUs that co‑occur in smaller, lower‑level spans receive stronger mutual support than EDUs that are only related at higher discourse levels.
The resulting matrix \(W\) is constructed as follows:
Initialisation: Asquare matrix is initialised with rows and columns corresponding to all EDUs appearing in the span annotations.
Accumulation of Span Contributions: For each RST span:
Connections accumulate across spans, allowing EDUs that repeatedly co‑occur across different discourse structures to develop stronger association weights.
This ensures that the matrix can be interpreted as distributing a bounded amount of activation across structurally related EDUs.
An example preview of RST weight matrix is shown here:
001kgfc 002smtf 003twnf 004eahf 005wtsg 006bssf
001kgfc 0.00000000 0.02932211 0.02816846 0.02680505 0.02513866 0.02513866
002smtf 0.02932211 0.00000000 0.02816846 0.02680505 0.02513866 0.02513866
003twnf 0.02820099 0.02820099 0.00000000 0.02683601 0.02516770 0.02516770
004eahf 0.02690947 0.02690947 0.02690947 0.00000000 0.02523659 0.02523659
005wtsg 0.02523659 0.02523659 0.02523659 0.02523659 0.00000000 0.02690947
006bssf 0.02523659 0.02523659 0.02523659 0.02523659 0.02690947 0.00000000
The final matrix has the following properties:
The RST‑based weight matrix represents discourse‑structural support relations among propositions. It operationalises the idea that propositions are embedded in a hierarchical discourse structure and that integration processes distribute activation preferentially along structurally local connections.
The following code illustrates how the reading history function update_reading_history can be applied to slices of data. These data slices can have any size from 1 to the overall length of the writing session. Details of the implementation of update_reading_history and a conceptual description can be found below.
# Model parameters
lambda <- .08 # slow passive time decay (/sec)
alpha <- .9 # moderate encoding strength
beta <- .1 # strong propositinoal level interference from reading
gamma <- .18 # strong decay during writing / "other"
refix_window <- 2.0 # seconds
k_maint <- 0.3 # refixation boost
theta <- 0.15 # retrieval threshold (normalised)
# Activation spreading
discourse_spread_rate <- 0.01 # discourse structure (very low)
semantic_spread_rate <- 0.02 # semantic resonance
# Vector for initial states
state <- init_state()
# Data frame for results
rh <- tibble()
# Loop over random slices of data (to simulate real time)
for(i in seq(1, nrow(data), 20)){
tmp <- slice(data, i)
res <- update_reading_history(tmp, state)
state <- res$state
rh <- bind_rows(rh, res$output)
}
# Add source information for plotting
rh <- left_join(rh, data_pe, by = "edu")
The model returns the following information
Rows: 22,492
Columns: 8
$ token <chr> "test", "test", "test", "test", "test", "test", "test", "test", "test", "test", "test",…
$ t_start <dbl> 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.738678e+12, 1.7…
$ activation <dbl> 1.00000000, 0.29664513, 1.00000000, 0.29841672, 1.00000000, 0.30018625, 1.00000000, 0.1…
$ available <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, F…
$ edu <chr> "006bssf", "006bssf", "010isat", "006bssf", "010isat", "006bssf", "010isat", "006bssf",…
$ text <chr> "but she still felt like a stranger at her own company whose remote policies were hapha…
$ source <chr> "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_ascii", "RW1_short_…
$ words_in_edu <int> 18, 18, 9, 18, 9, 18, 9, 18, 9, 12, 18, 9, 12, 15, 18, 9, 12, 15, 19, 18, 9, 12, 15, 19…
The model results are visualised in Figure 1. Activation values represent relative indices of memory prominence rather than absolute memory strength. Accordingly, the model’s predictions focus on availability and competition among propositions, which are assumed to drive rereading and source use decisions. The model does not attempt to estimate absolute memory strength. Instead, it models dynamic changes in relative availability under time‑based decay and interference, which are assumed to be the relevant determinants of writing behavior. The model permits statements about the relative prominence of propositions in memory (e.g. EDU “X” being more active than EDU “Y” at a given moment), but does not assign absolute memory‑strength units to activation values.
Figure 1: Reading history. Each line represents an EDU. Colour indicates whether the propositional information of the EDU is available or not.
The model treats each EDU as a propositional memory trace whose activation evolves continuously over time as readers alternate between reading, writing, and other cognitively demanding activities. Activation reflects the momentary accessibility of the information associated with each EDU, not its permanent storage in long‑term memory.
Activation increases when an EDU is processed during reading and decreases due to the passage of time, competition from newly processed information, and attentional diversion during non‑reading activities such as writing or planning. In addition, activation can be redistributed across structurally and semantically related propositions via spreading‑activation mechanisms.
The core assumptions of the model are:
The model operates in real time and produces a memory snapshot after each event yielding continuous estimates of the activation and availability of all propositions encountered so far. This allows the model to track how the set of accessible ideas changes dynamically during writing-from-source tasks.
At any time \(t\), the model maintains the following state variables:
The complete model state at time \(t\) can thus be characterised as:
\[ \text{state}(t) = \bigl\{ A(t),\; t_{\text{last}},\; t^{\text{fix}},\; W^{(s)},\; S^{(s)} \bigr\} \]
In other words, propositional information is strengthened, weakened, supported, and suppressed over time under the combined influence of processing, interference, discourse structure, and semantic knowledge.
lambda = 0.08 # passive decay rate (per sec)
alpha = 0.9 # encoding strength
beta = 0.10 # interference from new reading
gamma = 0.18 # decay during non-reading (“other”)
theta = 0.15 # availability threshold (normalised)
refix_window = 2.0 # refixation window (sec)
k_maint = 0.3 # refixation boost
# Activation spreading
discourse_spread_rate = 0.01 # how much matters discourse structure
semantic_spread_rate = 0.02 # how much does semantic resonance matter
These values are intended for adult L1 readers processing propositional content.
Each input event corresponds to a single attentional or task-related episode. Events are processed sequentially in time order.
Each event has the following attributes:
read, other}read events only:
The model makes no assumptions about dwell structure or fixation clustering; any event stream that satisfies these conditions can be used.
Between any two events, all EDU activations decay exponentially:
\[ \Delta t = \frac{t - t_{\text{last}}}{1000} \]
\[ A_e(t) = A_e(t_{\text{last}}) \cdot e^{-\lambda \Delta t} \]
where:
This operation implements a baseline loss of activation over time, independent of current task demands or new input. Exponential decay reflects the assumption, common to activation‑based memory models, that representations become less accessible as a smooth function of elapsed time rather than disappearing abruptly.
Crucially, continuous decay captures the fact that propositional information fades from working accessibility unless it is actively maintained or reactivated. Passive decay provides a temporal reference frame against which the effects of interference, rehearsal, and spreading activation can be evaluated.
For non-reading events (event_type == "other", e.g. writing, planning, thinking), no new content is encoded. Instead, existing activations decay further:
\[ A_e \leftarrow A_e \cdot e^{-\gamma \cdot (d / 1000)} \]
where:
This step models task‑dependent decay capturing the idea that source information becomes less accessible when attention is directed away from the source text. Activities such as writing or planning do not merely allow time to pass; they actively compete for cognitive resources that might otherwise support maintenance of source propositions.
The parameter \(\gamma\) therefore represents additional decay due to attentional diversion, over and above passive time‑based fading. This allows the model to distinguish between simple temporal gaps and cognitively demanding activities that are known to accelerate forgetting of previously read content.
During reading events (event_type == "read"), activation dynamics combine encoding, interference, and maintenance processes.
Encoding input is computed as fixation duration on a logarithmic scale, normalised by EDU length
\[ I = \frac{\log(d + 1)}{w_e} \]
where EDU \(e\) is fixated for duration \(d\).
This transformation serves three purposes: 1. Logarithmic scaling reflects diminishing returns of fixation duration: very long fixations increase encoding strength, but not linearly. 2. Normalisation by EDU length prevents long propositions from receiving disproportionately large activation simply because they contain more words. 3. Continuous scaling allows encoding strength to vary smoothly rather than categorically.
As a result, \(I\) can be interpreted as an approximation of processing depth at the propositional level rather than raw visual exposure.
Reading a new EDU suppresses activation of all existing EDUs:
\[ A_j \leftarrow A_j \cdot e^{-\beta I} \quad \forall j \]
where \(\beta\) controls interference strength.
This operation implements similarity‑independent interference, a core principle of cue‑based and activation‑based memory models. Encoding new information reduces the accessibility of previously processed information because all representations compete for a limited pool of activation. Importantly, interference is scaled by encoding strength \(I\). Deep engagement with a new proposition produces stronger competition than superficial processing, reflecting that deeply processed information induces greater forgetting of competing representations.
If the same EDU was refixated within a short temporal window, these are not be treated as new encoding events but as short‑term conceptual rehearsal that stabilise an already active representation.
\[ \text{if } (t - t^{\text{fix}}_e) < \tau: \quad A_e \leftarrow A_e \cdot (1 + k) \]
where:
Multiplicative scaling preserves the relative activation of the EDU while strengthening it enough to resist decay and interference. This captures the intuition that rereading shortly after reading helps keep an idea active rather than creating a qualitatively new memory trace.
After interference and maintenance, the currently fixated EDU receives an additive activation boost:
\[ A_e \leftarrow A_e + \alpha I \]
Finally, its fixation time is updated:
\[ t^{\text{fix}}_e \leftarrow t \]
This step establishes the longer‑lasting encoding effect of reading: propositions gain activation in proportion to processing depth, allowing them to persist beyond the immediate fixation. The parameter \(\alpha\) determines how strongly new information enters the activation landscape relative to decay and spreading. Updating the last fixation time ensures that subsequent refixations can be identified as maintenance rather than new encoding.
To estimate which EDUs are likely retrievable at a given moment, activations are normalised:
\[ A^*_e = \frac{A_e}{\max_j A_j} \]
An EDU is considered available if:
\[ A^*_e \ge \theta \]
where \(\theta\) is a threshold parameter. Thus availability reflects relative retrievability under competition, not absolute memory strength.
Normalisation transforms raw activation values into relative prominence within the current memory state. Availability is therefore defined competitively: an EDU is retrievable not because it is absolutely strong but because it remains sufficiently strong relative to other active propositions.
The first spreading mechanism implements discourse‑based support between propositions that are structurally related within a text.
In the Construction–Integration (CI) model, propositions are not independent items but nodes in a discourse network whose connections reflect textual relations and coherence structure (Kintsch 1988, 1998). During the integration phase, activation spreads across this network, stabilising coherent interpretations while less connected propositions lose activation.
Rhetorical Structure Theory (RST) (Mann and Thompson 1988) provides an explicit representation of such discourse structure, capturing how clauses and propositions are grouped into spans at different hierarchical levels. Propositions that co‑occur in smaller, lower‑level spans are assumed to be more tightly integrated than propositions that are only linked at higher, more global levels.
For each source text, a discourse‑structural weight matrix \(W^{(s)}\) is constructed such that:
Formally, CI‑style spreading is implemented as:
\[ A^{new}_i = A_i + s \sum_j W_{ij} A_j \]
where \(s\) is a small spreading parameter. Spreading is restricted to EDUs belonging to the same source text.
This mechanism captures:
It corresponds to the integration phase of CI models and to episodic connections in landscape‑style models of comprehension.
The second spreading mechanism implements knowledge‑based resonance between propositions that are semantically related, independent of explicit discourse structure.
In the Landscape Model of reading comprehension (Broek et al. 1996, 1999), activation fluctuates across cycles as a function of:
Recent developments of the landscape model (e.g., the LS‑R model) (Yeari and Broek 2015, 2016) operationalise semantic memory using distributional semantic representations (e.g., LSA), allowing activation to spread between text units based on their semantic similarity.
For each source text, a semantic similarity matrix \(S^{(s)}\) is constructed such that:
Semantic spreading is implemented as:
\[ A^{new}_i = A_i + \gamma \sum_j S_{ij} A_j \]
where \(\gamma\) is a small semantic spreading rate.
This mechanism captures: - activation of inferences and predictions - background conceptual resonance - reactivation of previously read but semantically related information
Unlike discourse‑structural spreading, semantic spreading does not assume explicit textual relations and can activate propositions that were never adjacent in the text.
"other" (e.g., writing, planning, thinking) apply additional task‑based decay proportional to the duration of the event"read":
After each event, the model outputs a complete memory snapshot containing the current activation values and availability status for all EDUs encountered so far.
# Initial state information -----------------------------------------------
init_state <- function() {
list(
A = numeric(), # named activation vector
last_t = NA_real_, # last event time (ms)
last_fix_time = numeric() # last fixation per EDU (sec)
)
}
# Activation weights based on RST relations -------------------------------
build_rst_W <- function(spans) {
spans_source <- spans %>%
mutate(
edu_list = str_split(edu, ","),
weight = 1 / (level + 1)
)
edus <- unique(unlist(spans_source$edu_list))
W <- matrix(
0,
nrow = length(edus),
ncol = length(edus),
dimnames = list(edus, edus)
)
for (i in seq_len(nrow(spans_source))) {
span_edus <- spans_source$edu_list[[i]]
w <- spans_source$weight[i]
if (length(span_edus) < 2) next
pairs <- combn(span_edus, 2)
for (j in seq_len(ncol(pairs))) {
e1 <- pairs[1, j]
e2 <- pairs[2, j]
W[e1, e2] <- W[e1, e2] + w
W[e2, e1] <- W[e2, e1] + w
}
}
rs <- rowSums(W)
rs[rs == 0 | !is.finite(rs)] <- 1
W <- W / rs
# row normalise
W <- W / rowSums(W)
W[is.na(W)] <- 0
W
}
# Semantic similarity matrix for spreading activation ---------------------
# Compute and cache EDU embeddings
compute_sbert_embeddings <- function(edu_df) {
# edu_df: columns edu, text
texts <- edu_df$text
emb <- sbert_model$encode(
texts,
convert_to_numpy = TRUE,
normalize_embeddings = TRUE)
rownames(emb) <- edu_df$edu
emb
}
# Build semantic similarity matrix
build_semantic_W_sbert <- function(edu_df, embeddings) {
edus <- edu_df$edu
V <- embeddings[edus, , drop = FALSE]
# cosine similarity
S <- tcrossprod(V)
# numerical safety
diag(S) <- 1
S[S < 0] <- 0
S
}
# Compute TF‑IDF‑based semantic matrix (previous version)
build_semantic_W <- function(edu_df) {
it <- itoken(
edu_df$text,
tokenizer = word_tokenizer,
progressbar = FALSE
)
vocab <- create_vocabulary(it)
vectorizer <- vocab_vectorizer(vocab)
dtm <- create_dtm(it, vectorizer)
tfidf <- TfIdf$new()
dtm_tfidf <- tfidf$fit_transform(dtm)
# cosine similarity
S <- sim2(dtm_tfidf, method = "cosine", norm = "l2")
S <- as.matrix(S)
rownames(S) <- edu_df$edu
colnames(S) <- edu_df$edu
S
}
# Spreading activation ----------------------------------------------------
# Landscape semantic spreading, knowledge‑based resonance, semantic memory
spread_activation_landscape <- function(A,
semantic_W_by_source,
edu_source,
gamma = 0.03) { # spreading of semantic information
if (length(A) == 0) return(A)
A_new <- A
active_edus <- names(A)
for (src in unique(edu_source[active_edus])) {
if (!src %in% names(semantic_W_by_source)) next
edus <- active_edus[edu_source[active_edus] == src]
if (length(edus) < 2) next
S <- semantic_W_by_source[[src]][edus, edus, drop = FALSE]
S <- apply(S, 2, as.numeric)
storage.mode(S) <- "double"
a_vec <- as.numeric(A[edus])
if (nrow(S) != length(a_vec)) next
A_new[edus] <- A[edus] + gamma * as.numeric(S %*% a_vec)
}
A_new
}
# Disourse spreading activation (construction integration / RST)
spread_activation_ci <- function(A, W_by_source, edu_source, s = 0.05) {
if (length(A) == 0) return(A)
A_new <- A
active_edus <- names(A)
for (src in unique(edu_source[active_edus])) {
# --- GUARD 1: source must exist in W_by_source ---
if (!src %in% names(W_by_source)) next
edus <- active_edus[edu_source[active_edus] == src]
# --- GUARD 2: need at least two EDUs ---
if (length(edus) < 2) next
W_src <- W_by_source[[src]]
# --- GUARD 3: weight matrix must be valid ---
if (is.null(W_src)) next
W <- W_src[edus, edus, drop = FALSE]
# force numeric
W <- as.matrix(W)
a_vec <- as.numeric(A[edus])
# --- GUARD 4: no spreading if degenerate ---
if (any(!is.finite(a_vec)) || nrow(W) == 0) next
# Add to activation the weigths for edus with a spreadiing factor that determines
# the strength of the contribution of the weights
A_new[edus] <- A[edus] + s * as.numeric(W %*% a_vec)
}
A_new
}
# Update activation -------------------------------------------------------
`%||%` <- function(x, y) if (is.null(x) || is.na(x)) y else x
update_one_event <- function(row, state) {
A <- state$A
last_fix_time <- state$last_fix_time
# time elapsed (sec)
if (is.na(state$last_t)) {
dt <- 0
} else {
dt <- (row$t_start - state$last_t) / 1000
}
# universal passive decay
if (length(A) > 0 && dt > 0) {
A <- A * exp(-lambda * dt)
}
# ---- EVENT TYPES ----
if (row$event_type == "other") {
# task engagement without source input
if (length(A) > 0) {
A <- A * exp(-gamma * (row$duration / 1000))
}
} else if (row$event_type == "read") {
edu <- row$edu
now <- row$t_start / 1000
# encoding strength (EDU-length normalised)
input <- log(row$duration + 1) / row$words_in_edu
# interference from new reading
if (length(A) > 0) {
A <- A * exp(-beta * input)
}
if (!is.na(edu) &&
edu %in% names(last_fix_time) &&
!is.na(last_fix_time[edu]) &&
(now - last_fix_time[edu]) < refix_window) {
A[edu] <- (A[edu] %||% 0) * (1 + k_maint)
}
# encoding
A[edu] <- (A[edu] %||% 0) + alpha * input
last_fix_time[edu] <- now
}
# spreading activation across propositions
# after encoding / interference
# discourse / CI spreading: based on RST relations
A <- spread_activation_ci(
A,
W_by_source,
edu_source = edu_source,
s = discourse_spread_rate
)
# semantic / landscape spreading: based on semantic similarity
A <- spread_activation_landscape(
A,
semantic_W_by_source,
edu_source,
gamma = semantic_spread_rate
)
list(
A = A,
last_t = row$t_start,
last_fix_time = last_fix_time
)
}
update_reading_history <- function(df, state) {
df <- df[order(df$t_start), ]
outputs <- vector("list", nrow(df))
for (i in seq_len(nrow(df))) {
state <- update_one_event(df[i, ], state)
A <- state$A
# reporting snapshot
if (length(A) > 0 && any(is.finite(A))) {
A_norm <- A / max(A)
} else {
A_norm <- A
}
outputs[[i]] <- tibble(
token = df$token[i],
t_start = df$t_start[i],
edu = names(A_norm),
activation = as.numeric(A_norm),
available = A_norm >= theta
)
}
list(
state = state,
output = dplyr::bind_rows(outputs)
)
}