An introduction to the randomverse

Juan Pablo Acuña González

20 enero 2023

Abstract

The broad philosophy of the randomverse is to capture new developments of computer simulated or in silico systems by integrating quantitative methods into biology with an emphasis into molecular biology. It is intended to be employed by students, teachers, researchers, and professionals in the mathematical, statistical, computational and biological sciences in all levels of scholarity from secondary to tertiary education. The randomverse is built by moving towards synthesizing information to generate and test new hypotheses about how network biology works and focusing on stochastic models for biochemical reactions using the standard algorithm for simulating these dynamics, ‘the Gillespie algorithm’ (Gillespie 1977), since the kinetics of biological processes at the intra-cellular level keep showing a stochastic behaviour that is better understood by building in silico models (Schneider 2016). This is still an active research area, so the main purpose here is to make it more accessible for transdisciplinary research and to fill in the gap by providing a tool to analyze actual models and data used, develop training with case studies and stress the construction of models with a discrete stochastic simulation perspective. This article illustrates several case studies of well–mixed stochastic biochemical reaction network models to investigate its behavior, accompanied with working code examples by implementing them in a shiny app (Chang et al. 2022).

1 State of the art and general background

In silico systems biology refers to the modeling in biology at the systems level through synthetic devices. Advances in the study of biological systems rely on the assumed underlying predictive factor that models provide in understanding the functionality of the whole of a system as well as the different parts in any system. The systematic approach allows to extend across species to predict behavior of similar systems in humans which in turn can be applied to medical studies and develop health solutions. When models utilize simulations experimentation can be validated with a saving of time and resources.

By combining previous experience of how a system works with the input of new data, the construction and analysis of models simplify the description of systems. Empirically, models explore what happens at the organism level when the individual parts represented as variables are disturbed. These perturbations lead to new hypothesis about phenomena to justify such discrepancies and so, the output of the initial model becomes the input of a new model and the cycle starts again.

To attack in silico modeling proper computing technology is nedeed. In the last three decades, the $\texttt{R}$ programming language (R Core Team 2022) has become the standard tool in statistics by using a balance of a high-level language with one of the best support for stochastic simulation and statistical analysis. The great advantages of $\texttt{R}$ against other technologies are its completely free open-source software nature and its increased use in bioinformatics and several applications of systems biology (Giorgi, Ceraolo, and Mercatelli 2022).

In the field of stochastic modeling for in silico biology several efforts have been made in the form of $\texttt{R}$ packages. smfsb (D. Wilkinson 2018) is the main reference for combining modeling, biological applications and $\texttt{R}$ developement. Packages adaptivetau (Johnson 2019) and GillespieSSA (Pineda-Krch and Cannoodt 2022) implement Gillespie’s exact stochastic simulation algorithm (direct method) and several approximate methods. GillespieSSA2 (Cannoodt et al. 2020) and sar (still at its developmental stage) present improvements over GillespieSSA with faster implementations of Gillespie’s algorithm. The package lnar (Giagos 2010) provides likelihood-based inference of stochastic kinetic models using a Linear Noise Approximation. In regards to the shinyverse, the app hosted at https://edbonneville.shinyapps.io/gillespie_shiny/ allows the user to simulate and explore infectious disease epidemics stochastically with the Gillespie algorithm.

As for the literature, some of the textbooks in stochastic modeling that deal with biological applications are D. J. Wilkinson (2020) and Fuchs (2013). The text by Allen (2011) stresses applications of stochastic processes in biology and in other areas as well, including finance, economics, physics, chemistry, and engineering. Another very recent reference is Zobitz (2022) which is written in a very modern style employing the tidyverse (Wickham et al. 2019) and allied libraries.

All of the aforementioned material has been developed towards a quantitative focus, therefore, the randomverse proposes to unify previous software into a more qualitative approach in order to give deeper understanding into the biology rather than into the algorithms, and in doing so, providing a better tool for decision making in the areas of drug development and health improvement.

2 Stochastic modeling in molecular biology, biochemistry and genetics

2.1 Introduction

Stochastic modeling of natural phenomena is an indispensable tool in understanding complex diseases by allowing to incorporate factors such as the mechanisms of heredity and mutation, the causes of senescence or the emergence of fatal viral infections. The functionality of these processes are based on chemical reactions in organisms and the structures of living cells.

The overlapping areas of molecular biology, biochemistry and genetics have been studied from an enormous set of modeling techniques. The randomverse focuses on addressing applications which utilize the framework of biochemical reactions for the representation of biological processes.

This produces a representation of biological systems as networks of coupled biochemical reactions that can be simulated in silico by different algorithms depending on assumptions made about the underlying kinetics.

Biochemical reactions are typically specified by reaction equations of the form as in Equation (1).

\[\begin{equation} \begin{aligned} a_1 A_1 + . . . + a_k A_k → b_1 B_1 + . . . + b_l B_l, \end{aligned} \tag{1} \end{equation}\]

where $k$ is the number of reactants and $l$ is the number of products. $A_i$ represents the $i$th reactant molecule and $B_j$ is the $j$th product molecule. $a_i$ is the number of molecules of $A_i$ consumed in a single reaction step, and $b_j$ is the number of molecules of $B_j$ produced in a single reaction step. The coefficients $a_i$ and $b_j$ are known as stoichiometries. The stoichiometries are assumed to be natural numbers with greatest common divisor equal to one.

A starting point for describing biological phenomena by the use of reaction equations and their associated mathematical theory is to describe the natural laws which underlie the essential process of information transfer between information-carrying molecules, the molecular biology central dogma.

Cells in the human body specialize in different tasks, and all these different kinds of cells come from a single-celled embryo. With every division of that cell instructions are transmitted to new cells.

These instructions can be coded into a string which represents a polymer made of nucleotides. The four nucleotides in DNA molecules, Adenine (A), Guanine (G), Cytosine (C) and Thymine (T) arranged in speciﬁc sequences store the information for the functioning of living organisms.

The basic path of DNA replication consists on, if a gene is activated, it’s transcription into messenger RNA (mRNA), and if this gene is protein coding, the mRNA gets translated into a protein. Transcription occurs in the nucleus for eukaryotes and translation occurs in the cytoplasm.

The gene is a central concept for information encoding of proteins since all living cells depend on them because they are essential for structural and functional tasks in organisms.

Through a stochastic process the effects prevalent at the scale of genetic and biochemical networks can be modeled in order to represent the noisy characteristics of biological systems. The processes involved in gene expression and regulation can be simplified by generating auto-regulatory networks. These networks describe the regulatory mechanisms that determine the required amount of proteins produced by each cell via a set of coupled biochemical reactions. Therefore, transcription and translation can be assembled to construct genetic networks. The components in Equation (2) summarize a very simple auto-regulatory genetic network.

\[\begin{equation} \begin{aligned} g \rightarrow g + r \,\ (\text{transcription}) \\ g + P_2 \leftarrow \rightarrow g · P_2 \,\ (\text{repression}) \\ r \rightarrow r + P \,\ (\text{translation}) \\ 2P \leftarrow\rightarrow P_2 \,\ (\text{dimerisation}) \\ r \rightarrow ∅ \,\ (\text{mRNA} \,\ \text{degradation}) \\ P \rightarrow ∅ \,\ (\text{protein} \,\ \text{degradation}). \end{aligned} \tag{2} \end{equation}\]

In these equations, $P$ stands for a protein, $P_2$ for the compound of two of these proteins (dimer), $g$ for a gene and $r$ for a transcript of $g$. The empty set, $∅$, indicates the degradation of the product of a reaction and the dot, $\cdot$, represents the compound of two reactants.

The above mentioned application is intrinsically stochastic.

2.2 Representation of biological models

Biological phenomena are represented as models by specifying a biochemical network with a list of reactions corresponding to the system of interest. Also, the rate of every reaction and the initial amounts of each reacting species are needed in this representation.

For example, in the auto-regulatory network there are eight reactions, since two of them can happen in both directions, a process known as reversibility. Each reaction has a rate law associated with it that quantifies the reaction propensity to take place and that depends on the current amounts of available reactants. In addition, the auto-regulatory system is defined with an initial amount for each of the five biological species involved ($g·P_2$ , $g$, $r$, $P$, and $P_2$). With all the previous elements, the reactions, the rate laws, and the initial amounts, the model is specified and can be simulated dynamically in silico.

Once the analytically representation is established, a graphical representation of the model is constructed to understand reaction networks as pathway diagrams. The graphical model for a set of coupled biochemical reactions uses a graph where the nodes are partitioned into a set that represents the species (elliptical nodes) and another set that represents the reactions (rectangular nodes). The arcs from a species node to a reaction node indicates that those species are reactants for those reactions and the arcs from reactions node to species nodes indicates that those species are products of those reactions. The weights associated with the arcs represent the stoichiometries associated with the reactants and products (unnumbered arcs are assumed to have a weight of one). Finally, there’s an integer number of molecules from each of the nodes representing a species of a place (P) to a transition (T) indicating the abundance of that species associated with it at a given time. A set of molecule numbers at any given point in time is known as the current marking of the net.

Therefore the graphical model for auto-regulation has the following representation:

Figure 1: Auto-regulatory genetic reaction network

The reaction network in Figure 1 implies a loop. It shows the evolution of particular reactions when they occur, so within each step of the process, a new network will be given with its respective new markings. To measure change in such a system a Pre matrix containing the weights of the arcs going from places to transitions and a Post matrix containing the weights of arcs from transitions to places are compared in a tabular format. The numbers of molecules associated with each species will decrease according to the numbers on Pre and increase according to the numbers on Post every time a reaction takes place. By evaluating the difference between the Post and the Pre matrices the state change of the system is obtained.

The tabular representation of the auto-regulation model is shown in Table 1.

Table 1: Tabular representation of auto-regulatory system.
	Reactants ( Pre)					Products (Post)
Species	$g⋅P_2$	$g$	$r$	$P$	$P_2$	$g⋅P_2$	$g$	$r$	$P$	$P_2$
Repression		1			1	1
Reverse repression	1						1			1
Transcription		1					1	1
Translation			1					1	1
Dimerisation				2						1
Dissociation					1				2
mRNA degradation			1
Protein degradation				1

And the tabular representation of the overall effect of each reaction on the marking (state) of the network is in Table 2.

Table 2: Overall effect of each reaction on the marking of the auto-regulatory network.
Species	$g⋅P_2$	$g$	$r$	$P$	$P_2$
Repression	1	-1	0	0	-1
Reverse repression	-1	1	0	0	1
Transcription	0	0	1	0	0
Translation	0	0	0	1	0
Dimerisation	0	0	0	-2	1
Dissociation	0	0	0	2	-1
mRNA degradation	0	0	-1	0	0
Protein degradation	0	0	0	-1	0

To give a matricial representation of this net effect, Equation (3) introduces the matrix $A$ as the difference between Post and Pre, which is called the reaction matrix. The transpose of this matrix is denoted $S$ and is called the stoichiometry matrix.

\[\begin{equation} A = Post - Pre = \begin{pmatrix} 1 & -1 & 0 & 0 & -1 \\ -1 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -2 & 1 \\ 0 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & -1 & 0 \\ \end{pmatrix} \tag{3} \end{equation}\]

The matricial representation of the list of reactions that take place is written in the form of a vector. Whith a matricial representation of biological processes, the current marking in the model can be updated using matrix algebra. The transitions that have taken place subsequent to the current marking of the net are represented as a vector $r$, and the current marking itself is represented with a vector $M$. Then the new marking of the network at a given time is the vector $\overline{M}$ that is related to the old marking via the matrix Equation (4).

\[\begin{equation} \overline{M} = M + Sr. \tag{4} \end{equation}\]

2.3 Simulation

To conduct a simulation study a system of reactions involving $u$ species $X_1 , X_2 , . . . , X_u$ and $v$ reactions, $R_1 , R_2 , . . . , R_v$ is considered. The number of molecules of $X_i$ at time $t$ is $X_{it}$, so the state of the system at time $t$ is $X_t = (X_{1t} , X_{2t} , . . . , X_{ut})^T$. The number reactions of type $R_i$ in the time window $(0, t]$ is $R_{it}$, and so $R_t = (R_{1t} , . . . , R_{vt} )^T$. Then Equation (5) updates the state of the system.

\[\begin{equation} X_t − X_0 = SR_t. \tag{5} \end{equation}\]

Here

\[\begin{equation} S = (Post − Pre)^T, \tag{6} \end{equation}\]

is the $u × v$ stoichiometry matrix of the reaction network. Each stochastic rate constant for a reaction $R_i$ is denoted as $c_i$ and it’s associated rate law or hazard function is denoted $h_i (x, c_i )$, where $x = (x_1 , x_2 , . . . , x_u )^T$ is the current marking (state) of the system. When the state of the system is $x$ at time $t$, the probability that an $R_i$ reaction (or transition) will occur in the time interval $(t, t + dt]$ is given by $h_i (x, c_i ) dt$. In the absence of any other reactions taking place the time to such a reaction event is an exponential random quantity denoted $Exp(h_i (x, c_i ))$.

Since the system consists of $v$ reactions and the hazard for a type $i$ reaction is $h_i (x, c_i )$, then the hazard for a reaction of some type occurring follows Equation (7).

\[\begin{equation} h_0 (x, c) ≡ \sum_{i=1}^v h_i (x, c_i ). \tag{7} \end{equation}\]

A reaction event $Exp(h_0 (x, c))$ that updates the process is of random type proportional to $h_i (x, c_i )$ (the reaction type will be $i$ with probability $h_i (x, c_i )/h_0 (x, c)$) and independent of the time to the next event. The simulation progresses using the time to the next event and then the event type in each step of the system gets updated. This standard discrete event simulation procedure is known as the Gillespie algorithm (Gillespie 1977) in the context of genetic and biochemical kinetics.

A shiny version of this algorithm is implemented into the app under the $\texttt{Stochastic}$ tab and can be summarized as follows:

The system starts with initial conditions $t = 0$, rate constants $c_1 , c _2 , . . . , c_v$ and initial numbers of molecules for each species, $x_1 , x _2 , . . . , x_u$ .
A hazard function $h_i (x, c_i )$ is calculated for each $i = 1, 2, . . . , v$ based on the current state $x$.
The combined reaction hazard $h_0 (x, c) ≡ \sum_{i=1}^v h_i (x, c_i )$ is obtained.
A sample from an $Exp(h_0 (x, c))$ random quantity to simulate the time to the next event $t'$ is calculated.
The time gets upadeted as $t : = t + t'$ .
A sample from a discrete random quantity with probabilities $h_i (x, c_i ) / h_0 (x, c), i = 1, 2, . . . , v$ is calculated to simulate the reaction index $j$ to the next event type.
The state $x$ gets updated according to reaction $j$ as $x : = x + S (j)$ where $S(j)$ denotes the $j$th column of the stoichiometry matrix $S$.
The updated values for $x$ and $t$ get stacked in the form of a list.
If $t < T_{max}$, the algorithm returns to step $2$.
The output, from both $x$ and $t$, gets binded as a data frame once the loop is over.

This routine repeats itself an unknown number of times until the $T_{max}$ condition is reached, since in practice it’s not usually known in advance the number of reactions of a given process and only a time barrier can be studied. The pre-stacking and post-binding in the algorithm allows for a tidy and efficient code structure.

2.4 Deterministic approach

Deterministic kinetics of biological systems are governed by rate laws represented by systems of differential equations. When the differential equations are too complicated to be solved analytically, the solutions can be examined numerically by integrating a system of Ordinary Differential Equations (ODE). The simplest and most practical approach employs the first-order Euler method.

The basic idea of numerically integrating a system of differential equations representing a system starts by writing the system as a vector of ODE as in Equation (8).

\[\begin{equation} \frac{dX}{dt} = f(X), \tag{8} \end{equation}\]

where $X$ is a $p$-dimensional vector and $f (·) : R^p → R^p$ is an arbitrary (non-linear) $p$-dimensional function of $X$, which for the previous in silico models considered, $f(X) = Sr(X)$.

Employing the definition of derivative defined in Equation (9)

\[\begin{equation} \frac{dX}{dt}(t) = \lim_{\Delta t \to 0} \frac{X(t + \Delta t)- X(t)}{\Delta t}, \tag{9} \end{equation}\]

with a small increment $∆t$ Equation (10) is obtained

\[\begin{equation} \frac{X(t+\Delta t)-X(t)}{\Delta t} \approx f(X(t)) \tag{10} \end{equation}\]

and rearranging Equation (11) gives a simple method for computing $X(t + ∆t)$ from $X(t)$. Starting with the initial condition $X(0)$, then $X(∆t), X(2∆t), X(3∆t), . . .$ can be computed to get the full dynamics of the system.

\[\begin{equation} X(t + ∆t) \approx X(t) + ∆tf (X(t)). \tag{11} \end{equation}\]

The shiny analogue for the deterministic simulator (also in function of the time horizon) is provided under the $\texttt{Deterministic}$ tab and can be summarized as follows:

The system starts with initial conditions $t = 0$, initial rate laws $f(X(t))$ and initial numbers of molecules for each species, $X= (x_1 , x _2 , . . . , x_u)$.
A $\Delta t$ time step is fixed
The time to the next event is calculated as $t : = t + \Delta t$.
The next event type is calculated as $X = X + f(X(t)) \Delta t$.
The updated values for $X$ and $t$ get stacked in the form of a list.
If $t < T_{max}$, the algorithm returns to step $2$.
The output, from both $X$ and $t$, gets binded as a data frame once the loop is over.

2.5 Summary of simulation output

The $\texttt{Summary}$ tab provides graphical output of realizations in function of time and the number of trajectories. Histograms and facets are optional visualizations.

3 Basic illustration through the shiny app

The url for this app is https://randomverse.shinyapps.io/RandomVerse/.

The input variables are specified by a model parameters. These include the body of the function, the arguments of the function and it’s values, the Pre and Post reaction matrices of the system, the dimensions of the model specifying the number of reactions and species as well as species and model names. With all the specified parameters a time horizon is set and the corresponding realizations for a model are returned as an output with a species specification.

3.1 Stochastic model

3.1.1 Auto-regulatory genetic network

Table 3: Shiny components of the auto-regulatory genetic network. Stochastic model.
Component	Input
Function Body	$\texttt{th1x2x5,th2x1,th3x2,th4x3,th50.5x4(x4-1),th6x5,th7x3,th8*x4}$
Function Arguments	$\texttt{x1,x2,x3,x4,x5,th1,th2,th3,th4,th5,th6,th7,th8}$
Initial Markings with Hazards	$\texttt{0,1,2,10,12,1,10,0.01,10,1,1,0.1,0.01}$
Pre matrix	$\texttt{0,1,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0}$
Post matrix	$\texttt{1,0,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0}$
Reactions	8
Species	5
Species names	g⋅P2,g,r,P,P2
Model name	Auto-regulatory genetic network

The output for this model produces a representation as a stochastic processes shown in Figure 2.

Figure 2: A simulated realisation of the discrete stochastic dynamics of the genetic auto-regulatory network model, for a period of 700 seconds

Nevertheless, faceting the output as in Figure 3 of the process gives a better understanding of the behavior for each species.

Figure 3: Facet view of the genetic auto-regulatory network

3.1.2 Lotka–Volterra system

The reaction equations between the competing species can be represented in Equation (12).

\[\begin{equation} \begin{aligned} Y_1 \rightarrow 2Y_1 \\ Y_1+Y_2 \rightarrow 2Y_2 \\ Y_2 \rightarrow ∅. \end{aligned} \tag{12} \end{equation}\]

For this example, the shiny components are showed in Table 4.

Table 4: Shiny components of the Lotka-Volterra system. Stochastic model.
Component	Input
Function Body	$\texttt{th1x1,th2x1x2,th3x2}$
Function Arguments	$\texttt{x1,x2,th1,th2,th3}$
Initial Markings with Hazards	$\texttt{50,100,1,0.05,0.6}$
Pre matrix	$\texttt{1,0,1,1,0,1}$
Post matrix	$\texttt{2,0,0,2,0,0}$
Reactions	3
Species	2
Species names	Prey,Predator
Model name	Lotka-Volterra System

A realization for this model is shown in Figure 4.

A single realisation of a stochastic LV process. The state of the system is initialised to 50 prey and 100 predators, and the stochastic rate constants are c = (1, 0.005, 0.6)

Figure 4: A single realisation of a stochastic LV process
The state of the system is initialised to 50 prey and 100 predators, and the stochastic rate constants are c = (1, 0.005, 0.6)

3.1.3 Dimerisation kinetics

This problem considers the dimerisation kinetics of a protein, $P$, at very low concentrations. The forward and backward reactions, respectively, are represented in Equation (13).

\[\begin{equation} \begin{aligned} 2P \rightarrow P_2 \\ P_2 \rightarrow P. \\ \end{aligned} \tag{13} \end{equation}\]

The shiny components for the dimerisation kinetics model are showed in Table 5.

Table 5: Shiny components of the dimeristaion kinetics of a protein. Stochastic model.
Component	Input
Function Body	$\texttt{th1x1(x1-1)/2,th2*x2}$
Function Arguments	$\texttt{x1,x2,th1,th2}$
Initial Markings with Hazards	$\texttt{301,0,1.66e-3,0.2}$
Pre matrix	$\texttt{2,0,0,1}$
Post matrix	$\texttt{0,1,2,0}$
Reactions	2
Species	2
Species names	Protein,Dimer
Model name	Dimerisation kinetics

The graphical output of Figure 5 represents the dimerisation kinetics.

Figure 5: A simulated realisation of the discrete stochastic dynamics of the dimerisation kinetics model

3.1.4 Michaelis–Menten enzyme kinetics

Here a substrate $S$ is converted to a product $P$ only in the presence of a catalyst $E$ (or enzyme). A plausible model for this is given in Equation (14).

\[\begin{equation} \begin{aligned} S + E \rightarrow SE \\ SE \rightarrow S + E \\ SE \rightarrow P + E. \\ \end{aligned} \tag{14} \end{equation}\]

It’s components for the shiny app are given in Table 6.

Table 6: Shiny components of the Michaelis-Menten enzyme kinetics. Stochastic model.
Component	Input
Function Body	$\texttt{th1x1x2,th2x3,th3x3}$
Function Arguments	$\texttt{x1,x2,x3,x4,th1,th2,th3}$
Initial Markings with Hazards	$\texttt{301,120,100,10,0.00166,1e-04,0.1}$
Pre matrix	$\texttt{1,1,0,0,0,0,1,0,0,0,1,0}$
Post matrix	$\texttt{0,0,1,0,1,1,0,0,0,1,0,1}$
Reactions	3
Species	4
Species names	S,E,SE,P
Model name	Michaelis-Menten enzyme kinetics

The graphical representation of these kinetics is shown in Figure 6.

Figure 6: A simulated realisation of the discrete stochastic dynamics of the Michaelis–Menten kinetics model

3.2 Deterministic model

3.2.1 Auto-regulatory genetic network

The deterministic model for auto-regulation is obtained by changing the body of the function to the rate laws:

$\texttt{th1*x2*x5-th2*x1,th2*x1-th1*x5*x2,th3*x2-th7*x3+th4*x3,th4*x3+2*th6*x5-2*th5*x5-th8*x4,th2*x1-th1*x5*x2}.$

This produces the graphical representation in Figure 7.

Figure 7: Deterministic version of the Auto-regulatory network

3.2.2 Lotka–Volterra system

Considering all three reactions, the set of ordinary differential equations (ODEs) for the system takes the form of Equation (15).

\[\begin{equation} \begin{aligned} \frac{d[Y_1]}{dt} = k_1Y[1] - k_2[Y_1][Y_2] \\ \frac{d[Y_2]}{dt} = k_2Y[1][Y_2] - k_3[Y_2]. \end{aligned} \tag{15} \end{equation}\]

For the implementation in shiny, under the deterministic tab the body of the function changes to:

$\texttt{th1*x1-th2*x1*x2,th2*x1*x2-th3*x2}.$

Thus, producing the deterministic dynamics of the model shown in Figure 8.

Figure 8: Deterministic version of the Lotka-Volterra System

3.2.3 Dimerisation kinetics

Since these are reversible reactions that can proceed in both directions, expanding gives the pair of ODEs in Equation (16):

\[\begin{equation} \begin{aligned} \frac{d[P]}{dt} = 2k_2[P_2] -2 k_1[P]^2 \\ \frac{d[P_2]}{dt} = k_1[P]^2 -k_2[P_2]. \end{aligned} \tag{16} \end{equation}\]

The body of the function changes to:

$\texttt{2*th2*x2-2*th1*x1**2,th1*x1**2-th2*x2}.$

Figure 9 represents the model produced.

Figure 9: Deterministic version of the Dimerisation kinetics model

3.2.4 Michaelis–Menten enzyme kinetics

Assuming that $k_1 , k_2$ , and $k_3$ are deterministic mass-action kinetic rate constants, then the ODEs governing the deterministic dynamics are of the form of Equation (17).

\[\begin{equation} \begin{aligned} \frac{d[S]}{dt} = k_2[SE] - k_1[S][E] \\ \frac{d[E]}{dt} = (k_2+k_3)[SE] - k_1[S][E]$ \\ \frac{d[SE]}{dt} = k_1[S][E] - (k_2+k_3)[SE] \\ \frac{d[P]}{dt} = k_3[SE]. \end{aligned} \tag{17} \end{equation}\]

The input for body of the function is:

$\texttt{th2*x3-th1*x1*x2,(th2+th3)*x3-th1*x1*x2,th1*x1*x2-(th2+th3)*x3,th3*x3}.$

And Figure 10 shows the grpahical output of the ODE model.

Figure 10: Deterministic version of the MIchaelis-Menten enzyme kinetics model

3.3 Simulation analysis

3.3.1 Lotka-Volterra system

The trajectories for levels of $\texttt{Predator}$ species in the Lotka-Volterra model from $1000$ runs overlaid and the density histogram of the simulated realisations of $\texttt{Predator}$ at time $t = 7$. This gives an estimate of the probability mass function (PMF) for $\texttt{Predator}(7)$ as shown in Figures 11 and 12.

Figure 11: Trajectories of a predator species of the Lotka-Volterra model

Figure 12: Histogram of a predator species of the Lotka-Volterra model

3.3.2 Auto-regulatory genetic network

Figure 11 shows the time-evolution of the number of molecules of $P$ over a $10$-second period. Figure 11 shows the empirical PMF for the number of molecules of $P$ at time $t = 10$ seconds, based on $1000$ runs. In a model with low levels on initial markings there is a probability that there will be no molecules of $P$ at time $10$, as there’s not yet have been a transcription event. The distribution is far from normal, for concreteness, there is a degree of uncertainty regarding the sensitivity of the gene transcription rate.

Figure 13: Trajectories of a protein species of the auto-regulatory genetic network

Figure 14: Histogram of a protein species of the auto-regulatory genetic network

Session info

## R version 4.2.2 Patched (2022-11-10 r83330)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=es_MX.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_MX.UTF-8        LC_COLLATE=es_MX.UTF-8    
##  [5] LC_MONETARY=es_MX.UTF-8    LC_MESSAGES=es_MX.UTF-8   
##  [7] LC_PAPER=es_MX.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_MX.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] kableExtra_1.3.4 diagram_1.6.5    shape_1.4.6      BiocStyle_2.24.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.9          bslib_0.4.0         compiler_4.2.2     
##  [4] BiocManager_1.30.18 jquerylib_0.1.4     highr_0.9          
##  [7] tools_4.2.2         digest_0.6.29       viridisLite_0.4.1  
## [10] jsonlite_1.8.0      evaluate_0.16       lifecycle_1.0.2    
## [13] rlang_1.0.5         cli_3.4.0           rstudioapi_0.14    
## [16] magick_2.7.3        yaml_2.3.5          xfun_0.33          
## [19] fastmap_1.1.0       stringr_1.4.0       httr_1.4.4         
## [22] knitr_1.40          xml2_1.3.3          systemfonts_1.0.4  
## [25] sass_0.4.2          webshot_0.5.2       svglite_2.1.0      
## [28] glue_1.6.2          R6_2.5.1            rmarkdown_2.16     
## [31] bookdown_0.29       magrittr_2.0.3      scales_1.2.1       
## [34] htmltools_0.5.4     rvest_1.0.0         colorspace_2.0-3   
## [37] stringi_1.7.8       munsell_0.5.0       cachem_1.0.6

References

Allen, Linda J. S. 2011. An Introduction to Stochastic Processes with Applications to Biology. CRC Press-Taylor & francis.

Cannoodt, Robrecht, Wouter Saelens, Louise Deconinck, and Yvan Saeys. 2020. “Dyngen: A Multi-Modal Simulator for Spearheading New Single-Cell Omics Analyses.” bioRxiv, February. https://doi.org/10.1101/2020.02.06.936971.

Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2022. Shiny: Web Application Framework for r. https://CRAN.R-project.org/package=shiny.

Fuchs, Christiane. 2013. Inference for Diffusion Processes: With Applications in Life Sciences. Springer.

Giagos, Vasileios. 2010. “Inference for Auto-Regulatory Genetic Networks Using Diffusion Process Approximations.” PhD thesis.

Gillespie, Daniel T. 1977. “Exact Stochastic Simulation of Coupled Chemical Reactions.” The Journal of Physical Chemistry 81 (25): 2340–61. https://doi.org/10.1021/j100540a008.

Giorgi, Federico M., Carmine Ceraolo, and Daniele Mercatelli. 2022. “The r Language: An Engine for Bioinformatics and Data Science.” Life 12 (5). https://doi.org/10.3390/life12050648.

Johnson, Philip. 2019. Adaptivetau: Tau-Leaping Stochastic Simulation. https://CRAN.R-project.org/package=adaptivetau.

Pineda-Krch, Mario, and Robrecht Cannoodt. 2022. GillespieSSA: Gillespie’s Stochastic Simulation Algorithm (SSA). https://CRAN.R-project.org/package=GillespieSSA.

R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Schneider, Maria Victoria. 2016. In Silico Systems Biology. Humana Press.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wilkinson, Darren. 2018. Smfsb: Stochastic Modelling for Systems Biology. https://CRAN.R-project.org/package=smfsb.

Wilkinson, Darren James. 2020. Stochastic Modelling for Systems Biology. Chapman & Hall/CRC.

Zobitz, John M. 2022. Adding Stochasticity to Parameters | Exploring Modeling with Data and Differential Equations Using r. github. https://jmzobitz.github.io/ModelingWithR/adding-stochasticity-to-parameters.html.

	Reactants ( Pre)					Products (Post)
Species	\(g⋅P_2\)	\(g\)	\(r\)	\(P\)	\(P_2\)	\(g⋅P_2\)	\(g\)	\(r\)	\(P\)	\(P_2\)
Repression		1			1	1
Reverse repression	1						1			1
Transcription		1					1	1
Translation			1					1	1
Dimerisation				2						1
Dissociation					1				2
mRNA degradation			1
Protein degradation				1

Component	Input
Function Body	\(\texttt{th1x2x5,th2x1,th3x2,th4x3,th50.5x4(x4-1),th6x5,th7x3,th8*x4}\)
Function Arguments	\(\texttt{x1,x2,x3,x4,x5,th1,th2,th3,th4,th5,th6,th7,th8}\)
Initial Markings with Hazards	\(\texttt{0,1,2,10,12,1,10,0.01,10,1,1,0.1,0.01}\)
Pre matrix	\(\texttt{0,1,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0}\)
Post matrix	\(\texttt{1,0,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0}\)
Reactions	8
Species	5
Species names	g⋅P2,g,r,P,P2
Model name	Auto-regulatory genetic network

Component	Input
Function Body	\(\texttt{th1x1,th2x1x2,th3x2}\)
Function Arguments	\(\texttt{x1,x2,th1,th2,th3}\)
Initial Markings with Hazards	\(\texttt{50,100,1,0.05,0.6}\)
Pre matrix	\(\texttt{1,0,1,1,0,1}\)
Post matrix	\(\texttt{2,0,0,2,0,0}\)
Reactions	3
Species	2
Species names	Prey,Predator
Model name	Lotka-Volterra System

Component	Input
Function Body	\(\texttt{th1x1(x1-1)/2,th2*x2}\)
Function Arguments	\(\texttt{x1,x2,th1,th2}\)
Initial Markings with Hazards	\(\texttt{301,0,1.66e-3,0.2}\)
Pre matrix	\(\texttt{2,0,0,1}\)
Post matrix	\(\texttt{0,1,2,0}\)
Reactions	2
Species	2
Species names	Protein,Dimer
Model name	Dimerisation kinetics

Component	Input
Function Body	\(\texttt{th1x1x2,th2x3,th3x3}\)
Function Arguments	\(\texttt{x1,x2,x3,x4,th1,th2,th3}\)
Initial Markings with Hazards	\(\texttt{301,120,100,10,0.00166,1e-04,0.1}\)
Pre matrix	\(\texttt{1,1,0,0,0,0,1,0,0,0,1,0}\)
Post matrix	\(\texttt{0,0,1,0,1,1,0,0,0,1,0,1}\)
Reactions	3
Species	4
Species names	S,E,SE,P
Model name	Michaelis-Menten enzyme kinetics