October 26, 2023
Process \(\mathbf{X}=\{X^{(i)}\colon -\infty < i < \infty\}\).
\(X^{(i)}=(X^{(i)}_{1}, \ldots, X^{(i)}_{d}),\) with set of vertices \(V = \{1, \dots, d\}.\)
\(X^{(i)}_{v} \in A\), \(A\) is a finite alphabet.
Process observed at time \(i \in \{1, \ldots, n\}.\)
Subscript indicates the vertex and superscript the time the observation was taken.
We assume that the process \(\mathbf{X}\) has an underlying graph \(G^*.\)
Consider these estimated neighborhoods for a set of 5 discrete random variables: \[ \mathrm{ne(X_1)=\{2,3\}}, \qquad \mathrm{ne(X_2)=\{3\}}, \qquad\mathrm{ne(X_3)=\{1,2,4,5\}}, \] \[ \mathrm{ne(X_4)=\{3\}}, \qquad \mathrm{ne(X_5)=\{3,4\}}. \]
Leonardi et al. (2021) presents a model selection criterion for independence within random vectors.
\(\mathbf{Y}=\{Y^{(i)}\}\) is multivariate stochastic process.
\(Y^{(i)}=(Y^{(i)}_{1}, \ldots, Y^{(i)}_{7}),\) with joint probability distribution \(p(y_1,\ldots,y_7).\)
The method decomposes the vector’s distribution function into independent blocks.
If the set of independence of \(\mathbf{Y}\) is \(U = \{3\}\), then \[p(y_1,\ldots,y_7) = p(y_1,y_2,y_3)p(y_4,y_5,y_6,y_7).\]
Therefore, \((Y_1,Y_2,Y_3) \perp (Y_4,Y_5,Y_6,Y_7).\)
Proposal: Method to estimate the graph of conditional dependencies for multivariate stochastic processes with mixing conditions.
Aims to overcome limitations of previous works:
Proposed solution: penalized pseudo-likelihood criterion for entire graph estimation for multivariate processes with mixing conditions.
Key advantages:
Definition: For \(i<j\), let \(X^{(i:j)}\) denote the sequence of vectors \(X^{(i)}, X^{(i+1)}, \ldots, X^{(j)}.\) We say the process \(\mathbf{X} = \{X^{(i)}\colon -\infty < i < \infty\}\) satisfies a mixing condition with rate \(\{\psi(\ell)\}_{\ell \in \mathbb R}\) if for each \(k, m \in \mathbb N\) and each \(x^{(1:k)} \in (A^d)^k\), \(x^{(1:m)}\in (A^d)^m\) with \(\mathbb{P}(X^{(1:m)}=x^{(1:m)})>0,\) we have that \[ \Bigl| \mathbb{P} \bigl(X^{(n:(n+k-1))}=x^{(1:k)}\, |\, X^{(1:m)}=x^{(1:m)}\bigr) - \mathbb{P} \bigl(X^{(n:(n+k-1))} =x^{(1:k)}\bigr)\Bigr| \qquad\qquad \] \[ \qquad\qquad\qquad\qquad\qquad\qquad\qquad \leq \psi(n-m) \mathbb{P}\bigl(X^{(n:(n+k-1))}=x^{(1:k)}\bigr), \] for \(n\geq m+\ell.\)
We take a regularized pseudo maximum likelihood approach to estimate the graph \(G^*\), given a sample \(x^{(1)},\dotsc, x^{(n)}\) of the stochastic process.
The pseudo log likelihood estimator is given by \[ \log \widehat L(G) \;=\; \sum_{v \in V} \:\sum_{(a_v\in A)} \:\sum_{a_{G(v)}\in A^{|G(v)|}} N(a_v,a_{G(v)})\log \widehat\pi(a_v|a_{G(v)})\,, \] The conditional probabilities are estimated from the observed data \[ \hat\pi(a_v|a_{G(v)}) = \frac{N(a_v, a_{G(v)})}{N(a_{G(v)})}. \]
Therefore, the graph estimator is \(\widehat G = \underset{G}{\arg\max}\Big\{\log \widehat L(G) - \lambda_n \sum_{v \in V} |A|^{|G(v)|}\Big\}.\)
We prove the consistency of this estimator.
Study motivated by Leonardi et al. (2021) and Leonardi et al. (2023).
Overcame limitations of previous works.
Proved consistency and convergence rate of the proposed estimator.
Proposed practical implementation methods: exact, simulated annealing, and stepwise algorithm.
Factors affecting estimation process: number of nodes, alphabet size, and graph structure.
Leonardi, F., Carvalho, R., and Frondana, I. (2023). Structure recovery for partially observed discrete markov random fields on graphs under not necessarily positive distributions. Scandinavian Journal of Statistics.
Leonardi, F., Lopez-Rosenfeld, M., Rodriguez, D., Severino, M. T. F., and Sued, M. (2021). Independent block identification in multivariate time series. Journal of Time Series Analysis, 42(1):19–33.
This work was produced as part of the activities of the Research, Innovation and Dissemination Center for Neuromathematics (grant FAPESP 2013/07699-0).
The authors also received financial support from CAPES and CNPq.
CEPID NeuroMat Researchers Workshop