Lecture_3_Stochastic

Motivation

After a couple of introductory chapters, we can finally turn to the subject at hand. In this chapter, we will define stochastic processes and discuss some of their properties. We will also review a number of simple stochastic properties and use R to better understand their behavior.

Stochastic Processes Definitions

It can be hard to define a subject in a single sentence. Think about it: how would you succinctly define ‘mathematics,’ or even ‘statistics?’ We are lucky that a ‘stochastic process’ has a simple, one line definition:

A Random variable that changes through time.

We’re cheating a bit by leaning on the definition of ‘random variable,’ but still, we have a simple, seven-word definition for an immensely complicated topic!

We denote random variables with capital letters like $X$. We denote stochastic random variables with $X_t$, which simply means ‘the value of $X$ at time $t$.’ We’ll start with potentially the simplest stochastic process: a Bernoulli Process. \[ X_t \sim Bern(p), \] for $t = 1, 2, \ldots$ (this is called the index of the process), and all of the $X_t$ random variables are independent. This is a discrete stochastic process because the index is discrete; stochastic processes with continuous indices (continuous intervals of the real line) are considered continuous stochastic processes (this distinction is similar to the one we make with the supports of discrete and continuous random variables!).

library("xts")

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

#library("roll")
library("ggplot2")
library("data.table")

## 
## Attaching package: 'data.table'

## The following objects are masked from 'package:xts':
## 
##     first, last

Since we know that a Bernoulli random variable is essentially a coin flip that takes on the value 1 with probability $p$ and takes on the value 0 with probability $1 − p$, this stochastic process is just a series of zeroes and ones. For our starting stochastic process, let’s have $p = \frac{1}{2}$.

# generate a Bernoulli process
data <- data.table(t = 1:100,
 X = rbinom(100, 1, 1 / 2))
# visualize
ggplot(data, aes(x = t, y = X)) +
 geom_line() +
 theme_bw() +
 ggtitle("A Bernoulli Process") +
 ylim(c(-2, 2))

It’s important to remember that stochastic processes are still random variables. That is, $X_t$ (and $X_4$, $X_{17}$ and $X_{368}$, and all of the other steps in the process) has an expectation, a variance, moments, density functions, etc. In this specific case, for example, we have:

\[ E(X_t) = \frac{1}{2}, \qquad Var(X_t)=\frac{1}{4}, \qquad P(X_t = 1) = P(X_t = 0) = \frac{1}{2}. \]

For all values of $t$ in the index $\{1, 2, \ldots \}$. Of course, this type of generality isn’t strictly necessary: we will be working with processes where we don’t have these neat solutions for all values of $t$ in the index (i.e., where the distribution of the process changes with $t$, in which case we might have $E(X_1) \ne E(X_2)$, or something along these lines).

Joint Stochastic Processes

As in most statistical fields, stochastic processes can be ‘levered up’ from the marginal (single process) to the joint (multiple processes). We will see a couple of examples of joint stochastic processes later on, so we can introduce some examples here and explore some of their key properties.

Let’s define $Y_t$, where: $Y_0=0$ and

\[ Y_t = Y_{t − 1} + \varepsilon_t, \] Where $t = 1, 2, \ldots$ and $\varepsilon_t$ is a series of i.i.d. $N(0, 1)$ random variables (standard normal random variables).

This is called a Gaussian Random Walk. The Gaussian part means ‘Normal’ and the Random Walk part means that, well, the process looks like it’s randomly walking around! We can easily simulate this process in R:

# replicate
set.seed(0)
# generate a Gaussian Random Walk
data <- data.table(t = 1:100,
 Y = cumsum(rnorm(100)))
# visualize
ggplot(data, aes(x = t, y = Y)) +
 geom_line() +
 theme_bw() +
 geom_hline(yintercept = 0) +
 ggtitle("Gaussian Random Walk")

Let’s now define a process $Z_t$ with $Z_0=0$ and \[ Z_t=Y_t-\gamma_t, \]

where $t=1,2,\ldots$ and $\gamma_t$ is a series of i.i.d. $\mbox{Expo}(1)$ random variables.

We can easily visualize a simulation of these two processes:

# generate the processes
data <- data.table(t = 1:100,
 Y = cumsum(rnorm(100)))
data$Z <- data$Y - rexp(100, rate = 1)
# visualize
ggplot(data, aes(x = t, y = Y)) +
 geom_line(col = "black") +
 geom_line(data = data, aes(x = t, y = Z), col = "red") +
 theme_bw() +
 geom_hline(yintercept = 0) +
 ggtitle("Joint Processes Y (black) and Z (red)") +
 ylab("")

Markov Property

Colloquially, a stochastic process with the Markov property only cares about its most recent state. Mathematically, the stochastic process $X_t$ satisfies the Markov property if:

\[ P(X_t|X_1, X_2, . . . , X_{t−1}) = P(X_t|X_{t−1}) \] That is, within the filtration of $X_t$ (all of the information before time $t$), $X_{t-1}$ tells us all the information that we need to know about $X_t$ (put another way, $X_t$ is conditionally independent of $X_1, X_2, \ldots, X_{t-2}$ given $X_{t-1}$).

Our Bernoulli Process is clearly Markov, since each step is just an i.i.d. Bernoulli, and thus: \[ P(X_t|X_1, X_2, \ldots , X_{t−1}) = P(X_t|X_{t−1})=P(X_t). \] That is, conditioning on the past doesn’t even tell us anything, because each step is independent of previous steps.

We can also consider our Gaussian Random Walk: $Y_0=0$ and \[ Y_t=Y_{t-1}+\varepsilon_t, \] where $\varepsilon_t$ is a series of i.i.d. $N(0,1)$. This is also Markov; given $Y_{t-1}$, we have all of the information about $Y_t$ that we could get from the history the process; the only random part remaining is $\varepsilon_t$ , which is independent of $Y_t$.

Markov Chains

We continue with one of the most natural and common and interesting types of stochastic processes, namely Markov chains.

Example 1: The Frog Walk

Imagine 20 lily pads arranged in a circle. Suppose a frog starts at pad 20. Each second, it either stays where it is, or jumps one pad in the clockwise direction, or jumps one pad in the counter-clockwise direction, each with probability 1/3. (See Figure 1).

Figure 1

This leads to some direct probability computations, e.g.

\[ P(\mbox{at pad}\ {\bf 1}\ \mbox{after one step}) = \frac{1}{3}, \quad P(\mbox{at pad}\ {\bf 20}\ \mbox{after one step}) = \frac{1}{3}, \]

\[ P(\mbox{at pad}\ {\bf 19}\ \mbox{after one step}) = \frac{1}{3}, \quad P(\mbox{at pad}\ {\bf 18}\ \mbox{after one step}) = 0. \]

Proceeding for two steps, we see that e.g. \[ P(\mbox{at pad}\ {\bf 2}\ \mbox{after two step}) = \frac{1}{3}\cdot \frac{1}{3} = \frac{1}{9}, \]

since starting at pad 20, there is a 1/3 chance of jumping to pad 1, and then from pad 1 there is a further 1/3 chance of jumping to pad 2. And, \[ P(\mbox{at pad}\ {\bf 19}\ \mbox{after two step}) = \frac{1}{3}\cdot \frac{1}{3} + \frac{1}{3}\cdot \frac{1}{3}= \frac{2}{9}, \]

since the frog could either first jump from pad 20 to pad 19 and then stay there on the second jump with probability (1/3)(1/3), or it could first stay at pad 20 and then jump from there to pad 19 again with probability (1/3)(1/3). And so on.

Exercise 1. For the Frog Walk, compute P(at pad #1 after 2 steps).

Markov chain theory concerns itself largely with the question of what happens in the long run. For example, what is \[ P(\mbox{frog at pad}\ {\bf 14}\ \mbox{after 27 steps})? \]

Farther ahead, what is \[ \lim_{k \to\infty} P(\mbox{frog at pad}\ {\bf 14}\ \mbox{after k steps})? \] And, will the frog necessarily eventually return to pad #20? Or, will the frog necessarily eventually visit every pad?

Markov Chain Definition

A (discrete-time, discrete-space, time-homogeneous) Markov chain is specified by three ingredients:

A state space $S$, any non-empty finite or countable set.
initial probabilities $\{\nu \}_{i\in S}$, where $\nu_i$ is the probability of starting at $i$ (at time 0). (So, $\nu_i\ge 0$, and $\sum_i \nu_i = 1$.)
transition probabilities $\{p_{ij}\}_{i,j\in S}$, where $p_{ij}$ is the probability of jumping to $j$ if you start at $i$. (So, $p_{ij} \ge 0$, and $\sum_j p_{ij} = 1$ for very $i$.)

In the Frog Walk example:

The state space is $S = \{1, 2, 3, \ldots , 20\}$. (All the lily pads.)

The initial distribution is $\nu_{20} = 1$, and $\nu_i = 0$ for all $i\ne 20$. (Since the frog always starts at pad 20.)

Example 2: Simple finite Markiv chain}

Let $S=\{1,2,3\}$, and $\nu=(1/7, 2/7, 4/7)$, and (see Figure 2) \[ (p_{ij})= \left( \begin{array}{lll} p_{11} & p_{12} & p_{13} \\ p_{21} & p_{22} & p_{23} \\ p_{31} & p_{32} & p_{33} \end{array} \right) = \left( \begin{array}{lll} 0 & 1/2 & 1/2 \\ 1/3 & 1/3 & 1/3 \\ 1/4 & 1/4 & 1/2 \end{array} \right) \]

Figure 2

Then \[ P(X_0=3, X_1=2)=\nu_3p_{32}=\frac{4}{7}\cdot \frac{1}{4}=\frac{1}{7}\quad \mbox{and}\quad P(X_0=2, X_1=1, X_2=3)=\nu_2p_{21}p_{13}=\frac{2}{7}\cdot \frac{1}{3}\cdot \frac{1}{2}=\frac{1}{21}. \] We can ask, what happens in the long run, as $n\to \infty$? Is every state visited for sure? What is $\lim{_n\to \infty} P(X_n=3)$?

Exercise 2. For the above example:

Compute $P(X_0=1, X_1=2)$.
Compute $P(X_0=1, X_1=1, X_2=1)$.
Compute $P(X_0=1, X_1=1, X_2=1, X_3=2)$.

Example 3: Simple Random Walk (s.r.w.)}

Let $0<p<1$, (e.g., $p=1/2$). Suppose you repeatedly bet $1. Each time you have probability $p$ of winning $1, and probability $1-p$ of loosing $1. Let $X_n$ be the net gain (in dollars) after $n$ bets. Then $\{X_n\}$ is a Markov chain, with $S={\bf Z}$, and (see Figure 3) \[ p_{ij}= \left\{ \begin{array}{ll} p, & j=i+1\\ 1-p, & j=i-1 \\ 0, & \mbox{otherwise}. \end{array} \right. \]

Figure 3

Usually we take the initial value $X_0=0$, so $\nu_0=1$, but sometimes we instead take $X_0=a$ for some other $a\in {\bf Z}$, so $\nu_a=1$.

We can then ask: What happens in the long run? Will you eventually with $1,000? Will you necessary eventually go broke? etc.

An important special case is when $p=1/2$, called Simple Symmetric Random walk (s.s.r.w.) since the $p=1-p$.

Exercise 3. For the s.r.w. with $p=2/3$ and $X_0=0$ with probability 1:

Compute $P(X_0=0, X_1=1)$.
Compute $P(X_0=0, X_1=1, X_2=0)$.
Compute $P(X_0=0, X_2=2$).

This leads to some direct probability computations, e.g.

\[ P(\mbox{at pad}\ {\bf 1}\ \mbox{after one step}) = \frac{1}{3}, \quad P(\mbox{at pad}\ {\bf 20}\ \mbox{after one step}) = \frac{1}{3}, \]

\[ P(\mbox{at pad}\ {\bf 19}\ \mbox{after one step}) = \frac{1}{3}, \quad P(\mbox{at pad}\ {\bf 18}\ \mbox{after one step}) = 0. \]

Proceeding for two steps, we see that e.g. \[ P(\mbox{at pad}\ {\bf 2}\ \mbox{after two step}) = \frac{1}{3}\cdot \frac{1}{3} = \frac{1}{9}, \] since starting at pad 20, there is a 1/3 chance of jumping to pad 1, and then from pad 1 there is a further 1/3 chance of jumping to pad 2. And, \[ P(\mbox{at pad}\ {\bf 19}\ \mbox{after two step}) = \frac{1}{3}\cdot \frac{1}{3} + \frac{1}{3}\cdot \frac{1}{3}= \frac{2}{9}, \]

Exercise 1. For the Frog Walk, compute P(at pad #1 after 2 steps).

Markov chain theory concerns itself largely with the question of what happens in the long run. For example, what is \[ P(\mbox{frog at pad}\ {\bf 14}\ \mbox{after 27 steps})? \]

Markov Chain Definition

A (discrete-time, discrete-space, time-homogeneous) Markov chain is specified by three ingredients:

A state space $S$, any non-empty finite or countable set.
initial probabilities $\{\nu\}_{i\in S}$, where $\nu_i$ is the probability of starting at $i$ (at time 0). (So, $ν_i≥ 0$, and $\sum_i \nu_i = 1$.)
transition probabilities $\{p_{ij}\}_{i,j\in S}$, where $p_{ij}$ is the probability of jumping to $j$ if you start at $i$. (So, $p_{ij} ≥ 0$, and $\sum_j p_{ij} = 1$ for very $i$.)

In the Frog Walk example:

The state space is $S = \{1, 2, 3, . . . , 20\}$. (All the lily pads.)

The initial distribution is $\nuν_{20} = 1$, and $ν_i = 0$ for all $i\ne 20$. (Since the frog always starts at pad 20.)

The transition probabilities are given by:

\[ p_{ij}= \left\{ \begin{array}{ll} \frac{1}{3}, & |j-i|\le 1\\ \frac{1}{3}, & |j-i|=19 \\ 0, & \mbox{otherwise}. \end{array} \right. \]

Given any Markov chain, let $X_n$ be the Markov chain’s state at time $n$ So, $X_0, X_1, X_2\ldots$ are random variables. Then at time 0, we have \[ P(X_0=i)=\nu_i \quad \mbox{for every} \ i\in S. \] The transitionn probabilities $p_{ij}$ can be interpreted as conditiional probabilities. Indeed if $P(X_n=i)\ge 0$, then \[ P(X_{n+1}=j\ |\ X_n=i)=p_{ij}, \qquad i,j\in S, \quad n=0,1,2,\cdots. \] This conditional probability does not depend on $n$, which is the time-homogeneous property. Also, \[ P(X_{n+1}=j|X_0=i_0, X_1=j_1, . . . , X_n=i_n) = P(X_{n+1}=j\ |\ X_n=i_n)=p_{i_nj}, \]

that is, the probabilities at time $n+1$ depend only on the the state at time $n$, Markov property.

Wecan then compute joint probabilities by relating them to conditional probabilities. For example, for $i,j\in S$, \[ P(X_0=i, X_1=j)=P(X_0=i)P(X_1=j\ |\ X+0=i)=\nu_i p_{ij}. \] More generally, \[ P(X_0=i_0, X_1=j_1, . . . , X_n=i_n) = \nu_{i_0}p_{i_0i_1}\ldots p_{i_{n-1}i_n}. \]

Example 2: Simple finite Markiv chain}

Figure 2

Exercise 2. For the above example:

Compute $P(X_0=1, X_1=2)$.
Compute $P(X_0=1, X_1=1, X_2=1)$.
Compute $P(X_0=1, X_1=1, X_2=1, X_3=2)$.

Example 3: Simple Random Walk (s.r.w.)}

Figure 3

Usually we take the initial value $X_0=0$, so $\nu_0=1$, but sometmes we instead take $X_0=a$ for some other $a\in {\bf Z}$, so $\nu_a=1$.

We can then ask: What happens in the long run? Will you eventually with $1,000? Will you necessarly eventually go broke? etc.

An impotant speciial case is when $p=1/2$, called Simple Symmetric Random walk (s.s.r.w.) since the $p=1-p$.

Exercise 3. For the s.r.w. with $p=2/3$ and $X_0=0$ with probability 1:

Compute $P(X_0=0, X_1=1)$.
Compute $P(X_0=0, X_1=1, X_2=0)$.
Compute $P(X_0=0, X_2=2$).

Lecture_3_Stochastic_Properties

George Yanev

2022-09-11

Motivation

Stochastic Processes Definitions

Joint Stochastic Processes

Markov Property

Markov Chains

Example 1: The Frog Walk

Markov Chain Definition

Example 2: Simple finite Markiv chain}

Example 3: Simple Random Walk (s.r.w.)}

Markov Chain Definition

Example 2: Simple finite Markiv chain}

Example 3: Simple Random Walk (s.r.w.)}