References

  1. Billingsley’s Probability and Measure.
  2. This link on the Borel-Cantelli lemma.
  3. This note on the Borel-Cantelli lemma.
  4. This link on the Borel-Cantelli Lemmas.
  5. T.K. Chandra’s The Borel-Cantelli Lemma.
  6. The Monotone Convergence Theorem
  7. Fatou’s Lemma.
  8. E.M. Stein and R. Shakarchi’s Real Analysis: Measure Theory, Integration, and Hilbert Spaces

Limit Sets

For a sequence \(A_1,A_2,...\) of sets, we define the following sets: \[limsup_n A_n = \cap_{n=1}^\infty \cup_{k=n}^\infty A_k\] \[liminf_n A_n = \cup_{n=1}^\infty \cap_{k=n}^\infty A_k\]

Notice what the definition for \(limsup\) says: For an element \(a \in limsup_n A_n\) it must be the case that \(a\) belongs to infinitely many of the \(A_n\). That is, \(a \in limsup_n A_n\) if and only if for each \(n\) there is some \(k \geq n\) for which \(a \in A_k\). For \(liminf\) it is the case that \(a \in liminf_n A_n\) if and only if \(a\) belongs to all but finitely many of the \(A_n\).

Note 1: In general, \[\cap_{k=n}^\infty A_k \uparrow liminf_n A_n\] and that \[\cup_{k=n}^\infty A_k \downarrow limsup_n A_n\]

Note 2: For every \(m,n\), \[\cap_{k=n}^\infty A_k \subset \cup_{k=m}^\infty A_k\] and \[liminf_n A_n \subset limsup_n A_n\].

Note 3: When \(n\) stands for time we say that \(limsup_n A_n = [A_n \quad i.o.]\) where i.o. stands for infinitely often. This gives the event {\(A_n\) ocurrs for infinitely many \(n\)}. For \(liminf_n A_n\) the event is {\(A_n\) occurs eventually}.

Note 4: By De Morgan’s laws the complement of \(limsup_n A_n\) equals to:

\[\begin{align} (limsup_{n \rightarrow \infty} A_n)^c &= (\cap_{n=0}^\infty \cup_{m=n}^ \infty A_m)^c\\ &= \cup_{n=1}^\infty (\cup_{m=n}^\infty A_n)^c \\ &= \cup_{n=1}^\infty \cap_{m=n}^\infty A_n^c \\ &= liminf(A_n)^c \end{align}\]

In words, if \(\omega \in A_n\) does not hold for infinitely many \(n\), then we can infer two things:

  1. \(\omega \in A_n^c\) for infinitely many \(n\)

  2. \(\omega \in A_n^c\) for all but finitely many \(n\).

Hence \[P(limsup A_n) = 0 \implies P(limsup A_n^c) = 1 \implies P(liminf A_n^c) = 1\] Which one we end up using depends on the problem.


Remark 1: For real sequences \(\{a_n\}\) we have that \[limsup(a_n) = inf_{n \geq 1} sup_{m \geq n} a_m\] and \[liminf(a_n) = sup_{n \geq 1} inf_{m \geq n} a_m\]

Equivalently, if \(a \in \mathcal{R}\), then \[a = limsup_n a_n\] is equivalent to \(\forall \epsilon >0\), the set \(\{n \in N: a_n > a+\epsilon\}\) is finite and \(\{n \in N: a_n > a-\epsilon\}\) is infinite.

Similarly, \[a = liminf_n a_n\] is equivalent to \(\forall \epsilon >0\), the set \(\{n \in N: a_n < a+\epsilon\}\) is infinite and \(\{n \in N: a_n < a-\epsilon\}\) is finite.

The sequence is convergent if \(limsup_n a_n=liminf_n a_n\).

Remark 2 Let \(f\) be a real-valued function, and define \(\{x: f(x) \leq a\} = \{f \leq a\}\). Then the following hold:

\[\{f \leq a\} = \cap_{k=1}^\infty \{f<a+\frac{1}{k}\}\]

and

\[\{f < a\} = \cup_{k=1}^\infty \{f<a-\frac{1}{k}\}\] Remark 3 Let \(\{a_n\}\) be a sequence of real numbers and let \(\{A_n\}=(-\infty,a_n)\). Then \[limsup A_n = (-\infty, a), \quad a \equiv limsup(a_n)\] and \[liminf A_n = (-\infty, a), \quad a \equiv liminf(a_n)\]


Theorem For each sequence \({A_n}\), \[P(liminf_n A_n) \leq liminf_n P(A_n) \leq limsup_n P(A_n) \leq P(limsup_n A_n)\] If \(A_n \rightarrow A\) then \(P(A_n) \rightarrow P(A)\).

(Proof for this Theorem is based on Note 1 and on the continuity of probability. In class next time.)

This Theorem implies that if \(P(A_n) \rightarrow 0\) then \(P(liminf_n A_n) =0\).

The Borel Cantelli Lemmas

Lemma 1 If \(\sum_n P(A_n)\) converges, then \(P(limsup_n A_n)=0\).

The interpretation of this lemma is that, if a sequence of events has summable probabilities, only a finite number of the events will occur with probability 1. If we let \(I_n\) be the indicator for the ocurrence of event \(A_n\) and \(N=\sum_n^\infty I_n\) be the total number of the events, then \(P(N < \infty) =1\).

Lemma 1 is a special case of the Monotone Convergence Theorem.

Proof: Since \(limsup_n A_n \subset \cup_{k=m}^\infty A_k\) it follows that \[P(limsup_n A_n) \leq P(\cup_{k=m}^\infty A_k) \leq \sum_{k=m} P(A_k)\] Since \(\sum_{k=m} P(A_k)\) converges, letting \(m \rightarrow \infty\) obtains that \(\sum_{k=m} P(A_k) \rightarrow 0\).

Lemma 2 If \(\{A_n\}\) is a sequence of independent of events and \(\sum_n P(A_n)\) diverges, then \(P(limsup_n A_n)=1\).

A consequence of the two lemmas is that for a sequence of independent events \(\{A_n\}\), we either have that \(P(limsup_n A_n)=1\) or that \(P(limsup_n A_n)=0\), depending on \(\sum_n P(A_n)\). This is known as the 0-1 Law. The difficulty is to prove which one holds in any particular application.

Convergence of Random Variables

Suppose we have random variables \(X\) and \(X_1,X_2,...\) on a probability space. We are interested in the probability of the event that \(lim_n X_n = X\). The complement of this event is: \(X_n\) fails to converge to \(X\) if and only if for some \(\epsilon>0, |X_n - X| \geq \epsilon\) for infinitely many \(n\). Alternatively, the event “\(lim_n X_n =X\)” has probability 1 if and only if \[P(|X_n - X| \geq \epsilon \quad i.o.)=0\] for each \(\epsilon\) Thus the event “\(|X_n - X| \geq \epsilon \quad i.o.\)” is the limsup of the events “\(|X_n - X| \geq \epsilon\)”, and it implies that \[lim_n P(|X_n - X| \geq \epsilon)=0\] If this holds for each \(\epsilon >0\) then \(X_n\) converges in probability to \(X\).

Theorem \(lim_n X_n = X\) with probability 1 (or a.s.) if \(P(|X_n - X| \geq \epsilon \quad i.o.)=0\) holds for each \(\epsilon>0\).

Theorem A sufficient condition for \(X_n \rightarrow X\) a.s. is that \[\sum_{n=1}^\infty P(|X_n - X|>\epsilon) < \infty \quad \forall \epsilon >0\]

This follows by the first Borel Cantelli lemma.


Borel Cantelli and Almost Sure Convergence of a Sequence of R.V.

For a sequence of random variables \(\{X_n\}\) and limit random variable \(X\), let \(\epsilon >0\) and let \(A_n(\epsilon)\) be the event

\[A_n(\epsilon) \equiv \{\omega: |X_n(\omega) - X(\omega)| > \epsilon\}\]

That is, \(A_n\) corresponds to the set of \(\omega\) for which \(X_n(\omega)\) is \(\epsilon\)-away from \(X(\omega)\).

The Borel Cantelli lemmas say that:

  1. If \[\sum_{n=1}^\infty P(A_n(\epsilon)) = \sum_{n=1}^\infty P(|X_n(\omega)-X(\omega)| > \epsilon) < \infty\] then \(X_n \rightarrow X\) a.s.

  2. If \[\sum_{n=1}^\infty P(A_n(\epsilon)) = \sum_{n=1}^\infty P(|X_n(\omega)-X(\omega)| > \epsilon) = \infty\] and \(X_n\) are independent, then \(X_n \nrightarrow X\) a.s.

Interpretation A random variable is a real-valued function from sample space \(\Omega\) to \(\mathcal{R}\). The sequence of random variables \(X_1,...,X_n\) corresponds to a sequence of functions defined on elements of \(\Omega\). Almost sure convergence requires that the sequence of real numbers \(X_n(\omega)\) converge to \(X(\omega)\) (as a real sequence) for all \(\omega \in \Omega\) as \(n \rightarrow \infty\) (except perhaps when \(\omega\) is in a set having probability zero under the probability distribution of \(X\)).


Remark It may be useful to note that \[\{limsup_n X_n \leq a\} = \left(\cup_{m=1}^\infty \{\omega: X_n(\omega) > a+\frac{1}{m}\} \quad i.o. \right)^c\]

  1. If \(\forall m \geq 1\), \(\sum_{n=1}^\infty P({X_n > a + \frac{1}{m}})<\infty\) then \(P(limsup X_n \leq a) = 1\).

  2. If there exists an \(m \geq 1\), \(\sum_{n=1}^\infty P({X_n > a + \frac{1}{m}})=\infty\) then \(P(limsup X_n \leq a) = 0\) or, equivalently, \(P(limsup X_n \geq a) = 1\)


Modes of convergence

  1. \(X_n \rightarrow X\) a.s. if \(P(lim_{n \rightarrow \infty} X_n = X) =1\)

  2. \(X_n \rightarrow X\) in \(L_p\) for \(p \geq 1\) if \(E|X|^p < \infty\) and \(E|X_n -X|^p \rightarrow 0\).

  3. \(X_n \rightarrow X\) in probability if \[lim_{n \rightarrow \infty} P(|X_n-X| \geq \epsilon) = 0, \forall \epsilon > 0\] Equivalently \[lim_{n \rightarrow \infty} P(|X_n-X|<\epsilon) = 1, \forall \epsilon > 0\]

  4. \(X_n \rightarrow X\) in distribution if \(F_{X_n}(t) \rightarrow F_X(t)\) for all points \(t \in \mathcal{R}\) of continuity of \(F_X\).

Theorem The following implications hold:

If \(X_n \rightarrow X\) either a.s. or in \(L_p, p\geq 1\) \(\implies X_n \rightarrow X\) in probability \(\implies X_n \rightarrow X\) in distribution.

Additionally, if \(r \geq p \geq 1\) then \(X_n \rightarrow X\) in \(L_r\) which implies that \(X_n \rightarrow X\) in \(L_p\).

Where we are going with this The Borel Cantelli lemmas will be useful in proving the SLLN (strong Law of Large Numbers.)

Convergence Theorems

Convergence theorems give conditions under which we can interchange a limit with an integral, i.e. if \(lim_{k \rightarrow \infty} f_k(x) = f(x) \quad a.e.\), \(f_k\) and \(f\) are measurable, these theorems give us conditions under which \(lim_{k \rightarrow \infty} \int f_k(x) = \int f\). This not true in general.

These theorems are: the Monotone Convergence Theorem, Fatou’s Lemma, and the Dominated Convergence Theorem.

Homework 2

Problem 1

  1. Prove that \[limsup_n (A_n \cap B_n) \subset (limsup_n A_n) \cap (limsup_n B_n)\]
  2. Prove that \[(liminf_n A_n) \cup (liminf_n B_n) \subset liminf_n (A_n \cup B_n)\]

Problem 2

Let \(X_1,X_2,...\) be independent absolutely continuous random variables such that \(X_n\) has density function \(f_n\) given by \(f_n(x) = \frac{n}{\pi(1+n^2x^2)}\). With respect to which modes of convergence does \(X_n\) converge to \(X\)?

Hint: \(E|X_n|^p\) diverges, and \(P(|X_n|>\epsilon) \approx \frac{2}{\pi n \epsilon}\).

An example that applies the two Borel Cantelli lemmas

(This example contains many types of “inequality arguments” that are used over and over again in economics. Try it out.)

Suppose \(X_1,X_2,...\) is a sequence of i.i.d. random variables that are exponentially distributed with parameter \(\lambda >0\). We can use the Borel Cantelli lemmas to show that

\[limsup_n \frac{X_n}{log(n)} = \frac{1}{\lambda}\] This exercise shows that if you repeatedly take measurements of a quantity that is exponentially distributed with parameter \(\lambda\), then you would expect that, in the long run, after \(n\) measurements, you would get some measurements as large as \(\frac{log(n)}{\lambda}\).


The strategy for solving such a problem is to show two things:

  1. \(limsup_n \frac{X_n}{log(n)} \leq \frac{1}{\lambda}\) a.s.
  2. \(limsup_n \frac{X_n}{log(n)} \geq \frac{1}{\lambda}\) a.s.

Usually, (1) is shown by looking at \[\sum_{n=1}^\infty P\left(\frac{X_n}{logn} > \frac{1+\epsilon}{\lambda}\right)\] and then invoking Lemma 1 of Borel-Cantelli, and (2) is shown by looking at \[\sum_{n=1}^\infty P\left(\frac{X_n}{logn} < \frac{1-\epsilon}{\lambda}\right)\] It is usually sufficient to analyze only \(\sum_{n=1}^\infty P\left(\frac{X_n}{logn} < \frac{1}{\lambda}\right)\). Then, Lemma 2 of Borel-Cantelli is invoked, and the result follows by combining (1) and (2).


Note that for \(X \sim exp(\lambda)\) we know that \(P(X>x)=exp(-\lambda x)\) for each \(x>0\). Then we will show (1). For any \(\epsilon >0\),

\[\sum_{n=1}^\infty P\left(\frac{X_n}{logn} > \frac{1+\epsilon}{\lambda}\right) = \sum_{n=1}^{\infty} e^{-(1+\epsilon)log(n)} = \sum_{n=1}^{\infty} n^{-(1+\epsilon)} < \infty\]

By Lemma 1, we have that for every \(\epsilon >0\), \[P(\frac{X_n}{log(n)} > \frac{1+\epsilon}{\lambda} \quad i.o.) = 0\] Hence for every \(\epsilon >0\), \[P(limsup_{n \rightarrow \infty} \frac{X_n}{log(n)} \leq \frac{1+\epsilon}{\lambda}) = 1\] Then, letting \(\epsilon \downarrow 0\), it follows that \[limsup_{n \rightarrow \infty} \frac{X_n}{log(n)} \leq \frac{1}{\lambda} \quad a.s.\]

We also have that

\[\sum_{n=1}^\infty P\left(\frac{X_n}{log(n)} > \frac{1}{\lambda}\right) = \sum_{n=1}^{\infty} e^{-log(n)} = \sum_{n=1}^{\infty} n^{-1} = \infty\]

By Lemma 2, \(P(\frac{X_n}{logn} > \frac{1}{\lambda} \quad i.o.) = 1\). Hence \[limsup_{n \rightarrow \infty} \frac{X_n}{log(n)} \geq \frac{1}{\lambda} \quad a.s.\]

Then combining the two results obtains the desired answer.

Alternatively, the event \[\left(limsup_n \frac{X_n}{log(n)} = \frac{1}{\lambda}\right)=\cap_{m=1}^\infty \left(\frac{1}{\lambda} \leq limsup_n \frac{X_n}{log(n)} \leq \frac{1}{\lambda}+\frac{1}{m}\right)\] has probability one.