Data 605 Discussion Week 8

Alexander Ng

10/14/2019

Chapter 8 Exercise 6 Chebyshev Inequality

Statement

Let \(S_n\) be the number of successes in \(n\) Bernoulli trials with probability \(p\) for success on each trial.

Show using Chebyshev’s inequality, that for each \(\epsilon > 0\)

\[P \left( \vert \frac{S_n}{n}-p\vert \geq \epsilon \right) \leq \frac{p(1-p)}{n\epsilon^2}\]

Solution

For a single Bernoulli variable \(X_i\) the variance is \(p(1-p)\) because we know \(X_i^2 = X_i\) therefore, \[V(X_i) = E[X_i^2] - E[X_i]^2 = E[X_i]-E[X_i]^2 = p - p^2\]

It follows that the variance of a sum of iid Bernoulli variables \(S_n\) has variance:

\[V(S_n) = \sum_{i=1}^{n}V(X_i) = np(1-p)\]

It further follows that \[V(\frac{S_n}{n}) = \frac{1}{n^2}V(S_n)= \frac{np(1-p)}{n^2}=\frac{p(1-p)}{n}\] It also follows that \[E[\frac{S_n}{n}] = \frac{ E[ S_n]}{n} = \frac{np}{n} = p\]

Now applying Chebyshev’s inequality to \(X = \frac{S_n}{n}\) and using the calculations above we obtain

\[P[| X- \mu | \geq \epsilon ] \leq \frac{V(X)}{\epsilon^2}\] And substituting the above identities we get:

\[P[| \frac{S_n}{n} - p | \geq \epsilon ] \leq \frac{p(1-p)}{n\epsilon^2}\]

This concludes the proof.