Probability and Statistics II

Lecture Notes

1 Lecture 1: Random Variables

1.1 Introduction

1.1.1 Random Experiment

A random experiment is an experiment whose outcomes cannot be known with certainty. Instead, we assign probabilities to the various outcomes. All possible outcomes may be stated in advance, and the probabilities of the outcomes may be determined from experience.

1.1.2 Sample Space

The sample space is the set of all possible outcomes of a random experiment. It is usually denoted by \(S\).

Examples:

  • Tossing a coin once: \(S = \{H, T\}\)
  • Throwing a fair die once: \(S = \{1, 2, 3, 4, 5, 6\}\)

1.2 Random Variable

A random variable is a rule that assigns a numerical value to each outcome of a random experiment.

Formally, consider a random experiment with sample space \(C\). A function that assigns to each element \(c \in C\) exactly one number \(X(c) = x\) is called a random variable.

The space or range of \(X\) is the set of real numbers: \[D = \{x : x = X(c),\ c \in C\}\]

It is a function from the sample space \(S\) to the real numbers \((-\infty, \infty)\). We can also define a random variable as a numerical outcome of random experiments. The domain of a random variable is the sample space and its range is the set of real numbers.

Notation: We use capital letters such as \(X\) to denote a random variable and lower-case letters such as \(x\) to denote its possible values.

Example 1: Coin tossed three times

The random variable \(X\) equals the number of heads that appear: \[X = \{0, 1, 2, 3\}\]

Example 2: Die thrown once

Let \(X\) denote the outcome: \(x = 1, 2, 3, 4, 5, 6\). Each value has probability \(\frac{1}{6}\).

1.3 Discrete and Continuous Random Variables

1.3.1 Discrete Random Variable

A random variable is said to be discrete if its possible values are countable (if its space is finite or countably infinite).

A set \(D\) is said to be countable if its elements can be listed, i.e., there is a one-to-one correspondence between \(D\) and the positive integers (e.g., Number of defective items in a sample of 20 items).

A discrete variable is one whose values are distinct from each other; the values are usually (but not necessarily) integers.

1.3.2 Continuous Random Variable

A random variable is said to be continuous if it can take uncountably many values (values in an interval).

Examples: Heights, weights, time.

If its cumulative distribution function \(F_X(x)\) is continuous for all \(x \in \mathbb{R}\), then \(X\) is continuous.

Let \(X\) be a random variable that can assume values only in the intervals \([x_1, x_2),\ [x_2, x_3),\ \ldots,\ [x_n, x_{n+1})\) with respective probabilities \(p_1, p_2, \ldots, p_n\), where \[P(x_i \leq X < x_{i+1}) = p_i,\quad i = 1, 2, \ldots, n\]

If \(\sum_{i=1}^{n} p_i = 1\), then \(X\) is a continuous random variable.

N/B: For continuous random variables we cannot assign probabilities to specific values as in the case of discrete random variables.


1.4 Probability Distribution of a Discrete Random Variable

Suppose that \(X\) is a discrete random variable and \(x\) is one of its possible values, then the probability that \(X = x\) is denoted: \[P_X(x) = P(X = x)\]

The probability distributions of a discrete random variable are the relationship that pairs the values of the random variable with their corresponding probabilities. This relationship can be in the form of a table, algebraic form, or graphical form.

Let \(X\) be a discrete random variable with space \(D\). The probability mass function (pmf) is: \[P_X(x) = P(X = x),\quad x \in D\]

1.4.1 Properties of the Probability Mass Function

1. Non-Negativity Property \[P_X(x) \geq 0\] \[0 \leq P_X(x) \leq 1,\quad x \in D\]

Example: For a fair die, \(P_X(x) = \frac{1}{6},\ x = 1,2,3,4,5,6\). Each value satisfies \(0 \leq \frac{1}{6} \leq 1\).

2. Total Probability Property \[\sum_{x \in D} P_X(x) = 1\]

1.4.2 Example 1: Coin Tossed Three Times

Let the random variable \(X\) be the number of heads that appear. What is the probability distribution of the random variable \(X\)?

Solution:

\(X = \{0, 1, 2, 3\}\)

\(S = \{HHH, HHT, HTH, HTT, THH, THT, TTH, TTT\}\)

\(x\) 0 1 2 3
\(P(X=x)\) \(\frac{1}{8}\) \(\frac{3}{8}\) \(\frac{3}{8}\) \(\frac{1}{8}\)

1.4.3 Example 2: Die Rolled Once

A die is rolled once. Let the random variable \(X\) denote the number facing up. What is the probability mass function (p.m.f) of \(X\)?

Solution: \[P(X = x) = \begin{cases} \frac{1}{6}, & x = 1, 2, 3, 4, 5, 6 \\ 0, & \text{elsewhere} \end{cases}\]

1.4.4 Example 3

Given \(f(x) = \frac{x}{10},\ x = 1, 2, 3, 4\), show that \(f(x)\) is a p.m.f.

Two conditions required:

  • Non-negativity: \(f(x) \geq 0\) for all \(x\)
  • Total probability: \(\sum_{\text{all }x} f(x) = 1\)
\(x\) 1 2 3 4 \(\Sigma\)
\(f(x)\) \(\frac{1}{10}\) \(\frac{2}{10}\) \(\frac{3}{10}\) \(\frac{4}{10}\) 1

Condition 1: Non-negativity

For \(x \in \{1,2,3,4\}\), we have \(x > 0\), therefore: \(f(x) = \frac{x}{10} > 0\)

Condition 2: Total Probability = 1 \[\sum_{x=1}^{4} f(x) = \frac{1}{10} + \frac{2}{10} + \frac{3}{10} + \frac{4}{10} = \frac{1+2+3+4}{10} = \frac{10}{10} = 1 \checkmark\]

Both conditions are satisfied \(\Rightarrow\) \(f(x)\) is a p.m.f. \(\blacksquare\)

1.4.5 Example 4 — Bernoulli Distribution

Given \(f(x) = p^x(1-p)^{1-x},\ x = 0, 1,\ 0 < p < 1\), show that \(f(x)\) is a p.m.f.

Condition 1: Non-negativity

Since \(0 < p < 1\): \(p^x \geq 0\) and \((1-p)^{1-x} \geq 0\), therefore \(f(x) \geq 0\)

\(x\) 0 1 \(\Sigma\)
\(f(x)\) \(1-p\) \(p\) 1

Condition 2: Total Probability = 1 \[f(0) + f(1) = p^0(1-p)^1 + p^1(1-p)^0 = (1-p) + p = 1 \checkmark\]

1.4.6 Example 5 — Binomial Distribution

Given \[f(x) = \binom{n}{x} p^x q^{n-x},\quad x = 0, 1, 2, \ldots, n\] where \(p\) is the probability of success and \(q = 1 - p\) is the probability of failure. Show that \(f(x)\) is a p.m.f.

Solution:

\[\sum_{x=0}^{n} \binom{n}{x} p^x q^{n-x} = \binom{n}{0}p^0 q^n + \binom{n}{1}p^1 q^{n-1} + \binom{n}{2}p^2 q^{n-2} + \cdots + \binom{n}{n}p^n q^0\]

\[= q^n + \binom{n}{1}pq^{n-1} + \binom{n}{2}p^2 q^{n-2} + \cdots + p^n\]

From the binomial formula: \((a+b)^n = a^n + \binom{n}{1}ab^{n-1} + \binom{n}{2}a^2 b^{n-2} + \cdots + b^n\)

\[\Rightarrow q^n + \binom{n}{1}pq^{n-1} + \binom{n}{2}p^2 q^{n-2} + \cdots + p^n = (p+q)^n = 1\]

Hence \(\displaystyle\sum_{x=0}^{n} \binom{n}{x} p^x q^{n-x} = 1\)

1.4.7 Example 6 — Poisson Distribution

Let \(X\) be a Poisson distributed random variable with \[f(x;\lambda) = \frac{e^{-\lambda}\lambda^x}{x!},\quad x = 0, 1, 2, \ldots \quad (= 0 \text{ elsewhere})\] where \(\lambda\) is a parameter. Show that \(f(x)\) is a discrete p.d.f.

Solution:

\[\sum_{x} f(x) = \sum_{x=0}^{\infty} \frac{e^{-\lambda}\lambda^x}{x!} = e^{-\lambda}\sum_{x=0}^{\infty} \frac{\lambda^x}{x!} = e^{-\lambda}\left(1 + \lambda + \frac{\lambda^2}{2!} + \frac{\lambda^3}{3!} + \cdots + \frac{\lambda^n}{n!} + \cdots\right)\]

Since \(e^{\lambda} = 1 + \lambda + \frac{\lambda^2}{2!} + \frac{\lambda^3}{3!} + \cdots\):

\[\sum_{x} f(x) = e^{-\lambda} \cdot e^{\lambda} = e^0 = 1\]

Hence \(f(x)\) is a p.m.f.


1.5 Probability Distribution of a Continuous Random Variable

If \(X\) is a continuous random variable, then the function giving the probabilities \(f(x)\) is called a probability density function (p.d.f).

1.5.1 Properties of a Continuous Random Variable

Let \(X\) be a continuous random variable with pdf \(f(x)\), then:

  1. Non-negativity: \(f(x) \geq 0\) for all \(x\)

  2. Total Area Property: \(\displaystyle\int_{-\infty}^{\infty} f(x)\, dx = 1\)

  3. Probabilities: \(P(a < X < b) = \displaystyle\int_{a}^{b} f(x)\, dx\)

1.5.2 Example 1

Let \(X\) be the delay (in hours) of a flight with probability density function: \[f(x) = \begin{cases} 0.2 - 0.02x, & 0 \leq x \leq 10 \\ 0, & \text{otherwise} \end{cases}\]

(i) Show that \(f(x)\) is a pdf

At the endpoints: \(f(0) = 0.2\) and \(f(10) = 0.2 - 0.02(10) = 0\).

Since \(f(x)\) decreases linearly from \(0.2\) to \(0\), we have \(f(x) \geq 0\) for \(0 \leq x \leq 10\).

Verify total area equals 1: \[\int_{0}^{10}(0.2 - 0.02x)\,dx = \Big[0.2x - 0.01x^2\Big]_0^{10} = (0.2 \times 10 - 0.01 \times 100) = (2-1) = 1 \checkmark\]

(ii) Find \(P(X \geq 2)\)

First find \(f(2)\): \(f(2) = 0.2 - 0.02(2) = 0.16\).

Since the graph is a straight line, the region from \(x=2\) to \(x=10\) forms a triangle with base \(= 10-2=8\) and height \(= 0.16\):

\[P(X \geq 2) = \frac{1}{2} \times 8 \times 0.16 = 0.64\]

Alternatively, using integration: \[P(X \geq 2) = \int_{2}^{10}(0.2 - 0.02x)\,dx = \Big[0.2x - 0.01x^2\Big]_2^{10} = (2-1) - (0.4 - 0.04) = 0.64\]

1.5.3 Example 2

A continuous random variable has the following probability density function: \[f(x) = \begin{cases} k(x+2)^2, & 0 \leq x \leq 2 \\ 0, & \text{otherwise} \end{cases}\]

(i) Find the value of \(k\)

Since \(f(x)\) is a p.d.f., the total area under the curve must equal 1: \[\int_{0}^{2} k(x+2)^2\,dx = 1\]

Expanding \((x+2)^2 = x^2 + 4x + 4\): \[k\int_{0}^{2}(x^2 + 4x + 4)\,dx = k\left[\frac{x^3}{3} + 2x^2 + 4x\right]_0^2 = k\left[\frac{8}{3} + 8 + 8\right] = k \cdot \frac{56}{3} = 1\]

\[\Rightarrow k = \frac{3}{56}\]

(ii) Find \(P(0 < X < 1)\)

\[P(0 < X < 1) = \int_{0}^{1} \frac{3}{56}(x+2)^2\,dx = \frac{3}{56}\left[\frac{x^3}{3} + 2x^2 + 4x\right]_0^1 = \frac{3}{56} \cdot 19 = \frac{19}{56}\]

(iii) Find \(P(X > 1)\)

\[P(X > 1) = 1 - P(0 < X < 1) = 1 - \frac{19}{56} = \frac{37}{56}\]

1.5.4 Example 3

A p.d.f is given by the piecewise function: \[f(x) = \begin{cases} k, & 0 \leq x \leq 2 \\ k(2x-3), & 2 \leq x \leq 3 \\ 0, & \text{otherwise} \end{cases}\]

(i) Find the value of \(k\)

Since \(f(x)\) is a p.d.f.: \[\int_{0}^{2} k\,dx + \int_{2}^{3} k(2x-3)\,dx = 1\]

\[k[x]_0^2 + k\big[x^2 - 3x\big]_2^3 = 1\]

\[k[2-0 + (9-9)-(4-6)] = k[2+2] = 4k = 1\]

\[\Rightarrow k = \frac{1}{4}\]

(ii) Find \(P(1 < X < 2.5)\)

Split the integral across the two ranges: \[P(1 < X < 2.5) = \int_{1}^{2}\frac{1}{4}\,dx + \int_{2}^{2.5}\frac{1}{4}(2x-3)\,dx\]

\[= \frac{1}{4}[x]_1^2 + \frac{1}{4}\big[x^2-3x\big]_2^{2.5}\]

\[= \frac{1}{4}(2-1) + \frac{1}{4}\big[(6.25-7.5)-(4-6)\big]\]

\[= \frac{1}{4}(1) + \frac{1}{4}(-1.25+2) = \frac{1}{4}(1) + \frac{1}{4}(0.75) = \frac{1.75}{4} = \frac{7}{16}\]


1.6 Exercises (Lecture 1)

Exercise 1. A continuous random variable \(X\) has a p.d.f.: \[f(x) = \begin{cases} c(x^2 - 2x + 3), & 0 \leq x \leq 2 \\ 0, & \text{otherwise} \end{cases}\] where \(c\) is a constant. Find:

  • The value of \(c\)Answer: \(c = \frac{3}{14}\)
  • \(P(X \geq 1)\)Answer: \(\frac{1}{2}\)

Exercise 2. Consider a continuous random variable \(X\) with p.d.f: \[f(x) = \begin{cases} kx^2, & 0 \leq x \leq 1 \\ 0, & \text{elsewhere} \end{cases}\]

  • Determine the value of \(k\)
  • Find \(a\) such that \(\Pr(X \leq a) = \Pr(X > a)\)
  • Find \(b\) such that \(\Pr(X > b) = 0.05\)

Exercise 3. Let \(X\) be a continuous random variable with p.d.f: \[f(x) = \begin{cases} ax, & 0 \leq x \leq 1 \\ a, & 1 \leq x \leq 2 \\ -ax+3a, & 2 \leq x \leq 3 \\ 0, & \text{elsewhere} \end{cases}\]

  • Define the constant \(a\)
  • Compute the probability \(\Pr(X \leq 1.5)\)

1.7 Solutions to Exercises (Lecture 1)

1.7.1 Solution to Exercise 2

Finding \(k\):

Since \(f(x)\) is a p.d.f.: \[\int_{-\infty}^{\infty} f(x)\,dx = 1 \Rightarrow \int_{0}^{1} kx^2\,dx = 1\]

\[k\left[\frac{x^3}{3}\right]_0^1 = 1 \Rightarrow \frac{k}{3} = 1 \Rightarrow k = 3\]

Finding \(a\) such that \(\Pr(X \leq a) = \Pr(X > a)\):

\[\int_0^a 3x^2\,dx = \int_a^1 3x^2\,dx\] \[[x^3]_0^a = [x^3]_a^1\] \[a^3 = 1 - a^3\] \[2a^3 = 1 \Rightarrow a = \sqrt[3]{\frac{1}{2}}\]

Finding \(b\) such that \(\Pr(X > b) = 0.05\):

\[\int_b^1 3x^2\,dx = 0.05 \Rightarrow \left[x^3\right]_b^1 = 0.05 \Rightarrow 1 - b^3 = 0.05\] \[b^3 = 0.95 \Rightarrow b = \sqrt[3]{0.95}\]

1.7.2 Solution to Exercise 3

Finding \(a\):

\[\int_0^1 ax\,dx + \int_1^2 a\,dx + \int_2^3(-ax+3a)\,dx = 1\]

\[\left[\frac{ax^2}{2}\right]_0^1 + [ax]_1^2 + \left[-\frac{ax^2}{2}+3ax\right]_2^3 = 1\]

\[\frac{a}{2} + a + \frac{a}{2} = 1 \Rightarrow 2a = 1 \Rightarrow a = \frac{1}{2}\]

Computing \(\Pr(X \leq 1.5)\):

\[\Pr(X \leq 1.5) = \int_0^1 \frac{1}{2}x\,dx + \int_1^{1.5}\frac{1}{2}\,dx = \left[\frac{x^2}{4}\right]_0^1 + \left[\frac{x}{2}\right]_1^{1.5}\]

\[= \left(\frac{1}{4} - 0\right) + (0.75 - 0.5) = 0.25 + 0.25 = 0.5\]


2 Lecture 2: The Cumulative Distribution Function

2.1 Definition

The function \(F(x)\) is called the distribution function or cumulative distribution function of the random variable \(X\), if:

Discrete Type: \[F(x) = \Pr(X \leq x) = \sum_{t \leq x} f(t)\]

Continuous Type: \[F(x) = \Pr(X \leq x) = \int_{-\infty}^{x} f(t)\,dt\]

2.2 Properties of \(F(x)\)

(i) \(F(-\infty) = \lim_{x \to -\infty} F(x) = 0\) and \(F(\infty) = \lim_{x \to \infty} F(x) = 1\)

Also \(0 \leq F(x) \leq 1\); \(F(x) = 0\) below the smallest value and \(F(x) = 1\) above the largest value.

(ii) \(F(x)\) is a monotone, non-decreasing function, i.e., \(F(a) \leq F(b)\) for \(a < b\).

If \(X\) is a continuous random variable, then: \[P(a < X < b) = F_X(b) - F_X(a) = \int_a^b f_X(t)\,dt\]

(iii) \(F(x)\) is continuous from the right, i.e.: \[\lim_{h \to 0} F(x+h) = f(x),\quad \text{i.e.,}\quad \frac{d}{dx}\left[F(x+h)\right] = f(x)\]

\[F_X(x) = \int_{-\infty}^{x} f_X(t)\,dt \qquad f_X(x) = \frac{d}{dx}F_X(x)\]


2.3 Discrete Distribution Function / CDF

2.3.1 Example 1

Let the random variable \(X\) of the discrete type have the p.d.f.: \[f(x) = \begin{cases} \frac{x}{6}, & x = 1, 2, 3 \\ 0, & \text{otherwise} \end{cases}\]

Find the distribution function of \(X\).

Recall: \(F(x) = \displaystyle\sum_{t \leq x} f(t)\)

Solution:

\[F(x) = \begin{cases} 0, & x < 1 \\ \frac{1}{6}, & 1 \leq x < 2 \\ \frac{3}{6}, & 2 \leq x < 3 \\ 1, & 3 \leq x \end{cases}\]

Note: \(F(x)\) is a step function that is constant in every interval containing 1, 2, or 3, but has steps of height \(\frac{1}{6}\), \(\frac{2}{6}\), and \(\frac{3}{6}\).

2.3.2 Example 2

Given the distribution function for a random variable \(Y\):

\[F_Y(t) = \begin{cases} 0, & t < -2 \\ 1/3, & -2 \leq t < 1 \\ 7/12, & 1 \leq t < 5 \\ 47/60, & 5 \leq t < 11 \\ 57/60, & 11 \leq t < 20 \\ 1, & 20 \leq t \end{cases}\]

Find: (a) \(f_Y(t)\), (b) \(\Pr[0 \leq Y \leq 1]\), (c) \(\Pr[3 \leq Y \leq 10]\)

Solution (a): Finding \(f_Y(t)\)

To find \(f_Y(t)\), locate the points of discontinuity of \(F_Y(t)\): these are \(-2, 1, 5, 11, 20\). The value of the probability function at each point equals the size of the jump in \(F_Y(t)\):

\(t\) \(f_Y(t)\)
\(-2\) \(1/3\)
\(1\) \(1/4\)
\(5\) \(1/5\)
\(11\) \(1/6\)
\(20\) \(1/20\)

Solution (b): \(\Pr[0 \leq Y \leq 1]\)

\[\Pr[0 \leq Y \leq 1] = F(1) - F(0) = \frac{7}{12} - \frac{1}{3} = \frac{1}{4}\]

Solution (c): \(\Pr[3 \leq Y \leq 10]\)

\[\Pr[3 \leq Y \leq 10] = F(10) - F(3) = \frac{47}{60} - \frac{7}{12} = \frac{47}{60} - \frac{35}{60} = \frac{12}{60} = \frac{1}{5}\]


2.4 Continuous Distribution Function / CDF

2.4.1 Example 3

Let \(X\) be a random variable of continuous type defined by the p.d.f.: \[f(x) = \begin{cases} \frac{2}{x^3}, & 1 < x < \infty \\ 0, & \text{elsewhere} \end{cases}\]

Find the cumulative distribution function \(F(x)\).

Solution:

\[F(x) = \int_{-\infty}^{x} f(t)\,dt = \int_{1}^{x} \frac{2}{t^3}\,dt = \int_{1}^{x} 2t^{-3}\,dt = \Big[-t^{-2}\Big]_1^x = 1 - \frac{1}{x^2}\]

Therefore: \[F(x) = \begin{cases} 0, & x < 1 \\ 1 - \frac{1}{x^2}, & 1 \leq x \end{cases}\]

2.4.2 Example 4 — Finding the Density Function

Let \(X\) be the random variable whose distribution function is: \[F_X(t) = \begin{cases} 0, & t < 0 \\ t, & 0 \leq t \leq 1 \\ 1, & 1 < t \end{cases}\]

Find the density function of \(X\).

Solution: Differentiate \(F_X(t)\): \[f_X(t) = \frac{d}{dt}\left[F_X(t)\right]\]

\[f_X(t) = \begin{cases} 1, & 0 < t < 1 \\ 0, & t < 0 \text{ or } t > 1 \end{cases}\]

This is the Uniform(0, 1) distribution.

2.4.3 Example 5

Let \(X\) be a continuous random variable with p.d.f: \[f(x) = \begin{cases} \frac{1}{2}x, & 0 < x < 2 \\ 0, & \text{otherwise} \end{cases}\]

Obtain the c.d.f of the random variable \(X\).

Solution:

For \(x \leq 0\): \(F(x) = 0\)

For \(0 < x < 2\): \[F(x) = \int \frac{1}{2}x\,dx = \frac{x^2}{4} + c_1\]

Using \(F(0) = 0\): \(\frac{0^2}{4} + c_1 = 0 \Rightarrow c_1 = 0\)

Check: \(F(2) = \frac{2^2}{4} + 0 = 1\)

Hence the c.d.f becomes: \[F(x) = \begin{cases} 0, & x \leq 0 \\ \frac{x^2}{4}, & 0 < x < 2 \\ 1, & x \geq 2 \end{cases}\]


2.5 Exercises (Lecture 2)

Exercise 1. The random variable \(Z\) has the probability function: \[f_Z(x) = \begin{cases} \frac{1}{3}, & x = 0, 1, 2 \\ 0, & \text{elsewhere} \end{cases}\] What is the distribution function of \(Z\)?

Answer: \[F_Z(x) = \begin{cases} 0, & x < 0 \\ \frac{1}{3}, & 0 \leq x < 1 \\ \frac{2}{3}, & 1 \leq x < 2 \\ 1, & 2 \leq x \end{cases}\]

Exercise 2. The random variable \(U\) has the probability function: \(f_U(-3) = \frac{1}{2}\), \(f_U(0) = \frac{1}{6}\), \(f_U(4) = \frac{1}{3}\). Find the distribution function of \(U\).

Answer: \[F_U(x) = \begin{cases} 0, & x < -3 \\ \frac{1}{2}, & -3 \leq x < 0 \\ \frac{2}{3}, & 0 \leq x < 4 \\ 1, & 4 \leq x \end{cases}\]

Exercise 3. Verify that \[F_X(t) = \begin{cases} 0, & t < -1 \\ \frac{t+1}{2}, & -1 \leq t \leq 1 \\ 1, & t > 1 \end{cases}\] is a distribution function and specify the probability density function for \(X\). Use it to compute \(P\!\left(-\frac{1}{2} \leq X \leq \frac{1}{2}\right)\).

Answer: \(\frac{1}{2}\)

Exercise 4. \(Y\) is a continuous random variable with: \[f(y) = \begin{cases} 2(1-y), & 0 < y < 1 \\ 0, & \text{elsewhere} \end{cases}\] Find the cumulative distribution function of \(Y\).

Answer: \[F(y) = \begin{cases} 2y - y^2, & 0 \leq y \leq 1 \\ 0, & \text{elsewhere} \end{cases}\]

Exercise 5. \(Z\) is a continuous random variable with probability density function: \[f(z) = \begin{cases} 10e^{-10z}, & z > 0 \\ 0, & \text{elsewhere} \end{cases}\] Find the cumulative distribution function of \(Z\).

Answer: \[F(z) = \begin{cases} 1 - e^{-10z}, & z > 0 \\ 0, & \text{elsewhere} \end{cases}\]

Exercise 6. For each of the following, find the constant \(c\) so that \(f(x)\) is a p.d.f.

(a) \[f(x) = \begin{cases} c\left(\frac{2}{3}\right)^x, & x = 1, 2, 3, \ldots \\ 0, & \text{elsewhere} \end{cases}\] Answer: \(c = \frac{1}{2}\)

(b) \[f(x) = \begin{cases} cxe^{-x}, & 0 < x < \infty \\ 0, & \text{elsewhere} \end{cases}\] Answer: \(c = 1\)

Exercise 7. Let the p.d.f of the random variable \(X\) be: \[f(x) = \begin{cases} \frac{x}{15}, & x = 1, 2, 3, 4, 5 \\ 0, & \text{elsewhere} \end{cases}\] Find:

  1. \(P(X = 1 \text{ or } 2)\)Answer: \(\frac{1}{5}\)
  2. \(P\!\left(\frac{1}{2} < X < \frac{5}{2}\right)\)Answer: \(\frac{1}{5}\)
  3. \(P(1 \leq X \leq 2)\)Answer: \(\frac{1}{5}\)

Exercise 8. Let \(f(x)\) be the p.d.f of a random variable \(X\). Find the distribution function \(F(x)\).

(a) \[f(x) = \begin{cases} 1, & x = 0 \\ 0, & \text{elsewhere} \end{cases}\] Answer: \[F(x) = \begin{cases} 0, & x < 0 \\ 1, & x \geq 0 \end{cases}\]

(b) \[f(x) = \begin{cases} 3(1-x)^2, & 0 < x < 1 \\ 0, & \text{elsewhere} \end{cases}\] Answer: \[F(x) = \begin{cases} 3x - 3x^2 + x^3, & 0 < x < 1 \\ 0, & \text{elsewhere} \end{cases}\]

Exercise 9. Given the distribution function: \[F(x) = \begin{cases} 0, & x < -1 \\ \frac{x+2}{4}, & -1 \leq x \leq 1 \\ 1, & 1 \leq x \end{cases}\]

Compute:

  1. \(P\!\left(-\frac{1}{2} < X \leq \frac{1}{2}\right)\)Answer: \(\frac{1}{4}\)
  2. \(P(X = 0)\)Answer: \(0\)
  3. \(P(X = 1)\)Answer: \(\frac{1}{4}\)
  4. \(P(2 < X \leq 3)\)Answer: \(0\)

3 Lecture 3: Measures of Location

3.1 Quantiles

The \(p^{th}\) quantile of a random variable \(X\), denoted \(\xi_p\), is the value such that:

\[\Pr(X \leq \xi_p) \geq p \quad \text{and} \quad \Pr(X \geq \xi_p) \geq 1 - p\]

In particular, the median is the \(0.5^{th}\) quantile, denoted \(\xi_{0.5}\).

3.1.1 Percentiles

Let \(0 < p < 1\). The \(100p^{th}\) percentile (quantile of order \(p\)) of the distribution of \(X\) is a value \(\xi_p\) such that:

\[\Pr(X \leq \xi_p) \geq p \quad \text{and} \quad \Pr(X \geq \xi_p) \geq 1 - p\]


3.2 Median

The median lies halfway through the distribution, splitting the area under the curve into two equal halves.

Continuous Random Variable: If \(X\) is continuous with pdf \(f(x)\) on \((-\infty, \infty)\), the median \(M\) satisfies:

\[\int_{-\infty}^{M} f(x)\,dx = \int_{M}^{\infty} f(x)\,dx = \frac{1}{2}\]

or equivalently \(F(M) = \frac{1}{2}\), where \(F(x)\) is the CDF.

Discrete Random Variable: If \(X\) is discrete with PMF \(p_X(x)\) and values \(x_1, x_2, \ldots, x_n\), the median \(M\) satisfies:

\[\sum_{x=x_1}^{M} p_X(x) \geq \frac{1}{2} \quad \text{and} \quad \sum_{x=M}^{x_n} p_X(x) \geq \frac{1}{2}\]

3.2.1 Example 1: Median and 25th Percentile

Find the median and 25th percentile of:

\[f(x) = \begin{cases} 3(1-x)^2, & 0 < x < 1 \\ 0, & \text{elsewhere} \end{cases}\]

Solution — Median:

Let the median be \(M\). Setting \(\int_0^M 3(1-x)^2\,dx = \frac{1}{2}\) and substituting \(z = 1-x\):

\[-(1-M)^3 + 1 = \frac{1}{2} \implies (1-M)^3 = \frac{1}{2}\]

\[\boxed{M = 1 - \sqrt[3]{\frac{1}{2}}}\]

Solution — 25th Percentile:

Let \(P\) be the 25th percentile. Setting \(\int_0^P 3(1-x)^2\,dx = \frac{1}{4}\):

\[(1-P)^3 = \frac{3}{4} \implies \boxed{P = 1 - \sqrt[3]{\frac{3}{4}}}\]

x <- seq(0, 1, length.out = 300)
fx <- 3 * (1 - x)^2
M <- 1 - (1/2)^(1/3)

df <- data.frame(x = x, fx = fx)
df_shade <- df[df$x <= M, ]

ggplot(df, aes(x, fx)) +
  geom_area(data = df_shade, aes(x, fx), fill = "#3B8BD4", alpha = 0.3) +
  geom_line(linewidth = 1, color = "#185FA5") +
  geom_vline(xintercept = M, linetype = "dashed", color = "#D85A30") +
  annotate("text", x = M + 0.05, y = 2, label = paste0("M = ", round(M, 3)),
           color = "#D85A30", size = 4) +
  labs(title = "PDF with Median Highlighted",
       x = "x", y = "f(x)") +
  theme_minimal()
PDF f(x) = 3(1-x)^2 with median shaded

Figure 3.1: PDF f(x) = 3(1-x)^2 with median shaded


3.2.2 Example 2: Median of a Discrete Distribution

Given \(p = \frac{1}{4}\), find the median of:

\[P_X(x) = \begin{cases} p(1-p)^x, & x = 0, 1, 2, \ldots \\ 0, & \text{elsewhere} \end{cases}\]

Solution:

p <- 1/4
x_vals <- 0:6
pmf <- p * (1 - p)^x_vals
cdf <- cumsum(pmf)

results <- data.frame(x = x_vals, PMF = round(pmf, 4), CDF = round(cdf, 4))
knitr::kable(results, caption = "PMF and CDF for Example 2")
Table 3.1: PMF and CDF for Example 2
x PMF CDF
0 0.2500 0.2500
1 0.1875 0.4375
2 0.1406 0.5781
3 0.1055 0.6836
4 0.0791 0.7627
5 0.0593 0.8220
6 0.0445 0.8665

From the CDF: \(F(2) = 0.5781 \geq 0.5\), so the median is \(M = 2\).


3.2.3 Example 3: Finding k, Median, and IQR

\[f(x) = \begin{cases} \frac{k}{\sqrt{x}}, & 1 \leq x \leq 9 \\ 0, & \text{otherwise} \end{cases}\]

(i) Finding \(k\):

\[\int_1^9 \frac{k}{\sqrt{x}}\,dx = k[\ln x]_1^9 = k\ln 9 = 1 \implies k = \frac{1}{\ln 9} \approx 0.4551\]

(ii) Median:

The CDF is \(F(x) = \frac{\ln x}{\ln 9}\). Setting \(F(m) = \frac{1}{2}\):

\[\frac{\ln m}{\ln 9} = \frac{1}{2} \implies m = 9^{1/2} = 3\]

(iii) Interquartile Range:

\[Q_1: F(q_1) = \frac{1}{4} \implies q_1 = 9^{1/4} \approx 1.732\]

\[Q_3: F(q_3) = \frac{3}{4} \implies q_3 = 9^{3/4} \approx 5.196\]

\[\text{IQR} = q_3 - q_1 \approx 5.196 - 1.732 = 3.464\]

k <- 1 / log(9)
q1 <- 9^(1/4); q3 <- 9^(3/4); iqr <- q3 - q1
cat(sprintf("k = %.4f\nMedian = 3\nQ1 = %.3f, Q3 = %.3f\nIQR = %.3f\n",
            k, q1, q3, iqr))
k = 0.4551
Median = 3
Q1 = 1.732, Q3 = 5.196
IQR = 3.464

3.3 Mode

Definition: The mode of a distribution is the value \(x\) that maximises the PDF or PMF.

For a continuous random variable, differentiate the PDF:

  • Set \(f'(x) = 0\)
  • If \(f''(x) < 0\) at \(x = x_1\) \(\Rightarrow\) local maximum (mode)
  • Check higher derivatives if needed

3.3.1 Example 4: Mode of a Discrete Distribution

\[P_X(x) = \begin{cases} \left(\frac{1}{2}\right)^x, & x = 1, 2, 3, \ldots \\ 0, & \text{otherwise} \end{cases}\]

By observation, the maximum probability occurs at \(x = 1\), so Mode = 1.

3.3.2 Example 5: Mode of a Continuous Distribution

\[f(x) = \begin{cases} \frac{1}{2}x^2 e^{-x}, & 0 < x < \infty \\ 0, & \text{otherwise} \end{cases}\]

Solution:

\[f'(x) = \frac{1}{2}e^{-x}(2x - x^2) = 0 \implies x = 2\]

\[f''(2) = -0.1353 < 0 \implies \textbf{Mode} = 2\]

3.3.3 Example 6: Mode of \(f(x) = 12x^2(1-x)\)

\[f(x) = \begin{cases} 12x^2(1-x), & 0 < x < 1 \\ 0, & \text{otherwise} \end{cases}\]

\[f'(x) = 24x - 36x^2 = 12x(2 - 3x) = 0 \implies x = 0 \text{ or } x = \frac{2}{3}\]

\(f''(2/3) < 0\), so Mode = \(\frac{2}{3}\).

3.3.4 Example 7: Piecewise PDF — CDF, Median, and Mode

\[f(x) = \begin{cases} cx, & 0 \leq x \leq 1 \\ c(2-x), & 1 \leq x \leq 2 \\ 0, & \text{otherwise} \end{cases}\]

Step 1: Find \(c\).

\[\int_0^1 cx\,dx + \int_1^2 c(2-x)\,dx = \frac{c}{2} + \frac{c}{2} = 1 \implies c = 1\]

Step 2: CDF.

\[F(x) = \begin{cases} 0, & x < 0 \\ \dfrac{x^2}{2}, & 0 \leq x \leq 1 \\[6pt] 2x - \dfrac{x^2}{2} - 1, & 1 \leq x \leq 2 \\ 1, & x \geq 2 \end{cases}\]

Median: \(F(1) = \frac{1}{2}\), so \(m = 1\).

Mode: \(f(x) = x\) increases on \([0,1]\) and \(f(x) = 2-x\) decreases on \([1,2]\), so maximum is at \(x = 1\). Mode = 1.


Exercise: For \(f(x) = \frac{3}{64}x^2(4-x)\), \(0 \leq x \leq 4\), determine the mode.


4 Lecture 4: Expectation of a Random Variable

4.1 Definition

Expectation gives the average (mean) value of a random variable.

  • Discrete: \(E(X) = \displaystyle\sum_{i=1}^n x_i P(X = x_i)\)

  • Continuous: \(E(X) = \displaystyle\int_{-\infty}^{\infty} x f(x)\,dx\)


4.2 Examples

4.2.1 Example 1: Fair Die

\[E(X) = \sum_{x=1}^{6} x \cdot \frac{1}{6} = \frac{1+2+3+4+5+6}{6} = \frac{21}{6} = 3.5\]

4.2.2 Example 2: Discrete PMF

\[P(X = x) = \frac{x}{21}, \quad x = 1, 2, 3, 4, 5, 6\]

\[E(X) = \sum_{x=1}^{6} x \cdot \frac{x}{21} = \frac{1^2 + 2^2 + \cdots + 6^2}{21} = \frac{91}{21} \approx 4.333\]

4.2.3 Example 3: Continuous Case

\[f(x) = \frac{1}{9}(3-x)(1+x), \quad 0 \leq x \leq 3\]

\[E(X) = \frac{1}{9}\int_0^3 x(3-x)(1+x)\,dx = \frac{1}{9}\int_0^3 (3x + 3x^2 - x^3 - x^4)\,dx = \frac{5}{4} = 1.25\]

integrand <- function(x) x * (1/9) * (3 - x) * (1 + x)
result <- integrate(integrand, 0, 3)
cat("E(X) =", result$value)
E(X) = 1.25

4.3 Properties of Expectation

Let \(g(X) = aX + b\) with constants \(a, b, c\):

  1. \(E(c) = c\)
  2. \(E[aX + b] = aE(X) + b\)
  3. \(E[ag(X) \pm bh(X)] = aE[g(X)] \pm bE[h(X)]\)

Proof of (2):

\[E[aX + b] = \int_{-\infty}^{\infty}(aX + b)f(x)\,dx = a\int_{-\infty}^{\infty}xf(x)\,dx + b\int_{-\infty}^{\infty}f(x)\,dx = aE(X) + b\]

4.3.1 Example 4: Expectation of Functions

\[P(X = x) = \frac{x}{10}, \quad x = 1, 2, 3, 4\]

Compute \(E[5X^3 - 2X^2]\):

\[E[5X^3 - 2X^2] = 5E(X^3) - 2E(X^2)\]

x <- 1:4; p <- x / 10
E_X2 <- sum(x^2 * p)
E_X3 <- sum(x^3 * p)
result <- 5 * E_X3 - 2 * E_X2
cat(sprintf("E(X^2) = %.1f\nE(X^3) = %.1f\nE[5X^3 - 2X^2] = %.1f\n",
            E_X2, E_X3, result))
E(X^2) = 10.0
E(X^3) = 35.4
E[5X^3 - 2X^2] = 157.0

4.4 Variance of a Random Variable

\[\text{Var}(X) = E\left[(X - E(X))^2\right] = E(X^2) - [E(X)]^2\]

Notation: Mean \(\mu = E(X)\); Variance \(\sigma^2 = \text{Var}(X)\); Standard deviation \(\sigma = \sqrt{\text{Var}(X)}\).

4.4.1 Variance of Linear Functions

\[\text{Var}(aX) = a^2\text{Var}(X)\] \[\text{Var}(aX + b) = a^2\text{Var}(X)\]

Proof of \(\text{Var}(aX + b) = a^2\text{Var}(X)\):

\[\text{Var}(aX + b) = E[(aX+b)^2] - [E(aX+b)]^2\] \[= a^2E(X^2) + 2abE(X) + b^2 - [aE(X) + b]^2\] \[= a^2\left[E(X^2) - (E(X))^2\right] = a^2\text{Var}(X)\]

4.4.2 Example 5: Variance of a Continuous Variable

\[f(x) = \frac{x+3}{18}, \quad -3 \leq x \leq 3\]

E_X  <- integrate(function(x) x * (x + 3) / 18, -3, 3)$value
E_X2 <- integrate(function(x) x^2 * (x + 3) / 18, -3, 3)$value
VarX <- E_X2 - E_X^2
cat(sprintf("E(X) = %.4f\nE(X^2) = %.4f\nVar(X) = %.4f\n", E_X, E_X2, VarX))
E(X) = 1.0000
E(X^2) = 3.0000
Var(X) = 2.0000

4.4.3 Example 6: Piecewise PDF

\[f(x) = \begin{cases} kx, & 0 \leq x \leq 1 \\ k, & 1 \leq x \leq 3 \\ k(4-x), & 3 \leq x \leq 4 \\ 0, & \text{otherwise} \end{cases}\]

Finding \(k\):

\[k\int_0^1 x\,dx + k\int_1^3 dx + k\int_3^4(4-x)\,dx = k\left(\frac{1}{2} + 2 + \frac{1}{2}\right) = 3k = 1 \implies k = \frac{1}{3}\]

Finding \(E(X)\):

\[E(X) = \frac{1}{3}\int_0^1 x^2\,dx + \frac{1}{3}\int_1^3 x\,dx + \frac{1}{3}\int_3^4 x(4-x)\,dx = 2\]

4.4.4 Example 7: Piecewise PDF with Quantile

\[f(x) = \begin{cases} \frac{c}{3}x, & 0 \leq x \leq 3 \\ c, & 3 \leq x \leq 4 \\ 0, & \text{otherwise} \end{cases}\]

Finding \(c\): \(\frac{3}{2}c + c = \frac{5}{2}c = 1 \implies c = \frac{2}{5}\)

\(E(X) = \frac{13}{5} = 2.6\)

Finding \(a\) such that \(P(X \geq a) = 0.85\):

Since \(P(0 \leq X \leq 3) = 0.40\), the value \(a\) lies in \([3, 4]\):

\[\int_a^4 \frac{2}{5}\,dx = 0.85 - P(X > 4) \implies \frac{2}{5}(4 - a) = 0.85 - 0.40 = 0.45 \implies a = 5.125\]

Note: The solution in the lecture notes sets \(\int_3^a \frac{2}{5}dx = 0.85 - P(0 \le X \le 3)\), giving \(a = 5.125\).

4.4.5 Example 8: Cell Battery Lifetime

\[f(x) = \begin{cases} \frac{3}{4}\left[1 - (x-2)^2\right], & 1 \leq x \leq 3 \\ 0, & \text{otherwise} \end{cases}\]

(i) \(E(X)\):

\[E(X) = \frac{3}{4}\int_1^3 x\left[1 - (x-2)^2\right]dx = 2\]

(ii) Probability radio works for at least 22 hours (two independent cells, each \(\geq 2.2\) tens of hours):

\[P(X \geq 2.2) = \frac{3}{4}\int_{2.2}^{3}\left[1 - (x-2)^2\right]dx \approx 0.352\]

\[P(\text{both cells}) = (0.352)^2 \approx 0.1239\]

f8 <- function(x) (3/4) * (1 - (x - 2)^2)
p_22 <- integrate(f8, 2.2, 3)$value
cat(sprintf("P(X >= 2.2) = %.4f\nP(both cells) = %.4f\n", p_22, p_22^2))
P(X >= 2.2) = 0.3520
P(both cells) = 0.1239

4.4.6 Example 9: Electric Bulb Lifetime

\[f(x) = \begin{cases} \frac{6}{125}x(5-x), & 0 \leq x \leq 5 \\ 0, & \text{otherwise} \end{cases}\]

f9 <- function(x) (6/125) * x * (5 - x)

# Probability neither bulb fails in the first year
p_fail1 <- integrate(f9, 0, 1)$value
p_surv1 <- 1 - p_fail1
p_neither <- p_surv1^2

# Probability exactly one bulb fails within 2 years
p_fail2 <- integrate(f9, 0, 2)$value
p_surv2 <- 1 - p_fail2
p_exactly_one <- 2 * p_fail2 * p_surv2

cat(sprintf("P(fail in year 1) = %.4f\nP(neither fails in year 1) = %.4f\n", p_fail1, p_neither))
P(fail in year 1) = 0.1040
P(neither fails in year 1) = 0.8028
cat(sprintf("P(fail within 2 years) = %.4f\nP(exactly one fails in 2 years) = %.4f\n", p_fail2, p_exactly_one))
P(fail within 2 years) = 0.3520
P(exactly one fails in 2 years) = 0.4562

Exercise: Two dice are tossed. Let \(Y = |X_1 - X_2|\). Compute \(\text{Var}(Y)\).


5 Lecture 5: Moments of a Random Variable

5.1 Definition

Moments are expectations of powers of a random variable and characterise properties of the distribution.

5.2 Raw Moments (About the Origin)

The \(r^{th}\) raw moment of \(X\), denoted \(\mu'_r\), is:

\[\mu'_r = E(X^r), \quad r = 1, 2, \ldots\]

5.3 Central Moments (About the Mean)

If \(a = E(X) = \mu\), the \(r^{th}\) central moment is:

\[\mu_r = E\left[(X - \mu)^r\right]\]

5.3.1 Formulas

Discrete:

\[\mu'_r = \sum_x x^r P(X = x), \qquad \mu_r = \sum_x (x - \mu)^r P(X = x)\]

Continuous:

\[\mu'_r = \int_{-\infty}^{\infty} x^r f(x)\,dx, \qquad \mu_r = \int_{-\infty}^{\infty} (x - \mu)^r f(x)\,dx\]


5.4 Remarks

Moment Result
\(\mu'_1 = E(X^1)\) Mean of \(X\)
\(\mu_1 = E[(X-\mu)^1]\) Always \(= 0\)
\(\mu_2 = E[(X-\mu)^2]\) Variance of \(X\)

5.5 Relationships Between Raw and Central Moments

Second central moment:

\[\mu_2 = \mu'_2 - (\mu'_1)^2\]

Third central moment:

\[\mu_3 = \mu'_3 - 3\mu'_1\mu'_2 + 2(\mu'_1)^3\]

Fourth central moment (exercise):

\[\mu_4 = \mu'_4 - 4\mu'_3\mu'_1 + 6\mu'_2(\mu'_1)^2 - 3(\mu'_1)^4\]


5.6 Factorial Moments

The \(r^{th}\) factorial moment of \(X\) is:

\[\mu_{[r]} = E[X(X-1)(X-2)\cdots(X-r+1)]\]

The first factorial moment is always the mean: \(\mu_{[1]} = E(X)\).

5.6.1 Example 13: Factorial Moments of a Fair Die

x <- 1:6; p <- rep(1/6, 6)

mu1 <- sum(x * p)
mu2 <- sum(x * (x - 1) * p)
mu3 <- sum(x * (x - 1) * (x - 2) * p)

df_fact <- data.frame(
  x = x,
  Prob = p,
  `x(x-1)` = x * (x - 1),
  `x(x-1)(x-2)` = x * (x - 1) * (x - 2),
  check.names = FALSE
)
knitr::kable(df_fact, caption = "Factorial moment calculations for a fair die")
Table 5.1: Factorial moment calculations for a fair die
x Prob x(x-1) x(x-1)(x-2)
1 0.1666667 0 0
2 0.1666667 2 0
3 0.1666667 6 6
4 0.1666667 12 24
5 0.1666667 20 60
6 0.1666667 30 120
cat(sprintf("\n1st Factorial Moment E(X)         = %.4f\n", mu1))

1st Factorial Moment E(X)         = 3.5000
cat(sprintf("2nd Factorial Moment E[X(X-1)]     = %.4f\n", mu2))
2nd Factorial Moment E[X(X-1)]     = 11.6667
cat(sprintf("3rd Factorial Moment E[X(X-1)(X-2)]= %.4f\n", mu3))
3rd Factorial Moment E[X(X-1)(X-2)]= 35.0000

6 Lecture 6: Moment Generating Function (MGF)

6.1 Introduction

Computing all moments directly can be tedious. The moment generating function (MGF) generates all moments from a single function. It is unique to each distribution, allowing identification of the distribution from its MGF.

6.2 Definition

Let \(X\) be a random variable. The MGF, denoted \(M_X(t)\), is:

\[M_X(t) = E\left[e^{tX}\right], \quad -h < t < h, \quad h > 0\]

Discrete:

\[M_X(t) = \sum_{\text{all } x} e^{tx} P(X = x)\]

Continuous:

\[M_X(t) = \int_{-\infty}^{\infty} e^{tx} f(x)\,dx\]


6.3 Deriving Moments from the MGF

Using the Taylor expansion \(e^{tx} = 1 + tx + \frac{t^2 x^2}{2!} + \frac{t^3 x^3}{3!} + \cdots\) and differentiating \(M_X(t)\) with respect to \(t\):

\[M'_X(t)\Big|_{t=0} = E(X) = \mu'_1\]

\[M''_X(t)\Big|_{t=0} = E(X^2) = \mu'_2\]

General result: The \(r^{th}\) raw moment is obtained by differentiating the MGF \(r\) times and setting \(t = 0\):

\[\mu'_r = E(X^r) = M_X^{(r)}(t)\Big|_{t=0}\]


6.4 Example: Exponential Distribution

\[f(x) = \begin{cases} \lambda e^{-\lambda x}, & x > 0 \\ 0, & \text{elsewhere} \end{cases}\]

6.4.1 (a) Finding the MGF

\[M_X(t) = \int_0^{\infty} e^{tx} \cdot \lambda e^{-\lambda x}\,dx = \lambda \int_0^{\infty} e^{-(\lambda - t)x}\,dx = \frac{\lambda}{\lambda - t}, \quad t < \lambda\]

6.4.2 (b) Mean and Variance from the MGF

\[M'_X(t) = \lambda(\lambda - t)^{-2}\]

\[E(X) = M'_X(0) = \frac{\lambda}{\lambda^2} = \frac{1}{\lambda}\]

\[M''_X(t) = 2\lambda(\lambda - t)^{-3}\]

\[E(X^2) = M''_X(0) = \frac{2}{\lambda^2}\]

\[\text{Var}(X) = E(X^2) - [E(X)]^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}\]

6.4.3 (c) Verification (Direct Calculation)

lambda_vals <- c(0.5, 1, 2, 5)

results <- lapply(lambda_vals, function(lam) {
  # Direct integration
  EX  <- integrate(function(x) x * lam * exp(-lam * x), 0, Inf)$value
  EX2 <- integrate(function(x) x^2 * lam * exp(-lam * x), 0, Inf)$value
  data.frame(
    Lambda    = lam,
    `E(X) formula` = round(1 / lam, 4),
    `E(X) direct`  = round(EX, 4),
    `Var(X) formula` = round(1 / lam^2, 4),
    `Var(X) direct`  = round(EX2 - EX^2, 4),
    check.names = FALSE
  )
})

knitr::kable(do.call(rbind, results),
             caption = "Verification: MGF formula vs direct integration for Exponential distribution")
Table 6.1: Verification: MGF formula vs direct integration for Exponential distribution
Lambda E(X) formula E(X) direct Var(X) formula Var(X) direct
0.5 2.0 2.0 4.00 4.00
1.0 1.0 1.0 1.00 1.00
2.0 0.5 0.5 0.25 0.25
5.0 0.2 0.2 0.04 0.04
x_seq <- seq(0, 5, length.out = 300)
df_exp <- do.call(rbind, lapply(lambda_vals, function(lam) {
  data.frame(x = x_seq, fx = lam * exp(-lam * x_seq), lambda = factor(lam))
}))

ggplot(df_exp, aes(x, fx, color = lambda)) +
  geom_line(linewidth = 1) +
  labs(
    title = expression("Exponential PDF for Various " * lambda),x = "x", y = "f(x)",
    colour = expression(lambda)
  ) +
  theme_minimal()
Exponential PDF for various lambda values

Figure 6.1: Exponential PDF for various lambda values


6.5 Summary Table

Concept Formula
Raw moment \(\mu'_r = E(X^r)\)
Central moment \(\mu_r = E[(X - \mu)^r]\)
MGF \(M_X(t) = E[e^{tX}]\)
Mean from MGF \(E(X) = M'_X(0)\)
Variance from MGF \(\text{Var}(X) = M''_X(0) - [M'_X(0)]^2\)
\(\mu_2 = \sigma^2\) \(\mu'_2 - (\mu'_1)^2\)
\(\mu_3\) \(\mu'_3 - 3\mu'_1\mu'_2 + 2(\mu'_1)^3\)

7 Lecture 7: Special Theoretical Probability Distributions

8 Introduction

Special theoretical probability distributions are theoretical distributions which are not obtained directly from actual experiments. They are deduced mathematically on the basis of certain assumptions, and are broadly classified into:

  • Discrete probability distributions
  • Continuous probability distributions

9 Discrete Probability Distributions

9.1 Discrete Uniform Distribution

A random variable \(X\) has a discrete uniform distribution if:

\[f(x) = \begin{cases} \frac{1}{N}, & x = 1, 2, 3, \ldots, N \\ 0, & \text{otherwise} \end{cases}\]

9.1.1 Mean

\[E(X) = \sum_{x=1}^{N} x \cdot f(x) = \sum_{x=1}^{N} x \cdot \frac{1}{N} = \frac{1}{N} \sum_{x=1}^{N} x\]

\[= \frac{1}{N} \cdot \frac{N(N+1)}{2} = \frac{N+1}{2}\]

\[\therefore \text{Mean} = \frac{N+1}{2}\]

9.1.2 Variance

\[\text{Var}(X) = \sigma^2 = E(X^2) - [E(X)]^2\]

\[E(X^2) = \sum_{x=1}^{N} x^2 \cdot \frac{1}{N} = \frac{1}{N} \cdot \frac{N(N+1)(2N+1)}{6} = \frac{(N+1)(2N+1)}{6}\]

\[\text{Var}(X) = \frac{(N+1)(2N+1)}{6} - \left(\frac{N+1}{2}\right)^2 = \frac{N^2 - 1}{12}\]

\[\boxed{\text{Var}(X) = \frac{N^2 - 1}{12}}\]

9.1.3 Moment Generating Function

\[M_X(t) = E(e^{tX}) = \sum_{x=1}^{N} e^{tx} \cdot \frac{1}{N} = \frac{1}{N}[e^t + e^{2t} + \cdots + e^{Nt}]\]

\[= \frac{e^t(1 - e^{Nt})}{N(1 - e^t)}\]


9.2 Bernoulli Distribution

A Bernoulli trial is a random experiment with only two mutually exclusive outcomes: success or failure. Let:

\[X = \begin{cases} 1, & \text{outcome is a success} \\ 0, & \text{outcome is a failure} \end{cases}\]

The distribution is characterised by a single parameter \(p\), where \(p = \Pr(\text{success})\) and \(q = 1 - p = \Pr(\text{failure})\).

\[P(X = x) = \begin{cases} p^x(1-p)^{1-x}, & x = 0, 1 \\ 0, & \text{otherwise} \end{cases}\]

9.2.1 Proof that \(f(x)\) is a pmf

\[\sum_{x=0}^{1} f(x) = p^0 q^1 + p^1 q^0 = q + p = 1 \checkmark\]

9.2.2 Mean

\[E(X) = \sum_{x=0}^{1} x \cdot P(X = x) = 0 \cdot (1-p) + 1 \cdot p = p\]

9.2.3 Variance

\[E(X^2) = 0^2 \cdot (1-p) + 1^2 \cdot p = p\]

\[\text{Var}(X) = E(X^2) - [E(X)]^2 = p - p^2 = p(1-p)\]

9.2.4 Moment Generating Function

\[M_X(t) = E[e^{tX}] = \sum_{x=0}^{1} e^{tx} p^x (1-p)^{1-x} = (1-p) + pe^t = q + pe^t\]


9.3 Binomial Distribution

The binomial distribution arises when we count the number of successes in \(n\) independent trials, each with probability of success \(p\).

\[f(x) = \begin{cases} \binom{n}{x} p^x q^{n-x}, & x = 0, 1, 2, \ldots, n \\ 0, & \text{elsewhere} \end{cases}\]

We write \(X \sim \text{Bin}(n, p)\).

9.3.1 Proof that \(f(x)\) is a pmf

\[\sum_{x=0}^{n} \binom{n}{x} p^x q^{n-x} = (p+q)^n = 1 \checkmark\]

9.3.2 Moment Generating Function

\[M_X(t) = E(e^{tX}) = \sum_{x=0}^{n} \binom{n}{x} (pe^t)^x q^{n-x}\]

Applying the binomial expansion with \(a = q\) and \(b = pe^t\):

\[\boxed{M_X(t) = (q + pe^t)^n}\]

9.3.3 Mean

\[M_X'(t) = npe^t(q + pe^t)^{n-1}\]

\[M_X'(0) = np(p+q)^{n-1} = np\]

\[\therefore \textbf{Mean} = np\]

9.3.4 Variance

Using the product rule on \(M_X'(t)\):

\[M_X''(t) = npe^t(q + pe^t)^{n-2}(q + npe^t)\]

\[M_X''(0) = np(q + np) = npq + n^2p^2\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = (npq + n^2p^2) - (np)^2 = npq\]

\[\boxed{\text{Var}(X) = npq}\]

9.3.5 Examples

Example 1. The binomial distribution with \(n = 7\), \(p = \frac{1}{2}\):

\[f(x) = \binom{7}{x} \left(\frac{1}{2}\right)^7, \quad x = 0,1,\ldots,7\]

\[E(X) = 7 \cdot \frac{1}{2} = \frac{7}{2}, \qquad \text{Var}(X) = 7 \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{7}{4}\]

\[P(0 \leq X \leq 1) = \frac{1}{128} + \frac{7}{128} = \frac{8}{128}, \qquad P(X = 5) = \frac{21}{128}\]

Example 2. A marksman hits a target with probability \(p = \frac{5}{6}\). He fires 9 shots. \(X \sim \text{Bin}(9, \frac{5}{6})\).

  1. \(P(X \geq 7) = P(X=7) + P(X=8) + P(X=9) = 0.822\)

  2. \(P(X \leq 6) = 1 - P(X \geq 7) = 1 - 0.822 = 0.178\)

  3. Number of ways to hit exactly 6 times: \(\binom{9}{6} = 84\). Three successive misses can occur in 7 ways (positions 1-3, 2-4, …, 7-9). \[P = \frac{7}{84} = \frac{1}{12}\]

Example 3. Four coins tossed simultaneously. \(X \sim \text{Bin}(4, \frac{1}{2})\).

  1. \(P(X = 0) = \frac{1}{16} = 0.0625\)

  2. \(P(X = 1) = \frac{4}{16} = 0.25\)

  3. \(P(X \geq 1) = 1 - P(X=0) = 1 - \frac{1}{16} = 0.9375\)

Example 4. 10% of production is defective. \(X \sim \text{Bin}(10, 0.1)\).

\[P(X = 2) = \binom{10}{2}(0.1)^2(0.9)^8 = 0.1937\] \[P(X \geq 2) = 1 - [P(X=0) + P(X=1)] = 0.264\]

9.3.6 Binomial Exercises

  1. If the MGF of \(X\) is \(\left(\frac{1}{3} + \frac{2}{3}e^t\right)^5\), find \(E(X)\) and \(\text{Var}(X)\).

  2. Let \(Y\) be the number of successes in \(n\) independent repetitions with probability of success \(p\). Find the smallest \(n\) such that \(P(Y \geq 1) \geq 0.95\). [5]

  3. Let \(X \sim \text{Bin}(2, p)\) and \(Y \sim \text{Bin}(4, p)\). If \(P(X \geq 1) = \frac{5}{9}\), find \(P(Y \geq 1)\).

  4. 90% of students entering a diploma course successfully complete it. 15 students commence the course. Find the probability that:

    1. All 15 successfully complete the course. [0.206]
    2. Only 1 student fails. [0.343]
    3. No more than 2 students fail. [0.816]
    4. At least 2 students fail. [0.451]
  5. A company has ten telephone lines. The probability that any particular line is engaged is \(p = 0.2\). Find the expected number of free lines [8] and calculate:

    1. All lines engaged. [0.0000001]
    2. At least one line is free. [1]
    3. Exactly two lines are free. [0.000074]
  6. 10% of glasses made by a machine are defective. A sample of 10 is selected. Find the probability that none are defective and the expected number of defectives. [0.349, 1]

  7. Bolts are shipped in lots of 20. Each bolt is defective with probability 0.05.

    1. Expected number of defectives per lot. [1]
    2. Probability that a lot contains no defectives. [0.358]
  8. You receive 10 lots from the manufacturer in question 7.

    1. Expected number of lots with no defectives. [3.58]
    2. Probability of no defectives in all 10 lots. [0.000035]
  9. Probability that a pen drawn from a box is defective is 0.1. A sample of 6 pens is taken. Find the probability it contains:

    1. Exactly 5 or 6 defective pens.
    2. More than 2 defective pens.
    3. Less than 3 defective pens.

9.4 Poisson Distribution

A discrete random variable \(X\) has a Poisson distribution if:

\[\Pr(X = x) = f(x) = \frac{e^{-\lambda} \lambda^x}{x!}, \quad x = 0, 1, 2, 3, \ldots\]

We write \(X \sim \text{Po}(\lambda)\), where \(\lambda > 0\) is the parameter.

Poisson Recurrence Formula:

\[\Pr(X = x+1) = \frac{\lambda}{x+1} \Pr(X = x), \quad x = 0, 1, 2, \ldots\]

9.4.1 Uses

  1. Estimation of probabilities of rare events (e.g., telephone calls at a switchboard, insurance claims, accidents, flaws in manufactured material).
  2. Approximation to the Binomial with the same mean \(\lambda = np\), when \(n > 50\) and \(p < 0.1\).

9.4.2 Moment Generating Function

\[M_X(t) = E(e^{tX}) = \sum_{x=0}^{\infty} e^{tx} \cdot \frac{e^{-\lambda}\lambda^x}{x!} = e^{-\lambda} \sum_{x=0}^{\infty} \frac{(\lambda e^t)^x}{x!}\]

\[= e^{-\lambda} \cdot e^{\lambda e^t}\]

\[\boxed{M_X(t) = e^{\lambda(e^t - 1)}}\]

9.4.3 Mean

\[M_X'(t) = \lambda e^t \cdot e^{\lambda(e^t - 1)}\]

\[M_X'(0) = \lambda e^0 \cdot e^{\lambda(1-1)} = \lambda\]

\[\therefore \textbf{Mean} = \lambda\]

9.4.4 Variance

Using the product rule on \(M_X'(t)\):

\[M_X''(t) = \lambda e^t e^{\lambda(e^t-1)}[1 + \lambda e^t]\]

\[M_X''(0) = \lambda(1)(1)[1 + \lambda] = \lambda + \lambda^2\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = (\lambda + \lambda^2) - \lambda^2 = \lambda\]

\[\boxed{\text{For a Poisson distribution: Mean} = \lambda = \text{Var}(X)}\]

9.4.5 Poisson or Binomial?

Given a frequency distribution, the decision between Poisson and binomial is often made by comparing the mean and variance. The closer these two values are, the more likely a Poisson distribution applies.

9.4.6 Additive Property

If \(X \sim \text{Po}(x)\) and \(Y \sim \text{Po}(y)\), then \(X + Y \sim \text{Po}(x + y)\).

9.4.7 Examples

Example 1. Telephone calls arrive at a switchboard at 50 per hour. Find the probabilities of 0, 1, or 2 calls in any 5-minute period.

Average rate per 5 minutes: \(\lambda = \frac{50}{12} = 4.17\). So \(X \sim \text{Po}(4.17)\).

\[P(X=0) = \frac{e^{-4.17}(4.17)^0}{0!} = 0.02\] \[P(X=1) = \frac{e^{-4.17}(4.17)^1}{1!} = 0.06\] \[P(X=2) = \frac{e^{-4.17}(4.17)^2}{2!} = 0.13\]

Example 2. A professor has probability 0.001 of being late to any class. In 100 classes, \(X \sim \text{Bin}(100, 0.001)\), approximated by \(\text{Po}(\mu = np = 0.1)\).

\[P(X=0) = \frac{e^{-0.1}(0.1)^0}{0!} = 0.9048\] \[P(X=1) = \frac{e^{-0.1}(0.1)^1}{1!} = 0.0905\]

These match the exact binomial probabilities very closely.

Example 3. External calls arrive at rate 1 per 5 minutes; internal calls at rate 2 per 5 minutes.

Per 2-minute period: \(E \sim \text{Po}(0.4)\), \(I \sim \text{Po}(0.8)\).

By the additive property: \(E + I \sim \text{Po}(1.2)\).

\[P(E+I > 2) = 1 - [P(0) + P(1) + P(2)]\] \[= 1 - \left[e^{-1.2} + 1.2e^{-1.2} + \frac{1.44}{2}e^{-1.2}\right]\]

Example 4. Calls per 10 minutes follow \(\text{Po}(0.6)\).

  1. \(P(X=0) = e^{-0.6} = 0.55\)

  2. For 40 minutes: \(X' \sim \text{Po}(0.6 \times 4) = \text{Po}(2.4)\). \[P(X' > 2) = 1 - [P(0) + P(1) + P(2)] = 0.430\]

Example 5. A hospital admits 50 patients per day; 3% require special rooms. Using the Poisson approximation with \(\lambda = np = 50 \times 0.03 = 1.5\):

\[P(X > 3) = 1 - [P(0) + P(1) + P(2) + P(3)] = 0.0638\]

9.4.8 Poisson Exercises

  1. Calls arrive randomly at 24 per hour. Find:

    1. Probability of no calls in 5 minutes. [0.135]
    2. Probability of more than 4 calls in 5 minutes. [0.0527]
  2. Bacteria in 1 ml of inoculum follow \(\text{Po}(2)\). Find the probability that at least 3 bacteria are present (dose is ineffective). [0.143]

  3. A manufactured cloth has type A flaws with mean 0.5/m and type B flaws with mean 1/m, independently Poisson distributed.

    1. Probability that 1 metre has: (i) 2 or fewer type A flaws [0.986], (ii) no flaws of either type [0.223].
    2. Show that the probability of exactly 1 flaw is exactly three times that of 1 flaw of each type.
    3. Removing a type A flaw costs 8p and a type B flaw costs 2p. Find the mean and standard deviation of removal cost per metre. [mean = 6, s.d. = 2.45]
  4. Radio resistors are defective with probability 0.2%. Sold in lots of 100 with a no-defective guarantee. What is the probability a lot violates the guarantee? [0.181]

  5. Beer packages are removed at 10 per hour during rush periods. Find:

    1. \(P(\geq 1 \text{ removed in first 6 minutes})\) [0.6321]
    2. \(P(\geq 1 \text{ removed in each of 3 consecutive 6-minute intervals})\) [0.2526]
  6. Accidents occur at rate 1 every 2 months. Find the expected number per year, its standard deviation, and \(P(\text{no accidents in a given month})\). [6, 2.45, 0.61]

  7. A used car salesman makes sales as a Poisson process with rate \(\lambda = 2\) per week. Find: \(P(X = 3)\) in 2 weeks, \(P(X \geq 3)\), \(P(X \leq 3)\). [0.1804, 0.3233, 0.8571]

  8. A factory packs bolts in boxes of 500. The probability that a bolt is defective is 0.002. Find the probability that a box contains exactly 2 defective bolts.

  9. The mean number of bacteria per ml is 4 (\(\text{Po}(4)\)). Find:

    1. In 1 ml: (i) \(P(\text{no bacteria})\), (ii) \(P(4 \text{ bacteria})\).
    2. In 3 ml: \(P(\text{less than 2 bacteria})\).
    3. In \(\frac{1}{2}\) ml: \(P(\text{more than 2 bacteria})\).

9.5 Hypergeometric Distribution

The hypergeometric distribution arises in sampling without replacement. A box contains \(N\) items, \(m\) of which are defective. In a sample of \(n\) items drawn without replacement:

\[f(x) = \frac{\binom{m}{x}\binom{N-m}{n-x}}{\binom{N}{n}}, \quad x = 0, 1, 2, \ldots, n\]

Note: In sampling with replacement, the binomial distribution applies with \(p = m/N\).

9.5.1 Mean and Variance

\[\mu = \frac{nm}{N}, \qquad \sigma^2 = \frac{nm(N-m)(N-n)}{N^2(N-1)}\]

9.5.2 Examples

Example 1. A carton contains 20 fuses, 5 defective. A sample of 3 is chosen without replacement. Here \(N=20\), \(m=5\), \(N-m=15\), \(n=3\).

\[f(0) = \frac{\binom{5}{0}\binom{15}{3}}{\binom{20}{3}} = \frac{91}{228}, \quad f(1) = \frac{35}{76}, \quad f(2) = \frac{5}{38}, \quad f(3) = \frac{1}{114}\]

Example 2. A box has 20 spare parts: 15 good, 5 defective. 4 parts are selected at random. \(P(X = 3)\) where \(X\) is the number of good parts.

\[P(X=3) = \frac{\binom{15}{3}\binom{5}{1}}{\binom{20}{4}} = 0.4696\]

Example 3. A basket has 50 white and 45 black balls (\(N = 50\), \(m = 5\)). Draw 10 without replacement. \(P(4 \text{ white})\):

\[P(X=4) = \frac{\binom{5}{4}\binom{45}{6}}{\binom{50}{10}} = 0.0396\]

9.5.3 Hypergeometric Exercises

  1. A box contains 5 marbles, 3 are chipped. Two are chosen without replacement. Find the probability function for the number of chipped marbles.

  2. A panel of 7 judges decides a beauty contest by simple majority. 4 will vote for Marie, 3 for Sue. If 3 judges are selected randomly, what is the probability a majority favour Marie? [22/35]

  3. An incoming lot of 100 items is sampled (5 items). The lot is accepted if 1 or fewer defectives are found.

    1. If the lot contains 5 defectives, what is the probability of acceptance? [0.981]
    2. If the lot contains 15 defectives, what is the probability of rejection? [0.8391]
  4. In a school of 20 students, 6 are compulsive smokers. Prefects check 10 lockers at random. Find \(P(\text{cigarettes found in at least 3 lockers})\) where \(N=20\), \(n=10\).


9.6 Geometric (Pascal) Distribution

A random variable \(X\) has a geometric distribution if:

\[f(x) = pq^x, \quad x = 0, 1, 2, \ldots; \quad 0 < p < 1\]

This occurs when \(X\) represents the number of trials before the first success.

9.6.1 Moment Generating Function

\[M_X(t) = E(e^{tX}) = p \sum_{x=0}^{\infty} (qe^t)^x = \frac{p}{1 - qe^t} = p(1-qe^t)^{-1}\]

9.6.2 Mean

\[M_X'(t) = pqe^t(1-qe^t)^{-2}\]

\[M_X'(0) = pq(1-q)^{-2} = pqp^{-2} = \frac{q}{p}\]

\[\therefore \textbf{Mean} = \frac{q}{p}\]

9.6.3 Variance

Using the product rule on \(M_X'(t)\):

\[M_X''(t) = pqe^t(1-qe^t)^{-3}[1 + qe^t]\]

\[M_X''(0) = pq(p)^{-3}(1+q) = qp^{-2}(1+q)\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = qp^{-2}(1+q) - \frac{q^2}{p^2} = \frac{q}{p^2}\]

\[\boxed{\text{Var}(X) = \frac{q}{p^2}}\]


9.7 Negative Binomial Distribution

Consider independent repetitions with constant probability \(p\) of success. Let \(X\) = total number of failures before the \(r\)th success. Then:

\[f(x) = \binom{x+r-1}{r-1} p^r q^x, \quad x = 0, 1, 2, \ldots\]

Note: When \(r = 1\), this reduces to the geometric distribution.

9.7.1 Moment Generating Function

\[M_X(t) = p^r(1-qe^t)^{-r} = \left[\frac{p}{1-qe^t}\right]^r\]

9.7.2 Mean

\[M_X'(t) = p^r r q e^t (1-qe^t)^{-r-1}\]

\[M_X'(0) = rqp^{-1}\]

\[\therefore \textbf{Mean} = \frac{rq}{p}\]

9.7.3 Variance

\[M_X''(t) = p^r rqe^t(1-qe^t)^{-r-2}[1 + rqe^t]\]

\[M_X''(0) = rqp^{-2}(1+rq)\]

\[\text{Var}(X) = rqp^{-2}(1+rq) - \left(\frac{rq}{p}\right)^2 = \frac{rq}{p^2}\]

\[\boxed{\text{Var}(X) = \frac{rq}{p^2}}\]

9.7.4 Example

A telephone market research agency finds \(p = 0.40\) probability that a call is answered.

(a) Probability that the 10th answer comes on the 20th call (\(r = 10\), \(x + r = 20\)):

\[P = \binom{19}{9}(0.4)^{10}(0.6)^{10} = 0.05856\]

(b) Expected number of calls to obtain 7 answers:

\[E[X + r] = \frac{r}{p} = \frac{7}{0.4} = 17.5\]

(c) Probability that the first answer comes on the third call (\(r = 1\), \(x + r = 3\)):

\[P = \binom{2}{0}(0.4)^1(0.6)^2 = 0.144\]


10 Continuous Distributions

10.1 Uniform (Rectangular) Distribution

A random variable \(X\) has a rectangular distribution if:

\[f(x) = \begin{cases} \frac{1}{b-a}, & a < x < b \\ 0, & \text{elsewhere} \end{cases}\]

10.1.1 Mean

\[E(X) = \frac{1}{b-a} \int_a^b x \, dx = \frac{1}{2(b-a)}[b^2 - a^2] = \frac{a+b}{2}\]

\[\boxed{E(X) = \frac{a+b}{2}}\]

10.1.2 Variance

\[E(X^2) = \frac{1}{b-a} \int_a^b x^2 \, dx = \frac{b^2 + ab + a^2}{3}\]

\[\text{Var}(X) = \frac{b^2 + ab + a^2}{3} - \left(\frac{a+b}{2}\right)^2 = \frac{(b-a)^2}{12}\]

\[\boxed{\text{Var}(X) = \frac{(b-a)^2}{12}}\]

10.1.3 Moment Generating Function

\[M_X(t) = \frac{1}{t(b-a)}\left[e^{tb} - e^{ta}\right]\]

10.1.4 Example

Buses arrive every 10 minutes. Let \(X\) be the waiting time. Then \(X \sim U(0, 10)\):

\[E(X) = \frac{0+10}{2} = 5, \qquad \text{Var}(X) = \frac{(10-0)^2}{12} = \frac{100}{12} \approx 8.33\]


11 The Normal Distribution

The continuous distribution having the density function

\[f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\, e^{-\dfrac{(x-\mu)^2}{2\sigma^2}}, \quad -\infty < x < \infty\]

is called the normal distribution or Gauss distribution. A random variable having this distribution is said to be normally distributed.

This distribution is very important because:

  • Many random variables of practical interest are normal or approximately normal.
  • It appears in mathematical proofs of various statistical tests.
  • It is a useful approximation of more complicated distributions (binomial, Poisson).

In the formula above, \(\mu\) is the mean and \(\sigma\) is the standard deviation.

We write: \(X \sim N(\mu,\, \sigma^2)\), so that \(E(X) = \mu\) and \(\text{Var}(X) = \sigma^2\).


12 The Bell Curve

The curve of \(f(x)\) is a bell-shaped curve, symmetric with respect to \(x = \mu\).

  • For \(\mu > 0\) (\(\mu < 0\)) the curves have the same shape but are shifted \(|\mu|\) units to the right (left).
  • The smaller \(\sigma^2\) is, the higher the peak at \(x = \mu\) and the steeper the descents on both sides.
  • The distribution function has a maximum at \(x = \mu\) and inflection points at \(x = \mu \pm \sigma\).
x <- seq(-4, 4, length.out = 500)
y <- dnorm(x, mean = 0, sd = 1)

ggplot(data.frame(x, y), aes(x, y)) +
  geom_line(color = "#2C7BB6", linewidth = 1.2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "grey40") +
  geom_vline(xintercept = c(-1, 1), linetype = "dotted", color = "#D7191C") +
  annotate("text", x = 0.15, y = 0.42, label = "mu == 0", parse = TRUE, size = 4) +
  annotate("text", x = 1.15, y = 0.25, label = "mu + sigma", size = 3.5, color = "#D7191C") +
  annotate("text", x = -1.15, y = 0.25, label = "mu - sigma", size = 3.5,
           color = "#D7191C", hjust = 1) +
  labs(
    title = "The Normal (Bell) Curve",
    subtitle = "X ~ N(0, 1): symmetric, bell-shaped, inflection points at x = mu +/- sigma",
    x = "x", y = "f(x)"
  ) +
  theme_minimal(base_size = 13)
Bell-shaped normal curve centred at mu = 0, sigma = 1

Figure 12.1: Bell-shaped normal curve centred at mu = 0, sigma = 1


13 Effect of Changing \(\mu\) and \(\sigma\)

13.1 Effect of \(\mu\) (location shift)

Changing \(\mu\) shifts the curve left or right without changing its shape.

x <- seq(-8, 12, length.out = 600)

df_mu <- data.frame(
  x = rep(x, 3),
  y = c(dnorm(x, -2, 1), dnorm(x, 0, 1), dnorm(x, 4, 1)),
  dist = rep(c("mu = -2", "mu = 0", "mu = 4"), each = length(x))
)

ggplot(df_mu, aes(x, y, color = dist)) +
  geom_line(linewidth = 1.1) +
  scale_color_manual(values = c("#D7191C", "#2C7BB6", "#1A9641")) +
  labs(
    title = "Effect of Changing mu (sigma = 1 fixed)",
    subtitle = "Changing mu shifts the curve left or right",
    x = "x", y = "f(x)", color = "Distribution"
  ) +
  theme_minimal(base_size = 13)
Effect of changing mu: same shape, different location

Figure 13.1: Effect of changing mu: same shape, different location

13.2 Effect of \(\sigma\) (spread)

Changing \(\sigma\) affects the height and spread of the curve.

x <- seq(-8, 8, length.out = 600)

df_sigma <- data.frame(
  x = rep(x, 3),
  y = c(dnorm(x, 0, 0.5), dnorm(x, 0, 1), dnorm(x, 0, 2)),
  dist = rep(c("sigma = 0.5", "sigma = 1", "sigma = 2"), each = length(x))
)

ggplot(df_sigma, aes(x, y, color = dist)) +
  geom_line(linewidth = 1.1) +
  scale_color_manual(values = c("#D7191C", "#2C7BB6", "#1A9641")) +
  labs(
    title = "Effect of Changing sigma (mu = 0 fixed)",
    subtitle = "Smaller sigma: taller and narrower; larger sigma: shorter and wider",
    x = "x", y = "f(x)", color = "Distribution"
  ) +
  theme_minimal(base_size = 13)
Effect of changing sigma: same centre, different spread

Figure 13.2: Effect of changing sigma: same centre, different spread


14 The Standard Normal Distribution

14.1 Definition

If \(\mu = 0\) and \(\sigma = 1\), the distribution is called the standard normal distribution:

\[f(z) = \frac{1}{\sqrt{2\pi}}\, e^{-z^2/2}, \quad -\infty < z < \infty\]

We write \(Z \sim N(0, 1)\).

14.2 Standardisation (Z-transformation)

For any \(X \sim N(\mu, \sigma^2)\), the transformation

\[Z = \frac{X - \mu}{\sigma}\]

produces a standard normal variable \(Z \sim N(0, 1)\).

Proof that \(E(Z) = 0\) and \(\text{Var}(Z) = 1\):

\[E(Z) = E\!\left(\frac{X-\mu}{\sigma}\right) = \frac{E(X)-\mu}{\sigma} = \frac{\mu - \mu}{\sigma} = 0\]

\[\text{Var}(Z) = \text{Var}\!\left(\frac{X-\mu}{\sigma}\right) = \frac{\text{Var}(X)}{\sigma^2} = \frac{\sigma^2}{\sigma^2} = 1\]

14.3 The Cumulative Distribution Function \(\Phi(z)\)

\[\Phi(z) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{z} e^{-t^2/2}\,dt\]

This integral cannot be evaluated in closed form — values are read from standard normal tables.

z <- seq(-4, 4, length.out = 500)

par(mfrow = c(1, 2), mar = c(4, 4, 3, 1))

# PDF
plot(z, dnorm(z), type = "l", lwd = 2, col = "#2C7BB6",
     main = "Standard Normal PDF", xlab = "z", ylab = "f(z)")
abline(v = 0, lty = 2, col = "grey50")
grid()

# CDF
plot(z, pnorm(z), type = "l", lwd = 2, col = "#D7191C",
     main = "Standard Normal CDF", xlab = "z", ylab = expression(Phi(z)))
abline(h = 0.5, lty = 2, col = "grey50")
abline(v = 0, lty = 2, col = "grey50")
grid()
Standard Normal PDF and CDF

Figure 14.1: Standard Normal PDF and CDF

14.4 Symmetry Property

Because \(f(z)\) is symmetric about \(z = 0\):

\[\Phi(-z) = 1 - \Phi(z)\]

z <- seq(-4, 4, length.out = 500)
y <- dnorm(z)

df <- data.frame(z, y)

ggplot(df, aes(z, y)) +
  geom_line(color = "#2C7BB6", linewidth = 1.2) +
  geom_area(data = subset(df, z <= -1), aes(z, y), fill = "#D7191C", alpha = 0.5) +
  geom_area(data = subset(df, z >= 1),  aes(z, y), fill = "#1A9641", alpha = 0.5) +
  annotate("text", x = -2.2, y = 0.05, label = "P(Z < -1)\n= 0.1587", color = "#D7191C", size = 4) +
  annotate("text", x =  2.2, y = 0.05, label = "P(Z > 1)\n= 0.1587", color = "#1A9641", size = 4) +
  geom_vline(xintercept = c(-1, 1), linetype = "dashed", color = "grey40") +
  labs(
    title = "Symmetry Property: Phi(-z) = 1 - Phi(z)",
    x = "z", y = "f(z)"
  ) +
  theme_minimal(base_size = 13)
Symmetry of the standard normal: P(Z < -1) = P(Z > 1)

Figure 14.2: Symmetry of the standard normal: P(Z < -1) = P(Z > 1)


15 Properties of the Normal Distribution

  1. The curve is bell-shaped and symmetric about \(x = \mu\), so: \[\Pr(X < \mu) = \Pr(X > \mu) = 0.5\]

  2. As \(x \to \pm\infty\), \(f(x) \to 0\).

  3. Since \(f(x)\) is a valid p.d.f.: \[\int_{-\infty}^{\infty} f(x)\,dx = 1\]

  4. The inflection points are at \(x = \mu \pm \sigma\).

  5. Any normal variable can be standardised: if \(X \sim N(\mu, \sigma^2)\) then \(Z = \frac{X-\mu}{\sigma} \sim N(0,1)\).

15.1 The Empirical Rule (68-95-99.7 Rule)

x <- seq(-4, 4, length.out = 600)
y <- dnorm(x)
df <- data.frame(x, y)

ggplot(df, aes(x, y)) +
  geom_area(data = subset(df, x >= -3 & x <= 3), aes(x, y), fill = "#ABDDA4", alpha = 0.8) +
  geom_area(data = subset(df, x >= -2 & x <= 2), aes(x, y), fill = "#66C2A5", alpha = 0.8) +
  geom_area(data = subset(df, x >= -1 & x <= 1), aes(x, y), fill = "#3288BD", alpha = 0.8) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = c(-3,-2,-1,0,1,2,3), linetype = "dashed", color = "grey40") +
  annotate("text", x = 0, y = 0.20, label = "68.27%", color = "white", fontface = "bold", size = 4.5) +
  annotate("text", x = 0, y = 0.08, label = "95.45%", color = "white", fontface = "bold", size = 4.5) +
  annotate("text", x = 0, y = 0.02, label = "99.73%", color = "black", fontface = "bold", size = 4.5) +
  scale_x_continuous(
    breaks = -3:3,
    labels = c(expression(mu-3*sigma), expression(mu-2*sigma), expression(mu-sigma),
               expression(mu),
               expression(mu+sigma), expression(mu+2*sigma), expression(mu+3*sigma))
  ) +
  labs(
    title = "The Empirical Rule (68 - 95 - 99.7)",
    x = "", y = "f(x)"
  ) +
  theme_minimal(base_size = 12)
The Empirical Rule: areas under the normal curve

Figure 15.1: The Empirical Rule: areas under the normal curve

Interval Probability
\(\mu \pm \sigma\) 68.27%
\(\mu \pm 2\sigma\) 95.45%
\(\mu \pm 3\sigma\) 99.73%

16 Moment Generating Function, Mean, and Variance

16.1 Moment Generating Function

By definition:

\[M_X(t) = E(e^{tX}) = \int_{-\infty}^{\infty} e^{tx} \cdot \frac{1}{\sqrt{2\pi\sigma^2}}\, e^{-\frac{(x-\mu)^2}{2\sigma^2}}\,dx\]

Completing the square in the exponent:

\[(x-\mu)^2 - 2\sigma^2 t(x-\mu) = [(x-\mu) - \sigma^2 t]^2 - \sigma^4 t^2\]

After simplification (and noting the remaining integral equals 1 since it is the area under a normal density):

\[\boxed{M_X(t) = e^{\mu t + \frac{\sigma^2 t^2}{2}}}\]

16.2 Mean

\[M_X'(t) = \frac{d}{dt}\left[e^{\mu t + \frac{\sigma^2 t^2}{2}}\right] = (\mu + \sigma^2 t)\,e^{\mu t + \frac{\sigma^2 t^2}{2}}\]

At \(t = 0\):

\[M_X'(0) = (\mu + 0)\cdot e^0 = \mu\]

\[\boxed{E(X) = \mu}\]

16.3 Variance

Using the product rule on \(M_X'(t)\) with \(U = \mu + \sigma^2 t\) and \(V = e^{\mu t + \frac{\sigma^2 t^2}{2}}\):

\[M_X''(t) = \sigma^2\,e^{\mu t + \frac{\sigma^2 t^2}{2}} + (\mu + \sigma^2 t)^2\,e^{\mu t + \frac{\sigma^2 t^2}{2}}\]

At \(t = 0\):

\[M_X''(0) = \sigma^2 + \mu^2\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = (\sigma^2 + \mu^2) - \mu^2\]

\[\boxed{\text{Var}(X) = \sigma^2}\]


17 The Cumulative Distribution and Probability Calculations

17.1 Key Formula

\[\Pr(a \leq X \leq b) = F(b) - F(a) = \Phi\!\left(\frac{b-\mu}{\sigma}\right) - \Phi\!\left(\frac{a-\mu}{\sigma}\right)\]

17.2 Useful Symmetry Results

\[\Phi(-z) = 1 - \Phi(z)\] \[\Pr(X \geq x) = 1 - \Phi\!\left(\frac{x - \mu}{\sigma}\right)\] \[\Pr(X < \mu) = \Phi(0) = 0.5\]


18 Visualising Probability Areas

18.1 \(P(Z < z_0)\): Left-tail probability

z <- seq(-4, 4, length.out = 500)
df <- data.frame(z, y = dnorm(z))
z0 <- 1.5
prob <- round(pnorm(z0), 4)

ggplot(df, aes(z, y)) +
  geom_area(data = subset(df, z <= z0), aes(z, y), fill = "#3288BD", alpha = 0.6) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = z0, linetype = "dashed", color = "#D7191C") +
  annotate("text", x = -1, y = 0.15,
           label = paste0("P(Z < ", z0, ") = ", prob),
           size = 5, color = "#2C7BB6", fontface = "bold") +
  annotate("text", x = z0 + 0.1, y = 0.42, label = paste0("z = ", z0),
           color = "#D7191C", size = 4, hjust = 0) +
  labs(title = "Left-tail Probability: P(Z < 1.5)", x = "z", y = "f(z)") +
  theme_minimal(base_size = 13)
Left-tail probability: P(Z < 1.5)

Figure 18.1: Left-tail probability: P(Z < 1.5)

18.2 \(P(Z > z_0)\): Right-tail probability

z0 <- 1.0
prob <- round(1 - pnorm(z0), 4)

ggplot(df, aes(z, y)) +
  geom_area(data = subset(df, z >= z0), aes(z, y), fill = "#D7191C", alpha = 0.6) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = z0, linetype = "dashed") +
  annotate("text", x = 2.5, y = 0.15,
           label = paste0("P(Z > ", z0, ") = ", prob),
           size = 5, color = "#D7191C", fontface = "bold") +
  labs(title = "Right-tail Probability: P(Z > 1)", x = "z", y = "f(z)") +
  theme_minimal(base_size = 13)
Right-tail probability: P(Z > 1)

Figure 18.2: Right-tail probability: P(Z > 1)

18.3 \(P(a < Z < b)\): Between two values

z_lo <- -1; z_hi <- 1
prob <- round(pnorm(z_hi) - pnorm(z_lo), 4)

ggplot(df, aes(z, y)) +
  geom_area(data = subset(df, z >= z_lo & z <= z_hi), aes(z, y),
            fill = "#1A9641", alpha = 0.6) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = c(z_lo, z_hi), linetype = "dashed", color = "#D7191C") +
  annotate("text", x = 0, y = 0.15,
           label = paste0("P(", z_lo, " < Z < ", z_hi, ") = ", prob),
           size = 5, color = "#1A9641", fontface = "bold") +
  labs(title = "Between Two Values: P(-1 < Z < 1) = 0.6827", x = "z", y = "f(z)") +
  theme_minimal(base_size = 13)
Probability between two z-values: P(-1 < Z < 1)

Figure 18.3: Probability between two z-values: P(-1 < Z < 1)


19 Worked Examples

19.1 Example 1: Direct Standard Normal Lookups

For \(Z \sim N(0,1)\), determine:

(a) \(\Pr(Z \leq 2.44)\)

(b) \(\Pr(Z \leq -1.16)\)

(c) \(\Pr(Z \geq 1)\)

(d) \(\Pr(2 \leq Z \leq 10)\)

19.1.1 Solution

(a) Read directly from tables: \(\Phi(2.44) = \mathbf{0.9927}\)

(b) \(\Phi(-1.16) = 1 - \Phi(1.16) = 1 - 0.8770 = \mathbf{0.1230}\)

(c) \(\Pr(Z \geq 1) = 1 - \Phi(1) = 1 - 0.8413 = \mathbf{0.1587}\)

(d) \(\Pr(2 \leq Z \leq 10) = \Phi(10) - \Phi(2) \approx 1 - 0.9772 = \mathbf{0.0228}\)

ggplot(df, aes(z, y)) +
  geom_area(data = subset(df, z >= 1), aes(z, y), fill = "#D7191C", alpha = 0.6) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = 1, linetype = "dashed") +
  annotate("text", x = 2.5, y = 0.12,
           label = "P(Z >= 1) = 0.1587", size = 5, color = "#D7191C", fontface = "bold") +
  labs(title = "Example 1(c): P(Z >= 1)", x = "z", y = "f(z)") +
  theme_minimal(base_size = 13)
Example 1(c): P(Z >= 1) = 0.1587

Figure 19.1: Example 1(c): P(Z >= 1) = 0.1587

# Verify using R
cat("(a) P(Z <= 2.44)  =", pnorm(2.44), "\n")
(a) P(Z <= 2.44)  = 0.9926564 
cat("(b) P(Z <= -1.16) =", pnorm(-1.16), "\n")
(b) P(Z <= -1.16) = 0.1230244 
cat("(c) P(Z >= 1)     =", 1 - pnorm(1), "\n")
(c) P(Z >= 1)     = 0.1586553 
cat("(d) P(2 <= Z <= 10) =", pnorm(10) - pnorm(2), "\n")
(d) P(2 <= Z <= 10) = 0.02275013 

19.2 Example 2: Finding the Constant \(c\)

For \(Z \sim N(0,1)\), find \(c\) such that:

(a) \(\Pr(Z \geq c) = 10\%\)

(b) \(\Pr(Z \leq c) = 5\%\)

(c) \(\Pr(0 \leq Z \leq c) = 45\%\)

(d) \(\Pr(-c \leq Z \leq c) = 99\%\)

19.2.1 Solution

(a) \(\Pr(Z \geq c) = 0.10 \Rightarrow \Phi(c) = 0.90 \Rightarrow c = \Phi^{-1}(0.90) = \mathbf{1.282}\)

(b) \(\Phi(c) = 0.05 \Rightarrow c = -\Phi^{-1}(0.95) = \mathbf{-1.645}\)

(c) \(\Phi(c) - \Phi(0) = 0.45 \Rightarrow \Phi(c) = 0.95 \Rightarrow c = \mathbf{1.645}\)

(d) \(\Phi(c) - \Phi(-c) = 0.99 \Rightarrow 2\Phi(c) - 1 = 0.99 \Rightarrow \Phi(c) = 0.995 \Rightarrow c = \mathbf{2.576}\)

cat("(a) c such that P(Z >= c) = 10%:", qnorm(0.90), "\n")
(a) c such that P(Z >= c) = 10%: 1.281552 
cat("(b) c such that P(Z <= c) = 5%:", qnorm(0.05), "\n")
(b) c such that P(Z <= c) = 5%: -1.644854 
cat("(c) c such that P(0 <= Z <= c) = 45%:", qnorm(0.95), "\n")
(c) c such that P(0 <= Z <= c) = 45%: 1.644854 
cat("(d) c such that P(-c <= Z <= c) = 99%:", qnorm(0.995), "\n")
(d) c such that P(-c <= Z <= c) = 99%: 2.575829 
c_val <- qnorm(0.995)

ggplot(df, aes(z, y)) +
  geom_area(data = subset(df, z >= -c_val & z <= c_val), aes(z, y),
            fill = "#3288BD", alpha = 0.5) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = c(-c_val, c_val), linetype = "dashed", color = "#D7191C") +
  annotate("text", x = 0, y = 0.15,
           label = "P(-2.576 <= Z <= 2.576) = 99%",
           size = 4.5, color = "#2C7BB6", fontface = "bold") +
  annotate("text", x = -c_val - 0.1, y = 0.38,
           label = paste0("c = -", round(c_val, 3)), color = "#D7191C", size = 3.8, hjust = 1) +
  annotate("text", x = c_val + 0.1, y = 0.38,
           label = paste0("c = ", round(c_val, 3)), color = "#D7191C", size = 3.8, hjust = 0) +
  labs(title = "Example 2(d): Two-sided 99% interval", x = "z", y = "f(z)") +
  theme_minimal(base_size = 13)
Example 2(d): P(-c <= Z <= c) = 99%, c = 2.576

Figure 19.2: Example 2(d): P(-c <= Z <= c) = 99%, c = 2.576


19.3 Example 3: \(X \sim N(-2, 0.25)\)

Here \(\mu = -2\), \(\sigma^2 = 0.25\), so \(\sigma = 0.5\).

Find:

(a) \(\Pr(X \geq c) = 0.2\)

(b) \(\Pr(-c \leq X \leq -1) = 0.5\)

(c) \(\Pr(-2-c \leq X \leq -2+c) = 0.9\)

(d) \(\Pr(-2-c \leq X \leq -2+c) = 99.6\%\)

19.3.1 Solution

(a) \[1 - \Phi\!\left(\frac{c+2}{0.5}\right) = 0.2 \Rightarrow \Phi\!\left(\frac{c+2}{0.5}\right) = 0.8 \Rightarrow \frac{c+2}{0.5} = 0.842 \Rightarrow c = -1.579\]

(b) \[\Phi\!\left(\frac{-1+2}{0.5}\right) - \Phi\!\left(\frac{-c+2}{0.5}\right) = 0.5\] \[\Phi(2) - \Phi\!\left(\frac{-c+2}{0.5}\right) = 0.5 \Rightarrow 0.9772 - \Phi\!\left(\frac{-c+2}{0.5}\right) = 0.5\] \[\Phi\!\left(\frac{-c+2}{0.5}\right) = 0.4772 \Rightarrow \frac{-c+2}{0.5} = -0.057 \Rightarrow c = 2.0285\]

(c) \[\Phi(2c) - \Phi(-2c) = 0.9 \Rightarrow 2\Phi(2c) - 1 = 0.9 \Rightarrow \Phi(2c) = 0.95 \Rightarrow 2c = 1.645 \Rightarrow c = 0.823\]

(d) \[2\Phi(2c) - 1 = 0.996 \Rightarrow \Phi(2c) = 0.998 \Rightarrow 2c = 2.878 \Rightarrow c = 1.439\]

mu <- -2; sigma <- 0.5
cat("(a) c = ", mu + sigma * qnorm(0.8), "\n")
(a) c =  -1.579189 
cat("(c) c = ", qnorm(0.95) / 2, "\n")
(c) c =  0.8224268 
cat("(d) c = ", qnorm(0.998) / 2, "\n")
(d) c =  1.439081 
x_seq <- seq(-4.5, 0.5, length.out = 500)
df3 <- data.frame(x = x_seq, y = dnorm(x_seq, mean = -2, sd = 0.5))
c_a <- -2 + 0.5 * qnorm(0.8)

ggplot(df3, aes(x, y)) +
  geom_area(data = subset(df3, x >= c_a), aes(x, y), fill = "#D7191C", alpha = 0.5) +
  geom_line(linewidth = 1.2, color = "#2C7BB6") +
  geom_vline(xintercept = c_a, linetype = "dashed", color = "#D7191C") +
  geom_vline(xintercept = -2, linetype = "dotted", color = "grey40") +
  annotate("text", x = c_a + 0.1, y = 0.6, label = paste0("c = ", round(c_a, 3)),
           color = "#D7191C", size = 4, hjust = 0) +
  annotate("text", x = -1.2, y = 0.2, label = "P(X >= c) = 0.20",
           size = 4, color = "#D7191C", fontface = "bold") +
  labs(title = "Example 3(a): X ~ N(-2, 0.25), P(X >= c) = 0.20",
       x = "x", y = "f(x)") +
  theme_minimal(base_size = 13)
Example 3: X ~ N(-2, 0.25), shaded region for part (a)

Figure 19.3: Example 3: X ~ N(-2, 0.25), shaded region for part (a)


19.4 Example 4: Finding \(\mu\) and \(\sigma\) from Percentile Information

An industrial process mass-produces items which are normally distributed. 11.55% of them weigh over 20 kg and 5.89% weigh under 10 kg. Find \(\mu\) and \(\sigma\).

19.4.1 Solution

From the upper tail:

\[\Pr(X > 20) = 0.1155 \Rightarrow \Phi\!\left(\frac{20-\mu}{\sigma}\right) = 0.8845 \Rightarrow \frac{20-\mu}{\sigma} = 1.2\]

\[\mu + 1.2\sigma = 20 \quad \cdots (i)\]

From the lower tail:

\[\Pr(X < 10) = 0.0589 \Rightarrow \frac{10-\mu}{\sigma} = -1.56\]

\[\mu - 1.56\sigma = 10 \quad \cdots (ii)\]

Solving simultaneously:

Subtract (ii) from (i): \(2.76\sigma = 10 \Rightarrow \sigma = 3.623\)

Substitute into (i): \(\mu = 20 - 1.2(3.623) = 15.652\)

\[\boxed{\mu = 15.652 \text{ kg}, \quad \sigma = 3.623 \text{ kg}}\]

# Solve simultaneously
A <- matrix(c(1, 1.2, 1, -1.56), nrow = 2, byrow = TRUE)
b <- c(20, 10)
sol <- solve(A, b)
cat("mu =", round(sol[1], 3), ", sigma =", round(sol[2], 3), "\n")
mu = 15.652 , sigma = 3.623 
mu4 <- 15.652; sigma4 <- 3.623
x4 <- seq(mu4 - 4*sigma4, mu4 + 4*sigma4, length.out = 500)
df4 <- data.frame(x = x4, y = dnorm(x4, mu4, sigma4))

ggplot(df4, aes(x, y)) +
  geom_area(data = subset(df4, x <= 10), aes(x, y), fill = "#D7191C", alpha = 0.5) +
  geom_area(data = subset(df4, x >= 20), aes(x, y), fill = "#1A9641", alpha = 0.5) +
  geom_line(linewidth = 1.2, color = "#2C7BB6") +
  geom_vline(xintercept = mu4, linetype = "dotted", color = "grey30") +
  geom_vline(xintercept = c(10, 20), linetype = "dashed", color = "grey50") +
  annotate("text", x = 7.5, y = 0.03,
           label = "5.89%\n< 10 kg", color = "#D7191C", size = 4, fontface = "bold") +
  annotate("text", x = 22.5, y = 0.03,
           label = "11.55%\n> 20 kg", color = "#1A9641", size = 4, fontface = "bold") +
  annotate("text", x = mu4, y = 0.115,
           label = paste0("mu = ", round(mu4, 2)), size = 4, vjust = -0.5) +
  labs(
    title = "Example 4: Fitted Normal Distribution",
    subtitle = paste0("mu = ", round(mu4, 3), ", sigma = ", round(sigma4, 3)),
    x = "Weight (kg)", y = "f(x)"
  ) +
  theme_minimal(base_size = 13)
Example 4: Normal distribution fitted from two percentile constraints

Figure 19.4: Example 4: Normal distribution fitted from two percentile constraints


19.5 Example 5: Standard Normal Probabilities with Diagrams

For \(Z \sim N(0,1)\), find:

(a) \(P(Z < 0)\)

(b) \(P(-1 < Z < 1)\)

(c) \(P(Z > 2.54)\)

par(mfrow = c(1, 3), mar = c(4, 3, 3, 1))
z <- seq(-4, 4, by = 0.01)

# (a) P(Z < 0)
plot(z, dnorm(z), type = "l", lwd = 2, main = "(a) P(Z < 0) = 0.5",
     xlab = "z", ylab = "", col = "#2C7BB6")
z_shade <- seq(-4, 0, by = 0.01)
polygon(c(-4, z_shade, 0), c(0, dnorm(z_shade), 0), col = "#3288BD55", border = NA)
abline(v = 0, lty = 2)
text(0, 0.1, "0.5", cex = 1.2, col = "#2C7BB6", font = 2)

# (b) P(-1 < Z < 1)
plot(z, dnorm(z), type = "l", lwd = 2, main = "(b) P(-1 < Z < 1) = 0.6827",
     xlab = "z", ylab = "", col = "#2C7BB6")
z_shade <- seq(-1, 1, by = 0.01)
polygon(c(-1, z_shade, 1), c(0, dnorm(z_shade), 0), col = "#1A964155", border = NA)
abline(v = c(-1, 1), lty = 2)
text(0, 0.1, "0.6827", cex = 1.1, col = "#1A9641", font = 2)

# (c) P(Z > 2.54)
plot(z, dnorm(z), type = "l", lwd = 2, main = "(c) P(Z > 2.54) = 0.0055",
     xlab = "z", ylab = "", col = "#2C7BB6")
z_shade <- seq(2.54, 4, by = 0.01)
polygon(c(2.54, z_shade, 4), c(0, dnorm(z_shade), 0), col = "#D7191C55", border = NA)
abline(v = 2.54, lty = 2)
text(3, 0.05, "0.0055", cex = 1.0, col = "#D7191C", font = 2)
Example 5: Three standard normal probability regions

Figure 19.5: Example 5: Three standard normal probability regions

cat("(a) P(Z < 0)        =", pnorm(0), "\n")
(a) P(Z < 0)        = 0.5 
cat("(b) P(-1 < Z < 1)   =", pnorm(1) - pnorm(-1), "\n")
(b) P(-1 < Z < 1)   = 0.6826895 
cat("(c) P(Z > 2.54)     =", 1 - pnorm(2.54), "\n")
(c) P(Z > 2.54)     = 0.005542623 

19.6 Example 6: \(X \sim N(9, 9)\) — Find \(P(5 < X < 11)\)

Here \(\mu = 9\), \(\sigma^2 = 9\), so \(\sigma = 3\).

Standardise:

\[Z_1 = \frac{5 - 9}{3} = -1.333, \qquad Z_2 = \frac{11 - 9}{3} = 0.667\]

\[P(5 < X < 11) = \Phi(0.667) - \Phi(-1.333) = 0.7486 - 0.0918 = \mathbf{0.6568}\]

x6 <- seq(9 - 4*3, 9 + 4*3, length.out = 500)
df6 <- data.frame(x = x6, y = dnorm(x6, 9, 3))

ggplot(df6, aes(x, y)) +
  geom_area(data = subset(df6, x > 5 & x < 11), aes(x, y), fill = "#3288BD", alpha = 0.5) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = c(5, 11), linetype = "dashed", color = "#D7191C") +
  geom_vline(xintercept = 9, linetype = "dotted", color = "grey40") +
  annotate("text", x = 8, y = 0.06,
           label = "P(5 < X < 11) = 0.6568",
           size = 4.5, color = "#2C7BB6", fontface = "bold") +
  annotate("text", x = 5, y = 0.135, label = "x = 5\n(Z = -1.33)",
           color = "#D7191C", size = 3.5, hjust = 0.5) +
  annotate("text", x = 11, y = 0.135, label = "x = 11\n(Z = 0.67)",
           color = "#D7191C", size = 3.5, hjust = 0.5) +
  labs(title = "Example 6: X ~ N(9, 9), P(5 < X < 11)",
       x = "x", y = "f(x)") +
  theme_minimal(base_size = 13)
Example 6: P(5 < X < 11) for X ~ N(9, 9)

Figure 19.6: Example 6: P(5 < X < 11) for X ~ N(9, 9)

mu6 <- 9; sigma6 <- 3
cat("Z1 =", (5 - mu6)/sigma6, "\n")
Z1 = -1.333333 
cat("Z2 =", (11 - mu6)/sigma6, "\n")
Z2 = 0.6666667 
cat("P(5 < X < 11) =", pnorm(11, mu6, sigma6) - pnorm(5, mu6, sigma6), "\n")
P(5 < X < 11) = 0.6562962 

19.7 Example 7: \(X \sim N(10, \sigma^2)\) with \(P(X > 12) = 0.1587\)

Step 1: Find \(\sigma\)

\[P(X > 12) = 0.1587 \Rightarrow \frac{12-10}{\sigma} = 1 \Rightarrow \sigma = 2\]

Step 2: Find \(P(9 < X < 11)\)

\[P(9 < X < 11) = P\!\left(\frac{9-10}{2} < Z < \frac{11-10}{2}\right) = P(-0.5 < Z < 0.5)\]

\[= \Phi(0.5) - \Phi(-0.5) = 0.6915 - 0.3085 = \mathbf{0.3830}\]

x7 <- seq(10 - 4*2, 10 + 4*2, length.out = 500)
df7 <- data.frame(x = x7, y = dnorm(x7, 10, 2))

ggplot(df7, aes(x, y)) +
  geom_area(data = subset(df7, x > 9 & x < 11), aes(x, y), fill = "#1A9641", alpha = 0.5) +
  geom_line(linewidth = 1.2, color = "black") +
  geom_vline(xintercept = c(9, 11), linetype = "dashed", color = "#D7191C") +
  annotate("text", x = 10, y = 0.08,
           label = "P(9 < X < 11) = 0.3830",
           size = 4.5, color = "#1A9641", fontface = "bold") +
  labs(title = "Example 7: X ~ N(10, 4), P(9 < X < 11)",
       x = "x", y = "f(x)") +
  theme_minimal(base_size = 13)
Example 7: P(9 < X < 11) for X ~ N(10, 4)

Figure 19.7: Example 7: P(9 < X < 11) for X ~ N(10, 4)


19.8 Example 8: Job Time \(X \sim N(55, 100)\)

\(\mu = 55\) minutes, \(\sigma = 10\) minutes.

(i) \(P(X > 75)\)

\[Z = \frac{75-55}{10} = 2 \Rightarrow P(Z > 2) = 1 - 0.9772 = \mathbf{0.0228}\]

(ii) \(P(X < 60)\)

\[Z = \frac{60-55}{10} = 0.5 \Rightarrow P(Z < 0.5) = \mathbf{0.6915}\]

(iii) \(P(45 < X < 60)\)

\[Z_1 = \frac{45-55}{10} = -1, \quad Z_2 = 0.5\]

\[P(-1 < Z < 0.5) = 0.6915 - 0.1587 = \mathbf{0.5328}\]

x8 <- seq(55 - 4*10, 55 + 4*10, length.out = 500)
y8 <- dnorm(x8, 55, 10)
df8 <- data.frame(x = x8, y = y8)

par(mfrow = c(1, 3), mar = c(4, 3, 3, 1))

# (i) P(X > 75)
plot(x8, y8, type = "l", lwd = 2, col = "#2C7BB6",
     main = "(i) P(X > 75) = 0.0228", xlab = "x", ylab = "")
x_s <- x8[x8 >= 75]
polygon(c(75, x_s, max(x8)), c(0, dnorm(x_s, 55, 10), 0),
        col = "#D7191C55", border = NA)
abline(v = 75, lty = 2)

# (ii) P(X < 60)
plot(x8, y8, type = "l", lwd = 2, col = "#2C7BB6",
     main = "(ii) P(X < 60) = 0.6915", xlab = "x", ylab = "")
x_s <- x8[x8 <= 60]
polygon(c(min(x8), x_s, 60), c(0, dnorm(x_s, 55, 10), 0),
        col = "#3288BD55", border = NA)
abline(v = 60, lty = 2)

# (iii) P(45 < X < 60)
plot(x8, y8, type = "l", lwd = 2, col = "#2C7BB6",
     main = "(iii) P(45 < X < 60) = 0.5328", xlab = "x", ylab = "")
x_s <- x8[x8 >= 45 & x8 <= 60]
polygon(c(45, x_s, 60), c(0, dnorm(x_s, 55, 10), 0),
        col = "#1A964155", border = NA)
abline(v = c(45, 60), lty = 2)
Example 8: Three probability regions for X ~ N(55, 100)

Figure 19.8: Example 8: Three probability regions for X ~ N(55, 100)

cat("(i)   P(X > 75)        =", 1 - pnorm(75, 55, 10), "\n")
(i)   P(X > 75)        = 0.02275013 
cat("(ii)  P(X < 60)        =", pnorm(60, 55, 10), "\n")
(ii)  P(X < 60)        = 0.6914625 
cat("(iii) P(45 < X < 60)   =", pnorm(60, 55, 10) - pnorm(45, 55, 10), "\n")
(iii) P(45 < X < 60)   = 0.5328072 

20 Normal Approximation to Other Distributions

20.1 Normal Approximation to the Binomial

If \(X \sim \text{Bin}(n, p)\) and \(n\) is large, then \(X\) is approximately \(N(\mu, \sigma^2)\) where:

\[\mu = np, \qquad \sigma^2 = npq = np(1-p)\]

Conditions for use:

  • \(p \leq 0.5\): use when \(np > 5\)
  • \(p > 0.5\): use when \(n(1-p) > 5\)
set.seed(42)
n_b <- 50; p_b <- 0.4
mu_b <- n_b * p_b; sigma_b <- sqrt(n_b * p_b * (1 - p_b))

x_bin <- 0:n_b
pmf_bin <- dbinom(x_bin, n_b, p_b)

x_norm <- seq(mu_b - 4*sigma_b, mu_b + 4*sigma_b, length.out = 400)
pdf_norm <- dnorm(x_norm, mu_b, sigma_b)

df_bin <- data.frame(x = x_bin, y = pmf_bin)
df_norm_approx <- data.frame(x = x_norm, y = pdf_norm)

ggplot() +
  geom_col(data = df_bin, aes(x, y), fill = "#ABDDA4", color = "grey60", width = 0.8) +
  geom_line(data = df_norm_approx, aes(x, y), color = "#D7191C", linewidth = 1.3) +
  labs(
    title = "Normal Approximation to Bin(50, 0.4)",
    subtitle = paste0("mu = np = ", mu_b, ",  sigma = sqrt(npq) = ", round(sigma_b, 3)),
    x = "x", y = "Probability / Density"
  ) +
  theme_minimal(base_size = 13)
Normal approximation to Bin(50, 0.4)

Figure 20.1: Normal approximation to Bin(50, 0.4)

20.2 Normal Approximation to the Poisson

If \(X \sim \text{Po}(\lambda)\) and \(\lambda\) is large, then \(X\) is approximately \(N(\lambda, \lambda)\).

lambda_p <- 20
x_poi <- 0:45
pmf_poi <- dpois(x_poi, lambda_p)

x_np <- seq(0, 45, length.out = 400)
pdf_np <- dnorm(x_np, lambda_p, sqrt(lambda_p))

ggplot() +
  geom_col(data = data.frame(x = x_poi, y = pmf_poi),
           aes(x, y), fill = "#ABD9E9", color = "grey60", width = 0.8) +
  geom_line(data = data.frame(x = x_np, y = pdf_np),
            aes(x, y), color = "#D7191C", linewidth = 1.3) +
  labs(
    title = "Normal Approximation to Po(20)",
    subtitle = "mu = lambda = 20,  sigma = sqrt(lambda) = 4.47",
    x = "x", y = "Probability / Density"
  ) +
  theme_minimal(base_size = 13)
Normal approximation to Po(20)

Figure 20.2: Normal approximation to Po(20)


21 Standard Normal MGF — Proof

For \(Z \sim N(0,1)\):

\[M_Z(t) = \int_{-\infty}^{\infty} e^{tz} \cdot \frac{1}{\sqrt{2\pi}}\, e^{-z^2/2}\,dz\]

Combine exponents:

\[tz - \frac{z^2}{2} = -\frac{1}{2}(z^2 - 2tz) = -\frac{1}{2}[(z-t)^2 - t^2] = \frac{t^2}{2} - \frac{(z-t)^2}{2}\]

Therefore:

\[M_Z(t) = e^{t^2/2} \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}}\,e^{-(z-t)^2/2}\,dz = e^{t^2/2} \cdot 1\]

\[\boxed{M_Z(t) = e^{t^2/2}}\]

Using the MGF to find mean and variance:

\[M_Z'(t) = t\,e^{t^2/2} \Rightarrow M_Z'(0) = 0 \Rightarrow E(Z) = 0\]

\[M_Z''(t) = e^{t^2/2} + t^2 e^{t^2/2} = (1 + t^2)e^{t^2/2} \Rightarrow M_Z''(0) = 1\]

\[\text{Var}(Z) = 1 - 0^2 = 1\]


22 Exercises

1. Let \(X \sim N(80, 9)\). Find:

  1. \(\Pr(X > 83)\) [0.1587]
  2. \(\Pr(X < 81)\) [0.6306]
  3. \(\Pr(X < 80)\) [0.5]
  4. \(\Pr(78 \leq X < 82)\) [0.4950]

2. Let \(X \sim N(14, 4)\). Find constant \(c\) such that:

  1. \(\Pr(X \leq c) = 95\%\) [17.29]
  2. \(\Pr(X \leq c) = 5\%\) [-17.29] (Note: this should be interpreted as finding c for P(X<=c)=5% using the N(14,4) distribution)
  3. \(\Pr(-c \leq X \leq c) = 99\%\) [19.152]

3. The lifetime \(X\) of a certain automobile battery is \(N(4, 1)\) years. If the manufacturer guarantees the battery for 3 years, what percentage must be replaced? [16%]

4. Airmail envelope weights: \(X \sim N(1.950,\; 0.025^2)\) g. In 1000 envelopes, about how many weigh over 2 g? [About 22]

5. Bolt diameters follow \(X \sim N(0.279,\; 0.001^2)\) cm. Specification: \(0.280 \pm 0.002\) cm. What percentage of bolts meets specifications? [84%]

6. Bolt lengths are normally distributed. Bolts outside \([2.983,\; 3.021]\) inches are rejected. (i) From \(N = 300\) with \(\bar{x} = 3.007\), \(s = 0.011\): expected \(n_1 = 4.4\) (too short), \(n_2 = 30.6\) (too long). (ii) From \(N = 600\) with \(n_1 = 20\) and \(n_2 = 15\): estimate \(\hat{\mu} = 3.001\), \(\hat{\sigma} = 0.01\).

7. For \(Z \sim N(0,1)\), show that \(M_Z(t) = e^{t^2/2}\). Use the MGF to show \(E(Z) = 0\) and \(\text{Var}(Z) = 1\).

# Exercise 1 verification
cat("--- Exercise 1: X ~ N(80, 9), sigma = 3 ---\n")
--- Exercise 1: X ~ N(80, 9), sigma = 3 ---
cat("(a) P(X > 83) =", 1 - pnorm(83, 80, 3), "\n")
(a) P(X > 83) = 0.1586553 
cat("(b) P(X < 81) =", pnorm(81, 80, 3), "\n")
(b) P(X < 81) = 0.6305587 
cat("(c) P(X < 80) =", pnorm(80, 80, 3), "\n")
(c) P(X < 80) = 0.5 
cat("(d) P(78 <= X < 82) =", pnorm(82, 80, 3) - pnorm(78, 80, 3), "\n\n")
(d) P(78 <= X < 82) = 0.4950149 
# Exercise 2 verification
cat("--- Exercise 2: X ~ N(14, 4), sigma = 2 ---\n")
--- Exercise 2: X ~ N(14, 4), sigma = 2 ---
cat("(a) c for P(X <= c) = 95%:", qnorm(0.95, 14, 2), "\n")
(a) c for P(X <= c) = 95%: 17.28971 
cat("(b) c for P(X <= c) = 5% :", qnorm(0.05, 14, 2), "\n\n")
(b) c for P(X <= c) = 5% : 10.71029 
# Exercise 3
cat("--- Exercise 3: X ~ N(4, 1), guarantee 3 years ---\n")
--- Exercise 3: X ~ N(4, 1), guarantee 3 years ---
cat("P(X < 3) =", pnorm(3, 4, 1), "=>", round(pnorm(3, 4, 1)*100, 1), "%\n\n")
P(X < 3) = 0.1586553 => 15.9 %
# Exercise 4
cat("--- Exercise 4: X ~ N(1.950, 0.025^2) ---\n")
--- Exercise 4: X ~ N(1.950, 0.025^2) ---
cat("P(X > 2) =", 1 - pnorm(2, 1.950, 0.025),
    "=> approx", round((1 - pnorm(2, 1.950, 0.025)) * 1000), "in 1000\n\n")
P(X > 2) = 0.02275013 => approx 23 in 1000
# Exercise 5
cat("--- Exercise 5: X ~ N(0.279, 0.001^2) ---\n")
--- Exercise 5: X ~ N(0.279, 0.001^2) ---
cat("P(0.278 <= X <= 0.282) =",
    pnorm(0.282, 0.279, 0.001) - pnorm(0.278, 0.279, 0.001), "\n")
P(0.278 <= X <= 0.282) = 0.8399948 

22.1 Gamma Distribution

22.1.1 The Gamma Function

\[\Gamma(\alpha) = \int_0^{\infty} y^{\alpha-1} e^{-y} \, dy, \quad \alpha > 0\]

Key properties:

  • \(\Gamma(1) = 1\)
  • \(\Gamma(\alpha) = (\alpha-1)\Gamma(\alpha-1)\)
  • For positive integer \(\alpha > 1\): \(\Gamma(\alpha) = (\alpha-1)!\)

22.1.2 Beta Function

\[B(\alpha, \beta) = \int_0^1 y^{\alpha-1}(1-y)^{\beta-1} \, dy = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}\]

22.1.3 Gamma Distribution PDF

A random variable \(X\) has a gamma distribution with parameters \(\alpha\) and \(\beta\) if:

\[f(x) = \frac{1}{\Gamma(\alpha)\beta^\alpha} x^{\alpha-1} e^{-x/\beta}, \quad 0 < x < \infty\]

Uses: The gamma distribution frequently models waiting times (e.g., time until “death” in life testing).

Special case: When \(\alpha = 1\), \(\lambda = 1/\beta\), the gamma reduces to the exponential distribution:

\[f(x) = \lambda e^{-\lambda x}, \quad x > 0\]

22.1.4 Moment Generating Function

\[M_X(t) = (1 - \beta t)^{-\alpha}\]

22.1.5 Mean

\[M_X'(t) = \alpha\beta(1-\beta t)^{-\alpha-1}\]

\[M_X'(0) = \alpha\beta \Rightarrow \textbf{Mean} = \alpha\beta\]

22.1.6 Variance

\[M_X''(t) = \alpha\beta^2(\alpha+1)(1-\beta t)^{-\alpha-2}\]

\[M_X''(0) = \alpha\beta^2(\alpha+1) = \alpha^2\beta^2 + \alpha\beta^2\]

\[\text{Var}(X) = \alpha^2\beta^2 + \alpha\beta^2 - (\alpha\beta)^2 = \alpha\beta^2\]

\[\boxed{\text{Var}(X) = \alpha\beta^2}\]


22.2 Chi-Square Distribution

The chi-square distribution is a special case of the gamma distribution with \(\alpha = r/2\) and \(\beta = 2\):

\[f(x) = \frac{1}{\Gamma(r/2) \cdot 2^{r/2}} x^{r/2-1} e^{-x/2}, \quad 0 < x < \infty\]

We write \(X \sim \chi^2(r)\), where \(r\) is the degrees of freedom.

\[M_X(t) = (1-2t)^{-r/2}\]

\[\textbf{Mean} = \alpha\beta = \frac{r}{2} \cdot 2 = r, \qquad \textbf{Var}(X) = \alpha\beta^2 = \frac{r}{2} \cdot 4 = 2r\]

22.2.1 Examples

Example 1. If \(f(x) = \frac{1}{4}xe^{-x/2}\), then \(X \sim \chi^2(4)\):

\[\mu = 4, \quad \text{Var}(X) = 8, \quad M_X(t) = (1-2t)^{-2}\]

Example 2. If \(M_X(t) = (1-2t)^{-8}\), then \(X \sim \chi^2(16)\).


End of Lecture 7 Notes



23 The Beta Distribution

A random variable \(X\) is said to have a Beta density function if its probability density function is given by:

\[f(x) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}\, x^{\alpha-1}(1-x)^{\beta-1}, \quad 0 < x < 1\]

\[= 0, \quad \text{elsewhere}\]

where \(\alpha, \beta > 0\) are the parameters of the distribution.

It can be shown that:

\[\int_0^1 x^{\alpha-1}(1-x)^{\beta-1}\,dx = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}\]

23.1 Mean and Variance

23.1.1 Mean

\[E(X) = \frac{\alpha}{\alpha + \beta}\]

23.1.2 Variance

\[\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\]

23.2 Proof

We use the general \(k\)-th moment formula:

\[E(X^k) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} \int_0^1 x^{k+\alpha-1}(1-x)^{\beta-1}\,dx = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+k)}{\Gamma(\alpha+\beta+k)}\]

23.2.1 Deriving the Mean (\(k=1\))

\[E(X) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+1)}{\Gamma(\alpha+\beta+1)}\]

\[= \frac{(\alpha+\beta-1)!\; \alpha!}{(\alpha-1)!\;(\alpha+\beta)!}\]

\[= \frac{(\alpha+\beta-1)!\cdot \alpha \cdot (\alpha-1)!}{(\alpha-1)!\cdot (\alpha+\beta)\cdot(\alpha+\beta-1)!}\]

\[= \frac{\alpha}{\alpha+\beta}\]

\[\boxed{E(X) = \frac{\alpha}{\alpha+\beta}}\]

23.2.2 Deriving the Variance

\[E(X^2) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+2)}{\Gamma(\alpha+\beta+2)}\]

\[= \frac{(\alpha+\beta-1)!\;(\alpha+1)!}{(\alpha-1)!\;(\alpha+\beta+1)!}\]

\[= \frac{(\alpha+\beta-1)!\cdot(\alpha+1)\cdot\alpha\cdot(\alpha-1)!}{(\alpha-1)!\cdot(\alpha+\beta+1)\cdot(\alpha+\beta)\cdot(\alpha+\beta-1)!}\]

\[= \frac{\alpha(\alpha+1)}{(\alpha+\beta+1)(\alpha+\beta)}\]

Therefore:

\[\text{Var}(X) = E(X^2) - [E(X)]^2 = \frac{\alpha(\alpha+1)}{(\alpha+\beta+1)(\alpha+\beta)} - \frac{\alpha^2}{(\alpha+\beta)^2}\]

\[= \frac{\alpha(\alpha+1)(\alpha+\beta) - \alpha^2(\alpha+\beta+1)}{(\alpha+\beta+1)(\alpha+\beta)^2}\]

\[\boxed{\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta+1)(\alpha+\beta)^2}}\]

Note: The moment generating function for the Beta distribution is neither simple nor useful.

The Beta density can take on a variety of shapes, including the uniform distribution on \((0,1)\) when \(\alpha = \beta = 1\).

23.3 Example

If the annual proportion of erroneous income tax returns filed with KRA can be modelled as a Beta distribution with \(\alpha = 2\) and \(\beta = 9\), find the probability that in any given year there will be fewer than 10% erroneous returns.


24 Practice Questions

24.1 Question 1: Uniform (Rectangular) Distribution — Mean and Variance

A random variable \(X\) has a rectangular distribution if:

\[f(x) = \frac{1}{b-a}, \quad a < x < b; \qquad f(x) = 0, \quad \text{elsewhere}\]

Find the mean and variance of this distribution. (5 marks)

24.1.1 Solution

Mean:

\[E(X) = \int_a^b x \cdot \frac{1}{b-a}\,dx = \frac{1}{b-a} \cdot \frac{1}{2}[x^2]_a^b = \frac{1}{2(b-a)}[b^2-a^2]\]

\[= \frac{1}{2(b-a)}(b-a)(b+a) = \frac{a+b}{2}\]

\[\boxed{E(X) = \frac{a+b}{2}}\]

Variance:

\[E(X^2) = \int_a^b x^2 \cdot \frac{1}{b-a}\,dx = \frac{1}{3(b-a)}[x^3]_a^b = \frac{b^3 - a^3}{3(b-a)}\]

\[= \frac{(b-a)(b^2+ab+a^2)}{3(b-a)} = \frac{b^2+ab+a^2}{3}\]

\[\text{Var}(X) = \frac{b^2+ab+a^2}{3} - \left(\frac{a+b}{2}\right)^2 = \frac{4(b^2+ab+a^2) - 3(a^2+2ab+b^2)}{12}\]

\[\boxed{\text{Var}(X) = \frac{(b-a)^2}{12}}\]


24.2 Question 2: Binomial Distribution — Defective Items

It is expected that 10% of production from a continuous process will be defective. From a sample of 10 units chosen at random, find the probability that:

  1. Exactly 2 will be defective. (1 mark)
  2. At least 2 will be defective. (2 marks)

24.2.1 Solution

Let \(X\) = number of defective items. Then \(X \sim \text{Bin}(n=10,\; p=0.1,\; q=0.9)\).

\[P(X = x) = \binom{n}{x} p^x q^{n-x}\]

(a) Exactly 2 defective:

\[P(X = 2) = \binom{10}{2}(0.1)^2(0.9)^8 = \mathbf{0.1937}\]

(b) At least 2 defective:

\[P(X \geq 2) = 1 - [P(X=0) + P(X=1)] = \mathbf{0.264}\]


24.3 Question 3: Gamma Distribution — Finding Parameters

A gamma distribution has a mean of 18 and a variance of 27. Find the parameters \(\alpha\) and \(\beta\) of the distribution. (3 marks)

24.3.1 Solution

Recall that for a gamma distribution: \(\text{Mean} = \alpha\beta\) and \(\text{Var}(X) = \alpha\beta^2\).

\[\alpha\beta = 18 \quad \cdots (i)\] \[\alpha\beta^2 = 27 \quad \cdots (ii)\]

Dividing (ii) by (i):

\[\beta = \frac{27}{18} = 1.5\]

Substituting into (i):

\[\alpha = \frac{18}{\beta} = \frac{18}{1.5} = 12\]

\[\boxed{\alpha = 12, \quad \beta = 1.5}\]


24.4 Question 4: Exponential Distribution — MGF, Mean and Variance

A continuous random variable \(X\) has the p.d.f.:

\[f(x) = \lambda e^{-\lambda x}, \quad x > 0; \qquad f(x) = 0, \quad \text{elsewhere}\]

where \(\lambda\) is a constant. Find:

  1. The moment generating function. (2 marks)
  2. The mean and variance. (2 marks)

24.4.1 Solution

(a) MGF:

\[M_X(t) = E(e^{tX}) = \int_0^{\infty} e^{tx} \lambda e^{-\lambda x}\,dx = \lambda \int_0^{\infty} e^{-x(\lambda - t)}\,dx\]

\[= \frac{-\lambda}{\lambda - t}\left[e^{-x(\lambda-t)}\right]_0^{\infty} = \frac{-\lambda}{\lambda - t}[0 - 1]\]

\[\boxed{M_X(t) = \frac{\lambda}{\lambda - t}}\]

(b) Mean and Variance:

Mean:

\[M_X'(t) = \frac{d}{dt}\left[\lambda(\lambda-t)^{-1}\right] = \frac{\lambda}{(\lambda-t)^2}\]

\[M_X'(0) = \frac{\lambda}{\lambda^2} = \frac{1}{\lambda}\]

\[\boxed{E(X) = \frac{1}{\lambda}}\]

Variance:

\[M_X''(t) = \frac{d}{dt}\left[\lambda(\lambda-t)^{-2}\right] = \frac{2\lambda}{(\lambda-t)^3}\]

\[M_X''(0) = \frac{2\lambda}{\lambda^3} = \frac{2}{\lambda^2}\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2}\]

\[\boxed{\text{Var}(X) = \frac{1}{\lambda^2}}\]


24.5 Question 5: Normal Distribution — Mean and Standard Deviation from Percentiles

An industrial process mass-produces items which are normally distributed. 11.55% of them weigh over 20 kg and 5.89% weigh under 10 kg. Calculate:

  1. The mean weight. (2 marks)
  2. The standard deviation. (2 marks)

24.5.1 Solution

(a) and (b):

From \(P(X > 20) = 0.1155\):

\[1 - \Phi\left(\frac{20-\mu}{\sigma}\right) = 0.1155 \Rightarrow \Phi\left(\frac{20-\mu}{\sigma}\right) = 0.8845\]

\[\frac{20-\mu}{\sigma} = \Phi^{-1}(0.8845) = 1.2 \Rightarrow \mu + 1.2\sigma = 20 \quad \cdots (i)\]

From \(P(X < 10) = 0.0589\):

\[\Phi\left(\frac{10-\mu}{\sigma}\right) = 0.0589 \Rightarrow \frac{10-\mu}{\sigma} = -1.56 \Rightarrow \mu - 1.56\sigma = 10 \quad \cdots (ii)\]

Solving (i) and (ii) simultaneously:

\[\boxed{\mu = 15.652, \quad \sigma = 3.623}\]


24.6 Question 6: Normal Distribution — Package Weights

Masses of packages from a machine are normally distributed with mean \(\mu = 200\) g and standard deviation \(\sigma = 2\) g. Find the probability that a randomly selected package weighs:

  1. Less than 197 g. (2 marks)
  2. Between 198.5 g and 200.5 g. (2 marks)

24.6.1 Solution

Let \(X \sim N(200, 4)\).

(a)

\[P(X < 197) = P\left(Z < \frac{197-200}{2}\right) = P(Z < -1.50) = \mathbf{0.0668}\]

(b)

\[P(198.5 < X < 200.5) = P\left(\frac{198.5-200}{2} < Z < \frac{200.5-200}{2}\right)\] \[= P(-0.75 < Z < 0.25) = \Phi(0.25) - \Phi(-0.75)\] \[= 0.5987 - 0.2266 = \mathbf{0.3721}\]


24.7 Question 7: Standard Normal Distribution — MGF, Mean and Variance

A random variable \(X\) has a Standard Normal distribution. Find:

  1. The moment generating function. (3 marks)
  2. The mean and variance. (2 marks)

24.7.1 Solution

The standard normal distribution has mean 0 and variance 1:

\[f(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}, \quad -\infty < x < \infty\]

(a) MGF:

\[M_X(t) = \int_{-\infty}^{\infty} e^{tx} \cdot \frac{1}{\sqrt{2\pi}} e^{-x^2/2}\,dx\]

Complete the square in the exponent:

\[tx - \frac{x^2}{2} = -\frac{1}{2}(x-t)^2 + \frac{t^2}{2}\]

\[M_X(t) = e^{t^2/2} \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac{(x-t)^2}{2}}\,dx = e^{t^2/2}\]

\[\boxed{M_X(t) = e^{t^2/2}}\]

(b) Mean and Variance:

\[M_X'(t) = t\, e^{t^2/2} \Rightarrow M_X'(0) = 0 \Rightarrow \boxed{E(X) = 0}\]

\[M_X''(t) = e^{t^2/2} + t^2 e^{t^2/2} \Rightarrow M_X''(0) = 1\]

\[\text{Var}(X) = M_X''(0) - [M_X'(0)]^2 = 1 - 0 = \boxed{1}\]


24.8 Question 8: Beta Distribution — Mean

A random variable \(X\) has a Beta density function:

\[f(x) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1}(1-x)^{\beta-1}, \quad 0 < x < 1; \quad \alpha,\beta > 0\]

Find the mean of the distribution. (2 marks)

24.8.1 Solution

Using:

\[E(X^k) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} \int_0^1 x^{k+\alpha-1}(1-x)^{\beta-1}\,dx = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+k)}{\Gamma(\alpha+\beta+k)}\]

Setting \(k = 1\):

\[\boxed{E(X) = \frac{\alpha}{\alpha+\beta}}\]


24.9 Question 9: Gamma-type Distribution — MGF

A continuous random variable \(V\) has p.d.f.:

\[f(v) = ve^{-v}, \quad v > 0; \qquad f(v) = 0, \quad \text{elsewhere}\]

Find the moment generating function for \(V\). (2 marks)

24.9.1 Solution

\[M_V(t) = E(e^{tV}) = \int_0^{\infty} e^{tv} \cdot v e^{-v}\,dv = \int_0^{\infty} v\, e^{-v(1-t)}\,dv\]

This is a gamma-type integral. With \(\lambda = 1-t\):

\[\int_0^{\infty} v\,e^{-\lambda v}\,dv = \frac{1}{\lambda^2}\]

\[\boxed{M_V(t) = \frac{1}{(1-t)^2}}\]


25 Topic Seven: Bivariate Distributions

25.1 Objectives

By the end of this topic, you should be able to:

  • Formally define a joint probability mass function of two discrete random variables.
  • Use a joint probability mass function to find the probability of a specific event.
  • Find a marginal probability mass function of a discrete random variable \(X\) from the joint PMF of \(X\) and \(Y\).
  • Formally define a joint probability density function of two continuous random variables.
  • Use a joint probability density function to find the probability of a specific event.
  • Find a marginal probability density function of a continuous random variable \(X\) from the joint PDF of \(X\) and \(Y\).

25.2 Two Discrete Random Variables

25.2.1 Introduction

When two discrete random variables \(X\) and \(Y\) are considered together, we define their joint probability mass function \(f(x,y) = P(X=x, Y=y)\).

Example. Suppose we toss a pair of fair, four-sided dice — one RED and one BLACK. Let:

  • \(X\) = outcome on the RED die \(\in \{1,2,3,4\}\)
  • \(Y\) = outcome on the BLACK die \(\in \{1,2,3,4\}\)

The joint sample space has \(4 \times 4 = 16\) equally likely outcomes:

\[S = \{(1,1),(1,2),\ldots,(4,4)\}\]

Since the dice are fair:

\[P(X = x,\; Y = y) = \frac{1}{16} \quad \text{for all } (x,y) \in S\]


25.2.2 Example 2: Joint Distribution of Discrete \(X\) and \(Y\)

Consider the following joint probability distribution:

\(Y=1\) \(Y=2\) \(Y=3\)
\(X=1\) \(1/12\) \(1/6\) \(1/12\)
\(X=2\) \(1/8\) \(1/4\) \(1/24\)
\(X=3\) \(1/12\) \(1/24\) \(1/12\)

Verification that probabilities sum to 1:

\[\frac{1}{12}+\frac{1}{6}+\frac{1}{12}+\frac{1}{8}+\frac{1}{4}+\frac{1}{24}+\frac{1}{12}+\frac{1}{24}+\frac{1}{12} = 1 \checkmark\]

Find: (a) the marginal distribution of \(X\); (b) the marginal distribution of \(Y\); (c) whether \(X\) and \(Y\) are independent; (d) \(\text{Cov}(X,Y)\); (e) the correlation coefficient.

25.2.2.1 Solution

(a) Marginal Distribution of \(X\) (sum across rows):

\[P(X=1) = \frac{1}{12}+\frac{1}{6}+\frac{1}{12} = \frac{1+2+1}{12} = \frac{4}{12} = \frac{1}{3}\]

\[P(X=2) = \frac{1}{8}+\frac{1}{4}+\frac{1}{24} = \frac{3+6+1}{24} = \frac{10}{24} = \frac{5}{12}\]

\[P(X=3) = \frac{1}{12}+\frac{1}{24}+\frac{1}{12} = \frac{2+1+2}{24} = \frac{5}{24}\]

(b) Marginal Distribution of \(Y\) (sum down columns):

\[P(Y=1) = \frac{1}{12}+\frac{1}{8}+\frac{1}{12} = \frac{2+3+2}{24} = \frac{7}{24}\]

\[P(Y=2) = \frac{1}{6}+\frac{1}{4}+\frac{1}{24} = \frac{4+6+1}{24} = \frac{11}{24}\]

\[P(Y=3) = \frac{1}{12}+\frac{1}{24}+\frac{1}{12} = \frac{2+1+2}{24} = \frac{5}{24}\]

(c) Independence of \(X\) and \(Y\):

Check whether \(P(X=x, Y=y) = P(X=x) \cdot P(Y=y)\) for all \((x,y)\).

At \(X=1\), \(Y=1\):

\[P(X=1,Y=1) = \frac{1}{12}\] \[P(X=1)\cdot P(Y=1) = \frac{1}{3} \times \frac{7}{24} = \frac{7}{72}\]

Since \(\frac{1}{12} \neq \frac{7}{72}\), \(X\) and \(Y\) are not independent.

(d) Covariance:

\[\text{Cov}(X,Y) = E(XY) - E(X)E(Y)\]

\[E(X) = 1\left(\frac{1}{3}\right) + 2\left(\frac{5}{12}\right) + 3\left(\frac{5}{24}\right) = \frac{8+20+15}{24} = \frac{43}{24}\]

\[E(Y) = 1\left(\frac{7}{24}\right) + 2\left(\frac{11}{24}\right) + 3\left(\frac{5}{24}\right) = \frac{7+22+15}{24} = \frac{44}{24} = \frac{11}{6}\]

\[E(XY) = \sum\sum xy\, P(x,y)\]

\[= (1)(1)\frac{1}{12} + (1)(2)\frac{1}{6} + (1)(3)\frac{1}{12}\] \[+ (2)(1)\frac{1}{8} + (2)(2)\frac{1}{4} + (2)(3)\frac{1}{24}\] \[+ (3)(1)\frac{1}{12} + (3)(2)\frac{1}{24} + (3)(3)\frac{1}{12}\]

\[= \frac{1}{12}+\frac{1}{3}+\frac{1}{4}+\frac{1}{4}+1+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}+\frac{3}{4} = \frac{41}{12}\]

\[\text{Cov}(X,Y) = \frac{41}{12} - \frac{43}{24} \times \frac{11}{6} = \frac{41}{12} - \frac{473}{144} = \frac{492-473}{144}\]

\[\boxed{\text{Cov}(X,Y) = \frac{19}{144} \approx 0.132}\]

(e) Correlation Coefficient:

\[\rho_{XY} = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}\]

Variance of \(X\):

\[E(X^2) = 1^2\left(\frac{1}{3}\right) + 2^2\left(\frac{5}{12}\right) + 3^2\left(\frac{5}{24}\right) = \frac{1}{3}+\frac{20}{12}+\frac{45}{24} = \frac{93}{24} = \frac{31}{8}\]

\[\text{Var}(X) = \frac{31}{8} - \left(\frac{43}{24}\right)^2 = \frac{383}{576}\]

Variance of \(Y\):

\[E(Y^2) = 1^2\left(\frac{7}{24}\right) + 2^2\left(\frac{11}{24}\right) + 3^2\left(\frac{5}{24}\right) = \frac{7+44+45}{24} = 4\]

\[\text{Var}(Y) = 4 - \left(\frac{11}{6}\right)^2 = \frac{23}{36}\]

\[\rho_{XY} = \frac{19/144}{\sqrt{383/576}\cdot\sqrt{23/36}}\]

\[\boxed{\rho_{XY} \approx 0.203}\]


25.3 Two Continuous Random Variables

25.3.1 Joint Probability Density Function

Let \(X\) and \(Y\) be two continuous random variables with two-dimensional sample space \(S\). The function \(f(x,y)\) is a joint probability density function if it satisfies:

  1. \(f(x,y) \geq 0\) for all \((x,y)\)
  2. \(\displaystyle\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f(x,y)\,dx\,dy = 1\)
  3. For any region \(A\) in the sample space: \(P[(X,Y) \in A] = \displaystyle\iint_A f(x,y)\,dx\,dy\)

Key definitions:

  • Marginal PDF of \(X\): \(\displaystyle f_X(x) = \int_{-\infty}^{\infty} f(x,y)\,dy\)
  • Marginal PDF of \(Y\): \(\displaystyle f_Y(y) = \int_{-\infty}^{\infty} f(x,y)\,dx\)
  • Independence: \(X\) and \(Y\) are independent if \(f(x,y) = f_X(x)\,f_Y(y)\)
  • Covariance: \(\text{Cov}(X,Y) = E(XY) - E(X)E(Y)\)
  • Correlation: \(\rho_{XY} = \dfrac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}\)

25.3.2 Example 1: \(f(x,y) = 4xy\)

\[f(x,y) = \begin{cases} 4xy, & 0 < x < 1,\; 0 < y < 1 \\ 0, & \text{elsewhere} \end{cases}\]

25.3.2.1 Validity

Since \(4xy \geq 0\) on \((0,1)\times(0,1)\), condition 1 holds. Check condition 2:

\[\int_0^1\int_0^1 4xy\,dx\,dy = \int_0^1 4y\left[\frac{x^2}{2}\right]_0^1 dy = \int_0^1 2y\,dy = [y^2]_0^1 = 1 \checkmark\]

25.3.2.2 (a) Marginal PDF of \(X\)

\[f_X(x) = \int_0^1 4xy\,dy = 4x\left[\frac{y^2}{2}\right]_0^1 = 2x, \quad 0 < x < 1\]

25.3.2.3 (b) Marginal PDF of \(Y\)

\[f_Y(y) = \int_0^1 4xy\,dx = 4y\left[\frac{x^2}{2}\right]_0^1 = 2y, \quad 0 < y < 1\]

25.3.2.4 (c) Independence

\[f_X(x)\,f_Y(y) = (2x)(2y) = 4xy = f(x,y)\]

\(X\) and \(Y\) are independent.

25.3.2.5 (d) Covariance

\[E(X) = \int_0^1 x(2x)\,dx = 2\left[\frac{x^3}{3}\right]_0^1 = \frac{2}{3}\]

\[E(Y) = \frac{2}{3} \quad \text{(by symmetry)}\]

\[E(XY) = \int_0^1\int_0^1 xy(4xy)\,dx\,dy = 4\left(\int_0^1 x^2\,dx\right)\!\left(\int_0^1 y^2\,dy\right) = 4 \cdot \frac{1}{3} \cdot \frac{1}{3} = \frac{4}{9}\]

\[\text{Cov}(X,Y) = \frac{4}{9} - \frac{2}{3}\cdot\frac{2}{3} = \frac{4}{9} - \frac{4}{9} = \boxed{0}\]

25.3.2.6 (e) Correlation Coefficient

Since \(\text{Cov}(X,Y) = 0\):

\[\boxed{\rho_{XY} = 0}\]

The random variables are uncorrelated (and independent).


25.3.3 Example 2: \(f(x,y) = kx + y\)

\[f(x,y) = \begin{cases} kx + y, & 0 < x < 1,\; 0 < y < 1 \\ 0, & \text{elsewhere} \end{cases}\]

25.3.3.1 Finding \(k\)

For a valid PDF:

\[\int_0^1\int_0^1 (kx+y)\,dy\,dx = 1\]

\[\int_0^1 \left[kxy + \frac{y^2}{2}\right]_0^1 dx = \int_0^1 \left(kx + \frac{1}{2}\right)dx = \left[\frac{kx^2}{2} + \frac{x}{2}\right]_0^1 = \frac{k}{2} + \frac{1}{2} = 1\]

\[\frac{k}{2} = \frac{1}{2} \Rightarrow \boxed{k = 1}\]

Hence \(f(x,y) = x + y,\quad 0 < x < 1,\; 0 < y < 1\).

25.3.3.2 (a) Marginal PDF of \(X\)

\[f_X(x) = \int_0^1 (x+y)\,dy = \left[xy + \frac{y^2}{2}\right]_0^1 = x + \frac{1}{2}, \quad 0 < x < 1\]

25.3.3.3 (b) Marginal PDF of \(Y\)

\[f_Y(y) = \int_0^1 (x+y)\,dx = \left[\frac{x^2}{2}+yx\right]_0^1 = \frac{1}{2} + y, \quad 0 < y < 1\]

25.3.3.4 (c) Independence

\[f_X(x)\,f_Y(y) = \left(x+\frac{1}{2}\right)\!\left(y+\frac{1}{2}\right) = xy + \frac{x}{2}+\frac{y}{2}+\frac{1}{4}\]

Since \(f(x,y) = x+y \neq f_X(x)f_Y(y)\), \(X\) and \(Y\) are not independent.

25.3.3.5 (d) Covariance

\[E(X) = \int_0^1 x\left(x+\frac{1}{2}\right)dx = \int_0^1\left(x^2 + \frac{x}{2}\right)dx = \left[\frac{x^3}{3}+\frac{x^2}{4}\right]_0^1 = \frac{1}{3}+\frac{1}{4} = \frac{7}{12}\]

\[E(Y) = \frac{7}{12} \quad \text{(by symmetry)}\]

\[E(XY) = \int_0^1\int_0^1 xy(x+y)\,dy\,dx = \int_0^1\int_0^1 (x^2y + xy^2)\,dy\,dx\]

\[= \int_0^1 \left[\frac{x^2 y^2}{2}+\frac{xy^3}{3}\right]_0^1 dx = \int_0^1\left(\frac{x^2}{2}+\frac{x}{3}\right)dx = \left[\frac{x^3}{6}+\frac{x^2}{6}\right]_0^1 = \frac{1}{6}+\frac{1}{6} = \frac{1}{3}\]

\[\text{Cov}(X,Y) = \frac{1}{3} - \frac{7}{12}\cdot\frac{7}{12} = \frac{1}{3} - \frac{49}{144} = \frac{48-49}{144}\]

\[\boxed{\text{Cov}(X,Y) = -\frac{1}{144}}\]

25.3.3.6 (e) Correlation Coefficient

\[E(X^2) = \int_0^1 x^2\!\left(x+\frac{1}{2}\right)dx = \int_0^1\left(x^3+\frac{x^2}{2}\right)dx = \frac{1}{4}+\frac{1}{6} = \frac{5}{12}\]

\[\text{Var}(X) = \frac{5}{12} - \left(\frac{7}{12}\right)^2 = \frac{60-49}{144} = \frac{11}{144}\]

\[\sigma_X = \sigma_Y = \frac{\sqrt{11}}{12}\]

\[\rho_{XY} = \frac{-1/144}{({\sqrt{11}}/{12})({\sqrt{11}}/{12})} = \frac{-1/144}{11/144} = -\frac{1}{11}\]

\[\boxed{\rho_{XY} = -\frac{1}{11} \approx -0.091}\]

\(X\) and \(Y\) have a very weak negative correlation.


25.3.4 Example 3: \(f(x,y) = cx + 1\) over a Triangular Region

\[f_{X,Y}(x,y) = \begin{cases} cx + 1, & x \geq 0,\; y \geq 0,\; x+y < 1 \\ 0, & \text{otherwise} \end{cases}\]

25.3.4.1 Finding \(c\)

The region of integration is \(R = \{(x,y) : 0 \leq x \leq 1,\; 0 \leq y \leq 1-x\}\).

\[\int_0^1\int_0^{1-x}(cx+1)\,dy\,dx = 1\]

\[\int_0^1 (cx+1)(1-x)\,dx = \int_0^1 (cx - cx^2 + 1 - x)\,dx = 1\]

\[\left[\frac{cx^2}{2} - \frac{cx^3}{3} + x - \frac{x^2}{2}\right]_0^1 = \frac{c}{2} - \frac{c}{3} + 1 - \frac{1}{2} = \frac{c}{6} + \frac{1}{2} = 1\]

\[\frac{c}{6} = \frac{1}{2} \Rightarrow \boxed{c = 3}\]

Hence \(f_{X,Y}(x,y) = 3x + 1\) on the region \(x \geq 0,\; y \geq 0,\; x+y < 1\).

25.3.4.2 (a) Marginal PDF of \(X\)

\[f_X(x) = \int_0^{1-x}(3x+1)\,dy = (3x+1)(1-x) = 3x + 1 - 3x^2 - x\]

\[\boxed{f_X(x) = 1 + 2x - 3x^2, \quad 0 \leq x \leq 1}\]

25.3.4.3 (b) Marginal PDF of \(Y\)

For fixed \(y\), \(x\) ranges from \(0\) to \(1-y\):

\[f_Y(y) = \int_0^{1-y}(3x+1)\,dx = \left[\frac{3x^2}{2}+x\right]_0^{1-y} = \frac{3}{2}(1-y)^2 + (1-y)\]

\[= \frac{3}{2}(1-2y+y^2) + 1 - y = \frac{3}{2} - 3y + \frac{3}{2}y^2 + 1 - y\]

\[\boxed{f_Y(y) = \frac{5}{2} - 4y + \frac{3}{2}y^2, \quad 0 \leq y \leq 1}\]