2025-03-14

What is Discrete Random Variable in Probability?

  • It takes on a countable number of distinct values and we use those distinct values to create a probability mass function
  • Key Characteristics of Discrete Random Variables:
    • Countable Outcomes: Can take values like: 0,2, HHH, HHT, etc.
    • Probability Distribution: Described by a probability mass function, which assigns a probability to each possible outcome.
    • Sum of Probabilities: The total sum of the probabilities must be 1.

Probability Mass Function (PMF)

A PMF assigns a probability to each discrete outcome:

\[ P(X = k) = f(k) \]

Example: A fair die roll:

\[ P(X = k) = \frac{1}{6}, \quad k = 1, 2, 3, 4, 5, 6 \]

Cumulative Distribution Function (CDF)

A Cumulative Distribution Function (CDF) gives the probability that a discrete random variable \(X\) takes a value less than or equal to** \(x\):

\[ F(x) = P(X \leq x) = \sum_{k \leq x} P(X = k) \]

Properties of a CDF:

  1. \(F(x)\) is non-decreasing.
  2. Limits: \[ \lim_{x \to -\infty} F(x) = 0, \quad \lim_{x \to \infty} F(x) = 1 \]
  3. Step Function for Discrete Variables:
    • Since \(X\) is discrete, \(F(x)\) is a step function rather than continuous.

Example: CDF of a Fair Die Roll

For a fair 6-sided die, the PMF is:

\[ P(X = k) = \frac{1}{6}, \quad k = 1, 2, 3, 4, 5, 6 \]

The corresponding CDF is:

\[ F(x) = \begin{cases} 0, & x < 1 \\ \frac{1}{6}, & 1 \leq x < 2 \\ \frac{2}{6}, & 2 \leq x < 3 \\ \frac{3}{6}, & 3 \leq x < 4 \\ \frac{4}{6}, & 4 \leq x < 5 \\ \frac{5}{6}, & 5 \leq x < 6 \\ 1, & x \geq 6 \end{cases} \]

Example Problem

  • Suppose we have a chess knight K is placed on a regular 8x8 board by selecting one of the squares on the board at random. Let X be the number of squares attacked by K. Give me a Probability Distribution table, Probability Mass Function, and Cumulative Distribution.

Probability Distribution Function

K P(X = K)
1 0.0000
2 0.0625
3 0.1250
4 0.3125
5 0.0000
6 0.2500
7 0.0000
8 0.2500

Probability Mass Function (Plotly)

plot_ly(
  x = df_PDF$K_i,
  y = df_PDF$P_X_equals_K,
  type = "bar"
) %>%
  layout(title = "Probability Mass Function (PMF)",
         xaxis = list(title = "K"),
         yaxis = list(title = "P(X = K)"))

Probability Mass Function (ggplot2)

ggplot(df_PDF, aes(x = K_i, y = P_X_equals_K)) +
  geom_bar(stat = "identity", fill = "blue") +
  geom_text(aes(label = round(P_X_equals_K, 4)), vjust = -0.5, size = 5) +
  labs(
    title = "Probability Mass Function (PMF)",
    x = "K (Values of Random Variable X)",
    y = "P(X = K)"
  ) +
  theme_minimal()

Cumulative Distribution Function (ggplot2)

K <- c(1, 2, 3, 4, 5, 6, 7, 8)
CDF <- c(0, 4/64, 12/64, 32/64, 32/64, 48/64, 48/64, 64/64)
df_CDF <- data.frame(K_i = K, P_X_equals_K = CDF)
ggplot(df_CDF, aes(x = K_i, y = P_X_equals_K)) +
  geom_bar(stat = "identity", fill = "orange") +
  geom_text(aes(label = round(P_X_equals_K, 4)), vjust = -0.5, size = 5) +
  labs(
    title = "Cumulative Distribution Function (CDF)",
    x = "K (Values of Random Variable X)",
    y = "P(X = K)"
  ) +
  theme_minimal()