Variable aleatoria discreta (X)
En probabilidad, una variable aleatoria discreta toma valores en un conjunto finito o infinito numerable (0,1,2,…). Cada valor posible tiene una probabilidad asignada. Ejemplos: número de caras al lanzar n monedas, número de estudiantes que aprueban en un grupo de 20, intentos hasta el primer éxito.
Función de probabilidad f(x)
La pmf asigna a cada valor x la probabilidad \(f(x)=P(X=x)\). Debe cumplir \(f(x)\ge0\) y \(\sum_x f(x)=1\).
Función de probabilidad acumulada F(x)
La CDF es \(F(x)=P(X\le x)\). En discretas es escalonada; \(P(a\le X\le b)=F(b)-F(a^-)\).
Esperanza matemática E(X)
\(\mathbb{E}[X]=\sum_x x f(x)\) es el “centro de masa” de la distribución.
Varianza V(X)
\(\mathbb{V}[X]=\sum_x (x-\mu)^2 f(x)\) mide la dispersión alrededor de \(\mu=\mathbb{E}[X]\).
Si \(X\in\{a,\ldots,b\}\) con igual probabilidad: \(\displaystyle P(X=x)=\frac{1}{b-a+1},\ x=a,\ldots,b.\) \(\quad \mathbb{E}[X]=\frac{a+b}{2},\ \mathbb{V}[X]=\frac{(b-a+1)^2-1}{12}.\)
a <- 2; b <- 7
x <- a:b
pmf <- rep(1/(b-a+1), length(x))
df <- data.frame(x = x, `P(X=x)` = pmf, check.names = FALSE)
knitr::kable(df, caption = sprintf("Uniforme discreta en {%d,...,%d}", a, b))| x | P(X=x) |
|---|---|
| 2 | 0.1666667 |
| 3 | 0.1666667 |
| 4 | 0.1666667 |
| 5 | 0.1666667 |
| 6 | 0.1666667 |
| 7 | 0.1666667 |
\(P(X=x)=p^x(1-p)^{1-x},\ x\in\{0,1\}.\) \(\quad \mathbb{E}[X]=p,\ \mathbb{V}[X]=p(1-p).\)
df <- data.frame(x = c(0,1), `P(X=x)` = c(1-p, p), check.names = FALSE)
knitr::kable(df, caption = sprintf("Bernoulli(p = %.2f)", p))| x | P(X=x) |
|---|---|
| 0 | 0.7 |
| 1 | 0.3 |
\(P(X=k)=\binom{n}{k}p^k(1-p)^{n-k},\ k=0,\ldots,n.\) \(\quad \mathbb{E}[X]=np,\ \mathbb{V}[X]=np(1-p).\)
n <- 12; p_sem <- 0.8
k <- 0:n
pmf <- dbinom(k, size = n, prob = p_sem)
cdf <- pbinom(k, size = n, prob = p_sem)
df <- data.frame(k = k, `P(X=k)` = pmf, `P(X≤k)` = cdf, check.names = FALSE)
knitr::kable(df, caption = sprintf("Binomial(n = %d, p = %.2f): pmf y CDF", n, p_sem), digits = 6)| k | P(X=k) | P(X≤k) |
|---|---|---|
| 0 | 0.000000 | 0.000000 |
| 1 | 0.000000 | 0.000000 |
| 2 | 0.000004 | 0.000005 |
| 3 | 0.000058 | 0.000062 |
| 4 | 0.000519 | 0.000581 |
| 5 | 0.003322 | 0.003903 |
| 6 | 0.015502 | 0.019405 |
| 7 | 0.053150 | 0.072555 |
| 8 | 0.132876 | 0.205431 |
| 9 | 0.236223 | 0.441654 |
| 10 | 0.283468 | 0.725122 |
| 11 | 0.206158 | 0.931281 |
| 12 | 0.068719 | 1.000000 |
\(P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}\), \(\mathbb{E}[X]=\lambda\), \(\mathbb{V}[X]=\lambda\).
lambda <- 4
k <- 0:20
pmf <- dpois(k, lambda)
cdf <- ppois(k, lambda)
df <- data.frame(k = k, `P(X=k)` = pmf, `P(X≤k)` = cdf, check.names = FALSE)
knitr::kable(df, caption = sprintf("Poisson(λ = %.1f): pmf y CDF (k=0..20)", lambda), digits = 6)| k | P(X=k) | P(X≤k) |
|---|---|---|
| 0 | 0.018316 | 0.018316 |
| 1 | 0.073263 | 0.091578 |
| 2 | 0.146525 | 0.238103 |
| 3 | 0.195367 | 0.433470 |
| 4 | 0.195367 | 0.628837 |
| 5 | 0.156293 | 0.785130 |
| 6 | 0.104196 | 0.889326 |
| 7 | 0.059540 | 0.948866 |
| 8 | 0.029770 | 0.978637 |
| 9 | 0.013231 | 0.991868 |
| 10 | 0.005292 | 0.997160 |
| 11 | 0.001925 | 0.999085 |
| 12 | 0.000642 | 0.999726 |
| 13 | 0.000197 | 0.999924 |
| 14 | 0.000056 | 0.999980 |
| 15 | 0.000015 | 0.999995 |
| 16 | 0.000004 | 0.999999 |
| 17 | 0.000001 | 1.000000 |
| 18 | 0.000000 | 1.000000 |
| 19 | 0.000000 | 1.000000 |
| 20 | 0.000000 | 1.000000 |
\(P(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}\). \(\quad \mathbb{E}[X]=n\frac{K}{N},\ \mathbb{V}[X]=n\frac{K}{N}\left(1-\frac{K}{N}\right)\frac{N-n}{N-1}.\)
N <- 50; K <- 12; n <- 8
k_min <- max(0, n-(N-K)); k_max <- min(n, K)
k <- k_min:k_max
pmf <- dhyper(k, K, N-K, n)
df1 <- unique(data.frame(N=N, K=K, n=n))
df2 <- data.frame(k=k, `P(X=k)`=pmf, check.names = FALSE)
knitr::kable(df1, caption = "Parámetros ejemplo")| N | K | n |
|---|---|---|
| 50 | 12 | 8 |
| k | P(X=k) |
|---|---|
| 0 | 0.091089 |
| 1 | 0.282081 |
| 2 | 0.339378 |
| 3 | 0.205684 |
| 4 | 0.068057 |
| 5 | 0.012445 |
| 6 | 0.001210 |
| 7 | 0.000056 |
| 8 | 0.000001 |
\(P(X=k)=\binom{k+r-1}{r-1}p^r(1-p)^k.\) \(\quad \mathbb{E}[X]=\frac{r(1-p)}{p},\ \mathbb{V}[X]=\frac{r(1-p)}{p^2}.\)
r <- 3
k <- 0:20
pmf <- dnbinom(k, size = r, prob = p)
df <- data.frame(k=k, `P(X=k)`=pmf, check.names = FALSE)
knitr::kable(df, caption = sprintf("NegBin(r = %d, p = %.2f): fracasos antes de %d éxitos", r, p, r), digits = 6)| k | P(X=k) |
|---|---|
| 0 | 0.027000 |
| 1 | 0.056700 |
| 2 | 0.079380 |
| 3 | 0.092610 |
| 4 | 0.097240 |
| 5 | 0.095296 |
| 6 | 0.088943 |
| 7 | 0.080048 |
| 8 | 0.070042 |
| 9 | 0.059925 |
| 10 | 0.050337 |
| 11 | 0.041643 |
| 12 | 0.034008 |
| 13 | 0.027468 |
| 14 | 0.021974 |
| 15 | 0.017433 |
| 16 | 0.013729 |
| 17 | 0.010741 |
| 18 | 0.008354 |
| 19 | 0.006463 |
| 20 | 0.004977 |
\(P(X=k)=(1-p)^{k-1}p,\ k\ge1\). \(\quad \mathbb{E}[X]=\frac{1}{p},\ \mathbb{V}[X]=\frac{1-p}{p^2}.\)
k <- 1:15
pmf <- dgeom(k-1, prob = p)
cdf <- pgeom(k-1, prob = p)
df <- data.frame(k=k, `P(X=k)`=pmf, `P(X≤k)`=cdf, check.names=FALSE)
knitr::kable(df, caption = sprintf("Geométrica(p = %.2f)", p), digits = 6)| k | P(X=k) | P(X≤k) |
|---|---|---|
| 1 | 0.300000 | 0.300000 |
| 2 | 0.210000 | 0.510000 |
| 3 | 0.147000 | 0.657000 |
| 4 | 0.102900 | 0.759900 |
| 5 | 0.072030 | 0.831930 |
| 6 | 0.050421 | 0.882351 |
| 7 | 0.035295 | 0.917646 |
| 8 | 0.024706 | 0.942352 |
| 9 | 0.017294 | 0.959646 |
| 10 | 0.012106 | 0.971752 |
| 11 | 0.008474 | 0.980227 |
| 12 | 0.005932 | 0.986159 |
| 13 | 0.004152 | 0.990311 |
| 14 | 0.002907 | 0.993218 |
| 15 | 0.002035 | 0.995252 |