Raman Meikrishnan, Madisen Phillips, Kevin R. MacCauley
October 5, 2023
The Poisson Distribution is a discrete probability distribution often used in statistical modeling. It is expressed as:
\[ p(x) = \frac{e^{-\lambda}\lambda^x}{x!} \]
where \(\lambda\) is the rate of occurrence.
# Defining the name of the distribution
distribution_name <- "Poisson Distribution"
distribution_name## [1] "Poisson Distribution"
The given formula is:
\[ p(x) = \frac{e^{-22}22^x}{x!} \]
for \(x = 0,1,2, …\)
This is a Poisson Distribution with \(\lambda = 22\).
We are looking for the probability of catching at least 20 mice. The R code below calculates this probability.
# Parameters
lambda <- 22
# Probability of catching at least 20 mice
prob_at_least_20 <- 1 - ppois(19, lambda)
prob_at_least_20## [1] 0.693973
The ppois function calculates the cumulative probability
of a Poisson distribution. The complement rule is used here, which means
we first find the probability of the opposite event (catching 19 or
fewer mice) and then subtract that from 1 to get the probability of
catching at least 20 mice.
library(ggplot2)
# Set up a data frame to hold the distribution
x <- 0:40
lambda <- 22
poisson_data <- data.frame(
x = x,
y = dpois(x, lambda)
)
# Create the plot
ggplot(poisson_data, aes(x, y)) +
geom_bar(stat="identity", fill="steelblue") +
labs(title="Poisson Distribution with lambda = 22",
x="x",
y="Probability") +
theme_minimal()The standard deviation of a Poisson distribution is the square root of \(\lambda\).
## [1] 4.690416
The standard deviation measures how spread out the numbers are from the average. In a Poisson distribution, it’s equal to the square root of the average rate \(\lambda\).
The Binomial Distribution helps us understand situations where you have a bunch of yes-or-no questions.
## [1] "Binomial Distribution"
Let’s find the probability of the bummer situation where fewer than 15 mice have Lyme disease.
# R Code for Probability Calculation
p_fewer_than_15 <- pbinom(14, size = 25, prob = 0.7)
p_fewer_than_15## [1] 0.09780001
On average, how many mice out of 25 should we expect to have Lyme disease?
## [1] 17.5
# Set up a data frame to hold the distribution
x <- 0:25
binom_data <- data.frame(
x = x,
y = dbinom(x, size = 25, prob = 0.7)
)
# Create the plot
ggplot(binom_data, aes(x, y)) +
geom_bar(stat="identity", fill="steelblue") +
labs(title="Binomial Distribution (n=25, p=0.7)",
x="x",
y="Probability") +
theme_minimal()The Gamma Distribution helps us understand the lifespan of things like transistors. Two main characters, alpha (α) and beta (β), help shape this lifespan story.
The equations to find α and β from the mean and variance are:
\[ \text{Mean} = \alpha \cdot \beta \] \[ \text{Variance} = \alpha \cdot \beta^2 \]
Let’s find the values of α and β.
# Given data
mean <- 24
variance <- 12
# Solving for alpha and beta
alpha <- mean^2 / variance
beta <- variance / mean
alpha## [1] 48
## [1] 0.5
What are the chances that a transistor will work for at least 18 weeks but not more than 24 weeks?
# Probability of lasting between 18 and 24 weeks
prob_18_to_24 <- pgamma(24, shape = alpha, rate = 1/beta) - pgamma(18, shape = alpha, rate = 1/beta)
prob_18_to_24## [1] 0.4872366
# Set up a data frame to hold the distribution
mean <- 24
variance <- 12
alpha <- mean^2 / variance
beta <- variance / mean
# Set up a data frame to hold the distribution
x <- seq(0, 80, by = 0.1)
gamma_data <- data.frame(
x = x,
y = dgamma(x, shape=alpha, rate=1/beta)
)
# Create the plot
ggplot(gamma_data, aes(x, y)) +
geom_line(color="steelblue") +
labs(title="Gamma Distribution",
x="x",
y="Density") +
theme_minimal()At what age do 99 out of 100 transistors stop working?
## [1] 32.7853
How does the cost of maintaining transistors bounce around based on their lifetimes?
## [1] 192
IQ scores follow a “Normal Distribution” which means most people have scores around the average, but a few people score much lower or much higher. In the US, the average IQ is 100, and the “spread” of scores is 15.
Let’s find the chance of meeting a super smart cookie with an IQ bigger than 130!
# Calculating the probability of IQ greater than 130
prob_greater_130 <- 1 - pnorm(130, mean = 100, sd = 15)
prob_greater_130## [1] 0.02275013
Now, let’s find the chance of meeting someone whose IQ is not too far from the average, between 95 and 115.
# Calculating the probability of IQ between 95 and 115
prob_between_95_115 <- pnorm(115, mean = 100, sd = 15) - pnorm(95, mean = 100, sd = 15)
prob_between_95_115## [1] 0.4719034
# Set up a data frame to hold the distribution
x <- seq(55, 145, by = 0.1)
normal_data <- data.frame(
x = x,
y = dnorm(x, mean = 100, sd = 15)
)
# Create the plot
ggplot(normal_data, aes(x, y)) +
geom_line(color="steelblue") +
labs(title="Normal Distribution (mean=100, sd=15)",
x="IQ Score",
y="Density") +
theme_minimal()The standard score tells us how special an IQ of 120 is compared to everyone else.
## [1] 1.333333
We’re looking for the range of IQ scores where you’ll find the middle 95% of people.
# Finding the IQ values for the middle 95% of the population
lower_bound <- qnorm(0.025, mean = 100, sd = 15)
upper_bound <- qnorm(0.975, mean = 100, sd = 15)
c(lower_bound, upper_bound)## [1] 70.60054 129.39946