Analysis of Mouse Catching Statistics

Raman Meikrishnan, Madisen Phillips, Kevin R. MacCauley

October 5, 2023

Problem 3: Parts A, B, and C

Introduction to Poisson Distribution

The Poisson Distribution is a discrete probability distribution often used in statistical modeling. It is expressed as:

\[ p(x) = \frac{e^{-\lambda}\lambda^x}{x!} \]

where \(\lambda\) is the rate of occurrence.

# Defining the name of the distribution
distribution_name <- "Poisson Distribution"
distribution_name
## [1] "Poisson Distribution"

Identifying the Distribution

The given formula is:

\[ p(x) = \frac{e^{-22}22^x}{x!} \]

for \(x = 0,1,2, …\)

This is a Poisson Distribution with \(\lambda = 22\).

Probability of Catching At Least 20 Mice

We are looking for the probability of catching at least 20 mice. The R code below calculates this probability.

# Parameters
lambda <- 22

# Probability of catching at least 20 mice
prob_at_least_20 <- 1 - ppois(19, lambda)
prob_at_least_20
## [1] 0.693973

Explanation of Probability Calculation

The ppois function calculates the cumulative probability of a Poisson distribution. The complement rule is used here, which means we first find the probability of the opposite event (catching 19 or fewer mice) and then subtract that from 1 to get the probability of catching at least 20 mice.

library(ggplot2)

# Set up a data frame to hold the distribution
x <- 0:40
lambda <- 22
poisson_data <- data.frame(
  x = x,
  y = dpois(x, lambda)
)

# Create the plot
ggplot(poisson_data, aes(x, y)) +
  geom_bar(stat="identity", fill="steelblue") +
  labs(title="Poisson Distribution with lambda = 22",
       x="x",
       y="Probability") +
  theme_minimal()

Standard Deviation

The standard deviation of a Poisson distribution is the square root of \(\lambda\).

# Standard deviation
std_dev <- sqrt(lambda)
std_dev
## [1] 4.690416

Explanation of Standard Deviation

The standard deviation measures how spread out the numbers are from the average. In a Poisson distribution, it’s equal to the square root of the average rate \(\lambda\).

Conclusion to Problem 3 - a, b and c.

Problem 3: Part D

Introduction to Binomial Distribution

The Binomial Distribution helps us understand situations where you have a bunch of yes-or-no questions.

# R Code for Named Distribution
distribution_name <- "Binomial Distribution"
distribution_name
## [1] "Binomial Distribution"

Probability of Fewer Than 15 Mice with Lyme Disease

Let’s find the probability of the bummer situation where fewer than 15 mice have Lyme disease.

# R Code for Probability Calculation
p_fewer_than_15 <- pbinom(14, size = 25, prob = 0.7)
p_fewer_than_15
## [1] 0.09780001

Expected Number of Mice with Lyme Disease

On average, how many mice out of 25 should we expect to have Lyme disease?

# R Code for Expected Value Calculation
expected_value <- 25 * 0.7
expected_value
## [1] 17.5

Conclusion to Problem 3 - d.

# Set up a data frame to hold the distribution
x <- 0:25
binom_data <- data.frame(
  x = x,
  y = dbinom(x, size = 25, prob = 0.7)
)

# Create the plot
ggplot(binom_data, aes(x, y)) +
  geom_bar(stat="identity", fill="steelblue") +
  labs(title="Binomial Distribution (n=25, p=0.7)",
       x="x",
       y="Probability") +
  theme_minimal()

Problem 4: Transistor Life Analysis

Introduction to Gamma Distribution

The Gamma Distribution helps us understand the lifespan of things like transistors. Two main characters, alpha (α) and beta (β), help shape this lifespan story.

Finding Alpha and Beta

The equations to find α and β from the mean and variance are:

\[ \text{Mean} = \alpha \cdot \beta \] \[ \text{Variance} = \alpha \cdot \beta^2 \]

Let’s find the values of α and β.

# Given data
mean <- 24
variance <- 12

# Solving for alpha and beta
alpha <- mean^2 / variance
beta <- variance / mean

alpha
## [1] 48
beta
## [1] 0.5

Probability of Lasting Between 18 and 24 Weeks

What are the chances that a transistor will work for at least 18 weeks but not more than 24 weeks?

# Probability of lasting between 18 and 24 weeks
prob_18_to_24 <- pgamma(24, shape = alpha, rate = 1/beta) - pgamma(18, shape = alpha, rate = 1/beta)
prob_18_to_24
## [1] 0.4872366
# Set up a data frame to hold the distribution
mean <- 24
variance <- 12
alpha <- mean^2 / variance
beta <- variance / mean

# Set up a data frame to hold the distribution
x <- seq(0, 80, by = 0.1)
gamma_data <- data.frame(
  x = x,
  y = dgamma(x, shape=alpha, rate=1/beta)
)

# Create the plot
ggplot(gamma_data, aes(x, y)) +
  geom_line(color="steelblue") +
  labs(title="Gamma Distribution",
       x="x",
       y="Density") +
  theme_minimal()

Finding the 99th Percentile

At what age do 99 out of 100 transistors stop working?

# 99th percentile
percentile_99 <- qgamma(0.99, shape = alpha, rate = 1/beta)
percentile_99
## [1] 32.7853

Variance of the Cost Function

How does the cost of maintaining transistors bounce around based on their lifetimes?

# Variance of the cost function
var_C <- (4^2) * variance
var_C
## [1] 192

Conclusion to Problem 4

Problem 5: Analyzing IQ Scores

IQ scores follow a “Normal Distribution” which means most people have scores around the average, but a few people score much lower or much higher. In the US, the average IQ is 100, and the “spread” of scores is 15.

Probability of IQ Bigger Than 130

Let’s find the chance of meeting a super smart cookie with an IQ bigger than 130!

# Calculating the probability of IQ greater than 130
prob_greater_130 <- 1 - pnorm(130, mean = 100, sd = 15)
prob_greater_130
## [1] 0.02275013

Probability of IQ Between 95 and 115

Now, let’s find the chance of meeting someone whose IQ is not too far from the average, between 95 and 115.

# Calculating the probability of IQ between 95 and 115
prob_between_95_115 <- pnorm(115, mean = 100, sd = 15) - pnorm(95, mean = 100, sd = 15)
prob_between_95_115
## [1] 0.4719034
# Set up a data frame to hold the distribution
x <- seq(55, 145, by = 0.1)
normal_data <- data.frame(
  x = x,
  y = dnorm(x, mean = 100, sd = 15)
)

# Create the plot
ggplot(normal_data, aes(x, y)) +
  geom_line(color="steelblue") +
  labs(title="Normal Distribution (mean=100, sd=15)",
       x="IQ Score",
       y="Density") +
  theme_minimal()

Standard Score for IQ of 120

The standard score tells us how special an IQ of 120 is compared to everyone else.

# Calculating the z-score for IQ of 120
z_score_120 <- (120 - 100) / 15
z_score_120
## [1] 1.333333

IQ Values for Middle 95% of the Population

We’re looking for the range of IQ scores where you’ll find the middle 95% of people.

# Finding the IQ values for the middle 95% of the population
lower_bound <- qnorm(0.025, mean = 100, sd = 15)
upper_bound <- qnorm(0.975, mean = 100, sd = 15)
c(lower_bound, upper_bound)
## [1]  70.60054 129.39946

Conclusion to Problem 5