MATH266: Activities and Simulations with Special Discrete Distributions

Example 1: Robin Hood

An archer is able to hit the bullseye 70% of the time. Assume each shot is independent of the others.

Part I: Identify which distribution should be used in each scenario and calculate the desired probability.

The archer’s first bullseye comes on the fifth arrow.
The archer’s first bullseye comes within the first five arrows.
The archer’s third bullseye comes on the fifth arrow.
The archer’s first bullseye comes on the tenth arrow, given that the archer has not made any bullseyes within the first five arrows.
The archer gets three bullseyes within the first five arrows, given that the archer has made seven bullseyes within the first ten arrows.

Part II: Simulate the probabilities

Example for Problem A: The archer’s first bullseye comes on the fifth arrow.

nsim<-1000
count=0
for(i in 1:nsim){
  arrows<-sample(size=5, x=c(0,1), prob=c(0.3, 0.7), replace=TRUE)
  if(sum(arrows)==1 & arrows[5]==1){
    count=count+1
  }
}
count/nsim

## [1] 0.006

Example 2: Occupancy Models

Ecologists use occupancy models to study animal populations. Ecologists at the Department of Natural Resources use helicopter surveying methods to look for otter tracks in the snow along the Mississippi River to study which parts of the river are occupied by otters. The occupancy rate is the probability that an animal is present in a particular site. The detection rate is the probability that animals will be detected. (In this case, whether tracks will be seen from a helicopter.) If the animal is detected, this might be due to the site not being occupied or because the site is occupied and the tracks were not detected.

A common model used by ecologists is a zero-inflated binomial model. If a region is occupied, then the number of detections is binomial with n the number of sites and p the detection rate. If a region is unoccupied, the number of detections is 0.

Let $\alpha$ be the occupancy rate, $p$ the detection rate, and n the number of sites.

1. Find the probability of zero detections. This should be a function in terms of $\alpha$, $n$, and $p$.
1. DNR ecologists search five sites along the Mississippi River for the presence of otters. Suppose $\alpha=0.75$ and $p=0.5$. Let $Z$ be the number of observed detections. Give the probability function for $Z$.
1. Write an R script to simulate 1000 draws from the distribution of Z, using the parameters from Part B. You may build it up using functions like rbinom() and sample()

### inflated binomial
occupied<-sample(size=1,
                 x=c(0,1),
                 prob=c(0.25, 0.75),
                 replace=TRUE)
#occupied

if(occupied==0){
  detect=0
}
if(occupied==1){
  detect=rbinom(n=1, size=5, prob=0.5)
}

### LOOP IT
nsim<-10000
sim<-c()
for(i in 1:nsim){
  occupied<-sample(size=1,
                   x=c(0,1),
                   prob=c(0.25, 0.75),
                   replace=TRUE)
  #occupied
  
  if(occupied==0){
    detect=0
  }
  if(occupied==1){
    detect=rbinom(n=1, size=5, prob=0.5)
  }
  sim<-c(sim, detect)
}

# What does it look like
hist(sim)

# zero inflation
mean(sim==0)

## [1] 0.2753

Poisson

A random variable $X$ is said to have a Poisson distribution with parameter $\lambda>0$, then the PMF is given by \[f(x)=P(X=x)=\frac{e^{-\lambda}\lambda^x}{x!}, x=0, 1, 2, ...\]

Simulate Draws from the Poisson to Estimate $

lambda$

poisSamp<-rpois(n=1000, lambda=5)
mean(poisSamp)

## [1] 5.002

var(poisSamp)

## [1] 4.42442

BONUS FUN! Normal Approximation to Binomial

## VARIABLES FOR SIM
nsim<-1000
this_n<-50
this_p<-.5

# RANDOM DRAWS
samp<-rbinom(n=nsim, size=this_n, prob=this_p)

# MEAN AND VARIANCE
m<-this_n*this_p
std<-sqrt(this_n*this_p*(1-this_p))

hist(samp, density=20, breaks=20, prob=TRUE, 
     xlab="Binomial", 
     xlim=c(0, this_n),
     main="Normal Curve Over Histogram")
curve(dnorm(x, mean=m, sd=std), 
      col="darkblue", lwd=2, add=TRUE, yaxt="n")