Math 365 Poisson Process

Due: ¾/2014

Poisson process simulations

Running the simulation code and plotting for \( \lambda \in \{1,2,3,4,5,6,7\} \):

# Poisson process simulations
for (lambda in 1:7) {
  maxcustomers <- 100

  # independent exponentially distributed waiting times T_n
  T <- rexp(maxcustomers, rate = lambda)
  hist(T, xlab = "nth customer", ylab = "Interarrival time T_n", main = paste("lambda=", 
      lambda))

  # cumulative sum Y_n=T_1+...T_n
  Y <- cumsum(T)
  plot(Y, xlab = "nth customer", ylab = "Arrival time Y_n")
}

plot of chunk unnamed-chunk-1

The exponential distribution of arrival times decays at a much faster rate as \( \lambda \) is increased. This makes sense because the time constant of the exponential is related by \[ \lambda = \frac{1}{\tau} \].
- In the case \( \lambda=1 \), we see that the average arrival rate \( \frac{100}{97}=1.031 \).
- In the case \( \lambda=2 \), we see that the average arrival rate \( \frac{100}{55}=1.818 \).
- In the case \( \lambda=4 \), we see that the average arrival rate \( \frac{100}{27}=3.703 \).
- In the case \( \lambda=6 \), we see that the average arrival rate \( \frac{100}{16}=6.250 \).
Slope should be equal to \( \lambda \)

Real life example

  f <- read.table(file = "CoalMineAccidents.csv", sep = ",", header = TRUE)
  summary(f)

  ##       Days        Casualties   
  ##  Min.   :   0   Min.   : 10.0  
  ##  1st Qu.:  34   1st Qu.: 16.0  
  ##  Median :  96   Median : 27.0  
  ##  Mean   : 159   Mean   : 55.7  
  ##  3rd Qu.: 215   3rd Qu.: 63.0  
  ##  Max.   :1205   Max.   :439.0

  hist(f[, "Days"], freq = FALSE)  # density plot

plot of chunk unnamed-chunk-3

Estimating \( \lambda = 0.0065 \)

lambda <- 0.0065
hist(f[, "Days"], freq = FALSE, main = paste("Density estimate lambda=", lambda))  # density plot
lines(lambda * exp(-lambda * 0:1200))

plot of chunk unnamed-chunk-4

```
Y <- cumsum(f[, "Days"])
plot(Y, xlab = "Number of Days", ylab = "Cumulative sum of accidents")
```
For the most part yes, it seems like the slope of the cumuative sum function is independent of time, which implies that the probability density function is time homogeneous. However, this does not hold for the interval \( [120,153] \) where the slope (and hence the PDF) seems to vary over time.

f.subset1 <- f[1:120, "Days"]
lambda1 <- 0.009
hist(f.subset1, freq = FALSE, main = paste("Density estimate lambda=", lambda1))  # density plot
lines(lambda1 * exp(-lambda1 * 0:1200))

plot of chunk unnamed-chunk-6


f.subset2 <- f[121:153, "Days"]
lambda2 <- 0.0022
hist(f.subset2, freq = FALSE, main = paste("Density estimate lambda=", lambda2))  # density plot
lines(lambda2 * exp(-lambda2 * 0:1200))

plot of chunk unnamed-chunk-6

Our estimates on the separated datasets are \( \lambda_1=0.009 \) and \( \lambda_2=0.0022 \).