Due: ¾/2014
Feynman Liang (feynman.liang@gmail.com)
Running the simulation code and plotting for \( \lambda \in \{1,2,3,4,5,6,7\} \):
# Poisson process simulations
for (lambda in 1:7) {
maxcustomers <- 100
# independent exponentially distributed waiting times T_n
T <- rexp(maxcustomers, rate = lambda)
hist(T, xlab = "nth customer", ylab = "Interarrival time T_n", main = paste("lambda=",
lambda))
# cumulative sum Y_n=T_1+...T_n
Y <- cumsum(T)
plot(Y, xlab = "nth customer", ylab = "Arrival time Y_n")
}
The exponential distribution of arrival times decays at a much faster rate as \( \lambda \) is increased. This makes sense because the time constant of the exponential is related by \[ \lambda = \frac{1}{\tau} \].
Slope should be equal to \( \lambda \)
1.
f <- read.table(file = "CoalMineAccidents.csv", sep = ",", header = TRUE)
summary(f)
## Days Casualties
## Min. : 0 Min. : 10.0
## 1st Qu.: 34 1st Qu.: 16.0
## Median : 96 Median : 27.0
## Mean : 159 Mean : 55.7
## 3rd Qu.: 215 3rd Qu.: 63.0
## Max. :1205 Max. :439.0
2.
hist(f[, "Days"], freq = FALSE) # density plot
Estimating \( \lambda = 0.0065 \)
lambda <- 0.0065
hist(f[, "Days"], freq = FALSE, main = paste("Density estimate lambda=", lambda)) # density plot
lines(lambda * exp(-lambda * 0:1200))
Y <- cumsum(f[, "Days"])
plot(Y, xlab = "Number of Days", ylab = "Cumulative sum of accidents")
For the most part yes, it seems like the slope of the cumuative sum function is independent of time, which implies that the probability density function is time homogeneous. However, this does not hold for the interval \( [120,153] \) where the slope (and hence the PDF) seems to vary over time.
f.subset1 <- f[1:120, "Days"]
lambda1 <- 0.009
hist(f.subset1, freq = FALSE, main = paste("Density estimate lambda=", lambda1)) # density plot
lines(lambda1 * exp(-lambda1 * 0:1200))
f.subset2 <- f[121:153, "Days"]
lambda2 <- 0.0022
hist(f.subset2, freq = FALSE, main = paste("Density estimate lambda=", lambda2)) # density plot
lines(lambda2 * exp(-lambda2 * 0:1200))
Our estimates on the separated datasets are \( \lambda_1=0.009 \) and \( \lambda_2=0.0022 \).