============================================================================================================
You live exactly 35 minutes from work (driving and getting to your desk from the parking garage) under perfect driving conditions (no traffic, construction, red lights, slow drivers, or weather issues) and must be to work by 9 am (if you are late one more time you are fired).
You are fairly slow getting ready in the morning, so no matter how hard you try, you just can’t get out of the door before 8:05 am. That means you have 20 minutes of spare time for things like traffic, construction, red lights, and slow drivers.
There are two routes you can take, and the likelihood and severity of different kinds of delays are indicated. It is snowing today, and snow on this route typically means an additional 10 minutes with a standard deviation of 20.
Use simulation to determine which route will provide you with a higher probability of getting to work on time. Show and annotate your code (placed after your answers)
Route A:
Normal traffic has a mean of 1.5 minutes with a standard deviation of 2 minutes.
It is off-season for construction, so the likelihood of encountering emergency construction is about 3.9%, and if it is encountered, the median delay is 10 minutes with the minimum recorded being 1 minute and the max 260 minutes.
This route has no traffic lights! :)
Route B:
Normal traffic has a mean of 6 minutes with a standard deviation of 4 minutes.
It is off-season for construction, so the likelihood of encountering emergency construction is about 1.5%, and if it is encountered, the median delay is 5 minutes with the minimum recorded being O minutes and the max 300 minutes.
There are 7 red lights. Each occurs with an independent probability of 33%. The length of a red light is fixed at 1 minute.
Evaluation metrics are going to consider:
1. How you determine the test or approach you are going to use.
Answer: One method is to calculate the mathematical expectation (also known as the expected value or first moment) of each event’s spending time. Another method is to run simulations and compute the mean of each event’s spending time. Then, the sum should be as much within the spare time, i.e., 20 minutes.
2. Explicitly mention what you need to use for this approach, e.g., the parameters, statistics, predictors necessary.
Answer: There are three different types of distributions in this case: (1) normal (Gaussian) distribution, (2) triangular (Simpson’s) distribution, and (3) binomial distribution. The 20 minutes spare time may be wasted on four events: (i) traffic, (ii) construction, (iii) red lights, and (iv) snowy weather.
The delayed time from the snowy weather and the traffic would follow the normal distributions. The delayed time from the construction would follow the triangular distribution. The delayed time from waiting red lights follows the binomial distribution.
The normal distribution has two parameters - mean and standard deviation. The triangular distribution has three parameters - lower limit, upper limit, and mode. However, the median is given in this case. The binomial distribution has two parameters - the number of trials and success probability for each trial.
3. Report the results of your approach.
Answer: The expected time spent on the snowy weather is 10 minutes. The expected time spent on the traffic for Route A and Route B is, respectively, 1.5 and 6 minutes. The expected time spent on the construction for Route A and Route B is, respectively, \(0.039*\frac{10+1+260}{3}=3.523\) and \(0.015*\frac{5+0+300}{3}=1.525\) minutes. The expected time spent on the red lights for Route B is \(7*0.33*1=2.31\) minutes.
Route A has a sum of \(10+1.5+3.523=15.023\) minutes, while Route B has a sum of \(10+6+1.525+2.31=19.835\) minutes. Also, the simulation method is covered in the code chunk.
4. The written communication of the process and final recommendation.
Answer: Route A has less expected value of wasting time; hence it is recommended. As for the simulation method (N=100,000), it is crucial to replace the negative values with 0 in the sample. The random seed sets to 525 for your record. Figure 1 shows the density plots of the extra time in each route (Route A is in #fb3675 ruby color; Route B is in #00c0c5 turquoise color; the dashed line indicates 20 minutes). Route A has a mean of 19.30353 minutes, while Route B has a mean of 23.98242 minutes. Still, Route A is recommended.
N <- 100000
set.seed(525)
snow <- rnorm(N,10,20)
trafficA <- rnorm(N,1.5,2)
trafficB <- rnorm(N,6,4)
constrA <- rtriangle(N,1,260,10)
constrB <- rtriangle(N,0,300,5)
redlight <- rbinom(N,7,0.33)
#remember to replace the negative values
snow[snow<0] <- 0
trafficA[trafficA<0] <- 0
trafficB[trafficB<0] <- 0
routeA <- snow+trafficA+0.039*constrA
routeB <- snow+trafficB+0.015*constrB+1*redlight
data <- data.frame("time"=c(routeA,routeB),"route"=factor(c(rep("A",N),rep("B",N))))
qplot(time,data=data,geom="density",color=route)+ggtitle("Figure 1. Density of the extra time by two routes")+labs(x="Time (minute)",y="Density")+theme_classic()+geom_vline(xintercept=20,linetype="dashed")
mean(routeA)
## [1] 19.30353
mean(routeB)
## [1] 23.98242