Data: WorkTravel
My commute consists of roughly 2 hours to work and 2 hours back.
Method1: Bus-Train with wait and traffic time
Method2: Train-Ferry with wait but no traffic time
Method3: Car with no wait but traffic time
The data was inserted in Postgres DB and then exported to csv file: worktravel.csv. The data covers 3 weeks where each week consists one of the routes described above.
wt <- read.table("C:\\worktravel.csv", sep=",", header = TRUE)
head(wt)
## traveldate traveltype routeid waittime traffictime method traveltime
## 1 2015-06-29 Leaving R1 10 0 Bus-Trains 95
## 2 2015-06-29 Arriving R1 15 15 Bus-Trains 95
## 3 2015-06-30 Leaving R1 10 0 Bus-Trains 95
## 4 2015-06-30 Arriving R1 20 15 Bus-Trains 95
## 5 2015-07-01 Leaving R1 5 0 Bus-Trains 95
## 6 2015-07-01 Arriving R1 0 10 Bus-Trains 95
## travelcost
## 1 6.75
## 2 6.75
## 3 6.75
## 4 6.75
## 5 6.75
## 6 6.75
suppressMessages(require(plyr))
wt_sub_tr <- data.frame(type = wt$traveltype, method = wt$method,
totaltime = wt$traffictime + wt$waittime + wt$traveltime,
cost = wt$travelcost)
head(wt_sub_tr)
## type method totaltime cost
## 1 Leaving Bus-Trains 105 6.75
## 2 Arriving Bus-Trains 125 6.75
## 3 Leaving Bus-Trains 105 6.75
## 4 Arriving Bus-Trains 130 6.75
## 5 Leaving Bus-Trains 100 6.75
## 6 Arriving Bus-Trains 105 6.75
wt_summary <- ddply(wt_sub_tr, .(type,method), summarise,
mean=mean(totaltime),
sd=sd(totaltime), cost=sum(cost)/5)
head(wt_summary)
## type method mean sd cost
## 1 Arriving Bus-Trains 119.0 9.617692 6.75
## 2 Arriving Car 100.0 6.123724 10.00
## 3 Arriving Train-Ferry-Train 139.0 8.944272 2.75
## 4 Leaving Bus-Trains 102.6 2.509980 6.75
## 5 Leaving Car 89.2 2.774887 10.00
## 6 Leaving Train-Ferry-Train 131.8 2.049390 2.75
plot(wt_sub_tr$totaltime~wt_sub_tr$method,
main = "Work Travel",
xlab="Travel Methods",
ylab="Total Travel Time",
col=c("gold","blue","red"))
With the summary table and the plot, we can conclude that it is faster if I take the car but more economical if I take the ferry to work. However, by alternating different routes througout the week, I can save on both time and money.
What is the probability that if I take the Bus to work, my wait time is no more than 5 mins?
Let us look at the numbers for only bus:
subset(wt, traveltype=="Leaving" & method=="Bus-Trains")
## traveldate traveltype routeid waittime traffictime method traveltime
## 1 2015-06-29 Leaving R1 10 0 Bus-Trains 95
## 3 2015-06-30 Leaving R1 10 0 Bus-Trains 95
## 5 2015-07-01 Leaving R1 5 0 Bus-Trains 95
## 7 2015-07-02 Leaving R1 8 0 Bus-Trains 95
## 9 2015-07-03 Leaving R1 5 0 Bus-Trains 95
## travelcost
## 1 6.75
## 3 6.75
## 5 6.75
## 7 6.75
## 9 6.75
Based on the sample gathered, the probability my wait time will be no more than 5 minutes is: 0.4.
I left work at 3:30pm and I just reached home at 5:35pm. My husband thinks since the car was home and I’m over 2 hours late, I must have taken the ferry. What is the probability he is right?
subset(wt_sub_tr, (method=="Bus-Trains" | method=="Train-Ferry-Train") & type=="Arriving")
## type method totaltime cost
## 2 Arriving Bus-Trains 125 6.75
## 4 Arriving Bus-Trains 130 6.75
## 6 Arriving Bus-Trains 105 6.75
## 8 Arriving Bus-Trains 120 6.75
## 10 Arriving Bus-Trains 115 6.75
## 12 Arriving Train-Ferry-Train 140 2.75
## 14 Arriving Train-Ferry-Train 145 2.75
## 16 Arriving Train-Ferry-Train 130 2.75
## 18 Arriving Train-Ferry-Train 150 2.75
## 20 Arriving Train-Ferry-Train 130 2.75
\[{P(ferry|travel>=125mins)=}\] \[\frac{p(travel>=125mins|ferry)p(ferry)}{p(travel>=125|ferry)p(ferry)+p(travel<125|bus)p(bus)}=\] \[\frac{5/5*1/2}{(5/5*1/2)+(2/5*1/2)}=\] \[{0.7142}\]
The probability that he is right is 71.42%.