Author: Marcus Poulton, Birkbeck, University of London, August 2017
This is an R Markdown Notebook that compares actual ambulance journey times against predictions produced by Google Maps Distance Matrix API. A set of nearly 5000 actual ambulance journeys were kindly provided by London Ambulance Service. These journeys were undertaken in London toward critical ill patients, using blue lights and sirens. Two types of vehicles were used, Ambulances and Fast Response Vehicles. This document attempts to characterise the differences in emergency vehicle transit times compared with normal road users in an urban setting.
This document is available on RPubs
For each journey the following information was provided:
From this information we wrote software that called the Google Maps Distance matrix API for each journey to obtain journey time and distance estimates. Unfortunately the Google API does not allow retrospective journey times, only times in the future are allowed. We repeatedly added 7 days to each journey start time in our sample set until that journey time is in the future. Because of this, Google can only make limited use of the traffic conditions at the time the API request was made.
Note: An alternative method is to call the API at the same time of week that is in our sample, but this would take at least a week to complete the test.
The CSV file used in this document has the following information for each journey:
Google returns two journey times when it estimates a route. First, it reports the estimated journey time using a standard model. There is little information on how this model is implemented. The API also reports an estimated journey time taking into account traffic conditions. Again, there is little information on how this is calculated, other than it is based on historic traffic conditions and live traffic.
The first plot does not consider ambulance journey times at all but simply compares the two Google estimators. This plot shows that the estimated journey time in traffic does vary with the estimate produced by the standard model, however the spread is relatively contained.
We can see in the next plot how the estimated journing time, taking into consideration the traffic conditions, varies from the estimate based solely on the standard model. This variation increases, as expected, as the journey length increases. Again, this plot is solely using the two Google estimator using sample journeys and does not include actual journey time.
Does the Google predicted journey time differ from journey time factored for traffic vary by hour of day? First we look at the number of samples we have for each hour of the day by vehicle type. Most emergency journeys occur during the day and the first part of the night. Between midnight and 5am the number of journeys is relatively low. In addition, the number of ambulance journeys vs fast response vehicles is higher during the day.
The following boxplot shows the variation in estimated journey time through the day. There is very little variation in the estimates throughout the day.
Again, the estimated distance travelled by hour of day does not appear to vary much, with a slight increase around 5am. This is due to the ratio of incidents and available resources remaining roughly constant.
The first plot is a histogram of the actual journey times in our dataset, with an associated QQ plot. The histogram is overlaid with an estimated log-normal curve.The QQ plot indicates that journey times are similar to the log-normal distribution up to about 20 minutes. Additionally, we can see that there are some outliers where journey times are inexplicably long. Initially, we fit a 99th percentile line (vertical red dotted line).
Next we plot ambulance journey time vs. estimated journey using the Google Maps Distance Matrix API. We do this initially without removing outliers. We can see that there are quite a few journeys that took a considerable period of time compared to the Google estimate. We assume from previous research these are either artifacts of the ambulance telemetry reporting mechanism or errors in operational procedure and can be considered as outliers.
Here we remove all journeys that took more than 1500 seconds and show a scatter plot of the actual vs estimated journeys times. Note that the estimate provided by Google is based solely on the standard model and does not include allowance for traffic conditions at the time of day that the journey was undertaken. As this stage we plot both vehicle types simultaneously.
The green line is the regression line and the red line is a local polynomial regression fitting (loess). The polynomial is very similar to the straight line regression indicating an almost linear relationship between the standard model and the actual journey times. This encourages us to investigate whether a simple coefficient could be used to translate car traffic model times into ambulance times.
We repeat the analysis using the estimates taking in account predicted traffic conditions at the time of day that the journey was undertaken. Here the regression and Loess lines are very similar to te previous chart, indicating little benefit in using the traffic model.
We take a look at each vehicle type, of which there are two,
The first boxplot is the Google estimated journey time in traffic by actual duration for ambulances. The box represents the first and third quantile and the notch at the end of each whisker approximately represents the 95% confidence interval. Assuming a linear model with an intercept of 0, the green regression line on the plot gives us an estimate that ambulances are 1.355 times quicker than regular road traffic.
We calculate the error in prediction as \(Delta = Estimated Duration - Actual Duration in Traffic\) and then plot the delta against the actual journey time. Here the mean delta decreases in value as journey time increases, indicating that Google overestimates the journey time. Put another way, ambulance arrives faster than predicted. As expected, the prediction error grows as the journey time increases.
Fast response vehicles are generally used for shorter journeys, but the regression and loess lines are again similar to ambulances. Again, assuming a linear model, the data shows gives us an estimate that fast response vehicles are marginally quicker than ambulances and approximately 1.552 times quicker than regular road traffic.
Ambulances and Fast response vehicles are faster than regular traffic. Ambulances are 1.355 ) quicker than regular road traffic. Fast Response Vehicles are 1.552 quicker than regular road traffic.
The best time to predict travel time is during the day. Prediction worsens in the early hours of the morning.
There is little difference between the Google traffic and model estimates.