How DoorDash achieves fast travel estimates

Online ordering and delivery have become an increasingly popular lifestyle. With just a few taps on the DoorDash app, you can select your favorite store, build your cart, place your order, and have it delivered to your doorstep in minutes. In the DoorDash ecosystem, travel distance and time are critical factors for logistics. But behind this seemingly straightforward process lies the complex challenge of achieving fast routing at the millisecond level.

Why do we need fast routing? Because it powers many aspects of logistics, particularly through the early stages, where the scale and scope are massive and latency critically impacts user experience. These fast routings are essential for identifying nearby merchants, estimating ETAs and fees, and selecting suitable dashers. The efficiency of these operations is crucial to maintaining the high standards of service for which DoorDash is known.

Balancing accuracy with fast response times is critical for providing the best experience for both consumers and dashers. One of the main ways we enhance is accuracy, which measures how far off our estimates are compared to the actual delivery data collected later. Another key focus is latency, where we struggled for a long time to break past the 100ms level at a reasonable cost. To push beyond these limits, we explored various approaches, ultimately leading to a breakthrough: Geo-Grid-Cache. This system groups locations into cells and uses precomputed center-to-center travel estimations instead of dynamically calculating routes for every request. This technique significantly lowers latency while maintaining a similar level of accuracy. Additionally, it offers substantial cost savings, eliminating the need for incremental computations through travel engines.

Table 1: compares the accuracy (Deviation), response time (Latency), and system throughput (QPS) across different travel estimation methods. Definition of Deviation: Deviation is the mean absolute percentage error (MAPE) calculated using DoorDash’s ground truth data and travel estimations.

Table 1 highlights key performance comparisons across different travel estimation methods. Our in-house engine offers better accuracy and significantly higher throughput than 3rd-party solutions, making it a more efficient and reliable choice. The Geo-Grid-Cache breakthrough achieves ultra-low latency while maintaining accuracy close to in-house estimates, providing a major efficiency improvement. On the other hand, straight-line estimates deliver the fastest response times but are significantly less accurate due to their simplified distance assumptions.

A new era of travel time estimates

DoorDash's highly reliable and scalable travel service provides comprehensive support for numerous use cases with diverse priorities, including accuracy, throughput, and concurrency. A significant challenge arises, however, when a use case requires low-latency responses. Engaging any travel engine, such as Open Source Routing Machine (OSRM), typically results in a service level agreement that exceeds 100 ms, which is suboptimal for latency-sensitive applications like homepage refresh, search ranking, or matching dashers with offers at large scale. Many microservices work around this by relying on a simple formula for calculating straight-line distance and time across the Earth's surface. While this approach offers faster responses, it’s far less accurate than using precise, real-time travel engine data, especially for long-distance estimations, because it does not factor in actual available route options. Instead, we need a mechanism that can provide both rapid and accurate travel estimations.

At first blush, pre-computing seems to offer a means for achieving faster routing without sacrificing accuracy. But the challenge of pre-computing and storing distance values at a massive scale – spanning trillions of point-to-point distance pairs across multiple countries – has previously been considered impossible. So we introduced a geogrid hash system to tackle these challenges. First, we divided DoorDash World – all our serviceable areas – into H3 geohash cells, a hexagonal hierarchical geospatial indexing system, and then pre-computed the whole adjacency matrix using OSRM ahead. We cached all travel distances and durations, significantly reducing the need for real-time travel engine computations. This innovative solution marks a new era of fast and efficient routing.

Figure 1: In this section of DoorDash’s new geogrid caching system, you can see that the estimated delivery time to the red house is more refined than routes to more distant destinations. Travel time around unavailable routes can be factored in, which accounts for the significant difference in travel time between the white apartment and the yellow house despite similar straight-line proximity.

As shown in Figure 1, the number in each cell of our geogrid caching system represents a pre-computed travel estimate from the restaurant to the destination. These numbers are computed using our in-house engine with average road traffic. When two points are geographically close, smaller cells with higher resolution can be used, leading to a more accurate estimate. The numbers increase as routes work around areas that can’t be traversed.

For the urban and suburban areas where DoorDash operates, H3’s mid-level resolution typically strikes a good balance between granularity and area coverage. Defining the unit cell was critical; resolution 10 may be a good option to avoid excess same-cell queries, but it is not feasible for supporting longer distances, even for a short two-mile radius. On the other hand, resolution 6 may be suitable for longer deliveries, such as from San Francisco to San Jose (50 miles), but it’s inaccurate for short-distance queries, often falling into the same cell. The need to balance hit rate, cost, and accuracy led us to a tiering solution — each point is stored at three resolutions — which is extensible for future use cases requiring longer distances or deeper resolutions.

Speeding processing with Spark

Designing the service involved two main components: online productionization and offline data generation. The online flow was relatively straightforward, requiring a stable connection between the production server and the Redis cache cluster. For each point-to-point travel estimation request, the system checks all three tiers of precomputed results and always selects the highest available tier, as illustrated in Figure 2 (left). However, offline data generation posed a much greater challenge. With approximately 6 billion cell-to-cell travel estimates across three resolution tiers, both computational efficiency and storage optimization were critical, as illustrated in Figure 2 (right). After multiple iterations, we chose Databricks’ Spark to handle the task. Spark’s parallelized big data processing capabilities and its ability to install OSRM directly onto the cluster allowed us to replace costly network calls with local HTTP calls, boosting processing speed by up to 10-fold.

Figure 2: Understanding the GeoGrid Caching System. Every cell in DoorDash’s geogrid caching system is uniquely indexed (e.g., 88283ff), similar to how buildings or cities are named. We precompute travel estimates at three resolutions: fine-grained (up to one mile), mid-tier (moderate distances), and large-scale (up to 100 miles) to balance accuracy and efficiency.

Offline data generation is a cornerstone of our geogrid caching system. Through leveraging Databricks' Spark-based distributed computing, we have transformed a labor-intensive and time-consuming process into a streamlined operation. Our offline efforts make it possible to maintain routing services that offer high performance and accuracy, ultimately enhancing the experience for dashers, partners, and customers.

Looking forward

Through powering local commerce with this innovative new routing strategy, DoorDash is overcoming existing limitations, unlocking new opportunities, and ensuring that every online order is a seamless and satisfying experience. Stay tuned for more exciting updates as DoorDash continues to push the boundaries of innovation and enhance its services for customers and partners alike.

Stay Informed with Weekly Updates

Subscribe to our Engineering blog to get regular updates on all the coolest projects our team is working on

How DoorDash achieves fast travel estimates

A new era of travel time estimates

Speeding processing with Spark

Looking forward

Stay Informed with Weekly Updates

Please enter a valid email address.

Thank you for Subscribing!

About the Authors

Related Jobs

Recent Blogs