Running thousands of experiments effectively means carefully balancing our speed with the necessary controls to maintain trust in experimental outputs – but figuring out that balance is never easy.
Category Archives: Data
Adapted Switch-back Testing to Quantify Incrementality for App Marketplace Search Ads
At DoorDash, we use experimentation as one of the robust approaches to validate the incremental return on the marketing investment.
Five Common Data Quality Gotchas in Machine Learning and How to Detect Them Quickly
The vast majority of work in developing machine learning models in the industry is data preparation, but current methods require a lot of intensive and repetitive work by practitioners.
Building Scalable Real Time Event Processing with Kafka and Flink
At DoorDash, real time events are an important data source to gain insight into our business but building a system capable of handling billions of real time events is challenging.
Building a Source of Truth for an Inventory with Disparate Data Sources
Managing inventory becomes a serious challenge when transitioning from food delivery — where the item ordered is prepared on demand — to grocery and alcohol delivery.
Using Back-Door Adjustment Causal Analysis to Measure Pre-Post Effects
When A/B testing is not recommended because of regulatory requirements or technical limitations to setting up a controlled experiment, we can still quickly implement a new feature and measure its effects in a data-driven way.
Meet Dash-AB: The Statistics Engine of Experimentation at DoorDash
For any data-driven company, it’s key that every change is tested by experiments to ensure that it has a positive measurable impact on the key performance metrics.
How We Applied Client-Side Caching to Improve Feature Store Performance by 70%
At DoorDash, we make millions of predictions every second to power machine learning applications to enhance our search, recommendation, logistics, and fraud areas, and scaling these complex systems along with our feature store is continually a challenge.
3 Principles for Building an ML Platform That Will Sustain Hypergrowth
Taking full advantage of a large and diverse set of machine learning (ML) use cases calls for creating a centralized platform that can support new business initiatives, improve user experiences, enhance operational efficiency, and accelerate overall ML adoption.
Making Applications Compatible with Postgres Tables BigInt Update
Previously, DoorDash relied on Postgres as its main data storage and used Python Django database models to define the data.