Data Archives - Page 3 of 5

Introducing Fabricator: A Declarative Feature Engineering Framework

As machine learning (ML) becomes increasingly important across tech companies, feature engineering becomes a bigger focus for improving the predictive power of models.

How to Run Apache Airflow on Kubernetes at Scale

As an orchestration engine, Apache Airflow let us quickly build pipelines in our data infrastructure.

The 4 Principles DoorDash Used to Increase Its Logistics Experiment Capacity by 1000%

In our real-time delivery logistics system, the environment, behavior of Dashers (our term for delivery drivers), and consumer demand are highly volatile.

Building Faster Indexing with Apache Kafka and Elasticsearch

DoorDash describes how it built a faster search index using open source projects.

Overcoming Rapid Growth Challenges for Datasets in Snowflake

A proper optimization framework for data infrastructure streamlines engineering efforts, allowing platforms to scale.

Maintaining Machine Learning Model Accuracy Through Monitoring

Machine learning model drift occurs as data changes, but a robust monitoring system helps maintain integrity.

Building Riviera: A Declarative Real-Time Feature Engineering Framework

In a business with fluid dynamics between customers, drivers, and merchants, real-time data helps make crucial decisions which grow our business and delights our customers.

How to Drive Effective Data Science Communication with Cross-Functional Teams

Analytics teams focused on detecting meaningful business insights may overlook the need to effectively communicate those insights to their cross-functional partners who can use those recommendations to improve the business.

Running Experiments with Google Adwords for Campaign Optimization

Running experiments on marketing channels involves many challenges, yet at DoorDash, we found a number of ways to optimize our marketing with rigorous testing on our digital ad platforms.

Building Flexible Ensemble ML Models with a Computational Graph

DoorDash extended its machine learning platform to support ensemble models.