DoorDash USAposted 1 day ago
$130,600 - $285,000/Yr
Full-time • Senior
Seattle, WA

About the position

DoorDash is building the world’s most reliable on-demand logistics engine. Behind the scenes, our Machine Learning Platform (MLP) powers critical real-time decision-making for millions of orders each day, supporting business-critical use cases like Ads, Groceries, Logistics, Fraud, and Search. We’re looking for a Staff Software Engineer to lead our ML Serving initiatives—enabling seamless, high-performance, and highly scalable model inference at DoorDash. You will guide a small, talented team in developing and operating a next-generation ML serving platform that handles millions of QPS across a global marketplace.

Responsibilities

  • Lead the Vision & Architecture - Set the technical direction for an extremely high-QPS (multi-million QPS) serving platform, enabling rapid and reliable deployment of ML models across a variety of use cases.
  • Build for Scale & Reliability - Own and evolve our model serving stack to ensure zero-downtime, 24/7 operations. You’ll tackle unique scaling challenges around throughput, isolation, and latency.
  • Enable Self-Serve Model Deployments - Develop abstractions, ensuring that ML Engineers can seamlessly bring their own models (BYOM), and custom GPU-accelerated workloads online.
  • Improve Developer Velocity - Drive innovations that reduce time-to-production. Standardize workflows for deploying, validating, and monitoring ML services with strong observability and debugging capabilities.
  • Collaborate Across the Company - Work closely with teams in Ads, Fraud, Logistics, Groceries, and more to tailor the serving platform for their specific needs while maintaining a core set of robust, reusable components.
  • Mentor & Lead - Guide a small but growing team of senior engineers. Champion best practices, set coding standards, conduct design reviews, and help shape DoorDash’s ML culture.

Requirements

  • 8+ years of industry experience in software engineering, with at least 1 year of technical lead experience.
  • Deep expertise in building large-scale, distributed systems—you’re comfortable architecting services that handle millions of requests per second with single-digit millisecond latencies.
  • Strong knowledge of CS fundamentals and experience with programming languages like Python, golang, Kotlin, C++, or Java.
  • Experience with production ML systems—you’ve built or operated high-QPS inference services, real-time feature stores, or large-scale data pipelines.
  • Passion for reliability & performance—you’ve developed strategies for zero-downtime deployments, high availability, and low-latency serving, and you understand cost vs. performance trade-offs.
  • Track record of technical leadership—you excel at collaboration, driving projects end-to-end, and mentoring other engineers in best practices.

Nice-to-haves

  • GPU experience for ML serving and real-time inference.
  • Familiarity with deep learning frameworks (PyTorch, TensorFlow) and large language models (LLMs) such as GPT or BERT.
  • Experience with microservices and container orchestration (Kubernetes, EKS).
  • Cloud computing experience (AWS, GCP, etc.), including cost attribution and optimization.
  • Background in model lifecycle management (MLflow, ML Orchestration systems, or metadata tracking).

Benefits

  • 401(k) plan with an employer match
  • Paid time off
  • Paid parental leave
  • Wellness benefits
  • Paid holidays
  • Medical, dental, and vision benefits
  • Disability and basic life insurance
  • Family-forming assistance
  • Commuter benefit match
  • Mental health program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service