Jobgetherposted 3 days ago
$175,900 - $307,800/Yr
Full-time • Senior

About the position

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching. One of our companies is currently looking for a Staff Software Engineer, Speculative Decoding in California. We’re looking for a seasoned engineer with deep experience in Generative AI inference and a strong command of speculative decoding techniques. In this role, you'll be responsible for developing high-performance, scalable algorithms that enhance speed and accuracy within production-level AI systems. Working in a multi-data center Kubernetes environment, you’ll help design and integrate state-of-the-art decoding methods while driving performance improvements across the inference stack. If you're passionate about transforming leading-edge AI research into production-ready solutions and mentoring others while doing so, this is the ideal opportunity.

Responsibilities

  • Design and implement speculative decoding algorithms to enhance Generative AI inference performance and efficiency.
  • Optimize system architecture and software infrastructure for real-time, large-scale AI model deployment.
  • Develop and maintain high-performance codebases in C++ and Rust for production-grade distributed systems.
  • Work within a multi-process, Kubernetes-based environment utilizing technologies such as MPI.
  • Partner with software, research, and operations teams to improve model evaluation, post-training processes, and system scalability.
  • Translate recent advancements in AI and speculative decoding into practical, robust implementations.
  • Provide technical leadership and contribute to a culture of innovation, mentorship, and continuous improvement.

Requirements

  • Master’s degree in Computer Science, Electrical Engineering, or equivalent practical experience.
  • 5+ years of hands-on experience in generative AI inference, particularly with speculative decoding.
  • Expertise in C++ with a proven record of building high-performance, distributed systems.
  • Familiarity with PyTorch and performance evaluation methodologies for generative models.
  • Deep understanding of AI infrastructure challenges, model architecture, and scalable deployment.
  • Proficiency with cloud-native tools, Kubernetes environments, and inter-process communication.
  • Strong problem-solving abilities, creativity, and collaboration skills in a fast-paced setting.

Benefits

  • Competitive base salary between $175,900 and $307,800 (based on experience)
  • Equity participation
  • Comprehensive health and wellness benefits
  • Flexible work environment with potential site-based requirements
  • Continuous learning and growth opportunities
  • Inclusive culture committed to diversity, equity, and belonging
  • Opportunity to work at the forefront of AI innovation
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service