Amazon.composted 3 days ago
$151,300 - $261,500/Yr
Mid Level
Cupertino, CA
General Merchandise Retailers

About the position

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and servers that use them. This role is for a software engineer in the Machine Learning Inference Model Enablement team for AWS Neuron at Annapurna Labs. This role is responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama family, DeepSeek and beyond, as well as stable diffusion, vision transformers and many more. The Inference Model Enablement team works side by side with compiler engineers and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia. Experience optimizing inference performance for both latency and throughput on these large models using Python, Pytorch or JAX is a must. Experience with Deepspeed and other distributed inference libraries is a bonus, as extending these techniques for the Neuron based system is key.

Responsibilities

  • Help lead the efforts building distributed inference support for Pytorch in the Neuron SDK.
  • Tune models to ensure highest performance and maximize efficiency on AWS Trainium and Inferentia silicon and servers.
  • Design and code solutions to drive efficiencies in software architecture.
  • Create metrics, implement automation and other improvements, and resolve the root cause of software defects.
  • Build high-impact solutions to deliver to a large customer base.
  • Participate in design discussions, code review, and communicate with internal and external stakeholders.
  • Work cross-functionally to help drive business decisions with technical input.
  • Work in a startup-like development environment.

Requirements

  • 5+ years of non-internship professional software development experience.
  • 5+ years of non-internship design or architecture experience of new and existing systems.
  • Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles.
  • Experience programming with at least one software programming language.

Nice-to-haves

  • 5+ years of full software development life cycle experience, including coding standards, code reviews, source control management, build processes, testing, and operations.
  • Masters degree in computer science or equivalent.

Benefits

  • Medical, financial, and/or other benefits.
  • Equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service