Tesla-posted 4 days ago
Mid Level
Palo Alto, CA
Motor Vehicle and Parts Dealers
Craft a resume that recruiters will want to see with Teal's resume Matching Mode

As a Software Engineer within the Autopilot AI Infrastructure team, you will work on reinforcing, optimizing, and scaling our infrastructure components supporting AI research activities for Autopilot and the Tesla Bot. At the core of our autonomy capabilities are neural networks that the research team is designing to train on very large amounts of data, across large-scale GPU clusters and our supercomputer Dojo. Robustly training these models at scale and in the shortest amount of time is critical to our mission. We are building out the Machine Learning Platform that our engineers and leadership use to schedule, manage and monitor machine learning experiments, data pipelines and artifacts. With the ever-increasing size of our datasets and compute clusters, we are looking for an experienced backend engineer to help drive scalability improvements and new capabilities in the platform.

  • Develop and deploy solutions to scale our infrastructure effectively in response to rapidly growing demands
  • Drive implementation of best practices and monitoring systems to proactively detect and address issues in our production environment
  • Work across the stack on tools and infrastructure empowering the machine learning team to be effective. This ranges from developing/running model training and evaluation code to back-end infrastructure to occasional front-end work
  • Coordinate required resources with the team managing the cluster hardware to maintain high availability
  • Work closely with the research team to understand requirements and priorities.
  • Expertise in designing scalable and durable distributed systems
  • Strong knowledge of Python/Go and Linux
  • Experience working with diverse backend infrastructure components (SQL / NoSQL databases, caching, message brokers, event streams, monitoring etc)
  • Hands-on experience with containerization and orchestration technologies (Docker, Kubernetes) and setting up CI/CD flows
  • Knowledge of front-end development in React / strong product sense
  • Knowledge of machine learning, computer vision, or neural networks
  • Experience working with HPC clusters
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service