Rivianposted 17 days ago
$129,300 - $161,600/Yr
Full-time • Mid Level
Palo Alto, CA
Transportation Equipment Manufacturing

About the position

Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to attract. As a company, we constantly challenge what's possible, never simply accepting what has always been done. We reframe old problems, seek new solutions and operate comfortably in areas that are unknown. Our backgrounds are diverse, but our team shares a love of the outdoors and a desire to protect it for future generations. The Autonomy org at Rivian is seeking a talented and motivated Data Engineer to join our Cloud and Data Team. As a Data Engineer, you will play a crucial role in designing, building, and maintaining robust data pipelines to support our data-driven initiatives. Your expertise in AWS, Python, SQL, Apache Spark, and Databricks will be pivotal in ensuring the efficiency, scalability, and reliability of our data infrastructure.

Responsibilities

  • Design, develop, and maintain robust, scalable, and low-latency data pipelines to ingest vehicle telemetry and system logs from internal fleets and customer vehicles.
  • Partner with cross-functional teams to understand data requirements for autonomy features and implement reliable data solutions accordingly.
  • Own and evolve the ingestion architecture for new platforms like R2 to ensure seamless integration with existing systems.
  • Develop and optimize ETL processes for key initiatives such as Autonomy Data Recorder (ADR) optimization, metrics tagging, and simulation data ingestion.
  • Build Spark-based processing jobs using Databricks to handle large-scale, structured and semi-structured datasets efficiently.
  • Enforce and enhance data quality and validation layers to ensure high integrity of datasets used for model development, metrics analysis, and safety evaluation.
  • Implement observability, automation, and CI/CD for data workflows using tools like Airflow, Kubernetes, and Terraform.
  • Collaborate with data analysts and scientists to enable seamless access to curated and well-documented datasets for analytics and dashboarding use cases.
  • Support global fleet expansion by scaling data infrastructure to comply with regional data privacy regulations and performance SLAs.
  • Contribute to the team's ongoing efforts in documentation, monitoring, alerting, and codebase maintainability.
  • Stay current with modern data engineering technologies, AWS best practices, and Spark ecosystem developments.

Requirements

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field.
  • 3+ years of hands-on experience building data infrastructure on AWS using services like S3, Lambda, Glue, Step Functions, and IAM.
  • Proficiency in Python, SQL, and Apache Spark, especially within the Databricks.
  • Strong understanding of distributed systems and data processing patterns (e.g., batch, streaming, event-driven).
  • Experience implementing robust CI/CD pipelines, containerization (Docker), and orchestration (Kubernetes preferred).
  • Demonstrated ability to write modular, testable, and maintainable code and work within a collaborative codebase.

Nice-to-haves

  • Familiarity with data governance concepts and experience with tools like Unity Catalog or AWS Lake Formation is a plus.
  • Experience with geospatial data or telemetry from edge devices is a plus.

Benefits

  • Robust medical/Rx, dental and vision insurance packages for full-time employees, their spouse or domestic partner, and children up to age 26. Coverage is effective on the first day of employment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service