Riot Gamesposted 2 days ago
Full-time • Senior
Los Angeles, CA

About the position

The Live Operations Observability Pipeline (LOOP) team enables Riot to efficiently operate games that are played by millions of players across the world. The team builds and maintains a pipeline that carries petabytes of operational metrics data emitted by services deployed globally. We ensure that our games are operable and observable by defining a well-communicated set of standards and building tools that enhance our ability to respond to failure. Engineers in Live Operations work across all games and architectures at Riot to solve global problems. As a Senior Software Engineer, you'll leverage your expertise in building large-scale, highly-available systems to create an efficient operational ecosystem that is necessary to eliminate any delays in identifying or preventing the impact to the player experience. You’ll be building tools to maintain an accurate and consistent interpretation of the information obtained from dozens of internal monitoring systems and processes. You’ll partner closely with service and game teams to help improve the observability of their services. Availability, automation and reliability will be your watchwords. You'll report to the team’s Engineering Manager.

Responsibilities

  • Lead the creation of Riot wide standards and best practices for alerting and monitoring
  • Create and operate tools and services that help achieve operational excellence
  • Communicate at a technical level with other development teams to help improve their services
  • Drive collaboration and alignment with multiple internal and globally dispersed teams
  • Characterize and identify system problems both within operations as well as our tooling and services
  • Mentor software engineers through code and technical design reviews
  • Identify and propose fixes for systemic issues
  • Provide ongoing maintenance, support and enhancements in existing platforms

Requirements

  • 4+ years of experience building, deploying and operating features end-to-end within an existing large system
  • Experience driving software engineering best practices within the team, including design reviews, coding standards, code reviews, tools improvements, source control management, build processes, and testing
  • Understand distributed systems, microservices, and software at high scale
  • Comfortable using whichever language/framework is necessary for the job
  • Ability to participate in an on-call rotation to ensure 24/7 system availability and handle critical incidents

Nice-to-haves

  • Experience with distributed systems, specifically microservices
  • Experience working in container-based ecosystems and with a container scheduler (e.g. Marathon, Mesos, Kubernetes, GKE, Amazon ECS)
  • Experience with Java and Go
  • Familiarity with Site Reliability best practices
  • Familiarity and experience with third-party monitoring platforms like New Relic, Datadog, Prometheus etc

Benefits

  • Open paid time off policy
  • Flexible work schedules
  • Medical, dental, and life insurance
  • Parental leave for you, your spouse/domestic partner, and children
  • 401k with company match
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service