Home Depotposted 20 days ago
Full-time • Senior
Atlanta, GA
Building Material and Garden Equipment and Supplies Dealers

About the position

The Software Engineer Principal is responsible for joining the Enterprise Mobile Reliability Engineering team, providing leadership in the design, development, and end-to-end lifecycle management of consumer mobile applications at enterprise scale. This role will focus on ensuring the reliability, resiliency, and operational excellence of mobile products that millions of users depend on. Collaboration is central: the Software Engineer Principal partners closely with other product and engineering teams to share reliability best practices, address complex technical questions, and encourage robust cross-team connections. They will also actively engage with third-party vendors and the open-source community to drive continuous improvements in stability and resiliency across the mobile application portfolio. Key technical responsibilities include architecting and building reusable code elements, producing architectural diagrams, and developing comprehensive documentation that supports mobile resiliency and effective incident response. The Software Engineer Principal will establish service level objectives specifically crafted for the unique demands of consumer mobile applications, and is deeply involved in configuring, monitoring, performance tuning, and reliability testing of apps in production. As a technical leader, this individual is expected to mentor and develop junior engineers, cultivating expertise in mobile reliability engineering and fostering a culture of operational excellence. Demonstrated experience throughout the complete product lifecycle-from inception through monitoring and incident management-combined with mastery of modern mobile development practices, is essential for success in this role.

Responsibilities

  • Collaborates and pairs with other product team members (UX, engineering, and product management) to create secure, reliable, scalable software solutions
  • Documents, reviews, and ensures that all quality and change control standards are met
  • Writes custom code or scripts to automate infrastructure, monitoring services, and test cases
  • Writes custom code or scripts to do 'destructive testing' to ensure adequate resiliency in production
  • Creates meaningful dashboards, logging, alerting, and responses to ensure that issues are captured and addressed proactively
  • Contributes to enterprise-wide tools to drive destructive testing, automation, or engineering empowerment
  • Identifies product enhancements (client-facing or technical) to create a better experience for the end users
  • Identifies unsecured code areas and implements fixes as they are discovered with or without tooling
  • Identifies, implements, and shares technical solutions that can be used across the organization
  • Creates and architects foundational code elements that can be reused many times by a product
  • Creates meaningful architecture diagrams and other documentation needed for security reviews or other interested parties
  • Defines Service Level Objectives for the product to constantly measure their reliability in production and help prioritize backlog work
  • Field questions from other product teams or support teams
  • Monitors tools and participates in conversations to encourage collaboration across product teams
  • Provides application support for software running in production
  • Proactively monitors production Service Level Objectives for products
  • Works with vendors and the open-source community to help identify and implement feature enhancements in software products
  • Works with other product teams to create API specifications and contracts for shared data
  • Proactively reviews the performance and capacity of all aspects of production: code, infrastructure, data, and message processing
  • Triages high-priority issues and outages as they arise
  • Participates in and leads learning activities around modern software design and development core practices (communities of practice)
  • Learns, through reading, tutorials, and videos, new technologies and best practices being used within other technology organizations
  • Attends conferences and learns how to apply new technologies where appropriate

Requirements

  • Must be eighteen years of age or older
  • Must be legally permitted to work in the United States
  • Mastery of an object-oriented programming language (preferably Java)
  • 8+ years of relevant experience in software engineering, with a strong emphasis on enterprise-scale mobile applications
  • Expert-level hands-on experience with both Android and iOS development, including deep proficiency in native frameworks such as Kotlin/Java for Android and Swift/Objective-C for iOS
  • Advanced knowledge of application monitoring and user analytics tools (e.g., NewRelic, Firebase, AppDynamics, or similar) for mobile applications, including configuration, integration, and real-time data interpretation for reliability and performance
  • Proven expertise in application performance management (APM) specifically for mobile environments, including strategies for continuous monitoring, alerting, and remediation in production
  • Demonstrated skills in leveraging application analytics platforms to measure user engagement, application health, and to drive data-informed operational improvements
  • Strong background in designing, building, and deploying containerized workloads and microservices using Kubernetes, with an emphasis on supporting mobile application backends and associated infrastructure at scale
  • Proficiency in Java as a primary backend or native Android language, and experience using Java-based frameworks and tools in the mobile ecosystem
  • Deep understanding of mobile reliability engineering principles, including designing for resiliency, implementing robust error handling, automated recovery, and incident response for high-availability mobile applications
  • Direct experience architecting, implementing, and maintaining mobile infrastructure and cloud backend services for consumer-facing apps, encompassing high availability, scalability, disaster recovery, and security best practices
  • Expertise in modern application development practices, including CI/CD pipelines, automated testing (unit, integration, and E2E), and agile methodologies for mobile software delivery
  • Experience with the end-to-end lifecycle management of mobile apps, including app store deployment, telemetry instrumentation, operational monitoring, user incident triage, and post-release stability improvement
  • Strong foundation in RESTful/gRPC API design and implementation, particularly as it supports mobile client consumption
  • Solid understanding of cloud platform services (e.g., AWS, GCP, or Azure) as relevant to mobile backend infrastructure and scaling high-concurrency consumer workloads
  • Commitment to mentoring and enabling other engineers in reliability engineering, best practices in monitoring, and operational excellence
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service