Couchbase, Inc.posted 17 days ago
$182,835 - $215,100/Yr
Full-time • Senior

About the position

As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency—from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission. Site Reliability Engineers (SRE) are hybrid software and systems engineers. They are the glue holding things together, whether that’s infrastructure, security, services, teams and processes. You will join Couchbase to lead the Cloud Platform & Production Pipeline Initiatives and collaborate with the Product Management, Engineering, Support team to on-board new services, features, and solutions. You will also ensure best practices for Couchbase Capella site reliability, scalability and cloud cost optimizations meeting business SLAs. You will work with many software engineers and teams to ensure our cloud platform meets the needs of our organization and customers. You will set the strategy and operational KPIs for the platform team and the applications supported by the cloud organization. You will have an immediate impact on the day-to-day efficiency of cloud operations and an ongoing impact on growth.

Responsibilities

  • Collaborate with the product & engineering team to on-board new cloud services, features & solutions.
  • Ensure SRE best practices are applied during design, development, deployment & monitoring.
  • Lead SRE solutions with clear problem statements & business values.
  • Lead the Solution architecture, design and development using cloud-native technologies like Golang, Docker, Kubernetes, and cloud platforms (AWS, GCP, Azure).
  • Work closely with development teams to write, review, and optimize code for scalability, reliability, and performance.
  • Guide architectural decisions and software design, ensuring the platform can scale to meet future customer demands.
  • Administer Capella cloud infrastructure and services for reliability, availability & optimization.
  • Help influence mitigation first strategy and adjustments to process, tools and service.
  • Drive critical product events mitigation and RCA.
  • Improve Capella Observability, Serviceability and Tooling.
  • Improve CI/CD & release pipelines.
  • Collaborate with the infrastructure and service team on implementing best security practices.
  • Take ownership of many controls, processes, and risks required to maintain our compliance portfolio (SOC 2, PCI-DSS, GDPR, and HIPAA, among others).
  • Mentorship & Team Development: Mentor developers, fostering a culture of continuous learning and technical excellence. Lead by example in writing clean, maintainable, and efficient code.
  • Stay up-to-date with new technologies and industry trends, and continuously improve the platform to meet the changing needs of the company.

Requirements

  • 10+ years experience in SRE/DevSecOps operating on public cloud.
  • Past 5+ years experience in leading cloud solutions architecture, design and implementation.
  • CSP administration skills in AWS, GCP & Azure.
  • Proficiency with programming and scripting languages like Go, Python, Java, or Ruby.
  • High proficiency with Linux operating systems.
  • Experience in running, managing and maintaining Kubernetes clusters both self-managed (vanilla/plain k8s) & managed (preferably AWS EKS).
  • Knowledge and understanding of Security topics such as vulnerability management, pen testing, SCA, DAST, SAST and Security tools such as Sysdig, Synk, Blackduck etc.
  • Proficient working with Terraform configuration management tools, version control systems (Git), integrating with CI/CD platforms and tool chains such as CircleCI, GitHub, Spinnaker etc.
  • Strong understanding of networking security concepts, including TCP/IP, DNS, HTTP, Firewalls, VPNs, VPCs, Private Links etc.
  • Deep working experience on cloud platforms and open source software like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos etc.

Nice-to-haves

  • Proficiency with Databases such as Couchbase is a plus.
  • Security certifications are appreciated.

Benefits

  • Generous Time Off Program - Flexibility to care for you and your family.
  • Wellness Benefits - A variety of world class medical plans to choose from, along with dental, vision, life insurance, and employee assistance programs.
  • Financial Planning - RSU equity program, ESPP program, Retirement program and Business Travel Insurance.
  • Career Growth - Be valued, Create value approach.
  • Fun Perks - An ergonomic and comfortable in-office / WFH setup. Food & Snacks for in-office employees.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service