Red Hatposted 2 days ago
$116,270 - $191,840/Yr
Full-time • Senior
Raleigh, NC

About the position

The Red Hat Performance and Scale Engineering team is looking for an experienced Senior Software Engineer to join our passionate global and diverse team that fosters innovation and collaboration to achieve world-class performance of Red Hat Openshift, the industry leading enterprise Kubernetes platform to help user workloads do more. This position calls for a creative, adaptable engineer who is eager to learn new technologies and thrive in a dynamic, open source-oriented culture. In this role, you will solve challenging and industry-impacting problems, specifically focusing on optimizing and enhancing the performance of Red Hat Openshift Container Platform for our on-prem Telco 5G Core Network Functions and Radio Access Network (RAN) solution offerings. You should have a keen interest in building large and performant platforms, making things big and fast, a curiosity to explore the details, and cross-discipline skills for developing projects from beginning to end. Use cases currently involve Telco 5G control plane and data plane workloads deployed in baremetal environments, evolving toward the future of edge computing for Telco and other enterprise applications with Artificial Intelligence. As a Senior Software Engineer, you will leverage your knowledge of systems, networking and hardware performance to theorize bottlenecks and limitations, devise test plans, execute workloads, measure performance, clearly articulate findings and fix bottlenecks. You may be located in one of our many offices or work permanently remotely within the US. At Red Hat, our commitment to open source innovation extends beyond our products - it’s embedded in how we work and grow. Red Hatters embrace change – especially in our fast-moving technological landscape – and have a strong growth mindset. That's why we encourage our teams to proactively, thoughtfully, and ethically use AI to simplify their workflows, cut complexity, and boost efficiency. This empowers our associates to focus on higher-impact work, creating smart, more innovative solutions that solve our customers' most pressing challenges.

Responsibilities

  • Understand the complex architecture of distributed Kubernetes systems and deploy lab environments across dozens or hundreds of nodes that are analogous to customer expectations and demands
  • Work closely with management, product owners, developers, and quality engineers to understand product requirements and build suitable performance test plans
  • Conduct hardware-level performance tuning and optimization for telco workloads on bare metal servers
  • Simulate real-world workloads to systematically stress environments through comprehensive end-to-end automation, leveraging custom built and state of the art open source tools and frameworks
  • Collect aggregated workload and system metrics from large distributed systems
  • Deep dive into performance issues with the intent of discovering their root cause on complex distributed systems
  • Design and contribute to orchestration, benchmarking, monitoring and reporting tools used within and beyond the Performance and Scale teams
  • Document your research and results clearly and concisely, communicate findings both internally and externally, and provide continuous feedback to Engineering teams and the leadership

Requirements

  • Minimum 7 years of combined education and experience in a role like Software Engineering, Performance Engineering, or Site Reliability Engineering (SRE)
  • Significant hands-on experience deploying and managing container orchestration platforms like Kubernetes or Red Hat OpenShift
  • Solid Linux system administration and engineering skills, with a good understanding of bare metal server operations
  • Solid scripting skills, particularly in Bash, Python, GoLang, or Red Hat Ansible Automation Platform
  • Proven experience in designing, implementing, and documenting performance testing strategies and frameworks for optimizing system performance
  • A solid understanding of performance analysis methodologies and experience with system-level performance tools (e.g., iostat, vmstat, sar, perf)
  • Familiarity with observability stacks (e.g., Prometheus, Grafana, Jaeger, OpenTelemetry, ELK, Splunk)
  • Understanding of cloud-native architectures, microservices, CI/CD pipelines, and collaborative software development methodologies, tools and version control (git, gitLab)
  • Knowledge of TCP/IP, DNS, DHCP, load balancing, and container networking
  • Demonstrated abilities to take initiative, work independently, proactively seek collaboration and drive projects to completion
  • Excellent communication skills and ability to present technical findings to diverse audiences
  • Collaborate with cross-functional teams to identify opportunities for AI integration within the software development lifecycle, driving continuous improvement and innovation in engineering practices; share use cases for successful experiments with stakeholders for broader use

Nice-to-haves

  • Direct experience with telecommunications workloads (5G Core, RAN) or telco architectures
  • A demonstrated history of contributing to open-source projects
  • Experience with using AI for building testing frameworks and automation
  • Experience working with public clouds like AWS, Azure, GCP or IBM Cloud

Benefits

  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service