Senior Site Reliability Engineer
Red Hat · Boston, Massachusetts, US
As a senior member of the team, you will drive service performance with a high degree of autonomy, tackling complex, non-routine technical challenges that di...
Job description
About the Job As a senior member of the team, you will drive service performance with a high degree of autonomy, tackling complex, non-routine technical challenges that directly impact global service stability. You will work across teams to define and iterate upon repeatable processes for onboarding new managed services, demonstrating good judgment in employing methods that scale. Your technical contributions will focus on the codebase of command-and-control software to automate the building, deployment, monitoring, and alerting of Red Hat services, with a specific expectation of proficiency in Agentic tooling. You will leverage these autonomous systems to enhance intelligent orchestration and self-healing capabilities across the platform. The remainder of your time will be spent diagnosing complex issues, planning, documenting, and mentoring junior engineers to foster a culture of technical excellence. What you will do: Design, write, and maintain software (primarily in Python and Golang) that automates the deployment, monitoring, and maintenance of Red Hat managed services. Onboarding of new services onto our OpenShift-based platform Adhering to cloud-native design principles & b...