JobMesh

Senior Site Reliability Engineer

Bumble Inc. · London, England, GB

We are looking for an experienced engineer with strong Linux and system-level expertise who can operate autonomously in complex production environments.

Job description

We are looking for an experienced engineer with strong Linux and system-level expertise who can operate autonomously in complex production environments. You must be able to independently troubleshoot incidents, lead and support post-incident service recovery, and drive improvements to overall system stability, performance, and observability. We are looking for a hands-on Site Reliability Engineer (SRE) with a strong background in Linux infrastructure and third-party system operations. This role focuses on managing and optimizing large-scale environments (5,000+ hosts) running technologies like Kafka, Redis, and Kubernetes. The position does not involve application development but requires deep operational expertise and solid troubleshooting skills. Qualifications: - 5+ years of experience in Linux system administration or SRE roles - Proven experience managing large-scale infrastructure environments - Strong troubleshooting and performance tuning skills at the infrastructure level - Basic scripting/automation experience (Bash, Python) - Familiarity with IaC tools (e.g., Ansible, Puppet) - Knowledge of distributed systems and container orchestration (Kafka, Kubernetes, etc.) - Excel...