Site Reliability Engineer
Recorded Future, Inc. · SE
With 1,000+ intelligence professionals serving over 1,900 clients worldwide, Recorded Future is the world’s most advanced, and largest, intelligence company!...
Job description
With 1,000+ intelligence professionals serving over 1,900 clients worldwide, Recorded Future is the world’s most advanced, and largest, intelligence company! We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our growing team. In this role, you will be instrumental in ensuring the reliability, scalability, and performance of our critical systems. You will work closely with development teams to build and maintain robust infrastructure, implement automation, and foster a culture of operational excellence. This position requires a strong understanding of cloud environments, observability, and infrastructure as code principles. What You'll Do: - Ensure the performance, capacity, scalability, reliability, resiliency, security, compliance, support, cost efficiency, SLA, SLOs, RPOs and RTOs for the platform, either directly or in collaboration with other teams. - Make systemic improvements both proactively and for recurring issues. - Perform comprehensive Root Cause Analysis for outages. - Design, implement, and maintain scalable and reliable infrastructure on AWS. - Develop and manage observability solutions using tools such as Grafana, ELK (Elastic...