Senior Reliability Engineer (Data Infrastructure)
ComplyAdvantage · Lisbon, PT
What you will be doing: We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our Data Infrastructure team. You will be responsible...
Job description
What you will be doing: We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our Data Infrastructure team. You will be responsible for ensuring the reliability, availability, and performance of our critical data systems running on AWS and GCP. Your expertise in cloud infrastructure, automation, and operational excellence will be crucial in supporting our Product trough our global client base. As a Senior Site Reliability Engineer you will: - Design, implement, and maintain highly available and reliable data infrastructure services, including SQL, NoSQL, Kafka, and Spark-based data layers. Define and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs). - Participate in an on-call rotation to respond to incidents and ensure rapid resolution of production issues. Conduct thorough post-incident reviews to identify root causes and implement preventative measures. - Manage and automate cloud infrastructure using Terraform and Helm, adhering to GitOps principles. - Implement and maintain comprehensive monitoring, logging, and tracing solutions to proactively identify and resolve performance and reliability issues. - Monitor and manage data...