Deadline:30 June 2025 Job Type:PermanentContact Email:sid@fintop.co.uk Apply Now
Job Description
About the Company:
A leading IT services provider is seeking a Senior DevOps Site Reliability Engineer (SRE) to enhance system reliability, scalability, and performance. This is an opportunity to work in a dynamic team, driving automation, monitoring, and infrastructure optimization.
Roles & Responsibilities:
Lead and mentor a team of engineers in implementing best practices for reliability engineering.
Design and optimize scalable infrastructure, ensuring high availability and fault tolerance.
Develop and refine automation tools for monitoring, alerting, and incident response.
Manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve system performance.
Conduct postmortem analysis and implement preventive measures to mitigate risks.
Collaborate with development teams to optimize CI/CD pipelines and deployment workflows.
Enhance system observability by implementing and tuning logging, monitoring, and alerting solutions.
Perform capacity planning, performance tuning, and cost optimization for cloud environments.
Strengthen security and compliance in cloud-native infrastructure.
Participate in on-call rotations and handle escalations for critical incidents.
Requirements:
BSc in Computer Science, Engineering, or a related field (or equivalent experience).
5+ years of experience in DevOps, SRE, or related infrastructure roles.
Expertise in cloud environments (AWS, Google Cloud, Azure) and container orchestration (Kubernetes, Docker Swarm).
Deep knowledge of infrastructure-as-code tools such as Terraform, Ansible, or SaltStack.
Strong proficiency in Python, Go, or Bash for automation and scripting.