SonicJobs Logo
Login
Left arrow iconBack to search

Site Reliability Engineer

Spectrum IT Recruitment
Posted a day ago, valid for 5 hours
Location

Southampton, Hampshire SO152AE, England

Salary

£45,000 - £54,000 per annum

info
Contract type

Full Time

Life Insurance
Employee Assistance

By applying, a CV-Library account will be created for you. CV-Library's Terms & Conditions and Privacy Policy will apply.

Sonic Summary

info
  • The company is seeking a Site Reliability Engineer based in Southampton, requiring 3-6 years of experience in a similar role and a salary that includes benefits such as life insurance and private medical insurance.
  • The role involves overseeing production environments, enhancing system performance, and developing automated solutions for resilient and scalable systems.
  • Candidates should have practical experience with Kubernetes, cloud platforms like AWS, and familiarity with monitoring tools such as Grafana and Splunk.
  • The position requires proficiency in at least one programming language and a solid understanding of CI/CD principles and infrastructure-as-code tools.
  • Security clearance is required for this position, and the work arrangement includes hybrid working with two days in the office each week.

Site Reliability Engineer

Southampton HQ - 2 Times a week in Office

Cloud, SaaS, AWS,

Please be advised Security Clearance is required for this position

We are working alongside one of our longstanding clients in helping them recruit a Site Reliability Engineer. The company deliver cutting-edge enterprise software solutions across both cloud and on-premises environments, empowering organisations to enhance customer experiences, maintain regulatory compliance, and proactively fight fraud. The company are trusted by businesses worldwide to drive seamless, intelligent customer interactions.

In this role, you'll oversee the production environment by ensuring system availability and maintaining a comprehensive perspective on overall health. You'll develop tools and software to support and streamline the management of platform infrastructure and key applications. A major focus will be enhancing the dependability, performance, and delivery speed of our software products. You'll also be responsible for analysing and fine-tuning system performance to anticipate user demands and drive innovation. Additionally, you'll take the lead in providing operational support and technical oversight for several large-scale distributed applications.

How You'll Contribute:

  • Monitor and interpret system and application metrics to fine-tune performance and troubleshoot issues effectively
  • Collaborate closely with developers to enhance service quality through thorough testing and structured release practices
  • Engage in architectural discussions, manage platform operations, and contribute to capacity forecasting
  • Design and implement automated solutions to build resilient, scalable systems
  • Maintain a strong focus on delivering new features while ensuring stability and adherence to service level goals

You'll Stand Out If You Have:

  • Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus
  • Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo
  • Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck
  • Experience using configuration management platforms like Ansible, Puppet, or Chef
  • Professional certifications in cloud DevOps, such as AWS Certified DevOps Engineer or Google Cloud Professional DevOps Engineer, or similar credentials

Do You Have What It Takes?

  • 3-6 years of hands-on experience in a similar role, with a strong emphasis on systems engineering, automation, and service reliability
  • Proficient in at least one programming language such as Python, Go, Java, or C#, along with scripting skills in Bash or PowerShell
  • Solid grasp of cloud platforms like AWS, including an understanding of how core services like EC2, ECS, Lambda, and DynamoDB operate under reliability constraints
  • Practical experience using infrastructure-as-code tools like CloudFormation or Terraform
  • In-depth knowledge of CI/CD principles and hands-on experience with tools such as Jenkins, GitLab CI/CD, or CircleCI
  • Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture
  • Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch
  • Excellent analytical and troubleshooting abilities, especially within complex distributed systems
  • Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through resolution and communication during critical outages

Benefits

  • Life Insurance - 4 x Annual Salary
  • Private Medical Insurance
  • Employee Assistance Programme
  • Hybrid Working - 3 Days from Home
  • GP Online Assistance Portal.
  • + Much More

Please click the "Apply" button to state your interest in this position.

Spectrum IT Recruitment (South) Limited is acting as an Employment Agency in relation to this vacancy.

Apply now in a few quick clicks

By applying, a CV-Library account will be created for you. CV-Library's Terms & Conditions and Privacy Policy will apply.