Senior Site Reliability Engineer
Posted 7 days 18 hours ago by Embarcaderomediagroup
Senior Site Reliability & Platform Engineer
Manchester Hybrid/Flexible Working Full-Time
Drive better infrastructure and developer experience at scale
At Sorted, we're building robust, scalable systems to support modern digital services - and we're looking for a Site Reliability & Platform Engineer to help lead the way.
You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices like GitOps, Infrastructure as Code, DevSecOps automation, and self-service enablement, to help development teams ship faster, safer, and more cost-efficiently.
What you'll be doing:- Designing and operating highly reliable, scalable, and secure Azure-based platforms
- Applying SRE principles like SLOs, observability, and incident management to drive service reliability
- Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows
- Enabling teams through platform tools, reusable Terraform modules, and self-service infrastructure
- Enhancing CI/CD pipelines (Azure DevOps, YAML-based) with security scanning and progressive delivery
- Supporting AKS clusters and Azure services (SQL, Cosmos DB, ADF, Functions, Logic Apps, etc.)
- Improving monitoring and alerting with Datadog, Grafana, ELK, and proactive failure detection
- Participating in the on-call rota and leading incident response workflows and blameless postmortems
- Coaching engineers, upskilling teams, and contributing to a culture of continuous improvement
- Driving cost awareness through FinOps practices and automated budget controls
We're seeking someone with strong platform and cloud engineering experience who can collaborate across teams and incorporate reliability thinking into all aspects of their work. Ideally, you have:
- In-depth Azure knowledge (AKS, Functions, SQL, Cosmos DB, etc.)
- Strong Infrastructure as Code skills with Terraform (v1.7+)
- Experience with CI/CD pipelines, GitOps, and automation tools (PowerShell, Bash)
- Familiarity with observability and incident tools like Datadog, ELK, and synthetic monitoring
- Solid understanding of networking (TCP/IP, Load Balancing, DNS, Routing)
- Good knowledge of DevSecOps practices - including security scanning, IAM, and RBAC
- Experience with FinOps - tagging, budgeting, cost optimisation
- Experience with Windows and Linux Operating Systems
- Understanding of progressive delivery methods (canary, blue/green)
- Familiarity with security scanning tools (Trivy, tfsec) integrated into pipelines
- A proactive approach to problem-solving, documentation, and coaching
Additional bonus skills include experience with Azure governance tools, advanced Datadog capabilities, Kubernetes autoscaling solutions, GitOps workflows, automated cost dashboards, compliance frameworks, and internal platform development.
What You Can Expect:- Competitive salary: £70,000 - £80,000 depending on experience
- 25 days holiday plus bank holidays
- Flexible remote/hybrid working with office collaboration as needed
- Health Shield from day one
- Annual £200 personal development budget
- Enhanced family leave policy
- Life Assurance coverage
- Pension contributions via salary sacrifice
- 35-hour workweek (plus 1-hour unpaid lunch)
This is a great opportunity for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency.
At Sorted, we are committed to fostering an inclusive environment where people from all backgrounds can thrive. If you need any accommodations during the interview process, please let us know-we're happy to assist.