Senior Cloud Reliability Engineer
Posted 14 hours 52 minutes ago by Jagex Ltd
Location: Cambridge, UK - Applicants should be based (or willing to relocate) within a comfortable commuting distance of our office to attend onsite as required.
As a Senior Cloud Reliability Engineer in Cloud Tech, you'll help keep RuneScape reliable, scalable and high-performing for players around the world. You'll work across Game, Central Tech and Cloud Platform teams to improve reliability, observability, automation and cloud-native adoption on Jagex's hybrid-cloud platform.
This is a role with real breadth: hands on production engineering, architecture influence across multiple teams, and the chance to modernise services that directly affect player experience. You'll join a highly experienced team and help shape how Jagex delivers reliable live services at scale.
What you'll be doing- Partner with game and development teams to move services toward cloud native architectures, improving resilience, security and cost efficiency across live environments.
- Support the migration of workloads from managed VPS environments onto Jagex's cloud platform, helping teams modernise safely without compromising uptime.
- Define, embed and improve SLIs, SLOs and error budget thinking so service reliability is measurable and better understood across teams.
- Design and enhance observability and alerting across logs, metrics and traces, giving teams faster insight into issues and reducing time to detection.
- Automate operational tasks such as scaling, failover and deployments, while building self healing mechanisms that reduce toil and improve recovery.
- Contribute hands on reliability improvements across Linux based production systems, reusable IaC modules and team codebases, while helping raise engineering standards across Cloud Tech.
- Proven experience owning reliability for large scale, internet facing services in production.
- Demonstrable AWS expertise across services such as VPC, EC2, ECS/EKS, ELB, ECR, Route53, KMS, IAM and Systems Manager.
- Proven capability in cloud native design, workload modernisation and Infrastructure as Code delivery.
- Strong practical experience with SLIs, SLOs, incident response, root cause analysis and resilient system design.
- Demonstrable production experience with Debian based Linux environments, virtual machine fleet management and configuration management tooling.
- Hands on experience with observability platforms, CI/CD, containerisation and programming or scripting in Python or Java.
When you join Jagex you can look forward to a generous Perks & Benefits package including:
- Private Healthcare, including Dental Plan.
- Discretionary annual performance bonus.
- Minimum 6% Pension contributions.
- Life Insurance.
- Enhanced family leave policies from day 1.
- Flexible working hours.
- 25 days annual leave + Bank holidays & the option to buy/sell holidays + so much more!
We are committed to providing equal opportunities and creating an environment where everyone can thrive. We welcome applications from all backgrounds, and we recruit, develop, and promote based on merit and ability.
If you require any reasonable adjustments to support you during the recruitment process, please let us know when you're invited to interview.
Right to Work StatementThis role is only open to applicants who have the permanent right to work in the UK. We are unable to provide or take over visa sponsorship for this position, either now or in the future. Applicants must therefore be able to demonstrate their ongoing eligibility to work in the UK without the need for employer sponsorship.