Leave us your email address and we'll send you all the new jobs according to your preferences.

Capacity and Performance Reliability Manager

Posted 3 hours 10 minutes ago by Computappoint

£90,000 Annual
Permanent
Not Specified
I.T. & Communications Jobs
London, City, United Kingdom, EC1A2
Job Description

  • Permanent
  • Central London - 3 days on-site per week
  • Up to £90,000 (DOE)
This is an exciting opportunity with a prestigious financial services client of ours who is seeking a talented Category and Performance Reliability Manager. This is a rare chance to play a central role in maintaining the stability and performance of one of the worlds most critical financial trading platforms. you'll ensure regulatory compliance, proactive risk mitigation, and seamless handling of peak trading demands. Contribute to transparent global reference prices and help safeguard price risk management in a dynamic, business-critical environment.

Job Title: Capacity and Performance Reliability Manager
Job Type: Permanent
Salary: Up to £90,000 (DOE)
Location: Central London
Working Arrangement: Hybrid - 3 days on-site per week

As Capacity and Performance Reliability Manager, you will:
  • Forecast demand and plan capacity across virtual, containerised, and physical environments using historical data, predictive analytics, and scenario modelling.
  • Conduct stress testing, performance tuning, and automate scaling/resource provisioning with Infrastructure as Code (IaC) and cloud-native tools.
  • Maintain and enhance the Capacity Management tool suite (eg, Athene, Grafana) for zero data loss and high automation.
  • Develop and manage Service Level Objectives (SLOs), SLIs, error budgets, monitoring, alerting, and observability solutions.
  • Lead incident response, blameless post-mortems, and continuous improvement initiatives.
  • Produce capacity plans, reliability reports, and recommendations; own the recommendations tracker and report to senior management.
  • Collaborate closely with development, operations, business teams, architects, and third-party suppliers to embed reliability into design and delivery.
  • Champion automation, observability, and a reliability-focused culture while ensuring regulatory and governance compliance.
What We're Looking For
  • 5+ years of hands-on experience in performance, capacity, or reliability management.
  • At least 5 years in business-critical global banking, financial services, or technology environments, ideally with trading technologies and linking technical metrics to business outcomes.
  • Proven expertise in capacity forecasting, modelling, trend analysis, and queueing theory/system modelling.
  • Strong proficiency with monitoring and automation tools (eg, Athene, Grafana, Prometheus, DataDog, Terraform, Kubernetes, CI/CD pipelines).
  • Significant SQL knowledge, advanced Excel skills, and coding ability (eg, Python, Visual Basic, MS SQL) plus understanding of APIs and Scripting.
  • ITIL Foundation Certification (or equivalent); experience in SRE/reliability engineering highly desirable.
  • Excellent analytical, communication, and stakeholder management skills to present insights to senior leaders and collaborate across technical and non-technical teams.
  • Knowledge of cloud architecture, containers, orchestration, and agile practices is a plus.

Services offered by Computappoint Limited are those of an Employment Business and/or Employment Agency in relation to this vacancy.

Computappoint do not use AI to filter or assess candidates, we use experienced and dedicated recruiters, who want to match the best people to roles.

Email this Job