Leave us your email address and we'll send you all the new jobs according to your preferences.
Principle SRE
Posted 3 hours 11 minutes ago by Barclays
The Principal Site Reliability Engineer will be a senior technical expert responsible for driving end-to-end resilience, reliability, and scalability across our mission-critical payments platform. This role focuses on front-to-back payment flows, ensuring systems are designed for fault tolerance, observability, and operational excellence.
You will perform deep technical reviews, troubleshoot complex issues, and define patterns for resiliency by design. As a hands on engineer, you will collaborate with development and production support teams, advocate chaos engineering, and build a culture of designing for failure. This position requires strong technical breadth across infrastructure, applications, networks, databases, and integrations, combined with expertise in modern reliability engineering practices.
Key Responsibilities- Reliability Engineering Leadership
- Drive strategies to improve reliability, maintainability, and scalability across payment flows and platform components.
- Architecture & Design Reviews
- Conduct deep technical assessments of system architectures, identifying risks and recommending improvements for fault tolerance and disaster recovery.
- Incident Management & Root Cause Analysis
- Act as a senior escalation point for production incidents, lead RCA, and implement permanent fixes to prevent recurrence.
- Resiliency by Design
- Define and enforce reliability patterns, frameworks, and best practices; ensure adoption across engineering teams.
- Chaos Engineering & Failure Testing
- Advocate and implement chaos engineering principles to validate system resilience under real-world failure scenarios.
- Observability & Monitoring
- Design and implement full-stack observability solutions, including metrics, logging, distributed tracing, and alerting.
- Automation & Tooling
- Develop automation for failover, capacity management, and self healing mechanisms to reduce operational risk.
- Collaboration
- Partner with development, infrastructure, and production support teams to embed reliability into the SDLC.
- Continuous Improvement
- Analyze service risk assessments and production incidents to identify systemic issues and drive long-term improvements.
- Culture Building
- Promote operational excellence and a mindset of designing for failure across all engineering teams.
- Technical Expertise
- 12+ years in software engineering or infrastructure roles, with at least 5 years focused on reliability engineering or SRE.
- Proven experience building and operating fault tolerant, highly available systems at scale.
- Architecture & Design
- Strong knowledge of distributed systems, resiliency patterns (circuit breakers, retries, failover), and disaster recovery strategies.
- Technical Breadth
- Expertise across infrastructure (compute, storage, networking), application architecture, databases, and integration patterns.
- Problem Solving
- Ability to troubleshoot complex technical issues across distributed systems and perform deep root cause analysis.
- Collaboration & Influence
- Skilled at working with development, operations, and architecture teams to embed reliability into design and delivery.
To drive technical excellence and innovation by leading the design and implementation of robust software solutions, providing mentorship to engineering teams, fostering cross functional collaboration, and contributing to strategic planning to ensure the delivery of high quality solutions aligned with business objectives.
Accountabilities- Provision of guidance and expertise to engineering teams to ensure alignment with best practices and foster a culture of technical excellence.
- Contribution to strategic planning by aligning technical decisions with business goals, anticipating future technology trends, and providing insights to optimize product roadmaps.
- Design and implementation of complex, scalable, and maintainable software solutions, considering long term viability and business objectives.
- Mentoring and coaching to junior and mid level engineers to foster professional growth and knowledge sharing, elevating the overall skillset and capabilities of the organization.
- Collaboration with business partners, product managers, designers, and other stakeholders to translate business requirements into technical solutions and ensure a cohesive approach to product development.
- Innovation within the organization by identifying and incorporating new technologies, methodologies, and industry practices into the engineering process.
Barclays
Related Jobs
Structural Engineer
- £45,000 - £60,000 Annual
- Oxfordshire, Oxford, United Kingdom, OX4 2PS
Structural Engineer - Iconic UK Projects & Competitive Salary
- Manchester, City, United Kingdom, M15 6SZ
Geriatric Psychiatry Specialist - Older Adults Care
- £70,000 - £90,000 Annual
- Nottinghamshire, Nottingham, United Kingdom, NG7 2QX
Senior Vascular Surgery Fellow - Hybrid Theatre (1 Year)
- £65,048 Annual
- Hertfordshire, Stevenage, United Kingdom, SG1 4YS
Consultant Psychiatrist - Acute Inpatient Care Lead
- Herefordshire, Ledbury, United Kingdom, HR8 2EA