Principal Cloud Native Platform Engineer UK
Posted 5 days 16 hours ago by Nscale Ltd.
UK
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you'll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you'll be contributing to building the technology that powers the future.
About The RoleThe Principal Cloud Native Platform Engineer is a senior technical leader responsible for the long-term integrity, coherence, and evolution of Nscale's cloud-native platform. This role extends beyond individual systems, focusing on architecture, standards, and engineering excellence as the organisation and platform scale.
The role combines deep hands-on engineering with strong architectural stewardship. You will act as a technical escalation point, a mentor to senior engineers, and a trusted advisor to engineering leadership, helping shape the direction of the platform and the practices used to build it.
This role requires Principal engineers to be able to accelerate the delivery of Nscale's platform service offerings, marrying innovation with efficiency via experience in technical direction. Working closely with the Director of Cloud Native Platform Engineering
What You'll be Doing (Responsibilities)- Own and evolve the core platform architecture across multiple subsystems
- Design and review complex, multi-controller Kubernetes-native systems
- Maintain a strong bias toward simplicity, explicitness, and long-term maintainability
- Act as a technical escalation point for the most complex platform problems
Standardisation & Technical Governance
- Define and maintain platform-wide engineering standards, including:
- Controller and operator design patterns
- API and CRD design guidelines
- Versioning, compatibility, and deprecation strategies
- Ensure consistency across teams in:
- Reconciliation behavior
- Error handling and retry semantics
- Review and influence designs to prevent:
- Unnecessary divergence
- Overlapping abstractions
- Establish reference implementations and shared libraries where appropriate
Mentoring & Capability Building
- Actively mentor Senior and mid-level engineers in:
- Kubernetes internals and control plane design
- Distributed systems thinking
- Production readiness and failure analysis
- Raise the overall technical bar through:
- Design reviews
- Code reviews focused on correctness and clarity
- Knowledge sharing and documentation
- Identify skill gaps within the team and contribute to closing them through guidance and example
- Serve as a trusted technical advisor to engineering leadership
Cross-Team Influence
- Align platform engineering decisions with:
- SRE operational requirements
- Infrastructure and hardware roadmaps
- Product and customer needs
- Communicate architectural intent clearly through:
- Reviews and technical discussions
- Ensure that platform changes are understandable, supportable, and well-documented
- Demonstrated experience designing and building Kubernetes-native systems, including custom controllers, operators, CRDs, and reconciliation logic that runs reliably in production.
- Proven ability to design coherent, multi-component platform architectures that evolve over time without accumulating excessive complexity or technical debt.
- Production-Grade Software Engineering in Go
- Strong track record of writing maintainable, testable, and resilient Go code for long-lived distributed systems.
- Experience designing Kubernetes APIs and internal abstractions that are explicit, stable, and aligned with real operational constraints.
- Deep understanding of failure modes in Kubernetes and distributed systems, and the ability to design for graceful degradation, recovery, and operability.
- Experience defining and enforcing platform standards for controller design, API usage, versioning, and operational behaviour across teams.
- Demonstrated ability to mentor senior and mid-level engineers, raise technical standards, and improve design quality across the organisation.
- Strong grounding in Linux internals, networking, and infrastructure fundamentals sufficient to debug complex, cross-layer issues.
- Working knowledge of GPU-based platforms and AI workloads, including scheduling constraints, performance considerations, and multi-tenant isolation.
- Highly competitive package (base + equity) with reviews every 12 months.
- Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI.
- Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
- Human-First Flexibility: We treat you as humans first. Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work.
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If there's anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.