Leave us your email address and we'll send you all the new jobs according to your preferences.

Senior Platform ML Engineer - Freelance/B2B - Remote

Posted 1 day 10 hours ago by Robson Bale Ltd

Contract
Not Specified
Other
Not Specified, Poland
Job Description

Senior Platform ML Engineer - Freelance/B2B - Remote from the EU

Candidates MUST be based in the EU for this role.

We are unable to offer this opportunity to anyone outside the EU

6 months initially

Market rate

Must-Have Skills

Production ML platform experience

  • 5+ years building and operating ML infrastructure or large-scale data/ML systems on cloud platforms
  • Experience supporting mission-critical systems serving multiple teams

Distributed systems engineering

  • Containers (Docker) and orchestration (Kubernetes)
  • Experience with streaming and batch processing systems (eg, Kafka/Kinesis, Spark/Flink)

Low-latency and high-scale systems

  • Experience designing and operating systems with strict latency and throughput requirements (eg, systems with sub-10ms inference or retrieval paths)
  • Familiarity with caching, traffic shaping, and request management in production

Reliability, observability, and operability

  • Designing systems with SLOs, monitoring, and safe deployment practices
  • Experience with incident response, capacity planning, and post-incident reviews

Security and governance by design

  • Experience working with IAM, secrets management, and network boundaries
  • Ability to embed security, compliance, and governance into engineering workflows

Platform integration skills

  • Experience combining multiple platform components (open source and managed services) into a coherent, shared, multi-team, production-ready ML platform
  • Comfortable evaluating and integrating tools rather than relying on a single end-to-end solution
  • Evaluating build vs buy vs extend trade-offs

Collaboration and communication

  • Clear articulation of technical trade-offs and recommendations
  • Ability to produce architecture designs, PoC findings, and decision input
  • Effective collaboration with platform, infra, and ML teams

Nice-to-Have Skills

  • Experience with enterprise ML platforms (eg, Databricks, Domino, ClearML)
  • Kubernetes first ML systems
  • Hands-on experience running ML workloads on Kubernetes (EKS preferred)
  • Multi-tenant environments, resource isolation, autoscaling
  • Experience running and optimizing GPU-based training workloads in shared, multi-tenant environments (eg, scheduling, utilization, cost efficiency)
  • Feature platform or feature store experience
  • Online/offline consistency, schema evolution
  • Familiarity with Hopsworks, Feast, or similar
  • Governance and compliance experience in regulated ML environments
  • Experience onboarding teams onto shared platforms
  • MLOps awareness (cost attribution and optimization for ML workloads)
  • Developer experience/platform enablement mindset (golden paths, templates, onboarding guides)

High-Level Tasks & Expectations

PoC Evaluations

  • Support and contribute hands-on to multiple ML platform PoCs
  • Work closely with Applied Scientists, ML Engineers, and internal platform teams
  • Evaluate platform capabilities across:
  • GPU training and experimentation
  • Real Time and batch inference
  • Orchestration, monitoring, and operability
  • Multi-tenancy, isolation, and scaling
  • Assess integration points with existing in-house tooling
  • Perform performance and operability analysis
  • Contribute technical input to:
  • Build vs buy vs extend decisions
  • Target platform stack recommendations
  • OPEX and CAPEX justification for rollout

Production Rollout

  • Take responsibility for technical execution during early production rollout, working closely with internal teams who will own the platform long-term
  • Contribute hands-on to building the production environment
  • Support integration of the selected platform into the Client's ecosystem

Help define and implement:

  • Initial best practices
  • Governance and compliance foundations
  • Operational ownership and maintenance boundaries

Support onboarding of:

  • PoC use cases as first production adopters
  • Additional early adopter teams
  • Reduce friction and uncertainty for teams adopting new AI/ML use cases
Email this Job