AI Engineer (Compiler & Edge AI Enablement) - Munich - Long Term Contract
Posted 2 hours 52 minutes ago by WHD Consulting Ltd
AI Engineer (Compiler & Edge AI Enablement) - Munich - Long Term Contract
My Clients AI Competence Center is seeking a highly skilled AI/ML Compiler Engineer to play a defining role in the OneCompiler program.
This position is central to their strategy to build a single, scalable, MLIR based edge AI compiler covering all hardware targets - from Cortex M MCUs to S32 automotive processors, Neutron NPUs, Ara accelerators, and 3rd party DSPs.
You will develop the technologies that horizontally unify Front End, IR transformations, and Back End code generation, replacing today's fragmented deployment flows (TFLite, ONNX Runtime, Glow, custom flows) with OneCompiler as the company wide platform for model ingestion, optimization, and code generation.
The mission of the AI Engineer is to:
- Develop and extend the OneCompiler MLIR/IREE based toolchain as the future proven compiler Front End and deployment infrastructure.
- Enable scalable customer model deployment (BYOM) across the full portfolio through unified IRs (StableHLO, TOSA, torch-mlir, custom dialects) and modern compilation flows.
- Advance exploratory compiler tracks such as training graph ingestion, runtime bindings, auto tuning, and heterogeneous partitioning & mapping for multi IP scheduling.
- Prepare OneCompiler for integration into eIQ, enabling consistent productization and cross BL reuse.
- This is a technical leadership role operating at the intersection of AI systems, compilers, and hardware enablement.
Key Responsibilities
1. Build & Evolve the OneCompiler Front End (MLIR Based)
- Implement and maintain Front End ingestion pipelines for PyTorch, ONNX, TensorFlow, and TFLite (StableHLO, TOSA, torch mlir import paths).
- Develop new MLIR dialects to interface with IP stack (eg, the MLIR dialect used with Neutron Converter).
- Establish custom pipelines for quantized models, including LLM ingestion paths (GPT like models).
2. Optimize & Lower Models to Hardware
- Design and implement optimization passes, IR lowering, and Back End mappings to various backends.
- Participate in device level codegen enabling efficient operator support and microcode generation.
3. Advance Exploratory Compiler Tracks (You will contribute to OneCompiler's innovation agenda)
- Training graph ingestion for efficient on device learning
- Auto tuning using SHARK Tuner based approaches
- Heterogeneous scheduling and partitioning across CPUs, NPUs, DSPs, and Ara accelerators
- Runtime binding unification (TFLite API compatibility layers, ONNX Runtime-like APIs for IREE)
These tracks shape the long term viability of OneCompiler as the default ecosystem for all customers.
4. Integration into Software Products
- Deliver compiler components for eIQ and offline/online runtimes.
- Create model bundles and execution formats consumable by ML SDKs and automotive stacks.
6. Collaboration & Technology Transfer
- Work with internal BLs/PLs to transition OneCompiler into product lines.
- Collaborate with external partners to accelerate compiler innovation.
- Contribute to invention disclosures, whitepapers, and internal guidelines.
Your Profile
- Strong foundation in compiler design, MLIR/LLVM, or modern ML compilers (TVM, IREE, XLA).
- Deep understanding of AI/ML models, quantization flows, and hardware aware optimizations.
- Proficiency in C++ and Python, with experience in building compiler passes or device backends.
- Familiarity with PyTorch/ONNX/TensorFlow and model export flows.
- Experience with Embedded or heterogeneous compute architectures (Cortex family, NPUs, DSPs).
Preferred Qualifications
- MLIR dialect development & IR transformations
- IREE AOT compilation and runtime flows
- Contributions to open source ML compilers or graph runtimes.
- Knowledge of automotive/industrial constraints (safety, determinism, latency budgets).
If this exciting opportunity could be of interest - please let me know ASAP. Interviews can be arranged on short notice.