Solution Architect - Kernels and Performance, Core ML

Indeed

Full-time

Onsite

No experience limit

No degree limit

PV49+C7 Lisbon, Portugal

Favourites

Description

Summary: This role involves driving low-level performance engineering of AI workloads and optimizing both model training and inference across advanced accelerator architectures. Highlights: 1. Drive performance engineering for AI workloads on advanced accelerators 2. Work on cutting-edge ML models, toolchains, and frameworks 3. Directly impact next-generation AI performance We're looking for a **Solution Architect – Kernels and Performance**, Core ML to join our team in Lisbon, Portugal, in a hybrid working mode. In this role, you will drive low\-level performance engineering of AI workloads, optimizing both model training and inference across advanced accelerator architectures like TPU and GPU. You will work on cutting\-edge ML models, toolchains and frameworks, enabling scalable, efficient deployment of AI solutions in production. This position combines deep system\-level engineering with architectural leadership, directly impacting next\-generation AI performance. **Responsibilities** * Design and optimize high\-performance kernels using low\-level languages like Pallas, Mosaic and Triton for TPU and GPU architectures * Architect infrastructure such as benchmarking suites, autotuning frameworks and performance analysis tools to support kernel development and testing * Develop regression testing strategies and comprehensive documentation to maintain quality and facilitate adoption across developer communities * Collaborate with ML researchers, framework developers (JAX, PyTorch) and compiler engineers (XLA) to address performance bottlenecks and implement effective solutions * Track advancements in hardware architectures, compiler technologies and AI models to identify optimization opportunities and guide roadmap decisions * Advocate best practices for integrating optimized kernels into open\-source libraries and production systems **Requirements** * Bachelor’s degree in Computer Science or equivalent practical experience * 12\+ years of overall industry experience in software engineering or related fields * Minimum 5 years of experience in C\+\+ or Python development * At least 3 years of experience testing, maintaining or launching software products * Minimum 1 year of experience in software design and architecture * Proven background in kernel\-level performance optimization for ML workloads **Nice to have** * Experience optimizing TPU/GPU code using Pallas, CUDA or Triton * Familiarity with ML frameworks such as JAX and PyTorch, including advanced components like attention mechanisms and Mixture of Experts (MoEs) * Understanding of modern accelerator characteristics such as data movement, pipelining and heterogeneous compute * Knowledge of compiler principles, code generation and toolchains such as MLIR and OpenXLA * Experience building developer infrastructure for OSS libraries and creating high\-performance APIs * Strong problem\-solving and investigative skills with proven ability to work across cross\-functional teams **We offer** * Competitive compensation depending on experience and skills * Variety of projects within one company * Being a part of a project following engineering excellence standards * Individual career path and professional growth opportunities * Internal events and communities * Flexible work hours

Source: indeed View original post

João Santos

Indeed · HR

Company

Indeed

João Santos

Indeed · HR

Similar jobs

Solution Architect - Kernels and Performance, Core ML

Description

Company

Similar jobs

EY-Parthenon | AI Consultant - AI & Data Hub (f/m/x)

Seeking a domestic worker

Construction building

Cybersecurity Senior Consultant

REAL ESTATE CONSULTANT (M/F) – WITH EXPERIENCE

EY-Parthenon - Corporate Strategy Senior | Lisboa/Porto