ML Performance Engineer
Pinely · Amsterdam, North Holland, THE NETHERLANDS
We’re looking for a performance-focused ML Engineer to help speed up large-scale model training by optimizing our internal stack and compute infrastructure.
Job description
We’re looking for a performance-focused ML Engineer to help speed up large-scale model training by optimizing our internal stack and compute infrastructure. You’ll work across the full training pipeline — from GPU kernels to system-level throughput — applying profiling, CUDA-level tuning, and distributed systems techniques. The goal is to reduce training time, boost iteration speed, and use compute more efficiently. This is a key role in a growing team building deep technical expertise in ML training systems. Responsibilities: - Optimize our model training pipeline to improve both speed and reliability, enabling faster and more efficient experimentation; - Apply GPU-level optimization techniques using tools like JAX, Triton, low-level CUDA to improve training performance and efficiency at scale; - Identify and resolve performance bottlenecks across the entire ML pipeline — from data loading and preprocessing to CUDA kernels; - Build tools and extend internal infrastructure to support scalable, reproducible, and high-performance training workflows; - Mentor and support engineers and researchers in adopting performance best practices across the team; - Help grow the team’s GPU and sy...