JobMesh

Senior Product Manager, AI Inference - Dynamo

NVIDIA · Santa Clara, California, US

NVIDIA is seeking a highly technical Product Manager to own the evolution of NVIDIA Dynamo, our flagship distributed inference framework.

Job description

NVIDIA is seeking a highly technical Product Manager to own the evolution of NVIDIA Dynamo, our flagship distributed inference framework. In this role, you will define the roadmap for high-scale LLM and Generative AI serving, bridging the gap between cutting-edge hardware (Vera Rubin, LPU, and NVLink) and software optimizations, like disaggregated serving, KV aware routing, and intelligent KV cache management. We need a self-starter to continue growing the product portfolio and work with the customers to incorporate model evaluation into end-2-end LLM workflows. We're looking for the rare blend of technical and product skills and passion for groundbreaking technology. If this fits, we would love to learn more about you! What you'll be doing: Core Dynamo Architecture: Drive the product strategy for Dynamo’s modular components, including the KV-aware Router, KV Block Manager (KVBM), and communication planes. Inference Orchestration: Define requirements for sophisticated routing logic that minimizes redundant prefill and optimizes Time to First Token (TTFT) across substantial GPU clusters. Memory & KV Cache Management: Define strategy for multi-tier KV cache offloading enabling long-c...