JobMesh

Virtual Platform Engineering Sr. Manager, Annapurna Labs Machine Learning Accelerators, AWS

Amazon.com · Cupertino, California, US

AWS's Trainium and Inferentia chips power the world's largest machine learning clusters. Our team builds virtual platforms — full-system C++ models of these...

Job description

AWS's Trainium and Inferentia chips power the world's largest machine learning clusters. Our team builds virtual platforms — full-system C++ models of these custom SoCs — that let software teams start development months before silicon exists. For Trainium3, our virtual platform enabled a full training workload within 12 hours of first silicon, putting servers in customer hands within weeks! We're hiring a hands-on engineering manager to lead and scale our virtual platform effort. You'll own the end-to-end delivery of virtual platforms that our software partners depend on — from model architecture and level of detail, through deployment and customer enablement. This is a builder-leader role: you'll set the technical direction while staying close to the code and your customers. What you'll do: - Lead the team delivering virtual platforms used by design verification as well as driver, runtime, collectives, and application software teams to develop and validate software pre-silicon - Own the virtual platform roadmap — deciding what to model, at what fidelity, and when to deliver it, based on customer needs and chip schedules - Drive platform usability, performance, and scalability so t...