Software Development Manager
Oracle · Seattle, Washington, US
OCI (Oracle Cloud Infrastructure) AI Infrastructure is at the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support A...
Job description
OCI (Oracle Cloud Infrastructure) AI Infrastructure is at the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI/ML/HPC workloads. This is your chance to be part of the AI revolution, creating systems that allow customers to scale from tens to thousands of GPUs without compromising performance. Our team is responsible for designing and developing fundamental architectural changes for GPU delivery, health monitoring, triage automation, and diagnostic services. These are essential for running distributed AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and InfiniBand. We're looking for an experienced front-line engineering manager to lead and support AI Data Plane (DP) team. You'll build a highly available, massive scale, integrated cloud service in a distributed, multi-tenant cloud environment for hyper-scale AI customers. You'll bring experience in leading engineering teams, including hands-on technical management and on-call experience. You have technical experience with high-performance, mission-critical environments. You demonstrate strong ownership, solid communication skills and a bias for action. You...