Backend Engineer, Multi-cloud Inference Platform
Modular · CA
In the Cloud Inference team, we are focused on building end to end distributed LLM inference deployments and a repeatable, observable, productive, low toil p...
Job description
About the role: In the Cloud Inference team, we are focused on building end to end distributed LLM inference deployments and a repeatable, observable, productive, low toil platform for managing these deployments. Our goal is to make inference both the fastest and most scalable while also building an easiest platform for deploying and scaling models for enterprises and developers alike. We're seeking engineers who are passionate about pushing the boundaries of distributed inference systems and enjoy working at the intersection of large-scale systems and machine learning. We are looking for candidates based on their breadth and depth of experience in backend engineering, AI inference, and distributed systems development. If this sounds exciting, we invite you to join our world-leading AI infrastructure team and help drive our industry forward! LOCATION: Candidates based in the US or Canada are welcome to apply. You can work in our office in Los Altos, CA or remotely from home. Onboarding for new hires is conducted in-person at our headquarters in Los Altos, CA. What you will do: - Build the multi-cloud, multi-tenant platform powering Modular’s inference services. - Build fault-tolera...