JobMesh

Staff Technical Lead for Inference & ML Performance

fal · San Francisco, California, US

fal is pioneering the next generation of generative-media infrastructure. We're pushing the boundaries of model inference performance to power seamless creat...

Job description

fal is pioneering the next generation of generative-media infrastructure. We're pushing the boundaries of model inference performance to power seamless creative experiences at unprecedented scale. We're looking for a Staff Technical Lead for Inference & ML Performance, someone who blends deep technical expertise with strategic vision, guiding a team to build and optimize state-of-the-art inference systems. This role is intense yet deeply impactful. Apply if you're ready to lead the future of inference performance at a fast-paced, high-growth frontier. Why this role matters: You’ll shape the future of fal’s inference engine and ensure our generative models achieve best-in-class performance. Your work directly impacts our ability to rapidly deliver cutting-edge creative solutions to users, from individual creators to global brands. What you'll do: Day-to-day: What success looks like: Set technical direction. Guide your team (kernels, applied performance, ML compilers, distributed inference) to build high-performance inference solutions. fal’s inference engine consistently outperforms industry benchmarks in throughput, latency, and efficiency. Hands-on IC leadership. Personally contri...