JobMesh

Member of technical staff (Inference) - Paris

H Company · Paris, Île-De-France, FR

About H: H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agen...

Job description

About H: H exists to push the boundaries of superintelligence with agentic AI. By automating complex, multi-step tasks typically performed by humans, AI agents will help unlock full human potential. H is hiring the world’s best AI talent, seeking those who are dedicated as much to building safely and responsibly as to advancing disruptive agentic capabilities. We promote a mindset of openness, learning, and collaboration, where everyone has something to contribute. About the Team: The Inference team develops and enhances the inference stack for serving H-models that power our agent technology. The team focuses on optimizing hardware utilization to reach high throughput, low latency and cost efficiency in order to deliver a seamless user experience. Key Responsibilities: Develop scalable, low-latency and cost effective inference pipelines Optimize model performance: memory usage, throughput, and latency, using advanced techniques like distributed computing, model compression, quantization and caching mechanisms Develop specialized GPU kernels for performance-critical tasks like attention mechanisms, matrix multiplications, etc. Collaborate with H research teams on model architecture...