Research Engineer, Multimodal Data
Eventual · San Francisco, California, US
About Eventual Every breakthrough Physical AI system — humanoid robots, autonomous vehicles, video generation models — is trained on petabytes of video, lida...
Job description
About Eventual Every breakthrough Physical AI system — humanoid robots, autonomous vehicles, video generation models — is trained on petabytes of video, lidar, radar, and sensor data. But today's data platforms (Databricks, Snowflake) were built for spreadsheet-like analytics, not the multimodal corpora that power AI. As a result, robotics and video-AI teams iterate on model improvement about once a week. Most of that week isn't training — it's finding the right data: writing CV heuristics over raw footage, paying annotators for edge cases, hand-curating clips before a cluster ever spins up. GPU bandwidth has grown 2-3× per generation. Storage and pipelines haven't. The gap widens every year. Eventual was founded in 2022 to close it. Our open-source engine, Daft , is the distributed data engine purpose-built for multimodal AI — already running 2 PB/day at Amazon, 60-100 PB at another FAANG company, and in production at Mobileye, TogetherAI, and CloudKitchens. We are building a video-native index on top of our engine for Physical AI that collapses the data iteration loop. Describe the dataset you want, get a curated table in minutes, feed it to your GPUs at line rate. One iteration...