Research Scientist - Seed Multimodal Interaction and World Model
ByteDance · San Jose, California, US
About the Team The Seed Multimodal Interaction and World Model team is dedicated to developing models that have boast human-level multimodal understanding an...
Job description
About the Team The Seed Multimodal Interaction and World Model team is dedicated to developing models that have boast human-level multimodal understanding and interaction capabilities. The team also aspires to advance the exploration and development of multimodal assistant products Responsibilities: The base salary range for this position in the selected city is $208800 - $438000 annually. - Research and development large-scale multimodal foundation models - Develop unified modeling frameworks that integrate video, audio, and language, with a focus on visual latent reasoning - Explore Reinforcement Learning-based approaches to bridge understanding and generation for multimodal visual reasoning - Collaborate with researchers to evaluate models on tasks involving world modeling, reasoning, and instruction-conditioned generation