JobMesh

Research Scientist - Seed Multimodal Interaction and World Model

ByteDance · San Jose, California, US

About the Team The Seed Multimodal Interaction and World Model team is dedicated to developing models that have boast human-level multimodal understanding an...

Job description

About the Team The Seed Multimodal Interaction and World Model team is dedicated to developing models that have boast human-level multimodal understanding and interaction capabilities. The team also aspires to advance the exploration and development of multimodal assistant products Responsibilities: The base salary range for this position in the selected city is $208800 - $438000 annually. - Research and development large-scale multimodal foundation models - Develop unified modeling frameworks that integrate video, audio, and language, with a focus on visual latent reasoning - Explore Reinforcement Learning-based approaches to bridge understanding and generation for multimodal visual reasoning - Collaborate with researchers to evaluate models on tasks involving world modeling, reasoning, and instruction-conditioned generation