JobMesh

Research Scientist in Speech Foundation Model - Seed - Graduates - 2027 Start (PhD)

ByteDance · San Jose, California, US

About the team The mission of the Seed Speech team is to enrich interactive and creative processes through the application of multimodal speech technologies....

Job description

About the team The mission of the Seed Speech team is to enrich interactive and creative processes through the application of multimodal speech technologies. The team focuses on the forefront of research and product development in speech and audio, music, natural language understanding, and multimodal deep learning. Responsibilities: The base salary range for this position in the selected city is $244800 - $450000 annually. - Develop and scale speech foundation models for understanding and generation tasks. - Design training pipelines including data construction, instruction tuning, and model alignment. - Improve core capabilities such as speech recognition, synthesis, reasoning, and robustness. - Optimize model architectures, training efficiency, and system performance. - Explore natural and interactive interfaces for speech-based systems.