PhD Research Intern, Multi-Modal Foundation Encoder for Perception
Zoox · Foster City, California, US
About Our Internship Program Zoox’s internship program offers hands-on experience with cutting-edge technology, mentorship from some of the industry’s bright...
Job description
About Our Internship Program Zoox’s internship program offers hands-on experience with cutting-edge technology, mentorship from some of the industry’s brightest minds, and the opportunity to make meaningful contributions to real projects. We seek interns who demonstrate strong academic performance, engagement beyond the classroom, intellectual curiosity, and a genuine interest in Zoox’s mission. Project Overview: During this internship, you will lead the development of a multi-modality (vision, LiDAR, Radar, and language), temporal foundation encoder to support 3D object detection & tracking, 3D segmentation (occupancy), and live maps. This Multi-Modal Foundation Encoder (MMFE) is a critical key to achieving End-to-End Perception at Zoox. Your research will aim to significantly improve system performance on long-tail events and rare classes by utilizing a large-capacity foundation model to learn rich representations across different sensor modalities. Additionally, the project aims to improve perception in adverse environmental conditions (such as medium to heavy rain and fog, reducing false positives on water splashes or dust particles) , achieve long-range sensing for highway dri...