JobMesh

Research Scientist/Engineer (Science of Scheming)

Apollo Research · London, England, GB

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable. ABOUT THE OPPORTUNITY We want to d...

Job description

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable. ABOUT THE OPPORTUNITY: We want to develop a “Science of Scheming”. The goal is ambitious and we’re looking for Research Scientists and Research Engineers who are excited to build a new hard science from the ground up. YOU WILL HAVE THE OPPORTUNITY TO: Note: We are not hiring for interpretability roles. - Collaborate with leading AI developers. We partner with multiple labs, giving you access to a breadth of models that no single AI lab could offer. Through long-term research collaborations, your work directly impacts how the most capable AI systems are built and deployed. - Deeply study the RL dynamics that lead to the emergence of reward-seeking, evaluation awareness or misaligned preferences. Design and train model organisms, and scale your insights to frontier systems. - Work towards “ Scaling laws of scheming ”. Build the empirical foundations to predict how scheming risks evolve as models scale in capability. - Develop novel and ambitious evaluation techniques that have a chance of scaling to highly evaluation aware models. - Deep dive into AI cognition ....