AI Safety & Alignment
Research Scientist / Engineer, Science of Scheming
London, UK
£100,000 - £200,000
-
In this role, you'll research and develop a "Science of Scheming" to detect and mitigate deceptive behaviors in AI systems.
Research Scientist / Engineer, Science of Scheming
Apollo Research · Added today
Applications are handled by the employer on an external website. AI Safety Careers does not process applications directly.
AI Safety & Alignment
London, UK
£100,000 - £200,000
In this role, you'll research and develop a "Science of Scheming" to detect and mitigate deceptive behaviors in AI systems.
Collaborate with leading AI labs to study models and impact how frontier systems are built and deployed.
Design experiments to study RL dynamics that lead to reward-seeking or misaligned behaviors.
Develop empirical frameworks to predict how scheming risks evolve as models scale in capability.
Create novel evaluation techniques capable of detecting scheming in advanced AI systems.
Apollo Research is a London-based research and auditing organisation focused on detecting deception in advanced AI systems.
This listing may be aggregated from a public source or submitted by a third party. If you represent this employer and would like to update or remove this listing, contact support@aisafetycareers.com.