Research Scientist, Autonomous Agents — Reward Modelling

DeepMind

London

Snapshot

We are looking for Research Scientists to join the Autonomous Agents team and produce research in the development of next-generation technologies to power increasingly open-ended autonomous agents which strive to assist support and supplement humans in their daily personal and professional lives.

About Us

Artificial Intelligence could be one of humanitys most useful inventions. At Google DeepMind were a team of scientists engineers machine learning experts and more working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery and collaborate with others on critical challenges ensuring safety and ethics are the highest priority.

The Role

This role will focus on research into developing next-generation reward models by analyzing the geometric and behavioral trajectories of multi-turn interactions to create highly adaptive and human-aligned autonomous agents.

A core focus will be mining implicit information from interaction dynamics to infer user state and enable the real-time adaptation and steering of AI agents. We will develop novel methodologies for capturing the success of complex multi-turn interactions spanning both human-LLM and multi-agent scenarios. The majority of the work in this role will focus on rapid experimental iteration to quickly validate hypotheses and explore new research directions.

Within the team Research Scientists are encouraged to lead/support a research agenda aimed at producing practically applicable technological advances in the ability of increasingly autonomous agents to assist support and empower humans.

The expectation is that research scientists will conduct novel research according to ambitious long-term agendas while maintaining a strong focus on methods and tools offering practical benefits in the short term as a form of pragmatic grounding. Central to this process is the idea that rapid iteration over and refinement of solutions catering to real-world use-cases provides a strong basis for better understanding the research boundary in a fast paced field.

Key responsibilities:

Participate in the ideation and development of new use-cases and desired capabilities of human-oriented agents of any form advised by the current state of research.
Partner with research engineers to develop ambitious prototypes pertaining to desired or anticipated agent use-cases and design and implement evaluation protocols around these prototypes.
Identify motivated by empirical study of existing methods failures or limitations on use-case-based evaluations the roadblocks and research challenges and...
...develop novel technical or methodological solutions to overcome such obstacles and limitations.
Identify sources of data design and implement data collection processes (supported by research engineering partners) and conduct human annotation and evaluation campaigns for the production and evaluation of strong baselines for each use-case.
Help identify within Google DeepMinds broad portfolio of research projects methods which could be adapted or tried against our evaluations as well as teams and individuals with which the team could partner to overcome challenges whilst providing grounding and evaluation for that collaborators research agenda.

About You

In order to set you up for success as a Research Scientist we look for the following skills and experience:

A PhD in a technical field or equivalent practical experience. This specific role is targeting recent graduates and the ideal candidate will be willing to work closely with one or more senior researchers on established high-value projects.
Hands-on experience with experiment design and analysis including data collection and validation methodology statistical analysis of results and their significance and of performing rigorous ablation studies.
Experience in a research domain connected to the production of increasingly autonomous human-oriented agents (e.g. LLM-powered agents RL/IL applications in NLP evaluation design).
A desire to produce the next generation of agentic systems capable of learning from and efficiently adapting to deployment in real-world scenarios.

In addition the following would be an advantage:

Strong end-to-end system building and prototyping skills.
Experience with one or more of: fine-tuning LLMs running human data collection/annotation campaigns self-play multi-agent systems.
Experience with open-ended learning RL and frontier methods for training LLMs (RLHF RLAIF multi-turn RL multi-agent interactions reward function design and modelling etc.).
A curiosity about or experience with research topics surrounding personalization memory reasoning self-improvement and safety.

Closing date: Monday 2nd March 2026 at 10:00am GMT

At Google DeepMind we value diversity of experience knowledge backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex race religion or belief ethnic or national origin disability age citizenship marital domestic or civil partnership status sexual orientation gender identity pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation please do not hesitate to let us know.

Required Experience:

Unclear Seniority

Posted 2026-02-27

Recommended Jobs

DevOps Engineer

stakefish

London

As our DevOps Engineer, you will be helping us build and maintain blockchain networks and protocols. You will work on improving our current infrastructure including security, automation, and monitori…

View Details

Posted 2025-11-12

Let Property Advisor

Chessington, Greater London

Salary Banding £18,000 – £20,000 p/a OTE £30,000 (Incl. Alternate Saturday). Bonus scheme – Full details to be provided at interview We are seeking a full time Let Property Advisor to join our…

View Details

Posted 2026-02-27

Lidl

Surbiton, Greater London

Summary £72,600* up to £103,400* per annum | 35 days’ holiday (pro rata) | 10% in-store discount | Pension scheme Everyone who works at Lidl brings something unique to the table - but we also…

View Details

Posted 2026-02-27