Research Engineer/Research Scientist - Red Team (Misuse)
Posted 7 hours ago by AI Security Institute
£80,000 - £100,000 Annual
Permanent
Full Time
Research Jobs
London, United Kingdom
Job Description
Research Engineer/Research Scientist - Red Team (Misuse) London, UK
About the AI Security Institute The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We're in the heart of the UK government with direct lines to No. 10 (the Prime Minister's office), and we work with frontier developers and governments globally.
We're here because governments are critical for advanced AI going well, and UK AISI is uniquely positioned to mobilise them. With our resources, unique agility and international influence, this is the best place to shape both AI development and government action.
Team Description Interventions that secure a system from abuse by bad actors or misaligned AI systems will grow in importance as AI systems become more capable, autonomous, and integrated into society.
The Misuse Red Team is a specialised sub-team within AISI's wider Red Team. We red-team frontier AI safeguards for dangerous capabilities, research novel attack vectors, and develop advanced automated attack tooling. We share our findings with frontier AI companies (including Anthropic, OpenAI, DeepMind), key UK officials, and other governments to inform their respective deployment, research, and policy decision-making.
We have published on several topics, including novel automated attack algorithms (Boundary Point Jailbreaking), poisoning attacks, safeguards safety cases, defending fine tuning APIs, third party attacks on agents, agent misuse, and pre training data filtering. Some example impact cases have been advancing the benchmarking of agent misuse, identifying novel vulnerabilities and collaborating with frontier labs to mitigate them, and producing insights into the feasibility and effectiveness of attacks and defences in data poisoning and fine tuning APIs.
We're looking for research scientists and research engineers for our misuse sub-team with expertise developing and analysing attacks and protections for systems based on large language models or who have broader experience with frontier LLM research and development. An ideal candidate would have a strong track record of performing and publishing novel and impactful research in these or other areas of LLM research. We're looking for:
The team is currently led by Eric Winsor and Xander Davies - advised by Geoffrey Irving and Yarin Gal. You'll work with incredible technical staff across AISI, including alumni from Anthropic, OpenAI, DeepMind, and top universities. You may also collaborate with external teams from Anthropic, OpenAI, and Gray Swan.
We are open to hires at junior, senior, staff and principal research scientist levels.
Representative projects you might work on
The experiences listed below should be interpreted as examples of the expertise we're looking for, as opposed to a list of everything we expect to find in one applicant:
You may be a good fit if you have:
Annual salary is benchmarked to role scope and relevant experience. Most offers land between £65,000 and £145,000 made up of a base salary plus a technical allowance (take home salary = base + technical allowance). An additional 28.97% employer pension contribution is paid on the base salary.
This role sits outside of the DDaT pay framework given the scope of this role requires in depth technical expertise in frontier AI safety, robustness and advanced AI architectures.
The full range of salaries are available below:
Selection process The interview process may vary candidate to candidate, however, you should expect a typical process to include some technical proficiency tests, discussions with a cross section of our team at AISI (including non technical staff), conversations with your team lead. The process will culminate in a conversation with members of the senior leadership team here at AISI.
Candidates should expect to go through some or all of the following stages once an application has been submitted:
Internal Fraud Database The Internal Fraud function of the Fraud, Error, Debt and Grants Function at the Cabinet Office processes details of civil servants who have been dismissed for committing internal fraud, or who would have been dismissed had they not resigned. The Cabinet Office receives the details from participating government organisations of civil servants who have been dismissed, or who would have been dismissed had they not resigned, for internal fraud. In instances such as this . click apply for full job details
About the AI Security Institute The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We're in the heart of the UK government with direct lines to No. 10 (the Prime Minister's office), and we work with frontier developers and governments globally.
We're here because governments are critical for advanced AI going well, and UK AISI is uniquely positioned to mobilise them. With our resources, unique agility and international influence, this is the best place to shape both AI development and government action.
Team Description Interventions that secure a system from abuse by bad actors or misaligned AI systems will grow in importance as AI systems become more capable, autonomous, and integrated into society.
The Misuse Red Team is a specialised sub-team within AISI's wider Red Team. We red-team frontier AI safeguards for dangerous capabilities, research novel attack vectors, and develop advanced automated attack tooling. We share our findings with frontier AI companies (including Anthropic, OpenAI, DeepMind), key UK officials, and other governments to inform their respective deployment, research, and policy decision-making.
We have published on several topics, including novel automated attack algorithms (Boundary Point Jailbreaking), poisoning attacks, safeguards safety cases, defending fine tuning APIs, third party attacks on agents, agent misuse, and pre training data filtering. Some example impact cases have been advancing the benchmarking of agent misuse, identifying novel vulnerabilities and collaborating with frontier labs to mitigate them, and producing insights into the feasibility and effectiveness of attacks and defences in data poisoning and fine tuning APIs.
We're looking for research scientists and research engineers for our misuse sub-team with expertise developing and analysing attacks and protections for systems based on large language models or who have broader experience with frontier LLM research and development. An ideal candidate would have a strong track record of performing and publishing novel and impactful research in these or other areas of LLM research. We're looking for:
- Research Scientists, who typically lead technical direction - picking the questions, designing the experiments, and owning the conclusions (typically evidenced by a strong publication record).
- Research Engineers, who typically lead execution - building the systems and code that make those experiments possible at scale, and owning reliability, speed, and reproducibility.
The team is currently led by Eric Winsor and Xander Davies - advised by Geoffrey Irving and Yarin Gal. You'll work with incredible technical staff across AISI, including alumni from Anthropic, OpenAI, DeepMind, and top universities. You may also collaborate with external teams from Anthropic, OpenAI, and Gray Swan.
We are open to hires at junior, senior, staff and principal research scientist levels.
Representative projects you might work on
- Designing, building, running and evaluating methods to automatically attack and evaluate safeguards, such as LLM automated attacking and direct optimisation approaches.
- Building a benchmark for asynchronous monitoring for signs of misuse and jailbreak development across multiple model interactions.
- Investigating novel attacks and defences for data poisoning LLMs with backdoors or other attacker goals.
- Performing adversarial testing of frontier AI system safeguards and producing reports that are impactful and action guiding for safeguard developers.
The experiences listed below should be interpreted as examples of the expertise we're looking for, as opposed to a list of everything we expect to find in one applicant:
You may be a good fit if you have:
- Hands on research experience with large language models (LLMs) - such as training, fine tuning, evaluation, or safety research.
- A demonstrated track record of peer reviewed publications in top tier ML conferences or journals.
- Ability and experience writing clean, documented research code for machine learning experiments, including experience with ML frameworks like PyTorch or evaluation frameworks like Inspect.
- A sense of mission, urgency, responsibility for success.
- An ability to bring your own research ideas and work in a self directed way, while also collaborating effectively and prioritizing team efforts over extensive solo work.
- Experience working on adversarial robustness, other areas of AI security, or red teaming against any kind of system.
- Experience working on AI alignment or AI control.
- Extensive experience writing production quality code.
- Desire to and experience with improving our team through mentoring and feedback.
- Experience designing, shipping, and maintaining complex technical products.
- Incredibly talented, mission driven and supportive colleagues.
- Direct influence on how frontier AI is governed and deployed globally.
- Work with the Prime Minister's AI Advisor and leading AI companies.
- Opportunity to shape the first & best resourced public interest research team focused on AI security.
- Pre release access to multiple frontier models and ample compute.
- Extensive operational support so you can focus on research and ship quickly.
- Work with experts across national security, policy, AI research and adjacent sciences.
- If you're talented and driven, you'll own important problems early.
- 5 days off and annual stipends for learning and development, and funding for conferences and external collaborations.
- Freedom to pursue research bets without product pressure.
- Opportunities to publish and collaborate externally.
- Modern central London office (cafes, food court, gym), or where applicable, option to work in similar government offices in Birmingham, Cardiff, Darlington, Edinburgh, Salford or Bristol.
- Hybrid working, flexibility for occasional remote work abroad and stipends for work from home equipment.
- At least 25 days' annual leave, 8 public holidays, extra team wide breaks and 3 days off for volunteering.
- Generous paid parental leave (36 weeks of UK statutory leave shared between parents + 3 extra paid weeks + option for additional unpaid time).
- On top of your salary, we contribute 28.97% of your base salary to your pension.
- Discounts and benefits for cycling to work, donations and retail/gyms.
Annual salary is benchmarked to role scope and relevant experience. Most offers land between £65,000 and £145,000 made up of a base salary plus a technical allowance (take home salary = base + technical allowance). An additional 28.97% employer pension contribution is paid on the base salary.
This role sits outside of the DDaT pay framework given the scope of this role requires in depth technical expertise in frontier AI safety, robustness and advanced AI architectures.
The full range of salaries are available below:
Selection process The interview process may vary candidate to candidate, however, you should expect a typical process to include some technical proficiency tests, discussions with a cross section of our team at AISI (including non technical staff), conversations with your team lead. The process will culminate in a conversation with members of the senior leadership team here at AISI.
Candidates should expect to go through some or all of the following stages once an application has been submitted:
- Initial assessment
- Initial screening call
- Research interviewTechnical assessment
- Behavioural interview
- Final interview with members of the senior leadership team
Internal Fraud Database The Internal Fraud function of the Fraud, Error, Debt and Grants Function at the Cabinet Office processes details of civil servants who have been dismissed for committing internal fraud, or who would have been dismissed had they not resigned. The Cabinet Office receives the details from participating government organisations of civil servants who have been dismissed, or who would have been dismissed had they not resigned, for internal fraud. In instances such as this . click apply for full job details