Episode
"Automated Alignment is Harder Than You Think" by Aleksandr Bowkis, Marie_DB, Jacob Pfau, Geoffrey Irving
- Published
- May 17, 2026
- Duration seconds
- 471
- Processing state
not_requested
Actions
POST https://stenobird.com/v1/public/podcasts/lesswrong-curated-popular-5643401/episodes/automated-alignment-is-harder-than-you-think-by-aleksandr-bowkis-marie-db-jacob-pfau-geoffrey-irving/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/lesswrong-curated-popular-5643401/automated-alignment-is-harder-than-you-think-by-aleksandr-bowkis-marie-db-jacob-pfau-geoffrey-irving.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Summary This is a summary of a paper published by the alignment team at UK AISI. Read the full paper here. AI research agents may help solve ASI alignment, for example via the following plan: Build agents that can do empirical alignment work (e.g.~writing code, running experiments, designing evaluations and red teaming) and confirm they are not scheming.[1]Use these agents to build increasingly sophisticated empirical safety cases for each successive generation of agents, gradually aut...