Episode
He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]
- Published
- Nov 23, 2025
- Duration seconds
- 4359
- Processing state
processed
Actions
POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/machine-learning-street-talk/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Llion Jones, co-inventor of the Transformer, argues that current scaling laws are trapping AI in a local minimum of pattern matching rather than true reasoning. He and Luke Darlow introduce Continuous Thought Machines (CTM) as a biologically-inspired alternative that allows models to 'ponder' and process information step-by-step.
Topics
- Transformer Architecture
- Continuous Thought Machines
- Sakana AI
- Adaptive Computation
- Neural Network Architecture
- Machine Learning Research
- Artificial General Intelligence
- Pattern Recognition
Highlights
- Main idea: The Transformer architecture excels at pattern recognition but lacks the ability to genuinely 'think' through complex, multi-step problems
- Failure mode: Current LLMs use 'brute force' scaling to mimic complex shapes or logic, effectively faking understanding through high-dimensional straight lines
- Practical takeaway: Continuous Thought Machines (CTM) enable adaptive computation, allowing a model to spend more time on harder tasks by 'walking' through a problem
- Technical insight: CTM uses a self-bootstrapping mechanism where the model is trained to predict only the next step in a sequence it has already partially mastered
- Research philosophy: Moving away from 'architecture lottery' and fixed-compute models toward systems that can naturally backtrack and correct errors
Chapters
1:05Stepping Back from Transformers: Llion Jones discusses the shift in AI research from the open-ended exploration of the Transformer era to the current era of reduced research freedom.6:40The Era of Technology Capture: An exploration of how the ubiquity of the Transformer architecture may be creating a 'local minimum' in AI development.17:15The Limits of Scaling: A critique of how current models can produce clearly incorrect outputs despite massive scale, signaling a fundamental architectural flaw.28:40Introducing Continuous Thought Machines: A deep dive into the CTM architecture and how it differs from the 'instantaneous' processing of standard Transformers.34:00Adaptive Computation & Maze Solving: Using the maze analogy to explain how CTM can use attention to retrieve information and 'think' through steps sequentially.39:40Technical Deep Dive: CTM Architecture: A technical look at neuron synchronization and measuring activations over time within the CTM framework.55:45The Future of AI Research: Advice for young researchers on navigating the 'maze' of AI and the importance of pursuing passion-driven, bottom-up research.