# He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI] Page: https://stenobird.com/podcast/machine-learning-street-talk/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai Text version: https://stenobird.com/podcast/machine-learning-street-talk/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai.md Podcast: [Machine Learning Street Talk (MLST)](https://stenobird.com/podcast/machine-learning-street-talk) Published: 2025-11-23T17:36:59+00:00 Episode link: https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/He-Co-Invented-the-Transformer--Now-Continuous-Thought-Machines---Llion-Jones-and-Luke-Darlow-Sakana-AI-e3bbt96 Audio file: https://traffic.megaphone.fm/APO6903071163.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai Duration seconds: 4359 ## Resource Llion Jones, co-inventor of the Transformer, argues that current scaling laws are trapping AI in a local minimum of pattern matching rather than true reasoning. He and Luke Darlow introduce Continuous Thought Machines (CTM) as a biologically-inspired alternative that allows models to 'ponder' and process information step-by-step. ## Highlights - Main idea: The Transformer architecture excels at pattern recognition but lacks the ability to genuinely 'think' through complex, multi-step problems - Failure mode: Current LLMs use 'brute force' scaling to mimic complex shapes or logic, effectively faking understanding through high-dimensional straight lines - Practical takeaway: Continuous Thought Machines (CTM) enable adaptive computation, allowing a model to spend more time on harder tasks by 'walking' through a problem - Technical insight: CTM uses a self-bootstrapping mechanism where the model is trained to predict only the next step in a sequence it has already partially mastered - Research philosophy: Moving away from 'architecture lottery' and fixed-compute models toward systems that can naturally backtrack and correct errors ## Topics Transformer Architecture, Continuous Thought Machines, Sakana AI, Adaptive Computation, Neural Network Architecture, Machine Learning Research, Artificial General Intelligence, Pattern Recognition ## Chapters - 1:05 — Stepping Back from Transformers: Llion Jones discusses the shift in AI research from the open-ended exploration of the Transformer era to the current era of reduced research freedom. - 6:40 — The Era of Technology Capture: An exploration of how the ubiquity of the Transformer architecture may be creating a 'local minimum' in AI development. - 17:15 — The Limits of Scaling: A critique of how current models can produce clearly incorrect outputs despite massive scale, signaling a fundamental architectural flaw. - 28:40 — Introducing Continuous Thought Machines: A deep dive into the CTM architecture and how it differs from the 'instantaneous' processing of standard Transformers. - 34:00 — Adaptive Computation & Maze Solving: Using the maze analogy to explain how CTM can use attention to retrieve information and 'think' through steps sequentially. - 39:40 — Technical Deep Dive: CTM Architecture: A technical look at neuron synchronization and measuring activations over time within the CTM framework. - 55:45 — The Future of AI Research: Advice for young researchers on navigating the 'maze' of AI and the importance of pursuing passion-driven, bottom-up research. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/machine-learning-street-talk/he-co-invented-the-transformer-now-continuous-thought-machines-llion-jones-and-luke-darlow-sakana-ai.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.