# #228 - GPT 5.2, Scaling Agents, Weird Generalization Page: https://stenobird.com/podcast/last-week-in-ai/228-gpt-5-2-scaling-agents-weird-generalization Text version: https://stenobird.com/podcast/last-week-in-ai/228-gpt-5-2-scaling-agents-weird-generalization.md Podcast: [Last Week in AI](https://stenobird.com/podcast/last-week-in-ai) Published: 2025-12-17T08:00:00+00:00 Episode link: https://rss.art19.com/episodes/ff43c594-5876-4808-9d7e-4ff32cca7d5b.mp3?rss_browser=BAhJIg90cmFuc2NyaWJyBjoGRVQ%3D--952c5701c84ad333c69d5faa668f8177091704f0 Audio file: https://rss.art19.com/episodes/ff43c594-5876-4808-9d7e-4ff32cca7d5b.mp3?rss_browser=BAhJIg90cmFuc2NyaWJyBjoGRVQ%3D--952c5701c84ad333c69d5faa668f8177091704f0 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/last-week-in-ai/episodes/228-gpt-5-2-scaling-agents-weird-generalization Duration seconds: 5202 ## Resource OpenAI's GPT-5.2 release marks a significant leap in multi-modal performance, though it introduces new cost and knowledge cutoff challenges. The episode also explores the massive $1 billion Disney-OpenAI partnership and the complexities of scaling multi-agent systems. ## Highlights - Main idea: GPT-5.2 demonstrates superior reasoning on benchmarks like Suibench Pro compared to Claude 4.5 Opus - Business shift: Disney's $1 billion investment in OpenAI aims to integrate Marvel, Pixar, and Star Wars characters into Sora - Practical takeaway: Scaling multi-agent systems requires solving complex tool coordination and task performance challenges - Failure mode: Relying solely on increased compute (software-only singularity) may not be enough to reach superintelligence without algorithmic breakthroughs - Geopolitical tension: New U.S. chip export rules and investigations into smuggling networks highlight AI hardware as critical national security infrastructure ## Topics OpenAI, GPT-5.2, Multi-agent systems, AI hardware, Robotics, Machine Learning, Generative Video, AI Regulation ## Chapters - 7:50 — GPT-5.2 Performance vs Claude 4.5: A comparison of reasoning capabilities, noting GPT-5.2's top-tier performance on Suibench Pro. - 14:35 — Product Updates: Adobe & Google: Discussion on ChatGPT's new integration with Adobe apps and Google's approach to linking AI sources. - 21:00 — Global Chip Competition: The struggle for Nvidia H200 chips in China and the implications of U.S. export controls. - 27:30 — The Rise of Neuromorphic Computing: Unconventional AI's massive seed round and the pursuit of energy-efficient, biological-style computing. - 48:00 — The Science of Scaling Agents: DeepMind's research into the difficulties of coordinating multiple agents in complex environments. - 1:08:05 — Stability in LLM Reasoning: Exploring mathematical approaches to maintaining stability during intermediate reasoning steps. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/last-week-in-ai/episodes/228-gpt-5-2-scaling-agents-weird-generalization/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/last-week-in-ai/228-gpt-5-2-scaling-agents-weird-generalization.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.