Episode
#228 - GPT 5.2, Scaling Agents, Weird Generalization
- Podcast
- Last Week in AI
- Published
- Dec 17, 2025
- Duration seconds
- 5202
- Processing state
processed
Actions
POST https://stenobird.com/v1/public/podcasts/last-week-in-ai/episodes/228-gpt-5-2-scaling-agents-weird-generalization/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/last-week-in-ai/228-gpt-5-2-scaling-agents-weird-generalization.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
OpenAI's GPT-5.2 release marks a significant leap in multi-modal performance, though it introduces new cost and knowledge cutoff challenges. The episode also explores the massive $1 billion Disney-OpenAI partnership and the complexities of scaling multi-agent systems.
Topics
- OpenAI
- GPT-5.2
- Multi-agent systems
- AI hardware
- Robotics
- Machine Learning
- Generative Video
- AI Regulation
Highlights
- Main idea: GPT-5.2 demonstrates superior reasoning on benchmarks like Suibench Pro compared to Claude 4.5 Opus
- Business shift: Disney's $1 billion investment in OpenAI aims to integrate Marvel, Pixar, and Star Wars characters into Sora
- Practical takeaway: Scaling multi-agent systems requires solving complex tool coordination and task performance challenges
- Failure mode: Relying solely on increased compute (software-only singularity) may not be enough to reach superintelligence without algorithmic breakthroughs
- Geopolitical tension: New U.S. chip export rules and investigations into smuggling networks highlight AI hardware as critical national security infrastructure
Chapters
7:50GPT-5.2 Performance vs Claude 4.5: A comparison of reasoning capabilities, noting GPT-5.2's top-tier performance on Suibench Pro.14:35Product Updates: Adobe & Google: Discussion on ChatGPT's new integration with Adobe apps and Google's approach to linking AI sources.21:00Global Chip Competition: The struggle for Nvidia H200 chips in China and the implications of U.S. export controls.27:30The Rise of Neuromorphic Computing: Unconventional AI's massive seed round and the pursuit of energy-efficient, biological-style computing.48:00The Science of Scaling Agents: DeepMind's research into the difficulties of coordinating multiple agents in complex environments.1:08:05Stability in LLM Reasoning: Exploring mathematical approaches to maintaining stability during intermediate reasoning steps.