Episode

Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759

Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Published: Dec 17, 2025
Duration seconds: 3174
Processing state: processed
Canonical source: https://twimlai.com/podcast/twimlai/rethinking-pretraining-for-agentic-ai/
Audio: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN3462034138.mp3?updated=1766003076
JSON: /v1/public/podcasts/twiml-ai-podcast/episodes/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759
Markdown: /podcast/twiml-ai-podcast/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759.md

Actions

POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/twiml-ai-podcast/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

To move beyond static benchmarks, AI pre-training must shift from simple next-token prediction to supporting multi-step reasoning and environmental interaction. Aakanksha Chowdhery argues that true agentic capabilities like error recovery and tool use require fundamental changes to training objectives and data trajectories.

Topics

Agentic AI
Pre-training
Large Language Models
Machine Learning
Reasoning
Artificial Intelligence
Neural Networks
Model Evaluation

Highlights

Main idea: Agentic AI requires a fundamental rethink of pre-training objectives rather than relying solely on post-training refinements
Failure mode: Relying on static benchmarks like GSM8K fails to measure a model's ability to interact with dynamic environments
Practical takeaway: Training on 'trajectory' data is essential for teaching models to plan multiple steps ahead and recover from failed actions
Main idea: Scaling remains the primary driver for discovering emergent capabilities like cross-modal reasoning and dynamic tool learning
Practical takeaway: High-quality, representative data curation is just as critical as scale for achieving efficiency in modern models

Chapters

1:00 Foundations of Large Scale Pre-training: Aakanksha discusses her experience building the distributed systems for PaLM and Gemini, highlighting the complexities of scaling models to hundreds of billions of parameters.
4:35 The Need for Fundamental Pre-training Shifts: The limitations of current benchmarks and the argument that agentic capabilities cannot be solved through post-training alone.
8:15 Attention Mechanisms and Reasoning: How the attention mechanism serves as the fundamental engine for long-form reasoning and multi-step planning.
16:15 Beyond Next-Token Prediction: Examining why the standard next-token prediction objective is insufficient for the complex requirements of autonomous agents.
20:10 Data Curation and Efficiency: Insights into how data quality and curation drive competition and efficiency in the current LLM landscape.
27:40 Designing Better Benchmarks: The importance of breaking down real-world workflows into measurable sub-problems to create meaningful evaluation metrics.
36:05 Predictive Planning and Error Recovery: The necessity of training models to 'think ahead' and develop the ability to self-correct during inference.