Episode

Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759

Podcast
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Published
Dec 17, 2025
Duration seconds
3174
Processing state
processed
Canonical source
https://twimlai.com/podcast/twimlai/rethinking-pretraining-for-agentic-ai/
Audio
https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN3462034138.mp3?updated=1766003076
JSON
/v1/public/podcasts/twiml-ai-podcast/episodes/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759
Markdown
/podcast/twiml-ai-podcast/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/twiml-ai-podcast/rethinking-pre-training-for-agentic-ai-with-aakanksha-chowdhery-759.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

To move beyond static benchmarks, AI pre-training must shift from simple next-token prediction to supporting multi-step reasoning and environmental interaction. Aakanksha Chowdhery argues that true agentic capabilities like error recovery and tool use require fundamental changes to training objectives and data trajectories.

Topics

  • Agentic AI
  • Pre-training
  • Large Language Models
  • Machine Learning
  • Reasoning
  • Artificial Intelligence
  • Neural Networks
  • Model Evaluation

Highlights

  • Main idea: Agentic AI requires a fundamental rethink of pre-training objectives rather than relying solely on post-training refinements
  • Failure mode: Relying on static benchmarks like GSM8K fails to measure a model's ability to interact with dynamic environments
  • Practical takeaway: Training on 'trajectory' data is essential for teaching models to plan multiple steps ahead and recover from failed actions
  • Main idea: Scaling remains the primary driver for discovering emergent capabilities like cross-modal reasoning and dynamic tool learning
  • Practical takeaway: High-quality, representative data curation is just as critical as scale for achieving efficiency in modern models

Chapters

  1. 1:00 Foundations of Large Scale Pre-training: Aakanksha discusses her experience building the distributed systems for PaLM and Gemini, highlighting the complexities of scaling models to hundreds of billions of parameters.
  2. 4:35 The Need for Fundamental Pre-training Shifts: The limitations of current benchmarks and the argument that agentic capabilities cannot be solved through post-training alone.
  3. 8:15 Attention Mechanisms and Reasoning: How the attention mechanism serves as the fundamental engine for long-form reasoning and multi-step planning.
  4. 16:15 Beyond Next-Token Prediction: Examining why the standard next-token prediction objective is insufficient for the complex requirements of autonomous agents.
  5. 20:10 Data Curation and Efficiency: Insights into how data quality and curation drive competition and efficiency in the current LLM landscape.
  6. 27:40 Designing Better Benchmarks: The importance of breaking down real-world workflows into measurable sub-problems to create meaningful evaluation metrics.
  7. 36:05 Predictive Planning and Error Recovery: The necessity of training models to 'think ahead' and develop the ability to self-correct during inference.