Episode

Glean's Waldo: The Agentic Search Model Making AI Faster and Cheaper

Podcast: Answer Engine Optimization (AEO): The AI Search Podcast
Published: May 2, 2026
Duration seconds: 394
Processing state: processed
Canonical source: https://share.transistor.fm/s/52bff747
Audio: https://media.transistor.fm/52bff747/413dfcbb.mp3
JSON: /v1/public/podcasts/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/episodes/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper
Markdown: /podcast/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper.md

Actions

POST https://stenobird.com/v1/public/podcasts/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/episodes/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Glean's Waldo introduces an agentic search model that uses reinforcement learning to optimize the retrieval process. By acting as a specialized intermediary, it significantly reduces latency and token consumption without sacrificing final answer quality.

Topics

Agentic Search
Glean Waldo
Reinforcement Learning
Enterprise AI
LLM Efficiency
Answer Engine Optimization
Query Decomposition
Retrieval Augmented Generation

Highlights

Main idea: Waldo functions as a specialized agent that performs query decomposition and tool selection before hitting frontier models
Efficiency gain: The model achieves 50% lower latency and 25% fewer tokens consumed through optimized retrieval
Technical mechanism: The system uses a planning loop of query decomposition, iterative search, and evaluation rather than a single lookup
Strategic shift: The rise of agentic search necessitates a move from traditional SEO to Answer Engine Optimization (AEO) to ensure brand discoverability
Failure mode: Relying on monolithic models for complex enterprise queries leads to unsustainable costs and slow response times

Chapters

0:00 Introduction to Waldo: An introduction to Glean's new agentic search model and its impact on enterprise AI efficiency.
1:00 The Problem with Massive Models: Discussing the latency and token costs associated with sending every query directly to frontier LLMs.
2:00 Efficiency Without Quality Loss: How Waldo uses a curated set of evidence to maintain high-quality answers while reducing compute.
3:00 The Agentic Reasoning Loop: A deep dive into query decomposition, tool selection, and the iterative search process.
4:00 The Future of Enterprise AI: Analyzing the shift from monolithic models to orchestrated systems of specialized agents.
5:00 Optimizing for Agentic Discovery: How brands must adapt their content structure to be found by intermediate search agents.
6:00 Conclusion and Takeaways: Final thoughts on the reality of agentic search models and the changing landscape of AI information consumption.