Episode
Glean's Waldo: The Agentic Search Model Making AI Faster and Cheaper
- Published
- May 2, 2026
- Duration seconds
- 394
- Processing state
processed- Canonical source
- https://share.transistor.fm/s/52bff747
Actions
POST https://stenobird.com/v1/public/podcasts/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/episodes/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/answer-engine-optimization-aeo-the-ai-search-podcast-7756998/glean-s-waldo-the-agentic-search-model-making-ai-faster-and-cheaper.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Glean's Waldo introduces an agentic search model that uses reinforcement learning to optimize the retrieval process. By acting as a specialized intermediary, it significantly reduces latency and token consumption without sacrificing final answer quality.
Topics
- Agentic Search
- Glean Waldo
- Reinforcement Learning
- Enterprise AI
- LLM Efficiency
- Answer Engine Optimization
- Query Decomposition
- Retrieval Augmented Generation
Highlights
- Main idea: Waldo functions as a specialized agent that performs query decomposition and tool selection before hitting frontier models
- Efficiency gain: The model achieves 50% lower latency and 25% fewer tokens consumed through optimized retrieval
- Technical mechanism: The system uses a planning loop of query decomposition, iterative search, and evaluation rather than a single lookup
- Strategic shift: The rise of agentic search necessitates a move from traditional SEO to Answer Engine Optimization (AEO) to ensure brand discoverability
- Failure mode: Relying on monolithic models for complex enterprise queries leads to unsustainable costs and slow response times
Chapters
0:00Introduction to Waldo: An introduction to Glean's new agentic search model and its impact on enterprise AI efficiency.1:00The Problem with Massive Models: Discussing the latency and token costs associated with sending every query directly to frontier LLMs.2:00Efficiency Without Quality Loss: How Waldo uses a curated set of evidence to maintain high-quality answers while reducing compute.3:00The Agentic Reasoning Loop: A deep dive into query decomposition, tool selection, and the iterative search process.4:00The Future of Enterprise AI: Analyzing the shift from monolithic models to orchestrated systems of specialized agents.5:00Optimizing for Agentic Discovery: How brands must adapt their content structure to be found by intermediate search agents.6:00Conclusion and Takeaways: Final thoughts on the reality of agentic search models and the changing landscape of AI information consumption.