Episode

The Truth About Agents in Production

Podcast
The Data Exchange with Ben Lorica
Published
Dec 31, 2025
Duration seconds
1537
Processing state
processed
Canonical source
https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18412003-the-truth-about-agents-in-production.mp3
Audio
https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18412003-the-truth-about-agents-in-production.mp3
JSON
/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-truth-about-agents-in-production
Markdown
/podcast/the-data-exchange-with-ben-lorica/the-truth-about-agents-in-production.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-truth-about-agents-in-production/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/the-truth-about-agents-in-production.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

A panel of industry leaders from Anthropic, LlamaIndex, Pydantic, and Arize AI discusses the transition from simple LLM prompts to complex agentic workflows. The discussion focuses on the practical engineering challenges of reliability, evaluation, and tool integration in production environments.

Topics

  • Agentic AI
  • LLM Evaluation
  • AI Observability
  • Computer Use
  • Model Context Protocol
  • RAG
  • Software Engineering
  • AI Infrastructure

Highlights

  • Main idea: Successful agent deployment relies on translating specific business processes into workflows rather than forcing AI into existing structures
  • Practical takeaway: Implementing type safety in agent frameworks is critical for the reliability of coding agents
  • Failure mode: Over-reliance on offline evaluations can lead to a lack of visibility into real-world user friction and production errors
  • Main idea: The future of agents lies in 'computer use' and the ability to interact with unstructured interfaces where APIs do not exist
  • Practical takeaway: Using high-reasoning models for planning and delegating execution to faster, cheaper models can optimize agentic performance

Chapters

  1. 1:00 Architectural Patterns in Agents: The panel explores successful agent architectures, highlighting the importance of type safety in coding agents.
  2. 2:50 Product-Led AI Development: Discussion on why the best teams focus on solving user problems rather than simply implementing new AI capabilities.
  3. 4:40 The Challenge of Agent Planning: An analysis of the difficulties in managing context handoffs and planning across multi-agent systems.
  4. 10:20 The Role of Evaluations: A debate on the necessity of offline vs. online evaluations and the value of product analytics in measuring agent success.
  5. 17:40 MCP and Computer Use: Exploring the Model Context Protocol (MCP) and the potential for agents to navigate software via direct computer interaction.
  6. 21:30 The Future of Agent Interfaces: Predictions on standardized interfaces like SQL and the evolution of RAG into more active, tool-using search agents.