Episode

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

Podcast
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Published
Apr 16, 2026
Duration seconds
3258
Processing state
processed
Canonical source
https://twimlai.com/podcast/twimlai/how-capital-one-delivers-multi-agent-systems
Audio
https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN9114691307.mp3?updated=1776383902
JSON
/v1/public/podcasts/twiml-ai-podcast/episodes/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765
Markdown
/podcast/twiml-ai-podcast/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/twiml-ai-podcast/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Capital One scales multi-agent systems in highly regulated environments by separating agent design from runtime governance. The discussion explores how their 'Chat Concierge' uses specialized models and robust observability to execute complex, goal-oriented actions.

Topics

  • Multi-Agent Systems
  • Generative AI Platform
  • Enterprise AI Governance
  • Model Distillation
  • AI Observability
  • LLM Evaluation
  • Agentic Workflows
  • Cloud-Native AI

Highlights

  • Main idea: Multi-agent systems are essential for breaking down complex, multifaceted problems into specific, goal-oriented actions
  • Practical takeaway: Use a platform-centric approach to separate agent design from runtime governance, embedding guardrails and cyber controls at agent boundaries
  • Failure mode: Avoid treating agents as isolated units; instead, evaluate them using end-to-end frameworks that account for the entire stochastic workflow
  • Strategy: Leverage model distillation and fine-tuning to achieve the necessary balance between reasoning capabilities and low-latency performance
  • Technical takeaway: Implement a robust observability stack and pluggable SDKs to allow the platform to evolve alongside rapidly advancing LLM capabilities

Chapters

  1. 1:05 The Evolution of Intelligence: Rashmi discusses her transition from academic research in distributed intelligence to operationalizing agentic AI in the enterprise.
  2. 5:10 The Shift to Multi-Agentic Workflows: An exploration of why complex problems require moving beyond simple LLM responses toward systems capable of taking specific actions.
  3. 9:55 Operating in Regulated Environments: How Capital One manages the tension between deploying cutting-edge agentic technology and maintaining strict regulatory compliance.
  4. 17:20 Platform Architecture and Data Lineage: The technical challenges of managing context, memory, and tool integration across multiple agents without exhausting context windows.
  5. 21:10 Evaluation and Golden Datasets: Moving from individual agent evaluations to end-to-end evaluation frameworks for complex, multi-step workflows.
  6. 25:30 Designing for Future Scalability: How a robust observability stack and pluggable SDKs allow for the seamless integration of new AI components and models.
  7. 29:45 Model Specialization and Distillation: Using fine-tuning and student-teacher distillation to optimize for reasoning, latency, and personalized customer experiences.
  8. 45:55 Treating Agentic AI as a System: Final lessons on the importance of treating agentic workflows as integrated systems rather than a collection of disparate tools.