Episode

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Published: Apr 16, 2026
Duration seconds: 3258
Processing state: processed
Canonical source: https://twimlai.com/podcast/twimlai/how-capital-one-delivers-multi-agent-systems
Audio: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN9114691307.mp3?updated=1776383902
JSON: /v1/public/podcasts/twiml-ai-podcast/episodes/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765
Markdown: /podcast/twiml-ai-podcast/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765.md

Actions

POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/twiml-ai-podcast/how-capital-one-delivers-multi-agent-systems-with-rashmi-shetty-765.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Capital One scales multi-agent systems in highly regulated environments by separating agent design from runtime governance. The discussion explores how their 'Chat Concierge' uses specialized models and robust observability to execute complex, goal-oriented actions.

Topics

Multi-Agent Systems
Generative AI Platform
Enterprise AI Governance
Model Distillation
AI Observability
LLM Evaluation
Agentic Workflows
Cloud-Native AI

Highlights

Main idea: Multi-agent systems are essential for breaking down complex, multifaceted problems into specific, goal-oriented actions
Practical takeaway: Use a platform-centric approach to separate agent design from runtime governance, embedding guardrails and cyber controls at agent boundaries
Failure mode: Avoid treating agents as isolated units; instead, evaluate them using end-to-end frameworks that account for the entire stochastic workflow
Strategy: Leverage model distillation and fine-tuning to achieve the necessary balance between reasoning capabilities and low-latency performance
Technical takeaway: Implement a robust observability stack and pluggable SDKs to allow the platform to evolve alongside rapidly advancing LLM capabilities

Chapters

1:05 The Evolution of Intelligence: Rashmi discusses her transition from academic research in distributed intelligence to operationalizing agentic AI in the enterprise.
5:10 The Shift to Multi-Agentic Workflows: An exploration of why complex problems require moving beyond simple LLM responses toward systems capable of taking specific actions.
9:55 Operating in Regulated Environments: How Capital One manages the tension between deploying cutting-edge agentic technology and maintaining strict regulatory compliance.
17:20 Platform Architecture and Data Lineage: The technical challenges of managing context, memory, and tool integration across multiple agents without exhausting context windows.
21:10 Evaluation and Golden Datasets: Moving from individual agent evaluations to end-to-end evaluation frameworks for complex, multi-step workflows.
25:30 Designing for Future Scalability: How a robust observability stack and pluggable SDKs allow for the seamless integration of new AI components and models.
29:45 Model Specialization and Distillation: Using fine-tuning and student-teacher distillation to optimize for reasoning, latency, and personalized customer experiences.
45:55 Treating Agentic AI as a System: Final lessons on the importance of treating agentic workflows as integrated systems rather than a collection of disparate tools.