# Building and Deploying Real-World RAG Applications with Ram Sriharsha - #669

Page: https://stenobird.com/podcast/twiml-ai-podcast/building-and-deploying-real-world-rag-applications-with-ram-sriharsha-669
Text version: https://stenobird.com/podcast/twiml-ai-podcast/building-and-deploying-real-world-rag-applications-with-ram-sriharsha-669.md
Podcast: [The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)](https://stenobird.com/podcast/twiml-ai-podcast)
Published: 2024-01-29T19:19:00+00:00
Episode link: https://twimlai.com/podcast/twimlai/building-and-deploying-real-world-rag-applications/
Audio file: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN5047897251.mp3?updated=1706556580
Processing state: failed
JSON: https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/building-and-deploying-real-world-rag-applications-with-ram-sriharsha-669
Duration seconds: 2129

## Resource

Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new serverless offering. Currently in public preview, Pinecone Serverless is a vector database that enables on-demand data loading, flexible scaling, and cost-effective query processing. Ram discusses how the serverless paradigm impacts the vector database’s core architecture, key features, and other considerations. Lastly, Ram shares his perspective on the future of vector databases in helping enterprises deliver RAG systems. The complete show notes for this episode can be found at twimlai.com/go/669.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/building-and-deploying-real-world-rag-applications-with-ram-sriharsha-669/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/twiml-ai-podcast/building-and-deploying-real-world-rag-applications-with-ram-sriharsha-669.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.