{"podcast":{"title":"Latent Space: The AI Engineer Podcast","slug":"latent-space-ai-engineer","podcast_index_feed_id":6058902,"rss_url":"https://api.substack.com/feed/podcast/1084089.rss","website_url":"https://www.latent.space/podcast","image_url":"https://substackcdn.com/feed/podcast/1084089/ca7468da5614a246d2906ee8926f6de7.jpg","author":"Latent.Space","episode_count":214,"summary":"The AI Engineer newsletter + Top technical AI podcast. How leading labs build Agents, Models, Infra, & AI for Science. See https://latent.space/about for highlights from Greg Brockman, Andrej Karpathy, George Hotz, Simon Willison, Soumith Chintala et al!","last_synced_at":"2026-07-17T00:20:53.505905+00:00","page_url":"https://stenobird.com/podcast/latent-space-ai-engineer"},"episode":{"title":"Retrieval After RAG: Hybrid Search, Agents, and Database Design — Simon Hørup Eskildsen of Turbopuffer","slug":"retrieval-after-rag-hybrid-search-agents-and-database-design-simon-h-rup-eskildsen-of-turbopuffer","published_at":"2026-03-12T22:56:01+00:00","page_url":"https://stenobird.com/podcast/latent-space-ai-engineer/retrieval-after-rag-hybrid-search-agents-and-database-design-simon-h-rup-eskildsen-of-turbopuffer","show_page_url":"https://stenobird.com/podcast/latent-space-ai-engineer","url":"https://www.latent.space/p/turbopuffer","audio_url":"https://api.substack.com/feed/podcast/190777516/3e8657eee5a6ccb27814143e15672fd5.mp3","summary":"The founder of Turbopuffer explains how a massive infrastructure cost problem at Readwise led to the creation of a specialized search engine for unstructured data. He details the architectural shift toward using object storage and NVMe to provide high-performance hybrid search at a fraction of traditional costs.","meta_description":"Learn how Turbopuffer uses object storage and NVMe to optimize hybrid search and reduce vector database costs by up to 95%.","key_points":["Main idea: Modern AI workloads require a 'search engine for unstructured data' that combines full-text and vector search rather than just a vector database","Practical takeaway: Moving heavy workloads to an architecture built on object storage and NVMe can reduce infrastructure costs by 95% for companies like Cursor","Failure mode: Relying on traditional relational databases for high-scale vector search can lead to unsustainable monthly costs that break unit economics","Architectural insight: A successful new database requires three ingredients: a new workload, a new storage architecture, and support for diverse query plans","Technical lesson: High-performance retrieval in AI agents relies on optimizing concurrency and minimizing round trips through intelligent cluster downloading"],"chapters":[{"start_ms":65000,"title":"Engineering Roots","summary":"Simon discusses his transition from Denmark to Canada and his experience working on infrastructure at Shopify."},{"start_ms":340000,"title":"The Architecture of Turbopuffer","summary":"An exploration of building a database around object storage and the necessity of a new workload for modern companies."},{"start_ms":605000,"title":"The Readwise Origin Story","summary":"How the need to scale recommendation engines and semantic search without exploding costs led to the birth of Turbopuffer."},{"start_ms":880000,"title":"Optimizing Retrieval","summary":"A technical look at the mechanics of downloading and building clusters to minimize round trips during search."},{"start_ms":1150000,"title":"Leveraging Cloud Primitives","summary":"The role of GCS and S3 availability in shaping the storage architecture of the platform."},{"start_ms":1690000,"title":"The Cursor Case Study","summary":"How Turbopuffer helped Cursor migrate their workload, resulting in a 95% reduction in costs."},{"start_ms":1965000,"title":"High Concurrency and Agentic Workloads","summary":"Analyzing the massive query concurrency required by modern AI agents and coding tools."}],"topics":["Vector Search","Hybrid Search","Object Storage","Infrastructure Engineering","RAG","Database Design","Cloud Architecture","AI Agents"],"duration_seconds":3632,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/retrieval-after-rag-hybrid-search-agents-and-database-design-simon-h-rup-eskildsen-of-turbopuffer/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/latent-space-ai-engineer/retrieval-after-rag-hybrid-search-agents-and-database-design-simon-h-rup-eskildsen-of-turbopuffer.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}