# #355 AI's Impact on Databases with Shireesh Thota, CVP of Databases at Microsoft Page: https://stenobird.com/podcast/dataframed/355-ai-s-impact-on-databases-with-shireesh-thota-cvp-of-databases-at-microsoft Text version: https://stenobird.com/podcast/dataframed/355-ai-s-impact-on-databases-with-shireesh-thota-cvp-of-databases-at-microsoft.md Podcast: [DataFramed](https://stenobird.com/podcast/dataframed) Published: 2026-04-13T09:00:00+00:00 Episode link: https://www.datacamp.com/podcast Audio file: https://dts.podtrac.com/redirect.mp3/cohst.app/pdcst/6G1A6D/episodes.captivate.fm/episode/8c5bd163-0851-4fda-8acc-c895e5984943.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/dataframed/episodes/355-ai-s-impact-on-databases-with-shireesh-thota-cvp-of-databases-at-microsoft Duration seconds: 3157 ## Resource The evolution of the data stack is shifting from human-centric design to platforms that must serve both developers and AI agents. This discussion explores how providing semantic context and unified environments like Microsoft Fabric prevents AI hallucinations and simplifies complex data engineering. ## Highlights - Main idea: AI agents require semantic context, such as ontologies and metadata, to reason accurately and avoid hallucinating queries - Practical takeaway: Use unified platforms like Microsoft Fabric to bridge the gap between data integration, engineering, and real-time analytics - Failure mode: Relying on raw schema alone for LLM-generated queries leads to errors because agents cannot distinguish between sensitive data types without annotation - Strategic choice: Select SQL for deep relational modeling and consistency, or NoSQL for schema flexibility and high-scale JSON workloads - Industry trend: The rise of 'Software 2.0' and agentic workflows necessitates databases that are accessible via natural language while maintaining core security and resiliency ## Topics AI Agents, Microsoft Fabric, Azure Cosmos DB, PostgreSQL, Semantic Modeling, Data Engineering, NoSQL, Cloud Databases ## Chapters - 1:00 — The New Challenge: Platforms for Humans and Agents: An introduction to how the data stack must evolve to support both human developers and autonomous AI agents using Azure and Fabric. - 4:50 — The Complexity of Data Lifecycles: Discussing the difficulty of managing data as it moves from operational systems of record to analytical warehouses. - 12:50 — Microsoft Fabric as a Unified Solution: How a single-stop shop for integration, science, and engineering reduces the friction of fragmented data tools. - 16:40 — Simplifying Development without Losing Power: The goal of using natural language to interact with platforms while maintaining the deep control required by developers. - 20:40 — Embracing Open Source and Postgres: Microsoft's strategy regarding OSS, specifically their deep commitment and contributions to the Postgres ecosystem. - 28:40 — Semantic Modeling and AI Context: Why defining entities and relationships is critical to preventing AI hallucinations in automated querying. - 40:20 — The Changing Role of Data Professionals: Reflecting on how data modeling remains a human-centric necessity even as automation increases. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/dataframed/episodes/355-ai-s-impact-on-databases-with-shireesh-thota-cvp-of-databases-at-microsoft/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/dataframed/355-ai-s-impact-on-databases-with-shireesh-thota-cvp-of-databases-at-microsoft.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.