{"podcast":{"title":"Data Engineering Podcast","slug":"data-engineering-podcast","podcast_index_feed_id":403671,"rss_url":"https://serve.podhome.fm/rss/1c0357c0-6aba-5766-a2d5-2090d8dab6bc","website_url":"https://www.dataengineeringpodcast.com","image_url":"https://assets.podhome.fm/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638557928872209534cover.jpg","author":"Tobias Macey","episode_count":510,"summary":"This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/data-engineering-podcast"},"episode":{"title":"Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows","slug":"branches-diffs-and-sql-how-dolt-powers-agentic-workflows","published_at":"2026-02-01T23:46:18+00:00","page_url":"https://stenobird.com/podcast/data-engineering-podcast/branches-diffs-and-sql-how-dolt-powers-agentic-workflows","show_page_url":"https://stenobird.com/podcast/data-engineering-podcast","url":"https://www.dataengineeringpodcast.com/dolt-version-controlled-database-episode-499","audio_url":"https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/639055858748539200f2ffcea3-c0d7-4aeb-808a-4c6cf03db4bc.mp3","summary":"Dolt introduces Git-style semantics—branching, merging, and diffing—directly to the SQL database layer. This allows for version-controlled data management, enabling safe agentic workflows and reproducible machine learning experiments.","meta_description":"Learn how Dolt uses a novel storage engine to bring Git-style branching and merging to MySQL and Postgres-compatible databases for AI and data engineering.","key_points":["Main idea: Dolt implements Git semantics (branch, merge, diff) for both database schema and row-level data","Practical takeaway: Use branching to run A/B tests on different embedding models or chunking strategies in ML pipelines","Technical detail: Dolt uses a 'Prollytree' storage engine to enable efficient, cryptographically provable audit logs and fast JSON querying","Failure mode: Avoid treating Dolt as a standard MySQL/Postgres clone; it is a new engine that implements the syntax via AST transformation","Future frontier: The next major challenge in data management is managing and versioning the 'context' generated by AI agents"],"chapters":[{"start_ms":70000,"title":"Introduction to Dolt","summary":"Tim Sehn introduces Dolt, the world's first version-controlled SQL database, and its origins."},{"start_ms":330000,"title":"Data Sharing and Use Cases","summary":"Exploring how Dolt enables efficient data sharing and its popularity in stock market data distribution."},{"start_ms":580000,"title":"Competitive Landscape","summary":"A comparison of Dolt against other database technologies like PlanetScale, Neon, and Replit's infrastructure."},{"start_ms":830000,"title":"Dolt vs. Traditional MySQL/Postgres","summary":"Understanding the architectural differences between Dolt's engine and standard SQL implementations."},{"start_ms":1090000,"title":"The Database for AI","summary":"How version control mitigates trust issues and enables safe, reviewable writes for AI agents."},{"start_ms":1340000,"title":"Decentralized Data and Cloning","summary":"The power of being able to clone a database to a local machine for isolated development."},{"start_ms":1840000,"title":"Engineering the SQL Dialect","summary":"How Dolt uses AST transformations to support both MySQL and Postgres-compatible syntax on a single engine."}],"topics":["SQL","Version Control","Data Engineering","AI Workflows","Database Engines","Machine Learning","Git Semantics","Data Management"],"duration_seconds":3413,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/branches-diffs-and-sql-how-dolt-powers-agentic-workflows/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-engineering-podcast/branches-diffs-and-sql-how-dolt-powers-agentic-workflows.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}