# 914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz Page: https://stenobird.com/podcast/super-data-science/914-data-lakes-101-and-why-they-re-key-for-ai-models-with-oz-katz Text version: https://stenobird.com/podcast/super-data-science/914-data-lakes-101-and-why-they-re-key-for-ai-models-with-oz-katz.md Podcast: [Super Data Science: ML & AI Podcast with Jon Krohn](https://stenobird.com/podcast/super-data-science) Published: 2025-08-15T11:00:00+00:00 Episode link: https://www.podtrac.com/pts/redirect.mp3/chrt.fm/track/E581B9/arttrk.com/p/VI4CS/pscrb.fm/rss/p/traffic.megaphone.fm/SUPERDATASCIENCEPTYLTD7728592050.mp3 Audio file: https://www.podtrac.com/pts/redirect.mp3/chrt.fm/track/E581B9/arttrk.com/p/VI4CS/pscrb.fm/rss/p/traffic.megaphone.fm/SUPERDATASCIENCEPTYLTD7728592050.mp3 Processing state: failed JSON: https://stenobird.com/v1/public/podcasts/super-data-science/episodes/914-data-lakes-101-and-why-they-re-key-for-ai-models-with-oz-katz Duration seconds: 1552 ## Resource In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping users collaborate on data lakes, and how to overcome the challenges of working with multimodal data. Additional materials: ⁠www.superdatascience.com/914⁠ This episode is brought to you by the ⁠Dell AI Factory with NVIDIA⁠. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/super-data-science/episodes/914-data-lakes-101-and-why-they-re-key-for-ai-models-with-oz-katz/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/super-data-science/914-data-lakes-101-and-why-they-re-key-for-ai-models-with-oz-katz.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.