Episode

The DuckLake Lakehouse Format // Hannes Mühleisen // #339

Podcast
MLOps.community
Published
Sep 19, 2025
Duration seconds
3444
Processing state
failed
Canonical source
https://podcasters.spotify.com/pod/show/mlops/episodes/The-DuckLake-Lakehouse-Format--Hannes-Mhleisen--339-e38eh6f
Audio
https://anchor.fm/s/174cb1b8/podcast/play/108528271/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-8-19%2F407774943-44100-2-544185dd3fe51.mp3
JSON
/v1/public/podcasts/mlops-community/episodes/the-ducklake-lakehouse-format-hannes-m-hleisen-339
Markdown
/podcast/mlops-community/the-ducklake-lakehouse-format-hannes-m-hleisen-339.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/mlops-community/episodes/the-ducklake-lakehouse-format-hannes-m-hleisen-339/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/mlops-community/the-ducklake-lakehouse-format-hannes-m-hleisen-339.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

The DuckLake Lakehouse Format // MLOps Podcast #339 with Hannes Mühleisen, Co-founder and CEO of DuckDB Labs. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Managing data on Object Stores has been a painful affair. Users had to choose between data swamp chaos or a maze of metadata files with catalog servers on top. DuckLake is a new paradigm for managing data on object stores: First, it uses classical SQL data management systems to manage metadata. Second, actual data is stored in Parquet files on pretty arbitrary storage. Third, processing queries is done client-side, or anywhere really. DuckDB is the first system to integrate with DuckLake using an extension with the same name. Conceptually, DuckLake enables central control over truth while decentralizing compute and storage entirely. DuckLake turns data warehouse architecture upside down by departing from the integrated metadata/compute layer towards a fully disconnected operation with only centralized metadata. For the first time, DuckLake allows a “multi-player” experience with DuckDB, where computation stays fully local, but transactional control is centralized. // Bio Hannes Mühleisen 🔈 is a creator of the DuckDB database management system and Co-founder and CEO of DuckDB Labs. He is a senior researcher at the Centrum Wiskunde & Informatica (CWI) in Amsterdam. He is also Professor of Data Engineering at Radboud University Nijmegen. // Related Links Website: https://hannes.muehleisen.org Unleashing Unconstrained News Knowledge Graphs to Combat Misinformation // Robert Caulk // #279 - https://youtu.be/pF8zTI867EI ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYEx…