Episode

Malloy: Hierarchical Data, Semantic Models, and the Future of Analytics

Podcast
Data Engineering Podcast
Published
Dec 8, 2025
Duration seconds
3528
Processing state
processed
Canonical source
https://www.dataengineeringpodcast.com/malloy-advanced-analytics-language-episode-491
Audio
https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/6390075077084996601a990f24-c158-4040-82de-362219334ea5.mp3
JSON
/v1/public/podcasts/data-engineering-podcast/episodes/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics
Markdown
/podcast/data-engineering-podcast/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/data-engineering-podcast/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

SQL's flat relational model often conflicts with the hierarchical nature of real-world data. Malloy introduces a new query language that treats semantic modeling and hierarchy as first-class citizens, treating SQL as a compilation target rather than a manual interface.

Topics

  • Data Engineering
  • Malloy
  • SQL
  • Semantic Modeling
  • TypeScript
  • Data Transformation
  • Relational Algebra
  • Open Source

Highlights

  • Main idea: Malloy moves beyond the limitations of SQL by implementing a hierarchical mental model that preserves data context
  • Practical takeaway: Using TypeScript as a runtime allows Malloy to integrate seamlessly into modern web-based and VS Code-driven developer workflows
  • Failure mode: Relying on SQL as a human-facing interface leads to inflexible, unmaintainable, and non-composable data transformations
  • Technical insight: The language is designed to be highly compatible with LLM-generated queries due to its structured, semantic nature
  • Future vision: Transitioning the core runtime to Rust/WASM could provide the high-performance, cross-language integration needed for Python-centric data science

Chapters

  1. 5:20 The Core Problem: Michael Toy discusses the fundamental limitations of SQL and the motivation behind creating a language that better reflects human problem-solving.
  2. 9:40 Beyond SQL Abstractions: An exploration of previous attempts to layer over SQL and why Malloy focuses on a different approach to relational algebra and semantic layers.
  3. 14:10 Decoupling Data and Metadata: The tension between raw data columns and the curated transformations that define their meaning and usage.
  4. 22:40 Language Design and Ecosystem: Discussing the choice of TypeScript for the runtime and the importance of developer experience in the modern toolchain.
  5. 26:50 The Future of the Runtime: Reflections on the potential for a Rust-based WASM implementation to bridge the gap between TypeScript and Python environments.
  6. 31:20 Notebooks and Pipelines: How Malloy fits into interactive analysis via notebooks and its role in automated transformation pipelines.
  7. 45:10 Open Source and Community: The importance of an open-source-first approach to building trust and inviting community contributions to the Malloy ecosystem.