Episode
Malloy: Hierarchical Data, Semantic Models, and the Future of Analytics
- Podcast
- Data Engineering Podcast
- Published
- Dec 8, 2025
- Duration seconds
- 3528
- Processing state
processed
Actions
POST https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/data-engineering-podcast/malloy-hierarchical-data-semantic-models-and-the-future-of-analytics.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
SQL's flat relational model often conflicts with the hierarchical nature of real-world data. Malloy introduces a new query language that treats semantic modeling and hierarchy as first-class citizens, treating SQL as a compilation target rather than a manual interface.
Topics
- Data Engineering
- Malloy
- SQL
- Semantic Modeling
- TypeScript
- Data Transformation
- Relational Algebra
- Open Source
Highlights
- Main idea: Malloy moves beyond the limitations of SQL by implementing a hierarchical mental model that preserves data context
- Practical takeaway: Using TypeScript as a runtime allows Malloy to integrate seamlessly into modern web-based and VS Code-driven developer workflows
- Failure mode: Relying on SQL as a human-facing interface leads to inflexible, unmaintainable, and non-composable data transformations
- Technical insight: The language is designed to be highly compatible with LLM-generated queries due to its structured, semantic nature
- Future vision: Transitioning the core runtime to Rust/WASM could provide the high-performance, cross-language integration needed for Python-centric data science
Chapters
5:20The Core Problem: Michael Toy discusses the fundamental limitations of SQL and the motivation behind creating a language that better reflects human problem-solving.9:40Beyond SQL Abstractions: An exploration of previous attempts to layer over SQL and why Malloy focuses on a different approach to relational algebra and semantic layers.14:10Decoupling Data and Metadata: The tension between raw data columns and the curated transformations that define their meaning and usage.22:40Language Design and Ecosystem: Discussing the choice of TypeScript for the runtime and the importance of developer experience in the modern toolchain.26:50The Future of the Runtime: Reflections on the potential for a Rust-based WASM implementation to bridge the gap between TypeScript and Python environments.31:20Notebooks and Pipelines: How Malloy fits into interactive analysis via notebooks and its role in automated transformation pipelines.45:10Open Source and Community: The importance of an open-source-first approach to building trust and inviting community contributions to the Malloy ecosystem.