{"podcast":{"title":"Data Engineering Podcast","slug":"data-engineering-podcast","podcast_index_feed_id":403671,"rss_url":"https://serve.podhome.fm/rss/1c0357c0-6aba-5766-a2d5-2090d8dab6bc","website_url":"https://www.dataengineeringpodcast.com","image_url":"https://assets.podhome.fm/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638557928872209534cover.jpg","author":"Tobias Macey","episode_count":510,"summary":"This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/data-engineering-podcast"},"episode":{"title":"Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability","slug":"your-data-your-lake-how-observe-uses-iceberg-and-streaming-etl-for-observability","published_at":"2026-01-18T23:50:41+00:00","page_url":"https://stenobird.com/podcast/data-engineering-podcast/your-data-your-lake-how-observe-uses-iceberg-and-streaming-etl-for-observability","show_page_url":"https://stenobird.com/podcast/data-engineering-podcast","url":"https://www.dataengineeringpodcast.com/observe-lakehouse-technology-for-app-telemetry-episode-497","audio_url":"https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/639043759793013763a79c0604-4ef4-41b2-8eb6-a71de98d8b37.mp3","summary":"Learn how to apply lakehouse architectures to observability workloads to achieve petabyte-scale efficiency. Jacob Leverich explains how using open table formats like Iceberg and streaming ETL can eliminate data silos and reduce costs.","meta_description":"Discover how Observe uses Iceberg, streaming ETL, and lakehouse architectures to build a high-performance, cost-effective observability platform.","key_points":["Main idea: Lakehouse architectures can replace expensive, siloed observability tools by leveraging cloud-native warehousing and open formats","Practical takeaway: Organizing telemetry data by use case and columnarizing it significantly improves query performance and cost efficiency","Technical breakthrough: Iceberg v3's ability to shred JSON data is a major unlock for handling semi-structured OpenTelemetry data","Failure mode: Relying on generic data pipelines for observability can lead to high latency and unmanageable costs at scale","Strategic advantage: Adopting 'your data in your lake' strategies prevents vendor lock-in and enables unified querying across logs, metrics, and traces"],"chapters":[{"start_ms":390000,"title":"The Evolution of Data Processing","summary":"A look back at the foundations of MapReduce and the shift toward modern data processing architectures."},{"start_ms":720000,"title":"Challenges in Semi-Structured Data","summary":"Discussing the difficulties of parallel processing and relational querying for semi-structured datasets."},{"start_ms":1050000,"title":"The High Cost of Observability Silos","summary":"Analyzing how fragmented tools and proprietary formats exacerbate costs and usability issues in observability."},{"start_ms":1380000,"title":"Optimizing for Streaming ETL","summary":"The necessity of building specialized streaming pipelines to optimize for end-to-end latency in observability."},{"start_ms":1720000,"title":"Efficient Data Loading in Lakehouses","summary":"Strategies for loading data into a lakehouse without creating massive, unmanageable single tables."},{"start_ms":2370000,"title":"The 'Your Data, Your Lake' Strategy","summary":"Why enterprises prefer owning their data in open, accessible formats rather than proprietary silos."},{"start_ms":4020000,"title":"The Future of Open Table Formats","summary":"How advancements like Iceberg v3 JSON shredding are transforming the observability landscape."}],"topics":["Lakehouse Architecture","Apache Iceberg","Observability","Streaming ETL","OpenTelemetry","Data Engineering","JSON Shredding","Cloud-Native Warehousing"],"duration_seconds":4341,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/your-data-your-lake-how-observe-uses-iceberg-and-streaming-etl-for-observability/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-engineering-podcast/your-data-your-lake-how-observe-uses-iceberg-and-streaming-etl-for-observability.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}