Episode

How to build in Observability at Petabyte Scale

Podcast: Adventures in DevOps
Published: Sep 7, 2025
Duration seconds: 2731
Processing state: processed
Canonical source: https://adventuresindevops.com/episodes/2025/09/07/how-you-build-observability-that-scales-to-enterprise
Audio: https://dts.podtrac.com/redirect.mp3/api.spreaker.com/download/episode/67654497/download.mp3
JSON: /v1/public/podcasts/adventures-in-devops/episodes/how-to-build-in-observability-at-petabyte-scale
Markdown: /podcast/adventures-in-devops/how-to-build-in-observability-at-petabyte-scale.md

Actions

POST https://stenobird.com/v1/public/podcasts/adventures-in-devops/episodes/how-to-build-in-observability-at-petabyte-scale/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/adventures-in-devops/how-to-build-in-observability-at-petabyte-scale.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Learn how Observe scales observability to petabytes of data per day by leveraging Snowflake's architecture instead of building a proprietary database. The discussion covers the technical trade-offs of using Kafka for stream processing and the strategic move toward open data formats like Iceberg.

Topics

Observability
Snowflake
Kafka
Data Engineering
Cloud Architecture
Apache Iceberg
Petabyte Scale
Stream Processing
AWS S3

Highlights

Main idea: Avoid the 'founding engineer instinct' of building a custom database to focus on delivering immediate user value
Architectural choice: Use Kafka as a buffer to smooth out massive data bursts before they hit Snowflake's batch-based engine
Strategic advantage: Leveraging open formats like Iceberg prevents vendor lock-in and allows customers to maintain true data ownership
Failure mode: Relying on proprietary cloud services like AWS Kinesis can create tight coupling that hinders multi-cloud (GCP/Azure) expansion
Practical takeaway: A usage-based pricing model for queries, paired with low-cost ingestion, prevents the 'bill shock' common in observability

Chapters

1:00 Context: Observability at Scale: Introduction to the challenges of managing petabyte-scale data streams and the evolution of observability expertise.
4:20 The Decision Against Proprietary Engines: Why building on top of Snowflake was a strategic choice to avoid the overhead of developing a custom execution engine.
7:50 Kafka as a Buffering Layer: Using Kafka to manage high-volume ingestion and bridge the gap between streaming data and batch-based processing.
14:40 Predictable Pricing Models: How separating ingestion costs from query usage helps customers avoid unexpected monthly billing spikes.
21:30 Custom Stream Processing: The technical necessity of building custom stream processing layers to handle historical data reprocessing efficiently.
28:20 Future-Proofing with Iceberg: Leveraging open data formats to enable data portability and multi-cloud interoperability.
35:10 Security and Identity Risks: Discussing the risks of IAM trust policy exploitation and the importance of modern authentication like passkeys.