Episode

Building the Backbone of AI Agents: Telemetry, Open Source, and the Future of Developer Infrastructure with Brian Douglas

Podcast
Screaming in the Cloud
Published
Apr 30, 2026
Duration seconds
1757
Processing state
processed
Canonical source
https://share.transistor.fm/s/41391d7a
Audio
https://dts.podtrac.com/redirect.mp3/media.transistor.fm/41391d7a/34028eaa.mp3
JSON
/v1/public/podcasts/screaming-in-the-cloud/episodes/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas
Markdown
/podcast/screaming-in-the-cloud/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/screaming-in-the-cloud/episodes/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/screaming-in-the-cloud/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

AI agents are advancing rapidly, but the underlying infrastructure for observability and execution is lagging. Brian Douglas explains how Paper Compute is building the necessary telemetry and sandboxed runtimes to make agentic workflows reliable and secure.

Topics

  • AI Agents
  • Developer Infrastructure
  • Telemetry
  • Open Source
  • Observability
  • Sandboxing
  • NixOS
  • Distributed Systems

Highlights

  • Main idea: Agentic workflows require specialized telemetry beyond standard OpenTelemetry to track tokens, duration, and prompt context
  • Practical takeaway: Using sandboxed runtimes like Stereos (built on NixOS) can prevent agents from accessing unauthorized local data or system resources
  • Failure mode: Relying on default LLM permissions can lead to unexpected data leaks, such as agents accessing personal calendars via connected accounts
  • Main idea: Open source serves as a critical trust signal for infrastructure tools, allowing developers to audit policies and controls
  • Practical takeaway: Implementing Merkle DAGs for agent sessions enables efficient session hashing, searching, and skill generation

Chapters

  1. 1:00 Paper Compute's Mission: Brian introduces Paper Compute and its focus on building distributed systems and infrastructure specifically for AI agents.
  2. 3:10 Telemetry for Agents: A deep dive into tracking prompts, human context, and the limitations of current observability tools like OpenTelemetry for LLMs.
  3. 5:15 Observability and Data Interpretation: The importance of seeing raw data from agent executions to understand performance and reliability.
  4. 7:15 Security and Agent Sandboxing: Discussing the risks of agentic access to sensitive data and the need for controlled, isolated environments.
  5. 9:35 The Tapes Product and Session Hashing: How using Merkle DAGs allows for searchable, hashable agent sessions and the development of reusable agent skills.
  6. 16:15 The Cost of Agentic Compute: The high RAM and EC2 costs associated with running intensive agentic tools like Claude Code.
  7. 20:35 Open Source Strategy: Why building in the open is a strategic advantage for gaining developer trust and establishing industry standards.