Episode
Building the Backbone of AI Agents: Telemetry, Open Source, and the Future of Developer Infrastructure with Brian Douglas
- Podcast
- Screaming in the Cloud
- Published
- Apr 30, 2026
- Duration seconds
- 1757
- Processing state
processed- Canonical source
- https://share.transistor.fm/s/41391d7a
Actions
POST https://stenobird.com/v1/public/podcasts/screaming-in-the-cloud/episodes/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/screaming-in-the-cloud/building-the-backbone-of-ai-agents-telemetry-open-source-and-the-future-of-developer-infrastructure-with-brian-douglas.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
AI agents are advancing rapidly, but the underlying infrastructure for observability and execution is lagging. Brian Douglas explains how Paper Compute is building the necessary telemetry and sandboxed runtimes to make agentic workflows reliable and secure.
Topics
- AI Agents
- Developer Infrastructure
- Telemetry
- Open Source
- Observability
- Sandboxing
- NixOS
- Distributed Systems
Highlights
- Main idea: Agentic workflows require specialized telemetry beyond standard OpenTelemetry to track tokens, duration, and prompt context
- Practical takeaway: Using sandboxed runtimes like Stereos (built on NixOS) can prevent agents from accessing unauthorized local data or system resources
- Failure mode: Relying on default LLM permissions can lead to unexpected data leaks, such as agents accessing personal calendars via connected accounts
- Main idea: Open source serves as a critical trust signal for infrastructure tools, allowing developers to audit policies and controls
- Practical takeaway: Implementing Merkle DAGs for agent sessions enables efficient session hashing, searching, and skill generation
Chapters
1:00Paper Compute's Mission: Brian introduces Paper Compute and its focus on building distributed systems and infrastructure specifically for AI agents.3:10Telemetry for Agents: A deep dive into tracking prompts, human context, and the limitations of current observability tools like OpenTelemetry for LLMs.5:15Observability and Data Interpretation: The importance of seeing raw data from agent executions to understand performance and reliability.7:15Security and Agent Sandboxing: Discussing the risks of agentic access to sensitive data and the need for controlled, isolated environments.9:35The Tapes Product and Session Hashing: How using Merkle DAGs allows for searchable, hashable agent sessions and the development of reusable agent skills.16:15The Cost of Agentic Compute: The high RAM and EC2 costs associated with running intensive agentic tools like Claude Code.20:35Open Source Strategy: Why building in the open is a strategic advantage for gaining developer trust and establishing industry standards.