Episode

Matt Aitken from Trigger.dev @ AIE

Podcast: Scaling DevTools
Published: Apr 16, 2026
Duration seconds: 710
Processing state: processed
Canonical source: https://podcast.scalingdevtools.com/episodes/matt-aitken-from-trigger-dev-aie
Audio: https://media.transistor.fm/745e5ae9/dc0770b4.mp3
JSON: /v1/public/podcasts/scaling-devtools/episodes/matt-aitken-from-trigger-dev-aie
Markdown: /podcast/scaling-devtools/matt-aitken-from-trigger-dev-aie.md

Actions

POST https://stenobird.com/v1/public/podcasts/scaling-devtools/episodes/matt-aitken-from-trigger-dev-aie/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/scaling-devtools/matt-aitken-from-trigger-dev-aie.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Building sophisticated AI agents requires moving beyond stateless APIs to manage long-running execution state and context. Matt Aitken explains how Trigger.dev uses VM snapshotting to handle durable agents and secure code execution.

Topics

AI Agents
Durable Execution
LLM Orchestration
Micro-VMs
Firecracker
Software Sandboxing
Serverless Computing
Observability

Highlights

Main idea: Moving from stateless LLM calls to durable agents requires managing both conversation context and execution state
Technical approach: Using VM snapshots of CPU, memory, and filesystem allows agents to pause and resume without losing progress
Practical takeaway: Snapshotting allows developers to avoid paying for compute while waiting for human feedback or external events
Security challenge: Executing LLM-generated code requires robust sandboxing, such as using Firecracker micro-VMs, to prevent unauthorized credential access
Failure mode: Relying solely on message history leads to bloated context and high costs as conversation length increases

Chapters

1:00 The Problem with Stateless LLM APIs: Discussing the limitations of simple request-response patterns and the growing complexity of managing massive conversation histories.
1:50 Defining Durable Agents: An exploration of the trade-offs involved in building agents that require persistent memory and long-running processes.
2:40 Step-based Execution vs. State Management: Comparing traditional step-based caching methods with more advanced approaches for handling extremely long agent runs.
4:20 VM Snapshotting for Execution State: How Trigger.dev snapshots the entire machine state to handle context, files, and memory across different servers.
5:10 Handling Latency and Human-in-the-loop: Managing the cost and efficiency of agents that must wait for external events, other agents, or human approvals.
6:45 Sandboxing with Firecracker: The move toward using Firecracker micro-VMs to provide secure, high-performance sandboxes for untrusted, LLM-generated code.
8:25 Security and Observability in AI: Addressing the challenges of fine-grained permissions and the need for queryable execution logs in production environments.