Episode

Matt Aitken from Trigger.dev @ AIE

Podcast
Scaling DevTools
Published
Apr 16, 2026
Duration seconds
710
Processing state
processed
Canonical source
https://podcast.scalingdevtools.com/episodes/matt-aitken-from-trigger-dev-aie
Audio
https://media.transistor.fm/745e5ae9/dc0770b4.mp3
JSON
/v1/public/podcasts/scaling-devtools/episodes/matt-aitken-from-trigger-dev-aie
Markdown
/podcast/scaling-devtools/matt-aitken-from-trigger-dev-aie.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/scaling-devtools/episodes/matt-aitken-from-trigger-dev-aie/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/scaling-devtools/matt-aitken-from-trigger-dev-aie.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Building sophisticated AI agents requires moving beyond stateless APIs to manage long-running execution state and context. Matt Aitken explains how Trigger.dev uses VM snapshotting to handle durable agents and secure code execution.

Topics

  • AI Agents
  • Durable Execution
  • LLM Orchestration
  • Micro-VMs
  • Firecracker
  • Software Sandboxing
  • Serverless Computing
  • Observability

Highlights

  • Main idea: Moving from stateless LLM calls to durable agents requires managing both conversation context and execution state
  • Technical approach: Using VM snapshots of CPU, memory, and filesystem allows agents to pause and resume without losing progress
  • Practical takeaway: Snapshotting allows developers to avoid paying for compute while waiting for human feedback or external events
  • Security challenge: Executing LLM-generated code requires robust sandboxing, such as using Firecracker micro-VMs, to prevent unauthorized credential access
  • Failure mode: Relying solely on message history leads to bloated context and high costs as conversation length increases

Chapters

  1. 1:00 The Problem with Stateless LLM APIs: Discussing the limitations of simple request-response patterns and the growing complexity of managing massive conversation histories.
  2. 1:50 Defining Durable Agents: An exploration of the trade-offs involved in building agents that require persistent memory and long-running processes.
  3. 2:40 Step-based Execution vs. State Management: Comparing traditional step-based caching methods with more advanced approaches for handling extremely long agent runs.
  4. 4:20 VM Snapshotting for Execution State: How Trigger.dev snapshots the entire machine state to handle context, files, and memory across different servers.
  5. 5:10 Handling Latency and Human-in-the-loop: Managing the cost and efficiency of agents that must wait for external events, other agents, or human approvals.
  6. 6:45 Sandboxing with Firecracker: The move toward using Firecracker micro-VMs to provide secure, high-performance sandboxes for untrusted, LLM-generated code.
  7. 8:25 Security and Observability in AI: Addressing the challenges of fine-grained permissions and the need for queryable execution logs in production environments.