Episode

Solving the 20-Year S3 File System Problem with Hunter Leath

Podcast
Screaming in the Cloud
Published
Jan 20, 2026
Duration seconds
1903
Processing state
processed
Canonical source
https://share.transistor.fm/s/87694d04
Audio
https://dts.podtrac.com/redirect.mp3/media.transistor.fm/87694d04/81e22e5b.mp3
JSON
/v1/public/podcasts/screaming-in-the-cloud/episodes/solving-the-20-year-s3-file-system-problem-with-hunter-leath
Markdown
/podcast/screaming-in-the-cloud/solving-the-20-year-s3-file-system-problem-with-hunter-leath.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/screaming-in-the-cloud/episodes/solving-the-20-year-s3-file-system-problem-with-hunter-leath/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/screaming-in-the-cloud/solving-the-20-year-s3-file-system-problem-with-hunter-leath.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Archil solves the long-standing problem of using S3 as a high-performance file system by implementing a high-speed SSD cache layer. This approach allows legacy applications to use S3 with POSIX compatibility and significantly lower costs than EBS or EFS.

Topics

  • Cloud Storage
  • Amazon S3
  • File Systems
  • AWS EFS
  • Infrastructure Engineering
  • Data Caching
  • Cloud Economics
  • Linux Kernel

Highlights

  • Main idea: Archil acts as a pull-through cache, using fast SSDs to bridge the performance gap between S3 and local disk semantics
  • Practical takeaway: Users can attach Archil directly to existing S3 buckets without migrating data, maintaining S3 as the source of truth
  • Economic advantage: When accounting for over-provisioning, Archil can cost approximately 1.95 cents per GB, significantly cheaper than EBS
  • Failure mode: Traditional attempts to mount S3 as a file system often fail due to high latency and excessive metadata operations (e.g., shell profile reads)
  • Future roadmap: The project is currently built on FUSE to allow rapid iteration but aims to move into the Linux kernel as a stable module

Chapters

  1. 3:30 The Vision for Archil: Hunter Leath explains the mission to provide EFS-like experiences without the high cost of Amazon's managed services.
  2. 5:45 Why S3 Mounting Fails: A look at the technical hurdles of legacy software making thousands of discrete read operations against high-latency object storage.
  3. 10:30 The SSD Cache Layer: How using fast SSDs as a cache layer enables high-performance workloads like video transcoding and AI training.
  4. 12:40 Integration with Existing Buckets: Details on how Archil attaches to existing production S3 buckets and synchronizes changes back to the source.
  5. 15:00 The Economics of Archil vs EBS: A mathematical comparison showing how Archil's pricing beats EBS by eliminating the need for over-provisioning.
  6. 19:30 Competing with AWS Features: Discussion on the competitive landscape and the risk of AWS introducing similar caching features.
  7. 29:10 Moving Toward the Linux Kernel: The technical roadmap from FUSE-based implementation to a mainline Linux kernel module.