Episode
Solving the 20-Year S3 File System Problem with Hunter Leath
- Podcast
- Screaming in the Cloud
- Published
- Jan 20, 2026
- Duration seconds
- 1903
- Processing state
processed- Canonical source
- https://share.transistor.fm/s/87694d04
Actions
POST https://stenobird.com/v1/public/podcasts/screaming-in-the-cloud/episodes/solving-the-20-year-s3-file-system-problem-with-hunter-leath/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/screaming-in-the-cloud/solving-the-20-year-s3-file-system-problem-with-hunter-leath.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Archil solves the long-standing problem of using S3 as a high-performance file system by implementing a high-speed SSD cache layer. This approach allows legacy applications to use S3 with POSIX compatibility and significantly lower costs than EBS or EFS.
Topics
- Cloud Storage
- Amazon S3
- File Systems
- AWS EFS
- Infrastructure Engineering
- Data Caching
- Cloud Economics
- Linux Kernel
Highlights
- Main idea: Archil acts as a pull-through cache, using fast SSDs to bridge the performance gap between S3 and local disk semantics
- Practical takeaway: Users can attach Archil directly to existing S3 buckets without migrating data, maintaining S3 as the source of truth
- Economic advantage: When accounting for over-provisioning, Archil can cost approximately 1.95 cents per GB, significantly cheaper than EBS
- Failure mode: Traditional attempts to mount S3 as a file system often fail due to high latency and excessive metadata operations (e.g., shell profile reads)
- Future roadmap: The project is currently built on FUSE to allow rapid iteration but aims to move into the Linux kernel as a stable module
Chapters
3:30The Vision for Archil: Hunter Leath explains the mission to provide EFS-like experiences without the high cost of Amazon's managed services.5:45Why S3 Mounting Fails: A look at the technical hurdles of legacy software making thousands of discrete read operations against high-latency object storage.10:30The SSD Cache Layer: How using fast SSDs as a cache layer enables high-performance workloads like video transcoding and AI training.12:40Integration with Existing Buckets: Details on how Archil attaches to existing production S3 buckets and synchronizes changes back to the source.15:00The Economics of Archil vs EBS: A mathematical comparison showing how Archil's pricing beats EBS by eliminating the need for over-provisioning.19:30Competing with AWS Features: Discussion on the competitive landscape and the risk of AWS introducing similar caching features.29:10Moving Toward the Linux Kernel: The technical roadmap from FUSE-based implementation to a mainline Linux kernel module.