Episode
S12 E16: Nikunj Bajaj, True Foundry
- Published
- Apr 28, 2026
- Duration seconds
- 1956
- Processing state
processed- Canonical source
- https://codestory.co/podcast/e16-nikunj-bajaj-true-foundry/
Actions
POST https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Nikunj Bajaj explains how True Foundry built an enterprise-grade AI gateway using a split-plane architecture to manage LLM and agent traffic. He details the transition from Meta engineer to founder, focusing on the infrastructure needs of the upcoming agentic application era.
Topics
- AI Gateway
- LLM Infrastructure
- Kubernetes
- Split-plane architecture
- Agentic applications
- Machine Learning Operations
- Startup scaling
- Cloud computing
Highlights
- Main idea: True Foundry utilizes a split-plane architecture to separate the control plane from the gateway plane, ensuring high availability in the critical request path
- Practical takeaway: Building on Kubernetes allows startups to leverage existing enterprise standards and community-driven scaling capabilities
- Failure mode: Relying on fragile, non-standardized infrastructure can lead to downtime in production-critical AI agent applications
- Strategic insight: The future of AI infrastructure lies in becoming a central compute orchestration platform, similar to the evolution of Snowflake for data
- Foundational principle: Long-term entrepreneurial success depends on building a reputation for radical transparency and ownership of mistakes
Chapters
1:00Split-Plane Architecture: Introduction to the design principle of separating the control plane from the gateway plane to ensure reliability.7:00The Unified AI Stack: Discussing the vertically stacked platform approach involving software, machine learning, and underlying infrastructure.9:50Betting on Kubernetes: Why investing in Kubernetes-based machine learning infrastructure was a strategic move as other platforms declined.21:40Achieving Low Latency at Scale: How the gateway maintains sub-second latency while handling tens of thousands of requests per second across 17 regions.27:50Lessons in Entrepreneurship: Reflecting on early mistakes and the importance of maintaining a consistent architectural foundation.33:50The Vision for Compute Orchestration: Comparing the future of True Foundry to the rise of data warehouses like Databricks and Snowflake.37:00Building Mission-Driven Teams: How a clear problem statement and founder dedication drive team resilience during startup rough patches.