# S12 E16: Nikunj Bajaj, True Foundry Page: https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry Text version: https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md Podcast: [Code Story: Insights from Startup Tech Leaders](https://stenobird.com/podcast/code-story) Published: 2026-04-28T10:00:16+00:00 Episode link: https://codestory.co/podcast/e16-nikunj-bajaj-true-foundry/ Audio file: https://pdst.fm/e/pscrb.fm/rss/p/audio4.redcircle.com/episodes/7deaad27-2d92-4f67-8da5-1709162da475/stream.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry Duration seconds: 1956 ## Resource Nikunj Bajaj explains how True Foundry built an enterprise-grade AI gateway using a split-plane architecture to manage LLM and agent traffic. He details the transition from Meta engineer to founder, focusing on the infrastructure needs of the upcoming agentic application era. ## Highlights - Main idea: True Foundry utilizes a split-plane architecture to separate the control plane from the gateway plane, ensuring high availability in the critical request path - Practical takeaway: Building on Kubernetes allows startups to leverage existing enterprise standards and community-driven scaling capabilities - Failure mode: Relying on fragile, non-standardized infrastructure can lead to downtime in production-critical AI agent applications - Strategic insight: The future of AI infrastructure lies in becoming a central compute orchestration platform, similar to the evolution of Snowflake for data - Foundational principle: Long-term entrepreneurial success depends on building a reputation for radical transparency and ownership of mistakes ## Topics AI Gateway, LLM Infrastructure, Kubernetes, Split-plane architecture, Agentic applications, Machine Learning Operations, Startup scaling, Cloud computing ## Chapters - 1:00 — Split-Plane Architecture: Introduction to the design principle of separating the control plane from the gateway plane to ensure reliability. - 7:00 — The Unified AI Stack: Discussing the vertically stacked platform approach involving software, machine learning, and underlying infrastructure. - 9:50 — Betting on Kubernetes: Why investing in Kubernetes-based machine learning infrastructure was a strategic move as other platforms declined. - 21:40 — Achieving Low Latency at Scale: How the gateway maintains sub-second latency while handling tens of thousands of requests per second across 17 regions. - 27:50 — Lessons in Entrepreneurship: Reflecting on early mistakes and the importance of maintaining a consistent architectural foundation. - 33:50 — The Vision for Compute Orchestration: Comparing the future of True Foundry to the rise of data warehouses like Databricks and Snowflake. - 37:00 — Building Mission-Driven Teams: How a clear problem statement and founder dedication drive team resilience during startup rough patches. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.