# S12 E16: Nikunj Bajaj, True Foundry

Page: https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry
Text version: https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md
Podcast: [Code Story: Insights from Startup Tech Leaders](https://stenobird.com/podcast/code-story)
Published: 2026-04-28T10:00:16+00:00
Episode link: https://codestory.co/podcast/e16-nikunj-bajaj-true-foundry/
Audio file: https://pdst.fm/e/pscrb.fm/rss/p/audio4.redcircle.com/episodes/7deaad27-2d92-4f67-8da5-1709162da475/stream.mp3
Processing state: processed
JSON: https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry
Duration seconds: 1956

## Resource

Nikunj Bajaj explains how True Foundry built an enterprise-grade AI gateway using a split-plane architecture to manage LLM and agent traffic. He details the transition from Meta engineer to founder, focusing on the infrastructure needs of the upcoming agentic application era.

## Highlights
- Main idea: True Foundry utilizes a split-plane architecture to separate the control plane from the gateway plane, ensuring high availability in the critical request path
- Practical takeaway: Building on Kubernetes allows startups to leverage existing enterprise standards and community-driven scaling capabilities
- Failure mode: Relying on fragile, non-standardized infrastructure can lead to downtime in production-critical AI agent applications
- Strategic insight: The future of AI infrastructure lies in becoming a central compute orchestration platform, similar to the evolution of Snowflake for data
- Foundational principle: Long-term entrepreneurial success depends on building a reputation for radical transparency and ownership of mistakes

## Topics

AI Gateway, LLM Infrastructure, Kubernetes, Split-plane architecture, Agentic applications, Machine Learning Operations, Startup scaling, Cloud computing

## Chapters
- 1:00 — Split-Plane Architecture: Introduction to the design principle of separating the control plane from the gateway plane to ensure reliability.
- 7:00 — The Unified AI Stack: Discussing the vertically stacked platform approach involving software, machine learning, and underlying infrastructure.
- 9:50 — Betting on Kubernetes: Why investing in Kubernetes-based machine learning infrastructure was a strategic move as other platforms declined.
- 21:40 — Achieving Low Latency at Scale: How the gateway maintains sub-second latency while handling tens of thousands of requests per second across 17 regions.
- 27:50 — Lessons in Entrepreneurship: Reflecting on early mistakes and the importance of maintaining a consistent architectural foundation.
- 33:50 — The Vision for Compute Orchestration: Comparing the future of True Foundry to the rise of data warehouses like Databricks and Snowflake.
- 37:00 — Building Mission-Driven Teams: How a clear problem statement and founder dedication drive team resilience during startup rough patches.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.