Episode

S12 E16: Nikunj Bajaj, True Foundry

Podcast
Code Story: Insights from Startup Tech Leaders
Published
Apr 28, 2026
Duration seconds
1956
Processing state
processed
Canonical source
https://codestory.co/podcast/e16-nikunj-bajaj-true-foundry/
Audio
https://pdst.fm/e/pscrb.fm/rss/p/audio4.redcircle.com/episodes/7deaad27-2d92-4f67-8da5-1709162da475/stream.mp3
JSON
/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry
Markdown
/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/code-story/episodes/s12-e16-nikunj-bajaj-true-foundry/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/code-story/s12-e16-nikunj-bajaj-true-foundry.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Nikunj Bajaj explains how True Foundry built an enterprise-grade AI gateway using a split-plane architecture to manage LLM and agent traffic. He details the transition from Meta engineer to founder, focusing on the infrastructure needs of the upcoming agentic application era.

Topics

  • AI Gateway
  • LLM Infrastructure
  • Kubernetes
  • Split-plane architecture
  • Agentic applications
  • Machine Learning Operations
  • Startup scaling
  • Cloud computing

Highlights

  • Main idea: True Foundry utilizes a split-plane architecture to separate the control plane from the gateway plane, ensuring high availability in the critical request path
  • Practical takeaway: Building on Kubernetes allows startups to leverage existing enterprise standards and community-driven scaling capabilities
  • Failure mode: Relying on fragile, non-standardized infrastructure can lead to downtime in production-critical AI agent applications
  • Strategic insight: The future of AI infrastructure lies in becoming a central compute orchestration platform, similar to the evolution of Snowflake for data
  • Foundational principle: Long-term entrepreneurial success depends on building a reputation for radical transparency and ownership of mistakes

Chapters

  1. 1:00 Split-Plane Architecture: Introduction to the design principle of separating the control plane from the gateway plane to ensure reliability.
  2. 7:00 The Unified AI Stack: Discussing the vertically stacked platform approach involving software, machine learning, and underlying infrastructure.
  3. 9:50 Betting on Kubernetes: Why investing in Kubernetes-based machine learning infrastructure was a strategic move as other platforms declined.
  4. 21:40 Achieving Low Latency at Scale: How the gateway maintains sub-second latency while handling tens of thousands of requests per second across 17 regions.
  5. 27:50 Lessons in Entrepreneurship: Reflecting on early mistakes and the importance of maintaining a consistent architectural foundation.
  6. 33:50 The Vision for Compute Orchestration: Comparing the future of True Foundry to the rise of data warehouses like Databricks and Snowflake.
  7. 37:00 Building Mission-Driven Teams: How a clear problem statement and founder dedication drive team resilience during startup rough patches.