# The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava Page: https://stenobird.com/podcast/gradient-dissent/the-ceo-behind-the-fastest-growing-ai-inference-company-tuhin-srivastava Text version: https://stenobird.com/podcast/gradient-dissent/the-ceo-behind-the-fastest-growing-ai-inference-company-tuhin-srivastava.md Podcast: [Gradient Dissent: Conversations on AI](https://stenobird.com/podcast/gradient-dissent) Published: 2025-11-18T12:00:00+00:00 Episode link: https://wandb.ai/site/resources/podcast Audio file: https://episodes.captivate.fm/episode/bb7a7e75-34e4-471e-a3b8-12dfff61e22e.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/the-ceo-behind-the-fastest-growing-ai-inference-company-tuhin-srivastava Duration seconds: 3553 ## Resource In this episode of Gradient Dissent, Lukas Biewald talks with Tuhin Srivastava, CEO and founder of Baseten, one of the fastest-growing companies in the AI inference ecosystem. Tuhin shares the real story behind Baseten’s rise and how the market finally aligned with the infrastructure they’d spent years building. They get into the core challenges of modern inference, including why dedicated deployments matter, how runtime and infrastructure bottlenecks stack up, and what makes serving large models fundamentally different from smaller ones. Tuhin also explains how vLLM, TensorRT-LLM, and SGLang differ in practice, what it takes to tune workloads for new chips like the B200, and why reliability becomes harder as systems scale.  The conversation dives into company-building, from killing product lines to avoiding premature scaling while navigating a market that shifts every few weeks. Connect with us here:  Tuhin Srivastva: https://www.linkedin.com/in/tuhin-srivastava/   Lukas Biewald: https://www.linkedin.com/in/lbiewald/ Weights & Biases: https://www.linkedin.com/company/wandb/ ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/the-ceo-behind-the-fastest-growing-ai-inference-company-tuhin-srivastava/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/gradient-dissent/the-ceo-behind-the-fastest-growing-ai-inference-company-tuhin-srivastava.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.