# Reiner Pope – The math behind how LLMs are trained and served

Page: https://stenobird.com/podcast/dwarkesh-podcast/reiner-pope-the-math-behind-how-llms-are-trained-and-served
Text version: https://stenobird.com/podcast/dwarkesh-podcast/reiner-pope-the-math-behind-how-llms-are-trained-and-served.md
Podcast: [Dwarkesh Podcast](https://stenobird.com/podcast/dwarkesh-podcast)
Published: 2026-04-29T17:07:03+00:00
Episode link: https://www.dwarkesh.com/p/reiner-pope
Audio file: https://api.substack.com/feed/podcast/195859978/d0ec0edfef14862f0fc2e095a1ee64b0.mp3
Processing state: not_requested
JSON: https://stenobird.com/v1/public/podcasts/dwarkesh-podcast/episodes/reiner-pope-the-math-behind-how-llms-are-trained-and-served
Duration seconds: 8030

## Resource

Did a very different format with Reiner Pope - a blackboard lecture where he walks through how frontier LLMs are trained and served. It’s shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there – it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. Reiner is CEO of MatX , a new chip startup (full disclosure - I’m an angel investor). He was previously at Google, where he worked on software efficiency , compilers, and TPU architecture. Download markdown of transcript here to chat with an LLM. Wrote up some flashcards and practice problems to help myself retain what Reiner taught. Hope it's helpful to you too! Sponsors * Jane Street needs constant access to incredibly low-latency compute. I recently asked one of their engineers, Clark, to talk me through how they meet these demands. Our conversation—which touched on everything from FPGAs to liquid cooling—was extremely helpful as I prepped to interview Reiner. You can watch the full discussion and explore Jane Street’s open roles at janestreet.com/dwarkesh * Google’s Gemma 4 is the first open model that’s let me shut off the internet and create a fully disconnected “focus machine”. This is because Gemma is small enough to run on my laptop, but powerful enough to actually be useful. So, to prep for this interview, I downloaded Reiner’s scaling book, disconnected from wifi, and used Gemma to help me break down the material. Check it out at goo.gle/Gemma4 * Cursor helped me turn some n…

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/dwarkesh-podcast/episodes/reiner-pope-the-math-behind-how-llms-are-trained-and-served/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/dwarkesh-podcast/reiner-pope-the-math-behind-how-llms-are-trained-and-served.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.