# Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Page: https://stenobird.com/podcast/twiml-ai-podcast/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train-with-niklas-muennighoff-721
Text version: https://stenobird.com/podcast/twiml-ai-podcast/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train-with-niklas-muennighoff-721.md
Podcast: [The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)](https://stenobird.com/podcast/twiml-ai-podcast)
Published: 2025-03-03T23:56:03+00:00
Episode link: https://twimlai.com/podcast/twimlai/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train/
Audio file: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN4321517135.mp3?updated=1741045497
Processing state: failed
JSON: https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train-with-niklas-muennighoff-721
Duration seconds: 2969

## Resource

Today, we're joined by Niklas Muennighoff, a PhD student at Stanford University, to discuss his paper, “S1: Simple Test-Time Scaling.” We explore the motivations behind S1, as well as how it compares to OpenAI's O1 and DeepSeek's R1 models. We dig into the different approaches to test-time scaling, including parallel and sequential scaling, as well as S1’s data curation process, its training recipe, and its use of model distillation from Google Gemini and DeepSeek R1. We explore the novel "budget forcing" technique developed in the paper, allowing it to think longer for harder problems and optimize test-time compute for better performance. Additionally, we cover the evaluation benchmarks used, the comparison between supervised fine-tuning and reinforcement learning, and similar projects like the Hugging Face Open R1 project. Finally, we discuss the open-sourcing of S1 and its future directions. The complete show notes for this episode can be found at https://twimlai.com/go/721.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train-with-niklas-muennighoff-721/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/twiml-ai-podcast/inside-s1-an-o1-style-reasoning-model-that-cost-under-50-to-train-with-niklas-muennighoff-721.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.