# The Startup Powering The Data Behind AGI Page: https://stenobird.com/podcast/gradient-dissent/the-startup-powering-the-data-behind-agi Text version: https://stenobird.com/podcast/gradient-dissent/the-startup-powering-the-data-behind-agi.md Podcast: [Gradient Dissent: Conversations on AI](https://stenobird.com/podcast/gradient-dissent) Published: 2025-09-16T10:00:00+00:00 Episode link: https://wandb.ai/site/resources/podcast Audio file: https://episodes.captivate.fm/episode/becd4fd5-189b-4644-b956-4efd1c5756c1.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/the-startup-powering-the-data-behind-agi Duration seconds: 3375 ## Resource Surge AI CEO Edwin Chen explains why high-quality, expert-led human data is the critical bottleneck for frontier LLMs. He argues that traditional labeling is broken and that the future of AGI depends on moving beyond simple benchmarks toward complex, multi-day reasoning tasks. ## Highlights - Main idea: The industry is moving from simple classification to high-complexity tasks requiring days of human expertise - Failure mode: Relying on inter-annotator agreement or simple checkboxes fails to capture subjective quality in creative or complex domains - Practical takeaway: Effective model training requires understanding the researcher's underlying goal rather than just following rigid instructions - Critical insight: Benchmark hacking on academic datasets is creating a disconnect between leaderboard performance and real-world utility - Future trend: The ratio of spend on data versus compute should increase as models require more nuanced, specialized human feedback ## Topics LLM Training, Data Labeling, AGI, Reinforcement Learning, Machine Learning Benchmarks, Human-in-the-loop, Synthetic Data, Surge AI ## Chapters - 1:00 — The Data Collection Landscape: An overview of the massive, constant spend required for data in foundation model training. - 5:15 — Scaling Human Networks: How Surge initially sourced workers and built its early network. - 13:30 — The Myth of the PhD Solution: Why simply hiring experts like PhDs doesn't solve the fundamental problems of data quality. - 26:10 — The Shift to High-Cognitive Tasks: Moving from five-second labeling tasks to complex problems that take days to solve. - 30:25 — The Danger of Benchmark Hacking: How optimizing for leaderboards like LMSYS can degrade real-world model performance. - 39:20 — Data for Scientific Discovery: The role of specialized data in training models for chemistry and advanced reasoning. - 47:40 — Synthetic vs. Human Data: The limitations of synthetic data and the necessity of messy, real-world human inputs. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/the-startup-powering-the-data-behind-agi/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/gradient-dissent/the-startup-powering-the-data-behind-agi.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.