{"podcast":{"title":"Latent Space: The AI Engineer Podcast","slug":"latent-space-ai-engineer","podcast_index_feed_id":6058902,"rss_url":"https://api.substack.com/feed/podcast/1084089.rss","website_url":"https://www.latent.space/podcast","image_url":"https://substackcdn.com/feed/podcast/1084089/ca7468da5614a246d2906ee8926f6de7.jpg","author":"Latent.Space","episode_count":204,"summary":"The AI Engineer newsletter + Top technical AI podcast. How leading labs build Agents, Models, Infra, & AI for Science. See https://latent.space/about for highlights from Greg Brockman, Andrej Karpathy, George Hotz, Simon Willison, Soumith Chintala et al!","last_synced_at":null,"page_url":"https://stenobird.com/podcast/latent-space-ai-engineer"},"episode":{"title":"[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton","slug":"neurips-best-paper-1000-layer-networks-for-self-supervised-rl-kevin-wang-et-al-princeton","published_at":"2026-01-02T16:00:00+00:00","page_url":"https://stenobird.com/podcast/latent-space-ai-engineer/neurips-best-paper-1000-layer-networks-for-self-supervised-rl-kevin-wang-et-al-princeton","show_page_url":"https://stenobird.com/podcast/latent-space-ai-engineer","url":"https://www.latent.space/p/neurips-best-paper-1000-layer-networks","audio_url":"https://api.substack.com/feed/podcast/186610577/1c67d698a72366b17c184a249d44225b.mp3","summary":"From undergraduate research seminars at Princeton to winning Best Paper award at NeurIPS 2025 , Kevin Wang, Ishaan Javali, Michał Bortkiewicz, Tomasz Trzcinski, Benjamin Eysenbach defied conventional wisdom by scaling reinforcement learning networks to 1,000 layers deep —unlocking performance gains that the RL community thought impossible. We caught up with the team live at NeurIPS to dig into the story behind RL1000 : why deep networks have worked in language and vision but failed in RL for over a decade (spoiler: it’s not just about depth, it’s about the objective), how they discovered that self-supervised RL (learning representations of states, actions, and future states via contrastive learning) scales where value-based methods collapse, the critical architectural tricks that made it work ( residual connections, layer normalization, and a shift from regression to classification ), why scaling depth is more parameter-efficient than scaling width (linear vs. quadratic growth), how Jax and GPU-accelerated environments let them collect hundreds of millions of transitions in hours (the data abundance that unlocked scaling in the first place), the “critical depth” phenomenon where performance doesn’t just improve—it multiplies once you cross 15M+ transitions and add the right architectural components, why this isn’t just “make networks bigger” but a fundamental shift in RL objectives (their code doesn’t have a line saying “maximize rewards”—it’s pure self-supervised representation learning), how deep teacher, shallow student distillation could unlock deployment at scale (train frontier capabilities with 1000 layers, distill down to efficient inference models), the robotics implications (goal-conditioned RL without human supervision or demonstrations, scaling architecture…","meta_description":"From undergraduate research seminars at Princeton to winning Best Paper award at NeurIPS 2025 , Kevin Wang, Ishaan Javali, Michał Bortkiewicz, Tomasz Trzc…","key_points":[],"chapters":[],"topics":[],"duration_seconds":1699,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/neurips-best-paper-1000-layer-networks-for-self-supervised-rl-kevin-wang-et-al-princeton/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/latent-space-ai-engineer/neurips-best-paper-1000-layer-networks-for-self-supervised-rl-kevin-wang-et-al-princeton.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}