# New top score on ARC-AGI-2-pub (29.4%) - Jeremy Berman Page: https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman Text version: https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman.md Podcast: [Machine Learning Street Talk (MLST)](https://stenobird.com/podcast/machine-learning-street-talk) Published: 2025-09-27T16:21:01+00:00 Episode link: https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/New-top-score-on-ARC-AGI-2-pub-29-4---Jeremy-Berman-e38pj96 Audio file: https://traffic.megaphone.fm/APO8526044538.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman Duration seconds: 4107 ## Resource Jeremy Berman explains how shifting from Python code generation to natural language instructions allowed his system to achieve a top score on the ARC-AGI-2-pub leaderboard. The discussion explores the transition from pattern memorization to true algorithmic reasoning and the potential for models to synthesize new knowledge. ## Highlights - Main idea: Natural language provides a more expressive programming medium than Python for solving complex visual reasoning tasks - Practical takeaway: In the ARC-AGI-2-pub challenge, a stronger 'checker' model is more critical for success than a stronger 'instruction creator' - Failure mode: Relying solely on pre-training can actually hinder reasoning by encouraging pattern memorization over logical deduction - Technical insight: The trade-off in ARC-AGI-2-pub involves balancing the breadth of the search space with the depth of the instruction complexity - Future vision: True AGI requires a meta-skill for reasoning that allows models to learn and synthesize new skills without losing existing knowledge ## Topics ARC-AGI, Program Synthesis, Natural Language Processing, Reinforcement Learning, Artificial General Intelligence, Symbolic Reasoning, Evolutionary Algorithms, Machine Learning ## Chapters - 1:00 — The Goal of Knowledge Synthesis: Discussing the need for AI to move beyond data compression toward systems that can integrate and learn new information dynamically. - 6:10 — Evolutionary Program Synthesis: A look at the transition from program synthesis to reinforcement learning with verifiable feedback. - 11:40 — The Shift to Natural Language: Why moving from Python to English instructions improved accuracy by increasing the degrees of freedom in the solution space. - 17:05 — Neural Networks vs. Turing Completeness: Debating whether LLMs possess true intelligence or are simply searching through the space of Turing programs. - 22:05 — The Challenge of Continual Learning: Exploring the possibility of freezing expert layers to allow for new learning without catastrophic forgetting. - 27:35 — The Power of Expressive Programs: Analyzing how combining neural networks with a Python terminal can bridge the gap between intuition and execution. - 54:10 — Pre-training as a Barrier to Reasoning: A provocative take on how massive pre-training might act as a 'consultant' that knows names but lacks deductive capability. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.