# New top score on ARC-AGI-2-pub (29.4%) - Jeremy Berman

Page: https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman
Text version: https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman.md
Podcast: [Machine Learning Street Talk (MLST)](https://stenobird.com/podcast/machine-learning-street-talk)
Published: 2025-09-27T16:21:01+00:00
Episode link: https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/New-top-score-on-ARC-AGI-2-pub-29-4---Jeremy-Berman-e38pj96
Audio file: https://traffic.megaphone.fm/APO8526044538.mp3
Processing state: processed
JSON: https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman
Duration seconds: 4107

## Resource

Jeremy Berman explains how shifting from Python code generation to natural language instructions allowed his system to achieve a top score on the ARC-AGI-2-pub leaderboard. The discussion explores the transition from pattern memorization to true algorithmic reasoning and the potential for models to synthesize new knowledge.

## Highlights
- Main idea: Natural language provides a more expressive programming medium than Python for solving complex visual reasoning tasks
- Practical takeaway: In the ARC-AGI-2-pub challenge, a stronger 'checker' model is more critical for success than a stronger 'instruction creator'
- Failure mode: Relying solely on pre-training can actually hinder reasoning by encouraging pattern memorization over logical deduction
- Technical insight: The trade-off in ARC-AGI-2-pub involves balancing the breadth of the search space with the depth of the instruction complexity
- Future vision: True AGI requires a meta-skill for reasoning that allows models to learn and synthesize new skills without losing existing knowledge

## Topics

ARC-AGI, Program Synthesis, Natural Language Processing, Reinforcement Learning, Artificial General Intelligence, Symbolic Reasoning, Evolutionary Algorithms, Machine Learning

## Chapters
- 1:00 — The Goal of Knowledge Synthesis: Discussing the need for AI to move beyond data compression toward systems that can integrate and learn new information dynamically.
- 6:10 — Evolutionary Program Synthesis: A look at the transition from program synthesis to reinforcement learning with verifiable feedback.
- 11:40 — The Shift to Natural Language: Why moving from Python to English instructions improved accuracy by increasing the degrees of freedom in the solution space.
- 17:05 — Neural Networks vs. Turing Completeness: Debating whether LLMs possess true intelligence or are simply searching through the space of Turing programs.
- 22:05 — The Challenge of Continual Learning: Exploring the possibility of freezing expert layers to allow for new learning without catastrophic forgetting.
- 27:35 — The Power of Expressive Programs: Analyzing how combining neural networks with a Python terminal can bridge the gap between intuition and execution.
- 54:10 — Pre-training as a Barrier to Reasoning: A provocative take on how massive pre-training might act as a 'consultant' that knows names but lacks deductive capability.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/machine-learning-street-talk/new-top-score-on-arc-agi-2-pub-29-4-jeremy-berman.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.