{"podcast":{"title":"Machine Learning Street Talk (MLST)","slug":"machine-learning-street-talk","podcast_index_feed_id":781643,"rss_url":"https://anchor.fm/s/1e4a0eac/podcast/rss","website_url":"https://podcasters.spotify.com/pod/show/machinelearningstreettalk","image_url":"https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/4981699/4981699-1757416025703-f026fa81b6d04.jpg","author":"Machine Learning Street Talk (MLST)","episode_count":250,"summary":"Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).","last_synced_at":null,"page_url":"https://stenobird.com/podcast/machine-learning-street-talk"},"episode":{"title":"Transformers Need Glasses! - Federico Barbero","slug":"transformers-need-glasses-federico-barbero","published_at":"2025-03-08T22:49:35+00:00","page_url":"https://stenobird.com/podcast/machine-learning-street-talk/transformers-need-glasses-federico-barbero","show_page_url":"https://stenobird.com/podcast/machine-learning-street-talk","url":"https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/Transformers-Need-Glasses----Federico-Barbero-e2vt2tn","audio_url":"https://anchor.fm/s/1e4a0eac/podcast/play/99567991/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-2-8%2F93abdba7-8239-0f46-eb47-df058b0f2033.mp3","summary":"Federico Barbero (DeepMind/Oxford) is the lead author of &quot;Transformers Need Glasses!&quot;. Have you ever wondered why LLMs struggle with seemingly simple tasks like counting or copying long strings of text? We break down the theoretical reasons behind these failures, revealing architectural bottlenecks and the challenges of maintaining information fidelity across extended contexts. Federico explains how these issues are rooted in the transformer's design, drawing parallels to over-squashing in graph neural networks and detailing how the softmax function limits sharp decision-making. But it's not all bad news! Discover practical &quot;glasses&quot; that can help transformers see more clearly, from simple input modifications to architectural tweaks. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting! https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** https://federicobarbero.com/ TRANSCRIPT + RESEARCH: https://www.dropbox.com/s/h7ys83ztwktqjje/Federico.pdf?dl=0 TOC: 1. Transformer Limitations: Token Detection &amp; Representation [00:00:00] 1.1 Transformers fail at single token detection [00:02:45] 1.2 Representation collapse in transformers [00:03:21] 1.3 Experiment: LLMs fail at copying last tokens [00:18:00] 1.4 Attention sharpness limitations in transformers 2. Transformer Limitations: Information Flow &amp; Quantization [00:18:50] 2.1 Unidirectional information mixing [00:18:50] 2.2 Unidirectio…","meta_description":"Federico Barbero (DeepMind/Oxford) is the lead author of \"Transformers Need Glasses!\". Have you ever wondered why LLMs struggle with seemingly s…","key_points":[],"chapters":[],"topics":[],"duration_seconds":3654,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/transformers-need-glasses-federico-barbero/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/machine-learning-street-talk/transformers-need-glasses-federico-barbero.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}