{"podcast":{"title":"\"The Cognitive Revolution\" | AI Builders, Researchers, and Live Player Analysis","slug":"the-cognitive-revolution","podcast_index_feed_id":6011783,"rss_url":"https://feeds.megaphone.fm/RINTP3108857801","website_url":"https://www.cognitiverevolution.ai/","image_url":"https://megaphone.imgix.net/podcasts/30f818da-c930-11ed-9b4b-1352ca96fb17/image/888e2c534b7c2534213c97e025646932.png?ixlib=rails-4.3.1&max-w=3000&max-h=3000&fit=crop&auto=format,compress","author":"Turpentine","episode_count":346,"summary":"A biweekly podcast where hosts Nathan Labenz and Erik Torenberg interview the builders on the edge of AI and explore the dramatic shift it will unlock in the coming years. The Cognitive Revolution is part of the Turpentine podcast network. To learn more: turpentine.co","last_synced_at":null,"page_url":"https://stenobird.com/podcast/the-cognitive-revolution"},"episode":{"title":"Untangling Neural Network Mechanisms: Goodfire's Lee Sharkey on Parameter-based Interpretability","slug":"untangling-neural-network-mechanisms-goodfire-s-lee-sharkey-on-parameter-based-interpretability","published_at":"2025-08-27T23:11:00+00:00","page_url":"https://stenobird.com/podcast/the-cognitive-revolution/untangling-neural-network-mechanisms-goodfire-s-lee-sharkey-on-parameter-based-interpretability","show_page_url":"https://stenobird.com/podcast/the-cognitive-revolution","url":"https://www.cognitiverevolution.ai","audio_url":"https://pdst.fm/e/mgln.ai/e/1113/pscrb.fm/rss/p/traffic.megaphone.fm/RINTP8528953586.mp3?updated=1756335782","summary":"Today Lee Sharkey of Goodfire joins The Cognitive Revolution to discuss his research on parameter decomposition methods that break down neural networks into interpretable computational components, exploring how his team's \"stochastic parameter decomposition\" approach addresses the limitations of sparse autoencoders and offers new pathways for understanding, monitoring, and potentially steering AI systems at the mechanistic level. Check out our sponsors: Oracle Cloud Infrastructure, Shopify. Shownotes below brought to you by Notion AI Meeting Notes - try one month for free at ⁠⁠⁠⁠https://⁠⁠notion.com/lp/nathan Parameter vs. Activation Decomposition: Traditional interpretability methods like Sparse Autoencoders (SAEs) focus on analyzing activations, while parameter decomposition focuses on understanding the parameters themselves - the actual \"algorithm\" of the neural network. No \"True\" Decomposition: None of the decompositions (whether sparse dictionary learning or parameter decomposition) are objectively \"right\" because they're all attempting to discretize a fundamentally continuous object, inevitably introducing approximations. Tradeoff in Interpretability: There's a balance between reconstruction loss and causal importance - as you decompose networks more, reconstruction loss may worsen, but interpretability might improve up to a certain point. Potential Unlearning Applications: Parameter decomposition may make unlearning more straightforward than with SAEs because researchers are already working in parameter space and can directly modify vectors that perform specific functions. Function Detection vs. Input Direction: A function like \"deception\" might manifest in many different input directions that SAEs struggle to identify as a single concept, while parameter decomp…","meta_description":"Today Lee Sharkey of Goodfire joins The Cognitive Revolution to discuss his research on parameter decomposition methods that break down neural networks in…","key_points":[],"chapters":[],"topics":[],"duration_seconds":7331,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/the-cognitive-revolution/episodes/untangling-neural-network-mechanisms-goodfire-s-lee-sharkey-on-parameter-based-interpretability/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/the-cognitive-revolution/untangling-neural-network-mechanisms-goodfire-s-lee-sharkey-on-parameter-based-interpretability.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}