{"podcast":{"title":"The Data Exchange with Ben Lorica","slug":"the-data-exchange-with-ben-lorica","podcast_index_feed_id":1196000,"rss_url":"https://rss.buzzsprout.com/682433.rss","website_url":"https://thedataexchange.media/","image_url":"https://storage.buzzsprout.com/ljk0yj7r22pi61grsmelnsoa9084?.jpg","author":"Ben Lorica","episode_count":345,"summary":"A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].","last_synced_at":null,"page_url":"https://stenobird.com/podcast/the-data-exchange-with-ben-lorica"},"episode":{"title":"Breaking the Memory Wall in the Age of Inference","slug":"breaking-the-memory-wall-in-the-age-of-inference","published_at":"2026-02-12T12:00:00+00:00","page_url":"https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/breaking-the-memory-wall-in-the-age-of-inference","show_page_url":"https://stenobird.com/podcast/the-data-exchange-with-ben-lorica","url":"https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18625276-breaking-the-memory-wall-in-the-age-of-inference.mp3","audio_url":"https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18625276-breaking-the-memory-wall-in-the-age-of-inference.mp3","summary":"Sid Sheth, CEO of D-Matrix, explains how digital in-memory computing (DIMC) overcomes the 'memory wall' bottleneck in AI inference. The discussion focuses on reducing data movement to significantly improve energy efficiency and token generation speed.","meta_description":"Explore how D-Matrix is solving the AI inference memory bottleneck using digital in-memory computing to optimize transformer model performance.","key_points":["Main idea: The 'memory wall' occurs because moving model parameters between memory and compute consumes excessive time and energy","Practical takeaway: Digital in-memory computing (DIMC) allows matrix operations to happen directly where parameters are stored, eliminating data movement","Failure mode: Hardware startups often fail if they lack experience navigating the complex, high-stakes physical cycles of chip tape-outs","Efficiency metric: Moving from traditional architectures to DIMC can enable running 100B+ parameter models within a single rack with 5-10x better efficiency","Industry trend: The future of AI scaling depends on emerging Ethernet-based scale-up networks like Broadcom's ESun to connect servers within racks"],"chapters":[{"start_ms":60000,"title":"The Importance of Chip Industry Experience","summary":"A discussion on why successful AI hardware ventures require veterans who have navigated multiple successful chip tape-outs."},{"start_ms":260000,"title":"The Shift from Training to Inference","summary":"Analyzing why the hardware focus is moving from model training to the massive scale required for inference in data centers."},{"start_ms":870000,"title":"The Memory Wall and Data Movement","summary":"An analogy of the highway bottleneck between compute and memory, and how moving data creates a performance ceiling."},{"start_ms":1270000,"title":"The Persistence of Matrix Math","summary":"Why fundamental matrix operations remain the core of AI hardware and how to optimize them without changing the underlying math."},{"start_ms":1480000,"title":"Digital In-Memory Computing (DIMC)","summary":"How D-Matrix uses SRAM-tier computing to process parameters in place, drastically reducing energy and latency."},{"start_ms":2520000,"title":"Scaling via Rack-Level Interconnects","summary":"The role of Ethernet-based scale-up networks and the competition between NVLink, ESun, and UAL in connecting AI servers."},{"start_ms":2730000,"title":"Open Standards in AI Hardware","summary":"D-Matrix's approach to embracing open software stacks like PyTorch and hardware standards like UCIe and Ethernet."}],"topics":["AI Inference","Digital In-Memory Computing","Hardware Accelerators","Transformer Models","Memory Wall","Data Center Infrastructure","Semiconductor Manufacturing","Edge Computing"],"duration_seconds":2743,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/breaking-the-memory-wall-in-the-age-of-inference/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/breaking-the-memory-wall-in-the-age-of-inference.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}