Episode

Sepp Hochreiter - LSTM: The Comeback Story?

Podcast
Machine Learning Street Talk (MLST)
Published
Feb 12, 2025
Duration seconds
4021
Processing state
processed
Canonical source
https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/Sepp-Hochreiter---LSTM-The-Comeback-Story-e2uoffb
Audio
https://anchor.fm/s/1e4a0eac/podcast/play/98368427/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-1-12%2F528f8181-5bf4-b25d-6c34-e71a7ea674b4.mp3
JSON
/v1/public/podcasts/machine-learning-street-talk/episodes/sepp-hochreiter-lstm-the-comeback-story
Markdown
/podcast/machine-learning-street-talk/sepp-hochreiter-lstm-the-comeback-story.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/sepp-hochreiter-lstm-the-comeback-story/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/machine-learning-street-talk/sepp-hochreiter-lstm-the-comeback-story.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Sepp Hochreiter, the inventor of LSTM (Long Short-Term Memory) networks – a foundational technology in AI. Sepp discusses his journey, the origins of LSTM, and why he believes his latest work, XLSTM, could be the next big thing in AI, particularly for applications like robotics and industrial simulation. He also shares his controversial perspective on Large Language Models (LLMs) and why reasoning is a critical missing piece in current AI systems. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting! https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT AND BACKGROUND READING: https://www.dropbox.com/scl/fi/n1vzm79t3uuss8xyinxzo/SEPPH.pdf?rlkey=fp7gwaopjk17uyvgjxekxrh5v&dl=0 Prof. Sepp Hochreiter https://www.nx-ai.com/ https://x.com/hochreitersepp https://scholar.google.at/citations?user=tvUH3WMAAAAJ&hl=en TOC: 1. LLM Evolution and Reasoning Capabilities [00:00:00] 1.1 LLM Capabilities and Limitations Debate [00:03:16] 1.2 Program Generation and Reasoning in AI Systems [00:06:30] 1.3 Human vs AI Reasoning Comparison [00:09:59] 1.4 New Research Initiatives and Hybrid Approaches 2. LSTM Technical Architecture [00:13:18] 2.1 LSTM Development History and Technical Background [00:20:38] 2.2 LSTM vs RNN Architecture and Computational Complexity [00:25:10] 2.3 xLSTM Architecture and Flash Attention Comparison [00:30:51] 2.4 Evolution of Gating Mechanisms from Sigmoid to Exponential 3. Industrial…