Episode

2024 in Open Models [LS Live @ NeurIPS]

Podcast
Latent Space: The AI Engineer Podcast
Published
Dec 23, 2024
Duration seconds
2544
Processing state
processed
Canonical source
https://www.latent.space/p/2024-open-models
Audio
https://api.substack.com/feed/podcast/153509369/adcaf325218a8d4e0e3f2f3e31b113a4.mp3
JSON
/v1/public/podcasts/latent-space-ai-engineer/episodes/2024-in-open-models-ls-live-neurips
Markdown
/podcast/latent-space-ai-engineer/2024-in-open-models-ls-live-neurips.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/2024-in-open-models-ls-live-neurips/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/latent-space-ai-engineer/2024-in-open-models-ls-live-neurips.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Happy holidays! We’ll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS , Daylight Computer , Thoth.ai , StrongCompute , Notable Capital , and most of all our LS supporters who helped fund the venue and A/V production! For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML ), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver. Since Nathan Lambert ( Interconnects ) joined us for the hit RLHF 201 episode at the start of this year, it is hard to overstate how much Open Models have exploded this past year. In 2023 only five names were playing in the top LLM ranks, Mistral, Mosaic's MPT, TII UAE's Falcon, Yi from Kai-Fu Lee's 01.ai, and of course Meta's Llama 1 and 2. This year a whole cast of new open models have burst on the scene, from Google's Gemma and Cohere's Command R , to Alibaba's Qwen and Deepseek models, to LLM 360 and DCLM and of course to the Allen Institute's OLMo, OL MOE, Pixmo, Molmo, and Olmo 2 models. We were honored to host Luca Soldaini , one of the research leads on the Olmo series of models at AI2. Pursuing Open Model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe, California and the White House. We also were honored to hear from and Sophia Yang , head of devrel at Mistral, who also presented a great session at the AI Engineer World's…