Episode

Ashley Edwards - Genie Paper (DeepMind/Runway)

Podcast
Machine Learning Street Talk (MLST)
Published
Sep 13, 2024
Duration seconds
1504
Processing state
processed
Canonical source
https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/Ashley-Edwards---Genie-Paper-DeepMindRunway-e2oc9m5
Audio
https://anchor.fm/s/1e4a0eac/podcast/play/91677829/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2024-8-13%2F43b709ac-1c14-74ab-673a-ab9bb09beac6.mp3
JSON
/v1/public/podcasts/machine-learning-street-talk/episodes/ashley-edwards-genie-paper-deepmind-runway
Markdown
/podcast/machine-learning-street-talk/ashley-edwards-genie-paper-deepmind-runway.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/ashley-edwards-genie-paper-deepmind-runway/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/machine-learning-street-talk/ashley-edwards-genie-paper-deepmind-runway.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Ashley Edwards, who was working at DeepMind when she co-authored the Genie paper and is now at Runway, covered several key aspects of the Genie AI system and its applications in video generation, robotics, and game creation. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api. Genie's approach to learning interactive environments, balancing compression and fidelity. The use of latent action models and VQE models for video processing and tokenization. Challenges in maintaining action consistency across frames and integrating text-to-image models. Evaluation metrics for AI-generated content, such as FID and PS&R diff metrics. The discussion also explored broader implications and applications: The potential impact of AI video generation on content creation jobs. Applications of Genie in game generation and robotics. The use of foundation models in robotics and the differences between internet video data and specialized robotics data. Challenges in mapping AI-generated actions to real-world robotic actions. Ashley Edwards: https://ashedwards.github.io/ TOC (*) are best bits 00:00:00 1. Intro to Genie & Brave Search API: Trade-offs & limitations * 00:02:26 2. Genie's Architecture: Latent action, VQE, video processing * 00:05:06 3. Genie's Constraints: Frame consistency & image model integration 00:07:26 4. Evaluation: FID, PS&R diff metrics & latent induction methods 00:09:44 5. AI Video Gen: Content creation impact, depth & parallax effects 00:11:39 6. Model Scaling: Training data…