# WorldKV: Efficient World Memory with World Retrieval and Compression

Page: https://stenobird.com/podcast/daily-paper-cast-7079649/worldkv-efficient-world-memory-with-world-retrieval-and-compression
Text version: https://stenobird.com/podcast/daily-paper-cast-7079649/worldkv-efficient-world-memory-with-world-retrieval-and-compression.md
Podcast: [Daily Paper Cast](https://stenobird.com/podcast/daily-paper-cast-7079649)
Published: 2026-05-23T04:26:21+00:00
Episode link: https://share.transistor.fm/s/1ceb95e4
Audio file: https://media.transistor.fm/1ceb95e4/c73cb8ce.mp3
Processing state: not_requested
JSON: https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/worldkv-efficient-world-memory-with-world-retrieval-and-compression
Duration seconds: 1351

## Resource

🤗 Upvotes: 29 | cs.CV Authors: Jung Yi, Minjae Kim, Paul Hyunbin Cho, Wooseok Jang, Sangdoo Yun, Seungryong Kim Title: WorldKV: Efficient World Memory with World Retrieval and Compression Arxiv: http://arxiv.org/abs/2605.22718v1 Abstract: Autoregressive video diffusion models have enabled real-time, action-conditioned world generation. However, sustaining a persistent world, where revisiting a previously seen viewpoint yields consistent content, remains an open problem. Full KV-cache attention preserves this consistency but breaks real-time constraints: memory footprint and attention cost grow linearly with rollout length. Sliding window inference restores throughput but discards long-term consistency. We propose WorldKV, a training-free framework with two components: World Retrieval and World Compression. World Retrieval stores evicted KV-cache chunks in GPU/CPU memory and selectively retrieves scene-relevant chunks via camera/ action correspondence, inserting them back into the native attention window without re-encoding. World Compression prunes redundant tokens within each chunk via key-key similarity to an anchor frame, halving per-chunk storage to fit 2x more history under a fixed budget. On Matrix-Game-2.0 and LingBot- World-Fast, WorldKV matches or exceeds full-KV memory fidelity at roughly 2x the throughput, and is competitive with memory-trained baselines without any fine-tuning. Project Page: https://cvlab-kaist.github.io/WorldKV/

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/worldkv-efficient-world-memory-with-world-retrieval-and-compression/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/daily-paper-cast-7079649/worldkv-efficient-world-memory-with-world-retrieval-and-compression.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.