# GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration Page: https://stenobird.com/podcast/daily-paper-cast-7079649/ggt-100k-generative-ground-truth-for-generalizable-real-world-image-restoration Text version: https://stenobird.com/podcast/daily-paper-cast-7079649/ggt-100k-generative-ground-truth-for-generalizable-real-world-image-restoration.md Podcast: [Daily Paper Cast](https://stenobird.com/podcast/daily-paper-cast-7079649) Published: 2026-06-02T04:13:27+00:00 Episode link: https://share.transistor.fm/s/8b7766eb Audio file: https://media.transistor.fm/8b7766eb/d12987c1.mp3 Processing state: not_requested JSON: https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/ggt-100k-generative-ground-truth-for-generalizable-real-world-image-restoration Duration seconds: 1397 ## Resource 🤗 Upvotes: 31 | cs.CV Authors: Xiangtao Kong, Jixin Zhao, Lingchen Sun, Rongyuan Wu, Lei Zhang Title: GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration Arxiv: http://arxiv.org/abs/2605.31039v1 Abstract: Real-world image restoration (IR) is bottlenecked by the scarcity of high-quality paired training data. Synthetic datasets are abundant but often fail to model real-world degradations, while real-world paired datasets are expensive and difficult to capture. As a result, IR models trained on these datasets show limited generalization in real-world scenarios. In this work, we propose Generative Ground Truth (GGT) by using generative multimodal foundation models (MFMs) to produce high-quality (HQ) targets from real-world low-quality (LQ) images. We first conduct a systematic evaluation of nine state-of-the-art MFMs, including Nano-Banana-2 and GPT-Image-2, on images of various scenes and degradation types. The results demonstrate that Nano-Banana-2 with VLM-based adaptive prompting shows the highest capability to synthesize perceptually realistic and content-faithful HQ targets, which can serve as the GGT for the LQ input. We then employ Nano-Banana-2 to build a GGT synthesis pipeline, which involves multi-stage quality control to ensure data reliability, and construct GGT-100K, an LQ-HQ paired dataset comprising 103,707 training pairs and covering diverse scenes and complex real-world degradations. A test set of 500 image pairs is also established. Extensive experiments show that GGT-100K consistently improves the real-world generalization of a wide range of IR models, with particularly strong benefits for finetuning generative models for IR tasks. Our results suggest that MFMs can serve as practical tools for restoration-oriented data gener… ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/ggt-100k-generative-ground-truth-for-generalizable-real-world-image-restoration/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/daily-paper-cast-7079649/ggt-100k-generative-ground-truth-for-generalizable-real-world-image-restoration.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.