# High-Efficiency Diffusion Models for On-Device Image Generation and Editing with Hung Bui - #753 Page: https://stenobird.com/podcast/twiml-ai-podcast/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing-with-hung-bui-753 Text version: https://stenobird.com/podcast/twiml-ai-podcast/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing-with-hung-bui-753.md Podcast: [The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)](https://stenobird.com/podcast/twiml-ai-podcast) Published: 2025-10-28T20:26:00+00:00 Episode link: https://twimlai.com/podcast/twimlai/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing/ Audio file: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN6593247207.mp3?updated=1761682149 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing-with-hung-bui-753 Duration seconds: 3143 ## Resource Hung Bui explains how to compress computationally expensive diffusion models into single-step architectures for mobile deployment. The discussion focuses on the technical mechanics of distillation and the use of 'coach' networks to bridge the gap between teacher and student distributions. ## Highlights - Main idea: Single-step diffusion models can achieve high-quality results by distilling knowledge from multi-step teacher models - Technical breakthrough: A secondary 'coach' network is used to align the student's early-stage distribution with the teacher's distribution - Practical takeaway: Efficient on-device generation requires minimizing the iterative denoising process to reduce latency and compute - Failure mode: Standard distillation can fail early in training because the student's distribution is too different from the teacher's for the signal to be useful - Future direction: The next frontier involves optimizing reasoning models and agents within fixed hardware compute budgets ## Topics Diffusion Models, Model Distillation, On-Device AI, Image Generation, Neural Network Compression, Qualcomm AI, Computer Vision, Edge Computing ## Chapters - 1:05 — Introduction and Background: Hung Bui discusses his career path from academia to leadership roles at Qualcomm, Google DeepMind, and Adobe. - 5:00 — Building AI Talent in Southeast Asia: A look at the efforts to recruit and develop high-level AI researchers and engineers in Vietnam and the broader region. - 12:35 — Challenges in Large-Scale Language Models: The difficulty of training massive-parameter models like ChatGPT using localized, non-English datasets. - 16:20 — Optimizing Small Model Performance: Strategies for extracting higher performance from smaller models through data iteration and efficient training. - 20:20 — The Goal of Efficient Image Generation: Comparing the compute requirements of text generation versus the iterative nature of diffusion-based image generation. - 24:05 — Distillation and the Denoising Function: Deep dive into the distillation framework used to reduce hundred-step denoising processes into a single inference step. - 27:45 — The Role of the Coach Network: Explaining how a secondary network acts as a bridge to stabilize training when student and teacher distributions diverge. - 35:40 — On-Device Agents and Future Scaling: Discussing the future of low-latency AI agents and managing inference-time scaling under fixed hardware budgets. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing-with-hung-bui-753/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/twiml-ai-podcast/high-efficiency-diffusion-models-for-on-device-image-generation-and-editing-with-hung-bui-753.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.