Episode

AI Engineering for Art — with comfyanonymous, of ComfyUI

Podcast
Latent Space: The AI Engineer Podcast
Published
Jan 4, 2025
Duration seconds
3304
Processing state
processed
Canonical source
https://www.latent.space/p/comfyui
Audio
https://api.substack.com/feed/podcast/154105963/0411e065a5b8d299a4838ff6a24052ec.mp3
JSON
/v1/public/podcasts/latent-space-ai-engineer/episodes/ai-engineering-for-art-with-comfyanonymous-of-comfyui
Markdown
/podcast/latent-space-ai-engineer/ai-engineering-for-art-with-comfyanonymous-of-comfyui.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/ai-engineering-for-art-with-comfyanonymous-of-comfyui/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/latent-space-ai-engineer/ai-engineering-for-art-with-comfyanonymous-of-comfyui.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Applications for the NYC AI Engineer Summit, focused on Agents at Work , are open ! When we first started Latent Space, in the lightning round we’d always ask guests: “What’s your favorite AI product?”. The majority would say Midjourney. The simple UI of prompt → very aesthetic image turned it into a $300M+ ARR bootstrapped business as it rode the first wave of AI image generation. In open source land, StableDiffusion was congregating around AUTOMATIC1111 as the de-facto web UI. Unlike Midjourney, which offered some flags but was mostly prompt-driven, A1111 let users play with a lot more parameters, supported additional modalities like img2img, and allowed users to load in custom models. If you’re interested in some of the SD history, you can look at our episodes with Lexica , Replicate , and Playground . One of the people involved with that community was comfyanonymous , who was also part of the Stability team in 2023, decided to build an alternative called ComfyUI , now one of the fastest growing open source projects in generative images, and is now the preferred partner for folks like Black Forest Labs ’s Flux Tools on Day 1 . The idea behind it was simple: “Everyone is trying to make easy to use interfaces. Let me try to make a powerful interface that's not easy to use.” Unlike its predecessors, ComfyUI does not have an input text box. Everything is based around the idea of a node: there’s a text input node, a CLIP node, a checkpoint loader node, a KSampler node, a VAE node, etc. While daunting for simple image generation, the tool is amazing for more complex workflows since you can break down every step of the process, and then chain many of them together rather than manually switching between tools. You can also re-start execution halfway instead of from the begi…