# Docker Model Runner Page: https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner Text version: https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner.md Podcast: [DevOps and Docker Talk: Cloud Native Interviews and Tooling](https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling) Published: 2025-04-21T18:44:48+00:00 Episode link: https://podcast.bretfisher.com/episodes/docker-model-runner Audio file: https://media.transistor.fm/b8689db1/ab36fce8.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/devops-and-docker-talk-cloud-native-interviews-and-tooling/episodes/docker-model-runner Duration seconds: 924 ## Resource Docker Model Runner simplifies running LLMs locally by using a single command to manage models via llama.cpp. This episode explores the architecture, OCI artifact integration, and practical use cases for local AI inference. ## Highlights - Main idea: Docker Model Runner provides a streamlined interface for running LLMs using the 'docker model' command - Technical detail: Models are distributed as OCI artifacts containing the model blob and license files, rather than full container images - Practical takeaway: Use Open WebUI with Docker Model Runner to create a private, local ChatGPT-like experience - Failure mode: Large models can cause timeouts or system freezes, occasionally requiring a Docker Desktop restart - Future roadmap: Upcoming support for Windows, Docker CE, and MLX for significant performance boosts on Apple Silicon ## Topics Docker Model Runner, LLM, llama.cpp, OCI Artifacts, Docker Hub, Open WebUI, AI Infrastructure, DevOps Automation ## Chapters - 1:00 — The Agentic DevOps Guild: An introduction to the new community for accelerating AI adoption in DevOps, CI/CD, and Platform Engineering. - 3:05 — Docker Model Runner Elevator Pitch: A high-level overview of how Docker Model Runner lowers the barrier to entry for running local LLMs. - 4:05 — Enabling Docker Model Runner: How to enable the feature in Docker Desktop and the distinction between Model Runner and Docker AI. - 7:30 — Downloading Models via Docker Hub: Exploring the new packaging format for models and how to pull them from the Docker Hub AI account. - 9:50 — Architecture and llama.cpp: A deep dive into the underlying use of llama.cpp and how models are dynamically loaded into memory. - 13:05 — OCI Artifacts and ORAS: Understanding the technical implementation of models as OCI artifacts and the role of tools like ORAS. - 14:10 — Troubleshooting and Future Roadmap: Addressing current limitations like model size issues and discussing upcoming Windows and Linux support. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/devops-and-docker-talk-cloud-native-interviews-and-tooling/episodes/docker-model-runner/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.