# Docker Model Runner

Page: https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner
Text version: https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner.md
Podcast: [DevOps and Docker Talk: Cloud Native Interviews and Tooling](https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling)
Published: 2025-04-21T18:44:48+00:00
Episode link: https://podcast.bretfisher.com/episodes/docker-model-runner
Audio file: https://media.transistor.fm/b8689db1/ab36fce8.mp3
Processing state: processed
JSON: https://stenobird.com/v1/public/podcasts/devops-and-docker-talk-cloud-native-interviews-and-tooling/episodes/docker-model-runner
Duration seconds: 924

## Resource

Docker Model Runner simplifies running LLMs locally by using a single command to manage models via llama.cpp. This episode explores the architecture, OCI artifact integration, and practical use cases for local AI inference.

## Highlights
- Main idea: Docker Model Runner provides a streamlined interface for running LLMs using the 'docker model' command
- Technical detail: Models are distributed as OCI artifacts containing the model blob and license files, rather than full container images
- Practical takeaway: Use Open WebUI with Docker Model Runner to create a private, local ChatGPT-like experience
- Failure mode: Large models can cause timeouts or system freezes, occasionally requiring a Docker Desktop restart
- Future roadmap: Upcoming support for Windows, Docker CE, and MLX for significant performance boosts on Apple Silicon

## Topics

Docker Model Runner, LLM, llama.cpp, OCI Artifacts, Docker Hub, Open WebUI, AI Infrastructure, DevOps Automation

## Chapters
- 1:00 — The Agentic DevOps Guild: An introduction to the new community for accelerating AI adoption in DevOps, CI/CD, and Platform Engineering.
- 3:05 — Docker Model Runner Elevator Pitch: A high-level overview of how Docker Model Runner lowers the barrier to entry for running local LLMs.
- 4:05 — Enabling Docker Model Runner: How to enable the feature in Docker Desktop and the distinction between Model Runner and Docker AI.
- 7:30 — Downloading Models via Docker Hub: Exploring the new packaging format for models and how to pull them from the Docker Hub AI account.
- 9:50 — Architecture and llama.cpp: A deep dive into the underlying use of llama.cpp and how models are dynamically loaded into memory.
- 13:05 — OCI Artifacts and ORAS: Understanding the technical implementation of models as OCI artifacts and the role of tools like ORAS.
- 14:10 — Troubleshooting and Future Roadmap: Addressing current limitations like model size issues and discussing upcoming Windows and Linux support.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/devops-and-docker-talk-cloud-native-interviews-and-tooling/episodes/docker-model-runner/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.