Episode
Docker Model Runner
- Published
- Apr 21, 2025
- Duration seconds
- 924
- Processing state
processed- Canonical source
- https://podcast.bretfisher.com/episodes/docker-model-runner
Actions
POST https://stenobird.com/v1/public/podcasts/devops-and-docker-talk-cloud-native-interviews-and-tooling/episodes/docker-model-runner/transcription-requests
Idempotently request low-priority transcript generation for this episode.GET https://stenobird.com/podcast/devops-and-docker-talk-cloud-native-interviews-and-tooling/docker-model-runner.md
Read the agent-friendly Markdown representation of this episode resource.
Summary
Docker Model Runner simplifies running LLMs locally by using a single command to manage models via llama.cpp. This episode explores the architecture, OCI artifact integration, and practical use cases for local AI inference.
Topics
- Docker Model Runner
- LLM
- llama.cpp
- OCI Artifacts
- Docker Hub
- Open WebUI
- AI Infrastructure
- DevOps Automation
Highlights
- Main idea: Docker Model Runner provides a streamlined interface for running LLMs using the 'docker model' command
- Technical detail: Models are distributed as OCI artifacts containing the model blob and license files, rather than full container images
- Practical takeaway: Use Open WebUI with Docker Model Runner to create a private, local ChatGPT-like experience
- Failure mode: Large models can cause timeouts or system freezes, occasionally requiring a Docker Desktop restart
- Future roadmap: Upcoming support for Windows, Docker CE, and MLX for significant performance boosts on Apple Silicon
Chapters
1:00The Agentic DevOps Guild: An introduction to the new community for accelerating AI adoption in DevOps, CI/CD, and Platform Engineering.3:05Docker Model Runner Elevator Pitch: A high-level overview of how Docker Model Runner lowers the barrier to entry for running local LLMs.4:05Enabling Docker Model Runner: How to enable the feature in Docker Desktop and the distinction between Model Runner and Docker AI.7:30Downloading Models via Docker Hub: Exploring the new packaging format for models and how to pull them from the Docker Hub AI account.9:50Architecture and llama.cpp: A deep dive into the underlying use of llama.cpp and how models are dynamically loaded into memory.13:05OCI Artifacts and ORAS: Understanding the technical implementation of models as OCI artifacts and the role of tools like ORAS.14:10Troubleshooting and Future Roadmap: Addressing current limitations like model size issues and discussing upcoming Windows and Linux support.