# ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse Page: https://stenobird.com/podcast/ai-engineering-podcast/ml-infrastructure-without-the-ops-simplifying-the-ml-developer-experience-with-runhouse Text version: https://stenobird.com/podcast/ai-engineering-podcast/ml-infrastructure-without-the-ops-simplifying-the-ml-developer-experience-with-runhouse.md Podcast: [AI Engineering Podcast](https://stenobird.com/podcast/ai-engineering-podcast) Published: 2024-11-11T00:51:37+00:00 Episode link: https://www.aiengineeringpodcast.com/runhouse-ml-infrastructure-simplified-episode-40 Audio file: https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638668760280576680bd3e1488-da51-4aa8-afb1-962bc5866457v1.mp3 Processing state: failed JSON: https://stenobird.com/v1/public/podcasts/ai-engineering-podcast/episodes/ml-infrastructure-without-the-ops-simplifying-the-ml-developer-experience-with-runhouse Duration seconds: 4572 ## Resource Summary Machine learning workflows have long been complex and difficult to operationalize. They are often characterized by a period of research, resulting in an artifact that gets passed to another engineer or team to prepare for running in production. The MLOps category of tools have tried to build a new set of utilities to reduce that friction, but have instead introduced a new barrier at the team and organizational level. Donny Greenberg took the lessons that he learned on the PyTorch team at Meta and created Runhouse. In this episode he explains how, by reducing the number of opinions in the framework, he has also reduced the complexity of moving from development to production for ML systems. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems Your host is Tobias Macey and today I'm interviewing Donny Greenberg about Runhouse and the current state of ML infrastructure Interview Introduction How did you get involved in machine learning? What are the core elements of infrastructure for ML and AI? How has that changed over the past ~5 years? For the past few years the MLOps and data engineering stacks were built and managed separately. How does the current generation of tools and product requirements influence the present and future approach to those domains? There are numerous projects that aim to bridge the complexity gap in running Python and ML code from your laptop up to distributed compute on clouds (e.g. Ray, Metaflow, Dask, Modin, etc.). How do you view the decision process for teams trying to understand which tool(s) to use for managing their ML/AI developer experience? Can you describe what Runhouse is and the story behind it? What are the core problems that you a… ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/ai-engineering-podcast/episodes/ml-infrastructure-without-the-ops-simplifying-the-ml-developer-experience-with-runhouse/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/ai-engineering-podcast/ml-infrastructure-without-the-ops-simplifying-the-ml-developer-experience-with-runhouse.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.