# Are Evals Dead?

Page: https://stenobird.com/podcast/mlops-community/are-evals-dead
Text version: https://stenobird.com/podcast/mlops-community/are-evals-dead.md
Podcast: [MLOps.community](https://stenobird.com/podcast/mlops-community)
Published: 2025-09-26T16:00:04+00:00
Episode link: https://podcasters.spotify.com/pod/show/mlops/episodes/Are-Evals-Dead-e38jjqf
Audio file: https://anchor.fm/s/174cb1b8/podcast/play/108694799/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-8-23%2F407998018-44100-2-4625ae0cd88ad.mp3
Processing state: failed
JSON: https://stenobird.com/v1/public/podcasts/mlops-community/episodes/are-evals-dead
Duration seconds: 1524

## Resource

AI Conversations Powered by Prosus Group Your AI agent isn’t failing because it’s dumb—it’s failing because you refuse to test it. Chiara Caratelli cuts through the hype to show why evaluations—not bigger models or fancier prompts—decide whether agents succeed in the real world. If you’re not stress-testing, simulating, and iterating on failures, you’re not building AI—you’re shipping experiments disguised as products. Guest speaker: Chiara Caratelli - Data Scientist @ Prosus Group Host: Demetrios Brinkmann - Founder of MLOps Community ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [ https://go.mlops.community/slack ] Follow us on X/Twitter [@mlopscommunity]( https://x.com/mlopscommunity ) or [LinkedIn]( https://go.mlops.community/linkedin )] Sign up for the next meetup: [ https://go.mlops.community/register ] MLOps Swag/Merch: [ https://shop.mlops.community/ ]

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/mlops-community/episodes/are-evals-dead/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/mlops-community/are-evals-dead.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.