Episode

Evals, Feedback Loops, and the Engineering That Makes AI Work

Podcast: AI + a16z
Published: Feb 17, 2026
Duration seconds: 2629
Processing state: not_requested
Canonical source: https://ai-a16z.simplecast.com/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work-cqU91fWY
Audio: https://mgln.ai/e/1344/afp-848985-injected.calisto.simplecastaudio.com/112866f3-1a50-4a8d-b12e-850b73e71b33/episodes/0a4f8869-211c-4465-91aa-860173e18e94/audio/128/default.mp3?aid=rss_feed&awCollectionId=112866f3-1a50-4a8d-b12e-850b73e71b33&awEpisodeId=0a4f8869-211c-4465-91aa-860173e18e94&feed=Hb_IuXOo
JSON: /v1/public/podcasts/ai-a16z-6874937/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work
Markdown: /podcast/ai-a16z-6874937/evals-feedback-loops-and-the-engineering-that-makes-ai-work.md

Actions

POST https://stenobird.com/v1/public/podcasts/ai-a16z-6874937/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/ai-a16z-6874937/evals-feedback-loops-and-the-engineering-that-makes-ai-work.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Martin Casado speaks with Ankur Goyal, founder and CEO of Braintrust, about where engineering actually matters in AI and where it doesn't. They cover the open source vs closed source model cycle, why Chinese models are gaining ground faster than spending suggests, whether AI demand will eventually saturate, and the Bash vs SQL benchmark that challenges the "just give it a computer" approach to agents.