Episode

Evals, Feedback Loops, and the Engineering That Makes AI Work

Podcast
AI + a16z
Published
Feb 17, 2026
Duration seconds
2629
Processing state
not_requested
Canonical source
https://ai-a16z.simplecast.com/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work-cqU91fWY
Audio
https://mgln.ai/e/1344/afp-848985-injected.calisto.simplecastaudio.com/112866f3-1a50-4a8d-b12e-850b73e71b33/episodes/0a4f8869-211c-4465-91aa-860173e18e94/audio/128/default.mp3?aid=rss_feed&awCollectionId=112866f3-1a50-4a8d-b12e-850b73e71b33&awEpisodeId=0a4f8869-211c-4465-91aa-860173e18e94&feed=Hb_IuXOo
JSON
/v1/public/podcasts/ai-a16z-6874937/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work
Markdown
/podcast/ai-a16z-6874937/evals-feedback-loops-and-the-engineering-that-makes-ai-work.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/ai-a16z-6874937/episodes/evals-feedback-loops-and-the-engineering-that-makes-ai-work/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/ai-a16z-6874937/evals-feedback-loops-and-the-engineering-that-makes-ai-work.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Martin Casado speaks with Ankur Goyal, founder and CEO of Braintrust, about where engineering actually matters in AI and where it doesn't. They cover the open source vs closed source model cycle, why Chinese models are gaining ground faster than spending suggests, whether AI demand will eventually saturate, and the Bash vs SQL benchmark that challenges the "just give it a computer" approach to agents.