Episode

SWE-bench & SWE-agent | Data Brew | Episode 44

Podcast: Data Brew by Databricks
Published: Apr 17, 2025
Duration seconds: 2182
Processing state: processed
Canonical source: https://www.buzzsprout.com/1370119/episodes/16876013-swe-bench-swe-agent-data-brew-episode-44.mp3
Audio: https://www.buzzsprout.com/1370119/episodes/16876013-swe-bench-swe-agent-data-brew-episode-44.mp3
JSON: /v1/public/podcasts/data-brew-by-databricks/episodes/swe-bench-swe-agent-data-brew-episode-44
Markdown: /podcast/data-brew-by-databricks/swe-bench-swe-agent-data-brew-episode-44.md

Actions

POST https://stenobird.com/v1/public/podcasts/data-brew-by-databricks/episodes/swe-bench-swe-agent-data-brew-episode-44/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/data-brew-by-databricks/swe-bench-swe-agent-data-brew-episode-44.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

In this episode, Kilian Lieret, Research Software Engineer, and Carlos Jimenez, Computer Science PhD Candidate at Princeton University, discuss SWE-bench and SWE-agent, two groundbreaking tools for evaluating and enhancing AI in software engineering. Highlights include: - SWE-bench: A benchmark for assessing AI models on real-world coding tasks. - Addressing data leakage concerns in GitHub-sourced benchmarks. - SWE-agent: An AI-driven system for navigating and solving coding challenges. - Ov...