{"podcast":{"title":"Super Data Science: ML & AI Podcast with Jon Krohn","slug":"super-data-science","podcast_index_feed_id":220402,"rss_url":"https://feeds.megaphone.fm/SUPERDATASCIENCEPTYLTD9836501887","website_url":"https://www.superdatascience.com/podcast","image_url":"https://megaphone.imgix.net/podcasts/efa92454-1c31-11ef-9e30-03596b470c27/image/c3e0edc239c962f8bcd144000fafa5aa.jpeg?ixlib=rails-4.3.1&max-w=3000&max-h=3000&fit=crop&auto=format,compress","author":"Jon Krohn","episode_count":991,"summary":"The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/super-data-science"},"episode":{"title":"903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir","slug":"903-llm-benchmarks-are-lying-to-you-and-what-to-do-instead-with-sinan-ozdemir","published_at":"2025-07-08T11:00:00+00:00","page_url":"https://stenobird.com/podcast/super-data-science/903-llm-benchmarks-are-lying-to-you-and-what-to-do-instead-with-sinan-ozdemir","show_page_url":"https://stenobird.com/podcast/super-data-science","url":"https://www.podtrac.com/pts/redirect.mp3/chrt.fm/track/E581B9/arttrk.com/p/VI4CS/pscrb.fm/rss/p/traffic.megaphone.fm/SUPERDATASCIENCEPTYLTD4709835756.mp3?updated=1752060782","audio_url":"https://www.podtrac.com/pts/redirect.mp3/chrt.fm/track/E581B9/arttrk.com/p/VI4CS/pscrb.fm/rss/p/traffic.megaphone.fm/SUPERDATASCIENCEPTYLTD4709835756.mp3?updated=1752060782","summary":"Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmarks, and the future of benchmarking agentic and multimodal models. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/903⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by ⁠⁠Adverity, the conversational analytics platform⁠⁠ and by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (16:48) Sinan’s new podcast, Practically Intelligent (21:54) What to know about the limits of AI benchmarking (53:22) Alternatives to AI benchmarks (1:01:23) The difficulties in getting a model to recognize its mistakes","meta_description":"Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training…","key_points":[],"chapters":[],"topics":[],"duration_seconds":5300,"processing_state":"failed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/super-data-science/episodes/903-llm-benchmarks-are-lying-to-you-and-what-to-do-instead-with-sinan-ozdemir/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/super-data-science/903-llm-benchmarks-are-lying-to-you-and-what-to-do-instead-with-sinan-ozdemir.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}