{"podcast":{"title":"Data Engineering Podcast","slug":"data-engineering-podcast","podcast_index_feed_id":403671,"rss_url":"https://serve.podhome.fm/rss/1c0357c0-6aba-5766-a2d5-2090d8dab6bc","website_url":"https://www.dataengineeringpodcast.com","image_url":"https://assets.podhome.fm/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638557928872209534cover.jpg","author":"Tobias Macey","episode_count":512,"summary":"This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.","last_synced_at":"2026-06-08T08:20:33.411847+00:00","page_url":"https://stenobird.com/podcast/data-engineering-podcast"},"episode":{"title":"The AI Data Paradox: High Trust in Models, Low Trust in Data","slug":"the-ai-data-paradox-high-trust-in-models-low-trust-in-data","published_at":"2025-11-09T23:53:19+00:00","page_url":"https://stenobird.com/podcast/data-engineering-podcast/the-ai-data-paradox-high-trust-in-models-low-trust-in-data","show_page_url":"https://stenobird.com/podcast/data-engineering-podcast","url":"https://www.dataengineeringpodcast.com/boomi-data-for-ai-survey-results-episode-488","audio_url":"https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638983160301766546433fb150-92e7-42a5-a006-aacb4f6fee76.mp3","summary":"Ariel Pohryles explores the 'AI Data Paradox,' where high trust in model outputs masks a deep-seated lack of trust in underlying organizational data. The discussion details how data engineers must evolve from building pipelines to governing a massive sprawl of autonomous AI agents.","meta_description":"Explore the AI Data Paradox: why 77% of leaders trust AI models but only 50% trust their data, and how to manage the coming sprawl of AI agents.","key_points":["Main idea: The rise of 'Shadow AI' agents requires data teams to shift from managing data sources to governing autonomous agent actions","Failure mode: Relying on curated, small datasets for AI can create a false sense of security while ignoring broader organizational data rot","Practical takeaway: Organizations should focus on automated pipelines and metadata management to provide the traceability needed for AI-driven decisions","Trend: A resurgence in Master Data Management (MDM) is driven by the need to eliminate duplicates and enrich data for high-stakes AI use cases","Strategic advice: Use AI to automate the data engineering lifecycle itself to handle the increasing complexity of real-time, unstructured workloads"],"chapters":[{"start_ms":300000,"title":"The State of AI Data Investment","summary":"Ariel introduces recent survey findings regarding how data leaders are preparing their infrastructure for generative AI."},{"start_ms":540000,"title":"Reconciling the Trust Paradox","summary":"An analysis of why leaders trust AI model outputs despite significant distrust in the underlying organizational data sources."},{"start_ms":770000,"title":"Risks of Autonomous AI","summary":"Discussing the dangers of AI's ability to 'think' independently when fed unverified or unmanaged data."},{"start_ms":1000000,"title":"Automating Data Validation","summary":"The shift from manual data quality reviews to automated, scalable validation pipelines for production AI."},{"start_ms":1490000,"title":"The Challenge of Data Sprawl","summary":"How the ease of building AI agents is creating a new layer of 'Shadow IT' that data engineers must eventually govern."},{"start_ms":1720000,"title":"Governing the Agent Force","summary":"The necessity of implementing visibility, certification, and kill-switches for the growing population of business agents."},{"start_ms":2400000,"title":"The Future of Data Management","summary":"Predicting a move toward platform consolidation and the use of AI to accelerate the data engineering lifecycle."}],"topics":["Data Engineering","Generative AI","Data Governance","AI Agents","Master Data Management","Data Pipeline Automation","Metadata Management","Shadow IT"],"duration_seconds":3095,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/the-ai-data-paradox-high-trust-in-models-low-trust-in-data/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-engineering-podcast/the-ai-data-paradox-high-trust-in-models-low-trust-in-data.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}