{"podcast":{"title":"Data Futurology - Leadership And Strategy in Artificial Intelligence, Machine Learning, Data Science","slug":"data-futurology-leadership-and-strategy","podcast_index_feed_id":68063,"rss_url":"https://anchor.fm/s/3fab060/podcast/rss","website_url":"https://www.datafuturology.com/","image_url":"https://d3t3ozftmdmh3i.cloudfront.net/production/podcast_uploaded/567608/567608-1593736387795-5897b4f790597.jpg","author":"Felipe Flores","episode_count":266,"summary":"Artificial intelligence is a tremendously beneficial technology that's advancing at an incredibly rapid pace. As more and more organisations adopt and implement AI we find that the main challenges are not in the technology itself but in the human side, ie: the approaches, chosen problems and what's called 'the last mile', etc. That's why Data Futurology focuses on the leadership side of AI and how to get the most value from it. Join me, Felipe Flores, a Data Science executive with almost 20 years of experience in the space. Every week I speak with top industry leaders from around the world","last_synced_at":null,"page_url":"https://stenobird.com/podcast/data-futurology-leadership-and-strategy"},"episode":{"title":"#244: Navigating Data Quality: Insights from the Chief Operator of Data Quality Camp","slug":"244-navigating-data-quality-insights-from-the-chief-operator-of-data-quality-camp","published_at":"2023-08-16T00:41:05+00:00","page_url":"https://stenobird.com/podcast/data-futurology-leadership-and-strategy/244-navigating-data-quality-insights-from-the-chief-operator-of-data-quality-camp","show_page_url":"https://stenobird.com/podcast/data-futurology-leadership-and-strategy","url":"https://podcasters.spotify.com/pod/show/datafuturology/episodes/244-Navigating-Data-Quality-Insights-from-the-Chief-Operator-of-Data-Quality-Camp-e285fqg","audio_url":"https://anchor.fm/s/3fab060/podcast/play/74677520/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2023-7-16%2F343224439-44100-2-f85279486c7a5.mp3","summary":"Data quality is the essential foundation for reliable AI and machine learning models. Chad Sanderson shares pragmatic strategies for implementing data contracts and managing data reliability through community-driven knowledge.","meta_description":"Learn how to implement data contracts, manage data producers and consumers, and build robust data quality strategies with Chad Sanderson.","key_points":["Main idea: Data should be treated as a permanent organizational asset that outlasts changing technologies and processes","Practical takeaway: Start with 'low-tech' data contracts using YAML or even Word documents to define schemas and SLAs before moving to automated enforcement","Failure mode: Neglecting to identify downstream dependencies can lead to unexpected breaking changes when producers modify data structures","Practical takeaway: Use the 'tier one' approach to prioritize quality efforts on the most critical datasets rather than attempting to fix everything at once","Main idea: Effective data contracts require collaboration between producers and consumers to define requirements like latency and error thresholds"],"chapters":[{"start_ms":230000,"title":"The Power of Community-Driven Knowledge","summary":"Why community-driven insights are more objective for scaling data quality strategies."},{"start_ms":400000,"title":"Data as a Permanent Asset","summary":"Treating data with the same long-term importance as the company's core identity."},{"start_ms":570000,"title":"Initial Steps for Data Quality","summary":"How to begin building a robust approach to improving data reliability."},{"start_ms":730000,"title":"Prioritizing Tier One Datasets","summary":"Identifying critical data columns and assessing the severity of quality issues."},{"start_ms":900000,"title":"The Business Case for Data Quality","summary":"Aligning data quality improvements with financial incentives and business value."},{"start_ms":1080000,"title":"Defining Data Contracts","summary":"Codifying schemas, semantics, and SLAs between producers and consumers."},{"start_ms":1260000,"title":"Low-Tech vs. High-Tech Implementation","summary":"Using YAML and GitHub to implement flexible, scalable data contracts."}],"topics":["Data Quality","Data Contracts","Artificial Intelligence","Machine Learning","Data Engineering","Data Governance","Data Strategy","Data Observability"],"duration_seconds":2332,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-futurology-leadership-and-strategy/episodes/244-navigating-data-quality-insights-from-the-chief-operator-of-data-quality-camp/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-futurology-leadership-and-strategy/244-navigating-data-quality-insights-from-the-chief-operator-of-data-quality-camp.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}