{"podcast":{"title":"Data Engineering Podcast","slug":"data-engineering-podcast","podcast_index_feed_id":403671,"rss_url":"https://serve.podhome.fm/rss/1c0357c0-6aba-5766-a2d5-2090d8dab6bc","website_url":"https://www.dataengineeringpodcast.com","image_url":"https://assets.podhome.fm/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/638557928872209534cover.jpg","author":"Tobias Macey","episode_count":510,"summary":"This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/data-engineering-podcast"},"episode":{"title":"Beyond the PDF: Rowan Cockett on Reproducible, Composable Science","slug":"beyond-the-pdf-rowan-cockett-on-reproducible-composable-science","published_at":"2026-03-22T20:11:13+00:00","page_url":"https://stenobird.com/podcast/data-engineering-podcast/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science","show_page_url":"https://stenobird.com/podcast/data-engineering-podcast","url":"https://www.dataengineeringpodcast.com/continous-science-foundation-curvenote-scientific-research-data-management-episode-506","audio_url":"https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/6390980655121795181d9b156f-8508-483e-9015-4b41c9a448ec.mp3","summary":"Scientific research is currently trapped in static, non-reproducible PDF formats that hinder collaboration. Rowan Cockett explores how moving toward composable, cloud-optimized data architectures can enable a new era of interactive and verifiable science.","meta_description":"Explore the shift from static PDFs to composable science with Rowan Cockett. Learn how open standards and cloud-native data formats drive research reprodu…","key_points":["Main idea: The reproducibility crisis is a socio-technical problem rooted in static publishing formats and misaligned incentives","Failure mode: Relying on uncurated 'zip file' data dumps on repositories leads to poor discoverability and broken research lineages","Practical takeaway: Implementing 'graceful degradation' allows interactive research widgets to remain useful even as underlying compute environments evolve","Main idea: True scientific progress requires 'composability'—the ability to treat research components like software packages that can be easily integrated","Technical goal: Moving toward an Open Exchange Architecture (OXA) that integrates data, code, and narrative into a single, archivable unit"],"chapters":[{"start_ms":60000,"title":"Introduction to Rowan Cockett","summary":"Rowan discusses his background in geoscience visualization and his transition into building collaborative data management tools."},{"start_ms":250000,"title":"The Goal of Reproducible Science","summary":"The importance of creating systems where researchers can trust and reuse results to accelerate scientific discovery."},{"start_ms":430000,"title":"The Problem with Uncurated Data","summary":"Critique of current data sharing practices, such as uploading unorganized files to repositories like Zenodo."},{"start_ms":620000,"title":"Visualizing Large-Scale Datasets","summary":"Using cloud-optimized formats to enable interactive zooming into massive microscopy datasets, similar to Google Maps."},{"start_ms":810000,"title":"Tackling Socio-Technical Challenges","summary":"Addressing the misalignment between technical capabilities and the social incentives of the publishing industry."},{"start_ms":1000000,"title":"The Future of Open Publishing","summary":"How preprint servers and initiatives like the Journal of Open Source Science are democratizing scientific credit."},{"start_ms":1190000,"title":"Modern Data Engineering in Research","summary":"Integrating software engineering best practices and data carpentry into the scientific workflow."}],"topics":["Scientific Reproducibility","Data Engineering","Cloud-Native Data","Open Science","Data Visualization","Software Engineering Best Practices","Research Infrastructure","Interoperability"],"duration_seconds":2560,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-engineering-podcast/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}