Episode

Beyond the PDF: Rowan Cockett on Reproducible, Composable Science

Podcast
Data Engineering Podcast
Published
Mar 22, 2026
Duration seconds
2560
Processing state
processed
Canonical source
https://www.dataengineeringpodcast.com/continous-science-foundation-curvenote-scientific-research-data-management-episode-506
Audio
https://op3.dev/e/dts.podtrac.com/redirect.mp3/serve.podhome.fm/episode/f6ff0caa-931b-4c08-bfdd-08dc7f5cd336/6390980655121795181d9b156f-8508-483e-9015-4b41c9a448ec.mp3
JSON
/v1/public/podcasts/data-engineering-podcast/episodes/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science
Markdown
/podcast/data-engineering-podcast/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/data-engineering-podcast/episodes/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/data-engineering-podcast/beyond-the-pdf-rowan-cockett-on-reproducible-composable-science.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Scientific research is currently trapped in static, non-reproducible PDF formats that hinder collaboration. Rowan Cockett explores how moving toward composable, cloud-optimized data architectures can enable a new era of interactive and verifiable science.

Topics

  • Scientific Reproducibility
  • Data Engineering
  • Cloud-Native Data
  • Open Science
  • Data Visualization
  • Software Engineering Best Practices
  • Research Infrastructure
  • Interoperability

Highlights

  • Main idea: The reproducibility crisis is a socio-technical problem rooted in static publishing formats and misaligned incentives
  • Failure mode: Relying on uncurated 'zip file' data dumps on repositories leads to poor discoverability and broken research lineages
  • Practical takeaway: Implementing 'graceful degradation' allows interactive research widgets to remain useful even as underlying compute environments evolve
  • Main idea: True scientific progress requires 'composability'—the ability to treat research components like software packages that can be easily integrated
  • Technical goal: Moving toward an Open Exchange Architecture (OXA) that integrates data, code, and narrative into a single, archivable unit

Chapters

  1. 1:00 Introduction to Rowan Cockett: Rowan discusses his background in geoscience visualization and his transition into building collaborative data management tools.
  2. 4:10 The Goal of Reproducible Science: The importance of creating systems where researchers can trust and reuse results to accelerate scientific discovery.
  3. 7:10 The Problem with Uncurated Data: Critique of current data sharing practices, such as uploading unorganized files to repositories like Zenodo.
  4. 10:20 Visualizing Large-Scale Datasets: Using cloud-optimized formats to enable interactive zooming into massive microscopy datasets, similar to Google Maps.
  5. 13:30 Tackling Socio-Technical Challenges: Addressing the misalignment between technical capabilities and the social incentives of the publishing industry.
  6. 16:40 The Future of Open Publishing: How preprint servers and initiatives like the Journal of Open Source Science are democratizing scientific credit.
  7. 19:50 Modern Data Engineering in Research: Integrating software engineering best practices and data carpentry into the scientific workflow.