Episode

🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik

Podcast
Latent Space: The AI Engineer Podcast
Published
Apr 20, 2026
Duration seconds
5121
Processing state
processed
Canonical source
https://www.latent.space/p/noetik
Audio
https://api.substack.com/feed/podcast/194810752/1b92bce4d49858354007a47c48e4e6d4.mp3
JSON
/v1/public/podcasts/latent-space-ai-engineer/episodes/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik
Markdown
/podcast/latent-space-ai-engineer/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/latent-space-ai-engineer/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

The 95% failure rate in cancer clinical trials is often a patient selection problem rather than a drug discovery failure. Noetik uses large-scale spatial transcriptomics and transformer models to predict how specific tumors will respond to existing treatments.

Topics

  • Spatial Transcriptomics
  • Transformer Models
  • Cancer Clinical Trials
  • Drug Development
  • Bioinformatics
  • Machine Learning
  • Precision Medicine
  • Biotech Software

Highlights

  • Main idea: Clinical trial failures are driven by poor patient-to-treatment matching, not necessarily poor pharmacology
  • Practical takeaway: Using H&E assays to predict 19,000-gene spatial maps can identify responders without expensive whole-plex sequencing
  • Technical insight: Scaling models on massive, multi-modal datasets (H&E, protein, and spatial transcriptomics) is essential for capturing non-linear biological patterns
  • Failure mode: Relying solely on cell-line-based drug discovery fails to account for the complex, multicellular architecture of human tumors
  • Strategic shift: The industry is moving from drug-discovery-only models toward software licensing platforms that improve trial success rates

Chapters

  1. 1:00 The Thesis: Solving the Matching Problem: Introduction to Noetik's approach to reducing the 95% failure rate in cancer trials through better patient selection.
  2. 7:10 The Gap in Cell Line Models: Why traditional cell line studies fail to translate to human clinical outcomes due to lack of mutation complexity.
  3. 13:40 Scaling Biological Transformers: The necessity of massive, multi-modal datasets to achieve the scaling laws seen in natural language processing.
  4. 20:05 Decoding Spatial Transcriptomics: Defining spatial transcriptomics and how it provides the richest possible map of tumor biology.
  5. 26:15 Computer Vision for Biology: Treating high-dimensional gene expression as a massive computer vision problem with thousands of channels.
  6. 32:50 Simulating Genetic Perturbations: Using 'world models' to simulate the effects of knocking down specific genes within a tumor environment.
  7. 39:20 The Advantage of Paired Data: How Noetik's unique access to paired H&E and spatial data provides a competitive edge in model training.