# 🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik Page: https://stenobird.com/podcast/latent-space-ai-engineer/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik Text version: https://stenobird.com/podcast/latent-space-ai-engineer/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik.md Podcast: [Latent Space: The AI Engineer Podcast](https://stenobird.com/podcast/latent-space-ai-engineer) Published: 2026-04-20T16:17:17+00:00 Episode link: https://www.latent.space/p/noetik Audio file: https://api.substack.com/feed/podcast/194810752/1b92bce4d49858354007a47c48e4e6d4.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik Duration seconds: 5121 ## Resource The 95% failure rate in cancer clinical trials is often a patient selection problem rather than a drug discovery failure. Noetik uses large-scale spatial transcriptomics and transformer models to predict how specific tumors will respond to existing treatments. ## Highlights - Main idea: Clinical trial failures are driven by poor patient-to-treatment matching, not necessarily poor pharmacology - Practical takeaway: Using H&E assays to predict 19,000-gene spatial maps can identify responders without expensive whole-plex sequencing - Technical insight: Scaling models on massive, multi-modal datasets (H&E, protein, and spatial transcriptomics) is essential for capturing non-linear biological patterns - Failure mode: Relying solely on cell-line-based drug discovery fails to account for the complex, multicellular architecture of human tumors - Strategic shift: The industry is moving from drug-discovery-only models toward software licensing platforms that improve trial success rates ## Topics Spatial Transcriptomics, Transformer Models, Cancer Clinical Trials, Drug Development, Bioinformatics, Machine Learning, Precision Medicine, Biotech Software ## Chapters - 1:00 — The Thesis: Solving the Matching Problem: Introduction to Noetik's approach to reducing the 95% failure rate in cancer trials through better patient selection. - 7:10 — The Gap in Cell Line Models: Why traditional cell line studies fail to translate to human clinical outcomes due to lack of mutation complexity. - 13:40 — Scaling Biological Transformers: The necessity of massive, multi-modal datasets to achieve the scaling laws seen in natural language processing. - 20:05 — Decoding Spatial Transcriptomics: Defining spatial transcriptomics and how it provides the richest possible map of tumor biology. - 26:15 — Computer Vision for Biology: Treating high-dimensional gene expression as a massive computer vision problem with thousands of channels. - 32:50 — Simulating Genetic Perturbations: Using 'world models' to simulate the effects of knocking down specific genes within a tumor environment. - 39:20 — The Advantage of Paired Data: How Noetik's unique access to paired H&E and spatial data provides a competitive edge in model training. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/latent-space-ai-engineer/episodes/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/latent-space-ai-engineer/training-transformers-to-solve-95-failure-rate-of-cancer-trials-ron-alfa-daniel-bear-noetik.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.