# Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

Page: https://stenobird.com/podcast/daily-paper-cast-7079649/distilling-long-cot-reasoning-through-collaborative-step-wise-multi-teacher-decoding
Text version: https://stenobird.com/podcast/daily-paper-cast-7079649/distilling-long-cot-reasoning-through-collaborative-step-wise-multi-teacher-decoding.md
Podcast: [Daily Paper Cast](https://stenobird.com/podcast/daily-paper-cast-7079649)
Published: 2026-05-19T04:20:34+00:00
Episode link: https://share.transistor.fm/s/e1a410e1
Audio file: https://media.transistor.fm/e1a410e1/373fb12b.mp3
Processing state: not_requested
JSON: https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/distilling-long-cot-reasoning-through-collaborative-step-wise-multi-teacher-decoding
Duration seconds: 1272

## Resource

🤗 Upvotes: 34 | cs.AI Authors: Taewon Yun, Jisu Shin, Jeonghwan Choi, Seunghwan Bang, Hwanjun Song Title: Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding Arxiv: http://arxiv.org/abs/2605.02290v1 Abstract: Distilling large reasoning models is essential for making Long-CoT reasoning practical, as full-scale inference remains computationally prohibitive. Existing curation-based approaches select complete reasoning traces post-hoc, overlooking collaboration among heterogeneous teachers and lacking dynamic exploration, which leads to redundant sampling and missed complementary reasoning. We introduce CoRD, a collaborative multi-teacher decoding framework that performs step-wise reasoning synthesis guided by predictive perplexity-based scoring and beam search. This enables heterogeneous LRMs to jointly construct coherent reasoning trajectories while efficiently preserving diverse, high-potential hypotheses. Experiments show that CoRD produces higher-quality reasoning data and achieves near teacher-level student performance with fewer, structured supervision signals, without substantial efficiency overhead. CoRD further generalizes well to out-of-domain and open-ended settings. The dataset and model are available at \href{https://github.com/DISL-Lab/CoRD}{https://github.com/DISL-Lab/CoRD}.

## Actions

- request_transcript: `POST https://stenobird.com/v1/public/podcasts/daily-paper-cast-7079649/episodes/distilling-long-cot-reasoning-through-collaborative-step-wise-multi-teacher-decoding/transcription-requests` — Idempotently request low-priority transcript generation for this episode.
- read_markdown: `GET https://stenobird.com/podcast/daily-paper-cast-7079649/distilling-long-cot-reasoning-through-collaborative-step-wise-multi-teacher-decoding.md` — Read the agent-friendly Markdown representation of this episode resource.

A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed.

## Transcript

Full transcripts are not published on public pages unless there is a clear rights basis.