# Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh - #672 Page: https://stenobird.com/podcast/twiml-ai-podcast/reasoning-over-complex-documents-with-docllm-with-armineh-nourbakhsh-672 Text version: https://stenobird.com/podcast/twiml-ai-podcast/reasoning-over-complex-documents-with-docllm-with-armineh-nourbakhsh-672.md Podcast: [The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)](https://stenobird.com/podcast/twiml-ai-podcast) Published: 2024-02-19T19:07:00+00:00 Episode link: https://twimlai.com/podcast/twimlai/reasoning-over-complex-documents-with-docllm/ Audio file: https://pscrb.fm/rss/p/traffic.megaphone.fm/MLN8614358492.mp3?updated=1708370325 Processing state: failed JSON: https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/reasoning-over-complex-documents-with-docllm-with-armineh-nourbakhsh-672 Duration seconds: 2738 ## Resource Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her team’s approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the datasets they used to build the model, their approach to incorporating layout information, and the various ways they evaluated the model’s performance. The complete show notes for this episode can be found at twimlai.com/go/672. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/twiml-ai-podcast/episodes/reasoning-over-complex-documents-with-docllm-with-armineh-nourbakhsh-672/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/twiml-ai-podcast/reasoning-over-complex-documents-with-docllm-with-armineh-nourbakhsh-672.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.