Episode

Arvind Jain on Building Glean and the Future of Enterprise AI

Podcast
Gradient Dissent: Conversations on AI
Published
Aug 5, 2025
Duration seconds
2621
Processing state
processed
Canonical source
https://wandb.ai/site/resources/podcast
Audio
https://episodes.captivate.fm/episode/1b866c83-7cf5-4181-942a-d98dfcaef7aa.mp3
JSON
/v1/public/podcasts/gradient-dissent/episodes/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai
Markdown
/podcast/gradient-dissent/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/gradient-dissent/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Arvind Jain explains how Glean transitioned from a 2019 enterprise search startup into a leading AI platform by leveraging early transformer technology. He discusses the technical architecture required to make LLMs safe and effective for internal corporate knowledge.

Topics

  • Enterprise AI
  • Large Language Models
  • Retrieval-Augmented Generation
  • Semantic Search
  • Transformer Models
  • AI Agents
  • Data Security
  • Software Engineering

Highlights

  • Main idea: Glean uses a RAG-style architecture to connect LLMs to private enterprise data securely
  • Technical takeaway: Using citations and evaluation frameworks is critical to suppressing hallucinations in enterprise settings
  • Failure mode: Relying solely on massive foundation models without purpose-trained layers can miss the nuance of internal documentation
  • Practical takeaway: AI should be viewed as a force multiplier that enables teams to scale output rather than a tool for headcount reduction
  • Strategic insight: The shift toward SaaS-heavy environments made enterprise search more difficult but also more technically tractable via API-driven data access

Chapters

  1. 1:00 Defining Enterprise AI: An introduction to Glean's mission to provide a ChatGPT-like experience for internal company data and workflows.
  2. 4:25 The Pre-LLM Era: How Glean utilized transformer models in 2019 to solve the fragmentation of enterprise information.
  3. 7:40 Fine-tuning vs. Out-of-the-box Models: The technical decision-making process regarding when to use massive foundation models versus specialized search stacks.
  4. 14:25 Security and RAG Architecture: Implementing RAG to ensure AI models only access data that users are explicitly authorized to see.
  5. 17:55 Lessons from Rubrik and Google: Reflections on building large-scale, high-impact companies and the importance of tackling universal problems.
  6. 30:50 The Future of Work and AI Agents: Why AI is an enabler for human productivity and how roles like software engineering will evolve toward design and review.
  7. 34:05 Evaluating Model Performance: Using golden sets and evaluation frameworks to measure accuracy and minimize errors in production.