Episode

Arvind Jain on Building Glean and the Future of Enterprise AI

Podcast: Gradient Dissent: Conversations on AI
Published: Aug 5, 2025
Duration seconds: 2621
Processing state: processed
Canonical source: https://wandb.ai/site/resources/podcast
Audio: https://episodes.captivate.fm/episode/1b866c83-7cf5-4181-942a-d98dfcaef7aa.mp3
JSON: /v1/public/podcasts/gradient-dissent/episodes/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai
Markdown: /podcast/gradient-dissent/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai.md

Actions

POST https://stenobird.com/v1/public/podcasts/gradient-dissent/episodes/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/gradient-dissent/arvind-jain-on-building-glean-and-the-future-of-enterprise-ai.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

Arvind Jain explains how Glean transitioned from a 2019 enterprise search startup into a leading AI platform by leveraging early transformer technology. He discusses the technical architecture required to make LLMs safe and effective for internal corporate knowledge.

Topics

Enterprise AI
Large Language Models
Retrieval-Augmented Generation
Semantic Search
Transformer Models
AI Agents
Data Security
Software Engineering

Highlights

Main idea: Glean uses a RAG-style architecture to connect LLMs to private enterprise data securely
Technical takeaway: Using citations and evaluation frameworks is critical to suppressing hallucinations in enterprise settings
Failure mode: Relying solely on massive foundation models without purpose-trained layers can miss the nuance of internal documentation
Practical takeaway: AI should be viewed as a force multiplier that enables teams to scale output rather than a tool for headcount reduction
Strategic insight: The shift toward SaaS-heavy environments made enterprise search more difficult but also more technically tractable via API-driven data access

Chapters

1:00 Defining Enterprise AI: An introduction to Glean's mission to provide a ChatGPT-like experience for internal company data and workflows.
4:25 The Pre-LLM Era: How Glean utilized transformer models in 2019 to solve the fragmentation of enterprise information.
7:40 Fine-tuning vs. Out-of-the-box Models: The technical decision-making process regarding when to use massive foundation models versus specialized search stacks.
14:25 Security and RAG Architecture: Implementing RAG to ensure AI models only access data that users are explicitly authorized to see.
17:55 Lessons from Rubrik and Google: Reflections on building large-scale, high-impact companies and the importance of tackling universal problems.
30:50 The Future of Work and AI Agents: Why AI is an enabler for human productivity and how roles like software engineering will evolve toward design and review.
34:05 Evaluating Model Performance: Using golden sets and evaluation frameworks to measure accuracy and minimize errors in production.