Company
Glean
Title
Building Robust Enterprise Search with LLMs and Traditional IR
Industry
Tech
Year
2023
Summary (short)
Glean tackles enterprise search by combining traditional information retrieval techniques with modern LLMs and embeddings. Rather than relying solely on AI techniques, they emphasize the importance of rigorous ranking algorithms, personalization, and hybrid approaches that combine classical IR with vector search. The company has achieved unicorn status and serves major enterprises by focusing on holistic search solutions that include personalization, feed recommendations, and cross-application integrations.
# Building Production-Grade Enterprise Search at Glean ## Company Overview Glean is an enterprise search company founded in 2019 by ex-Google engineers to solve the challenge of finding information across enterprise applications. Taking inspiration from Google's internal MOMA tool, they built a comprehensive search solution that has achieved unicorn status ($1B+ valuation) and serves major enterprises like Databricks, Canva, Confluent, and Duolingo. ## Technical Architecture and Approach ### Hybrid Search Implementation - Combines multiple search approaches rather than relying on a single technique: ### Core Search Components - Query understanding pipeline to interpret user intent - Document understanding to assess quality and relevance - Ranking components stack with multiple specialized modules - Personalization engine that considers: ### Infrastructure Considerations - Uses distributed elastic search as foundational infrastructure - Built integrations with major SaaS application APIs - Handles complex permission models across applications - Manages real-time indexing and freshness requirements ## Production Challenges and Solutions ### Search Quality - Rigorous tuning of ranking algorithms through iterative optimization - Focus on intellectual honesty in evaluating search quality - Regular evaluation of ranking components performance - Careful balance between different ranking signals ### Cost and Performance - Optimization of infrastructure costs while maintaining quality - Management of latency requirements for real-time search - Efficient handling of cross-application data access - Balancing index freshness with performance ### Enterprise Integration - Built robust API integrations with enterprise SaaS tools - Handle complex permission models across applications - Support for enterprise-specific features like Go links - Cross-platform mention tracking and notifications ## LLM Integration Strategy ### Thoughtful AI Adoption - Careful integration of LLMs rather than wholesale replacement - Recognition that LLMs are not silver bullets for all search problems - Hybrid approach combining traditional IR with modern AI - Focus on measurable user experience improvements ### AI Feature Development - Experimentation with chat interfaces - Testing of various LLM applications - Validation of AI features against user needs - Careful evaluation of cost-benefit tradeoffs ## Production System Features ### Core Enterprise Features - Cross-application search - Permission-aware results - Document collections - Go links functionality - Trending content identification - Feed recommendations - Mentions aggregation across platforms - Real-time updates ### Personalization Capabilities - Role-based customization - Team-aware results - Historical interaction incorporation - Organizational context understanding ## Deployment Considerations ### Enterprise Requirements - Security and compliance handling - Performance at enterprise scale - Integration with existing tools - Support for multiple data sources - Real-time indexing capabilities ### Operational Focus - Regular ranking algorithm updates - Performance monitoring and optimization - Cost management and efficiency - Integration maintenance and updates ## Lessons Learned ### Technical Insights - Importance of hybrid approaches over pure AI solutions - Value of rigorous ranking algorithm tuning - Need for personalization in enterprise context - Balance between features and performance ### Product Development - Focus on solving real user problems - Importance of retention-driving features - Need for integrated platform approach - Value of enterprise-specific features ### AI Integration - Careful evaluation of AI applications - Focus on measurable improvements - Recognition of AI limitations - Importance of traditional IR techniques ## Future Directions ### Technology Evolution - Continued experimentation with LLMs - Enhanced personalization capabilities - Improved cross-application integration - Advanced ranking algorithms ### Enterprise Features - Enhanced collaboration tools - Improved document understanding - Better context awareness - Advanced permission handling ## Implementation Guidelines ### Best Practices - Start with solid IR foundations

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.