This case study examines how Databricks implemented a sophisticated LLM-powered AI assistant to enhance their sales operations, providing valuable insights into the real-world challenges and solutions of deploying LLMs in a production environment.
The core business challenge was addressing the overwhelming amount of information sales teams need to process across multiple siloed systems. Sales representatives required easy access to account data, opportunity information, market insights, and various sales collateral, while also needing to perform numerous repetitive administrative tasks. The Field AI Assistant was developed to streamline these processes through automation and intelligent data integration.
From a technical implementation perspective, the solution demonstrates several key aspects of production LLM deployment:
**Architecture and Integration**
The system is built entirely on the Databricks tech stack, utilizing their Mosaic AI agent framework. The architecture employs a compound AI agent approach, with one driver agent coordinating multiple tools and functions. The solution integrates diverse data sources including:
* Internal Databricks Lakehouse for account intelligence and sales content
* CRM system (Salesforce)
* Collaboration platforms for unstructured data
* Various external APIs and services
**Model Selection and Management**
The team chose Azure OpenAI's GPT-4 as their primary model after careful evaluation of various options, including open-source alternatives. The selection criteria focused on:
* Groundedness and factual accuracy
* Ability to generate relevant content
* Effectiveness in tool/function selection
* Adherence to output formatting requirements
Importantly, the architecture was designed to be model-agnostic, allowing for easy adoption of new models as they become available in their framework.
**Data Engineering and Quality**
The case study highlights the critical importance of data quality in production LLM systems. The team implemented:
* Iterative expansion of datasets
* Focused data engineering pipelines
* Creation of clean, GOLD Single Source of Truth datasets
* Strong data governance through Unity Catalog
**Production Infrastructure**
The production deployment leverages several key components:
* Mosaic AI Vector Search for embedding models and vector databases
* Function calling and tool usage interfaces
* MLFlow LLMOps for model customization and management
* DSPy on Databricks for prompt engineering
* Mosaic AI Gateway for monitoring, rate limiting, and guardrails
**Governance and Security**
The implementation emphasizes robust governance measures:
* Early engagement with Enterprise Security, Privacy, and Legal teams
* Comprehensive governance model built on Unity Catalog
* Access controls and rate limiting
* Payload logging and monitoring
* Safety and bias checks
**Evaluation and Monitoring**
The team implemented several approaches to measure and maintain system quality:
* LLM-as-a-judge capability for response scoring
* Continuous monitoring of running systems
* Evaluation of safety, bias, and quality metrics
* Small focus group testing during pilot phase
**Challenges and Learnings**
The case study candidly discusses several challenges faced during implementation:
* Data messiness requiring significant cleanup and pipeline development
* Difficulty in measuring ROI, requiring careful experimentation with focus groups
* Complexity in building evaluation datasets
* Need for strong governance from the start
**Capabilities and Use Cases**
The deployed system supports various use cases including:
* Conversational interaction with data across multiple sources
* Automated document creation and downloading
* CRM updates and field management
* Personalized email drafting
* Customer proposal creation
* Meeting preparation assistance
The system provides comprehensive customer insights including financial news, competitive landscape analysis, product consumption data, support case information, and use case recommendations. It also handles data hygiene alerts and manages sales collateral.
From an LLMOps perspective, this case study demonstrates the complexity of building and deploying production-grade LLM systems in enterprise environments. It highlights the importance of robust infrastructure, careful model selection, strong governance, and comprehensive monitoring. The modular architecture and emphasis on data quality and governance provide valuable lessons for similar implementations.
The implementation showcases how LLM systems can be effectively integrated into existing business processes while maintaining security and compliance requirements. The focus on measurable business impact and user adoption through careful piloting and evaluation provides a template for other organizations considering similar deployments.