Marsh McLennan: Enterprise-Wide LLM Assistant Deployment and Evolution Towards Fine-Tuned Models

LLMOps Database

Insurance

Marsh McLennan

Company

Marsh McLennan

Title

Enterprise-Wide LLM Assistant Deployment and Evolution Towards Fine-Tuned Models

Industry

Insurance

Link

https://youtu.be/GbpoMw-S1tg?si=ayY5KFf9aU7060-a

Year

2023

Summary (short)

Marsh McLennan, a global professional services firm, implemented a comprehensive LLM-based assistant solution reaching 87% of their 90,000 employees worldwide, processing 25 million requests annually. Initially focused on productivity enhancement through API access and RAG, they evolved their strategy from using out-of-the-box models to incorporating fine-tuned models for specific tasks, achieving better accuracy than GPT-4 while maintaining cost efficiency. The implementation has conservatively saved over a million hours annually across the organization.

Tags

high_stakes_application

regulatory_compliance

legacy_system_integration

Marsh McLennan's journey into production LLM deployment represents a significant case study in enterprise-wide AI implementation, showcasing both the opportunities and challenges of rolling out generative AI at scale in a large organization. # Initial Approach and Strategy The company took an interesting approach to LLM deployment that differed from typical enterprise adoption patterns. Rather than starting with extensive use-case analysis and business case development, they recognized the immediate potential of the technology and moved quickly to make it accessible across the organization. This approach was deliberately chosen to avoid the common enterprise trap of over-analysis and project bureaucracy that could stifle innovation and experimentation. Their implementation timeline was notably aggressive: * Early 2023: Initial exploration * April 2023: Secured APIs made available to teams * June 2023: Pilot LLM assistant launched * August/September 2023: Full global rollout # Technical Implementation and Architecture The technical architecture was designed with several key principles in mind: * Flexibility and experimentation * Cost-effectiveness * Security and access control * Scalability Rather than hosting their own models, they opted for a cloud-based API approach, renting models by the call. This decision was driven by the desire to maintain flexibility and keep experimentation costs low. The implementation included: * Core LLM assistant integrated into their office suite * Multiple AI-powered helper applications surrounding the core suite * RAG (Retrieval Augmented Generation) implementation for secure data access * API integration layer for service access # Evolution to Fine-Tuned Models One of the most interesting aspects of their journey was the evolution in their approach to model fine-tuning. Initially, they were skeptical of fine-tuning for several reasons: * Higher costs compared to out-of-the-box models * Operational complexities in managing multiple models * Data security concerns * Infrastructure multiplication challenges However, their perspective shifted as they discovered new approaches that made fine-tuning more economically viable. Key factors that influenced this change included: * Ability to share infrastructure across multiple use cases * Surprisingly low training costs (around $20 per training cycle) * Achievement of accuracy levels exceeding GPT-4 * Better economics for specific task automation # Security and Data Management The organization placed a strong emphasis on security and data control. Their RAG implementation was particularly important as it allowed them to: * Maintain precise control over data access * Match existing data security paradigms * Avoid the complications of embedding sensitive data directly into fine-tuned models * Manage access controls effectively across their global organization # Results and Impact The implementation has shown significant measurable impact: * 87% of 90,000 global employees have used the tool * Processing approximately 25 million requests annually * Conservative estimate of over 1 million hours saved per year * Improvements in client service, decision making, and work-life balance # Future Direction and Lessons Learned Their experience has led to several insights about the future of enterprise LLM deployment: * The value of starting with broad accessibility and letting use cases emerge organically * The importance of making experimentation cheap and accessible * The potential of specialized, targeted models for specific tasks * The benefit of a hybrid approach using both general and fine-tuned models Their future strategy includes: * Continuing enhancement of their productivity suite * Focus on process-specific automation * Implementation of more specialized models for subtasks * Development of a "flywheel" approach where initial LLM implementations gather data that informs subsequent fine-tuning # Challenges and Considerations The case study highlights several important challenges in enterprise LLM deployment: * Balancing the desire for experimentation with enterprise security requirements * Managing the economics of model deployment at scale * Addressing concerns about job replacement versus augmentation * Maintaining control over data access while enabling broad accessibility # Technical Lessons Several key technical lessons emerged: * The importance of API-first architecture for flexibility * The value of RAG for maintaining data security * The need for balanced infrastructure investment * The benefits of incremental improvement in model accuracy and efficiency The case provides valuable insights into how large enterprises can successfully implement LLMs at scale while maintaining security, managing costs, and driving meaningful business value. It particularly highlights the importance of practical, iterative approaches over theoretical perfection, and the value of making AI capabilities broadly accessible while maintaining appropriate controls.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source