Company
LinkedIn
Title
Domain-Adapted Foundation Models for Enterprise-Scale LLM Deployment
Industry
Tech
Year
2024
Summary (short)
LinkedIn developed a family of domain-adapted foundation models (EON models) to enhance their GenAI capabilities across their platform serving 1B+ members. By adapting open-source models like Llama through multi-task instruction tuning and safety alignment, they created cost-effective models that maintain high performance while being 75x more cost-efficient than GPT-4. The EON-8B model demonstrated significant improvements in production applications, including a 4% increase in candidate-job-requirements matching accuracy compared to GPT-4o mini in their Hiring Assistant product.
LinkedIn's journey into production LLM deployment represents a comprehensive case study in enterprise-scale LLMOps, showcasing how a major technology platform can effectively adapt and deploy foundation models for specific domain needs while maintaining cost efficiency and performance. The company's approach to LLMOps evolved through several phases, starting with their initial experiments in 2022 using in-house models for personalized AI-assisted messaging. By 2023, they had begun using GPT-4 for premium profile writing suggestions and collaborative articles, though they faced challenges with latency and costs. This led to the development of their InBart model, an in-house domain-adapted transformer model that leveraged several key LLMOps techniques including: * Model imitation * Automated instruction faithfulness checking * Hallucination detection * Retrieval Augmented Generation (RAG) for inference However, the maintenance and expansion of these single-purpose fine-tuned models proved challenging, leading to their latest innovation: the Economic Opportunity Network (EON) project, which represents a more sophisticated and scalable approach to LLMOps. The EON project introduces several important LLMOps innovations and best practices: **Model Architecture and Training Pipeline** * Built on top of open-source models like Llama * Implemented a two-step training process: * Multi-task instruction tuning with reasoning traces * Preference and safety alignment using RLHF and DPO * Utilized prompt simplification strategies, achieving 30% reduction in prompt size * Training data comprised around 200M tokens, focusing on quality and diversity **Infrastructure and Deployment** * Developed on an on-premise Kubernetes platform * Implemented modular training pipeline connecting: * Data preprocessing * Model training * Offline inference * Comprehensive evaluation * Flexible architecture allowing different optimization techniques: * In-house Liger Kernels * DeepSpeed ZeRO * Huggingface Accelerate * vLLM for inference **Evaluation and Monitoring** * Integrated with MLFlow for experiment tracking * Comprehensive evaluation framework including: * Open-source benchmarks (ARC, MuSR, IFEval) * Berkeley Function Calling Leaderboard metrics * Internal safety scores * GPT-4 as a judge for certain metrics * Centralized company-wide leaderboard for model performance tracking **Cost Optimization and Performance** The EON-8B model achieved remarkable efficiency improvements: * 75x more cost-effective than GPT-4 * 6x more cost-effective than GPT-4o * Significantly reduced GPU requirements for serving **Production Application - Hiring Assistant** The deployment of EON models in LinkedIn's Hiring Assistant product demonstrates real-world effectiveness: * Handles 90% of LLM calls in the hiring assistant workflow * Improved candidate-job-requirements matching accuracy by 4% compared to GPT-4o mini * 30% improvement over Llama-3-8B-instruct baseline * Successfully integrated with LinkedIn's Responsible AI principles **Safety and Responsible AI** The team implemented several measures to ensure responsible deployment: * Instruction tuning with synthetic safe outputs * Preference alignment data * Built-in bias detection for job requirements * Compliance with LinkedIn's trust and fairness principles **Challenges and Solutions** The team faced and addressed several common LLMOps challenges: * Balancing generalization with domain-specific performance * Managing training data quality and diversity * Implementing efficient evaluation frameworks * Scaling inference while maintaining cost efficiency **Future Developments** The team is actively working on enhancing the system with: * Complex multi-turn interactions * Advanced planning and reasoning capabilities * Efficient context representations * Dynamic goal identification * Improved storage and retrieval techniques This case study demonstrates how large enterprises can successfully implement LLMOps at scale, balancing performance, cost, and responsible AI principles. The modular architecture and comprehensive evaluation framework provide a template for other organizations looking to deploy domain-adapted language models in production environments. The success of the EON project highlights the importance of: * Building flexible, modular infrastructure * Implementing comprehensive evaluation frameworks * Balancing cost efficiency with performance * Maintaining strong safety and responsibility standards * Creating scalable, reusable solutions LinkedIn's approach shows that with proper LLMOps practices, organizations can successfully adapt and deploy foundation models while maintaining control over costs, performance, and safety considerations.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.