Discord: Large-Scale AI Assistant Deployment with Safety-First Evaluation Approach

LLMOps Database

Tech

Discord

Company

Discord

Title

Large-Scale AI Assistant Deployment with Safety-First Evaluation Approach

Industry

Tech

Link

https://www.youtube.com/watch?v=OrtBEBLMXdM

Year

2023

Summary (short)

Discord implemented Clyde AI, a chatbot assistant that was deployed to over 200 million users, focusing heavily on safety, security, and evaluation practices. The team developed a comprehensive evaluation framework using simple, deterministic tests and metrics, implemented through their open-source tool PromptFu. They faced unique challenges in preventing harmful content and jailbreaks, leading to innovative solutions in red teaming and risk assessment, while maintaining a balance between casual user interaction and safety constraints.

This case study examines Discord's journey in deploying Clyde AI, a large-scale chatbot implementation that reached over 200 million users. The presentation, given by a former team lead who worked on both the developer platform and LLM products teams, provides valuable insights into the challenges and solutions in deploying LLMs at scale with a particular focus on safety and evaluation practices. # Core Challenge and Approach The primary challenge wasn't in the model development or fine-tuning, but rather in ensuring safety and preventing harmful outputs. The team faced significant hurdles in preventing the system from generating dangerous content (like bomb-making instructions) or engaging in harmful behaviors. This was particularly challenging given Discord's young user base and the tendency of some users to actively try to break or exploit the system. The team identified that the major launch blockers were typically related to security, legal, safety, and policy concerns rather than technical issues. This led to the development of a comprehensive evaluation framework that could quantify risks ahead of time and satisfy stakeholders' concerns. # Evaluation Philosophy and Implementation Discord's approach to evaluations (evals) was notably practical and developer-focused. They treated evals as unit tests, emphasizing: * Simple, fast, and ideally deterministic evaluations * Local running capability without cloud dependencies * Basic metrics that are easy to understand * Integration into the regular development workflow The team developed PromptFu, an open-source CLI tool for evals, which features declarative configs and supports developer-first evaluation practices. Every pull request required an accompanying eval, creating a culture of continuous testing and evaluation. A particularly interesting example of their practical approach was their solution for maintaining a casual chat personality. Instead of complex LLM graders or sophisticated metrics, they simply checked if responses began with lowercase letters - a simple heuristic that achieved 80% of their goals with minimal effort. # System Architecture and Technical Solutions The team implemented several innovative technical solutions: * Split testing between tool triggering and content generation with static contexts * Model routing strategy that occasionally used GPT-4 responses to "course correct" lower-powered models that were drifting from their system prompts in long conversations * Integration with DataDog for observability, choosing to use existing tools rather than adopting specialized LLM-specific solutions * Simple prompt management using Git as source of truth and Retool for configuration # Safety and Red Teaming Discord's approach to safety was particularly comprehensive, given their unique challenges with a young, technically savvy user base prone to testing system limits. They developed a two-pronged approach: * Pre-deployment safeguards: * Comprehensive risk assessment framework * Extensive red teaming * Compliance and legal constraint evaluation * Live filtering and monitoring The team created sophisticated red teaming approaches, including using unaligned models to generate toxic inputs and developing application-specific jailbreak testing. They documented various attack vectors, including the "grandma jailbreak" incident, which helped improve their safety measures. # Observability and Feedback The observability strategy focused on practical integration with existing tools, particularly DataDog. While they weren't able to implement a complete feedback loop for privacy reasons, they did: * Implement online production evals * Use model-graded evaluations * Incorporate public feedback and dogfooding data into their eval suite # Challenges and Limitations The case study honestly addresses several limitations and challenges: * Inability to implement a complete feedback loop due to privacy constraints * Challenges with model drift in long conversations * Difficulty in maintaining consistent personality across different model types * Constant battle against creative user attempts to bypass safety measures # Key Learnings The case study emphasizes several important lessons for LLMOps at scale: * Simplicity in evaluation metrics often outperforms complex solutions * Safety and security considerations should be primary, not secondary concerns * Developer experience and integration with existing workflows is crucial * The importance of practical, implementable solutions over theoretical perfection This case study is particularly valuable as it provides real-world insights into deploying LLMs at scale while maintaining safety and quality standards. Discord's approach demonstrates that successful LLMOps isn't just about sophisticated technical solutions, but about building practical, maintainable systems that can be effectively monitored and improved over time.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source