Decagon: Building a Production AI Agent System for Customer Support

LLMOps Database

Tech

Decagon

Company

Decagon

Title

Building a Production AI Agent System for Customer Support

Industry

Tech

Link

https://www.youtube.com/watch?v=t42Ohg19HZk

Year

2023

Summary (short)

Decagon has developed a comprehensive AI agent system for customer support that handles multiple communication channels including chat, email, and voice. Their system includes a core AI agent brain, intelligent routing, agent assistance capabilities, and robust testing and monitoring infrastructure. The solution aims to improve traditionally painful customer support experiences by providing consistent, quick responses while maintaining brand voice and safely handling sensitive operations like refunds.

Tags

customer_support

multi_modality

high_stakes_application

regulatory_compliance

Decagon offers an interesting case study in deploying LLMs at scale in a customer support context through their AI agent platform. The company, at Series B stage, has developed what they call an "AI Agent Engine" which demonstrates several key aspects of production LLM systems. ## System Architecture and Components The system is built around five core components that work together to create a complete LLMOps solution: ### Core AI Agent The foundation is their "brain" that handles enterprise logic and knowledge processing. This component: * Ingests and processes company knowledge bases, help center articles, and standard operating procedures * Implements tool calling capabilities for actions like issuing refunds or checking order status * Works across multiple channels (chat, email, SMS, voice) with the same core logic but different interaction patterns * Includes guardrails and security measures to prevent prompt injection and unauthorized actions What's particularly notable is their approach to tools and permissions. They've implemented a sophisticated system for managing sensitive operations like refunds, with configurable criteria that can include both hard rules (e.g., customer status, time since last refund) and softer qualitative assessments. ### Routing System The routing component demonstrates sophisticated orchestration of human-AI collaboration: * Dynamically routes conversations between AI and human agents based on configurable criteria * Particularly important for regulated industries like healthcare and financial services * Supports flexible handoff patterns, including the ability to return conversations to AI handling after human intervention ### Agent Assist This component acts as a co-pilot for human agents, showing how LLMs can augment rather than replace human workers: * Provides human agents access to the AI brain's capabilities * Allows for human review and approval of AI-suggested actions * Can serve as a stepping stone for companies gradually adopting AI technology ### Admin Dashboard The dashboard serves as the central nervous system for monitoring and improving the AI system: * Enables configuration of brand voice, guidelines, and response structures * Provides testing capabilities for new agent changes * Tracks key metrics like deflection rate (percentage of conversations handled without human escalation) and customer satisfaction scores * Facilitates continuous monitoring and improvement of the system ### Quality Assurance Interface Their QA approach demonstrates sophisticated testing and evaluation practices: * Pre-deployment testing with comprehensive test sets (hundreds of conversations per workflow) * Continuous testing as production data comes in * Evaluation of both quantitative metrics and qualitative aspects like tone and formatting * Structured taxonomies for consistent quality assessment * Gradual rollout strategy (starting with 5% of users) with rapid iteration loops ## Production Considerations Several aspects of their system demonstrate mature LLMOps practices: **Testing and Evaluation:** * Multiple testing phases: pre-deployment and continuous testing * Test sets that cover different paths and edge cases * Automated evaluation of responses using specialized evaluation agents * Adaptation of test sets based on real user interactions and changing product offerings **Safety and Security:** * Built-in guardrails against prompt injection attempts * Enterprise-grade security controls * Compliance with regulatory requirements * Penetration testing support **Customization and Configuration:** * Flexible configuration of brand voice and response patterns * Custom workflow definitions * Adjustable security parameters * Channel-specific optimizations **Monitoring and Analytics:** * Real-time tracking of key metrics * Comprehensive logging of interactions * Analysis of conversation patterns * Customer satisfaction monitoring What's particularly interesting about Decagon's approach is how they've balanced automation with human oversight. Rather than pushing for complete automation, they've created a system that can operate independently when appropriate but seamlessly integrate human judgment for complex or sensitive cases. They've also shown sophisticated understanding of different interaction modalities, adapting their core agent brain to handle different response time expectations and interaction patterns across chat, email, and voice channels. Their testing and evaluation approach is notably comprehensive, combining automated checks with human review and gradually expanding deployment to ensure quality. The system demonstrates how production LLM applications need multiple layers of safety and quality controls, especially when handling sensitive operations like financial transactions. The focus on brand consistency and customization also shows how enterprise LLM applications need to go beyond simple prompt engineering to maintain consistent voice and behavior across all interactions. This includes handling complex workflows while staying within brand guidelines and regulatory requirements. Overall, Decagon's implementation shows how production LLM systems require careful orchestration of multiple components, robust safety measures, and sophisticated monitoring and improvement processes. Their approach to gradually rolling out features and maintaining multiple layers of quality control provides a good model for enterprise LLM deployment.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free