Deutsche Telekom: Building a Multi-Agent LLM Platform for Customer Service Automation

LLMOps Database

Telecommunications

Deutsche Telekom

Company

Deutsche Telekom

Title

Building a Multi-Agent LLM Platform for Customer Service Automation

Industry

Telecommunications

Link

https://www.infoq.com/presentations/ai-agents-platform/?topicPageSponsorship=88befbbd-30f0-4d18-9d43-0bf2cb3e751d

Year

2023

Summary (short)

Deutsche Telekom developed a comprehensive multi-agent LLM platform to automate customer service across multiple European countries and channels. They built their own agent computing platform called LMOS to manage agent lifecycles, routing, and deployment, moving away from traditional chatbot approaches. The platform successfully handled over 1 million customer queries with an 89% acceptable answer rate and showed 38% better performance compared to vendor solutions in A/B testing.

Deutsche Telekom undertook an ambitious project to deploy generative AI across their European footprint through the Frag Magenta 1BOT program. The challenge was significant - they needed to support customer service automation across 10 countries in multiple languages, handling both chat and voice channels while serving approximately 100 million customers. Rather than taking a traditional chatbot or RAG-based approach, the team recognized early on that they needed a more sophisticated platform to handle the complexity of their use case. They developed a comprehensive multi-agent architecture and platform called LMOS (Language Models Operating System) that manages the entire lifecycle of AI agents in production. The technical architecture consists of several key components: The base layer is a custom agent framework built in Kotlin, chosen for its DSL capabilities and concurrency support. They developed a DSL called ARC (Agent Reactor) that allows rapid development of agents through a declarative approach. This significantly reduced the barrier to entry for developers and data scientists who might not be JVM experts. The agents themselves run as isolated microservices, each focused on a specific business domain (like billing or contracts). This isolation helps contain the impact of prompt changes and allows multiple teams to work in parallel. The platform includes sophisticated routing capabilities to direct queries to appropriate agents, either through explicit routing rules or intent-based routing. A key innovation is their approach to production deployment and operations. They built a custom control plane on top of Kubernetes and Istio that treats agents as first-class citizens in the platform. This allows for sophisticated traffic management, tenant/channel management, and progressive rollouts. The platform supports both their own ARC-based agents as well as agents built with other frameworks like LangChain or LlamaIndex. Some notable technical features include: * Sophisticated prompt management with built-in guardrails * Dynamic agent discovery and routing * Multi-channel support (chat, voice) with channel-specific optimizations * Support for multiple LLM models with easy model switching * Built-in security features like PII detection and handling * Comprehensive testing and evaluation capabilities The results have been impressive. Their deployment has handled over 1 million customer queries with an 89% acceptable answer rate. They achieved a 38% improvement in agent handover rates compared to vendor solutions in A/B testing. The platform has also dramatically improved development velocity - what used to take 2 months to develop can now be done in 10 days. A particularly interesting aspect of their approach is how they handle testing and quality assurance. They employ a combination of automated testing and human annotation, with a focus on continually building up test cases based on real-world interactions. They've implemented sophisticated guardrails and monitoring to detect potential issues like hallucinations or prompt injections. The team made some notable architectural decisions that proved valuable: * Using a multi-agent architecture to contain risks and allow parallel development * Building on familiar enterprise technologies (Kubernetes, JVM) while adding new AI-specific capabilities * Creating a DSL that makes agent development accessible while maintaining power and flexibility * Implementing sophisticated routing and traffic management to allow gradual rollouts Their approach to managing prompts is particularly sophisticated. Rather than treating prompts as simple templates, they've built comprehensive prompt management into their platform with features like: * Dynamic prompt generation based on context * Built-in security and safety checks * Support for dialog design patterns * Automated prompt testing and evaluation The team has open-sourced many components of their platform, aiming to contribute to the broader LLMOps ecosystem. They've structured their platform in layers: * Foundation computing abstractions for LLM handling * Single agent abstractions and tooling * Agent lifecycle management * Multi-agent collaboration This layered approach allows for extensibility while providing strong guardrails and operational capabilities needed for enterprise deployment. Their experience shows that while LLMs introduce new challenges, many established software engineering practices remain relevant - the key is adapting them appropriately for the unique characteristics of LLM-based systems.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free