Deepgram: Domain-Specific Small Language Models for Call Center Intelligence

LLMOps Database

Telecommunications

Deepgram

Company

Deepgram

Title

Domain-Specific Small Language Models for Call Center Intelligence

Industry

Telecommunications

Link

https://www.youtube.com/watch?v=ZglrqT0dPUU

Year

2023

Summary (short)

Deepgram tackles the challenge of building efficient language AI products for call centers by advocating for small, domain-specific language models instead of large foundation models. They demonstrate this by creating a 500M parameter model fine-tuned on call center transcripts, which achieves better performance in call center tasks like conversation continuation and summarization while being more cost-effective and faster than larger models.

Tags

knowledge_distillation

# Domain-Specific Language Models for Call Center Intelligence at Deepgram ## Company Background - Deepgram is a Speech-to-Text startup founded in 2015 - Series B company with $85 million in total funding - Processed over one trillion minutes of audio - Provides what they consider to be the most accurate and fastest speech-to-text API in the market ## Problem Statement and Market Context ### Language AI Evolution - Language is viewed as the universal interface to AI - Businesses need adapted AI solutions for practical implementation - Over next two years, many businesses will derive value from language AI products ### Multi-Modal Pipeline Architecture - Three-stage pipeline approach: ### Call Center Use Case Specifics - Centralized facilities handling large volumes of calls - Staffed with specially trained agents - Need for AI products supporting both customer and employee experience: ## Technical Challenges with Large Language Models ### Scale and Performance Issues - Large models typically exceed 100 billion parameters - Resource intensive deployment requirements ### Domain Specificity Challenges - LLMs have broad but shallow knowledge - Call center conversations have: ### Out-of-Distribution Problems - Standard LLMs struggle with real call center conversations - Generated conversations are unrealistic: ## Solution: Domain-Adapted Small Language Models ### Technical Implementation - Base model: - Transfer learning: ### Production Implementation - Integrated pipeline demonstration: - Performance metrics: ## Key Benefits and Results ### Efficiency Advantages - Faster inference times - Lower resource requirements - Cost-effective deployment ### Quality Improvements - Better handling of domain-specific conversations - More realistic conversation generation - Accurate summarization capabilities ### Production Readiness - Integrated with existing API infrastructure - Scalable deployment model - Real-time processing capabilities ## LLMOps Best Practices Demonstrated ### Model Selection and Optimization - Conscious choice of smaller, specialized models over larger general models - Focus on practical deployment constraints - Balance between model capability and operational efficiency ### Domain Adaptation Strategy - Effective use of transfer learning - Domain-specific data utilization - Targeted performance optimization ### Production Integration - API-first approach - Pipeline architecture implementation - Real-time processing capabilities - Integration of multiple AI components (ASR, diarization, summarization) ### Monitoring and Quality Control - Performance metrics tracking - Accuracy measurements - Response time monitoring This case study represents a practical approach to implementing LLMs in production, focusing on domain-specific optimization and operational efficiency rather than raw model size. It demonstrates how careful consideration of deployment constraints and domain requirements can lead to more effective real-world AI solutions.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free