DoorDash implemented an LLM-based chatbot system to improve their Dasher support automation, replacing a traditional flow-based system. The solution uses RAG (Retrieval Augmented Generation) to leverage their knowledge base, along with sophisticated quality control systems including LLM Guardrail for real-time response validation and LLM Judge for quality monitoring. The system successfully handles thousands of support requests daily while achieving a 90% reduction in hallucinations and 99% reduction in compliance issues.
# LLM-Based Dasher Support System at DoorDash
## Overview
DoorDash implemented a sophisticated LLM-based support system to assist their delivery contractors (Dashers) with various issues they encounter during deliveries. The system replaces a traditional flow-based automation system with a more flexible and intelligent solution powered by large language models and RAG (Retrieval Augmented Generation).
## System Architecture and Components
### RAG System Implementation
- Built a RAG system that leverages existing knowledge base articles
- System flow:
### Quality Control Systems
### LLM Guardrail System
- Online monitoring tool for real-time response evaluation
- Two-tier approach for efficiency:
- Checks multiple aspects:
- Achieved 90% reduction in hallucinations
- Reduced severe compliance issues by 99%
### LLM Judge System
- Comprehensive quality monitoring system
- Evaluates five key aspects:
- Uses combination of:
## Quality Improvement Pipeline
### Knowledge Base Optimization
- Regular reviews and updates to eliminate misleading information
- Development of developer-friendly KB management portal
- Continuous expansion of article coverage
### Retrieval Enhancement
- Focus on query contextualization
- Vector store optimization
### Prompt Engineering Best Practices
- Breaking down complex prompts into manageable components
- Avoiding negative language in prompts
- Implementing chain-of-thought prompting
- Using parallel processing where possible
### Regression Prevention
- Implementation of automated testing framework
- Similar to software unit testing
- Maintains prompt quality and model performance
- Continuous addition of new test cases
- Blocking deployment of failing prompts
## Technical Challenges and Solutions
### RAG System Challenges
- Groundedness and relevance issues
- Context summarization accuracy
- Language consistency
- Function call consistency
- Latency management
### Quality Monitoring
- Automated evaluation system with human oversight
- Continuous calibration between automated and human reviews
- Regular transcript sampling and analysis
- Integration of feedback into improvement pipeline
## Production Deployment and Scaling
### System Integration
- Seamless integration with existing support infrastructure
- Fallback mechanisms to human agents when needed
- API integration for system actions
### Performance Metrics
- Handles thousands of support requests daily
- Significant reduction in response time compared to human agents
- High automation rate while maintaining quality
- Continuous monitoring of key metrics:
### Future Improvements
- Expanding automated handling of complex scenarios
- Enhanced data collection and analysis
- Continuous model and ontology improvements
- Regular system updates based on collected insights
## Key Learnings and Best Practices
- Importance of multi-layered quality control
- Value of combining automated and human oversight
- Need for comprehensive testing frameworks
- Benefits of iterative improvement process
- Significance of proper prompt engineering
- Critical role of accurate knowledge base maintenance
The DoorDash implementation demonstrates a sophisticated approach to LLMOps, combining multiple quality control systems with practical deployment considerations. Their success in reducing hallucinations and compliance issues while maintaining high throughput showcases the effectiveness of their layered approach to quality control and continuous improvement.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.