Doordash: LLMs for Enhanced Search Retrieval and Query Understanding

LLMOps Database

E-commerce

Doordash

Company

Doordash

Title

LLMs for Enhanced Search Retrieval and Query Understanding

Industry

E-commerce

Link

https://careersatdoordash.com/blog/how-doordash-leverages-llms-for-better-search-retrieval/

Year

2024

Summary (short)

Doordash implemented an advanced search system using LLMs to better understand and process complex food delivery search queries. They combined LLMs with knowledge graphs for query segmentation and entity linking, using retrieval-augmented generation (RAG) to constrain outputs to their controlled vocabulary. The system improved popular dish carousel trigger rates by 30%, increased whole page relevance by over 2%, and led to higher conversion rates while maintaining high precision in query understanding.

Tags

This case study details how Doordash implemented LLMs to enhance their search retrieval system, particularly focusing on understanding complex user queries in their food delivery platform. The company faced a unique challenge in handling compound search queries that combine multiple requirements, such as "vegan chicken sandwich," where results need to respect strict dietary preferences while maintaining relevance. The implementation demonstrates a sophisticated approach to integrating LLMs into a production search system. Rather than completely replacing traditional search methods, Doordash created a hybrid system that leverages LLMs' strengths while mitigating their weaknesses. The system architecture consists of two main components: document processing and query understanding. For document processing, Doordash built knowledge graphs for both food and retail items, creating rich metadata structures. This foundation was crucial for maintaining consistency and accuracy in the system. The query understanding pipeline uses LLMs in two key ways: First, for query segmentation, where LLMs break down complex queries into meaningful segments. To prevent hallucinations, they constrained the LLM outputs using a controlled vocabulary derived from their knowledge graph taxonomies. Instead of generic segmentation, the system categorizes segments into specific taxonomies like cuisines, dish types, and dietary preferences. Second, for entity linking, where query segments are mapped to concepts in their knowledge graph. They implemented a sophisticated RAG (retrieval-augmented generation) approach to prevent hallucinations and ensure accuracy: * Generate embeddings for search queries and taxonomy concepts * Use approximate nearest neighbor (ANN) search to retrieve the top 100 relevant taxonomy concepts * Prompt the LLM to link queries to specific taxonomies while constraining choices to the retrieved concepts The system includes several important production considerations: * Post-processing steps to prevent hallucinations * Manual audits of processed queries to ensure quality * Regular evaluation of system precision, especially for critical attributes like dietary preferences * Integration with existing ranking systems The team carefully considered the trade-offs between memorization and generalization. While LLMs provided excellent results for batch processing of known queries, they recognized the challenges of scaling to handle new, unseen queries. Their solution was to combine LLM-based processing with traditional methods that generalize better to new queries, including statistical models and embedding retrieval. The results were significant and measurable: * 30% increase in popular dish carousel trigger rates * 2% improvement in whole page relevance for dish-intent queries * 1.6% additional improvement in whole page relevance after ranker retraining * Increased same-day conversions The implementation shows careful attention to production concerns: * Handling of edge cases and long-tail queries * System maintenance and updates * Feature staleness considerations * Integration with existing systems * Performance monitoring and metrics Doordash's approach to preventing LLM hallucinations is particularly noteworthy. They used a combination of: * Controlled vocabularies * Constrained outputs through RAG * Post-processing validation * Manual audits * Integration with knowledge graphs The case study demonstrates several LLMOps best practices: * Clear evaluation metrics and testing methodology * Hybrid approach combining multiple techniques * Careful consideration of production scalability * Integration with existing systems and data structures * Continuous monitoring and improvement processes An interesting aspect of their implementation is how they handled the cold start problem for new queries and items. By combining memorization-based LLM approaches with generalization-based traditional methods, they created a robust system that can handle both common and novel queries effectively. The system's architecture also shows careful consideration of latency and performance requirements, using batch processing where appropriate while maintaining real-time capabilities through their hybrid approach. This demonstrates a practical understanding of the constraints and requirements of production systems. Future directions mentioned in the case study indicate ongoing work to expand the system's capabilities, including query rewriting, personalization, and improved understanding of consumer behavior. This suggests a mature approach to system evolution and improvement, with clear roadmaps for future development. Overall, this case study provides a comprehensive example of how to successfully integrate LLMs into a production system while maintaining reliability, accuracy, and performance. The hybrid approach and careful attention to practical constraints make this a particularly valuable reference for similar implementations.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free