Doordash: Building a Food Delivery Product Knowledge Graph with LLMs

LLMOps Database

E-commerce

Doordash

Company

Doordash

Title

Building a Food Delivery Product Knowledge Graph with LLMs

Industry

E-commerce

Link

https://doordash.engineering/2024/04/23/building-doordashs-product-knowledge-graph-with-large-language-models/

Year

Summary (short)

DoorDash leveraged LLMs to transform their retail catalog management by implementing three key systems: an automated brand extraction pipeline that identifies and deduplicates new brands at scale; an organic product labeling system combining string matching with LLM reasoning to improve personalization; and a generalized attribute extraction process using LLMs with RAG to accelerate annotation for entity resolution across merchants. These innovations significantly improved product discoverability and personalization while reducing the manual effort that previously caused long turnaround times and high costs.

Tags

DoorDash, a leading food delivery platform, showcases an innovative application of large language models in building and maintaining a product knowledge graph. This case study explores how LLMs can be leveraged to structure and organize vast amounts of food delivery-related data into a meaningful and interconnected knowledge representation system. The Context and Challenge: DoorDash operates in a complex domain where millions of menu items, restaurants, cuisines, and food attributes need to be organized in a way that enables efficient search, discovery, and recommendations. Traditional approaches to building knowledge graphs often rely on manual curation or rigid rule-based systems, which can be both time-consuming and inflexible when dealing with the dynamic nature of the food delivery business. Building a knowledge graph for a food delivery platform presents several unique challenges: * Menu items and descriptions vary greatly across restaurants * The same dish might be described differently by different establishments * Ingredients and preparation methods need to be accurately extracted and categorized * Relationships between different food items, cuisines, and dietary restrictions must be established * The system needs to handle continuous updates as restaurants modify their menus The LLM-Powered Solution: While we don't have the complete implementation details from just the title, we can reasonably infer that DoorDash likely employs LLMs in several key ways to build and maintain their knowledge graph: Information Extraction: * LLMs can process unstructured menu descriptions and extract structured information about dishes, ingredients, and preparation methods * They can identify and standardize different variations of the same dish across multiple restaurants * The models can understand and categorize dietary attributes, spice levels, and portion sizes from natural language descriptions Relationship Mapping: * LLMs can help establish semantic relationships between different entities in the knowledge graph * They can identify similar dishes across different cuisines and create meaningful connections * The models can understand hierarchical relationships (e.g., that a "California Roll" is a type of "Sushi" which is part of "Japanese Cuisine") Data Enrichment: * LLMs can generate additional metadata and attributes for menu items based on their descriptions * They can help standardize and normalize varying descriptions of similar items * The models can suggest related items and complementary dishes based on semantic understanding Production Considerations: Implementing such a system in production requires careful attention to several LLMOps aspects: Data Quality and Validation: * Input validation systems to ensure menu descriptions are properly formatted * Quality checks on LLM outputs to ensure extracted information is accurate * Feedback loops to improve model performance based on real-world usage Scalability and Performance: * Efficient processing of large volumes of menu data * Batch processing capabilities for periodic updates * Real-time processing for new menu additions Error Handling and Monitoring: * Systems to detect and flag unusual or potentially incorrect extractions * Monitoring of model performance and accuracy over time * Fallback mechanisms when LLM processing fails or produces low-confidence results Integration Considerations: * APIs for accessing and querying the knowledge graph * Services for updating and maintaining the graph structure * Integration with search and recommendation systems The DoorDash knowledge graph likely serves as a foundational component for several key features: Search Enhancement: * More accurate understanding of user queries * Better matching of search terms with relevant menu items * Understanding of implicit relationships between food items Recommendation Systems: * Suggesting similar or complementary items * Understanding user preferences across different cuisines * Identifying substitute items for out-of-stock dishes Menu Understanding: * Better categorization of menu items * Understanding of dietary restrictions and preferences * Identification of common ingredients and preparation methods Future Potential and Challenges: The use of LLMs in building knowledge graphs presents both opportunities and challenges: Opportunities: * Continuous improvement through learning from new data * Ability to handle edge cases and unusual descriptions * Flexible adaptation to new cuisines and food trends Challenges: * Ensuring consistent accuracy across different types of cuisine * Handling multilingual menu descriptions * Maintaining performance at scale * Managing computational costs Best Practices and Lessons: While specific details aren't provided in the title alone, we can infer several best practices for similar implementations: * Implement robust validation systems for LLM outputs * Use human-in-the-loop processes for quality assurance * Maintain clear versioning of the knowledge graph * Implement monitoring systems for data quality * Regular evaluation of model performance and accuracy The use of LLMs in building DoorDash's product knowledge graph represents a significant advancement in how food delivery platforms can organize and understand their vast product catalogs. This approach likely enables more sophisticated search and recommendation capabilities while reducing the manual effort required for maintaining accurate product information. The system demonstrates how LLMs can be effectively deployed in production to solve complex data organization challenges in the e-commerce space.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free