Choco: Scaling Order Processing Automation Using Modular LLM Architecture

LLMOps Database

E-commerce

Choco

Company

Choco

Title

Scaling Order Processing Automation Using Modular LLM Architecture

Industry

E-commerce

Link

https://choco.com/us/stories/life-at-choco/scaling-ai-applications-with-llms

Year

2025

Summary (short)

Choco developed an AI system to automate the order intake process for food and beverage distributors, handling unstructured orders from various channels (email, voicemail, SMS, WhatsApp). By implementing a modular LLM architecture with specialized components for transcription, information extraction, and product matching, along with comprehensive evaluation pipelines and human feedback loops, they achieved over 95% prediction accuracy. One customer reported 60% reduction in manual order entry time and 50% increase in daily order processing capacity without additional staffing.

Tags

Choco's journey in implementing LLMs in production offers valuable insights into building and scaling AI applications effectively. Their case study focuses on Choco AI, a system designed to streamline and automate order processing for food and beverage distributors, demonstrating practical applications of LLMOps principles in a real-world business context. The company faced a complex challenge: automating the interpretation and processing of unstructured orders coming through various channels (email, voicemail, SMS, WhatsApp, fax) into a standardized format for ERP system integration. The technical complexity was amplified by the need to handle context-dependent product identification, such as matching generic product requests (e.g., "2 kilos of tomatoes") to specific SKUs from catalogs containing dozens of variants. Key LLMOps Implementation Aspects: **Modular Architecture Design** The team deliberately moved away from using a single, catch-all LLM prompt approach, despite its initial appeal during their hackathon phase. Instead, they implemented a modular architecture where different LLMs and ML models handle specific tasks. This architectural decision reflects mature LLMOps practices: * Breaking down complex workflows into smaller, testable components * Assigning specific responsibilities to different models (e.g., separate models for transcription, correction, and information extraction) * Enabling independent optimization and maintenance of each component * Facilitating easier debugging and performance monitoring **Comprehensive Evaluation Framework** Choco implemented a robust evaluation pipeline that embodies several LLMOps best practices: * Maintaining extensive test datasets for each AI/ML task * Implementing specific metrics for different components (e.g., Word Error Rate for transcription) * Testing both individual components and end-to-end system performance * Enabling rapid evaluation of new models or updates (demonstrated by their quick GPT-4 integration within a week of its release) **Data Quality and Human Labeling** The company's approach to data quality and labeling demonstrates sophisticated LLMOps practices: * Building custom internal tools for efficient labeling processes * Leveraging domain expertise through their Customer Success teams rather than relying solely on external agencies * Maintaining strict data privacy practices while building large-scale labeled datasets * Creating user-friendly interfaces for human review and correction **Continuous Learning and Improvement System** Choco implemented a sophisticated approach to model improvement: * Designing the system to capture and utilize user feedback through the review interface * Building internal tools for error flagging and correction * Implementing automated learning mechanisms to improve accuracy over time * Measuring both initial performance ("Day-0 performance") and learning curve metrics **Production Deployment Considerations** Their production deployment strategy shows careful consideration of real-world constraints: * Implementing a human review interface for initial orders to ensure accuracy * Building self-service error resolution mechanisms to reduce dependency on the AI engineering team * Creating comprehensive observability systems for monitoring performance * Designing the system to handle scale (processing hundreds of new customers) **Notable Technical Decisions** * Choosing to use in-context learning approaches over fine-tuning for continuous improvement * Implementing dynamic context provision to LLMs for personalized responses * Building separate interfaces for internal and customer-facing interactions * Creating automated feedback loops for continuous model improvement **Results and Impact** The implementation has shown significant business impact: * Achieving over 95% prediction accuracy in product matching * Enabling customers to reduce manual order entry time by 60% * Allowing processing of 50% more orders without additional staffing * Successfully scaling to hundreds of new customers while maintaining system quality **Challenges and Lessons** The case study highlights several important lessons for LLMOps practitioners: * The importance of breaking down complex tasks into manageable, testable components * The value of comprehensive evaluation pipelines in enabling rapid iteration * The critical role of human expertise in maintaining system quality * The need for both automated and manual feedback mechanisms This case study represents a mature implementation of LLMOps principles, showing how careful system design, comprehensive testing, and continuous improvement mechanisms can create a robust AI system that delivers real business value. Their approach to modularity, evaluation, and continuous learning provides valuable insights for other organizations looking to implement LLMs in production environments.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source