Whatnot: Enhancing E-commerce Search with GPT-based Query Expansion

LLMOps Database

E-commerce

Whatnot

Company

Whatnot

Title

Enhancing E-commerce Search with GPT-based Query Expansion

Industry

E-commerce

Link

https://medium.com/whatnot-engineering/enhancing-search-using-large-language-models-f9dcb988bdb9

Year

2023

Summary (short)

Whatnot improved their e-commerce search functionality by implementing a GPT-based query expansion system to handle misspellings and abbreviations. The system processes search queries offline through data collection, tokenization, and GPT-based correction, storing expansions in a production cache for low-latency serving. This approach reduced irrelevant content by more than 50% compared to their previous method when handling misspelled queries and abbreviations.

Tags

# Whatnot's GPT Integration for E-commerce Search Enhancement ## Overview Whatnot, an e-commerce platform, implemented a sophisticated LLM-based solution to enhance their search functionality by addressing common user input issues like misspellings and abbreviations. The case study demonstrates a practical approach to integrating GPT into a production search system while maintaining low latency requirements. ## Technical Implementation ### System Architecture The implementation follows a hybrid approach combining offline processing with real-time serving: - **Offline Processing Pipeline** - **Runtime System** ### Data Collection and Processing The system implements comprehensive logging at multiple levels: - **Log Collection Layers** - **Data Processing Steps** ### LLM Integration - **GPT Implementation** - **Production Considerations** ### Caching Strategy - **Cache Design** ### Query Processing Pipeline - **Runtime Flow** ## Production Deployment ### Performance Optimization - Target latency of sub-250ms for search operations - Offline GPT processing to avoid runtime delays - Efficient cache lookup mechanisms - Reduced irrelevant content by over 50% for problem queries ### Monitoring and Evaluation - Search engagement metrics tracking - Result relevance assessment - Performance impact measurement - User behavior analysis across search sessions ## Technical Challenges and Solutions ### Current Limitations - Unidirectional expansion (e.g., "sdcc" → "san diego comic con" works, but not vice versa) - Token-level processing constraints - Real-time semantic search challenges ### Proposed Improvements - **Short-term Enhancements** - **Future Initiatives** ## Implementation Lessons ### Success Factors - Offline processing for heavy computation - Caching strategy for low latency - Comprehensive logging and analysis - Careful prompt engineering for GPT ### Best Practices - Multi-level search session tracking - Token frequency analysis for processing - Confidence-based expansion application - Hybrid online/offline architecture ## Technical Infrastructure ### Data Pipeline - Search query logging system - Data warehouse integration - Token processing pipeline - GPT integration layer ### Production Systems - Key-value store for expansions - Query processing service - Search backend integration - Monitoring and analytics ## Future Roadmap ### Planned Enhancements - Semantic search capabilities - Advanced entity extraction - Attribute validation - Content understanding features ### Technical Considerations - Real-time model inference requirements - Production-latency ANN index infrastructure - Knowledge graph integration - Automated attribute tagging This case study demonstrates a pragmatic approach to integrating LLMs into production systems, balancing the power of GPT with real-world performance requirements. The hybrid architecture, combining offline processing with cached serving, provides a blueprint for similar implementations in other e-commerce platforms.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free