OLX faced a challenge with unstructured job roles in their job listings platform, making it difficult for users to find relevant positions. They implemented a production solution using Prosus AI Assistant, a GenAI/LLM model, to automatically extract and standardize job roles from job listings. The system processes around 2,000 daily job updates, making approximately 4,000 API calls per day. Initial A/B testing showed positive uplift in most metrics, particularly in scenarios with fewer than 50 search results, though the high operational cost of ~15K per month has led them to consider transitioning to self-hosted models.
# OLX Job Role Extraction Case Study: LLMOps Implementation
## Project Overview
OLX, a major online classifieds platform, implemented a sophisticated LLM-based system to automate job role extraction from their job listings. The project leverages Prosus AI Assistant, a proprietary LLM solution, to enhance their job search functionality by creating standardized job role taxonomies and automating role extraction from job descriptions.
## Technical Implementation
### LLM Selection and Infrastructure
- Chose Prosus AI Assistant over self-hosted LLMs due to:
### Production Pipeline Architecture
- Built a comprehensive job-role extraction pipeline including:
- Processing volume:
### Data Processing and Preparation
- Implemented careful data sampling:
- Data preprocessing steps:
- Search keyword analysis:
## LLM Operations Details
### Prompt Engineering
- Developed specific prompt engineering practices:
- Used LangChain framework for:
### Taxonomy Generation Process
- Created hierarchical job-role taxonomy:
- Example prompt structure:
```plain text
### Task Description ###
Consider the following top-searched keywords and job-roles in the {category} category...
### Expected Output Format ###
```
### Production System Integration
- Implemented two-phase deployment:
- Created subscription service for ad events
- Integrated with search indexing system
- Built monitoring and validation systems
## Monitoring and Evaluation
### Performance Metrics
- Conducted A/B testing focusing on:
- Segmented results analysis:
### Quality Control
- Informal accuracy monitoring during annotation
- Initial validation with 100 sample extractions
- Continuous monitoring of extraction quality
- Regular taxonomy updates based on category changes
## Challenges and Solutions
### Category Evolution Management
- Developed systems to handle dynamic category changes:
### Resource Optimization
- Managed API request volume through:
### Cost Management
- Current operational costs:
## Future Developments
### Planned Improvements
- Considering transition to self-hosted models
- Exploring broader information extraction
- Enhanced search relevance integration
- Automated taxonomy update system
### Scalability Considerations
- Planning for increased processing volumes
- Infrastructure optimization
- Cost reduction strategies
- Enhanced monitoring systems
## Lessons Learned and Best Practices
### Implementation Insights
- Importance of thorough prompt engineering
- Value of iterative testing and refinement
- Need for balanced resource utilization
- Significance of proper monitoring and evaluation
### Technical Recommendations
- Start with managed solutions for quick deployment
- Plan for potential self-hosting transition
- Implement robust monitoring from the start
- Focus on prompt optimization and management
- Consider long-term cost implications
This case study demonstrates a successful implementation of LLMOps in a production environment, highlighting both the immediate benefits and long-term considerations of using LLMs for business-critical tasks. The project showcases the importance of careful planning, monitoring, and cost management in LLM-based solutions.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.