ZenML

Automating Job Role Extraction Using Prosus AI Assistant in Production

OLX 2024
View original source

OLX faced a challenge with unstructured job roles in their job listings platform, making it difficult for users to find relevant positions. They implemented a production solution using Prosus AI Assistant, a GenAI/LLM model, to automatically extract and standardize job roles from job listings. The system processes around 2,000 daily job updates, making approximately 4,000 API calls per day. Initial A/B testing showed positive uplift in most metrics, particularly in scenarios with fewer than 50 search results, though the high operational cost of ~15K per month has led them to consider transitioning to self-hosted models.

Industry

E-commerce

Technologies

Overview

OLX, a global online marketplace, undertook a project to improve their job listings search experience by extracting structured job roles from unstructured job advertisement data. The core problem was that job roles were not clearly defined within their jobs taxonomies—instead, they were buried within ad titles and descriptions, making it difficult for job seekers to find relevant positions. This case study documents their journey from proof of concept through to production deployment, highlighting both the successes and the pragmatic cost considerations that come with using external LLM APIs at scale.

The solution leverages Prosus AI Assistant, an LLM service developed by Prosus (OLX’s parent company, a global consumer internet group), which operates on top of OpenAI’s infrastructure through a special agreement that includes enhanced privacy measures and a zero-day data retention policy. This case study is particularly instructive for teams considering the build-versus-buy decision for LLM capabilities in production systems.

Technical Architecture and Pipeline Design

The job-role extraction system operates through a multi-stage pipeline that processes job advertisements to create structured, searchable job role data. The architecture integrates with OLX’s existing infrastructure, particularly their search indexing system.

Data Preprocessing

Before sending data to the LLM, the team implemented several preprocessing steps. They sampled 2,000 job ads for their proof of concept, accounting for uneven distribution across sub-categories to ensure representative coverage. The preprocessing pipeline includes text cleaning, trimming content to the first 200 words/tokens (to manage API costs and stay within token limits), and translation where necessary since the initial focus was on the Polish market.

Search Keyword Analysis

A parallel analysis examined the most-searched keywords in the Jobs categories. Using the LLM, they categorized keywords into professions, job types, locations, and broader descriptors. This analysis revealed that approximately 60% of searched keywords relate to specific professions, validating the focus on job role extraction as a high-impact improvement area.

Taxonomy Tree Generation

The team used a structured approach to generate normalized job-role taxonomies. This involved providing the LLM with up to 100 profession-related searched keywords and up to 50 job roles extracted from randomly selected job ads within each category. A carefully crafted prompt guided the model to produce hierarchical taxonomies considering both responsibilities and department structures. The prompt structure explicitly requested categorization with detailed instructions and specified output format requirements.

Production Pipeline

The production implementation consists of two main operational modes:

A dedicated service subscribes to ad events and uses Prosus AI Assistant to extract job taxonomy information. The extracted job roles are then sent to AWS Kinesis, which feeds into the search team’s indexing pipeline. The enriched data connects extracted job roles with other ad information like titles and parameters for search lookup.

Prompt Engineering Practices

The team developed specific prompt engineering guidelines through their experimentation:

The team also utilized the LangChain framework to streamline interactions with the LLM API, simplify outcome specifications, and chain tasks for enhanced efficiency.

Resource Utilization and Scaling

In production, the system handles approximately 2,000 newly created or updated ads daily. The team made an architectural decision to break down the processing into two sub-tasks—job-role extraction and matching within the standardized tree—resulting in approximately 4,000 daily API requests to Prosus AI Assistant.

For taxonomy generation, the API request volume depends on the number of sub-categories and is only triggered when there are changes or updates to the category tree, which occurs at most a few times per month. This distinction between continuous extraction operations and periodic taxonomy regeneration is an important architectural consideration for managing costs and system complexity.

Evaluation and Results

The team conducted A/B testing to evaluate the impact of the job-role extraction system, focusing on the retrieval stage of search (not yet integrated into search ranking). They acknowledged that significant results require time and designed their experiment with strategic segmentation, dividing results into low, medium, and high segments.

Key observations from the experiments include:

The team was transparent about limitations—the impact currently resides only in the retrieval stage and is not yet integrated into search ranking, so improvements may not appear prominently in top results.

Model Selection and Trade-offs

The decision to use Prosus AI Assistant over self-hosted LLMs was driven by several factors:

The team acknowledged potential risks including slightly longer response times, dependency on external API availability, and questions about long-term viability. They positioned this as a strategic choice for rapid deployment while remaining open to exploring custom LLMs for future optimization.

Cost Considerations and Future Direction

The case study provides valuable transparency about operational costs: approximately $15,000 per month for the Prosus AI Assistant service. This cost revelation prompted serious reflection on sustainability and efficiency for ongoing operations.

The team is now evaluating a pivot toward self-hosted models, which could offer:

This honest assessment of the economics of LLM operations is particularly valuable for teams planning production deployments. While external services can expedite exploration and proof-of-concept phases, long-term cost considerations often guide strategic decisions toward self-hosted alternatives.

Handling System Evolution

A notable operational challenge is managing category evolution. As OLX’s teams continuously improve job categories, changes can necessitate recreation of job-role taxonomies and potentially introduce inconsistencies between taxonomies created before and after sub-category changes.

The planned strategy involves implementing an automated process that detects changes in sub-categories and automatically regenerates necessary job-role taxonomies. This proactive approach ensures the extraction model remains aligned with the evolving job landscape without requiring manual intervention.

Key Takeaways for LLMOps Practitioners

This case study illustrates several important LLMOps principles:

The OLX team’s transparency about both successes and challenges—including the significant monthly costs that are prompting reconsideration of their approach—provides realistic guidance for teams implementing similar LLM-powered extraction systems in production environments.

More Like This

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Snorkel 2025

Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.

healthcare fraud_detection customer_support +90

AI-Powered Customer Service and Call Center Transformation with Multi-Agent Systems

Fastweb / Vodafone 2025

Fastweb / Vodafone, a major European telecommunications provider serving 9.5 million customers in Italy, transformed their customer service operations by building two AI agent systems to address the limitations of traditional customer support. They developed Super TOBi, a customer-facing agentic chatbot system, and Super Agent, an internal tool that empowers call center consultants with real-time diagnostics and guidance. Built on LangGraph and LangChain with Neo4j knowledge graphs and monitored through LangSmith, the solution achieved a 90% correctness rate, 82% resolution rate, 5.2/7 Customer Effort Score for Super TOBi, and over 86% One-Call Resolution rate for Super Agent, delivering faster response times and higher customer satisfaction while reducing agent workload.

customer_support chatbot question_answering +32

AI-Powered Network Operations Assistant with Multi-Agent RAG Architecture

Swisscom 2025

Swisscom, Switzerland's leading telecommunications provider, developed a Network Assistant using Amazon Bedrock to address the challenge of network engineers spending over 10% of their time manually gathering and analyzing data from multiple sources. The solution implements a multi-agent RAG architecture with specialized agents for documentation management and calculations, combined with an ETL pipeline using AWS services. The system is projected to reduce routine data retrieval and analysis time by 10%, saving approximately 200 hours per engineer annually while maintaining strict data security and sovereignty requirements for the telecommunications sector.

customer_support classification data_analysis +35