ZenML

Building a Knowledge Base Chatbot for Data Team Support Using RAG

HP 2024
View original source

HP's data engineering teams were spending 20-30% of their time handling support requests and SQL queries, creating a significant productivity bottleneck. Using Databricks Mosaic AI, they implemented a RAG-based knowledge base chatbot that could answer user queries about data models, platform features, and access requests in real-time. The solution, which included a web crawler for knowledge ingestion and vector search capabilities, was built in just three weeks and led to substantial productivity gains while reducing operational costs by 20-30% compared to their previous data warehouse solution.

Industry

Tech

Technologies

Overview

HP, a global technology company known for computing and printing solutions supporting over 200 million printers worldwide, faced significant operational challenges within their Big Data Platform and Solutions organization. This team is responsible for data ingestion, platform support, and customer data products across all HP business units. The core problem was that nontechnical users struggled to discover, access, and gain insights from HP’s vast data repositories, which span PCs, printers, web applications, and mobile apps. This created bottlenecks that consumed substantial engineering resources and slowed decision-making across the organization.

The case study demonstrates how HP leveraged generative AI, specifically through the Databricks ecosystem, to build a production knowledge base chatbot that addresses internal support challenges. While this case study is presented by Databricks (a vendor with clear commercial interest), the technical details and quantified outcomes provide useful insights into LLMOps practices for enterprise knowledge management.

The Problem

HP’s data engineering teams were overwhelmed with support requests from internal users and partners. These requests ranged from questions about specific data models, access to restricted data, platform feature usage, and new employee onboarding. According to William Ma, Data Science Manager at HP, the teams spent approximately 20-30% of their time crafting SQL queries, investigating data issues, and cross-referencing information across multiple systems. For a five-person team, this overhead effectively represented the productivity loss equivalent to one full-time employee.

Additionally, data teams were tasked with building usage dashboards, cost analysis reports, and budget tracking tools for leadership decision-making. The manual nature of accessing and exploring data created latency that impeded real-time strategic decisions. Data movement between warehouses and AI workspaces also introduced security, privacy, and governance complexities, particularly around Data Subject Requests (DSRs) that required data deletion when customers submitted such requests.

Technical Architecture and LLMOps Implementation

Platform Migration and Infrastructure

HP migrated from AWS Redshift to the Databricks Data Intelligence Platform running on AWS, adopting a lakehouse architecture that unifies data, analytics, and AI capabilities. The key infrastructure components include:

Model Selection Process

HP used the Mosaic AI Playground to experiment with different large language models before settling on DBRX as their production model. The selection criteria focused on appropriateness for chatbot use cases and cost-effectiveness. This experimentation phase represents a critical LLMOps practice—evaluating multiple models against specific use case requirements before committing to a production deployment.

RAG Architecture

The core GenAI solution implements a Retrieval Augmented Generation (RAG) pattern with the following components:

Development Velocity

A notable aspect of this implementation is the development timeline. An intern on the data team implemented the end-to-end solution in less than three weeks using Databricks Mosaic AI. The case study contrasts this with other teams who reportedly spent months building similar solutions on different platforms with experienced staff engineers. While this comparison should be viewed with appropriate skepticism given the promotional nature of the source, it does suggest that modern LLMOps platforms can significantly accelerate GenAI application development when the tooling is well-integrated.

Governance and Security Considerations

The implementation addresses several enterprise governance requirements through Unity Catalog integration:

By keeping AI tooling within the same platform as the data (rather than moving data to external AI workspaces), HP reduced the governance complexity and associated costs.

Expanding Use Cases: AI/BI Genie

Beyond the knowledge base chatbot, HP is exploring AI/BI Genie for natural language data querying. This tool enables nontechnical users to query data conversationally and receive immediate responses, potentially reducing the manual SQL support burden on data engineers. The implementation involves:

This represents an evolution from document-based RAG to structured data querying, expanding the GenAI footprint within the organization.

Results and Outcomes

The quantified results from the case study include:

The cost savings appear to be primarily from the platform migration rather than the GenAI implementation specifically. The productivity gains from the chatbot are described as “forecasted” rather than measured, suggesting the solution may still be relatively new or that concrete metrics weren’t available at the time of publication.

Future Roadmap

HP plans to expand GenAI solutions to external customers to improve troubleshooting and query resolution processes. This suggests a progression from internal productivity tools to customer-facing AI applications, which typically involves more rigorous requirements around reliability, accuracy, and response quality.

Critical Assessment

While this case study provides useful insights into enterprise LLMOps practices, several points warrant consideration:

That said, the technical architecture described follows established LLMOps best practices: using RAG to ground LLM responses in proprietary data, implementing proper governance controls, selecting models based on experimentation, and building referenceable outputs for user verification. The emphasis on keeping data and AI tooling unified to reduce governance complexity is a practical consideration for enterprise deployments.

More Like This

Building a Search Engine for AI Agents: Infrastructure, Product Development, and Production Deployment

Exa.ai 2025

Exa.ai has built the first search engine specifically designed for AI agents rather than human users, addressing the fundamental problem that existing search engines like Google are optimized for consumer clicks and keyword-based queries rather than semantic understanding and agent workflows. The company trained its own models, built its own index, and invested heavily in compute infrastructure (including purchasing their own GPU cluster) to enable meaning-based search that returns raw, primary data sources rather than listicles or summaries. Their solution includes both an API for developers building AI applications and an agentic search tool called Websites that can find and enrich complex, multi-criteria queries. The results include serving hundreds of millions of queries across use cases like sales intelligence, recruiting, market research, and research paper discovery, with 95% inbound growth and expanding from 7 to 28+ employees within a year.

question_answering data_analysis chatbot +44

Building Unified API Infrastructure for AI Integration at Scale

Merge 2025

Merge, a unified API provider founded in 2020, helps companies offer native integrations across multiple platforms (HR, accounting, CRM, file storage, etc.) through a single API. As AI and LLMs emerged, Merge adapted by launching Agent Handler, an MCP-based product that enables live API calls for agentic workflows while maintaining their core synced data product for RAG-based use cases. The company serves major LLM providers including Mistral and Perplexity, enabling them to access customer data securely for both retrieval-augmented generation and real-time agent actions. Internally, Merge has adopted AI tools across engineering, support, recruiting, and operations, leading to increased output and efficiency while maintaining their core infrastructure focus on reliability and enterprise-grade security.

customer_support data_integration chatbot +38

Scaling AI Product Development with Rigorous Evaluation and Observability

Notion 2025

Notion AI, serving over 100 million users with multiple AI features including meeting notes, enterprise search, and deep research tools, demonstrates how rigorous evaluation and observability practices are essential for scaling AI product development. The company uses Brain Trust as their evaluation platform to manage the complexity of supporting multilingual workspaces, rapid model switching, and maintaining product polish while building at the speed of AI industry innovation. Their approach emphasizes that 90% of AI development time should be spent on evaluation and observability rather than prompting, with specialized data specialists creating targeted datasets and custom LLM-as-a-judge scoring functions to ensure consistent quality across their diverse AI product suite.

document_processing content_moderation question_answering +52