Portkey, Airbyte, Comet: Building Production-Ready AI Agents and Monitoring Systems

LLMOps Database

Tech

Portkey, Airbyte, Comet

Company

Portkey, Airbyte, Comet

Title

Building Production-Ready AI Agents and Monitoring Systems

Industry

Tech

Link

https://www.youtube.com/watch?v=LOdL6xbMPMM

Year

2024

Summary (short)

The panel discussion and demo sessions showcase how companies like Portkey, Airbyte, and Comet are tackling the challenges of deploying LLMs and AI agents in production. They address key issues including monitoring, observability, error handling, data movement, and human-in-the-loop processes. The solutions presented range from AI gateways for enterprise deployments to experiment tracking platforms and tools for building reliable AI agents, demonstrating both the challenges and emerging best practices in LLMOps.

This comprehensive panel discussion and demo session brings together key players in the LLMOps space to discuss the challenges and solutions for deploying LLMs and AI agents in production environments. The session features representatives from multiple companies including Portkey, Airbyte, and Comet, each addressing different aspects of the LLMOps pipeline. Starting with data movement and embeddings, Airbyte's experience highlights the complexities of handling embeddings in production. They initially attempted to build their own embedding service but encountered challenges with memory usage and timeout issues when processing large documents. This led to important learnings about the trade-offs between building versus using managed services, and the need to carefully consider infrastructure requirements when dealing with embedding operations at scale. Comet's presentation focused on the critical importance of monitoring and observability in LLM applications. They emphasized that many AI and traditional machine learning models are deployed without proper monitoring systems in place, which makes debugging and maintenance challenging. Their solution provides comprehensive tracking of inputs, outputs, and various metrics including hallucination detection and bias measures. They've developed a framework that allows both out-of-the-box metrics and custom metric development for LLM applications. Portkey addressed the challenges of enterprise LLM deployment through their AI gateway solution. They focus on handling failure modes, which are more complex with LLMs compared to traditional APIs due to their stochastic nature. Their solution provides routing logic and guard rails, allowing applications to handle various types of failures and switch between models when needed. They've also implemented features to handle bulk data processing and orchestration between multiple agents without losing context. A significant portion of the discussion centered around AI agents and their production deployment challenges. Key insights emerged about the need for: * Durability and error handling in agent execution * Managing context and data movement between agents * Handling asynchronous operations and callbacks * Implementing human-in-the-loop processes effectively The speakers highlighted that while agents offer powerful capabilities, they require careful consideration of error handling, retry logic, and state management. The discussion emphasized the importance of building resilient systems that can handle failures at various levels - from API timeouts to model hallucinations. Regarding human-in-the-loop processes, the panel agreed that this will remain important for the foreseeable future, particularly for critical operations or when high confidence is required. They discussed various implementations, from simple approval workflows to more complex interaction patterns. The companies also shared insights about cost management and optimization in production environments. This included strategies for managing token usage, choosing appropriate models for different tasks, and implementing cost-effective architectures. For instance, using simpler models for initial processing and only calling more expensive models when needed. Monitoring and evaluation emerged as crucial components across all implementations. The speakers emphasized the need for: * Comprehensive logging of inputs and outputs * Tracking of key metrics like hallucination rates and response quality * Performance monitoring across different components of the system * Cost and usage tracking The demo sessions showcased practical implementations of these concepts, with each company demonstrating their approaches to solving different aspects of the LLMOps challenge. Portkey demonstrated their AI gateway capabilities, Comet showed their observability and monitoring tools, and various speakers presented solutions for building reliable AI agents. A key theme throughout the discussion was the rapid pace of change in the LLM space and the need for systems that can adapt to new models and capabilities. The speakers emphasized building flexible architectures that can accommodate new models and features as they become available. The discussion also touched on the importance of proper evaluation and testing frameworks. With the stochastic nature of LLMs, traditional testing approaches need to be adapted. The speakers discussed various approaches to validation, including: * Automated testing with synthetic conversations * Evaluation of model outputs against defined metrics * Monitoring of production systems for degradation * Integration testing with various components Security and access control were also discussed, particularly in the context of enterprise deployments. The solutions presented included features for managing access to sensitive data and controlling model usage across different user groups. Overall, the session provided a comprehensive overview of the current state of LLMOps, highlighting both the challenges and emerging solutions in the space. The presentations emphasized the importance of building robust, monitored, and maintainable systems when deploying LLMs in production environments.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free