Ramp developed an AI-powered Tour Guide agent to help users navigate their financial operations platform more effectively. The solution guides users through complex tasks by taking control of cursor movements while providing step-by-step explanations. Using an iterative action-taking approach and optimized prompt engineering, the Tour Guide increases user productivity and platform accessibility while maintaining user trust through transparent human-agent collaboration.
# Building and Deploying an AI Tour Guide Agent at Ramp
## Overview
Ramp, a financial operations platform, developed an innovative AI-powered Tour Guide agent to help users navigate their platform more effectively. This case study demonstrates a sophisticated approach to deploying LLMs in production with a focus on user experience, system architecture, and practical implementation considerations.
## Technical Architecture and Implementation
### Agent Design Philosophy
- Focused on human-agent collaboration rather than full automation
- Implements a visible, controllable cursor that users can monitor and interrupt
- Uses a classifier to automatically identify queries suitable for Tour Guide intervention
- Maintains user trust through transparent step-by-step actions and explanations
### Core Technical Components
- Interactive element recognition system
- DOM processing and annotation pipeline
- Iterative action generation system
- User interface integration with explanatory banners
### Action Generation System
- Breaks down all user interactions into three basic types:
- Uses an iterative approach for action generation:
- Originally used a two-step LLM process:
- Later optimized to single consolidated prompt for better performance
## Prompt Engineering and Optimization
### Input Processing
- Developed custom annotation script for HTML elements
- Incorporated accessibility tags from DOM
- Created visible labels similar to Vimium browser extension
- Implemented DOM simplification to remove irrelevant objects
- Focused on clean, efficient inputs for better model guidance
### Prompt Optimization Techniques
- Labeled interactable elements with letters (A-Z) in prompts
- Constrained decision space to improve accuracy
- Balanced prompt length against latency requirements
- Avoided context stuffing in favor of enriched interactions
- Maintained concise prompts for optimal performance
## Evaluation and Quality Assurance
### Testing Approach
- Heavy reliance on manual testing
- Systematic identification of failure patterns
- Implementation of protective guardrails
- Restricted agent access to complex workflows
### Specific Restrictions
- Limited access to complex canvas interfaces
- Controlled interaction with large table elements
- Added hardcoded restrictions for high-risk pages
- Focus on reliable, predictable behavior
## User Experience Design
### Interface Elements
- Interactive cursor control system
- Popup explanation banners
- Step-by-step action visibility
- User interrupt capabilities
### Trust Building Features
- Transparent action execution
- Clear explanations for each step
- User control over process
- Visual feedback mechanisms
## Production Deployment Considerations
### Performance Optimization
- Consolidated LLM calls to reduce latency
- Simplified DOM processing for efficiency
- Streamlined prompt structure
- Balanced accuracy vs. speed requirements
### Safety and Reliability
- Implementation of guardrails
- Controlled action space
- User override capabilities
- Automatic query classification
## Lessons Learned and Best Practices
### Key Insights
- Importance of constraining decision space for LLMs
- Value of iterative action generation
- Need for balance between automation and user control
- Significance of transparent AI operations
### Future Development
- Plans for expansion into broader "Ramp Copilot"
- Focus on maintaining user-centric design
- Continued refinement of interaction patterns
- Integration with wider platform functionality
## Technical Challenges and Solutions
### DOM Processing
- Development of efficient annotation systems
- Integration with accessibility standards
- Optimization of element selection
- Balance of information density and processing speed
### Model Integration
- Optimization of prompt structures
- Management of state updates
- Integration with user interface
- Handling of edge cases and errors
### Performance Optimization
- Reduction of LLM call overhead
- Streamlining of processing pipeline
- Optimization of user interface updates
- Balance of responsiveness and accuracy
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.