This case study examines how RBC (Royal Bank of Canada) implemented a RAG system called Arcane to solve the challenge of accessing and interpreting complex investment policies and procedures. The project was developed by the AI Solution Acceleration and Innovation team at RBC, with the pilot phase completed and further scaling being handled by Borealis AI, RBC's AI research division.
## Problem Context and Business Need
The financial domain presents unique challenges in terms of complexity and scale. Financial operations span multiple areas including core banking systems, ATMs, online banking, treasury management, mobile banking, asset liability, foreign exchange, and various investment products. The specific problem addressed by Arcane was the bottleneck created by highly trained specialists (often with 5-10 years of post-secondary education) needing to spend significant time searching through documentation to answer investment-related queries. This inefficiency had direct financial implications, with every second of delay potentially impacting millions of dollars in bottom-line results.
## Technical Implementation
The system was built as a chat interface with several key technical components:
* Document Processing: One of the biggest challenges was handling semi-structured data from various sources (XML, HTML, PDF files, websites). The team found that proper parsing was crucial for success, as different document formats and structures required different parsing approaches.
* Vector Database Integration: While vector databases weren't a major bottleneck, the team found that proper usage strategies were important for effective retrieval.
* Chat Memory Management: The system maintains conversation context, though this creates challenges in managing the accuracy of responses over multiple turn conversations.
* Model Selection: The team experimented with different models for both embedding and generation:
- GPT-3.5 proved fast and efficient
- GPT-4 was notably slower but provided better quality
- GPT-4 Turbo offered a middle ground in terms of speed and quality
## Evaluation and Quality Assurance
The team implemented a comprehensive evaluation strategy using multiple approaches:
* Ragas Framework: Used for measuring context precision, answer relevancy, and faithfulness of responses.
* Traditional IR Metrics: Incorporated precision, recall, and F-score measurements.
* Custom Evaluation Methods:
- Human evaluation
- Automated fact-checking tools
- Consistency checks across similar queries
- Embedding-based semantic similarity measurements (using sentence-BERT)
## Security and Privacy Considerations
The implementation included several security measures:
* Protection against model inversion attacks
* Safeguards against prompt injection
* Data Loss Prevention (DLP) controls
* Preference for internal deployment over cloud when possible
* Careful handling of proprietary information
## Key Lessons and Best Practices
1. Parsing and Chunking Strategy:
- Proper document parsing proved to be the most critical component
- Chunks needed to maintain meaningful context
- Careful consideration of chunk size and content boundaries was essential
2. User Education and Risk Management:
- Users needed training on system capabilities and limitations
- Clear communication about potential risks and appropriate use cases
- Implementation of guardrails for human operators
3. Technical Optimization:
- Use of best available models for generation tasks
- Smaller, efficient models sufficient for embeddings
- Balance between response quality and speed
4. Conversation Management:
- Managing context across multiple questions proved challenging
- Risk of accuracy degradation in extended conversations
- Need for careful prompt engineering to maintain context
## Results and Impact
While specific metrics weren't provided in the presentation, the system successfully demonstrated its ability to:
- Reduce time spent searching through documentation
- Provide consistent answers to investment policy questions
- Scale expertise across the organization
- Maintain security and privacy compliance in a heavily regulated environment
The pilot phase proved successful enough that the system was designated for broader deployment through Borealis AI, RBC's specialized AI research division.
## Production Considerations
The deployment strategy focused heavily on reliability and security:
- Internal deployment preferred over cloud-based solutions
- Robust evaluation pipelines to ensure response quality
- Multiple layers of security controls
- Careful consideration of model selection based on performance requirements
- Regular monitoring and evaluation of system outputs
The case study demonstrates the complexity of implementing LLM systems in highly regulated financial environments, emphasizing the importance of careful evaluation, security measures, and user education alongside technical implementation.