Wealthsimple: Building a Secure and Scalable LLM Gateway for Financial Services

LLMOps Database

Finance

Wealthsimple

Company

Wealthsimple

Title

Building a Secure and Scalable LLM Gateway for Financial Services

Industry

Finance

Link

https://www.infoq.com/presentations/genai-productivity

Year

2023

Summary (short)

Wealthsimple, a Canadian FinTech company, developed a comprehensive LLM platform to securely leverage generative AI while protecting sensitive financial data. They built an LLM gateway with built-in security features, PII redaction, and audit trails, eventually expanding to include self-hosted models, RAG capabilities, and multi-modal inputs. The platform achieved widespread adoption with over 50% of employees using it monthly, leading to improved productivity and operational efficiencies in client service workflows.

This case study examines Wealthsimple's journey in implementing LLMs in a highly regulated financial services environment, showcasing a thoughtful approach to balancing innovation with security and compliance requirements. Wealthsimple is a Canadian FinTech company focused on helping Canadians achieve financial independence through a unified app for investing, saving, and spending. Their GenAI initiative was organized into three main streams: employee productivity, operational optimization, and the underlying LLM platform infrastructure. The company's LLM journey began in response to the release of ChatGPT in late 2022. Recognizing both the potential and risks of LLMs, particularly in financial services where data security is paramount, they developed several key components: **LLM Gateway Development and Security** Their first major initiative was building an LLM gateway to address security concerns while enabling exploration of LLM capabilities. The gateway included: * Audit trails tracking what data was sent externally * VPN and Okta authentication requirements * Proxy service for various LLM providers * Conversation export/import capabilities * Retry and fallback mechanisms for reliability To drive adoption, they implemented both incentives and soft enforcement mechanisms: * Free usage for employees * Centralized access to multiple LLM providers * Integration with staging and production environments * Gentle nudges when employees used external LLM services directly **PII Protection and Self-Hosted Models** A significant enhancement was their custom PII redaction model, built using Microsoft's residuals framework and an internally developed NER model. However, this created some friction in user experience, leading them to explore self-hosted models as a solution. They implemented: * Self-hosted versions of Llama 2 and Mistral models * Whisper for voice transcription * Models hosted within their VPC for secure processing of sensitive data **RAG Implementation and Platform Evolution** They built a comprehensive RAG system using: * Elasticsearch (later OpenSearch) as their vector database * Airflow DAGs for knowledge base updates and indexing * Simple semantic search API * Integration with LangChain for orchestration **Data Applications Platform** To facilitate experimentation and rapid prototyping, they created a data applications platform using Python and Streamlit, which led to: * Seven applications within first two weeks * Two applications progressing to production * Faster feedback loops for stakeholders **2024 Strategy Refinement** The company's approach evolved to become more business-focused: * Discontinued ineffective nudge mechanisms * Added Gemini integration for enhanced context window capabilities * Implemented multi-modal capabilities * Adopted AWS Bedrock for managed LLM services **Results and Adoption Metrics** The platform achieved significant adoption: * 2,200+ daily messages * ~33% weekly active users * ~50% monthly active users * 80% of LLM usage through the gateway * Usage spread evenly across tenure and roles **Key Use Cases and Applications** The primary use cases fell into three categories: * Programming support (approximately 50% of usage) * Content generation/augmentation * Information retrieval A notable production application was their client experience triaging workflow, which combined: * Whisper for voice transcription * Self-hosted LLMs for classification enrichment * Automated routing of customer inquiries **Lessons Learned** Important insights from their implementation included: * The need for integrated tools rather than separate platforms * The importance of centralization over multiple specialized tools * The value of balancing security with usability * The evolution from a build-first to a more nuanced build-vs-buy approach **Technical Architecture** Their platform integrated multiple components: * OpenSearch/Elasticsearch for vector storage * LangChain for orchestration * Airflow for data pipeline management * Custom API endpoints compatible with OpenAI's specifications * Multiple LLM providers including OpenAI, Cohere, and Anthropic (via Bedrock) The case study demonstrates the complexity of implementing LLMs in a regulated industry and the importance of building proper infrastructure and guardrails while maintaining user adoption and productivity. Their journey from basic gateway to comprehensive platform showcases the evolution of enterprise LLM adoption and the balance between security, usability, and business value.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free