Scotiabank: AI-Powered Chatbot Automation with Hybrid NLU and LLM Approach

LLMOps Database

Finance

Scotiabank

Company

Scotiabank

Title

AI-Powered Chatbot Automation with Hybrid NLU and LLM Approach

Industry

Finance

Link

https://www.youtube.com/watch?v=Fhtig6TrZdk

Year

2022

Summary (short)

Scotiabank developed a hybrid chatbot system combining traditional NLU with modern LLM capabilities to handle customer service inquiries. They created an innovative "AI for AI" approach using three ML models (nicknamed Luigi, Eva, and Peach) to automate the review and improvement of chatbot responses, resulting in 80% time savings in the review process. The system includes LLM-powered conversation summarization to help human agents quickly understand customer contexts, marking the bank's first production use of generative AI features.

Tags

This case study from Scotiabank demonstrates a pragmatic and thoughtful approach to implementing AI systems in a highly regulated financial environment, combining traditional NLU chatbot technology with newer LLM capabilities. The implementation shows careful consideration of control, reliability, and bias management while still pursuing innovation. ### Background and Context Scotiabank developed their chatbot in response to customer demand and to reduce pressure on their contact center. The initial implementation was launched in November 2022 as an authenticated experience in their mobile app. The chatbot was designed as a traditional intent-based NLU system rather than a pure LLM-based solution, reflecting the bank's need for controlled and predictable responses in a financial context. The system can handle over 700 different types of responses across various banking topics including accounts, investments, credit cards, loans, and mortgages. ### Core Architecture and Approach The bank took a hybrid approach to their chatbot implementation: * Base System: Traditional NLU intent-based chatbot for core functionality * Human Review Process: Initially implemented a two-week sustainment cycle with AI trainers reviewing customer queries * Automation Layer: Developed three ML models to automate review and improvement processes * LLM Integration: Added targeted LLM features for specific use cases like conversation summarization ### AI for AI Implementation The team developed what they called an "AI for AI" approach, creating three main models (playfully named after video game characters): **Luigi (Binary Classifier)** * Purpose: Automates first-level review of chatbot responses * Function: Determines if bot responses are correct or incorrect * Challenge: Handled class imbalance through data augmentation * Results: Successfully automated initial review process **Eva (Multi-class Classifier)** * Purpose: Automates second-level review and intent matching * Function: Can both validate responses and suggest better intent matches * Features: Uses n-gram analysis and multiple features to understand context * Innovation: Works independently from the NLU tool, using direct customer utterances **Peach (Similarity Analysis)** * Purpose: Assists AI trainers in training data management * Function: Calculates similarity scores between new training phrases and existing ones * Process: Includes data preparation, feature extraction, and vector representation * Benefit: Helps prevent redundant or conflicting training data ### LLM Integration and Prompt Engineering The team's approach to incorporating LLM capabilities shows careful consideration of when and where to use this technology: * Focus Area: Implemented LLM-based summarization for chat handovers to human agents * Results: Achieved 80% reduction in conversation length for agent review * Process: Developed through careful prompt engineering including: * Objective definition * Prompt type evaluation (zero-shot vs. few-shot) * Context vs. examples testing * Evaluation methodology using ROUGE-N metrics * Iterative refinement based on feedback ### Bias Control and Quality Assurance The team implemented several measures to ensure system reliability and fairness: * Separate Training Sets: Luigi and Eva use different data sources and features * Multiple Algorithm Types: Diversity in approach helps prevent systematic bias * Regular Bias Audits: Both during development and in production * Human Oversight: Maintained for cases where automated systems disagree * Transparency: Following bank-wide AI development best practices ### Results and Impact The implementation achieved significant improvements: * Time Savings: Automated significant portions of the review process * Quality Improvement: Maintained high accuracy while increasing coverage * Efficiency: Reduced average handle time for human agents * Scale: Enabled review of more customer interactions * Innovation: Successfully implemented first generative AI use case in production at the bank ### Technical Considerations and Challenges The team faced and overcame several technical challenges: * Bias Management: Careful design to prevent automated perpetuation of biases * Data Quality: Managing class imbalance in training data * Regulatory Compliance: Navigation of financial institution requirements * Integration: Combining traditional NLU with newer LLM capabilities * Stakeholder Management: Balancing technical capabilities with business needs The case study demonstrates a thoughtful approach to modernizing customer service systems in a regulated environment, showing how traditional NLU systems can be enhanced with both custom ML models and targeted LLM capabilities. The team's focus on controlled automation while maintaining quality and preventing bias provides valuable lessons for similar implementations in financial services and other regulated industries.

Start your new ML Project today with ZenML Pro

Join 1,000s of members already deploying models with ZenML.

Learn more

Try Free