Northwestern Mutual: Multi-Agent GenAI System for Developer Support and Documentation

LLMOps Database

Insurance

Northwestern Mutual

Company

Northwestern Mutual

Title

Multi-Agent GenAI System for Developer Support and Documentation

Industry

Insurance

Link

https://www.youtube.com/watch?v=7pvEYLW1yZw

Year

2023

Summary (short)

Northwestern Mutual implemented a GenAI-powered developer support system to address challenges with their internal developer support chat system, which suffered from long response times and repetitive basic queries. Using Amazon Bedrock Agents, they developed a multi-agent system that could automatically handle common developer support requests, documentation queries, and user management tasks. The system went from pilot to production in just three months and successfully reduced support engineer workload while maintaining strict compliance with internal security and risk management requirements.

Northwestern Mutual's case study demonstrates a practical implementation of multi-agent LLM systems in a highly regulated enterprise environment. The project showcases how a large insurance company successfully deployed GenAI capabilities while maintaining strict security and compliance requirements. ## Project Context and Challenge Northwestern Mutual provides financial planning services, wealth management, and insurance products. They identified that their internal developer support system, which operated primarily through chat, had several key problems: * Long response times, especially during weekends and holidays * Many basic questions that could be automated * Repetitive requests for simple tasks like user unlocking * Documentation that existed but wasn't being utilized effectively ## Technical Implementation The team implemented a multi-agent system using Amazon Bedrock with several key architectural decisions: ### Infrastructure and Services * Fully serverless architecture to minimize infrastructure management * Amazon SQS for message queuing and reliability * AWS Lambda for orchestration layer * DynamoDB for state management of escalated messages * OpenSearch Serverless as the vector database for knowledge bases * Cross-region inference for improved stability and performance ### Agent Architecture The system was implemented with five distinct agents, each with specific responsibilities: * Documentation provider * User management * Repository management * Pipeline failure analysis * Response evaluation agent ### Security and Compliance Features * Implementation of strict guardrails through Amazon Bedrock * PII masking and confidential information protection * Explicit confirmation requirements for any actions (strict yes/no responses only) * Integration with existing AWS security controls ### Key Design Decisions The team made several important architectural decisions: * Keeping agents focused on specific tasks to prevent confusion * Implementing an evaluation agent to verify responses before sending to users * Using cross-region inference for improved reliability * Building comprehensive observability into the system ## Risk Management and Compliance A notable aspect of the implementation was how they handled the restriction that AI couldn't take direct actions. They worked with risk partners to develop a sustainable solution: * Agents must explicitly state intended actions * Users must provide explicit "yes" or "no" confirmation * No ambiguous confirmations allowed * All actions are logged and monitored ## Development Timeline and Process The project moved remarkably quickly: * Started serious development in June * Reached production by September * Implemented in phases with continuous refinement ## Lessons Learned and Best Practices The team identified several key learnings: ### Data Quality and Management * Well-curated documentation is crucial for knowledge base effectiveness * The "garbage in, garbage out" principle applies strongly to GenAI systems ### Technical Implementation * Cross-region inference should be implemented from the start * Models should explain their decisions to facilitate debugging * Agent responsibilities should be limited and focused ### User Experience * Noise and hallucinations must be filtered before reaching users * Over-indexing on filtering messages is preferable to risking user trust * Building trust is crucial and hard to regain if lost ### Operational Considerations * Strong observability is essential for both technical issues and user interactions * User behavior monitoring helps identify needs for new agent capabilities * Regular feedback mechanisms ensure information accuracy ## Results and Impact The system successfully: * Reduced response times from hours to minutes for common queries * Freed up support engineers to focus on complex issues * Maintained compliance with strict regulatory requirements * Improved employee engagement through faster response times * Created a foundation for more complex business-facing applications ## Future Developments The team is looking forward to implementing multi-agent collaboration capabilities from Amazon Bedrock to: * Simplify their code base * Improve agent orchestration * Reduce complexity in managing multiple agents * Enable more sophisticated use cases The case study demonstrates how enterprises can successfully implement GenAI solutions while maintaining strict security and compliance requirements. The focus on user experience, careful agent design, and robust architecture provides a valuable template for other organizations looking to implement similar solutions.

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source