Untold Studios developed an AI assistant integrated into Slack to help their visual effects artists access internal resources and tools more efficiently. Using Amazon Bedrock with Claude 3.5 Sonnet and a serverless architecture, they created a natural language interface that handles 120 queries per day, reducing information search time from minutes to seconds while maintaining strict data security. The solution combines RAG capabilities with function calling to access multiple knowledge bases and internal systems, significantly reducing the support team's workload.
Untold Studios, a tech-driven visual effects and animation studio, successfully implemented an AI assistant to enhance their artists' productivity while maintaining strict security requirements. This case study demonstrates a practical approach to deploying LLMs in a production environment, with particular attention to security, scalability, and user experience.
The core challenge they faced was creating an accessible interface for their diverse pool of artists to access internal resources and tools. Their solution leverages Amazon Bedrock and integrates directly into their existing Slack workflow, making adoption seamless for end users.
## Technical Architecture and Implementation
The solution is built on a robust serverless architecture using multiple AWS services:
* The foundation is Amazon Bedrock running Anthropic's Claude 3.5 Sonnet model for natural language processing
* A two-function Lambda approach handles Slack integration, meeting strict timing requirements while allowing for thorough processing
* Data storage utilizes both Amazon S3 for unstructured data and DynamoDB for persistent storage
* Security and access control are maintained through careful user management and role-based access
A particularly noteworthy aspect of their implementation is the RAG (Retrieval Augmented Generation) setup. Instead of building a custom vector store, they leveraged Amazon Bedrock connectors to integrate with existing knowledge bases in Confluence and Salesforce. For other data sources, they export content to S3 and use the S3 connector, letting Amazon Bedrock handle embeddings and vector search. This approach significantly reduced development complexity and time.
## Function Calling Implementation
The implementation of function calling demonstrates a pragmatic approach to production LLM deployment. Rather than using comprehensive frameworks like LangChain, they opted for a lightweight, custom approach that focuses on their specific needs. They created an extensible base class for tools, where each new function is automatically discovered and added to the LLM's capabilities based on user permissions.
Their function calling system is intentionally limited to a single pass rather than implementing a full agent architecture, prioritizing simplicity and robustness. This shows a mature understanding of the tradeoffs between capability and reliability in production systems.
## Security and Monitoring
Security considerations are evident throughout the design:
* All data remains within the AWS ecosystem
* Strict access controls based on user roles
* Comprehensive logging of all queries and tool invocations to DynamoDB
* Integration with CloudWatch for performance monitoring
* Direct error notifications to a dedicated Slack channel
## User Experience and Integration
The solution demonstrates careful attention to user experience:
* Integration with Slack eliminates the need for new software or interfaces
* Immediate feedback through emoji reactions shows query status
* Private messaging and channel mentioning support different use cases
* User-specific memory stores preferences and defaults
* Natural language interface handles ambiguous queries effectively
## Performance and Impact
The system currently handles about 120 queries per day, with 10-20% requiring additional tool interactions. The impact has been significant:
* Information search time reduced from minutes to seconds
* Decreased load on support and technology teams
* Better utilization of internal resources
* Streamlined access to multiple systems through a single interface
## Future Development
The team has plans for continued development:
* Adding render job error analysis capabilities
* Implementing semantic clustering of queries to identify common issues
* Proactive knowledge base updates based on query patterns
* Integration of new AI capabilities as they become available
## Technical Lessons and Best Practices
Several valuable lessons emerge from this implementation:
* Using pre-built connectors and managed services can significantly reduce development time
* Modular architecture with clean separation between LLM interface and business logic enables easy expansion
* Starting with a simple but robust implementation (single-pass function calling) rather than complex agent architectures
* Comprehensive logging and monitoring are crucial for production systems
* Integration with existing workflows (Slack) reduces adoption friction
## Challenges and Solutions
The case study reveals several challenges they overcame:
* Meeting Slack's 3-second response requirement through a two-function architecture
* Maintaining security while providing broad access to internal resources
* Handling diverse user needs and technical experience levels
* Managing tool access based on user roles and permissions
The implementation demonstrates a thoughtful balance between capability and complexity, security and accessibility, showing how LLMs can be effectively deployed in production environments with strict security requirements. The focus on using managed services and existing integrations, rather than building everything from scratch, provides a valuable template for other organizations looking to implement similar solutions.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.