Anthropic developed Clio, a privacy-preserving system to understand how their LLM Claude is being used in the real world while maintaining strict user privacy. The system uses Claude itself to analyze and cluster conversations, extracting high-level insights without humans ever reading the raw data. This allowed Anthropic to improve their safety evaluations, understand usage patterns across languages and domains, and detect potential misuse - all while maintaining strong privacy guarantees through techniques like minimum cluster sizes and privacy auditing.
This case study explores how Anthropic built and deployed Clio, an innovative system for analyzing LLM usage patterns while preserving user privacy. The development team consisted of researchers and engineers from Anthropic's Societal Impacts team who were focused on understanding how their AI systems affect users and society at large.
The key challenge they faced was a fundamental tension between two competing needs: understanding how their LLM Claude was being used in production to improve safety and capabilities, while also respecting user privacy and avoiding any invasion of private conversations. Prior to Clio, their approach was primarily top-down - hypothesizing potential issues and building specific evaluations, rather than understanding actual usage patterns.
The technical architecture of Clio is quite sophisticated:
* Conversations are first analyzed by Claude itself to create high-level summaries of user intent
* These summaries are converted into embeddings that capture semantic meaning
* Similar conversations are clustered based on these embeddings
* Raw conversations are discarded after summarization
* Claude analyzes the clusters to generate descriptions and metadata
* A privacy auditing system ensures no personally identifiable information remains
* Strict minimum cluster sizes are enforced (e.g., requiring 1000+ distinct conversations)
The team implemented several key privacy safeguards:
* No human ever reads raw conversations
* Multiple layers of privacy filtering and auditing
* Minimum thresholds for both unique organizations and conversation counts
* Removal of any details that could identify fewer than ~1000 individuals
* Defense-in-depth strategy for privacy protection
The development process started with extensive ethical discussions before any code was written. The team carefully considered potential privacy risks and designed safeguards from the ground up. They validated Clio's accuracy through synthetic data experiments, generating test datasets with known distributions and verifying that Clio could reconstruct them correctly.
Some key findings and applications of Clio included:
* Understanding usage patterns across different languages and cultures
* Identifying over-refusal and under-refusal patterns in Claude's responses
* Detecting coordinated abuse attempts through unusual cluster patterns
* Monitoring new feature deployments like Claude's computer interaction capabilities
* Understanding how Claude was being used during sensitive periods like elections
* Discovering unexpected use cases like parenting advice and crisis counseling
The system proved particularly valuable for improving Claude's behavior in various ways:
* Calibrating refusal patterns for different types of requests
* Improving responses in different languages
* Understanding and addressing potential misuse
* Grounding safety evaluations in real-world usage patterns
* Identifying areas where Claude needed better handling of limitations or uncertainty
A notable aspect of this case study is Anthropic's decision to publicly share detailed information about Clio's architecture and findings, including publishing specific prompts and parameters to allow other organizations to implement similar systems. This transparency was driven by their status as a public benefit company and belief in the importance of sharing safety-relevant findings with the broader AI community.
The team identified several future directions for Clio:
* Understanding emotional impacts and user relationships with AI
* Studying how AI is changing work patterns and economic impacts
* Improving educational applications
* Analyzing value judgments and ensuring pluralistic responses
* Expanding privacy-preserving monitoring capabilities
The case study demonstrates how careful engineering and ethical consideration can resolve seemingly conflicting requirements - in this case, gaining valuable usage insights while maintaining strong privacy guarantees. The success of Clio suggests that privacy-preserving analytics may be a crucial component of responsible AI deployment.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.