Dropbox: LLM Security: Discovering and Mitigating Repeated Token Attacks in Production Models

LLMOps Database

Tech

Dropbox

Company

Dropbox

Title

LLM Security: Discovering and Mitigating Repeated Token Attacks in Production Models

Industry

Tech

Link

https://dropbox.tech/machine-learning/bye-bye-bye-evolution-of-repeated-token-attacks-on-chatgpt-models

Year

2024

Summary (short)

Dropbox's security research team discovered vulnerabilities in OpenAI's GPT-3.5 and GPT-4 models where repeated tokens could trigger model divergence and extract training data. They identified that both single-token and multi-token repetitions could bypass OpenAI's initial security controls, leading to potential data leakage and denial of service risks. The findings were reported to OpenAI, who subsequently implemented improved filtering mechanisms and server-side timeouts to address these vulnerabilities.

Tags

documentation

error_handling

guardrails

high_stakes_application

regulatory_compliance

# LLM Security Research and Mitigation at Dropbox ## Overview Dropbox, while implementing AI-powered features using OpenAI's models, conducted extensive security research that led to the discovery of significant vulnerabilities in GPT-3.5 and GPT-4 models. The research spanned from April 2023 to January 2024, revealing how repeated tokens could compromise model security and potentially leak training data. ## Technical Discovery Process ### Initial Findings - First discovered prompt injection vulnerability through repeated character sequences in April 2023 - Found that the vulnerability could bypass prompt guardrails and cause hallucinatory responses - Research was documented publicly on their tech blog and Github repository - Presented findings at CAMLIS conference ### Deep Technical Analysis - Identified that the root cause was token repetition rather than character sequence repetition - Demonstrated vulnerability using specific token combinations: - Both GPT-3.5 and GPT-4 model families were affected ## Vulnerability Details ### Single-Token Attack - Initial research showed model divergence through single token repetition - Could cause models to: ### Multi-Token Attack Innovation - Discovered that multi-token sequences could also trigger vulnerabilities - Example: 'jq_THREADS' combination (tokens 45748 and 57339) - Successfully extracted memorized training data from both GPT-3.5 and GPT-4 - Demonstrated extraction of Bible passages and technical documentation ### OpenAI's Security Response Timeline - November 2023: Initial filtering implementation for single-token attacks - January 2024: Dropbox reported multi-token vulnerability - January 29, 2024: OpenAI updated filtering to block multi-token repeats - Implemented server-side timeout mechanisms ## Production Security Implications ### Attack Vectors - Prompt injection leading to guardrail bypassing - Training data extraction - Potential denial of service through long-running requests - Resource exhaustion attacks ### Security Controls and Mitigations - Input sanitization for repeated tokens - Setting appropriate max_tokens limits - Implementing request timeouts - Monitoring for suspicious token patterns ### Architectural Considerations - Need for robust input validation - Importance of request timeout mechanisms - Resource utilization monitoring - Defense-in-depth approach for LLM security ## Development and Testing Tools - Developed Python scripts for token analysis - Created repeated tokens detector tool - Implemented Langchain-compatible detection capabilities - Planning to open-source security tools ## Best Practices for LLM Operations ### Monitoring and Detection - Regular security reviews of LLM implementations - Continuous monitoring for unusual model behavior - Token pattern analysis in production traffic - Performance impact assessment ### Response Procedures - Clear vulnerability disclosure process - Collaboration with model providers - Quick deployment of mitigations - Documentation of security findings ### Risk Management - Understanding of model security boundaries - Regular security assessments - Proactive vulnerability testing - Balance between functionality and security ## Broader Impact on LLMOps ### Industry Implications - Demonstrated vulnerabilities likely affect other models - Need for standardized security testing - Importance of vendor collaboration - Balance between model capability and security ### Future Considerations - Transfer learning implications for security - Open-source model vulnerabilities - Need for industry-wide security standards - Importance of shared security tools and practices ## Lessons Learned ### Technical Insights - Token-level analysis is crucial for security - Multiple attack vectors need consideration - Security controls must evolve with threats - Resource consumption as a security concern ### Operational Recommendations - Implement comprehensive token filtering - Deploy robust timeout mechanisms - Monitor resource utilization - Regular security testing and updates ### Documentation and Sharing - Importance of vulnerability disclosure - Value of open-source security tools - Need for industry collaboration - Documentation of security findings

Start deploying reproducible AI workflows today

Enterprise-grade MLOps platform trusted by thousands of companies in production.

Book a Demo

Use Open Source