Nextdoor developed a novel system to improve email engagement by optimizing notification subject lines using generative AI. They combined prompt engineering with ChatGPT API and a reward model using rejection sampling to generate authentic, engaging subject lines without hallucinations. The system includes caching for cost optimization and daily performance monitoring. A/B testing showed a 1% lift in sessions, 0.4% increase in Weekly Active Users, and 1% increase in ad revenue compared to user-generated subject lines.
# Nextdoor's LLMOps Implementation for Email Engagement Optimization
## Overview and Business Context
Nextdoor, the neighborhood network platform, implemented a sophisticated LLMOps system to enhance user engagement through AI-generated email subject lines. The project specifically focused on their "New and Trending notifications" email system, where they needed to generate engaging subject lines for posts being shared with users.
## Technical Implementation
### Subject Line Generator Architecture
- Used OpenAI API (ChatGPT) without fine-tuning as the base generation model
- Implemented prompt engineering to extract authentic content rather than generate new text
### Reward Model Development
- Fine-tuned OpenAI's "ada" model for evaluating subject line engagement
- Training approach:
- Technical optimization:
### Engineering Infrastructure
### Cost Optimization
- Implemented caching system
- Retry mechanism:
### Performance Monitoring and Maintenance
- Daily monitoring of reward model performance
- Dedicated control groups:
- Retraining triggers:
### Quality Control and Safety Measures
- Subject line constraints:
- Hallucination prevention:
## Results and Metrics
### Performance Improvements
- Session metrics:
- A/B testing revealed:
### Key Learnings
### Prompt Engineering Insights
- Improvements from prompt engineering hit a ceiling
- Difficulty in finding "optimal" prompts
- Limited systematic methods for prompt enhancement
- Heavy reliance on human intuition
### Model Performance
- Reward model accuracy at 65%
- Challenge in predicting popular content
- Room for improvement using real-time engagement signals
## Future Development Plans
### Planned Improvements
- Fine-tuning possibilities:
- Operational enhancements:
- Personalization goals:
### Infrastructure Considerations
- Scalability of caching system
- Cost-effective personalization strategies
- Real-time performance monitoring improvements
## Technical Stack Summary
- Primary LLM: OpenAI API (ChatGPT)
- Evaluation Model: Fine-tuned OpenAI ada model
- Infrastructure:
- Performance Metrics:
This case study demonstrates a comprehensive LLMOps implementation that successfully combines prompt engineering, reward modeling, and robust engineering practices to create a production-ready AI system. The approach shows how careful consideration of cost, performance, and quality control can lead to measurable business improvements while maintaining system reliability and efficiency.
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.