Digits implemented a production system for generating contextual questions for accountants using fine-tuned T5 models. The system helps accountants interact with clients by automatically generating relevant questions about transactions. They addressed key challenges like hallucination and privacy through multiple validation checks, in-house fine-tuning, and comprehensive evaluation metrics. The solution successfully deployed using TensorFlow Extended on Google Cloud Vertex AI with careful attention to training-serving skew and model performance monitoring.
# Fine-Tuned T5 Models for Accounting Question Generation at Digits
## Company Overview and Use Case
Digits is a fintech company focused on bringing efficiency to small-business finance and helping accountants better serve their clients. They implemented a production system using generative AI to automatically generate relevant questions about financial transactions that accountants can send to their clients. This automation saves valuable time in client interactions while maintaining professional communication standards.
## Technical Implementation Details
### Model Selection and Architecture
- Base model: T5 family pre-trained by Google Brain team
- Fine-tuned in-house for domain-specific accounting use cases
- Implements encoder-decoder transformer architecture
- Supports multiple personas for different communication styles
### Data Processing Pipeline
- Uses TensorFlow Transform for data preprocessing
- Scales preprocessing via Google Cloud Dataflow
- Implements FastSentencePieceTokenizer from TensorFlow Text
- Exports tokenization as part of the model to avoid training-serving skew
- Training data structure includes:
### Training Infrastructure
- Leverages TensorFlow Extended (TFX) on Google Cloud's Vertex AI
- Maintains training artifacts in company-wide metadata store
- Converts all model operations to TensorFlow for deployment efficiency
- Implements custom TFX component for model evaluation
### Production Safety Measures
- Three-layer validation system:
- Privacy protection through in-house fine-tuning
- No sharing of customer data without consent
### Model Evaluation Framework
- Comprehensive evaluation metrics:
- Automated comparison against previous model versions
- Human-curated validation dataset
- Automated deployment recommendations based on metric improvements
### Production Deployment
- Uses TensorFlow Serving without Python layer dependency
- Implements serving signatures for raw text input processing
- Maintains consistent preprocessing between training and serving
- Scales via Google Cloud infrastructure
## MLOps Best Practices
### Data Management
- Careful preprocessing of training data
- Maintenance of validation datasets
- Privacy-conscious data handling
- Implementation of data versioning
### Model Versioning and Deployment
- Automated model comparison and evaluation
- Version tracking in metadata store
- Streamlined deployment process
- Training-serving skew prevention
### Quality Assurance
- Multi-layer validation system
- Automated toxic content detection
- Hallucination detection systems
- Human-in-the-loop validation
### Monitoring and Maintenance
- Continuous evaluation of model performance
- Automated alerting for toxic content detection
- Version comparison tracking
- Performance metrics monitoring
## Technical Challenges and Solutions
### Handling Model Hallucinations
- Implementation of pattern matching for detection
- Automatic discarding of problematic outputs
- Alert system for ML team investigation
- Human review safety net
### Training-Serving Skew Prevention
- Integration of tokenization in model
- Consistent preprocessing across pipeline
- TensorFlow Transform integration
- Export of preprocessing graph
### Scaling Considerations
- Cloud-based training infrastructure
- Distributed preprocessing
- Efficient serving architecture
- Metadata management at scale
### Privacy Protection
- In-house model fine-tuning
- Secure data handling
- Consent-based data usage
- Private training infrastructure
## Results and Impact
### Efficiency Improvements
- Reduced time per client interaction
- Automated question generation
- Maintained professional communication standards
- Flexible persona-based responses
### Technical Achievements
- Successfully deployed production generative AI system
- Implemented comprehensive safety measures
- Established robust evaluation framework
- Created scalable MLOps pipeline
### System Reliability
- Multiple validation layers
- Consistent preprocessing
- Automated evaluation
- Regular performance monitoring
## Infrastructure Components
### Cloud Services
- Google Cloud Vertex AI
- Google Cloud Dataflow
- TensorFlow Serving
- Custom TFX components
### Development Tools
- TensorFlow Extended (TFX)
Start your new ML Project today with ZenML Pro
Join 1,000s of members already deploying models with ZenML.