ZenML
Prodigy
All integrations

Prodigy

Streamline Data Annotation with Prodigy and ZenML

Add to ZenML

Streamline Data Annotation with Prodigy and ZenML

Enhance your machine learning workflows by integrating Prodigy, a modern annotation tool, with ZenML. This powerful combination enables efficient data labeling, data inspection, and error analysis, streamlining your ML pipeline and improving model performance.

Features with ZenML

  • Seamless Integration:
    Easily incorporate Prodigy as a data annotation step within your ZenML pipelines.
  • Efficient Data Labeling:
    Leverage Prodigy's intuitive and optimized interface for fast and accurate data annotation.
  • Flexible Workflow Customization:
    Customize annotation workflows using Prodigy's pre-built components and ZenML's extensible architecture.
  • Streamlined Data Management:
    Effortlessly manage datasets, annotations, and metadata within the ZenML framework.

Prodigy integration screenshot

Main Features

  • Intuitive and efficient web-based annotation interface
  • Pre-built workflows for various annotation tasks
  • Customizable scripts for data loading, saving, and annotation logic
  • Extensible front-end with custom HTML and JavaScript support
  • Optimized for fast and accurate data labeling

How to use ZenML with Prodigy

# zenml annotator register prodigy --flavor prodigy
# optionally also pass in --custom_config_path="&alt;PATH_TO_CUSTOM_CONFIG_FILE>"
# zenml stack register prodigy -o default -a default -an prodigy --set

# wget https://raw.githubusercontent.com/explosion/prodigy-recipes/master/example-datasets/news_headlines.jsonl

# Now annotate your data
# zenml annotator dataset annotate your_dataset --command="textcat.manual news_topics ./news_headlines.jsonl --label Technology,Politics,Economy,Entertainment"

# access the data later on using Python in your pipelines
from zenml import step
from zenml.client import Client

@step
def import_annotations() -> List[Dict[str, Any]]:
    zenml_client = Client()
    annotations = zenml_client.active_stack.annotator.get_labeled_data(dataset_name="your_dataset")
    # Do something with the annotations
    return annotations
    

Additional Resources

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with more than 50 ZenML Integrations

  • Amazon S3
  • Apache Airflow
  • Argilla
  • AutoGen
  • AWS
  • AWS Strands
  • Azure Blob Storage
  • Azure Container Registry
  • AzureML Pipelines
  • BentoML
  • Comet