Integrate Prodigy with ZenML - Data Annotator Integrations

Streamline Data Annotation with Prodigy and ZenML

Enhance your machine learning workflows by integrating Prodigy, a modern annotation tool, with ZenML. This powerful combination enables efficient data labeling, data inspection, and error analysis, streamlining your ML pipeline and improving model performance.

Features with ZenML

Seamless Integration:
Easily incorporate Prodigy as a data annotation step within your ZenML pipelines.
Efficient Data Labeling:
Leverage Prodigy's intuitive and optimized interface for fast and accurate data annotation.
Flexible Workflow Customization:
Customize annotation workflows using Prodigy's pre-built components and ZenML's extensible architecture.
Streamlined Data Management:
Effortlessly manage datasets, annotations, and metadata within the ZenML framework.

‍

Main Features

Intuitive and efficient web-based annotation interface
Pre-built workflows for various annotation tasks
Customizable scripts for data loading, saving, and annotation logic
Extensible front-end with custom HTML and JavaScript support
Optimized for fast and accurate data labeling

‍

How to use ZenML with

Prodigy

# zenml annotator register prodigy --flavor prodigy
# optionally also pass in --custom_config_path="&alt;PATH_TO_CUSTOM_CONFIG_FILE>"
# zenml stack register prodigy -o default -a default -an prodigy --set

# wget https://raw.githubusercontent.com/explosion/prodigy-recipes/master/example-datasets/news_headlines.jsonl

# Now annotate your data
# zenml annotator dataset annotate your_dataset --command="textcat.manual news_topics ./news_headlines.jsonl --label Technology,Politics,Economy,Entertainment"

# access the data later on using Python in your pipelines
from zenml import step
from zenml.client import Client

@step
def import_annotations() -> List[Dict[str, Any]]:
    zenml_client = Client()
    annotations = zenml_client.active_stack.annotator.get_labeled_data(dataset_name="your_dataset")
    # Do something with the annotations
    return annotations

This code snippet demonstrates how to import annotations from Prodigy within a ZenML step. It uses the ZenML client to access the active stack's annotator component and retrieves the labeled data for a specific dataset. The annotations can then be processed further in the pipeline.

Additional Resources

ZenML Prodigy Integration Docs

Prodigy Documentation

Blog: How to annotate image data for object detection with Prodigy

Prodigy

Streamline Data Annotation with Prodigy and ZenML

Features with ZenML

Main Features

Streamline Data Annotation with Prodigy and ZenML

Unify Your ML and LLM Workflows

Connect Your ML Pipelines to a World of Tools

Connect Your ML Pipelines to a World of Tools