Integrations
Great Expectations
and
ZenML logo in purple, representing machine learning pipelines and MLOps framework.
Ensure Data Quality and Consistency in Your ML Pipelines with Great Expectations and ZenML
The image is blank. No elements are visible for description or keyword inclusion.
Great Expectations
All integrations

Great Expectations

Ensure Data Quality and Consistency in Your ML Pipelines with Great Expectations and ZenML
Add to ZenML
COMPARE
related resources
No items found.

Ensure Data Quality and Consistency in Your ML Pipelines with Great Expectations and ZenML

Integrate Great Expectations with ZenML to seamlessly incorporate data profiling, testing, and documentation into your ML workflows. This powerful combination allows you to maintain high data quality standards, improve communication, and enhance observability throughout your ML pipeline.

Features with ZenML

  • Seamless integration of Great Expectations data validation within ZenML pipelines
  • Automated storage and versioning of Expectation Suites and Validation Results using ZenML's Artifact Store
  • Easy visualization of Great Expectations artifacts directly in the ZenML dashboard or Jupyter notebooks
  • Flexible deployment options for stores to leverage existing Great Expectations configurations or let ZenML manage the setup

Main Features

  • Automated data profiling to generate validation rules (Expectations) based on dataset properties
  • Comprehensive data quality checks using predefined or inferred Expectations
  • Human-readable documentation of validation rules, quality checks, and results
  • Support for various data formats and sources, with ZenML currently supporting pandas DataFrames

How to use ZenML with
Great Expectations
from zenml.integrations.great_expectations.steps.ge_validator import (
    great_expectations_validator_step,
)

ge_validator_step = great_expectations_validator_step.with_options(
    parameters={
        "expectations_list": [
            GreatExpectationExpectationConfig(
                expectation_name="expect_column_values_to_be_between",
                expectation_args={
                    "column": "X_Minimum",
                    "min_value": 0,
                    "max_value": 2000
                },
            ),
        ],
        "data_asset_name": "steel_plates_train_df",
    }
)

@pipeline(enable_cache=False, settings={"docker": docker_settings})
def validation_pipeline():
    imported_data = importer()
    train, test = splitter(imported_data)
    ge_validator_step(train)

validation_pipeline()

The code example demonstrates a simple ZenML pipeline that integrates Great Expectations for data validation. It starts by importing the great_expectations_validator_step step and defining a data importer step. We can specify our list of expectations using the GreatExpectationExpectationConfig class, where each expectation is defined through an expectation name and some expectation arguments like the column name. When you run the pipeline, the resulting artifacts are automatically stored and versioned using ZenML's Artifact Store. By default, the great validation stores for validation results and checkpoints are also configured to your active artifact store.

Additional Resources
ZenML Great Expectations Integration Docs
Great Expectations Documentation
ZenML Great Expectation SDK Docs

Ensure Data Quality and Consistency in Your ML Pipelines with Great Expectations and ZenML

Integrate Great Expectations with ZenML to seamlessly incorporate data profiling, testing, and documentation into your ML workflows. This powerful combination allows you to maintain high data quality standards, improve communication, and enhance observability throughout your ML pipeline.
Great Expectations

Start Your Free Trial Now

No new paradigms - Bring your own tools and infrastructure
No data leaves your servers, we only track metadata
Free trial included - no strings attached, cancel anytime
Dashboard displaying machine learning models, including versions, authors, and tags. Relevant to model monitoring and ML pipelines.

Connect Your ML Pipelines to a World of Tools

Expand your ML pipelines with Apache Airflow and other 50+ ZenML Integrations
Hugging Face (Inference Endpoints)
Amazon S3
LightGBM
GitHub Container Registry
Weights & Biases
Elastic Container Registry
Hugging Face
TensorFlow
scikit-learn (sklearn)
Google Cloud Storage (GCS)
Azure Blob Storage