Integrate Amazon S3 with ZenML - Artifact Store Integrations

Unleash Scalable Cloud Storage with Amazon S3 and ZenML

Elevate your MLOps game by integrating Amazon S3 with ZenML for efficient and reliable artifact storage. This powerful combination allows you to store and manage your pipeline artifacts in the cloud, ensuring scalability, high availability, and seamless collaboration for your machine learning projects.

Features with ZenML

Scalable and Reliable Artifact Storage
‍Store your pipeline artifacts in the cloud with Amazon S3, ensuring scalability and high availability for your ML workflows.
Seamless Integration
‍Easily register and use an S3 Artifact Store in your ZenML stacks with just a few commands.
Secure Access Control
‍Leverage S3's built-in security features and ZenML's authentication methods to control access to your artifacts.
Collaboration Made Easy
‍Share pipeline artifacts with team members and stakeholders by using an S3 cloud-based storage solution.

Main Features

Secure and durable object storage
Scalable storage capacity
High availability and data redundancy
Flexible access control and authentication
Integration with various AWS services

How to use ZenML with

Amazon S3


# Step 1: Install the AWS integration
>>> zenml integration install s3

# Step 2: Register the S3 artifact store
>>> zenml artifact-store register s3_store -f s3 --path="s3://your-bucket-name"

# Step 3: [Optional] Connect the S3 artifact store to a Service Connector
>>> zenml artifact-store connect s3_store -i

# Step 4: Update your stack to use the S3 artifact store
>>> zenml stack update -a s3_store

# Step 5: Run the pipeline using the S3 artifact store
>>> python3 my_pipeline.py

Initiating a new run for the pipeline: my_pipeline.
Executing a new run.
Using user: user1
Using stack: remote_stack
  orchestrator: default
  artifact_store: s3_store
You can visualize your pipeline runs in the ZenML Dashboard. In order to try it locally, please run zenml up.
Step my_step has started.
Step my_step has finished in 0.078s.
Step my_step completed successfully.
Pipeline run has finished in 0.112s.

The artifact value you saved in the `my_pipeline` run is:
{'key': 'value', 'message': 'Hello from S3!'}


from typing_extensions import Annotated

from zenml import pipeline, step
from zenml.client import Client


@step
def my_step(input_dict: dict) -> Annotated[dict, "dict_from_s3"]:
    output_dict = input_dict.copy()
    output_dict["message"] = "Hello from S3!"
    return output_dict


@pipeline
def my_pipeline(input_dict: dict):
    my_step(input_dict)


if __name__ == "__main__":
    input_data = {"key": "value"}
    my_pipeline(input_data)

    print(
        "The artifact value you saved in the `my_pipeline` run is:\n"
        + str(Client().get_artifact_version(name_id_or_prefix="dict_from_s3").load())
    )

my_pipeline.py

This code example demonstrates how to set up and use an S3 Artifact Store (optionally connected with a Service Connector) in a ZenML pipeline. After installing the S3 integration and registering the S3 store, you can update your stack to use it. The my_step function showcases how data is processed and stored in the artifact store, while the my_pipeline function orchestrates the pipeline execution.

Additional Resources

Full ZenML integration documentation

Amazon S3

Unleash Scalable Cloud Storage with Amazon S3 and ZenML

Features with ZenML

Main Features

Unleash Scalable Cloud Storage with Amazon S3 and ZenML

Unify Your ML and LLM Workflows

Connect Your ML Pipelines to a World of Tools