Effortlessly deploy Hugging Face models to production with ZenML

Integrate Hugging Face Inference Endpoints with ZenML to streamline the deployment of transformers, sentence-transformers, and diffusers models. This integration allows you to leverage Hugging Face's secure, scalable infrastructure for hosting models, while managing the deployment process within your ZenML pipelines.

Features with ZenML

Seamless deployment of Hugging Face models directly from ZenML pipelines
Simplified management of inference endpoints within the ZenML ecosystem
Automatically scale deployments based on demand using Hugging Face's infrastructure
Maintain a centralized registry of deployed models for easy tracking and monitoring

‍

Main Features

Secure model hosting on dedicated Hugging Face infrastructure
Autoscaling capabilities to handle variable inference workloads
Support for a wide range of model types and frameworks
Pay-per-use pricing for cost-effective deployments
Enterprise-grade security features like VPC deployment

‍

How to use ZenML with

Hugging Face (Inference Endpoints)


from zenml.integrations.huggingface.steps import huggingface_model_deployer_step
from zenml.integrations.huggingface.services.huggingface_deployment import HuggingFaceDeploymentService
from zenml.integrations.huggingface.services import HuggingFaceServiceConfig

@step
def predictor(
    service: HuggingFaceDeploymentService,
) -> Annotated[str, "predictions"]:
    # Run a inference request against a prediction service
    data = load_live_data()
    prediction = service.predict(data)
    return prediction
    
@pipeline
def deploy_and_infer():
    service_config = HuggingFaceServiceConfig(model_name=model_name)
    service = huggingface_model_deployer_step(
        model_name="text-classification-model",
        accelerator="gpu",
        hf_repository="myorg/text-classifier",
        task="text-classification"
    )
    predictor(service)

‍

This code snippet demonstrates how to use the huggingface_model_deployer_step within a ZenML pipeline to deploy a trained model to Hugging Face Inference Endpoints. The step takes in the model artifact, a name for the deployment, the Hugging Face repository path, an access token, and specifies the type of task the model performs.

Additional Resources

Example project: Deploying a text classification model with Hugging Face and ZenML

ZenML Hugging Face integration guide

Hugging Face Inference Endpoints documentation