Effortlessly deploy Hugging Face models to production with ZenML
Integrate Hugging Face Inference Endpoints with ZenML to streamline the deployment of transformers, sentence-transformers, and diffusers models. This integration allows you to leverage Hugging Face's secure, scalable infrastructure for hosting models, while managing the deployment process within your ZenML pipelines.
Features with ZenML
- Seamless deployment of Hugging Face models directly from ZenML pipelines
- Simplified management of inference endpoints within the ZenML ecosystem
- Automatically scale deployments based on demand using Hugging Face's infrastructure
- Maintain a centralized registry of deployed models for easy tracking and monitoring
Main Features
- Secure model hosting on dedicated Hugging Face infrastructure
- Autoscaling capabilities to handle variable inference workloads
- Support for a wide range of model types and frameworks
- Pay-per-use pricing for cost-effective deployments
- Enterprise-grade security features like VPC deployment
How to use ZenML with
Hugging Face (Inference Endpoints)
from zenml.integrations.huggingface.steps import huggingface_model_deployer_step
from zenml.integrations.huggingface.services.huggingface_deployment import HuggingFaceDeploymentService
from zenml.integrations.huggingface.services import HuggingFaceServiceConfig
@step
def predictor(
service: HuggingFaceDeploymentService,
) -> Annotated[str, "predictions"]:
# Run a inference request against a prediction service
data = load_live_data()
prediction = service.predict(data)
return prediction
@pipeline
def deploy_and_infer():
service_config = HuggingFaceServiceConfig(model_name=model_name)
service = huggingface_model_deployer_step(
model_name="text-classification-model",
accelerator="gpu",
hf_repository="myorg/text-classifier",
task="text-classification"
)
predictor(service)
This code snippet demonstrates how to use the huggingface_model_deployer_step within a ZenML pipeline to deploy a trained model to Hugging Face Inference Endpoints. The step takes in the model artifact, a name for the deployment, the Hugging Face repository path, an access token, and specifies the type of task the model performs.
Additional Resources
Example project: Deploying a text classification model with Hugging Face and ZenML
ZenML Hugging Face integration guide
Hugging Face Inference Endpoints documentation