Streamline ML Workflows with Apache Airflow Orchestration in ZenML
Seamlessly integrate the robustness of Apache Airflow with the ML-centric capabilities of ZenML pipelines. This powerful combination simplifies the orchestration of complex machine learning workflows, enabling data scientists and engineers to focus on building high-quality models while leveraging Airflow's proven production-grade features.
Features with ZenML
- Native execution of ZenML pipelines as Airflow DAGs
- Simplified management of complex ML workflows
- Enhanced efficiency and scalability for MLOps pipelines
- Compatibility with both local and remote Airflow deployments
Main Features
- Robust workflow orchestration for data pipelines
- Extensive library of pre-built operators and sensors
- Intuitive web-based user interface for monitoring and managing workflows
- Scalable architecture for running workflows on distributed systems
- Strong focus on extensibility, allowing custom plugins and operators
How to use ZenML with
Apache Airflow
from zenml import step, pipeline
from zenml.integrations.airflow.flavors.airflow_orchestrator_flavor import AirflowOrchestratorSettings
@step
def my_step():
print("Running in Airflow!")
airflow_settings = AirflowOrchestratorSettings(
operator="airflow.providers.docker.operators.docker.DockerOperator",
operator_args={}
)
@pipeline(settings={"orchestrator.airflow": airflow_settings})
def my_airflow_pipeline():
my_step()
if __name__ == "__main__":
my_airflow_pipeline()]
This code snippet demonstrates how to configure a ZenML pipeline to run on Apache Airflow. The AirflowOrchestratorSettings allow specifying the Airflow operator (in this case, the DockerOperator) and any additional arguments. Each step of the pipeline will run in a separate Docker container orchestrated by Airflow
Additional Resources
Read the full ZenML Airflow integration documentation
Learn more about Apache Airflow