ZenML - Blog

15 mins

LLMOps in Production: 287 More Case Studies of What Actually Works

287 latest curated summaries of LLMOps use cases in industry, from tech to healthcare to finance and more. This blog also highlights some of the trends observed across the case studies.

Read post

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

MLOps

2 mins

The Hidden Cost of ML Chaos: Why Your Data Team Needs MLOps Standards Now

Discover why the lack of standardized MLOps practices is silently draining your data team's productivity and resources. This eye-opening analysis reveals how seemingly harmless differences in ML development approaches can cascade into significant organizational challenges, from knowledge transfer barriers to mounting technical debt. Learn practical strategies for implementing MLOps standards that boost efficiency without stifling innovation, and understand why addressing these hidden costs now is crucial for scaling your ML operations successfully. Perfect for data leaders and ML practitioners looking to optimize their team's workflow and maximize ROI on ML initiatives.

Read post

MLOps

2 mins

From POC to Production: A Guide to Scaling Retail MLOps Infrastructure

Discover how successful retail organizations navigate the complex journey from proof-of-concept to production-ready MLOps infrastructure. This comprehensive guide explores essential strategies for scaling machine learning operations, covering everything from standardized pipeline architecture to advanced model management. Learn practical solutions for handling model proliferation, managing multiple environments, and implementing robust governance frameworks. Whether you're dealing with a growing model fleet or planning for future scaling challenges, this post provides actionable insights for building sustainable, enterprise-grade MLOps systems in retail.

Read post

ZenML

3 mins

Improvements: Enhanced Artifacts Versioning, Scalability and Metadata Management

ZenML 0.70.0 has launched with major improvements but requires careful handling during upgrade due to significant database schema changes. Key highlights include enhanced artifact versioning with batch processing capabilities, improved scalability through reduced server requests, unified metadata management via the new log_metadata method, and flexible filtering with the new oneof operator. The release also features expanded documentation covering finetuning and LLM/ML engineering resources. Due to the database changes, users must back up their data and test the upgrade in a non-production environment before deploying to production systems.

Read post

ZenML Updates

3 min

New Features: Enhanced Dashboard, Improved Performance, and Streamlined User Experience

ZenML 0.68.0 introduces several major enhancements including the return of stack components visualization on the dashboard, powerful client-side caching for improved performance, and a streamlined onboarding process that unifies starter and production setups. The release also brings improved artifact management with the new `register_artifact` function, enhanced BentoML integration (v1.3.5), and comprehensive documentation updates, while deprecating legacy features including Python 3.8 support.

Read post

3 mins

Elevate Your Cloud MLOps with ZenML

Why use ZenML alongside AWS / GCP / Azure MLOps platforms? Let's dive into why ZenML complements and enhance existing cloud MLOps infrastructure.

Read post

Case Studies

4 mins

Empowering ZenML Pro Infrastructure Management: Our Journey from Spacelift to ArgoCD

The combination of ZenML and Neptune can streamline machine learning workflows and provide unprecedented visibility into experiments. ZenML is an extensible framework for creating production-ready pipelines, while Neptune is a metadata store for MLOps. When combined, these tools offer a robust solution for managing the entire ML lifecycle, from experimentation to production. The combination of these tools can significantly accelerate the development process, especially when working with complex tasks like language model fine-tuning. This integration offers the ability to focus more on innovating and less on managing the intricacies of your ML pipelines.

Read post

MLOps

5 mins

Streamlining Model Deployment with ZenML and BentoML

This blog post discusses the integration of ZenML and BentoML in machine learning workflows, highlighting their synergy that simplifies and streamlines model deployment. ZenML is an open-source MLOps framework designed to create portable, production-ready pipelines, while BentoML is an open-source framework for machine learning model serving. When combined, these tools allow data scientists and ML engineers to streamline their workflows, focusing on building better models rather than managing deployment infrastructure. The combination offers several advantages, including simplified model packaging, local and container-based deployment, automatic versioning and tracking, cloud readiness, standardized deployment workflow, and framework-agnostic serving.

Read post

LLMs

06 mins

Automating Lightning Studio ML Pipelines For Fine Tuning LLM (s)

In the AI world, fine-tuning Large Language Models (LLMs) for specific tasks is becoming a critical competitive advantage. Combining Lightning AI Studios with ZenML can streamline and automate the LLM fine-tuning process, enabling rapid iteration and deployment of task-specific models. This approach allows for the creation and serving of multiple fine-tuned variants of a model, with minimal computational resources. However, scaling the process requires resource management, data preparation, hyperparameter optimization, version control, deployment and serving, and cost management. This blog post explores the growing complexity of LLM fine-tuning at scale and introduces a solution that combines the flexibility of Lightning Studios with the automation capabilities of ZenML.

Read post

ZenML Updates

3 min

New Features: Improved Sagemaker Orchestration, New DAG Visualizer, Skypilot with Kubernetes, and more

This release incorporates updates to the SageMaker Orchestrator, DAG Visualizer, and environment variable handling. It also includes Kubernetes support for Skypilot and an updated Deepchecks integration. Various other improvements and bug fixes have been implemented across different areas of the platform.

Read post

ZenML Blog

LLMOps in Production: 287 More Case Studies of What Actually Works

The Hidden Cost of ML Chaos: Why Your Data Team Needs MLOps Standards Now

From POC to Production: A Guide to Scaling Retail MLOps Infrastructure

Improvements: Enhanced Artifacts Versioning, Scalability and Metadata Management

New Features: Enhanced Dashboard, Improved Performance, and Streamlined User Experience

Elevate Your Cloud MLOps with ZenML

Empowering ZenML Pro Infrastructure Management: Our Journey from Spacelift to ArgoCD

Streamlining Model Deployment with ZenML and BentoML

Automating Lightning Studio ML Pipelines For Fine Tuning LLM (s)

New Features: Improved Sagemaker Orchestration, New DAG Visualizer, Skypilot with Kubernetes, and more

Start deploying reproducible AI workflows today