Making ML Documentation AI-Friendly: ZenML's Implementation of llms.txt

Jayesh Sharma

Feb 10, 2025

•

5 mins

Contents

All Continuous, All The Time: Pipeline Deployment Patterns with ZenML

Tracking experiments in your MLOps pipelines with ZenML and Neptune

Deploy your ML models with KServe and ZenML

This is also a heading
This is a heading

Making ML Documentation AI-Friendly: ZenML's Implementation of llms.txt

Large Language Models are transforming how developers interact with documentation, but they're constrained by context windows and struggle with traditional documentation formats. Here's how we're addressing this at ZenML by implementing the llms.txt standard—a structured approach to making our documentation more accessible to both AI assistants and human readers.

Understanding llms.txt

The llms.txt standard, proposed by Jeremy Howard and the team at Answer.ai, addresses a fundamental challenge in the age of AI assistants: how to make website content more accessible to Large Language Models (LLMs) while working within their context window limitations. Traditional documentation, with its complex HTML structure, navigation elements, and JavaScript, can be challenging for LLMs to process effectively. The llms.txt standard offers an elegant solution by providing a markdown-based format that's both human-readable and machine-friendly.

At its core, the llms.txt format follows a precise structure that begins with a project name as a top-level header, followed by a required summary block that provides essential context. This summary serves as a quick overview for both humans and machines to understand the scope and purpose of the documentation. Beyond these required elements, the format supports optional detailed information sections and resource listings, all organized under second-level headers.

You can get a sense of just how much the specification / standard took of by browsing one of the popular directories of codebases that support llms.txt, llmstxt.site:

Screenshot of the llms.txt directory, showing a table with different implementations of the llms.txt file.

Technical Specification

The technical implementation is straightforward yet powerful. Files must be served at /llms.txt in the website root, ensuring consistent access across different projects. Using markdown to define structure enables easy parsing and processing while maintaining human readability. This approach allows for automated expansion of URLs, structured parsing of sections, and transformation into various formats like XML or JSON for different use cases.

ZenML's Implementation

We've taken a thoughtful approach to implementing llms.txt, focusing on scalability and usability across different context window sizes. Our implementation follows a modular approach with specialized files for different documentation aspects, all hosted with a HuggingFace dataset and available via the https://www.zenml.io/ domain prefix as well.

The foundation of our implementation is the base llms.txt file, which contains approximately 120,000 tokens covering our documentation’s User Guides and Getting Started information. This file serves as the primary entry point for basic queries and initial exploration of ZenML's capabilities. We've structured it to begin with a clear overview of ZenML as an extensible, open-source MLOps framework, followed by carefully organized sections covering our core documentation and component integrations.

Beyond the base file, we've created specialized documentation files to address different user needs:

Our component-guide.txt, containing 180,000 tokens, provides detailed information about ZenML integrations and stack components, making it invaluable for users working with specific integrations or configuring their MLOps stack.
The how-to-guides.txt, at 75,000 tokens, offers practical implementation guidance and summarized workflows, perfect for users seeking concrete examples and step-by-step instructions.
For users requiring comprehensive documentation access, we also maintain llms-full.txt, a complete corpus of 600,000 tokens that serves as an unabridged reference. This file is particularly useful when working with AI models that support larger context windows or when dealing with complex queries that require deep context.

Practical Applications and Integration

The real power of our llms.txt versions becomes apparent when integrated with modern AI development tools. We've tested out our documentation across various environments to ensure optimal usability. Through practical experience, we've developed insights into effective usage patterns and integration strategies.

Working with the ZenML documentation in Cursor demonstrates how seamlessly the llms.txt format integrates with modern development environments. The structured format enables precise code completion and documentation lookup, enhancing the development experience without breaking flow.

When working with larger documentation files, Vertex AI Studio's capabilities shine. The platform's ability to handle our comprehensive llms-full.txt file enables deep documentation analysis and complex query resolution, particularly valuable for advanced scenarios or where you’re not quite sure which file to use.

Our built-in RunLLM tool on the ZenML docs website showcases how documentation can power immediate documentation queries, and I wanted to include this as part of this overview since not everyone knows about it. This integration demonstrates the format's flexibility and utility in web-based environments.

Finally, if you use NotebookLM with the llms-full.txt sources you can both chat with the documentation as well as generate custom podcasts around a theme of your choice.

Looking Forward

We adopted the llms.txt standard because we found it a useful approach and users had asked us about it. We think it’ll open up some useful options for using AI to improve your pipelines, enabling better context understanding, more accurate code suggestions, and improved documentation search and retrieval.

You can find our base llms.txt file at https://zenml.io/llms.txt, and access all our specialized files through our HuggingFace dataset too if you’d like. We encourage you to explore these resources and provide feedback on how we can make them even more useful for your MLOps journey.

Looking to Get Ahead in MLOps & LLMOps?

Subscribe to the ZenML newsletter and receive regular product updates, tutorials, examples, and more articles like this one.

We care about your data in our privacy policy.

Looking to Get Ahead in MLOps & LLMOps?

Thank you!