data:image/s3,"s3://crabby-images/a3a18/a3a18525143daee72f01b6d288285c651a334a5e" alt="đ¤ Embedding HuggingFace datasets visualizations with ZenML"
Yesterday, Alex pointed me to this tweet from Julien Chaumond, CTO of Huggingface:
data:image/s3,"s3://crabby-images/06821/06821cf6463e3f46bf5d535c5ef20f90a2d73c2e" alt="Screenshot of a tweet by Julien Chaumond showing off how you can embed the hugging face datasets viewer in any webpage"
We instantly thought it would be a good idea to embed the visualization in the ZenML dashboard. As the đ¤ Huggingface team already exposed this embedding functionality as a simple iframe, we could easily do this:
data:image/s3,"s3://crabby-images/30737/30737f76f9a995bf60cccb5ad974a1a63dbe03f8" alt="Screenshot showing the HTML code you can use to embed an iframe for a hugging face dataset"
See an example on any đ¤ Huggingface dataset
Within a few  hours, we had it reviewed and merged:
data:image/s3,"s3://crabby-images/4f085/4f085efeba8dbcb0764c048ee9bc9b188fe52f4b" alt="Image of the ZenML dashboard with a Hugging Face artifact visualization embedded"
đ Custom visualizations in ZenML
In ZenML, there is a concept known as a materializer, that takes care of persisting objects to and from artifact storage. The interface is quite simple, and optionally includes a function where users can attach custom visualizations:
The materializer interface is extensible, and itâs easy to make custom ones by adding a class to your codebase. For đ¤ Huggingface datasets, there is already a standard materializer that takes care of reading and writing a dataset to and from storage. All that needed to be done was to implement the save_visualizations
function.
đ˘ Note, there are other ways to create custom visualizations in ZenML, but this was the simplest in this case
The save_visualizations
function expects us to return a dictionary of key-value pairs, where the key is where the visualization file is stored, and the value is the type of file that we persist. ZenML already supports HTML file types, so the logic was fairly simple. Here is the implementation:
You can see the full implementation materializer implementation here.
And thatâs that! Now by returning any đ¤ Huggingface dataset from a ZenML step in a pipeline, the materializer would also embed the viewer within the ZenML dashboard viewer.
How to embed a đ¤ dataset view in ZenML
Here is a simple example in action that embeds the glue dataset:
Run the above from version 0.62.0 onwards, and youâll see the following in the ZenML dashboard:
data:image/s3,"s3://crabby-images/ea475/ea475b4056deeb2140a3750f0762e443e1fac794" alt="GIF showing the ZenML dashboard UI, navigating to an embedded hugging face dataset"
This was a fun two hours to spend on this relatively simple but hopefully popular enhancement to the ZenML Huggingface integration. Give us a star if you like it, or say hi on Slack! Till next time.
â