Software Engineering

AI-Generated Storytelling: A GenAI Comic About ZenML

Alex Strick van Linschoten
Aug 19, 2024
4 mins

I spent a good chunk of Saturday making And . The idea (thanks Stefan!) was to showcase the value that the open-source framework we build can add to a team that's experiencing some pain in scaling their mostly-manual deployment and iteration processes. We speak to companies every day about the challenges that they face and the story I wanted to tell was one that represented the common patterns of companies moving through their journey to MLOps maturity.

GenAI tooling

For this, I mainly used getimg.ai and Replicate for the image generation. I started out using my local GPU machine but it was just much more convenient to use a service like getimg which allowed you to generate 6 or 8 images at once. (Fun fact: I created almost 2000 images during the course of the day!) I used Replicate for Flux since I had a bunch of credits there leftover from the Hamel Husain / Dan Becker finetuning course and it was impossible for me to play around with the Pro model locally (way too big!).

I used Adobe Express for the layout after a 10 minute browse around for suggestions. There are fancier options for layout, but I needed something simple and it was extremely quick to get going. I watched a really great tutorial on Twitter (shout out to Jordan Dené Ellis and Kris Kashtanova!) for how to get started which taught me basically everything needed for the layout.

Storyboarding the panels was the product of a very long conversation with Claude Sonnet, starting with some priming around story archetypes, engaging flows, and then moving on to the panels & layout. Once I'd settled on a version I was happy with, I moved to image generation and actually making it happen, during which I sometimes then had to go back and reiterate or remove images / panels which felt redundant. No plan survives contact with reality and all that!

Visual continuity challenges

The hardest part by far was to achieve visual continuity through the course of the whole story, namely how to make sure that the face of the main character didn't change from image to image. There are ways to achieve this through prompts (e.g. give the character a scar or glasses or a vibrant hair colour) but those all felt a bit hacky and didn't respond well to other changes (i.e. when his emotions changed). In the end I settled on a trick I'd learned during Part 2 of the FastAI course which was more along the lines of passing in an image into the prompt as well. I got this working with diffusers first just as a proof that it was possible and then I quite quickly switched to having getimg.ai handle this — you can pass in characters, styles or other content.

Screenshot of the getimg.ai image generation tool, using custom characters

Then the rest of the work was just many iterations, tweaking how much the image reference was to be followed, and adjusting the number of inference steps and the guidance scale. I also had a standard negative prompt as well ('Disfigured, blurry, nude'), but I didn't play around with that much.

Aside from Flux, the other model I used was realcartoon-xl-v6 and that was the main one I used for facial continuity. Unfortunately this means that some of the images in this comic book are photorealistic and others have more of a cartoon feel to them. I was too excited to play around with Flux and probably I should have just stuck with the cartoon styles and using a single model.

I saw that there is a version of flux which works img2img on Replicate but I didn't have much success with that and the inference speeds were pretty slow. If I were to spend more time on this I'd definitely dive more into developing more characters with DreamBooth-adjacent techniques, ideally on top of flux-schnell. (There are also services which claim to handle character continuity for you (mainly in the 'AI girlfriend / boyfriend' space, it seemed) but none of them were flexible and speedy enough to handle the task. Leonardo.ai, e.g. was one that I tried.)

How to try this out

Nevertheless, it was really fun to play around with this. If you're planning on working on something similar, I'd recommend getimg.ai and do upgrade your account so you can get the fastest inference possible. (No affiliate links with them here; I just really found their service to be super solid!) That said, I reckon something like 1/10 of my time working on this was spent waiting for images to generate which felt just not-instant enough to be annoying throughout.

Let me know if you've worked on similar projects and you know of services or techniques or ways to make this a bit easier! It was just fun enough that I might try this again some time! And please do give the comic a read!

Looking to Get Ahead in MLOps & LLMOps?

Subscribe to the ZenML newsletter and receive regular product updates, tutorials, examples, and more articles like this one.
We care about your data in our privacy policy.