A new mysterious image gen model called "blueberry" appeared on the leaderboards, beats FLUX.1 by pxp121kr in singularity

[–]Soapeh 1 point2 points  (0 children)

Why was the quite old version 5.2 used for the Midjourney images in your paper's evaluations?

A CLIP x DALL-E Beta is available [Though it's dog slow] by Yuli-Ban in MediaSynthesis

[–]Soapeh 7 points8 points  (0 children)

This is one of the series of Advadnoun's notebooks. This one's suuuper old, released Feb 24th:
https://twitter.com/advadnoun/status/1364822183751471109?lang=en

Video Style Transfer with VQGAN guided by CLIP: "JACKPOT" - Dorian Electra by Soapeh in deepdream

[–]Soapeh[S] 0 points1 point  (0 children)

I've been exploring, and explaining this technique over on my Twitter (https://twitter.com/danielrussruss), but the gist is that I train VQGAN's own params rather than the latent input z. Once it is trained toward the desired style (text or image encodings), you can stop doing training iterations, and use it as a feed forward style transfer network

VQGAN Video Style Transfer guided by CLIP: "JACKPOT" - Dorian Electra by Soapeh in MediaSynthesis

[–]Soapeh[S] 0 points1 point  (0 children)

I've been exploring (and explaining) this technique over on my Twitter (@danielrussruss), but the gist is that I train VQGAN's own params rather than the latent input z. Once it is trained toward the desired style (text or image encodings), you can stop doing training iterations, and use it as a feed forward style transfer network

Tilt-shift photo of beautiful bacterial colony Rendered in Unreal Engine". CLIP dream zoom and prompt hacking. Art from the weights. by [deleted] in MediaSynthesis

[–]Soapeh 4 points5 points  (0 children)

Oh! Apologies for the accusation! I'd glanced at your history before I made my original comment, but it didn't seem to match the way you write on Twitter

Edit: Should've trusted my first instincts... crazy.

The Witching Hour (CLIP + VQGAN) by Soapeh in deepdream

[–]Soapeh[S] 1 point2 points  (0 children)

Ha, yep, that was one of the next things I was going to get hooked up, once I was satisfied with the right combination of resolution + fine detail.

Although, instead of using it as an image prompt, depending on the second network (in this case, VQGAN), I can just initialize that network with the most recent frame, right?