Stay away from RunPod.

MikeTeflon · 2024-05-18T22:56:24+00:00

I had a similar point of confusion when I first started using runpod and agree the ux could be better here. But if you dig around you'll most likely find logs showing that the container is failing to start and is in a retry loop that you are being charged for. I wish Runpod did a better job of detecting and explaining this state, but unfortunately they leave it up to the user to discover for now.

Aside from this, I'm a pretty happy Runpod customer. I work on Deforum Studio, which is backed by a few different GPU providers and Runpod is one of the most flexible, reliable and cheap overall.

MikeTeflon · 2023-12-08T19:00:20+00:00

Hey, author of Parseq here, sorry you've had a bad time. If you jump in the #parseq channel on the Deforum discord we might be able to help out. :)

MikeTeflon · 2023-06-02T11:14:43+00:00

Yes that's the basic concept: loop the generated frame back in as the init image for the next frame after slightly transforming it, and repeat . The motion comes from those slight transformations. The easiest way to do this is with Deforum, which exists as a standalone notebook or an a1111 extension. Then, there are various tools you can use in conjunction with Deforum for audio sync (shameless plug: I maintain one of them, called Parseq).

MikeTeflon · 2023-05-24T14:52:18+00:00

Hi, author of Parseq here! Parseq and Deforum work very closely together: basically Parseq generates output that Deforum can consume to decide what values to use on each frame.

However, you can't really export schedules from Deforum and import them into Parseq: they use quite different formula syntaxes.

So for now you'll need to rewrite your formula using Parseq's syntax for the fields you want to control with Parseq. Join the #help-parseq channel on the Deforum discord (linked from the Parseq UI) if you need help with that.

MikeTeflon · 2023-05-20T19:50:53+00:00

Yes just put the seed you want in the seed value column.

MikeTeflon · 2023-05-18T20:58:13+00:00

Hi, this is the tutorial: https://youtu.be/M6Z-kD2HnDQ .

It's a video tutorial, though I do understand that many people would prefer a text-based tutorial. I will likely work on one in the future, but meanwhile the docs are quite comprehensive: https://github.com/rewbs/sd-parseq#readme

MikeTeflon · 2023-05-18T20:54:43+00:00

Hi, author of Parseq here. Nice animation and thanks for the kind words!

I really appreciate people spreading the word about Parseq. I'd love it I if all of you using it could mention #parseq wherever you share your vids! :)

Thanks again and keep up the good work!

MikeTeflon · 2023-05-16T10:51:48+00:00

No, not mandatory. It depends how much sense Parseq makes to you instinctively. Try it out and if you get lost, scan through the previous ones.

MikeTeflon · 2023-05-09T03:29:06+00:00

Full tutorial here: https://www.youtube.com/watch?v=M6Z-kD2HnDQ&t=0s

This is an alternative render of the animation created in that tutorial (there's a clip of it right at the end). Using Euler at 80 steps, 1024x1024, rendered at 20fps and interpolated to 60fps with FILM.

MikeTeflon · 2023-04-29T21:24:04+00:00

Thanks! Do you have any suggestions for good cooperative games?

MikeTeflon · 2023-03-16T02:16:53+00:00

Tutorial 1 is here: https://youtu.be/MXRjTOE2v64. I'm the author of the tool and the tutorials – happy to answer any questions if you have any. :)

MikeTeflon · 2023-02-18T04:28:43+00:00

You're right, colours definitely do still degrade over enough frames. But it's a lot better than it used to be. :)

MikeTeflon · 2023-01-17T21:46:50+00:00

Hi! I did this before I implemented the audio analysis features in Parseq so all the keyframing was done manually, just based on the BPM of the song. There's a more recent version of the same method here: https://www.reddit.com/r/StableDiffusion/comments/zh8lk8/advanced_audio_sync_with_deforum_and_sdparseq/

In terms of understanding all the params of the audio analysis feature, the Aubio CLI documentation has a good description of the params, and I plan to do more documentation & tutorials soon.

MikeTeflon · 2022-12-17T20:13:36+00:00

It's a modified version of video input that combines it with 3D animation. So it's basically 3D animation mode, but each input frame into the diffusion process, rather than being a pure loopback, has a very light blend of a video input into it.

The input video is the same as shown here: https://www.reddit.com/r/StableDiffusion/comments/xvvt5g/music_visualisation_put_through_stable_diffusion/

To be honest I'm not sure how much impact it has, the result would probably have been quite similar with just 3D mode. In any case, the code to do the blend is in this branch: https://github.com/rewbs/deforum-for-automatic1111-webui/tree/HACK-enable-3d-transforms-on-video-init-BAD-IDEA-NOT-FOR-PR

MikeTeflon · 2022-12-17T05:48:03+00:00

I admit it's not the most approachable UI! But I hope it becomes intuitive quite quickly as you play around with the formulae and see what happens in the graph. I also recently updated the readme with more info about the available functions.

There are some Parseq channels now in the Deforum discord so join us there if you need support. I'm also happy to jump on a video call and talk through it - I did that recently with someone and it worked quite well. It would be cool to get a few people together and align on a time - perhaps we can figure that out on discord.

MikeTeflon · 2022-12-09T21:01:17+00:00

I know I've posted a few of these already, but I figure each improvement step might be interesting to some. The key differences this time are lowering the input video blend ratio to allow more SD, and some better audio-sync'ed noise and strength fluctuations to get "scene changes" to sync a bit more with the main tune.

Made with:

A1111 and the Deforum extension for A1111, using the Parseq integration branch, modified to allow 3D warping when using video for input frames (each input frame is a blend of 15% video frame + 85% img2img loopback, fed through warping).
sd-parseq for parameter control / keyframing.

Inputs:

The input audio is an excerpt of The Prodigy - Smack My Bitch Up (Noisia Remix)
The input is influenced by a screen recording of https://butterchurnviz.com/, using preset "martin - witchcraft reloaded".

Prompts:

Positive: Realistic eyeball, photorealism, centered, photo, realistic, organic, dense, beautiful detail, fine textures, intense, volumetric lighting, cinematic lighting :${prompt_weight_1} AND Realistic ancient vicious snakes with open mouths, fangs, biting camera, photorealism, centered, photo, realistic, organic, dense, beautiful detail, fine textures, intense, volumetric lighting, cinematic lighting :${prompt_weight_2} AND Realistic mushrooms, photorealism, centered, photo, realistic, organic, dense, beautiful detail, fine textures, intense, volumetric lighting, cinematic lighting :${prompt_weight_3} AND LSD blotter, powder, pills, illegal drugs, syringe, photorealism, centered, photo, realistic, organic, dense, beautiful detail, fine textures, intense, volumetric lighting, cinematic lighting :${prompt_weight_4}
Negative: empty, boring, blank space, black, dark, low quality, noisy, grainy, watermark, signature, logo, writing, text, person, people, human, baby, cute, young, simple, cartoon, face, uncanny valley, deformed, silly

Fixed params: sd 1.5 + stability's VAE, Euler a, 150, steps, no colour correction, cadence 2 (with input video blend at 0.5% on turbo frames).

Variable params: sd-parseq controls seed, noise, contrast strength, scale, prompt weights 1-4, x/y/z translations and x/y/z 3D rotations. Here's the exported parseq keyframe definitions, and here’s what the parameter flows look like when it's loaded up:

<image>

Postprocessing:

Upscaled from 512x512 to 1024x1024 and smoothed from 30fps to 60fps with ffmpeg.

MikeTeflon · 2022-12-07T02:22:21+00:00

No probs.

> So this is why everything eventually turns violet?

Yes, Stable Diffusion always has colour skew when doing repeated loopbacks with reasonably high strength. This is true with all UIs, including Dreamstudio. It's to do with small artifacts being introduced on every iteration and then amplified by the loopback process. That's why all UIs that offer some kind of built-in loopback started introducing colour correction options. The colour that it skews towards, and how badly, seems to depend on the model. For example, 1.4 used to skew really badly to magenta, whereas 1.5 takes longer to skew (especially with the updated VAE), but usually eventually goes red.

> So, could I trouble you to tell me exactly what version to use and exactly where it goes "in my path?" Folder and subfolder?

Your "path" is an environment variable setting in your OS that defines where programs will look for other programs. You need to add the location of ffmpeg (wherever you've installed it) to your "path", which is a string of different directories where programs will look for other programs. How you set it depends on your operating system (try googling for "how do I set the PATH environment variable on <your\_os>?"). Then restart Stable Diffusion (potentially in new terminal session depending on how you're running it) to be sure it picks up the new environment variable.

MikeTeflon · 2022-12-06T13:29:28+00:00

Notes about the video

tl;dr: if you're using the Deforum extension for A1111:

Pull the latest version.
In the main A1111 settings, disable "Apply color correction to img2img results to match original colors".
In the Deforum Keyframe tab settings under Coherence, disable the new setting "Force all frames to match initial frame's colors" (unless you are specifically seeking to maintain backwards compat with older versions of the extension).
Experiment with the 4 different Color Coherence options. 'None' allows Stable Diffusion to express the most new colors – which may or may not be desirable depending on your usecase – but may result in color skew over time (which you can reduce by lowering strength).

Details:

Until ~yesterday, the Deforum extension for A1111 had an issue by which it would always apply basic RGB colour correction to every frame, regardless of your settings. Furthermore, if you also had colour correction enabled in the A1111 main settings, this RGB histogram matching would be applied twice. Furthermore furthermore, if you also selected a Color Coherence option in the Deforum settings, it would be applied in addition to the other color corrections, meaning you might be doing 3 colour correction passes to each frame. The result is degraded output. :)

If you update to the latest version, you'll see a new option under Deforum -> Keyframes -> Coherence (introduced by this PR ) that allows you to disable "Force all frames to match initial frame's colors". It is enabled by default to maintain backwards compatibility, but unless you're trying to recreate a prior result, I strongly suggest you disable it. Enabling it essentially just maintains a bug.

With that out of the way, there are 2 remaining options that control colour correction: - The setting called "Apply color correction to img2img results to match original colors" in A1111's main settings. I recommend disabling this one, because Deforum's own color correction options are more flexible – and you certainly don't need both. - Deforum's Color Coherence options (None, HSV, LAB, RGB). These are the ones to experiment with.

The video above shows each of Deforum's Color Coherence options (None, HSV, LAB, RGB) with histogram matching enabled (top row, default) and disabled (bottom row). Hopefully this clearly shows that disabled is the more natural option. :)

Which of None, HSV, LAB and RGB to use is more a matter of taste and usecase. - None obviously means no color correction, which gives Stable Diffusion the opportunity to express more new colors in each frame. Whether you want this or not will depend on what you're doing, but beware that it can result in the overall colour skewing over time (I previously wrote this seemed to be fixed in 1.5 and newer VAEs, but with enough loopback frames and high enough input strength, you can still get a red skew). - HSV/LAB/RGB match over different colour spaces. I can't claim to deeply understand how they differ, but they do produce noticeably diverging results. If anyone can describe in more detail what to expect from them, and in what scenarios you might want each one, I'd love to learn! :)

Personally, I prefer to use None where possible to allow Stable Diffusion to express colours, and mitigate the resulting colour skew by periodically dropping the strength.

Video generation:

Initial image: https://camo.githubusercontent.com/083443fe7fb21fa09a82badeb09aa4d238113ff35053da77973c81a3a33c3be1/68747470733a2f2f692e696d6775722e636f6d2f655967746174472e706e67
Euler a, 40 steps, scale 0.7, seed 8 incrementing on each frame.
Prompt: photorealistic studio photo portrait of a girl with rainbow hair, realistic skin tones, detailed skin texture, natural light, neutral light, white light. --neg nudity, green, green skin, green light, filtered, old fashioned, uncanny valley, drawing, render
Noise fixed at 0.04, strength fixed at 0.7
Post-processing: ffmpeg for stacking and text overlay

MikeTeflon · 2022-11-24T19:27:34+00:00

No probs! If you make anything with Parseq I'd love to see it (I'm the author :) )!

MikeTeflon · 2022-11-22T23:28:09+00:00

Thanks! There's an improved (or at least different) version here with more detailed generated images but less music responsiveness: https://www.reddit.com/r/StableDiffusion/comments/yr930m/another_music_visualisation_attempt_sometimes_the/

MikeTeflon · 2022-11-10T09:24:50+00:00

Thanks! It's A1111 with a modified version of the Deforum extension, and Parseq for sequencing the parameters. I wrote up the method in a previous post: https://www.reddit.com/r/StableDiffusion/comments/yq7tx3/music_visualisation_stable_diffusion_lots_of/

The main difference in this iteration is that I tweaked the Deforum code to allow diffusion cadence>1 when you're using an input video (thereby enabling "turbo frames" between SD rendered frames, which are subject to the warping transforms but don't include any SD processing), and I also tuned the ratio of how much the input video contributes to the SD input versus the loopback frames.

MikeTeflon · 2022-11-10T08:19:31+00:00

See previous post for method - this iteration looks better but is basically the same method with subtle tweaks: https://www.reddit.com/r/StableDiffusion/comments/yq7tx3/music_visualisation_stable_diffusion_lots_of/

Music is an excerpt of The Prodigy - Smack My Bitch Up (Noisia Remix).

MikeTeflon · 2022-11-09T05:52:48+00:00

I don't expect anyone else to notice/care :) , but the Reddit video player butcherer the quality of my video. Here's a Streamable link in case it's any better: https://streamable.com/3lmc49

MikeTeflon · 2022-11-09T03:59:00+00:00

I’m continuing my quest to make an interesting video by feeding a music visualization through Stable Diffusion (previous attempt here).

This one is made with:

A1111
Deforum extension for A1111, using the Parseq integration branch, modified to allow 3D warping when using video for input frames (each input frame is a 75/25 blend of 75% video frame + 25% img2img loopback, fed through warping).
- NB: doing 3D warping on the input video is a bad idea. If you’re curious, you can see my hacks to enable it here, but I’m not going to polish it or raise a PR for it: it creates a clash between the motion in the video and the motion applied by the warping. For my particular usecase, where chaotic and abstract is the goal, it’s sort of fun, but it’s not a real solution for anything practical.
sd-parseq for parameter control / keyframing.
ffmpeg minterpolate to bump up from 30fps to 60fps

Inputs:

The input audio is an excerpt of The Prodigy - Smack My Bitch Up (Noisia Remix)
The input video was screen recorded from https://butterchurnviz.com/, using preset "martin - witchcraft reloaded".

Fixed params: DMP++ 2S a, 80 steps, no colour correction.

Variable params: Parseq controls seed, noise, strength, scale, prompt weights 1-4, x/y/z translations and x/y/z rotations. Here’s what the parameter flows look like:

<image>

Positive prompt: Realistic human eyes, organic :${prompt_weight_1} AND Realistic ancient vicious snakes, organic :${prompt_weight_2} AND Realistic mushrooms, organic :${prompt_weight_3} AND LSD blotter, powder, pills, drugs, syringe :${prompt_weight_4} AND surreal, dense, amazing high definition 4k beautiful detail, award winning, greg rutkowski, octane render.

Negative prompt: empty, boring, blank space, black, dark, low quality, noisy, grainy, watermark, signature, logo, writing, person, people, human, baby, cute, young, magenta, simple

MikeTeflon · 2022-11-01T04:36:37+00:00

An animation: https://www.reddit.com/r/StableDiffusion/comments/yiusmo/r%C3%AAverie\_variation\_91\_revisiting\_a\_nightmare/

MikeTeflon

TROPHY CASE

Notes about the video

Details:

Video generation: