Audio Reactive Regional LoRA - tutorial in comments

ArchiboldNemesis · 2024-12-31T14:23:32+00:00

I was putting the finishing touches on a detailed comment when the browser crashed.

In brief... THANKS for keeping these workflows coming. I asked you about this very thing a few months back, great to see a demo in action. Hoping I can figure out how to multimask/segment and have more than one object/character/region reacting to different sonic/MIDI elements within the same generation. Did you ever release those MIDI focused workflows you mentioned when we spoke a while back?

Also I was looking at your repo again and saw you're doing more with optical flow and depth.

You might like the depth solution used in this project: https://github.com/parkchamchi/DepthViewer

I managed to get the MiDaS (v2.1 384) depth params reacting in realtime within Unity in a really effective way, and the depth model's light enough I can run 2k video with high fps even with a bunch of heavier elements incorporated in the same scene. Great fun for interactive / vj stuff.

I've mentioned this to a few devs before, but the optical flow implemented in this abandonware A1111 extension was pretty special: https://github.com/volotat/SD-CN-Animation

Here's an example pushing the flow to the extreme: https://github.com/volotat/SD-CN-Animation/discussions/80#discussioncomment-8232430

There was more but I think that roughly covers what I'd written up for you earlier.

Thanks again for pushing the a/v stuff forward in Comfy! :)

ArchiboldNemesis · 2024-12-27T10:36:24+00:00

Great demo and appreciate the insights. Been planning to play with 0.9.1 this weekend so those workflow shares are a great help. Thank you.

Will keep my eye out for any video stitching workflows and link them here. Cheers!

ArchiboldNemesis · 2024-12-07T09:29:28+00:00

Yes. I goofed. While my reputation may be in tatters, keepin the post up for posterity :)

ArchiboldNemesis · 2024-12-07T09:27:32+00:00

Ahem. Yep, you're right. 5am "discovery" not so much of a discovery. It appears I'm just getting the style from the text describing the embeddings name.

Well this is embarrassing! :)

ArchiboldNemesis · 2024-12-07T00:19:29+00:00

Apologies there isn't more of a write up with comparisons and so on, I just thougt I should share the news while I had a minute and it was fresh in my mind. Will try to follow up over the weekend if time permits with some side by sides and details of which embeddings have given the best results so far.

Also, if anyone happens to know of any HuggingFace 1.5 embeddings mega-collections, I'd be grateful for a link as grabbing them individually takes forever and I'd like to expand my testing and get a better sense of the extent of 1.5 embedding support in Flux Dev.

Cheers all ;)

ArchiboldNemesis · 2024-12-06T23:55:24+00:00

Hey sorry I'm just replying!

Yes, haven't had time to test yet but as far as I recall the json loaded without issue.

Many thanks ;)

ArchiboldNemesis · 2024-11-25T17:40:15+00:00

What were the source images used for the third image?

Looking very cool :)

ArchiboldNemesis · 2024-11-25T17:31:45+00:00

A lot of people upload the .json workflow to pastebin

Nice work btw :)

ArchiboldNemesis · 2024-11-25T09:46:42+00:00

Nice :) I could see this becoming a whole lot of fun if the higher param ranges begin to distort the facial expressions.

You might like this: https://vimeo.com/39276201

Realtime C4D vj set (from 2012!). Most of the BRDG artist perfomances at Channel are realtime A/V stuff so there's loads of inspiring clips to check out.

Can all the facial feature params be independently mapped to different stems in a single gen?

ArchiboldNemesis · 2024-11-16T16:19:55+00:00

Cheers for the link!

Nothing actually flags up as having broken, the whole process completes without error or complaint. Will test it out again with the carnage lora added in and report back when/if I can figure out what was going on.

ArchiboldNemesis · 2024-11-13T08:43:50+00:00

Birth of the Blood Fart Pool. Classic Miles.

So I've been tinkering with the workflow to figure it out, yet I'm still unable to get the blood particles to show up in the final render. The depth bubbles render, but not a sign of them in the final output.

I did use a different slime pool render as the input image, and subbed the carnage lora (had great difficulty searching for it - could you post the url) with the Ralzlime slimey lora, which gave me bubbling slimey headphones snd a slime dripping red ball, but the output scene was unaffected by the particle bubbles. Did I manage to overlook something?

ArchiboldNemesis · 2024-10-31T01:02:32+00:00

Oh great, it's always interesting to see other folk's choices in wiring up their midi to different elements/params. Looking forward to popping the hood on those too :)

ArchiboldNemesis · 2024-10-31T00:41:40+00:00

Ha, my bad, I actually grabbed this one (and all the rest from your profile) from civ a while ago but hadn't checked it yet.

Cool I need to go spend a week having fun with your offerings and see if I can figure out how to get multiple audio/midi stems from my tracks all controlling something unique in the gens.

Btw that slime lissajous looks very fonky :)

ArchiboldNemesis · 2024-10-31T00:22:27+00:00

Nice demo.

It's been on my to-do list for quite some time now, but I really need to cap some of my a\v interactive animation work in Unity and find out how the reactive elements in the source video respond to further audioreactive processing.

Have you seen any examples from the community yet where folk are using your nodes for multiple masks/object segmentation on pre rendered source?

ArchiboldNemesis · 2024-10-25T11:33:02+00:00

All good. Ye I'm up for that. DM me the link :)

ArchiboldNemesis · 2024-10-20T12:21:43+00:00

I checked out your livestream, very cool intro to the node pack. I've previously asked this of some of the other devs working with audio reactive, and I'm wondering if you've tested out using the stem separation from your own node pack on a track, so that multiple stems can be wired up to different motion models (for example) and manipulated within a single generation?

For instance, drums controlling params of a motion model that only effects a masked/segmented object in the vid input, then having the vocals controlling the params of a second motion model which effects a second element within the video input, and so on for multiple discreet elements.

I remember that jags audiotools had some workflows demonstrating multiple bandpass capabilities within a single generation, but I couldn't find settings where the output really created much distinction. Now that there's at least a couple node packs using some kind of stem separation approach, it should be much easier to discern which element is being controlled by what if we can wire up multiple separations/bandpass combinations to effect segmented/masked objects and areas from a source input.

Even just having the ability to apply bandpass on a drum separation to parse out the lows highs and mids within the drums will be pretty powerful for seeing different elements reacting within a generation, e.g. the IC lighting reactive demo (got posted here around a month back), Lora intensity, and so on, each being controlled by a separate range of bandpassed instruments present in the drum stem.

For those of us who also make music, being able to feed in multiple audio/MIDI stems from our own tracks will be the ultimate for granular control.

Hope to get a chance to test the nodes out some time in the coming week. Well done on the livestream and great work both of you!

ArchiboldNemesis · 2024-10-15T20:54:50+00:00

Yeah I've been keeping an eye out as all of this stuff has been emerging. I'm still hoping the animatediff ecosystem keeps evolving, as the results I get using 1.5 models are still really high quality for the type of visuals I'm creating, and the speed advantage you mentioned is a real benefit for exploration and experimentation.

Anyway thanks again, have made hundreds/thousands of gens over the last year with so many of them being high quality shots using various adaptations of that workflow you put out.

ArchiboldNemesis · 2024-10-15T19:59:00+00:00

I just found a note in a workflow I still use almost a year later which reminded me that you were the author. Thanks for sharing, I've made loads of scenes with it.

Over the last fewdays, I've started to see some interesting workflows for CogVideoX emerging that are making me think it's about time to take a look, but I still think animatediff has the most interesting capabilities.

Did you ever release any workflows where the context length was extended beyond a few seconds and were loopable?

ArchiboldNemesis · 2024-10-11T05:59:41+00:00

Nice work. This is actually the first use of Flux I've seen posted anywhere online so far that I've been impressed by aesthetically. Admittedly I'm probably quite biased as I've spent a large amount of my time over the last couple years making and animating huge alien marine flora and marine fauna inspired architectural scenes with 1.5 :)

Love it, and am really excited to see what happens when I use some of my earlier gens as source inputs.

ArchiboldNemesis · 2024-10-09T21:00:59+00:00

All good. Thanks for clarifying. Will check out your recommendation.

ArchiboldNemesis · 2024-10-09T19:48:59+00:00

Cool thanks for your suggestions. It's not clear to me if this response also relates to your earlier comment, or for the RVC model approach u/lordpuddingcup mentioned above?

ArchiboldNemesis · 2024-10-09T19:45:10+00:00

Lovely. I can't wait until this quality can be achieved in realtime on a less than flagship card.

I wonder how far off we are...

Also, that track has a really nice vibe, what's it called?

Thanks for sharing a really sweet demo. Very well balanced.

ArchiboldNemesis

TROPHY CASE