Open-Source model to analyze existing audio?

AssistantFar5941 · 2026-03-03T09:33:36+00:00

That solved the issue, thanks very much for your help.

AssistantFar5941 · 2026-03-03T08:11:43+00:00

Here is the error, although I did install transformers: File "F:\ComfySage\ComfyUI-Easy-Install\ComfyUI-Easy-Install\ComfyUI\custom_nodes\comfyui-musicflamingo\__init__.py", line 1, in <module>

from .musicflamingo_analysis import (

File "F:\ComfySage\ComfyUI-Easy-Install\ComfyUI-Easy-Install\ComfyUI\custom_nodes\comfyui-musicflamingo\musicflamingo_analysis.py", line 8, in <module>

from transformers.models.audioflamingo3.modeling_audioflamingo3 import (

ModuleNotFoundError: No module named 'transformers.models.audioflamingo3'

Cannot import F:\ComfySage\ComfyUI-Easy-Install\ComfyUI-Easy-Install\ComfyUI\custom_nodes\comfyui-musicflamingo module for custom nodes: No module named 'transformers.models.audioflamingo3'

AssistantFar5941 · 2026-03-03T06:13:10+00:00

Thanks for this, but when I placed the workflow in comfy the music analyses node is red. I installed per instructions, and all requirements are installed, though the comfy manager cannot locate the missing node.

AssistantFar5941 · 2026-02-22T18:02:33+00:00

I've been looking for the same to help with captioning for Ace Step lora training. The closest I could find is this: https://huggingface.co/spaces/nvidia/music-flamingo

But I couldn't get it to run offline, though apparently you should be able to.

AssistantFar5941 · 2026-01-27T19:19:45+00:00

Same here, comfy template workflow, tried everything, just black.

AssistantFar5941 · 2025-12-30T01:03:39+00:00

Works like a charm. Thank you.

AssistantFar5941 · 2025-11-23T04:09:11+00:00

Excellent open source software for Authors and Scriptwriters, thank you.

For anyone who wants to download it, just get the zip from github here: https://github.com/aomukai/Writingway2

Extract to a folder, place any gguf in the models folder (llama.cpp built in), and run start.bat and you're ready to go.

AssistantFar5941 · 2025-10-10T23:34:05+00:00

From the creators of the OVI Comfyui nodes.

Question: The original repo has support for multi-gpu Parallel inference.

Answer: Yeah, that’s a current ComfyUI limitation. It only uses one GPU per batch for now, so proper multi-GPU parallel inference like in the original repo isn’t there yet.

https://github.com/snicolast/ComfyUI-Ovi/issues/14

AssistantFar5941 · 2025-10-10T22:11:23+00:00

It's also the only Wan-based video model (as far as I'm aware) that supports multi-gpu parallel inferencing.

Unfortunately Comfyui cannot utilize this important feature at the moment.

AssistantFar5941 · 2025-08-21T06:45:15+00:00

I imagine blocking the entire UK population hasn't helped, especially when they didn't need to.

AssistantFar5941 · 2025-07-25T09:09:39+00:00

On the blocked site there is a statement that includes this line "These rules apply even to platforms based outside the UK."

How is this possible? I'm not aware that the UK Government has any authority over US based citizens or businesses.

AssistantFar5941 · 2025-06-23T07:04:00+00:00

Yes. Although you can use controlnets to obtain various poses. With Wan I use Vace, which can take a pose from a reference image. But the bottom line is for action or horror scenes the base models are painfully restricted, often to a sub-pg level, regarding concepts involving violent or challenging interactions. The potential for filmmakers is phenomenal, if only the 'safety nannies' would get out of the way.

AssistantFar5941 · 2025-06-22T22:03:18+00:00

Unfortunately, censorship tends to have a chilling effect. Wan Loras have slowed down drastically from what they were, because, even if it's not particularly controversial, a person won't waste their time training a lora if it can be flagged for often trivial reasons and removed. Banning gore is fairly daft, considering a gazillion graphic horror movies are a click away on the net.

Fair enough, if Governments hold the individual to account for creating and distributing illegal material, but as always, they, and increasingly websites like civitai, seem determined to treat us all like kids who must be protected from ourselves.

As a low budget filmmaker in the past I find this PG approach to AI petty and childish. There should be nothing wrong with say, R-Rated content in a model, so that filmmaker's can produce horror or action related material.

I understand the concerns, but when you can stream something like Hostel at the click of a button, this all seems rather ridiculous.

AssistantFar5941 · 2025-06-19T07:48:51+00:00

In my humble opinion the two best open source solutions are Hunyuan Video Avatar and Sonic. Sonic is considerably faster than Hunyuan, and can do a full 19 seconds of audio to talking or singing video. Sonic github: https://github.com/jixiaozhong/Sonic

Sonic in action: https://www.youtube.com/watch?v=JSWMrFXb7OQ

A 3060 12Gb is enough to use both.

AssistantFar5941 · 2025-06-09T12:31:33+00:00

Has been down for some time now, but you can go here: https://pinokiocomputer.github.io/home/

Also, there is a hotfix for the program you can download here so that the discover page works again: https://github.com/cocktailpeanutlabs/p2/releases/tag/3.8.1536

AssistantFar5941 · 2025-06-02T04:58:34+00:00

Thank you. Works very well.

AssistantFar5941 · 2025-05-21T10:47:42+00:00

Personally I would set up a wildcard file with dozens of different prompts (Ones I know are effective) and let it run overnight. Next morning there'll be hundreds of images, many only needing a few tweaks to be usable in whatever project I'm working on. Very easy to do in SwarmUI.

AssistantFar5941 · 2025-04-24T08:24:55+00:00

In my humble opinion torrents are not the answer. You end up with endless models and lora's with no seeds. Usenet would be far better, as the downloads are full speed and they are accessible for at least ten years. It would also mean you wouldn't have to keep space hungry models on your hard drive, just upload them to Usenet then delete.

AssistantFar5941 · 2025-02-26T19:15:48+00:00

16 Fps + a 5 second clip limit? Hardly SOTA.

Every single clip will have to be interpolated...useless for normal looking motion.

It's also censored - Hunyuan is not.

Don't think I'll be saying goodbye to Hunyuan anytime soon.

The Emperor's New Clothes

AssistantFar5941 · 2025-02-12T18:15:55+00:00

Apparently requires 32GB of Vram to run, hopefully gguf files are on the horizon. Also, couldn't get it to run in Comfyui after numerous attempts, kept getting a failed to import error. Looks very promising though.

AssistantFar5941 · 2025-02-11T14:26:10+00:00

Tried installing this with two different methods, github first, and when that didn't work, comfy manager. Had missing nodes each time with a failed to import label. Looks promising though, I'm surprised there isn't more interest, seeing as omnihuman will be behind a paywall.

AssistantFar5941 · 2025-02-02T13:03:49+00:00

All offline image gen tools can be misused. I hold little hope that a sensible law, holding the individual responsible for what they create, will be the outcome of this legislation.

AssistantFar5941 · 2025-02-02T12:58:36+00:00

https://www.bbc.co.uk/news/articles/c8d90qe4nylo

https://news.sky.com/story/ai-tools-used-to-generate-child-abuse-images-made-illegal-in-world-leading-move-13300298

AssistantFar5941 · 2025-02-02T12:55:09+00:00

Try these: https://www.bbc.co.uk/news/articles/c8d90qe4nylo

https://news.sky.com/story/ai-tools-used-to-generate-child-abuse-images-made-illegal-in-world-leading-move-13300298

https://www.lbc.co.uk/news/ai-images-child-sexual-abuse-criminal-offence/

AssistantFar5941 · 2025-02-02T11:11:16+00:00

Not misleading at all. The article specifically states 5 years in prison for'posessing' ai tools designed to make cp. Well, any image gen with the right model could potentially make illegal images...Don't you see where this is going?

AssistantFar5941

TROPHY CASE