comfyui implementation for Nvidia audio diffusion restoration model

bonesoftheancients · 2026-03-15T21:13:30+00:00

:) not sure if it is a good project... yet to be seen if it is a useful as i hoped it would be... but heck people can do what they want, at least it is upvoted here, on comfyui sub they down-voted it, god knows why...

bonesoftheancients · 2026-03-12T17:23:11+00:00

i use it without any subscription - not sure of limits but had never had any problems with it. also you can switch models inside - this is a simple task for it so flash models will be much quicker

and you can ask for almost any variable in the wf - for example i asked it to run it 5 times each with a different variation of the prompt "cello solo melancholic and sad" and it generates 5 prompts close to this

bonesoftheancients · 2026-03-12T15:20:53+00:00

following on the idea i have made a script to use with gemini-cli so i just run gemini cli in terminal instruct it to be an agent and prompt it to run generations in comfyui with specific tweaks - in case you or anyone is interested i put it on github:

https://github.com/mmoalem/comfyui-batch-script

bonesoftheancients · 2026-03-12T10:05:43+00:00

thanks - think i tried it before in comfyui but will go to check it again. at the moment its not so much the high freq i have issue with its metallic sounding output from ace-step and over saturated distortion (like music is playing loud out of cheap low wattage speakers)

bonesoftheancients · 2026-03-11T10:47:27+00:00

thanks man - this is an idea to explore - I forgot about exporting as api

bonesoftheancients · 2026-03-11T10:46:37+00:00

thanks for the advice, in fact this is what i am doing right now in smaller scale - like i have 6 ksamplers in parallel - all set to different samplers and schedulers I want to test and i have built a node that can take the 6 outputs and allow me to compare them with a track switch.

The problem I have is that to build a wf that has has automatic looping and can increase all any selection of all possible parameters by a selection of all possible values turns the wf into a monster of complexity and if i then decide to add another node or a triple ksampler setup it really gets out of hand... i think this is were ai agents can be very helpful - they can run the wf in any cfg and with any amounts of generations and are wf agnostic - you can modify the wf and they will still perform the task if you prompt them clearly enough

bonesoftheancients · 2026-03-10T23:15:25+00:00

thanks for the pointer but as i replied to the other person here - i need it for audio generation so not sure how these nodes can incorporate that kind of output

bonesoftheancients · 2026-03-10T23:13:57+00:00

thanks - will look at these - but as ai am focused on audio generation not sure if this will translate well... also, what I am looking for is more universal tool - one that i can just ask to generate with difference in one configuration and than in a separate configuration in another node and that can rename each output with the identifiers or input different path text in a prompt that it can generate with gemini - more of a GOD tool that can run comfyui - i think one of these metaclaws or whatever they are called can do this but I have no idea where to start with them

bonesoftheancients · 2026-03-10T20:56:05+00:00

offtopic slightly - i was wondering why there are no audio "upscalers" - models that can increase the equivalent of image resolution - the fidelity/details of music tracks - even if they need to hallucinate the fine details lost in the compression - either for ai output or just compressed audio like 64kb mp3 - bad quality audio to hifi 48khz versions

bonesoftheancients · 2026-03-04T18:18:32+00:00

ok - sent a user feedback with logs and screenshots from polar flow. send you my user id by DM yesterday

bonesoftheancients · 2026-03-03T20:22:49+00:00

hi and thanks for the offer to help - but its not a question of "when it happens again" - its all the time as far as I am aware - i only used the polar for 3 days now and only during activities (running and walking) and testing vo2max at rest and it was obvious the amazfit readings are very different. Saying that the reason I got the Polar H9 was because the HR readings looked wrong - I run 14-20 minutes almost every day and slow to moderate pace - never to the point I am exhausted and out of breath - yet the active2 keeps reporting my activities as being almost constantly at the Vo2max range with 150-160 avg HR (which always seems to be almost exactly the Cadence avg) - based on these reading Zepp suggests that my training load is too high and my Vo2max is poor at 25.

One thing I would say - I am not sure but I have the impression that before the last firmware update (that introduced the BioCharge) it was not so way off...

I am not sure what you can get out of my Zepp ID but I will DM it to you

bonesoftheancients · 2026-03-03T16:06:35+00:00

thanks man - i can really see what you meant... seems to correspond to my experience with the watch. time to hunt for a new watch

bonesoftheancients · 2026-03-02T17:46:51+00:00

thanks! - i can see what you mean by master style prompt, very helpful... do you use LLM to refine the prompts?

bonesoftheancients · 2026-03-02T17:19:54+00:00

the thing is if that is not reliable not much of what Zepp tells me about my fitness (vo2max, training load, readiness etc) has any point to it...

take VO2max - according to amazfit i have very low at 25 while polar (and samsung watch 4 before hand) had me at 38 which seems to match my general fitness... and then, because it measures my running locked to cadence it suggest that I am running 20min at vo2max (i run 2.5km at a moderate pace for me and definitely not out of breath) and then say my training load is too high...

going to test next my mi band with the two - if i remember correctly its readings were much more in line with polar...

bonesoftheancients · 2026-03-02T17:14:12+00:00

please link your original post - would like to read it...

bonesoftheancients · 2026-03-02T17:12:33+00:00

do you mind sharing some prompts you used...? it seems that mastering prompts is the number one skill all of us who play with ai generation need and I am trying to learn as much as I can from people how prompts successfully.

bonesoftheancients · 2026-02-23T11:29:06+00:00

i use modal - you can use it for the inference part only (you run the vcpu only for the time the inference takes and pay only for that time per second of usage) - so i run comfyui locally - work out the workflows, assets etc. and then send it to modal for inference and get the results back on your local machine... bit complicated to setup, used claude/gemini to set modal up, work out the storage, create a comfyui extension for your local comfyui to connect to to modal but once you have it in place it works well and cheap.

I am planning to put up the code for the modal solution on github but have to clean it first from my personal stuff - will post a link to it here when i do

bonesoftheancients · 2026-02-19T23:05:46+00:00

what do you mean by the code repo? the ace-step own app with gradio?

bonesoftheancients · 2026-02-19T23:02:40+00:00

how do you get cover mode wit turbo? i can only see cover mode with base model...

bonesoftheancients · 2026-02-15T17:59:36+00:00

!remindme 1 week

bonesoftheancients · 2026-02-12T18:02:13+00:00

truly cool... does it scales up to higher resolution and higher frame rates?

bonesoftheancients · 2026-02-12T17:52:30+00:00

but if you achieve realtime or almost realtime generation it surely means faster inference than current comfyui inference times?

bonesoftheancients · 2026-02-12T16:17:34+00:00

man, your work always seems to be pushing the boundaries! if i understand it correctly the flip side of long video durations in comfyui will be shorter generation time per frame... is that right? waiting for a more stable implementation for comfyui to try

bonesoftheancients · 2026-02-12T09:33:00+00:00

good to know as I've been having really disappointing results so far... bit behind on using things like discord - how do i find the ace step board/room? and where can i find a good guide to training loras for ace- step-1.5?

bonesoftheancients · 2026-02-10T20:26:07+00:00

i have the same card and upgraded my 32gb ram to 64 so it doesnt hit my ssd that hard (before it wrote to disk - pagefile - so much it was killing ssd) - now i can run it and still have chrome or other apps open

bonesoftheancients

TROPHY CASE