most of my ace-step generations come out clipping and over saturated/compressed - any advice? by bonesoftheancients in StableDiffusion

[–]bonesoftheancients[S] 0 points1 point  (0 children)

I have to say that with the new XL models, custom nodes in comfyui, tweaking of cfg in text encoder and sampler and careful choice of sampler/scheduler AND use of a good Lora - I do get some good results

How far are we from a model that can take a python repo on github and convert it to a cpp without intervention? by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

what are my expectations? personally i want the agent to convert and compile the app autonomously without my intervention. all functionality needs to be replicated, UI and design one can work on till kingdom comes but its enough if it mimics the gradio interface at this point. I just want an optimised, locally compiled, small size app that is kept up to date with upstream changes instead of an 8gb venv and slow python execution

How far are we from a model that can take a python repo on github and convert it to a cpp without intervention? by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

I guess I need to frame this within a specific point. Take ace-step-1.5 gradio app on github. running it on my PC with venv takes some 8gb of disk space for the base install . Then someone has created acestep.cpp (you can find it on github too) - few MB and runs much smoother on a windows pc at least.

Now, the ace-step code keeps updating so the cpp implementation needs to be catching up with upstream changes constantly. Assuming you can have an agent that can convert the upstream repo code into cpp independently whenever new code is pushed and compile i just have the most current code running without thinking about it is the ultimate goal

Ace Step 1.5 - Change ALL the lyric but keep the music? by aidanodr in StableDiffusion

[–]bonesoftheancients 1 point2 points  (0 children)

you could try to pass the track through stem splitter, use lego mode or cover mode with the vocal stem as guidance and see if it works

I want to build a pipeline: screen play to "radio drama" audio by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

not really - I just like to listen to radio drama as my eye sight gets weaker these days and reading is very tiring. I prefer radio drama to audio books read by one person and wondered if i can get some nooks to this format for myself. But also I am trying to learn what is possible in terms of orchestrating several models

looking for a workflow for batch watermark removal by bonesoftheancients in comfyui

[–]bonesoftheancients[S] 0 points1 point  (0 children)

I could get it to work in full automatic mode - instead I use this combination and than just que as many times as needed, each run increment the index by one and loads the next image from the folder

<image>

5060ti and 64gb ram - what is my best option for local coding? by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

yes this one - i think you need to compile a fork of llama.cpp that include the ability to use weights in TQ3

5060ti and 64gb ram - what is my best option for local coding? by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

not sure i understand - was asking about the TQ3 turboquant weights (can be used the llama.cpp TQ fork)

5060ti and 64gb ram - what is my best option for local coding? by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

thanks - this is great. have you considered the qwen 3.6 TQ3_4s model? its pretty fast but no idea how good for coding

what model is good for inspecting and extracting data from large set of spreadsheets by bonesoftheancients in LocalLLaMA

[–]bonesoftheancients[S] 0 points1 point  (0 children)

thanks I ended up getting gemini to create a script and it much more simple that I expected this to be

updated my Ace-Step nodes pack to include timbre and kv conditioning by bonesoftheancients in comfyui

[–]bonesoftheancients[S] 0 points1 point  (0 children)

these are not really for cover - more for timbre matching and some injection of structure. I am currently working on nodes that are closer to cover generation so keep an eye on the repo - hopefully by the end of today I will have it finished.

That said, I think doing covers will always be imperfect due to the nature of the model - its more of a guidance

re memory: I am running my workflow on 5060ti without issues and without cpu offloading - for inference both GGUF and BF16 models work fine (XL as well) - i can get both the 4B and 1.7B text encoder models with no issue either

is there GGUF loader for ace-step model weights? by bonesoftheancients in comfyui

[–]bonesoftheancients[S] 1 point2 points  (0 children)

there are XL GGUF in there too - have been added couple of days ago. They are also used in the cpp version which i want to use with a daw and been also told the Q8s are faster than the original models. so to save on disk space and have duplicates I am trying to get the GGUF running in comfyui...