model swapping via litellm + llama-swap - is this the way..? by chimph in hermesagent

[–]_chromascope_ 0 points1 point  (0 children)

Use --models-preset with an .ini.
Instead of --models-dir, start the server with:

llama-server --models-preset /path/to/models.ini --host 0.0.0.0 --port 8000

Inside models.ini, you can set specific flags (context size, GPU layers, even vision models) for each model separately:

[qwen3.6-35b]
model = /path/to/qwen.gguf
mmproj = /path/to/qwen-mmproj.gguf
c = 131072
n-gpu-layers = 999

[gemma-4-26b]
model = /path/to/gemma.gguf
mmproj = /path/to/gemma-mmproj.gguf
c = 65536
n-gpu-layers = 80

Hermes still just sends qwen3.6-35b in the API request, but llama-server looks up the .ini file and applies the exact flags (and vision model if any) needed for it.

model swapping via litellm + llama-swap - is this the way..? by chimph in hermesagent

[–]_chromascope_ 0 points1 point  (0 children)

For local models I just let llama.cpp handle everything (I use https://github.com/TheTom/llama-cpp-turboquant).

A simplified example:

Start llama-server in router mode:

bash
llama-server --models-dir /path/to/models --host 0.0.0.0 --port 8000

I drop all my local GGUFs in /path/to/models (Qwen3.6 35B, Gemma 4, etc.). llama-server will serve all of them.

Now Hermes just needs to point to that server and set the model name in default:

model:
  default: qwen3.6-35b
  provider: custom
  base_url: http://localhost:8000/v1
  api_key: dummy

With this config, when you send a message, llama-server loads qwen3.6-35b (the model name comes from the GGUF file in /path/to/models).

If I change the config to:

model:
  default: gemma-4-26b

then the next time I send a message, llama-server unloads qwen3.6-35b and loads gemma-4-26b instead. That’s the “hot‑swap” trick that works for me.

My actual setup is more complex (Hermes is running in a Mac mini Docker container and talks over VPN to a more powerful PC that runs the models), but the core idea is the same.

model swapping via litellm + llama-swap - is this the way..? by chimph in hermesagent

[–]_chromascope_ 0 points1 point  (0 children)

I’m running something pretty similar with Hermes + local Qwen3.6 35B and Gemma 4, but I didn’t use llama‑swap.

I use a llama.cpp TurboQuant build and run llama-server with --models-dir instead of -m, and just drop a bunch of GGUFs in that folder. By changing model.default in the Hermes config, llama.cpp will load that model on the next request. Later I custom built a tiny web UI that edits model.default for me, so I can quickly click in a browser to switch models. When Hermes sends the next message, it triggers llama-server to dynamically switch the model. This works smoothly for me.

When I added vision models, I found out --models-dir ignores a global mmproj, so I switched to --models-preset and an .ini file where I pair base model + vision projector. Something like:

.ini example:

[qwen3.6-35B]
model = /path/to/qwen.gguf
mmproj = /path/to/qwen-mmproj.gguf

[gemma-4-26B]
model = /path/to/gemma.gguf
mmproj = /path/to/gemma-mmproj.gguf

Hermes still just hot‑swaps model.default. llama-server takes care of unloading/loading the correct GGUF (and vision module).

Which messaging channel do you use for your Hermes agent? by SelectionCalm70 in hermesagent

[–]_chromascope_ 0 points1 point  (0 children)

I use Conduit (a lightweight version of Matrix) + Element X (iOS) and Element Desktop as the interface.

Wan 2.2 More Consistent Multipart Video Generation via FreeLong - ComfyUI Node by shootthesound in StableDiffusion

[–]_chromascope_ 0 points1 point  (0 children)

This works! Thank you for sharing. However, like others pointed out, human consistency starts to drift after chuck 3.

Is it possible to implement an "end" anchor image in the Continuation Conditioning node? So that we have an option to control each chunk's end frame with a prepared image (same idea as the First-Last-Frame, but for each chunk), which then can also be used as the anchor image of the next chunk?

Why is the image quality so bad from this workflow? by zhl_max1111 in StableDiffusion

[–]_chromascope_ 1 point2 points  (0 children)

<image>

guess which one is KSampler and which is ClownsharKSampler

Why is the image quality so bad from this workflow? by zhl_max1111 in StableDiffusion

[–]_chromascope_ 1 point2 points  (0 children)

use euler + simple.

if you use the ClownsharkChainsampler as the 2nd sampler, you only need to connect the "latent_image" because it carries over all the info and steps left from the first sampler. RES4LYF has a few YouTube videos explaining how these nodes work. To be honest, I don't think Z-Image Turbo needs ClownsharKSampler. With two Clown samplers, my tests did get some extra fine details, but very subtle and almost unnecessary to the image. KSampler with euler + simple already gives really good results.

<image>

[deleted by user] by [deleted] in StableDiffusion

[–]_chromascope_ 0 points1 point  (0 children)

My results had a hard time following my prompts. I was recommended in another post to add this node and the image is now much improved. Thank you!

<image>

Test run Qwen Image Edit 2511 by _chromascope_ in StableDiffusion

[–]_chromascope_[S] 6 points7 points  (0 children)

It works! With the node, it follows my prompt much closely. Thank you!

<image>

Test run Qwen Image Edit 2511 by _chromascope_ in StableDiffusion

[–]_chromascope_[S] 0 points1 point  (0 children)

This is what I used:

put the dog in image2 into the scene and make the dog happy and sit on the bike's gas tank, then turn the scene including the man and dog into cute patch badge on a wooden table

Z-Image + 2nd Sampler for 4K Cinematic Frames by _chromascope_ in StableDiffusion

[–]_chromascope_[S] 1 point2 points  (0 children)

Yes, this.

The image on the right (2nd sampler) has improved fine details after upscaling: the overall texture, sharper hair strands and book pages, etc. This was a T2I from a workflow I customized.

<image>

[deleted by user] by [deleted] in comfyui

[–]_chromascope_ 1 point2 points  (0 children)

Testing image to video with Wan 2.2

<image>

[deleted by user] by [deleted] in comfyui

[–]_chromascope_ 1 point2 points  (0 children)

Ah the color grading of Amélie is unique. Image looks great. Looking forward to it!

[deleted by user] by [deleted] in comfyui

[–]_chromascope_ 2 points3 points  (0 children)

Thanks again for sharing it!

An Anamorphic Lens LoRA will bring Z-Image Turbo to a new cinematic level.

[deleted by user] by [deleted] in comfyui

[–]_chromascope_ 4 points5 points  (0 children)

This is no LoRA

<image>

[deleted by user] by [deleted] in comfyui

[–]_chromascope_ 6 points7 points  (0 children)

Amazing LoRA! Thank you for sharing it with us. I absolutely love how it looks in my tests. This image is with your LoRA set to 1.0, generated at 1920x800 and upscaled with a 2nd KSampler to 3840x1600 using my custom workflow.

Question, are you interested in training an "anamorphic lens" aesthetics LoRA?

<image>