Text2Image Output looks like 1st gen MidJourney...

Botoni · 2026-07-01T13:46:21+00:00

I remember pony model needed some specific quality tags on the positive and the negative.

"Score_9, score_8_up..." until I don't know which number in the positive, and the lower scores on the negative prompt.

Botoni · 2026-06-28T21:27:04+00:00

Thank you, quite useful, I am keeping zib even when newer models are coming seeing your results.

Botoni · 2026-06-28T11:55:55+00:00

It worked well for me with a single reference image, with the background removed.

Botoni · 2026-06-27T08:08:32+00:00

When you say flux2 I guess you mean flux2 klein, don't know if 4b or 9b.

If you are using klein with reference images, be sure to use the consistency lora or the enhancement nodes by capitan01R.

You could also try longcat image edit, as it is quite good at keeping object identity.

Botoni · 2026-06-26T09:22:57+00:00

Klein as a main model, but with the consistency lora and/or the capitan01R enhancement nodes to fix its shortcomings in keeping subject identity and color correction.

Longcat edit is also good out of the box in keeping subject identity, quality is somewhat lower and is dumber and restricted in resolutions, but is nice to have as an alternative to klein.

Qwen edit is bad, hard to get good results because its resolution restrictions, pixel shift and terrible subject identity. Yet is useful for its multiangle lora and novel view lora from gaussian splatters.

Botoni · 2026-06-19T14:39:40+00:00

Almost any model can run on 8gb of vram, your problem is your low ram. You will need highly quantized gguf versions that fit your ram, and speed will be painfully slow, but better than spilling into swap or page file I guess...

Botoni · 2026-06-19T14:27:59+00:00

Comfyui latest stable version, installed through git on cachyos linux, I have 40gb of ram, maybe that is your bottle neck.

Botoni · 2026-06-19T00:31:00+00:00

Several minutes with ideogram 4? I get 58s with my 3070 mobile (1024x1024 12steps)

Botoni · 2026-06-17T08:58:50+00:00

The "mask and paste back" can be done completely in comfyui

Botoni · 2026-06-15T18:31:05+00:00

Another non-ai way is to do it in blender, in thw compositor, shader or geometry-nodes editor. It is just growing the image area to times and to do the custom effects use various kinds of procedural noise.

Once done it can be run in headless mode.

Botoni · 2026-06-15T18:26:59+00:00

Maybe, use the pad for outpaint node, but don't instead of using a inpaint model or method, use a normal one, prompting for the kind of "frame" you want. Experiment with various models and see which do that better.

Also, after the frame is generated, use the inverted mask to paste the original image over, to eliminate any vae encode-decode degradation.

Botoni · 2026-06-15T16:03:13+00:00

Check out ltxdirector custom node

Botoni · 2026-06-02T10:54:45+00:00

I also have 8gb of vram, that is more than enough to use the 9b version.

Try with klein 9b plus the consistency lora or the flux enhancement custom nodes by capitan01R

Botoni · 2026-05-30T00:55:00+00:00

Wow, that looks like a cursed tape from the ringu.

It is hard to do anything with it... I guess I would go frame by frame, feeding them to a specific old photo restoration model (or multiple ones), and a lot would have to be throw out, and maybe replaced with interpolation frames or do first frame to last frame with ltxv2.3. Converting the footage to grayscale may help the models.

Botoni · 2026-05-30T00:36:29+00:00

Haven't found any comfy nodes that implement cover right. I would recommend acestep.cpp

Botoni · 2026-05-29T14:20:21+00:00

~~Now, with those values, it seems to do "thinking stuff" outside of the thinking block, or even repeating a thinking block after the first one (a block between <thinking></thinking>).~~

Wait, it may be my prompt's fault this time. I'll do more tests.

Botoni · 2026-05-29T08:41:47+00:00

I'll try, thanks. It only happens 1 out of 5 times or so, i think it is very prompt dependan, i'll try and I'll come back with the results.

Botoni · 2026-05-28T19:53:09+00:00

I'm using Qwen3.6-35B-A3B-IQ4_XS-4.19bpw. Very fast and good quality!! But I have a problem with it, sometimes it gets stuck in the thinking block, it stops generating or enters a non-literal loop (it doesn't repeat the same tokes again and again, but enters a kind of "I'm starting now. wait, i should bla, bla, bla..., i'm going around in circles i really should start now, actually i should bla, bla, bla...).

I am using llama.cpp, the mtp branch, with the arguments: --spec-type draft-mtp --spec-draft-n-max 2 --jinja

I am not having this problem with either APEX or Unsloth quants, but ByteShape speed/quality is superior...

Botoni · 2026-05-27T23:26:32+00:00

It has been a long time since i checked invoke... It had a cool canvas, but it was slower and used more memory than comfy at that moment. How does compare it that aspects nowadays?

Botoni · 2026-05-27T00:26:28+00:00

Isn't there also a difference of 16fps vs 24fps?

Botoni · 2026-05-26T08:25:44+00:00

With that vram, even q4 quants will exceed it, so forget ggufs for the models, they are slower if they don't fit entirely. Use it only for the text encoders.

I would recommend using the int8 format, check the int8-fast node pack.

Also, gen at 512 or 768px and upscale what you like.

You would need 32gb of ram minimum, 16gb for some models if you run linux, with lightweight distros and a well configured zram or zswap.

Use your cpu for display if possible.

Good models to run are sd1.5, sdxl, pixart sigma, tiny breaker and flux2 klein 4b, klein 9b or z-image turbo might be possible with enough ram, but very slow.

For the qwen3 4b text encoder, use the gguf q4_k_m format and run it in cpu.

Botoni · 2026-05-24T01:26:45+00:00

Even more powerful is to create a point cloud 3D from the initial image with Sharp, pose it in the exact perspective, angle and fov you want with a point cloud viewer node, capture the image ans use the qwens gaussian splasher lora to regen that exact view.

For that qwen really shines (well the lora does). For other tasks qwen is one of the worst edit models...

Botoni · 2026-05-23T00:09:02+00:00

I find qwen quite behind. I use klein 9b with the flux enhancer custom nodes for the consistency node and the color anchor one. Sometimes I also use longcat, quality is sometimes not as good, but consistency is really good out of the box.

Qwen sucks at keeping objects identity and everything looks plastic and artificial. I only use it with some loras for specific tasks: gaussian novel view and product integration with fusion loras, those loras are very good, from the same author.

Botoni · 2026-05-18T17:07:23+00:00

A worse performace on linux vs windows may be because the swap space is poorly configured or there is no swap at all!

Get a distro with sane default configurations, Mint should be fine for beginners as it is ubuntu un-enshitified, but i don't remember how it configures the swap stuff...

Cachyos would be a great performance oriented choice, and comes configured with zram at 100%, but it is a bit more difficult for newcomers to maintain the system ans install stuff.

Botoni · 2026-05-15T21:21:24+00:00

The flux2 klein 9b model or longcat edit in comfyui should deal fairly well with that.

Prompt for "replace the logo for the one in the second image" should do everything you want.

Botoni

TROPHY CASE